I have a data frame that contains two time series: the dates and version numbers of Emacs and Firefox releases. Using one ggplot2 command it's easy to make a chart that uses loess (in a way that looks a bit amusing, which I don't mind) to turn the points into lines.
How can I extend the lines into the future? I want to determine where and when Emacs and Firefox version numbers will cross, and if there's a way to show an error range, all the better.
Given that ggplot2 is plotting the lines, it must have a model, but I don't see how to tell it to extend the lines, or to get the model out and do something with it.
> library(ggplot2)
> programs <- read.csv("http://www.miskatonic.org/files/se-program-versions.csv")
> programs$Date <- as.Date(programs$Date, format="%B %d, %Y")
> head(programs)
Program Version Date
1 Emacs 24.1 2012-06-10
2 Emacs 23.4 2012-01-29
3 Emacs 23.3 2011-03-10
4 Emacs 23.2 2010-05-08
5 Emacs 23.1 2009-07-29
6 Emacs 22.3 2008-09-05
> head(subset(programs, Program == "Firefox"))
Program Version Date
18 Firefox 16 2012-10-09
19 Firefox 15 2012-08-28
20 Firefox 14 2012-06-26
21 Firefox 13 2012-06-15
22 Firefox 12 2012-04-24
23 Firefox 11 2012-03-13
> ggplot(programs, aes(y = Version, x = Date, colour = Program)) + geom_point() + geom_smooth(span = 0.5, fill = NA)
(Note: I had to fudge the early Firefox versions and turn 0.1 onto 0.01, etc., because "dot one" and "dot ten" are equal arithmetically. I know Firefox is releasing every six weeks now, but they don't exist yet, and I'm interested in a general answer to this prediction question.)
Best Answer
As @Glen mentions you have to use a
stat_smooth
method which supports extrapolations, whichloess
does not.lm
does however. What you need to do is use thefullrange
parameter ofstat_smooth
and expand the x-axis to include the range you want to predict over. I don't have your data, but here's an example using the mtcars dataset: