Solved – How to predict or extend regression lines in ggplot2

ggplot2prediction

I have a data frame that contains two time series: the dates and version numbers of Emacs and Firefox releases. Using one ggplot2 command it's easy to make a chart that uses loess (in a way that looks a bit amusing, which I don't mind) to turn the points into lines.

How can I extend the lines into the future? I want to determine where and when Emacs and Firefox version numbers will cross, and if there's a way to show an error range, all the better.

Given that ggplot2 is plotting the lines, it must have a model, but I don't see how to tell it to extend the lines, or to get the model out and do something with it.

> library(ggplot2)
> programs <- read.csv("http://www.miskatonic.org/files/se-program-versions.csv")
> programs$Date <- as.Date(programs$Date, format="%B %d, %Y")
> head(programs)
  Program Version       Date
1   Emacs    24.1 2012-06-10
2   Emacs    23.4 2012-01-29
3   Emacs    23.3 2011-03-10
4   Emacs    23.2 2010-05-08
5   Emacs    23.1 2009-07-29
6   Emacs    22.3 2008-09-05
> head(subset(programs, Program == "Firefox"))
   Program Version       Date
18 Firefox      16 2012-10-09
19 Firefox      15 2012-08-28
20 Firefox      14 2012-06-26
21 Firefox      13 2012-06-15
22 Firefox      12 2012-04-24
23 Firefox      11 2012-03-13
> ggplot(programs, aes(y = Version, x = Date, colour = Program)) + geom_point() + geom_smooth(span = 0.5, fill = NA)

Emacs and Firefox versions by date

(Note: I had to fudge the early Firefox versions and turn 0.1 onto 0.01, etc., because "dot one" and "dot ten" are equal arithmetically. I know Firefox is releasing every six weeks now, but they don't exist yet, and I'm interested in a general answer to this prediction question.)

Best Answer

As @Glen mentions you have to use a stat_smooth method which supports extrapolations, which loess does not. lm does however. What you need to do is use the fullrange parameter of stat_smooth and expand the x-axis to include the range you want to predict over. I don't have your data, but here's an example using the mtcars dataset:

ggplot(mtcars,aes(x=disp,y=hp)) + geom_point() + xlim(0,700) +
stat_smooth(method="lm",fullrange=TRUE)
Related Question