Solved – Improving exponential decay fit

rtime series

I'm fitting an exponential function to a time series in R using the formula lm(log(rate) ~ month). When I graph predicted values (in red) vs my data (in black points), I can see that the curve isn't fitting very well in the earlier months. Is there a better approach for this?

Best Answer

Without having your data, it is hard to completely address your problem, but I do have a suggestion that may help. If the linear model produced errors of constant size across all values of time, once they are exponentiated, they will become errors proportional to the y-value, that is, the errors for large y-values can be expected to be much larger than errors for small values. You can correct for this using exponentially decaying weights.

Here is a simple example. The data is the weekly box office gross receipts from the movie "The Magnificent Seven" taken from Box Office Mojo

M7 = structure(list(Week = 1:14, Gross = c(45905901L, 20859492L, 12862169L, 
7115805L, 3142159L, 1900503L, 630393L, 291382L, 300424L, 185613L, 
107141L, 60229L, 33921L, 17497L)), .Names = c("Week", "Gross"
), class = "data.frame", row.names = c(NA, -14L))

## Straight Linear Model
LLM1 = lm(I(log(Gross)) ~ Week, data=M7)
plot(M7, main="Ordinary Linear Model")
lines(1:14, exp(LLM1$fitted.values), col="red")

## Weighted model
W = 0.27^(1:14)
LLM2 = lm(I(log(Gross)) ~ Week, data=M7, weights=W)
plot(M7, main="Weighted Linear Model")
lines(1:14, exp(LLM2$fitted.values), col="red")

Notice that the error at week 1 is much smaller and the other points did not suffer too much. I hope that this helps on your data.

Related Solutions

Solved – forecasting time series based on previous value forecasted

@Maya, I would recommend an online forecasting text book if you are specifically interested in time series forecasting methods. It has a section on naive forecasting

There are two types of naive methods, for seasonal and non-seasonal data. programming naive models is one of the simplest models that you can do either in R or any other program. For non-seasonal data, last value is your forecast for all the future horizon. for seasonal data, the forecast for future period is same as whatever value is in the historical data during the same period (Example forecast for Aug 2015 is same value as actual value for Aug 2014.)

In R specifically you can perform naive forecast (both non seasonal and seasonal) using forecast package. The code snippet is shown below. I have also shown how the forecast looks like for non-seasonal and seasonal data plots.

library("forecast")
library("fma")

## Example Data  Non Seasonal from fma package

nsdata <- eggs
plot(nsdata)

## Forecast naive method for seasonal data for 18 years
nsdata.f <- naive(nsdata,h=18)
plot(nsdata.f)


## Example data for Seasonal data from fma package
sdata <- airpass
plot(sdata)

## Forecast naive method for seasonal data provides you 12 months of forecast
sdata.f <- snaive(sdata,h=12)
plot(sdata.f)

Non-Seasonal Data enter image description here

Solved – Model for exponential decay with lots of zeros

Provided there is some justification for an exponential decay model, you could try the gnls function from package nlme. This allows you to compare treatments and model variance heterogeneity. Here is something to get you started:

library(ggplot2)
p <- ggplot(DF, aes(time, value, color = treatment)) + geom_point()

Get starting values by fitting separate nls models:

coef(nls(value ~ C * exp(-k*time), data = DF[DF$treatment == 1,], start = list(C=400, k=1)))
#         C          k 
#415.729905   1.080539 

coef(nls(value ~ C * exp(-k*time), data = DF[DF$treatment == 2,], start = list(C=400, k=1)))
#         C          k 
#430.787442   1.606167

Now use gnls:

library(nlme)
fit <- gnls(value ~ C * exp(-k*time), 
            data = DF, 
            params = list(C ~ treatment, k ~ treatment), 
            start = list(C = c(415, 15), k = c(1.1, 0.5)),
            weights = varExp(-0.8, form = ~ time),
            control = gnlsControl(nlsTol = 0.1))

I use an exponential variance structure (without dependence on treatment) here, but you could try some alternatives. Note that I had to strongly increase nlsTol to achieve a successful fit. Use the result with caution (but it looks pretty good):

summary(fit)
#Generalized nonlinear least squares fit
#  Model: value ~ C * exp(-k * time) 
#  Data: DF 
#       AIC      BIC    logLik
#  2485.028 2508.176 -1236.514
#
#Variance function:
# Structure: Exponential of variance covariate
# Formula: ~time 
# Parameter estimates:
#     expon 
#-0.8035687 
#
#Coefficients:
#                 Value Std.Error  t-value p-value
#C.(Intercept) 413.5077 15.976589 25.88210  0.0000
#C.treatment2    9.7849 24.062902  0.40664  0.6845
#k.(Intercept)   1.0932  0.021132 51.73149  0.0000
#k.treatment2    0.4104  0.061657  6.65565  0.0000
#
# Correlation: 
#              C.(In) C.trt2 k.(In)
#C.treatment2  -0.664              
#k.(Intercept)  0.629 -0.418       
#k.treatment2  -0.216  0.456 -0.343
#
#Standardized residuals:
#        Min          Q1         Med          Q3         Max 
#-2.49675931 -0.55858325 -0.02101141  0.53929573  4.36616094 
#
#Residual standard error: 92.79678 
#Degrees of freedom: 350 total; 346 residual

plot(fit)

residual plot

Now plot the result:

newdata <- expand.grid(time = seq(0, 6, length.out = 100), treatment = factor(1:2))
newdata$value <- predict(fit, newdata = newdata)

p + geom_line(data = newdata)

resulting plot

In a next step one could try removing the dependence of C on treatment from the model. That might help to achieve better convergence ...

Best Answer

Related Solutions

Solved – forecasting time series based on previous value forecasted

Solved – Model for exponential decay with lots of zeros

Related Question