@Maya, I would recommend an online forecasting text book if you are specifically interested in time series forecasting methods. It has a section on naive forecasting
There are two types of naive methods, for seasonal and non-seasonal data. programming naive models is one of the simplest models that you can do either in R
or any other program. For non-seasonal data, last value is your forecast for all the future horizon. for seasonal data, the forecast for future period is same as whatever value is in the historical data during the same period (Example forecast for Aug 2015 is same value as actual value for Aug 2014.)
In R specifically you can perform naive forecast (both non seasonal and seasonal) using forecast package. The code snippet is shown below. I have also shown how the forecast looks like for non-seasonal and seasonal data plots.
library("forecast")
library("fma")
## Example Data Non Seasonal from fma package
nsdata <- eggs
plot(nsdata)
## Forecast naive method for seasonal data for 18 years
nsdata.f <- naive(nsdata,h=18)
plot(nsdata.f)
## Example data for Seasonal data from fma package
sdata <- airpass
plot(sdata)
## Forecast naive method for seasonal data provides you 12 months of forecast
sdata.f <- snaive(sdata,h=12)
plot(sdata.f)
Provided there is some justification for an exponential decay model, you could try the gnls
function from package nlme
. This allows you to compare treatments and model variance heterogeneity. Here is something to get you started:
library(ggplot2)
p <- ggplot(DF, aes(time, value, color = treatment)) + geom_point()
Get starting values by fitting separate nls
models:
coef(nls(value ~ C * exp(-k*time), data = DF[DF$treatment == 1,], start = list(C=400, k=1)))
# C k
#415.729905 1.080539
coef(nls(value ~ C * exp(-k*time), data = DF[DF$treatment == 2,], start = list(C=400, k=1)))
# C k
#430.787442 1.606167
Now use gnls
:
library(nlme)
fit <- gnls(value ~ C * exp(-k*time),
data = DF,
params = list(C ~ treatment, k ~ treatment),
start = list(C = c(415, 15), k = c(1.1, 0.5)),
weights = varExp(-0.8, form = ~ time),
control = gnlsControl(nlsTol = 0.1))
I use an exponential variance structure (without dependence on treatment) here, but you could try some alternatives. Note that I had to strongly increase nlsTol
to achieve a successful fit. Use the result with caution (but it looks pretty good):
summary(fit)
#Generalized nonlinear least squares fit
# Model: value ~ C * exp(-k * time)
# Data: DF
# AIC BIC logLik
# 2485.028 2508.176 -1236.514
#
#Variance function:
# Structure: Exponential of variance covariate
# Formula: ~time
# Parameter estimates:
# expon
#-0.8035687
#
#Coefficients:
# Value Std.Error t-value p-value
#C.(Intercept) 413.5077 15.976589 25.88210 0.0000
#C.treatment2 9.7849 24.062902 0.40664 0.6845
#k.(Intercept) 1.0932 0.021132 51.73149 0.0000
#k.treatment2 0.4104 0.061657 6.65565 0.0000
#
# Correlation:
# C.(In) C.trt2 k.(In)
#C.treatment2 -0.664
#k.(Intercept) 0.629 -0.418
#k.treatment2 -0.216 0.456 -0.343
#
#Standardized residuals:
# Min Q1 Med Q3 Max
#-2.49675931 -0.55858325 -0.02101141 0.53929573 4.36616094
#
#Residual standard error: 92.79678
#Degrees of freedom: 350 total; 346 residual
plot(fit)
Now plot the result:
newdata <- expand.grid(time = seq(0, 6, length.out = 100), treatment = factor(1:2))
newdata$value <- predict(fit, newdata = newdata)
p + geom_line(data = newdata)
In a next step one could try removing the dependence of C on treatment from the model. That might help to achieve better convergence ...
Best Answer
Without having your data, it is hard to completely address your problem, but I do have a suggestion that may help. If the linear model produced errors of constant size across all values of time, once they are exponentiated, they will become errors proportional to the y-value, that is, the errors for large y-values can be expected to be much larger than errors for small values. You can correct for this using exponentially decaying weights.
Here is a simple example. The data is the weekly box office gross receipts from the movie "The Magnificent Seven" taken from Box Office Mojo
Notice that the error at week 1 is much smaller and the other points did not suffer too much. I hope that this helps on your data.