I'm working on a reproduction of a model and I am running into some issues with the wording of the model and implementing it into R. The model that I am working with is an Additive AutoRegressive model, nonlinear in this case. The paper mentions that the final model that they chose was a "Spline model with a constant and with two lags, where a B-spline base of order 3 and 25 knots was chosen." The package that they used to generate this model was the "mgcv" package.
That's the background now here's what I have so far. I have the following dataset,
y | x.1 | x.2 |
---|---|---|
0.7 | NA | NA |
0.4 | 0.7 | NA |
-0.2 | 0.4 | 0.7 |
0.5 | -0.2 | 0.4 |
Where I know the values of y, x.1 is the first time lag, and x.1 is a second time lag. I can put this data into R and create a model with the inputs that are listed as a spline model using the GAM function from the R package mgcv. Where I define k=25 knots, this can be changed to 2 since I only provide a small dataset, bs="bs" defines the B-spline method and s() is simply a smoothing function placement for both lags.
test2<- gam(y~ s(x.1, bs="bs", k=25)+s(x.2, bs="bs", k=25),
data=full_set_1)
test2
I then get the following output when I run the full dataset, which is what I expect however I'm having a bit of an issue in not understanding how do I define the B-spline base order of 3? And secondly, I'm also not certain by the wording of "Spline model with a constant" and how would I implement a constant to my model below?
Family: gaussian
Link function: identity
Formula:
y ~ s(x.1, bs = "bs", k = 25, fx = FALSE) + s(x.2, bs = "bs",
k = 25, fx = FALSE)
Estimated degrees of freedom:
23.6 22.4 total = 46.99
GCV score: 0.01041977
Thank you for the help.
Best Answer
I presume the "constant" just means the model intercept or constant term. The intercept/constant is implied in R's formula notation, but you can state it explicitly via
but it isn't necessary to specify it explicitly as it is always included unless you suppress it with
0 +
or- 1
added to the formula.As for using cubic (order 3) B splines, this is the default, but you can specify this via the
m
argument. The default ism = c(3,2)
which is a cubic B spline basis with 2nd order derivative penalty.