Solved – Time series cross section forecasting with R

forecastingnegative-binomial-distributionpanel datartime series

I have a (I suspect) simple question. I have time series cross section data on voting behaviour in the Council of the European Union (the monthly number of yes, no and abstentions for each member state from 1999 to 2007). So basically the variables are counts, thus a Poisson/negative binomial regression would be appropriate, possibly with lagged dependent variables on the right hand side to control for time dependencies. I have seen papers with people using such negative binomial models to forecast, for instance the number of monthly legislative acts adopted in the future, and I have three questions in this regard:

How can i run a negative binomial regression on panel data without making any inferential mistakes?
How can I use a negative binomial model with lags to forecast future values of the dependent variable.
Can this be done in R?

Thomas

Best Answer

After a bit of research, I can give a partial answer. In his book Wooldridge discusses Poisson and negative binomial regressions for cross-section and panel data. But for regression with lagged variables he only discusses Poisson regression. Maybe negative binomial is discussed in the new edition. The main conclusion is that for random effects Poisson regression with lagged random variable can be estimated by mixed effects Poisson regression model. The detailed description can be found here. The mixed effects Poisson regression in R can be estimated with glmer from package lme4. To adapt it to work with panel data, you will need to create lagged variable explicitly. Then your estimation command should look something like this:

glmer(y~lagy+exo+(1|Country),data,family=quasipoisson)

You should also look into gplm package suggested by @dickoa. But be sure to check, whether it supports lagged variables. Yves Croissant, the creator of gplm and plm packages writes wonderful code, but unfortunately in my personal experience, the code is not tested enough, so bugs crop up more frequently than in standard R packages.

Related Solutions

Solved – Forecasting irregular time series (with R)

State space models support the missing data very well. take a look at section 6.4 "Missing Data Modifications" in Time Series Analysis and Its Applications With R Examples, 3rd ed., by Shumway and Stoffer. They have examples in http://www.stat.pitt.edu/stoffer/tsa3/

Solved – Forecasting a time series with weights

This is an issue with lm.

wrapper = function(formula,...)
  lm(formula=formula,...)
x=(1:27)
wrapper(ts.input~x,weights=ts.weights)

produces the exact same error. If you read the source code to tslm, you'll find that lm is called in more or less the sam way.

I found here that ..1 means the first argument included in ..., which in this case is the only argument, weights. It suggests that this kind of error could be caused because argument being passed doesn't exist in the environment from which the function is being called. You can see this behavior with wrapper(ts.input~x,weights=foo) if you don't have any object called foo.

Running a traceback on wrapper(ts.input~x,weights=ts.weights) reveals:

8 eval(expr, envir, enclos) 
7 eval(extras, data, env) 
6 model.frame.default(formula = formula, weights = ..1, drop.unused.levels = TRUE) 
5 stats::model.frame(formula = formula, weights = ..1, drop.unused.levels = TRUE) 
4 eval(expr, envir, enclos) 
3 eval(mf, parent.frame()) 
2 lm(formula = formula, ...) 
1 wrapper(ts.input ~ x, weights = ts.weights)

And in the source code of lm,

mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "subset", "weights", "na.action", 
    "offset"), names(mf), 0L)
mf <- mf[c(1L, m)]
mf$drop.unused.levels <- TRUE
mf[[1L]] <- quote(stats::model.frame)
mf <- eval(mf, parent.frame())
if (method == "model.frame") 
    return(mf)

which suggests that the call to model.frame is picking up on the ... in a strange way.

So I decided to try it without ...:

wrapper = function(formula,weights)
  lm(formula=formula,weights=weights)
x=(1:27)
wrapper(ts.input~x,weights=ts.weights)

which produced

 Error in model.frame.default(formula = formula, weights = weights, drop.unused.levels = TRUE) :

invalid type (closure) for variable '(weights)'

A closure, of course, is a function in R. Which can mean only one thing... weights is already a function in the global environment. Indeed, ?weights reveals that it's ironically the extractor function for model weights. It's a no-brainer that it gets search priority over local variables. So I changed the argument names (since formula is also a function):

wrapper = function(fm,ws){
  print(ws)
  lm(formula=fm,weights=ws)
}
x=(1:27)
wrapper(fm=ts.input~x,ws=ts.weights)

now produces

 [1] 2.260324e-05 6.385091e-05 1.803683e-04 5.094997e-04 1.439136e-03 4.064300e-03
 [7] 1.147260e-02 3.234087e-02 9.082267e-02 2.523791e-01 6.815483e-01 1.712317e+00
[13] 3.685455e+00 6.224593e+00 8.232410e+00 9.293615e+00 9.737984e+00 9.905650e+00
[19] 9.966395e+00 9.988078e+00 9.995776e+00 9.998504e+00 9.999471e+00 9.999813e+00
[25] 9.999934e+00 9.999977e+00 9.999992e+00

 Error in eval(expr, envir, enclos) : object 'ws' not found

And if you run traceback you still get the same issue with model.frame. So I'm completely baffled. My only conclusion is that a) it has nothing to do with tslm or time series analysis, and b) the problem lies somewhere in the way arguments are passed around inside lm.

That probably doesn't help you at all, but hopefully at least someone can come along and explain what's going on here. My provisional answer to your actual question of how to use weights in tslm is that you can't.

Best Answer

Related Solutions

Solved – Forecasting irregular time series (with R)

Solved – Forecasting a time series with weights

Related Question