Solved – Double exponential smoothing in multivariate multilevel panel regression

panel datartime seriestrend

I would like to use double exponential smoothing to predict prevalence rates of care dependency in Austrian federal states.

My data is very detailed, thus I would like to make use of that in order to refine my predictions. I have the percentage of people in care dependency levels 1–7 aged 50–99 in 9 Austrian federal states.

 str(daten[1:12][daten$jahr>1996,])
'data.frame':   39600 obs. of  12 variables:
 $ age       : num  50 51 52 53 54 55 56 57 58 59 ...
 $ gender    : Factor w/ 2 levels "male","female": 1 1 1 1 1 1 1 1 1 1 ...
 $ bundesland: Factor w/ 9 levels "Bgld","Ktn","Noe",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ jahr      : num  1997 1997 1997 1997 1997 ...
 $ PfSt0     : num  0.992 0.989 0.985 0.985 0.985 ...
 $ PfSt1     : num  0.001458 0.000967 0.001459 0 0.002199 ...
 $ PfSt2     : num  0.00437 0.00193 0.00802 0.00793 0.00587 ...
 $ PfSt3     : num  0.00146 0.0058 0.00073 0.00433 0.0044 ...
 $ PfSt4     : num  0.000729 0 0.002188 0.002163 0.000733 ...
 $ PfSt5     : num  0 0.000967 0.002188 0.000721 0.002199 ...
 $ PfSt6     : num  0 0.000967 0 0 0 ...
 $ PfSt7     : num  0 0 0.00073 0 0 ...

DES is a time series analysis method. Time series analysis explains a data series by its past values only. While it is true that I use only past data of care dependency, one could regard age, gender and federal state as explanatory variables. Instead of computing individual double exponential smoothing forecasts for each age, gender, federal state combination, I could assume structural uniformity within these time series. Thus, my data might be regarded a multilevel panel dataset, with 50 observations per year (age groups) nested in 9 federal states each. (I plan to do separate regressions for males and females.)

I would like to use federal state, age and age squared as explanatory variables apart from previous value and previous trend, as done in double exponential smoothing.

However, in panel data analysis, time trends are typically covered by including the year variable in the regression, and rarely ever by including lags. How could I realize a forcasting method similar to double exponential smoothing in a panel dataset, i.e. including also other explanatory variables? (Preferably in R)

(Matters are complicated further by the fact that I have 7 instead of 1 dependent variables.)

Best Answer

Double exponential smoothing can viewed as reduced version of Kalman filter. It is not optimal but can be more robust. You may try Kalman filtering in R.

Related Question