Solved – Time series analysis in Python

pythontime series

I am a beginner to time-series analysis. I have the model below; y is sales of product and x is tweet-rate:

$y_t=ay_{t-1}+by_{t-2}+…+cy_{t-m}+dx_t+ex_{t-1}+…+fx_{t-n}$

  1. What is this model called? I guess it's called an AR model but I am not
    sure since the dependent variable y is on R.H.S as well.
  2. How do I fix the lag period, $m$ and $n$? Can $x$ and $y$ have different lags?
  3. How can I use Python to build this model and also predict the sales for $t+1\ldots t+n$? Any solution for this without using rpy.

Best Answer

  1. The model you have there is called an Autoregressive Distributed Lag (ARDL) Model. To be specific, \begin{equation} y_t=ay_{t-1}+by_{t-2}+...+cy_{t-m}+dx_t+ex_{t-1}+...+fx_{t-n} \end{equation} can be called an ARDL(m,n) model and we can write the model in slightly more compact form as: \begin{equation} y_{t} = \delta + \sum_{i=1}^{m} \alpha_{i} y_{t-i} + \sum_{j=0}^{n} \beta_{j} x_{t-j} + u_{t} \end{equation} where $u_{t} \sim IID(o, \sigma^{2})~ \forall~ t$ and in this case $\delta = 0$.

  2. The values of m and n do not have to be the same. That is, the lag length of the autoregressive term does not have to be equal to the lag length of the distributed lag term. Note also that it is possible to include a second (or more) distributed lag terms (for example, $z_{t-k}$). There are different ways of choosing the lag lengths and for a treatment of this issue, I refer you to Chapter 17 of Damodar Gujarati and Dawn Porter's Basic Econometrics (5th ed).

  3. To build a model like this in python, it might be worth checking out statsmodels.tsa as well as the other packages mentioned in the other answers.

Related Question