Solved – Outlier treatment in Vector Autoregression (VAR) Model

multivariate analysisoutliersrtime seriesvector-autoregression

Data: Multivariate Time Series, Series

  1. Demand of a product
  2. Rainfall data both available at monthly level from 2010-2013.

Approach: I am trying to estimate the effect of rainfall on demand of the product using VAR( Vector Autoregression) model. Demand data has some outliers, like a month of sudden high demand and followed by zero values.

Question: How to treat these outliers (I am working in R), since I already have few data and deleting them is not an option for me.

Best Answer

Why would you do VAR on product demand and rainfall? VAR assumes that the impact goes both ways, and I find it unusual to assume that demand for the product causes rainfall. It's not entire impossible, of course. After all our farming impacts weather according to global warming alarmists. However, me thinks in your case it's not what you are trying to say.

That's why I'd start with ARIMAX models, here's MATLAB example. In R there's astsa package with similar functionality. In ARIMAX X stands for the exogenous time series, in your example it would rainfall. Your dependent variable would be the demand. This is a univariate set up, much simpler and makes more sense, in my opinion.

The things you have to be aware of is the causality issues. It's often very difficult to establish causality. What if your product demand is not driven by rainfall, but by something else, which is linked to the annual cyclicality, which in turn is correlated with a rainfall? So, the fact that betas are significant doesn't automatically mean that rainfall causes or impacts the demand.

There are statistical methods for testing causality in the sense of Granger, for instance.

However, I would rely less on stats and more on your underlying theory or a domain knowledge. Let's say we're talking about umbrellas. Clearly, one would expect demand to depend on rainfall.