Solved – the difference between a one-sided filter and a two-sided filter when looking at time series analysis

filtertime series

I'm looking to understand the difference between the two and grasp in which situations each might be preferred over the other.

Best Answer

I'm encountering this question 2 years after it was asked, but I'm just entering the time series scene no and have been reading a lot on in the past month. My answer is simply my intuitive take-aways, so I'm sure that more experienced practitioners will chime in with refinemenst. I'm hoping they will because it would certainly confirm or clarify my own take on it.

Since time series analysis typically looks to forecast, it seeks dependencies on past values predictors. So those filters are one-sides. If you have autoregressive and moving average components, it looks like a FIR + IIR system, and I'm not sure whether the equivalent system filter is one-sided (causal), especially for multivariate. In fact, I don't see too many treatments of time series systems as if they were DSP systems with filtering. I recall one example in the Matlab Econometrics tutorials where a filtering command takes a time series model and applies it to input data.

Compared to the discrete time equations from signal flow graphs and block diagrams in DSP, in time series, the motivation for analyzing (one-sided!) polynomial lag equations seems to be to locate poles to identify invertibility, causality, and stationarity, and spectral characteristics. You defy causality when you have coefficients representing dependence on future values.

For signal processing, you can still apply a symmetric weighted average filter to a delayed version of the signal without violating causality, but I suspect that you don't want to craft the moving average portion of a time series model because the coefficients represent reality; that is the information you're trying to discover. In contrast, in DSP, you remove information by filtering out broad band noise or adjacent channel info, and focus only on the info of interest; so you want to craft your weights based on your spectral band of interest. From what I've seen for univariate time series, you would do much of this up front in the detrending, seasonality removal, and unit root detection (I view the latter a detecting "noise" at 0 Hertz). Basically, you're identifying and separating out the deterministic information features of interest to get white noise residuals. In so doing, you maximize the deterministic modelling of deterministic dependencies, leaving the stochastic component to represent that which you truly cannot determine.

Related Question