Solved – Difference between Multivariate Time Series data and Panel Data

cross-sectionpanel dataterminologytime series

Recently I got mix response on the difference between multivariate time series data and panel data. I completely understand the difference between cross sectional data, time series data and panel data. But sometimes it becomes difficult to distinguish between panel data and multivariate time series data.

For example, if we consider data on daily closing prices for last one year for 10 companies. Is it panel data or just multivariate time series data? I found such type of data categorised as multivariate time series in time series books, whereas some econometrics book categorised such type of data as panel data.

How can we statistically categorised data between multivariate time series and panel data?

Best Answer

In short, there is no such thing as Multivariate Time Series data. The only classic data types out there are: Cross Sections, Time Series, Pooled Cross Sections, and Panel data.

Panel data is multidimensional. Time series is one-dimensional. Time Series data is a type of panel data. Daily closing prices for last one year for 1 company is a Time Series dataset because the time variable alone uniquely identifies each observation. I will work with only 8 days worth of prices instead of a year, but the idea is the same.

enter image description here

Daily closing prices for last one year for 10 companies can be both a Panel dataset and a Time Series dataset, depending on whether it is in wide or long format.

If the data is organized so that Time still uniquely identifies each observation (i.e. wide format) then it is still a time series dataset, and you will have different columns for the daily closing price for each of the 10 companies. See example below:

enter image description here

If, on the other hand, it is organized so that Time alone no longer uniquely identifies each observation (i.e. long format) then it is a panel dataset. With 10 companies, you need to use Time and Company ID together to uniquely identify each observation. See example below:

enter image description here

In any statistical software it is straightforward to reshape a dataset from wide to long and from long to wide.

The term Multivariate Time Series is heard around, but it refers not to a type of dataset but to a type of analysis. To be more exact, it refers to the type of time series regression analysis in which there is more than one response variable. (Make sure to distinguish it from Multiple Time Series regression, which refers to a regression with one response variable and several predictor variables). One example of Multivariate Time Series Analysis is VAR (Vector AutoRegression).

enter image description here

More on VAR here: https://en.wikipedia.org/wiki/Vector_autoregression

I hope this helps.

Related Question