What is the best way to describe the difference between a repeated cross-sectional and panel dataset?
I consider an example of measuring blood pressure for a set of patients $i\in\{1,…,N\}$.
A repeated cross-section data in this case would mean that the hospital has the blood pressure records for Sam and Suzy at different points in time and not necessarily at the same time. For example, Sam might have his readings on Monday and Tuesday, and Suzy on Friday and Saturday.
In a panel data set-up, the hospital has 3-month worth of blood pressure data on everyone, but $N$ is quite larger than 3 months, so it is "longitudinal" in this sense.
Does this example portray the difference between a repeated cross-sectional and panel dataset? Why and why not? What might be a better example to explain the difference(s)?
Best Answer
I think they are often used interchangeably depending on the field, but the way I think of it is panel data explicitly views the repeated measurements as occurring through time whereas repeated cross-sectional can be some arbitrary dimension. Your example sounds pretty good to me.
To expand on it, let's say you have observational data of 100 patients' daily blood pressure measurements for 3 months. For a single patient $i$, we can think of their data as a time series with 3 months worth of daily observations. This is panel data.
On the other hand, perhaps some patients only recorded measurements every other day while some recorded it several times a day. It is still repeated through "time", but the time index is somewhat meaningless now since the $j$th observation for two different patients may correspond to different timestamps entirely.
The practical implication is that if we wished to include time effects such as seasonality (perhaps there is an overall increase in blood pressure in the population over summer due to increased consumption of hot dogs), panel data can allow for identification whereas repeated cross-sectional data cannot.