Solved – Comparison of longitudinal data in a same group of participants but using two entirely different tests

datasetmultivariate analysispanel dataspss

I have a dataset of participants who were longitudinally followed up at two different time points. They went through a series of measurements at both points. Even though the overall domain of the function measured at these two points were the same, age appropriate tests (which are different) had to be used to determine their highest level of function attained at that age.

My questions are:

  1. How can I compare the values of these tests and quantify the development of the function as a whole? I do not see any correlation between the two data and the data are non-parametrically distributed at both ages.

  2. What is the best method to evaluate the effect of an intervention or observe the rate of development in two groups?

I am analysing my data in SPSS, any suggestion on the methods that can be executed in SPSS would be highly beneficial.

Best Answer

As long as the scoring of the age-specific tests are directly comparable across age groups, there are a couple of panel methods you can use. One of the simpler ones is the difference-in-differences (DiD) method. With DiD, you're taking the difference of the difference between values for Group A at time 2 and time 2 and the difference between values for Group B at time 2 and at time 1, hence the name "difference-in-differences."

Visualizing what you are calculating usually helps, so here's what the comparison looks like conceptually:

Difference-in-Differences illustration from Gertler et al. (2010)

The DiD estimator measures the impact of treatment, which I have circled in red above. Using the above illustration, the DiD estimator is calculated as: $DiD = (B-A)-(D-C)$

The good news is, this estimation is quite simple. The "impact indicator" is an interaction term of the time period (time 2 or time 1) and the study group (intervention or control). You can fit the model:

$Outcome = \beta_0 + \beta_1Time + \beta_2Group + \beta_3Time*Group +\epsilon$

and the coefficient on $Time*Group$ measures the impact of your intervention, while the p-value on that coefficient tells you if the differences were significant or not. Basically, you're just fitting a "regular" OLS model in your statistical package. If your data are panel data, transform it from "wide" to "long." That is, each person should have two observations: one that contains values at time 1 and one that contains values at time 2. To illustrate, this is what your data would look like (with more variables, of course):

Data in "long" format

Take care, though to use robust standard errors if you are using true panel data instead of pooled cross sections. Be sure you have a unique identifier for each person, so that your statistical package knows what variable to use to cluster the standard errors. When you have a true panel, as is the case here, not taking into account the fact that the measurements were taken from the same person would artificially drive your standard errors down due to clustering within the individual. Failing to control for the panel structure would lead your standard errors to be wrong, which would lead you to wrong conclusions about whether your program or intervention was effective. Check the documentation or Google for SPSS for how to cluster standard errors.

For more information on DiD, you can check out Gertler et al.'s (2010) Impact Evaluation in Practice for a "beginner's guide" (link to web resource in References section below). Detailed treatment can be found in Wooldridge (2003), Wooldridge (2009), and Angrist & Pischke (2009). They are all very good resources for statisticians at all levels of experience.

References

  1. Angrist, J., & Pischke, J. (2009). Mostly harmless econometrics: An empiricist’s companion. Princeton, NJ: Princeton University Press.
  2. Gertler, P. J., Martinez, S., Premand, P., Rawlings, L. B., & Vermeersch, C. M. J. (2010). Impact Evaluation in Practice. The World Bank. http://doi.org/10.1596/978-0-8213-8541-8
  3. Wooldridge, J. M. (2002). Econometric Analysis of Cross Section and Panel Data. Cambridge, MA, USA: The MIT Press.
  4. Wooldridge, J. M. (2009). Introductory Econometrics: A Modern Approach (4th ed.). Mason, OH, USA: South-Western, Cengage Learning.
Related Question