Solved – How to test for correlation between frequency of an event and the stock market

clusteringfrequency

I am currently running an event study, for which I need to find out if my events are clustered and/or if their frequency is tied to the stock market. Creating a scatter plot of the event dates and the stock market index gives me a rough idea that they are correlated (points on the scatter plot are denser during market upswings, and become less so in downturns), however I do not know how to prove this at a statistically significant level.

During my search I have come across a Seemingly Unrelated Regression, but I am not sure if this is the right tool to use and, if it is, how to use it. I have attached the scatter plot, the x-axis is the date of the event and the y-axis is the S&P 500 index at the time of the event.

If you know how I can quantify this, please let me know, as I would like to use something more solid than "look at the graph".

Much appreciated

LE: to get a better idea of how I have my data structured, I have only two variables: date which is the date of the event and sp500 which is the level of the S&P 500 for every date. From these two, I am trying to see whether date is clustered (happen at the same time or closely related to each other) when the sp500 is high, and viceversa.

scatter plot

Best Answer

If you just want to say that there is a statistically significant dependence between the occurence of the event and the sign of the change in the stock market, you can model it the following way: create two binary variables for each day of your observation period:

  • $X_t=1$ if the stock market rose on day $t$, 0 otherwise

  • $Y_t=1$ if the event occurred on day $t$, 0 otherwise

You can then check the independence of variables $X$ and $Y$ using a chi-squared test.

If you want to get more fancy, you can also fit a linear regression model, regress the magnitude of stock market changes on $Y$, and use the t-test p-value of $Y$'s coefficient. If you want to take account for the number of occurrences on each day (other than just 0-1), you can do Poisson regression of the occurrences, using the S&P change as the explanatory variable.

NB: neither the chi-squared test nor the regressions take into account the time series aspect of the data, they work as if all days in the dataset were randomly drawn days in the whole period. Therefor they don't take into account the fact that the occurence of the event might also depend on its having occurred in the previous days.

Related Question