Solved – The econometrics of a Bayesian approach to event study methodology

Event studies are widespread in economics and finance to determine the effect of an event on a stock price, but they are almost always based on frequentist reasoning. An OLS regression — over a reference period which is distinct from the event window — is usually used to determine the parameters required to model the normal return for an asset. One then determines the statistical significance of cumulative abnormal returns ($\text{CAR}$) on asset $i$ following an event during a specified event window from $T_1$ to $T_2$. A hypothesis test is used to determine whether these returns are significant and thus indeed abnormal or not. Thus:

$H_0 : \text{CAR}_i = 0$, where

$\text{CAR}_i = \sum_{t=T_1}^{T_2} \text{AR}_{i,t} = \sum_{t=T_1}^{T_2}
\left( r_{i,t} -\mathbb{E}[r_{i,t}]
\right)$, and

$\mathbb{E}[r_{i,t}]$ is the return on the asset predicted by the model.

If our number of observations is large enough, we can assume asymptotic normality of the distribution of asset returns, but this may not be verified for a smaller sample size.

It can be argued that because of this, single-firm, single-event studies (as required for example in litigation) should follow a Bayesian approach, because the assumption of infinitely many repetitions is much "further from being verified" than in the case of multiple firms. Yet, the frequentist approach remains common practice.

Given the scarce literature on this subject, my question is how to best approach an event study — analogous to the methodology outlined above and summarised in MacKinlay, 1997 — using a Bayesian approach.

Although this question arises within the context of empirical corporate finance, it is really about the econometrics of Bayesian regression and inference, and the differences in reasoning behind frequentist and Bayesian approaches. Specifically:

How should I best approach the estimation of the model parameters using a Bayesian approach (assuming a theoretical understanding of Bayesian statistics, but little to no experience in using it for empirical research).
How do I test for statistical significance, once cumulative abnormal returns have been computed (using the normal returns from the model)?
How can this be implemented in Matlab?

Best Answer

As mentioned in the comments, the model you're looking for is Bayesian linear regression. And since we can use BLR to calculate the posterior predictive distribution $p(r_t|t, \mathcal{D}_\text{ref})$ for any time $t$, we can numerically evaluate the distribution $p(\text{CAR}|\mathcal{D}_\text{event}, \mathcal{D}_\text{ref})$.

The thing is, I don't think a distribution over $\text{CAR}$ is what you really want. The immediate problem is that $p(\text{CAR} = 0|\mathcal{D}_\text{event}, \mathcal{D}_\text{ref})$ has probability zero. The underlying problem is that the "Bayesian version of hypothesis tests" is comparing models via their Bayes factor, but that requires you to define two competing models. And $\text{CAR} = 0, \text{CAR} \neq 0$ are not models (or at least, they're not models without some extremely unnatural number juggling).

From what you've said in the comments, I think what you actually want to answer is

Are $\mathcal{D}_\text{ref}$ and $\mathcal{D}_\text{event}$ better explained by the same model or by different ones?

which has a neat Bayesian answer: define two models

$M_0$: all the data in $\mathcal{D}_\text{ref}, \mathcal{D}_\text{event}$ is drawn from the same BLR. To calculate the marginal likelihood $p(\mathcal{D}_\text{ref}, \mathcal{D}_\text{event}|M_0)$ of this model, you'd calculate the marginal likelihood of a BLR fit to all the data.
$M_1$: the data in $\mathcal{D}_\text{ref}$ and $\mathcal{D}_\text{event}$ are drawn from two different BLRs. To calculate the marginal likelihood $p(\mathcal{D}_\text{ref}, \mathcal{D}_\text{event}|M_1)$ of this model, you'd fit BLRs to $\mathcal{D}_\text{ref}$ and $\mathcal{D}_\text{event}$ independently (though using the same hyperparameters!), then take the product of the two BLR marginal likelihoods.

Having done that, you can then calculate the Bayes factor

$$\frac{p(\mathcal{D}_\text{ref}, \mathcal{D}_\text{event}|M_1)}{p(\mathcal{D}_\text{ref}, \mathcal{D}_\text{event}|M_0)}$$

to decide which model is more believable.