Do these additional OLS regressions showcase that High-ESG funds have higher absolute levels of flows in the pre-period or does it tell something about the pre-trends?
In my review of this section, the author is showcasing mean differences across sustainability fund groups. Note in Table 3 the authors estimate separate equations within the different timing epochs. The dummy variables simply indicate whether a fund had a high, above average, below average, or low Morningstar sustainability rating as of December 2019. In other words, the estimates reflect average differences across groups within a specified time window; each group is compared to a baseline, which is average (i.e., 3 globes) funds. We can confirm this by reviewing their estimating equation. Equation 2 on page 15 doesn't include any interactions with a post-crisis indicator since they already subdivided the sample by time (e.g., pre- versus post-COVID).
Technically, they're averaging across the weeks in the different sub-periods. In the pre-shock era, for example, all we can say is high ESG funds received relatively higher weekly net flows. In my opinion, it's more indicative of higher absolute levels in the weeks starting at the beginning of the year and ending in the week prior to the onset of the stock market crash on February 20. But note how they're averaging over the weeks pre-shock, so we're not observing the week-over-week flow trends. The real work is shown in Figure 1, where we observe relatively parallel weekly average flow trends across sustainability rating groups. Parallel paths should be observed across groups and time periods.
How can you formally test the parallel trend assumption in a generalized difference-in-difference setting like this, where you have weekly-panel-data and the crisis affects all groups however with different intensities?
Technically, this is a 'classical' difference-in-differences setting. The treatment epochs (e.g., "crash" and "stimulus" periods) seem very well-defined. Note how the post-COVID era affects all funds at the same time. Once the post-treatment indicators are defined in software, we simply interact those with the high/low ESG dummies. Now say the shock affects all funds; in other words, all units in the sample are treated. In this case, we need a reference group, say the average fund. The coefficients on the interactions of a post-COVID indicator with the high/low ESG dummies estimate how much more flows high or low sustainability funds receive after the onset of the COVID-19 shock relative to before, as compared to the average fund.
It's also worth highlighting that the "intensity groups" you are referring to are categorical in nature. To assess parallel trends in this setting, I recommend plotting raw and/or normalized net flows by intensity group. In other words, plot average weekly retail net flows of high (i.e., 5 globes), average (i.e., 3 globes), and low (i.e., 1 globe) sustainability funds over time—separately. Figure 1 offers a nice illustration of this.
Third question: Do the higher flows in the High-ESG class during the pre-period (as indicated in question 1) cause problems when you try to formally test this (with for example an event-study)?
In general, no.
While a level difference is allowed, diverging time variation is not. As indicated in my other post, if average weekly flows for high ESG funds were on a different growth trajectory before the economic shock, then this biases the estimated treatment effect. Put more simply, it's permissible for average net flows among high ESG funds to be higher than low ESG funds in any one week, but their movement over time (i.e., week-over-week) should be reasonably similar as the pandemic nears.
Best Answer
This alternative approach investigates whether two groups respond differently pre- versus post-event. It assumes a macro-level shock affects all units $i$ in the exact same way. In post you cited, all $i$ (e.g., firms, industries, etc.) are affected by the shock, but one group in particular (i.e., the treatment group), which has a particular characteristic that distinguishes them from the control group, will have a differential response in post-treatment period.
This alternative approach can invite some skepticism. In my opinion, you have to be very clear about how you define your treatment group. You have to disabuse others of the notion that your treatment group was selected "because" it was going to respond differently anyway. If, for example, a macro-shock affects everyone, then you have to demonstrate, in some way, that the outcome trends for the group without the particular trait/characteristic (i.e., the control group) is representative of how the treatment group trend would have evolved had no macro-shock occurred.
In a typical difference-in-differences application, the evaluator is usually not in control of the selection process. If we're lucky, nature does the entire randomization process for us, though never perfectly. In the alternative setting you're referring to, the whole sample is introduced to a new normal. The difference now is that some particular trait or attribute makes one group more/less vulnerable in this new state of the world. The evaluator is usually the one looking for this "feature" that defines treatment status.
Let's look at a real world example.
Of course.
A finance paper by Albuquerque and colleagues (2020) is probably the best example, to date, of this alternative approach. They used the COVID-19 pandemic to study how environmental and social (ES) firm policies conditioned the stock market response of firms. In particular, they were interested in testing how customer and investor loyalty based theories of ES account for the stock price properties. Firms with high ES scores (i.e., top quartile in 2018) were considered part of the treatment group. They showed that stock prices for firms with high ES scores perform much better than the prices for other firms. One critique is that some firms/businesses, such as in the utilities industry, were considered "essential" and simply operated in a "business-as-usual" manner in the early stages of the pandemic. If this was the case, this would have resulted in a more resilient business. Maybe there were important differences by industry that were ignored. To address any criticism that the results were driven by a particular industry, they estimated a third difference to show that their findings encompass most industries. Obviously, the distribution of your "treatment group" across sector, industry, or space matters in this setting.
I would give the aforementioned paper a read. The link should take you to the journal where the paper is available (ungated). It outlines this alterative difference-in-differences approach in great detail.