That is the query posed in paper by Baker, Larcker and Wang (2022). I summarize their key arguments beneath.
The validity of…[the DiD]…method rests on the central assumption that the noticed development in management models’ outcomes mimic the development in therapy models’ outcomes had they not acquired therapy. As the authors write:
First, DiD estimates are unbiased in settings with a single therapy interval, even when there are dynamic therapy results. Second, DiD estimates are additionally unbiased in settings with staggered timing of therapy task and homogeneous therapy impact throughout corporations and over time. Finally, when analysis settings mix staggered timing of therapy results and therapy impact heterogeneity, staggered DiD estimates are doubtless biased.
Oftentimes, DiD is applied utilizing an bizarre least squares (OLS) regression primarily based mannequin as follows:
When there are greater than two teams and greater than and a couple of time durations, regression-based DiD fashions sometimes depend on two-way mounted impact (TWFE) of the shape:
Where the primary two coefficients are unit and time interval
mounted results. Note that earlier analysis from Goodman-Bacon
(2021) reveals that static types of the TWFE DiD is definitely a “weighted
common of all attainable two-group/two-period DiD estimators within the information.”
When therapy results can change over time (“dynamic
therapy results”), staggered DiD therapy impact estimates can truly
receive the alternative signal of the true ATT, even when the researcher had been capable of
randomize therapy task (thus the place the parallel-trends assumption
holds).
The motive for it’s because Goodman-Bacon
(2021) reveals that the static TWFE DiD is definitely consists of three elements:
- Variance-weighted common therapy impact on
the handled (VWATT) - Variance-weighted common counterfactual developments
(VWCT) - Weighted sum of the change within the common
therapy on the handled inside a treatment-timing group’s post-period and
round a later-treated unit’s therapy window (ΔATT)
The first time period is the time period of curiosity. If the parallel developments happens, then VWCT =0. The final time period arises as a result of, beneath static
TWFE DiD, already-treated teams as successfully used as comparability teams for later-treated
teams. If DiD is estimated in a
two-period mannequin, nonetheless, this time period disappears and there’s no bias. Alternatively,
if therapy results are static (i.e., not altering over time after the
intervention), then ΔATT = 0 and TWFE DiD is legitimate.
The challenges, nonetheless, happens when therapy results are
dynamic. In this case ΔATT
≠
0 and the TWFE DiD is biased.
So what might be carried out? The authors supply 3 options:
- Callaway and Santa’Anna (2021). Here, the authors enable one to estimate therapy impact for a specific group (therapy at time g) utilizing observations at time τ and g-1 from a clear set of controls. These are mainly not-yet handled, last-treated, or never-treated teams.
- Sun and Abraham (2021). An analogous methodology is used as in CS, however always-treated models are dropped, and the one models that can be utilized as efficient controls are these which might be never-treated or last-treated. Further, this method is absolutely parametric.
- Stacked regression estimators. Cengiz (2019) implements this method. The aim is to “create event-specific “clean 2 × 2” datasets, together with the result variable and controls for the handled cohort and all different observations which might be “clean” controls inside the therapy window (e.g., not-yet-, last-, or never-treated models). For every clear 2 × 2 dataset, the researcher generates a dataset-specific figuring out variable. These event-specific information units are then stacked collectively, and a TWFE DiD regression is estimated on the stacked dataset, with dataset-specific unit- and time-fixed results… In essence, the stacked regression estimates the DiD from every of the clear 2 × 2 datasets, then applies variance weighting to mix the therapy results throughout cohorts effectively.”
While there was loads of math on this put up, if researchers apply these various DiD estimators, the authors properly advocate that “researchers should justify their choice of ‘clean’ comparison groups—not-yet treated, last treated, or never treated—and articulate why the parallel-trends assumption is likely to apply”.
You can learn the total article right here.