Aggregation Estimation Issues

Question

I'm analyzing the effect of a law enforcement measure over time on reported violence for districts in a city. If I consider the city as a whole I have access to a lot of possible control variables (e.g., unemployment, poverty, police expenditures) that I don’t have at the more dis-aggregate district level.
I'm wondering what I lose by eschewing a panel analysis (with the possibility of fixed effects to control for unobserved heterogeneity and cross sectional variation) compared to the time series city analysis.

Thomas Bilach · Answer

It appears the 'effective date' of your intervention is uniform across all districts in the city.
A fixed effects approach is not likely to perform better than simply pooling your data. For instance, you might wish to estimate something like this:
$$
text{Crime}_{dt} = beta_{0} + beta_{1}text{Policy}_{t} + alpha_{d} + epsilon_{dt}
$$
where you observe districts $d$ across time periods $t$. The variable $text{Crime}_{dt}$ is some arbitrary crime outcome (e.g., robbery rate) observed across your 'district-time' periods. $text{Policy}_{t}$ is a treatment dummy 'turning on' at period $t$. Here, $alpha_{d}$ is your district effect.
Note, the dummy will have, or should have, the same "within-district" pattern. Estimation of "district" fixed effects, which could be achieved via the inclusion of dummies for all districts, is not likely to yield anything useful in this scenario, even with a limited number of control variables. In fact, your estimate of $beta_{1}$ should be similar to the pooled estimate that ignores the panel structure. Moreover, estimation of discrete time effects for all $t$ periods is likely to absorb a discrete treatment variable that uniformly affects all districts.
I don't presume that you only seek to model the effects of the initiative using a discrete treatment indicator. But without more information, it is difficult to advise you how to proceed. I suppose some districts received a greater dose of the initiative/intervention, in which case you could attempt estimation of the above equation with a continuous treatment variable. Or, maybe you could investigate heterogeneity of treatment effects by city section.
You will lose all cross-sectional variation by aggregating your data up to the city level. Is there another neighboring city/county similar to yours which could act as a control in this setting? Your options are somewhat limited unless you can find other jurisdictions, either at the district or city level, where the intervention was absent.
If you acquired a sufficient time series then I suppose you could look into interrupted time series modeling. Also, there is a relatively new package in R for estimating causal effects.
I hope this helps!

Aggregation Estimation Issues

One Answer

Add your own answers!

Ask a Question