Fixed effects with panel data vs including lagged variables with cross section data

Question

I have panel data with many groups $i$ and two time periods $t$.
I want to know the effect of a binary treatment $D$ on a continuous outcome $Y$. Some groups go from untreated to treated, while others are treated in both periods, and others are untreated in both periods.
I am considering two approaches, and I'm curious about the differences between the two.
Approach 1: Fixed effects with panel data
I shape the data into long format, where each observation is a group-time period (so each group has two observations in this case). Then I run the following regression:
$Y_{it} = delta_1 D_{it} + alpha_i + gamma_t + epsilon_{it} $
Where $alpha_i$ is a group-level fixed effect, and $gamma_t$ is a time period-level fixed effect (in this case it would just be a dummy for the second time period).
Approach 2: including lagged variables with cross section data
Reshape the data into wide format, so each observation is a group. Then I have two new variables that are the lagged outcome value ($Y_{t-1}$), and the lagged treatment status variable ($D_{t-1}$). The $i$ subscript is gone. Run the following regression:
$Y_{t} = delta_2 D_{t} + beta_1 D_{t-1} + beta_2 Y_{t-1} + nu_{t} $
Question:
What is the difference between the two approaches? Is one generally preferred or is it context-specific?

Fixed effects with panel data vs including lagged variables with cross section data

Add your own answers!

Ask a Question