Fixed effects with panel data vs including lagged variables with cross section data

Cross Validated Asked by gannawag on January 3, 2022

I have panel data with many groups $i$ and two time periods $t$.
I want to know the effect of a binary treatment $D$ on a continuous outcome $Y$. Some groups go from untreated to treated, while others are treated in both periods, and others are untreated in both periods.
I am considering two approaches, and I’m curious about the differences between the two.

Approach 1: Fixed effects with panel data

I shape the data into long format, where each observation is a group-time period (so each group has two observations in this case). Then I run the following regression:

$Y_{it} = delta_1 D_{it} + alpha_i + gamma_t + epsilon_{it} $

Where $alpha_i$ is a group-level fixed effect, and $gamma_t$ is a time period-level fixed effect (in this case it would just be a dummy for the second time period).

Approach 2: including lagged variables with cross section data

Reshape the data into wide format, so each observation is a group. Then I have two new variables that are the lagged outcome value ($Y_{t-1}$), and the lagged treatment status variable ($D_{t-1}$). The $i$ subscript is gone. Run the following regression:

$Y_{t} = delta_2 D_{t} + beta_1 D_{t-1} + beta_2 Y_{t-1} + nu_{t} $

What is the difference between the two approaches? Is one generally preferred or is it context-specific? here is a screenshot of some made up data in long format. the wide format only uses the observations where the lags are not NA

Add your own answers!

Related Questions

Is ROCR applied to training data or testing data?

1  Asked on November 12, 2021 by fcas80


Is the plot “White noise”?

0  Asked on November 12, 2021


Identify a contaminated distribution

1  Asked on November 12, 2021 by stephen-clark


Ask a Question

Get help from others!

© 2023 All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP