Cross Validated Asked by Federico Tedeschi on January 1, 2022

If I use lagged values ($k, k geq 2$) as instruments to estimate an $AR(1)$ model (to take into account that the lagged value is endogenous),

there’s the problem of missing values. For example, if I decide that $Y_{t-3}$ is a valid instrument for $Y_{t-1}$, of course I’d have it available only for individuals with at least $4$ observations, and in general only observations $Y_t, t geq 4$ would be observed.

I’ve found below:

(pdf)

(pdf)

that, at least when the panel is unbalanced, missing values should be replaced by zero.

Here:

https://www.jstor.org/stable/pdf/1913103.pdf?refreqid=excelsior%3Aaff11b8d9e04796448ebfbf42d6d7132

I read:

“*recall that our procedure involves dropping the equations for the first m + 2 time periods. When the parameters are nonstationary this procedure involves no loss in efficiency. Although the equations that are dropped may be correlated with the remaining equations, there are no cross equation restrictions, and they are underidentified. When the parameters are stationary, dropping the first m + 2 periods may involve some loss in efficiency. Because there are cross-equation restrictions, efficiency can be improved by adding back t = m + 2 and t = m + 1 period equations, both of which have observable lags. Also, if there is no heteroskedasticity (across time or individuals) in the innovation variance for yit and xit, then all of the parameters for the joint Yit and process can be estimated without the earliest cross-section moments, so that it may be possible to further improve efficiency by using these moments. Cross-section moment based estimation of moving average (but not autoregressive) time series models in panel data has been considered by MaCurdy (1981a)*”.

Thus, it is clear to me from the above that using $Y_{t-3}$ will avoid a loss of efficiency, but not that it should be set to $0$ when missing.

In general, I think that in some cases replacement with $0$ seems to me quite natural: for example, if the variable refers to a policy program that was not yet implemented, or to service use of children that had not been born yet (when there’s no service/program, or I am too young to be entitled to it or not even been born, I cannot use it). But when we are simply talking about variables that were not observed before a given time, with a replacement to $0$, the model: $ Y_{t-1}=alpha+beta*Y_{t-3}+epsilon_{t-1}$ leads to $ hat{Y}_{t-1}=alpha$, thus something completely non-informative about the individual. How can this not bias estimates toward $0$? It seems to me analogous to the situation where, in a context where : $ Y_{t}=alpha+beta*Y_{t-1}+epsilon_t$, instead of starting from $t=2$, we set $Y_0=0$.

++++++ EDIT 24 JULY 2020 ++++++

On second thought, I guess that the reason why estimates don’t get biased is that both in $ Y_{t-1}=alpha_0+beta_0*Y_{t-3}+epsilon_{t-1}$ and in $ Y_{t-1}=alpha_1+beta_1*Y_{t-3}+epsilon^*_{t-1}$, the value of $beta$ is irrelevant for fitted values of the outcomes. This however leads me to think that, despite not introducing bias, substituting missing values of $Y_{t-3}$ with $0$ may affect other parameters and the variance explained, but not the estimate of the autoregressive parameter itself.

++++++ EDIT 14 SEPTEMBER 2020 ++++++

Now I have a different understanding: estimates are not affected in the first-stage equations. Nevertheless, more observations will be used for the final regression, thus increasing efficiency of the estimates. While I’ve found this online, I still haven’t found a clear explanation of that in the literature.

1 Asked on December 6, 2021

2 Asked on December 6, 2021 by omm-kreate

0 Asked on December 6, 2021 by lakshman-mahto

1 Asked on December 6, 2021

gradient descent neural networks q learning reinforcement learning

1 Asked on December 6, 2021 by rambalachandran

binomial distribution combinatorics mathematical statistics probability

0 Asked on December 6, 2021

0 Asked on December 6, 2021 by user6883405

fishers exact test hypothesis testing multiple comparisons small sample

1 Asked on December 6, 2021

1 Asked on December 5, 2021 by laos

1 Asked on December 5, 2021 by user41710

1 Asked on December 5, 2021 by aae

approximation distributions lognormal distribution moment generating function normal distribution

1 Asked on December 5, 2021

1 Asked on December 5, 2021

attention machine translation natural language neural networks

1 Asked on December 5, 2021

1 Asked on December 5, 2021

cross correlation granger causality macroeconomics time series

1 Asked on December 5, 2021 by b-kenobi

generalized linear model logistic multiple regression r reporting

1 Asked on December 5, 2021

assumptions heteroscedasticity linear multiple regression variance

0 Asked on December 5, 2021

1 Asked on December 5, 2021 by emberbillow

2 Asked on December 5, 2021

Get help from others!

Recent Questions

- How Do I Get The Ifruit App Off Of Gta 5 / Grand Theft Auto 5
- Iv’e designed a space elevator using a series of lasers. do you know anybody i could submit the designs too that could manufacture the concept and put it to use
- Need help finding a book. Female OP protagonist, magic
- Why is the WWF pending games (“Your turn”) area replaced w/ a column of “Bonus & Reward”gift boxes?
- Does Google Analytics track 404 page responses as valid page views?

Recent Answers

- Peter Machado on Why fry rice before boiling?
- haakon.io on Why fry rice before boiling?
- Joshua Engel on Why fry rice before boiling?
- Jon Church on Why fry rice before boiling?
- Lex on Does Google Analytics track 404 page responses as valid page views?

© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir