what should I do about a non-stationary variable in a panel-data interaction?

Question

We have panel data on immigration stocks, immigration flows, and immigration policy for 30 countries and 10-30 years. We would like to test the theory that the effect of immigration flows (i.e., annual numbers of incoming immigrants as % of pop) on immigration policy depends on immigrant stocks (i.e., non-citizens as % of pop). In other words, immigration flows affect policy, but only when there are few existing immigrants to begin with.
It seems to me that an interaction between immigration stocks and flows will allow a test of this theory. However, while our dependent variable (immigration policy) and our main independent variable (immigration flows) appear to be stationary, immigrant stocks is not. Standard solutions like first-differencing immigrant stocks won't help because that would transform stocks into another measure of annual flows, which will not allow us to test the theory.
Another way of putting this is to ask: does stationarity matter only for the dependent variable? Or also for all independent variables?
Advice on how to proceed will be greatly appreciated!

kurtosis · Answer

I think there are some ways to better express your data that might be helpful and avoid the issues of nonstationarity as well as some other issues you have not mentioned.
You have measured stocks and flows exist as percentages. That is good because those variables are unlikely to take on very large values. Second, that avoids many of the problems of heteroskedasticity and influence with some countries being much larger or having larger immigrant populations. Your policy measures are not so clear: are they based on announcement dates, dates of passing into law, or effective dates? That deserves some thinking to tease out which effects come into play when.
You say that immigrant stocks is not stationary. I am not sure that is true over the span of a few years although it may be true looking across your entire time period. Nonetheless, we typically assume the independent variables are not random but known. So I don't see a problem with using the immigrant stocks as you have it (expressed as a percentage of population).
Typically, a nonstationary independent variable is not a concern but it is likely to be unhelpful since it can wander off to values that are large in magnitude. If the dependent variable is stationary, this is unlikely to lead to a spurious regression; however, it is likely to lead to a coefficient estimate which is not significant. Since your stocks variable exists on a compact, well-defined, and small interval, I doubt that will be an issue.
The one issue you may find, however, is endogeneity. Immigrant stocks may affect future flows (immigrants often move to places where they already have family) and flows obviously affect future stocks. Stocks may affect policy (immigrants can press for policy changes) and policy changes can affect future stocks. Flows can also affect policy and vice versa.
You could model stocks, flows, and policy all in a simultaneous equation model. Another alternative is to find an instrumental variable to break the reverse causality. Also, taking care with your time lags may help break the causality concerns.

what should I do about a non-stationary variable in a panel-data interaction?

One Answer

Add your own answers!

Ask a Question