TransWikia.com

Why is it OK to model demographics as random effects in bayesian multilevel models?

Cross Validated Asked by Graham Wright on November 2, 2021

In Bayesian multilevel models (with, say, people nested within congressional districts) I sometimes see individual level demographic variables like race modeled as random effects.
So here’s a slightly simplifed example from this paper:
$$
Pr(y_i=1)=text{logit}^{-1}(gamma_0 + alpha^{race}_{r[i]} +alpha^{gender}_{g[i]}+alpha^{edu}_{e[i]}+alpha^{district}_{d[i]}…)$$

$$alpha^{race}_{r[i]} sim N(0,sigma^2_{race}), for~r = 1,….4 $$
$$alpha^{gender}_{g[i]} sim N(0,sigma^2_{gender}) $$
$$alpha^{edu}_{e[i]} sim N(0,sigma^2_{edu}), for ~e=1,…,5
$$

As I understand it this model is treating all the individual level demographic variables as "random effects" just like district. So for race it is assuming that the 4 racial categories that exist in the data (black, white, hispanic, other) are actually just 4 random draws from a larger population of all possible races. To me this seems strange and wrong, since the racial categories we have in the data are meant to be exhaustive and there doesn’t seem to be any reason to think that racial differences will be normally distributed.

So my question is: Is my interpretation of this model correct, and if so why is it justified?

I know that someone actually asked this question before but the answer they were given was that it is probably NOT appropriate to treat race etc as random effects. But that’s precisely what is done in many papers on Bayesian multilevel models.

4 Answers

One interpretation would be that it would not be helpful to call the $alpha^{race}$ 'random effects'.

Practically, it looks like the race effects $alpha^{race}sim N(0,sigma^2_{race})$ (for instance) have a hierarchical PRIOR, that is, conditioned on the race effect variance we have a normal prior. In turn, $sigma^2_{race}$ should have a prior, effectively making the $alpha^{race}$ have a prior that is a mixture distribution. As mentioned, it isn't really helpful to think of this as a random effect, hyper parameter $sigma^2_{race}$ does not really have a useful definition (since, as you said, the races were not sampled from a population of races). Possibly you could make a post hoc interpretation of $sigma^2_{race}$ as a guide to how different the race effects are, but for that purpose you could instead make direct comparisons between the $alpha^{race}$ values.

The $sigma^2_{race}$ is just part of the definition of the prior of $alpha^{race}$. It might have been just as good to place a huge constant value on $sigma^2_{race}$ and thus leave the $alpha^{race}$ with a vague prior.

Answered by AlaskaRon on November 2, 2021

I'd recommend looking at this answer from @Paul for guidance on so-called "random effects" and hierarchical models. In particular, this quote is on point:

Random effects are estimated with partial pooling, while fixed effects are not.

Partial pooling means that, if you have few data points in a group, the group's effect estimate will be based partially on the more abundant data from other groups. This can be a nice compromise between estimating an effect by completely pooling all groups, which masks group-level variation, and estimating an effect for all groups completely separately, which could give poor estimates for low-sample groups.

The answer goes on with an example, and discussion of the relationship of this approach to hierarchical Bayesian modeling.

Such pooling is exactly what the authors of the paper you cite were setting out to do with their multi-level approach:*

... a multilevel model pools group-level parameters towards their mean, with greater pooling when group-level variance is small and more smoothing for less populated groups. The degree of pooling emerges from the data endogenously ...

So although it's often argued that categories with few levels (sex, race) should be treated as fixed effects in regressions, they need to be treated as random effects to accomplish this partial pooling.


*The authors used GLMER in R for this, so I suppose this particular example isn't strictly a Bayesian approach.

Answered by EdM on November 2, 2021

Categories of social position and social identity—including common demographic variables—are important demarcations of population. In the population sciences, there is a good deal of emphasis on differentiating the mean or median (central) experiences of populations, however, the variability of experiences distributed within populations is also substantively important.

Take systolic blood pressure (SBP) as an example: it is approximately normally distributed, and one could imagine two populations with nearly the same, or even identical mean SBP. Does this mean that the health of the two populations with respect to blood pressure is the same? No! If one population is considerably more variable, then its SBP-related health is actually quite a bit worse. First, knowing nothing else but which population an individual is from, we are less certain of their SBP. Second, if there are extremes of SBP (values of it at which risk for bad things happening rises sharply; SBP>130 sharp increase in stroke risk, SBP <90 sharp increase in waking up dead from hypotension) then the population with greater variability has more, possibly far more people "falling through the cracks" at the extremes. The cyan shaded region in the below graph (a cartoon I made, not actual data) is how much more likely people in the blue population are to be at high risk due to hypertension or hypotension than people in the red population. The more variable population is more vulnerable.

For two groups with the same mean systolic blood pressure SBP, but if one group (blue) has greater variability in SBP, then it also has both greater uncertainty, and greater extremes in SBP.

Back to your question, the current US (and global!) social moment of unrest against centuries of institutionalized anti-black racism and against half a millennium of colonization of American Indians, Hawaiians and Pacific Islanders, and Alaska Natives points out that the vulnerability—the increased uncertainty in outcomes, and the increased numbers in the extremes—of the populations defined by racial demographic groups (among others) is a good reason to look to methods, such as using mixed models/random effects models/hierarchical linear models/multilevel models/etc. (as @Tim rightly points out the language is a smidge muddled) to provide estimates of population variability.

NB: I do not see this as an issue of Bayesian vs Frequentist, but as a question of substantive modeling of the world around us.

Answered by Alexis on November 2, 2021

"Fixed" and "random" effects is terminology from frequentist models. In fact, it is not the best and not consistently used terminology. In frequentist statistics you are trying to find point estimates for the parameters, with the exception of random variables, where you want to learn about the distribution of those effects. In Bayesian statistics every parameter is treated as random variable and we want to learn about it's distribution, so there's no such distinction.

Answered by Tim on November 2, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP