TransWikia.com

Specifying model in glmer() - interaction terms

Cross Validated Asked on December 25, 2021

I am running a generalised mixed effects model, of family logistic regression, using function glmer().

I am predicting likelihood of response (0/1) and my fixed effects to explore in my final model are:
Day/Night (D/N)
Male/Female (M/F)
Time since trial began (continuous)

my random effects are ID, and location.

I am having a lot of difficulty in working out the best model to fit these, not only do the significances of the 3 fixed effects vary greatly when they are modelled alone or in combination with another… but I am reading contrasting information online as to whether “*” or “:” should be used to define interaction terms?

For example, just looking at Male/ Female and ‘Time since trial began’

modelling just male/ female:

`mod1 <- glmer(RESPONSE ~ Sex  + (1|`ID CODE`) + (1|Location), data = data, family = binomial)

summary(mod2)
`

output:

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -12.856      1.592  -8.075 6.74e-16 ***
SexF         -12.970      3.375  -3.843 0.000122 ***

Running just Time since Trial began:

mod2 <- glmer(RESPONSE ~ Time  + (1|`ID CODE`) + (1|Location), data = data, family = binomial)

summary(mod2)

output:

    Fixed effects:
                         Estimate Std. Error z value Pr(>|z|)    
(Intercept)               -1.3185     0.5398  -2.442 0.014592 *  
Time                      -1.3036     0.3542  -3.680 0.000233 ***

When running as interaction using “*” :

mod3 <- glmer(RESPONSE ~ Time*Sex + (1|`ID CODE`) + (1|Location), data = data, family = binomial)

summary(mod3)

output:

Fixed effects:
                                    Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    -11.483      1.955  -5.873 4.29e-09 *** 
Time                           -1.301      1.677  -0.776   0.4380    
SexF                           -12.488      5.439  -2.296   0.0217 *  
Time:SexF                       0.396      4.827   0.082   0.9346  

When running as interaction using “:” :

mod4 <- glmer(RESPONSE ~ Time:Sex + (1|`ID CODE`) + (1|Location), data = data, family = binomial)

summary(mod4)

output:

Fixed effects:
                                    Estimate Std. Error z value Pr(>|z|)   
(Intercept)                          -1.2957     0.5427  -2.388  0.01695 * 
Time:SexM                            -1.1943     0.3698  -3.229  0.00124 **
Time:SexF                            -1.5406     0.5019  -3.070  0.00214 **

One Answer

First, note that A*B is just shorthand for A + B + A:B and it does not make sense to specify a model with only the interaction term, as in your last model. That is, when including an interaction, as a general rule you also need to include the main effects for each variable involved in the interaction. In other words you should either fit A + B if you don't want an interaction or A*B ( or A + B + A:B) if you do want to include the interaction.

Second, note that, in the presence of an interaction, the meaning of the main effects change. Without an interaction, each main effect is interpreted as the association of a 1 unit change (or the difference compared to the reference level, in the case of a categorical variable) with the outcome, leaving the other covariates unchanged. However, in the presence of an interaction, each main effect is interpreted as the association of a 1 unit change (or the difference compared to the reference level, in the case of a categorical variable) with the outcome, when the other variable that is involved in the interaction is zero (or at its reference level in the case of a categorical variable). This is why the estimates and their p values for the main effects are different after including them in an interaction: they are testing different things.

From the output of the first three models, we see from mod3 that the interaction term is not significant. This usually means that you can safely drop the interaction. I say "usually" because p values are also related to sample size, so if you have strong theoretical reasons for including the interaction, and you only have a small sample, then you should retain it. A power analysis before conducting the experiment/study to determine adequate sample size is the best way to proceed, so if you are going to follow up with a further study then I would strongly suggest that. Assuming that statistical power isn't the issue, then as mentioned above you can drop the interaction and proceed with the model that contains both main effects. Presumably you have good reason for wanting to include them in the first place. Model selection based on p values is a highly dubious procedure, and it is far better to use your knowledge of the subject matter to select the best model. So a lot will depend on your research question. For example if you are mainly interested in understanding how Time is associated with the probability of response, then it is possible that sex is a confounder so you would definitely want to include it.

Lastly, note that time often has a non linear association with outcomes so you might want to include non linear terms (such as a quadratic term) for it, or use splines. You might also want to allow the association between time and the probability of response to be different within ID and Location, by including random slopes for it.

Answered by Robert Long on December 25, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP