# Mixed Effects Model

Cross Validated Asked by Seydou GORO on September 20, 2020

Sorry for asking my question even though I know there are some subjects about mixed effect model on the forum. But I think my question is somewhat different.
It is a group of people followed for a treatment against depression: 146 people (Men an women), 8 times of measure for each subject. I have to answer about if treatment works better in one gender group compare to the other.
My variables of interest are ScoreHamilton (Score used to assess depression state), GROUPE (Gender: male or female),TEMPS (Different times of visit),NUMERO (Subjects ID)
I know I have to used mixed effect model, but I am not sure if my scripts (below) are correct.

modMix_H0 <- lme(ScoreHamilton ~ TEMPS + GROUPE,
random = ~1+TEMPS|NUMERO,
data = Ham_norm.mix)


I fitted variables TEMPS (time) and GROUPE (Gender) like fixed effects and NUMERO (Subjects) like random effect. I am wondering if that is right.

I hesitate a little about the way I made random effect. I tried to do random intercept and random slope like this~1+TEMPS|NUMERO cause I noticed that people making random effects used to do like this ~1+TIME|ID (in general). Now I am wondering why I cannot put in random terms my variable GROUPE, something like this ~1+GROUPE|NUMERO, or like this ~1+TEMPS+GROUPE|NUMERO.

The other part of my question is the interpreting of the output.
Here are the results of the summary of the model:

summary(modMix_H0)

Linear mixed-effects model fit by REML
Data: Ham_norm.mix
AIC      BIC    logLik
6628.782 6663.471 -3307.391

Random effects:
Formula: ~1 + TEMPS | NUMERO
Structure: General positive-definite, Log-Cholesky parametrization
StdDev     Corr
(Intercept) 4.73695760 (Intr)
TEMPS       0.08200003 -0.353
Residual    4.72718973

Fixed effects: ScoreHamilton ~ TEMPS + GROUPE
Value Std.Error  DF   t-value p-value
(Intercept) 22.989933 0.5959364 905  38.57783   0e+00
TEMPS       -0.352266 0.0109268 905 -32.23866   0e+00
GROUPEHomme  2.952001 0.8013428 144   3.68382   3e-04
Correlation:
(Intr) TEMPS
TEMPS       -0.359
GROUPEHomme -0.652  0.012

Standardized Within-Group Residuals:
Min          Q1         Med          Q3         Max
-2.66404151 -0.58774912  0.02206275  0.56281247  3.97325207

Number of Observations: 1052
Number of Groups: 146


I don’t know how to interpret all of the parameters, how they could influence the interpreting of my final result (that is, the impact of GROUPE on the Score Hamilton), and the quality of my model .

Though, the way I interpret this result is that the score is significantly higher in men (Homme) than in women. So, the treatment improve better the mental state in women (lowest score), a result I was not expecting for. This make me wondering about about the way I computed the model.

I have additional questions. My variable VISIT was factor which I turned into numeric. Could it change something whether my variable VISIT is factor or numeric?
Could it change something about my results whether I used na.omit or not in the model, since my dataset has a lot of missing values?

I fitted variables TEMPS (time) and GROUPE (Gender) like fixed effects and NUMERO (Subjects) like random effect. I am wondering if that is right.

Yes, you have repeated measures within individuals so fitting random intercepts for this factor accounts for the correlation between measurements within each individual.

Now I am wondering why I cannot put in random terms my variable GROUPE, something like this ~1+GROUPE|NUMERO, or like this ~1+TEMPS+GROUPE|NUMERO

You can. The variables to the left of the | symbol specify random slopes. This means that you are allowing whatever variable appears on the left side to vary within whatever is on the right side. So in your case 1+GROUPE|NUMERO means that the effect of gender can be different for each individual. You should ask yourself whether this makes sense within your particular specialty (since gender usually does not change within individuals, I expect that this would not make sense). ~1+TEMPS+GROUPE|NUMERO additionally allows the effect of time to be different for each individual.

Though, the way I interpret this result is that the score is significantly higher in men (Homme) than in women. So, the treatment improve better the mental state in women (lowest score), a result I was not expecting for. This make me wondering about about the way I computed the model.

Yes, your interpretation is correct. The estimate for GROUPEHomme can be interpreted as the difference in ScoreHamilton between the reference level for GROUPE and for GROUPE=Homme, where the other fixed effects remain constant, and conditional on the random effects estimated.

I have additional questions. My variable VISIT was factor which I turned into numeric. Could it change something whether my variable VISIT is factor or numeric?

Yes it could. I don't see VISIT in your model. From your question it appears that TEMPS is the variable that was created from VISIT. I am assuming that the original factor variable had levels such as "10 days", "20 days", "25 days" "..." and you converted these to a numeric variable 10, 20, 25, ... By doing this your model estimates a linear effect for TEMPS. By keeping the variable as a factor then you allow for non linearity. If there are few levels / values then this does not matter too much, but if you have many levels then retaining the variable as a factor will lead to a model with many estimates for each level, which becomes difficult to interpret. If you want to allow for non-linearity, one way is to use the numeric variable and specify a quadratic (and higher order) terms for it, or to use splines. Since you have 8 measurement occasions, I would be inclined to use the numeric variable with splines.

Could it change something about my results whether I used na.omit or not in the model, since my dataset has a lot of missing values?

In lme missing data causes an error, so using na.omit = TRUE is the only way to make it run. This removes rows containing any missing data, and this can lead to substantial bias. Depending on the reasons for missingness and the extent of missingness, you would be well-advised to consider using multiple imputation to address this problem.

Final note: nlme is an old package. lme4` was subsequently developed by the same people and is a better choice in most situations.

Correct answer by Robert Long on September 20, 2020

## Related Questions

### How does Generalized Policy Iteration stabilize to the optimal policy and value function?

1  Asked on December 6, 2021

### Fire an alert when number of sign up in an app drops. How to find the best condition to maximize accuracy?

2  Asked on December 6, 2021 by omm-kreate

### Is it always possible a closed form solution for a norm minimization problem? Which one is the best approach closed form solution or gradient based?

0  Asked on December 6, 2021 by lakshman-mahto

### Does gradient descent work for tabular Q learning?

1  Asked on December 6, 2021

### prove change in total probability of success in binomial distribution

1  Asked on December 6, 2021 by rambalachandran

### Why we cannot take baseline as predictor for change in this case

0  Asked on December 6, 2021

### Calculate group with highest defective rate

0  Asked on December 6, 2021 by user6883405

### Time series model for multiple different series observations

1  Asked on December 6, 2021

### Whitening a dataset with fewer observations than variables

1  Asked on December 5, 2021 by laos

### Composite Scores and Standardized Composite Scores t test

1  Asked on December 5, 2021 by user41710

### The distribution of the product of a multivariate normal and a lognormal distribution

1  Asked on December 5, 2021 by aae

### How to understand mapping function of kernel?

1  Asked on December 5, 2021

### Attention Mechanisms and Alignment Models in Machine Translation

1  Asked on December 5, 2021

### Difference between Repeated measures ANOVA, ANCOVA and Linear mixed effects model

1  Asked on December 5, 2021

### Time Series Multivariate Forecasting

1  Asked on December 5, 2021

### reporting results of a multivariate logistic regression using the glm function in R

1  Asked on December 5, 2021 by b-kenobi

### Checking the constant variance assumption for residuals vs fitted plots: What about for the same fitted values?

1  Asked on December 5, 2021

### What model is a suitable model for zero-constrained variables?

0  Asked on December 5, 2021

### Why the regression coefficient for normalized continuous variable is unexpected when there is dummy variable in the model?

1  Asked on December 5, 2021 by emberbillow

### Is boosting and bagging only relevant in the context of decision trees?

2  Asked on December 5, 2021