Cross Validated Asked by rnso on August 13, 2020

I am using commonly available iris dataset and trying to do following regression:

PW ~ PL + SL + SW

Since samples are taken from 3 "Species", this is kept as random or group variable.

The results of Linear Mixed Regression are:

```
Mixed Linear Model Regression Results
=====================================================
Model: MixedLM Dependent Variable: PW
No. Observations: 150 Method: REML
No. Groups: 3 Scale: 0.0278
Min. group size: 50 Log-Likelihood: 41.4680
Max. group size: 50 Converged: Yes
Mean group size: 50.0
-----------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
-----------------------------------------------------
Intercept 0.082 0.335 0.245 0.807 -0.575 0.740
SL -0.098 0.045 -2.199 0.028 -0.186 -0.011
SW 0.238 0.048 4.975 0.000 0.144 0.332
PL 0.257 0.050 5.139 0.000 0.159 0.355
Group Var 0.257 1.636
=====================================================
```

While the results of GEE regression are:

```
GEE Regression Results
===================================================================================
Dep. Variable: PW No. Observations: 150
Model: GEE No. clusters: 3
Method: Generalized Min. cluster size: 50
Estimating Equations Max. cluster size: 50
Family: Gaussian Mean cluster size: 50.0
Dependence structure: Independence Num. iterations: 2
Date: Thu, 16 Jul 2020 Scale: 0.037
Covariance type: robust Time: 02:42:49
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
Intercept -0.2403 0.151 -1.595 0.111 -0.536 0.055
SL -0.2073 0.088 -2.349 0.019 -0.380 -0.034
SW 0.2228 0.073 3.036 0.002 0.079 0.367
PL 0.5241 0.049 10.711 0.000 0.428 0.620
==============================================================================
Skew: 0.2232 Kurtosis: 0.9437
Centered skew: -0.2824 Centered kurtosis: 1.2493
==============================================================================
=============== cov_struct.summary() ===============
Observations within a cluster are modeled as being independent.
```

Although P-values for all 3 predictor variables are significant in both, they are different in 2 analyses.

Moreover, the coefficients are quite different:

Which of these analyses is more appropriate and acceptable? Thanks for your insight.

When I fit these models in R I get very similar estimates to those that you obtained:

```
> data("iris")
> # lmm
> m.lmm <- lmer(Petal.Width ~ Sepal.Length + Sepal.Width + Petal.Length + (1|Species), data = iris)
> m.gee <- geeglm(Petal.Width ~ Sepal.Length + Sepal.Width + Petal.Length, id = Species, data = iris, corstr = "independence")
> summary(m.lmm)
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.0821 0.3356 0.24
Sepal.Length -0.0984 0.0444 -2.22
Sepal.Width 0.2380 0.0477 4.99
Petal.Length 0.2567 0.0478 5.37
> summary(m.gee)
Coefficients:
Estimate Std.err Wald Pr(>|W|)
(Intercept) -0.2403 0.1506 2.55 0.1106
Sepal.Length -0.2073 0.0882 5.52 0.0188 *
Sepal.Width 0.2228 0.0734 9.22 0.0024 **
Petal.Length 0.5241 0.0489 114.72 <2e-16 ***
```

The diffeence is mostle due to using `independence`

as the correlation structure. To be equivalent to the mixed model you should use `exchangable`

:

```
> m.gee1 <- geeglm(Petal.Width ~ Sepal.Length + Sepal.Width + Petal.Length, id = Species, data = iris, corstr="exchangeable")
> summary(m.gee1)
Coefficients:
Estimate Std.err Wald Pr(>|W|)
(Intercept) 0.0767 0.1960 0.15 0.695
Sepal.Length -0.1015 0.0254 16.02 6.3e-05 ***
Sepal.Width 0.2357 0.0958 6.06 0.014 *
Petal.Length 0.2647 0.0332 63.45 1.7e-15 ***
```

Exchangeable correlation structure means that the residual covariance between all species is the same, which is the same assumption as in mixed effects models.

Correct answer by Robert Long on August 13, 2020

1 Asked on December 6, 2021

2 Asked on December 6, 2021 by omm-kreate

0 Asked on December 6, 2021 by lakshman-mahto

1 Asked on December 6, 2021

gradient descent neural networks q learning reinforcement learning

1 Asked on December 6, 2021 by rambalachandran

binomial distribution combinatorics mathematical statistics probability

0 Asked on December 6, 2021

0 Asked on December 6, 2021 by user6883405

fishers exact test hypothesis testing multiple comparisons small sample

1 Asked on December 6, 2021

1 Asked on December 5, 2021 by laos

1 Asked on December 5, 2021 by user41710

1 Asked on December 5, 2021 by aae

approximation distributions lognormal distribution moment generating function normal distribution

1 Asked on December 5, 2021

1 Asked on December 5, 2021

attention machine translation natural language neural networks

1 Asked on December 5, 2021

1 Asked on December 5, 2021

cross correlation granger causality macroeconomics time series

1 Asked on December 5, 2021 by b-kenobi

generalized linear model logistic multiple regression r reporting

1 Asked on December 5, 2021

assumptions heteroscedasticity linear multiple regression variance

0 Asked on December 5, 2021

1 Asked on December 5, 2021 by emberbillow

2 Asked on December 5, 2021

Get help from others!

Recent Questions

Recent Answers

- Joshua Engel on Why fry rice before boiling?
- Peter Machado on Why fry rice before boiling?
- haakon.io on Why fry rice before boiling?
- Lex on Does Google Analytics track 404 page responses as valid page views?
- Jon Church on Why fry rice before boiling?

© 2022 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir