TransWikia.com

Is the pooled AUC calculation for imputated data in (psfmi package) mivalext_lr() correct?

Cross Validated Asked by YY Shi on December 1, 2021

I have an imputated data with several nonmissing and not-imputated variables. However, I realised when I use mivalext_lr() to obtain pooled AUC and 95% CI of my logistic regression, the pooled 95% CI is bigger then I get from individual imputated data, (which are the same as the variable used was not imputated), and actually exactly the square root of the CI upper and lower limit obtained from each imputated data set.
I understand I should pool AUC after logit transformation with Rubin’s rule, integrating the variance of each AUC itself and the variance between imputation.
So I tried to look into the source code, and added some print out command, the variate between each imputation (b.roc.logit) truly print out as 0, but I felt the p.se.roc.logit had an extra sqrt…. Can someone help me to take a look? unless it supposed to have bigger SE even for not imputated variables after pooling…..
I tried psfmi_perform() in the same package, got the same big CI.

While math is really beyond me, I do feel at least the two parts in tv.roc.logit should have same times of sqrt, right?

(complete source code can be found on : https://rdrr.io/cran/psfmi/src/R/mivalext_lr.R)

'''
 roc.f.mi.i[[i]] <- roc(f.ext$y, p.ext)$auc
        se.roc.mi.i[[i]] <- sqrt(pROC::var(roc.f.mi.i[[i]]))
        print(roc.f.mi.i[[i]])
        print(se.roc.mi.i[[i]])
        se.roc.mi.i.logit[[i]] <- sqrt(pROC::var(roc.f.mi.i[[i]]))/(roc.f.mi.i[[i]] * 
            (1 - roc.f.mi.i[[i]]))
        if (g < 4) {
            stop("For Hosmer and Lemeshow test, number of groups must be > 3")
        }
        else {
            hl.mi.i[[i]] <- hoslem.test(f.ext$y, p.ext, g = g)[[1]]
        }
    }
    coef.pool <- round(colMeans(do.call("rbind", coef.mi.i)), 
        5)
    lp.pool <- round(colMeans(do.call("rbind", lp.ext.mi)), 5)
    est.roc.logit <- log(unlist(roc.f.mi.i)/(1 - unlist(roc.f.mi.i)))
    se.roc.logit <- unlist(se.roc.mi.i.logit)
    print(se.roc.logit)
    p.roc.logit <- mean(est.roc.logit)
    p.se.roc.logit <- mean(se.roc.logit)
    print(p.se.roc.logit)
    b.roc.logit <- var(est.roc.logit)
    print(b.roc.logit)
    tv.roc.logit <- p.se.roc.logit + ((1 + (1/nimp)) * b.roc.logit) ## is this "p.se.roc.logit" been sqrt a extra time??
    print(tv.roc.logit)
    se.t.roc.logit <- sqrt(tv.roc.logit)
    inv.roc <- exp(p.roc.logit)/(1 + exp(p.roc.logit))
    inv.roc.u <- exp(p.roc.logit + (1.96 * se.t.roc.logit))/(1 + 
        exp(p.roc.logit + (1.96 * se.t.roc.logit)))
'''

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP