Output the linear predictor from a stratified cox model?

Bioinformatics Asked by Nienke on December 17, 2020

I have fitted a cox model on a pooled dataset of multiple studies, say study A, B and C. As these studies have different baseline hazard functions, I stratified for ‘study’ in the model.

Like this:

df$SurvObj <- with(df, Surv(event_rd, event == 1))

fit <- coxph(SurvObj ~ cov1 + cov2 + cov3 + cov4 + strata(study),data=df)

Now, I want to assess whether I can use the beta coefficients from the above model to predict the event probability in a new study, study D.
The baseline hazard in study D is different from those in study A, B and C.

When I have


I get this error:

Error in model.frame.default(data = studyD, formula = ~hba1c + sbp +
: factor strata(study) has new levels study=4

Why does R require that I match the strata of study D with those in study A,B and/or C when I only want to output the linear predictor (type="lp")? As the linear predictor is independent from the baseline hazard function in this case.

I suppose it should be possible to extract the linear predictor for individuals in study D and then manually calculate the event probability using the baseline hazard of study D.

Does anybody know how to do this?
I could not find any question like this so maybe I am thinking the wrong way.

One Answer

Unlike wikipedia, the linear predictor (LP) in coxph is made dependent from the baseline hazard function $lambda$. For example, if a model has two predictors x1, x2 stratified by sex, the linear predictor would be like: $$LP = log ( lambda _{sex}) + beta _{1}x _{1}+beta _{2}x _{2}$$

Where the $log(lambda _{sex})$ can be obtained by predict() with x1=x2=0.

We could calculate the LP without strata information (sex) by hand using the $beta _{1} , beta _{2}$ provided by coef(fit) and a customised $lambda _{unkown , sex}$.

Below is a simulation comparing the results from predict() and by hand coef(fit).

## Toy data
df <- list(time=c(4,3,1,1,2,2,3), 

## Fit a stratified model 
fit <- coxph(Surv(time, status) ~ x1 + x2 + strata(sex), df) 
#         x1          x2 
# 0.79451881 -0.01917633

## Baseline linear predictor
lambda_sex1 <- predict(fit, newdata=list(x1=0, x2=0, sex=0))
# -0.746578
lambda_sex2 <- predict(fit, newdata=list(x1=0, x2=0, sex=1))
# -0.1497816

## by predict function
predict(fit, newdata=list(x1=1, x2=2, sex=0))
# 0.009588167
predict(fit, newdata=list(x1=1, x2=2, sex=1))
# 0.6063845

## by hand
lambda_sex1 + coef(fit)[1]*1 + coef(fit)[2]*2
# 0.009588167
lambda_sex2 + coef(fit)[1]*1 + coef(fit)[2]*2
# 0.6063845

Finally, to calculate "a" linear predictor for an unknown sex:

# prevented by predict function
predict(fit, newdata=list(x1=1, x2=2, sex=2))
#Error in model.frame.default(data = list(x1 = 1, x2 = 2, sex = 2), formula = ~x1 +  : 
# factor strata(sex) has new level sex=2

# by hand, assuming lambda for sex 2 is 0
0 + coef(fit)[1]*1 + coef(fit)[2]*2
# 0.7561661

Answered by Ryan SY Kwan on December 17, 2020

Add your own answers!

Related Questions

Generating 3D coordinates error

1  Asked on January 15, 2021 by shahbaaz


BAM file filteing to remain best isoform

0  Asked on January 10, 2021 by user977828


Somatic mutations for normal WES samples

0  Asked on January 6, 2021 by lot_to_learn


Get list of urls of GSM data set of a GSE set

1  Asked on January 6, 2021 by user432797


Biohackers Netflix – DNA to binary and video

1  Asked on January 3, 2021 by xamax


DNA sequence error annotation

0  Asked on December 30, 2020 by matthew-jones


samtools / bamUtil | Meaning of as Reference Name

1  Asked on December 25, 2020 by paul-endymion


How to remove batch effect from TCGA and GTEx data

2  Asked on December 22, 2020 by kai-he


Ask a Question

Get help from others!

© 2023 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir