TransWikia.com

SPSS - Automatic Linear Modeling "Importance" Numbers

Cross Validated Asked by Josh Davis on December 1, 2021

I have a large set of survey data. I’m looking at trying to find out which variables are the most important to impacting a DV (call it "happiness"). I’m not looking to find a beta number like a typical regression. I theorize that some of my ~14 variables will have co-linearity. Is it appropriate to do the following in SPSS, and am I interpreting this correctly?

Analyze->Regression->Automatic Linear Modeling
Predictors = all my ~14 variables
Target = my DV ("happiness)

Build options – automatically prepare data OFF
Model selection = forward stepwise
Criteria for entry/removal = F Statistics, including effects with p-values less than 0.05 and removing p-values greater than .1.

Hit Run.

Scroll down in the model to "coefficients, and expand window.

Variable 1 – coefficient .16, sig .000, importance .52
Variable 2 – coefficient .10, sig .000, importance .19
Variable 3 – coefficient .07, sig .000 importance .16
Variable 4 – coefficient .06, sig .000, importance .08
Variable 5 – coefficient .06, sig .001, importance .05

The coefficient I assume is beta – i.e. if this variable increased by 1, "happiness" should increase by the coefficient #. Now, what I’m really interested in is the importance #. Am I correct in interpreting this for variable 1 as, "Our model suggests 52% of the variance of happiness is due to this variable."

Further, am I correct that in that this modeling indeed removes colinearity from variables? Or do I need to do that in a different way. I’m hopeful SPSS is essentially doing a "relative weights analysis" here.

Thanks, and hopefully this makes sense!

One Answer

SPSS Help or Algorithms document somewhere states about the "variable impotrance" in the Automatic Linear Modeling procedure:

For linear models, the importance of a predictor is the residual sum of squares with the predictor removed from the model, normalized so that the importance values sum to 1. This is the leave-one-out method to compute the predictor importance, based on the residual sum of squares SSe by removing one predictor at a time from the final full model.

So, variable (predictor) importance (VI)

VI = SSE_without_the_predictor - SSE_full_model, and the
Normalized_VI = VI / Sum(VI)_of_all_the_predictors.

But note that

(SSE_without_the_predictor - SSE_full_model) / SSY

is the squared part correlation of the predictor with Y. Since SSY (Sum of squares in centered Y) is constant here, it does not influence the normalized VI.

Thus, the normalized VI of a predictor is nothing else than the relative size, among all the predictors, of its squared part correlation with Y.


See @amoeba's overview of various "importance" measures in regression setting. Check also other posts tagged [importance].

Answered by ttnphns on December 1, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP