TransWikia.com

averaging feature importance from different models

Cross Validated Asked by henry50618 on November 21, 2021

I have three data sets, each including a subset of some features.

For example, dataset 1 have feature A and feature B. dataset 2 have feature B and feature C. dataset 3 have feature A and feature C.

I would like to find the overall feature importance of A, B, C. Can I do the following procedure?

(1) Find feature importance from three datasets separately (Using dominance analysis or pls-sem)

dataset1 -> A : 50%, B : 50%

dataset2 -> B : 25%, C : 75%

dataset3 -> A : 40%, C : 60%

(2) weighting feature importance by the sample number for each dataset:

sample number in dataset 1 is 4000

sample number in dataset 2 is 5000

sample number in dataset 3 is 6000

feature importance of A = 50%(4000/15000) + 40%(6000/15000) = 29.3%

feature importance of B = 50%(4000/15000) + 25%(5000/15000) = 21.7%

feature importance of C = 75%(5000/15000) + 60%(6000/15000) = 49%

I am not sure whether this procedure is reasonable. Can anyone give me some advice?

Really appreciate.

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP