TransWikia.com

Feature importance for particular classes

Data Science Asked on November 20, 2021

Suppose I have a dataset labeled with two classes such as healthy and unhealthy and I applied feature selection (feature importance) on the dataset.

How can I know if the features are important to a particular class (to healthy or unhealthy)?

2 Answers

Something like this should get you going.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


df = pd.read_csv("https://rodeo-tutorials.s3.amazonaws.com/data/credit-data-trainingset.csv")
df.head()

from sklearn.ensemble import RandomForestClassifier

features = np.array(['revolving_utilization_of_unsecured_lines',
                     'age', 'number_of_time30-59_days_past_due_not_worse',
                     'debt_ratio', 'monthly_income','number_of_open_credit_lines_and_loans', 
                     'number_of_times90_days_late', 'number_real_estate_loans_or_lines',
                     'number_of_time60-89_days_past_due_not_worse', 'number_of_dependents'])
clf = RandomForestClassifier()
clf.fit(df[features], df['serious_dlqin2yrs'])

# from the calculated importances, order them from most to least important
# and make a barplot so we can visualize what is/isn't important
importances = clf.feature_importances_
sorted_idx = np.argsort(importances)


padding = np.arange(len(features)) + 0.5
plt.barh(padding, importances[sorted_idx], align='center')
plt.yticks(padding, features[sorted_idx])
plt.xlabel("Relative Importance")
plt.title("Variable Importance")
plt.show()

enter image description here

Answered by ASH on November 20, 2021

Assuming we are talking about feature importance for decision tree algorithms here - you cannot really say. It only tells you how often a feature is used to split both classes apart.

If you would like more insight into how your model makes decisions you could look into SHAP and LIME. Both are methods that approximate your model and then try to explain it. You can check out these two libraries in Python.

Answered by Simon Larsson on November 20, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP