TransWikia.com

Random Forest Output

Stack Overflow Asked by Charles Biwer on December 30, 2021

so I am looking at a supervised, binary prediction problem. Dataframe is mostly categorical which I one-hot encoded. I handled all missing values, NaN, and infinite values.
The dataframe (df) has 2 numerical features and the rest categorical (one-hot encoded). Reminder the dependent variable is binary.

dataset_target = df[['dependent_var']].values
dataset_target = pd.DataFrame(dataset_target)
dataset_target.columns=['dependent_var']

regressor = RandomForestRegressor(n_estimators=500, random_state=0, n_jobs=-1)
# Train the classifier
regressor.fit(df, dataset_target.values)

# Print the name and gini importance of each feature
for feature in  regressor.feature_importances_:
    print(feature)

The model is supposed to help me select the most important features, it is running however with very unsatisfying results (only 0 and one 1) which I don’t even understand and thus don’t know what to change on my input.

this is my first prediction project at my internship as a DA, very glad about any help.

This is a snipplet of the output of the randomForrest:

enter image description here

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP