Data Science Asked by Sebastian Topalian on November 1, 2020
I have created four random forest models they have the same X data, but their y data are four different response variables. The sklearn random forest feature importance is identical for all four. All four models achieve their purpose and make different predictions, but their random forest feature importance is the same.
Has anyone experienced this before?
I created the models with a series of nested objects like illustrated below. I used the same code before without having identical random forest feature importances, there was however the difference that inside each object I ran a 3-fold CV to determine max_features, whereas here I just used the default which is all of them.
Current code:
class NoCVMethod:
def __init__(self, X_train, y_train, X_test, y_test, y, Method):
self.clf = Method
self.clf.fit(X_train, y_train)
self.predictions = self.clf.predict(X_test)
self.rev_preds = rev_pred(y[-(13978+97):].values,self.predictions)
self.residuals = y_test - self.rev_preds
self.RMSE = np.mean((self.residuals)**2)**0.5
class Different_variables:
def __init__(self, X_train, y_train, X_test, y_test, Method):
self.TSS = NoCVMethod(X_train, y_train[y_train.columns.tolist()[0]], X_test, y_test[y_test.columns.tolist()[0]], y[y.columns.tolist()[0]], Method)
self.NOx = NoCVMethod(X_train, y_train[y_train.columns.tolist()[1]], X_test, y_test[y_test.columns.tolist()[1]], y[y.columns.tolist()[1]], Method)
self.NH4 = NoCVMethod(X_train, y_train[y_train.columns.tolist()[2]], X_test, y_test[y_test.columns.tolist()[2]], y[y.columns.tolist()[2]], Method)
self.PO4 = NoCVMethod(X_train, y_train[y_train.columns.tolist()[3]], X_test, y_test[y_test.columns.tolist()[3]], y[y.columns.tolist()[3]], Method)
Old code:
class CVMethod:
def __init__(self, X_train, y_train, X_test, y_test, y, param_dict, Method):
self.pipeline = Pipeline([
('scale', StandardScaler()),
('clf', Method)
])
self.param_grid = param_dict
self.grid = GridSearchCV(self.pipeline, param_grid = self.param_grid, cv = 3, verbose = False, n_jobs = -1)
self.grid.fit(X_train, y_train)
self.predictions = self.grid.predict(X_test).ravel()
self.rev_preds = rev_pred(y[-(13978+97):].values,self.predictions)
self.residuals = y_test - self.rev_preds
self.RMSE = np.mean((self.residuals)**2)**0.5
class CVDifferent_variables:
def __init__(self, X_train, y_train, X_test, y_test, param_dict, Method):
self.TSS = CVMethod(X_train, y_train[y_train.columns.tolist()[0]], X_test, y_test[y_test.columns.tolist()[0]], y[y.columns.tolist()[0]], param_dict, Method)
self.NOx = CVMethod(X_train, y_train[y_train.columns.tolist()[1]], X_test, y_test[y_test.columns.tolist()[1]], y[y.columns.tolist()[1]], param_dict, Method)
self.NH4 = CVMethod(X_train, y_train[y_train.columns.tolist()[2]], X_test, y_test[y_test.columns.tolist()[2]], y[y.columns.tolist()[2]], param_dict, Method)
self.PO4 = CVMethod(X_train, y_train[y_train.columns.tolist()[3]], X_test, y_test[y_test.columns.tolist()[3]], y[y.columns.tolist()[3]], param_dict, Method)
``
It seems that your self.clf
points to your Method
. At the end, you are probably printing the features importance of a unique classifier.
Maybe you should copy it:
from sklearn.base import clone
class NoCVMethod:
[...]
self.clf = clone(Method) # only copy the estimator
# OR
self.clf = deepcopy(Method) # if you want to also copy the data estimator
See here (or here as you suggested) for more details about copying an sklearn estimator.
Correct answer by etiennedm on November 1, 2020
2 Asked on April 17, 2021 by patrik-zaoral
1 Asked on April 17, 2021 by rohan-singh-dhaka
2 Asked on April 16, 2021 by shonix3373
classification deep learning machine learning neural network
1 Asked on April 16, 2021 by sp_
0 Asked on April 16, 2021
1 Asked on April 16, 2021
2 Asked on April 16, 2021
3 Asked on April 16, 2021
deep learning keras machine learning tensorflow transfer learning
2 Asked on April 16, 2021
1 Asked on April 16, 2021 by user91090
1 Asked on April 15, 2021 by anbal-snchez-numa
0 Asked on April 15, 2021 by vamper1234
1 Asked on April 15, 2021 by alex-dore
1 Asked on April 15, 2021
attention mechanism lstm sequence to sequence tensorflow vae
2 Asked on April 15, 2021 by gypaetebarbu
0 Asked on April 15, 2021 by statsnoob
0 Asked on April 15, 2021 by rylan-schaeffer
Get help from others!
Recent Questions
Recent Answers
© 2022 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP