Getting different precisions for same neural network with same dataset and hyperparameters in sklearn mlp classifier

Question

I get WAY DIFFERENT results in each run despite using random state for making sure that network outputs same result for same hyper parameters, here is some sample outputs(I've printed the hyper parameters manually to show that they are the same) and I think I may missing something...

{'hidden_layer_sizes': [20, 20, 30], 'activation': 'tanh', 'solver': 'adam', 'alpha': 0.02, 'learning_rate': 'adaptive', 'iteration': 400}
0.8888888888888888

{'hidden_layer_sizes': [20, 20, 30], 'activation': 'tanh', 'solver': 'adam', 'alpha': 0.02, 'learning_rate': 'adaptive', 'iteration': 400}
0.3333333333333333

{'hidden_layer_sizes': [20, 20, 30], 'activation': 'tanh', 'solver': 'adam', 'alpha': 0.02, 'learning_rate': 'adaptive', 'iteration': 400}
0.4444444444444444

{'hidden_layer_sizes': [20, 20, 30], 'activation': 'tanh', 'solver': 'adam', 'alpha': 0.02, 'learning_rate': 'adaptive', 'iteration': 400}
0.7777777777777778

here is some of the code:

mlp = MLPClassifier(random_state=2,
                    hidden_layer_sizes=(20, 20, 30),
                    max_iter=400, alpha=0.02,
                    activation='tanh',
                    solver='adam',
                    learning_rate='adaptive')
mlp.fit(X_train, y_train.values.ravel())

Paul · Answer

If the initialization of your network is random, and the order in which you feed it the training examples is not fixed, then your training can lead to a different model every time, with different performance. That’s normal, although if you train long enough, you would hope that you often reach a solution of similar quality. So perhaps you are not training long enough. Have you plotted the loss as a function of the training time (in terms of max_iter)? Is your model training until reaching the maximum number of iterations, or is it converging before?

Another possibility is the amount of training data. So repeat the above exercise where you figure out how long you need to train with smaller fractions of your training data. For example, what’s the best you can do with 20% of your training data, with 40%, 60%, and so on? If your best result is still improving significantly when you reach 100%, then maybe you need to get more training data.

Getting different precisions for same neural network with same dataset and hyperparameters in sklearn mlp classifier

One Answer

Add your own answers!

Ask a Question