TransWikia.com

Suspiciously good accuracy using neural network

Data Science Asked by Po Chen Liu on February 1, 2021

I have a dataset from EEG data that is 24 features (24 electrodes) and 88000 samples with 3 classes, it is normalised and everything and had some noise filtered out via bandpassing.

When I classify with anything but a neural network the accuracy is pretty bad and I am using a 60/40 for training/test set just to make sure I can trust the result.

For example:

  • Gaussian naive bayes: 42%
  • Logistic Regression: 52%
  • Linear Discriminant Analysis: 51%

However I played around with a neural network achieving 95%+ averaged with:

  • 3 hidden layers: 100, 200, 100
  • Activation: relu
  • Learning rate: adaptive

I think this is super fishy so I did a PCA analysis
enter image description here

And plotted it with dimension reduction to two dimensions
enter image description here

As you can see, there is nothing significantly separable, which really confuses me.
I am definitely using the test set to run the cross validation which is 40% of the sample data.

Can someone please advise on what’s happening and whether I can trust this result? And what further steps I can do to make this result more concrete?
I don’t want to celebrate too early!!!

One Answer

As you can see, there is nothing significantly separable, which really confuses me. I am definitely using the test set to run the cross validation which is 40% of the sample data.

Well, your interpretation is wrong: there is nothing linearly separable on the two first directions.

The other components in PCA may be more relevant. Remember that PCA gives you the direction where the data has most variance, which doesn't necessarily mean it is the direction in which it is most easily separated/classified.

I see you tryed LDA, which follows on that assumption that the most variable direction is not the most significant for classification, but since it is linear we can conclude that your data is not Linearly separable.

Your model seem pretty large, it is considerably more powerful than LDA, LR, etc.

If you are not confident with the results, try using other validation methods such as K-FOLD or Leave-one-out.

Answered by Pedro Henrique Monforte on February 1, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP