TransWikia.com

How to understand Learning Curves

Data Science Asked by m8tey7 on December 3, 2020

I’ve developed two models and wanted to test them. However, I don’t know how to intepret the results of the learning curves properly.

For Model 1:



                 precision    recall  f1-score   support
      class 1       0.82      0.94      0.88      1717
      class 2       0.97      0.90      0.93      3504

    accuracy                           0.91      5221

Learning Curve for Model 1

For Model 2:

                precision    recall  f1-score   support

      class 1       0.27      0.98      0.43       115
      class 2       1.00      0.77      0.87      1317

    accuracy                           0.79      1432

Learning Curve for Model 2

For model 2, I’ve checked that I’ve done the code right but the results don’t seem to match up with the learning curve. Could someone please explain what’s happening in each case?

Thanks

Some extra info:

  • Linear SVC used
  • There is a data imbalance but I set the class weight to balanced
  • Dataset relates to a worklog (natural language processing)

One Answer

There are a couple potential things going on here.

  • Your split is not homogeneus Your train and test are not similar, this can be done because you have split with a bias/pattern and both are different. This makes the train test error different.

  • Overfitting Since the training error is better than the test, your model can just be overfitting. This kind of make sense by looking at the graphs. As you increase the corpus size the overfitting becomes smaller.

If you consider the split to be done OK, then is probably the second

Answered by Carlos Mougan on December 3, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP