TransWikia.com

What is the point of test set in ML?

Cross Validated Asked by Lelouche Lamperouge on November 29, 2021

The typical ML process consist of 3 sets:
Training, Dev, Test set.

But the actual training and tuning only happens in the training and dev sets. And the final result is applied on test set to give us a satisfaction if our algorithm generalises well.

And that’s it. That’s the only point of test set, satisfaction. If we are not satisfied with the test set result we cant redo the whole training process cause then we’d be overfitting the test set, unless we have a new test set.

So wouldn’t it be better to dissolve the only test set we have with the training and dev set so as to get more examples for the algorithm to work on?

4 Answers

I will add that your model doesn’t “lose out on” the test set data as many people think. The purpose of the train/validation/test split is solely to estimate generalization error. Once that is complete, you can retrain your model with the test set included, it did it’s job, no need to exclude it any longer. So your model doesn’t lose out on that data, it uses it to train, just after it’s served it’s purpose.

Answered by astel on November 29, 2021

The test set is used to assess the performance of the trained ML system. In the process of training, both the training and the validation (dev) set contribute to the parameters of the system: You optimise them on the training set, and check how the system performs on the dev set. Then you tune some hyperparameters and repeat, until you achieve the best performance on the dev set.

In the end, although the dev set hasn't influenced the system's parameters directly, it contributed to the choice of the hyperparameters. So the system's performance on these two sets, training and dev, is usually too good to be true: It has been optimised to work well on them.

What you want to know, however, is how the system will perform in practice, on data never seen before. Measuring performance on the test set, which hasn't been used at all during training, gives you a more realistic estimate.

Answered by Igor F. on November 29, 2021

Sometimes you don't have a test set. When the data is scarce, the performance is typically reported via cross-validation. Because, separating out, say, 4-5 samples out of a total of 30 would be highly risky to report performance. If there are hyperparameters to tune, then nested cross validation is used. This isn't done because it's better than separating a test set, but because you have to.

However, if there is enough data, separation of a test set is highly advisable because the verdict of overfitting/underfitting would be best made on the separate test set.

Answered by gunes on November 29, 2021

Edit: There are 4 ML sets: training, training-dev, dev, test. But as common used 3 ML sets: training, dev, test The point of the test set is to know is your model over-fitted its goal. For this we use accuracy and loss of ML with training and training-dev set. If model has been over-fitted badly and we have high bias, high variance we use bigger networks, other type of networks or some regularization and optimization.

I will try to realize Dave's problem. You want to create a self-driving car model. For this you need big data set, good tuned hyper-parameters, good architecture model and so on. Finally, you did it. You created model. Now it's time to trained it. After training you will see cost(loss) functions' plot. You trained model, but that's not telling you that it's working good at real problems. You'll should test(debug) it, how it understood a goal, a reason of creating it. If you won't test it with dev and test sets, it could end very-very badly ;(

So, summarize. Test set point is to debug and test your model, how it's over-fitted a goal and if needed to use other add-ins for good working.

Answered by TheFnafException on November 29, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP