Is it possible for a model with a large amount of data to perform very well and reach an extremely low cost within a single epoch?

Question

I am working on a project to detect human awareness levels using this dataset.
I preprocessed the video data as the following:

Convert video to frames(taking a frame every 5 seconds.
Rotate the frames to be vertical.
Apply OpenCV DNN to extract the faces from the images.
Split the data into 90% train, 5% validation and 5% test.

All in the dataset has a size of about 570,000 images.
I am using the model on a mobile device so I used transfer learning with MobileNetV2.
The model classification is extremely good but it feels odd seeing it do so well and reach a very low loss so fast.
Is this even possible on a dataset this big? I am feeling that I did something wrong cause when I try to use the model on the mobile device with Tensorflow.js it does not perform well at all. After doing some research I realized that I should be using a model that combines a CNN and a LSTM as this is video data. But I am bit strapped for time to redo the whole preprocessing of the data to convert the images into a sequence of frames and then do the training once more.
What I was planning to do was make an average of the predictions on the mobile device to improve the accuracy there but I am wondering if I messed up the process anywhere.

yudhiesh · Accepted Answer

So the model was performing poorly because I was making predictions on the entire input image instead of doing face detection then performing predictions on the cropped faces.

N. Kiefer · Answer

Like you correctly pointed out, your data is actually sequential. Simply randomly splitting your data for training and testing won't do here. If you do it like that it is very likely that every test frame is only 5 frames away from a training frame making it look very similar. Your network has practically seen your testing data in training already.
You will probably have to train again. I would recommend though, to save your data after preprocessing it, so you can immediatly start from this point again.

Brandon Donehoo · Answer

A few things come to mind here:

If you are using a pre-trained MobileNetV2 on a task which is similar to the pre-training, then you may not need much fine-tuning to get good results. This may explain why you are seeing good training results.
For the poor testing results, are you transforming your frames in the same way you did for training? Any differences between the training pipeline or phone testing pipeline you can think of? Is it possible to test some of your good training results with the phone testing pipeline as a sanity check?

Is it possible for a model with a large amount of data to perform very well and reach an extremely low cost within a single epoch?

3 Answers

Add your own answers!

Ask a Question