Object detection stops predicting well according to how I collect the images

Question

On my problem I have to use 4k resolution (3840x2160) to record images and detect qrcodes at the same time. The images I end resizing it to 1920x1080 to save it on disk.
The problem is that if I train using this resized data the model stops predicting well compared to when I record the video on native 1920x1080.
I can't have good results nor if I predict with resized to 1920x1080 neither native 1920x1080 with this train.
If I train using native recorded at 1920x1080 it predicts well both with native 1920x1080 and resized 1920x1080.
Is there some explanation about why this is occurring?
I always try to save the images as jps with quality=95 and the camera I'm using compresses the image before sending to opencv (directly affects the fps).
I can't go back to 1920x1080 because the qrcode reader stops detecting.
How I record the video without resize (working method):
for z, frame in enumerate(frames):
    try:
        self.out.write(cv2.cvtColor(frame2, cv2.COLOR_BGR2RGB))
    except AttributeError:
        self.out = cv2.VideoWriter(ANNOTATION_RECORD_VIDEO_FILE, cv2.VideoWriter_fourcc('m','p','4','v'), 60, (conf.cameras.phase1[0].resolution.width, conf.cameras.phase1[0].resolution.height))
        self.out.write(cv2.cvtColor(frame2, cv2.COLOR_BGR2RGB))

With resize (frames are coming with 4k res):
for z, frame in enumerate(frames):
    frame = cv2.resize(frame, dsize=(conf.annotation.resolution_to_save['width'], 
                                        conf.annotation.resolution_to_save['height']), 
                                interpolation=cv2.INTER_CUBIC)

try:
        self.out.write(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
    except AttributeError:
        self.out = cv2.VideoWriter(ANNOTATION_RECORD_VIDEO_FILE, cv2.VideoWriter_fourcc('m','p','4','v'), 60, (conf.annotation.resolution_to_save['width'], conf.annotation.resolution_to_save['height']))
        self.out.write(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))

Anoop A Nair · Answer

On my problem I have to use 4k resolution (3840x2160) to record images
and detect qrcodes at the same time. The images I end resizing it to
1920x1080 to save it on disk.

Have you tried using a HAAR Cascade classifier to crop out the region with the QR code so that you'll have to deal with only an image at lower dimensions. I guess that will help you with the frame rate.

The problem is that if I train using this resized data the model stops
predicting well compared to when I record the video on native
1920x1080. I can't have good results nor if I predict with resized to
1920x1080 neither native 1920x1080 with this train. If I train using
native recorded at 1920x1080 it predicts well both with native
1920x1080 and resized 1920x1080.

Im not entirely sure what you've meant by this. A bit more clarity would be nice !

Is there some explanation about why this is occurring?

If youre training on the resized 4k image and testing on the 1920x1080 image its natural that this happens. because the sub sampling removes features. It's better to look at a few resizing algorithms and write your own resizer. because qr codes have intricate details that can be missed while sub sampling

I always try to save the images as jps with quality=95 and the camera
I'm using compresses the image before sending to opencv (directly
affects the fps).

Instead of this why can't it be recorded in 1920x1080 and supersampled using interpolations. This would not be as good as the 4k but it will be close. And then you can sharpen the image until it's at par with 4k

Object detection stops predicting well according to how I collect the images

One Answer

Add your own answers!

Ask a Question