TransWikia.com

Object detection stops predicting well according to how I collect the images

Data Science Asked by denisb411 on January 11, 2021

On my problem I have to use 4k resolution (3840×2160) to record images and detect qrcodes at the same time. The images I end resizing it to 1920×1080 to save it on disk.

The problem is that if I train using this resized data the model stops predicting well compared to when I record the video on native 1920×1080.
I can’t have good results nor if I predict with resized to 1920×1080 neither native 1920×1080 with this train.
If I train using native recorded at 1920×1080 it predicts well both with native 1920×1080 and resized 1920×1080.

Is there some explanation about why this is occurring?

I always try to save the images as jps with quality=95 and the camera I’m using compresses the image before sending to opencv (directly affects the fps).

I can’t go back to 1920×1080 because the qrcode reader stops detecting.

How I record the video without resize (working method):

for z, frame in enumerate(frames):
    try:
        self.out.write(cv2.cvtColor(frame2, cv2.COLOR_BGR2RGB))
    except AttributeError:
        self.out = cv2.VideoWriter(ANNOTATION_RECORD_VIDEO_FILE, cv2.VideoWriter_fourcc('m','p','4','v'), 60, (conf.cameras.phase1[0].resolution.width, conf.cameras.phase1[0].resolution.height))
        self.out.write(cv2.cvtColor(frame2, cv2.COLOR_BGR2RGB))

With resize (frames are coming with 4k res):

for z, frame in enumerate(frames):
    frame = cv2.resize(frame, dsize=(conf.annotation.resolution_to_save['width'], 
                                        conf.annotation.resolution_to_save['height']), 
                                interpolation=cv2.INTER_CUBIC)

    try:
        self.out.write(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
    except AttributeError:
        self.out = cv2.VideoWriter(ANNOTATION_RECORD_VIDEO_FILE, cv2.VideoWriter_fourcc('m','p','4','v'), 60, (conf.annotation.resolution_to_save['width'], conf.annotation.resolution_to_save['height']))
        self.out.write(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))

One Answer

On my problem I have to use 4k resolution (3840x2160) to record images and detect qrcodes at the same time. The images I end resizing it to 1920x1080 to save it on disk.

Have you tried using a HAAR Cascade classifier to crop out the region with the QR code so that you'll have to deal with only an image at lower dimensions. I guess that will help you with the frame rate.

The problem is that if I train using this resized data the model stops predicting well compared to when I record the video on native 1920x1080. I can't have good results nor if I predict with resized to 1920x1080 neither native 1920x1080 with this train. If I train using native recorded at 1920x1080 it predicts well both with native 1920x1080 and resized 1920x1080.

Im not entirely sure what you've meant by this. A bit more clarity would be nice !

Is there some explanation about why this is occurring?

If youre training on the resized 4k image and testing on the 1920x1080 image its natural that this happens. because the sub sampling removes features. It's better to look at a few resizing algorithms and write your own resizer. because qr codes have intricate details that can be missed while sub sampling

I always try to save the images as jps with quality=95 and the camera I'm using compresses the image before sending to opencv (directly affects the fps).

Instead of this why can't it be recorded in 1920x1080 and supersampled using interpolations. This would not be as good as the 4k but it will be close. And then you can sharpen the image until it's at par with 4k

Answered by Anoop A Nair on January 11, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP