Role of Image Resolution in Deep Learning

Question

I have multiple image datasets about the same topic that I want to use for a classification task utilizing Deep Learning. The datasets differ in the resolution of images (i.e. some pictures are 128x128px, some 512x512, others 2048x2048).
If I used the dataset with the highest resolution for training my intuition would be that it'll be harder to classify images with a lower resolution as the network learns patterns that it may not recognize in lower resolution pictures. On the other hand, if I use the low resolution dataset for training, the patterns learnt are more crude and it performs better on any new data as higher resolution images can be scaled down easily. Is my intuition right or am I missing something? What would be the best approach in my case for selecting proper training data?

Shahriyar Mammadli · Accepted Answer

The answer may depend on what kind of information you want to extract from the images. However, the general approach to the problem is to find a perfect balance so that your image is not too small which is hard to extract too much information or it is not high-resolution input which will unnecessarily complicate your model. The latter will also be hard to train in terms of space complexity and time complexity.
Thus, if your objective is not something like identifying and classifying minuscule objects in the image, or a similar detailed and complex task, then you can use the small size of images.
Having good architecture and a well-trained model can ensure your powerful outcome. Consider that most of the famous and powerful NN models in Computer Vision and Image Processing fields are using input sizes like 96x96, 128x128, 224x224, 256x256. Maybe I am going too further by saying this, but, the good challenge would be building a powerful model with a small size of pixels such as 224x224 or close to this because in that case, the model's usability and usefulness will be scaled up. As you also mentioned, high-quality images can be downscaled to low scales but it is not possible to enlarge the small-sized image (at least, without the help of AI). As your model reasonably small, it will address that many of the all available images (also, detecting or classifying cropped images, small objects in the images, etc. can be considered another reason for having a small input size). So if you build a model using a large pixel size, will also require a high-quality image to be able to work and would not accept on low-quality images. Consider that google vision API works superb even with the 64x64 images.
Briefly, if your aim is not unusual and it does not necessitate high-resolution images to work with small-sized objects in the images, then use small-sized images. With the help of good architectured convolution layers, you can extract lots of information from those small images and processes in the next layers. Also, consider that even small changes in the input image size may affect the training time drastically.

Role of Image Resolution in Deep Learning

One Answer

Add your own answers!

Ask a Question