What is the difference between using numpy array images and using images files in deep learning?

Question

Which way is better?

kfn95 · Accepted Answer

In order to pass an image as an input to a model first need to convert it to a numpy array. Each image actually is represented as an array of values when you load it into python. Even if you don't do it explicitly (i.e. through keras' ImageDataGenerator), it is done behind the scenes.

If your question is: Is it better to use generators than loading the images in a large numpy array?

The answer is: it depends. Is the dataset small enough to fit in your memory?

If not, you are forced to use a generator that loads the images in batches and passes each batch to the model.

If yes, you either can use a generator to save memory for other things (e.g. the model) or you can load the images into a numpy array so that you can save on computation time (i.e. the overhead of loading images again and again).

n1k31t4 · Answer

In principal, they are exactly the same.

A numpy array holds the RGB values of an image saved on disk in a memory container (numpy.ndarray). This container offers certain built-in functions, such as the ability to do some fancy slicing. An example would be to flip an image across the vertical axis, giving a mirror image:

flipped =image[:, ::-1]         # memory efficient and therefore fast

Numpy arrays aren't able to do everything we need for modelling, especially on GPUs using Tensorflow or PyTorch, for example. So we pass the numpy arrays to these frameworks and they put another wrapper on them, making them tensor objects.

These objects have special methods and properties that are tailored to our needs for deep learning. They can do things such as store gradient information or deduce the shapes of the tensor before/after operations - all things that make our lives easier.

Loading images directly into a deep learning framework, using one of their own tools, would skip the numpy step and get straight to the tensors. That is fine when your pipeline is already set up and you don't need to perform any pre-processing.

PyTorch has its basic tensor object; however, it allows you to do most operations you could do on a standard numpy array, which many see as a big advantage of that framework.

What is the difference between using numpy array images and using images files in deep learning?

2 Answers

Add your own answers!

Ask a Question