TransWikia.com

How do I build an image dataset for CNN?

Data Science Asked by 55thSwiss on December 4, 2020

I don’t understand how images are actually fed into a CNN. If I have a directory containing a few thousand images, what steps do I need to take in order to feed them to a neural network (for instance resizing, grey scale, labeling, etc)? I don’t understand how the labeling of an image works. What would this dataset actually look like? Or can you not look at it at all (something like a table)?

3 Answers

Dataset just consists of Features and Labels. Here features are your images and labels are the classes.

There is a fit() method for every CNN model, which will take in Features and Labels, and performs training.

for the first layer, you need to mention the input dimension of image, and the output layer should be a softmax (if you're doing classification) with dimension as the number of classes you have.

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(64, 64, 3)))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10)

The above is the code for training a Keras sequenctioal model.

General Points:

  1. input_shape should be the dimension of X_train.
  2. You need to get this shape when you do X_train.shape (numpy)
  3. Convolutions are then applied with respective Activations
  4. Dropout and Pooling layers are optional.
  5. After the convolution layers, the data is flattened. using Flatten()
  6. Then it is sent to few Fully Connected layers
  7. The last but one layer should have the dimensions of number of classes
  8. Last layer will be softmax.
  9. Now, compile the model with the loss, optimizer and metric
  10. Then fit()

Vote up ;) if you like it.

Answered by William Scott on December 4, 2020

This is a very packed question. Let's try to go through it and I will try to provide some example for image processing using a CNN.

Pre-processing the data

Pre-processing the data such as resizing, and grey scale is the first step of your machine learning pipeline. Most deep learning frameworks will require your training data to all have the same shape. So it is best to resize your images to some standard.

Whenever training any kind of machine learning model it is important to remember the bias variance trade-off. The more complex the model the harder it will be to train it. That means it is best to limit the number of model parameters in your model. You can lower the number of inputs to your model by downsampling the images. Greyscaling is often used for the same reason. If the colors in the images do not contain any distinguishing information then you can reduce the number of inputs by a third by greyscaling.

There are a number of other pre-processing methods which can be used depending on your data. It is also a good idea to do some data augmentation, this is altering your input data slightly without changing the resulting label to increase the number of instances you have to train your model.

How to structure the data?

The shape of the variable which you will use as the input for your CNN will depend on the package you choose. I prefer using tensorflow, which is developed by Google. If you are planning on using a pretty standard architecture, then there is a very useful wrapper library named Keras which will help make designing and training a CNN very easy.

When using tensorflow you will want to get your set of images into a numpy matrix. The first dimension is your instances, then your image dimensions and finally the last dimension is for channels.

So for example if you are using MNIST data as shown below, then you are working with greyscale images which each have dimensions 28 by 28. Then the numpy matrix shape that you would feed into your deep learning model would be (n, 28, 28, 1), where $n$ is the number of images you have in your dataset.

enter image description here

How to label images?

For most data the labeling would need to be done manually. This is often named data collection and is the hardest and most expensive part of any machine learning solution. It is often best to either use readily available data, or to use less complex models and more pre-processing if the data is just unavailable.


Here is an example of the use of a CNN for the MNIST dataset

First we load the data

from keras.datasets import mnist
import numpy as np

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)

Training data shape: (60000, 28, 28)
Testing data shape : (10000, 28, 28)

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K

Then we need to reshape our data to add the channel dimension at the end of our numpy matrix. Furthermore, we will one-hot encode the labels. So you will have 10 output neurons, where each represent a different class.

# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)

Now we design our model

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

Finally we can train the model

epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test_reshaped, y_test_binary))

Answered by JahKnows on December 4, 2020

I understood your question and I've been there. Make sure the data you've collected is saved into its respective class folder, for example, all dog images in a folder named "dog" and cat images in "cat" and so on

I think might help

IMG_SIZE = 50

DATADIR = 'Path/classes'

CATEGORIES = ['class1 #folder name', 'class2', 'class3']

for category in CATEGORIES :
    path = os.path.join(DATADIR, category)
    for img in os.listdir(path):
        img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_UNCHANGED)

training_data = []

def create_training_data():
    for category in CATEGORIES :
        path = os.path.join(DATADIR, category)
        class_num = CATEGORIES.index(category)
        for img in os.listdir(path):
            try :
                img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_UNCHANGED)
                new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
                training_data.append([new_array, class_num])
            except Exception as e:
                pass

create_training_data()

random.shuffle(training_data)

X = [] #features
y = [] #labels

for features, label in training_data:
    X.append(features)
    y.append(label)

X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 3)
y = np.asarray(y)

Not my code

Source

scroll down to Preparing the data and you'll find your answer to create dataset and importing it into your code from your computer. :)

Answered by Danny Joel Devarapalli on December 4, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP