reason why there is two dense layer?

Question

I have been using the following model for image classification. I wonder why there is 512 Dense layer before the Dense3. Should I just use only Dense3? instead of Dense512-Dense3?
model = tf.keras.models.Sequential([
    # Note the input shape is the desired size of the image 150x150 with 3 bytes color
    # This is the first convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(150, 150, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    # The second convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # The third convolution
    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # The fourth convolution
    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # Flatten the results to feed into a DNN
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dropout(0.5),
    # 512 neuron hidden layer
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(3, activation='softmax')
])

SrJ · Answer

The Convolution layers do the job of feature extracting. Using those feature the DNN layers try to do the classification task. The 512 layer does the job of a feature selector like which feature is relevant for a class or not while last layer is just calculate sigmoid probability.

Answered by SrJ on July 31, 2021

Akash · Answer

You can find the number of parameters in the layers and accordingly choose number of neurons in next layer.
# The fourth convolution
tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),

Here, number of parameters are:
[{(nh – f) / s + 1} X {(nw – f) / s + 1} X nc]

Here: nh is input height, nw is input width, f is filter height, s is stride and nc number of output filters
[(( input_height - 3 ) / 2 ) + 1] = something

So, the output will be something * something * 128. This output will be the input to the dense layer considering there is no dropout layer.

Media · Answer

It can be shown that any function can be approximated using multi-layer networks that are fully connected and have nonlinear activations. In your case, if you add one more fully connected layer other than the current ones, you can achieve a better outcome. People usually add two hidden fully connected layers after convolutional layers and before the output layer. The reason is that convolutional layers try to extract features in a differentiable manner, and fully connected layers try to classify the features. Consequently, adding more layers to the dense section can empower your network's ability to classify the extracted features better.

reason why there is two dense layer?

3 Answers

Add your own answers!

Ask a Question