TransWikia.com

How can I have the same input and output shape in an auto-encoder?

Artificial Intelligence Asked by Vesko Vujovic on December 25, 2021

I’m building a denoising autoencoder. I want to have the same input and output shape image.

This is my architecture:

input_img = Input(shape=(IMG_HEIGHT, IMG_WIDTH, 1))  

x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)



x = Conv2D(32, (3, 3), activation='relu', padding='valid')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)


# decodedSize = K.int_shape(decoded)[1:]

# x_size = K.int_shape(input_img)
# decoded = Reshape(decodedSize, input_shape=decodedSize)(decoded)


autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

My input shape is: 1169×827

This is Keras output:

Model: "model_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_7 (InputLayer)         [(None, 1169, 827, 1)]    0         
_________________________________________________________________
conv2d_30 (Conv2D)           (None, 1169, 827, 32)     320       
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 585, 414, 32)      0         
_________________________________________________________________
conv2d_31 (Conv2D)           (None, 585, 414, 64)      18496     
_________________________________________________________________
max_pooling2d_13 (MaxPooling (None, 293, 207, 64)      0         
_________________________________________________________________
conv2d_32 (Conv2D)           (None, 291, 205, 32)      18464     
_________________________________________________________________
up_sampling2d_12 (UpSampling (None, 582, 410, 32)      0         
_________________________________________________________________
conv2d_33 (Conv2D)           (None, 582, 410, 32)      9248      
_________________________________________________________________
up_sampling2d_13 (UpSampling (None, 1164, 820, 32)     0         
_________________________________________________________________
conv2d_34 (Conv2D)           (None, 1162, 818, 1)      289       
===============================================================

How can I have the same input and output shape?

2 Answers

I don't know if this is the right way of doing it but I solved the problem.

Following the code from above I've added:

img_size = K.int_shape(input_img)[1:]

resized_image_tensor = tf.image.resize(decoded, list(img_size[:2]))****


autoencoder = Model(input_img, resized_image_tensor)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

I used tf.image.resize to synchronize the shape of reconstructed image and input image.

Hope it helps.

Answered by Vesko Vujovic on December 25, 2021

If you look at Keras' output, there are various steps which lose pixels:

Max pooling on odd sizes will always lose one pixel. Conv2D using 3x3 kernels will also lose 2pixels, although I'm puzzled that it doesn't seem to happen in the downsampling steps.

Intuitively, padding the original images with enough border pixels to compensate for the pixel loss due to the various layers would be the simplest solution. At the moment I can't calculate how much it should be, but I suspect rounding up to a multiple of 4 should take care of the max pooling layers. For denoising, borders could be just copied from the outermost pixels, probably with some sort of low pass filtering to avoid artefacts.

Answered by Hans-Martin Mosner on December 25, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP