TransWikia.com

Keras models break when I add batch normalization

Data Science Asked by axon on February 26, 2021

I’m creating the model for a DDPG agent (keras-rl version) but i’m having some trouble with errors whenever I try adding in batch normalization in the first of two networks.

Here is the creation function as i’d like it to be:

def buildDDPGNets(actNum, obsSpace):
    actorObsInput = Input(shape = (1,) + obsSpace, name = "actor_obs_input")
    a = Flatten()(actorObsInput)
    a = Dense(600, use_bias = False)(a)
    a = BatchNormalization()(a)
    a = Activation("relu")(a)
    a = Dense(300, use_bias = False)(a)
    a = BatchNormalization()(a)
    a = Activation("relu")(a)
    a = Dense(actNum)(a)
    a = Activation("tanh")(a)   # Bipdeal walker applies torque (-1, 1).
    actor = Model(inputs = [actorObsInput], outputs = a)
    criticActInput = Input(shape = (actNum,), name = "critic_action_input")
    criticObsInput = Input(shape = (1,) + obsSpace, name = "critic_obs_input")
    c = Flatten()(criticObsInput)
    c = Dense(600, use_bias = False)(c)
    c = BatchNormalization()(c)
    c = Activation("relu")(c)
    c = Concatenate()([c, criticActInput])
    c = Dense(300, use_bias = False)(c)
    c = BatchNormalization()(c)
    c = Activation("relu")(c)
    c = Dense(1)(c)
    c = Activation("linear")(c)
    critic = Model(inputs = [criticActInput, criticObsInput], outputs = c)
    return (criticActInput, actor, critic)

This causes me to get the following error:

    InvalidArgumentError: You must feed a value for placeholder tensor 'actor_obs_input' with dtype float and shape [?,1,24]
     [[{{node actor_obs_input}}]]

However, If I remove the batch normalization from the first network (a) and do not change the second network (c) at all, It runs as it should.

    a = Flatten()(actorObsInput)
    a = Dense(600, use_bias = False)(a)
    #a = BatchNormalization()(a)
    a = Activation("relu")(a)
    a = Dense(300, use_bias = False)(a)
    #a = BatchNormalization()(a)
    a = Activation("relu")(a)
    a = Dense(actNum)(a)

Its also notable that if I do it the other way around (remove bn from c and leave it in a) the error still occurs.
Any idea as to why that’s happening?
Its odd because it runs fine with batch norm in the critic, just not the actor.
The models are being used by keras-rl DDPG agent btw.

Update:
I’ve tried rewriting it with the sequential object instead of the functional api. Didn’t help. Still got the error with no change. I’m beginning to think this is some sort of problem with keras’s batch normalize class when being applied to systems of multiple models.

2 Answers

When you wrote this:

a = BatchNormalization()(a)

you assigned the object BatchNormalization() to a. The following layer:

a = Activation("relu")(a)

is supposed to receive some data in numpy array, not a BatchNormalization layer. You should rewrite your actor code like this:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, BatchNormalization

actor = Sequential([
    Dense(600, input_shape = your_input_shape, activation = 'relu', use_bias = False),
    BatchNormalization(), 
    Dense(300, activation = 'relu', use_bias = False),
    BatchNormalization(),
    Dense(actNum, activation = 'tanh')
    ])

Please note that I didn't fully specify the input_shape. That is because I don't know the nature of your data, that's why I left your_input_shape. You can modify it according to your needs.

The other model, critic, should follow the same structure IMHO. It's also more readable (and easier to debug) for you. Once both actor and critic are defined you can assemble them together using a Model(), as you did above.

Correct answer by Leevo on February 26, 2021

This link describes the cause of the problem. Its because we try to feed a placeholder defined as an Input layer using another input layer. You can modify your code as follows:

def buildDDPGNets(actNum, obsSpace):

  actorObsInput = Input(shape = (1,) + obsSpace, name = "actor_obs_input")
  criticActInput = Input(shape = (actNum,), name = "critic_action_input")

  flattened_obs = Flatten()(actorObsInput)
  a = Dense(600, use_bias = False)(flattened_obs)
  a = BatchNormalization()(a)
  a = Activation("relu")(a)
  a = Dense(300, use_bias = False)(a)
  a = BatchNormalization()(a)
  a = Activation("relu")(a)
  a = Dense(actNum)(a)
  a = Activation("tanh")(a)   # Bipdeal walker applies torque (-1, 1).
  actor = Model(inputs = [actorObsInput], outputs = a)

  c = Dense(600, use_bias = False)(flattened_obs)
  c = BatchNormalization()(c)
  c = Activation("relu")(c)
  c = Concatenate()([c, criticActInput])
  c = Dense(300, use_bias = False)(c)
  c = BatchNormalization()(c)
  c = Activation("relu")(c)
  c = Dense(1)(c)
  c = Activation("linear")(c)
  critic = Model(inputs = [criticActInput, criticObsInput], outputs = c)

  return (criticActInput, actor, critic)

Answered by ckusumadewa on February 26, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP