Imagine I have a convolutional neural network to classify MNIST digits, such as this Keras example. This is purely for experimentation so I don’t have a clear reason or justification as to why I’m doing this, but let’s say I would like to regularize or penalize the output of an intermediate layer. I realize that the visualization below does not correspond to the MNIST CNN example and instead just has several fully connected layers. However, to help visualize what I mean let’s say I want to impose a penalty on the node values in layer 4 (either pre or post activation is fine with me).
In addition to having a categorical cross entropy loss term which is typical for multi-class classification, I would like to add another term to the loss function that minimizes the squared sum of the output at a given layer. This is somewhat similar in concept to l2 regularization, except that l2 regularization is penalizing the squared sum of all weights in the network. Instead, I am purely interested in the values of a given layer (e.g. layer 4) and not all the weights in the network.
I realize that this requires writing a custom loss function using keras backend to combine categorical crossentropy and the penalty term, but I am not sure how to use an intermediate layer for the penalty term in the loss function. I would greatly appreciate help on how to do this. Thanks!
Another way to add loss based on input or calculations at a given layer is to use the add_loss() API. If you are already creating a custom layer, the custom loss can be added directly to the layer. Or a custom layer can be created that simply takes the input, calculates and adds the loss, and then passes the unchanged input along to the next layer.
Here is the code taken directly from the documentation (in case the link is ever broken):
from tensorflow.keras.layers import Layer class MyActivityRegularizer(Layer): """Layer that creates an activity sparsity regularization loss.""" def __init__(self, rate=1e-2): super(MyActivityRegularizer, self).__init__() self.rate = rate def call(self, inputs): # We use `add_loss` to create a regularization loss # that depends on the inputs. self.add_loss(self.rate * tf.reduce_sum(tf.square(inputs))) return inputs
Answered by Leland Hepworth on January 5, 2022
Actually, what you are interested in is regularization and in Keras there are two different kinds of built-in regularization approach available for most of the layers (e.g.
Weight regularization, which penalizes the weights of a layer. Usually, you can use
bias_regularizer arguments when constructing a layer to enable it. For example:
l1_l2 = tf.keras.regularizers.l1_l2(l1=1.0, l2=0.01) x = tf.keras.layers.Dense(..., kernel_regularizer=l1_l2, bias_regularizer=l1_l2)
Activity regularization, which penalizes the output (i.e. activation) of a layer. To enable this, you can use
activity_regularizer argument when constructing a layer:
l1_l2 = tf.keras.regularizers.l1_l2(l1=1.0, l2=0.01) x = tf.keras.layers.Dense(..., activity_regularizer=l1_l2)
Note that you can set activity regularization through
activity_regularizer argument for all the layers, even custom layers.
In both cases, the penalties are summed into the model's loss function, and the result would be the final loss value which would be optimized by the optimizer during training.
Further, besides the built-in regularization methods (i.e. L1 and L2), you can define your own custom regularizer method (see Developing new regularizers). As always, the documentation provides additional information which might be helpful as well.
Answered by today on January 5, 2022
Just specify the hidden layer as an additional output. As
tf.keras.Models can have multiple outputs, this is totally allowed. Then define your custom loss using both values.
Extending your example:
input = tf.keras.Input(...) x1 = tf.keras.layers.Dense(10)(input) x2 = tf.keras.layers.Dense(10)(x1) x3 = tf.keras.layers.Dense(10)(x2) model = tf.keras.Model(inputs=[input], outputs=[x3, x2])
for the custom loss function I think it's something like this:
def custom_loss(y_true, y_pred): x2, x3 = y_pred label = y_true # you might need to provide a dummy var for x2 return f1(x2) + f2(y_pred, x3) # whatever you want to do with f1, f2
Answered by Frederik Bode on January 5, 2022
3 Asked on November 22, 2021 by ziegler199
2 Asked on November 22, 2021 by muller
0 Asked on November 20, 2021 by artisan
2 Asked on November 20, 2021 by json-prime
2 Asked on November 20, 2021 by rosi-darmawati
0 Asked on November 20, 2021 by suman_dh
2 Asked on November 20, 2021 by varun-jain
1 Asked on November 20, 2021
0 Asked on November 20, 2021 by channee-mathmath
1 Asked on November 20, 2021 by syroman
3 Asked on November 20, 2021
1 Asked on November 20, 2021 by tarik-benrabah
2 Asked on November 20, 2021 by avow-studio
1 Asked on November 20, 2021 by peter-bailey
Get help from others!