TransWikia.com

Why does my "entropy generation" RNN do so badly?

Artificial Intelligence Asked on December 18, 2021

I’m new to relatively RNNs, and I’m trying to train generative and guessing neural networks to produce sequences of real numbers that look random. My architecture looks like this (each "circle" in the output is the adverserial network’s guess for the generated circle vertically below it — having seen only the terms before it):

Note that the adverserial network is rewarded for predicting outputs close to the true values, i.e. the loss function looks like tf.math.reduce_max((sequence - predictions) ** 2) (I have also tried reduce_mean).

I don’t know if there’s something obviously wrong with my architecture, but when I try to train this network (and I’ve added a reasonable number of layers), it doesn’t really work very well.

If you look at the result of the last code block, you’ll see that my generative neural network produces things like

  • [0.9907787, 0.9907827, 0.9907827, 0.9907827, 0.9907827, 0.9907827, 0.9907827, 0.9907827, 0.9907827, 0.9907827]

But it could easily improve itself by simply training to jump around more, since you’ll observe that the adverserial network also predicts numbers very close to the given number (even when the sequence it is given to predict is one that jumps around a lot!).

What am I doing wrong?

One Answer

I think there are two problems with your network. The first one, always having very similar outputs, is the rather simple one. As it seems, your network suffers from the so-called, very common Mode Collapse problem. The attached link provides both an explanation and some potential remedy to that problem.

The second problem is more fundamental. You say that you want your network to produce random numbers. Or numbers that at least appear as such. However, once training is finished, your model is going to be a static function which will not change any further. Given the same input x, it will always produce the same output y. Consequently, unless the inputs to your network contain some randomness already or are, at least, always slightly dissimilar, you will not end up having a random generator. So, whether that is going to be useful will depend on your usecase. But if you make sure that the true random variable (like date&time) serves as input to the RNN and the RNN just translates this into some different format, that might work again. Just keep in mind that randomness can never arise out of a trained model.

Answered by Daniel B. on December 18, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP