TransWikia.com

Why is DDPG not learning and it does not converge?

Artificial Intelligence Asked by I_Al-thamary on December 9, 2021

I have used a different setting, but DDPG is not learning and it does not converge. I have used these codes 1,2, and 3 and I used different optimizers, activation functions, and learning rate but there is no improvement.

    parser.add_argument('--actor-lr', help='actor network learning rate', default=0.001)
    parser.add_argument('--critic-lr', help='critic network learning rate', default=0.0001)
    parser.add_argument('--gamma', help='discount factor for critic updates', default=0.95)

    parser.add_argument('--tau', help='soft target update parameter', default=0.001)
    parser.add_argument('--buffer-size', help='max size of the replay buffer', default=int(1e5))
    parser.add_argument('--minibatch-size', help='size of minibatch for minibatch-SGD', default=64)

    # run parameters
    # parser.add_argument('--env', help='choose the gym env- tested on {Pendulum-v0}', default='MountainCarContinuous-v0')
    parser.add_argument('--random-seed', help='random seed for repeatability', default=1234)
    parser.add_argument('--max-episodes', help='max num of episodes to do while training', default=200)
    parser.add_argument('--max-episode-len', help='max length of 1 episode', default=100)

enter image description here

I have trained in the same environment with A2C and it converged.

enter image description here

Which parameters should I change to make the DDPG converge? Can anyone help me with this?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP