Artificial Intelligence Asked by Kashan on August 24, 2021
I am learning with the OpenAI gym’s cart pole environment.
I want to make the observation states discrete (with small stepsize) and for that purpose, I need to change two of the observations from [$
-infty, infty$] to some finite upper and lower limits. (By the way, these states are velocity and pole velocity at the tip).
How can I change these limits in the actual gym’s environment?
Any other suggestions are also welcome.
I don't recommend changing the rules of the environment.
What you could do:
Perform a method called bucketing i.e. take a value from a continuous state space see which discrete bucket it should go into and then let your agent use the bucket number as the observation.
e.g. Say I do have a continuous state space with one variable in range $[-infty,infty]$
The buckets can be as follows:
0). x < -1000
1). -1000 $le$ x $<$ -500
2). -500 $le$ x $<$ -100
3). -100 $le$ x $<$ -50
4). -50 $le$ x $<$ 0
5). 0 $le$ x $<$ 50
6). 50 $le$ x $<$ 100
7). 100 $le$ x $<$ 500
8). 500 $le$ x $<$ 1000
9). x > 1000
Therefore in this example scenario there are 9 buckets. Hence, the observations can be in range [0, 9] discretely.
Answered by rert588 on August 24, 2021
1 Asked on November 24, 2021
applications deep learning deepfakes generative adversarial networks
1 Asked on November 20, 2021
autoencoders deep learning machine learning neural networks unsupervised learning
1 Asked on November 20, 2021
1 Asked on November 17, 2021 by dhanush-giriyan
1 Asked on November 12, 2021
1 Asked on November 10, 2021
long short term memory machine learning open ai reinforcement learning time series
1 Asked on November 7, 2021
2 Asked on November 4, 2021
deep rl dqn neural networks reinforcement learning temporal difference methods
1 Asked on November 4, 2021
dense rewards reinforcement learning reward design reward functions reward shaping
0 Asked on November 4, 2021 by tinu
ai development machine learning papers research state of the art
1 Asked on November 4, 2021 by ijuneja
1 Asked on August 24, 2021 by kashan
1 Asked on August 24, 2021 by ram-bharadwaj
1 Asked on August 24, 2021 by metrician
epsilon greedy policy monte carlo methods notation on policy methods reinforcement learning
1 Asked on August 24, 2021 by user289602
1 Asked on August 24, 2021 by daniel-koh
0 Asked on August 24, 2021 by soitgoes
function approximation markov decision process reinforcement learning
0 Asked on August 24, 2021 by seunosiko
0 Asked on August 24, 2021 by user38639
convolutional neural networks data augmentation neural networks testing training
Get help from others!
Recent Answers
© 2022 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP