How can I convert a simple CLI RPG to a compatible environment for training an RL agent via stable-baselines?

Question

What would be the good choice of algorithm to use for character action selection in an RPG, implemented in Python?

I had previously asked this question in the hopes of getting headway on the AI portion of a project I have been working on, only to realized that I had the obstacle of converting the text based game I had created to a custom gym-environment before I could consider algorithm selection. I have found few papers relating to the task I am looking to achieve, hence I’ve come for advice on how to get started.

As my code for that particular task is extensive and rather messy, I have tried to create a minimized version which contains most of the core features for the development of an environment that, from my limited understanding, should be relatable to the larger program which looks to explore the possibility of an RL agent being a second player in a game akin to a simple Pokemon-esque RPG.

In this example, the player, or the agent, can choose a class prior to the start of the battle and on each turn, an action from a set can be chosen; for ease of access, I did not add differing action sets for each character class and to use the last ‘Special’ action, there is a requirement of a specific amount of ‘mp’ which cannot be increased.

I assume that if this game is converted to a compatible environment, an agent can learn how to optimally play this game.

I have a few questions aside from the major ‘how-to’ just because I lack the understanding in this area and for more directly transferable information to the larger project:

How do I prevent an agent from selecting a particular action when the condition has not been met (e.g not enough MP)
Can I define the agent’s action set after it’s character class selection or I have to give the complete list of actions usable in the game and then set probabilities based on the chosen character class?
How do I handle the agent being unable to act when it’s HP is 0, but the game has not ended. Due to the existence of an allied player in the party, for example.
To follow up, how to I allow action selection once it’s HP has been restored. (Not currently applicable to the example shown below)
Can the agent learn to interact with another player in the game, can I reward the agent for using particular types of actions on a particular type of player? (Not currently applicable to the example shown below)
If, for example, a particular condition was applied to the game environment for one episode but not another, such as healing instead harms, can the agent deduce that such a condition has been met under a particular episode and choose not to use healing-based actions.

If I need to make extensions of amendments to the code, just let me know.

Specified code is as follows:

# Based of original source-code by users: Citrus-Code and AlexV on Code Review Stack Exchange

import random

def classSelector(enemySelection = False):
    if enemySelection:
        classSelection = random.randint(1,3)
        if classSelection is 1:
            return 1700
        elif classSelection is 2:
            return 1750
        elif classSelection is 3:
            return 1300
        
    else:
        print("Choose a class: n1. Thiefn2. Warriorn3. Mage")
        classSelection = int(input())
        print("nPlayer Class selection complete! n")
        if classSelection is 1:
            return 1000
        elif classSelection is 2:
            return 1200
        elif classSelection is 3:
            return 900
def battle_simulation():
    """Run a simple interactive RPGChar battle simulation"""

    class RPGChar:
        def __init__(self, health):
            self.hp = health
            self.mp = 100
            self.maxHealth = health
            # used to denote if a character is defending
            self.defenseState = 0

        def getDefenseModifier(self):
            if self.defenseState is 0:
                return 1
            else:
                return 0.75

        def heal(self, heal_amount):
            self.hp += heal_amount
            if self.hp > self.maxHealth:

                self.hp = self.maxHealth
            return heal_amount
        def cast(self, mpCost):
            self.mp -= mpCost
            if self.mp <0:
                self.mp = 0

        def attack(self, target, damage):
            target.hp -= int(damage * 1 - target.getDefenseModifier())
            if target.hp < 0:
                target.hp = 0
            return damage

        def defend(self):
            self.defenseState = -self.defenseState + 1

        def defendReset(self):
            self.defenseState = 0

    enemy = RPGChar(classSelector(enemySelection=True))
    player = RPGChar(classSelector())
    while True:
        print("nATTACK CHOICESn1. Attackn2. Defendn3. Healn4. Special")
        attack_choice = int(input("nSelect an attack: "))

        # The enemy selects an attack by random, but will always attack if hp is full
        enemy_choice = random.randint(1, 2 if enemy.hp == enemy.maxHealth else 4 )

        if attack_choice is 2:
            print(f"You defend yourself from incoming attacks!")
            player.defend()

        if enemy_choice is 2:
            print(f"Enemy defends from incoming attacks!")
            enemy.defend()

        if attack_choice is 1:
            print(f"You dealt {player.attack(enemy, 275)} damage.")

        if enemy_choice is 1:
            print(f"Mew dealt {enemy.attack(player,250)} damage.")

        if attack_choice is 3:
            print(
                f"You healed {player.heal(random.randint(int(player.maxHealth * 0.1),int(player.maxHealth * 0.2)))} health points."
            )

        if enemy_choice is 3:
            print(
                f"Mew healed {enemy.heal(random.randint(int(player.maxHealth * 0.1),int(player.maxHealth * 0.15)))} health points."
            )
        if attack_choice is 4:
            if player.mp > 0:
                print(f"You dealt {enemy.attack(player,450)} damage.")
            else:
                print("You do not have MP to use this action!") 
                print("You do nothing on this turn.")
        if enemy_choice is 4:
            if enemy.mp > 0:
                print(f"You dealt {enemy.attack(player,450)} damage.")
            else:
                print("Enemy does not have MP to use this action!") 
                print("Enemy does nothing on this turn.")
        if enemy.hp is 0 or player.hp is 0:
            break

        print(f"Your current health is {player.hp}")
        print(f"Mew's current health is {enemy.hp}")

        enemy.defendReset()
        player.defendReset()

        print("n Next Turn!")

    print(f"Your final health is {player.hp}")
    print(f"Mew's final health is {enemy.hp}")

    if player.hp < enemy.hp:
        print("nYou lost! Better luck next time!")
    else:
        print("nYou won against Mew!")


def Main():
    battle_simulation()


if __name__ == "__main__":
    Main()

game ai getting started open ai reinforcement learning

How can I convert a simple CLI RPG to a compatible environment for training an RL agent via stable-baselines?

Add your own answers!

Ask a Question