Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

6
RL for Pokemon (specifically "subgame solving")
Post Body

Iā€™m pretty new to RL, and I was wondering if I could get some insight on how an idea of mine could work for Pokemon battling. For those unfamiliar with Pokemon battling, itā€™s an imperfect-information game with a very large state space and a lot of hidden information regarding the opponentā€™s moves, their stats, etc. For generalization, letā€™s say each player has a team of 6 Pokemon (weā€™ll call this 6v6). The action space is the 4 attacks 5 potential switches for a total of 9 moves.

My main question is, can you train an agent by solving ā€œsubgamesā€ where first you have it become a 1v1 pro (so each player only has 1 Pokemon), and then become a pro at 2v2 by learning a policy that reduces the 2v2 game into a favorable 1v1 scenario, eventually scaling up to 6v6? So rather than starting from the (daunting) 6v6 problem, we explicitly start with the much simpler smaller endgame scenarios and frame the problem into learning policies that map to a state that we have already solved before. Is there an RL algorithm that couples naturally with this? The motivation here is that hopefully the agent will naturally learn concepts like win conditions, reducing games to favorable endgame scenarios, etc. which are all larger macro concepts that separate top human players from people simply clicking the super effective move.

For example, how would you modify Deep-Q learning to adopt this learning paradigm? Is there some way you could ā€œfreezeā€ weights of the network that are good approximating Q-values for these 1v1 states, and then freeze another layer of weights that are good at approximating Q-values for 2v2 states, etc. in these rounds of training subgames? Would that make any sense to do?

Author
Account Strength
100%
Account Age
9 years
Verified Email
Yes
Verified Flair
No
Total Karma
9,456
Link Karma
4,999
Comment Karma
4,290
Profile updated: 1 day ago
Posts updated: 7 months ago

Subreddit

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
3 years ago