| Japanese | English |
24th LSI Design Contests-in Okinawa Design Specification - 4
4. Maze exploration using reinforcement learning
The following file is this example in Matlab.
Zip file(m file):Sample_program_English.zipHow to use: Run Q _ Learning.m from the QL _ Shortest _ 5x5 folder
As an example using reinforcement learning, we deal with the 5 × 5 maze search problem shown in Fig. 2.
Fig 2 : 5×5 maze
Agents: People
State: Where in S1 ~ S25 the agent is
Action: Move in the direction of「→」,「↑」,「←」,「↓」
Rewards:
S5, S7, S8, S 14, S 17, S 19, S 20, S 22: Negative reward (demon)
S 25: Positive reward (money)
Else: No reward
In the case of this maze search problem, the purpose of reinforcement learning is to get maximum reward (money) at the time of goal (When we arrived at S 25,).
→ Do not go through a trout with demons.