Design Specification

1. Purpose
2. Design enviroment
3. Reinforcement learning
   3-1.The sequence of reinforcement learning
4. Maze exploration using reinforcement learning
   4-1.Q table
   4-2.Q - Learning
   4-3.Learning results
5. Challenge

24th LSI Design Contests-in Okinawa Design Specification - 4

4. Maze exploration using reinforcement learning

The following file is this example in Matlab.

Zip file(m file)：Sample_program_English.zip
How to use: Run Q _ Learning.m from the QL _ Shortest _ 5x5 folder

As an example using reinforcement learning, we deal with the 5 × 5 maze search problem shown in Fig. 2.

Fig 2 : ５×５ maze

Agents: People
State: Where in S1 ~ S25 the agent is

Action: Move in the direction of「→」，「↑」，「←」，「↓」
Rewards:
S5, S7, S8, S 14, S 17, S 19, S 20, S 22: Negative reward (demon)
S 25: Positive reward (money)
Else: No reward

In the case of this maze search problem, the purpose of reinforcement learning is to get maximum reward (money) at the time of goal (When we arrived at S 25,).
→ Do not go through a trout with demons.

<<Back Next>>

Contents

Design Specification

24th LSI Design Contests-in Okinawa Design Specification - 4

4. Maze exploration using reinforcement learning