| Japanese | English |
25th LSI Design Contests-in Okinawa Design Specification - 4
4. Maze exploration using reinforcement learning
The following file is this example in Matlab.
Zip file(m file):DQN_sample_e.zipHow to use: Run sw_Q _ Learning.m from the DQN_sample folder
As an example using reinforcement learning, we deal with the 3 × 3 maze search problem shown in Fig. 3.
Fig 3 : 3×3 maze
Agents: People
State: Where in S1 ~ S9 the agent is
Action: Move in the direction of「→」,「↑」,「←」,「↓」
Rewards:
S5, S7, S8: Negative reward (demon)
S9: Positive reward (money)
Else: No reward
In the case of this maze search problem, the purpose of deep reinforcement learning is to get maximum reward (money) at the time of goal (When we arrived at S 9,).
→ Do not go through a trout with demons.
Copyright (C) 2021-2022 LSI Design Contest. All Rights Reserved.