Load Predefined Grid World Environments

Reinforcement Learning Toolbox™ software provides several predefined grid world environments for which the actions, observations, rewards, and dynamics are already defined. You can use these environments to:

  • Learn reinforcement learning concepts

  • Gain familiarity with Reinforcement Learning Toolbox software features

  • Test your own reinforcement learning agents

You can load the following predefined MATLAB® grid world environments using the rlPredefinedEnv function.

EnvironmentAgent Task
Basic grid worldMove from a starting location to a target location on a two-dimensional grid by selecting moves from the discrete action space {N,S,E,W}.
Waterfall grid worldMove from a starting location to a target location on a larger two-dimensional grid with unknown deterministic or stochastic dynamics.

For more information on the properties of grid world environments, Create Custom Grid World Environments.

You can also load predefined MATLAB control system environments. For more information, see Load Predefined Control System Environments.

Basic Grid World

The basic grid world environment is a two-dimensional 5-by-5 grid with a starting location, terminal location, and obstacles. The environment also contains a special jump from state [2,4] to state [4,4]. The goal of the agent is to move from the starting location to the terminal location while avoiding obstacles and maximizing the total reward.

To create a basic grid world environment, use the rlPredefinedEnv function. This function creates an rlMDPEnv object representing the grid world.

env = rlPredefinedEnv('BasicGridWorld');

You can visualize the grid world environment using the plot function. The plot displays the:

  • Agent location as a red circle. By default, the agent starts in state [1,1].

  • Terminal location as a blue square

  • Obstacles as black squares

plot(env)

Actions

The agent can move in one of four possible directions (North, South, East, West).

Rewards

The agent receives the following rewards or penalties:

  • +10 reward for reaching the terminal state at [5,5]

  • +5 reward for jumping from state [2,4] to state [4,4].

  • -1 penalty for every other action

Deterministic Waterfall Grid Worlds

The deterministic waterfall grid world environment is a two-dimensional 8-by-7 grid with a starting location and terminal location. The environment includes a waterfall that pushes the agent towards the bottom of the grid. The goal of the agent is to move from the starting location to the terminal location while maximizing the total reward.

To create a deterministic waterfall grid world, use the rlPredefinedEnv function. This function creates an rlMDPEnv object representing the grid world.

env = rlPredefinedEnv('WaterFallGridWorld-Deterministic');

As with the basic grid world, you can visualize the environment, where the agent is a red circle, and the terminal location is a blue square.

plot(env)

Actions

The agent can move in one of four possible directions (North, South, East, West).

Rewards

The agent receives the following rewards or penalties:

  • +10 reward for reaching the terminal state at [4,5]

  • -1 penalty for every other action

Waterfall Dynamics

In this environment, a waterfall pushes the agent towards the bottom of the grid.

The intensity of the waterfall varies between the columns, as shown at the top of the preceding figure. When the agent moves into a column with a nonzero intensity, the waterfall pushes it downward by the indicated number of squares. For example, if the agent goes East from state [5,2], it will reach state [7,3].

Stochastic Waterfall Grid Worlds

The stochastic waterfall grid world environment is a two-dimensional 8-by-7 grid with a starting location and terminal locations. The environment includes a waterfall that pushes the agent towards the bottom of the grid with a stochastic intensity. The goal of the agent is to move from the starting location to the target terminal location while avoiding the penalty terminal states along the bottom of the grid and maximizing the total reward.

To create a stochastic waterfall grid world, use the rlPredefinedEnv function. This function creates an rlMDPEnv object representing the grid world.

env = rlPredefinedEnv('WaterFallGridWorld-Stochastic');

As with the basic grid world, you can visualize the environment, where the agent is a red circle, and the terminal location is a blue square.

plot(env)

Actions

The agent can move in one of four possible directions (North, South, East, West).

Rewards

The agent receives the following rewards or penalties:

  • +10 reward for reaching the terminal state at [4,5]

  • -10 penalty for reaching any terminal state in the bottom row of the grid

  • -1 penalty for every other action

Waterfall Dynamics

In this environment, a waterfall pushes the agent towards the bottom of the grid with a stochastic intensity. The baseline intensity matches the intensity of the deterministic waterfall environment. However, in the stochastic waterfall case, the agent has an equal chance of experiencing either the indicated intensity, one level above that intensity, or one level below that intensity. For example, if the agent goes East from state [5,2], it has an equal chance of reaching either state [6,3], [7,3], or [8,3].

See Also

| |

Related Topics