Create Markov decision process model

Create an MDP model with eight states and two possible actions.

MDP = createMDP(8,["up";"down"]);

Specify the state transitions and their associated rewards.

% State 1 Transition and Reward MDP.T(1,2,1) = 1; MDP.R(1,2,1) = 3; MDP.T(1,3,2) = 1; MDP.R(1,3,2) = 1; % State 2 Transition and Reward MDP.T(2,4,1) = 1; MDP.R(2,4,1) = 2; MDP.T(2,5,2) = 1; MDP.R(2,5,2) = 1; % State 3 Transition and Reward MDP.T(3,5,1) = 1; MDP.R(3,5,1) = 2; MDP.T(3,6,2) = 1; MDP.R(3,6,2) = 4; % State 4 Transition and Reward MDP.T(4,7,1) = 1; MDP.R(4,7,1) = 3; MDP.T(4,8,2) = 1; MDP.R(4,8,2) = 2; % State 5 Transition and Reward MDP.T(5,7,1) = 1; MDP.R(5,7,1) = 1; MDP.T(5,8,2) = 1; MDP.R(5,8,2) = 9; % State 6 Transition and Reward MDP.T(6,7,1) = 1; MDP.R(6,7,1) = 5; MDP.T(6,8,2) = 1; MDP.R(6,8,2) = 1; % State 7 Transition and Reward MDP.T(7,7,1) = 1; MDP.R(7,7,1) = 0; MDP.T(7,7,2) = 1; MDP.R(7,7,2) = 0; % State 8 Transition and Reward MDP.T(8,8,1) = 1; MDP.R(8,8,1) = 0; MDP.T(8,8,2) = 1; MDP.R(8,8,2) = 0;

Specify the terminal states of the model.

MDP.TerminalStates = ["s7";"s8"];

`states`

— Model statespositive integer | string vector

Model states, specified as one of the following:

Positive integer — Specify the number of model states. In this case, each state has a default name, such as

`"s1"`

for the first state.String vector — Specify the state names. In this case, the total number of states is equal to the length of the vector.

`actions`

— Model actionspositive integer | string vector

Model actions, specified as one of the following:

Positive integer — Specify the number of model actions. In this case, each action has a default name, such as

`"a1"`

for the first action.String vector — Specify the action names. In this case, the total number of actions is equal to the length of the vector.

`MDP`

— MDP model`GenericMDP`

objectMDP model, returned as a `GenericMDP`

object with the following
properties.

`CurrentState`

— Name of the current statestring

Name of the current state, specified as a string.

`States`

— State namesstring vector

State names, specified as a string vector with length equal to the number of states.

`Actions`

— Action namesstring vector

Action names, specified as a string vector with length equal to the number of actions.

`T`

— State transition matrix3D array

State transition matrix, specified as a 3-D array, which determines the
possible movements of the agent in an environment. State transition matrix
`T`

is a probability matrix that indicates how likely the agent
will move from the current state `s`

to any possible next state
`s'`

by performing action `a`

.
`T`

is given by:

$$T\left(s,s\text{'},a\right)\text{}=\text{}probability\left(s\text{'}|s,a\right).$$

`T`

is an
*S*-by-*S*-by-*A* array,
where *S* is the number of states and *A* is the
number of actions.

`R`

— Reward transition matrix3D array

Reward transition matrix, specified as a 3-D array, which determines how much
reward the agent receives after performing an action in the environment.
`R`

has the same shape and size as state transition matrix
`T`

. The reward for moving from state `s`

to
state `s'`

by performing action `a`

is given by:

$$r\text{}=\text{}R\left(s,s\text{'},a\right).$$

`TerminalStates`

— Terminal state names in the grid worldstring vector

Terminal state names in the grid world, specified as a string vector of state names.

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)