rlRepresentation

(Not recommended) Model representation for reinforcement learning agents

rlRepresentation is not recommended. Use rlValueRepresentation, rlQValueRepresentation, rlDeterministicActorRepresentation, or rlStochasticActorRepresentation instead. For more information, see Version History.

Syntax

rep = rlRepresentation(net,obsInfo,'Observation',obsNames)

rep = rlRepresentation(net,obsInfo,actInfo,'Observation',obsNames,'Action',actNames)

tableCritic = rlRepresentation(tab)

critic = rlRepresentation(basisFcn,W0,obsInfo)

critic = rlRepresentation(basisFcn,W0,oaInfo)

actor = rlRepresentation(basisFcn,W0,obsInfo,actInfo)

rep = rlRepresentation(___,repOpts)

Description

Use rlRepresentation to create a function approximator representation for the actor or critic of a reinforcement learning agent. To do so, you specify the observation and action signals for the training environment and options that affect the training of an agent that uses the representation. For more information on creating representations, see Create Policies and Value Functions.

rep = rlRepresentation(net,obsInfo,'Observation',obsNames) creates a representation for the deep neural network net. The observation names obsNames are the network input layer names. obsInfo contains the corresponding observation specifications for the training environment. Use this syntax to create a representation for a critic that does not require action inputs, such as a critic for an rlACAgent or rlPGAgent agent.

example

rep = rlRepresentation(net,obsInfo,actInfo,'Observation',obsNames,'Action',actNames) creates a representation with action signals specified by the names actNames and specification actInfo. Use this syntax to create a representation for any actor, or for a critic that takes both observation and action as input, such as a critic for an rlDQNAgent or rlDDPGAgent agent.

example

tableCritic = rlRepresentation(tab) creates a critic representation for the value table or Q table tab. When you create a table representation, you specify the observation and action specifications when you create tab.

example

critic = rlRepresentation(basisFcn,W0,obsInfo) creates a linear basis function representation using the handle to a custom basis function basisFcn and initial weight vector W0. obsInfo contains the corresponding observation specifications for the training environment. Use this syntax to create a representation for a critic that does not require action inputs, such as a critic for an rlACAgent or rlPGAgent agent.

critic = rlRepresentation(basisFcn,W0,oaInfo) creates a linear basis function representation using the specification cell array oaInfo, where oaInfo = {obsInfo,actInfo}. Use this syntax to create a representation for a critic that takes both observations and actions as inputs, such as a critic for an rlDQNAgent or rlDDPGAgent agent.

actor = rlRepresentation(basisFcn,W0,obsInfo,actInfo) creates a linear basis function representation using the specified observation and action specifications, obsInfo and actInfo, respectively. Use this syntax to create a representation for an actor that takes observations as inputs and generates actions.

rep = rlRepresentation(___,repOpts) creates a representation using additional options that specify learning parameters for the representation when you train an agent. Available options include the optimizer used for training and the learning rate. Use rlRepresentationOptions to create the options object repOpts. You can use this syntax with any of the previous input-argument combinations.

example

Examples

collapse all

Create Actor and Critic Representations

Create an actor representation and a critic representation that you can use to define a reinforcement learning agent such as an Actor Critic (AC) agent.

For this example, create actor and critic representations for an agent that can be trained against the cart-pole environment described in Train AC Agent to Balance Discrete Cart-Pole System. First, create the environment. Then, extract the observation and action specifications from the environment. You need these specifications to define the agent and critic representations.

env = rlPredefinedEnv("CartPole-Discrete");
obsInfo = getObservationInfo(env);
actInfo = getActionInfo(env);

For a state-value-function critic such as those used for AC or PG agents, the inputs are the observations and the output should be a scalar value, the state value. For this example, create the critic representation using a deep neural network with one output, and with observation signals corresponding to x,xdot,theta,thetadot as described in Train AC Agent to Balance Discrete Cart-Pole System. You can obtain the number of observations from the obsInfo specification. Name the network layer input 'observation'.

numObservation = obsInfo.Dimension(1);
criticNetwork = [
    imageInputLayer([numObservation 1 1],'Normalization','none','Name','observation')
    fullyConnectedLayer(1,'Name','CriticFC')];

Specify options for the critic representation using rlRepresentationOptions. These options control parameters of critic network learning, when you train an agent that incorporates the critic representation. For this example, set the learning rate to 0.05 and the gradient threshold to 1.

repOpts = rlRepresentationOptions('LearnRate',5e-2,'GradientThreshold',1);

Create the critic representation using the specified neural network and options. Also, specify the action and observation information for the critic. Set the observation name to 'observation', which is the name you used when you created the network input layer for criticNetwork.

critic = rlRepresentation(criticNetwork,obsInfo,'Observation',{'observation'},repOpts)

critic = 
  rlValueRepresentation with properties:

            Options: [1x1 rl.option.rlRepresentationOptions]
    ObservationInfo: [1x1 rl.util.rlNumericSpec]
         ActionInfo: {1x0 cell}

Similarly, create a network for the actor. An AC agent decides which action to take given observations using an actor representation. For an actor, the inputs are the observations, and the output depends on whether the action space is discrete or continuous. For the actor of this example, there are two possible discrete actions, –10 or 10. Thus, to create the actor, use a deep neural network with the same observation input as the critic, that can output these two values. You can obtain the number of actions from the actInfo specification. Name the output 'action'.

numAction = numel(actInfo.Elements); 
actorNetwork = [
    imageInputLayer([4 1 1], 'Normalization','none','Name','observation')
    fullyConnectedLayer(numAction,'Name','action')];

Create the actor representation using the observation name and specification and the action name and specification. Use the same representation options.

actor = rlRepresentation(actorNetwork,obsInfo,actInfo,...
    'Observation',{'observation'},'Action',{'action'},repOpts)

actor = 
  rlStochasticActorRepresentation with properties:

            Options: [1x1 rl.option.rlRepresentationOptions]
    ObservationInfo: [1x1 rl.util.rlNumericSpec]
         ActionInfo: [1x1 rl.util.rlFiniteSetSpec]

You can now use the actor and critic representations to create an AC agent.

agentOpts = rlACAgentOptions(...
    'NumStepsToLookAhead',32,...
    'DiscountFactor',0.99);
agent = rlACAgent(actor,critic,agentOpts)

agent = 
  rlACAgent with properties:

    AgentOptions: [1x1 rl.option.rlACAgentOptions]

Create Q Table Representation

This example shows how to create a Q Table representation:

Create an environment interface.

env = rlPredefinedEnv("BasicGridWorld");

Create a Q table using the action and observation specifications from the environment.

qTable = rlTable(getObservationInfo(env),getActionInfo(env));

Create a representation for the Q table.

tableRep = rlRepresentation(qTable);

Create Quadratic Basis Function Critic Representation

This example shows how to create a linear basis function critic representation.

Assume that you have an environment, env. For this example, load the environment used in the Create and Train Custom LQR Agent example.

load myLQREnv.mat

Obtain the observation and action specifications from the environment.

obsInfo = getObservationInfo(env);
actInfo = getActionInfo(env);

Create a custom basis function. In this case, use the quadratic basis function from Create and Train Custom LQR Agent.

Set the dimensions and parameters required for your basis function.

n = 6;

Set an initial weight vector.

w0 = 0.1*ones(0.5*(n+1)*n,1);

Create a representation using a handle to the custom basis function.

critic = rlRepresentation(@(x,u) computeQuadraticBasis(x,u,n),w0,{obsInfo,actInfo});

Function to compute the quadratic basis from Create and Train Custom LQR Agent.

function B = computeQuadraticBasis(x,u,n)
z = cat(1,x,u);
idx = 1;
for r = 1:n
    for c = r:n
        if idx == 1
            B = z(r)*z(c);
        else
            B = cat(1,B,z(r)*z(c));
        end
        idx = idx + 1;
    end
end
end

Input Arguments

collapse all

`net` — Deep neural network for actor or critic
array of `Layer` objects | `layerGraph` object | `DAGNetwork` object | `SeriesNetwork` object

Deep neural network for actor or critic, specified as one of the following:

Array of Layer objects
layerGraph object
DAGNetwork object
SeriesNetwork object
dlnetwork object

For a list of deep neural network layers, see List of Deep Learning Layers. For more information on creating deep neural networks for reinforcement learning, see Create Policies and Value Functions.

`obsNames` — Observation names
cell array of character vectors

Observation names, specified as a cell array of character vectors. The observation names are the network input layer names you specify when you create net. The names in obsNames must be in the same order as the observation specifications in obsInfo.

Example: {'observation'}

`obsInfo` — Observation specification
`rlFiniteSetSpec` object | `rlNumericSpec` object | array

Observation specifications, specified as an rlFiniteSetSpec or rlNumericSpec object or an array containing a mix of such objects. Each element in the array defines the properties of an environment observation channel, such as its dimensions, data type, and name.

You can extract obsInfo from an existing environment or agent using getObservationInfo. You can also construct the specifications manually.

`actNames` — Action name
single-element cell array that contains a character vector

Action name, specified as a single-element cell array that contains a character vector. The action name is the network layer name you specify when you create net. For critic networks, this layer is the first layer of the action input path. For actors, this layer is the last layer of the action output path.

Example: {'action'}

`actInfo` — Action specification
`rlFiniteSetSpec` object | `rlNumericSpec` object

Action specifications, specified either as an rlFiniteSetSpec (for discrete action spaces) or rlNumericSpec (for continuous action spaces) object. This object defines the properties of the environment action channel, such as its dimensions, data type, and name.

Note

Only one action channel is allowed.

You can extract actInfo from an existing environment or agent using getActionInfo. You can also construct the specifications manually.

`tab` — Value table or Q table for critic
`rlTable` object

Value table or Q table for critic, specified as an rlTable object. The learnable parameters of a table representation are the elements of tab.

`basisFcn` — Custom basis function
function handle

Custom basis function, specified as a function handle to a user-defined function. For a linear basis function representation, the output of the representation is f = W'B, where W is a weight array and B is the column vector returned by the custom basis function. The learnable parameters of a linear basis function representation are the elements of W.

When creating:

A critic representation with observation inputs only, your basis function must have the following signature.
```
B = myBasisFunction(obs1,obs2,...,obsN)
```
Here obs1 to obsN are observations in the same order and with the same data type and dimensions as the observation specifications in obsInfo.
A critic representation with observation and action inputs, your basis function must have the following signature.
```
B = myBasisFunction(obs1,obs2,...,obsN,act)
```
Here obs1 to obsN are observations in the same order and with the same data type and dimensions as the observation specifications in the first element of oaInfo, and act has the same data type and dimensions as the action specification in the second element of oaInfo.
An actor representation, your basis function must have the following signature.
```
B = myBasisFunction(obs1,obs2,...,obsN)
```
Here, obs1 to obsN are observations in the same order and with the same data type and dimensions as the observation specifications in obsInfo. The data types and dimensions of the action specification in actInfo affect the data type and dimensions of f.

Example: @(x,u) myBasisFunction(x,u)

`W0` — Initial value for linear basis function weight vector
column vector | array

Initial value for linear basis function weight array, W, specified as one of the following:

Column vector — When creating a critic representation or an actor representation with a continuous scalar action signal
Array — When creating an actor representation with a column vector continuous action signal or a discrete action space.

`oaInfo` — Observation and action specifications
cell array

Observation and action specifications for creating linear basis function critic representations, specified as the cell array {obsInfo,actInfo}.

`repOpts` — Representation options
`rlRepresentationOptions` object

Representation options, specified as an option set that you create with rlRepresentationOptions. Available options include the optimizer used for training and the learning rate. See rlRepresentationOptions for details.

Output Arguments

collapse all

`rep` — Deep neural network representation
`rlLayerRepresentation` object

Deep neural network representation, returned as an rlLayerRepresentation object. Use this representation to create an agent for reinforcement learning. For more information, see Reinforcement Learning Agents.

`tableCritic` — Value or Q table critic representation
`rlTableRepresentation` object

Value or Q table critic representation, returned as an rlTableRepresentation object. Use this representation to create an agent for reinforcement learning. For more information, see Reinforcement Learning Agents.

`critic` — Linear basis function critic representation
`rlLinearBasisRepresentation` object

Linear basis function critic representation, returned as and rlLinearBasisRepresentation object. Use this representation to create an agent for reinforcement learning. For more information, see Reinforcement Learning Agents.

`actor` — Linear basis function actor representation
`rlLinearBasisRepresentation` object

Linear basis function actor representation, returned as and rlLinearBasisRepresentation object. Use this representation to create an agent for reinforcement learning. For more information, see Reinforcement Learning Agents.

Version History

Introduced in R2019a

expand all

R2020a: `rlRepresentation` is not recommended

rlRepresentation is not recommended. Depending on the type of representation being created, use one of the following objects instead:

rlValueRepresentation — State value critic, computed based on observations from the environment.
rlQValueRepresentation — State-action value critic, computed based on both actions and observations from the environment.
rlDeterministicActorRepresentation — Actor with deterministic actions, based on observations from the environment.
rlStochasticActorRepresentation — Actor with stochastic actions, based on observations from the environment.

The following table shows some typical uses of the rlRepresentation function to create neural network-based critics and actors, and how to update your code with one of the new objects instead.

Network-Based Representations: Not Recommended	Network-Based Representations: Recommended
`rep = rlRepresentation(net,obsInfo,'Observation',obsName)`, with `net` having only observations as inputs, and a single scalar output.	`rep = rlValueRepresentation(net,obsInfo,'Observation',obsName)`. Use this syntax to create a representation for a critic that does not require action inputs, such as a critic for an `rlACAgent` or `rlPGAgent` agent.
`rep = rlRepresentation(net,obsInfo,actInfo,'Observation',obsName,'Action',actName)`, with `net` having both observations and action as inputs, and a single scalar output.	`rep = rlQValueRepresentation(net,obsInfo,actInfo,'Observation',obsName,'Action',actName)`. Use this syntax to create a single-output state-action value representation for a critic that takes both observation and action as input, such as a critic for an `rlDQNAgent` or `rlDDPGAgent` agent.
`rep = rlRepresentation(net,obsInfo,actInfo,'Observation',obsName,'Action',actName)`, with `net` having observations as inputs and actions as outputs, and `actInfo` defining a continuous action space.	`rep = rlDeterministicActorRepresentation(net,obsInfo,actInfo,'Observation',obsName,'Action',actName)`. Use this syntax to create a deterministic actor representation for a continuous action space.
`rep = rlRepresentation(net,obsInfo,actInfo,'Observation',obsName,'Action',actName)`, with `net` having observations as inputs and actions as outputs, and `actInfo` defining a discrete action space.	`rep = rlStochasticActorRepresentation(net,obsInfo,actInfo,'Observation',obsName)`. Use this syntax to create a stochastic actor representation for a discrete action space.

The following table shows some typical uses of the rlRepresentation objects to express table-based critics with discrete observation and action spaces, and how to update your code with one of the new objects instead.

Table-Based Representations: Not Recommended	Table-Based Representations: Recommended
`rep = rlRepresentation(tab)`, with `tab` containing a value table consisting in a column vector as long as the number of possible observations.	`rep = rlValueRepresentation(tab,obsInfo)`. Use this syntax to create a representation for a critic that does not require action inputs, such as a critic for an `rlACAgent` or `rlPGAgent` agent.
`rep = rlRepresentation(tab)`, with `tab` containing a Q-value table with as many rows as the possible observations and as many columns as the possible actions.	`rep = rlQValueRepresentation(tab,obsInfo,actInfo)`. Use this syntax to create a single-output state-action value representation for a critic that takes both observation and action as input, such as a critic for an `rlDQNAgent` or `rlDDPGAgent` agent.

The following table shows some typical uses of the rlRepresentation function to create critics and actors which use a custom basis function, and how to update your code with one of the new objects instead. In the recommended function calls, the first input argument is a two-elements cell containing both the handle to the custom basis function and the initial weight vector or matrix.

Custom Basis Function-Based Representations: Not Recommended	Custom Basis Function-Based Representations: Recommended
`rep = rlRepresentation(basisFcn,W0,obsInfo)`, where the basis function has only observations as inputs and `W0` is a column vector.	`rep = rlValueRepresentation({basisFcn,W0},obsInfo)`. Use this syntax to create a representation for a critic that does not require action inputs, such as a critic for an `rlACAgent` or `rlPGAgent` agent.
`rep = rlRepresentation(basisFcn,W0,{obsInfo,actInfo})`, where the basis function has both observations and action as inputs and `W0` is a column vector.	`rep = rlQValueRepresentation({basisFcn,W0},obsInfo,actInfo)`. Use this syntax to create a single-output state-action value representation for a critic that takes both observation and action as input, such as a critic for an `rlDQNAgent` or `rlDDPGAgent` agent.
`rep = rlRepresentation(basisFcn,W0,obsInfo,actInfo)`, where the basis function has observations as inputs and actions as outputs, `W0` is a matrix, and `actInfo` defines a continuous action space.	`rep = rlDeterministicActorRepresentation({basisFcn,W0},obsInfo,actInfo)`. Use this syntax to create a deterministic actor representation for a continuous action space.
`rep = rlRepresentation(basisFcn,W0,obsInfo,actInfo)`, where the basis function has observations as inputs and actions as outputs, `W0` is a matrix, and `actInfo` defines a discrete action space.	`rep = rlStochasticActorRepresentation({basisFcn,W0},obsInfo,actInfo)`. Use this syntax to create a deterministic actor representation for a discrete action space.

rlRepresentation

Syntax

Description

Examples

Create Actor and Critic Representations

Create Q Table Representation

Create Quadratic Basis Function Critic Representation

Input Arguments

`net` — Deep neural network for actor or critic
array of `Layer` objects | `layerGraph` object | `DAGNetwork` object | `SeriesNetwork` object

`obsNames` — Observation names
cell array of character vectors

`obsInfo` — Observation specification
`rlFiniteSetSpec` object | `rlNumericSpec` object | array

`actNames` — Action name
single-element cell array that contains a character vector

`actInfo` — Action specification
`rlFiniteSetSpec` object | `rlNumericSpec` object

`tab` — Value table or Q table for critic
`rlTable` object

`basisFcn` — Custom basis function
function handle

`W0` — Initial value for linear basis function weight vector
column vector | array

`oaInfo` — Observation and action specifications
cell array

`repOpts` — Representation options
`rlRepresentationOptions` object

Output Arguments

`rep` — Deep neural network representation
`rlLayerRepresentation` object

`tableCritic` — Value or Q table critic representation
`rlTableRepresentation` object

`critic` — Linear basis function critic representation
`rlLinearBasisRepresentation` object

`actor` — Linear basis function actor representation
`rlLinearBasisRepresentation` object

Version History

R2020a: `rlRepresentation` is not recommended

See Also

Functions

Objects

Topics

rlRepresentation

Syntax

Description

Examples

Create Actor and Critic Representations

Create Q Table Representation

Create Quadratic Basis Function Critic Representation

Input Arguments

net — Deep neural network for actor or critic array of Layer objects | layerGraph object | DAGNetwork object | SeriesNetwork object

obsNames — Observation names cell array of character vectors

obsInfo — Observation specification rlFiniteSetSpec object | rlNumericSpec object | array

actNames — Action name single-element cell array that contains a character vector

actInfo — Action specification rlFiniteSetSpec object | rlNumericSpec object

tab — Value table or Q table for critic rlTable object

basisFcn — Custom basis function function handle

W0 — Initial value for linear basis function weight vector column vector | array

oaInfo — Observation and action specifications cell array

repOpts — Representation options rlRepresentationOptions object

Output Arguments

rep — Deep neural network representation rlLayerRepresentation object

tableCritic — Value or Q table critic representation rlTableRepresentation object

critic — Linear basis function critic representation rlLinearBasisRepresentation object

actor — Linear basis function actor representation rlLinearBasisRepresentation object

Version History

R2020a: rlRepresentation is not recommended

See Also

Functions

Objects

Topics

`net` — Deep neural network for actor or critic
array of `Layer` objects | `layerGraph` object | `DAGNetwork` object | `SeriesNetwork` object

`obsNames` — Observation names
cell array of character vectors

`obsInfo` — Observation specification
`rlFiniteSetSpec` object | `rlNumericSpec` object | array

`actNames` — Action name
single-element cell array that contains a character vector

`actInfo` — Action specification
`rlFiniteSetSpec` object | `rlNumericSpec` object

`tab` — Value table or Q table for critic
`rlTable` object

`basisFcn` — Custom basis function
function handle

`W0` — Initial value for linear basis function weight vector
column vector | array

`oaInfo` — Observation and action specifications
cell array

`repOpts` — Representation options
`rlRepresentationOptions` object

`rep` — Deep neural network representation
`rlLayerRepresentation` object

`tableCritic` — Value or Q table critic representation
`rlTableRepresentation` object

`critic` — Linear basis function critic representation
`rlLinearBasisRepresentation` object

`actor` — Linear basis function actor representation
`rlLinearBasisRepresentation` object

R2020a: `rlRepresentation` is not recommended