rlValueRepresentation
(Not recommended) Value function critic representation for reinforcement learning agents
rlValueRepresentation is not recommended. Use rlValueFunction
instead. For more information, see rlValueRepresentation is not recommended.
Description
This object implements a value function approximator to be used as a critic within
a reinforcement learning agent. A value function is a function that maps an observation to a
scalar value. The output represents the expected total long-term reward when the agent starts
from the given observation and takes the best possible action. Value function critics
therefore only need observations (but not actions) as inputs. After you create an
rlValueRepresentation critic, use it to create an agent relying on a value
function critic, such as an rlACAgent, rlPGAgent, or rlPPOAgent. For an
example of this workflow, see Create Actor and Critic Representations. For more information on creating
representations, see Create Policies and Value Functions.
Creation
Syntax
Description
creates the value function based critic = rlValueRepresentation(net,observationInfo,'Observation',obsName)critic from the deep neural network
net. This syntax sets the ObservationInfo property of critic to the input
observationInfo. obsName must contain the
names of the input layers of net.
creates the value function based critic = rlValueRepresentation(tab,observationInfo)critic with a discrete
observation space, from the value table tab, which is an
rlTable object
containing a column array with as many elements as the possible observations. This syntax
sets the ObservationInfo property of critic to the input
observationInfo.
creates the value function based critic = rlValueRepresentation({basisFcn,W0},observationInfo)critic using a custom basis function
as underlying approximator. The first input argument is a two-elements cell in which the
first element contains the handle basisFcn to a custom basis
function, and the second element contains the initial weight vector
W0. This syntax sets the ObservationInfo property of critic to the input
observationInfo.
creates the value function based critic = rlValueRepresentation(___,options)critic using the additional option
set options, which is an rlRepresentationOptions object. This syntax sets the Options property of critic to the
options input argument. You can use this syntax with any of the
previous input-argument combinations.
Input Arguments
Properties
Object Functions
rlACAgent | Actor-critic (AC) reinforcement learning agent |
rlPGAgent | Policy gradient (PG) reinforcement learning agent |
rlPPOAgent | Proximal policy optimization (PPO) reinforcement learning agent |
getValue | Obtain estimated value from a critic given environment observations and actions |