Class sdm::BaseOccupancyMDP
template <class TOccupancyState class TOccupancyState>
Class List > sdm > BaseOccupancyMDP
This class provides a way to transform a Dec-POMDP into an occupancy MDP formalism.More...
#include <occupancy_mdp.hpp>
Inherits the following classes: sdm::BaseBeliefMDP
Inherited by the following classes: sdm::HierarchicalOccupancyMDP, sdm::PrivateHierarchicalOccupancyMDP, sdm::SerialOccupancyMDP
Public Attributes
Type | Name |
---|---|
std::shared_ptr< HistoryInterface > | current_history_ |
std::shared_ptr< HistoryInterface > | initial_history_ Initial and current histories. |
Public Attributes inherited from sdm::BaseBeliefMDP
Type | Name |
---|---|
int | batch_size_ |
std::shared_ptr< State > | current_state_ The current state (used in RL). |
std::shared_ptr< Graph< std::shared_ptr< State >, Pair< std::shared_ptr< Action >, std::shared_ptr< Observation > > > > | mdp_graph_ the MDP __Graph (graph of state transition) |
std::shared_ptr< Graph< double, Pair< std::shared_ptr< State >, std::shared_ptr< Action > > > > | reward_graph_ |
RecursiveMap< TBelief, std::shared_ptr< State > > | state_space_ A pointer on the bag containing all states. |
int | step_ The current timestep (used in RL). |
bool | store_actions_ = = true |
bool | store_states_ = = true Hyperparameters. |
RecursiveMap< std::shared_ptr< State >, std::shared_ptr< Action >, std::shared_ptr< Observation >, double > | transition_probability _The probability transition. (i.e. p(o |
Public Static Attributes
Type | Name |
---|---|
unsigned long | MEAN_SIZE_STATE |
number | PASSAGE_IN_NEXT_STATE |
double | TIME_IN_APPLY_DR |
double | TIME_IN_COMPRESS |
double | TIME_IN_EXP_NEXT |
double | TIME_IN_GET_ACTION |
double | TIME_IN_GET_REWARD |
double | TIME_IN_NEXT_OSTATE |
double | TIME_IN_NEXT_STATE |
double | TIME_IN_STEP |
double | TIME_IN_UNDER_STEP |
Public Functions
Type | Name |
---|---|
BaseOccupancyMDP () | |
BaseOccupancyMDP (const std::shared_ptr< MPOMDPInterface > & dpomdp, number memory=-1, bool compression=true, bool store_states=true, bool store_actions=true, int batch_size=0) | |
virtual std::shared_ptr< Action > | applyDecisionRule (const std::shared_ptr< OccupancyStateInterface > & ostate, const std::shared_ptr< JointHistoryInterface > & joint_history, const std::shared_ptr< Action > & decision_rule, number t) const |
virtual bool | checkCompatibility (const std::shared_ptr< Observation > & joint_observation, const std::shared_ptr< Observation > & observation) |
virtual std::shared_ptr< Action > | computeRandomAction (const std::shared_ptr< OccupancyStateInterface > & ostate, number t) |
virtual double | do_excess (double incumbent, double lb_value, double ub_value, double cost_so_far, double error, number t) Compute the excess of the HSVI paper. It refers to the termination condition. |
virtual std::shared_ptr< Space > | getActionSpaceAt (const std::shared_ptr< State > & belief, number t=0) Get the action space at a specific belief and timestep. |
virtual std::shared_ptr< Space > | getActionSpaceAt (const std::shared_ptr< Observation > & observation, number t=0) Get the action space. |
virtual std::shared_ptr< Space > | getObservationSpaceAt (const std::shared_ptr< State > &, const std::shared_ptr< Action > &, number t) Get the observation space of the central planner. |
virtual std::shared_ptr< Action > | getRandomAction (const std::shared_ptr< Observation > & observation, number t) Get random action. |
virtual double | getReward (const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, number t=0) Get the expected reward of executing a specific action in a specific belief at timestep t. |
double | getRewardBelief (const std::shared_ptr< BeliefInterface > & state, const std::shared_ptr< Action > & action, number t) |
virtual std::shared_ptr< BeliefMDP > | getUnderlyingBeliefMDP () const Get the address of the underlying BeliefMDP. |
virtual std::shared_ptr< MPOMDPInterface > | getUnderlyingMPOMDP () const Get the address of the underlying MPOMDP . |
void | initialize (number memory) |
virtual std::shared_ptr< State > | nextOccupancyState (const std::shared_ptr< State > & occupancy_state, const std::shared_ptr< Action > & decision_rule, const std::shared_ptr< Observation > & observation, number t=0) Get the next occupancy state. |
virtual std::shared_ptr< Observation > | reset () Reset the environment and return initial observation. |
virtual std::tuple< std::shared_ptr< Observation >, std::vector< double >, bool > | step (std::shared_ptr< Action > action) Do a step on the environment. |
~BaseOccupancyMDP () |
Public Functions inherited from sdm::BaseBeliefMDP
Type | Name |
---|---|
BaseBeliefMDP () | |
BaseBeliefMDP (const std::shared_ptr< POMDPInterface > & pomdp, int batch_size=0) | |
virtual Pair< std::shared_ptr< State >, std::shared_ptr< State > > | computeExactNextState (const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t=0) |
virtual std::shared_ptr< State > | computeNextState (const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t=0) |
virtual Pair< std::shared_ptr< State >, double > | computeNextStateAndProbability (const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t=0) Compute the state transition in order to return next state and associated probability. |
virtual Pair< std::shared_ptr< State >, std::shared_ptr< State > > | computeSampledNextState (const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t=0) |
virtual std::shared_ptr< Space > | getActionSpaceAt (const std::shared_ptr< State > & belief, number t=0) Get the action space at a specific belief and timestep. |
virtual std::shared_ptr< Space > | getActionSpaceAt (const std::shared_ptr< Observation > & observation, number t) Get the action space. |
virtual double | getExpectedNextValue (const std::shared_ptr< ValueFunction > & value_function, const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, number t=0) Get the expected next value. |
std::shared_ptr< Graph< std::shared_ptr< State >, Pair< std::shared_ptr< Action >, std::shared_ptr< Observation > > > > | getMDPGraph () Get the graph of. |
virtual Pair< std::shared_ptr< State >, double > | getNextState (const std::shared_ptr< ValueFunction > & value_function, const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t) |
virtual double | getObservationProbability (const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, const std::shared_ptr< State > & next_belief, const std::shared_ptr< Observation > & obs, number t=0) const Get the Observation _Probability p(o |
virtual std::shared_ptr< Space > | getObservationSpaceAt (const std::shared_ptr< State > &, const std::shared_ptr< Action > &, number t) |
virtual std::shared_ptr< Action > | getRandomAction (const std::shared_ptr< Observation > & observation, number t) Get random action. |
virtual double | getReward (const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, number t=0) Get the expected reward of executing a specific action in a specific belief at timestep t. |
std::vector< std::shared_ptr< State > > | getStoredStates () const |
virtual std::shared_ptr< POMDPInterface > | getUnderlyingPOMDP () const Get the address of the underlying POMDP . |
virtual std::shared_ptr< State > | nextBelief (const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t=0) |
virtual Pair< std::shared_ptr< State >, double > | nextBeliefAndProba (const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t=0) Get the next belief. |
virtual std::shared_ptr< State > | nextState (const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, number t=0, const std::shared_ptr< HSVI > & hsvi=nullptr) Select the next belief. |
virtual std::shared_ptr< Observation > | reset () Reset the environment and return initial observation. |
virtual std::tuple< std::shared_ptr< Observation >, std::vector< double >, bool > | step (std::shared_ptr< Action > action) Do a step on the environment. |
~BaseBeliefMDP () |
Public Functions inherited from sdm::SolvableByMDP
Type | Name |
---|---|
SolvableByMDP () Default constructor. | |
SolvableByMDP (const std::shared_ptr< MDPInterface > & mdp) Construct a problem solvable by HSVI . | |
virtual double | do_excess (double incumbent, double lb_value, double ub_value, double cost_so_far, double error, number horizon) Compute the excess of the HSVI paper. It refers to the termination condition. |
virtual std::shared_ptr< Space > | getActionSpaceAt (const std::shared_ptr< State > & state, number t=0) Get the action space at a specific state and timestep. The state dependency is required when the game forbid the usage of a number of actions in this state. It is also used in some reformulated problems where actions are decision rules. The time dependency is required in extensive-form games in which some agents have a different action space. |
virtual double | getDiscount (number t=0) const Get the specific discount factor for the problem at hand. |
virtual double | getExpectedNextValue (const std::shared_ptr< ValueFunction > & value_function, const std::shared_ptr< State > & state, const std::shared_ptr< Action > & action, number t=0) Get the expected next value. |
virtual std::shared_ptr< State > | getInitialState () Get the initial state. |
virtual Pair< std::shared_ptr< State >, double > | getNextState (const std::shared_ptr< ValueFunction > & value_function, const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t) |
virtual std::shared_ptr< Space > | getObservationSpaceAt (const std::shared_ptr< State > & state, const std::shared_ptr< Action > & action, number t) |
virtual double | getReward (const std::shared_ptr< State > & state, const std::shared_ptr< Action > & action, number t=0) Get the reward of executing a specific action in an specific state at timestep t. The time dependency can be required in non-stationnary problems. |
virtual const std::shared_ptr< MDPInterface > & | getUnderlyingProblem () const Get the well defined underlying problem. Some problems are solvable by DP algorithms even if they are not well defined. Usually, they simply are reformulation of an underlying well defined problem. For instance, the underlying DecPOMDP of the OccupancyMDP or the underlying POMDP of the current BeliefMDP. |
virtual double | getWeightedDiscount (number t) Get the specific weighted discount factor for the problem at hand. |
virtual bool | isSerialized () const Check if the problem is serialized. |
virtual std::shared_ptr< State > | nextState (const std::shared_ptr< State > & state, const std::shared_ptr< Action > & action, number t=0, const std::shared_ptr< HSVI > & hsvi=nullptr) Select the next state. |
virtual Pair< std::shared_ptr< Action >, double > | selectNextAction (const std::shared_ptr< ValueFunction > & lb, const std::shared_ptr< ValueFunction > & ub, const std::shared_ptr< State > & s, number h) Select the next action. |
virtual void | setInitialState (const std::shared_ptr< State > & state) |
Public Functions inherited from sdm::SolvableByHSVI
Type | Name |
---|---|
virtual double | do_excess (double incumbent, double lb_value, double ub_value, double cost_so_far, double error, number t) = 0 Compute the excess of the HSVI paper. It refers to the termination condition. |
virtual std::shared_ptr< Space > | getActionSpaceAt (const std::shared_ptr< State > & state, number t) = 0 Get the action space at a specific state and timestep. The state dependency is required when the game forbid the usage of a number of actions in this state. It is also used in some reformulated problems where actions are decision rules. The time dependency is required in extensive-form games in which some agents have a different action space. |
virtual double | getDiscount (number t) const = 0 Get the specific discount factor for the problem at hand. |
virtual double | getExpectedNextValue (const std::shared_ptr< ValueFunction > & value_function, const std::shared_ptr< State > & state, const std::shared_ptr< Action > & action, number t) = 0 Get the expected next value. |
virtual std::shared_ptr< State > | getInitialState () = 0 Get the initial state. |
virtual Pair< std::shared_ptr< State >, double > | getNextState (const std::shared_ptr< ValueFunction > & value_function, const std::shared_ptr< State > & belief, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t) = 0 |
virtual std::shared_ptr< Space > | getObservationSpaceAt (const std::shared_ptr< State > & state, const std::shared_ptr< Action > & action, number t) = 0 |
virtual double | getReward (const std::shared_ptr< State > & state, const std::shared_ptr< Action > & action, number t) = 0 Get the reward of executing a specific action in an specific state at timestep t. The time dependency can be required in non-stationnary problems. |
virtual const std::shared_ptr< MDPInterface > & | getUnderlyingProblem () const = 0 Get the well defined underlying problem. Some problems are solvable by DP algorithms even if they are not well defined. Usually, they simply are reformulation of an underlying well defined problem. For instance, the underlying DecPOMDP of the OccupancyMDP or the underlying POMDP of the current BeliefMDP. |
virtual double | getWeightedDiscount (number t) = 0 Get the specific weighted discount factor for the problem at hand. |
virtual bool | isSerialized () const = 0 Check if the problem is serialized. |
virtual std::shared_ptr< State > | nextState (const std::shared_ptr< State > & state, const std::shared_ptr< Action > & action, number t=0, const std::shared_ptr< HSVI > & hsvi=nullptr) = 0 Select the next state. |
virtual Pair< std::shared_ptr< Action >, double > | selectNextAction (const std::shared_ptr< ValueFunction > & lb, const std::shared_ptr< ValueFunction > & ub, const std::shared_ptr< State > & state, number t) = 0 Select the next action. |
virtual void | setInitialState (const std::shared_ptr< State > &) = 0 |
virtual | ~SolvableByHSVI () |
Public Functions inherited from sdm::GymInterface
Type | Name |
---|---|
virtual std::shared_ptr< Space > | getActionSpaceAt (const std::shared_ptr< Observation > & observation, number t) = 0 Get the action space. |
virtual std::shared_ptr< Action > | getRandomAction (const std::shared_ptr< Observation > & observation, number t) = 0 Get random action. |
virtual std::shared_ptr< Observation > | reset () = 0 Reset the environment and return initial observation. |
virtual std::tuple< std::shared_ptr< Observation >, std::vector< double >, bool > | step (std::shared_ptr< Action > action) = 0 Do a step on the environment. |
Protected Attributes
Type | Name |
---|---|
std::shared_ptr< BeliefMDP > | belief_mdp_ Keep a pointer on the associated belief mdp that is used to compute next beliefs. |
bool | compression_ = = true Hyperparameters. |
Protected Attributes inherited from sdm::SolvableByMDP
Type | Name |
---|---|
std::shared_ptr< State > | initial_state_ The initial state. |
std::shared_ptr< MDPInterface > | underlying_problem_ The underlying well defined problem. |
Protected Functions
Type | Name |
---|---|
virtual std::shared_ptr< Space > | computeActionSpaceAt (const std::shared_ptr< State > & occupancy_state, number t=0) |
virtual Pair< std::shared_ptr< State >, std::shared_ptr< State > > | computeExactNextState (const std::shared_ptr< State > & occupancy_state, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t=0) |
virtual std::shared_ptr< State > | computeNextState (const std::shared_ptr< State > & occupancy_state, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t=0) |
virtual Pair< std::shared_ptr< State >, double > | computeNextStateAndProbability (const std::shared_ptr< State > & occupancy_state, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t=0) Compute the state transition in order to return next state and associated probability. |
virtual Pair< std::shared_ptr< State >, std::shared_ptr< State > > | computeSampledNextState (const std::shared_ptr< State > & occupancy_state, const std::shared_ptr< Action > & action, const std::shared_ptr< Observation > & observation, number t=0) |
virtual bool | do_compression (number t) const Return true if compression must be done. |
std::shared_ptr< HistoryInterface > | getNextHistory (const std::shared_ptr< Observation > & observation) |
virtual void | update_occupancy_state_proba (const std::shared_ptr< OccupancyStateInterface > & occupancy_state, const std::shared_ptr< JointHistoryInterface > & joint_history, const std::shared_ptr< BeliefInterface > & belief, double probability) |
Protected Functions inherited from sdm::SolvableByMDP
Type | Name |
---|---|
const std::shared_ptr< MDPInterface > & | getUnderlyingMDP () const Get the underlying mdp. |
Detailed Description
This problem reformulation can be used to solve the underlying Dec-POMDP with standard dynamic programming algorithms.
Public Attributes Documentation
variable current_history_
std::shared_ptr<HistoryInterface> sdm::BaseOccupancyMDP< TOccupancyState >::current_history_;
variable initial_history_
std::shared_ptr<HistoryInterface> sdm::BaseOccupancyMDP< TOccupancyState >::initial_history_;
Public Static Attributes Documentation
variable MEAN_SIZE_STATE
unsigned long sdm::BaseOccupancyMDP< TOccupancyState >::MEAN_SIZE_STATE;
variable PASSAGE_IN_NEXT_STATE
number sdm::BaseOccupancyMDP< TOccupancyState >::PASSAGE_IN_NEXT_STATE;
variable TIME_IN_APPLY_DR
double sdm::BaseOccupancyMDP< TOccupancyState >::TIME_IN_APPLY_DR;
variable TIME_IN_COMPRESS
double sdm::BaseOccupancyMDP< TOccupancyState >::TIME_IN_COMPRESS;
variable TIME_IN_EXP_NEXT
double sdm::BaseOccupancyMDP< TOccupancyState >::TIME_IN_EXP_NEXT;
variable TIME_IN_GET_ACTION
double sdm::BaseOccupancyMDP< TOccupancyState >::TIME_IN_GET_ACTION;
variable TIME_IN_GET_REWARD
double sdm::BaseOccupancyMDP< TOccupancyState >::TIME_IN_GET_REWARD;
variable TIME_IN_NEXT_OSTATE
double sdm::BaseOccupancyMDP< TOccupancyState >::TIME_IN_NEXT_OSTATE;
variable TIME_IN_NEXT_STATE
double sdm::BaseOccupancyMDP< TOccupancyState >::TIME_IN_NEXT_STATE;
variable TIME_IN_STEP
double sdm::BaseOccupancyMDP< TOccupancyState >::TIME_IN_STEP;
variable TIME_IN_UNDER_STEP
double sdm::BaseOccupancyMDP< TOccupancyState >::TIME_IN_UNDER_STEP;
Public Functions Documentation
function BaseOccupancyMDP [1/2]
sdm::BaseOccupancyMDP::BaseOccupancyMDP ()
function BaseOccupancyMDP [2/2]
sdm::BaseOccupancyMDP::BaseOccupancyMDP (
const std::shared_ptr< MPOMDPInterface > & dpomdp,
number memory=-1,
bool compression=true,
bool store_states=true,
bool store_actions=true,
int batch_size=0
)
function applyDecisionRule
virtual std::shared_ptr< Action > sdm::BaseOccupancyMDP::applyDecisionRule (
const std::shared_ptr< OccupancyStateInterface > & ostate,
const std::shared_ptr< JointHistoryInterface > & joint_history,
const std::shared_ptr< Action > & decision_rule,
number t
) const
function checkCompatibility
virtual bool sdm::BaseOccupancyMDP::checkCompatibility (
const std::shared_ptr< Observation > & joint_observation,
const std::shared_ptr< Observation > & observation
)
function computeRandomAction
virtual std::shared_ptr< Action > sdm::BaseOccupancyMDP::computeRandomAction (
const std::shared_ptr< OccupancyStateInterface > & ostate,
number t
)
function do_excess
virtual double sdm::BaseOccupancyMDP::do_excess (
double incumbent,
double lb_value,
double ub_value,
double cost_so_far,
double error,
number t
)
Parameters:
double
: incumbentdouble
: lb valuedouble
: ub valuedouble
: cost_so_fardouble
: errornumber
: horizon
Returns:
double
Implements sdm::SolvableByHSVI::do_excess
function getActionSpaceAt [1/2]
virtual std::shared_ptr< Space > sdm::BaseOccupancyMDP::getActionSpaceAt (
const std::shared_ptr< State > & belief,
number t=0
)
Parameters:
belief
the belieft
the timestep
Returns:
the action space
The time dependency is required in extensive-form games in which some agents have a different action space.
Implements sdm::BaseBeliefMDP::getActionSpaceAt
function getActionSpaceAt [2/2]
virtual std::shared_ptr< Space > sdm::BaseOccupancyMDP::getActionSpaceAt (
const std::shared_ptr< Observation > & observation,
number t=0
)
Parameters:
observation
the observation in considerationt
time step
Returns:
the action space.
Implements sdm::BaseBeliefMDP::getActionSpaceAt
function getObservationSpaceAt
virtual std::shared_ptr< Space > sdm::BaseOccupancyMDP::getObservationSpaceAt (
const std::shared_ptr< State > &,
const std::shared_ptr< Action > &,
number t
)
Parameters:
t
the timestep
Returns:
the space of observation of the central planner.
Depending of the case, the central planner may observe or not what agents observe.
Implements sdm::BaseBeliefMDP::getObservationSpaceAt
function getRandomAction
virtual std::shared_ptr< Action > sdm::BaseOccupancyMDP::getRandomAction (
const std::shared_ptr< Observation > & observation,
number t
)
Parameters:
observation
the observation in consideration.t
time step.
Returns:
the random action.
Implements sdm::BaseBeliefMDP::getRandomAction
function getReward
virtual double sdm::BaseOccupancyMDP::getReward (
const std::shared_ptr< State > & belief,
const std::shared_ptr< Action > & action,
number t=0
)
Parameters:
belief
the beliefaction
the actiont
the timestep
Returns:
the reward
The time dependency can be required in non-stationnary problems.
Implements sdm::BaseBeliefMDP::getReward
function getRewardBelief
double sdm::BaseOccupancyMDP::getRewardBelief (
const std::shared_ptr< BeliefInterface > & state,
const std::shared_ptr< Action > & action,
number t
)
function getUnderlyingBeliefMDP
virtual std::shared_ptr< BeliefMDP > sdm::BaseOccupancyMDP::getUnderlyingBeliefMDP () const
function getUnderlyingMPOMDP
virtual std::shared_ptr< MPOMDPInterface > sdm::BaseOccupancyMDP::getUnderlyingMPOMDP () const
function initialize
void sdm::BaseOccupancyMDP::initialize (
number memory
)
function nextOccupancyState
virtual std::shared_ptr< State > sdm::BaseOccupancyMDP::nextOccupancyState (
const std::shared_ptr< State > & occupancy_state,
const std::shared_ptr< Action > & decision_rule,
const std::shared_ptr< Observation > & observation,
number t=0
)
Parameters:
occupancy
state the occupancy stateaction
the actionobservation
the observationt
the timestep
Returns:
the next occupancy state
This function returns the next occupancy state. To do so, we check in the MDP graph the existance of an edge (action / observation) starting from the current occupancy state. If it exists, we return the associated next occupancy state. Otherwise, we compute the next occupancy state using "computeNextStateAndProbability" function and add the edge from the current occupancy state to the next occupancy state in the graph.
function reset
virtual std::shared_ptr< Observation > sdm::BaseOccupancyMDP::reset ()
Returns:
the initial observation
Implements sdm::BaseBeliefMDP::reset
function step
virtual std::tuple< std::shared_ptr< Observation >, std::vector< double >, bool > sdm::BaseOccupancyMDP::step (
std::shared_ptr< Action > action
)
Parameters:
action
the action to execute
Returns:
the information produced. Include : next observation, rewards, episode done
Implements sdm::BaseBeliefMDP::step
function ~BaseOccupancyMDP
sdm::BaseOccupancyMDP::~BaseOccupancyMDP ()
Protected Attributes Documentation
variable belief_mdp_
std::shared_ptr<BeliefMDP> sdm::BaseOccupancyMDP< TOccupancyState >::belief_mdp_;
variable compression_
bool sdm::BaseOccupancyMDP< TOccupancyState >::compression_;
Protected Functions Documentation
function computeActionSpaceAt
virtual std::shared_ptr< Space > sdm::BaseOccupancyMDP::computeActionSpaceAt (
const std::shared_ptr< State > & occupancy_state,
number t=0
)
function computeExactNextState
virtual Pair < std::shared_ptr< State >, std::shared_ptr< State > > sdm::BaseOccupancyMDP::computeExactNextState (
const std::shared_ptr< State > & occupancy_state,
const std::shared_ptr< Action > & action,
const std::shared_ptr< Observation > & observation,
number t=0
)
Implements sdm::BaseBeliefMDP::computeExactNextState
function computeNextState
virtual std::shared_ptr< State > sdm::BaseOccupancyMDP::computeNextState (
const std::shared_ptr< State > & occupancy_state,
const std::shared_ptr< Action > & action,
const std::shared_ptr< Observation > & observation,
number t=0
)
Implements sdm::BaseBeliefMDP::computeNextState
function computeNextStateAndProbability
virtual Pair < std::shared_ptr< State >, double > sdm::BaseOccupancyMDP::computeNextStateAndProbability (
const std::shared_ptr< State > & occupancy_state,
const std::shared_ptr< Action > & action,
const std::shared_ptr< Observation > & observation,
number t=0
)
Parameters:
belief
the beliefaction
the actionobservation
the observationt
the timestep
Returns:
the couple (next state, transition probability in the next state)
This function can be modified in an inherited class to define a belief MDP with a different representation of the belief state. (i.e. BaseOccupancyMDP inherits from BaseBeliefMDP with TBelief = OccupancyState)
Implements sdm::BaseBeliefMDP::computeNextStateAndProbability
function computeSampledNextState
virtual Pair < std::shared_ptr< State >, std::shared_ptr< State > > sdm::BaseOccupancyMDP::computeSampledNextState (
const std::shared_ptr< State > & occupancy_state,
const std::shared_ptr< Action > & action,
const std::shared_ptr< Observation > & observation,
number t=0
)
Implements sdm::BaseBeliefMDP::computeSampledNextState
function do_compression
virtual bool sdm::BaseOccupancyMDP::do_compression (
number t
) const
function getNextHistory
std::shared_ptr< HistoryInterface > sdm::BaseOccupancyMDP::getNextHistory (
const std::shared_ptr< Observation > & observation
)
function update_occupancy_state_proba
virtual void sdm::BaseOccupancyMDP::update_occupancy_state_proba (
const std::shared_ptr< OccupancyStateInterface > & occupancy_state,
const std::shared_ptr< JointHistoryInterface > & joint_history,
const std::shared_ptr< BeliefInterface > & belief,
double probability
)
The documentation for this class was generated from the following file src/sdm/world/occupancy_mdp.hpp