Public Member Functions | Protected Types | Protected Member Functions | Private Attributes
Sarsa Class Reference

#include <Sarsa.hh>

Inheritance diagram for Sarsa:
Inheritance graph
[legend]

List of all members.

Public Member Functions

virtual int first_action (const std::vector< float > &s)
float getValue (std::vector< float > state)
virtual void last_action (float r)
void logValues (ofstream *of, int xmin, int xmax, int ymin, int ymax)
virtual int next_action (float r, const std::vector< float > &s)
void printState (const std::vector< float > &s)
std::vector< float >::iterator random_max_element (std::vector< float >::iterator start, std::vector< float >::iterator end)
 Sarsa (int numactions, float gamma, float initialvalue, float alpha, float epsilon, float lambda, Random rng=Random())
 Sarsa (const Sarsa &)
virtual void savePolicy (const char *filename)
virtual void seedExp (std::vector< experience >)
virtual void setDebug (bool d)
virtual ~Sarsa ()

Protected Types

typedef const std::vector
< float > * 
state_t

Protected Member Functions

state_t canonicalize (const std::vector< float > &s)

Private Attributes

bool ACTDEBUG
const float alpha
float * currentq
bool ELIGDEBUG
std::map< state_t, std::vector
< float > > 
eligibility
const float epsilon
const float gamma
const float initialvalue
const float lambda
const int numactions
std::map< state_t, std::vector
< float > > 
Q
Random rng
std::set< std::vector< float > > statespace

Detailed Description

Interface for an implementation of the canonical Sarsa lambda algorithm. Agent that uses straight Sarsa Lambda, with no generalization and epsilon-greedy exploration.

Definition at line 17 of file Sarsa.hh.


Member Typedef Documentation

typedef const std::vector<float>* Sarsa::state_t [protected]

The implementation maps all sensations to a set of canonical pointers, which serve as the internal representation of environment state.

Definition at line 56 of file Sarsa.hh.


Constructor & Destructor Documentation

Sarsa::Sarsa ( int  numactions,
float  gamma,
float  initialvalue,
float  alpha,
float  epsilon,
float  lambda,
Random  rng = Random() 
)

Standard constructor

Parameters:
numactionsThe number of possible actions
gammaThe discount factor
initialvalueThe initial value of each Q(s,a)
alphaThe learning rate
epsilonThe probability of taking a random action
rngInitial state of the random number generator to use

Definition at line 4 of file Sarsa.cc.

Sarsa::Sarsa ( const Sarsa )

Unimplemented copy constructor: internal state cannot be simply copied.

Sarsa::~Sarsa ( ) [virtual]

Definition at line 19 of file Sarsa.cc.


Member Function Documentation

Sarsa::state_t Sarsa::canonicalize ( const std::vector< float > &  s) [protected]

Produces a canonical representation of the given sensation.

Parameters:
sThe current sensation from the environment.
Returns:
A pointer to an equivalent state in statespace.

Definition at line 149 of file Sarsa.cc.

int Sarsa::first_action ( const std::vector< float > &  s) [virtual]

Implements Agent.

Definition at line 21 of file Sarsa.cc.

float Sarsa::getValue ( std::vector< float >  state)

Definition at line 244 of file Sarsa.cc.

void Sarsa::last_action ( float  r) [virtual]

Implements Agent.

Definition at line 123 of file Sarsa.cc.

void Sarsa::logValues ( ofstream *  of,
int  xmin,
int  xmax,
int  ymin,
int  ymax 
)

Definition at line 228 of file Sarsa.cc.

int Sarsa::next_action ( float  r,
const std::vector< float > &  s 
) [virtual]

Implements Agent.

Definition at line 67 of file Sarsa.cc.

void Sarsa::printState ( const std::vector< float > &  s)

Definition at line 190 of file Sarsa.cc.

std::vector< float >::iterator Sarsa::random_max_element ( std::vector< float >::iterator  start,
std::vector< float >::iterator  end 
)

Definition at line 165 of file Sarsa.cc.

void Sarsa::savePolicy ( const char *  filename) [virtual]

Reimplemented from Agent.

Definition at line 278 of file Sarsa.cc.

void Sarsa::seedExp ( std::vector< experience seeds) [virtual]

Reimplemented from Agent.

Definition at line 198 of file Sarsa.cc.

void Sarsa::setDebug ( bool  d) [virtual]

Implements Agent.

Definition at line 185 of file Sarsa.cc.


Member Data Documentation

bool Sarsa::ACTDEBUG [private]

Definition at line 87 of file Sarsa.hh.

const float Sarsa::alpha [private]

Definition at line 80 of file Sarsa.hh.

float* Sarsa::currentq [private]

Definition at line 85 of file Sarsa.hh.

Definition at line 88 of file Sarsa.hh.

std::map<state_t, std::vector<float> > Sarsa::eligibility [private]

Definition at line 74 of file Sarsa.hh.

const float Sarsa::epsilon [private]

Definition at line 81 of file Sarsa.hh.

const float Sarsa::gamma [private]

Definition at line 77 of file Sarsa.hh.

const float Sarsa::initialvalue [private]

Definition at line 79 of file Sarsa.hh.

const float Sarsa::lambda [private]

Definition at line 82 of file Sarsa.hh.

const int Sarsa::numactions [private]

Definition at line 76 of file Sarsa.hh.

std::map<state_t, std::vector<float> > Sarsa::Q [private]

The primary data structure of the learning algorithm, the value function Q. For state_t s and int a, Q[s][a] gives the learned maximum expected future discounted reward conditional on executing action a in state s.

Definition at line 73 of file Sarsa.hh.

Random Sarsa::rng [private]

Definition at line 84 of file Sarsa.hh.

std::set<std::vector<float> > Sarsa::statespace [private]

Set of all distinct sensations seen. Pointers to elements of this set serve as the internal representation of the environment state.

Definition at line 67 of file Sarsa.hh.


The documentation for this class was generated from the following files:


rl_agent
Author(s): Todd Hester
autogenerated on Thu Jun 6 2019 22:00:14