#include <Sarsa.hh>

Inheritance diagram for Sarsa:

Public Member Functions
virtual int	first_action (const std::vector< float > &s)
float	getValue (std::vector< float > state)
virtual void	last_action (float r)
void	logValues (ofstream *of, int xmin, int xmax, int ymin, int ymax)
virtual int	next_action (float r, const std::vector< float > &s)
void	printState (const std::vector< float > &s)
std::vector< float >::iterator	random_max_element (std::vector< float >::iterator start, std::vector< float >::iterator end)
	Sarsa (int numactions, float gamma, float initialvalue, float alpha, float epsilon, float lambda, Random rng=Random())
	Sarsa (const Sarsa &)
virtual void	savePolicy (const char *filename)
virtual void	seedExp (std::vector< experience >)
virtual void	setDebug (bool d)
virtual	~Sarsa ()
Protected Types
typedef const std::vector < float > *	state_t
Protected Member Functions
state_t	canonicalize (const std::vector< float > &s)
Private Attributes
bool	ACTDEBUG
const float	alpha
float *	currentq
bool	ELIGDEBUG
std::map< state_t, std::vector < float > >	eligibility
const float	epsilon
const float	gamma
const float	initialvalue
const float	lambda
const int	numactions
std::map< state_t, std::vector < float > >	Q
Random	rng
std::set< std::vector< float > >	statespace

Detailed Description

Interface for an implementation of the canonical Sarsa lambda algorithm. Agent that uses straight Sarsa Lambda, with no generalization and epsilon-greedy exploration.

Definition at line 17 of file Sarsa.hh.

Member Typedef Documentation

typedef const std::vector<float>* Sarsa::state_t [protected]

The implementation maps all sensations to a set of canonical pointers, which serve as the internal representation of environment state.

Definition at line 56 of file Sarsa.hh.

Constructor & Destructor Documentation

Sarsa::Sarsa	(	int	numactions,
		float	gamma,
		float	initialvalue,
		float	alpha,
		float	epsilon,
		float	lambda,
		Random	rng = `Random()`
	)

Standard constructor

Parameters:

numactions	The number of possible actions
gamma	The discount factor
initialvalue	The initial value of each Q(s,a)
alpha	The learning rate
epsilon	The probability of taking a random action
rng	Initial state of the random number generator to use

Definition at line 4 of file Sarsa.cc.

Sarsa::Sarsa ( const Sarsa & )

Unimplemented copy constructor: internal state cannot be simply copied.

Sarsa::~Sarsa ( ) [virtual]

Definition at line 19 of file Sarsa.cc.

Member Function Documentation

Sarsa::state_t Sarsa::canonicalize ( const std::vector< float > & s ) [protected]

Produces a canonical representation of the given sensation.

Parameters:

s	The current sensation from the environment.

Returns:: A pointer to an equivalent state in statespace.

Definition at line 149 of file Sarsa.cc.

int Sarsa::first_action ( const std::vector< float > & s ) [virtual]

Implements Agent.

Definition at line 21 of file Sarsa.cc.

float Sarsa::getValue ( std::vector< float > state )

Definition at line 244 of file Sarsa.cc.

void Sarsa::last_action ( float r ) [virtual]

Implements Agent.

Definition at line 123 of file Sarsa.cc.

void Sarsa::logValues	(	ofstream *	of,
		int	xmin,
		int	xmax,
		int	ymin,
		int	ymax
	)

Definition at line 228 of file Sarsa.cc.

int Sarsa::next_action	(	float	r,
		const std::vector< float > &	s
	)		`[virtual]`

Implements Agent.

Definition at line 67 of file Sarsa.cc.

void Sarsa::printState ( const std::vector< float > & s )

Definition at line 190 of file Sarsa.cc.

std::vector< float >::iterator Sarsa::random_max_element	(	std::vector< float >::iterator	start,
		std::vector< float >::iterator	end
	)

Definition at line 165 of file Sarsa.cc.

void Sarsa::savePolicy ( const char * filename ) [virtual]

Reimplemented from Agent.

Definition at line 278 of file Sarsa.cc.

void Sarsa::seedExp ( std::vector< experience > seeds ) [virtual]

Reimplemented from Agent.

Definition at line 198 of file Sarsa.cc.

void Sarsa::setDebug ( bool d ) [virtual]

Implements Agent.

Definition at line 185 of file Sarsa.cc.

Member Data Documentation

bool Sarsa::ACTDEBUG [private]

Definition at line 87 of file Sarsa.hh.

const float Sarsa::alpha [private]

Definition at line 80 of file Sarsa.hh.

float* Sarsa::currentq [private]

Definition at line 85 of file Sarsa.hh.

bool Sarsa::ELIGDEBUG [private]

Definition at line 88 of file Sarsa.hh.

std::map<state_t, std::vector<float> > Sarsa::eligibility [private]

Definition at line 74 of file Sarsa.hh.

const float Sarsa::epsilon [private]

Definition at line 81 of file Sarsa.hh.

const float Sarsa::gamma [private]

Definition at line 77 of file Sarsa.hh.

const float Sarsa::initialvalue [private]

Definition at line 79 of file Sarsa.hh.

const float Sarsa::lambda [private]

Definition at line 82 of file Sarsa.hh.

const int Sarsa::numactions [private]

Definition at line 76 of file Sarsa.hh.

std::map<state_t, std::vector<float> > Sarsa::Q [private]

The primary data structure of the learning algorithm, the value function Q. For state_t s and int a, Q[s][a] gives the learned maximum expected future discounted reward conditional on executing action a in state s.

Definition at line 73 of file Sarsa.hh.

Random Sarsa::rng [private]

Definition at line 84 of file Sarsa.hh.

std::set<std::vector<float> > Sarsa::statespace [private]

Set of all distinct sensations seen. Pointers to elements of this set serve as the internal representation of the environment state.

Definition at line 67 of file Sarsa.hh.

The documentation for this class was generated from the following files:

Public Member Functions

Protected Types

Protected Member Functions

Private Attributes

Detailed Description

Member Typedef Documentation

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation