#include <QLearner.hh>

Inheritance diagram for QLearner:

Public Member Functions
virtual int	first_action (const std::vector< float > &s)
float	getValue (std::vector< float > state)
virtual void	last_action (float r)
void	loadPolicy (const char *filename)
void	logValues (ofstream *of, int xmin, int xmax, int ymin, int ymax)
virtual int	next_action (float r, const std::vector< float > &s)
void	printState (const std::vector< float > &s)
	QLearner (int numactions, float gamma, float initialvalue, float alpha, float epsilon, Random rng=Random())
	QLearner (const QLearner &)
std::vector< float >::iterator	random_max_element (std::vector< float >::iterator start, std::vector< float >::iterator end)
virtual void	savePolicy (const char *filename)
virtual void	seedExp (std::vector< experience >)
virtual void	setDebug (bool d)
virtual	~QLearner ()
Public Attributes
float	epsilon
Protected Types
typedef const std::vector < float > *	state_t
Protected Member Functions
state_t	canonicalize (const std::vector< float > &s)
Private Attributes
bool	ACTDEBUG
const float	alpha
float *	currentq
const float	gamma
const float	initialvalue
const int	numactions
std::map< state_t, std::vector < float > >	Q
Random	rng
std::set< std::vector< float > >	statespace

Detailed Description

Agent that uses straight Q-learning, with no generalization and epsilon-greedy exploration.

Definition at line 17 of file QLearner.hh.

Member Typedef Documentation

typedef const std::vector<float>* QLearner::state_t [protected]

The implementation maps all sensations to a set of canonical pointers, which serve as the internal representation of environment state.

Definition at line 58 of file QLearner.hh.

Constructor & Destructor Documentation

QLearner::QLearner	(	int	numactions,
		float	gamma,
		float	initialvalue,
		float	alpha,
		float	epsilon,
		Random	rng = `Random()`
	)

Standard constructor

Parameters:

numactions	The number of possible actions
gamma	The discount factor
initialvalue	The initial value of each Q(s,a)
alpha	The learning rate
epsilon	The probability of taking a random action
rng	Initial state of the random number generator to use

Definition at line 4 of file QLearner.cc.

QLearner::QLearner ( const QLearner & )

Unimplemented copy constructor: internal state cannot be simply copied.

QLearner::~QLearner ( ) [virtual]

Definition at line 17 of file QLearner.cc.

Member Function Documentation

QLearner::state_t QLearner::canonicalize ( const std::vector< float > & s ) [protected]

Produces a canonical representation of the given sensation.

Parameters:

s	The current sensation from the environment.

Returns:: A pointer to an equivalent state in statespace.

Definition at line 102 of file QLearner.cc.

int QLearner::first_action ( const std::vector< float > & s ) [virtual]

Implements Agent.

Definition at line 19 of file QLearner.cc.

float QLearner::getValue ( std::vector< float > state )

Definition at line 201 of file QLearner.cc.

void QLearner::last_action ( float r ) [virtual]

Implements Agent.

Definition at line 92 of file QLearner.cc.

void QLearner::loadPolicy ( const char * filename )

Definition at line 266 of file QLearner.cc.

void QLearner::logValues	(	ofstream *	of,
		int	xmin,
		int	xmax,
		int	ymin,
		int	ymax
	)

Definition at line 185 of file QLearner.cc.

int QLearner::next_action	(	float	r,
		const std::vector< float > &	s
	)		`[virtual]`

Implements Agent.

Definition at line 53 of file QLearner.cc.

void QLearner::printState ( const std::vector< float > & s )

Definition at line 141 of file QLearner.cc.

std::vector< float >::iterator QLearner::random_max_element	(	std::vector< float >::iterator	start,
		std::vector< float >::iterator	end
	)

Definition at line 116 of file QLearner.cc.

void QLearner::savePolicy ( const char * filename ) [virtual]

Reimplemented from Agent.

Definition at line 235 of file QLearner.cc.

void QLearner::seedExp ( std::vector< experience > seeds ) [virtual]

Reimplemented from Agent.

Definition at line 149 of file QLearner.cc.

void QLearner::setDebug ( bool d ) [virtual]

Implements Agent.

Definition at line 136 of file QLearner.cc.

Member Data Documentation

bool QLearner::ACTDEBUG [private]

Definition at line 86 of file QLearner.hh.

const float QLearner::alpha [private]

Definition at line 81 of file QLearner.hh.

float* QLearner::currentq [private]

Definition at line 84 of file QLearner.hh.

float QLearner::epsilon

Definition at line 52 of file QLearner.hh.

const float QLearner::gamma [private]

Definition at line 78 of file QLearner.hh.

const float QLearner::initialvalue [private]

Definition at line 80 of file QLearner.hh.

const int QLearner::numactions [private]

Definition at line 77 of file QLearner.hh.

std::map<state_t, std::vector<float> > QLearner::Q [private]

The primary data structure of the learning algorithm, the value function Q. For state_t s and int a, Q[s][a] gives the learned maximum expected future discounted reward conditional on executing action a in state s.

Definition at line 75 of file QLearner.hh.

Random QLearner::rng [private]

Definition at line 83 of file QLearner.hh.

std::set<std::vector<float> > QLearner::statespace [private]

Set of all distinct sensations seen. Pointers to elements of this set serve as the internal representation of the environment state.

Definition at line 69 of file QLearner.hh.

The documentation for this class was generated from the following files:

Public Member Functions

Public Attributes

Protected Types

Protected Member Functions

Private Attributes

Detailed Description

Member Typedef Documentation

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation