QLearner Class Reference

#include <QLearner.hh>

Public Member Functions

virtual int first_action (const std::vector< float > &s)
float getValue (std::vector< float > state)
virtual void last_action (float r)
void loadPolicy (const char *filename)
void logValues (ofstream *of, int xmin, int xmax, int ymin, int ymax)
virtual int next_action (float r, const std::vector< float > &s)
void printState (const std::vector< float > &s)
 QLearner (int numactions, float gamma, float initialvalue, float alpha, float epsilon, Random rng=Random())
 QLearner (const QLearner &)
std::vector< float >::iterator random_max_element (std::vector< float >::iterator start, std::vector< float >::iterator end)
virtual void savePolicy (const char *filename)
virtual void seedExp (std::vector< experience >)
virtual void setDebug (bool d)
virtual ~QLearner ()

Public Attributes

float epsilon

Protected Types

typedef const std::vector
< float > * 

Protected Member Functions

state_t canonicalize (const std::vector< float > &s)

Private Attributes

const float alpha
float * currentq
const float gamma
const float initialvalue
const int numactions
std::map< state_t, std::vector
< float > > 
Random rng
std::set< std::vector< float > > statespace

Detailed Description

Agent that uses straight Q-learning, with no generalization and epsilon-greedy exploration.

Member Typedef Documentation

typedef const std::vector<float>* QLearner::state_t [protected]

The implementation maps all sensations to a set of canonical pointers, which serve as the internal representation of environment state.

Constructor & Destructor Documentation

QLearner::QLearner ( int  numactions,
float  gamma,
float  initialvalue,
float  alpha,
float  epsilon,
Random  rng = Random() 

Standard constructor

numactionsThe number of possible actions
gammaThe discount factor
initialvalueThe initial value of each Q(s,a)
alphaThe learning rate
epsilonThe probability of taking a random action
rngInitial state of the random number generator to use

Unimplemented copy constructor: internal state cannot be simply copied.

QLearner::~QLearner ( ) [virtual]

Member Function Documentation

QLearner::state_t QLearner::canonicalize ( const std::vector< float > &  s) [protected]

Produces a canonical representation of the given sensation.

sThe current sensation from the environment.
A pointer to an equivalent state in statespace.

int QLearner::first_action ( const std::vector< float > &  s) [virtual]

Implements Agent.

float QLearner::getValue ( std::vector< float >  state)

void QLearner::last_action ( float  r) [virtual]

Implements Agent.

void QLearner::loadPolicy ( const char *  filename)

void QLearner::logValues ( ofstream *  of,
int  xmin,
int  xmax,
int  ymin,
int  ymax 

int QLearner::next_action ( float  r,
const std::vector< float > &  s 
) [virtual]

Implements Agent.

void QLearner::printState ( const std::vector< float > &  s)

std::vector< float >::iterator QLearner::random_max_element ( std::vector< float >::iterator  start,
std::vector< float >::iterator  end 

void QLearner::savePolicy ( const char *  filename) [virtual]

Reimplemented from Agent.

void QLearner::seedExp ( std::vector< experience seeds) [virtual]

Reimplemented from Agent.

void QLearner::setDebug ( bool  d) [virtual]

Implements Agent.

Member Data Documentation

const float QLearner::alpha [private]

float* QLearner::currentq [private]

const float QLearner::gamma [private]

const float QLearner::initialvalue [private]

const int QLearner::numactions [private]

std::map<state_t, std::vector<float> > QLearner::Q [private]

The primary data structure of the learning algorithm, the value function Q. For state_t s and int a, Q[s][a] gives the learned maximum expected future discounted reward conditional on executing action a in state s.

Definition at line 75 of file QLearner.hh.

Random QLearner::rng [private]

std::set<std::vector<float> > QLearner::statespace [private]

Set of all distinct sensations seen. Pointers to elements of this set serve as the internal representation of the environment state.

Definition at line 69 of file QLearner.hh.

