Public Member Functions | Public Attributes | Protected Types | Protected Member Functions | Private Attributes
QLearner Class Reference

#include <QLearner.hh>

Inheritance diagram for QLearner:
Inheritance graph

List of all members.

Public Member Functions

virtual int first_action (const std::vector< float > &s)
float getValue (std::vector< float > state)
virtual void last_action (float r)
void loadPolicy (const char *filename)
void logValues (ofstream *of, int xmin, int xmax, int ymin, int ymax)
virtual int next_action (float r, const std::vector< float > &s)
void printState (const std::vector< float > &s)
 QLearner (int numactions, float gamma, float initialvalue, float alpha, float epsilon, Random rng=Random())
 QLearner (const QLearner &)
std::vector< float >::iterator random_max_element (std::vector< float >::iterator start, std::vector< float >::iterator end)
virtual void savePolicy (const char *filename)
virtual void seedExp (std::vector< experience >)
virtual void setDebug (bool d)
virtual ~QLearner ()

Public Attributes

float epsilon

Protected Types

typedef const std::vector
< float > * 

Protected Member Functions

state_t canonicalize (const std::vector< float > &s)

Private Attributes

const float alpha
float * currentq
const float gamma
const float initialvalue
const int numactions
std::map< state_t, std::vector
< float > > 
Random rng
std::set< std::vector< float > > statespace

Detailed Description

Agent that uses straight Q-learning, with no generalization and epsilon-greedy exploration.

Definition at line 17 of file QLearner.hh.

Member Typedef Documentation

typedef const std::vector<float>* QLearner::state_t [protected]

The implementation maps all sensations to a set of canonical pointers, which serve as the internal representation of environment state.

Definition at line 58 of file QLearner.hh.

Constructor & Destructor Documentation

QLearner::QLearner ( int  numactions,
float  gamma,
float  initialvalue,
float  alpha,
float  epsilon,
Random  rng = Random() 

Standard constructor

numactionsThe number of possible actions
gammaThe discount factor
initialvalueThe initial value of each Q(s,a)
alphaThe learning rate
epsilonThe probability of taking a random action
rngInitial state of the random number generator to use

Definition at line 4 of file

Unimplemented copy constructor: internal state cannot be simply copied.

QLearner::~QLearner ( ) [virtual]

Definition at line 17 of file

Member Function Documentation

QLearner::state_t QLearner::canonicalize ( const std::vector< float > &  s) [protected]

Produces a canonical representation of the given sensation.

sThe current sensation from the environment.
A pointer to an equivalent state in statespace.

Definition at line 102 of file

int QLearner::first_action ( const std::vector< float > &  s) [virtual]

Implements Agent.

Definition at line 19 of file

float QLearner::getValue ( std::vector< float >  state)

Definition at line 201 of file

void QLearner::last_action ( float  r) [virtual]

Implements Agent.

Definition at line 92 of file

void QLearner::loadPolicy ( const char *  filename)

Definition at line 266 of file

void QLearner::logValues ( ofstream *  of,
int  xmin,
int  xmax,
int  ymin,
int  ymax 

Definition at line 185 of file

int QLearner::next_action ( float  r,
const std::vector< float > &  s 
) [virtual]

Implements Agent.

Definition at line 53 of file

void QLearner::printState ( const std::vector< float > &  s)

Definition at line 141 of file

std::vector< float >::iterator QLearner::random_max_element ( std::vector< float >::iterator  start,
std::vector< float >::iterator  end 

Definition at line 116 of file

void QLearner::savePolicy ( const char *  filename) [virtual]

Reimplemented from Agent.

Definition at line 235 of file

void QLearner::seedExp ( std::vector< experience seeds) [virtual]

Reimplemented from Agent.

Definition at line 149 of file

void QLearner::setDebug ( bool  d) [virtual]

Implements Agent.

Definition at line 136 of file

Member Data Documentation

Definition at line 86 of file QLearner.hh.

const float QLearner::alpha [private]

Definition at line 81 of file QLearner.hh.

float* QLearner::currentq [private]

Definition at line 84 of file QLearner.hh.

Definition at line 52 of file QLearner.hh.

const float QLearner::gamma [private]

Definition at line 78 of file QLearner.hh.

const float QLearner::initialvalue [private]

Definition at line 80 of file QLearner.hh.

const int QLearner::numactions [private]

Definition at line 77 of file QLearner.hh.

std::map<state_t, std::vector<float> > QLearner::Q [private]

The primary data structure of the learning algorithm, the value function Q. For state_t s and int a, Q[s][a] gives the learned maximum expected future discounted reward conditional on executing action a in state s.

Definition at line 75 of file QLearner.hh.

Random QLearner::rng [private]

Definition at line 83 of file QLearner.hh.

std::set<std::vector<float> > QLearner::statespace [private]

Set of all distinct sensations seen. Pointers to elements of this set serve as the internal representation of the environment state.

Definition at line 69 of file QLearner.hh.

The documentation for this class was generated from the following files:

Author(s): Todd Hester
autogenerated on Thu Jun 6 2019 22:00:14