#include <QLearner.hh>

Public Member Functions | |
| virtual int | first_action (const std::vector< float > &s) |
| float | getValue (std::vector< float > state) |
| virtual void | last_action (float r) |
| void | loadPolicy (const char *filename) |
| void | logValues (ofstream *of, int xmin, int xmax, int ymin, int ymax) |
| virtual int | next_action (float r, const std::vector< float > &s) |
| void | printState (const std::vector< float > &s) |
| QLearner (int numactions, float gamma, float initialvalue, float alpha, float epsilon, Random rng=Random()) | |
| QLearner (const QLearner &) | |
| std::vector< float >::iterator | random_max_element (std::vector< float >::iterator start, std::vector< float >::iterator end) |
| virtual void | savePolicy (const char *filename) |
| virtual void | seedExp (std::vector< experience >) |
| virtual void | setDebug (bool d) |
| virtual | ~QLearner () |
Public Attributes | |
| float | epsilon |
Protected Types | |
| typedef const std::vector < float > * | state_t |
Protected Member Functions | |
| state_t | canonicalize (const std::vector< float > &s) |
Private Attributes | |
| bool | ACTDEBUG |
| const float | alpha |
| float * | currentq |
| const float | gamma |
| const float | initialvalue |
| const int | numactions |
| std::map< state_t, std::vector < float > > | Q |
| Random | rng |
| std::set< std::vector< float > > | statespace |
Agent that uses straight Q-learning, with no generalization and epsilon-greedy exploration.
Definition at line 17 of file QLearner.hh.
typedef const std::vector<float>* QLearner::state_t [protected] |
The implementation maps all sensations to a set of canonical pointers, which serve as the internal representation of environment state.
Definition at line 58 of file QLearner.hh.
| QLearner::QLearner | ( | int | numactions, |
| float | gamma, | ||
| float | initialvalue, | ||
| float | alpha, | ||
| float | epsilon, | ||
| Random | rng = Random() |
||
| ) |
Standard constructor
| numactions | The number of possible actions |
| gamma | The discount factor |
| initialvalue | The initial value of each Q(s,a) |
| alpha | The learning rate |
| epsilon | The probability of taking a random action |
| rng | Initial state of the random number generator to use |
Definition at line 4 of file QLearner.cc.
| QLearner::QLearner | ( | const QLearner & | ) |
Unimplemented copy constructor: internal state cannot be simply copied.
| QLearner::~QLearner | ( | ) | [virtual] |
Definition at line 17 of file QLearner.cc.
| QLearner::state_t QLearner::canonicalize | ( | const std::vector< float > & | s | ) | [protected] |
Produces a canonical representation of the given sensation.
| s | The current sensation from the environment. |
Definition at line 102 of file QLearner.cc.
| int QLearner::first_action | ( | const std::vector< float > & | s | ) | [virtual] |
Implements Agent.
Definition at line 19 of file QLearner.cc.
| float QLearner::getValue | ( | std::vector< float > | state | ) |
Definition at line 201 of file QLearner.cc.
| void QLearner::last_action | ( | float | r | ) | [virtual] |
Implements Agent.
Definition at line 92 of file QLearner.cc.
| void QLearner::loadPolicy | ( | const char * | filename | ) |
Definition at line 266 of file QLearner.cc.
| void QLearner::logValues | ( | ofstream * | of, |
| int | xmin, | ||
| int | xmax, | ||
| int | ymin, | ||
| int | ymax | ||
| ) |
Definition at line 185 of file QLearner.cc.
| int QLearner::next_action | ( | float | r, |
| const std::vector< float > & | s | ||
| ) | [virtual] |
Implements Agent.
Definition at line 53 of file QLearner.cc.
| void QLearner::printState | ( | const std::vector< float > & | s | ) |
Definition at line 141 of file QLearner.cc.
| std::vector< float >::iterator QLearner::random_max_element | ( | std::vector< float >::iterator | start, |
| std::vector< float >::iterator | end | ||
| ) |
Definition at line 116 of file QLearner.cc.
| void QLearner::savePolicy | ( | const char * | filename | ) | [virtual] |
Reimplemented from Agent.
Definition at line 235 of file QLearner.cc.
| void QLearner::seedExp | ( | std::vector< experience > | seeds | ) | [virtual] |
Reimplemented from Agent.
Definition at line 149 of file QLearner.cc.
| void QLearner::setDebug | ( | bool | d | ) | [virtual] |
Implements Agent.
Definition at line 136 of file QLearner.cc.
bool QLearner::ACTDEBUG [private] |
Definition at line 86 of file QLearner.hh.
const float QLearner::alpha [private] |
Definition at line 81 of file QLearner.hh.
float* QLearner::currentq [private] |
Definition at line 84 of file QLearner.hh.
| float QLearner::epsilon |
Definition at line 52 of file QLearner.hh.
const float QLearner::gamma [private] |
Definition at line 78 of file QLearner.hh.
const float QLearner::initialvalue [private] |
Definition at line 80 of file QLearner.hh.
const int QLearner::numactions [private] |
Definition at line 77 of file QLearner.hh.
std::map<state_t, std::vector<float> > QLearner::Q [private] |
The primary data structure of the learning algorithm, the value function Q. For state_t s and int a, Q[s][a] gives the learned maximum expected future discounted reward conditional on executing action a in state s.
Definition at line 75 of file QLearner.hh.
Random QLearner::rng [private] |
Definition at line 83 of file QLearner.hh.
std::set<std::vector<float> > QLearner::statespace [private] |
Set of all distinct sensations seen. Pointers to elements of this set serve as the internal representation of the environment state.
Definition at line 69 of file QLearner.hh.