#include <rl_common/Random.h>#include <rl_common/core.hh>#include <rl_common/ExperienceFile.hh>#include "../Models/FactoredModel.hh"#include "../Models/C45Tree.hh"#include <set>#include <vector>#include <map>#include <sstream>#include <deque>

Go to the source code of this file.
Classes | |
| class | PO_ParallelETUCT |
| struct | PO_ParallelETUCT::state_info |
| struct | PO_ParallelETUCT::state_samples |
Functions | |
| void * | poParallelModelLearningStart (void *arg) |
| void * | poParallelSearchStart (void *arg) |
Defines my real-time model-based RL architecture which uses UCT with eligiblity traces for planning. This version of UCT plans over states augmented with k-action histories. The modified version of UCT used is presented in: L. Kocsis and C. Szepesv´ari, "Bandit based monte-carlo planning," in ECML-06. Number 4212 in LNCS. Springer, 2006, pp. 282-293. The real-time architecture is presented in: Hester, Quinlan, and Stone, "A Real-Time Model-Based Reinforcement Learning Architecture for Robot Control", arXiv 1105.1749, 2011.
Definition in file PO_ParallelETUCT.hh.
| void* poParallelModelLearningStart | ( | void * | arg | ) |
Thread that loops, continually updating model with new experiences.
Definition at line 534 of file PO_ParallelETUCT.cc.
| void* poParallelSearchStart | ( | void * | arg | ) |
Parallel thread that continually does uct search from agent's current state.
Definition at line 1114 of file PO_ParallelETUCT.cc.