Classes | Functions | Variables
qnet Namespace Reference

Classes

class  QValueNetwork
 — Q-value networks More...
 

Functions

def disturb (u, i)
 
def onehot (ix, n=NX)
 
def rendertrial (maxiter=100)
 

Variables

float DECAY_RATE = 0.99
 
 env = DPendulum()
 — Environment More...
 
 feed_dict
 
list h_rwd = []
 — History of search More...
 
float LEARNING_RATE = 0.1
 
int NEPISODES = 500
 — Hyper paramaters More...
 
int NSTEPS = 50
 
 NU = env.nu
 
 NX = env.nx
 
 optim
 
 Q2 = sess.run(qvalue.qvalue,feed_dict={ qvalue.x: onehot(x2) })
 
 Qref = sess.run(qvalue.qvalue,feed_dict={ qvalue.x: onehot(x ) })
 
 qvalue = QValueNetwork()
 
 RANDOM_SEED = int((time.time()%10)*1000)
 — Random seed More...
 
 reward
 
float rsum = 0.0
 
 sess = tf.InteractiveSession()
 
 u = sess.run(qvalue.u,feed_dict={ qvalue.x: onehot(x) })[0]
 
 x = env.reset()
 — Training More...
 
 x2
 

Detailed Description

Example of Q-table learning with a simple discretized 1-pendulum environment using a linear Q network.

Function Documentation

◆ disturb()

def qnet.disturb (   u,
  i 
)

Definition at line 58 of file qnet.py.

◆ onehot()

def qnet.onehot (   ix,
  n = NX 
)
Return a vector which is 0 everywhere except index <i> set to 1.

Definition at line 54 of file qnet.py.

◆ rendertrial()

def qnet.rendertrial (   maxiter = 100)

Definition at line 62 of file qnet.py.

Variable Documentation

◆ DECAY_RATE

float qnet.DECAY_RATE = 0.99

Definition at line 23 of file qnet.py.

◆ env

qnet.env = DPendulum()

— Environment

Definition at line 26 of file qnet.py.

◆ feed_dict

qnet.feed_dict

Definition at line 90 of file qnet.py.

◆ h_rwd

list qnet.h_rwd = []

— History of search

Definition at line 72 of file qnet.py.

◆ LEARNING_RATE

float qnet.LEARNING_RATE = 0.1

Definition at line 22 of file qnet.py.

◆ NEPISODES

int qnet.NEPISODES = 500

— Hyper paramaters

Definition at line 20 of file qnet.py.

◆ NSTEPS

int qnet.NSTEPS = 50

Definition at line 21 of file qnet.py.

◆ NU

qnet.NU = env.nu

Definition at line 28 of file qnet.py.

◆ NX

qnet.NX = env.nx

Definition at line 27 of file qnet.py.

◆ optim

qnet.optim

Definition at line 90 of file qnet.py.

◆ Q2

qnet.Q2 = sess.run(qvalue.qvalue,feed_dict={ qvalue.x: onehot(x2) })

Definition at line 85 of file qnet.py.

◆ Qref

qnet.Qref = sess.run(qvalue.qvalue,feed_dict={ qvalue.x: onehot(x ) })

Definition at line 86 of file qnet.py.

◆ qvalue

qnet.qvalue = QValueNetwork()

Definition at line 50 of file qnet.py.

◆ RANDOM_SEED

qnet.RANDOM_SEED = int((time.time()%10)*1000)

— Random seed

Definition at line 14 of file qnet.py.

◆ reward

qnet.reward

Definition at line 82 of file qnet.py.

◆ rsum

float qnet.rsum = 0.0

Definition at line 77 of file qnet.py.

◆ sess

qnet.sess = tf.InteractiveSession()

Definition at line 51 of file qnet.py.

◆ u

def qnet.u = sess.run(qvalue.u,feed_dict={ qvalue.x: onehot(x) })[0]

Definition at line 80 of file qnet.py.

◆ x

qnet.x = env.reset()

— Training

Definition at line 76 of file qnet.py.

◆ x2

qnet.x2

Definition at line 82 of file qnet.py.



pinocchio
Author(s):
autogenerated on Fri Jun 23 2023 02:38:36