Classes | Functions | Variables
qnet Namespace Reference

Classes

class  QValueNetwork
 — Q-value networks More...
 

Functions

def disturb (u, i)
 
def onehot (ix, n=NX)
 
def rendertrial (maxiter=100)
 

Variables

float DECAY_RATE = 0.99
 
 env = DPendulum()
 — Environment More...
 
 feed_dict
 
list h_rwd = []
 — History of search More...
 
float LEARNING_RATE = 0.1
 
int NEPISODES = 500
 — Hyper paramaters More...
 
int NSTEPS = 50
 
 NU = env.nu
 
 NX = env.nx
 
 optim
 
 Q2 = sess.run(qvalue.qvalue, feed_dict={qvalue.x: onehot(x2)})
 
 Qref = sess.run(qvalue.qvalue, feed_dict={qvalue.x: onehot(x)})
 
 qvalue = QValueNetwork()
 
 RANDOM_SEED = int((time.time() % 10) * 1000)
 — Random seed More...
 
 reward
 
float rsum = 0.0
 
 sess = tf.InteractiveSession()
 
 u = sess.run(qvalue.u, feed_dict={qvalue.x: onehot(x)})[0]
 
 x = env.reset()
 — Training More...
 
 x2
 

Detailed Description

Example of Q-table learning with a simple discretized 1-pendulum environment using a
linear Q network.

Function Documentation

◆ disturb()

def qnet.disturb (   u,
  i 
)

Definition at line 68 of file qnet.py.

◆ onehot()

def qnet.onehot (   ix,
  n = NX 
)
Return a vector which is 0 everywhere except index <i> set to 1.

Definition at line 58 of file qnet.py.

◆ rendertrial()

def qnet.rendertrial (   maxiter = 100)

Definition at line 73 of file qnet.py.

Variable Documentation

◆ DECAY_RATE

float qnet.DECAY_RATE = 0.99

Definition at line 24 of file qnet.py.

◆ env

qnet.env = DPendulum()

— Environment

Definition at line 27 of file qnet.py.

◆ feed_dict

qnet.feed_dict

Definition at line 107 of file qnet.py.

◆ h_rwd

list qnet.h_rwd = []

— History of search

Definition at line 89 of file qnet.py.

◆ LEARNING_RATE

float qnet.LEARNING_RATE = 0.1

Definition at line 23 of file qnet.py.

◆ NEPISODES

int qnet.NEPISODES = 500

— Hyper paramaters

Definition at line 21 of file qnet.py.

◆ NSTEPS

int qnet.NSTEPS = 50

Definition at line 22 of file qnet.py.

◆ NU

qnet.NU = env.nu

Definition at line 29 of file qnet.py.

◆ NX

qnet.NX = env.nx

Definition at line 28 of file qnet.py.

◆ optim

qnet.optim

Definition at line 107 of file qnet.py.

◆ Q2

qnet.Q2 = sess.run(qvalue.qvalue, feed_dict={qvalue.x: onehot(x2)})

Definition at line 102 of file qnet.py.

◆ Qref

qnet.Qref = sess.run(qvalue.qvalue, feed_dict={qvalue.x: onehot(x)})

Definition at line 103 of file qnet.py.

◆ qvalue

qnet.qvalue = QValueNetwork()

Definition at line 53 of file qnet.py.

◆ RANDOM_SEED

qnet.RANDOM_SEED = int((time.time() % 10) * 1000)

— Random seed

Definition at line 15 of file qnet.py.

◆ reward

qnet.reward

Definition at line 99 of file qnet.py.

◆ rsum

float qnet.rsum = 0.0

Definition at line 94 of file qnet.py.

◆ sess

qnet.sess = tf.InteractiveSession()

Definition at line 54 of file qnet.py.

◆ u

def qnet.u = sess.run(qvalue.u, feed_dict={qvalue.x: onehot(x)})[0]

Definition at line 97 of file qnet.py.

◆ x

qnet.x = env.reset()

— Training

Definition at line 93 of file qnet.py.

◆ x2

qnet.x2

Definition at line 99 of file qnet.py.



pinocchio
Author(s):
autogenerated on Sun Dec 22 2024 03:41:16