discount¶
Note
Discount factors are only implemented in the learning mechanisms Guided associative learning, Expected SARSA, Q-learning, and Actor-critic.
Specifies the discount factor, written as \(\gamma\) in the equations for memory updates, that tells how important future rewards are to the current state. The discount factor is a value between 0 and 1. A reinforcement value \(u\) that occurs \(N\) steps in the future from the current state, is multiplied by \(\gamma^N\) to describe its importance to the current state. For example, consider \(\gamma=0.9\) and a reinforcement value \(u=10\) that is 3 steps ahead of the current state. The importance of this reward to the subject from where it stands is equal to \(10 \cdot 0.9^3 = 7.29\).
The value of the parameter discount
is used in the updating equations described in the mechanisms.
Syntax¶
discount = v
where v
is a scalar expression.
Description¶
discount = v
sets the discount factor to v
.
Examples¶
discount = 0.5
sets the discount factor to 0.5.
@variables x = 0.5
discount = x + 0.1
sets the discount factor to 0.6.