discount¶

Note

Discount factors are only implemented in the learning mechanisms Guided associative learning, Expected SARSA, Q-learning, and Actor-critic.

Specifies the discount factor, written as \(\gamma\) in the equations for memory updates, that tells how important future rewards are to the current state. The discount factor is a value between 0 and 1. A reinforcement value \(u\) that occurs \(N\) steps in the future from the current state, is multiplied by \(\gamma^N\) to describe its importance to the current state. For example, consider \(\gamma=0.9\) and a reinforcement value \(u=10\) that is 3 steps ahead of the current state. The importance of this reward to the subject from where it stands is equal to \(10 \cdot 0.9^3 = 7.29\).

The value of the parameter discount is used in the updating equations described in the mechanisms.

Syntax¶

discount = v

where v is a scalar expression.

Description¶

discount = v sets the discount factor to v.

Examples¶

discount = 0.5

sets the discount factor to 0.5.

@variables x = 0.5
discount = x + 0.1

sets the discount factor to 0.6.

discount¶

Syntax¶

Description¶

Examples¶

Table of Contents

This Page