Symbol

Meaning

s S

States.

a A

Actions.

r R

Results.

q

A transition function which gives the probability of moving from one state to another.

s t , a t , r t

State, action, and reward at time step t for one trajectory.

r ( s t , a t , s t + 1 )

Reward for taking action a t at state s t and moving to the new state s t + 1 . Sometimes the notation r ( s , a , s ) is used as well.

π

A reinforcement learning policy, which maps states to actions. π θ ( . ) is a policy parameterized by θ .

J ( π )

Cumulative reward for a policy.

H

Horizon which is the length of time a reinforcement policy can do actions.

γ

Discount factor which discounts future rewards.

α

Step size hyperparameter.

σ

Hyper parameter which controls how much random variance or “jitter” is applied to the Natural Evolutionary Strategy populations.

p f

Final portfolio value.

R ( ϕ )

Quantum Rotation gate.

D ( α )

Quantum Displacement gate.

S ( r )

Quantum Squeezing gate.

B S ( θ )

Quantum Beamsplitter gate that allows for entanglement between two photons.

K ( κ )

Quantum Kerr gate.