Item | Do |
1 | Initialize: gn, Iterations, h, parameters r and v. |
2 | Evaluate the initial policy through (16). |
3 | Update the parameters r via (17) |
4 | Compute functions Q(×) by (19) |
5 | Update policy using (20) |
6 | Update the parameters v of by (22) |
7 | Update function gn. |
8 | Go to 2 and repeat items 2, 3, 4, 5, 6 and 7 until the complete the Iterations |