Rollout Allocation Strategies for Classification-based Policy Iteration (Programming)
Conducted byVictor GABILLON
DescriptionReinforcement Learning: In classification-based policy iteration algorithms: 1) the value function over a finite number of states, called the rollout set, and the actions in the action space, are estimated through rollouts in order to 2) learn an improved policy through a classifier. The choice of rollout allocation strategies over states and actions have significant impact on the performance and computation time of this class of algorithms. In this paper, we present new strategies to allocate the available budget (number of rollouts) at each iteration of the algorithm over states, actions, and state-action pairs. Our empirical results indicate that, for a fixed budget, and in comparison to existing methods, using the proposed strategies improves the accuracy of the training set.
Tools usedNo information
Shared by: Victor GABILLON
Last update: 2010-06-10 14:46:52