In this talk I will describe some new results for planning in Markovian Decision Processes where the planner can use a simulator of the process. The question is under what conditions can we guarantee the existence of efficient and effective planning algorithms. To deal with large state spaces, the common approach is to assume that the planner is given access to the components of some basis functions, which map either states, or state-action pairs to reals, such that the linear combination of these basis functions can be used to approximate well some value functions. Depending on what is exactly assumed about the power of the basis functions and the way the basis functions can be accessed, efficient and effective planning may or may not be possible. In the talk I will give a high- level overview of this and discuss the state of the art. Intriguingly, tractability can depend on the number of actions, whether the process is fully deterministic and a number of other factors that I will discuss.
Ещё видео!