POMCP

Partially Observable Monte Carlo Planning (POMCP) is an online planning algorithm designed for partially observable Markov decision processes (POMDPs). It combines Monte Carlo tree search with a particle-based representation of beliefs, enabling planning in large or continuous state spaces without requiring an explicit belief update formula.

The algorithm operates by interacting with a generative model of the environment that can sample state transitions

Key advantages include scalability to large or continuous state spaces, because particle representations and simulation-based planning

Limitations involve dependence on the quality of the generative model and the computational budget for simulations

POMCP was introduced by Silver and Veness in 2010 as a scalable method for large POMDPs, notably

a

a

action–observation

a

a

a

a