DOI: 10.1177/02783649261453546 ISSN: 0278-3649

Think fast and far: Long-horizon online POMDP planning via rapid state sampling

Yuanchu Liang, Edward Kim, J. Arden Knoll, Wil Thomason, Zachary Kingston, Lydia E. Kavraki, Hanna Kurniawati

Partially observable Markov decision processes (

pomdp
s) are a general and principled framework for motion planning under uncertainty. Despite tremendous improvement in the scalability of
pomdp
solvers, long-horizon
pomdp
s remain difficult to solve. To alleviate the difficulty, this paper proposes a new approximate online
pomdp
solver, called reference-based online
pomdp
planning via rapid state space sampling (
rop-ras3
).
rop-ras3
uses novel extremely fast sampling-based motion planning techniques to sample the state space and generate a diverse set of macro-actions online, which are then used to bias belief-space sampling and infer high-quality policies without requiring exhaustive enumeration of the action space—a fundamental constraint for modern online
pomdp
solvers.
rop-ras3
converges to a near-optimal reference-based solution at a rate that depends on the number of sampled actions, rather than the size of the action space.
rop-ras3
is evaluated on various long-horizon
pomdp
s with up to 3000 lookahead steps and 35-dimensional state spaces, where the state, action and observation spaces can be continuous, discrete, or a hybrid of discrete and continuous. Although the reference-based optimal solution may not be the same as the optimal
pomdp
solution, empirical results indicate that in all of these problems, in terms of success rate,
rop-ras3
outperforms other state-of-the-art methods by up to multiple folds . We also demonstrate the capability of our approach on a physical robot demonstration. This work extends the theory and empirical results of our ISRR24 paper. Code can be found at https://github.com/RDLLab/ROPRAS3 .

More from our Archive