DOI: 10.1177/02783649261453546 ISSN: 0278-3649
Think fast and far: Long-horizon online POMDP planning via rapid state sampling
Yuanchu Liang, Edward Kim, J. Arden Knoll, Wil Thomason, Zachary Kingston, Lydia E. Kavraki, Hanna KurniawatiPartially observable Markov decision processes (
pomdp
s) are a general and principled framework for motion planning under uncertainty. Despite tremendous improvement in the scalability of
pomdp
solvers, long-horizon
pomdp
s remain difficult to solve. To alleviate the difficulty, this paper proposes a new approximate online
pomdp
solver, called reference-based online
pomdp
planning via rapid state space sampling (
rop-ras3
).
rop-ras3
uses novel extremely fast sampling-based motion planning techniques to sample the state space and generate a diverse set of macro-actions online, which are then used to bias belief-space sampling and infer high-quality policies
pomdp
solvers.
rop-ras3
converges to a near-optimal reference-based solution at a rate that depends on the number of sampled actions, rather than the size of the action space.
rop-ras3
is evaluated on various long-horizon
pomdp
s with up to 3000 lookahead steps and 35-dimensional state spaces, where the state, action and observation spaces can be continuous, discrete, or a hybrid of discrete and continuous. Although the reference-based optimal solution may not be the same as the optimal
pomdp
solution, empirical results indicate that in all of these problems, in terms of success rate,
rop-ras3