Semantic Fusion and Depth‐to‐Volume Backend Optimization for Robotic Simultaneous Localization and Mapping
Weining Huang, Haijiang Gao, Xingjun Lai, Hongyuan Liu, Yubin ZhaoABSTRACT
Simultaneous localization and mapping (SLAM) is essential for robotics; however, conventional feature‐based methods often fail in visually degraded environments, for example, textureless corridors. To address this challenge, we propose a robust semantic fusion based RGB‐D SLAM system that adaptively integrates visual features with dense geometric consistency. Our framework employs a hybrid tracking strategy, combining semantic feature‐based method and depth‐to‐volume matching. The incoming depth frames are aligned with a persistent truncated signed distance function (TSDF) map, efficiently stored using a hash‐based volumetric representation for scalable reconstruction. Additionally, 2D instance segmentation is integrated to project semantic labels into the 3D volume, creating a semantically enriched model. Experimental validation demonstrates that our system maintains millimeter‐level accuracy in structured environments while improving tracking continuity and reducing error in textureless scenes.