DOI: 10.3390/electronics15132883 ISSN: 2079-9292

Occluder-Mask-Constrained 3D Reconstruction from Tower-Crane Construction Site Imagery

Qirun He, Rong Zhang, Changjiang Yin, Qin Ye, Shaoming Zhang

3D reconstruction of construction scenes is an important enabling technology for digital and intelligent construction project management. Recurring foreground occluders and dynamic disturbances in tower-crane imagery can destabilize image registration and introduce spurious depth responses. This paper proposes an occluder-mask-constrained 3D reconstruction framework driven by multi-view geometric anomalies. Adjacent-view geometric outliers are spatially aggregated to generate foreground prompt points, which are converted into occluder masks using Segment Anything Model 2 (SAM2). The masks are propagated as unified pixel-validity constraints through sparse feature filtering, Adaptive Patch Deformation Multi-View Stereo (APD-MVS) matching-cost evaluation, support-region selection, and depth-map fusion. Experiments on three real construction-site datasets show increased sparse-registration completeness in the tested sequences and fewer visually identifiable occluder-induced artifacts in dense point clouds. A representative 308-image sequence was further evaluated against no-mask reconstruction, You Only Look Once version 8 (YOLOv8) bounding-box removal, manually prompted Segment Anything Model 2.1 (SAM2.1), a Segment Anything Model 3 (SAM3) text-prompt baseline, and Visibility-Aware Multi-View Stereo Network (Vis-MVSNet). The evaluation combines sparse-reconstruction metrics, pixel-level mask-quality metrics from a manually annotated validation subset, module-wise runtime accounting, controlled ablations, and aligned dense-point-cloud visualization. These results show improved sparse-stage registration completeness and visible artifact suppression. Because high-precision 3D reference point clouds are unavailable, the dense results are interpreted as visual evidence of artifact suppression rather than as proof of improved absolute dense-reconstruction accuracy.

More from our Archive