DOI: 10.3390/photonics13060601 ISSN: 2304-6732

CVA-Net: Multi-View 3D Reconstruction for Fringe Projection Profilometry via Cross-View Attention and Sim2Real Learning

Zuqiong Chen, Xiaopin Zhong, Yibin Tian

Fringe projection profilometry (FPP) is widely used for 3D reconstruction, but conventional single-view FPP systems suffer from inherent occlusions and shadow regions, leading to incomplete surface recovery. In this study, we propose CVA-Net, an end-to-end deep learning framework with cross-view attention (CVA) that directly reconstructs dense depth maps from multi-view fringe patterns. CVA-Net simultaneously processes four fringe images acquired from orthogonal projection directions and leverages a CVA module to explicitly model inter-view dependencies, enabling adaptive fusion of complementary information. A 3D U-Net backbone with attention gates, atrous spatial pyramid pooling (ASPP), and an auxiliary parameter estimation branch further enhances reconstruction accuracy and structural consistency via multitask learning. To support Sim2Real network training, we build a Blender-based digital twin of a multi-view FPP system and generate a large-scale synthetic dataset with perfect ground truth. Extensive experiments on both synthetic and real-world objects demonstrate that CVA-Net significantly outperforms state-of-the-art single-view methods. With a symmetric four-view configuration and fringe period of 8, CVA-Net achieves an MAE of 0.0359 mm, an MSE of 0.0379 mm2 and an RMSE of 0.1947 mm, reducing the MAE, MSE, and RMSE by 32.8%, 54.1%, and 32.2%, respectively, compared to the best single-view competitor. Ablation studies validate the contribution of each architectural component, while real-system experiments demonstrate the feasibility of transferring a network trained purely on synthetic data to practical FPP measurements without domain adaptation. Although further improvements are required to enhance reconstruction accuracy under real imaging conditions, the proposed framework provides an effective initial step toward bridging the gap between digital-twin-based training and real-world multi-view FPP applications. CVA-Net provides a robust, occlusion-aware solution for multi-view FPP reconstruction.

More from our Archive