DOI: 10.3390/rs18122049 ISSN: 2072-4292

Fusion of RGB and LiDAR Modalities for Building Footprint Extraction Using High-Resolution Aerial Imagery

Norbert Serbán, Péter Enyedi, Péter Burai, Balázs Harangi

In this paper, a novel approach is presented for fusing RGB and LiDAR inputs for semantic segmentation. Accurate building detection is required for various scenarios such as urban planning or environmental monitoring. The two main sources for accurate building segmentation are either RGB aerial images or LiDAR point clouds covering the selected area. Each of these sources has its own well-known techniques for segmentation; however, for the combination of the input, there are not many architectures available, and extracting different features from the two different fields can result in an enhanced segmentation map. The authors of this article created a semantic segmentation model that uses both the aerial RGB image and the LiDAR point cloud as its input. The network first takes the point cloud and forwards the processed projection to a modified U-Net-based architecture, which fuses the extracted features of the 3D input with the extracted information of the 2D input on each level of the decoding. To train and test the presented model, the authors used a dataset containing more than 3000 images and their corresponding 3D point clouds of three different areas from Hungary. As is also presented in this paper, this approach provides significantly better results than the traditional RGB, Point Cloud segmentation models, and their ensembles in terms of segmentation accuracy.

More from our Archive