Cross-Scale U-Net: A Deep Transfer Learning Framework for Automated High-Resolution Urban Land Cover Mapping
Zhe Wang, Chao Fan, Shoukun Sun, Haifeng (Felix) Liao, Min Xian, Xiaogang Ma, Xiang QueAccurate and scalable urban land cover mapping is critical for sustainable urban planning and environmental management. While deep learning models offer powerful tools for this task, their performance is often constrained by the need for vast, manually labeled datasets, which are costly and challenging to acquire for diverse urban environments. To address this limitation, we propose the Cross-Scale U-Net, an original, highly adaptable operational framework that systematically exploits the inherent scale effects of remote-sensing imagery to optimize transfer learning. By operationalizing prior theoretical findings on receptive fields, this workflow provides an actionable method for users to manipulate spatial resolution, identify an optimal scale to bridge the domain gap, and subsequently automate feature extraction with significantly reduced manual effort. Using the well-annotated ISPRS Potsdam dataset as the source domain, our framework transfers learned knowledge to classify National Agriculture Imagery Program (NAIP) data from Phoenix, AZ (2015), into four primary land cover classes. We systematically evaluated the framework’s performance across spatial resolutions ranging from 15 cm to 100 cm, achieving a peak overall accuracy (OA) of 82.45%. To assess generalizability, the model was applied in a label-free transfer scenario to NAIP imagery from Las Vegas, NV (2015), and Phoenix, AZ (2013 and 2019), consistently delivering OA values above 70%. In a comparative analysis, the Cross-Scale U-Net significantly outperformed traditional classification techniques. While our current empirical validation is focused on arid urban environments due to experimental constraints, the framework introduces a highly flexible, actionable scale-adjustment process. This approach offers a scalable workflow that can be tailored to various landscape scales—such as expanding to coarser resolutions for large-scale forests or protected areas—delivering high-fidelity maps while mitigating data scarcity.