DOI: 10.3390/rs18122037 ISSN: 2072-4292

EO2SAR-Diff: Structure-Aware Latent Diffusion for Unpaired EO-to-SAR Translation

Yeon-Wook Kim, Kiyoung Kim

Synthetic aperture radar (SAR) imagery provides all-weather, day-and-night observation capabilities that complement electro-optical (EO) imaging; however, the limited number of operational SAR satellites and the difficulty of acquiring expert-annotated SAR datasets constrain deep-learning-based SAR image analysis. In this paper, we propose EO2SAR-Diff, a conditional latent diffusion framework that translates EO aerial images into realistic synthetic SAR images. The framework comprises three core components: (1) domain-adaptive LoRA pre-training that anchors the Stable Diffusion backbone in the remote sensing domain, (2) a style extraction and injection network that captures SAR-specific visual characteristics via multi-scale feature encoding and parallel cross-attention, and (3) a multi-branch ControlNet with three parallel branches for complementary structural guidance. These components are coordinated by a dual-axis feature injection strategy that modulates conditioning strength along both spatial (per-block) and temporal (per-timestep) dimensions. Experiments on the DOTA 1.0 and SARDet-100K datasets demonstrate that EO2SAR-Diff ranks in the top tier among all compared methods in distributional alignment with real SAR imagery, in terms of FID and KID computed with two SAR-domain-adapted feature extractors. Augmenting the SAR training set with our synthetic images yields consistent improvements in downstream object detection performance, confirming the practical utility of the proposed framework.

More from our Archive