EO2SAR-Diff: Structure-Aware Latent Diffusion for Unpaired EO-to-SAR Translation
Yeon-Wook Kim, Kiyoung KimSynthetic aperture radar (SAR) imagery provides all-weather, day-and-night observation capabilities that complement electro-optical (EO) imaging; however, the limited number of operational SAR satellites and the difficulty of acquiring expert-annotated SAR datasets constrain deep-learning-based SAR image analysis. In this paper, we propose EO2SAR-Diff, a conditional latent diffusion framework that translates EO aerial images into realistic synthetic SAR images. The framework comprises three core components: (1) domain-adaptive LoRA pre-training that anchors the Stable Diffusion backbone in the remote sensing domain, (2) a style extraction and injection network that captures SAR-specific visual characteristics via multi-scale feature encoding and parallel cross-attention, and (3) a multi-branch ControlNet with three parallel branches for complementary structural guidance. These components are coordinated by a dual-axis feature injection strategy that modulates conditioning strength along both spatial (per-block) and temporal (per-timestep) dimensions. Experiments on the DOTA 1.0 and SARDet-100K datasets demonstrate that EO2SAR-Diff ranks in the top tier among all compared methods in distributional alignment with real SAR imagery, in terms of FID and KID computed with two SAR-domain-adapted feature extractors. Augmenting the SAR training set with our synthetic images yields consistent improvements in downstream object detection performance, confirming the practical utility of the proposed framework.