Radar-Camera Extrinsic Calibration for Roadside Infrastructure: A Systematic Review
Zeynab Rokhi, Ali EmadiThe growth of Intelligent Transportation Systems (ITS) has made high-quality perception data from multi-sensor setups essential. Pairing millimeter-wave (mmW) radar with a monocular camera is a common way to recover three-dimensional information about the environment, but aligning the two is difficult because sparse radar point clouds and dense camera images differ sharply in how they sense a scene. The problem grows more severe in roadside infrastructure, where the high mounting elevation introduces perspective distortion that vehicle-mounted systems rarely face. This paper presents a systematic review, conducted under the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, of radar-camera extrinsic calibration for fixed roadside infrastructure, organizing existing work into a taxonomy that separates traditional two-stage pipelines from recent end-to-end learning frameworks. Because methods designed specifically for roadside units remain scarce, the review also covers vehicle- and robot-mounted methods whose static-sensor formulation carries over to fixed roadside deployment. For the two-stage pipeline, the analysis covers target-based and targetless correspondence registration along with the optimization techniques and algorithmic assumptions behind parameter estimation. The end-to-end learning literature shows a clear shift toward self-supervised and fusion-based models, some of which report real-time performance. The review also compares the metrics and procedures used to quantify calibration accuracy. Progress is evident, but robustness in cluttered urban environments remains an open challenge, and the paper closes by outlining future directions, arguing that standardized roadside benchmarks are needed before scalable, targetless calibration can mature.