DOI: 10.3390/bioengineering13070714 ISSN: 2306-5354

DCP-TS: A Unified Spatiotemporal Framework for Real-Time Desmoking and Flicker Suppression in Laparoscopic Surgical Videos

Chun-Hsien Wu, Chih-Yi Lin, Yi-Chun Du

Surgical smoke generated by energy-based instruments during minimally invasive surgery severely degrades intraoperative visibility in laparoscopic procedures, prolonging operation time and elevating surgical risk. Although deep-learning desmoking methods have improved spatial clarity, most operate frame-by-frame and produce temporal artifacts—flicker, brightness drift, and color instability—that hinder clinical adoption. To our knowledge, no prior framework has jointly addressed spatial restoration and temporal consistency within a unified surgical smoke removal pipeline. We proposed DCP-TS, a unified spatiotemporal framework that coupled a Dark Channel Prior (DCP)-guided conditional generative adversarial network (cGAN) with an inference-time module integrating optical flow alignment, exponential moving-average luminance smoothing, and adaptive gamma correction. A key novelty was that this stabilizer was smoke-aware and operated entirely at inference time, requiring no retraining or post-processing, which distinguished it from generic video temporal-consistency methods. On laparoscopic colorectal surgery videos, DCP-TS achieved a PSNR of 23.39 dB, SSIM of 0.62, NIQE of 4.17, and BRISQUE of 23.66, outperforming DehazeFormer and Colores et al. across all metrics. Temporal analysis showed an approximate 28% reduction in inter-frame luminance variation, and a double-blind reader study with five experienced laparoscopic surgeons confirmed substantial improvements in brightness stability (4.37 vs. 2.86) and overall perceptual quality (4.18 vs. 3.51 on a 5-point Likert scale). The system ran at 22 fps with ~3.9 GB GPU memory on standard operating-room hardware, supporting real-time intraoperative deployment. DCP-TS demonstrated that physics-guided spatiotemporal modeling could transform frame-by-frame desmoking into a clinically promising, perceptually more continuous video stream.

More from our Archive