DOI: 10.1002/cpe.70833 ISSN: 1532-0626

Rethinking Lightweight Vision Transformers for Complex Medical Scenes: Blood Cell Detection With ASBH‐DETR

Wei Li, Xingwei Yao, Hongyang Luan, Yingtao Lyu

ABSTRACT

In the field of biomedicine, the detection of blood cells in microscopic images is crucial for diagnosing hematological diseases. However, when faced with multiscale, dense, and overlapping cells, traditional manual counting is laborious, while common deep learning methods struggle to balance high accuracy with real‐time performance. To address these challenges, this study proposes a high‐precision, lightweight model, the Adaptive Spatial‐frequency Blood‐cell Hybrid Detection Transformer (ASBH‐DETR). First, an Adaptive Kernel Mixer (AKM) was designed to reconstruct the backbone network; through a dynamic kernel recalibration and parallel processing mechanism, it enhances multiscale cellular feature capture while significantly reducing computational redundancy. Second, we employed a Bipolar Attention Layer (BAL), leveraging its polarity decomposition and spike recovery properties to improve the model's focusing precision in dense regions. Next, we designed the Spatial‐frequency Aggregation Network (SANet), which significantly enhances the model's perception of overlapping boundaries and multiscale targets through multibranch spatial‐frequency decomposition and fusion. Finally, the Hybrid Downsample Module (HBDM) effectively resolves the issue of high‐frequency feature loss during downsampling. On the public BCCD and SCAD datasets, the model achieved scores of 92.6% and 85.1%, respectively, representing improvements of 2.6% and 2.8% over the baseline. Concurrently, its parameter count and computational complexity (FLOPs) were substantially reduced by 30.4% and 28.6%, respectively, while inference speed (FPS) increased by approximately 55.2%. In summary, ASBH‐DETR strikes an excellent balance among accuracy, speed, and model efficiency, offering a reliable solution for deploying high‐precision, automated blood cell analysis systems in resource‐constrained clinical environments.

More from our Archive