A portable non-contact tongue imaging system with automated analysis for community and home settings
Jiehan Wei, Jun Song, Weiliang Lu, Shaoyang Men, Chuangquan Lin, Peipei ZhouBackground
Traditional tongue inspection relies on visual assessment by practitioners, which introduces subjectivity and compromises reproducibility. Existing solutions often rely on enclosed, dedicated acquisition instruments with nontrivial operation, whereas mobile self-capture approaches are more accessible but sensitive to environmental variability, making reliable analysis challenging in real-world use.
Objective
To develop a portable non-contact tongue imaging and automated analysis system that is robust to real-world acquisition variability.
Methods
We designed a portable acquisition terminal that integrates a camera, touchscreen preview, touch-initiated capture with voice prompts, and supplementary illumination for acquisition assistance. For automated analysis, we developed TongueSegNet (TSegNet) for tongue segmentation, incorporating stage-dependent residual modulation, deep-stage attention enhancement, and gated skip-pathway feature fusion to improve feature representation and boundary delineation. For fissured-tongue feature recognition, we developed Residual Kolmogorov-Arnold Network (ResKAN), which combines a convolutional neural network feature extractor with a Kolmogorov-Arnold Network–based head to improve modelling capacity for fine-grained texture patterns.
Results
On tongue images acquired under unconstrained conditions, TSegNet achieved mean Dice of 98.16%, mean intersection over union of 96.42%, and mean pixel accuracy of 98.31%, outperforming representative baselines. ResKAN achieved mean accuracy of 92.48%, sensitivity of 92.67%, specificity of 92.31%, and a fissured-class F1 score of 92.34%.
Conclusion
The proposed system enables reliable non-contact tongue imaging with automated server-side analysis under unconstrained conditions. These findings support the feasibility of this integrated approach as an initial step toward more accessible automated tongue-image analysis in community and home settings.