Toward scalable wood anatomy: a toolkit for automated xylem cell identification and quantification in woody angiosperms
Longwei Mei, Cheng Qi, Zhiyang Pei, Bingbing Zhao, Qiang Fu, Zhangqun Jiang, Chengcheng Li, Zhihe Jin, Ruoya Zhao, Jin Cheng, Yongjian Wang, Quanzi Li, Manzhu Bao, Bo Zheng, Xueping ShiAbstract
Xylem is essential for water and nutrient transport, mechanical support, and carbohydrate storage. Identification and quantification of vascular cell types remain manual, time-consuming, and prone to observer bias, limiting throughput and reproducibility. Automated, integrated tools are critical for scaling wood anatomical studies and enabling comparative analyses across taxa. We assembled a poplar xylem dataset of 1,790 microscopy images with 173,434 annotated instances. Using this dataset, we evaluated seven semantic segmentation models and five YOLOv8 detection models across section types for xylem cell recognition and morphometric attribute extraction and adopted a “Segmentation-then-Detection” pipeline to reduce misidentifications in complex backgrounds. Mask2Former achieved the best segmentation performance, covering transverse sections (whole xylem, vessels, fibers, rays) and tangential sections (rays and four ray cell image types). YOLOv8x and YOLOv8m performed consistently for object detection and morphometrics, and the PLXY-AI toolkit was accordingly developed based on YOLOv8 architecture. The combined pipeline markedly improved fiber identification in challenging images. In a generalization test, 34 of 42 woody angiosperms (81.0%) met > 90% accuracy for identifying all cell types. The workflow and PLXY-AI toolkit enable automated identification and quantification of vessels, fibers, and rays, extracting size and area while substantially reducing manual workload and observer bias. Per-image processing time averages < 1 s. Designed for batch analysis, the pipeline minimizes operational complexity and integrates easily into existing laboratory and computational environments. With a user-friendly graphical interface, this framework supports high-throughput analysis of vascular tissue structure and function across multiple tree species.