Development of a multidimensional diagnostic model integrating cfDNA methylation and clinical parameters for colorectal cancer and precancerous lesions using artificial intelligence.
Li-Yue Sun, Hao-Chuan Ma, Jiao-Jiao Yang, Rui-Qi Wang, Fang Wang94
Background: Early detection of colorectal cancer (CRC) and precancerous lesions is critical for improving patient outcomes. This study aimed to develop a diagnostic model integrating cfDNA methylation markers with multidimensional clinical parameters using artificial intelligence (AI) algorithms. Methods: A total of 1,373 participants were enrolled, including 261 patients with CRC, 312 with precancerous lesions, 400 high-risk individuals, and 400 healthy controls. Clinical data were systematically collected and preprocessed to ensure uniformity. Both qualitative variables (e.g., sex, fecal immunochemical test results) and quantitative variables (e.g., age, tumor markers, cfDNA methylation indicators) were included. Fifteen AI algorithms were evaluated, including logistic regression, support vector machine, random forest, XGBoost, and AdaBoost. Model performance was assessed using area under the receiver operating characteristic curve (AUC). The optimal algorithm was selected based on AUC, sensitivity, and specificity. Feature importance was analyzed using XGBoost. Results: AdaBoost was identified as the optimal algorithm. Key features selected included CA19-9, smoking index, white blood cell count, fasting blood glucose, CEA, age, CA724, triglycerides, total cholesterol, and neutrophil to lymphocyte ratio. Two rapid and low-cost cfDNA detection methods (electrochemical cfDNA adsorption and colloidal gold based cfDNA absorbance) were integrated into the model. The final combined model demonstrated excellent diagnostic performance, with an AUC of 0.999 for CRC and 0.843 for precancerous lesions. Sensitivity and specificity for CRC were 98.1% and 99.8%, respectively, and for precancerous lesions were 89.1% and 62.6%, respectively. Conclusions: This study successfully developed a high-performance diagnostic model integrating cfDNA methylation markers with multidimensional clinical data using AI. The model shows exceptional accuracy in detecting both CRC and precancerous lesions, offering a promising tool for early screening and risk stratification. Incorporation of low-cost cfDNA assays enhances its potential for widespread clinical application, particularly in resource-limited settings.