Research on identification of key genes and immune–metabolic mechanisms in atrial fibrillation through integrated multi-cohort transcriptomic analysis and machine learning
Mierzhati Maimaiti, Aizizha Paerhati, Shenhong Liu, Xianglin Du, Wen Bai
This study aimed to integrate multiple datasets for the identification of atrial fibrillation (AF)-related differentially expressed genes (DEGs), analyze their underlying mechanisms through functional enrichment and machine learning, construct diagnostic models, and explore immune–metabolic interactions to provide novel biomarkers and theoretical foundations. Gene expression datasets were integrated and normalized, with batch effects removed using principal component analysis. Differential expression analysis, functional enrichment analysis (Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathways), and machine learning-based feature gene selection and model construction were performed. Shapley additive explanations analysis was utilized to interpret the constructed models, while gene set enrichment analysis, gene set variation analysis, and immune cell infiltration analysis were conducted to investigate the associations between feature genes and immune infiltration. After integrating and normalizing gene expression data and eliminating batch effects via principal component analysis, 6 DEGs were identified, including 4 upregulated and 2 down-regulated ones. Functional enrichment analysis showed these DEGs were significantly enriched in neuro-related biological processes and pathways, indicating their key roles in AF pathogenesis. Five key feature genes were selected using LASSO, random forest, and support vector machine-recursive feature elimination algorithms. They had significant expression differences between the AF and control groups (