Early Detection of Latent Tuberculosis Using Multi-Objective Hybrid Feature Selection and Complex Network Analysis
Somayeh Ayalvari, Marjan Kaedi, Mohammadreza SehhatiLatent tuberculosis infection (LTBI) represents a major public health challenge due to its asymptomatic nature and potential progression to active tuberculosis, making early and accurate detection critical. This study proposes a novel computational decision-support framework for the early identification of LTBI using gene expression data, aiming to enhance diagnostic accuracy and support timely clinical decision-making. The framework integrates evolutionary algorithms with multi-objective optimization for feature selection, employing a real-coded representation to dynamically identify informative gene subsets while minimizing prediction error. An initial filtering of genes is performed using a Random Forest (RF) classifier, followed by refinement through entropy-based criteria and Mutual Information Maximization. Biological relevance is further validated by mapping the selected genes onto protein–protein interaction (PPI) networks using Cytoscape. Experimental results demonstrate that the proposed approach achieves 95% classification accuracy using only the top five genes identified by the RF classifier, with a precision of approximately 92.7%, a recall (sensitivity) of 95%, and an F1-score of 93.8%. Hierarchical and cumulative clustering methods effectively distinguish latent from active tuberculosis samples, while key gene markers and gene–gene interactions associated with LTBI are revealed, providing valuable insights into disease mechanisms. The proposed bioinformatics-driven framework offers a robust, interpretable, and clinically relevant tool for LTBI screening, with potential implications for improving tuberculosis diagnosis and management strategies