The Proposed Clustering Model to Predict Autism Spectrum Disorder in Toddlers
Abdulelah G. F. Saif
A
bstract
Objective:
To introduce a robust clustering model for the early prediction of autism spectrum disorder (ASD) in toddlers.
Methods:
Five parametric and five non-parametric clustering algorithms were employed using all available features of a dataset obtained from Kaggle.com. The performance of these algorithms was evaluated through a comparative analysis based on several metrics, including the Silhouette Score (SS), Davies–Bouldin Index (DBI), adjusted rand index (ARI), Fowlkes–Mallows index (FMI), normalized mutual information (NMI), standard deviation (SD), and processing time (PT).
Results:
The results demonstrate that the Gaussian Mixture Model (GMM) achieved superior performance compared with the other algorithms on this dataset. The best results were observed in terms of both clustering performance (SS = 0.1952, DBI = 1.8038, ARI = 0.9730, FMI = 0.9884, NMI = 0.9439) and time efficiency (an average of 4.1998 s over five runs) when training and testing were conducted on the entire dataset. When the dataset was split into training and testing sets, GMM achieved the following performance: SS = 0.2108, DBI = 1.6722, ARI = 1, FMI = 1, NMI = 1, and SD=0.0818.
Conclusion:
ASD is a long-term neurological condition that requires early, rapid, accurate, and effective diagnosis to ensure appropriate healthcare intervention. Traditional diagnostic methods are often time-consuming and costly. Consequently, machine learning approaches have emerged as faster and more cost-effective alternatives for ASD diagnosis. In this study, GMM showed superior performance among the algorithms compared within this dataset, but should not be interpreted as a diagnostic tool without further validation.