DOI: 10.62520/fujece.1887893 ISSN: 2822-2881

Predicting Student Dropout Using Explainable Artificial Intelligence: The Impact of SMOTE-Based Class Balancing Techniques

Merve Zirekgür, Barış Karakaya, Derya Avcı
Student dropout poses a significant challenge to both individual and societal progress. While artificial intelligence is widely used to identify potential dropouts and analyze their causes, its success relies heavily on well-structured datasets. Imbalanced datasets may limit models from achieving good generalization by adapting to the majority class. This study examines the impact of dataset balancing techniques on dropout prediction using artificial intelligence models and interprets model decisions through explainable artificial intelligence, based on the publicly available Predict Students’ Dropout and Academic Success datasets. Accordingly, four different datasets were created using the SMOTE, SMOTE-ENN, and SMOTE-Tomek balancing techniques and were trained with machine learning models. The SMOTE-ENN + CATBoost combination, which achieved 98% accuracy, precision, recall, F1 score, and an AUC score of 1.00, was selected as the best-performing combination and was subjected to LIME and SHAP analyses for model interpretability. The analyses showed that low academic performance increases the risk of dropping out; in addition, support mechanisms such as scholarships encourage students to complete their education. Compared to similar studies, the findings address both class balancing techniques and model explainability within a broader scope and examine the underlying causes of student dropout through explainable artificial intelligence.

More from our Archive