An Ensemble Machine Learning Approach for Forecasting Credit risk of Loan Applications
C. L. Perera, S. C. Premaratne- Computer Science Applications
- Control and Systems Engineering
The business environment in Sri Lanka has become competitive with the development of the financial sector and the spread of the COVID-19 pandemic. The number of organizations and individuals applying for loans has increased. Lengthy authentication procedures are followed by financial institutes. However, there is no assurance whether the chosen applicant is the right applicant or not. Thus, this study proposed a methodology for assessing the credit risks associated with loans, to help make appropriate choices in the future. An Exploratory Data Analysis was performed to provide insights. This study focused on evaluating customer profiles based on the demographic and geographical data of the customers to forecast credit risks of loans using Machine Learning (ML) algorithms. Finally, the model performances were evaluated using evaluation metrics. The Stacking Ensemble outperformed the other techniques with the highest training and test accuracy of 0.99 and 0.78, respectively. The novelty of this study lies in performing a comprehensive data collection from a leading finance institution in Sri Lanka. The study highlights the importance of the choice of features, ML techniques, hyperparameters and evaluation criteria. Also, a novel ML technique, voting-based ensemble learning was proposed for enhancing performance.