DOI: 10.3390/jrfm19070465 ISSN: 1911-8074

Deep Learning for Credit Risk Prediction in Fintech Lending: A Systematic Literature Review on Model Architectures, Imbalanced Data Handling, and Research Agenda

Moch Panji Agung Saputra, Sukono, Riaman, Alit Kartiwa, Masnita Misiran, Alim Jaizul Wahid

The purpose of this article is to conduct a systematic literature review on the role of deep learning in credit risk prediction for fintech lending, with particular emphasis on model architectures, imbalanced data handling techniques, and the mathematical foundations underpinning these methods. Open-access scientific publications retrieved from three complementary databases (Scopus, IEEE Xplore Digital Library, and Web of Science Core Collection) were used to conduct the systematic literature review. Following the PRISMA 2020 protocol, 30 publications were selected after a rigorous multi-stage screening process involving deduplication across databases, temporal filtering (2015–2026), and thematic eligibility assessment. Data for analysis were processed using Python-based bibliometric tools and network analysis (replicating R Bibliometrix and VOSviewer functionalities). The results of the analysis indicate a sustained growth in research on deep learning applications for fintech credit risk, with a compound annual growth rate (CAGR) of approximately 29.2% and an average of 27.17 citations per document. The segmentation of the studied conceptual landscape made it possible to identify four interconnected thematic clusters: (1) peer-to-peer lending and default prediction architectures; (2) explainability and XAI-based methods for credit scoring; (3) imbalanced data and hybrid deep-learning frameworks; and (4) credit risk assessment combining deep learning and statistical approaches. The following research areas on the deep learning-based transformation of fintech credit risk prediction have been identified: (1) feedforward deep neural networks and attention-based architectures (LSTM, CNN) as dominant predictive engines; (2) hybrid deep ensemble and deep-boosting frameworks (e.g., LightGBM-Attention, GBDT-Deep FFM) as emerging high-performance paradigms; (3) specialized techniques for imbalanced data handling—including ADASYN, SMOTE, cost-sensitive learning, and balanced stratified prioritized experience replay—as critical methodological frontiers; and (4) transformer-based and XAI-integrated architectures as the emerging frontier. The originality of this article lies in its explicit focus on the mathematical and methodological challenges of deep learning-based credit risk prediction in fintech lending, providing an actionable research agenda that addresses class imbalance, uncertainty quantification, loss function design, concept drift, and regulatory compliance. The findings provide valuable insights for scholars, practitioners, and policymakers, and outline a concrete roadmap for developing more accurate, robust, and explainable credit risk models in the rapidly evolving fintech ecosystem.

More from our Archive