|
The relevance of the development, implementation and use of a digital customer scoring system model for credit risk management is beyond doubt. Modern information technologies for the implementation of customer credit scoring allow you to concentrate on yourself most of the technical work on the collection and processing of initial data and the implementation of machine learning algorithms. The volume of banking information about clients will increase in the next few years, and the requirements for the quality and speed of its processing will become more stringent.
This study compares the performance of ensemble algorithms, i.e., random forest, XGBoost, LightGBM, CatBoost, and Stacking, in terms of area under the curve (AUC), Brier rating (BS), and model runtime. In addition, analysis of three popular basic classifiers, i.e. decision tree (DT), logistic regression (LR) and linear discriminant analysis (LDA), are considered benchmarks in credit scoring.
Experimental evidence shows that ensemble learning is better than basic classifiers. In addition, Stacking stands out from the rest of the models.
Keywords:ensemble methods, staking, credit scoring, classification
|