Example dependent cost sensitive learning based selective deep ensemble model for customer credit scoring

Xiao, Jin; Li, Sihan; Tian, Yuhang; Huang, Jing; Jiang, Xiaoyi; Wang, Shouyang

doi:10.1038/s41598-025-89880-7

Download PDF

Article
Open access
Published: 18 February 2025

Example dependent cost sensitive learning based selective deep ensemble model for customer credit scoring

Jin Xiao¹,
Sihan Li¹,
Yuhang Tian¹,
Jing Huang²,
Xiaoyi Jiang³ &
…
Shouyang Wang⁴

Scientific Reports volume 15, Article number: 6000 (2025) Cite this article

2588 Accesses
3 Citations
Metrics details

Subjects

Abstract

In credit scoring, data often has class-imbalanced problems. However, traditional cost-sensitive learning methods rarely consider the varying costs among samples. Moreover, previous studies have limitations, such as the lack of fit to real-world business needs and limited model interpretability. To address these issues, this paper proposes a novel example-dependent cost-sensitive learning based selective deep ensemble (ECS-SDE) model for customer credit scoring, which integrates example-dependent cost-sensitive learning with the interpretable TabNet (attentive interpretable tabular learning) and GMDH (group method of data handling) deep neural networks. Specifically, we use TabNet, which excels in handling tabular data, as the base classifier and optimize its performance on imbalanced data with an example-dependent cost loss function. Next, we design a GMDH based on an example-dependent cost-sensitive symmetric criterion to selectively deep integrate the base classifiers. This approach reduces the redundancy of base models in traditional ensemble strategies and enhances classification performance. Experimental results show that the ECS-SDE model outperforms six cost-sensitive models and five advanced deep ensemble models in overall performance for credit scoring. It shows significant advantages in the BS⁺, Save, and AUC metrics on four datasets. Furthermore, the ECS-SDE model provides strong interpretability, and detailed analysis reveals the key roles of various features in credit scoring.

Optimizing credit card fraud detection with random forests and SMOTE

Article Open access 22 May 2025

NOTE: non-parametric oversampling technique for explainable credit scoring

Article Open access 30 October 2024

A novel interpretable deep transfer learning combining diverse learnable parameters for improved T2D prediction based on single-cell gene regulatory networks

Article Open access 24 February 2024

Introduction

Global economic integration has created a more complex environment for financial institutions¹. In particular, the rise in financial derivatives and consumer loans has increased risks for financial institutions². Credit risk, arising from borrower defaults, is a primary concern for financial institutions³. While it is difficult to accurately predict whether a borrower will default in the future, effective credit risk scoring can significantly reduce potential default losses for financial institutions⁴. Thus, the identification of suitable measures to mitigate losses incurred by customer defaults has emerged as a critical concern in the financial industry.

Customer credit scoring is an effective tool for evaluating borrowers’ credit risk. Credit scoring is commonly regarded as a binary classification task^5,6,7, which classifies borrowers into two categories: “good credit” or “poor credit.” Most of the currently widely used credit scoring models are based on cost-insensitive learning methods, which aim to minimize the number of misclassifications while assuming that the cost of all misclassifications is the same⁸. However, this assumption does not fully consider the actual business objectives of financial institutions, which are to minimize operating costs⁹. For financial institutions, reducing the potential costs associated with misclassification is often more important than merely improving classification accuracy. As a result, cost-sensitive learning has emerged, aiming to minimize total classification costs by balancing management expenses and loss expenses.

Currently, many studies have applied cost-sensitive learning methods to credit scoring¹⁰, but most methods assume that the classification cost for each class (e.g., good credit vs. poor credit) is constant, which is referred to as class-dependent cost-sensitive (CCS) learning¹¹. However, the limitation of CCS is that it only focuses on the misclassification cost between different classes and primarily aims to improve the classification performance of the model, neglecting the need for cost minimization in the actual business operations of financial institutions¹². In real-world customer credit scoring scenarios, the economic loss to financial institutions from lending to bad customers varies, because customers have different credit limits and economic conditions¹³. To address this issue, researchers have proposed example-dependent cost-sensitive (ECS) learning. Studies have shown that, compared to CCS, ECS methods demonstrate better performance in customer credit scoring¹⁴. This is because ECS methods account for cost differences between classes as well as between samples. In customer credit scoring, ECS models can accurately estimate the economic loss caused by misclassification, taking into account the varying credit conditions and economic situations of different customers. This helps better meet the needs of financial institutions and enhances the economic benefits of credit scoring.

Lenarcik and Piasta¹⁵ first introduced the concept of ECS while improving the probabilistic rough set al.gorithm. Based on the stage when costs are introduced, ECS methods can be divided into three categories: introducing example-dependent costs before, during, and after model training^8,16. (1) Example-dependent costs introduced in pre-training methods involve adjusting sample weights according to their misclassification costs. Common methods include cost-proportionate rejection sampling (CPRS)¹⁷ and cost-proportionate over-sampling (CPOS)¹⁸. CPRS retains or rejects samples based on a probability proportional to their misclassification cost, while CPOS creates a new dataset by duplicating samples, with the frequency of duplication determined by their misclassification cost. (2) Example-dependent costs introduced during the training phase modify the loss function to directly optimize model performance. Typical models include ECS logistic regression (LR)¹⁹, ECS decision trees (DT)^8,9, and ECS support vector machines²⁰. (3) Example-dependent costs introduced after the training phase primarily employ a cost-sensitive Bayesian minimum risk approach^21,22. This approach combines the predicted probabilities from base classifiers with the example-dependent costs to minimize the overall expected risk. However, before-training approaches, which rely on the prior distribution of the training data, may lead to data bias or reduced model generalization²¹. After-training methods, in turn, depend on the base classifiers, and if they fail to effectively capture cost-sensitive information during training, optimization may be limited. In contrast, by incorporating the ECS mechanism during training, the model can more directly optimize the cost-sensitive objective, thereby improving its focus on high-cost samples. Therefore, this paper studies the ECS method that introduces example-dependent costs during the training phase.

Most of the above studies focus on improving a single classification model. However, single models are prone to overfitting, which can affect the model’s generalization ability. To solve this problem, researchers have begun to enhance the performance of ECS models through ensemble learning. For example, Bahnsen et al.²³ proposed an ECS classification framework that combines ECS decision trees (CSDT) using four different ensemble methods: random forest (RF), bagging, and their variants, random patches, and pasting. The results showed that the CSDT model with the RP ensemble method produced the best performance on five datasets across four applications, including credit card fraud detection, customer churn prediction, credit scoring, and marketing. Zelenkov²⁴ used DT as base classifiers and introduced the ECS method into the AdaBoost model using three different methods: inside the exponent, outside the exponent, and both inside and outside the exponent, constructing an ECS AdaBoost ensemble model. Experiments showed that this model outperformed other ECS models on five datasets in banking marketing and insurance fraud domains. Bhargava et al.²⁵ proposed an ECS stacking ensemble framework for predicting potential tax defaulters. This framework consisted of two stages: the first stage-trained multiple cost-insensitive classifiers, and the second stage used CSDT, RF, artificial neural networks (ANN), and a bagging ensemble classifier based on CSDT as meta-models. The outputs of the first-stage models were used as inputs to train the meta-models. Experimental results showed that this framework not only outperformed traditional ECS classifiers but also significantly reduced costs.

In recent years, deep neural networks (DNN)^{26,27,28,29,30,31} have demonstrated outstanding performance in various fields, showing significant potential in credit-scoring tasks. Mehta et al.³² proposed an ECS deep neural network (ECS-DNN) by modifying the loss function to incorporate ECS. Experimental results indicated that this model had significant advantages in terms of cost savings. However, traditional DNN models typically require extensive data preprocessing when dealing with complex tabular data. In contrast, the attentive interpretable tabular deep neural network (TabNet)³³ is specifically designed for tabular data. It can be applied directly to raw data and demonstrates high prediction accuracy. As a result, researchers have attempted to introduce TabNet to credit-scoring tasks. For instance, Cai et al.³⁴ proposed a deep ensemble model for credit card fraud detection, which used TabNet as the base classifier and XGBoost for the ensemble. Experimental results showed that the proposed model outperformed the comparative models across multiple evaluation metrics. Zhang et al.³⁵ proposed a TabNet-based credit fraud detection model, which significantly outperformed traditional XGBoost and Naive Bayes algorithms. Lee et al.³⁶ used various ensemble techniques such as LightGBM, XGBoost, RF, and CatBoos to integrate TabNets, and successfully applied it to credit card default prediction. Despite the significant success of TabNet in credit scoring tasks, most existing studies focus on performance enhancement and do not consider ECS. In addition, model interpretability is particularly important in financial credit scoring. Since TabNet combines the interpretability of tree-based models with the learning ability of DNNs, it has the potential to play a greater role in this field.

However, after careful analysis, we find that the existing studies still have the following four limitations: (1) Most cost-sensitive learning-based deep learning models still adopt CCS methods, and research on ECS techniques is relatively limited¹⁴. Only one study³² has applied ECS in single DNN modeling; (2) Existing ECS ensemble models for credit scoring integrate traditional machine learning-based base classifiers, and no studies have explored ECS deep ensemble models that use deep learning models as base classifiers. In addition, existing ensemble models typically combine the predictions of all base classifiers, which may lead to redundancy. Using deep learning models as base classifiers and selecting an appropriate model subset for the ensemble, i.e., selective deep ensemble, may improve model performance; (3) Existing models that introduce the ECS mechanism during training typically adjust the loss function to account for example-dependent cost. While this adjustment reduces misclassification costs, it may compromise performance on traditional accuracy-based metrics; (4) Current deep learning algorithms considering ECS in credit scoring are black-box models, with low transparency and poor interpretability, limiting their practical application.

To address the above limitations, this paper proposes an example-dependent cost-sensitive learning based selective deep ensemble (ECS-SDE) model for customer credit scoring. First, an example-dependent cost matrix is constructed for the raw data, and the processed dataset is randomly sampled several times to generate ECS training subsets. Second, we construct example-dependent cost-sensitive TabNet (ECS TabNet) base classifiers and train multiple differentiated base classifiers using the training subsets. Finally, we propose an example-dependent cost-sensitive GMDH (ECS GMDH) neural network that uses the selection mechanism of GMDH for the selective deep ensemble. To verify the performance of the proposed model, this paper introduces five evaluation metrics and conducts empirical analysis on four datasets. The experimental results show that, compared to three ECS models, three CCS models, and five advanced deep ensemble models, the ECS-SDE model demonstrates better overall performance in customer credit scoring and has stronger model interpretability.

The theoretical contributions of this paper are as follows: (1) We are the first to apply ECS techniques in constructing deep ensemble models for customer credit scoring by combining the interpretable TabNet and GMDH deep neural networks; (2) We introduce ECS technique to the TabNet model, proposing a new TabNet deep learning model. This model is trained by embedding an enhanced ECS-based loss function, which significantly improves its performance when dealing with imbalanced data; (3) We propose a novel example-dependent cost-sensitive symmetric criterion (ECS-SC) for the GMDH, which accounts for the cost differences between samples and aims to minimize the total cost. The ECS-SC overcomes the limitation of traditional criteria that assign equal misclassification costs to all samples, making it more feasible for the practical needs of credit scoring. Additionally, we develop an ECS-SC-based GMDH model for selective deep ensemble learning. This method resolves base model redundancy in traditional ensemble strategies, enhancing classification performance; (4) We conduct a comparative analysis using four credit-scoring datasets, comparing three ECS models, three CCS models, and five advanced deep ensemble models. The results show that the ECS-SDE model achieves superior overall performance in customer credit scoring and offers strong interpretability.

The remainder of this paper is structured as follows. Section 2 briefly reviews the relevant theoretical foundations. Section 3 provides a detailed description of the basic concept and modeling steps of the ECS-SDE model. In Sect. 4, we present the experimental design, including dataset information, experimental setup, and model evaluation metrics, and we analyze the experimental results. Finally, in Sect. 5, we present the conclusions of this paper and suggest possible future research directions.

Related works

Class dependent cost sensitive learning

In the real world, misclassification of different classes may have different consequences. In credit scoring, it is often observed that misclassifying a customer with poor credit as having good credit causes more severe economic losses than misclassifying a customer with good credit as poor credit. Therefore, many studies use CCS methods that assign different costs to the misclassification of each class. Classification costs are represented by a cost matrix, where the elements within the cost matrix are the same for all samples in the same class. Credit scoring can be represented as a binary classification problem, where samples are either in the negative class or in the positive class. To quantify the cost of misclassification, a cost matrix¹⁷ is used, as shown in Table 1:

Table 1 Cost matrix.

Subjects

Abstract

Similar content being viewed by others

Optimizing credit card fraud detection with random forests and SMOTE

NOTE: non-parametric oversampling technique for explainable credit scoring

A novel interpretable deep transfer learning combining diverse learnable parameters for improved T2D prediction based on single-cell gene regulatory networks

Introduction

Related works

Class dependent cost sensitive learning

TabNet deep neural network

GMDH neural network

Methods

Basic framework

Phase I: construction of the example-dependent cost matrix and ECS training subset

Phase II: training of ECS TabNet base classifiers

Phase III: design of an ECS GMDH for the selective deep ensemble

Construction of the example dependent cost matrix and ECS training subset

Training of ECS TabNet base classifiers

Design of an ECS GMDH for selective deep ensemble

Modeling process

Phase I: construction of the example-dependent cost matrix and ECS training subset

Phase II: training of ECS TabNet base classifiers

Phase III: design of an ECS GMDH for selective deep ensemble

Results and analysis

Datasets

Experimental setup

Evaluation metrics

Comparison experiments

Comparison of different cost sensitive models

Comparison of deep ensemble models

Computational time comparison of deep ensemble models

Ablation experiment

Analysis of model interpretability

Parameter sensitivity analysis

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s note

Electronic supplementary material

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

VAE-INN: Variational Autoencoder with Integrated Neural Network Classifier for Imbalanced Credit Scoring, Utilizing Weighted Loss for Improved Accuracy

Search

Quick links