Table 1 Features on HE and GMSC datasets used in NOTE.

From: NOTE: non-parametric oversampling technique for explainable credit scoring

Feature

Description of HE

Type

\(\textbf{BAD}^{*}\)

Applicant paid loan, or applicant defaulted on loan or seriously delinquent

N (Not defaulted = 0) / Y (Defaulted = 1)

LOAN

Amount of the loan request

Integer

MORTDUE

Amount due on existing mortgage

Integer

VALUE

Value of current property

Integer

REASON

DebtCon = debt consolidation; HomeImp = home improvement

Category

JOB

Six occupational categories (Other, ProfExe, Office, Sales, Mgr, Self)

Category

YOJ

Years at present job

Integer

DEROG

Number of major derogatory reports

Integer

DELINQ

Number of delinquent credit lines

Integer

CLAGE

Age of oldest credit line in months

Real

NINQ

Number of recent credit inquiries

Integer

CLNO

Number of credit lines

Integer

DEBTINC

Debt-to-income ratio

Real

Feature

\(\textbf{Description of GMSC}^{***}\)

Type

\(\textbf{SeriousDlqin2yrs}^{**}\)

Applicant experienced 90 days past due delinquency or worse

N (Not defaulted = 0) / Y (Defaulted = 1)

RevolvingUtilizationOfUnsecuredLines

Total balance on credit cards and personal lines of credit

Percentage

Age

Age of in years

Integer

NumberOfTime30–59DaysPastDueNotWorse

Number of times applicant has been 30–59 days past due but no worse in the last 2 years

Integer

DebtRatio

Monthly debt payments, alimony,living costs divided by monthly gross income

Percentage

MonthlyIncome

Monthly income

Real

CombinedCreditLoans

Number of open loans and lines of credit, including real estate and home equity

Category

CombinedDefaulted

Number of times an applicant has been 30–89 days or over 90 days past due

Category

NumberOfDependents

Number of dependents in family excluding themselves (spouse, children etc.)

Category

  1. *BAD and **SeriousDlqin2yrs are features for class label.
  2. ***The GMSC dataset has been feature-engineered to derive categorical features from the original variables.