Credit Scoring, Artificial Intelligence and Quantum Machine Learning

Please Download the White Paper for the Course

COURSE OBJECTIVE

This rigorous course is designed to impart skills necessary for creating and calibrating credit scoring models, including calculating default probabilities (PD) and validating these models. Participants will explore a range of machine learning approaches from traditional methods to quantum and probabilistic techniques, and learn how to leverage artificial intelligence for automating these processes.

Participants will gain proficiency in both conventional and cutting-edge models for credit scoring during the stages of credit admission and monitoring. This includes handling vast datasets to construct comprehensive credit and behavior scoring systems.

The course also delves into advanced data analytics, covering topics like sampling, exploratory analysis, feature engineering, segmentation, and outlier detection.

A variety of machine learning techniques will be discussed—ranging from supervised and unsupervised learning to reinforcement learning—specifically applied to developing tools for credit scoring. Well-established methods like logistic regression and other innovative machine learning techniques such as decision trees, naive Bayes, K-nearest neighbors, LASSO logistic regression, random forests, neural networks, Bayesian networks, Support Vector Machines, and gradient boosting trees will be explored.

The application of deep learning in building robust credit scoring models suitable for banking applications will be covered extensively. This includes the use of various neural network architectures such as feedforward, convolutional, recurrent, and adversarial generative networks, alongside Fermac Risk’s proprietary methodology for managing and interpreting deep learning models to prevent the pitfalls of black box scenarios.

Instruction on tuning hyperparameters, which are crucial for controlling the learning process and optimizing model performance, will be provided along with techniques like grid search, random search, and Bayesian optimization.

The course provides over 20 distinct credit scoring models using different methodologies across multiple programming environments like R, Python, Jupyterlab, Tensorflow, and SAS. This spans models for various credit aspects including origination, behavior, recovery, income, and churn.

Advanced techniques for calibrating risk parameters for the IRB and IFRS 9 PD are included, covering methods from adjustment to central tendency to deep learning models for PD lifetime calibration under IFRS 9.

The curriculum introduces automated machine learning (AutoML), enhancing the ability of risk analysts to develop, scale, and validate high-quality machine learning models efficiently.

Participants will also explore probabilistic machine learning techniques, like Bayesian neural networks, to construct credit scoring models, alongside best practices for model validation, particularly focusing on AI-driven financial tools as per European regulatory standards.

Finally, the course highlights the emerging field of Quantum Machine Learning, discussing its potential to revolutionize financial services through enhanced computational speeds and capabilities using quantum algorithms.

This comprehensive program aims to equip participants with the skills to utilize advanced computing technologies, including quantum and tensor networks, for machine learning calculations, preparing them for significant advancements in the financial sector.

WHO SHOULD ATTEND?

The Course is aimed at professionals from financial institutions interested in developing powerful credit scoring models and calibrating their output, as well as model managers in credit risk and data science departments.

For a better understanding of the topics, the participant must know statistics and mathematics. You can benefit from quantum computing technologies without needing to have knowledge of quantum physics.

fondo-azul-degradado-lujo-abstracto-azul-oscuro-liso-banner-estudio-vineta-negra.jpg

Europe: Mon-Fri, CEST 16-19 h

America: Mon-Fri, CDT 18-21 h

Asia: Mon-Fri, IST 18-21 h

Schedules:

Price: 7 900 €

Level: Advanced

Duration: 36 h

Material:

Presentations in PDF
Exercises in Excel, R , SAS, Python, Jupyterlab y Tensorflow

Request Info

Some Clients

AGENDA
Credit Scoring, Artificial Intelligence and Quantum Machine Learning

Anchor 10

CREDIT SCORING

Module 0: Quantum Computing and Algorithms

Future of quantum computing in banking
Is it necessary to know quantum mechanics?
QIS Hardware and Apps
quantum operations
Qubit representation
Measurement
Overlap
matrix multiplication
Qubit operations
Multiple Quantum Circuits
Entanglement
Deutsch Algorithm
Quantum Fourier transform and search algorithms
Hybrid quantum-classical algorithms
Quantum annealing, simulation and optimization of algorithms
Quantum machine learning algorithms
Exercise 1: Quantum operations

Module 1: Artificial Intelligence for Credit Scoring

Big Data Definition
Big Data in financial institutions and fintech
Big data in Bigtech
Data typology
- structured
- semi-structured
- Unstructured Data
Big data: Volume, Velocity, Variety, Veracity and Value
Big Data Size
Big data sources
- transactional data
- social media dating
- Credit bureau data
- Origin of data sources
- The data of the website
- Text Data
- sensor data
- RFID and NFC data
- Data from telecom operators
- Smart grid data
banking digitization
financial inclusion
Regulation in Europe, USA and Latin America
Artificial intelligence in banking
Artificial intelligence in the credit cycle

Module 2: AI in Credit Scoring

AI in Credit Scoring for Banking and Fintech
Offline and online credit scoring
Design and Construction of Credit Scoring Models
Advantages and disadvantages
Models to face new financial crises
Machine Learning to develop and validate credit scoring
Importance of the Bureau Score
Credit Scorecard Management
Default Probability Estimation PD

Module 3: Machine Learning

Definition of Machine Learning
Machine Learning Methodology
- Data Storage
- Abstraction
- Generalization
- Assessment
Supervised Learning
Unsupervised Learning
Reinforcement Learning
deep learning
Typology of Machine Learning Algorithms
Steps to Implement an Algorithm
information collection
- Exploratory Analysis
- Model Training
- Model Evaluation
- Model improvements
- Machine Learning in Credit Scoring Models
- Quantum Machine Learning

Exploratory Data Analysis (EDA) and Feature Engineering

Module 4: Exploratory Data Analysis

Data typology
transactional data
Unstructured data embedded in text documents
Social Media Data
data sources
Data review
Target definition
Time horizon of the target variable
Sampling
- Random Sampling
- Stratified Sampling
- Rebalanced Sampling
Exploratory Analysis:
- histograms
- Q Q Plot
- Moment analysis
- boxplot
Treatment of Missing values
- Multivariate Imputation Model
- Advanced Outlier detection and treatment techniques
- Univariate technique: winsorized and trimming
- Multivariate Technique: Mahalanobis Distance

Module 5: Feature Engineering

Feature Engineering
Data Standardization
Variable categorization
- Equal Interval Binning
- Equal Frequency Binning
- Chi-Square Test
binary coding
WOE Coding
- WOE Definition
- Univariate Analysis with Target Variable
- Variable Selection
- Treatment of Continuous Variables
- Treatment of Categorical Variables
- Using Gini
- Information Value
- Optimization of continuous variables
- Optimization of categorical variables
Exercise 1: Exploratory Analysis in R
Exercise 2: Detection and Treatment of Advanced Outliers
Exercise 3: Stratified and Random Sampling in R
Exercise 4: Multivariate imputation model
Exercise 5: Univariate analysis in percentiles in R
Exercise 6: Continuous variable optimal univariate analysis in Excel
Exercise 7: Estimation of the KS, Gini, and IV of each variable in Excel
Exercise 8: Word Cloud analysis of variables in R

MACHINE LEARNING

Unsupervised Learning

Module 6: Unsupervised models

Hierarchical Clusters
K Means
standard algorithm
Euclidean distance
Principal Component Analysis (PCA)
Advanced PCA Visualization
Eigenvectors and Eigenvalues
Exercise 14: Core components in R and SAS
Exercise 15: Segmentation of the data with K-Means R

Supervised Learning

Module 7: Logistic Regression and LASSO Regression

Econometric Models
- Logit regression
- probit regression
- Piecewise Regression
- survival models
Machine Learning Models
- Lasso Regression
- Ridge Regression
Model Risk in Logistic Regression
Exercise 16: Credit Scoring Logistic Regression in SAS and R
Exercise 17: Credit Scoring Lasso Logistic Regression in R
Exercise 18: Model Risk Using Confidence Intervals of Logistic Regression Coefficients

Module 8: Trees, KNN and Naive Bayes

Decision Trees
- modeling
- Advantages and disadvantages
- Recursion and Partitioning Processes
- Recursive partitioning tree
- Pruning Decision tree
- Conditional inference tree
- tree display
- Measurement of decision tree prediction
- CHAID model
- Model C5.0
K-Nearest Neighbors KNN
- modeling
- Advantages and disadvantages
- Euclidean distance
- Distance Manhattan
- K value selection
Probabilistic Model: Naive Bayes
- naive bayes
- Bayes' theorem
- Laplace estimator
- Classification with Naive Bayes
- Advantages and disadvantages
Exercise 19: Credit Scoring Decision Tree in SAS and R
Exercise 20: Credit Scoring KNN in R and SAS
Exercise 21: Credit Scoring Naive Bayes in R

Module 9: Support Vector Machine SVM

SVM with dummy variables
SVM
optimal hyperplane
Support Vectors
add costs
Advantages and disadvantages
SVM visualization
Tuning SVM
kernel trick
Exercise 22: Credit Scoring Support Vector Machine in R data 1
Exercise 23: Credit Scoring Support Vector Machine in Python data 2

Module 10: Ensemble Learning

set models
bagging
bagging trees
Random Forest
Boosting
adaboost
Gradient Boosting Trees
Advantages and disadvantages
Exercise 24: Credit Scoring Boosting in R
Exercise 25: Credit Scoring Bagging in R
Exercise 26: Credit Scoring Random Forest, R and Python, data 1 and 2
Exercise 27: Credit Scoring Gradient Boosting Trees

MODEL VALIDATION

Module 11: Validation of traditional and Machine Learning models

Model validation
Validation of machine learning models
Regulatory validation of machine learning models in Europe
Out of Sample and Out of time validation
Checking p-values in regressions
R squared, MSE, MAD
Waste diagnosis
Goodness of Fit Test
multicollinearity
Binary case confusion matrix
Multinomial case confusion matrix
Main discriminant power tests
confidence intervals
Jackknifing with discriminant power test
Bootstrapping with discriminant power test
Kappa statistic
K-Fold Cross Validation
Exercise 28: Logistic Regression Goodness-of-Fit Test
Exercise 29: Cross validation in SAS
Exercise 30: Gini Estimation, Information Value, Brier Score, Lift Curve, CAP, ROC, Divergence in SAS and Excel
Exercise 31: Bootstrapping of SAS parameters
Exercise 32: Jackkinifng in SAS
Exercise 33: Gini/ROC Bootstrapping in SAS
Exercise 34: Kappa estimation
Exercise 35: K-Fold Cross Validation in R
Exercise 36: Traffic light validation out of time (horizon 6 years) of Logistics and Machine Learning models

Module 12: Stability Testing

Model stability index
Factor stability index
Xi-square test
K-S test
Exercise 37: Stability tests of models and factors

DEEP LEARNING

Module 14: Introduction to Deep Learning

Definition and concept of deep learning
Why now the use of deep learning?
Artificial neural networks
Neural network architectures
activation function
- sigmoidal
- Rectified linear unit
- hypertangent
- Softmax
feedforward network
Multilayer Perceptron
Using Tensorflow
Using Tensorboard
R deep learning
Python deep learning
Convolutional Neural Networks
Use of deep learning in image classification
cost function
Gradient descending optimization
Using deep learning for credit scoring
- How many hidden layers?
- How many neurons, 100, 1000?
- How many times and size of the batch size?
- What is the best activation function?
Deep Learning Software: Caffe, H20, Keras, Microsoft, Matlab, etc.
Deployment software: Nvidia and Cuda
Hardware, CPU, GPU and cloud environments
Advantages and disadvantages of deep learning

Module 15: Deep Learning Feed Forward Neural Networks

Single Layer Perceptron
Multiple Layer Perceptron
Neural network architectures
activation function
- sigmoidal
- Rectified linear unit (Relu)
- The U
- Selu
- hyperbolic hypertangent
- Softmax
- other
Back propagation
- Directional derivatives
- gradients
- Jacobians
- Chain rule
- Optimization and local and global minima
Exercise 38: Credit Scoring using Deep Learning Feed Forward

Module 16: Deep Learning Convolutional Neural Networks CNN

CNN for pictures
Design and architectures
convolution operation
descending gradient
filters
strider
padding
Subsampling
pooling
fully connected
Credit Scoring using CNN
Recent CNN studies applied to credit risk and scoring
Exercise 39: Credit scoring using deep learning CNN

Module 17: Deep Learning Recurrent Neural Networks RNN

Natural Language Processing
Natural Language Processing (NLP) text classification
Long Term Short Term Memory (LSTM)
hopfield
Bidirectional associative memory
descending gradient
Global optimization methods
RNN and LSTM for credit scoring
One-way and two-way models
Deep Bidirectional Transformers for Language Understanding
Exercise 40: Credit Scoring using Deep Learning LSTM

Module 18: Generative Adversarial Networks (GANs)

Generative Adversarial Networks (GANs)
Fundamental components of the GANs
GAN architectures
Bidirectional GAN
Training generative models
Synthetic Data
Credit Scoring using GANs
Exercise 41: Credit Scoring using GANs

Module 19: Calibrating Machine Learning and Deep Learning

hyperparameterization
grid search
random search
Bayesian Optimization
Train test split ratio
Learning rate in optimization algorithms (e.g. gradient descent)
Selection of optimization algorithm (e.g., gradient descent, stochastic gradient descent, or Adam optimizer)
Activation function selection in a (nn) layer neural network (e.g. Sigmoid, ReLU, Tanh)
Selection of loss, cost and custom function
Number of hidden layers in an NN
Number of activation units in each layer
The drop-out rate in nn (dropout probability)
Number of iterations (epochs) in training a nn
Number of clusters in a clustering task
Kernel or filter size in convolutional layers
pooling size
batch size
Exercise 42: Optimization Credit Scoring Xboosting, Random forest and SVM
Exercise 43: Optimized Credit Scoring Deep Learning

Module 20: Traditional Scorecard Construction

scoring assignment
Scorecard Classification
- Scorecard WOE
- Binary Scorecard
- Continuous Scorecard
Scorecard Rescaling
- Factor and Offset Analysis
- Scorecard WOE
- Binary Scorecard
Reject Inference Techniques
- cut-off
- parceling
- Fuzzy Augmentation
- Machine Learning
Advanced Cut Point Techniques
- Cut-off optimization using ROC curves
Exercise 44: Building Scorecard in Excel, R and Python
Exercise 45: Optimum cut-off point estimation in Excel and model risk by cut-off point selection
Exercise 46: Confusion matrix to verify Type 1 and Type 2 Error in Excel with and without variables

QUANTUM MACHINE LEARNING

Module 21: Quantum Credit Scoring

What is quantum machine learning?
Qubit and Quantum States
Quantum Automatic Machine Algorithms
quantum circuits
quantum k means
Support Vector Machine
Support Vector Quantum Machine
Variational quantum classifier
Training quantum machine learning models
Quantum Neural Networks
Quantum GAN
Quantum Boltzmann machines
Quantum machine learning in Credit Risk
Quantum machine learning in credit scoring
quantum software
Exercise 47: Quantum K-means
Exercise 48: Quantum Support Vector Machine to develop credit scoring model
Exercise 49: Quantum feed forward Neural Networks to develop a credit scoring model
Exercise 50: Quantum Convoluted Neural Networks to develop a credit scoring model

Module 22: Tensor Networks for Quantum Machine Learning

What are tensor networks?
Quantum Entanglement
Tensor networks in machine learning
Tensor networks in unsupervised models
Tensor networks in SVM
Tensor networks in NN
NN tensioning
Application of tensor networks in credit scoring models
Exercise 51: Construction of credit scoring using tensor networks

PROBABILISTIC MACHINE LEARNING

Module 23: Probabilistic Machine Learning

Introduction to probabilistic machine learning
Gaussian models
Bayesian Statistics
Bayesian logistic regression
Kernel family
Gaussian processes
- Gaussian processes for regression
Hidden Markov Model
Markov chain Monte Carlo (MCMC)
- Metropolis Hastings algorithm
Machine Learning Probabilistic Model
Bayesian Boosting
Bayesian Neural Networks
Exercise 52: Gaussian process for regression
Exercise 53: Credit scoring model using Bayesian Neural Networks

MODEL RISK

Module 24: Model Risk in Credit Scoring

Model Risk
Model risk in deep learning
Model risk in credit scoring
black boxes
cut-off decision
absence of data
Model Risk for not updating or recalibrating
Ethical concepts of credit scoring
Exercise 54: Model risk in credit scoring due to not recalibrating on time

CREDIT SCORING MODELS

Module 25: Credit Scoring Models by Product

Origination Credit Scoring
- Credit Card Score
- Mortgage Score
- consumption scores
- Car Score
Behavior Score (BS)
- Temporal horizon
- Dashboard data information
- Panel data regression
- Cox regression
- Behavior Score with macroeconomic variables
- transition matrices
- Behavior Score with transition matrices
- Transaction Score
- Machine Learning Models
- BEHAVIOR SCORE ON CREDIT CARDS
Exercise 55: Behavior Score Logistic Regression in Python data 2
Exercise 56: Behavior Score Support Vector Machines in python
Exercise 57: Behavior Score Random Forest in python
Exercise 58: Behavior Score Gradient Boosting Trees in python
Exercise 59: Behavior Score Deep Learning LSTM in python

Module 26: Typology of Scoring models

Response Score
Income score
Churn Score
Origination Fraud Score
Behavior Fraud Score
Collection Score
Recovery Score
Big Data Scoring
Exercise 60: Fraud Score with neural networks
Exercise 61: Income Score
Exercise 62: Collection Score
Exercise 63: Recovery Score
Exercise 64: Quit Score

CALIBRATION OF PD MODELS

Module 27: Calibration of the Probability of Default PD IRB

PD estimation
- econometric models
- Machine Learning Models
- Data requirement
- Risk drivers and credit scoring criteria
- Rating philosophy
- Pool Treatment
PD Calibration
- Default Definition
- Long run average for PD
- Technical defaults and technical default filters
- Data requirement
- One Year Default Rate Calculation
- Long-Term Default Rate Calculation
PD Model Risk
- Conservatism Margin
PD Calibration Techniques
- Anchor Point Estimate
- Mapping from Score to PD
- Adjustment to the PD Economic Cycle
- Rating Philosophy
PD Trough The Cycle (PD TTC) models
PD Point in Time PD (PD PIT ) models
PD Calibration of Models Using Machine and Deep Learning
Exercise 65: PD Calibration in Machine Learning Models

Module 28: Machine Learning models to estimate Lifetime PD under IFRS 9

Credit scoring models to estimate Lifetime PD
PD Lifetime in IFRS 9
Impact of COVID-19 on models
Climate Risk Impact
Inflation impact
Impact of rising prices
Regression Models
- Logistic regression
- Logistic Multinomial Regression
- Ordinal Probit Regression
VAR and VEC models
Machine Learning Model
- SVM: Kernel Function Definition
- Neural Network: definition of hyperparameters and activation function
- deep learning
- LSTM
PD Calibration of Models Using Machine and Deep Learning
Exercise 66: PD Lifetime using logistic regression
Exercise 67: PD Lifetime using multinomial regression in R
Exercise 68: PD Lifetime using SVM in Python
Exercise 69: PD Lifetime using Deep Learning in Python
Exercise 70: PD Lifetime using Deep Learning LSTM in Python

VALIDATION OF PD MODELS

Module 29: Validation of PD models

Definition of PD Backtesting
PD Calibration Validation
- normal test
- Binomial Test
- Traffic Light Approach
Traffic Light Analysis and PD Dashboard
PS Stability Test
Forecasting PD vs. Real PD in time
When should we recalibrate or reestimate a credit scoring model?
Re-development
Re-estimation
Model Risk in PD
Machine Learning to validate PD models
Artificial Intelligence to recalibrate and rebuild models autonomously
Exercise 71: Backtesting PD in Excel
Exercise 72: Forecasting PD and actual PD in Excel

AUTOMATION OF CREDIT SCORING AND PD WITH AI

Module 30: Automation of Credit Scoring and PD Modeling

What is modeling automation?
that is automated
Automation of machine learning processes
Optimizers and Evaluators
Modeling Automation Workflow Components
- Summary
- Indicted
- Feature engineering
- Model generation
- Assessment
Hyperparameter optimization
Reconstruction or recalibration of credit scoring
Credit Scoring Modeling
- Main milestones
- Evaluation and optimization
- Possible Issues
PD calibration modeling
- Evaluation and optimization
- backtesting
- Discriminating Power
- Stability Tests
Global evaluation of modeling automation
Implementation of modeling automation in banking
Technological requirements
available tools
Benefits and possible ROI estimation
Main Issues
Model Risk
Genetic algorithms
Exercise 73: Automation of the modeling, optimization and validation of credit scoring hyperparametry
Exercise 74: Automation of PD modeling and validation

Credit Scoring, Artificial Intelligence and Quantum Machine Learning

Some Clients

AGENDA Credit Scoring, Artificial Intelligence and Quantum Machine Learning

AGENDA
Credit Scoring, Artificial Intelligence and Quantum Machine Learning