top of page
robot-futurista-inteligencia-artificial-revolucionario-concepto-tecnologia-ai.jpg

Credit Scoring, Artificial Intelligence and Quantum Machine Learning

 

 

 

COURSE OBJECTIVE

 

Intensive course to develop credit scoring tools, calibrate the probability of default, PD, and validate models. Traditional, probabilistic and quantum machine learning methodologies are explained. It also explains how to automate the construction and calibration of the PD with the artificial intelligence itself.

 

The participant will learn to develop traditional and advanced credit scoring models in the credit admission and monitoring stage. In other words, the construction of credit and behavior scoring is explained using enormous volumes of information.

 

Regarding data analytics, a module is exposed on advanced data processing, explaining, among other topics, sampling, exploratory analysis, segmentation and detection of outliers.

 

The main techniques of machine learning, supervised, unsupervised and reinforcement learning, applied to the creation of credit scoring tools, are exposed.

Traditional methodologies such as logistic regression and other, innovative, machine learning methodologies are exposed, such as: decision trees, naive bayes, KKN, LASSO logistic regression, random forest, neural networks, Bayesian networks, Support Vector Machines, gradient boosting tree, etc .

The use of deep learning neural networks to develop powerful credit scoring models that banks can implement as challenging tools or useful tools in the admission and monitoring process is explained. Feed forward, convolutional, recurrent neural networks and antagonistic generative networks are exposed. A proprietary methodology, by Fermac Risk, is explained to control deep learning models and make them interpretable. This will avoid unacceptable black boxes.

Hyperparameters are parameters whose values control the learning process and determine the values of the model parameters that a learning algorithm ends up learning. The prefix hyper suggests that they are 'higher level' parameters that control the learning process and the model parameters that result from it.

Techniques for calculating hyperparameters are shown, such as grid search, random search, and Bayesian optimization.

More than 20 credit scoring models are delivered, with different methodologies in various programming languages such as: R, Python, Jupyterlab, Tensorflow and SAS. Credit scoring models for admission, follow-up, recovery, income and abandonment are delivered.

​Advanced methodologies for calibrating the PD IRB risk parameter are taught. Calibration by adjustment to central tendency, the philosophy of the PD PIT and PD TTC rating, the calibration of machine learning models so that they produce probabilities of default are addressed. In addition, a module has been included to develop and calibrate the PD Lifetime of IFRS 9 using deep learning models.

Automated machine learning, also called automated ML or AutoML, is the process of automating the iterative tasks of machine learning model development. Allowing risk analysts to build machine learning models with high scalability, efficiency, and productivity, while maintaining model quality, they can help not only self-build models but validation of credit scoring models.

Probabilistic machine learning techniques are shown to build credit scoring models such as Bayesian neural networks among other models.

Automated machine learning methodologies using genetic algorithms among other advanced techniques are explained.

The best practices for validation of credit scoring models of financial institutions using artificial intelligence and the regulatory requirements in Europe to use this type of models are indicated.

Quantum Machine Learning is the integration of quantum algorithms within Machine Learning programs. Machine learning algorithms are used to compute vast amounts of data, quantum machine learning uses qubits and quantum operations or specialized quantum systems to improve the speed of computation and data storage performed by algorithms in a program. For example, some mathematical and numerical techniques from quantum physics are applicable to classical deep learning. A quantum neural network has computational capabilities to decrease the number of steps, the qubits used, and the computation time.

The objective of the course is to show the use of quantum computing and tensor networks for the calculation of machine learning algorithms.

We believe that quantum computing will begin to transform the financial services landscape in the coming years. Banks that adopt quantum algorithms will have competitive advantages, including the potential to outpace competitors to become undisputed market leaders.

WHO SHOULD ATTEND?

 

The Course is aimed at professionals from financial institutions interested in developing powerful credit scoring models and calibrating their output, as well as model managers in credit risk and data science departments.

 

For a better understanding of the topics it is necessary that the participant has knowledge of statistics and mathematics.

 

 

fondo-azul-degradado-lujo-abstracto-azul-oscuro-liso-banner-estudio-vineta-negra.jpg
Time.png

  • Europe: Mon-Fri, CEST 16-19 h

 

  • America: Mon-Fri, CDT 18-21 h

  • Asia: Mon-Fri, IST 18-21 h

 

 

 

 

Schedules:

Price.png

Price: 8.900 €

 

Level.png

Level: Advanced

Duration.png

Duration: 40 h

 

Material.png

     Material: 

  • Presentations in PDF

  • Exercises in Excel, R , SAS, Python, Jupyterlab y Tensorflow

Download.png
Diapositiva4.png
fondo-azul-degradado-lujo-abstracto-azul-oscuro-liso-banner-estudio-vineta-negra.jpg

AGENDA
 
Credit Scoring, Artificial Intelligence and Quantum Machine Learning

 

Anchor 10

CREDIT SCORING

 

Module 0: Quantum Computing and Algorithms (Optional)

  • Future of quantum computing in banking

  • Is it necessary to know quantum mechanics?

  • QIS Hardware and Apps

  • quantum operations

  • Qubit representation

  • Measurement

  • Overlap

  • matrix multiplication

  • Qubit operations

  • Multiple Quantum Circuits

  • Entanglement

  • Deutsch Algorithm

  • Quantum Fourier transform and search algorithms

  • Hybrid quantum-classical algorithms

  • Quantum annealing, simulation and optimization of algorithms

  • Quantum machine learning algorithms

  • Exercise 1: Quantum operations

Module 1: Artificial Intelligence for Credit Scoring

 

  • Big Data Definition

  • Big Data in financial institutions and fintech

  • Big data in Bigtech

  • Data typology

    • structured

    • semi-structured

    • Unstructured Data

  • Big data: Volume, Velocity, Variety, Veracity and Value

  • Big Data Size

  • Big data sources

    • transactional data

    • social media dating

    • Credit bureau data

    • Origin of data sources

    • The data of the website

    • Text Data

    • sensor data

    • RFID and NFC data

    • Data from telecom operators

    • Smart grid data

  • banking digitization

  • financial inclusion

  • Regulation in Europe, USA and Latin America

  • Artificial intelligence in banking

  • Artificial intelligence in the credit cycle

 

Module 2: AI in Credit Scoring

 

  • AI in Credit Scoring for Banking and Fintech

  • Offline and online credit scoring

  • Design and Construction of Credit Scoring Models

  • Advantages and disadvantages

  • Models to face new financial crises

  • Machine Learning to develop and validate credit scoring

  • Importance of the Bureau Score

  • Credit Scorecard Management

  • Default Probability Estimation PD

Module 3: Machine Learning

 

  • Definition of Machine Learning

  • Machine Learning Methodology

    • Data Storage

    • Abstraction

    • Generalization

    • Assessment

  • Supervised Learning

  • Unsupervised Learning

  • Reinforcement Learning

  • deep learning

  • Typology of Machine Learning algorithms

  • Steps to Implement an Algorithm

  • information collection

    • Exploratory Analysis

    • Model Training

    • Model Evaluation

    • Model improvements

    • Machine Learning in credit scoring models

    • Quantum Machine Learning

 

MODELING

Module 4: Exploratory Analysis

  • Data typology

  • transactional data

  • Unstructured data embedded in text documents

  • Social Media Data

  • data sources

  • Data review

  • Target definition

  • Time horizon of the target variable

  • Sampling

    • Random Sampling

    • Stratified Sampling

    • Rebalanced Sampling

  • Exploratory Analysis:

    • histograms

    • Q Q Plot

    • Moment analysis

    • boxplot

  • Treatment of Missing values

    • Multivariate Imputation Model

    • Advanced Outlier detection and treatment techniques

    • Univariate technique: winsorized and trimming

    • Multivariate Technique: Mahalanobis Distance

Module 5: Univariate Analysis

  • Data Standardization

  • Variable categorization

    • Equal Interval Binning

    • Equal Frequency Binning

    • Chi-Square Test

  • binary coding

  • WOE Coding

    • WOE Definition

    • Univariate Analysis with Target variable

    • Variable Selection

    • Treatment of Continuous Variables

    • Treatment of Categorical Variables

    • gini

    • Information Value

    • Optimization of continuous variables

    • Optimization of categorical variables

  • ​Exercise 1: Exploratory Analysis in R

  • Exercise 2: Detection and treatment of Advanced Outliers

  • Exercise 3: Stratified and Random Sampling in R

  • Exercise 4: Multivariate imputation model

  • Exercise 5: Univariate analysis in percentiles in R

  • Exercise 6: Continuous variable optimal univariate analysis in Excel

  • Exercise 7: Estimation of the KS, Gini and IV of each variable in Excel

  • Exercise 8: Word Cloud analysis of variables in R

MACHINE LEARNING

Unsupervised Learning

Module 6: Unsupervised models

  • Hierarchical Clusters

  • K Means

  • standard algorithm

  • Euclidean distance

  • Principal Component Analysis (PCA)

  • Advanced PCA Visualization

  • Eigenvectors and Eigenvalues

  • Exercise 14: Core components in R and SAS

  • Exercise 15: Segmentation of the data with K-Means R

Supervised Learning

 

Module 7: Logistic Regression and LASSO Regression

 

  • Econometric Models

    • Logit regression

    • probit regression

    • Piecewise Regression

    • survival models

  • Machine Learning Models

    • Lasso Regression

    • Ridge Regression

  • Model Risk in Logistic Regression

  • Exercise 16: Credit Scoring Logistic Regression in SAS and R

  • Exercise 17: Credit Scoring Lasso Logistic Regression in R

  • Exercise 18: Model Risk Using Confidence Intervals of Logistic Regression Coefficients

Module 8: Trees, KNN and Naive Bayes

 

  • Decision Trees

    • modeling

    • Advantages and disadvantages

    • Recursion and Partitioning Processes

    • Recursive partitioning tree

    • Pruning Decision tree

    • Conditional inference tree

    • tree display

    • Measurement of decision tree prediction

    • CHAID model

    • Model C5.0

  • K-Nearest Neighbors KNN

    • modeling

    • Advantages and disadvantages

    • Euclidean distance

    • Distance Manhattan

    • K value selection

  • Probabilistic Model: Naive Bayes

    • naive bayes

    • Bayes' theorem

    • Laplace estimator

    • Classification with Naive Bayes

    • Advantages and disadvantages

  • Exercise 19: Credit Scoring Decision Tree in SAS and R

  • Exercise 20: Credit Scoring KNN in R and SAS

  • Exercise 21: Credit Scoring Naive Bayes in R

Module 9: Support Vector Machine SVM

  • SVM with dummy variables

  • SVM

  • optimal hyperplane

  • Support Vectors

  • add costs

  • Advantages and disadvantages

  • SVM visualization

  • Tuning SVM

  • kernel trick

  • Exercise 22: Credit Scoring Support Vector Machine in R data 1

  • Exercise 23: Credit Scoring Support Vector Machine in Python data 2

Module 10: Ensemble Learning

  • set models

  • bagging

  • bagging trees

  • Random Forest

  • Boosting

  • adaboost

  • Gradient Boosting Trees

  • Advantages and disadvantages

  • Exercise 24: Credit Scoring Boosting in R

  • Exercise 25: Credit Scoring Bagging in R

  • Exercise 26: Credit Scoring Random Forest, R and Python, data 1 and 2

  • Exercise 27: Credit Scoring Gradient Boosting Trees

MODEL VALIDATION

 

Module 11: Validation of traditional and Machine Learning models

  • Model validation

  • Validation of machine learning models

  • Regulatory validation of machine learning models in Europe

  • Out of Sample and Out of time validation

  • Checking p-values in regressions

  • R squared, MSE, MAD

  • Waste diagnosis

  • Goodness of Fit Test

  • multicollinearity

  • Binary case confusion matrix

  • Multinomial case confusion matrix

  • Main discriminant power tests

  • confidence intervals

  • Jackknifing with discriminant power test

  • Bootstrapping with discriminant power test

  • Kappa statistic

  • K-Fold Cross Validation

  • Exercise 28: Logistic Regression Goodness-of-Fit Test

  • Exercise 29: Cross validation in SAS

  • Exercise 30: Gini Estimation, Information Value, Brier Score, Lift Curve, CAP, ROC, Divergence in SAS and Excel

  • Exercise 31: Bootstrapping of SAS parameters

  • Exercise 32: Jackkinifng in SAS

  • Exercise 33: Gini/ROC Bootstrapping in SAS

  • Exercise 34: Kappa estimation

  • Exercise 35: K-Fold Cross Validation in R

  • Exercise 36: Traffic light validation out of time (horizon 6 years) of Logistics and Machine Learning models

Module 12: Stability Testing

  • Model stability index

  • Factor stability index

  • Xi-square test

  • K-S test

  • Exercise 37: Stability tests of models and factors

DEEP LEARNING 

 

Module 14: Introduction to Deep Learning

  • Definition and concept of deep learning

  • Why now the use of deep learning?

  • Artificial neural networks

  • Neural network architectures

  • activation function

    • sigmoidal

    • Rectified linear unit

    • hypertangent

    • Softmax

  • feedforward network

  • Multilayer Perceptron

  • Using Tensorflow

  • Using Tensorboard

  • R deep learning

  • Python deep learning

  • Convolutional Neural Networks

  • Use of deep learning in image classification

  • cost function

  • Gradient descending optimization

  • Using deep learning for credit scoring

    • How many hidden layers?

    • How many neurons, 100, 1000?

    • How many times and size of the batch size?

    • What is the best activation function?

  • Deep Learning Software: Caffe, H20, Keras, Microsoft, Matlab, etc.

  • Deployment software: Nvidia and Cuda

  • Hardware, CPU, GPU and cloud environments

  • Advantages and disadvantages of deep learning

Module 15: Deep Learning Feed Forward Neural Networks

  • Single Layer Perceptron

  • Multiple Layer Perceptron

  • Neural network architectures

  • activation function

    • sigmoidal

    • Rectified linear unit (Relu)

    • The U

    • Selu

    • hyperbolic hypertangent

    • Softmax

    • other

  • Back propagation

    • Directional derivatives

    • gradients

    • Jacobians

    • Chain rule

    • Optimization and local and global minima

  • Exercise 38: Credit Scoring using Deep Learning Feed Forward

Module 16: Deep Learning Convolutional Neural Networks CNN

  • CNN for pictures

  • Design and architectures

  • convolution operation

  • descending gradient

  • filters

  • strider

  • padding

  • Subsampling

  • pooling

  • fully connected

  • Credit Scoring using CNN

  • Recent CNN studies applied to credit risk and scoring

  • Exercise 39: Credit scoring using deep learning CNN

 

Module 17: Deep Learning Recurrent Neural Networks RNN

  • Natural Language Processing

  • Natural Language Processing (NLP) text classification

  • Long Term Short Term Memory (LSTM)

  • hopfield

  • Bidirectional associative memory

  • descending gradient

  • Global optimization methods

  • RNN and LSTM for credit scoring

  • One-way and two-way models

  • Deep Bidirectional Transformers for Language Understanding​

  • Exercise 40: Credit Scoring using Deep Learning LSTM

Module 18: Generative Adversarial Networks (GANs)

  • Generative Adversarial Networks (GANs)

  • Fundamental components of the GANs

  • GAN architectures

  • Bidirectional GAN

  • Training generative models

  • Credit Scoring using GANs

  • Exercise 41: Credit Scoring using GANs

Module 19: Calibrating Machine Learning and Deep Learning

  • hyperparameterization

  • grid search

  • random search

  • Bayesian Optimization

  • Train test split ratio

  • Learning rate in optimization algorithms (e.g. gradient descent)

  • Selection of optimization algorithm (e.g., gradient descent, stochastic gradient descent, or Adam optimizer)

  • Activation function selection in a (nn) layer neural network (e.g. Sigmoid, ReLU, Tanh)

  • Selection of loss, cost and custom function

  • Number of hidden layers in an NN

  • Number of activation units in each layer

  • The drop-out rate in nn (dropout probability)

  • Number of iterations (epochs) in training a nn

  • Number of clusters in a clustering task

  • Kernel or filter size in convolutional layers

  • pooling size

  • batch size

  • Exercise 42: Optimization Credit Scoring Xboosting, Random forest and SVM

  • Exercise 43: Optimized Credit Scoring Deep Learning

​Module 20: Traditional Scorecard Construction

 

  • scoring assignment

  • Scorecard Classification

    • Scorecard WOE

    • Binary Scorecard

    • Continuous Scorecard

  • Scorecard Rescaling

    • Factor and Offset Analysis

    • Scorecard WOE

    • Binary Scorecard

  • Reject Inference Techniques

    • cut-off

    • parceling

    • Fuzzy Augmentation

    • Machine Learning

  • Advanced Cut Point Techniques

    • Cut-off optimization using ROC curves

  • Exercise 44: Building Scorecard in Excel, R and Python

  • Exercise 45: Optimum cut-off point estimation in Excel and model risk by cut-off point selection

  • Exercise 46: Confusion matrix to verify Type 1 and Type 2 Error in Excel with and without variables

QUANTUM MACHINE LEARNING

 

Module 21: Quantum Credit Scoring

  • What is quantum machine learning?

  • Qubit and Quantum States

  • Quantum Automatic Machine Algorithms

  • quantum circuits

  • quantum k means

  • Support Vector Machine

  • Support Vector Quantum Machine

  • Variational quantum classifier

  • Training quantum machine learning models

  • Quantum Neural Networks

  • Quantum GAN

  • Quantum Boltzmann machines

  • Quantum machine learning in Credit Risk

  • Quantum machine learning in credit scoring

  • quantum software

  • Exercise 47: Quantum K-means

  • Exercise 48: Quantum Support Vector Machine to develop credit scoring model

  • Exercise 49: Quantum feed forward Neural Networks to develop a credit scoring model

  • Exercise 50: Quantum Convoluted Neural Networks to develop a credit scoring model

Module 22: Tensor Networks for Quantum Machine Learning

  • What are tensor networks?

  • Quantum Entanglement

  • Tensor networks in machine learning

  • Tensor networks in unsupervised models

  • Tensor networks in SVM

  • Tensor networks in NN

  • NN tensioning

  • Application of tensor networks in credit scoring models

  • Exercise 51: Construction of credit scoring using tensor networks

PROBABILISTIC MACHINE LEARNING

 

​Module 23: Probabilistic Machine Learning

​​

  • Introduction to probabilistic machine learning

  • Gaussian models

  • Bayesian Statistics

  • Bayesian logistic regression

  • Kernel family

  • Gaussian processes

    • Gaussian processes for regression

  • Hidden Markov Model

  • Markov chain Monte Carlo (MCMC)

    • Metropolis Hastings algorithm

  • Machine Learning Probabilistic Model

  • Bayesian Boosting

  • Bayesian Neural Networks

  • Exercise 52: Gaussian process for regression

  • Exercise 53: Credit scoring model using Bayesian Neural Networks

MODEL RISK

Module 24: Model Risk in Credit Scoring

  • Model Risk

  • Model risk in deep learning

  • Model risk in credit scoring

  • black boxes

  • cut-off decision

  • absence of data

  • Model Risk for not updating or recalibrating

  • Ethical concepts of credit scoring

  • Exercise 54: Model risk in credit scoring due to not recalibrating on time

CREDIT SCORING MODELS

Module 25: Credit Scoring Models by Product

  • Admission Credit Scoring

    • Credit Card Score

    • Mortgage Score

    • consumption scores

    • Car Score

  • Behavior Score (BS)

    • Temporal horizon

    • Dashboard data information

    • Panel data regression

    • Cox regression

    • Behavior Score with macroeconomic variables

    • transition matrices

    • Behavior Score with transition matrices

    • Transaction Score

    • Machine Learning Models

    • BEHAVIOR SCORE ON CREDIT CARDS

  • Exercise 55: Behavior Score Logistic Regression in Python data 2

  • Exercise 56: Behavior Score Support Vector Machines in python

  • Exercise 57: Behavior Score Random Forest in python

  • ​Exercise 58: Behavior Score Gradient Boosting Trees in python

  • Exercise 59: Behavior Score Deep Learning LSTM in python

Module 26: Typology of Scores

​​​

  • Response Score

  • Income score

  • Dropout Score

  • Admission Fraud Score

  • Follow-up Fraud Score

  • Collection Score

  • Recovery Score

  • Big Data Scoring

  • Exercise 60: Fraud Score with neural networks

  • Exercise 61: Income Score

  • Exercise 62: Collection Score

  • Exercise 63: Recovery Score

  • Exercise 64: Quit Score

CALIBRATION OF PD MODELS

 

Module 27: Calibration of the Probability of Default PD IRB

  • PD estimation

    • econometric models

    • Machine Learning Models

    • Data requirement

    • Risk drivers and credit scoring criteria

    • Rating philosophy

    • Pool Treatment

  • PD Calibration

    • Default Definition

    • Long run average for PD

    • Technical defaults and technical default filters

    • Data requirement

    • One Year Default Rate Calculation

    • Long-Term Default Rate Calculation

  • PD Model Risk

    • Conservatism Margin

  • PD Calibration Techniques

    • Anchor Point Estimate

    • Mapping from Score to PD

    • ​Adjustment to the PD Economic Cycle

    • Rating Philosophy

  • PD Trough The Cycle (PD TTC) models

  • PD Point in Time PD (PD PIT ) models

  • PD Calibration of Models Using Machine and Deep Learning

  • Exercise 65: PD Calibration in Machine Learning Models

Module 28: Machine Learning models to estimate Lifetime PD under IFRS 9

​​

  • Credit scoring models to estimate Lifetime PD

  • PD Lifetime in IFRS 9

  • Impact of COVID-19 on models

  • Climate Risk Impact

  • Inflation impact

  • Impact of rising prices

  • Regression Models

    • Logistic regression

    • Logistic Multinomial Regression

    • Ordinal Probit Regression

  • VAR and VEC models

  • Machine Learning Model​

    • SVM: Kernel Function Definition

    • Neural Network: definition of hyperparameters and activation function

    • deep learning

    • LSTM

  • PD Calibration of Models Using Machine and Deep Learning

  • Exercise 66: PD Lifetime using logistic regression

  • Exercise 67: PD Lifetime using multinomial regression in R

  • Exercise 68: PD Lifetime using SVM in Python

  • Exercise 69: PD Lifetime using Deep Learning in Python

  • Exercise 70: PD Lifetime using Deep Learning LSTM in Python

VALIDATION OF PD MODELS

 

​Module 29: Validation of PD models

  • Definition of PD Backtesting

  • PD Calibration Validation​

    • normal test

    • Binomial Test

    • Traffic Light Approach

  • Traffic Light Analysis and PD Dashboard

  • PS Stability Test

  • Forecasting PD vs Real PD in time

  • When to recalibrate or reestimate a credit scoring model?

  • Re-development

  • re-estimation

  • Model Risk in PD

  • Machine Learning to validate PD models

  • Artificial Intelligence to recalibrate and rebuild models autonomously

  • Exercise 71: Backtesting PD in Excel

  • Exercise 72: Forecasting PD and actual PD in Excel

AUTOMATION OF CREDIT SCORING AND PD WITH AI

 

Module 30: Automation of Credit Scoring and PD Modeling

  • What is modeling automation?

  • that is automated

  • Automation of machine learning processes

  • Optimizers and Evaluators

  • Modeling Automation Workflow Components

    • Summary

    • Indicted

    • Feature engineering

    • Model generation

    • Assessment

  • Hyperparameter optimization

  • Reconstruction or recalibration of credit scoring

  • Credit Scoring Modeling

    • Main milestones

    • Evaluation and optimization

    • Possible Issues

  • PD calibration modeling

    • Evaluation and optimization

    • backtesting

    • Discriminating Power

    • Stability Tests

  • Global evaluation of modeling automation

  • Implementation of modeling automation in banking

  • Technological requirements

  • available tools

  • Benefits and possible ROI estimation

  • Main Issues

  • Model Risk

  • Genetic algorithms

  • Exercise 73: Automation of the modeling, optimization and validation of credit scoring hyperparametry

  • Exercise 74: Automation of PD modeling and validation

bottom of page