In November 2021 EBA issued a discussion paper on the use of Machine Learning techniques in the IRB framework. The document provides definition of ML models, in which these models are defined as having a high number of parameters requiring a large volume of data for their estimation. The paper continues describing the challenges and benefits of ML techniques in IRB calculation and finally it discusses how to ensure a prudent use of ML in the Credit Risk IRB framework.
EBA identifies several main areas where ML may be used: risk differentiation, risk quantification, model validation (where ML models may be used as challenger models) as well as some additional areas, such as input preparation for the main IRB models (for instance, collateral valuation). Less focus is required if ML models are used at lower levels in the process of IRB calculations. Other considerations may play a role, such as the legal or ethical aspects as well as consumer and data protection.
In the main areas, challenges appear in the availability, quality and representativeness of data. Human judgment, a regulatory requirement for a sound calculation process, may prevent the use of the more complex ML models, due to the difficulty to explain the outcomes.
Complexity, reliability and interpretability of the ML model results is a key challenge. For some ML type of models, documenting the underlying assumptions and theory as required by regulation is challenging. Similar explainability challenges appear during the validation process or in the IRB governance.
As for the benefits, ML models may improve the risk differentiation, both by improving the discriminatory power or by identifying all relevant risk drivers. Similarly, the risk quantification benefits by an improvement in predictive ability or in the detection of material biases. ML tools are also beneficial in data collection and preparation processes.
Principles of Prudent Use of ML
Rather than describing which ML tools may be used in which aspects of the IRB process, EBA gives recommendations in the form of a principle-based approach. Banks must have appropriate level of knowledge of the ML model’s functioning in the model development (MD), the credit risk and control (CRCU) and the validation units. Senior management has to be in position to understand the models. Furthermore, unnecessary complexity should be avoided when building the models.
ML models used in IRB processes should be properly documented, and if human judgment is used in the model development or to override the outcomes, staff in charge should be in position to assess model assumptions.
Particular attention needs to be put in the validation of these models. Sufficient data quality needs to be ensured and unstructured data carefully used. The rationale behind the choice of hyperparameters, model stability and overfitting should be thoroughly assessed.
The Experian Approach
Now that the regulators are setting the path to using ML models in the calculation of capital, a crucial step in the process is the ability to explain the model outcomes. When consulted about the use of ML models, our clients mention the explainability issue as one of the major concerns. Experian has developed an advanced approach to improve explainability of ML models, so that outcomes of these models can easily be explained and understood by stakeholders or customers.
Explanations are in the form of variable importance ranking which can then easily be translated into reason codes. This is achieved by different advanced techniques which derive importance score for each feature based on partial dependencies and SHAP values. SHAP quantifies the contribution that each feature brings to the prediction made by the model. From that, a confidence level is taken to understand the real importance of individual features.
Experian has also developed a standardized framework for developing and deploying ML models with the required level of explainability. This allows the process to be 50% faster than the normal modelling processes.
Experian Explainability plug-in is compatible with tree-based ML models.