Calibration

The module provides methods to perform calibration analytics. It provides a table containing the number of observations, the number of observed defaults, the number of predicted defaults, the observed Probability of Default (PD), the predicted PD, the observed grade, the predicted grade, the absolute grade difference as integer, the binomial p-value for underestimation, the binomial p-value for overestimation, the Jeffreys p-value, the observed PD Central Tendency (CT), and the predicted PD CT. It also provides a "calibration" plot.

Accessor

Initialise the DataFrame with the calibration method. Minimal working example:

df.crm.calibration(default="DEFAULT", score="SCORE")

Parameters:

Name	Type	Description	Default
`default`	`str`	Defines the default column, typically containing binary values.	required
`score`	`str`	Defines the score column, typically containing Probabilities of Default (PDs) in absolute values. Grades can be transformed to scores, for example, PDs via the module "general.py" if necessary.	required
`by`	`str`	Defines the grouper to be applied to the DataFrame, for example, "DATE".	`None`

Returns:

Type	Description
`Calibration`	Returns a class called "Calibration" providing calibration analytics methods.

Methods

`plot(x_axis_label='Predicted Probability of Default', y_axis_label='Observed Probability of Default', annotation=True, annot_xy=None, legend_loc='upper left', fig_size=(7.5, 7.5), path=None, show=False)`

Minimal working example:

df.crm.calibration(default="DEFAULT", score="SCORE", by="GRADE").plot()

Parameters:

Name	Type	Description	Default
`x_axis_label`	`str`	Defines the x-axis label.	`'Predicted Probability of Default'`
`y_axis_label`	`str`	Defines the y-axis label.	`'Observed Probability of Default'`
`annotation`	`bool`	Adds a "Progressive" and "Conservative" box to the plot.	`True`
`annot_xy`	`tuple`	Adds annotations to data points provided as relative x- and y-positions, i.e. to be entered as (float, float).	`None`
`legend_loc`	`str`	Defines the legend location.	`'upper left'`
`fig_size`	`tuple`	Defines the figure size.	`(7.5, 7.5)`
`path`	`str`	Defines the saving path including the filename of the figure and should be entered as r"C:\<path>\<filename>.<image_format>".	`None`
`show`	`bool`	If True, plot is shown.	`False`

Returns:

Type	Description
`BytesIO`	Returns a "calibration" plot.

`table(add_mean=False, add_sum=False)`

Minimal working example:

df.crm.calibration(default="DEFAULT", score="SCORE").table()

Parameters:

Name	Type	Description	Default
`add_mean`	`bool`	Adds a mean row to the table. The mean is calculated for the number of observations (rounded down to nearest integer), the number of observed defaults (rounded down to nearest integer), the number of predicted defaults, the observed PD, and the predicted PD. The remaining statistics are based on these mean values.	`False`
`add_sum`	`bool`	Adds a sum row to the table. The sum is calculated for the number of observations, the number of observed defaults, the number of predicted defaults, the observed PD, and the predicted PD. The remaining statistics are based on these sum values.	`False`

Returns:

Type	Description
`DataFrame`	Returns a table containing the number of observations, the number of observed defaults, the number of predicted defaults, the observed PD, the predicted PD, the observed grade, the predicted grade, the absolute grade difference as integer, the binomial p-value for underestimation, the binomial p-value for overestimation, the Jeffreys p-value, the observed PD CT, and the predicted PD CT.

Examples

data

>>> import credit_risk_modelling as crm
>>> data = crm.load_data.load_data()
>>> data

           DATE    ID GRADE  GRADE_PD OVERRIDE  OVERRIDE_PD  DEFAULT
0    2019-12-31    10     B    0.1000        B       0.1000        0
1    2019-12-31   100   BBB    0.0090       BB       0.0400        0
2    2019-12-31  1000   BBB    0.0090      BBB       0.0090        0
3    2019-12-31  1001   BBB    0.0090      BBB       0.0090        0
4    2019-12-31  1003   BBB    0.0090      BBB       0.0090        0
...         ...   ...   ...       ...      ...          ...      ...
4145 2023-12-31   994    AA    0.0010       AA       0.0010        0
4146 2023-12-31   995    AA    0.0010       AA       0.0010        0
4147 2023-12-31   996     A    0.0020        A       0.0020        0
4148 2023-12-31   998     B    0.1000        B       0.1000        0
4149 2023-12-31   999   AAA    0.0002      AAA       0.0002        0

[4150 rows x 7 columns]

.table()

>>> (
>>>     data
>>>     .crm.calibration(default="DEFAULT", score="GRADE_PD", by="DATE")
>>>     .table()
>>> )

          BY   OBS  DEFAULT_OBS  DEFAULT_PRE    PD_OBS    PD_PRE GRADE_OBS GRADE_PRE  GRADE_DIF  P_BINOM_UND  P_BINOM_OVE  P_JEFFREYS  CT_PD_OBS  CT_PD_PRE
0 2019-12-31   750          131        134.0  0.174667  0.178551       CCC       CCC          0     0.624078     0.412805    0.605746   0.174667   0.178551
1 2020-12-31   800          170        165.0  0.212500  0.206311       CCC       CCC          0     0.346001     0.685403    0.330143   0.193583   0.192431
2 2021-12-31   700          150        149.0  0.214286  0.212161       CCC       CCC          0     0.460159     0.576223    0.441899   0.200484   0.199008
3 2022-12-31   900          205        203.0  0.227778  0.225724       CCC       CCC          0     0.454351     0.577001    0.438622   0.207308   0.205687
4 2023-12-31  1000          236        221.0  0.236000  0.220917       CCC       CCC          0     0.133490     0.882007    0.125573   0.213046   0.208733

.plot()

>>> (
>>>     data
>>>     .crm.calibration(default="DEFAULT", score="GRADE_PD", by="GRADE")
>>>     .plot(show=True)
>>> )

calibration plot