Calibration
The module provides methods to perform calibration analytics. It provides a table containing the number of observations, the number of observed defaults, the number of predicted defaults, the observed Probability of Default (PD), the predicted PD, the observed grade, the predicted grade, the absolute grade difference as integer, the binomial p-value for underestimation, the binomial p-value for overestimation, the Jeffreys p-value, the observed PD Central Tendency (CT), and the predicted PD CT. It also provides a "calibration" plot.
Accessor
Initialise the DataFrame with the calibration method. Minimal working example:
df.crm.calibration(default="DEFAULT", score="SCORE")
Parameters:
Name | Type | Description | Default |
---|---|---|---|
default
|
str
|
Defines the default column, typically containing binary values. |
required |
score
|
str
|
Defines the score column, typically containing Probabilities of Default (PDs) in absolute values. Grades can be transformed to scores, for example, PDs via the module "general.py" if necessary. |
required |
by
|
str
|
Defines the grouper to be applied to the DataFrame, for example, "DATE". |
None
|
Returns:
Type | Description |
---|---|
Calibration
|
Returns a class called "Calibration" providing calibration analytics methods. |
Methods
plot(x_axis_label='Predicted Probability of Default', y_axis_label='Observed Probability of Default', annotation=True, annot_xy=None, legend_loc='upper left', fig_size=(7.5, 7.5), path=None, show=False)
Minimal working example:
df.crm.calibration(default="DEFAULT", score="SCORE", by="GRADE").plot()
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x_axis_label
|
str
|
Defines the x-axis label. |
'Predicted Probability of Default'
|
y_axis_label
|
str
|
Defines the y-axis label. |
'Observed Probability of Default'
|
annotation
|
bool
|
Adds a "Progressive" and "Conservative" box to the plot. |
True
|
annot_xy
|
tuple
|
Adds annotations to data points provided as relative x- and y-positions, i.e. to be entered as (float, float). |
None
|
legend_loc
|
str
|
Defines the legend location. |
'upper left'
|
fig_size
|
tuple
|
Defines the figure size. |
(7.5, 7.5)
|
path
|
str
|
Defines the saving path including the filename of the figure and should be entered as r"C:\<path>\<filename>.<image_format>". |
None
|
show
|
bool
|
If True, plot is shown. |
False
|
Returns:
Type | Description |
---|---|
BytesIO
|
Returns a "calibration" plot. |
table(add_mean=False, add_sum=False)
Minimal working example:
df.crm.calibration(default="DEFAULT", score="SCORE").table()
Parameters:
Name | Type | Description | Default |
---|---|---|---|
add_mean
|
bool
|
Adds a mean row to the table. The mean is calculated for the number of observations (rounded down to nearest integer), the number of observed defaults (rounded down to nearest integer), the number of predicted defaults, the observed PD, and the predicted PD. The remaining statistics are based on these mean values. |
False
|
add_sum
|
bool
|
Adds a sum row to the table. The sum is calculated for the number of observations, the number of observed defaults, the number of predicted defaults, the observed PD, and the predicted PD. The remaining statistics are based on these sum values. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
Returns a table containing the number of observations, the number of observed defaults, the number of predicted defaults, the observed PD, the predicted PD, the observed grade, the predicted grade, the absolute grade difference as integer, the binomial p-value for underestimation, the binomial p-value for overestimation, the Jeffreys p-value, the observed PD CT, and the predicted PD CT. |
Examples
>>> import credit_risk_modelling as crm
>>> data = crm.load_data.load_data()
>>> data
DATE ID GRADE GRADE_PD OVERRIDE OVERRIDE_PD DEFAULT
0 2019-12-31 10 B 0.1000 B 0.1000 0
1 2019-12-31 100 BBB 0.0090 BB 0.0400 0
2 2019-12-31 1000 BBB 0.0090 BBB 0.0090 0
3 2019-12-31 1001 BBB 0.0090 BBB 0.0090 0
4 2019-12-31 1003 BBB 0.0090 BBB 0.0090 0
... ... ... ... ... ... ... ...
4145 2023-12-31 994 AA 0.0010 AA 0.0010 0
4146 2023-12-31 995 AA 0.0010 AA 0.0010 0
4147 2023-12-31 996 A 0.0020 A 0.0020 0
4148 2023-12-31 998 B 0.1000 B 0.1000 0
4149 2023-12-31 999 AAA 0.0002 AAA 0.0002 0
[4150 rows x 7 columns]
>>> (
>>> data
>>> .crm.calibration(default="DEFAULT", score="GRADE_PD", by="DATE")
>>> .table()
>>> )
BY OBS DEFAULT_OBS DEFAULT_PRE PD_OBS PD_PRE GRADE_OBS GRADE_PRE GRADE_DIF P_BINOM_UND P_BINOM_OVE P_JEFFREYS CT_PD_OBS CT_PD_PRE
0 2019-12-31 750 131 134.0 0.174667 0.178551 CCC CCC 0 0.624078 0.412805 0.605746 0.174667 0.178551
1 2020-12-31 800 170 165.0 0.212500 0.206311 CCC CCC 0 0.346001 0.685403 0.330143 0.193583 0.192431
2 2021-12-31 700 150 149.0 0.214286 0.212161 CCC CCC 0 0.460159 0.576223 0.441899 0.200484 0.199008
3 2022-12-31 900 205 203.0 0.227778 0.225724 CCC CCC 0 0.454351 0.577001 0.438622 0.207308 0.205687
4 2023-12-31 1000 236 221.0 0.236000 0.220917 CCC CCC 0 0.133490 0.882007 0.125573 0.213046 0.208733
>>> (
>>> data
>>> .crm.calibration(default="DEFAULT", score="GRADE_PD", by="GRADE")
>>> .plot(show=True)
>>> )