Discrimination

The module provides methods to perform discrimination analytics. It provides a table containing the Accuracy Ratio (AR), the left (lower) bound of the AR confidence interval, the right (upper) bound of the AR confidence interval, the significance level, the number of observations, and the number of observed defaults. It also provides a Receiver Operating Characteristic (ROC) curve plot.

Accessor

Initialise the DataFrame with the discrimination method. Minimal working example:

df.crm.discrimination(default="DEFAULT", score="SCORE")

Parameters:

Name	Type	Description	Default
`default`	`str`	Defines the default column, typically containing binary values.	required
`score`	`str`	Defines the score column, typically containing Probabilities of Default (PDs) in absolute values. Grades can be transformed to scores, for example, PDs via the module "general.py" if necessary.	required
`alpha`	`float`	Defines the statistical significance in absolute terms, i.e. a significant level of 10% should be entered as 0.1.	`None`
`by`	`str`	Defines the grouper to be applied to the DataFrame, for example, "DATE".	`None`

Returns:

Type	Description
`Discrimination`	Returns a class called "Discrimination" providing discrimination analytics methods.

Methods

`plot(x_axis_label='1 - Specificity / Percentage of Ratings', y_axis_label='Sensitivity / Percentage of Defaults', legend_label=None, score_2=None, legend_label_2=None, legend_loc='lower right', fig_size=(7.5, 7.5), path=None, show=False)`

Minimal working example:

df.crm.discrimination(default="DEFAULT", score="SCORE").plot()

Parameters:

Name	Type	Description	Default
`x_axis_label`	`str`	Defines the x-axis label.	`'1 - Specificity / Percentage of Ratings'`
`y_axis_label`	`str`	Defines the y-axis label.	`'Sensitivity / Percentage of Defaults'`
`legend_label`	`Union[str, list]`	Defines the legend label.	`None`
`score_2`	`str`	Defines a second score column.	`None`
`legend_label_2`	`Union[str, list]`	Defines the legend label of the second score column.	`None`
`legend_loc`	`str`	Defines the legend location.	`'lower right'`
`fig_size`	`tuple`	Defines the figure size.	`(7.5, 7.5)`
`path`	`str`	Defines the saving path including the filename of the figure and should be entered as r"C:\<path>\<filename>.<image_format>".	`None`
`show`	`bool`	If True, plot is shown.	`False`

Returns:

Type	Description
`BytesIO`	Returns the ROC curve.

`table()`

Minimal working example:

df.crm.discrimination(default="DEFAULT", score="SCORE").table()

Returns:

Type	Description
`DataFrame`	Returns a table containing the AR, the left (lower) bound of the AR confidence interval, the right (upper) bound of the AR confidence interval, the significance level, the number of observations, and the number of observed defaults.

Examples

data

>>> import credit_risk_modelling as crm
>>> data = crm.load_data.load_data()
>>> data

           DATE    ID GRADE  GRADE_PD OVERRIDE  OVERRIDE_PD  DEFAULT
0    2019-12-31    10     B    0.1000        B       0.1000        0
1    2019-12-31   100   BBB    0.0090       BB       0.0400        0
2    2019-12-31  1000   BBB    0.0090      BBB       0.0090        0
3    2019-12-31  1001   BBB    0.0090      BBB       0.0090        0
4    2019-12-31  1003   BBB    0.0090      BBB       0.0090        0
...         ...   ...   ...       ...      ...          ...      ...
4145 2023-12-31   994    AA    0.0010       AA       0.0010        0
4146 2023-12-31   995    AA    0.0010       AA       0.0010        0
4147 2023-12-31   996     A    0.0020        A       0.0020        0
4148 2023-12-31   998     B    0.1000        B       0.1000        0
4149 2023-12-31   999   AAA    0.0002      AAA       0.0002        0

[4150 rows x 7 columns]

.table()

>>> (
>>>     data
>>>     .crm.discrimination(default="DEFAULT", score="GRADE_PD", alpha=0.1, by="DATE")
>>>     .table()
>>> )

          BY   OBS  DEFAULT_OBS        AR  AR_BOUND_LOWER  AR_BOUND_UPPER  ALPHA
0 2019-12-31   750          131  0.941644        0.907460        0.975829    0.1
1 2020-12-31   800          170  0.905444        0.867548        0.943339    0.1
2 2021-12-31   700          150  0.929358        0.894257        0.964458    0.1
3 2022-12-31   900          205  0.932367        0.902949        0.961785    0.1
4 2023-12-31  1000          236  0.921183        0.891655        0.950711    0.1

.plot()

>>> (
>>>     data
>>>     .crm.discrimination(default="DEFAULT", score="GRADE_PD", alpha=0.1, by="DATE")
>>>     .plot(show=True)
>>> )

discrimination plot