Skip to content

Discrimination

The module provides methods to perform discrimination analytics. It provides a table containing the Accuracy Ratio (AR), the left (lower) bound of the AR confidence interval, the right (upper) bound of the AR confidence interval, the significance level, the number of observations, and the number of observed defaults. It also provides a Receiver Operating Characteristic (ROC) curve plot.

Accessor

Initialise the DataFrame with the discrimination method. Minimal working example:

df.crm.discrimination(default="DEFAULT", score="SCORE")

Parameters:

Name Type Description Default
default str

Defines the default column, typically containing binary values.

required
score str

Defines the score column, typically containing Probabilities of Default (PDs) in absolute values. Grades can be transformed to scores, for example, PDs via the module "general.py" if necessary.

required
alpha float

Defines the statistical significance in absolute terms, i.e. a significant level of 10% should be entered as 0.1.

None
by str

Defines the grouper to be applied to the DataFrame, for example, "DATE".

None

Returns:

Type Description
Discrimination

Returns a class called "Discrimination" providing discrimination analytics methods.

Methods

plot(x_axis_label='1 - Specificity / Percentage of Ratings', y_axis_label='Sensitivity / Percentage of Defaults', legend_label=None, score_2=None, legend_label_2=None, legend_loc='lower right', fig_size=(7.5, 7.5), path=None, show=False)

Minimal working example:

df.crm.discrimination(default="DEFAULT", score="SCORE").plot()

Parameters:

Name Type Description Default
x_axis_label str

Defines the x-axis label.

'1 - Specificity / Percentage of Ratings'
y_axis_label str

Defines the y-axis label.

'Sensitivity / Percentage of Defaults'
legend_label Union[str, list]

Defines the legend label.

None
score_2 str

Defines a second score column.

None
legend_label_2 Union[str, list]

Defines the legend label of the second score column.

None
legend_loc str

Defines the legend location.

'lower right'
fig_size tuple

Defines the figure size.

(7.5, 7.5)
path str

Defines the saving path including the filename of the figure and should be entered as r"C:\<path>\<filename>.<image_format>".

None
show bool

If True, plot is shown.

False

Returns:

Type Description
BytesIO

Returns the ROC curve.

table()

Minimal working example:

df.crm.discrimination(default="DEFAULT", score="SCORE").table()

Returns:

Type Description
DataFrame

Returns a table containing the AR, the left (lower) bound of the AR confidence interval, the right (upper) bound of the AR confidence interval, the significance level, the number of observations, and the number of observed defaults.

Examples

data
>>> import credit_risk_modelling as crm
>>> data = crm.load_data.load_data()
>>> data

           DATE    ID GRADE  GRADE_PD OVERRIDE  OVERRIDE_PD  DEFAULT
0    2019-12-31    10     B    0.1000        B       0.1000        0
1    2019-12-31   100   BBB    0.0090       BB       0.0400        0
2    2019-12-31  1000   BBB    0.0090      BBB       0.0090        0
3    2019-12-31  1001   BBB    0.0090      BBB       0.0090        0
4    2019-12-31  1003   BBB    0.0090      BBB       0.0090        0
...         ...   ...   ...       ...      ...          ...      ...
4145 2023-12-31   994    AA    0.0010       AA       0.0010        0
4146 2023-12-31   995    AA    0.0010       AA       0.0010        0
4147 2023-12-31   996     A    0.0020        A       0.0020        0
4148 2023-12-31   998     B    0.1000        B       0.1000        0
4149 2023-12-31   999   AAA    0.0002      AAA       0.0002        0

[4150 rows x 7 columns]
.table()
>>> (
>>>     data
>>>     .crm.discrimination(default="DEFAULT", score="GRADE_PD", alpha=0.1, by="DATE")
>>>     .table()
>>> )

          BY   OBS  DEFAULT_OBS        AR  AR_BOUND_LOWER  AR_BOUND_UPPER  ALPHA
0 2019-12-31   750          131  0.941644        0.907460        0.975829    0.1
1 2020-12-31   800          170  0.905444        0.867548        0.943339    0.1
2 2021-12-31   700          150  0.929358        0.894257        0.964458    0.1
3 2022-12-31   900          205  0.932367        0.902949        0.961785    0.1
4 2023-12-31  1000          236  0.921183        0.891655        0.950711    0.1
.plot()
>>> (
>>>     data
>>>     .crm.discrimination(default="DEFAULT", score="GRADE_PD", alpha=0.1, by="DATE")
>>>     .plot(show=True)
>>> )

discrimination plot