Information Value¶
Calculate the Information Value (IV) of the features in X
.
X
must be the output of fitted bucketers.
\[
IV = \sum { (\% goods - \% bads) } * { WOE }
\]
\[
WOE=\ln (\% { goods } / \% { bads })
\]
Example:
from skorecard import datasets
from sklearn.model_selection import train_test_split
from skorecard.bucketers import DecisionTreeBucketer
from skorecard.reporting import iv
X, y = datasets.load_uci_credit_card(return_X_y=True)
dbt = DecisionTreeBucketer()
X_bins = dbt.fit_transform(X,y)
iv_dict = iv(X_bins, y)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
DataFrame
|
pd.DataFrame (bucketed) features |
required |
y |
Series
|
pd.Series: target values |
required |
epsilon |
float
|
Amount to be added to relative counts in order to avoid division by zero in the WOE calculation. |
0.0001
|
digits |
int
|
number of significant decimal digits in the IV calculation |
None
|
Returns:
Name | Type | Description |
---|---|---|
IVs |
dict
|
Keys are feature names, values are the IV values |
Source code in skorecard/reporting/report.py
338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 |
|