Population Stability Index¶
Calculate the PSI between the features in two dataframes, X1
and X2
.
X1
and X2
should be bucketed (outputs of fitted bucketers).
\[
PSI = \sum((\%{ Good } - \%{ Bad }) imes \ln rac{\%{ Good }}{\%{ Bad }})
\]
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X1 |
DataFrame
|
bucketed features, expected |
required |
X2 |
DataFrame
|
bucketed features, actual data |
required |
epsilon |
float
|
Amount to be added to relative counts in order to avoid division by zero in the WOE calculation. |
0.0001
|
digits |
(int): number of significant decimal digits in the IV calculation |
None
|
Examples:
from skorecard import datasets
from sklearn.model_selection import train_test_split
from skorecard.bucketers import DecisionTreeBucketer
from skorecard.reporting import psi
X, y = datasets.load_uci_credit_card(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
X,y,
test_size=0.25,
random_state=42
)
dbt = DecisionTreeBucketer()
X_train_bins = dbt.fit_transform(X_train,y_train)
X_test_bins = dbt.transform(X_test)
psi_dict = psi(X_train_bins, X_test_bins)
Source code in skorecard/reporting/report.py
284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 |
|