Population Stability Index¶
Calculate the PSI between the features in two dataframes, X1
and X2
.
X1
and X2
should be bucketed (outputs of fitted bucketers).
\[
PSI = \sum((\%{ Good } - \%{ Bad }) imes \ln rac{\%{ Good }}{\%{ Bad }})
\]
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X1 |
pd.DataFrame |
bucketed features, expected |
required |
X2 |
pd.DataFrame |
bucketed features, actual data |
required |
epsilon |
float |
Amount to be added to relative counts in order to avoid division by zero in the WOE calculation. |
0.0001 |
digits |
(int): number of significant decimal digits in the IV calculation |
None |
Examples:
from skorecard import datasets
from sklearn.model_selection import train_test_split
from skorecard.bucketers import DecisionTreeBucketer
from skorecard.reporting import psi
X, y = datasets.load_uci_credit_card(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
X,y,
test_size=0.25,
random_state=42
)
dbt = DecisionTreeBucketer()
X_train_bins = dbt.fit_transform(X_train,y_train)
X_test_bins = dbt.transform(X_test)
psi_dict = psi(X_train_bins, X_test_bins)
Last update: 2021-11-24