Skip to content

Population Stability Index

Calculate the PSI between the features in two dataframes, X1 and X2.

X1 and X2 should be bucketed (outputs of fitted bucketers).

\[ PSI = \sum((\%{ Good } - \%{ Bad }) imes \ln rac{\%{ Good }}{\%{ Bad }}) \]

Parameters:

Name Type Description Default
X1 pd.DataFrame

bucketed features, expected

required
X2 pd.DataFrame

bucketed features, actual data

required
epsilon float

Amount to be added to relative counts in order to avoid division by zero in the WOE calculation.

0.0001
digits

(int): number of significant decimal digits in the IV calculation

None

Examples:

from skorecard import datasets
from sklearn.model_selection import train_test_split
from skorecard.bucketers import DecisionTreeBucketer
from skorecard.reporting import psi

X, y = datasets.load_uci_credit_card(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
    X,y,
    test_size=0.25,
    random_state=42
)

dbt = DecisionTreeBucketer()
X_train_bins = dbt.fit_transform(X_train,y_train)
X_test_bins = dbt.transform(X_test)

psi_dict = psi(X_train_bins, X_test_bins)

Last update: 2021-11-24
Back to top