Skip to content

Categoricals

skorecard also has bucketers that support categorical features (such as OptimalBucketer and OrdinalCategoricalBucketer). If you have a categorical feature, you can bucket them directly:

from skorecard.bucketers import OptimalBucketer
import random
from skorecard.datasets import load_uci_credit_card

X, y = load_uci_credit_card(return_X_y=True)

# Add a categorical feature
pets = ["no pets"] * 3000 + ["cat lover"] * 1500 + ["dog lover"] * 1000 + ["rabbit"] * 498 + ["gold fish"] * 2
random.Random(42).shuffle(pets)
X["pet_ownership"] = pets

bucketer = OptimalBucketer(max_n_bins=3, variables=["pet_ownership"], variables_type="categorical", cat_cutoff=None)
bucketer.fit_transform(X, y)["pet_ownership"].value_counts().sort_index()
0    1998
1    3000
2    1002
Name: pet_ownership, dtype: int64

Last update: 2023-08-08