Categoricals¶
skorecard
also has bucketers that support categorical features (such as OptimalBucketer and OrdinalCategoricalBucketer). If you have a categorical feature, you can bucket them directly:
from skorecard.bucketers import OptimalBucketer
import random
from skorecard.datasets import load_uci_credit_card
X, y = load_uci_credit_card(return_X_y=True)
# Add a categorical feature
pets = ["no pets"] * 3000 + ["cat lover"] * 1500 + ["dog lover"] * 1000 + ["rabbit"] * 498 + ["gold fish"] * 2
random.Random(42).shuffle(pets)
X["pet_ownership"] = pets
bucketer = OptimalBucketer(max_n_bins=3, variables=["pet_ownership"], variables_type="categorical", cat_cutoff=None)
bucketer.fit_transform(X, y)["pet_ownership"].value_counts().sort_index()