Mixed usage with other packages¶
There are quite some excellent packages out there offering functionality around bucketing/binning/discretizing numerical variables and encoding categorical variables. Chances are you'd like to combine them in your skorecard
pipelines.
Here are some packages are are compatible with pandas dataframes:
%%capture
!pip install category_encoders
%%capture
from sklearn.pipeline import make_pipeline
from skorecard.datasets import load_uci_credit_card
from skorecard.bucketers import OrdinalCategoricalBucketer
X, y = load_uci_credit_card(return_X_y=True)
from category_encoders import TargetEncoder
pipe = make_pipeline(
TargetEncoder(cols=["EDUCATION"]), # category_encoders.TargetEncoder passes through other columns
OrdinalCategoricalBucketer(variables=["MARRIAGE"]),
)
pipe.fit(X, y)
pipe.transform(X).head(5)
Some packages do not return pandas DataFrames, like:
You can wrap the class in skorecard.pipeline.KeepPandas
to use these transformers in a pipeline:
from sklearn.preprocessing import KBinsDiscretizer
from skorecard.pipeline import KeepPandas
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer(
[("binner", KBinsDiscretizer(n_bins=3, encode="ordinal", strategy="uniform"), ["EDUCATION"])],
remainder="passthrough",
)
pipe = make_pipeline(KeepPandas(ct), OrdinalCategoricalBucketer(variables=["MARRIAGE"]))
pipe.fit_transform(X, y).head(5)