Mixed usage with other packages¶
There are quite some excellent packages out there offering functionality around bucketing/binning/discretizing numerical variables and encoding categorical variables. Chances are you'd like to combine them in your skorecard
pipelines.
Here are some packages are are compatible with pandas dataframes:
%%capture
!pip install category_encoders
%%capture
from sklearn.pipeline import make_pipeline
from skorecard.datasets import load_uci_credit_card
from skorecard.bucketers import OrdinalCategoricalBucketer
X, y = load_uci_credit_card(return_X_y=True)
from category_encoders import TargetEncoder
pipe = make_pipeline(
TargetEncoder(cols=["EDUCATION"]), # category_encoders.TargetEncoder passes through other columns
OrdinalCategoricalBucketer(variables=["MARRIAGE"])
)
pipe.fit(X, y)
pipe.transform(X).head(5)
Some packages do not return pandas DataFrames, like:
You can wrap the class in skorecard.pipeline.KeepPandas
to use these transformers in a pipeline:
from sklearn.preprocessing import KBinsDiscretizer
from skorecard.pipeline import KeepPandas
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer(
[
("binner", KBinsDiscretizer(n_bins=3, encode='ordinal', strategy='uniform'), ['EDUCATION'])
],
remainder="passthrough"
)
pipe = make_pipeline(
KeepPandas(ct),
OrdinalCategoricalBucketer(variables=["MARRIAGE"])
)
pipe.fit_transform(X, y).head(5)
Last update: 2021-11-24