Methods¶
The bucketers of skorecard
come with a handy list of methods for you to peek under the hood of the bucketer
from skorecard.datasets import load_uci_credit_card
from skorecard.bucketers import DecisionTreeBucketer
X, y = load_uci_credit_card(return_X_y=True)
specials = {"LIMIT_BAL": {"=50000": [50000], "in [20000,30000]": [20000, 30000]}}
dt_bucketer = DecisionTreeBucketer(variables=["LIMIT_BAL"], specials=specials)
dt_bucketer.fit(X, y)
dt_bucketer.fit_transform(X, y).head()
.summary()¶
This gives the user a simple table of the columns and number of (pre)buckets generated by the bucketer. The information value and dtypes are also given
dt_bucketer.summary()
.bucket_table()¶
To look at the buckets in a more granular level, the bucket_table()
method outputs, among others, a table containing the counts in each bin, the percentages, and the event rate.
dt_bucketer.bucket_table("LIMIT_BAL")
.plot_bucket()¶
We have already seen that we can plot the above bucket table for a better visualisation of the buckets
dt_bucketer.plot_bucket(
"LIMIT_BAL", format="png", scale=2, width=1050, height=525
) # remove format argument for an interactive plotly plot.)
.save_yml()¶
We can save the generated bucket to a yaml file. This yaml file can later be used to generate a bucketer as we show in the create_bucketer_from_file
tutorial
dt_bucketer.save_yml("my_output.yml")
Bucket mapping¶
If you're interested into digging into the internals of the buckets, you can access the fitted attribute
features_bucket_mapping_
. For example:
```python
bucketer.features_bucket_mapping_.get('pet_ownership').labels
# {0: 'cat lover, rabbit',
# 1: 'no pets',
# 2: 'dog lover',
# 3: 'gold fish',
# 4: 'other',
# 5: 'Missing'}
```