Skip to content

dataset.py

HuggingfaceDataset dataclass

Bases: Input[Dataset | DatasetDict | IterableDatasetDict | IterableDataset]

Load a dataset from the Huggingface datasets library.

Example usage:

>>> from ordeq_huggingface import HuggingfaceDataset
>>> ds = HuggingfaceDataset(path="imdb")
>>> data = ds.load(split="train[:10%]")  # doctest: +SKIP
>>> len(data)  # doctest: +SKIP