Skip to content

csv.py

DuckDBCSV dataclass

Bases: IO[DuckDBPyRelation]

IO to load and save CSV files using DuckDB.

Example:

>>> from ordeq import node, run
>>> from ordeq_duckdb import DuckDBCSV
>>> csv = DuckDBCSV(path="data.csv")
>>> csv.save(duckdb.values([1, "a"]))
>>> data = csv.load()
>>> data.describe()
┌─────────┬────────┬─────────┐
│  aggr   │  col0  │  col1   │
│ varchar │ double │ varchar │
├─────────┼────────┼─────────┤
│ count   │    1.0 │ 1       │
│ mean    │    1.0 │ NULL    │
│ stddev  │   NULL │ NULL    │
│ min     │    1.0 │ a       │
│ max     │    1.0 │ a       │
│ median  │    1.0 │ NULL    │
└─────────┴────────┴─────────┘
<BLANKLINE>

load(**kwargs)

Load a CSV file into a DuckDB relation.

Parameters:

Name Type Description Default
**kwargs Any

Additional options to pass to duckdb.read_csv.

{}

Returns:

Type Description
DuckDBPyRelation

The DuckDB relation representing the loaded CSV data.

save(relation, **kwargs)

Save a DuckDB relation to a CSV file.

Parameters:

Name Type Description Default
relation DuckDBPyRelation

The relation to save.

required
**kwargs Any

Additional options to pass to relation.to_csv

{}