csv.py
DuckDBCSV
dataclass
Bases: IO[DuckDBPyRelation]
IO to load and save CSV files using DuckDB.
Example:
>>> from ordeq import node, run
>>> from ordeq_duckdb import DuckDBCSV
>>> csv = DuckDBCSV(path="data.csv")
>>> csv.save(duckdb.values([1, "a"]))
>>> data = csv.load()
>>> data.describe()
┌─────────┬────────┬─────────┐
│ aggr │ col0 │ col1 │
│ varchar │ double │ varchar │
├─────────┼────────┼─────────┤
│ count │ 1.0 │ 1 │
│ mean │ 1.0 │ NULL │
│ stddev │ NULL │ NULL │
│ min │ 1.0 │ a │
│ max │ 1.0 │ a │
│ median │ 1.0 │ NULL │
└─────────┴────────┴─────────┘
<BLANKLINE>
load(**kwargs)
Load a CSV file into a DuckDB relation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**kwargs
|
Any
|
Additional options to pass to duckdb.read_csv. |
{}
|
Returns:
Type | Description |
---|---|
DuckDBPyRelation
|
The DuckDB relation representing the loaded CSV data. |
save(relation, **kwargs)
Save a DuckDB relation to a CSV file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
relation
|
DuckDBPyRelation
|
The relation to save. |
required |
**kwargs
|
Any
|
Additional options to pass to |
{}
|