ordeq_common

`BytesBuffer` `dataclass` ¶

Bases: IO[bytes]

IO that uses an in-memory bytes buffer to load and save data. Useful for buffering data across nodes without writing to disk.

Example:

>>> from ordeq_common import BytesBuffer
>>> buffer = BytesBuffer()
>>> buffer.load()
b''

The buffer is initially empty, unless provided with initial data:

>>> buffer = BytesBuffer(b"Initial data")
>>> buffer.load()
b'Initial data'

Saving to the buffer appends data to the existing content:

>>> buffer.save(b"New data")
>>> buffer.load()
b'Initial dataNew data'

Example in a node:

>>> from ordeq_args import CommandLineArg
>>> from ordeq_common import BytesBuffer
>>> from ordeq import node, run, Input
>>> result = BytesBuffer()
>>> @node(
...     inputs=[BytesBuffer(b"Hello"), Input[bytes](b"you")],
...     outputs=result
... )
... def greet(greeting: bytes, name: bytes) -> bytes:
...     return greeting + b" to " + name + b"!"
>>> run(greet)
>>> result.load()
b'Hello to you!'

`Dataclass` `dataclass` ¶

Bases: Input['DataclassInstance']

IO that parses data as Python dataclass on load.

Example:

>>> from ordeq_common import Dataclass
>>> from ordeq_files import JSON
>>> from pathlib import Path
>>> ValidJSON = JSON(path=Path("to/valid.json"))
>>> ValidJSON.load()  # doctest: +SKIP
{"name": "banana", "colour": "yellow"}

>>> @dataclass
... class Fruit:
...     name: str
...     colour: str
>>> Dataclass(ValidJSON, Fruit).load()  # doctest: +SKIP
Fruit(name="banana", colour="yellow")

>>> InvalidJSON = JSON(path=Path("to/invalid.json"))
>>> InvalidJSON.load()  # doctest: +SKIP
{"name": "banana", "weight_gr": "100"}
>>> Dataclass(InvalidJSON, Fruit).load()  # doctest: +SKIP
TypeError: Fruit.__init__() got an unexpected keyword argument 'weight_gr'

For nested models, or other more sophisticated parsing requirements consider using ordeq-pydantic instead.

`Literal` `dataclass` ¶

Bases: Input[T]

IO that returns a pre-defined value on load. Mostly useful for testing purposes.

Example:

>>> from ordeq_common import Literal
>>> value = Literal("someValue")
>>> value.load()
'someValue'
>>> print(value)
Literal('someValue')

`LoggerHook` ¶

Bases: InputHook, OutputHook, NodeHook

Hook that prints the calls to the methods. Typically only used for test purposes.

`Print` `dataclass` ¶

Bases: Output[Any]

Output that prints data on save. Mostly useful for debugging purposes. The difference between other utilities like StringBuffer and Pass is that Print shows the output of the node directly on the console.

Example:

>>> from ordeq_common import Print
>>> from ordeq import node, run, Input
>>> @node(
...     inputs=Input[str]("hello, world!"),
...     outputs=Print()
... )
... def print_message(message: str) -> str:
...     return message.capitalize()

>>> run(print_message)
Hello, world!

>>> import sys
>>> @node(
...     inputs=Input[str]("error message"),
...     outputs=Print().with_save_options(file=sys.stderr)
... )
... def log_error(message: str) -> str:
...     return f"Error: {message}"

>>> run(log_error)  # prints to stderr

`SpyHook` ¶

Bases: InputHook, OutputHook, NodeHook

Hook that stores the arguments it is called with in a list. Typically only used for test purposes.

`StringBuffer` `dataclass` ¶

Bases: IO[str]

IO that uses an in-memory string buffer to load and save data. Useful for buffering data across nodes without writing to disk.

Example:

>>> from ordeq_common import StringBuffer
>>> buffer = StringBuffer()
>>> buffer.load()
''

The buffer is initially empty, unless provided with initial data:

>>> buffer = StringBuffer("Initial data")
>>> buffer.load()
'Initial data'

Saving to the buffer appends data to the existing content:

>>> buffer.save("New data")
>>> buffer.load()
'Initial dataNew data'

Example in a node:

>>> from ordeq_args import CommandLineArg
>>> from ordeq_common import StringBuffer
>>> from ordeq import node, run, Input
>>> result = StringBuffer()
>>> @node(
...     inputs=[StringBuffer("Hello"), Input[str]("you")],
...     outputs=result
... )
... def greet(greeting: str, name: str) -> str:
...     return f"{greeting} to {name}!"
>>> run(greet)
>>> result.load()
'Hello to you!'

`Iterate(*ios)` ¶

IO for loading and saving iteratively. This can be useful when processing multiple IOs using the same node, while only requiring to have one of them in memory at the same time.

Examples:

The load function returns a generator:

>>> from pathlib import Path
>>> from ordeq_files import Text, JSON
>>> from ordeq_common import Iterate
>>> paths = [Path("hello.txt"), Path("world.txt")]
>>> text_ios = Iterate(*[Text(path=path) for path in paths])
>>> text_ios.load()  # doctest: +SKIP
<generator object Iterate._load at 0x104946f60>

The load function returns the contents of the files in this case:

>>> list(text_ios.load())  # doctest: +SKIP
['hello', 'world']

By iterating over the contents, each file will be loaded and saved without the need to keep multiple files in memory at the same time:

>>> for idx, content in enumerate(text_ios.load()):   # doctest: +SKIP
...    JSON(
...        path=paths[idx].with_suffix(".json")
...    ).save({"content": content})

We can achieve the same by passing a generator to the Iterate.save method:

>>> json_dataset = Iterate(
...     *[
...         JSON(path=path.with_suffix(".json"))
...         for path in paths
...     ]
... )
>>> json_dataset.save(
...    ({"content": content} for content in text_ios.load())
... )   # doctest: +SKIP
>>> from collections.abc import Iterable
>>> def generate_json_contents(
...     contents: Iterable[str]
... ) -> Iterable[dict[str, str]]:
...     for content in contents:
...         yield {"content": content}
>>> json_dataset.save(generate_json_contents(text_ios.load()))   # doctest: +SKIP

Returns:

Type	Description
`_Iterate[T]`	_Iterate

`Match(io=None)` ¶

Match(io: Input[Tkey]) -> MatchOnLoad[Tval, Tkey]

Match() -> MatchOnSave[Tval, Tkey]

Utility IO that allows dynamic switching between IO, like the match-case statement in Python.

Example:

>>> from ordeq import Input
>>> from ordeq_common import Match
>>> from ordeq_args import EnvironmentVariable
>>> import os
>>> Country = (
...     Match(EnvironmentVariable("COUNTRY"))
...     .Case("NL", Input[str]("Netherlands"))
...     .Case("BE", Input[str]("Belgium"))
...     .Default(Input[str]("Unknown"))
... )
>>> os.environ["COUNTRY"] = "NL"
>>> Country.load()
'Netherlands'

If a default is provided, it will be used when no cases match:

>>> os.environ["COUNTRY"] = "DE"
>>> Country.load()
'Unknown'

Otherwise, it raises an error when none of the provided cases are matched:

>>> Match(EnvironmentVariable("COUNTRY")).load()  # doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
ordeq.IOException: Failed to load
Unsupported case 'DE'

Match on save works as follows:

>>> SmallOrLarge = (
...     Match()
...     .Case("S", EnvironmentVariable("SMALL"))
...     .Case("L", EnvironmentVariable("LARGE"))
...     .Default(EnvironmentVariable("UNKNOWN"))
... )
>>> SmallOrLarge.save(("S", "Andorra"))
>>> SmallOrLarge.save(("L", "Russia"))
>>> SmallOrLarge.save(("XXL", "Mars"))
>>> os.environ["SMALL"]
'Andorra'
>>> os.environ.get("LARGE")
'Russia'
>>> os.environ.get("UNKNOWN")
'Mars'

Example in a node:

>>> from ordeq import node
>>> from ordeq_files import JSON
>>> from ordeq_args import CommandLineArg
>>> from pathlib import Path
>>> TestOrTrain = (
...     Match(CommandLineArg("--split"))
...     .Case("test", JSON(path=Path("to/test.json")))
...     .Case("train", JSON(path=Path("to/train.json")))
... )
>>> @node(
...     inputs=TestOrTrain,
... )
... def evaluate(data: dict) -> dict:
...     ...

Returns:

Type	Description
`MatchOnLoad \| MatchOnSave`	MatchOnLoad or MatchOnSave

ordeq_common

BytesBuffer dataclass ¶

Dataclass dataclass ¶

Literal dataclass ¶

LoggerHook ¶

Print dataclass ¶

SpyHook ¶

StringBuffer dataclass ¶

Iterate(*ios) ¶

Match(io=None) ¶

`BytesBuffer` `dataclass` ¶

`Dataclass` `dataclass` ¶

`Literal` `dataclass` ¶

`LoggerHook` ¶

`Print` `dataclass` ¶

`SpyHook` ¶

`StringBuffer` `dataclass` ¶

`Iterate(*ios)` ¶

`Match(io=None)` ¶