Skip to content

environment_variable.py

EnvironmentVariable dataclass

Bases: IO[str]

IO used to load and save environment variables. Use: - as input, to parameterize the node logic - as output, to set an environment variable based on node logic

Gets and sets os.environ on load and save. See [1] for more information.

[1] https://docs.python.org/3/library/os.html#os.environ

Example in a node:

>>> from ordeq import node
>>> from ordeq_spark import SparkHiveTable
>>> import pyspark.sql.functions as F
>>> from pyspark.sql import DataFrame

>>> @node(
...    inputs=[
...         SparkHiveTable(table="my.table"),
...         EnvironmentVariable("KEY", default="DEFAULT")
...    ],
...    outputs=SparkHiveTable(table="my.output"),
... )
... def transform(df: DataFrame, value: str) -> DataFrame:
...     return df.where(F.col("col") == value)

When you run transform through the CLI as follows:

export KEY=MyValue
python {your-entrypoint} run --node transform

MyValue will be used as value in transform.