Skip to content

environment_variable.py

EnvironmentVariable dataclass

Bases: IO[str]

IO used to load and save environment variables. Use: - as input, to parameterize the node logic - as output, to set an environment variable based on node logic

Gets and sets os.environ on load and save. See the Python docs for more information.

Example in a node:

main.py
from ordeq import run, node
from ordeq_spark import SparkHiveTable
import pyspark.sql.functions as F
from pyspark.sql import DataFrame


@node(
    inputs=[
        SparkHiveTable(table="my.table"),
        EnvironmentVariable("KEY", default="DEFAULT"),
    ],
    outputs=SparkHiveTable(table="my.output"),
)
def transform(df: DataFrame, value: str) -> DataFrame:
    return df.where(F.col("col") == value)


if __name__ == "__main__":
    run(transform)

When you run transform through the CLI as follows:

export KEY=MyValue
python main.py transform

MyValue will be used as value in transform.