environment_variable.py
EnvironmentVariable
dataclass
Bases: IO[str]
IO used to load and save environment variables. Use: - as input, to parameterize the node logic - as output, to set an environment variable based on node logic
Gets and sets os.environ
on load and save. See [1] for more
information.
[1] https://docs.python.org/3/library/os.html#os.environ
Example in a node:
>>> from ordeq import node
>>> from ordeq_spark import SparkHiveTable
>>> import pyspark.sql.functions as F
>>> from pyspark.sql import DataFrame
>>> @node(
... inputs=[
... SparkHiveTable(table="my.table"),
... EnvironmentVariable("KEY", default="DEFAULT")
... ],
... outputs=SparkHiveTable(table="my.output"),
... )
... def transform(df: DataFrame, value: str) -> DataFrame:
... return df.where(F.col("col") == value)
When you run transform
through the CLI as follows:
export KEY=MyValue
python {your-entrypoint} run --node transform
MyValue
will be used as value
in transform
.