session.py
SparkSession
dataclass
Bases: Input[SparkSession]
Input representing the active Spark session. Useful for accessing the active Spark session in nodes.
Example:
>>> from ordeq_spark.io.session import SparkSession
>>> spark_session = SparkSession()
>>> spark = spark_session.load() # doctest: +SKIP
>>> print(spark.version) # doctest: +SKIP
3.3.1
Example in a node:
>>> from ordeq import node
>>> from ordeq_common import Static
>>> items = Static({'id': [1, 2, 3], 'value': ['a', 'b', 'c']})
>>> @node(
... inputs=[items, spark_session],
... outputs=[],
... )
... def convert_to_df(
... data: dict, spark: pyspark.sql.SparkSession
... ) -> pyspark.sql.DataFrame:
... return spark.createDataFrame(data)
load()
Gets the active SparkSession
Returns:
Type | Description |
---|---|
SparkSession
|
pyspark.sql.SparkSession: The Spark session. |
Raises:
Type | Description |
---|---|
ValueError
|
If there is no active Spark session. |