- Get link
- X
- Other Apps
Lateral column alias support with Spark 3.4 and above
CTEs get extensively used to implement logic. Before Spark 3.4, one has to write multiple select statements when a computed column had to be used to drive another column.
#Using Pyspark version 3.3 and before
#chaining multiple select statements
query = "WITH t AS (SELECT 100 as col1) \
SELECT 100 * col1 as col2 FROM t"
df = spark.sql(query)
df.display()
Code Sample(Spark 3.3)
#Using Pyspark version 3.3 and before
#Using lateral column alias
query = "Select 100 as col1, 100 * col1 as col2"
df = spark.sql(query)
df.display()
Starting Spark 3.4 and above, we don't need to write multiple select statement but can get this accomplished with the lateral column alias
#Using Pyspark version 3.4 and above
#Using lateral column alias
query = "Select 100 as col1, 100 * col1 as col2"
df = spark.sql(query)
df.display()
Comments
Post a Comment