pyspark.sql.functions.stack#
- pyspark.sql.functions.stack(*cols)[source]#
- Separates col1, …, colk into n rows. Uses column names col0, col1, etc. by default unless specified otherwise. - New in version 3.5.0. - Parameters
- colsColumnor column name
- the first element should be a literal int for the number of rows to be separated, and the remaining are input elements to be separated. 
 
- cols
 - Examples - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(1, 2, 3)], ['a', 'b', 'c']) >>> df.select('*', sf.stack(sf.lit(2), df.a, df.b, 'c')).show() +---+---+---+----+----+ | a| b| c|col0|col1| +---+---+---+----+----+ | 1| 2| 3| 1| 2| | 1| 2| 3| 3|NULL| +---+---+---+----+----+ - >>> df.select('*', sf.stack(sf.lit(2), df.a, df.b, 'c').alias('x', 'y')).show() +---+---+---+---+----+ | a| b| c| x| y| +---+---+---+---+----+ | 1| 2| 3| 1| 2| | 1| 2| 3| 3|NULL| +---+---+---+---+----+ - >>> df.select('*', sf.stack(sf.lit(3), df.a, df.b, 'c')).show() +---+---+---+----+ | a| b| c|col0| +---+---+---+----+ | 1| 2| 3| 1| | 1| 2| 3| 2| | 1| 2| 3| 3| +---+---+---+----+ - >>> df.select('*', sf.stack(sf.lit(4), df.a, df.b, 'c')).show() +---+---+---+----+ | a| b| c|col0| +---+---+---+----+ | 1| 2| 3| 1| | 1| 2| 3| 2| | 1| 2| 3| 3| | 1| 2| 3|NULL| +---+---+---+----+