arrays_zip #
pyspark.sql.functions.arrays_zip(cols*) #
version: since 2.4.0
Collection function: Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.
Runnable Code:
from pyspark.sql import functions as F
# Set up dataframe
data = [{"a": [1,2],"b": [3,2]},{"b": [1,2]}]
df = spark.createDataFrame(data)
# Use function
df = (df
.withColumn("arrays_zip",
F.arrays_zip(F.col("a"),F.col("b")))
)
df.show()
a | b | arrays_zip |
---|---|---|
[1, 2] | [3, 2] | [{1, 3}, {2, 2}] |
null | [1, 2] | null |
Usage:
Simple array function. Similar to a python zip.
returns: Column(sc.\_jvm.functions.arrays_zip(\_to_seq(sc, cols, \_to_java_column)))
tags: zip array, zip list, from both lists
© 2023 PySpark Is Rad