base64

base64 #

pyspark.sql.functions.base64(col) #

version: since 1.5

Computes the BASE64 encoding of a binary column and returns it as a string column.

base64

Runnable Code:

from pyspark.sql import functions as F
# Set up dataframe
df = spark.createDataFrame([
    (bytearray(b'0001'), 1)
],
schema=T.StructType([
    T.StructField("bin", T.BinaryType()),
    T.StructField("number", T.IntegerType())
]))
df= df.drop("number")
# Use function
df = (df
     .withColumn("base64", 
                 F.base64(F.col("bin")))
     )
df.show()
bin base64
[30 30 30 31] MDAwMQ==

Usage:

I’ve never used this one. Not had much demand for base64 in the real world.



returns: _invoke_function_over_column("base64", col)

PySpark manual

tags: base64, binary to text, binary




© 2023 PySpark Is Rad