Spark: Date Arithmetic with Multiple Columns
Say you have a timestamp column created_at,
and an integer column number that represents a number of weeks,
how do you use the date_add function to calculate the resulting timestamps?
You need to also use the expr function:
from pyspark.sql.functions import expr, date_add
new_df = my_df.withColumn('test', expr('date_add(created_at, number*7)'))
Via SO.
Leave a comment