Spark DataFrame aggregate column values by key into List
It is possible with the DataFrame
API. Try:
df.groupBy(col("Id"))
.agg(collect_list(col("value")) as "value")
If instead of an Array
you want a String
separated by ,
, then try this:
df.groupBy(col("Id"))
.agg(collect_list(col("value")) as "value")
.withColumn("value", concat_ws(",", col("value")))