org.apache.spark.sql.avro.IncompatibleSchemaException: Unexpected type org.apache.spark.ml.linalg.VectorUDT code example

Example: org.apache.spark.sql.avro.IncompatibleSchemaException: Unexpected type org.apache.spark.ml.linalg.VectorUDT

# To convert any Vector to an Array[Double] you can use the following UDF:

import org.apache.spark.sql.functions.udf
import org.apache.spark.sql.functions.col
import org.apache.spark.ml.linalg.Vector

val vectorToArrayUdf = udf((vector: Vector) => vector.toArray)

// The following will work
val output = dataPredictions
    .withColumn("probabilities", vectorToArrayUdf(col("probability")))
    .select("id", "probabilities", "prediction")

output.write.format("com.databricks.spark.avro").save(path)

Tags:

Sql Example