Spark : Union can only be performed on tables with the compatible column types. Struct<name,id> != Struct<id,name>
The default Spark behaviour for union
is standard SQL behaviour, so match-by-position. This means, the schema in both DataFrames must contain the same fields with the same fields in the same order.
If you want to match schema by name, use unionByName
, introduced in Spark 2.3.
You can also re-map fields:
val df1 = ...
val df2 = /...
df1.toDF(df2.columns: _*).union(df2)
Edit: I saw the edit now.
You can add again those columns:
import org.apache.spark.sql.functions._
val withCorrectedStruct = df1.withColumn("skyward", struct($"skyward_number", $"tier", $"skyward_points"))