dataframe Spark scala explode json array
You'll have to parse the JSON string into an array of JSONs, and then use explode
on the result (explode expects an array).
To do that (assuming Spark 2.0.*):
If you know all
Payment
values contain a json representing an array with the same size (e.g. 2 in this case), you can hard-code extraction of the first and second elements, wrap them in an array and explode:val newDF = dataframe.withColumn("Payment", explode(array( get_json_object($"Payment", "$[0]"), get_json_object($"Payment", "$[1]") )))
If you can't guarantee all records have a JSON with a 2-element array, but you can guarantee a maximum length of these arrays, you can use this trick to parse elements up to the maximum size and then filter out the resulting
null
s:val maxJsonParts = 3 // whatever that number is... val jsonElements = (0 until maxJsonParts) .map(i => get_json_object($"Payment", s"$$[$i]")) val newDF = dataframe .withColumn("Payment", explode(array(jsonElements: _*))) .where(!isnull($"Payment"))