Read and write empty string "" vs NULL in Spark 2.0.1
A mere two and a half years later, empty strings are no longer considered equal to null
values thanks to Spark 2.4.0! See this commit for a bit of detail on functionality. Your code will behave as expected under 2.4.0+:
val df = session.createDataFrame(Seq(
(0, "a"),
(1, "b"),
(2, "c"),
(3, ""),
(4, null)
))
df.coalesce(1).write.mode("overwrite").format("csv")
.option("delimiter", ",")
.option("nullValue", "unknown")
.option("treatEmptyValuesAsNulls", "false")
.save(s"$path/test")
Results in:
0,a
1,b
2,c
3,
4,unknown