Removing Blank Strings from a Spark Dataframe

Removing things from a dataframe requires filter().

newDF = oldDF.filter("colName != ''")

or am I misunderstanding your question?


In case someone dont want to drop the records with blank strings, but just convvert the blank strings to some constant value.

val newdf = df.na.replace(df.columns,Map("" -> "0")) // to convert blank strings to zero
newdf.show()

You can use this:

df.filter(!($"col_name"===""))

It filters out the columns where the value of "col_name" is "" i.e. nothing/blankstring. I'm using the match filter and then inverting it by "!"