How can I pass extra parameters to UDFs in Spark SQL?
Just use a little bit of currying:
def convertDateFunc(resolution: DateResolutionType) = udf((x:String) =>
SparkDateTimeConverter.convertDate(x, resolution))
and use it as follows:
case FieldDataType.Date => convertDateFunc(resolution(i))(allCols(i))
On a side note you should take a look at sql.functions.trunc
and sql.functions.date_format
. These should at least part of the job without using UDFs at all.
Note:
In Spark 2.2 or later you can use typedLit
function:
import org.apache.spark.sql.functions.typedLit
which support a wider range of literals like Seq
or Map
.
You can create a literal Column
to pass to a udf using the lit(...)
function defined in org.apache.spark.sql.functions
For example:
val takeRight = udf((s: String, i: Int) => s.takeRight(i))
df.select(takeRight($"stringCol", lit(1)))