How to register UDF to use in SQL and DataFrame?
UDFRegistration.register
variants, which take a scala.FunctionN
, return an UserDefinedFunction
so you can register SQL function and create DSL friendly UDF in a single step:
val timesTwoUDF = spark.udf.register("timesTwo", (x: Int) => x * 2)
spark.sql("SELECT timesTwo(1)").show
+---------------+
|UDF:timesTwo(1)|
+---------------+
| 2|
+---------------+
spark.range(1, 2).toDF("x").select(timesTwoUDF($"x")).show
+------+
|UDF(x)|
+------+
| 2|
+------+
You can use the following and still apply it on dataframe
spark.sqlContext.udf.register("myUDF", myFunc)
Use selectExpr when calling it on dataframe transformations.
df.selectExpr("myUDF(col1) as modified_col1")