SQL LIKE in Spark SQL
You are only a little bit off. Spark SQL and Hive follow SQL standard conventions where LIKE
operator accepts only two special characters:
_
(underscore) - which matches an arbitrary character.%
(percent) - which matches an arbitrary sequence of characters.
Square brackets have no special meaning and [4,8]
matches only a [4,8]
literal:
spark.sql("SELECT '[4,8]' LIKE '[4,8]'").show
+----------------+
|[4,8] LIKE [4,8]|
+----------------+
| true|
+----------------+
To match complex patterns you can use RLIKE
operator which suports Java regular expressions:
spark.sql("SELECT '8NXDPVAE' RLIKE '^[4,8]NXD.V.*$'").show
+-----------------------------+
|8NXDPVAE RLIKE ^[4,8]NXD.V.*$|
+-----------------------------+
| true|
+-----------------------------+