No suitable driver found for jdbc in Spark
I had to add the driver
option when using the sparkSession
's read
function.
.option("driver", "org.postgresql.Driver")
var jdbcDF - sparkSession.read
.option("driver", "org.postgresql.Driver")
.option("url", "jdbc:postgresql://<host>:<port>/<DBName>")
.option("dbtable", "<tableName>")
.option("user", "<user>")
.option("password", "<password>")
.load()
Depending on how your dependencies are setup, you'll notice that when you include something like compile group: 'org.postgresql', name: 'postgresql', version: '42.2.8'
in Gradle, for example, this will include the Driver class at org/postgresql/Driver.class
, and that's the one you want to instruct spark to load.
There is 3 possible solutions,
- You might want to assembly you application with your build manager (Maven,SBT) thus you'll not need to add the dependecies in your
spark-submit
cli. You can use the following option in your
spark-submit
cli :--jars $(echo ./lib/*.jar | tr ' ' ',')
Explanation : Supposing that you have all your jars in a
lib
directory in your project root, this will read all the libraries and add them to the application submit.You can also try to configure these 2 variables :
spark.driver.extraClassPath
andspark.executor.extraClassPath
inSPARK_HOME/conf/spark-default.conf
file and specify the value of these variables as the path of the jar file. Ensure that the same path exists on worker nodes.