spark submit add multiple jars in classpath
I was trying to connect to mysql from the python code that was executed using spark-submit
.
I was using HDP sandbox that was using Ambari. Tried lot of options such as --jars
, --driver-class-path
, etc, but none worked.
Solution
Copy the jar in /usr/local/miniconda/lib/python2.7/site-packages/pyspark/jars/
As of now I'm not sure if it's a solution or a quick hack, but since I'm working on POC so it kind of works for me.
Just use the --jars
parameter. Spark will share those jars (comma-separated) with the executors.
Specifying full path for all additional jars works.
./bin/spark-submit --class "SparkTest" --master local[*] --jars /fullpath/first.jar,/fullpath/second.jar /fullpath/your-program.jar
Or add jars in conf/spark-defaults.conf by adding lines like:
spark.driver.extraClassPath /fullpath/firs.jar:/fullpath/second.jar
spark.executor.extraClassPath /fullpath/firs.jar:/fullpath/second.jar
You can use * for import all jars into a folder when adding in conf/spark-defaults.conf .
spark.driver.extraClassPath /fullpath/*
spark.executor.extraClassPath /fullpath/*