environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON
You should set the following environment variables in $SPARK_HOME/conf/spark-env.sh
:
export PYSPARK_PYTHON=/usr/bin/python
export PYSPARK_DRIVER_PYTHON=/usr/bin/python
If spark-env.sh
doesn't exist, you can rename spark-env.sh.template
By the way, if you use PyCharm, you could add PYSPARK_PYTHON
and PYSPARK_DRIVER_PYTHON
to run/debug configurations per image below