PySpark in iPython notebook raises Py4JJavaError when using count() and first()

If you are using Anaconda, try to install java-jdk for Anaconda:

conda install -c cyclus java-jdk

Yeah I had the same problem long time ago in Pyspark in Anaconda I tried several ways to rectify this finally I found on my own by installing Java for anaconda separately afterwards there is no Py4jerror.

https://anaconda.org/cyclus/java-jdk


Pyspark 2.1.0 is not compatible with python 3.6, see https://issues.apache.org/jira/browse/SPARK-19019.

You need to use earlier python version or you can try building master or 2.1 branch from github and it should work.