How to find the master URL for an existing spark cluster

I found that doing --master yarn-cluster works best. this makes sure that spark uses all the nodes of the hadoop cluster.


You are on the spot. .setMaster("local[*]") will run spark in self-contained mode. In this mode spark can utilize only the resources of the local machine.

If you've already set up a spark cluster on top of your physical cluster. The solution is an easy one, Check http://master:8088 where master is pointing to spark master machine. There you can see spark master URI, and by default is spark://master:7077, actually quite a bit of information lives there, if you have a spark standalone cluster.

However, I see a lot of questions on SO claiming this does not work with many different reasons. Using spark-submit utility is just less error prone, See usage.

But if you haven't got a spark cluster yet I suggest setting up a Spark Standalone cluster first.

Tags:

Apache Spark