Spark : multiple spark-submit in parallel
You could also bump up the value set for spark.port.maxRetries
.
As per the docs:
Maximum number of retries when binding to a port before giving up. When a port is given a specific value (non 0), each subsequent retry will increment the port used in the previous attempt by 1 before retrying. This essentially allows it to try a range of ports from the start port specified to port + maxRetries.
Above answers are correct.
However, we should not try and change the spark.port.maxRetries
values, as it will increase load on the same server, which in turn will depreciate the cluster performance and can push the node to a deadlock situations.Load can be checked with uptime
command in your session.
The root cause of this issue is when you try to run all spark application via --deploy-mode client
.
If you have a distributed capacity in your cluster, the best approach is to run it with --deploy-mode cluster
.
This way, every time it will run the spark application in different nodes, hence mitigating the port binding issues on the same node.
Hope this helps. Cheers!
This issue occurs if multiple users tries to start spark session at the same time or existing spark session are not property closed
There are two ways to fix this issue.
Start new spark session on a different port as follow
spark-submit --conf spark.ui.port=5051 <other arguments>`<br>`spark-shell --conf spark.ui.port=5051
Find all spark session using ports from 4041 to 4056 and kill process using kill command, netstat and kill command can be used to find process which are occupying the port and kill the process respectively. Here's the usage:
sudo netstat -tunalp | grep LISTEN| grep 4041
Above command will produce output as below, last column is process id, in this case PID is 32028
tcp 0 0 :::4040 :::* LISTEN 32028/java
Once you find out the process id(PID) you can kill the spark process(spark-shell or spark-submit) using the below command
sudo kill -9 32028