Linux SSH Timeout
The first thing you should look into is setting the ServerAliveInterval. This should be set on your workstation.
On Linux or OSX clients you can create a configuration file for your user under ~/.ssh/config on your workstation. Add the following directive. In my case I want it to affect all hosts so I put it under Host *.
Host *
ServerAliveInterval 60
This will send a noop instruction every 60 seconds to keep the connection open. You may want to tweak the value to meet your needs.
On the server side make sure TCPKeepAlive is set to yes.
grep TCPKeepAlive /etc/ssh/sshd_config
TCPKeepAlive yes
If you are on Windows you will need to reference the documentation for your client.
Linux does not time out idle SSH connections. You can leave an SSH connection open indefinitely, and as long as neither endpoint was rebooted or got a new IP address, the connection will still work when you access it after a long idle time.
However if there are any stateful middleboxes (NAT, firewall, etc.) those are likely to time out idle connections. The result of that is that even though the connection is alive at both ends, the two endpoints cannot communicate anymore because the middlebox refuses to forward any packets until the SSH client opens a new connection.
If you know the timeout of the middlebox, you can work around the problem by configuring ClientAliveInterval
in /etc/ssh/sshd_config
on the server or ServerAliveInterval
in ~/.ssh/config
on the client. For optimal detection of broken connections it is advisable to enable both settings. This will also detect broken connections when either endpoint has been rebooted or has gotten a new IP address.
Since you indicate that the timeout sometimes appears to be as low as a few seconds, this might not be sufficient to solve your problem. A very low apparent timeout can be caused by an overloaded or misconfigured CGN. You need to inspect traffic at various points of the communication path in order to figure out whether a CGN is responsible for the failures.
If it turns out that the failures are caused by your ISP doing something silly such as load balancing connections over multiple CGNs that don't share connection state, you cannot fix the problem yourself by simply tweaking your SSH configuration.
If you happen to be stuck with an ISP with an unreliable CGN which they refuse to fix, the only remaining options I know of are to either upgrade both client and server to kernel versions with MPTCP support or to use a tunnel solution designed to tolerate spontaneous changes in the port mappings on the NAT.