systemctl start trafficserver wait for start
Solution 1:
This is because
systemctl start
returns immediately, without waiting for traffic server to be actually started.Is there a way to tell
systemctl start
to only return once the service is started?
systemctl start
does wait for the service to be ready (except if invoked with --no-block
), the service just needs to indicate that properly (i. e., not use Type=simple
). If the service doesn’t tell systemd when it’s ready, no variation of systemctl is-active
, systemctl show
, etc. will help you.
The most elegant solution, as mentioned in the comments, would be a socket unit. systemd starts the socket, traffic_line
connects to it, systemd starts the service, and traffic_line
blocks until the service starts to accept connections on the file descriptor it inherited from systemd.
Alternatively, you can use either Type=forking
(the service forks, and the main PID exits once the forked service is ready) or Type=notify
(the service calls sd_notify(0, "READY=1")
once it’s ready).
Unfortunately, all of these solutions require some support from trafficserver
– use systemd’s socket instead of allocating its own, fork and wait appropriately in the main process, or call sd_notify
. systemd can’t magically guess when the server is ready if the server doesn’t cooperate :)
After looking at trafficserver
’s source code a bit, it looks like it might actually support Type=forking
– the server is spawned by a dedicated traffic_cop
command, which seems to wait until the server is up and perform some basic testing (at least the code looks like it). So if you change the service type, it might just work:
# /etc/systemd/system/trafficserver.service.d/type-forking.conf
[Service]
Type=forking
Solution 2:
I finally got it to work, after several attempts.
First attempt
After digging into systemctl help I found the is-active
command:
$ systemctl is-active trafficserver
active
I therefore wrote a shell script to wait until the service becomes active:
while true; do
if [ $(systemctl is-active trafficserver) == "active" ]; then
break
fi
sleep 1
done
Unfortunately, even though this script works as expected when I test it with start/stop, I was still getting the same error when running the traffic_line
commands right after it. I think that the service is reported as active before the actual processes have fully started (probably a matter of milliseconds).
Second attempt
So I tried another way. Knowing that this is the very first start of the service, I can wait until the PID file of the trafficserver manager exists. Here is what I tried:
while [ ! -f /run/trafficserver/manager.lock ]; do
sleep 1
done
Same problem: when the trafficserver manager's PID file is written, the manager is not actually ready to receive orders yet, so I'm still getting the error.
Damn, I don't want to use a blind sleep
.
Third attempt
So I ended up checking that the traffic_line
command itself does not fail:
while ! traffic_line --status &> /dev/null; do
sleep 1
done
And this works!
Nice, but...
Unfortunately, the answer is very specific to the service I'm using (trafficserver), and would not directly apply to other services.
If you know a more generic answer to this question, please feel free to share it.