Over a year later, I have found a solution. First, an additional clarification on the environment, what I believe is happening, and speculation on a possible bug with the Docker Engine.

The Compose file I am using now is launching a lightly modified version of the 'official' Alpine NGINX image, which uses COPY to load in the healthcheck script and adds HEALTHCHECK explicitly in the image. This image is used for an nginx service, and is used in concert with an image running jwilder/docker-gen to use container metadata from Docker to generate NGINX configuration files. This container is running as a service named nginx-gen. When containers change, configuration is re-generated, and if there are any changes, a SIGHUP is sent to the nginx service.

What I discovered is the following:

  • If all services are launched together, the nginx service never runs healthchecks;
  • If the nginx service is restarted soon after launch, healthchecks complete normally;
  • If the nginx service is launched by itself, healthchecks complete normally;
  • If all services other than nginx-gen are launched together, healthchecks complete normally;
  • If all services are launched together, but nginx-gen is modified to sleep 60 before doing anything, healthchecks complete normally;

So, it appears that there is some obscure interaction with signal processing, Docker, and NGINX. If a SIGHUP is sent to an NGINX process in a container before the first healthcheck runs in that container, no healthchecks ever run.

The final iteration I came up with modifies the nginx-gen container to poll the health of the nginx container. It looks up the health status of a container with a defined label in a loop, with a short sleep. Once the nginx container reports healthy, nginx-gen proceeds to generate configuration files. I also changed the notification method to docker exec a script to explicitly test and reload configuration in the nginx container, rather than rely on SIGHUP.

End result: I can docker-compose up -d, and everything eventually reports healthy without further intervention. Success!

I attempted the same script and encountered the same issue. I changed the to instead run like this:


if service nginx status; then
    exit 0
    exit 1

Running this in the docker container resulted in successful health checks.

for the official alpine nginx image you can also do:

      test: ["CMD-SHELL", "wget -O /dev/null http://localhost || exit 1"]
      timeout: 10s

wget is part of the standard image. What this does is download your index.html/php/whatever to nowhere (/dev/null), and it should timeout and fail otherwise.

I think that there is no need for a custom script in this case.

Try just change your healthcheck test to

test: ["CMD", "service", "nginx", "status"]

That works fine for me.

Try to use " instead of ' as well, just in case :)


If you really want to force an exit 1, in case of failure, you could use:

test: service nginx status || exit 1