AWS ECS - exit code 137
In Linux, there are a number of exit codes with Special Meanings, of note here is the 128+n section, which are the Kill levels for a process. in this case, 137 = 128 + 9, so this process was killed with the highest level.
This usually happens in ECS when ECS sends a STOP to the process, but it hasn't exited within 30 seconds.
Two common causes for stuff like this that I'd check:
- In your Container Definition, you have
"memory": 300
. This is a hard limit on the amount of memory that a specific Docker task can use. If it reaches that limit, it gets terminated (I don't know what exit code that would be, but 137 might be reasonable). Based on your example, you could try to run something similar withdocker run -m 300M -it "imagename" /bin/bash
. If this hits the memory limit, it should also get terminated. - If your container is connected to a Load Balancer and has a health check enabled, ensure that your application is responding to the health check within the health check interval. If not, it will count against the container, and after the configured number of health check fails, the container will be considered unhealthy, and will create a new task and destroy the old one.
One thing to note here as that ECS supports memory
and memoryReservation
in the task definition, and only one of them need be set. memory
as mentioned above is a hard limit, and anything that hits that gets destroyed. memoryReservation
is a soft limit, and your container can exceed that, up to the total memory of the ECS Instance. You can also combine the two, in which case memoryReservation
is used in determining the amount of memory available for extra tasks on an instance, and memory
is used to kill the task if it over runs it overruns the number provided.
I had a similar issue and my container was killed with exactly the same exit code (137).
It turned out that it got killed due to an incorrectly configured health check in my Application load balancer.