How to debug "ImagePullBackOff"?
You can use the 'describe pod' syntax
For OpenShift use:
oc describe pod <pod-id>
For vanilla Kubernetes:
kubectl describe pod <pod-id>
Examine the events of the output. In my case it shows Back-off pulling image coredns/coredns:latest
In this case the image coredns/coredns:latest can not be pulled from the Internet.
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
5m 5m 1 {default-scheduler } Normal Scheduled Successfully assigned coredns-4224169331-9nhxj to 192.168.122.190
5m 1m 4 {kubelet 192.168.122.190} spec.containers{coredns} Normal Pulling pulling image "coredns/coredns:latest"
4m 26s 4 {kubelet 192.168.122.190} spec.containers{coredns} Warning Failed Failed to pull image "coredns/coredns:latest": Network timed out while trying to connect to https://index.docker.io/v1/repositories/coredns/coredns/images. You may want to check your internet connection or if you are behind a proxy.
4m 26s 4 {kubelet 192.168.122.190} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "coredns" with ErrImagePull: "Network timed out while trying to connect to https://index.docker.io/v1/repositories/coredns/coredns/images. You may want to check your Internet connection or if you are behind a proxy."
4m 2s 7 {kubelet 192.168.122.190} spec.containers{coredns} Normal BackOff Back-off pulling image "coredns/coredns:latest"
4m 2s 7 {kubelet 192.168.122.190} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "coredns" with ImagePullBackOff: "Back-off pulling image \"coredns/coredns:latest\""
Additional debuging steps
- try to pull the docker image and tag manually on your computer
- Identify the node by doing a 'kubectl/oc get pods -o wide'
- ssh into the node (if you can) that can not pull the docker image
- check that the node can resolve the DNS of the docker registry by performing a ping.
- try to pull the docker image manually on the node
- If you are using a private registry, check that your secret exists and the secret is correct. Your secret should also be in the same namespace. Thanks swenzel
- Some registries have firewalls that limit ip address access. The firewall may block the pull
- Some CIs create deployments with temporary docker secrets. So the secret expires after a few days (You are asking for production failures...)
Have you tried to edit to see what's wrong (I had the wrong image location)
kubectl edit pods arix-3-yjq9w
or even delete your pod?
kubectl delete arix-3-yjq9w