Pod in pending state due to Insufficient CPU
I had the same issue when attempting to deploy to the cluster. In my case, there were unneeded pods being automatically created for test branches of my application. To diagnose the issue, I needed to do:
kubectl get po
kubectl describe po
- for one of the existing pods, to check which node it's running on
kubectl get nodes
kubectl describe node
- to see the CPU usage for the node being used by the existing pod, as below:
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1010m (93%) 4 (210%)
Then, the unneeded pods could be deleted using:
kubectl get deployments
kubectl delete deployment ....
- then name of the deployment for the pod I needed to delete.
Once I deleted enough unused pods, I was ably to deploy new ones.
I recently had this same issue, after some research I found that GKE has a default LimitRange
with CPU requests limit set to 100m
, this can be checked by running kubectl get limitrange -o=yaml
.
It's going to display something like this:
apiVersion: v1
items:
- apiVersion: v1
kind: LimitRange
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"LimitRange","metadata":{"annotations":{},"name":"limits","namespace":"default"},"spec":{"limits":[{"defaultRequest":{"cpu":"100m"},"type":"Container"}]}}
creationTimestamp: 2017-11-16T12:15:40Z
name: limits
namespace: default
resourceVersion: "18741722"
selfLink: /api/v1/namespaces/default/limitranges/limits
uid: dcb25a24-cac7-11e7-a3d5-42010a8001b6
spec:
limits:
- defaultRequest:
cpu: 100m
type: Container
kind: List
metadata:
resourceVersion: ""
selfLink: ""
This limit is applied to every container. So for instance, if you have a 4 cores node, and assuming that for each POD of yours 2 containers are going to be created, it will allow only for around ~20 pods to be created.
The "fix" here is to change the default LimitRange
setting your own limits, and then removing old pods so they are recreated with the updated values, or to directly set the pods limits when creating them.
Some reading material:
https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-and-a-cpu-limit
https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/#create-a-limitrange-and-a-pod
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#how-pods-with-resource-limits-are-run
https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits
Yes, the overcommit is currently not supported. It's in planned improvements http://kubernetes.io/docs/user-guide/compute-resources. Related issue on github: https://github.com/kubernetes/kubernetes/issues/168
ps: in theory you can define custom node capacity, but I not sure.
For me, creating all the deployments and services in a different namespace (other than default
) fixed this issue.