Ansible playbook wait until all pods running

I would try something like this (works for me):

tasks:
- name: wait for pods to come up
  shell: kubectl get pods -o json
  register: kubectl_get_pods
  until: kubectl_get_pods.stdout|from_json|json_query('items[*].status.phase')|unique == ["Running"]

You are basically getting all the statuses for all the pods and combining them into a unique list, and then it won't complete until that list is ["Running"]. So for example, if all your pods are not running you will get something like ["Running", "Starting"].


The kubectl wait command

Kubernetes introduced the kubectl wait in v1.11 version:

CHANGELOG-1.11:

  • kubectl wait is a new command that allows waiting for one or more resources to be deleted or to reach a specific condition. It adds a kubectl wait --for=[delete|condition=condition-name] resource/string command.

CHANGELOG-1.13:

  • kubectl wait now supports condition value checks other than true using --for condition=available=false

CHANGELOG-1.14:

  • Expanded kubectl wait to work with more types of selectors.
  • kubectl wait command now supports the --all flag to select all resources in the namespace of the specified resource types.

It is not intended to wait for phases, but for conditions. I think that waiting for conditions is much more assertive than waiting for phases. See the following conditions:

  • PodScheduled: the Pod has been scheduled to a node;
  • Ready: the Pod is able to serve requests and should be added to the load balancing pools of all matching Services;
  • Initialized: all init containers have started successfully;
  • ContainersReady: all containers in the Pod are ready.

Using kubectl wait with Ansible

Suppose that you are automating a Kubernetes install with kubeadm + Ansible, and need to wait for the installation to complete:

- name: Wait for all control-plane pods become created
  shell: "kubectl get po --namespace=kube-system --selector tier=control-plane --output=jsonpath='{.items[*].metadata.name}'"
  register: control_plane_pods_created
  until: item in control_plane_pods_created.stdout
  retries: 10
  delay: 30
  with_items:
    - etcd
    - kube-apiserver
    - kube-controller-manager
    - kube-scheduler

- name: Wait for control-plane pods become ready
  shell: "kubectl wait --namespace=kube-system --for=condition=Ready pods --selector tier=control-plane --timeout=600s"
  register: control_plane_pods_ready

- debug: var=control_plane_pods_ready.stdout_lines

Result Example:

TASK [Wait for all control-plane pods become created] ******************************
FAILED - RETRYING: Wait all control-plane pods become created (10 retries left).
FAILED - RETRYING: Wait all control-plane pods become created (9 retries left).
FAILED - RETRYING: Wait all control-plane pods become created (8 retries left).
changed: [localhost -> localhost] => (item=etcd)
changed: [localhost -> localhost] => (item=kube-apiserver)
changed: [localhost -> localhost] => (item=kube-controller-manager)
changed: [localhost -> localhost] => (item=kube-scheduler)

TASK [Wait for control-plane pods become ready] ********************************
changed: [localhost -> localhost]

TASK [debug] *******************************************************************
ok: [localhost] => {
    "control_plane_pods_ready.stdout_lines": [
        "pod/etcd-localhost.localdomain condition met", 
        "pod/kube-apiserver-localhost.localdomain condition met", 
        "pod/kube-controller-manager-localhost.localdomain condition met", 
        "pod/kube-scheduler-localhost.localdomain condition met"
    ]    
}