How to wait for server restart using Ansible?
2018 Update
As of 2.3, Ansible now ships with the wait_for_connection
module, which can be used for exactly this purpose.
#
## Reboot
#
- name: (reboot) Reboot triggered
command: /sbin/shutdown -r +1 "Ansible-triggered Reboot"
async: 0
poll: 0
- name: (reboot) Wait for server to restart
wait_for_connection:
delay: 75
The shutdown -r +1 prevents a return code of 1 to be returned and have ansible fail the task. The shutdown is run as an async task, so we have to delay the wait_for_connection
task at least 60 seconds. 75 gives us a buffer for those snowflake cases.
wait_for_connection - Waits until remote system is reachable/usable
Most reliable I've with 1.9.4 got is (this is updated, original version is at the bottom):
- name: Example ansible play that requires reboot
sudo: yes
gather_facts: no
hosts:
- myhosts
tasks:
- name: example task that requires reboot
yum: name=* state=latest
notify: reboot sequence
handlers:
- name: reboot sequence
changed_when: "true"
debug: msg='trigger machine reboot sequence'
notify:
- get current time
- reboot system
- waiting for server to come back
- verify a reboot was actually initiated
- name: get current time
command: /bin/date +%s
register: before_reboot
sudo: false
- name: reboot system
shell: sleep 2 && shutdown -r now "Ansible package updates triggered"
async: 1
poll: 0
ignore_errors: true
- name: waiting for server to come back
local_action: wait_for host={{ inventory_hostname }} state=started delay=30 timeout=220
sudo: false
- name: verify a reboot was actually initiated
# machine should have started after it has been rebooted
shell: (( `date +%s` - `awk -F . '{print $1}' /proc/uptime` > {{ before_reboot.stdout }} ))
sudo: false
Note the async
option. 1.8 and 2.0 may live with 0
but 1.9 wants it 1
. The above also checks if machine has actually been rebooted. This is good because once I had a typo that failed reboot and no indication of the failure.
The big issue is waiting for machine to be up. This version just sits there for 330 seconds and never tries to access host earlier. Some other answers suggest using port 22. This is good if both of these are true:
- you have direct access to the machines
- your machine is accessible immediately after port 22 is open
These are not always true so I decided to waste 5 minutes compute time.. I hope ansible extend the wait_for module to actually check host state to avoid wasting time.
btw the answer suggesting to use handlers is nice. +1 for handlers from me (and I updated answer to use handlers).
Here's original version but it it not so good and not so reliable:
- name: Reboot
sudo: yes
gather_facts: no
hosts:
- OSEv3:children
tasks:
- name: get current uptime
shell: cat /proc/uptime | awk -F . '{print $1}'
register: uptime
sudo: false
- name: reboot system
shell: sleep 2 && shutdown -r now "Ansible package updates triggered"
async: 1
poll: 0
ignore_errors: true
- name: waiting for server to come back
local_action: wait_for host={{ inventory_hostname }} state=started delay=30 timeout=300
sudo: false
- name: verify a reboot was actually initiated
# uptime after reboot should be smaller than before reboot
shell: (( `cat /proc/uptime | awk -F . '{print $1}'` < {{ uptime.stdout }} ))
sudo: false
Ansible >= 2.7 (released in Oct 2018)
Use the built-in reboot module:
- name: Wait for server to restart
reboot:
reboot_timeout: 3600
Ansible < 2.7
Restart as a task
- name: restart server
shell: 'sleep 1 && shutdown -r now "Reboot triggered by Ansible" && sleep 1'
async: 1
poll: 0
become: true
This runs the shell command as an asynchronous task, so Ansible will not wait for end of the command. Usually async
param gives maximum time for the task but as poll
is set to 0, Ansible will never poll if the command has finished - it will make this command a "fire and forget". Sleeps before and after shutdown
are to prevent breaking the SSH connection during restart while Ansible is still connected to your remote host.
Wait as a task
You could just use:
- name: Wait for server to restart
local_action:
module: wait_for
host={{ inventory_hostname }}
port=22
delay=10
become: false
..but you may prefer to use {{ ansible_ssh_host }}
variable as the hostname and/or {{ ansible_ssh_port }}
as the SSH host and port if you use entries like:
hostname ansible_ssh_host=some.other.name.com ansible_ssh_port=2222
..in your inventory (Ansible hosts
file).
This will run the wait_for task on the machine running Ansible. This task will wait for port 22 to become open on your remote host, starting after 10 seconds delay.
Restart and wait as handlers
But I suggest to use both of these as handlers, not tasks.
There are 2 main reason to do this:
code reuse - you can use a handler for many tasks. Example: trigger server restart after changing the timezone and after changing the kernel,
trigger only once - if you use a handler for a few tasks, and more than 1 of them will make some change => trigger the handler, then the thing that handler does will happen only once. Example: if you have a httpd restart handler attached to httpd config change and SSL certificate update, then in case both config and SSL certificate changes httpd will be restarted only once.
Read more about handlers here.
Restarting and waiting for the restart as handlers:
handlers:
- name: Restart server
command: 'sleep 1 && shutdown -r now "Reboot triggered by Ansible" && sleep 1'
async: 1
poll: 0
ignore_errors: true
become: true
- name: Wait for server to restart
local_action:
module: wait_for
host={{ inventory_hostname }}
port=22
delay=10
become: false
..and use it in your task in a sequence, like this, here paired with rebooting the server handler:
tasks:
- name: Set hostname
hostname: name=somename
notify:
- Restart server
- Wait for server to restart
Note that handlers are run in the order they are defined, not the order they are listed in notify
!
You should change the wait_for task to run as local_action, and specify the host you're waiting for. For example:
- name: Wait for server to restart
local_action:
module: wait_for
host=192.168.50.4
port=22
delay=1
timeout=300