How to shrink root filesystem without booting a livecd
In solving this issue, the information provided at http://www.ivarch.com/blogs/oss/2007/01/resize-a-live-root-fs-a-howto.shtml was pivotal. However, that guide is for a very old version of RHEL, and various information was obsolete.
The instructions below are crafted to work with CentOS 7, but they should be easily enough transferable to any distro that runs systemd. All commands are run as root.
Ensure the system is in a stable state
Make sure no one else is using it and nothing else important is going on. It's probably a good idea to stop service-providing units like httpd or ftpd, just to ensure external connections don't disrupt things in the middle.
systemctl stop httpd systemctl stop nfs-server # and so on....
Unmount all unused filesystems
umount -a
This will print a number of 'Target is busy' warnings, for the root volume itself and for various temporary/system FSs. These can be ignored for the moment. What's important is that no on-disk filesystems remain mounted, except the root filesystem itself. Verify this:
# mount alone provides the info, but column makes it possible to read mount | column -t
If you see any on-disk filesystems still mounted, then something is still running that shouldn't be. Check what it is using
fuser
:# if necessary: yum install psmisc # then: fuser -vm <mountpoint> systemctl stop <whatever> umount -a # repeat as required...
Make the temporary root
mkdir /tmp/tmproot mount -t tmpfs none /tmp/tmproot mkdir /tmp/tmproot/{proc,sys,dev,run,usr,var,tmp,oldroot} cp -ax /{bin,etc,mnt,sbin,lib,lib64} /tmp/tmproot/ cp -ax /usr/{bin,sbin,lib,lib64} /tmp/tmproot/usr/ cp -ax /var/{account,empty,lib,local,lock,nis,opt,preserve,run,spool,tmp,yp} /tmp/tmproot/var/
This creates a very minimal root system, which breaks (among other things) manpage viewing (no
/usr/share
), user-level customizations (no/root
or/home
) and so forth. This is intentional, as it constitutes encouragement not to stay in such a jury-rigged root system any longer than necessary.At this point you should also ensure that all the necessary software is installed, as it will also assuredly break the package manager. Glance through all the steps, and make sure you have the necessary executables.
Pivot into the root
mount --make-rprivate / # necessary for pivot_root to work pivot_root /tmp/tmproot /tmp/tmproot/oldroot for i in dev proc sys run; do mount --move /oldroot/$i /$i; done
systemd causes mounts to allow subtree sharing by default (as with
mount --make-shared
), and this causespivot_root
to fail. Hence, we turn this off globally withmount --make-rprivate /
. System and temporary filesystems are moved wholesale into the new root. This is necessary to make it work at all; the sockets for communication with systemd, among other things, live in/run
, and so there's no way to make running processes close it.Ensure remote access survived the changeover
systemctl restart sshd systemctl status sshd
After restarting sshd, ensure that you can get in, by opening another terminal and connecting to the machine again via ssh. If you can't, fix the problem before moving on.
Once you've verified you can connect in again, exit the shell you're currently using and reconnect. This allows the remaining forked
sshd
to exit and ensures the new one isn't holding/oldroot
.Close everything still using the old root
fuser -vm /oldroot
This will print a list of processes still holding onto the old root directory. On my system, it looked like this:
USER PID ACCESS COMMAND /oldroot: root kernel mount /oldroot root 1 ...e. systemd root 549 ...e. systemd-journal root 563 ...e. lvmetad root 581 f..e. systemd-udevd root 700 F..e. auditd root 723 ...e. NetworkManager root 727 ...e. irqbalance root 730 F..e. tuned root 736 ...e. smartd root 737 F..e. rsyslogd root 741 ...e. abrtd chrony 742 ...e. chronyd root 743 ...e. abrt-watch-log libstoragemgmt 745 ...e. lsmd root 746 ...e. systemd-logind dbus 747 ...e. dbus-daemon root 753 ..ce. atd root 754 ...e. crond root 770 ...e. agetty polkitd 782 ...e. polkitd root 1682 F.ce. master postfix 1714 ..ce. qmgr postfix 12658 ..ce. pickup
You need to deal with each one of these processes before you can unmount
/oldroot
. The brute-force approach is simplykill $PID
for each, but this can break things. To do it more softly:systemctl | grep running
This creates a list of running services. You should be able to correlate this with the list of processes holding
/oldroot
, then issuesystemctl restart
for each of them. Some services will refuse to come up in the temporary root and enter a failed state; these don't really matter for the moment.If the root drive you want to resize is an LVM drive, you may also need to restart some other running services, even if they do not show up in the list created by
fuser -vm /oldroot
. If you find you are unable to resize an LVM drive under Step 7, trysystemctl restart systemd-udevd
.Some processes can't be dealt with via simple
systemctl restart
. For me these includedauditd
(which doesn't like to be killed viasystemctl
, and so just wanted akill -15
). These can be dealt with individually.The last process you'll find, usually, is
systemd
itself. For this, runsystemctl daemon-reexec
.Once you're done, the table should look like this:
USER PID ACCESS COMMAND /oldroot: root kernel mount /oldroot
Unmount the old root
umount /oldroot
At this point, you can carry out whatever manipulations you require. The original question needed a simple
resize2fs
invocation, but you can do whatever you want here; one other use case is transferring the root filesystem from a simple partition to LVM/RAID/whatever.Pivot the root back
mount <blockdev> /oldroot mount --make-rprivate / # again pivot_root /oldroot /oldroot/tmp/tmproot for i in dev proc sys run; do mount --move /tmp/tmproot/$i /$i; done
This is a straightforward reversal of step 4.
Dispose of the temporary root
Repeat steps 5 and 6, except using
/tmp/tmproot
in place of/oldroot
. Then:umount /tmp/tmproot rmdir /tmp/tmproot
Since it's a tmpfs, at this point the temporary root dissolves into the ether, never to be seen again.
Put things back in their places
Mount filesystems again:
mount -a
At this point, you should also update
/etc/fstab
andgrub.cfg
in accordance with any adjustments you made during step 7.Restart any failed services:
systemctl | grep failed systemctl restart <whatever>
Allow shared subtrees again:
mount --make-rshared /
Start the stopped service units - you can use this single command:
systemctl isolate default.target
And you're done.
Many thanks to Andrew Wood, who worked out this evolution on RHEL4, and steve, who provided me the link to the former.
If you are sure what you are doing - thus not experimenting, you may hook into initrd which is the non-interactive and fast way.
On a Debian based system here is how.
See the code: https://github.com/szepeviktor/debian-server-tools/blob/master/debian-setup/debian-resizefs.sh
There is another example: https://github.com/szepeviktor/debian-server-tools/blob/master/debian-setup/debian-convert-ext3-ext4.sh