What's an effective offsite backup strategy for a ZFS mirrored pool?
After much tinkering and experimentation I've found a solution, albeit with a fairly large tradeoff.
First off, the options I had to rule out:
Having a second offsite ZFS server with a mirrored pool wasn't an option due to cost. Had it been an option this would by far have been the best approach, utilizing ZFS send / receive to ship snapshots to the remote pool.
Having a second onsite ZFS mirrored pool, which I could remove disks from to take home. This is more feasible than the first option, but I would need the second pool to always have two disks onsite (or to use two data-copies on a single onsite disk). At present I have four disks, and no more space for a fifth in the server. This would be a fair approach but still not ideal.
Using ZFS attach and detach to rotate the backup disk into and out of the mirrored pool. This works well, but has to perform a full resilver every time the disk is added. This takes unacceptably long, and so I couldn't rely on this.
My solution is similar to using attach
and detach
, however it uses online
and offline
. This has the advantage of performing a delta resilvering versus a full resilvering, but the drawback that the pool always reports a DEGRADED
state (the pool always has two disks; the rotating offsite disks are marked offline
when they are in remote storage and resilver and then come online when they are onsite).
So, a quick recap and overview of my setup:
I have one ZFS server and four identical disks. ZFS is setup to use a mirrored pool. Two of the four disks are permanent members of this pool. The other two disks rotate; one is always in offsite storage, the other is part of the pool to act as a ready-to-go backup.
When it comes time to rotate the backups:
I wait for a
zfs scrub
to complete to reasonably assure the backup disk is error freeI
zfs offline
the disk which will be taken remote. After its offline'd Ihdparm -Y /dev/id
to spin it down. After a minute I partially remove the disk sled (just enough to ensure its lost power) and then give it another minute before fully pulling the drive to guarantee it has stopped spinning. The disk goes in a static bag and then a protective case and goes offsite.I bring in the other offsite disk. It gets installed in the hotswap tray and spins up. I use
zfs online
to restore the disk to the pool and kick off a partial resilvering to make it concurrent.
This system guarantees that at any given time I have two ONLINE
mirror disks and one OFFLINE
remote disk (which has been scrubbed). The fourth disk is either being resilvered or online, which has the benefit that in case a running drive fails it's probably the pool will still consistent of two online disks.
It's worked well for the past couple weeks, but I'd still consider this a hackish approach. I'll follow up if I run into any major issues.
Update: After running with this for a couple months I've found that in my real-world use the resilvering is taking the same time for either detach/attach and offline/online. In my testing I don't think I was running a scrub--my hunch is that if a drive is offline for a scrub then it requires a full resilver.
Why not zfs send your snapshots to a remote ZFS machine? I use a simple bash script for this:
#!/usr/local/bin/bash
# ZFS Snapshot BASH script by Shawn Westerhoff
# Updated 1/14/2014
### DATE VARIABLES
# D = Today's date
# D1 = Yesterday's date
# D# = Today less # days date
Y=$(date -v-1d '+%m-%d-%Y')
D=$(date +%m-%d-%Y)
D1=$(date -v-1d '+%m-%d-%Y')
D10=$(date -v-10d '+%m-%d-%Y')
D20=$(date -v-20d '+%m-%d-%Y')
# Step 1: Make the snapshots
for i in $( zfs list -H -o name ); do
if [ $i == tier1 ]
then echo "$i found, skipping"
else
zfs snapshot $i@$D
fi
done
# Step 2: Send the snapshots to backup ZFS sever
for i in $( zfs list -H -o name ); do
zfs send -i $i@$D1 $i@$D | ssh -c arcfour [email protected] zfs recv $i
done
# Step 3: Destroy snapshots that are 20 days old
for i in $( zfs list -H -o name ); do
if [ $i == tier1 ]
then echo "$i found, skipping"
else
zfs destroy $i@$D20
fi
done