A script that I’ve been trying to find the time to write for quite some time now, but kept getting distracted. But finally, here it is (mostly so that I can start using it myself) for handling backups to local destinations that may or may not be online at any given moment.

Source available on Github.

Quick start

  • Create a zpool (e.g. backup1) on an external device
  • Ensure that you have a snapshot on your source filesystem
  • Launch the script with the arguments “sourcepool/filesystem backup1”

That’s it. Step 2 is adding a second zpool (backup2) on another external device.

Launch the script with the arguments “sourcepool/filesystem backup1 backup2” It will backup to whichever of the two backup destinations are available.

The objective: simple rotating backup solution

I’m going back to the basics here with a rotating offsite backup approach. The idea is simple. I have two big external disks that I want to rotate between home and the office. But I need a solution that’s as simple and robust as possible and that minimizes the dependency on the weakest point of the backup chain: me.

Similar to the auto-replicate script, the idea is that you should be able to pop this into a cron job and pretty much forget about it. You plug and unplug in your disks whenever seems appropriate to your needs (just checking that there’s no blinking lights on them when you do so).

Fixed schedules and rotation plans are pretty on paper, but break down in the face of the real world. So the script is designed to handle whatever it finds automatically and gracefully.

The syntax is simple: the source filesystem followed by as many destinations as you like. This is not designed for online replication like the auto-replicate script but you can use it with FC or iSCSI mounted remote disks. It works with the assumption that the destinations will be locally connected disks. In order to enhance the overall reliability and stability of the backups, it mounts and unmounts them for each session. This means that if you’re using an external USB disk you’re pretty safe unplugging it even if you forget to check if the pool is mounted or you’re in a hurry rushing out the door.

The script will happily backup to multiple destinations if they are all available. One after the other though, not in parallel. If no destinations are available, it will simply note the fact and exit.

In order to handle the fact that your off-site copy may stay off-site longer than your normal snapshot retention period, a zfs hold is placed on the last snapshot copied to your backup destination. This saves you from the situation where you miss the weekly disk swap because of a holiday and the dependent snapshot is deleted from the source. The hold will prevent the snapshot from being deleted automatically and requires that you release it or force the deletion manually. See ZFS Holds in the ZFS Administrators Guide for more information.

But even if this should happen, the script will automatically reset the backup and recopy everything automatically if it should find itself in a situation where there are no matching snapshots. There is also a bit of maintenance code in there to alert you by mail if you end up with an orphan hold on a snapshot that blocks the deletion of the snapshot.

I hope you find this useful. It’s a complement to my auto-replicate script which is for those that can afford to have two machines running and online with decent bandwidth. Which is fine at work, but a little hard to justify on the home front, hence this script.

Snapshots

This script does not handle the creation of snapshots, similarly to the design choice of the auto-replicate script. Snapshot creation can have a lot of application specific dependencies so I try and disassociate them from replication activities. In any case, if you want to integrate the two, grab a copy of simple-snap and add it to the script that you use to launch this script and you’ll be fine. For snapshot clean-up I have a separate script that handles that: auto-snap-cleanup.

Sample run

Here’s an example of what happens when you have a freshly formatted backup2 zpool attached and the backup1 zpool with an existing backup already in place.

auto-backup_0.1.ksh data/test backup1 backup2
Source filesystem: data/test
Checking for data/test
Source filesystem data/test exists

Checking for backup destination pool backup2
Pool backup2 is available, starting backup operations
Checking for backup2/test
Destination filesystem backup2/test does not exist - must create
Creating remote filesystem based on: data/test@2011-09-13_02-20-10
pfexec /sbin/zfs send data/test@2011-09-13_02-20-10 | pfexec /sbin/zfs recv backup2/test
Setting backup2/test to read only
Disabling auto-snapshot on backup2/test
Unmounting backup2

Checking for backup destination pool backup1
Pool backup1 is not mounted, attempting import
Pool backup1 is available, starting backup operations
Checking for backup1/test
Destination filesystem backup1/test exists
Most recent destination snapshot: test@2011-09-13_02-11-44
Matching source snapshot: data/test@2011-09-13_02-11-44
The Source snapshot does exist on the Destination, ready to send updates!
Command: pfexec /sbin/zfs send -I data/test@2011-09-13_02-11-44 data/test@2011-09-13_02-20-10 | pfexec /sbin/zfs recv -vF backup1/test
receiving incremental stream of data/test@2011-09-13_02-20-10 into backup1/test@2011-09-13_02-20-10
received 312B stream in 1 seconds (312B/sec)
Releasing hold on data/test@2011-09-13_02-11-44
Setting hold on data/test@2011-09-13_02-20-10
Number of extra snapshots to be deleted: 1
backup1/test@2011-09-12_16-22-06 will be deleted
Unmounting backup1

Internal options

snapstokeep

The snapstokeep variable determines how many snapshots you want to keep on the external backup volume. Since this is designed for off-site rotation, the only real reason to keep any additional snapshots past the current one would be to protect against propagating corrupt data to the backup. But since you’ll probably have a weekly or monthly cycle for rotation, you likely have a clean copy elsewhere anyway. In any case, this option is why I use the incremental send option to send all accumulated snapshots individually, and then delete them afterwards. This gives you the option to handle this the way best adapted to your requirements.

contact

Email address for a sending a few warnings directly. The default value is root@localhost.

Notes

This code is designed and tested against root filesystems only, although it should properly handle subordinate filesystems under the root. All testing was done on Solaris Express 11, so any feedback concerning compatibility with other platforms is appreciated.

Update 6/8/2012

I added in a few checks for Nexenta and OpenIndiana as the paths to the binaries is different in each of these environments, plus some subtleties concerning the lac of pfexec. To run these scripts under non-Solaris environments, you’ll need to be running as root or have delegated zfs and zpool commands to the user account you’re using. I haven’t looked into the state of RBAC and delegated rights on either of these environments so you may be limited to running them as root.