« Correction to my cheap pens story | Home | Tips on Writing the Living Web »

May 13, 2003

using rsync for snapshots

Following up on my note about the wonders of rsync, here's a very interesting article by Mike Rubel about using rsync to create snapshot-style backups. (Thanks to Jeremy Zawodny for the pointer.)

Filesystem snapshots are a useful notion. I first came across them with Network Appliance's Filer products; their WAFL file system. Snapshots give you access to backups of files in real time. In the Netapp boxes, snapshots work with a copy-on-write technique - that is, when a file in a snapshot is changed, the original file is left alone (and is still accessible from the snapshot), while the new changed data consumes new space in the filesystem.

Quoting the abstract from Mike Rubel's article:

This document describes a method for generating automatic rotating "snapshot"-style backups on a Unix-based system, with specific examples drawn from the author's GNU/Linux experience. Snapshot backups are a feature of some high-end industrial file servers; they create the illusion of multiple, full backups per day without the space or processing overhead. All of the snapshots are read-only, and are accessible directly by users as special system directories. It is often possible to store several hours, days, and even weeks' worth of snapshots with slightly more than 2x storage. This method, while not as space-efficient as some of the proprietary technologies (which, using special copy-on-write filesystems, can operate on slightly more than 1x storage), makes use of only standard file utilities and the common rsync program, which is installed by default on most Linux distributions. Properly configured, the method can also protect against hard disk failure, root compromises, or even back up a network of heterogeneous desktops automatically.