Automated incremental off-site database backups using ZFS send

Posted by Roy Hooper Wed, 01 Apr 2009 02:07:00 GMT

Something that I've fought with for a long time is how to make good nightly off-site backups of MySQL and PostgreSQL. Having recently switched back to my favourite *nix for server use, FreeBSD, I am now using ZFS on my server. (Boy is it fast. But that's a post for some other time.)

One of the advantages to using ZFS is that it supports the notion of serialising/deserializing file-systems or snapshots so you can transport them elsewhere and restore them, possibly on the same system to a new pool. One such use is remote backups of databases.

In the past, I've tried nightly full dumps using mysqldump or pg_dump and then rsyncing them along with the rest of my data for nightly backups. This meant that I had an ever-growing quantity of data to ship. Other ideas I had included using diff to generate incremental deltas which could be sequentially patched to generate the latest database dump. This approach could work well, and would only require two full dumps to be kept around (today's and yesterday's)... but I never got around to it. And it would still require periodic full dumps to be copied over, since you really don't want to rely on trusting that many patches will apply cleanly. Plus, depending on your workload, the diffs could be as big as or bigger than the full dump, since diff is line-based. Any modification to one column in every row of a big database would result in a huge delta - for example, adding a new column.

Back to ZFS. My idea is simple: Use Ralf S. Engelschall's snapshot tools for FreeBSD, to keep about a week's worth of nightly snapshots (plus some weekly ones), and send only the nightly ones to a remote site every night. The trick with remote backups is that there's a chance that a night will be missed. ZFS doesn't do all the necessary magic to determine if snapshots were renamed and to determine which ones aren't yet applied.

That's where my simple little sync_remote_snapshots.pl script comes in. This little script is intended to run from the receiver ("local"). It uses ssh as the transport, something you'll have to setup to work without passphrases while the script is running (ideally using ssh-agent).

The script starts by listing the local and remote zfs snapshots (using zfs list -t snapshot -r zpool/fsname), parses the output, then matches up the local and remote snapshots by timestamp. This is done because the snapshot numbers won't match up between the two machines after the next snapshot is run. This is because the freebsd snapshot script renumbers the snapshots such that @daily.0 is the lowest numbered snapshot for a daily snapshot. If two days go by, then @daily.0 on the local end would be for two days prior.

In order to be able to run the incremental backup, we need to first rename all the local snapshots just like the remote side does, purge any old local snapshots we don't want any more, and then actually do the backup.

This script takes care of all that. It also handles the case where the snapshot failed mid-way (loss of network, out of space, etc), leaving it such that the local snapshot numbers don't start at zero.

There's definitely situations it doesn't cover, but it does what I need and uses way less bandwidth than rsyncing compressed nightly dumps.

Recovering lost photos from a flash card

Posted by Roy Hooper Sun, 16 Mar 2008 22:42:00 GMT

Twice now, I’ve helped friends recover lost photos from flash cards. Both friends are technical people who generally know what they’re doing … but they both got bitten by bugs in commercial software that left them without their photos for one reason or another.

The loss of photos is a very upsetting event to a photographer. Fortunately, if you stop using the card the instant you discover the problem, more than likely, your photos can be recovered. If you never use the on-camera delete to remove anything but the most recently taken photo(s), you’re in even better shape, since no fragmentation will have happened!

So how do you recover photos when there’s NO filesystem information left? Turns out its pretty easy for some file formats and tools that ignore trailing junk in files. The approach is simple: snapshot the raw card contents and then look for the start-of-file signature for your specific type of photo files— but most importantly, look for it aligned at the start of 512-byte boundaries —the typical “block” size.

To determine your file’s start signature, you’ll need a few sample files from your camera. If you have a Canon Rebel XTi and shoot in RAW mode, you’ll be getting CR2 files. These files have a nice long signature at the start, making detection and recovery a breeze.

To determine if there’s a good unique signature, you can do something like this:

for f in IMG*.CR2; do hexdump -n 16 $f; done | sort | uniq -c
     6 0000000 4949 002a 0010 0000 5243 0002 3006 0001
     6 0000010

If there were 6 files in the directory, and you only get 2 lines of output, you’ve found yourself a reliable 16-byte signature. More than enough to detect the start of files in most cases, especially when aligned to the start of a 512-byte block.

The above signature is what’s needed for a CR2 file.

To obtain the necessary image of the flash card (I don’t recommend ever working directly on the flash card when doing recovery—so we read it once and save it for future processing).

So how do you grab the contents of the flash for safe processing? Under Linux, FreeBSD and OSX (and other unix platforms), you use dd. We do this because we don’t want any extra headers… just the raw bytes. This ensures the disk remains aligned at 512-byte boundaries. Some disk image containers might happen to keep things aligned to the block boundaries. I’ve never checked.

The specific instructions for OSX are as follows:

  1. Using a card reader, mount the flash card (like usual: just insert it)
  2. Start Terminal.app and run df to find the device name of the newly mounted flash card. We’re specifically interested in the /dev/diskNsN device name, since we’re going to need to directly access it. Here’s an example:
    /dev/disk4s1      999344    978464     20880    98%    /Volumes/EOS_DIGITAL
  3. Next, we need to unmount the disk without causing the device to be removed. We can either use unmount /dev/disk4s1 or go to Disk Utility, select the right volume, then use the “Unmount” toolbar icon to unmount it without ejecting it.
  4. Finally, we create the image we’re going to work with using dd.
    dd if=/dev/disk4s1 of=flashimage.dat
    Depending on the size, speed of the card, your card reader, and USB interface, this could take a long time. If you need to know how far it’s gotten, open a new Terminal window and run “du -h flashimage.dat”

Once the copy of the image has been made, we’ll want to run a quick recovery script. This script relies on the fact that Canon Raw conversion utilities tend to ignore trailing junk. If yours don’t, grab http://cybercom.net/~dcoffin/dcraw/:”dcraw” (available via http://www.macports.org/:”Mac Ports”) and convert the files to a format you can use (like TIFF).

Here’s the recovery script I hacked together to recover missing CR2 files for my friend:

#!/usr/bin/perl

use strict;
use warnings;

my @signature = qw(49 49 2a 00 10 00 00 00 43 52 02 00 06 30 01 00);
my $signature = pack("H*", join("", @signature));
my $siglen = length($signature);
my $blocksize = 512;

open(IN, "flashimage.dat") || die "$!";

# Skip 2gb to get past existing files
seek(IN, 2 * 1024 * 1024 * 1024, 0);

my $block = "";
my $imgno = 0;
while (read(IN, $block, $blocksize)) {
    if (substr($block, 0, $siglen) eq $signature) {
        print "starting $imgno\n";
        open(OUT, sprintf(">found%04d.CR2", ++$imgno));
    }
    print OUT $block unless !$imgno;
}

The end result should be a whole bunch of .CR2 files named from found0001.CR2 through the final number.

PS – The above should work for JPEG files, but JPEG headers aren’t as big/consistent. The above technique has been proven to work for a Pentax *ist with a fully erase card (unerase couldn’t be performed) before any files were written and a Canon Rebel XTi where 1/2 the card had been filled back up with new photos. Some files might be corrupt due to fragmentation. Basically, what I’m saying is: the only two cameras I know of so far that write their files in a sane manner without fragmentation (unless holes are created by erasing files on the camera) are the above two cameras.

Installing postgres gem on OSX (Leopard with MacPorts PostgreSQL82)

Posted by Roy Hooper Sat, 08 Mar 2008 20:25:00 GMT

Unfortunately, out of the box, the postgres gem fails to install with the MacPorts postgres82-server install. There are two problems. The first is it can’t find pg_config. The second is incorrect architecture detection (thanks to Andreas Flierl at the RubyForge postgres module forums for pointing out the fix).

ERROR:  Error installing postgres:
    ERROR: Failed to build gem native extension.

/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby extconf.rb install postgres
extconf.rb:73: command not found: pg_config --bindir
The fix for this one is to fix the path temporarily:
 export PATH=/opt/local/lib/postgresql82/bin/:$PATH

Then when we retry the gem installation, we’ll get the following error(s):

postgres.c:41: error: static declaration of ‘PQserverVersion’ follows non-static declaration
/opt/local/include/postgresql82/libpq-fe.h:262: error: previous declaration of ‘PQserverVersion’ was here
postgres.c:41: error: static declaration of ‘PQserverVersion’ follows non-static declaration
/opt/local/include/postgresql82/libpq-fe.h:262: error: previous declaration of ‘PQserverVersion’ was here
postgres.c: In function ‘Init_postgres’:
postgres.c:2676: error: ‘pgconn_protocol_version’ undeclared (first use in this function)
postgres.c:2676: error: (Each undeclared identifier is reported only once
postgres.c:2676: error: for each function it appears in.)
postgres.c:2677: error: ‘pgconn_server_version’ undeclared (first use in this function)
postgres.c: In function ‘Init_postgres’:
postgres.c:2676: error: ‘pgconn_protocol_version’ undeclared (first use in this function)
postgres.c:2676: error: (Each undeclared identifier is reported only once
postgres.c:2676: error: for each function it appears in.)
postgres.c:2677: error: ‘pgconn_server_version’ undeclared (first use in this function)
lipo: can't open input file: /var/tmp//ccBatpen.out (No such file or directory)
make: *** [postgres.o] Error 1

These are fixed by specifying a specific architecture to use.

The following commands worked well for me:
export PATH=/opt/local/lib/postgresql82/bin/:$PATH
sudo env ARCHFLAGS="-arch i386" gem install --remote postgres

Making Postgresql82 run under leopard (from Mac Ports)

Posted by Roy Hooper Sat, 08 Mar 2008 19:42:00 GMT

After installing the Mac Ports postgresql82-server port, I tried to make it run. It didn’t. The problem was that su didn’t like running anything as user postgres. Here’s the original instructions:

###########################################################
# A startup item has been generated that will aid in
# starting postgresql82-server with launchd. It is disabled
# by default. Execute the following command to start it,
# and to cause it to launch at startup:
#
# sudo launchctl load -w  /Library/LaunchDaemons/org.macports.postgresql82-server.plist 
###########################################################  
--->  Installing postgresql82-server 8.2.6_0

To create a database instance, after install do
 sudo mkdir -p /opt/local/var/db/postgresql82/defaultdb
 sudo chown postgres:postgres  /opt/local/var/db/postgresql82/defaultdb 
 sudo su postgres -c '/opt/local/lib/postgresql82/bin/initdb -D  /opt/local/var/db/postgresql82/defaultdb'
And here’s my test to determine if su was working:
 sudo su postgres -c id
I got no output… So I tried
 sudo -u postgres id
And I get:
 uid=401(postgres) gid=401(postgres) groups=401(postgres)
So now what?

There’s an easy way to deal with this: Grab the server admin tools from Apple and install them.

Next, fire up Workgroup Manager (its in your /Applications/Server folder). Ignore the dialog and go straight to the menu and select Server -> View Directories. Select the PostgreSQL Server user in the left-hand pane, then select the Advanced tab and change the Login Shell to something like /bin/sh. See the picture for details. Don’t forget to hit Save.

Now the command from before works:

# sudo su postgres -c id
uid=401(postgres) gid=401(postgres) groups=401(postgres)
So now we can finish the last step to initialize the database, then fire up Postgres.
  sudo su postgres -c '/opt/local/lib/postgresql82/bin/initdb -D /opt/local/var/db/postgresql82/defaultdb'

If you already ran the launchctl command, you can make Postgres start with these two commands:

 sudo launchctl unload -w /Library/LaunchDaemons/org.macports.postgresql82-server.plist
 sudo launchctl load -w /Library/LaunchDaemons/org.macports.postgresql82-server.plist

Otherwise just run the load version of the command.

Installing the Darwin Calendar Server on FreeBSD

Posted by Roy Hooper Sat, 07 Jul 2007 23:10:00 GMT

I’ve been contemplating creating a hallway dashboard to replace the paper calendar and birthday list that hangs in the hallway. My wife’s not too thrilled by the idea, so I’ve got to make it look good, work well and be useful to be able to convince her it should get installed on the wall.

As one of its main functions will be as a Calendar and Events listing, I thought it would make sense to investigate the iCal server. It turns out that the underlying server is based on an open source project at Apple by the name of Darwin Calendar Server.

It has a fair number of dependencies… But they all looked pretty reasonable, so I decided to try it out on my household FreeBSD 6.0 server.

The requirements to make it run are:
  1. Python 2.4
  2. Zope Interface 3.1.0c1
  3. PyXML 0.8.4
  4. pyOpenSSL 0.6
  5. python-dateutil-1.0
  6. xattr 0.2 (Bob Ippolito’s implementation)
  7. pysqlite 2.2.2
  8. Twisted
  9. vObject
  10. PyKerberos
  11. PyOpenDirectory

Many of these are already available as ports:

  1. /usr/ports/lang/python
  2. /usr/ports/www/zope3
  3. /usr/ports/textproc/py-xml
  4. /usr/ports/security/py-openssl
  5. /usr/ports/devel/py-dateutil
  6. Needs to be built manually. It also depends on a version of setuptools that is higher than the current port. Get and install setuptools 0.6c6. Once you’ve installed those, get and install xattr-0.4.
  7. /usr/ports/databases/py-pysqlite22
  8. /usr/ports/devel/py-twisted
  9. /usr/ports/desktuil/py-vobject

The final two in the dependency list are available via the MacOS forge SVN server for Darwin Calendar Server. PyKerberos requires you have kerberos installed, but don’t need to have it configured. To get kerberos libraries on your FreeBSD system, install /usr/ports/security/krb5.

Next, create a directory to check out the calendar server into then:

svn co http://svn.calendarserver.org/repository/calendarserver/CalendarServer/trunk CalendarServer

You’re also going to need PyKerberos, but it won’t work as-is on FreeBSD. First we’ll fetch it to the current working directory:

svn co http://svn.calendarserver.org/repository/calendarserver/PyKerberos/trunk PyKerberos

Next you’re going to need to modify setup.py in PyKerberos slightly as well as fix the include path that is used for Python.h. First we’ll fix all the references:

perl -spi -e ’s{<Python/}{<};’ src/*

Next, add these two lines to setup.py just after the sources block:

library_dirs=['/usr/local/lib'],
include_dirs=['/usr/local/include'],

You’ll end up with a setup.py like this:

##
  1. Copyright© 2006-2007 Apple Inc. All rights reserved. #
  2. Licensed under the Apache License, Version 2.0 (the “License”);
  3. you may not use this file except in compliance with the License.
  4. You may obtain a copy of the License at #
  5. http://www.apache.org/licenses/LICENSE-2.0 #
  6. Unless required by applicable law or agreed to in writing, software
  7. distributed under the License is distributed on an “AS IS” BASIS,
  8. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  9. See the License for the specific language governing permissions and
  10. limitations under the License. #
  11. DRI: Cyrus Daboo, cdaboo@apple.com ##
from distutils.core import setup, Extension
import sys
import commands
setup (
    name = "kerberos",
    version = "1.0",
    description = "Kerberos high-level interface",
    ext_modules = [
        Extension(
            "kerberos",
            extra_link_args = commands.getoutput("krb5-config --libs gssapi").split(),
            extra_compile_args = commands.getoutput("krb5-config --cflags gssapi").split(),
            sources = [
                "src/kerberos.c",
                "src/kerberosbasic.c",
                "src/kerberosgss.c",
                "src/base64.c" 
            ],
            library_dirs=['/usr/local/lib'],
            include_dirs=['/usr/local/include'],
        ),
    ],
)

Finally, we’re ready to attempt to fire up the server for the first time, as per the instructions on the quickstart page.

Here too, unfortunately, we run into portability problems. The run script uses /bin/bash. You’ll need to install the bash port if you haven’t already, and then either modify the run script, or fire it up manually:

bash ./run -s

This will build the kerberos library we modified in the previous steps, among other things.

Once the above works (no output), you’re ready to configure things. Before you can launch the server, though, you’ll need to fix:

CalendarServer/bin/caldavd
CalendarServer/run

to use /usr/local/bin/bash

Enjoy.

Convincing darwinports to buid in parallel

Posted by Roy Hooper Sun, 25 Mar 2007 22:57:00 GMT

Sometimes you want to use all of your available CPU power for building. If Darwinports were like FreeBSD ports, it would be really easy to pass -jN to the build process (make -jN install). With a little bit of playing around, I managed to sort this out.

Darwinports isn’t quite like FreeBSD ports, and doesn’t depend on makefiles, so you can’t directly pass -jN to the build process. In order to trick it into passing -jN flags to make at build time, you need to edit /opt/local/etc/ports/ports.conf to add:

extra_env                       MAKEFLAGS

From that point on, you can define flags to be passed into make as a part of the build process:

MAKEFLAGS=-j6 port build <portname>

You’ll run into weirdness if you try to install a port with that MAKEFLAGS set, and some ports won’t build with -jN flag set. For a dual-CPU system, I reccomend -j3. For a quad-CPU system, I reccomend -j5 or -j6.

Here’s a timing comparison:

monopoly:~ rhooper# time sudo port build ImageMagick
--->  Fetching ImageMagick
--->  Verifying checksum(s) for ImageMagick
--->  Extracting ImageMagick
--->  Configuring ImageMagick
--->  Building ImageMagick with target all

real    5m2.928s
user    2m24.148s
sys     2m19.407s

And here it is with -j5:

monopoly:~ rhooper# sudo port clean ImageMagick
--->  Cleaning ImageMagick

monopoly:~ rhooper# MAKEFLAGS=-j5 bash -c "time sudo port build ImageMagick" 
--->  Fetching ImageMagick
--->  Verifying checksum(s) for ImageMagick
--->  Extracting ImageMagick
--->  Configuring ImageMagick
--->  Building ImageMagick with target all

real    2m38.003s
user    2m43.868s
sys     3m22.793s

And just to compare, here it is with -j6

monopoly:~ rhooper# sudo port clean ImageMagick
--->  Cleaning ImageMagick

monopoly:~ rhooper# MAKEFLAGS=-j6 bash -c "time port build ImageMagick" 

--->  Fetching ImageMagick
--->  Verifying checksum(s) for ImageMagick
--->  Extracting ImageMagick
--->  Configuring ImageMagick
--->  Building ImageMagick with target all

real    2m32.642s
user    2m45.351s
sys     3m26.030s

This will be a real timesaver with those bigger ports!