Recovering lost photos from a flash card
Twice now, I’ve helped friends recover lost photos from flash cards. Both friends are technical people who generally know what they’re doing … but they both got bitten by bugs in commercial software that left them without their photos for one reason or another.
The loss of photos is a very upsetting event to a photographer. Fortunately, if you stop using the card the instant you discover the problem, more than likely, your photos can be recovered. If you never use the on-camera delete to remove anything but the most recently taken photo(s), you’re in even better shape, since no fragmentation will have happened!
So how do you recover photos when there’s NO filesystem information left? Turns out its pretty easy for some file formats and tools that ignore trailing junk in files. The approach is simple: snapshot the raw card contents and then look for the start-of-file signature for your specific type of photo files— but most importantly, look for it aligned at the start of 512-byte boundaries —the typical “block” size.
To determine your file’s start signature, you’ll need a few sample files from your camera. If you have a Canon Rebel XTi and shoot in RAW mode, you’ll be getting CR2 files. These files have a nice long signature at the start, making detection and recovery a breeze.
To determine if there’s a good unique signature, you can do something like this:
for f in IMG*.CR2; do hexdump -n 16 $f; done | sort | uniq -c
6 0000000 4949 002a 0010 0000 5243 0002 3006 0001
6 0000010
If there were 6 files in the directory, and you only get 2 lines of output, you’ve found yourself a reliable 16-byte signature. More than enough to detect the start of files in most cases, especially when aligned to the start of a 512-byte block.
The above signature is what’s needed for a CR2 file.
To obtain the necessary image of the flash card (I don’t recommend ever working directly on the flash card when doing recovery—so we read it once and save it for future processing).
So how do you grab the contents of the flash for safe processing? Under Linux, FreeBSD and OSX (and other unix platforms), you use dd. We do this because we don’t want any extra headers… just the raw bytes. This ensures the disk remains aligned at 512-byte boundaries. Some disk image containers might happen to keep things aligned to the block boundaries. I’ve never checked.
The specific instructions for OSX are as follows:
- Using a card reader, mount the flash card (like usual: just insert it)
- Start Terminal.app and run df to find the device name of the newly mounted flash card. We’re specifically interested in the /dev/diskNsN device name, since we’re going to need to directly access it.
Here’s an example:
/dev/disk4s1 999344 978464 20880 98% /Volumes/EOS_DIGITAL
- Next, we need to unmount the disk without causing the device to be removed. We can either use unmount /dev/disk4s1 or go to Disk Utility, select the right volume, then use the “Unmount” toolbar icon to unmount it without ejecting it.
- Finally, we create the image we’re going to work with using dd.
dd if=/dev/disk4s1 of=flashimage.dat
Depending on the size, speed of the card, your card reader, and USB interface, this could take a long time. If you need to know how far it’s gotten, open a new Terminal window and run “du -h flashimage.dat”
Once the copy of the image has been made, we’ll want to run a quick recovery script. This script relies on the fact that Canon Raw conversion utilities tend to ignore trailing junk. If yours don’t, grab http://cybercom.net/~dcoffin/dcraw/:”dcraw” (available via http://www.macports.org/:”Mac Ports”) and convert the files to a format you can use (like TIFF).
Here’s the recovery script I hacked together to recover missing CR2 files for my friend:
#!/usr/bin/perl
use strict;
use warnings;
my @signature = qw(49 49 2a 00 10 00 00 00 43 52 02 00 06 30 01 00);
my $signature = pack("H*", join("", @signature));
my $siglen = length($signature);
my $blocksize = 512;
open(IN, "flashimage.dat") || die "$!";
# Skip 2gb to get past existing files
seek(IN, 2 * 1024 * 1024 * 1024, 0);
my $block = "";
my $imgno = 0;
while (read(IN, $block, $blocksize)) {
if (substr($block, 0, $siglen) eq $signature) {
print "starting $imgno\n";
open(OUT, sprintf(">found%04d.CR2", ++$imgno));
}
print OUT $block unless !$imgno;
}
The end result should be a whole bunch of .CR2 files named from found0001.CR2 through the final number.
PS – The above should work for JPEG files, but JPEG headers aren’t as big/consistent. The above technique has been proven to work for a Pentax *ist with a fully erase card (unerase couldn’t be performed) before any files were written and a Canon Rebel XTi where 1/2 the card had been filled back up with new photos. Some files might be corrupt due to fragmentation. Basically, what I’m saying is: the only two cameras I know of so far that write their files in a sane manner without fragmentation (unless holes are created by erasing files on the camera) are the above two cameras.
Printing reports...
You’d think one of the things that would have been made easy in Visual Basic was creating/printing reports.
Unfortunately, that doesn’t seem to be the case. Printing support hasn’t changed much from the early days. They’ve just moved stuff around. You still have to do the hard work like word wrapping yourself. Which just seems to me to be reinventing the wheel.
Oh, you can do things like create a WebBrowser object, and print from that, but that seems reliable only for a single page at a time. Attempts to automate it were slower than 1/sec for a batch process, because I had to ensure I waited sufficient time for the page to render. Yes, there is an event that’s fired that you can use to time when to start the .Print() method… but you also have to get the timing right as to when you refresh the content to load a new page to print. Then there’s the standard IE print headers and footers. Without adding a commercial ActiveX component like ScriptX (which incidentally, seems not to be straightforward to use in VB2005), you have to modify the registry to remove them.
In the end, I gave up on trying to do the batch printing via VB. Crystal reports would have worked, I guess, but that’s just a different source of frustration. Oh, did I mention that to include bar codes reliably, I’d have to roll my own, or license a commercial font?
My solution? PDF::API2::Simple and PDF::API2, which allows me to easily create PDFs. PDF::API2 is poorly documented, but very powerful. It even includes a number of barcode formats, including 3of9 which I have been using, as well as code128 which I had been thinking about. PDF::API2::Simple provides an easier to use interface to the PDF::API2, including a very nicely written text placement routine that does word wrap for me. Fortunately, it also allows direct access to PDF::API2 so you can still place that barcode.
If you want to create barcodes with PDF::API2 you’ll want to check out AnnoCPAN where some rather helpful notes have been added.
Net result? Small files, fast generation of documents, and no frustrations surrounding reprint control, timing of print documents, overflowing the print spool, etc. I can now take all the work of creating a batch print job to the server end, and there’s no client software to install other than Acrobat, which everyone needs anyhow.
