by Rachel MacGregor
At the Modern Records Centre, University of Warwick in the United Kingdom we have been making steady progress in our digital preservation work. Jessica Venlet from UNC Chapel Hill wrote recently about being in the lucky position of finding “an excellent stock of hardware and two processors” when she started in 2016. We’re a little further behind than this—when I began in 2017 I had a lot less!
What we want is FRED. Who’s he? He’s your Forensic Recovery of Evidence Device (forensic workstation), but costing several thousand dollars, it’s beyond the reach of many of us.
What I had in 2017:
- A Tableau T8-R2 write blocker. Write blockers are very important when working with rewritable media (USB drives, hard drives, etc.) because they prevent accidental alteration of material by blocking overwriting or deletion.
- A (fingers crossed) working 3.5 inch external floppy disk drive.
- A lot of enthusiasm.
What I didn’t have:
- A budget.
Whilst doing background research for tackling our born-digital collections, I got interested in the BitCurator toolkit which is designed to help with the forensic recovery of digital materials. It interested me particularly because:
- It’s free.
- It’s open source.
- It’s created and managed by archivists for archivists.
- There’s a great user community.
- There are loads of training materials online and an online discussion group.
I found this excellent blog post by Porter Olsen to help get started. He suggests starting with a standard workstation with a relatively high specification (e.g. 8 GB of RAM). So, I asked our IT folk for one, which they had in stock (yay!). I specified a Windows operating system and installed a virtual machine, which runs a Linux operating system on which to run BitCurator.
I’m still exploring BitCurator—it’s a powerful suite of tools with lots of features. However, when trialing it on the papers of the eminent historian Eric Hobsbawm, I found that it was a bit like using a hammer to crack a nut. Whilst it was possible to produce all sorts of sophisticated reports identifying email addresses etc., this isn’t much use on drafts of published articles from the late 1990-early 2000s. I turned to FTK Imager which is proprietary but free software. It is widely used in the preservation community, but not designed by, with, or for archivists (as BitCurator is). I guess its popularity derives from the fact that it’s easy to use and will allow you to image (i.e. take a complete copy of all the whole media including deleted and empty space), or just extract the files, without too much time spent learning to use it. There are standard options for disk image output (e.g. as a raw byte-for-byte image, an E01 Expert witness format, SMART, and AFF formats). However, I would like to spend some more time getting to know BitCurator and becoming part of its community. There is always room for new and different tools and I suspect the best approaches are those which embrace diversity.
Another tool that looks useful for disk imaging is one called diskimgr created by Johan van der Knijff of the Nationale Bibliotheek van Nederland. It will only run on a Linux operating system (not on a virtual machine), so now I am wondering about getting a separate Linux workstation. BitCurator also works more effectively in a Linux environment as opposed to a virtual machine–it does stall sometimes with larger collections. I wonder if I should have opted for a Linux machine to start with. . . it’s certainly something to consider when creating a specification for a digital curation workstation.
Once the content is extracted, we need further tools to help us manage and process. Bitcurator does a lot, but there may be extra things that you might need depending on your intended workflow. I never go anywhere without DROID software. DROID is useful for loads of stuff like file format identification, creating checksums, deduplication, and lots more. My standard workflow is to create a DROID profile and then use this as part of the appraisal process further down the line. What I don’t yet have is some sort of file viewer—Quick View Plus is the one I have in mind (it’s not free and as I think I mentioned my resources are limited!). I would also like to get LibreOffice installed as it deals quite well with old Word processed documents.
I guess I’ll keep adding to it as I go along. I now need to work out the most efficient ways of using the tools I have and capturing the relevant metadata that is produced. I would encourage everyone to take some time experimenting with some of the tools out there and I’d love to hear about how people get on.
Rachel MacGregor is Digital Preservation Officer at the Modern Records Centre, University of Warwick, United Kingdom. Rachel is responsible for developing and implementing digital preservation processes at the Modern Records Centre, including developing cataloguing guidelines, policies and workflows. She is particularly interested in workforce digital skills development.