Over the past couple of months, we’ve heard a lot on bloggERS about how current students, recent grads, and mid-career professionals have made tech skills a strategic priority in their development plans. I like to think about the problem of “gaining tech skills” as being similar to “saving the environment”: individual action is needed and necessary, but it is most effective when it feeds clearly into systemic action.
So that begs the question, what root changes might educators of all types suggest and support to help GLAM professionals prioritize tech skills development? What are educator communities and systems – iSchools, faculty, and continuing education instructors – doing to achieve this? These questions are among those addressed by the BitCuratorEdu research project.
The BitCuratorEdu project is a two three-year effort funded by the Institute of Museum and Library Services (IMLS) to study and advance the adoption of born-digital archiving and digital forensics tools and methods in libraries and archives through a range of professional education efforts. The project is a partnership between the School of Information and Library Science at the University of North Carolina at Chapel Hill and the Educopia Institute, along with the Council of State Archivists (CoSA) and nine universities that are educating future information professionals.
We’re addressing two main research questions:
What are the primary institutional and technological factors that influence adoption of digital forensics tools and methods in different educational settings?
What are the most viable mechanisms for sustaining collaboration among LIS programs on the adoption of digital forensics tools and methods?
The project started in September 2018 and will conclude in Fall 2021, and Educopia and UNC SILS will be conducting ongoing research and releasing open educational resources on a rolling basis. With the help of our Advisory Board made up of nine iSchools and our Professional Experts Panel composed of leaders in the GLAM sector, we’re:
Piloting instruction to produce and disseminate a publicly accessible set of learning objects that can be used by education providers to administer hands-on digital forensics education
Gathering information and centralizing existing educational content to produce guides and other resources, such as this (still-in-development) guide to datasets that can be used to learn new digital forensics skills or test digital archives software/processes
Investigating and reporting on institutional factors that facilitate, hinder and shape adoption of digital forensics educational offerings
Through this work and intentional community cultivation, we hope to advance a community of practice around digital forensics education though partner collaboration, wider engagement, and exploration of community sustainability mechanisms.
To support our research and steer the direction of the project, we have conducted and analyzed nine advisory board interviews with current faculty who have taught or are developing a curriculum for digital forensics education. So far we’ve learned that:
instructors want and need access to example datasets to use in the classroom (especially cultural heritage datasets);
many want lesson plans and activities for teaching born-digital archiving tools and environments like BitCurator in one or two weeks because few courses are devoted solely to digital forensics;
they want further guidance on how to facilitate hands-on digital forensics instruction in distributed online learning environments; and
they face challenges related to IT support at their home institutions, just like those grappled with by practitioners in the field.
This list barely scratches the surface of our exploration into the experiences and needs of instructors for providing more effective digital forensics education, and we’re excited to tackle the tough job of creating resources and instructional modules that address these and many other topics. We’re also interested in exploring how the resources we produce may also support continuing education needs across libraries, archives, and museums.
We recently conducted a Twitter chat with SAA’s SNAP Section to learn about students’ experiences in digital forensics learning environments. We heard a range of experiences, from students who reported they had no opportunity to learn about digital forensics in some programs, to students who received effective instruction that remained useful post-graduation. We hope that the learning modules released at the conclusion of our project will address students’ learning needs just as much as their instructors’ teaching needs.
Later this year, we’ll be conducting an educational provider survey that will gather information on barriers to adoption of digital forensics instruction in continuing education. We hope to present to and conduct workshops for a broader set of audiences including museum and public records professionals.
Our deliverables, from conference presentations to learning modules, will be released openly and freely through a variety of outlets including the project website, the BitCurator Consortium wiki, and YouTube (for recorded webinars). Follow along at the project website or contact firstname.lastname@example.org if you have feedback or want to share your insights with the project team.
Jess Farrell is the project manager for BitCuratorEdu and community coordinator for the Software Preservation Network at Educopia Institute. Katherine Skinner is the Executive Director of Educopia Institute, and Christopher (Cal) Lee is Associate Professor at the School of Information and Library Science at the University of North Carolina, Chapel Hill, teaching courses on archival administration, records management, and digital curation. Katherine and Cal are Co-PIs on the BitCuratorEdu project, funded by the Institute of Museum and Library Services.
They say you learn more from failure than from success. FIAT was a great teacher.
This is a story about never giving up, until you do: about the project where nothing went right, and just kept going. It takes place at the University of Texas, Austin, School of Information (iSchool) in the Digital Archiving course. A big part of that class is the hands-on technology project, where students apply archival theory to legacy hardware, digital records, or a mix of both. Our class had three mothballed servers and three ancient personal computers available; my group was assigned the largest (and oldest) of the retired School of Information servers, a monster tower-chassis Dell PowerEdge 4400 called FIAT.
Our assignment was clear: following archival principles, gain access to the machine’s filesystem, determine dates of service, inventory the contents, and image the disks or otherwise retrieve the data. We had an advantage in that we knew what FIAT had been used for: the backbone server for the iSchool, serving the public-facing website and the home directories for faculty, staff, and students. In light of this, we had one additional task: locate the old website directory. Hopefully, at the end of the semester, we would have a result to present for the iSchool Open House.
As the only one in the group with Linux server experience (I’d been the school’s deputy systems administrator for about a year), I volunteered as technical lead and immediately began worrying about what would go wrong.
It would be easier to list what went right.
We got access to the machine. We estimated manufacturing and usage dates. We determined the drive configuration. We were able to view the file directory, and we located the old iSchool website.
The catalog of dead ends, surprises, and failures is rather longer, and began almost immediately. None of us had done anything like this before, but I had enough experience with servers to develop specific fears, which may or may not have been better than the general anxiety my group members suffered, and turned out to be largely misplaced.
I was sure that FIAT had been set up in a RAID array, but I didn’t know the specifications or how to image it. (1) To find out without directly accessing the machine– which might have compromised the integrity of its filesystem– we needed its Dell service tag number. If we could give that to the iSchool’s IT coordinator (my boss), Dell’s lookup tool would tell us what we needed to know.
The service tag had been scraped off.
That was annoying, but not fatal. Since we had the model number, we could find a manual; with access to the iSchool’s IT inventory, I could look up the IT control tag and see what information we had. From this, we determined that FIAT was produced between 1999 and 2003, could have been set up for either hardware or software RAID depending on a hardware feature, and was probably running the operating system Red Hat v2.7. That gave us a ballpark for service life. It didn’t move us forward, though, so while my compatriots researched RAID imaging strategies, I looked for another route.
Best practice for computer accessions calls for accessing the machine from a “dead” state, so that metadata doesn’t get overwritten and the machine can be preserved in its shutdown state. For us, that meant booting from a Live CD, a distribution of Linux which runs in the RAM and mounts, or attaches, to the filesystem without engaging the operating system, allowing us to see everything without altering the data. My thought was that we could boot that way and then check for a RAID configuration at the system level: open the box with the crowbar inside it.
And it would have worked, too, if it weren’t for Murphy.
After making the live CD, we turned FIAT on and adjusted the boot order in the BIOS so we could boot from the disk. We learned three things from this: first, and most frighteningly, one of the drive ports didn’t show up in the boot sequence (and another spun up with the telltale whine of a Winchester drive going bad, increasing the pressure to get this done). Second, the battery on FIAT’s internal clock must have died, because it displayed a date in May 2000 (which we figured was probably when the board had been installed). Third, neither the service tag number nor the processor serial number appeared in the BIOS, so we still couldn’t look it up.
Carrying merrily on, we went ahead with the live CD boot. What happened next was our mistake, and I only realized it later. Though Knoppix (the Linux OS we were running from the live CD) started and ran, the commands for displaying partitions and drives returned no results, and navigating to /dev (where the drives mount in Linux) didn’t reveal any mount points. Nothing in the filesystem looked right, either.
What had happened (and a second attempt made this apparent) was that Knoppix hadn’t mounted at all. It was just running in the RAM. We hadn’t noticed the error message that told us this because we were too excited that the CD drive had worked. Knocked back but hardly defeated, we took a week off to email smarter people and regroup.
The next thing we did involved a screwdriver.
Popping the side off to read the diagram and locate the RAID controller key– or not, as it happened–was mildly cathartic and hideously dusty. I spent the next three days sneezing. Without a hardware controller, I was certain that the machine had been set up with a software RAID; since our attempts to boot from the CD had failed, I proposed that we pull the drives and image them separately with the forensic hardware we had available. My theory was that, since the RAID was configured in the software, we could rebuild it from disk images. This theory did not have a chance to be disproved.
Unscrewing the faceplate and pulling the drives gave me a certain amount of satisfaction, I’ll admit. It also solved the mystery of the missing drive: the reason why one of the SCSI ports wasn’t coming up on the boot screen was that it was empty. With that potential catastrophe averted, we imagined ourselves well set on our way to imaging the disks. Until we discovered that the Forensic Recovery of Evidence Device Laptop (or FRED for short) in the Digital Archaeology Lab didn’t have cables capable of connecting our 80-pin SCSI-2 drives to its 68-pin SCSI-3 write blocker. And that, despite having a morgue’s worth of old computer cables and connectors, there wasn’t anything in the lab with the right ends. That’s when I started fantasizing about making FIAT into a lamp table.
So, while my comrades returned to preparing a controlled vocabulary for our pictures and drafting up metadata files (remember, we never actually gave up on getting the data), I called or drove to every electronics store in town, including the Goodwill computer store. I found a lot of legacy components and machines, but nothing that would convert SCSI-2 to SCSI-3; so I put out a call to my nerd friends to find me something that would work.
With their help, I found an adaptor with SCSI-2 on one side and SCSI-3 on the other. When it arrived, I met up with one of my groupmates at the Digital Archaeology Lab, where the two of us daisy-chained the FRED cables, write blocker, our connector, and the (newly labeled according to our naming convention) drives to see what would happen.
The short version is: nothing.
The longer version is: complicated nothing. Some of the drives, attached to the power supply and write blocker, didn’t turn on at first, then did later without us changing anything. The write blocker’s SCSI connection light never lit up. FRED never registered an attached drive. We tried several jumper combinations, all with the same result: when we could get the drives to turn on at all, the write blocker couldn’t see them, and neither could FRED.
Having exhausted our options for doing it the right way, we explained the situation to our professor, Dr. Pat Galloway (who I think was enjoying our object lesson in Special Problems), and got permission to just turn FIAT on and access it directly. I put the drives back in, we tried booting with Knoppix again just in case (revealing the error), then changed the boot order back and watched it slowly come back to life.
Of course no one had the password.
Illustrating the adage “if physical access has been achieved, consider security compromised,” I put FIAT into Single User Mode, allowing me to reset the root password (we put it in the report, don’t worry) and become its new boss. (2)
This is where it got weird. And exciting! Prior to this, we’d been frustrated- afterwards, we upgraded to baffled. After watching FIAT rebuild itself- as a RAID 5 array- we had to figure out what to do next: how to image the machine, and onto what.
We made three attempts to connect FIAT to something, or something to FIAT, each of which resulted in its own unique kind of failure.
After noticing a SCSI-3 port on the back of FIAT–a match to the one FRED’s write blocker cables–and with no idea if this would even work for a live machine, I proposed plugging the two together to see what happened.
Again, the short answer is: nothing. We tried it both through FRED’s write-blocker and directly to the laptop, but neither FRED nor FIAT registered a connection. Checking for drives showed no new devices, and no connection events appeared in the log file or the kernel message buffer. (3)
Our next bright idea was to attach a USB storage device and run a disk dump to it. We formatted a drive, plugged it in, and prepared for nothing to happen. For once, it didn’t. Instead, FIAT reported an error addressing the device that even Stack Overflow didn’t recognize. I found the error class, however: kernel issues. We thought that maybe the drive was too big, or too new, so we hunted up an older USB drive and tried again. Same result. Then we rebooted. The error messages stopped, but no new connection registered. I tried stopping the service, removing the USB module, and restarting the service, but both ports continued throwing errors.
Servers are meant be networked. Hoping that FIAT’s core functions remained intact, I acquired some cables and a switch and rigged up a local access network (LAN) between FIAT and my work laptop. If it worked, we could send the disk dump command to a destination drive attached to my Mac. Or we could stare at the screen while FIAT dumped line after line of ethernet errors, which seemed like more fun. Again, I stopped the service, cleared the jobs on the controller (FIAT only had one ethernet port), restarted it, then restarted the service, but the timeout errors persisted (4).
FIAT’s total solipsism suggested a dead south bridge as well as serious kernel problems. While it might have been possible for an electrical engineer (ie, not us) to overcome the former, the latter presented a catch-22: even if we were willing to alter the operating system (no), the only way to fix FIAT would have been with an update, which thanks to the kernel errors, we couldn’t perform.
To be clear: those three things all happened in about two hours.
During this project, we kept reflective journals, accessible only to ourselves and our professor. The final entry in mine simply reads: “I think I’m so smart.”
With about a week left, I had an idea. Other than how to convert a tower-chassis server into an end table. I discussed it with the group, and when we couldn’t find anything wrong with it, I suggested they start the final report and presentation: if this worked, we’d have something to turn in. If not, I’d write my section of the report and we’d call it done.
We had found the website while exploring the filesystem. FIAT had a CD drive. I could compress and copy the website directory to a CD and we could turn that in. It wasn’t ideal, but we’d have something to show for a semester’s worth of work.
So while my compatriots got the project documentation ready for ingest to the iSchool’s DSpace, I went to work on FIAT one last time. I covered my bases–researched the compression protocols Red Hat 2.7 supported, the commands I’d need, how to find the hardware location so I could write the file out once it was created.
FIAT being FIAT, I hit a snag: the largest CD I had available was 700MB, but the real size of the website directory was 751MB. After a little investigation, I decided which files and folders we could live without (and put locations and my reasoning in the report): excluding them, I created an .iso smaller than 700MB.
That file still resides in the directory where I put it. The final indignity, FIAT’s sting in the tail, was this: it had a CD drive, which I found with cdrecord -scanbus. What it did not have was a CD-ROM drive. Attempting to write the .iso to disk, cdrecord returned an error: unsupported drive at bus location. FIAT can read, but it can’t write.
And then we were out of time. After my last idea fizzled, we gave up on FIAT and put together our final report, including suggestions for further work and server decommissioning recommendations for never letting this happen again. I presented the project at the Open House anyway, sitting on FIAT (the casters on the tower model made moving its 115lb bulk a breeze) for four hours, showing people the file structure and regaling them with tales of failure. Talking over the fantastic noise it made, while the other members of my group held down their own project posters, I found that people appreciate a good comedy of errors. I’ve embraced FIAT as my shaggy-dog story, my archival albatross. And now I know what to say in job interviews when they ask me to “talk about a time you didn’t succeed.”
The author would like to acknowledge the efforts and contributions of their fellow-travelers, Arely Alejandro, Maria Fernandez, Megan Moltrup, and Olivia Solis, as well as the guidance and assistance of Dr. Patricia Galloway, Sam Burns, and members of the UT Austin storage ITS team, without whom none of this would have happened.
1 Redundant Array of Inexpensive (or Independent) Disks, a storage virtualization method which uses either hardware or software to combine multiple physical drives into a single logical unit, improving read/write speeds and providing redundancy to protect against drive failure. RAID arrays can be set up at a number of levels depending on user need, all of which have their own implications for preservation and data recovery.
2 Something I had no prior experience with- certainly not with a Red Hat 2.7 machine! I spent more time looking up error codes, troubleshooting, and searching for workarounds than interacting with FIAT.
3 Throughout, I used the command df -h in FIAT to display drives, and read the kernel message buffer, where information about the operating status of the machine can be read, with demsg.
4 I cannot emphasize enough how much on-the-fly learning occurred during this part- even as a low-level systems administrator, trying to get FIAT to talk to something involved a lot of new material for me.
A.L. Carson, MSIS UT ‘16, is the only archivist on Earth who is allergic to cats. Trained as a digital archivist, they now apply those perspectives on metadata and digital repositories as a Library Fellow at the University of Nevada, Las Vegas. Twitter: @mdxCarson