Pathways to Automated Appraisal for Born-Digital Records: An SAA 2016 ERS Breakout Discussion Recap

By Lora Davis
____

In a stroke of brilliant SAA scheduling (or, perhaps, blind chance) the 2016 Electronic Records Section’s annual business meeting immediately followed Thursday afternoon’s session 201 “From 0 to 400 GB: Confronting the Challenges of Born-Digital Photographs.” During this session, panelists Kristen Yarmey, Ed Busch, Chris Prom, Molly Tighe, and Gregory Wiedeman discussed a variety of steps they’ve taken to answer the question “What next?” following the (physical or digital) delivery of born-digital campus photographs to their repositories. I listened intently as Wiedeman recounted how he has employed the API of his campus’ chosen cloud-based online public photo database (SmugMug) to automate the description of born-digital campus photographs at large scale. By reusing the existing photographer-generated descriptive metadata stored in SmugMug, Wiedeman’s campus photographs “describe themselves.” This struck a chord with me as I look forward to my own institution’s upcoming National Digital Stewardship Residency project “Large-Scale Digital Stewardship: Preserving Johns Hopkins University’s Born-Digital Visual History.” But, I wondered, could a similar method be employed to automate appraisal?

As the formal portion of the ERS business meeting concluded, the Section broke into several unconference-style small group discussions. Inspired by the above, I volunteered to lead one on potential methods for automating the appraisal of born-digital records. Breakout participant Tammi Kim kept discussion notes, as a group of about 20 ERS members engaged in discussion. As is often the case, our conversation occasionally deviated from the primary topic of appraisal, but even these tangents proved fruitful. Some of the topics discussed and questions raised include:

  • The differences and distinctions between born-digital appraisal and weeding. Is the goal of minimizing the total size of digital records ingested (say, reducing 50TB of born-digital campus photographs to 10TB) analogous to actually doing appraisal on these records?
  • Could the type of facial recognition software discussed in session 201 be used not only for description purposes, but also to identify VIPs and other photographic content that would inform appraisal decisions?
  • If the record’s creator (say, a campus photographer) assigned rights or permissions metadata to a digital object, might that rights metadata be employed for appraisal in an MPLP-like fashion?
  • What are the differences between photographic and text-based digital records? Is automated, machine-actionable appraisal more likely to succeed with one type of record than another? (E.g. It is easier to search for text in word processing documents and OCRed PDFs than it is to “search” in photographs.)
  • How can “micro-tools” like ArchiveFinder (product mentioned, but I cannot locate a GitHub page) and FileAnalyzer help with the appraisal of large, complex directories of digital files? Additionally, while tools like ExifTool can read, write, and edit embedded technical metadata, how useful is technical metadata to appraisal decisions?
  • How might content creators be brought into appraisal decisions after content has been transferred to a repository? Can we ask creators to enhance or add metadata after the fact?
  • Where does appraisal actually fit in with processing workflows, especially when working with larger files like video and disk images? How do you manage the need for increased storage even at the appraisal stage?
  • What “traditional” approaches to analog appraisal do not necessarily apply to digital? Where does potential future use of records fit in with born-digital appraisal decisions?
  • Are born digital archives even sustainable monetarily or ecologically? Are we building the Tower of Babel? What about server farms and the offset of dirty fuels?

I encourage anyone who attended this discussion to add to this post and/or correct any of my poor-memory-induced misstatements above by commenting below. Similarly, whether you attended the breakout or not, let’s continue this conversation in the comments section!

Lora Davis is Digital Archivist at Johns Hopkins University, where she is tasked with creating, documenting, and managing workflows for acquiring, describing, processing, preserving, and providing access to born‐digital materials. Prior to her appointment at JHU in January 2016, Lora worked at Colgate University and the University of Delaware.

 

Advertisements

3 thoughts on “Pathways to Automated Appraisal for Born-Digital Records: An SAA 2016 ERS Breakout Discussion Recap

    • Lora Davis October 21, 2016 / 5:19 pm

      Fantastic! Thanks for the addition! I knew I was likely failing to find something hidden in somewhat plain site.

      Like

  1. Leisa Gibbons October 21, 2016 / 7:00 pm

    Great blog post Lora! I recently started exploring appraisal from a data (extraction and network mapping) perspective thinking about design of tools and better understanding of sustainability. I am so glad to hear others are thinking and talking about it! The ideas in the blog are really exciting and it would be great to explore how they might link as well – plus how my research fits in. I applied to the computational archival science workshop coming up in December and hope to join them to learn more about this area.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s