Sarah Barsness: Digital Archivist’s Manual

When I joined the staff of the Minnesota Historical Society, the institution was in the process of morphing from a series of special projects to a programmatic approach to digital collecting.  The two archivists who held my position before me laid an excellent foundation for a sustainable program, and I was tasked with continuing the work they started.  I decided that I ought to kick things off by creating a manual for my job, to be written in between collections work and other projects.  

As you might imagine, the process was painfully slow.  Each time I began to think I was nearing the end, something would change.  Staff retired, departments were reorganized, or systems were upgraded — I began to despair ever creating an actual finished product.  My many drafts also chart the change and growth of our communities of practice; the constantly evolving work of my fellow archivists, conservationists, and digital specialists of all stripes has profoundly shaped each version of this document.  

Despite all this change, each iteration of this manual has expanded upon a few core ideas that  have remained central to my approach:

  1. What if we treat digital collections just like we do any other collection that has preservation concerns?  What does that look like?  What tools and skills will be needed to support that approach?
  2. What if we assume that most details of the work will change regularly? Staffing, systems, workloads, formats collected, personal skill levels, even department organizations all change.  How can we structure this work to be maximally flexible?
  3. How can we make a framework to guide decisions for each collection, and to help the institution navigate more fundamental changes to the program over time?

The manual I began nearly seven years ago still isn’t finished in any traditional sense, but I’ve decided that’s how it ought to be.  Change is constant, and just because something isn’t static doesn’t mean it’s not ready to be used or shared.  I hope that you find this document useful, and if you have feedback, ideas, or your own manuals that you’d like to share with me (finished or not), I sincerely hope you do.

Sarah Barsness has been the Digital Collections Archivist at the Minnesota Historical Society for nearly 7 years, where she works with staff across the institution to process, store, preserve, and provide access to digital collections.  Sarah has worked previously at the Wisconsin Historical Society and at Cargill Corporate Archives.  For tweets about digital archives and Dungeons & Dragons, follow her on Twitter @SarahRBarsness.

Perspectives and Experiences in Collection Development with COVID-19 Content

By Jillian Lohndorf and Sylvie Rollason-Cass

For some, the switch to remote work meant that web archiving became a much bigger part of their day-to-day. Not only were people looking for digital projects to focus on, but new web-based information and content related to COVID-19 was being generated quickly. Nearly every government, organization, and company had information to share, and librarians and archivists were working as quickly to document it as it was being created. The team at Archive-It, the web archiving service of the Internet Archive, helped facilitate the increased activity by offering steeply discounted data and providing opportunities for users to discuss strategies, challenges, outcomes, and generally share experiences. These opportunities came in the form of regular community calls, as well as presentations and discussion groups related to COVID-19 collecting at the annual (and newly virtual) Archive-It Partner meeting.  Information from these discussions, as well as the collecting activity itself, has proven to be a spotlight on the many lessons that Archive-It users have already learned about web archive collection development, and the many ways they continue to innovate.

The International Internet Preservation Consortium’s June 2020 capture of COVID-19 data reported from Palestine

The way an organization approached collection development was often dictated by already-established processes, which in turn, are often dictated by larger goals and requirements.  Some of these organizational decisions were influenced by the resources available, most notably, data and staff time.  Many organizations  have created new, discrete collections for their COVID-19 content.  The idea of creating thematic collections in response to an event is not new, as many organizations have created them to document web content related to what are described as “spontaneous events,” which in the past have included natural events and societal tragedies.  Other organizations choose to integrate new COVID-19 content into existing collections, contextualized along other regularly captured content from their institutions and communities.  Sometimes the content was in both new and existing collections, capturing institutional COVID-19 responses in established collections, while also creating separate collections with an exclusive COVID-19 focus.  For many, collection development decisions went hand-in-hand with descriptive practices, such as building out metadata using controlled language as a way to facilitate access to information.

Metadata record from the Pennsylvania Horticultural Society COVID-19 Collection

The collected content was often a reflection of both institution and community, and across a range of formats, pushing some organizations into new territory.  Social media, from Facebook to Twitter, was of particular importance. And for many, capturing COVID-19 dashboards and maps became important.  Some organizations had either internal or public calls for website nominations, providing an opportunity to engage with others to build inclusive collections.  With the amount of available information, and the rate of change, no organization was immune from making difficult decisions on what to capture. 

The pandemic is far from over and even when it is, we are likely to see repercussions for a long time to come. For many, collecting COVID-19 content will be a long-term project with new twists and turns along the way. As of writing this, Archive-It users have collected over 40TB of web content related to COVID-19, and as that amount of data continues to rise,  the lessons of web archiving will as well.

Jillian Lohndorf joined the Internet Archive in 2016 as an Archive-It Web Archivist. Previously, she worked in the Archives and Special Collections at DePaul University and Rotary International, and as Web Services Librarian for The Chicago School of Professional Psychology. She holds a Master of Science in Library and Information Science from the University of Illinois at Urbana-Champaign.

Sylvie Rollason-Cass is the Senior Web Archivist for Archive-It where she has been supporting the web archiving community for the past 8 years. She holds a Master of Science in Library and Information Science from the University of Illinois at Urbana-Champaign.