By Dorothy Waugh
This post is the seventh in a bloggERS series about access to born-digital materials.
As a digital archivist, I am always looking for ways to streamline my processing workflows. Because, when faced with a multi-gigabyte hard drive or pile of aging 3.5” floppy disks, time is rarely on my side. The diverse nature of our collections, however, can be a hindrance when trying to build workflows based on efficiency—in spite of our best efforts, there are frequently cases where a collection’s particular characteristics demand greater levels of resources and time. During the past year, archivists at the Stuart A. Rose Manuscript, Archives, and Rare Book Library have developed a tiered approach to processing and access designed to flex in response to a collection’s limitations without obscuring its salient characteristics.
First, unprocessed collections are evaluated based on four criteria:
- Quality of data, defined by us as the totality, scope, and viability of the acquired material. Our focus here is on the completeness of the acquisition (an entire hard drive, for example, provides more content and context than a directory of files), the number of years spanned, and the extent to which data can be rendered using modern software.
- Authenticity of data, which refers to our ability to establish that it was indeed the donor that created, used, or managed the digital content.
- The number of donor restrictions and any concern regarding intellectual property rights.
- The extent to which we anticipate particularly high levels of use. Crucially, we want to avoid postponing questions about access until processing is complete, instead evaluating expected use up front so that access will inform processing from the very start.
Based on this evaluation, we assign collections to one of three processing tiers:
Tier One: Low complexity, tool-driven
Collections assigned to this tier will primarily include familiar, homogenous file formats and will have few donor restrictions or intellectual property rights concerns. The limited complexity of the data will allow for largely tool-driven processing.
Tier Two: Average complexity, combination of tool-driven and manual approaches
Collections assigned to this tier will primarily include familiar file formats, although there may be instances of more challenging file formats. Some donor restrictions may apply.
Tier Three: High complexity, high manual effort
Collections assigned to this tier will include a large number of heterogeneous and challenging file formats. There may be a high level of donor restrictions. The scope of the collection and anticipated high levels of use may demand a more involved approach to arrangement, description, and access.
This diagram illustrates the application of this approach to born-digital materials acquired as part of Lucille Clifton’s papers:
Information in the three left-hand columns provides a broad assessment of the collection’s born-digital material based on our four criteria. This assessment is then used to determine that material should be processed as a tier three collection, meaning that it is a complex collection and will likely require high manual effort during processing.
At the same time, our assessment guides decision-making about access. We’ve very loosely labelled the three levels at which we make collections available to researchers as standard, emulation only, and optimal—although it is important to note that we define these levels based not only on what we are currently able to do but also on what we hope to be able to implement in the future. As a result, consideration of these levels is as much about processing in such a way that does not preclude improved future models of access as it is about providing access right now. So, while optimal is perhaps more “optimal” as it currently stands, these are collections that we have deemed good candidates for a more advanced access point once we have the necessary resources in place. We determined, based on our assessment, that the Lucille Clifton collection warranted an “optimal” approach to access.
Rather than automating our decision making, the initial assessment and subsequent assignment of a processing tier and access level provides a structured vocabulary by which recurring project considerations can be discussed, and a comprehensive rubric by which new projects can be prioritized and planned. Once identified, these tiers influence decision making at almost every stage of our born-digital workflows, including how processed collections are made available to our researchers. As we continue to apply this approach, we hope too to better track our work in order to more accurately allocate the time and resources required to process a collection at each tier.
Note: This blog post borrows in part from a forthcoming article, “Flexible Processing and Diverse Collections: A Tiered Approach to Delivering Born Digital Archives,” written in collaboration by Dorothy Waugh, Elizabeth Roke, and Erika Farr, to be published in the journal Archives and Records. The article will offer additional information on how the tiered approach described in this post has been applied in practice at Emory.
Dorothy Waugh is Digital Archivist at the Stuart A. Rose Manuscript, Archives, and Rare Book Library at Emory University, where she is responsible for the management of born-digital manuscript and archival material.