By Matthew Farrell
This post is the eleventh in a bloggERS series about access to born-digital materials.
On January 15, 2016, the BitCurator Consortium (BCC) held its second annual User Forum at the Louis Round Wilson Library at UNC-Chapel Hill. Representatives from BCC member institutions joined non-member archivists and librarians engaged in digital archival work for a day of discussion regarding born-digital archives and the application of digital forensics to archival practice. Panel descriptions and links to public group notes can be found on the BCC website. A longer recap and reflection of the day is forthcoming on the BCC website.
Participants in the User Forum came from varied contexts–some hold digital archivist job titles, others are researchers active in the development of digital forensics tools, and still others are professionals from curatorial and administrative backgrounds. A major goal of the program committee was to make the User Forum engaging for participants from any background. A breakout session titled “Where Should Access Happen?” leveraged the variety in participants particularly well. The participants broke into four groups, each discussing access to born-digital materials through a different lens: (1) the environment and/or tools that should be available to a researcher in the reading room, (2) what sort of staff resources should be considered, (3) where and how should access be facilitated, and (4) how much and what kinds of metadata should be made available to users prior to and during a research visit.
The group notes are here for posterity. What follows are a few themes that were discussed commonly across multiple groups.
The group discussing the environment and tools that could be available to researchers discussed the amount of control exerted over both collection materials and the access environment. For reading room access, the default discussed tends to be strict controls over materials due to the nature of the content. Strict control as default is shown in another group’s discussion of each participant’s current reading room environment: representatives of Duke University, Penn State University, North Carolina State University, Washington, and the University of Virginia all reported the use of off-network access terminals with controls limiting application access and restricting copying and/or printing. There appears to be an all-or-nothing approach to access. If a set of materials has some necessity for access controls (restricted or sensitive info, rights issues, etc.), access is only available in a heavily mediated environment. On the flip side are those materials that have no restrictions at all, which can (and sometimes are) exposed online.
Further, determining what sort of environment and what toolset to provide researchers in a reading room is hindered by the small number of researchers currently requesting electronic collection materials. The low number of requests is almost certainly affected by the obscurity of those materials in finding aids and catalog records. Without a history of requests for access, it is impossible to determine with any certainty what researchers want to do with collection materials in a reading room environment. Three groups discussed exposing metadata about digital objects via finding aids as principal to offering access in any meaningful way. The group discussing metadata and processing touched on the balance between processing a set of digital objects at a minimum level to make them available and offering the maximum possible amount of metadata about those materials. Discussion generally accepted that over-processing of materials could happen. Though it is unclear what “over-processing” means for a given collection, processing archivists assuming what researchers will want to see and do with materials is a potential indicator.
Over-processing materials, coupled with heavy restrictions to access environments, leaves institutions at risk to providing researchers with glorified e-readers. Such access does little to support digital humanities research. The group discussing staffing for access talked about an example of Emory using Voyant and a poet’s collection of electronic text in a series of instruction sessions (though the project was not officially part of the program at the User Forum, I’m pretty sure Emory archivists would be happy to provide more information). Another group discussed basic tools that could be useful for many types of materials, including versatile viewing applications such as Quickview Plus as well as hex editors. Providing a hex editor to researchers was supported by two independent anecdotes. More advanced tools will likely vary depending on collections, though allowing a researcher to run file identification software or have access to the metadata output of such software is potentially of use. For these scenarios, should repositories be responsible for vetting additional tools and applications? One participant suggested researchers submit code or tools to run across a data set in advance of their visit for inspection by repository staff.
While this is only a small window on the User Forum, I think there are at least a couple of areas discussed above that warrant further immediate work. I’m curious to get a picture of what metadata, exactly, BCC member institutions are exposing in their finding aids, and whether there is room for creating a common set of guidelines for including forensic metadata in descriptive systems. Hopefully more information will come soon on that front. There is a wealth of engaging thoughts in the notes for this session as well as the rest of the User Forum’s group notes, and the BCC website. Any questions about the BitCurator Consortium can be sent to me or Sam Meister.
Matthew Farrell is digital archivist at the David M. Rubenstein Rare Book & Manuscript Library at Duke University and member of the program committee for the 2016 BitCurator User Forum.