Modernization of the A.D. Hopkins collection at the Smithsonian Institution Department of Entomology

by Teresa Boyd

This is the second post in the BloggERS Embedded Series.

The Smithsonian Institution’s Department of Entomology has recently finished phase one of their multiyear project to digitize their portion of the A.D. Hopkins notes and records system, which includes about 100 years of observations, both in the field and in the lab. A.D. Hopkins developed the system in order to collect biological and natural history notes about species, the environment they were in, as well as the collectors and locations of collection. This collection was adopted by the United States Department of Agriculture (USDA) when Hopkins was named Chief of Forest Insect Investigations, though Hopkins is known to have developed and used the system while working at West Virginia University in the late 1800s. The Smithsonian Institution’s Department of Entomology has historically worked very closely with the USDA and therefore obtained the largest portion of the Hopkins card file over the years.


It was important to Hopkins to collect as much information as possible about specimens because he felt it was the quickest way to understand the situation of potential pests and to find solutions to harmfully invasive species. Part of Hopkins’ methodology was to encourage average citizens to send in specimens and observations to the USDA, the Smithsonian, or one of the forest experiment stations that were located throughout the United States, which were then incorporated into the Hopkins note system. Some of these notes are also documentation about lab research such as specimen rearing, specimen transfers, and communications between lab and field. A few of these notes are also cross-referenced, so often a lab note can be traced back to a field note, making it easier for researchers to quickly see the correlation between field and lab (work that was often done by different individuals.) The numbers on each individual card within the A.D. Hopkins system correlates to specimens that are housed in various locations. Traditionally a researcher or scientist would ask for the notes that were associated with a a specimen number. By creating an online repository of the notes, the Smithsonian hopes to further enrich researchers with new tools to expand their work and perhaps find new ways to use the data which has been collected by past researchers and scientists.

I have been working on this project as a lone archivist for the past 5 years, scanning the card file portion of the collection, and am now working on preparing these scans for a website that will be built specifically for this type of collection. The Smithsonian Institution’s Department of Entomology hopes to begin sending the scans of the cards to the Smithsonian Transcription Center soon to crowdsource the transcriptions. This cuts down on the time it takes to transcribe the older material which is all handwritten. I will be adding the transcribed notes to the digitized card on the website so that researchers will be able to go to the website, look up a specific card, and see both the original scan and the transcribed notes, making it easy for anyone to be able to use the information contained in the Hopkins collection. Additionally these scans will be incorporated into the Department of Entomology’s collections database by matching specimens to their unique card numbers;  thereby giving researchers the complete picture.

The Smithsonian Institution’s work to digitize and make their A.D. Hopkins collection publicly available is not the first of its kind; the USDA had previously accomplished this in the 1980s, and has made their documents available on the USDA website, HUSSI. There is hope that in the future other institutions that have their own portions of the A.D. Hopkins notes and records system will also begin to digitize and make them available online, supplementing the Smithsonian and USDA efforts to make this invaluable data available to researchers. 

Teresa Boyd is an archivist for the Department of State and a volunteer archivist for the Smithsonian Institute’s Department of Entomology. She holds a degree in Library and Information Science from the University of Arizona.


Capturing Common Ground

by Leslie Matthaei

This is the first post in the BloggERS Embedded Series.

Every Tuesday I am asked the same question: “T Coast today?” T Coast, or Tortilla Coast, is the preferred lunch location for some of the photographers that occupy the Photography Branch, in the Curator Division, of the federal agency Architect of the Capitol. The agency oversees the maintenance of building and landscapes on Capitol Hill to include the Library of Congress buildings, Supreme Court, United States Botanic Gardens, House and Senate Office Buildings, Capitol Power Plant, and, of course, the United States Capitol. I have joined the professional photographers at T Coast for more than a dozen lunches now. I am here for the Taco Salad and comradery but mostly I am here to listen. And to ask questions. I am an embedded archivist.

The Library of Congress Thomas Jefferson Building
United States Capital Building
United States Botanic Garden – Bartholdi Park

I use my time at the T Coast lunch table to get to know the photographers and for them to get to know me. I discovered very quickly that the photographers and I have a lot in common. For example, the photographers are often assigned to shoot a long-term project (Collection) which may have multiple phases (Series), and for each phase, they go out on specific days to shoot (File Units). They cull excess and/or duplicate photographs. And they generally have a tried workflow for ingesting their born-digital objects to edit in Adobe Lightroom then upload them to an in-house Digital Asset Management system known as PhotoLightbox. Within PhotoLightbox, they are responsible for defining the security status of an individual image or group of images and providing the descriptive metadata. Tapping into parallel duties has allowed me to bridge potential knowledge gaps in explaining what roles and functions I can provide the branch as a whole.

One rather large knowledge gap is descriptive metadata. To be sure, the photographers in our agency are incredibly busy and in high demand. And they are professionally trained photographers. They see the world through aesthetics. It is not necessarily their job to use PhotoLightbox to help a researcher find images of the East Front extension that occurred in the 1950s, for example. That is my role, and when I query PhotoLightbox, the East Front extension project is represented in multiple ways: EFX, East Front (Plaza) Extension, East Extension, Capitol East Front Extension. You may see where this is going: there is no controlled vocabulary. When, in a staff meeting, I pitched the idea of utilizing controlled vocabularies, they immediately understood the need. Following their lead, the conversation turned to having me develop a data entry template for each of their shoots.

An example of the data entry template.

I admit now that my first spin through PhotoLightbox revealed a pressing need for controlled vocabularies, among other concerns the database presented. I am the type of person that when I see a problem, I want to fix it immediately. Yet I knew that if my first professional introduction to the photographers was a critique of how unworkable their data entry was and had been over time, I might turn them off immediately. Instead, I went to lunch. I credit the results produced by this particular staff meeting to the time that I put in getting to know the photographers, getting to understand each of their respective workflows, and understanding a little bit about the historic function and purpose of the office within the agency.

I have another half dozen lunches to go before I begin to talk to the photographers about the need for digital preservation of born-digital images over the long-term and both of our roles in the surrounding concepts and responsibilities. I have a few more lunches after that to get their assistance in codifying the decision we are making together into branch policies. I feel confident, however, that I have their complete buy-in for the work that I have been tasked to do in the branch. Instead of seeing me as another staff member making them do something they do not want, I am seen as someone who can help them gain control of and manage their assets in a way that has yet to be done in the branch. I cannot do it alone, I need their help. And some chips and salsa every once in a while.

Leslie Matthaei

Leslie Matthaei is an Archivist in the Photography Branch, Curator Division, Architect of the Capitol. She holds an MLIS from the University of Arizona, and an MA and BA in Media Arts from the University of Arizona.

Partnerships in Advancing Digital Archival Education

by Sohan Shah, Michael J. Kurtz, and Richard Marciano

This is the fourth post in the BloggERS series on Collaborating Beyond the Archival Profession.

The mission of the Digital Curation Innovation Center (DCIC) at the University of Maryland’s iSchool is to integrate archival education with research and technology. The Center does this through innovative instructional design, integrated with student-based project experience. A key element in these projects is forming collaborations with academic, public sector, and industry partners. The DCIC fosters these interdisciplinary partnerships through the use of Big Records and Archival Analytics.

DCIC Lab space at the University of Maryland.

The DCIC works with a wide variety of U.S. and foreign academic research partners. These include, among others, the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign, the University of British Columbia, King’s College London, and the Texas Advanced Computing Center at the University of Texas at Austin. Federal and state agencies who partner by providing access to Big Records collections and their staff expertise include the National Agricultural Library, the National Archives and Records Administration, the National Park Service, the U.S. Holocaust Memorial Museum, and the Maryland State Archives. In addition, the DCIC collaborates with the European Holocaust Research Infrastructure project to provide digital access to Holocaust-era collections documenting cultural looting by the Nazis and subsequent restitution actions. Industry partnerships have involved NetApp and Archival Analytics Solutions.

Students working on a semester-long project with Dr. Richard Marciano, Director, DCIC.

We offer students the opportunity to participate in interdisciplinary digital curation projects with the goal of developing new digital skills and conducting front line research at the intersection of archives, digital curation, Big Data, and analytics. Projects span across justice, human rights, cultural heritage, and cyber-infrastructure themes. Students explore new research opportunities as they work with cutting-edge technology and receive guidance from faculty and staff at the DCIC.

To further digital archival education, DCIC faculty develop courses at the undergraduate and graduate levels that teach digital curation theory and provide experiential learning through team-based digital curation projects. The DCIC has also collaborated with the iSchool to create a Digital Curation for Information Professionals (DCIP) Certificate program designed for working professionals who need training in next generation cloud computing technologies, tools, resources, and best practices to help with the evaluation, selection, and implementation of digital curation solutions. Along these lines, the DCIC will sponsor, with the Archival Educators Section of the Society of American Archivists (SAA), a workshop at the Center on August 13, 2018, immediately prior to the SAA’s Annual Meeting in Washington, D.C. The theme of the workshop is “Integrating Archival Education with Technology and Research.” Further information on the workshop will be forthcoming.

The DCIC seeks to integrate all its educational and research activities by exploring and developing a potentially new trans-discipline, Computational Archival Science (CAS), focused on the computational treatments of archival content. The emergence of CAS follows advances in Computational Social Science, Computational Biology, and Computational Journalism.

For further information about our programs and projects visit our web site at To learn more about CAS, see Information about a student-led Data Challenge, which the DCIC is co-sponsoring, can be accessed at

Sohan Shah

Sohan Shah is a Master’s student at the University of Maryland studying Information Management. His focus is on using research and data analytical techniques to make better business decisions. He holds a Bachelor’s degree in Computer Science from Ramaiah Institute of Technology, India, and has worked for 4 years at Microsoft as a Consultant and then as a Technical Lead prior to joining the University of Maryland. Sohan is working at the DCIC to find innovative ways of integrating data analytics with archival education. He is the co-author of “Building Open-Source Digital Curation Services and Repositories at Scale” and is working on other DCIC initiatives such as the Legacy of Slavery and Japanese American WWII Camps. Sohan is also the President of the Master of Information Management Student Association and initiated University of Maryland’s annual “Data Challenge,” bringing together hundreds of students from different academic backgrounds and class years to work with industry experts and build innovative solutions from real-world datasets.

Dr. Michael J. Kurtz is Associate Director of the Digital Curation Innovation Center in the College of Information Studies at the University of Maryland. Prior to this he worked at the U.S. National Archives and Records Administration for 37 years as a professional archivist, manager, and senior executive, retiring as Assistant Archivist in 2011. He received his doctoral degree in European History from Georgetown University in Washington, D.C. Dr. Kurtz has published extensively in the fields of American history and archival management. His works, among others, include: “ The Enhanced ‘International Research Portal for Records Related to Nazi-Era Cultural Property’ Project (IRP2): A Continuing Case Study” (co-author) in Big Data in the Arts and Humanities: Theory and Practice (forthcoming); “Archival Management and Administration,” in Encyclopedia of Library and Information Sciences (Third Edition, 2010); Managing Archival and Manuscript Repositories (2004); America and the Return of Nazi Contraband: The Recovery of Europe’s Cultural Treasures (2006, Paperback edition 2009).

Dr. Richard Marciano is a professor in the College of Information Studies at the University of Maryland and director of the Digital Curation Innovation Center (DCIC).  Prior to that, he conducted research at the San Diego Supercomputer Center (SDSC) at the University of California San Diego (UCSD) for over a decade with an affiliation in the Division of Social Sciences in the Urban Studies and Planning program.  His research interests center on digital preservation, sustainable archives, cyberinfrastructure, and big data.  He is also the 2017 recipient of the Emmett Leahy Award for achievements in records and information management. With partners from KCL, UBC, TACC, and NARA, he has launched a Computational Archival Science (CAS) initiative to explore the opportunities and challenges of applying computational treatments to archival and cultural content. He holds degrees in Avionics and Electrical Engineering, a Master’s and Ph.D. in Computer Science from the University of Iowa, and conducted a Postdoc in Computational Geography.

Call for Contributions: Embedded Series

The term embedded can mean a lot of things in the digital archives world: an archivist embedded in an architecture firm, metadata embedded into a digital image, archival work embedded into the record creation workflow. We want to hear what the word embedded means to you and your work, how it guides your processes and workflows, or how it presents challenges in your work with electronic records.

A few potential topics and themes for posts:

  • Archivists embedded in a non-archival unit
  • Embedded metadata stories of success or failure
  • Embedding archival work in external workflows
  • Non-archivists embedded in archival units


Writing for bloggERS! Embedded Series

  • We encourage visual representations: Posts can include or largely consist of comics, flowcharts, a series of memes, etc!
  • Written content should be 600-800 words in length
  • Write posts for a wide audience: anyone who stewards, studies, or has an interest in digital archives and electronic records, both within and beyond SAA
  • Align with other editorial guidelines as outlined in the bloggERS! guidelines for writers.

Posts for this series will start in March, so let us know if you are interested in contributing by sending an email to!

Teaching Personal Digital Archiving through Community Digitization

By Maggie Schreiner

This is the third post in the BloggERS series on Collaborating Beyond the Archival Profession

Queens Memory is an outreach-based community archiving program of the Queens Library and Queens College, CUNY that collects and makes accessible oral histories, photographs, and other personal records documenting contemporary life in the borough of Queens in New York City. Queens Memory hosts public scanning events where community members can have their photographs, documents and memorabilia digitized and added to the Queens Memory digital collections. Participants leave the events with both their original material and a flash drive of digital surrogates created during the event. The flash drive is the most tangible outcome of their participation in Queens Memory. However, many of donors do not have the necessary digital literacy skills for the flash drive to be a meaningful takeaway from the event. In fact, some donors do not know what a flash drive is, or how to connect it to a computer.

Queens Memory community digitization event in Forest Hills.
Queens Memory community digitization event in Forest Hills.

It was apparent that Queens Memory needed to incorporate digital literacy education, and personal digital archiving (PDA) was a natural fit for the community scanning events. Over the course of several months in 2015-2016, the Queens Memory team developed several teaching tools, iterating from a simple handout, to a brochure, then a two-hour digitization training.


“What’s on My Thumb Drive?” Handout

The first tool Queens Memory developed to integrate PDA education into the community scanning events was a simple handout explaining what files are on the flash drives that donors receive. The Queens Memory team provides donors with both TIFF and JPEG versions of each digital surrogate they create. The small handout, given with the flash drive, explains what types of files are on the flash drive, and suggests how each file type is best used. Although this information is very simple, it gives participants a starting place for understanding the digital material on their flash drives, and potentially also on their home computers.

Small handout explaining contents of Queens Memory thumb drive.
Small handout explaining contents of Queens Memory thumb drive.

“Preserving Your Digital Memories” Brochure

Building on this simple handout, Queens Memory staff created a brochure to give participants a more comprehensive resource to learn how to care for both the digital surrogates created during the community scanning events, as well as any digital files donors may already have. The brochure attempted to balance accessibility with robust information and professional standards. In creating the text for the brochure, unnecessary technical language was avoided, and in retrospect, accessible language could have been emphasized even more.

The brochure begins with an overview of the types of digital content that people might have, and how that material is uniquely fragile. The main threats to the longevity of digital material are outlined, including format obsolescence and the failure of computers and hard drives. The brochure then introduces the idea that digital content requires care, and provides a step-by-step guide for digital archiving.

Centerfold of Queens Memory Personal Digital Archiving brochure.
Centerfold of Queens Memory Personal Digital Archiving brochure.


Digitization Training Sessions

Another way that Queens Memory shares this information with community members is through more in-depth digitization trainings. The trainings focus on the digitization process and workflow, and include technical explanations of resolution, bit-depth, color space, compression and file format. Although Queens Memory provides a list of technical standards that are both professional and responsive to the reality of the situation and resources available, it is also very important for participants to learn how and why to choose particular standards and settings. When community members learn about the technology behind the process of digitization, they are empowered to make their own decisions about best practices as well as apply this knowledge to other scenarios. Additionally, this portion of the training proved to be a great opportunity to talk about the historical value of the collections, and how others might interact with these materials in the future.

Breezy Point Historical Society at a digitization training with Queens Memory.
Breezy Point Historical Society at a digitization training with Queens Memory.

The PDA teaching tools employed by Queens Memory at community scanning events and digitization trainings extend the reach of the program’s community archiving focus. As the historical record becomes increasingly born-digital, it is imperative that Queens Memory donors gain the skills and knowledge to become stewards of the digital content that documents life in the borough of Queens, NY.

This post is an edited version of a book chapter originally written with Natalie Milbrodt. Read the full chapter “Digitizing Memories and Teaching Digital Literacy in Queens, NY” in The Complete Guide to Personal Digital Archiving, edited by Brianna Marshall.

Headshot of Maggie Schreiner

Maggie Schreiner is a Project Archivist at New York University. Previously, she was the Outreach Coordinator for Queens Memory and member of the Culture in Transit team. She holds an MA in Archives and Public History from New York University.


Digital History Station at the Capital Area District Libraries

by Heidi Butler

This is the second post in the BloggERS series on Collaborating Beyond the Archival Profession

Inspired by the DC Public Library’s Memory Lab, the Brooklyn and Queens Public Libraries’ Culture in Transit Project, and the Kalamazoo Public Library’s Hub, in 2016 the Capital Area District Libraries (CADL) launched a pilot Digital History Station. This workstation differs from our standard patron computers in that it has many advanced capabilities for working with both old and new media. Patrons can use it to read data off older disk media, convert cassette audio or VHS to digital formats, or create new content. The station also allows for editing with a suite of programs and tools.

Digital History Station
The Digital History Station in CADL’s Local History Room. Photo by Heidi Butler.
Sanus Rack
The Digital History Station component rack, with video and audio decks, and storage for all sorts of card readers, recording devices, transparency viewers, cleaning tools, and other gear. Photo by Heidi Butler.

Hardware includes an iMac, an Epson v700 scanner, a Toshiba VHS-DVD deck with an Elgato video capture device, a Tascam cassette-CD deck, and more. For digitizing and editing, we provide the full Adobe Creative Cloud suite, as well as SilverFast 8 for scanning, and the standard iLife Mac programs. In 2018 we added Final Cut Pro to our software offerings. We have a Canon Rebel T6i camera with various lenses, a multifunction tripod, a Polaroid 3D photography cube for photographing objects or creating video, and a Zoom H2Next digital audio recorder. Due to demand, we also recently placed a Marantz cassette recorder into our Library of Things circulating collection. We are beginning to build a small collection of obsolete equipment such as mini-DV camcorders to facilitate more access to older materials.

The Digital History Station has several internal benefits as well. When it’s not in use by patrons, we are able to use it to access archival material in the library’s Local History collections or convert it to digital formats. Because Local History is a part of CADL’s Outreach department, we collaborate with coworkers on things like 3D photography for Etsy/eBay how-to classes, or workshops for seniors on personal digital archiving. We also take the handheld digital recorder and camera to family library events and record brief oral histories. Finally, we have conversations with every Digital History patron about what they are working on to determine if a copy of their materials would be a suitable addition the Local History collections. This has been beneficial as we continue building a collection of locally produced films and music. Recent accessions include three hip hop albums by Lansing artists, and several community theater productions on video from the community of Stockbridge, Michigan.

We ask patrons to complete an application to use the station, talk through their projects to be sure we can accommodate what they wish to do, and then schedule their visits in blocks of up to three hours at a time. Local History staff are not experts in everything the station offers, but we have identified colleagues elsewhere in the CADL system with relevant skills who can help when needed. We also recommend the library’s subscription to to patrons who want to build their knowledge of various digital practices. As of early 2018, the demand for the Digital History Station is moderate but expanding.

Heidi Butler (selfie)

Heidi Butler is the Local History Specialist at CADL. She previously served as archivist at Zayed University (Dubai, United Arab Emirates), Kalamazoo College (Mich.), Rush University Medical Center (Chicago, Ill.), and the Wichita Public Library (Kans.). She received her MSLS from the University of North Carolina-Chapel Hill in 2000.


Iterative Collaboration at LC Labs

by Jaime Mears

This is the first post in the BloggERS series on Collaborating Beyond the Archival Profession

Four women around a computer, showing the LC Labs homepage
The LC Labs Team – Abigail Potter, Jaime Mears, Meghan Ferriter, Kate Zwaard (left to right)

The LC Labs team works to increase the impact of Library of Congress digital collections. This includes not only the 2,500,000+ items available on, but also on-site only content and derivative content, such as our 25 million MARC records. We want to increase the variety of ways users engage with our content, and we get there through experimenting and collaboration, ideally setting up feedback loops whereby the work of our Library of Congress colleagues and our users can inform each other. From hands-on approaches such as crowdsourcing and tutorials for using our API, to more traditional avenues into the content such as podcasts, blog posts and works of art, we work with folks to interpret our collections in transformative ways for broader audiences.

Man in front of filing cabinets looks through Stereoscope
Innovator-in-Residence Jer Thorp visits the Library of Congress Prints & Photographs division

Our Innovator in Residence program places an individual from three months to a year with Library of Congress staff and collections to create something inspiring for the public domain. The data artist Jer Thorp is our current innovator, and it’s been a blast over the last couple months showing him what we love about this place. As a part of his residency, Jer is producing a podcast called “Artist in the Archive,” exploring both stories found in our content and the story of the content itself – how it gets here, how it’s maintained, enriched, shared, and listeners get to meet the people doing the work! He’s also exploring Library of Congress data sets (such as using network analysis to identify polymaths in our MARC records), and will create a capstone work.

congressionalchallengeInspired by the National Endowment for the Humanities’ Chronicling America Data Challenge and the excellent work we see coming from the data journalism field to make data meaningful, we are running a Congressional Data Challenge in partnership with the Congressional Research Service. This competition asks participants to leverage legislative data sets on and other platforms to develop digital projects that analyze, interpret or share congressional data in user-friendly ways. Anyone can apply, and we’re even awarding $5000 for the first prize, and $1000 for the best high school class entry! We’ll also work with the winners post-challenge to host their product on our labs site.

Piloting applications with the public is our most ambitious effort at collaboration to date. Right now, we’re running a crowdsourcing application built on Scribe called Beyond Words, where website visitors can identify, transcribe, or validate images from WWI era historic newspapers in our Chronicling America collection. The beauty of this application is that it also generates a viewable gallery of these images and a public domain data set for download and use in classrooms, research, or perhaps generating further applications. Not only do members of the public contribute to the gallery and data set (we’ve had 2240 volunteers so far and 685 completed images),  but the data we gather from feedback and metrics from Beyond Words users inform application updates and the design of our upcoming transcription platform (stay tuned!).

Events allow us to create dialogues around issues we care about, widen our network of peers, and work closely with new partners. For the past two years, we’ve hosted a Collections as Data annual symposium investigating the computational readiness, impact, and ethics of library content served as data sets.  Upcoming events include leading the local planning committee for Code4Lib 2018 and co-hosting the 2018 International Image Interoperability Framework (IIIF) Conference with the Smithsonian and Folger Shakespeare Library.

To see more of what we’re up to, go to our site at and follow us on Twitter @LC-Labs. Let’s work together!

Jaime Mears jame@loc.govJaime Mears is an Innovation Specialist with the National Digital Initiatives Division at the Library of Congress. She is a former National Digital Stewardship Resident and holds an MLS from the University of Maryland.

User Centered Collaboration for Archival Discovery (Part 2)

By the SAA 2017 Session 403 Team: James Bullen, Alison Clemens, Wendy Hagenmaier, Adriane Hanson, Emilie Hardman, Carrie Hintz, Mark Matienzo, Jessica Meyerson, Amanda Pellerin, Susan Pyzynski, Mike Shallcross, Seth Shaw, Sally Vermaaten, Tim Walsh

Insights from Discussion Groups

  • In discussion group 1, we went through a few different discussion areas. We had an interesting conversation about how to navigate using a user-centered approach. We talked about a) how to balance changing user needs professional practices; b) the difficulty of being user-centered within an MPLP paradigm; c) the contrasting difficulty of being *too* detailed in our description, and that getting in the way of discovery; and d) shifting our reference model so that public services staff are more facile with using finding aids and assisting users in navigating minimal description.
  • Discussion group 4 began by discussing the range of concerns relating to the state of discovery at the participants’ institutions. Everyone recognized that their current discovery systems for archives were not ideal, and there was a common interest across the group in centralizing discovery within an institution or consortium. The group also spent a significant amount of time discussing specific known issues to implementing a new discovery system, including issues related to system integration, the reality that information about archival materials is spread across multiple platforms, and that abrupt transitions across platforms were jarring for users. We also discussed challenges to undertaking user-centered design and collaborative work, which included barriers related to administrative support, systemic IT issues, lack of knowledge of user experience design methodologies, and resources for these projects.
  • Discussion Group 5 began by discussing archival discovery at participants’ institutions.  There was a wide range of strategies being used to facilitate archival discovery, but none of the participants were happy with their current state.  In most cases, the discovery systems were too deeply connected to library technologies like the OPAC, or utilized static html websites.  Participants were frustrated that their systems didn’t meet potential users where they were (the open web) and didn’t provide desired opportunity for users to find materials or more effectively use the finding aid data.   Participants saw flaws in their systems that negatively impact users, and all saw user testing as something that should be done when designing and maintaining archival discovery systems.  There was, however, some concern that user testing is resource intensive and that many archives don’t have the tools or training to do it effectively and that, while we feel good about our attempts to include users on product development we don’t have a good sense of what the return on that investment actually is in most cases.  
  • Discussion group 6 first considered how to go from having good description in multiple tools to having an effective user experience searching across all of those tools.  A user centered design approach was appealing, but there was concern about lacking the staff time and expertise to take this on, as well as challenges of knowing the demographics of your users well enough to establish meaningful user personas and being able to prioritize across different user groups’ needs. Collaboration and sharing the work seemed to be the answer to lacking staff time and expertise, although formal collaborative efforts do require overhead to manage the logistics of the collaboration. We discussed ways to have more effective collaboration, such as sharing the results of our work (like user personas) online rather than writing journal articles, making room for smaller institutions in the conversation, and allowing for different levels of time commitment and expertise within a collaboration. Roles for institutions without technical expertise include providing feedback or replicating a test at your own institution using another institution’s method.


Working on a user-centered design project for your archives? Have questions about the topic? Chime in via the comments below!

User Centered Collaboration for Archival Discovery (Part 1)

By the SAA 2017 Session 403 Team: James Bullen, Alison Clemens, Wendy Hagenmaier, Adriane Hanson, Emilie Hardman, Carrie Hintz, Mark Matienzo, Jessica Meyerson, Amanda Pellerin, Susan Pyzynski, Mike Shallcross, Seth Shaw, Sally Vermaaten, Tim Walsh

At the SAA Annual Meeting, a group of archivists and technologists organized a session on collaborative user-centered design processes across project and institutional boundaries: ArcLight (based at Stanford University), the ArchivesSpace public user interface enhancement project, and New York University’s archival discovery work. Using community-oriented approaches that foreground user experience design and usability testing, these initiatives seek to respond to the documented needs and requirements of archivists and researchers. In an effort to continue the conversation about user-centered design in archives, we wanted to share a recap of the session and discussion reflections with the community.


Sally Vermaaten started off the presentations by outlining NYU’s staged design work on a new archival discovery layer. In the first phase of work, a team of archivists, technologists, and librarians conducted a literature review on usability of archival discovery systems, held an blue-sky requirements workshop with stakeholders, assessed several systems in use by other archives, and drafted personas and high-level requirements. In parallel with this design work, a Blacklight-based proof of concept site was set up that utilized code developed by Adam Wead. The results of the pilot and design work were promising but also highlighted the growing need to upgrade other archival systems including migration to ArchivesSpace. Because implementing ArchivesSpace would offer new mechanisms to access metadata via API and would change underlying data structures, it made sense to migrate to ArchivesSpace before a full redesign of the discovery layer.


At the same time, the team at NYU knew that the proof of concept Blacklight-based site already running in a test environment included several tangible improvements to search and browse functionality that could be polished and rolled out with minimal investment. NYU therefore decided to take a phased approach in order to put those usability improvements in the hands of users earlier. First, they used rapid user-centered design techniques to quickly iterate on the proof of concept site, including a heuristic analysis and wireframing, and were able to deploy significantly improved search functionality to users within a few months. Next, they focused on ArchivesSpace migration and once that system was live, ‘Phase II’ of archival discovery work, a holistic rethink of the archival discovery layer, was kicked off. Sally wrapped up her presentation by sharing some of the aspects of the NYU approach that proved most helpful in their process and encouraged other institutions to consider these strategies in archival discovery work:

  • sharing and re-using existing resources (user research, design work, and code
  • documenting user needs as an impetus for and input into a future systems projects
  • incremental improvement and proof-of-concept approaches.


Susan Pyzynski and Emilie Hardman discussed the ongoing collaborative work toward an enhanced public user interface for ArchivesSpace. The first Design Phase took place between March and December 2015. This phase produced a set of wireframes and a report by the design firm, Cherry Hill, which was contracted to establish initial plans for the PUI.  The Development Phase, which spanned January-June 2017, took this initial planning into account and fleshed out the firm’s work with both fully exploratory and structured comparative user tests. A selection of these tests and findings may be found here. This work yeilded a 2.0 test release of the ASpace PUI: Though it sounds conclusive, the Release Phase (summer 2017) is not the final work, though it has put forth a user-informed product:

With this new release Harvard University is pursuing an aggressive timeline, looking at a January release for the ASpace PUI, and plan to engage in further and more specific community-centered user testing.


Finally, Mark Matienzo presented on the design process used for ArcLight, a project initiated by Stanford University Libraries to develop a purpose-built discovery and delivery system for archival collections. ArcLight’s design process followed a similar model as Stanford’s design process for the Spotlight exhibits platform, but with morea higher amount of community input and participation. The ArcLight design process included input from thirteen institutions, and significant individual contributions from eleven individuals, including both user experience designers and archivists. After providing an overview of the design process, Mark presented on how requirements were identified and evolved through over time. Using the example of the delivery of digital objects within the description of a specific collection component, this review included looking at early stakeholder goals and investigating existing functionality in their environmental scan; identifying questions to ask in user interviews, and subsequent analysis of their answers; how those insights were reflected in design documents like personas and wireframes; and their eventual implementation in the ArcLight minimum viable product. Mark closed his presentation by discussing lessons learned about the highly collaborative process. This included the recognition of the value of very broad input, the time and effort needed to organize collaboration, and the importance of needing professional knowledge and expertise in user experience in creating certain kinds of design artifacts./

Archival Collections as Data for Digital Scholarship

By Laurie Allen and Stewart Varner

Archives and special collections have a long history experimenting with and embracing digital tools, so it is not surprising that they have been natural partners for digital scholarship librarians. In this blog post, we want to share a couple of experiences we’ve had where digital scholarship and the archives came together.

Laurie Allen, Director for Digital Scholarship, University of Pennsylvania:

The Cope Evans project was an early collaboration between the Digital Scholarship group, Special Collections, and a group of students at the Haverford College Libraries. Over the years, a series of gifts had made it possible for Haverford to digitize and richly describe the Cope Evans Family Papers, which include correspondence and other documents from a connected group of Philadelphia Quaker families. While the ContentDM system used by the library allowed for searching through the digitized items, it did not take full advantage of the available metadata, including geospatial metadata. In the summer of 2014, the library employed a group of students to make use of the metadata records and associated images as a dataset. Over the following two summers, two groups of Haverford undergraduates explored the exported data from the Cope Collections to create maps, network analyses, and other visualizations and analyses of the collection. Of course, their exploration of the data led them directly back to the original materials and the resulting work represented a broader and deeper connection to the materials.

This experimentation with using our collections as data for student work led the Haverford Libraries to continue approaching the data and metadata of Quaker collections in data rich ways going forward. The Quakers and Mental Health site and the Beyond Penn’s Treaty site that have since been made take this work forward at Haverford.

Stewart Varner, Managing Director of Price Lab for Digital Humanities, University of Pennsylvania:

When I was the Digital Scholarship Librarian at the University of North Carolina, I worked on a project called DocSouth Data which was designed to facilitate innovative research methods on Documenting the American South, one of the library’s most popular online collections. Documenting the American South is composed of eighteen thematic collections of digitized material. DocSouth Data takes four of the most text-heavy collections, including the heavily used North American Slave Narrative, and makes them available as .txt files as well as .xml files. With these files, scholars can start looking for patterns using simple tools like Voyant and easily experiment with text analysis methods like topic modeling and sentiment analysis.

DocSouth Data was an exciting partnership between myself, the Library and Information Technology team, and archivists in UNC’s Special Collections. The original idea came from Nick Graham who, at the time, was the Program Coordinator for the North Carolina Digital Heritage Center (and is currently the University Archivist at UNC). I worked closely with Library and Information Technology who created the plain text files, organized them into a clear folder structure and made them available as .zip files on the library’s website. Once DocSouth Data was live, I hosted workshops at UNC and elsewhere that gave faculty, students, and librarians the chance to explore new ways to study the collections.

Since these two projects started, both Laurie and Stewart have joined the project team for the IMLS funded Collections as Data project. The Haverford Libraries contributed two facets to that project.