Modernization of the A.D. Hopkins collection at the Smithsonian Institution Department of Entomology

by Teresa Boyd

This is the second post in the BloggERS Embedded Series.

The Smithsonian Institution’s Department of Entomology has recently finished phase one of their multiyear project to digitize their portion of the A.D. Hopkins notes and records system, which includes about 100 years of observations, both in the field and in the lab. A.D. Hopkins developed the system in order to collect biological and natural history notes about species, the environment they were in, as well as the collectors and locations of collection. This collection was adopted by the United States Department of Agriculture (USDA) when Hopkins was named Chief of Forest Insect Investigations, though Hopkins is known to have developed and used the system while working at West Virginia University in the late 1800s. The Smithsonian Institution’s Department of Entomology has historically worked very closely with the USDA and therefore obtained the largest portion of the Hopkins card file over the years.

 

It was important to Hopkins to collect as much information as possible about specimens because he felt it was the quickest way to understand the situation of potential pests and to find solutions to harmfully invasive species. Part of Hopkins’ methodology was to encourage average citizens to send in specimens and observations to the USDA, the Smithsonian, or one of the forest experiment stations that were located throughout the United States, which were then incorporated into the Hopkins note system. Some of these notes are also documentation about lab research such as specimen rearing, specimen transfers, and communications between lab and field. A few of these notes are also cross-referenced, so often a lab note can be traced back to a field note, making it easier for researchers to quickly see the correlation between field and lab (work that was often done by different individuals.) The numbers on each individual card within the A.D. Hopkins system correlates to specimens that are housed in various locations. Traditionally a researcher or scientist would ask for the notes that were associated with a a specimen number. By creating an online repository of the notes, the Smithsonian hopes to further enrich researchers with new tools to expand their work and perhaps find new ways to use the data which has been collected by past researchers and scientists.

I have been working on this project as a lone archivist for the past 5 years, scanning the card file portion of the collection, and am now working on preparing these scans for a website that will be built specifically for this type of collection. The Smithsonian Institution’s Department of Entomology hopes to begin sending the scans of the cards to the Smithsonian Transcription Center soon to crowdsource the transcriptions. This cuts down on the time it takes to transcribe the older material which is all handwritten. I will be adding the transcribed notes to the digitized card on the website so that researchers will be able to go to the website, look up a specific card, and see both the original scan and the transcribed notes, making it easy for anyone to be able to use the information contained in the Hopkins collection. Additionally these scans will be incorporated into the Department of Entomology’s collections database by matching specimens to their unique card numbers;  thereby giving researchers the complete picture.

The Smithsonian Institution’s work to digitize and make their A.D. Hopkins collection publicly available is not the first of its kind; the USDA had previously accomplished this in the 1980s, and has made their documents available on the USDA website, HUSSI. There is hope that in the future other institutions that have their own portions of the A.D. Hopkins notes and records system will also begin to digitize and make them available online, supplementing the Smithsonian and USDA efforts to make this invaluable data available to researchers. 


Teresa Boyd is an archivist for the Department of State and a volunteer archivist for the Smithsonian Institute’s Department of Entomology. She holds a degree in Library and Information Science from the University of Arizona.

Advertisements

Capturing Common Ground

by Leslie Matthaei

This is the first post in the BloggERS Embedded Series.

Every Tuesday I am asked the same question: “T Coast today?” T Coast, or Tortilla Coast, is the preferred lunch location for some of the photographers that occupy the Photography Branch, in the Curator Division, of the federal agency Architect of the Capitol. The agency oversees the maintenance of building and landscapes on Capitol Hill to include the Library of Congress buildings, Supreme Court, United States Botanic Gardens, House and Senate Office Buildings, Capitol Power Plant, and, of course, the United States Capitol. I have joined the professional photographers at T Coast for more than a dozen lunches now. I am here for the Taco Salad and comradery but mostly I am here to listen. And to ask questions. I am an embedded archivist.

461700
The Library of Congress Thomas Jefferson Building
467882
United States Capital Building
469327
United States Botanic Garden – Bartholdi Park

I use my time at the T Coast lunch table to get to know the photographers and for them to get to know me. I discovered very quickly that the photographers and I have a lot in common. For example, the photographers are often assigned to shoot a long-term project (Collection) which may have multiple phases (Series), and for each phase, they go out on specific days to shoot (File Units). They cull excess and/or duplicate photographs. And they generally have a tried workflow for ingesting their born-digital objects to edit in Adobe Lightroom then upload them to an in-house Digital Asset Management system known as PhotoLightbox. Within PhotoLightbox, they are responsible for defining the security status of an individual image or group of images and providing the descriptive metadata. Tapping into parallel duties has allowed me to bridge potential knowledge gaps in explaining what roles and functions I can provide the branch as a whole.

One rather large knowledge gap is descriptive metadata. To be sure, the photographers in our agency are incredibly busy and in high demand. And they are professionally trained photographers. They see the world through aesthetics. It is not necessarily their job to use PhotoLightbox to help a researcher find images of the East Front extension that occurred in the 1950s, for example. That is my role, and when I query PhotoLightbox, the East Front extension project is represented in multiple ways: EFX, East Front (Plaza) Extension, East Extension, Capitol East Front Extension. You may see where this is going: there is no controlled vocabulary. When, in a staff meeting, I pitched the idea of utilizing controlled vocabularies, they immediately understood the need. Following their lead, the conversation turned to having me develop a data entry template for each of their shoots.

Matthaei_DataEntryTemplate
An example of the data entry template.

I admit now that my first spin through PhotoLightbox revealed a pressing need for controlled vocabularies, among other concerns the database presented. I am the type of person that when I see a problem, I want to fix it immediately. Yet I knew that if my first professional introduction to the photographers was a critique of how unworkable their data entry was and had been over time, I might turn them off immediately. Instead, I went to lunch. I credit the results produced by this particular staff meeting to the time that I put in getting to know the photographers, getting to understand each of their respective workflows, and understanding a little bit about the historic function and purpose of the office within the agency.

I have another half dozen lunches to go before I begin to talk to the photographers about the need for digital preservation of born-digital images over the long-term and both of our roles in the surrounding concepts and responsibilities. I have a few more lunches after that to get their assistance in codifying the decision we are making together into branch policies. I feel confident, however, that I have their complete buy-in for the work that I have been tasked to do in the branch. Instead of seeing me as another staff member making them do something they do not want, I am seen as someone who can help them gain control of and manage their assets in a way that has yet to be done in the branch. I cannot do it alone, I need their help. And some chips and salsa every once in a while.


Matthaei
Leslie Matthaei

Leslie Matthaei is an Archivist in the Photography Branch, Curator Division, Architect of the Capitol. She holds an MLIS from the University of Arizona, and an MA and BA in Media Arts from the University of Arizona.

Partnerships in Advancing Digital Archival Education

by Sohan Shah, Michael J. Kurtz, and Richard Marciano

This is the fourth post in the BloggERS series on Collaborating Beyond the Archival Profession.

The mission of the Digital Curation Innovation Center (DCIC) at the University of Maryland’s iSchool is to integrate archival education with research and technology. The Center does this through innovative instructional design, integrated with student-based project experience. A key element in these projects is forming collaborations with academic, public sector, and industry partners. The DCIC fosters these interdisciplinary partnerships through the use of Big Records and Archival Analytics.

DCIC Lab space at the University of Maryland.

The DCIC works with a wide variety of U.S. and foreign academic research partners. These include, among others, the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign, the University of British Columbia, King’s College London, and the Texas Advanced Computing Center at the University of Texas at Austin. Federal and state agencies who partner by providing access to Big Records collections and their staff expertise include the National Agricultural Library, the National Archives and Records Administration, the National Park Service, the U.S. Holocaust Memorial Museum, and the Maryland State Archives. In addition, the DCIC collaborates with the European Holocaust Research Infrastructure project to provide digital access to Holocaust-era collections documenting cultural looting by the Nazis and subsequent restitution actions. Industry partnerships have involved NetApp and Archival Analytics Solutions.

Students working on a semester-long project with Dr. Richard Marciano, Director, DCIC.

We offer students the opportunity to participate in interdisciplinary digital curation projects with the goal of developing new digital skills and conducting front line research at the intersection of archives, digital curation, Big Data, and analytics. Projects span across justice, human rights, cultural heritage, and cyber-infrastructure themes. Students explore new research opportunities as they work with cutting-edge technology and receive guidance from faculty and staff at the DCIC.

To further digital archival education, DCIC faculty develop courses at the undergraduate and graduate levels that teach digital curation theory and provide experiential learning through team-based digital curation projects. The DCIC has also collaborated with the iSchool to create a Digital Curation for Information Professionals (DCIP) Certificate program designed for working professionals who need training in next generation cloud computing technologies, tools, resources, and best practices to help with the evaluation, selection, and implementation of digital curation solutions. Along these lines, the DCIC will sponsor, with the Archival Educators Section of the Society of American Archivists (SAA), a workshop at the Center on August 13, 2018, immediately prior to the SAA’s Annual Meeting in Washington, D.C. The theme of the workshop is “Integrating Archival Education with Technology and Research.” Further information on the workshop will be forthcoming.

The DCIC seeks to integrate all its educational and research activities by exploring and developing a potentially new trans-discipline, Computational Archival Science (CAS), focused on the computational treatments of archival content. The emergence of CAS follows advances in Computational Social Science, Computational Biology, and Computational Journalism.

For further information about our programs and projects visit our web site at http://dcic.umd.edu. To learn more about CAS, see http://dcicblog.umd.edu/cas. Information about a student-led Data Challenge, which the DCIC is co-sponsoring, can be accessed at http://datachallenge.ischool.umd.edu.


Sohan Shah

Sohan Shah is a Master’s student at the University of Maryland studying Information Management. His focus is on using research and data analytical techniques to make better business decisions. He holds a Bachelor’s degree in Computer Science from Ramaiah Institute of Technology, India, and has worked for 4 years at Microsoft as a Consultant and then as a Technical Lead prior to joining the University of Maryland. Sohan is working at the DCIC to find innovative ways of integrating data analytics with archival education. He is the co-author of “Building Open-Source Digital Curation Services and Repositories at Scale” and is working on other DCIC initiatives such as the Legacy of Slavery and Japanese American WWII Camps. Sohan is also the President of the Master of Information Management Student Association and initiated University of Maryland’s annual “Data Challenge,” bringing together hundreds of students from different academic backgrounds and class years to work with industry experts and build innovative solutions from real-world datasets.

Dr. Michael J. Kurtz is Associate Director of the Digital Curation Innovation Center in the College of Information Studies at the University of Maryland. Prior to this he worked at the U.S. National Archives and Records Administration for 37 years as a professional archivist, manager, and senior executive, retiring as Assistant Archivist in 2011. He received his doctoral degree in European History from Georgetown University in Washington, D.C. Dr. Kurtz has published extensively in the fields of American history and archival management. His works, among others, include: “ The Enhanced ‘International Research Portal for Records Related to Nazi-Era Cultural Property’ Project (IRP2): A Continuing Case Study” (co-author) in Big Data in the Arts and Humanities: Theory and Practice (forthcoming); “Archival Management and Administration,” in Encyclopedia of Library and Information Sciences (Third Edition, 2010); Managing Archival and Manuscript Repositories (2004); America and the Return of Nazi Contraband: The Recovery of Europe’s Cultural Treasures (2006, Paperback edition 2009).

Dr. Richard Marciano is a professor in the College of Information Studies at the University of Maryland and director of the Digital Curation Innovation Center (DCIC).  Prior to that, he conducted research at the San Diego Supercomputer Center (SDSC) at the University of California San Diego (UCSD) for over a decade with an affiliation in the Division of Social Sciences in the Urban Studies and Planning program.  His research interests center on digital preservation, sustainable archives, cyberinfrastructure, and big data.  He is also the 2017 recipient of the Emmett Leahy Award for achievements in records and information management. With partners from KCL, UBC, TACC, and NARA, he has launched a Computational Archival Science (CAS) initiative to explore the opportunities and challenges of applying computational treatments to archival and cultural content. He holds degrees in Avionics and Electrical Engineering, a Master’s and Ph.D. in Computer Science from the University of Iowa, and conducted a Postdoc in Computational Geography.