Perspectives and Experiences in Collection Development with COVID-19 Content

By Jillian Lohndorf and Sylvie Rollason-Cass

For some, the switch to remote work meant that web archiving became a much bigger part of their day-to-day. Not only were people looking for digital projects to focus on, but new web-based information and content related to COVID-19 was being generated quickly. Nearly every government, organization, and company had information to share, and librarians and archivists were working as quickly to document it as it was being created. The team at Archive-It, the web archiving service of the Internet Archive, helped facilitate the increased activity by offering steeply discounted data and providing opportunities for users to discuss strategies, challenges, outcomes, and generally share experiences. These opportunities came in the form of regular community calls, as well as presentations and discussion groups related to COVID-19 collecting at the annual (and newly virtual) Archive-It Partner meeting.  Information from these discussions, as well as the collecting activity itself, has proven to be a spotlight on the many lessons that Archive-It users have already learned about web archive collection development, and the many ways they continue to innovate.

The International Internet Preservation Consortium’s June 2020 capture of COVID-19 data reported from Palestine

The way an organization approached collection development was often dictated by already-established processes, which in turn, are often dictated by larger goals and requirements.  Some of these organizational decisions were influenced by the resources available, most notably, data and staff time.  Many organizations  have created new, discrete collections for their COVID-19 content.  The idea of creating thematic collections in response to an event is not new, as many organizations have created them to document web content related to what are described as “spontaneous events,” which in the past have included natural events and societal tragedies.  Other organizations choose to integrate new COVID-19 content into existing collections, contextualized along other regularly captured content from their institutions and communities.  Sometimes the content was in both new and existing collections, capturing institutional COVID-19 responses in established collections, while also creating separate collections with an exclusive COVID-19 focus.  For many, collection development decisions went hand-in-hand with descriptive practices, such as building out metadata using controlled language as a way to facilitate access to information.

Metadata record from the Pennsylvania Horticultural Society COVID-19 Collection

The collected content was often a reflection of both institution and community, and across a range of formats, pushing some organizations into new territory.  Social media, from Facebook to Twitter, was of particular importance. And for many, capturing COVID-19 dashboards and maps became important.  Some organizations had either internal or public calls for website nominations, providing an opportunity to engage with others to build inclusive collections.  With the amount of available information, and the rate of change, no organization was immune from making difficult decisions on what to capture. 

The pandemic is far from over and even when it is, we are likely to see repercussions for a long time to come. For many, collecting COVID-19 content will be a long-term project with new twists and turns along the way. As of writing this, Archive-It users have collected over 40TB of web content related to COVID-19, and as that amount of data continues to rise,  the lessons of web archiving will as well.


Jillian Lohndorf joined the Internet Archive in 2016 as an Archive-It Web Archivist. Previously, she worked in the Archives and Special Collections at DePaul University and Rotary International, and as Web Services Librarian for The Chicago School of Professional Psychology. She holds a Master of Science in Library and Information Science from the University of Illinois at Urbana-Champaign.

Sylvie Rollason-Cass is the Senior Web Archivist for Archive-It where she has been supporting the web archiving community for the past 8 years. She holds a Master of Science in Library and Information Science from the University of Illinois at Urbana-Champaign.

One thought on “Perspectives and Experiences in Collection Development with COVID-19 Content

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s