The Conversation Must Go On: Climate Change and Archival Practice

By Itza A. Carbajal
This post is part of our BloggERS Another Kind of Glacier series.

On September 20th, 2019, an unprecedented number of archivists joined together in person and online to reignite conversations around Climate Change, archives, and the role of archivists in the ongoing crisis. What initially started as a conversation in search of hope between two archivists, Ted Lee and Itza Carbajal, quickly grew into an archival community wide search for change (1). Archivists as part of the “Archives and Climate Change Teach Ins Action” engaged through teach-ins, marches, resource gathering, and on social media in an effort to talk in parallel with the estimated 4-6 million people striking as part of the Global Climate Strike movement (2). These global strikes, led mostly by young people from around the world and inspired by Greta Thunberg’s Fridays for Future student strikes, occurred in over 163 countries on all seven continents uniting for perhaps the first time the residents of this house called Earth.

What led Ted, myself, and others to seek a shift in the conversations around archives and Climate Change likely began with this simple question: “why must we (as archivists) act?” While a simple question, the “why” in the case of the Climate Strike Teach-Ins was in fact the impetus for me and for many others involved (3). When Ted Lee and I, both archivists and archival scholars, first set out to organize the archivist community through these Teach-Ins, we intended the actions to be 1) opportunities to learn 2) moments to converse and 3) sparks to previous conversations around archives and climate change. With over 9 teach ins, information translated into 5 languages, a comprehensive reading list, and a global twitterthon, the action #Archivists4ClimateAction undoubtedly sparked a lasting conversation.

In all frankness, I would say that we are no longer in the stage of why we would, but rather, why would we not. Not everyone in the house is aware of the growing fire outside and within our own walls, and as a result, the archival community must begin conversations like these. For those new to advocacy, organizing, or activist work, this first question is the starting line (4). Regardless of age, length of work experience, or other backgrounds, we all must start somewhere. The “why” in this question asks us to think about why it matters to act. In returning to the metaphor of our house being on fire, the initial question could be “why should or would I be compelled to act as a result of this fire?” In the case of Climate Change, some may feel more comfortable calling our work advocacy, either on behalf of the field, our jobs, or perhaps our overall environment: the world. Others may feel more compelled to frame their work as organizing or activism, the former focused more on coordinating people and the latter focused on calling attention to an issue. In all three cases, we are striving for some sort of change or solution to what we perceive as a problem. 

We had, I would say, already accepted our responsibility to act. Both as inhabitants of this planet, and as practitioners dependent on the survival of humanity in order to make sense of our work, we had an obligation to act. We adopted a strategy– starting a conversation– which was both intentional and logical. As neither Ted nor I were environmental or climate change experts, we knew that we could only advance the conversations so much. But we recognized that our interests and skills lay in teaching, a form of educational conversation. And that led us to our answer for the second question: “what can we as archivists do?” 

The Teach-In strategy addressed the discomfort Ted and I initially felt approaching this subject, which frankly still feels overwhelming and outside of our expertise. We felt that the using Teach-Ins would allow us, as educators, to immerse ourselves in a topic of our choosing with the intention of sharing that information with our participants. The Teach-In method also allowed us to disrupt the “business as usual” attitude and tendency for many in our field, thus aligning with the original vision of the 2019 Global Climate Strike. As archivists, record managers, curators, librarians, and LIS students paused or walked out of work to attend or participate in these Teach-Ins, there was a recognition that many of us still desire to learn even after completing our formal participation in educational systems such as graduate programs. Plainly, the Teach-Ins resonated with participants from archival backgrounds, workplaces, and programs.

Ted and I chose a strategy that played to our strengths: the Teach-Ins were our preferred method because they gave us a path forward, a way to participate in the conversation by using our existing skills in teaching and organizing. Looking at what we knew, and what we had to contribute, the Teach-Ins made sense. Your skills, levels of comfort, insights, and connections will vary, but for bad or worse, the problem of Climate Change will require us all to contribute in big and small ways. This brings me to the last question: “how do we (as archivists and an archival community) take action?” In response, I propose a follow-up question, drawn from the work we started with the Archives and Climate Change Teach-Ins as well the discussions that led to the formation of ProjectARCC: “how do we continue the momentum built during the Global Climate Strike, build on conversations held, and work towards the changes that our field and community needs?” My simple answer would be to find ways to keep on learning. What happens after you learn will be up to you. But, I believe, the answers will inevitably circle back to the initial two questions – why and what

As many recognize, Climate Change is neither a new topic nor is it in its early stages. Our house is on fire and for many it is starting to crumble. This blog post attempts to highlight the importance of starting and continuing conversations and actions around Climate Change and its relationship and impact on archivists and archives. The work did not end with the 2019 strike. That was simply the beginning.

Itza A. Carbajal is a Ph.D student at the University of Washington School of Information focusing her research on children and their records. Previously, she worked as the Latin American Metadata Librarian at LLILAS Benson after having received a Master of Science in Information Studies with a focus on archival management and digital records at the University of Texas at Austin School of Information. Before that, she obtained a dual-degree Bachelor of Arts in History and English with a concentration on creative writing and legal studies at the University of Texas at San Antonio. More information: www.itzacarbajal.com

Notes:
1. Itza A. Carbajal and Ted Lee, “If Not Now, When? Archivists Respond to Climate Change,” Archival Outlook, November/December 2019, |PAGE|, https://mydigitalpublication.com/publication/?m=30305&i=635670&p=8)
 2. “Over 4 Million Join 2 Days of Global Climate Strike,” Global Climate Strike, September 21, 2019, accessed October 6, 2020, https://globalclimatestrike.net/4-million/
3. “Climate Strike Teach-Ins,” Project ARCC Events, September 11, 2019, accessed October 12, 2020, https://projectarcc.org/2019/09/11/climate-strike-teach-ins/)
4. I couple these terms together for a reason as they most definitely mean different things and carry different implications, they are in my opinion similar in that they seek some sort of change.

From Aspirational to Actionable: Working through the OSSArcFlow Guide

by Elizabeth Stauber


Before I begin extolling the virtues of theOSSArcFlow Guide to Documenting Born-Digital Archival Workflows, I must confess that I created an aspirational digital archiving workflow four years ago, and for its entire life it has existed purely as a decorative piece of paper hanging next to my computer. This workflow was extensive and contained as many open source tools as I could find. It was my attempt to follow every digital archiving best practice that has ever existed.

In actual practice, I never had time to follow this workflow. As a lone arranger at the Hogg Foundation for Mental Health, my attention is constantly divided. Instead, I found ways to incorporate aspects of digital archiving into my records management and archival description work, thus making the documentation fragmented. A birds-eye view of the entire lifecycle of the digital record was not captured – the transition points between accession and processing and description were unaccounted for.

Over the summer, a colleague suggested we go through theOSSArcFlow Guide to Documenting Born-Digital Archival Workflows together. Initially, I was skeptical, but my new home office needed some sprucing up, so I decided to go along. Immediately, I saw that the biggest difference between working through this guide and my prior, ill-fated attempt is that the OSSArcFlow Guide systematically helps you document what you already do. It is not shaming you for not properly updating every file type to the most archivally sound format or for not completing fixity checks every month. Rather, it showed me I am doing the best I can as one person managing an entire organization’s records and look how far I have come!

Taking the time to work through a structured approach for developing a workflow helped organize my digital archiving priorities and thoughts. It is easy to be haphazard as a lone arranger with so many competing projects. Following the guide allowed me to be systematic in my development and led to a better understanding of what I currently do in regards to digital archiving. For example, the act of categorizing my activities as appraisal, pre-accessioning, accessioning, arrangement, description, preservation, and access parceled out the disparate, but co-existing work into manageable amounts. It connected the different processes I already had, and revealed the overlaps and gaps in my workflow.

As I continued mapping out my activities, I was also able to more easily see the natural “pause” points in my workflow. This is important because digital archiving is often fit in around other work, and knowing when I can break from the workflow allows me to manage my time more efficiently – making it more likely that I will achieve progress on my digital archiving work. Having this workflow that documents my actual activities rather than my aspirational activities allows for easier future adaptability. Now I can spot more readily what needs to be added or removed. This is helpful in a lone arranger archive as it allows for flexibility and the opportunity for improvement over time.

The Hogg Foundation was established in 1940 by Ima Hogg. The Foundation’s archive houses many types of records from its 80 years of existence – newspapers, film, cassette tapes, and increasingly born-digital records. As the Foundation continues to make progress in transforming how communities promote mental health in everyday life, it is important to develop robust digital archiving workflows that capture this progress.

Now I understand my workflow as an evolving document that serves as the documentation of the connections between different activities, as well as a visualization to pinpoint areas for growth. My digital processing workflow is no longer simply a decorative piece of paper hanging next to my computer.


Elizabeth Stauber stewards the Hogg Foundation’s educational mission to document, archive and share the foundation’s history, which has become an important part of the histories of mental and public health in Texas, and the evolution of mental health discourse nationally and globally. Elizabeth provides access to the Hogg Foundation’s research, programs, and operations through the publicly accessible archive. Learn more about how to access our records here.

Laying Out the Horizon of Possibilities: Reflections on Developing the OSSArcFlow Guide to Documenting Born-Digital Archival Workflows

by Alexandra Chassanoff and Hannah Wang


OSSArcFlow (2017-2020) was an IMLS-funded grant initiative that began as a collaboration between the Educopia Institute and the University of North Carolina School of Library and Information Science. The goal of the project was to investigate, model, and synchronize born-digital curation workflows for collecting institutions who were using three leading open source software (OSS) platforms – BitCurator, Archivematica and ArchivesSpace. The team recruited a diverse group of twelve partner institutions, ranging from a state historical society to public libraries to academic archives and special collections units at large research universities and consortia.

OSSArcFlow partners at in-person meeting in Chapel Hill, NC (December 2017)
Creator: Educopia Institute

Early on in the project, it became clear that many institutions were planning for and carrying out digital preservation activities ad hoc rather than as part of fully formed workflows. The lack of “best practice” workflow models to consult also seemed to hinder institutions’ abilities to articulate what shape their ideal workflows might take. Creating visual workflow diagrams for each institution provided an important baseline from which to compare and contrast workflow steps, tools, software, roles, and other factors across institutions. It also played an important, if unexpected, role in helping the project team understand the sociotechnical challenges underlying digital curation work. While configuring systems and processing born-digital content, institutions make many important decisions – what to do, how to do it, when to do it, and why – that influence the contours of their workflows. These decisions and underlying challenges, however, are often hidden from view, and can only be made visible by articulating and documenting the actions taken at each stage of the process. Similarly, while partners noted that automation in workflows was highly desirable, the documented workflows revealed the highly customized local implementations at each institution, which prevented the team from writing generalizable scripts for metadata handoffs that could apply to more than one institution’s use case.

Another unexpected but important pivot in the project was a shift towards breakout group discussions to focus on shared challenges or “pain points” identified in our workflow analysis. For partners, talking through shared challenges and hearing suggested approaches proved immensely helpful in advancing their own digital preservation planning. Our observation echoes similar findings by Clemens et al. (2020) in “Participatory Archival Research and Development: The Born-Digital Access Initiative,” who note that “the engagement and vulnerability involved in sharing works in progress resonates with people, particularly practitioners who are working to determine and achieve best practices in still-developing areas of digital archives and user services.” These conversations not only helped to build camaraderie and a community of practice around digital curation, but also revealed that planning for more mature workflows seemed to ultimately depend on understanding more about what was possible.

Overall, our research on the OSSArcFlow project led us to understand more about how gaps in coordinated work practices and knowledge sharing can impact the ability of institutions to plan and advance their workflows. These gaps are not just technical but also social, and crucially, often embedded in the work practices themselves. Diagramming current practices helps to make these gaps more visible so that they can be addressed programmatically. 

At the same time, the use of research-in-practice approaches that prioritize practitioner experiences in knowledge pursuits can help institutions bridge these gaps between where they are today and where they want to be tomorrow.  As Clements et al. (2020) point out, “much of digital practice itself is research, as archivists test new methods and gather information about emerging areas of the field.” Our project findings show a significant difference between how the digital preservation literature conceptualizes workflow development and how boots-on-the-ground practitioners actually do the work of constructing workflows. Archival research and development projects should build in iterative practitioner reflections as a component of the R&D process, an important step for continuing to advance the work of doing digital preservation.  

Initially, we imagined that the Implementation Guide we would produce would focus on strategies used to synchronize workflows across three common OSS environments. Based on our project findings, however, it became clear that helping institutions articulate a plan for digital preservation through shared and collaborative documentation of workflows would provide an invaluable resource for institutions as they undertake similar activities. Our hope in writing the Guide to Documenting Born-Digital Archival Workflows is to provide a resource that focuses on common steps, tools, and implementation examples in service of laying out the “horizon of possibilities” for practitioners doing this challenging work.  

The authors would like to recognize and extend immense gratitude to the rest of the OSSArcFlow team and the project partners who helped make the project and its deliverables a success. The Guide to Documenting Born-Digital Archival Workflows was authored by Alexandra Chassanoff and Colin Post and edited by Katherine Skinner, Jessica Farrell, Brandon Locke, Caitlin Perry, Kari Smith, and Hannah Wang, with contributions from Christopher A. Lee, Sam Meister, Jessica Meyerson, Andrew Rabkin, and Yinglong Zhang, and design work from Hannah Ballard.


Alexandra Chassanoff is an Assistant Professor at the School of Library and Information Sciences at North Carolina Central University. Her research focuses on the use and users of born-digital cultural heritage. From 2017 to 2018, she was the OSSArcFlow Project Manager. Previously, she worked with the BitCurator and BitCurator Access projects while pursuing her doctorate in Information Science at UNC-Chapel Hill. She co-authored (with Colin Post and Katherine Skinner) the Guide to Documenting Born-Digital Archival Workflows.    

Hannah Wang is currently the Project Manager for BitCuratorEdu (IMLS, 2018-2021), where she manages the development of open learning objects for digital forensics and facilitates a community of digital curation educators. She served as the Project Manager for the final stage of OSSArcFlow and co-edited the Guide to Documenting Born-Digital Archival Workflows.

Estimating Energy Use for Digital Preservation, Part II

by Bethany Scott

This post is part of our BloggERS Another Kind of Glacier series. Part I was posted last week.


Conclusions

While the findings of the carbon footprint analysis are predicated on our institutional context and practices, and therefore may be difficult to directly extrapolate to other organizations’ preservation programs, there are several actionable steps and recommendations that sustainability-minded digital preservationists can implement right away. Getting in touch with any campus sustainability officers and investigating environmental sustainability efforts currently underway can provide enlightening information – for instance, you may discover that a portion of the campus energy grid is already renewable-powered, or that your institution is purchasing renewable energy credits (RECs). In my case, I was previously not aware that UH’s Office of Sustainability has published an improvement plan outlining its sustainability goals, including a 10% total campus waste reduction, a 15% campus water use reduction, and a 35% reduction in energy expenditures for campus buildings – all of which will require institutional support from the highest level of UH administration as well as partners among students, faculty, and staff across campus. I am proud to consider myself a partner in UH campus sustainability and look forward to promoting awareness of and advocating for our sustainability goals in the future.

As Keith Pendergrass highlighted in the first post of this series, there are other methods by which digital preservation practitioners can reduce their power draw and carbon footprint, thereby increasing the sustainability of their digital preservation programs – from turning off machines when not use or scheduling resource-intensive tasks for off-peak times, to making broader policy changes that incorporate sustainability principles and practices.

At UHL, one such policy change I would like to implement is a tiered approach to file format selection, through which we match the file formats and resolution of files created to the scale and scope of the project, the informational and research value of the content, the discovery and access needs of end users, and so on. Existing digital preservation policy documentation outlines file formats and specifications for preservation-quality archival masters for images, audio, and video files that are created through our digitization unit. However, as UHL conducts a greater number of mass digitization projects – and accumulates an ever larger number of high-resolution archival master files – greater flexibility is needed. By choosing to create lower-resolution files for some projects, we would reduce the total storage for our digital collections, thereby reducing our carbon footprint.

For instance, we may choose to retain large, high-resolution archival TIFFs for each page image of a medieval manuscript book, because researchers study minute details in the paper quality, ink and decoration, and the scribe’s lettering and handwriting. By contrast, a digitized UH thesis or dissertation from the mid-20th century could be stored long-term as one relatively small PDF, since the informational value of its contents (and not its physical characteristics) is what we are really trying to preserve. Similarly, we are currently discussing the workflow implications of providing an entire archival folder as a single PDF in our access system. Although the initial goal of this initiative was to make a larger amount of archival material quickly available online for patrons, the much smaller amount of storage needed to store one PDF vs. dozens or hundreds of high-res TIFF masters would also have a positive impact on the sustainability of the digital preservation and access systems.

UHL’s digital preservation policy also includes requirements for monthly fixity checking of a random sample of preservation packages stored in Archivematica, with a full fixity check of all packages to be conducted every three years during an audit of the overall digital preservation program. Frequent fixity checking is computationally intensive, though, and adds to the total energy expenditure of an institution’s digital preservation program. But in UHL’s local storage infrastructure, storage units run on the ZFS filesystem, which includes self-healing features such as internal checksum checks each time a read/write action is performed. This storage infrastructure was put in place in 2019, but we have not yet updated our policies and procedures for fixity checking to reflect the improved baseline durability of assets in storage.

Best practices calling for frequent fixity checks were developed decades ago – but modern technology like ZFS may be able to passively address our need for file integrity and durability in a less resource-intensive way. Through considered analysis matching the frequency of fixity checking to the features of our storage infrastructure, we may come to the conclusion that less frequent hands-on fixity checks, on a smaller random sample of packages, is sufficient moving forward. Since this is a new area of inquiry for me, I would love to hear thoughts from other digital preservationists about the pros and cons to such an approach – is fixity checking really the end-all, or could we use additional technological elements as part of a broader file integrity strategy over time?

Future work

I eagerly anticipate refining this electricity consumption research with exact figures and values (rather than estimates) when we are able to more consistently return to campus. We would like to investigate overhead costs such as lighting and HVAC in UHL’s server room, and we plan to grab point-in-time values physically from the power distribution units in the racks. Also, there may be additional power statistics that our Sys Admin can capture from the VMware hosts – which would allow us to begin on this portion of the research remotely in the interim. Furthermore, I plan to explore additional factors to provide a broader understanding of the impact of UHL’s energy consumption for digital systems and initiatives. By gaining more details on our total storage capacity, percentage of storage utilization, and GHG emissions per TB, we will be able to communicate about our carbon footprint in a way that will allow other libraries and archives to compare or estimate the environmental impact of their digital programs as well.

I would also like to investigate whether changes in preservation processes, such as the reduced hands-on fixity strategy outlined above, can have a positive impact on our energy expenditure – and whether this strategy can still provide a high level of integrity and durability for our digital assets over time. Finally, as a longer-term initiative I would like to take a deeper look at sustainability factors beyond energy expenditure, such as current practices for recycling e-waste on campus or a possible future life-cycle assessment for our hardware infrastructure. Through these efforts, I hope to help improve the long-term sustainability of UHL’s digital initiatives, and to aid other digital preservationists to undertake similar assessments of their programs and institutions as well.


Bethany Scott is Digital Projects Coordinator at the University of Houston Libraries, where she is a contributor to the development of the BCDAMS ecosystem incorporating Archivematica, ArchivesSpace, Hyrax, and Avalon. As a representative of UH Special Collections, she contributes knowledge on digital preservation, born-digital archives, and archival description to the BCDAMS team.

Estimating Energy Use for Digital Preservation, Part I

by Bethany Scott

This post is part of our BloggERS Another Kind of Glacier series. Part II will be posted next week.


Although the University of Houston Libraries (UHL) has taken steps over the last several years to initiate and grow an effective digital preservation program, until recently we had not yet considered the long-term sustainability of our digital preservation program from an environmental standpoint. As the leader of UHL’s digital preservation program, I aimed to address this disconnect by gathering information on the technology infrastructure used for digital preservation activities and its energy expenditures in collaboration with colleagues from UHL Library Technology Services and the UH Office of Sustainability. I also reviewed and evaluated the requirements of UHL’s digital preservation policy to identify areas where the overall sustainability of the program may be improved in the future by modifying current practices.

Inventory of equipment

I am fortunate to have a close collaborator in UHL’s Systems Administrator, who was instrumental in the process of implementing the technical/software elements of our digital preservation program over the past few years. He provided a detailed overview of our hardware and software infrastructure, both for long-term storage locations and for processing and workflows.

UHL’s digital access and preservation environment is almost 100% virtualized, with all of the major servers and systems for digital preservation – notably, the Archivematica processing location and storage service – running as virtual machines (VMs). The virtual environment runs on VMware ESXi and consists of five physical host servers that are part of a VMware vSAN cluster, which aggregates the disks across all five host servers into a single storage datastore.

VMs where Archivematica’s OS and application data reside may have their virtual disk data spread across multiple hosts at any given time. Therefore, exact resource use for digital preservation processes running via Archivematica is difficult to distinguish or pinpoint from other VM systems and processes, including UHL’s digital access systems. After discussing possible approaches for calculating the energy usage, we decided to take a generalized or blanket approach and include all five hosts. This calculation thus represents the energy expenditure for not only the digital preservation system and storage, but also for the A/V Repository and Digital Collections access systems. At UHL, digital access and preservation are strongly linked components of a single large ecosystem, so the decision to look at the overall energy expenditure makes sense from an ecosystem perspective.

In addition to the VM infrastructure described above, all user and project data is housed in the UHL storage environment. The storage environment includes both local shared network drive storage for digitized and born-digital assets in production, and additional shares that are not accessible to content producers or other end users, where data is processed and stored to be later served up by the preservation and access systems. Specifically, with the Archivematica workflow, preservation assets are processed through a series of automated preservation actions including virus scanning, file format characterization, fixity checking, and so on, and are then transferred and ingested to secure preservation storage.

UHL’s storage environment consists of two servers: a production unit and a replication unit. Archivematica’s processing shares are not replicated, but the end storage share is replicated. Again, for purposes of simplification, we generalized that both of these resources are being used as part of the digital preservation program when analyzing power use. Finally, within UHL’s server room there is a pair of redundant network switches that tie all the virtual and storage components together.

The specific hardware components that make up the digital access and preservation infrastructure described above include:

  • One (1) production storage unit: iXsystems True NAS M40 HA (Intel Xeon Silver 4114 CPU @ 2.2 Ghz and 128 GB RAM)
  • One (1) replication storage unit: iXsystems FreeNAS IXC-4224 P-IXN (Intel Xeon CPU E5-2630 v4 @ 2.2 Ghz and 128 GB RAM)
  • Two (2) disk expansion shelves: iXsystems ES60
  • Five (5) VMware ESXi hosts: Dell PowerEdge R630 (Intel Xeon CPU E5-2640 v4 @ 2.4 Ghz and 192 GB RAM)
  • Two (2) network switches: HPE Aruba 3810M 16SFP+ 2-slot

Electricity usage

Each of the hardware components listed above has two power supplies. However, the power draw is not always running at the maximum available for those power supplies and is dependent on current workloads, how many disks are in the units, and so on. Therefore, the power being drawn can be quantified but will vary over time.

With the unexpected closure of the campus due to COVID-19, I conducted this analysis remotely with the help of the UH campus Sustainability Coordinator. We compared the estimated maximum power draw based on the technical specifications for the hardware components, the draw when idle, and several partial power draw scenarios, with the understanding that the actual numbers will likely fall somewhere in this range.

Estimated power use and greenhouse gas emissions

 Daily Usage Total (Watts)Annual Total (kWh)Annual GHG (lbs)
Max9,09479,663.44124,175.71
95%8,639.375,680.268117,966.92
90%8,184.671,697.096111,758.14
85%7,729.967,713.924105,549.35
80%7,275.263,730.75299,340.565
Idle5,365.4647,001.4373,263.666

The estimated maximum annual greenhouse gas emissions derived from power use for the digital access and preservation hardware is over 124,000 pounds, or approximately 56.3 metric tons. To put this in perspective, it’s equivalent to the GHG emissions from nearly 140,000 miles driven by an average passenger vehicle, and to the carbon dioxide emissions from 62,063 pounds of coal burned or 130 barrels of oil consumed. While I hope to refine this analysis further in the future, for now these figures can serve as an entry point to discussions on the importance of environmental sustainability actions – and our plans to reduce our consumption – with Libraries administration, colleagues in the Office of Sustainability, and other campus leaders.

Part II, including conclusions and future work, will be posted next week.


Bethany Scott is Digital Projects Coordinator at the University of Houston Libraries, where she is a contributor to the development of the BCDAMS ecosystem incorporating Archivematica, ArchivesSpace, Hyrax, and Avalon. As a representative of UH Special Collections, she contributes knowledge on digital preservation, born-digital archives, and archival description to the BCDAMS team.

An intern’s experience: Preserving Jok Church’s Beakman

by Matt McShane

I seem to have a real penchant for completing my schoolings in the middle of “once-in-a-lifetime” economic crises: first the 2008 housing recession, and now a global pandemic. And while the current situation has somewhat altered the final semester of my MLIS program—not to mention many others’ situations much more intensely—I was still able to have a very engaging and rewarding practicum experience at The Ohio State University Libraries, working on an incredible digital collection accessioned by the Billy Ireland Cartoon Library and Museum. 

U Can with Beakman and Jax might be best known to a lot of people as the predecessor of the short-lived Saturday morning live action science show Beakman’s World, but the comic is arguably more successful than the television show it produced. With an international readership and a run that lasted more than 25 years, it was a success story that entertained and educated readers over generations. It was also the first syndicated newspaper comic to be entirely digitally drawn and distributed. Jok Church, the creator and author, used Adobe Illustrator throughout the run of the comic, and saved and migrated the files in various stages of creation through multiple hard drives. With the exception of a few gaps, the entire run was saved on Jok’s hard drive at the time of his death in April 2016. These are the files we received at University Libraries. 

Richard Bolingbroke, a friend of Jok’s and executor of his estate, donated the collection to the Billy Ireland. He also provided us with an in-progress biography of Jok, which gave insight into who he was as a person beyond his work with Beakman and Jax, as well as a condensed history of the publication. This will be useful as the Billy Ireland creates author metadata and information for the collection.

Richard provided us access to direct copies of twenty-four folders via Dropbox, containing nearly 10,000 files, which we downloaded to our local processing server. Each folder contained a year’s worth of comics, from 1993 to 2016, though the years 1995 and 1996 were empty due to a hard drive failure Jok had experienced. We’re still in the process of possibly hunting down any existing backups from these years. In the existing folders, though, we found not only the many years’ worth of terrific comic content, but also a glimpse into Jok’s creative and organizational process. An initial DROID scan of the contents found over 2,000 duplicate files scattered throughout. After speaking with Richard about this, we determined it to be a mistaken copy/paste issue. Rather than manipulate the existing archival collection, we decided to create a distribution collection better organized for user access to the works, with the intention of maintaining archival integrity of the donated collection. 

Before either of those goals could be reached, though, our second primary issue was that of file extensions. We found nearly 1,300 files without extensions in the collection, which we determined to be due to older Mac OS’s use of files without extensions appended. Adobe Illustrator produces both .ai and .eps file types. There are other file types among the collection, but these are the primary types for each work. It was impossible to determine which files were .ai versus .eps at a batch level, so the EXIF metadata of all files without extensions were manually examined to determine their proper extension. Using Bulk Rename Utility, we were able to semi-batch the extension appending, but it still required a fair amount of manual labor due to the intermingled nature of the different file types within subfolders. 

Even though create dates within EXIF metadata were unreliable because of different versions of Illustrator being used to access files throughout the years, Jok named and organized his files by publication date, which gave us reliable organization metadata for our distribution file. His file and folder organization did shift throughout the years—understandable over two decades and who knows how many machines. This required a fair bit of manual labor in creating and organizing the distribution collection in a standardized file name and subfolder format. The comic was published weekly, albeit with some breaks. Typically there are four different versions: portrait versus landscape and black and white versus color of the finished product. A year\month\date folder tree was created based on how the largest portion of Jok’s files were organized. Once that was completed, we shifted focus to Ohio State’s Accessibility standards, and investigated a batch workflow to convert the comic files to PDF/A. Unfortunately, we could not achieve PDF/A compliance due to the nature of the original files; additionally, the “batch” processing includes a significant human interaction.

Further complicating matters, while we were discovering this, the COVID-19 global pandemic hit Ohio. In response, Ohio State declared all non-essential personnel to move to tele-work, which cut off my access to the server behind the University’s firewall for the remainder of my internship. As a result, we had to put the completion of this project on indefinite hold. Despite these extreme circumstances preventing me from seeing the collection all the way through to public hands, I was able to leave it in an organized state, ready for file conversion and metadata creation. 

I learned a lot by being able to handle the collection from the beginning, untouched. One of the biggest takeaways was the importance of gathering information about the collection and its creator up front. Creating a manifest of the objects within the collection is a clear necessity to knowing how the collection should be preserved, and how it should be accessed, but also allowed us to see gaps in the collection, such as the significant number of duplicates and files without extensions. Having this knowledge up front allowed us to better plan our approach to the collection. I have actually suggested increasing students’ exposure to “messy” digital objects collections to my program’s faculty based on my experience with this project. 

The other key takeaway I discovered was that sometimes it might be best to dirty your hands, and perform tasks manually. Digital preservation can have a lot of automated shortcuts compared to processing its traditional analog cousins, but not everything can or should be done through batch processes. While it may be technically possible to program a process, it may not really be the best use of time or effort. Part of workflow development is recognizing when the creation of an automated solution outweighs the time and effort to manually perform the task. It may have been possible to code a script to identify and append the file extensions for our objects missing them, but the effort and time to learn, write, and troubleshoot that likely would have be greater than the somewhat tedious work of doing it by hand in this instance. Alternatively, it might be worth looking into automated scripting if this were a significantly larger collection of mislabeled or disorganized objects. Having a good understanding of cost and benefit is important when approaching a problem that can have multiple solutions.

My time on-site with The Ohio State University Libraries was a bit shorter than I had intended, but it still provided me with a great experience and helped to solidify my love for the digital preservation process and work. The fact that U Can with Beakman and Jax is the first digitally created syndicated newspaper comic makes the whole experience that much more apt and impactful. Even though some aspects of work are in limbo at the moment, I am confident that this terrific collection of Jok’s work will be available for the public to enjoy and learn from. Even if I am not able to fully carry the work over the finish line, I am thankful for the opportunity to work on it as much as I did. 


Matt McShane, a recent MLIS graduate from Kent State University, is currently focused on landing a role with a cultural heritage institution where he can work hands-on with digital collections, digital preservation, and influence broader preservation policy.

Securing Our Digital Legacy: An Introduction to the Digital Preservation Coalition

by Sharon McMeekin, Head of Workforce Development


Nineteen years ago, the digital preservation community gathered in York, UK, for the Cedars Project’s Preservation 2000 conference. It was here that the first seeds were sown for what would become the Digital Preservation Coalition (DPC). Guided by Neil Beagrie, then of King’s College London and Jisc, work to establish the DPC continued over the next 18 months and, in 2002, representatives from 7 organizations signed the articles that formally constituted the DPC.

In the 17 years since its creation, the DPC has gone from strength to strength, the last 10 years under the leadership of current Executive Director, William Kilbride. The past decade has been a particular period of growth, as shown by the rise in the staff compliment from 2 to 7. We now have more than 90 members who represent an increasingly diverse group of organizations from 12 countries across sectors including cultural heritage, higher education, government, banking, industry, media, research and international bodies.

DPC staff, chair, and president

Our mission at the DPC is to:

[…] enable our members to deliver resilient long-term access to digital content and services, helping them to derive enduring value from digital assets and raising awareness of the strategic, cultural and technological challenges they face.

We work to achieve this through a broad portfolio of work across six strategic areas of activity: Community Engagement, Advocacy, Workforce Development, Capacity Building, Good Practice and Standards, and Management and Governance. Everything we do is member-driven and they guide our activities through the DPC Board, Representative Council, and Sub-Committees which oversee each strategic area.

Although the DPC is driven primarily by the needs of our members, we do also aim to contribute to the broader digital preservation community. As such, many of the resources we develop are made publicly available. In the remainder of this blog post, I’ll be taking a quick look at each of the DPC’s areas of activity and pointing out resources you might find useful.

1 | Community Engagement

First up is our work in the area of Community Engagement. Here our aim is to enable “a growing number of agencies and individuals in all sectors and in all countries to participate in a dynamic and mutually supportive digital preservation community”. Collaboration is a key to digital preservation success, and we hope to encourage and support it by helping build an inclusive and active community. An important step in achieving this aim was the publication of our ‘Inclusion and Diversity Policy’ in 2018.

Webinars are key to building community engagement amongst our members. We invite speakers to talk to our members about particular topics and share experiences through case studies. These webinars are recorded and made available for members to watch at a later date. We also run a monthly ‘Members Lounge’ to allow informal sharing of current work and discussion of issues as they arise and, on the public end of the website, a popular blog, covering case studies, new innovations, thought pieces, recaps of events and more.

2 | Advocacy

Our advocacy work campaigns “for a political and institutional climate more responsive and better informed about the digital preservation challenge”, as well as “raising awareness about the new opportunities that resilient digital assets create”. This tends to happen on several levels, from enabling and aiding members’ advocacy efforts within their own organizations, through raising legislators’ and policy makers’ awareness of digital preservation, to educating the wider populace.

To help those advocating for digital preservation within their own context, we have recently published our Executive Guide. The Guide provides a grab bag of statements and facts to help make the case for digital preservation, including key messages, motivators, opportunities to be gained and risks faced. We welcome any suggestions for additions or changes to this resource!

Our longest running advocacy activity is the biannual Digital Preservation Awards, last held in 2018. The Awards aim to celebrate excellence and innovation in digital preservation across a range of categories. This high-profile event has been joined in recent years by two other activities with a broad remit and engagement. The first is the Bit List of Digitally Endangered Species, which highlights at risk digital information, showing both where preservation work is needed and where efforts have been successful. Finally, there is World Digital Preservation Day (WDPD), a day to showcase digital preservation around the globe. Response to WDPD since its inauguration in 2017 has been exceptionally positive. There’s been tweets, blogs, events, webinars, and even a song and dance! This year WDPD is scheduled for 7th November, and we encourage everyone to get involved.

The nominees, winners, and judges for the 2018 Digital Preservation Awards

3 | Workforce Development

Workforce Development activities at the DPC focus on “providing opportunities for our members to acquire, develop and retain competent and responsive workforces that are ready to address the challenges of digital preservation”. There are many threads to this work, but key for our members are the scholarships we provide through our Career Development Fund and free access to the training courses we run.

At the moment we offer three training courses: ‘Getting Started with Digital Preservation’, ‘Making Progress with Digital Preservation’ and ‘Advocacy for Digital Preservation’, but we have plans to expand the portfolio in the coming year. All of our training courses are available to non-members for a modest fee, but at the moment are mostly held face to face in the UK and Ireland. A move to online training provision is, however, planned for 2020. We are also happy to share training resources and have set up a Slack workspace to enable this and greater collaboration with regards to digital preservation training.

Other resources that may prove helpful that fall under our Workforce Development heading include the ‘Digital Preservation Handbook’, a free online publication covering a digital preservation in the broadest sense. The Handbook aims to be a comprehensive guide for those starting with digital preservation, whilst also offering links additional resources. The content for Handbook was crowd-sourced from experts and has all been peer reviewed. Another useful and slightly less well-known series of publications are our ‘Topical Notes’, originally funded by the National Archives of Ireland, and intended to create resources that introduced key digital preservation issues to a non-specialist audience (particularly record creators). Each note is only two pages long and jargon-free, so a great resource to help raise awareness.

4 | Capacity Building

Perhaps the biggest area of DPC work covers Capacity Building, that is “supporting and assuring our members in the delivery and maintenance of high quality and sustainable digital preservation services through knowledge exchange, technology watch, research and development.” This can take the form of direct member support, helping with tasks such as policy development and procurement, as well as participation in research projects.

Our more advanced publication series, the Technology Watch Reports, also sit below the Capacity Building heading. Written by experts and peer reviewed, each report takes a deeper dive into a particular digital preservation issue. Our latest report on Email Preservation is currently available for member preview but will be publicly released shortly. Some other ‘classics’ include Preserving Social Media, Personal Digital Archiving, and the always popular The Open Archival Information System (OAIS) Reference Model: Introductory Guide (2nd Edition) (I always tell those new to OAIS to start here rather than the 200+ dry pages of the full standard!)

We also run around six thematic Briefing Day events a year on topical issues. As with the training, these are largely held in the UK and Ireland, but they are now also live-streamed for members. We support a number of Thematic Task Forces and Working Groups, with the ‘Web Archiving and Preservation Working Group’ being particularly active at the moment.

DPC members engaged in a brainstorming session

5 | Good Practice and Standards

Our Good Practice and Standards stream of work was a new addition as of the publication of our latest Strategic Plan (2018-22). Here we are contributing work towards “identifying and developing good practice and standards that make digital preservation achievable, supporting efforts to ensure services are tightly matched to shifting requirements.”

We hope this work will allow us to input into standards with the needs of our members in mind and facilitate the sharing of good practice that already happens across the coalition. This has already borne fruit in the shape of the forthcoming DPC Rapid Assessment Model, a maturity model to help with benchmarking digital preservation progress within your organization. You can read a bit more about it in this blog post by Jen Mitcham and the model will be released publicly in late September.

We also work with vendors through our Supporter Program and events like our ‘Digital Futures’ series to help bridge the gap between practice and solutions.

6 | Management and Governance

Our final stream of work is less focused on digital preservation and instead on “ensuring the DPC is a sustainable, competent organization focussed on member needs, providing a robust and trusted platform for collaboration within and beyond the Coalition.” This obviously relates to both the viability of the organization and well as good governance. It is essential that everything we do is transparent and that the members can both direct what we do and ensure accountability.

The Future

Before I depart, I thought I would share a little bit about some of our plans for the future. In the next few years we’ll be taking steps to further internationalize as an organization. At the moment our membership is roughly 75% UK and Ireland and 25% international, but those numbers are gradually moving closer and we hope that continues. With that in mind we will be investigating new ways to deliver services and resources online, as well as in languages beyond English. We’re starting this year with the publication of our prospectus in German, French and Spanish.

We’re also beginning to look forward to our 20th anniversary in 2022. It’s a Digital Preservation Awards Year, so that’s reason enough for a celebration, but we will also be welcoming the digital preservation community to Glasgow, Scotland, as hosts of iPRES 2022. Plans are already afoot for the conference, and we’re excited to make it a showcase for both the community and one of our home cities. Hopefully we’ll see you there, but I encourage you to make use of our resources and to get in touch soon!

Access our Knowledge Base: https://www.dpconline.org/knowledge-base

Follow us on Twitter: https://twitter.com/dpc_chat

Find out how to join us: https://www.dpconline.org/about/join-us


Sharon McMeekin is Head of Workforce Development with the Digital Preservation Coalition and leads on work including training workshops and their scholarship program. She is also Managing Editor of the ‘Digital Preservation Handbook’. With Masters degrees in Information Technology and Information Management and Preservation, both from the University of Glasgow, Sharon is an archivist by training, specializing in digital preservation. She is also an ILM qualified trainer. Before joining the DPC she spent five years as Digital Archivist with RCAHMS. As an invited speaker, Sharon presents on digital preservation at a wide variety of training events, conferences and university courses.

The Theory and Craft of Digital Preservation: An interview with Trevor Owens

BloggERS! editor, Dorothy Waugh recently interviewed Trevor Owens, Head of Digital Content Management at the Library of Congress about his recent–and award-winning–book, The Theory and Craft of Digital Preservation.


Who is this book for and how do you imagine it being used?

I attempted to write a book that would be engaging and accessible to anyone who cares about long-term access to digital content and wants to devote time and energy to helping ensure that important digital content is not lost to the ages. In that context, I imagine the primary audience as current and emerging professionals that work to ensure enduring access to cultural heritage: archivists, librarians, curators, conservators, folklorists, oral historians, etc. With that noted, I think the book can also be of use to broader conversations in information science, computer science and engineering, and the digital humanities. 

Tell us about the title of the book and, in particular, your decision to use the word “craft” to describe digital preservation.

The words “theory” and “craft” in the title of the book forecast both the structure and the two central arguments that I advance in the book. 

The first chapters focus on theory. This includes tracing the historical lineages of preservation in libraries, archives, museums, folklore, and historic preservation. I then move to explore work in new media studies and platform studies to round out a nuanced understanding of the nature of digital media. I start there because I think it’s essential that cultural heritage practitioners moor their own frameworks and approaches to digital preservation in a nuanced understanding of the varied and historically contingent nature of preservation as a concept and the complexities of digital media and digital information. 

The latter half of the book is focused on what I describe as the “craft” of digital preservation. My use of the term craft is designed to intentionally challenge the notion that work in digital preservation should be understood as “a science.” Given the complexities of both what counts as preservation in a given context and the varied nature of digital media, I believe it is essential that we explicitly distance ourselves from many of the assumptions and baggage that come along with the ideology of “digital.” 

We can’t build some super system that just solves digital preservation. Digital preservation requires making judgement calls. Digital preservation requires the applied thinking and work of professionals. Digital preservation is not simply a technical question, instead digital preservation involves understanding the nature of the content that matters most to an intended community and making judgement calls about how best to mitigate risks of potential loss of access to that content. As a result of my focus on craft, I offer less of a “this is exactly what one should do” approach, and more of an invitation to join the community of practice that is developing knowledge and honing and refining their craft. 

Reading the book, I was so happy to see you make connections between the work that we do as archivists and digital preservation. Can you speak to that relationship and why you think it is important?

Archivists are key players in making preservation happen and the emergence of digital content across all kinds of materials and media that archivists work with means that digital preservation is now a core part of the work that archivists do. 

I organize a lot of my discussion about the craft of digital preservation around archival concepts as opposed to library science or curatorial practices. For example, I talk about arrangement and description. I also draw from ideas like MPLP as key concepts for work in digital preservation and from work on community archives. 

Old Files. From XKCD: webcomic of romance, sarcasm, math, and language. 2014

Broadly speaking, in the development of digital media, I see a growing context collapse between formats that had been distinct in the past. That is, conservation of oil paintings, management and preservation of bound volumes, and organizing and managing heterogeneous sets of records have some strong similarities but there are also a lot of differences. The born digital incarnations of those works; digital art, digital publishing, and digital records, are all made up of digital information and file formats, and face a related set of digital preservation issues.

With that note, I think archival practice tends to be particularly well-suited for dealing with the nature of digital content. Archives have long dealt with the problem of scale that is now intensified by digital data. At the same time, archivists have also long dealt with hybrid collections and complex jumbles of formats, forms, and organizational structures, which is also increasingly the case for all types of forms that transition into born-digital content. 

You emphasize that the technical component of digital preservation is sometimes prioritized over social, ethical, and organizational components. What are the risks implicit in overlooking these other important components?

Digital preservation is not primarily a technical problem. The ideology of “digital” is that things should be faster, cheaper, and automatic. The ideology of “digital” suggests that we should need less labor, less expertise, and less resources to make digital stuff happen. If we let this line of thinking infect our idea of digital preservation we are going to see major losses of important data, we will see major failures to respect ethical and privacy issues relating to digital content, and lots of money will be spent on work that fails to get us the results that we want.

In contrast, when we take as a starting point that digital preservation is about investing resources in building strong organizations and teams who participate in the community of practice and work on the complex interactions that emerge between competing library and archives values then we have a chance of both being effective but also building great and meaningful jobs for professionals.

If digital preservation work is happening in organizations that have an overly technical view of the problem, it is happening despite, not because, of their organization’s approach. That is, there are people doing the work, they just likely aren’t getting credit and recognition for doing that work. Digital preservation happens because of people who understand that the fundamental nature of the work requires continual efforts to get enough resources to meaningfully mitigate risks of loss, and thoughtful decision making about building and curating collections of value to communities.

Considerations related to access and discovery form a central part of the book and you encourage readers to “Start simple and prioritize access,” an approach that reminded me of many similar initiatives focused on getting institutions started with the management and preservation of born-digital archives. Can you speak to this approach and tell us how you see the relationship between preservation and access?

A while back, OCLC ran an initiative called “walk before you run,” focused on working with digital archives and digital content. I know it was a major turning point for helping the field build our practices. Our entire community is learning how to do this work and we do it together. We need to try things and see which things work best and which don’t. 

It’s really important to prioritize access in this work. Preservation is fundamentally about access in the future. The best way you know that something will be accessible in the future is if you’re making it accessible now. Then your users will help you. They can tell you if something isn’t working. The more that we can work end-to-end, that is, that we accession, process, arrange, describe, and make available digital content to our users, the more that we are able to focus on how we can continually improve that process end-to-end. Without having a full end-to-end process in place, it’s impossible to zoom out and look at that whole sequence of processes to start figuring out where the bottlenecks are and where you need to focus on working to optimize things. 


Dr. Trevor Owens is a librarian, researcher, policy maker, and educator working on digital infrastructure for libraries. Owens serves as the first Head of Digital Content Management for Library Services at the Library of Congress. He previously worked as a senior program administrator at the United States Institute of Museum and Library Services (IMLS) and, prior to that, as a Digital Archivist for the National Digital Information Infrastructure and Preservation Program and as a history of science curator at the Library of Congress. Owens is the author of three books, including The Theory and Craft of Digital Preservation and Designing Online Communities: How Designers, Developers, Community Managers, and Software Structure Discourse and Knowledge Production on the Web. His research and writing has been featured in: Curator: The Museum Journal, Digital Humanities Quarterly, The Journal of Digital Humanities, D-Lib, Simulation & Gaming, Science Communication, New Directions in Folklore, and American Libraries. In 2014 the Society for American Archivists granted him the Archival Innovator Award, presented annually to recognize the archivist, repository, or organization that best exemplifies the “ability to think outside the professional norm.”

A Conversation with Annalise Berdini, Digital Archivist at the Seeley G. Mudd Manuscript Library, Princeton University

Interview conducted with Annalise Berdini in May 2019 by Hannah Silverman and Tamar Zeffren

This is the eighth post in a new series of conversations between emerging professionals and archivists actively working with digital materials.


Annalise Berdini is the Digital Archivist at the Seeley G. Mudd Manuscript Library at Princeton University, a position she has held since January 2018. She is responsible for the ongoing management of the University Archives Digital Curation Program, as well as managing a collection of web archives and assisting with reference services.

Annalise’s first post-graduate school position was as a manuscripts and archives processor at the University of California, San Diego (UCSD). While she was working at UCSD, universities and archives were slowly starting to see the need for a dedicated digital archivist position. When the Special Collections department at UCSD created their first digital archivist position, Annalise applied and got the job. She explains that a good deal of her work there, and at Princeton, is graciously supported by a community of digital archivists solving similar challenges in other institutions.

As Annalise has now held a digital archivist role at two different institutions, both universities, we were interested to hear her perspectives on how colleagues and researchers have understood – or misunderstood – her role. “Because I have digital in my job title,” she noted, “people interpret that in a lot of very wide and broad ways. Really digital archives is still an emerging field…there are so many questions to answer, and it’s fun to investigate that aspect of the field.”

Given prevailing concerns among institutional archives about preserving and processing legacy media, we were keenly interested in hearing Annalise’s insights about securing stakeholder buy-in to develop a digital archives program.

“It’s a struggle everywhere,” she acknowledges. Presently, Princeton’s efforts to build up a more robust digital preservation program have led the University to a partnership with a UK-based company called Arkivum, which offers digital preservation, storage, maintenance, auditing and reporting modules and has the capacity to incorporate services from Archivematica and create a customized digital storage solution for Princeton.

“We’ve been lucky here [at Mudd]. We’re getting this great system. There is buy-in and there seems to be a pretty strong push right now. For us, the most compelling argument we’ve had is that we are mandated to collect student materials and student records that will not exist anywhere else unless we take them. The school has to keep those records, there’s not an option. Emphasizing how easily that content could be lost without a proper digital preservation system in place was very compelling to people who weren’t necessarily aware of the fact that hard drives sitting on a shelf are really not acceptable storage choices and options.”

Annalise has also found that deploying some compelling statistics can aid in building awareness around digital archives needs. In discussions about how rapidly materials can degrade, Annalise likes to cite a 2013 Western Archives article, “Capturing and Processing Born-Digital Files in the STOP AIDS Project Records,” which showcases findings that out of a vast collection of optical storage media, “only 10% of these hundreds of DVDs were really able to be recovered, whereas, strangely, a lot of the floppy disks were better and easier to recover…I think emphasizing how fragile digital content is [can help people understand] how easily it will corrupt without you even knowing it.”

Equally as important to generating momentum for such programs are the direct relationships Annalise cultivates with colleagues, within and without the archives. “My boss was really instrumental in the process, and the head of library IT helped me navigate getting approvals from the University as a whole and the University IT department.”

The complex process of sustaining and innovating a digital archives infrastructure provides ongoing opportunities for Annalise to “solve puzzles” and to unite colleagues in confronting the challenges of documenting and preserving born-digital heritage: “I have focused on trying to find one person who is maybe a level above me and to connect with them and then hopefully build up a network within my institution to build some groundswell.”


Hannah Silverman

Tamar Zeffren

Hannah Silverman and Tamar Zeffren both work at JDC Archives. Tamar is the Archival Collections Manager. Hannah is the Digitization Project Specialist and also works independently as a photo archivist. Both received SAA’s DAS certification.

PASIG (Preservation and Archiving Special Interest Group) 2019 Recap

by Kelly Bolding

PASIG 2019 met the week of February 11th at El Colegio de México (commonly known as Colmex) in Mexico City. PASIG stands for Preservation and Archiving Special Interest Group, and the group’s meeting brings together an international group of practitioners, industry experts, vendors, and researchers to discuss practical digital preservation topics and approaches. This meeting was particularly special because it was the first time the group convened in Latin America (past meetings have generally been held in Europe and the United States). Excellent real-time bilingual translation for presentations given in both English and Spanish enabled conversations across geographical and lingual boundaries and made room to center Latin American preservationists’ perspectives and transformative post-custodial archival practice.

Perla Rodriguez of the Universidad Nacional Autónoma de México (UNAM) discusses an audiovisual preservation case study.

The conference began with broad overviews of digital preservation topics and tools to create a common starting ground, followed by more focused deep-dives on subsequent days. I saw two major themes emerge over the course of the week. The first was the importance of people over technology in digital preservation. From David Minor’s introductory session to Isabel Galina Russell’s overview of the digital preservation landscape in Mexico, presenters continuously surfaced examples of the “people side” of digital preservation (think: preservation policies, appraisal strategies, human labor and decision-making, keeping momentum for programs, communicating to stakeholders, ethical partnerships). One point that struck me during the community archives session was Verónica Reyes-Escudero’s discussion of “cultural competency as a tool for front-end digital preservation.” By conceptualizing interpersonal skills as a technology for facilitating digital preservation, we gain a broader and more ethically grounded idea of what it is we are really trying to do by preserving bits in the first place. Software and hardware are part of the picture, but they are certainly not the whole view.

The second major theme was that digital preservation is best done together. Distributed digital preservation platforms, consortial preservation models, and collaborative research networks were also well-represented by speakers from LOCKSS, Texas Digital Library (TDL), Duraspace, Open Preservation Foundation, Software Preservation Network, and others. The takeaway from these sessions was that the sheer resource-intensiveness of digital preservation means that institutions, both large and small, are going to have to collaborate in order to achieve their goals. PASIG seemed to be a place where attendees could foster and strengthen these collective efforts. Throughout the conference, presenters also highlighted failures of collaborative projects and the need for sustainable financial and governance models, particularly in light of recent developments at the Digital Preservation Network (DPN) and Digital Public Library of America (DPLA). I was particularly impressed by Mary Molinaro’s honest and informative discussion about the factors that led to the shuttering of DPN. Molinaro indicated that DPN would soon be publishing a final report in order to transparently share their model, flaws and all, with the broader community.

Touching on both of these themes, Carlos Martínez Suárez of Video Trópico Sur gave a moving keynote about his collaboration with Natalie M. Baur, Preservation Librarian at Colmex, to digitize and preserve video recordings he made while living with indigenous groups in the Mexican state of Chiapas. The question and answer portion of this session highlighted some of the ethical issues surrounding rights and consent when providing access to intimate documentation of people’s lives. While Colmex is not yet focusing on access to this collection, it was informative to hear Baur and others talk a bit about the ongoing technical, legal, and ethical challenges of a work-in-progress collaboration.

Presenters also provided some awesome practical tools for attendees to take home with them. One of the many great open resources session leaders shared was Frances Harrell (NEDCC) and Alexandra Chassanoff (Educopia)’s DigiPET: A Community Built Guide for Digital Preservation Education + Training Google document, a living resource for compiling educational tools that you can add to using this form. Julian Morley also shared a Preservation Storage Cost Model Google sheet that contains a template with a wealth of information about estimating the cost of different digital preservation storage models, including comparisons for several cloud providers. Amy Rudersdorf (AVP), Ben Fino-Radin (Small Data Industries), and Frances Harrell (NEDCC) also discussed helpful frameworks for conducting self-assessments.

Selina Aragon, Daina Bouquin, Don Brower, and Seth Anderson discuss the challenges of software preservation.

PASIG closed out by spending some time on the challenges involved with preserving emerging and complex formats. On the last afternoon of sessions, Amelia Acker (University of Texas at Austin) spoke about the importance of preserving APIs, terms of service, and other “born-networked” formats when archiving social media. She was followed by a panel of software preservationists who discussed different use cases for preserving binaries, source code, and other software artifacts.

Conference slides are all available online.

Thanks to the wonderful work of the PASIG 2019 steering, program, and local arrangements committees!


Kelly Bolding is the Project Archivist for Americana Manuscript Collections at Princeton University Library, as well as the team leader for bloggERS! She is interested in developing workflows for processing born-digital and audiovisual materials and making archival description more accurate, ethical, and inclusive.