“A Trial by Fire Process”: Digital Archiving at the State Historical Society of Missouri (Interview with Elizabeth Engel)

This is the fourth post in the Conversations series

Founded in 1898, the State Historical Society of Missouri (SHSMO) “collect[s], preserve[s], publish[es], exhibit[s], and make[s] available material related to all aspects and periods of Missouri history” (The State Historical Society of Missouri, “About Us”). Supporting this mission is a large staff that includes thirty-five full-time and twelve part-time employees, two research fellows, and a large number of volunteers and interns who work in one of SHSMO’s six Research Centers (The State Historical Society of Missouri, “About Us”). My interviewee, Senior Archivist Elizabeth Engel, serves at the Columbia Research Center on the University of Missouri campus. Elizabeth and her colleagues work to make SHSMO’s collections (e.g. the National Women and Media Collection) accessible to a wide variety of patrons, including film creators, reporters, and researchers from all walks of life.

Elizabeth’s entry into the archival field was due partly to happenstance. After enrolling in the University of Iowa’s (UI) School of Information Science, Elizabeth expected to work in public libraries—especially because she had worked in similar settings during her high school and college years. However, she seized upon an opportunity to complete a work-study assignment at the Iowa Women’s Archives (at the University of Iowa) and promptly discovered a passion for archives. After graduating from UI in 2006, SHSMO initially hired her as a Manuscript Specialist—and the rest is, well, history (The State Historical Society of Missouri, “SHSMO Staff”). As the senior archivist for the Columbia Research Center, Elizabeth’s day-to-day work involves processing collections; fulfilling various public services responsibilities, and developing biographical histories of Missouri’s most well-known citizens. Her greatest responsibility, however, is overseeing the Columbia Research Center’s accessioning efforts—particularly as it pertains to digital content.

Elizabeth’s Research Center has seen a marked increase in the amount of born-digital material that it takes in each year. This point is exemplified by SHSMO’s recent acquisition of Senator Claire McCaskill’s papers, which consists of approximately 3.25 cubic feet AND two terabytes of data. To tackle the challenges of managing such content, Elizabeth and her staff have employed a variety of tactics and tools. While MPLP-inspired collection-level descriptions have sufficed for physical collections, Elizabeth noted that digital content requires a more in-depth description for access and preservation purposes. Elizabeth’s work on other projects—such as the processing of the Missouri Broadcasters Association Radio Archives Collection—reinforced the importance of flexibility, as exemplified by her arrangement tactics (recordings are organized by call sign, and further accruals are added to the end of the finding aid) and description efforts (“some of the file names were in ALL CAPS and I decided to retain that for the time being as well…perhaps it will aid in retrieval).

This theme of flexibility emerged when Elizabeth discussed the different digital archiving tools that SHSMO staff have employed: Duke University’s DataAccessioner and Microsoft Excel spreadsheets (to create and organize metadata); various storage spaces, including network attached storage (NAS) units and a dark archive (both of which are accessible only to certain staff); thumb drives, used to deliver content to patrons; a Microsoft Access database, which serves as the institution’s collection management system; and BitCurator, which SHSMO staff set up to tackle larger and more complex collections (e.g. Senator McCaskill’s papers). Overall, effectively and efficiently managing these digital resources has been “a [constant] trial by fire process,” given the somewhat volatile nature of the digital archives field. In the future, Elizabeth hopes that SHSMO will adopt more user-friendly and compatible software—such as Archivematica and/or Access to Memory (AtoM)—to fulfill its mission. In fact, Elizabeth emphasized that finding such tools—especially cost-effective tools—represents one of the greater challenges facing modern archivists.

For the aspiring digital archivist, Elizabeth recommended seeking out practice-focused learning opportunities. To complement her largely theoretical UI coursework, Elizabeth completed the Digital Archives Specialist (DAS) certificate; scans the field for published literature; and engages in other professional development efforts. She further recommended the workshops provided by Lyrasis as another opportunity to deepen one’s digital preservation knowledge. Elizabeth explained that the twenty-first-century digital archivist must remain flexible and commit to continual learning to stay on top of the field’s recent developments. She also emphasized that these same professionals must also be given sufficient time to learn and experiment with tools and workflows.

Before we digitally parted ways, Elizabeth offered one final and—in this writer’s opinion—exceptionally solid advice:

“You’re going to make mistakes and that’s okay. The DAS courses drilled it into me that ‘Doing something is better than nothing.’ Standards/tools are going to change and you can’t predict that. Sometimes all you can do is digital triage with the resources/time you have, so don’t let the doing things perfectly be the enemy of the good.”



Gentry_Photo_2018.jpgAuthor Bio: Steven Gentry is the Archives Technician for the St. Mary’s College of Maryland Archives. His responsibilities include processing collections and building finding aids; assisting with web and email archiving efforts; and researching tools and best practices pertaining to digital archives and electronic records.

 


Advertisements

An Interview With Caitlin Birch — Digital Collections and Oral History Archivist at the Rauner Special Collections Library, Dartmouth

Interview conducted with Caitlin Birch by Juli Folk in March 2019

This is the third post in the Conversations series

Meet Caitlin Birch

Caitlin Birch is the Digital Collections and Oral History Archivist for the Rauner Special Collections Library at Dartmouth College in Hanover, New Hampshire: she sat down with Juli Folk, a graduate student at the University of Maryland-College Park iSchool, who is pursuing an archives-focused MLIS and certificate in Museum Scholarship and Material Culture. Caitlin’s descriptions of her career path, her roles and achievements, and her insights into the challenges she faces helped frame a discussion of helpful skill sets for working with born-digital archival records on a daily basis.

Caitlin’s Career Path

As an undergraduate, Caitlin majored in English, concentrating in journalism with minors in history and Irish studies. After a few years working as a reporter and editor, she began to consider a different career path, looking for other fields that emphasize constant learning, storytelling, and contributions to the historical record. In time, she decided on a dual degree (MA/MSLIS) in history and archives management from Simmons College (now Simmons University). Throughout grad school, her studies focused on both historical methods and original research as well as archival theory and practice.

When asked about the path to her current position, Caitlin responded, “To the extent that my program allowed, I tried to take courses with a digital focus whenever I could. I also completed two internships and worked in several paraprofessional positions, which were really invaluable to preparing me for professional work in the field. I finished my degrees in December 2013 and landed my job at Dartmouth a few months later.” She now works as the Digital Collections and Oral History Archivist for Rauner Special Collections Library, the home of Dartmouth College’s rare books, manuscripts, and archives, compartmentalized within the larger academic research library.

Favorite Aspects of Being an Archivist

For Caitlin, the best aspects of being an archivist are working at the intersection of history and technology; teaching and interacting with people every day; and having new opportunities to create, innovate, and learn. Her position includes roles in both oral history and born-digital records, and on any given day she may be juggling tasks like teaching students oral history methodology, working on the implementation of a digital repository, building Dartmouth’s web archiving program, managing staff, sharing reference desk duty, and staying abreast of the profession via involvement with the SAA and the New England Archivists Executive Board. “I like that no two days are the same,” she shared, adding, “I like that my work can have a positive impact on others.”

Challenges of Being an Archivist

Caitlin pointed out that aspects of the profession change and evolve at a pace that can make it difficult to keep up, especially when job- or project-related tasks demand so much attention. She also noted other challenges: “More and more we’re grappling with issues like the ethical implications of digital archives and the environmental impact of digital preservation.” That said, she finds that “the biggest challenge is also the biggest opportunity: most of what I do hasn’t been done before at Dartmouth. I’m the first digital archivist to be hired at my institution, so everything—infrastructure, policies, workflows, etc.—has been/is being built from the ground up. It’s exciting and often very daunting, especially because this corner of the archives field is dynamic.”

Advice for Students and Young Professionals

As a result, Caitlin emphasized the importance of experimentation and failure. “Traditional archival practice is well-defined and there are standards to guide it, but digital archives present all kinds of unique challenges that didn’t exist until very recently. Out of necessity, you have to innovate and try new things and learn from failure in order to get anywhere.” For this reason, she recommended building a good professional network and finding time to keep up with the professional literature. “It’s really key to cultivate a community of practice with colleagues at other institutions.”

When asked whether she sets aside time specified for these tasks or if she finds that networking and research are natural outputs of her daily work, Caitlin stated that networking comes more easily because of her involvement with professional organizations. However, finding time for professional literature and research proved more difficult, a concern Caitlin brought to her manager. In response, he encouraged her to block 1-2 hours on her calendar at the same time every week to catch up on reading and professional news. She remains grateful for that support: “I would hope that every manager in this profession encourages time for regular professional development. It may seem like it’s taking time away from job responsibilities, but in actuality it’s helping you to build the skills and knowledge you need for future innovation.”


SAA-bloggERS-headshot-Juli_Folk

Juli Folk is finishing the MLIS program at the University of Maryland-College Park iSchool, specializing in Archives and Digital Curation. Previously a corporate editor and project manager, Juli’s graduate work supplements her passions for writing, art, and technology with formal archival training, to refocus her career on cultural heritage institutions.

An Interview with Erica Titkemeyer – Project Director and AV Conservator at the Southern Folklife Collection, UNC

Interview conducted with Erica Titkemeyer by Morgan McKeehan in March 2019.

This is the second post in a new series of conversations between emerging professionals and archivists actively working with digital materials.


Erica is the Project Director and AV Conservator at the Southern Folklife Collection, in Wilson Special Collections Library at the University of North Carolina at Chapel Hill’s University Libraries.Erica Titkemeyer

Tell us a little bit about the path that brought you to your current position.

As an undergrad I majored in Cinema and Photography, which initially put me in contact with many of the analog-based obsolete formats our team at UNC works to digitize now. It was also during this time when I saw how varied new proprietary born-digital formats could be based on camera types, capture settings, and editing environments, and how these files could be just as problematic as film and magnetic-based formats when trying to access content over time. Whether projects originated on something like DVCAM or P2 cards, codec and file format compatibility issues were a daily occurrence in classes. After undergrad I went through NYU’s Moving Image Archiving and Preservation program where courses in digital preservation helped instill a lot of the foundational knowledge I use today.

After grad school, I spent 9 months in the inaugural National Digital Stewardship Residency cohort in Washington, D.C., where I worked at Smithsonian Institution Archives to explore digital preservation needs and challenges of digital media art.

My current position is primarily concerned with the timely digitization, preservation and access of obsolete analog audiovisual formats, but our digital tape-based collections are growing, and there are many born-digital accessions with a myriad of audio and video file formats that we need to make decisions about now to ensure they’re around for the long term.

What type of institution do you currently work at and where is it located?

I work within Wilson Special Collections Library at the University of North Carolina at Chapel Hill’s University Libraries. I am situated in the Southern Folklife Collection, which holds the majority of audiovisual recordings in Wilson Library; however my team has expanded to work with all audiovisual recordings in the building as part of a new Andrew W. Mellon grant, Extending the Reach of Southern Audiovisual Sources: Expansion.

What do you love most about working with AV archival materials?

I’ve always been excited to learn about moving image and sound technologies and how they fit into historical contexts. Even if I know nothing about a collection except for the format, there’s enough there to understand the time and circumstances the recordings were created in. This is just as much the case for born-digital audiovisual files as it is for analog. We’ve seen file formats, codecs, and recording equipment go by the wayside, and so they exist as markers of a particular time.

What’s the biggest challenge affecting your work (and/or the field at large)?

Current and future digital video capabilities can provide a lot of options for documentarians and filmmakers, which is great news for them, but it also means there’s going to be a flood of new file formats with encodings and specifications we have not dealt with, many of which will already be difficult to access by the time they make it to our library because of planned obsolescence. We’ve already started to see these collections come in, and it’s impossible to normalize everything to our audiovisual target preservation specifications while still retaining quality for various reasons. Fortunately, there are a lot of folks thinking about this who are building some precedent when it comes to making decisions about the files. Julia Kim at Library of Congress, Rebecca Fraimow at WGBH, and I have also done a couple panel talks on this and recently put out an article through Code4Lib on this topic (https://journal.code4lib.org/articles/14244).

What advice would you give yourself as a student or professional first delving into digital archives work?

Everything can seem very overwhelming. There are a lot of directions to take in audiovisual preservation and archiving, and digital archiving and preservation is the shiny new frontier, but there’s a lot to gain by starting with what you know and taking it from there. I think building my knowledge and expertise in analog preservation risks inevitably helps me in tackling some of the more challenging aspects of born-digital audiovisual preservation.


Morgan McKeehanMorgan McKeehan is the Digital Collections Specialist in the Repository Services Department, within the University Libraries at the University of North Carolina at Chapel Hill. In this role, she provides support for the management of and access to digitized and born-digital materials from across the Libraries’ special collections units.

An Interview with Amy Berish – Assistant Archivist at the Rockefeller Archive Center

by Georgia Westbrook

This is the first post in a new series of conversations between emerging professionals and archivists actively working with digital materials.

Amy Berish is an Assistant Archivist at the Rockefeller Archive Center in Sleepy Hollow, New York. There, she is a member of the Processing Team, working on processing collections that cover a wide range of philanthropic history and a variety of materials. A recent graduate of the University of Pittsburgh Master of Library and Information Science program, Amy has generously shared her path and experiences with bloggERS!

Amy began working in her local library when she was 14 and went on to major in library and information science as an undergraduate. While there and throughout graduate school, she worked at the university library, took various internships, and worked for school credit at the preservation lab, all in an effort to find her place in the library and archives world.

In her current role at the Rockefeller Archive Center, she works as part of a larger staff to process incoming collections in both paper and digital formats. The Rockefeller Archive Center collects materials related to the Rockefeller family, but also several other large philanthropic organizations, including the Ford Foundation, the Near East Foundation, the Commonwealth Fund, the Rockefeller Brothers Fund, the Henry Luce Foundation, and the W. T. Grant Foundation, among others. While she shied away from working with digital formats and learning coding skills during college, she has had the opportunity to pursue that work in her current role and has embraced the challenges that have come with it.

“I feel like digital work is the biggest challenge right now, in both the work I am doing and the work of the broader archival profession,” she said. “Learning to navigate the technical skills required to do some of the work we are doing can be especially daunting. Having a positive attitude about change and a willingness to learn is often easier said than done – but I also think these two factors could help make this type of work seem more doable.”

Amy has found support in her teams at the Rockefeller Archive Center and in the archives community in and around New York City. For example, Digital Team members at the Rockefeller Archive Center reminded her that it would be ok to break things in the code, and that they would be able to fix it if she wanted to experiment with a new way of scripting. She has also found support in online forums, which have allowed her to connect to others doing related work across the country.

Beyond scripting, part of her position requires her to deal with formats that might be obsolete or nearly so, and to face policy questions regarding proprietary information and copyright. Like coding however, Amy has used her enthusiasm for learning new skills as an asset in facing these challenges.

“I love learning new things and as a processing archivist, it’s part of my job to continue to learn more about various topics through each collection I process,” Amy said. “I also get the opportunity to learn through some of the digital projects I am working on. I have learned to automate processes by writing scripts. I have also had a lot experience lately working with legacy digital media – from optical disks and floppies to zip disks and Bernoulli disks – it has been a challenge trying to get 10-year-old media to function properly!”

As a new professional, Amy was quick to mention some of the challenges that archivists can face at the beginning of their career. Still, she said, a pat on the back for each small step you take is well-deserved. She cited one of her graduate school professors, who encouraged her to cultivate an “ethos of fearlessness” when facing technology; she said the phrase has become a mantra in her current position. Since that, Amy acknowledged, is easier said than done, especially while you’re still in school, she has three other pieces of advice to share for others just starting out in digital archives work: Take the opportunities you’re given, always be ready to learn, and don’t be afraid digital work.


Georgia Westbrook is an MSLIS student at Syracuse University. She’s interested in visual resources, oral histories, digital publishing, and open access. Connect with her on LinkedIn or on her website.

Using R to Migrate Box and Folder Lists into EAD

by Andy Meyer

Introduction

This post is a case study about how I used the statistical programming language R to help export, transform, and load data from legacy finding aids into ArchivesSpace. I’m sharing this workflow in the hopes that another institution might find this approach helpful and could be generalized to other issues facing archives.

I decided to use the programming language R because it is a free and open source programming language that I had some prior experience using. R has a large and active user community as well as a large number of relevant packages that extend the basic functions of R,  including libraries that can deal with Microsoft Word tables and read and write XML. All of the code for this project is posted on Github.

The specific task that sparked this script was when I inherited hundreds of finding aids with minimal collection-level information and very long and detailed box and folder lists. These were all Microsoft Word documents with the box and folder list formatted as a table within the Word document. We recently adopted ArchivesSpace as our archival content management system so the challenge was to reformat this data and upload it into ArchivesSpace. I considered manual approaches but eventually opted to develop this code to automate this work. The code is generally organized into three sections: data export, transforming and cleaning the data, and finally, creating an EAD file to load into ArchivesSpace.

Data Export

After installing the appropriate libraries, the first step of the process was to extract the data from the Microsoft Word tables. Given the nature of our finding aids, I focused on extracting only the box and folder list; collection-level information would be added manually later in the process.

This process was surprisingly straightforward; I created a variable with a path to a Word Document and used the “docx_extract_tbl” function from the docxtractr package to extract the contents of that table into a data.frame in R. Sometimes our finding aids were inconsistent so I occasionally had to tweak the data to rearrange the columns or add missing values. The outcome of this step of the process is four columns that contain folder title, date, box number, and folder number.

This data export process is remarkably flexible. Using other R functions and libraries, I have extended this process to export data from CSV files or Excel spreadsheets. In theory, this process could be extended to receive a wide variety of data including collection-level descriptions and digital objects from a wider variety of sources. There are other tools that can also do this work (Yale’s Excel to EAD process and Harvard’s Aspace Import Excel plugin), but I found this process to be easier for my institution’s needs.

Data Transformation and Cleaning

Once I extracted the data from the Microsoft Word document, I did some minimal data cleanup, a sampling of which included:

  1. Extracting a date range for the collection. Again, past practice focused on creating folder-level descriptions and nearly all of our finding aids lacked collection-level information. From the box/folder list, I tried to extract a date range for the entire collection. This process was messy but worked a fair amount of the time. In cases when the data were not standardized, I defined this information manually.
  2. Standardizing “No Date” text. Over the course of this project, I discovered the following terms for folders that didn’t have dates: “n.d.”,”N.D.”,”no date”,”N/A”,”NDG”,”Various”, “N. D.”,””,”??”,”n. d.”,”n. d. “,”No date”,”-“,”N.A.”,”ND”, “NO DATE”, “Unknown.” For all of these, I updated the date field to “Undated” as a way to standardize this field.
  3. Spelling out abbreviations. Occasionally, I would use regular expressions to spell out words in the title field. This could be standard terms like “Corresp” to “Correspondence” or local terms like “NPU” to “North Park University.”

R is a powerful tool and provides many options for data cleaning. We did pretty minimal cleaning but this approach could be extended to do major transformations to the data.

Create EAD to Load into ArchivesSpace

Lastly, with the data cleaned, I could restructure the data into an XML file. Because the goal of this project was to import into ArchivesSpace, I created an extremely basic EAD file meant mainly to enter the box and folder information into ArchivesSpace; collection-level information would be added manually within ArchivesSpace. In order to get the cleaned data to import, I first needed to define a few collection-level elements including the collection title, collection ID, and date range for the collection. I also took this as an opportunity to apply a standard conditions governing access note for all collections.

Next, I used the XML package in R to create the minimally required nodes and attributes. For this section, I relied on examples from the book XML and Web Technologies for Data Sciences with R by Deborah Nolan and Duncan Temple Lang. I created the basic EAD schema in R using the “newXMLNode” functions from the XML package. This section of code is very minimal, and I would welcome suggestions from the broader community about how to improve it. Lastly, I defined functions that make the title, date, box, and folder nodes, which were then applied to the data exported and transformed in earlier steps. Lastly, this script saves everything as an XML file that I then uploaded into ArchivesSpace.

Conclusion

Although this script was designed to solve a very specific problem—extracting box and folder information from a Microsoft Word table and importing that information into ArchivesSpace—I think this approach could have wide and varied usage. The import process can accept loosely formatted data in a variety of different formats including Microsoft Word, plain text, CSV, and Excel and reformat the underlying data into a standard table. R offers an extremely robust set of packages to update, clean, and reformat this data. Lastly, you can define the export process to reformat the data into a suitable file format. Given the nature of this programming language, it is easy to preserve your original data source as well as document all the transformations you perform.


Andy Meyer is the director (and lone arranger) of the F.M. Johnson Archives and Special Collections at North Park University. He is interested in archival content management systems, digital preservation, and creative ways to engage communities with archival materials.

Assessing the Digital Forensics Instruction Landscape with BitCuratorEdu

by Jess Farrell

This is the sixth post in the bloggERS Making Tech Skills a Strategic Priority series.

Over the past couple of months, we’ve heard a lot on bloggERS about how current students, recent grads, and mid-career professionals have made tech skills a strategic priority in their development plans. I like to think about the problem of “gaining tech skills” as being similar to “saving the environment”: individual action is needed and necessary, but it is most effective when it feeds clearly into systemic action.

So that begs the question, what root changes might educators of all types suggest and support to help GLAM professionals prioritize tech skills development? What are educator communities and systems – iSchools, faculty, and continuing education instructors – doing to achieve this? These questions are among those addressed by the BitCuratorEdu research project.

The BitCuratorEdu project is a two three-year effort funded by the Institute of Museum and Library Services (IMLS) to study and advance the adoption of born-digital archiving and digital forensics tools and methods in libraries and archives through a range of professional education efforts. The project is a partnership between the School of Information and Library Science at the University of North Carolina at Chapel Hill and the Educopia Institute, along with the Council of State Archivists (CoSA) and nine universities that are educating future information professionals.

We’re addressing two main research questions:

  1. What are the primary institutional and technological factors that influence adoption of digital forensics tools and methods in different educational settings?
  2. What are the most viable mechanisms for sustaining collaboration among LIS programs on the adoption of digital forensics tools and methods?

The project started in September 2018 and will conclude in Fall 2021, and Educopia and UNC SILS will be conducting ongoing research and releasing open educational resources on a rolling basis. With the help of our Advisory Board made up of nine iSchools and our Professional Experts Panel composed of leaders in the GLAM sector, we’re:

  • Piloting instruction to produce and disseminate a publicly accessible set of learning objects that can be used by education providers to administer hands-on digital forensics education
  • Gathering information and centralizing existing educational content to produce guides and other resources, such as this (still-in-development) guide to datasets that can be used to learn new digital forensics skills or test digital archives software/processes
  • Investigating and reporting on institutional factors that facilitate, hinder and shape adoption of digital forensics educational offerings

Through this work and intentional community cultivation, we hope to advance a community of practice around digital forensics education though partner collaboration, wider engagement, and exploration of community sustainability mechanisms.

To support our research and steer the direction of the project, we have conducted and analyzed nine advisory board interviews with current faculty who have taught or are developing a curriculum for digital forensics education. So far we’ve learned that:

  • instructors want and need access to example datasets to use in the classroom (especially cultural heritage datasets);
  • many want lesson plans and activities for teaching born-digital archiving tools and environments like BitCurator in one or two weeks because few courses are devoted solely to digital forensics;
  • they want further guidance on how to facilitate hands-on digital forensics instruction in distributed online learning environments; and
  • they face challenges related to IT support at their home institutions, just like those grappled with by practitioners in the field.

This list barely scratches the surface of our exploration into the experiences and needs of instructors for providing more effective digital forensics education, and we’re excited to tackle the tough job of creating resources and instructional modules that address these and many other topics. We’re also interested in exploring how the resources we produce may also support continuing education needs across libraries, archives, and museums.

We recently conducted a Twitter chat with SAA’s SNAP Section to learn about students’ experiences in digital forensics learning environments. We heard a range of experiences, from students who reported they had no opportunity to learn about digital forensics in some programs, to students who received effective instruction that remained useful post-graduation. We hope that the learning modules released at the conclusion of our project will address students’ learning needs just as much as their instructors’ teaching needs.

Later this year, we’ll be conducting an educational provider survey that will gather information on barriers to adoption of digital forensics instruction in continuing education. We hope to present to and conduct workshops for a broader set of audiences including museum and public records professionals.

Our deliverables, from conference presentations to learning modules, will be released openly and freely through a variety of outlets including the project website, the BitCurator Consortium wiki, and YouTube (for recorded webinars). Follow along at the project website or contact jess.farrell@educopia.org if you have feedback or want to share your insights with the project team.

 

Authors bio:

Jess Farrell is the project manager for BitCuratorEdu and community coordinator for the Software Preservation Network at Educopia Institute. Katherine Skinner is the Executive Director of Educopia Institute, and Christopher (Cal) Lee is Associate Professor at the School of Information and Library Science at the University of North Carolina, Chapel Hill, teaching courses on archival administration, records management, and digital curation. Katherine and Cal are Co-PIs on the BitCuratorEdu project, funded by the Institute of Museum and Library Services.

PASIG (Preservation and Archiving Special Interest Group) 2019 Recap

by Kelly Bolding

PASIG 2019 met the week of February 11th at El Colegio de México (commonly known as Colmex) in Mexico City. PASIG stands for Preservation and Archiving Special Interest Group, and the group’s meeting brings together an international group of practitioners, industry experts, vendors, and researchers to discuss practical digital preservation topics and approaches. This meeting was particularly special because it was the first time the group convened in Latin America (past meetings have generally been held in Europe and the United States). Excellent real-time bilingual translation for presentations given in both English and Spanish enabled conversations across geographical and lingual boundaries and made room to center Latin American preservationists’ perspectives and transformative post-custodial archival practice.

Perla Rodriguez of the Universidad Nacional Autónoma de México (UNAM) discusses an audiovisual preservation case study.

The conference began with broad overviews of digital preservation topics and tools to create a common starting ground, followed by more focused deep-dives on subsequent days. I saw two major themes emerge over the course of the week. The first was the importance of people over technology in digital preservation. From David Minor’s introductory session to Isabel Galina Russell’s overview of the digital preservation landscape in Mexico, presenters continuously surfaced examples of the “people side” of digital preservation (think: preservation policies, appraisal strategies, human labor and decision-making, keeping momentum for programs, communicating to stakeholders, ethical partnerships). One point that struck me during the community archives session was Verónica Reyes-Escudero’s discussion of “cultural competency as a tool for front-end digital preservation.” By conceptualizing interpersonal skills as a technology for facilitating digital preservation, we gain a broader and more ethically grounded idea of what it is we are really trying to do by preserving bits in the first place. Software and hardware are part of the picture, but they are certainly not the whole view.

The second major theme was that digital preservation is best done together. Distributed digital preservation platforms, consortial preservation models, and collaborative research networks were also well-represented by speakers from LOCKSS, Texas Digital Library (TDL), Duraspace, Open Preservation Foundation, Software Preservation Network, and others. The takeaway from these sessions was that the sheer resource-intensiveness of digital preservation means that institutions, both large and small, are going to have to collaborate in order to achieve their goals. PASIG seemed to be a place where attendees could foster and strengthen these collective efforts. Throughout the conference, presenters also highlighted failures of collaborative projects and the need for sustainable financial and governance models, particularly in light of recent developments at the Digital Preservation Network (DPN) and Digital Public Library of America (DPLA). I was particularly impressed by Mary Molinaro’s honest and informative discussion about the factors that led to the shuttering of DPN. Molinaro indicated that DPN would soon be publishing a final report in order to transparently share their model, flaws and all, with the broader community.

Touching on both of these themes, Carlos Martínez Suárez of Video Trópico Sur gave a moving keynote about his collaboration with Natalie M. Baur, Preservation Librarian at Colmex, to digitize and preserve video recordings he made while living with indigenous groups in the Mexican state of Chiapas. The question and answer portion of this session highlighted some of the ethical issues surrounding rights and consent when providing access to intimate documentation of people’s lives. While Colmex is not yet focusing on access to this collection, it was informative to hear Baur and others talk a bit about the ongoing technical, legal, and ethical challenges of a work-in-progress collaboration.

Presenters also provided some awesome practical tools for attendees to take home with them. One of the many great open resources session leaders shared was Frances Harrell (NEDCC) and Alexandra Chassanoff (Educopia)’s DigiPET: A Community Built Guide for Digital Preservation Education + Training Google document, a living resource for compiling educational tools that you can add to using this form. Julian Morley also shared a Preservation Storage Cost Model Google sheet that contains a template with a wealth of information about estimating the cost of different digital preservation storage models, including comparisons for several cloud providers. Amy Rudersdorf (AVP), Ben Fino-Radin (Small Data Industries), and Frances Harrell (NEDCC) also discussed helpful frameworks for conducting self-assessments.

Selina Aragon, Daina Bouquin, Don Brower, and Seth Anderson discuss the challenges of software preservation.

PASIG closed out by spending some time on the challenges involved with preserving emerging and complex formats. On the last afternoon of sessions, Amelia Acker (University of Texas at Austin) spoke about the importance of preserving APIs, terms of service, and other “born-networked” formats when archiving social media. She was followed by a panel of software preservationists who discussed different use cases for preserving binaries, source code, and other software artifacts.

Conference slides are all available online.

Thanks to the wonderful work of the PASIG 2019 steering, program, and local arrangements committees!


Kelly Bolding is the Project Archivist for Americana Manuscript Collections at Princeton University Library, as well as the team leader for bloggERS! She is interested in developing workflows for processing born-digital and audiovisual materials and making archival description more accurate, ethical, and inclusive.

Contribute to an ERS Community Project!

Please take this short survey to contribute to the 2019 ERS Community Project! The survey closes on Friday, March 29.

In December 2018, the ERS Steering Committee put out a call for ideas for a 2019 ERS community project. We’re thankful for the community input and are pleased to announce that we’re building a master list of digital archives and digital preservation resources that can be used for reference, or to provide a resource overlay for existing best practice and workflow documentation. The Committee has begun compiling resources and thinking about how they connect, but broader input is essential to this project’s success.

At this stage, we are interested in getting a sense of what the most useful resources are in our community. Please take our survey to share your top three go-to resources as well as any areas of electronic records work that you feel lack guidance and documentation. We are thinking of resources broadly, so feel free to suggest your three favorite journal articles, blogs, handbooks, workflows, tools and manuals, or any other style of resource that helps you process and preserve born-digital collections.

After the survey closes on Friday, March 29, we’ll compile and share the results. We also hope to eventually open up a community documentation space where anyone can add to our current list of resources. Once the data collection period is over, we’ll determine the best way to share a more polished version of this resource list.

On behalf of the ERS Steering Committee, thank you for participating!

  • Jessica Farrell
  • Jane Kelly
  • Susan Malsbury
  • Donald Mennerich
  • Kelsey O’Connell
  • Alice Prael
  • Jessica Venlet
  • Dorothy Waugh

Just do it: Building technical capacity among Princeton’s Archival Description and Processing Team

by Alexis Antracoli

This is the fifth post in the bloggERS Making Tech Skills a Strategic Priority series.

ArchivesSpace, Archivematica, BitCurator, EAD, the list goes on! The contemporary archivist is tasked with not only processing paper collections, but also with processing digital records and managing the descriptive data we create. This work requires technical skills that archivists twenty or even ten years ago didn’t need to master. It’s also rare that archivists get extensive training in the technical aspects of the field during their graduate programs. So, how can a team of archivists build the skills they’ll need to meet the needs of an increasingly technical field? At the Princeton University Library, the newly formed Archival Description and Processing Team (ADAPT), is committed to meeting these challenges by building technical capacity across the team. We are achieving this by working on real-world projects that require technical skills, and by leveraging existing knowledge and skills in the organization, seeking outside training, and championing supervisor support for using time to grow our technical skills.

One of the most important requirements for growing technical capacity on the processing team is supervisor support for the effort. Workshops, training, and solving technical problems take a significant amount of time. Without management support for the time needed to develop technical skills, the team would not be able experiment, attend trainings, or practice writing code. As the manager of ADAPT, I make this possible by encouraging staff to set specific goals related to developing technical skills on their yearly performance evaluations; I also accept that it might take us a little longer to complete all of our processing. To fit this work into my own schedule, I identify real-world problems and block out time on my schedule to work on them or arrange meetings with colleagues who can assist me. Blocking out time in advance helps me stick to my commitment to building my technical skills. While the time needed to develop these skills means that some work happens more slowly today, the benefit of having a team that can manipulate data and automate processes is an investment in the future that will result in a more productive and efficient processing team.

With the support to devote time to building technical skills, ADAPT staff use a number of resources to improve their skills. Working with internal staff who already have skills they want to learn has been one successful approach. This has generally paired well with the need to solve real-world data problems. For example, we recently identified the need to move some old container information to individual component-level scope and content notes in a finding aid. We were able to complete this after several in-house training sessions on XPath and XQuery taught by a Library staff member. This introductory training helped us realize that the problem could be solved with XQuery scripting and we took on the project, while drawing on the in-house XQuery expert for assistance. This combination of identifying real-world problems and leveraging existing knowledge within the organization leads both to increased technical skills and projects getting done. It also builds confidence and knowledge that can be more easily applied to the next situation that requires a particular kind of technical expertise.

Finally, building in-house expertise requires allowing staff to determine what technical skills they want to build and how they might go about doing it. Often that requires outside training. Over the past several years, we have brought workshops to campus on working with the command line and using the ArchivesSpace API. Staff have also identified online courses and classes offered by the Office of Information Technology as important resources for building their technical skills. Providing support and time to attend these various trainings or complete online courses during the work day creates an environment where individuals can explore their interests and the team can build a variety of technical skills that complement each other.

As archival work evolves, having deeper technology skills across the team improves our ability to get our work done. With the right support, tapping into in-house resources, and seeking out additional training, it’s possible to build increased technological capability with the processing team. In turn, the team will increasingly be able to more efficiently tackle day-to-day technical challenges needed to manage digital records and descriptive data.


Alexis Antracoli is Assistant University Archivist for Technical Services at Princeton University Library where she leads the Archival Processing and Description Team. She has published on web archiving and the archiving of born-digital audio visual content. Alexis is active in the Society of American Archivists, where she serves as Chair of the Web Archiving Section and on the Finance Committee. She is also active in Archives for Black Lives in Philadelphia, an informal group of local archivists who work on projects that engage issues at the intersection of the archival profession and the Black Lives Matter movement. She is especially interested in applying user experience research and user-center design to archival discovery systems, developing and applying inclusive description practices, and web archiving. She holds an M.S.I. in Archives and Records Management from the University of Michigan, a Ph.D. in American History from Brandeis University, and a B.A. in History from Boston College.