Recap: Emulation in the Archives Workshop – UVA, July 18, 2019

By Brenna Edwards

The Emulation in the Archives workshop took place at the University of Virginia (UVA) July 18, 2019, as part of Software Preservation Network’s Fostering a Community of Practice grant cohort. This one-day workshop explored various aspects of emulation in archives, from the legal challenges to access, and included an overview of what UVA is currently doing in this area. The workshop featured talks from people across departments at UVA, as well as people from the Library of Congress. In addition to the talks, there was also a chance to sign up for wireframe testing for UVA’s current access methods for emulated material in their collections. This process was optional, but people could also sign up for distance testing after the workshop if they preferred. 

The day was split into four different parts: an introduction to software preservation and emulation, including legal information; an overview of UVA’s current work in emulation; a look into the metadata for emulations and video game preservation; and considerations for access and user experience. Breaking up the day into these chunks defined a flow for the day, walking through the steps and considerations needed to emulate software and born digital materials. It also helped contain these topics, though of course certain themes and aspects kept appearing throughout the day in other presentations. 

The first portion of the day covered an introduction to software preservation and emulation, and the legal landscape. After explaining more of what Software Preservation Network’s Fostering a Community of Practice grant is, Lauren Work provided some definitions of emulation, software, and curatorial for use throughout the day. 

  • Emulation: digital technique that allows new computers to run legacy systems so older software appears the way it was originally designed
  • Software: source code, executables, applications, and other related components that are set of instructions for computers
  • Curatorial: responsibility and practice of appraising, acquiring, describing

Work then talked more about the Peter Sheeran papers, a collection from an architectural firm based in Charlottesville and the main collection for this project. As a hybrid collection, there were Computer Aided Design (CAD) files and Building Information Modeling (BIM) software included, which posed the question of what to do with it. The answer? Emulation! Since CAD/BIM files are very dependent on what version of the software and files are being used, UVA first did an inventory of what they had, down to license keys and how compatible it is with other software. To do this, they used the FCOP Collections Inventory Exercise to help guide them through what they needed to consider. They also looked at what potential troubleshooting issues and legal issues they might run into. This led nicely into the next presentation all about the legal landscape for software preservation, presented by Brandon Butler of UVA. Butler talked about copyright and the The Copyright Permissions Culture in Software Preservation and Its Implications for the Cultural Records report done by ARL, as well as the idea of fair use, which is often an underutilized solution. He also talked about digital rights management, and how groups like SPN are bringing people together to ask these questions that haven’t been asked before and working to get exemptions granted every three years to help seek permission to crack locks. Overall, he said that you should be good legally, but to do your research just to be on the safe side. 

This was followed by an overview of what UVA is currently doing. After reiterating “Access is everything” to the room, Michael Durbin demonstrated the current working pieces of their emulation system using Archivematica, Apollo, and a Curio custom display interface. He also demonstrated some of the EaaSI platform (which has a sandbox now available!] demonstrating VectorWorks files and how they might be used. Durbin then explained how UVA, in their transition to ArchivesSpace, plans to use the Digital Object function to link to the external emulation, as well as display the metadata that goes along with it. UVA also is taking into consideration the description that can’t be stored in any of UVA’s systems as of yet and how they might incorporate WikiData in the future. Next was Lauren Work and Elizabeth Wilkinson to talk about the curation workflows for software at UVA, which included a revamped Deed of Gift, as well as additional checklists and questionnaires. Their main advice was to talk with the donors early, early, early to get all the information you can, work with the donor to help make preservation and access decisions, but they  also acknowledged it is not always possible. Work and Wilkinson are still working on integrating these steps into the curation workflow at UVA, but also plan to start working more on their appraisal and processing workflows. Have thoughts on the checklist and questionnaire? Feel free to comment on their documents and make suggestions! 

After lunch, we got more into the technical side of things and talked about metadata! Elizabeth Wilkinson and Jeremy Bartczak presented on how UVA is handling archival metadata for software, including questions of how much is enough information, and if ArchivesSpace would be accommodating to this amount of description. While heavily influenced by the University of California Guidelines for Born-Digital Archival Description, they also consulted the Software Preservation Network Emulation as a Service Infrastructure Metadata Model. The result? UVA Archival Description Strategies for Emulated Software, which presents two different approaches to describing software, and UVA MARC Field Look-up for Software Description in ArchivesSpace, which has suggestions on where to put the description in ArchivesSpace. To find out information about the software, they suggested using Google, WorldCat, and Wikidata (for which Yale has created a guide). 

The second portion of this block was about description and preservation of video games, presented by Laura Drake Davis and David Gibson of the Library of Congress. The LOC has been collecting video games since they were introduced, with the first being PacMan. The copyright registry requires a description of item and some sort of visual documentation or representation of game play (a video, source code, etc.). The LOC keeps the original packaging for the game if possible, and they also collect strategy guides and periodicals related to video games. They also take source code, and  the first and last 25 pages of source code are required to be printed out and sent as documentation. Right now, they are reworking their workflows for processing, cataloging, and describing video games, working on relationships with game developers and distributors and with the LC General Counsel Office to assess risks associated with providing access to actual games, and looking into ways to emulate the games themselves. 

The final part of the day was all about access and user experience. First was Lauren Work and Elizabeth Wilkinson to talk about how UVA is considering user access to emulated environments. As of now, they plan to have reading room access only, taking into consideration staff training required to do this and the computer station requirements. They are also taking into consideration what is important about access via emulated environments, a topic discussed at the Architecture, Design, and Engineering Summit at the Library of Congress in 2017. Currently, they are doing wireframe testing with ArchivesSpace to see how users navigate through ArchivesSpace, as well as what types of information is needed for researchers, such as troubleshooting tips, links to related collections, instructions or a note about what to expect within the emulated environment, and how to cite the emulation

The final talk of the day was by Julia Kim of the Library of Congress. Kim talked about her study on user experience with born digital materials at NYU from 2014 to 2015, and compared it to Tim Walsh’s survey on the same thing at the Canadian Center for Architecture done in 2017. Kim found that there is a very fine line between researcher responsibilities and digital archivist responsibilities, users got frustrated with the slowness of the emulations, and there is a learning curve. Overall, Kim found that it’s only somewhat worth it to do emulations, but thinks the EaaSI project will help with this, as well as a lot of outreach and education on what these materials are and how to use them effectively. 

Overall, I found the workshop to be highly informative and I feel more confident considering emulations for future projects. I feel the use of shared community notes helped everyone ask for clarification without disrupting the presenters and allowed for questions to be typed out to be asked at the end. It’s also been helpful to look back on these notes, as slides and links to resources have been added by both presenters and attendees. It’s nice that there is a cohort of people out there working on this and willing to share resources and talk as needed! If you’d like to learn more about the workshop, you can visit their website here, and if you’d like to see the community notes and presentations, you can click here, with the Twitter stream here


Brenna Edwards is currently Project Digital Archivist at the Stuart A. Rose Library at Emory University, Atlanta, GA. Her main responsibility is imaging and processing born digital materials, while also researching the best tools and practices to make them available. 

ml4arc – Machine Learning, Deep Learning, and Natural Language Processing Applications in Archives

by Emily Higgs


On Friday, July 26, 2019, academics and practitioners met at Wilson Library at UNC Chapel Hill for “ml4arc – Machine Learning, Deep Learning, and Natural Language Processing Applications in Archives.” This meeting featured expert panels and participant-driven discussions about how we can use natural language processing – using software to understand text and its meaning – and machine learning – a branch of artificial intelligence that learns to infer patterns from data – in the archives.

The meeting was hosted by the RATOM Project (Review, Appraisal, and Triage of Mail).  The RATOM project is a partnership between the State Archives of North Carolina and the School of Information and Library Science at UNC Chapel Hill. RATOM will extend the email processing capabilities currently present in the TOMES software and BitCurator environment, developing additional modules for identifying and extracting the contents of email-containing formats, NLP tasks, and machine learning approaches. RATOM and the ml4arc meeting are generously supported by the Andrew W. Mellon Foundation.

Presentations at ml4arc were split between successful applications of machine learning and problems that could potentially be addressed by machine learning in the future. In his talk, Mike Shallcross from Indiana University identified archival workflow pain points that provide opportunities for machine learning. In particular, he sees the potential for machine learning to address issues of authenticity and integrity in digital archives, PII and risk mitigation, aggregate description, and how all these processes are (or are not) scalable and sustainable. Many of the presentations addressed these key areas and how natural language processing and machine learning can lend aid to archivists and records managers. Additionally, attendees got to see presentations and demonstrations from tools for email such as RATOM, TOMES, and ePADD. Euan Cochrane also gave a talk about the EaaSI sandbox and discussed potential relationships between software preservation and machine learning.

The meeting agenda had a strong focus on using machine learning in email archives; collecting and processing emails is a large encumbrance in many archives that can stand to benefit greatly from machine learning tools. For example, Joanne Kaczmarek from the University of Illinois presented a project processing capstone email accounts using an e-discovery and predictive coding software called Ringtail. In partnership with the Illinois State Archives, Kaczmarek used Ringtail to identify groups of “archival” and “non-archival” emails from 62 capstone accounts, and to further break down the “archival” category into “restricted” and “public.” After 3-4 weeks of tagging training data with this software, the team was able to reduce the volume of emails by 45% by excluding “non-archival” messages, and identify 1.8 million emails that met the criteria to be made available to the public. Manually, this tagging process could have easily taken over 13 years of staff time.

After the ml4arc meeting, I am excited to see the evolution of these projects and how natural language processing and machine learning can help us with our responsibilities as archivists and records managers. From entity extraction to PII identification, there are myriad possibilities for these technologies to help speed up our processes and overcome challenges.


Emily Higgs is the Digital Archivist for the Swarthmore College Peace Collection and Friends Historical Library. Before moving to Swarthmore, she was a North Carolina State University Libraries Fellow. She is also the Assistant Team Leader for the SAA ERS section blog.


Securing Our Digital Legacy: An Introduction to the Digital Preservation Coalition

by Sharon McMeekin, Head of Workforce Development


Nineteen years ago, the digital preservation community gathered in York, UK, for the Cedars Project’s Preservation 2000 conference. It was here that the first seeds were sown for what would become the Digital Preservation Coalition (DPC). Guided by Neil Beagrie, then of King’s College London and Jisc, work to establish the DPC continued over the next 18 months and, in 2002, representatives from 7 organizations signed the articles that formally constituted the DPC.

In the 17 years since its creation, the DPC has gone from strength to strength, the last 10 years under the leadership of current Executive Director, William Kilbride. The past decade has been a particular period of growth, as shown by the rise in the staff compliment from 2 to 7. We now have more than 90 members who represent an increasingly diverse group of organizations from 12 countries across sectors including cultural heritage, higher education, government, banking, industry, media, research and international bodies.

DPC staff, chair, and president

Our mission at the DPC is to:

[…] enable our members to deliver resilient long-term access to digital content and services, helping them to derive enduring value from digital assets and raising awareness of the strategic, cultural and technological challenges they face.

We work to achieve this through a broad portfolio of work across six strategic areas of activity: Community Engagement, Advocacy, Workforce Development, Capacity Building, Good Practice and Standards, and Management and Governance. Everything we do is member-driven and they guide our activities through the DPC Board, Representative Council, and Sub-Committees which oversee each strategic area.

Although the DPC is driven primarily by the needs of our members, we do also aim to contribute to the broader digital preservation community. As such, many of the resources we develop are made publicly available. In the remainder of this blog post, I’ll be taking a quick look at each of the DPC’s areas of activity and pointing out resources you might find useful.

1 | Community Engagement

First up is our work in the area of Community Engagement. Here our aim is to enable “a growing number of agencies and individuals in all sectors and in all countries to participate in a dynamic and mutually supportive digital preservation community”. Collaboration is a key to digital preservation success, and we hope to encourage and support it by helping build an inclusive and active community. An important step in achieving this aim was the publication of our ‘Inclusion and Diversity Policy’ in 2018.

Webinars are key to building community engagement amongst our members. We invite speakers to talk to our members about particular topics and share experiences through case studies. These webinars are recorded and made available for members to watch at a later date. We also run a monthly ‘Members Lounge’ to allow informal sharing of current work and discussion of issues as they arise and, on the public end of the website, a popular blog, covering case studies, new innovations, thought pieces, recaps of events and more.

2 | Advocacy

Our advocacy work campaigns “for a political and institutional climate more responsive and better informed about the digital preservation challenge”, as well as “raising awareness about the new opportunities that resilient digital assets create”. This tends to happen on several levels, from enabling and aiding members’ advocacy efforts within their own organizations, through raising legislators’ and policy makers’ awareness of digital preservation, to educating the wider populace.

To help those advocating for digital preservation within their own context, we have recently published our Executive Guide. The Guide provides a grab bag of statements and facts to help make the case for digital preservation, including key messages, motivators, opportunities to be gained and risks faced. We welcome any suggestions for additions or changes to this resource!

Our longest running advocacy activity is the biannual Digital Preservation Awards, last held in 2018. The Awards aim to celebrate excellence and innovation in digital preservation across a range of categories. This high-profile event has been joined in recent years by two other activities with a broad remit and engagement. The first is the Bit List of Digitally Endangered Species, which highlights at risk digital information, showing both where preservation work is needed and where efforts have been successful. Finally, there is World Digital Preservation Day (WDPD), a day to showcase digital preservation around the globe. Response to WDPD since its inauguration in 2017 has been exceptionally positive. There’s been tweets, blogs, events, webinars, and even a song and dance! This year WDPD is scheduled for 7th November, and we encourage everyone to get involved.

The nominees, winners, and judges for the 2018 Digital Preservation Awards

3 | Workforce Development

Workforce Development activities at the DPC focus on “providing opportunities for our members to acquire, develop and retain competent and responsive workforces that are ready to address the challenges of digital preservation”. There are many threads to this work, but key for our members are the scholarships we provide through our Career Development Fund and free access to the training courses we run.

At the moment we offer three training courses: ‘Getting Started with Digital Preservation’, ‘Making Progress with Digital Preservation’ and ‘Advocacy for Digital Preservation’, but we have plans to expand the portfolio in the coming year. All of our training courses are available to non-members for a modest fee, but at the moment are mostly held face to face in the UK and Ireland. A move to online training provision is, however, planned for 2020. We are also happy to share training resources and have set up a Slack workspace to enable this and greater collaboration with regards to digital preservation training.

Other resources that may prove helpful that fall under our Workforce Development heading include the ‘Digital Preservation Handbook’, a free online publication covering a digital preservation in the broadest sense. The Handbook aims to be a comprehensive guide for those starting with digital preservation, whilst also offering links additional resources. The content for Handbook was crowd-sourced from experts and has all been peer reviewed. Another useful and slightly less well-known series of publications are our ‘Topical Notes’, originally funded by the National Archives of Ireland, and intended to create resources that introduced key digital preservation issues to a non-specialist audience (particularly record creators). Each note is only two pages long and jargon-free, so a great resource to help raise awareness.

4 | Capacity Building

Perhaps the biggest area of DPC work covers Capacity Building, that is “supporting and assuring our members in the delivery and maintenance of high quality and sustainable digital preservation services through knowledge exchange, technology watch, research and development.” This can take the form of direct member support, helping with tasks such as policy development and procurement, as well as participation in research projects.

Our more advanced publication series, the Technology Watch Reports, also sit below the Capacity Building heading. Written by experts and peer reviewed, each report takes a deeper dive into a particular digital preservation issue. Our latest report on Email Preservation is currently available for member preview but will be publicly released shortly. Some other ‘classics’ include Preserving Social Media, Personal Digital Archiving, and the always popular The Open Archival Information System (OAIS) Reference Model: Introductory Guide (2nd Edition) (I always tell those new to OAIS to start here rather than the 200+ dry pages of the full standard!)

We also run around six thematic Briefing Day events a year on topical issues. As with the training, these are largely held in the UK and Ireland, but they are now also live-streamed for members. We support a number of Thematic Task Forces and Working Groups, with the ‘Web Archiving and Preservation Working Group’ being particularly active at the moment.

DPC members engaged in a brainstorming session

5 | Good Practice and Standards

Our Good Practice and Standards stream of work was a new addition as of the publication of our latest Strategic Plan (2018-22). Here we are contributing work towards “identifying and developing good practice and standards that make digital preservation achievable, supporting efforts to ensure services are tightly matched to shifting requirements.”

We hope this work will allow us to input into standards with the needs of our members in mind and facilitate the sharing of good practice that already happens across the coalition. This has already borne fruit in the shape of the forthcoming DPC Rapid Assessment Model, a maturity model to help with benchmarking digital preservation progress within your organization. You can read a bit more about it in this blog post by Jen Mitcham and the model will be released publicly in late September.

We also work with vendors through our Supporter Program and events like our ‘Digital Futures’ series to help bridge the gap between practice and solutions.

6 | Management and Governance

Our final stream of work is less focused on digital preservation and instead on “ensuring the DPC is a sustainable, competent organization focussed on member needs, providing a robust and trusted platform for collaboration within and beyond the Coalition.” This obviously relates to both the viability of the organization and well as good governance. It is essential that everything we do is transparent and that the members can both direct what we do and ensure accountability.

The Future

Before I depart, I thought I would share a little bit about some of our plans for the future. In the next few years we’ll be taking steps to further internationalize as an organization. At the moment our membership is roughly 75% UK and Ireland and 25% international, but those numbers are gradually moving closer and we hope that continues. With that in mind we will be investigating new ways to deliver services and resources online, as well as in languages beyond English. We’re starting this year with the publication of our prospectus in German, French and Spanish.

We’re also beginning to look forward to our 20th anniversary in 2022. It’s a Digital Preservation Awards Year, so that’s reason enough for a celebration, but we will also be welcoming the digital preservation community to Glasgow, Scotland, as hosts of iPRES 2022. Plans are already afoot for the conference, and we’re excited to make it a showcase for both the community and one of our home cities. Hopefully we’ll see you there, but I encourage you to make use of our resources and to get in touch soon!

Access our Knowledge Base: https://www.dpconline.org/knowledge-base

Follow us on Twitter: https://twitter.com/dpc_chat

Find out how to join us: https://www.dpconline.org/about/join-us


Sharon McMeekin is Head of Workforce Development with the Digital Preservation Coalition and leads on work including training workshops and their scholarship program. She is also Managing Editor of the ‘Digital Preservation Handbook’. With Masters degrees in Information Technology and Information Management and Preservation, both from the University of Glasgow, Sharon is an archivist by training, specializing in digital preservation. She is also an ILM qualified trainer. Before joining the DPC she spent five years as Digital Archivist with RCAHMS. As an invited speaker, Sharon presents on digital preservation at a wide variety of training events, conferences and university courses.

Student Impressions of Tech Skills for the Field

by Sarah Nguyen


Back in March, during bloggERS’ Making Tech Skills a Strategic Priority series, we distributed an open survey to MLIS, MLS, MI, and MSIS students to understand what they know and have experienced in relation to  technology skills as they enter the field. 

To be frank, this survey stemmed from personal interests since I just completed an MLIS core course on Research, Assessment, and Design (re: survey to collect data on current landscape). I am also interested in what skills I need to build/what class I should sign up for my next quarter (re: what tech skills do I need to become hire-able?). While I feel comfortable with a variety of tech-related tools and tasks, I’ve been intimidated by more “high-level”computational languages for some years. This survey was helpful for exploring what skills other LIS pre-professionals are interested in and which skills will help us make these costly degrees worth the time and financial investment that is traditionally required to enter a stable archive or library position.

Method

The survey was open for one month on Google Forms, and distributed to SAA communities, @SAA_ERS Twitter, the Digital Curation Google Group, and a few MLIS university program listservs. There were 15 questions and we received responses from 51 participants. 

Results & Analysis

Here’s a superficial scan of the results. If you would like to come up with your own analyses, feel free to view the raw data on GitHub.

Figure 1. Technology-related skills that students want to learn

The most popular technology-related skill that students are interested in learning is data management (manipulating, querying, transforming data, etc.). This is a pretty broad topic as it involves many tools and protocols which can vary between a GUI or scripts. A separate survey that does a breakdown of specific data management tools might be in order, especially since these types of skills can be divided into specialty courses, workshops, which then translates into a specific job position. A more specific survey could help demonstrate what types of skills need to be taught in a full semester-long course, or what skills can be covered in a day-long or multi-day workshop.

It was interesting to see that even in this day and age where social media management can be second nature to many students’ daily lives, there was still a notable interest in understanding how to make this a part of their career. This makes me wonder what value students have in knowing how to strategically manage an archives’ social media account. How could this help with the job market, as well as an archival organization’s main mission?

Looking deeper into the popular data management category, it would be interesting to know the current landscape of knowledge or pedagogy in communicating with IT (e.g. project management and translating users’ needs). In many cases, archivists are working separately from but dependently on IT system administrators, and it can be frustrating since either department may have distinct concerns about a server or other networks. In June’s NYC Preservathon/Preservashare 2019, there was mention that IT exists to make sure servers and networks are spinning at all hours of the day. Unlike archivists, they are not concerned about the longevity of the content, obsolescence of file formats, or the software to render files. Could it be useful to have a course on how to effectively communicate and take control of issues that can be fuzzy lines between archives, data management, and IT? Or as one survey respondent said, “I think more basic programming courses focusing on tech languages commonly used in archives/libraries would be very helpful.” Personally, I’ve only learned this from experience working in different tech-related jobs. This is not a subject I see on my MLIS course catalog, nor a discussion at conference workshops. 

The popularity of data management skills sparked another question: what about knowledge around computer networks and servers? Even though LTO will forever be in our hearts, cloud storage is also a backup medium we’re budgeting for and relying on. Same goes for hosting a database for remote access and/or publishing digital files. A friend mentioned this networking workshop for non-tech savvy learners—Grassroots Networking: Network Administration for Small Organizations/Home Organizations—which could be helpful for multiple skill types including data management, digital forensics, web archiving, web development, etc. This is similar to a course that could be found in computer science or MLIS-adjacent information management departments.

Figure 2. Have you taken/will you take technology-focused courses in your program?
Figure 3. Do you feel comfortable defining the difference between scripting and programming

I can’t say this is statistically significant, but the inverse relationship between 15.7% who have not/will not take a technology-focused course in their program, compared to 78.4% of respondents who are not aware of the difference between scripting and programming is eyebrow raising. According to an article in PLOS Computational Biology,  the term “script” means “something that is executed directly as is”, while a “program[… is] something that is explicitly compiled before being used. The distinction is more one of degree than kind—libraries written in Python are actually compiled to bytecode as they are loaded, for example—so one other way to think of it is “things that are edited directly” and “things that are not edited directly” (Wilson et al 2017). This distinction is important since more archives are acquiring, processing and sharing collections that rely on the archivist to execute jobs such as web-scraping or metadata management (scripts) or archivists who can build and maintain a database (programming). These might be interpreted as trick questions, but the particular semantics and what is considered technology-focused is something modern library, archives, and information programs might want to consider. 

Figure 4. How do you approach new technology?

Figure 4 illustrates the various ways students tackle new technologies. Reading the f* manual (RTFM) and Searching forums are the most common approaches to navigating technology. Here are quotes from a couple students on how they tend to learn a new piece of software:

  • “break whatever I’m trying to do with a new technology into steps and look for tutorials & examples related to each of those steps (i.e. Is this step even possible with X, how to do it, how else to use it, alternatives for accomplishing that step that don’t involve X)”
  • “I tend to google “how to….” for specific tasks and learn new technology on a task-by-task basis.”

In the end, there was overwhelming interest in “more project-based courses that allow skills from other tech classes to be applied.” Unsurprisingly, many of us are looking for full-time, stable jobs after graduating and the “more practical stuff, like CONTENTdm for archives” seems to be a pressure felt in-order to get an entry-level position. Not just entry too; as continuing education learners, there is also a push to strive for more—several respondents are looking for a challenge to level up their tech skills: 

  • “I want more classes with hands-on experience with technical skills. A lot of my classes have been theory based or else they present technology to us in a way that is not easy to process (i.e. a lecture without much hands-on work).”
  • “Higher-level programming, etc. — everything on offer at my school is entry level. Also digital forensics — using tools such as BitCurator.”
  • “Advanced courses for the introductory courses. XML 2 and python 2 to continue to develop the skills.”
  • “A skills building survey of various code/scripting, that offers structured learning (my professor doesn’t give a ton of feedback and most learning is independent, and the main focus is an independent project one comes up with), but that isn’t online. It’s really hard to learn something without face to face interaction, I don’t know why.”

It’ll be interesting to see what skills recent MLIS, MLS, MIS, and MSIM graduates will enter the field with. While many job postings list certain software and skills as requirements, will programs follow suit? I have a feeling this might be a significant question to ask in the larger context of what is the purpose of this Master’s degree and how can the curriculum keep up with the dynamic technology needs of the field.

Disclaimer: 

  1. Potential bias: Those taking the survey might be interested in learning higher-level tech skills because they do not already know the skills, while those who are already tech-savvy might avoid a basic survey such as this one since they already know the skills. This may put a bias on the survey population consisting of mostly novice tech students.   
  2. More data on specific computational languages and technology courses taken are available in the GitHub csv file. As mentioned earlier, I just finished my first year as a part-time MLIS student, so I’m still learning the distinct jobs and nature of the LIS field. Feel free to submit an issue to the GitHub repo, or tweet me @snewyuen if you’d like to talk more about what this data could mean.

Bibliography

Wilson G, Bryan J, Cranston K, Kitzes J, Nederbragt L, Teal TK (2017) Good enough practices in scientific computing. PLoS Computational Biology 13(6): e1005510. https://doi.org/10.1371/journal.pcbi.1005510


Sarah Nguyen with a Uovo storage truck

Sarah Nguyen is an advocate for open, accessible, and secure technologies. While studying as an MLIS candidate with the University of Washington iSchool, she is expressing interests through a few gigs: Project Coordinator for Preserve This Podcast at METRO, Assistant Research Scientist for Investigating & Archiving the Scholarly Git Experience at NYU Libraries, and archivist for the Dance Heritage Coalition/Mark Morris Dance Group. Offline, she can be found riding a Cannondale mtb or practicing movement through dance. (Views do not represent Uovo. And I don’t even work with them. Just liked the truck.)

The Theory and Craft of Digital Preservation: An interview with Trevor Owens

BloggERS! editor, Dorothy Waugh recently interviewed Trevor Owens, Head of Digital Content Management at the Library of Congress about his recent–and award-winning–book, The Theory and Craft of Digital Preservation.


Who is this book for and how do you imagine it being used?

I attempted to write a book that would be engaging and accessible to anyone who cares about long-term access to digital content and wants to devote time and energy to helping ensure that important digital content is not lost to the ages. In that context, I imagine the primary audience as current and emerging professionals that work to ensure enduring access to cultural heritage: archivists, librarians, curators, conservators, folklorists, oral historians, etc. With that noted, I think the book can also be of use to broader conversations in information science, computer science and engineering, and the digital humanities. 

Tell us about the title of the book and, in particular, your decision to use the word “craft” to describe digital preservation.

The words “theory” and “craft” in the title of the book forecast both the structure and the two central arguments that I advance in the book. 

The first chapters focus on theory. This includes tracing the historical lineages of preservation in libraries, archives, museums, folklore, and historic preservation. I then move to explore work in new media studies and platform studies to round out a nuanced understanding of the nature of digital media. I start there because I think it’s essential that cultural heritage practitioners moor their own frameworks and approaches to digital preservation in a nuanced understanding of the varied and historically contingent nature of preservation as a concept and the complexities of digital media and digital information. 

The latter half of the book is focused on what I describe as the “craft” of digital preservation. My use of the term craft is designed to intentionally challenge the notion that work in digital preservation should be understood as “a science.” Given the complexities of both what counts as preservation in a given context and the varied nature of digital media, I believe it is essential that we explicitly distance ourselves from many of the assumptions and baggage that come along with the ideology of “digital.” 

We can’t build some super system that just solves digital preservation. Digital preservation requires making judgement calls. Digital preservation requires the applied thinking and work of professionals. Digital preservation is not simply a technical question, instead digital preservation involves understanding the nature of the content that matters most to an intended community and making judgement calls about how best to mitigate risks of potential loss of access to that content. As a result of my focus on craft, I offer less of a “this is exactly what one should do” approach, and more of an invitation to join the community of practice that is developing knowledge and honing and refining their craft. 

Reading the book, I was so happy to see you make connections between the work that we do as archivists and digital preservation. Can you speak to that relationship and why you think it is important?

Archivists are key players in making preservation happen and the emergence of digital content across all kinds of materials and media that archivists work with means that digital preservation is now a core part of the work that archivists do. 

I organize a lot of my discussion about the craft of digital preservation around archival concepts as opposed to library science or curatorial practices. For example, I talk about arrangement and description. I also draw from ideas like MPLP as key concepts for work in digital preservation and from work on community archives. 

Old Files. From XKCD: webcomic of romance, sarcasm, math, and language. 2014

Broadly speaking, in the development of digital media, I see a growing context collapse between formats that had been distinct in the past. That is, conservation of oil paintings, management and preservation of bound volumes, and organizing and managing heterogeneous sets of records have some strong similarities but there are also a lot of differences. The born digital incarnations of those works; digital art, digital publishing, and digital records, are all made up of digital information and file formats, and face a related set of digital preservation issues.

With that note, I think archival practice tends to be particularly well-suited for dealing with the nature of digital content. Archives have long dealt with the problem of scale that is now intensified by digital data. At the same time, archivists have also long dealt with hybrid collections and complex jumbles of formats, forms, and organizational structures, which is also increasingly the case for all types of forms that transition into born-digital content. 

You emphasize that the technical component of digital preservation is sometimes prioritized over social, ethical, and organizational components. What are the risks implicit in overlooking these other important components?

Digital preservation is not primarily a technical problem. The ideology of “digital” is that things should be faster, cheaper, and automatic. The ideology of “digital” suggests that we should need less labor, less expertise, and less resources to make digital stuff happen. If we let this line of thinking infect our idea of digital preservation we are going to see major losses of important data, we will see major failures to respect ethical and privacy issues relating to digital content, and lots of money will be spent on work that fails to get us the results that we want.

In contrast, when we take as a starting point that digital preservation is about investing resources in building strong organizations and teams who participate in the community of practice and work on the complex interactions that emerge between competing library and archives values then we have a chance of both being effective but also building great and meaningful jobs for professionals.

If digital preservation work is happening in organizations that have an overly technical view of the problem, it is happening despite, not because, of their organization’s approach. That is, there are people doing the work, they just likely aren’t getting credit and recognition for doing that work. Digital preservation happens because of people who understand that the fundamental nature of the work requires continual efforts to get enough resources to meaningfully mitigate risks of loss, and thoughtful decision making about building and curating collections of value to communities.

Considerations related to access and discovery form a central part of the book and you encourage readers to “Start simple and prioritize access,” an approach that reminded me of many similar initiatives focused on getting institutions started with the management and preservation of born-digital archives. Can you speak to this approach and tell us how you see the relationship between preservation and access?

A while back, OCLC ran an initiative called “walk before you run,” focused on working with digital archives and digital content. I know it was a major turning point for helping the field build our practices. Our entire community is learning how to do this work and we do it together. We need to try things and see which things work best and which don’t. 

It’s really important to prioritize access in this work. Preservation is fundamentally about access in the future. The best way you know that something will be accessible in the future is if you’re making it accessible now. Then your users will help you. They can tell you if something isn’t working. The more that we can work end-to-end, that is, that we accession, process, arrange, describe, and make available digital content to our users, the more that we are able to focus on how we can continually improve that process end-to-end. Without having a full end-to-end process in place, it’s impossible to zoom out and look at that whole sequence of processes to start figuring out where the bottlenecks are and where you need to focus on working to optimize things. 


Dr. Trevor Owens is a librarian, researcher, policy maker, and educator working on digital infrastructure for libraries. Owens serves as the first Head of Digital Content Management for Library Services at the Library of Congress. He previously worked as a senior program administrator at the United States Institute of Museum and Library Services (IMLS) and, prior to that, as a Digital Archivist for the National Digital Information Infrastructure and Preservation Program and as a history of science curator at the Library of Congress. Owens is the author of three books, including The Theory and Craft of Digital Preservation and Designing Online Communities: How Designers, Developers, Community Managers, and Software Structure Discourse and Knowledge Production on the Web. His research and writing has been featured in: Curator: The Museum Journal, Digital Humanities Quarterly, The Journal of Digital Humanities, D-Lib, Simulation & Gaming, Science Communication, New Directions in Folklore, and American Libraries. In 2014 the Society for American Archivists granted him the Archival Innovator Award, presented annually to recognize the archivist, repository, or organization that best exemplifies the “ability to think outside the professional norm.”

Collections as Data

by Elizabeth Russey Roke


Archives put a great deal of effort into preserving the original object.  We document the context around its creation, perform conservation work on the object if necessary, and implement reading room procedures designed to limit damage or loss.  As a result, researchers can read and hold an original letter written by Alice Walker, view a series of tintypes taken before the Civil War, read marginalia written by Ted Hughes in a book from his personal library, or listen to an audio recording of one of Martin Luther King, Jr.’s speeches.  In other words, we enable researchers to encounter materials as they were originally designed to be used as much as is possible.

The nature of digital humanities research challenges these traditional modes of archival access: are these the only ways to interact with archival material?  How do we serve users who want to leverage computational techniques such as text mining, machine learning, network analysis, or computer vision in their research or teaching?  Are machines and algorithms “users”? Archivists also encounter these questions as the content of archives shifts from analog to born digital material. Digital files were created and designed to be processed by algorithms, not just encountered through experiences such as watching, viewing, or reading.  What could access for these types of materials look like if we gave access to their full functionality and not just their appearance? 

I have spent the past two years working on an IMLS grant focused on addressing these types of questions.  Collections As Data: Always Already Computational examined how digital collections are and could be used beyond analog research methodologies.  Collections as data is ordered information, stored digitally, that is inherently amenable to computation.  This could include metadata, digital text, or other digital surrogates.  Whereas a digital repository might enable researchers to read a newspaper on a computer screen, an approach grounded in collections as data would give researchers access to the OCR file the repository generated to enable keyword search. In other words, digital repositories should provide access beyond the viewers, page turners, and streaming servers of most current digital repositories that replicate analog experiences.  At its core, collections as data simply asks cultural heritage organizations to make the full digital object available rather than making assumptions about how users will want to interact with it.  

Collections as data implementations are not necessarily complex nor do they involve complicated repository development.  Some of the simplest examples can be found on Github where archives such as Vanderbilt and New York University publish their EAD files.  The Rockefeller Archive Center and the Museum of Modern Art go a step further and publish all of their collection data, along with a creative commons license.  Emory, my home institution, makes finding aid data available in both EAD and RDF from our finding aids database, which has led to a digital humanities project that harvested correspondence indexes from our Irish poetry collections to build network graphs of the Belfast Group.  More complex implementations often provide access to data through APIs instead of a bulk download.  An example of this can be found at Carnegie Hall Archives, which allows researchers to query their data through a SPARQL endpoint.

Chord diagram created as part of Emory’s Belfast Group Poetry project, showing the networks of people associated with the Belfast Group and their relationships with each other.

The Collections As Data: Always Already Computational final report includes more information and ideas for getting started with collections as data. It includes resources such as a set of user personas, methods profiles of common techniques used by data-driven researchers, and real-world case studies of institutions with collections as data including their motivations, technical details, and how they made the case to their administrators.  I highly recommend “50 Things,” which is a list of activities associated with collections as data work ranging from the simple to the complex. 

There are a few takeaways from this project I’d like to highlight for archivists in particular:

Collections as data approaches are archival.  Data-driven research demands authenticity and context of the data source, established and preserved through archival principles of documentation, transparency, and provenance.  This type of information was one of the most universal requests from digital humanities researchers. It was clear that they were not only interested in the object, but in how it came to be.  They wanted to understand their data as an archival object with information about its creation, provenance, and preservation. Archivists need to advocate for digital collections to be treated not just as digital surrogates, or what I like to think of as expensive photocopying, but as unique resources unto themselves deserving description, preservation, and access that may not necessarily match that of the original object.

Collections as data enhances access to archival material.  What if we could partially open restricted material to researchers?  Emory holds the papers of Salman Rushdie and his email files are largely restricted per the deed of gift.  Computational techniques being developed in ePADD could generate maps of Rushdie’s correspondents and reveal patterns in the timing and frequency of his correspondence, just through email header information and without exposing sensitive data (i.e. the content of the email) that Rushdie wanted to restrict.  Could this methodology be extrapolated to other types of restricted electronic files?  

Just start.  For digital files, trying something is always the first, and best approach.  There is no one way or best way to do collections as data work. Consider your community and ask them what they need.  Unlike baseball fields, if you build it, they probably won’t come unless you ask first. Collections as data material already exists in your collection, especially if you use ArchivesSpace.  Publish it. Think broadly about what might constitute collections as data and how you might make use of it yourself; collections as data benefits us too. Follow the Computational Archival Science project at the University of Maryland, which is exploring how we think about archival collections as data.  

If you want to take a deep dive into collections as data (and get funding to do so!) consider applying to be part of the second cohort of the Part to Whole Mellon grant, which aims to foster the development of broadly viable models that support implementation and use of collections as data.  The next call for proposals opens August 1:  https://collectionsasdata.github.io/part2whole/ .  On August 5, the project team will offer a webinar with more information about the grant and opportunities to ask questions:  https://collectionsasdata.github.io/part2whole/cfp2webinar/.


Elizabeth Russey Roke is a digital archivist and metadata specialist at the Stuart A. Rose Library of Emory University, Atlanta, Georgia. Primarily focused on preservation, discovery, and access to digitized and born digital assets from special collections, Elizabeth works on a variety of technology projects and initiatives related to digital repositories, metadata standards, and archival descriptive practice. She was a co-investigator on a 2016-2018 IMLS grant investigating collections as data.

A Conversation with Annalise Berdini, Digital Archivist at the Seeley G. Mudd Manuscript Library, Princeton University

Interview conducted with Annalise Berdini in May 2019 by Hannah Silverman and Tamar Zeffren

This is the eighth post in a new series of conversations between emerging professionals and archivists actively working with digital materials.


Annalise Berdini is the Digital Archivist at the Seeley G. Mudd Manuscript Library at Princeton University, a position she has held since January 2018. She is responsible for the ongoing management of the University Archives Digital Curation Program, as well as managing a collection of web archives and assisting with reference services.

Annalise’s first post-graduate school position was as a manuscripts and archives processor at the University of California, San Diego (UCSD). While she was working at UCSD, universities and archives were slowly starting to see the need for a dedicated digital archivist position. When the Special Collections department at UCSD created their first digital archivist position, Annalise applied and got the job. She explains that a good deal of her work there, and at Princeton, is graciously supported by a community of digital archivists solving similar challenges in other institutions.

As Annalise has now held a digital archivist role at two different institutions, both universities, we were interested to hear her perspectives on how colleagues and researchers have understood – or misunderstood – her role. “Because I have digital in my job title,” she noted, “people interpret that in a lot of very wide and broad ways. Really digital archives is still an emerging field…there are so many questions to answer, and it’s fun to investigate that aspect of the field.”

Given prevailing concerns among institutional archives about preserving and processing legacy media, we were keenly interested in hearing Annalise’s insights about securing stakeholder buy-in to develop a digital archives program.

“It’s a struggle everywhere,” she acknowledges. Presently, Princeton’s efforts to build up a more robust digital preservation program have led the University to a partnership with a UK-based company called Arkivum, which offers digital preservation, storage, maintenance, auditing and reporting modules and has the capacity to incorporate services from Archivematica and create a customized digital storage solution for Princeton.

“We’ve been lucky here [at Mudd]. We’re getting this great system. There is buy-in and there seems to be a pretty strong push right now. For us, the most compelling argument we’ve had is that we are mandated to collect student materials and student records that will not exist anywhere else unless we take them. The school has to keep those records, there’s not an option. Emphasizing how easily that content could be lost without a proper digital preservation system in place was very compelling to people who weren’t necessarily aware of the fact that hard drives sitting on a shelf are really not acceptable storage choices and options.”

Annalise has also found that deploying some compelling statistics can aid in building awareness around digital archives needs. In discussions about how rapidly materials can degrade, Annalise likes to cite a 2013 Western Archives article, “Capturing and Processing Born-Digital Files in the STOP AIDS Project Records,” which showcases findings that out of a vast collection of optical storage media, “only 10% of these hundreds of DVDs were really able to be recovered, whereas, strangely, a lot of the floppy disks were better and easier to recover…I think emphasizing how fragile digital content is [can help people understand] how easily it will corrupt without you even knowing it.”

Equally as important to generating momentum for such programs are the direct relationships Annalise cultivates with colleagues, within and without the archives. “My boss was really instrumental in the process, and the head of library IT helped me navigate getting approvals from the University as a whole and the University IT department.”

The complex process of sustaining and innovating a digital archives infrastructure provides ongoing opportunities for Annalise to “solve puzzles” and to unite colleagues in confronting the challenges of documenting and preserving born-digital heritage: “I have focused on trying to find one person who is maybe a level above me and to connect with them and then hopefully build up a network within my institution to build some groundswell.”


Hannah Silverman

Tamar Zeffren

Hannah Silverman and Tamar Zeffren both work at JDC Archives. Tamar is the Archival Collections Manager. Hannah is the Digitization Project Specialist and also works independently as a photo archivist. Both received SAA’s DAS certification.

An Exploration of BitCurator NLP: Incorporating New Tools for Born-Digital Collections

by Morgan Goodman

Natural Language Processing (NLP) has been a buzz-worthy topic for professionals working with born-digital material over the last few years. The BitCurator Project recently released a new set of natural language processing tools, and I had the opportunity to test out the topic modeler, Bitcurator-nlp-gentm, with a group of archivists in the Raleigh-Durham area. I was interested in exploring how NLP might assist archivists to more effectively and efficiently perform their everyday duties. While my goal was to explore possible applications of topic modeling in archival appraisal specifically, the discussions surrounding other possible uses were enlightening.  The resulting research informed my 2019 Master’s paper for the University of North Carolina Chapel Hill.

Topic Modeling extracts text from files and organizes the tokenized words into topics. Imagine a set of words such as: mask, october, horror, michael, myers. Based on this grouping of words you might be able to determine that somewhere across the corpus there is a file about one the Halloween franchise horror films. When I met with the archivists, I had them run the program with disk images from their own collections, and we discussed the visualization output and whether or not they were able easily analyze and determine the nature of the topics presented.

BitCurator utilizes open source tools in their applications and chose the pyLDAvis visualization for the final output of their topic modeling tool (more information about the algorithm and how it works can be found by reading Sievert and Shirley’s paper. You can also play around with the output through this Jupyter notebook).  The left side view of the visualization has topic circles displayed in relative sizes and plotted on a two-dimensional plane. Each topic is labeled with a number in decreasing order of prevalence (circle #1 is the main topic in the overall corpus, and is also the largest circle). The space between topics is determined by the relative relation of the topics, i.e. topics that are less related are plotted further away from each other. The right-side view contains a list of 30 words with a blue bar indicating that term’s frequency across the corpus. Clicking on a topic circle will alter the view of the terms list by adding a red bar for each term, showing the term frequency in that particular topic in relation to the overall corpus.

Picture1

The user can then manipulate a metric slider which is meant to help decipher what the topic is about. Essentially, when the slider is all the way to the right at “1”, the most prevalent terms for the entire corpus are listed. When a topic is selected and the slider is at 1, it shows all the prevalent terms for the corpus in relation to that particular topic (in your Halloween example, you might see more general words like: movie, plot, character). Alternatively, the closer to “0” the slider moves, the less corpus-wide terms appear and the more topic specific terms are displayed (i.e.: knife, haddonfield, strode).

While the NLP does the hard work to scan and extract text from the files, some analysis is still required by the user. The tool’s output offers archivists a bird’s eye view of the collection, which can be helpful when little to nothing is known about its contents. However, many of the archivists I spoke to felt this tool is most effective when you already know a bit about the collection you are looking at. In that sense, it may be beneficial to allow researchers to use topic modeling in the reading room to explore a large collection. Researchers and others with subject matter expertise may get the most benefit from this tool – do you have to know about the Halloween movie franchise to know that Michael Myers is a fictional horror film character? Probably. Now imagine more complex topics that the archivists may not have working knowledge of. The archivist can point the researcher to the right collection and let them do the analysis. This tool may also help for description or possibly identifying duplication across a collection (which seems to be a common problem for people working with born-digital collections).

The next steps to getting NLP tools like this off the ground are to implement training. Information retrieval and ranking methods that create the output may not be widely understood. To unlock the value within an NLP tool, users must know how they work, how to run them, and how to perform meaningful analysis.  Training archivists in the reading room to assist researchers would be an excellent way to get tools like this out of the think tank and into the real world.


MorganMorgan Goodman is a 2019 graduate from the University of North Carolina, Chapel Hill and currently resides in Denver, Colorado. She holds a MS in Information Science with a specialization in Archives and Records Management.

 

 

 

An Interview with Erin Barsan—Archives & Collections Information Consultant at Small Data Industries.

An Interview with Erin Barsan—Archives & Collections Information Consultant at Small Data Industries.

by Meghan Lyon

This is the seventh post in a new series of conversations between emerging professionals and archivists actively working with digital materials.

Photo credit: Small Data Industries.

Erin Barsan is a Consultant specializing in Archives & Collections Information at Small Data Industries, a private conservation lab and consultancy firm with a mission to “support and empower people to safeguard the permanence and integrity of the world’s artistic record.” She was the NDSR Art Resident (2017-2018) at Minneapolis Institute of Art, and obtained her MSLIS with an Advanced Certificate in Archives from Pratt Institute in 2015. Before attending Pratt, she studied graphic design and photography as an undergraduate at Columbia College Chicago.


I was interested in how Erin’s background in art influenced the direction of her graduate coursework and affects her style as a professional. During her BFA program, Erin learned critical thinking and analysis, visual literacy, and intentional decision-making—Erin had a professor who’s frequent critique was  “make no arbitrary decisions!” As an LIS student who’s primary interest was archives, Erin chose to study User Experience (UX), specifically Information Architecture. The principles of UX—designing with the end user in mind, putting yourself in their place, doing research before you design—have very much influenced her working style.

At Small Data Industries, Erin works closely with their clients to craft unique digital preservation and conservation strategies for institutions, private collectors, artists studios, and artist estates. While Erin was the NDSR Art Resident at the Minneapolis Institute of Art (Mia), she helped conceptualize and document a framework for managing and preserving the Museum’s collection of time-based media art. Day-to-day work of digital preservation includes using those visual literacy and UX principles to develop usable documents, employing LIS research skills to find new information, to learn how to complete a task, or to find people with expert skills that you may not have. Soft skills then come in handy to build relationships with those expert individuals.

In discussing some of the challenges of her work, Erin cited the importance of advocacy to combat the invisibility of digital work, and to educate and raise awareness of the ongoing action of preservation, i.e. nothing is every preserved, only being preserved. There is a great need to explain “complicated things in a very succinct way,” to foster support for preservation initiatives and build collaborative relationships with professionals in adjacent fields. Developing good communication skills is crucial to maintaining preservation programs within any institution. Prepare an elevator pitch to explain your job to someone outside the field, and be ready to describe digital archives and preservation in lay terms, and to share knowledge and encourage excitement about the archival endeavor.

The challenges of Erin’s work are also the rewards. As a consultant, Erin frequently works with new clients, and a preservation strategy that works well for one institution may fall flat for another. “In consulting, there’s a lot of similar problems, but every institution is different. It’s always interesting to try and take best-practices and standards and figure out how they can be applied in these unique situations.” For Erin, finding solutions to complex problems is rewarding since it often involves learning new skills and thinking creatively. She also enjoys helping to ensure that time-based media art and digital archives will be accessible and findable in the future, “I find it really gratifying to know that the work that I’m doing is going to make a difference—because I’ve seen the other side of the coin, when things get lost, and how easily information can be lost.”

For students and new professionals entering the field, Erin’s advice: “Get more internships. Everything that you learn in school is great, but hands-on experience is invaluable and is what will get you a job.” And although technical skills will help you get a job, once you’re on the job, soft skills become more important. Take advantage of the professional community, “we have a very generous community. A lot of times we can be reticent to reach out to other professionals in the field, but I know from experience that people want to help. So reach out!”

Share your experiences with your peers, find a way to connect to the larger community, and discuss what you’re learning or working on. This can be at whatever venue or capacity is comfortable for you, whether it’s presenting at conferences, tweeting, blogging, or something else. Keep abreast of what’s happening, join conversations, follow listservs, contribute to working groups. Invite and listen to other people’s perspectives. Finally, don’t be afraid to advocate for your professional development in the workplace. Imposter syndrome is real, don’t sell yourself and your experience short!


Meghan Lyon is completing the 1st year of her MSLIS degree program at Pratt Institute School of Information. She has a BFA from the Cooper Union, School of Art, and is interested in artist archives, museum libraries & collections, and digital preservation.

Midwest Archivists Conference 2019 meeting (MAC 2019)

by A.L. Carson

The Midwest Archivists Conference 2019 meeting, held April 3-6 in Detroit (in the GM Renaissance Center, which may have the distinction, with its concentric circle design, of being the most bewildering conference center I’ve ever been in), chose “Innovations, Transformation, Resurgence” as its theme. The organizers put out a call for participants to “consider the ways they have transformed their local communities and the world,” and it seemed to have struck a chord: the sessions reflected a sense of rootedness as well as a desire to increase and deepen connections between repositories, their holdings, the communities they represent, and (crucially) those they haven’t.

The programming took a broad perspective on the profession and practice of archives, giving space to multiple approaches and understandings of the work, from imposter syndrome to workflows, resulting in some really generative sessions. I attended a number of sessions focused on surfacing the histories of underserved and marginalized groups in the Midwest, notably “Together, We Make It: Making Collections Featuring Minority Groups More Accessible” and “Documenting the History of HIV/AIDS in the Midwest.”

Two standouts on the technical practice and electronic records side were “Computer-Assisted Appraisal of Electronic Records” and “Archival Revitalization: Transforming Technical Services with Innovative Workflows,” both of which were relevant to my (new) position as a processing archivist. For a play-by-play of some of these sessions, you can check out my MAC Twitter feed (yes, I live-tweet conferences). Both emphasized balancing competing priorities and unequal capacities, familiar themes for anyone working in archives. Leading off “Computer Assisted Appraisal,” Cal Lee reminded everyone that there was no such thing as a perfect machine system (which would remove the human labor from appraisal), and that the goal should never be to create one: that machines are tools, not agents. That emphasis on human action, particularly communicating across and about technological divides, was echoed again in “Archival Revitalization,” which focused on instances of implementation (new processes, tools, and workflows) that were made possible through and in turn assisted human collaboration. Both sessions, too, spoke to the importance of understanding iteration as an integral part of workflows (whether appraisal, processing, or providing access) rather than something to be engineered out of a process.

Thanks to scholarship and grant programs (of which we can always have more), a number of paraprofessionals and short-term or project archivists were able to attend and present, which enriched the programming significantly. There was a strong showing from the regional LIS students, both in their poster session on Friday and the general programming. Having just started my position at Iowa State, this was my first MAC; it was also my first time in Detroit, and overall I was favorably impressed. While the conference center itself is a marvel of hostile architecture (which made literal accessibility a real and not-to-be-downplayed challenge), the intellectual content of the presentations and general attitude of the attendees made it a fairly easy space in which to be a newcomer.

A.L. Carson is a processing archivist at Iowa State University, where they are engaged in developing processing, preservation, and access guidelines for digital records as well as increasing the availability of the traditional collections.