Archiving Email: Electronic Records and Records Management Sections Joint Meeting Recap

By Alice Sara Prael

This is the second post in the bloggERS series on Archiving Digital Communication.

Email has become a major challenge for archivists working to preserve and provide access to correspondence. There are many technical challenges that differ between platforms as well as intellectual challenges to describe and appraise massive disorganized inboxes.

Image of smart phone showing 1 New Email Message

At this year’s Annual Meeting of the Society of American Archivists, the Electronic Records Section and the Records Management Section joined forces to present a panel on the breakthroughs and challenges of managing email in the archives.

Sarah Demb, Senior Records Manager, kicked off the panel by discussing the Harvard University Archive’s approach to collecting email from internal and donated records. Since their records retention schedule is not format specific, it doesn’t separate email from other types of correspondence. Correspondence in electronic format affects the required metadata and acquisition tools and methods, not the appraisal decisions, which are driven entirely by content. When a collection is acquired, administrative records are often mixed with faculty archives which poses a major challenge for appraisal of correspondence. This is true for paper and email correspondence, but a digital environment lends itself to mixing administrative and faculty records much more easily. Another major challenge in acquiring these internal records is that the emails are often attached to business systems in the form of notifications and reporting features. These system specific emails have significant overlap and cause duplication when system reports exist in one or many inboxes.

Since  internal records at Harvard University Archives are closed for 50 years, and personal information is closed for 80 years, Demb is less concerned with an accidental disclosure of private information to a researcher and more concerned with making the right appraisal decisions during acquisition. Email is acquired by the archive at the end of faculty’s career rather than regular smaller acquisitions, which often leaves the archivist with one large, unwieldy inbox. Although donors are encouraged to weed their own inbox prior to acquisition, this is a rare occurrence. The main strategy that Demb supports is to encourage best practices through training and offering guidance whenever possible.

The next presenter was Chris Prom, Assistant University Archivist at the University of Illinois at Urbana-Champaign. He discussed the work of Andrew W. Mellon Foundation and the Digital Preservation Coalition Task Force on Technical Approaches to Email Archives. This task force includes 12 members representing the U.K. and U.S. as well as numerous “Friends of the Task Force” who provide additional support. The task force recently published a draft report which is available online for comment through August 31st. Don’t worry if you won’t have time to comment in the next two days because the report will go out for a second round of comments in September. The task force is taking cues from other industries that are doing similar work with email, such as legal and forensic fields which use email as evidence. Having corporate representation from Google and Microsoft has been valuable because they are already acting upon suggestions from the task force to make their systems easier to preserve.

One major aspect of the task force’s work is addressing interoperability. Getting data out of one platform and usable by different tools has been an ongoing challenge for archivists managing email. There are many useful tools available, but chaining them together for a holistic workflow is problematic. Prom suggested one potential solution to the ‘one big inbox’ problem is to capture email via API to collect at regular intervals rather than waiting for an entire career’s worth of email to accumulate.

Camille Tyndall Watson, Digital Services Section Manager at State Archives of North Carolina, completed the panel discussing the Transforming Online Mail with Embedded Semantics (TOMES) project. This grant funded project is focused on appraisal by implementing the capstone approach, which identifies certain email accounts with enduring value rather than identifying individual emails. The project includes partners from Kansas, Utah, and North Carolina, but the hope is that this model could be duplicated in other states.

The first challenge was to choose the public officials whose accounts are considered part of the ‘capstone’ based on their position in the organizational chart. The project also crosswalked job descriptions to functional retention schedules. By working with the IT department, the team members are automating as much of the workflow as possible. This included assigning position numbers for ‘archival email accounts’ in order to track positions rather than individuals, which is difficult in an organization with significant turn-over like governmental departments. This nearly constant turn-over requires constant outreach to answer questions like “what is a record” and “why does the archive need your email?” The project is also researching natural language processing to allow for an automated and simplified process of arrangement and description of email collections.

The main takeaway from this panel is that email matters. There are many challenges, but the work is necessary because email, much like paper correspondence, has cultural and historical value beyond the transactional value it serves in our everyday lives.

profilephoto


Alice Sara Prael is the Digital Accessioning Archivist for Yale Special Collections at Beinecke Rare Book & Manuscript Library.  She works with born digital archival material through a centralized accessioning service.

Advertisements

The Game is Afoot! Digital Sleuthing at the Electronic Records Section/Museum Archives Section Mystery Workshop

By Christine Wang

____

Every archivist wears many different hats. Detective is not usually one of them, but at this year’s SAA Annual Meeting in Atlanta, museum archivists donned their deerstalkers for the day as they delved into a mystery workshop designed to introduce participants to principles and practices in managing born-digital records within an institution.  

“Find the Person: Missing Curator Mystery Edition!” led participants (cast as the project archivist at the fictional Three Hills Museum) through the curious case of lead curator and director Jane Stevens, who seems to have suddenly vanished, only leaving behind a mysterious set of files. Tasked with finding out just what had happened to Jane, participants sifted through her filesphotographs, text files, spreadsheets, and other various documentsto solve the mystery (spoiler alert: turns out Jane was perfectly fine, having simply rushed off to Russia in her excitement to examine a potential J.M.W. Turner painting). In the process, they grappled with questions not only about Jane and her whereabouts, but also about the organization, protection, and preservation of files like the ones they were examiningthat is, of digital archiving and records management in a professional setting.

Rachel Chatalbash and Susan Hernandez from the Museum Archives Section, and Ann Cooper, Wendy Hagenmaier, and Carol Kussmann from the Electronic Records Section planned the workshop based on Wendy’s 2014 workshop for the Society of Georgia Archivists. Wendy and Ann led the workshop for the Museum Archivists Section at the 2016 SAA meeting. The materials for the 2016 workshop covered topics and methods in personal digital archiving to support participants in working with a mixture of personal and archival digital records and boost participants confidence in working with digital material.

This year’s workshop revised and expanded upon the ideas of the original personal digital archiving workshop materials, applying them to the management and archiving of born-digital records in a museum environment. If you would like to view the materials from the workshop, follow the links to the Workshop Activity Instructions and Additional Resources.

____

Christine Wang is the Nancy Horton Bartels Scholar Intern at the Yale Center for British Art Institutional Archives.

 

Pathways to Automated Appraisal for Born-Digital Records: An SAA 2016 ERS Breakout Discussion Recap

By Lora Davis
____

In a stroke of brilliant SAA scheduling (or, perhaps, blind chance) the 2016 Electronic Records Section’s annual business meeting immediately followed Thursday afternoon’s session 201 “From 0 to 400 GB: Confronting the Challenges of Born-Digital Photographs.” During this session, panelists Kristen Yarmey, Ed Busch, Chris Prom, Molly Tighe, and Gregory Wiedeman discussed a variety of steps they’ve taken to answer the question “What next?” following the (physical or digital) delivery of born-digital campus photographs to their repositories. I listened intently as Wiedeman recounted how he has employed the API of his campus’ chosen cloud-based online public photo database (SmugMug) to automate the description of born-digital campus photographs at large scale. By reusing the existing photographer-generated descriptive metadata stored in SmugMug, Wiedeman’s campus photographs “describe themselves.” This struck a chord with me as I look forward to my own institution’s upcoming National Digital Stewardship Residency project “Large-Scale Digital Stewardship: Preserving Johns Hopkins University’s Born-Digital Visual History.” But, I wondered, could a similar method be employed to automate appraisal?

As the formal portion of the ERS business meeting concluded, the Section broke into several unconference-style small group discussions. Inspired by the above, I volunteered to lead one on potential methods for automating the appraisal of born-digital records. Breakout participant Tammi Kim kept discussion notes, as a group of about 20 ERS members engaged in discussion. As is often the case, our conversation occasionally deviated from the primary topic of appraisal, but even these tangents proved fruitful. Some of the topics discussed and questions raised include:

  • The differences and distinctions between born-digital appraisal and weeding. Is the goal of minimizing the total size of digital records ingested (say, reducing 50TB of born-digital campus photographs to 10TB) analogous to actually doing appraisal on these records?
  • Could the type of facial recognition software discussed in session 201 be used not only for description purposes, but also to identify VIPs and other photographic content that would inform appraisal decisions?
  • If the record’s creator (say, a campus photographer) assigned rights or permissions metadata to a digital object, might that rights metadata be employed for appraisal in an MPLP-like fashion?
  • What are the differences between photographic and text-based digital records? Is automated, machine-actionable appraisal more likely to succeed with one type of record than another? (E.g. It is easier to search for text in word processing documents and OCRed PDFs than it is to “search” in photographs.)
  • How can “micro-tools” like ArchiveFinder (product mentioned, but I cannot locate a GitHub page) and FileAnalyzer help with the appraisal of large, complex directories of digital files? Additionally, while tools like ExifTool can read, write, and edit embedded technical metadata, how useful is technical metadata to appraisal decisions?
  • How might content creators be brought into appraisal decisions after content has been transferred to a repository? Can we ask creators to enhance or add metadata after the fact?
  • Where does appraisal actually fit in with processing workflows, especially when working with larger files like video and disk images? How do you manage the need for increased storage even at the appraisal stage?
  • What “traditional” approaches to analog appraisal do not necessarily apply to digital? Where does potential future use of records fit in with born-digital appraisal decisions?
  • Are born digital archives even sustainable monetarily or ecologically? Are we building the Tower of Babel? What about server farms and the offset of dirty fuels?

I encourage anyone who attended this discussion to add to this post and/or correct any of my poor-memory-induced misstatements above by commenting below. Similarly, whether you attended the breakout or not, let’s continue this conversation in the comments section!

Lora Davis is Digital Archivist at Johns Hopkins University, where she is tasked with creating, documenting, and managing workflows for acquiring, describing, processing, preserving, and providing access to born‐digital materials. Prior to her appointment at JHU in January 2016, Lora worked at Colgate University and the University of Delaware.

 

bloggERS! has gone fishin’

We’re off to SAA! Will you be there too? Check out our list of ERS-recommended sessions on Sched.

If you can’t make it this year, then follow along on Twitter with #SAA16!

4156531802_929debdfe1
People fishing on Green Lake, circa 1950s. Item 31415, Ben Evans Recreation Program Collection (Record Series 5801-02), Seattle Municipal Archives

 

We’ll be back soon with recaps from recent conferences and plenty of other good stuff.

 

Building a “Computational Archival Science” Community

By Richard Marciano

———

When the bloggERS! series started at the beginning of 2015, some of the very first posts featured work on “computer generated archival description” and “big data and big challenges for archives,” so it seems appropriate to revisit this theme of automation and management of records at scale and provide an update on a recent symposium and several upcoming events.

Richard Marciano co-hosted a recent “Archival Records in the Age of Big Data” symposium. For more information about the recent Symposium, visit: http://dcicblog.umd.edu/cas/. The three-day program is listed online and has links to all the videos and slides. A list of participants can also be found at http://dcicblog.umd.edu/cas/attendees. The objectives of the Symposium were to:

  • address the challenges of big data for digital curation,
  • explore the conjunction of emerging digital methods and technologies,
  • identify and evaluate current trends,
  • determine possible research agendas, and
  • establish a community of practice.

Richard Marciano and Bill Underwood will be further exploring these themes at SAA in Atlanta on Friday, August 5, 9:30am – 10:45am, session 311, for those ERS aficionados interested in contributing to this emerging conversation. See: https://archives2016.sched.org/event/7f9D/311-archival-records-in-the-age-of-big-data

On April 26-28, 2016 the Digital Curation Innovation Center (DCIC) at the University of Maryland’s College of Information Studies (iSchool) convened a Symposium in collaboration with King’s College London. This invitation-only symposium, entitled Finding New Knowledge: Archival Records in the Age of Big Data, featured 52 participants from the UK, Canada, South Africa and the U.S. Among the participants were researchers, students, and representatives from federal agencies, cultural institutions, and consortia.

This group of experts gathered at Maryland’s iSchool to discuss and try to define computational archival science: an interdisciplinary field concerned with the application of computational methods and resources to large-scale records/archives processing, analysis, storage, long-term preservation, and access, with the aim of improving efficiency, productivity and precision in support of appraisal, arrangement and description, preservation and access decisions, and engaging and undertaking research with archival material.

This event, co-sponsored by Richard Marciano, Mark Hedges from King’s College London and Michael Kurtz from UMD’s iSchool, brought together thought leaders in this emerging CAS field:  Maria Esteva from the Texas Advanced Computing Center (TACC), Victoria Lemieux from the University of British Columbia School of Library, Archival and Information Studies (SLAIS), and Bill Underwood from Georgia Tech Research Institute (GTRI). There is growing interest in large-scale management, automation, and analysis of archival content and the realization of enhanced possibilities for scholarship through the integration of ‘computational thinking’ and ‘archival thinking.

To capitalize on the April Symposium, a follow-up workshop entitled Computational Archival Science: Digital Records in the Age of Big Data, will take place in Washington D.C. the 2nd week of December 2016 at the 2016 IEEE International Conference on Big Data. For information on the upcoming workshop, please visit: http://dcicblog.umd.edu/cas/ieee_big_data_2016_cas-workshop/. Paper contributions will be accepted until October 3, 2016.

———

Richard is a professor at Maryland’s iSchool and director of the Digital Curation Innovation Center (DCIC). His research interests include digital preservation, archives and records management, computational archival science, and big data. He holds degrees in Avionics and Electrical Engineering, a Master’s and Ph.D. in Computer Science from the University of Iowa, and conducted a Postdoc in Computational Geography.

Get to know the candidates: Lora Davis

The 2016 elections for Electronic Records Section leadership are upon us! Over the next two weeks, we will be presenting additional information provided by the 2016 nominees for ERS leadership positions. For more information about the slate of candidates, you can check out the full 2016 ERS elections site. ERS Members: be sure to vote! Polls are open July 8 through the 22!

Candidate name: Lora Davis

Running for: Steering Committee

What made you decide you wanted to become an archivist?

This question assumes a discrete “Aha!” moment, which, for me at least, never really happened. I like to say that archives found me, and not the other way around. I was first exposed to the archives (the place, if not the profession) when, as a 17-year-old undergraduate at Susquehanna University, I was awarded a university assistantship that placed me in the employ of a long-serving member of the Department of History, who had undertaken to write the history of the university. Following a brief tour (“My Moody Blues cassettes are in this drawer here, feel free to listen!”) and with a copy of James O’Toole’s Understanding Archives and Manuscripts (1990) in hand, I set about processing the papers of two former university presidents. Seven years later, after completing a master’s in history and opting to leave my PhD program, the archives (this time both place and profession) found me again when the Manuscripts Unit of the University of Delaware Library’s Special Collections department decided to take a chance and employ a grad school dropout at the height of the 2008 economic collapse. This time I was hooked for good. I went on to earn my MLIS online while working my full-time paraprofessional position at Delaware, and have since held professional positions at Colgate University and Johns Hopkins University. It took me a little while to figure it out, but, being an archivist provided me with the balance and variety of work I’d been longing for – the theory and intellectual work of a scholar, the interaction with people I’d missed as a graduate student researcher, the connection to history that had driven my prior coursework, and, perhaps most of all, the exposure to and engagement with emerging technologies I’d missed as a computer hobbyist turned grad student.

What is one thing you’d like to see the Electronic Records Section accomplish during your time on the steering committee?

Above all, I would like to see the Electronic Records Section serve as a welcoming and valuable resource to *all* archivists. In my career I have worked at a medium-sized partially public-funded university, a small liberal arts college, and a private research university, and worked on paper-based and electronic manuscript and university records’ collections, so I appreciate the variety of funding models, resource levels, institutional priorities, and individual knowledge and time we must all strive to balance and leverage in our day-to-day work. Across the profession it is still rare for someone to have the luxury of focusing day in and day out on electronic records; however, it is by no means rare for a 21st century archivist to encounter records of enduring value that exist only in digital form. By striving to be an open, welcoming, responsive, and member-driven community resource for all archivists, the Electronic Records Section can help meet the daily operational needs of its members (e.g. demystifying electronic records jargon and workflows, providing case studies of both successes and failures, serving as a non-judgmental sounding board for new and experienced archivists alike), while also helping to propel the profession forward.

What is your favorite GIF?

giphy

Get to know the candidates: Brian Dietz

The 2016 elections for Electronic Records Section leadership are upon us! Over the next two weeks, we will be presenting additional information provided by the 2016 nominees for ERS leadership positions. For more information about the slate of candidates, you can check out the full 2016 ERS elections site. ERS Members: be sure to vote! Polls are open July 8 through the 22!

Candidate name: Brian Dietz

Running for: Steering Committee

What made you decide you wanted to become an archivist?

All current contexts–social, cultural, economic–are historically contingent. We examine those contingencies, often with the goal of exposing power dynamics, through historical inquiry. Support such critical work is what excited me about becoming an archivist.

What is one thing you’d like to see the Electronic Records Section accomplish during your time on the steering committee?

I’m really interested in the idea of more of us making our documentation widely available so that it becomes a little bit easier for some folks to start digital archiving programs and others to enhance existing ones. The ERS could lead an effort around this kind of sharing.

What is your favorite GIF?

I love how affirming this one is:

giphy1

Get to know the candidates: Blake Graham

The 2016 elections for Electronic Records Section leadership are upon us! Over the next two weeks, we will be presenting additional information provided by the 2016 nominees for ERS leadership positions. For more information about the slate of candidates, you can check out the full 2016 ERS elections site. ERS Members: be sure to vote! Polls are open July 8 through the 22!

Candidate name: Blake Graham

Running for: Steering Committee

What made you decide you wanted to become an archivist?

I love being asked this question. I started my career working as a graduate assistant at a university archives about six years ago. At the time, I was knee-deep in the curriculum – studying southern identity and slavery. I was enchanted by historiography, and discovering how historians debate about the interpretation, nature, and implication of primary source materials. My coursework, as well as my job responsibilities, were related to southern history. While working at the university archives, arranging a nineteenth-century manuscript collection, I stumbled across a slave pamphlet. For anyone unfamiliar, these were handouts for slave-trading events in the antebellum South. The text and imagery included horrific details about physique and “background information” on slaves. I buckled after reading the pamphlet. Handling and reading this document was a powerful experience for me, to say the least. I brought the item to the director, and she broke down crying as well. Because of this, along with a long-list of “encounters in the archives,” I have a better understanding of the power of the written record. My work allows me to continue exploring the relationship between the written record and the human experience. This is why I work in archives, and why I love my work.

What is one thing you’d like to see the Electronic Records Section accomplish during your time on the steering committee?

I admire and appreciate all of the work in BloggERS – I believe it is a gateway for collaboration and innovation among our professional communities. If I was asked about foreseeable goals and accomplishments, I would take a bet on ERS leaders proactively seeking different voices to participate in the blog. In 2015-2016, roughly 80% of authors and ERM discussions on BloggERS come from university settings – a percentage that is also reflective of the Section’s leadership. To revisit Kyle Henke’s “Get to Know You” post last year, “I see the purpose of this group as a method to facilitate communication and encourage collaboration across the profession.” I also believe one of the best ways to learn how to improve one’s knowledge of, or develop new skills in, a topic of interest is to simply talk about it with colleagues across the profession. I would like to help move BloggERS in this direction by proactively initiating a dialogue between professionals working in a wide range of settings. I think targeted outreach and education is one of the ways we can accomplish collaboration across the profession.

What is your favorite GIF?

giphy3

Annual meeting session recommendations, courtesy of ERS

Having trouble deciding between two tantalizing-looking sessions at the Society of American Archivists annual meeting this year? Looking for some recommendations that might tip the scales? Look no further!

The Electronic Records Section has produced a schedule for this year’s conference through its online scheduling tool, Sched. Now you can see the session that may be of interest to ERS members in one place.

The Electronic Records Section mega-schedule is available here.

See something we may have missed? Comment below or email bloggERS! at ers.mailer.blog@gmail.com!

 

Get to know the candidates: Brad Houston

The 2016 elections for Electronic Records Section leadership are upon us! Over the next two weeks, we will be presenting additional information provided by the 2016 nominees for ERS leadership positions. For more information about the slate of candidates, you can check out the full 2016 ERS elections site. ERS Members: be sure to vote! Polls are open July 8 through the 22!

Candidate name: Brad Houston

Running for: Steering Committee

What made you decide you wanted to become an archivist?

A combination of two things: 1) A summer internship with the Truman Presidential Library, which introduced me to the work of an archivist and made me realize that said work was something I could see myself doing. 2) My subsequent experience researching for my senior History thesis, much of which took place in small town historical societies and other poorly-described and poorly organized repositories. This experience elicited a vow: “I want to help make sure other people don’t have to work this hard to find what they’re looking for.” (I hope I’ve been doing a good job on both the description and reference sides of this!)

What is one thing you’d like to see the Electronic Records Section accomplish during your time on the steering committee?

While chair of the Records Management Roundtable, I helped institute a semi-regular series of Google Hangouts, which give our members a chance to hear about archival and records management issues from various experts in the field and interact in real-time to ask questions or work through examples. I think this is a model that would work well with a lot of the content put out by ERS– Hangout facilitators could walk people through using a particular tool or workflow as discussed previously on the blog, for example. The hangout format offers more interactivity than a webinar or Twitter chat (though incorporating elements of both!) and it seems like a great opportunity to expand ERS’s educational engagement with its members.

What is your favorite GIF?

nope