Collections as Data

by Elizabeth Russey Roke


Archives put a great deal of effort into preserving the original object.  We document the context around its creation, perform conservation work on the object if necessary, and implement reading room procedures designed to limit damage or loss.  As a result, researchers can read and hold an original letter written by Alice Walker, view a series of tintypes taken before the Civil War, read marginalia written by Ted Hughes in a book from his personal library, or listen to an audio recording of one of Martin Luther King, Jr.’s speeches.  In other words, we enable researchers to encounter materials as they were originally designed to be used as much as is possible.

The nature of digital humanities research challenges these traditional modes of archival access: are these the only ways to interact with archival material?  How do we serve users who want to leverage computational techniques such as text mining, machine learning, network analysis, or computer vision in their research or teaching?  Are machines and algorithms “users”? Archivists also encounter these questions as the content of archives shifts from analog to born digital material. Digital files were created and designed to be processed by algorithms, not just encountered through experiences such as watching, viewing, or reading.  What could access for these types of materials look like if we gave access to their full functionality and not just their appearance? 

I have spent the past two years working on an IMLS grant focused on addressing these types of questions.  Collections As Data: Always Already Computational examined how digital collections are and could be used beyond analog research methodologies.  Collections as data is ordered information, stored digitally, that is inherently amenable to computation.  This could include metadata, digital text, or other digital surrogates.  Whereas a digital repository might enable researchers to read a newspaper on a computer screen, an approach grounded in collections as data would give researchers access to the OCR file the repository generated to enable keyword search. In other words, digital repositories should provide access beyond the viewers, page turners, and streaming servers of most current digital repositories that replicate analog experiences.  At its core, collections as data simply asks cultural heritage organizations to make the full digital object available rather than making assumptions about how users will want to interact with it.  

Collections as data implementations are not necessarily complex nor do they involve complicated repository development.  Some of the simplest examples can be found on Github where archives such as Vanderbilt and New York University publish their EAD files.  The Rockefeller Archive Center and the Museum of Modern Art go a step further and publish all of their collection data, along with a creative commons license.  Emory, my home institution, makes finding aid data available in both EAD and RDF from our finding aids database, which has led to a digital humanities project that harvested correspondence indexes from our Irish poetry collections to build network graphs of the Belfast Group.  More complex implementations often provide access to data through APIs instead of a bulk download.  An example of this can be found at Carnegie Hall Archives, which allows researchers to query their data through a SPARQL endpoint.

Chord diagram created as part of Emory’s Belfast Group Poetry project, showing the networks of people associated with the Belfast Group and their relationships with each other.

The Collections As Data: Always Already Computational final report includes more information and ideas for getting started with collections as data. It includes resources such as a set of user personas, methods profiles of common techniques used by data-driven researchers, and real-world case studies of institutions with collections as data including their motivations, technical details, and how they made the case to their administrators.  I highly recommend “50 Things,” which is a list of activities associated with collections as data work ranging from the simple to the complex. 

There are a few takeaways from this project I’d like to highlight for archivists in particular:

Collections as data approaches are archival.  Data-driven research demands authenticity and context of the data source, established and preserved through archival principles of documentation, transparency, and provenance.  This type of information was one of the most universal requests from digital humanities researchers. It was clear that they were not only interested in the object, but in how it came to be.  They wanted to understand their data as an archival object with information about its creation, provenance, and preservation. Archivists need to advocate for digital collections to be treated not just as digital surrogates, or what I like to think of as expensive photocopying, but as unique resources unto themselves deserving description, preservation, and access that may not necessarily match that of the original object.

Collections as data enhances access to archival material.  What if we could partially open restricted material to researchers?  Emory holds the papers of Salman Rushdie and his email files are largely restricted per the deed of gift.  Computational techniques being developed in ePADD could generate maps of Rushdie’s correspondents and reveal patterns in the timing and frequency of his correspondence, just through email header information and without exposing sensitive data (i.e. the content of the email) that Rushdie wanted to restrict.  Could this methodology be extrapolated to other types of restricted electronic files?  

Just start.  For digital files, trying something is always the first, and best approach.  There is no one way or best way to do collections as data work. Consider your community and ask them what they need.  Unlike baseball fields, if you build it, they probably won’t come unless you ask first. Collections as data material already exists in your collection, especially if you use ArchivesSpace.  Publish it. Think broadly about what might constitute collections as data and how you might make use of it yourself; collections as data benefits us too. Follow the Computational Archival Science project at the University of Maryland, which is exploring how we think about archival collections as data.  

If you want to take a deep dive into collections as data (and get funding to do so!) consider applying to be part of the second cohort of the Part to Whole Mellon grant, which aims to foster the development of broadly viable models that support implementation and use of collections as data.  The next call for proposals opens August 1:  https://collectionsasdata.github.io/part2whole/ .  On August 5, the project team will offer a webinar with more information about the grant and opportunities to ask questions:  https://collectionsasdata.github.io/part2whole/cfp2webinar/.


Elizabeth Russey Roke is a digital archivist and metadata specialist at the Stuart A. Rose Library of Emory University, Atlanta, Georgia. Primarily focused on preservation, discovery, and access to digitized and born digital assets from special collections, Elizabeth works on a variety of technology projects and initiatives related to digital repositories, metadata standards, and archival descriptive practice. She was a co-investigator on a 2016-2018 IMLS grant investigating collections as data.

Advertisements

A Conversation with Annalise Berdini, Digital Archivist at the Seeley G. Mudd Manuscript Library, Princeton University

Interview conducted with Annalise Berdini in May 2019 by Hannah Silverman and Tamar Zeffren

This is the eighth post in a new series of conversations between emerging professionals and archivists actively working with digital materials.


Annalise Berdini is the Digital Archivist at the Seeley G. Mudd Manuscript Library at Princeton University, a position she has held since January 2018. She is responsible for the ongoing management of the University Archives Digital Curation Program, as well as managing a collection of web archives and assisting with reference services.

Annalise’s first post-graduate school position was as a manuscripts and archives processor at the University of California, San Diego (UCSD). While she was working at UCSD, universities and archives were slowly starting to see the need for a dedicated digital archivist position. When the Special Collections department at UCSD created their first digital archivist position, Annalise applied and got the job. She explains that a good deal of her work there, and at Princeton, is graciously supported by a community of digital archivists solving similar challenges in other institutions.

As Annalise has now held a digital archivist role at two different institutions, both universities, we were interested to hear her perspectives on how colleagues and researchers have understood – or misunderstood – her role. “Because I have digital in my job title,” she noted, “people interpret that in a lot of very wide and broad ways. Really digital archives is still an emerging field…there are so many questions to answer, and it’s fun to investigate that aspect of the field.”

Given prevailing concerns among institutional archives about preserving and processing legacy media, we were keenly interested in hearing Annalise’s insights about securing stakeholder buy-in to develop a digital archives program.

“It’s a struggle everywhere,” she acknowledges. Presently, Princeton’s efforts to build up a more robust digital preservation program have led the University to a partnership with a UK-based company called Arkivum, which offers digital preservation, storage, maintenance, auditing and reporting modules and has the capacity to incorporate services from Archivematica and create a customized digital storage solution for Princeton.

“We’ve been lucky here [at Mudd]. We’re getting this great system. There is buy-in and there seems to be a pretty strong push right now. For us, the most compelling argument we’ve had is that we are mandated to collect student materials and student records that will not exist anywhere else unless we take them. The school has to keep those records, there’s not an option. Emphasizing how easily that content could be lost without a proper digital preservation system in place was very compelling to people who weren’t necessarily aware of the fact that hard drives sitting on a shelf are really not acceptable storage choices and options.”

Annalise has also found that deploying some compelling statistics can aid in building awareness around digital archives needs. In discussions about how rapidly materials can degrade, Annalise likes to cite a 2013 Western Archives article, “Capturing and Processing Born-Digital Files in the STOP AIDS Project Records,” which showcases findings that out of a vast collection of optical storage media, “only 10% of these hundreds of DVDs were really able to be recovered, whereas, strangely, a lot of the floppy disks were better and easier to recover…I think emphasizing how fragile digital content is [can help people understand] how easily it will corrupt without you even knowing it.”

Equally as important to generating momentum for such programs are the direct relationships Annalise cultivates with colleagues, within and without the archives. “My boss was really instrumental in the process, and the head of library IT helped me navigate getting approvals from the University as a whole and the University IT department.”

The complex process of sustaining and innovating a digital archives infrastructure provides ongoing opportunities for Annalise to “solve puzzles” and to unite colleagues in confronting the challenges of documenting and preserving born-digital heritage: “I have focused on trying to find one person who is maybe a level above me and to connect with them and then hopefully build up a network within my institution to build some groundswell.”


Hannah Silverman

Tamar Zeffren

Hannah Silverman and Tamar Zeffren both work at JDC Archives. Tamar is the Archival Collections Manager. Hannah is the Digitization Project Specialist and also works independently as a photo archivist. Both received SAA’s DAS certification.

An Exploration of BitCurator NLP: Incorporating New Tools for Born-Digital Collections

by Morgan Goodman

Natural Language Processing (NLP) has been a buzz-worthy topic for professionals working with born-digital material over the last few years. The BitCurator Project recently released a new set of natural language processing tools, and I had the opportunity to test out the topic modeler, Bitcurator-nlp-gentm, with a group of archivists in the Raleigh-Durham area. I was interested in exploring how NLP might assist archivists to more effectively and efficiently perform their everyday duties. While my goal was to explore possible applications of topic modeling in archival appraisal specifically, the discussions surrounding other possible uses were enlightening.  The resulting research informed my 2019 Master’s paper for the University of North Carolina Chapel Hill.

Topic Modeling extracts text from files and organizes the tokenized words into topics. Imagine a set of words such as: mask, october, horror, michael, myers. Based on this grouping of words you might be able to determine that somewhere across the corpus there is a file about one the Halloween franchise horror films. When I met with the archivists, I had them run the program with disk images from their own collections, and we discussed the visualization output and whether or not they were able easily analyze and determine the nature of the topics presented.

BitCurator utilizes open source tools in their applications and chose the pyLDAvis visualization for the final output of their topic modeling tool (more information about the algorithm and how it works can be found by reading Sievert and Shirley’s paper. You can also play around with the output through this Jupyter notebook).  The left side view of the visualization has topic circles displayed in relative sizes and plotted on a two-dimensional plane. Each topic is labeled with a number in decreasing order of prevalence (circle #1 is the main topic in the overall corpus, and is also the largest circle). The space between topics is determined by the relative relation of the topics, i.e. topics that are less related are plotted further away from each other. The right-side view contains a list of 30 words with a blue bar indicating that term’s frequency across the corpus. Clicking on a topic circle will alter the view of the terms list by adding a red bar for each term, showing the term frequency in that particular topic in relation to the overall corpus.

Picture1

The user can then manipulate a metric slider which is meant to help decipher what the topic is about. Essentially, when the slider is all the way to the right at “1”, the most prevalent terms for the entire corpus are listed. When a topic is selected and the slider is at 1, it shows all the prevalent terms for the corpus in relation to that particular topic (in your Halloween example, you might see more general words like: movie, plot, character). Alternatively, the closer to “0” the slider moves, the less corpus-wide terms appear and the more topic specific terms are displayed (i.e.: knife, haddonfield, strode).

While the NLP does the hard work to scan and extract text from the files, some analysis is still required by the user. The tool’s output offers archivists a bird’s eye view of the collection, which can be helpful when little to nothing is known about its contents. However, many of the archivists I spoke to felt this tool is most effective when you already know a bit about the collection you are looking at. In that sense, it may be beneficial to allow researchers to use topic modeling in the reading room to explore a large collection. Researchers and others with subject matter expertise may get the most benefit from this tool – do you have to know about the Halloween movie franchise to know that Michael Myers is a fictional horror film character? Probably. Now imagine more complex topics that the archivists may not have working knowledge of. The archivist can point the researcher to the right collection and let them do the analysis. This tool may also help for description or possibly identifying duplication across a collection (which seems to be a common problem for people working with born-digital collections).

The next steps to getting NLP tools like this off the ground are to implement training. Information retrieval and ranking methods that create the output may not be widely understood. To unlock the value within an NLP tool, users must know how they work, how to run them, and how to perform meaningful analysis.  Training archivists in the reading room to assist researchers would be an excellent way to get tools like this out of the think tank and into the real world.


MorganMorgan Goodman is a 2019 graduate from the University of North Carolina, Chapel Hill and currently resides in Denver, Colorado. She holds a MS in Information Science with a specialization in Archives and Records Management.

 

 

 

An Interview with Erin Barsan—Archives & Collections Information Consultant at Small Data Industries.

An Interview with Erin Barsan—Archives & Collections Information Consultant at Small Data Industries.

by Meghan Lyon

This is the seventh post in a new series of conversations between emerging professionals and archivists actively working with digital materials.

Photo credit: Small Data Industries.

Erin Barsan is a Consultant specializing in Archives & Collections Information at Small Data Industries, a private conservation lab and consultancy firm with a mission to “support and empower people to safeguard the permanence and integrity of the world’s artistic record.” She was the NDSR Art Resident (2017-2018) at Minneapolis Institute of Art, and obtained her MSLIS with an Advanced Certificate in Archives from Pratt Institute in 2015. Before attending Pratt, she studied graphic design and photography as an undergraduate at Columbia College Chicago.


I was interested in how Erin’s background in art influenced the direction of her graduate coursework and affects her style as a professional. During her BFA program, Erin learned critical thinking and analysis, visual literacy, and intentional decision-making—Erin had a professor who’s frequent critique was  “make no arbitrary decisions!” As an LIS student who’s primary interest was archives, Erin chose to study User Experience (UX), specifically Information Architecture. The principles of UX—designing with the end user in mind, putting yourself in their place, doing research before you design—have very much influenced her working style.

At Small Data Industries, Erin works closely with their clients to craft unique digital preservation and conservation strategies for institutions, private collectors, artists studios, and artist estates. While Erin was the NDSR Art Resident at the Minneapolis Institute of Art (Mia), she helped conceptualize and document a framework for managing and preserving the Museum’s collection of time-based media art. Day-to-day work of digital preservation includes using those visual literacy and UX principles to develop usable documents, employing LIS research skills to find new information, to learn how to complete a task, or to find people with expert skills that you may not have. Soft skills then come in handy to build relationships with those expert individuals.

In discussing some of the challenges of her work, Erin cited the importance of advocacy to combat the invisibility of digital work, and to educate and raise awareness of the ongoing action of preservation, i.e. nothing is every preserved, only being preserved. There is a great need to explain “complicated things in a very succinct way,” to foster support for preservation initiatives and build collaborative relationships with professionals in adjacent fields. Developing good communication skills is crucial to maintaining preservation programs within any institution. Prepare an elevator pitch to explain your job to someone outside the field, and be ready to describe digital archives and preservation in lay terms, and to share knowledge and encourage excitement about the archival endeavor.

The challenges of Erin’s work are also the rewards. As a consultant, Erin frequently works with new clients, and a preservation strategy that works well for one institution may fall flat for another. “In consulting, there’s a lot of similar problems, but every institution is different. It’s always interesting to try and take best-practices and standards and figure out how they can be applied in these unique situations.” For Erin, finding solutions to complex problems is rewarding since it often involves learning new skills and thinking creatively. She also enjoys helping to ensure that time-based media art and digital archives will be accessible and findable in the future, “I find it really gratifying to know that the work that I’m doing is going to make a difference—because I’ve seen the other side of the coin, when things get lost, and how easily information can be lost.”

For students and new professionals entering the field, Erin’s advice: “Get more internships. Everything that you learn in school is great, but hands-on experience is invaluable and is what will get you a job.” And although technical skills will help you get a job, once you’re on the job, soft skills become more important. Take advantage of the professional community, “we have a very generous community. A lot of times we can be reticent to reach out to other professionals in the field, but I know from experience that people want to help. So reach out!”

Share your experiences with your peers, find a way to connect to the larger community, and discuss what you’re learning or working on. This can be at whatever venue or capacity is comfortable for you, whether it’s presenting at conferences, tweeting, blogging, or something else. Keep abreast of what’s happening, join conversations, follow listservs, contribute to working groups. Invite and listen to other people’s perspectives. Finally, don’t be afraid to advocate for your professional development in the workplace. Imposter syndrome is real, don’t sell yourself and your experience short!


Meghan Lyon is completing the 1st year of her MSLIS degree program at Pratt Institute School of Information. She has a BFA from the Cooper Union, School of Art, and is interested in artist archives, museum libraries & collections, and digital preservation.

Midwest Archivists Conference 2019 meeting (MAC 2019)

by A.L. Carson

The Midwest Archivists Conference 2019 meeting, held April 3-6 in Detroit (in the GM Renaissance Center, which may have the distinction, with its concentric circle design, of being the most bewildering conference center I’ve ever been in), chose “Innovations, Transformation, Resurgence” as its theme. The organizers put out a call for participants to “consider the ways they have transformed their local communities and the world,” and it seemed to have struck a chord: the sessions reflected a sense of rootedness as well as a desire to increase and deepen connections between repositories, their holdings, the communities they represent, and (crucially) those they haven’t.

The programming took a broad perspective on the profession and practice of archives, giving space to multiple approaches and understandings of the work, from imposter syndrome to workflows, resulting in some really generative sessions. I attended a number of sessions focused on surfacing the histories of underserved and marginalized groups in the Midwest, notably “Together, We Make It: Making Collections Featuring Minority Groups More Accessible” and “Documenting the History of HIV/AIDS in the Midwest.”

Two standouts on the technical practice and electronic records side were “Computer-Assisted Appraisal of Electronic Records” and “Archival Revitalization: Transforming Technical Services with Innovative Workflows,” both of which were relevant to my (new) position as a processing archivist. For a play-by-play of some of these sessions, you can check out my MAC Twitter feed (yes, I live-tweet conferences). Both emphasized balancing competing priorities and unequal capacities, familiar themes for anyone working in archives. Leading off “Computer Assisted Appraisal,” Cal Lee reminded everyone that there was no such thing as a perfect machine system (which would remove the human labor from appraisal), and that the goal should never be to create one: that machines are tools, not agents. That emphasis on human action, particularly communicating across and about technological divides, was echoed again in “Archival Revitalization,” which focused on instances of implementation (new processes, tools, and workflows) that were made possible through and in turn assisted human collaboration. Both sessions, too, spoke to the importance of understanding iteration as an integral part of workflows (whether appraisal, processing, or providing access) rather than something to be engineered out of a process.

Thanks to scholarship and grant programs (of which we can always have more), a number of paraprofessionals and short-term or project archivists were able to attend and present, which enriched the programming significantly. There was a strong showing from the regional LIS students, both in their poster session on Friday and the general programming. Having just started my position at Iowa State, this was my first MAC; it was also my first time in Detroit, and overall I was favorably impressed. While the conference center itself is a marvel of hostile architecture (which made literal accessibility a real and not-to-be-downplayed challenge), the intellectual content of the presentations and general attitude of the attendees made it a fairly easy space in which to be a newcomer.

A.L. Carson is a processing archivist at Iowa State University, where they are engaged in developing processing, preservation, and access guidelines for digital records as well as increasing the availability of the traditional collections.

A Conversation with Wendy Hagenmaier, Digital Collections Archivist at Georgia Tech

Interview conducted with Wendy Hagenmaier by Colleen Farry in March 2019.

This is the sixth post in a new series of conversations between emerging professionals and archivists actively working with digital materials.


Wendy Hagenmaier is the Digital Collections Archivist at the Georgia Tech Library where she leads the development of workflows for preserving and delivering born-digital special collections. She also manages the Library’s retroTECH initiative. Recently, Wendy shared her experiences as an archivist and some recommendations for new professionals with bloggERS!

When Wendy entered graduate school at the University of Texas at Austin, she did not initially know that her area of focus would be archives. She recalled a presentation by Dr. David Gracy during orientation. “I was captivated by the thought of how records are with you from the time that you’re born and how they’re evidence of your life.” Wendy went on to work in the archives as a graduate student while pursuing her M.S. in Information Studies. In retrospect, pursuing a career in archives was a natural career path for Wendy. As a child, she was always fascinated by objects and the narratives that people attached to them. In this way, Wendy explained how “the past is still present within objects in an archive.”

In addition to managing born-digital special collections, Wendy oversees the retroTECH Library program at Georgia Tech. This initiative provides a place for engagement with vintage hardware and software and modern tools for digital archiving and emulation. As described on the program website: “retroTECH aims to inspire a cultural mindset that emphasizes the importance of personal archives, open access to digital heritage, and long-term thinking.” Wendy hopes the program will continue to grow and expand beyond its space in the library. “The students, in interacting with older technology, can consider how we interact with technology now. They begin to consider the infrastructures that define our records and think about their own engagement with technology.”

Visitors to the retroTech lab have the opportunity to experiment with classic hardware and computer programs. “When people walk into the space they become very emotional and immediately launch into a memory of when they were a kid.” She spoke about the power of that experience and its ability to demonstrate the importance of libraries and archives as preservers of the past. Wendy also shared her thoughts on the importance of making archival work more visible and the challenge of developing models of sustainability within the profession. “We need to be able to communicate the value of our work to people in power who make resource decisions.”

When asked about the dynamic nature of digital archiving and staying up to date on new tools and technologies, Wendy acknowledged, “We will continue to encounter skills gaps in our careers.” To tackle new challenges with technology, Wendy has adopted a collaborative approach, working with colleagues to achieve goals that she might not have had bandwidth to accomplish on her own. “I think, ‘If I don’t learn scripting in the way that I had fantasties of, that’s ok.’ I spend time talking with colleagues that have expertise that I wish I had more time to cultivate.” She added, “there are many great SAA courses, and I’m grateful to benefit from these gap-filling learning opportunities.” Wendy also encourages archivists to explore open-source tools with strong user communities and training resources. For her, it is very motivating to be in a profession where “everyone is very open to sharing their knowledge and capitalizing on the ways that we can support each other.”

Some pieces of advice that Wendy shared for new professionals included staying curious and feeling empowered to question. “Try to maintain that sense of wonder and discovery about technological and socio-technical issues, and feel empowered to challenge them, where necessary. Our field is going to change a lot, and we should encourage each other to push beyond the status quo.” She observed that archivists can’t always control how technologies are developed, but they can think critically about how those infrastructures define our records and practices.

Networking can be challenging for new archivists and veterans alike. Wendy recommended pursuing virtual collaborations and reaching out to regional groups with shared interests. “I found comfort and a genuine connection with smaller working groups, like the ERS steering committee.” Wendy has also done a lot of work regionally. “It’s great to get involved locally to identify areas of commonality to present at regional conferences.”

When asked what she loved most about being an archivist, Wendy said the privilege of working with people that have a shared passion for archives. “I do this because I love it, and I get to work with others who love it as well; feeling that shared passion is very nurturing.”


Colleen Farry is an Assistant Professor and Digital Services Librarian at the University of Scranton where she develops, coordinates, and manages the Weinberg Memorial Library’s digital collections and related digital projects.

An Interview with Elise Tanner – Director of Digital Projects and Initiatives at the University of Arkansas at Little Rock Center for Arkansas History and Culture

This is the fifth post in the Conversations series.

FB_IMG_1535042441997Elise Tanner received her Master’s of Science in Library and Information Science from the iSchool at the University of Illinois in 2015. She was a Resident in the 2017/2018 National Digital Stewardship Residency for Art Information program. For the residency, she worked on a project to build a foundation for the preservation of the Philadelphia Museum of Art’s time-based media art collection. Today, she is the Director of Digital Projects and Initiatives at the University of Arkansas at Little Rock Center for Arkansas History and Culture where she is taking the lead of all things digital.


“Try things.” “[Ask] lots of questions.”

Elise Tanner’s cheery force of will shines through the interviews we have over video chat. Her work in digital archives and preservation so far has been on the edges of the digital preservation map: preservation of Time-Based Media Art at the Museum of Philadelphia and this new position as the Director of Digital Projects & Initiatives at the University of Arkansas at Little Rock Center for Arkansas History & Culture. She admits she doesn’t see her role as an “archivist” necessarily, but a preservationist — even if archival concepts can’t help but inform her work as she considers an upcoming born-digital remote transfer.

We talk about the way archives hold stories, show structural bias, and how cool it would be to incorporate soundscapes in future collections. Tanner is working on collaborative GIS projects with the GIS Lab in the University, getting the Digital Services Lab technology organized, thinking about how to best engage the graduate assistants/apprentices who do much of the digitizing work in the lab, in addition to all the work involved with getting up to speed with a new institution and a new home. The Center itself shares space with the Central Arkansas Library System’s Butler Center for Arkansas Studies, a unique partnership that includes shared reference work in the research room.

When I ask her what advice she has for newer professionals and students, she points to her first internship in an academic library: “It wasn’t what I really wanted [to be a reference librarian].” But it is necessary for people to try things out, see what is in the field, join the listservs and ask (more) questions. Another colleague who made a career change later in life began working at the Center as a graduate student in UALR’s Public History program and has remained at the Center as an Assistant Archivist for the past 10 years.  As for many things, the first attempt will not be your last.

Tanner’s route to digital archives reflects the current social-economic times and her desire to keep learning. After graduating with a BA in Photography from Columbia College in Chicago, she worked for three years at Starbucks before deciding on an online MLIS program to avoid moving. She admits that the program wasn’t really structured towards archival work, but she pulled together the courses needed to obtain a certificate in Special Collections. Tanner worked full time during her MLIS as a digital imaging technician for The School of the Art Institute of Chicago.  This practical component, as well as access to practitioners who could answer Tanner’s many questions, would prove a valuable counter-balance to a mainly online program.

After graduation, Tanner applied for the 2017-2018 residency in the National Digital Stewardship Residency for Art Information, and while her first interview didn’t garner a position with that particular institution, the positive impression she created led to one of the other Resident positions with the Philadelphia Museum of Art. The residency work produced the base for “an approach to digital preservation of time-based media art (TBMA)” for the institution. It also provided Tanner with the opportunity to develop presentation and project management skills, as well as mentorship from other professionals and the residency organizers.

How did she find herself at Little Rock? “The staff here really sold it to me,” Tanner admits. After the usual job application grind and interviewing near-misses, she credits luck and hard work in landing a position that was a good fit for her skills and personality. She almost didn’t apply because the word Director in the title was intimidating, before looking closer at the requested skills and deciding to go for it. The match seems well made.

What are important skill sets for the nascent digital archivist/preservationist to develop according to Tanner? “Communication” she expands: learn to give an elevator speech; how to articulate your vision to a group; stay on top of an overwhelming email inbox; definitely mastering project management; how to prepare for and run a meeting; go to conferences and put yourself out there. Technical skills follow close behind: networks, security, any tools that will make your life easier in terms of communication and project management.  It might all sound overwhelming, but getting practical experience in the field will reveal your personal strengths and narrow down aspects you can work on – careers are a long game, try things – ask more questions.

What does the future hold for Tanner? Publishing her TBMA work is first on her list, but also aspires to one day collect the archives of the local Rock Town Roller Derby league, and eventually greater embedment with the local community. So definitely keep your eye out for more from this upcoming digital preservationist.


profile5Author Bio: Meghan Whyte is a former public librarian who currently works as a government records reappraisal archivist for Library and Archives Canada.

“A Trial by Fire Process”: Digital Archiving at the State Historical Society of Missouri (Interview with Elizabeth Engel)

This is the fourth post in the Conversations series

Founded in 1898, the State Historical Society of Missouri (SHSMO) “collect[s], preserve[s], publish[es], exhibit[s], and make[s] available material related to all aspects and periods of Missouri history” (The State Historical Society of Missouri, “About Us”). Supporting this mission is a large staff that includes thirty-five full-time and twelve part-time employees, two research fellows, and a large number of volunteers and interns who work in one of SHSMO’s six Research Centers (The State Historical Society of Missouri, “About Us”). My interviewee, Senior Archivist Elizabeth Engel, serves at the Columbia Research Center on the University of Missouri campus. Elizabeth and her colleagues work to make SHSMO’s collections (e.g. the National Women and Media Collection) accessible to a wide variety of patrons, including film creators, reporters, and researchers from all walks of life.

Elizabeth’s entry into the archival field was due partly to happenstance. After enrolling in the University of Iowa’s (UI) School of Information Science, Elizabeth expected to work in public libraries—especially because she had worked in similar settings during her high school and college years. However, she seized upon an opportunity to complete a work-study assignment at the Iowa Women’s Archives (at the University of Iowa) and promptly discovered a passion for archives. After graduating from UI in 2006, SHSMO initially hired her as a Manuscript Specialist—and the rest is, well, history (The State Historical Society of Missouri, “SHSMO Staff”). As the senior archivist for the Columbia Research Center, Elizabeth’s day-to-day work involves processing collections; fulfilling various public services responsibilities, and developing biographical histories of Missouri’s most well-known citizens. Her greatest responsibility, however, is overseeing the Columbia Research Center’s accessioning efforts—particularly as it pertains to digital content.

Elizabeth’s Research Center has seen a marked increase in the amount of born-digital material that it takes in each year. This point is exemplified by SHSMO’s recent acquisition of Senator Claire McCaskill’s papers, which consists of approximately 3.25 cubic feet AND two terabytes of data. To tackle the challenges of managing such content, Elizabeth and her staff have employed a variety of tactics and tools. While MPLP-inspired collection-level descriptions have sufficed for physical collections, Elizabeth noted that digital content requires a more in-depth description for access and preservation purposes. Elizabeth’s work on other projects—such as the processing of the Missouri Broadcasters Association Radio Archives Collection—reinforced the importance of flexibility, as exemplified by her arrangement tactics (recordings are organized by call sign, and further accruals are added to the end of the finding aid) and description efforts (“some of the file names were in ALL CAPS and I decided to retain that for the time being as well…perhaps it will aid in retrieval).

This theme of flexibility emerged when Elizabeth discussed the different digital archiving tools that SHSMO staff have employed: Duke University’s DataAccessioner and Microsoft Excel spreadsheets (to create and organize metadata); various storage spaces, including network attached storage (NAS) units and a dark archive (both of which are accessible only to certain staff); thumb drives, used to deliver content to patrons; a Microsoft Access database, which serves as the institution’s collection management system; and BitCurator, which SHSMO staff set up to tackle larger and more complex collections (e.g. Senator McCaskill’s papers). Overall, effectively and efficiently managing these digital resources has been “a [constant] trial by fire process,” given the somewhat volatile nature of the digital archives field. In the future, Elizabeth hopes that SHSMO will adopt more user-friendly and compatible software—such as Archivematica and/or Access to Memory (AtoM)—to fulfill its mission. In fact, Elizabeth emphasized that finding such tools—especially cost-effective tools—represents one of the greater challenges facing modern archivists.

For the aspiring digital archivist, Elizabeth recommended seeking out practice-focused learning opportunities. To complement her largely theoretical UI coursework, Elizabeth completed the Digital Archives Specialist (DAS) certificate; scans the field for published literature; and engages in other professional development efforts. She further recommended the workshops provided by Lyrasis as another opportunity to deepen one’s digital preservation knowledge. Elizabeth explained that the twenty-first-century digital archivist must remain flexible and commit to continual learning to stay on top of the field’s recent developments. She also emphasized that these same professionals must also be given sufficient time to learn and experiment with tools and workflows.

Before we digitally parted ways, Elizabeth offered one final and—in this writer’s opinion—exceptionally solid advice:

“You’re going to make mistakes and that’s okay. The DAS courses drilled it into me that ‘Doing something is better than nothing.’ Standards/tools are going to change and you can’t predict that. Sometimes all you can do is digital triage with the resources/time you have, so don’t let the doing things perfectly be the enemy of the good.”



Gentry_Photo_2018.jpgAuthor Bio: Steven Gentry is the Archives Technician for the St. Mary’s College of Maryland Archives. His responsibilities include processing collections and building finding aids; assisting with web and email archiving efforts; and researching tools and best practices pertaining to digital archives and electronic records.

 


An Interview With Caitlin Birch — Digital Collections and Oral History Archivist at the Rauner Special Collections Library, Dartmouth

Interview conducted with Caitlin Birch by Juli Folk in March 2019

This is the third post in the Conversations series

Meet Caitlin Birch

Caitlin Birch is the Digital Collections and Oral History Archivist for the Rauner Special Collections Library at Dartmouth College in Hanover, New Hampshire: she sat down with Juli Folk, a graduate student at the University of Maryland-College Park iSchool, who is pursuing an archives-focused MLIS and certificate in Museum Scholarship and Material Culture. Caitlin’s descriptions of her career path, her roles and achievements, and her insights into the challenges she faces helped frame a discussion of helpful skill sets for working with born-digital archival records on a daily basis.

Caitlin’s Career Path

As an undergraduate, Caitlin majored in English, concentrating in journalism with minors in history and Irish studies. After a few years working as a reporter and editor, she began to consider a different career path, looking for other fields that emphasize constant learning, storytelling, and contributions to the historical record. In time, she decided on a dual degree (MA/MSLIS) in history and archives management from Simmons College (now Simmons University). Throughout grad school, her studies focused on both historical methods and original research as well as archival theory and practice.

When asked about the path to her current position, Caitlin responded, “To the extent that my program allowed, I tried to take courses with a digital focus whenever I could. I also completed two internships and worked in several paraprofessional positions, which were really invaluable to preparing me for professional work in the field. I finished my degrees in December 2013 and landed my job at Dartmouth a few months later.” She now works as the Digital Collections and Oral History Archivist for Rauner Special Collections Library, the home of Dartmouth College’s rare books, manuscripts, and archives, compartmentalized within the larger academic research library.

Favorite Aspects of Being an Archivist

For Caitlin, the best aspects of being an archivist are working at the intersection of history and technology; teaching and interacting with people every day; and having new opportunities to create, innovate, and learn. Her position includes roles in both oral history and born-digital records, and on any given day she may be juggling tasks like teaching students oral history methodology, working on the implementation of a digital repository, building Dartmouth’s web archiving program, managing staff, sharing reference desk duty, and staying abreast of the profession via involvement with the SAA and the New England Archivists Executive Board. “I like that no two days are the same,” she shared, adding, “I like that my work can have a positive impact on others.”

Challenges of Being an Archivist

Caitlin pointed out that aspects of the profession change and evolve at a pace that can make it difficult to keep up, especially when job- or project-related tasks demand so much attention. She also noted other challenges: “More and more we’re grappling with issues like the ethical implications of digital archives and the environmental impact of digital preservation.” That said, she finds that “the biggest challenge is also the biggest opportunity: most of what I do hasn’t been done before at Dartmouth. I’m the first digital archivist to be hired at my institution, so everything—infrastructure, policies, workflows, etc.—has been/is being built from the ground up. It’s exciting and often very daunting, especially because this corner of the archives field is dynamic.”

Advice for Students and Young Professionals

As a result, Caitlin emphasized the importance of experimentation and failure. “Traditional archival practice is well-defined and there are standards to guide it, but digital archives present all kinds of unique challenges that didn’t exist until very recently. Out of necessity, you have to innovate and try new things and learn from failure in order to get anywhere.” For this reason, she recommended building a good professional network and finding time to keep up with the professional literature. “It’s really key to cultivate a community of practice with colleagues at other institutions.”

When asked whether she sets aside time specified for these tasks or if she finds that networking and research are natural outputs of her daily work, Caitlin stated that networking comes more easily because of her involvement with professional organizations. However, finding time for professional literature and research proved more difficult, a concern Caitlin brought to her manager. In response, he encouraged her to block 1-2 hours on her calendar at the same time every week to catch up on reading and professional news. She remains grateful for that support: “I would hope that every manager in this profession encourages time for regular professional development. It may seem like it’s taking time away from job responsibilities, but in actuality it’s helping you to build the skills and knowledge you need for future innovation.”


SAA-bloggERS-headshot-Juli_Folk

Juli Folk is finishing the MLIS program at the University of Maryland-College Park iSchool, specializing in Archives and Digital Curation. Previously a corporate editor and project manager, Juli’s graduate work supplements her passions for writing, art, and technology with formal archival training, to refocus her career on cultural heritage institutions.

An Interview with Erica Titkemeyer – Project Director and AV Conservator at the Southern Folklife Collection, UNC

Interview conducted with Erica Titkemeyer by Morgan McKeehan in March 2019.

This is the second post in a new series of conversations between emerging professionals and archivists actively working with digital materials.


Erica is the Project Director and AV Conservator at the Southern Folklife Collection, in Wilson Special Collections Library at the University of North Carolina at Chapel Hill’s University Libraries.Erica Titkemeyer

Tell us a little bit about the path that brought you to your current position.

As an undergrad I majored in Cinema and Photography, which initially put me in contact with many of the analog-based obsolete formats our team at UNC works to digitize now. It was also during this time when I saw how varied new proprietary born-digital formats could be based on camera types, capture settings, and editing environments, and how these files could be just as problematic as film and magnetic-based formats when trying to access content over time. Whether projects originated on something like DVCAM or P2 cards, codec and file format compatibility issues were a daily occurrence in classes. After undergrad I went through NYU’s Moving Image Archiving and Preservation program where courses in digital preservation helped instill a lot of the foundational knowledge I use today.

After grad school, I spent 9 months in the inaugural National Digital Stewardship Residency cohort in Washington, D.C., where I worked at Smithsonian Institution Archives to explore digital preservation needs and challenges of digital media art.

My current position is primarily concerned with the timely digitization, preservation and access of obsolete analog audiovisual formats, but our digital tape-based collections are growing, and there are many born-digital accessions with a myriad of audio and video file formats that we need to make decisions about now to ensure they’re around for the long term.

What type of institution do you currently work at and where is it located?

I work within Wilson Special Collections Library at the University of North Carolina at Chapel Hill’s University Libraries. I am situated in the Southern Folklife Collection, which holds the majority of audiovisual recordings in Wilson Library; however my team has expanded to work with all audiovisual recordings in the building as part of a new Andrew W. Mellon grant, Extending the Reach of Southern Audiovisual Sources: Expansion.

What do you love most about working with AV archival materials?

I’ve always been excited to learn about moving image and sound technologies and how they fit into historical contexts. Even if I know nothing about a collection except for the format, there’s enough there to understand the time and circumstances the recordings were created in. This is just as much the case for born-digital audiovisual files as it is for analog. We’ve seen file formats, codecs, and recording equipment go by the wayside, and so they exist as markers of a particular time.

What’s the biggest challenge affecting your work (and/or the field at large)?

Current and future digital video capabilities can provide a lot of options for documentarians and filmmakers, which is great news for them, but it also means there’s going to be a flood of new file formats with encodings and specifications we have not dealt with, many of which will already be difficult to access by the time they make it to our library because of planned obsolescence. We’ve already started to see these collections come in, and it’s impossible to normalize everything to our audiovisual target preservation specifications while still retaining quality for various reasons. Fortunately, there are a lot of folks thinking about this who are building some precedent when it comes to making decisions about the files. Julia Kim at Library of Congress, Rebecca Fraimow at WGBH, and I have also done a couple panel talks on this and recently put out an article through Code4Lib on this topic (https://journal.code4lib.org/articles/14244).

What advice would you give yourself as a student or professional first delving into digital archives work?

Everything can seem very overwhelming. There are a lot of directions to take in audiovisual preservation and archiving, and digital archiving and preservation is the shiny new frontier, but there’s a lot to gain by starting with what you know and taking it from there. I think building my knowledge and expertise in analog preservation risks inevitably helps me in tackling some of the more challenging aspects of born-digital audiovisual preservation.


Morgan McKeehanMorgan McKeehan is the Digital Collections Specialist in the Repository Services Department, within the University Libraries at the University of North Carolina at Chapel Hill. In this role, she provides support for the management of and access to digitized and born-digital materials from across the Libraries’ special collections units.