Assessing the Digital Forensics Instruction Landscape with BitCuratorEdu

by Jess Farrell

This is the sixth post in the bloggERS Making Tech Skills a Strategic Priority series.

Over the past couple of months, we’ve heard a lot on bloggERS about how current students, recent grads, and mid-career professionals have made tech skills a strategic priority in their development plans. I like to think about the problem of “gaining tech skills” as being similar to “saving the environment”: individual action is needed and necessary, but it is most effective when it feeds clearly into systemic action.

So that begs the question, what root changes might educators of all types suggest and support to help GLAM professionals prioritize tech skills development? What are educator communities and systems – iSchools, faculty, and continuing education instructors – doing to achieve this? These questions are among those addressed by the BitCuratorEdu research project.

The BitCuratorEdu project is a two three-year effort funded by the Institute of Museum and Library Services (IMLS) to study and advance the adoption of born-digital archiving and digital forensics tools and methods in libraries and archives through a range of professional education efforts. The project is a partnership between the School of Information and Library Science at the University of North Carolina at Chapel Hill and the Educopia Institute, along with the Council of State Archivists (CoSA) and nine universities that are educating future information professionals.

We’re addressing two main research questions:

  1. What are the primary institutional and technological factors that influence adoption of digital forensics tools and methods in different educational settings?
  2. What are the most viable mechanisms for sustaining collaboration among LIS programs on the adoption of digital forensics tools and methods?

The project started in September 2018 and will conclude in Fall 2021, and Educopia and UNC SILS will be conducting ongoing research and releasing open educational resources on a rolling basis. With the help of our Advisory Board made up of nine iSchools and our Professional Experts Panel composed of leaders in the GLAM sector, we’re:

  • Piloting instruction to produce and disseminate a publicly accessible set of learning objects that can be used by education providers to administer hands-on digital forensics education
  • Gathering information and centralizing existing educational content to produce guides and other resources, such as this (still-in-development) guide to datasets that can be used to learn new digital forensics skills or test digital archives software/processes
  • Investigating and reporting on institutional factors that facilitate, hinder and shape adoption of digital forensics educational offerings

Through this work and intentional community cultivation, we hope to advance a community of practice around digital forensics education though partner collaboration, wider engagement, and exploration of community sustainability mechanisms.

To support our research and steer the direction of the project, we have conducted and analyzed nine advisory board interviews with current faculty who have taught or are developing a curriculum for digital forensics education. So far we’ve learned that:

  • instructors want and need access to example datasets to use in the classroom (especially cultural heritage datasets);
  • many want lesson plans and activities for teaching born-digital archiving tools and environments like BitCurator in one or two weeks because few courses are devoted solely to digital forensics;
  • they want further guidance on how to facilitate hands-on digital forensics instruction in distributed online learning environments; and
  • they face challenges related to IT support at their home institutions, just like those grappled with by practitioners in the field.

This list barely scratches the surface of our exploration into the experiences and needs of instructors for providing more effective digital forensics education, and we’re excited to tackle the tough job of creating resources and instructional modules that address these and many other topics. We’re also interested in exploring how the resources we produce may also support continuing education needs across libraries, archives, and museums.

We recently conducted a Twitter chat with SAA’s SNAP Section to learn about students’ experiences in digital forensics learning environments. We heard a range of experiences, from students who reported they had no opportunity to learn about digital forensics in some programs, to students who received effective instruction that remained useful post-graduation. We hope that the learning modules released at the conclusion of our project will address students’ learning needs just as much as their instructors’ teaching needs.

Later this year, we’ll be conducting an educational provider survey that will gather information on barriers to adoption of digital forensics instruction in continuing education. We hope to present to and conduct workshops for a broader set of audiences including museum and public records professionals.

Our deliverables, from conference presentations to learning modules, will be released openly and freely through a variety of outlets including the project website, the BitCurator Consortium wiki, and YouTube (for recorded webinars). Follow along at the project website or contact jess.farrell@educopia.org if you have feedback or want to share your insights with the project team.

 

Authors bio:

Jess Farrell is the project manager for BitCuratorEdu and community coordinator for the Software Preservation Network at Educopia Institute. Katherine Skinner is the Executive Director of Educopia Institute, and Christopher (Cal) Lee is Associate Professor at the School of Information and Library Science at the University of North Carolina, Chapel Hill, teaching courses on archival administration, records management, and digital curation. Katherine and Cal are Co-PIs on the BitCuratorEdu project, funded by the Institute of Museum and Library Services.

Advertisements

Preserve This Podcast!

by Molly Schwartz

Mary Kidd (MLIS ’14) and Dana Gerber-Margie (MLS ’13) first met at a Radio Preservation Task Force meeting in 2016. They bonded over experiences of conference fatigue, but quickly moved onto topics near and dear to both of their hearts: podcasts and audio archiving. Dana Gerber-Margie has been a long-time podcast super-listener. She is subscribed to over 1400 podcasts, and she regularly listens to 40-50 of them. She launched a podcast recommendation newsletter when she was getting her MLS, called “The Audio Signal,” which has grown into a popular podcast publication called The Bello Collective. Mary was a National Digital Stewardship Resident at WNYC, where she was creating a born-digital preservation strategy for their archives. She had worked on analog archives projects in the past — scanning and transferring collections of tapes — but she’s embraced the madness and importance of preserving born-digital audio. Mary and Dana stayed in touch and continued to brainstorm ideas, which blossomed into a workshop about podcast preservation that they taught at the Personal Digital Archives conference at Stanford in 2017, along with Anne Wootton (co-founder of Popup Archive, now at Apple Podcasts).

Then Mary and I connected at the National Digital Stewardship Residency symposium in Washington, DC in 2017. I got my MLS back in 2013, but since then I’ve been working more at the intersection of media, storytelling, and archives. I had started a podcast and was really interested, for selfish reasons, in learning the most up-to-date best practices for born-digital audio preservation. I marched straight up to Mary and said something like, “hey, let’s work together on an audio preservation project.” Mary set up a three-way Skype call with Dana on the line, and pretty soon we were talking about podcasts. How we love them. How they are at risk because most podcasters host their files on commercial third-party platforms. And how we would love to do a massive outreach and education program where we teach podcasters that their digital files are at risk and give them techniques for preserving them. We wrote these ideas into a grant proposal, with a few numbers and a budget attached, and the Andrew W. Mellon Foundation gave us $142,000 to make it happen. We started working on this grant project, called “Preserve This Podcast,” back in February 2018. We’ve been able to hire people who are just as excited about the idea to help us make it happen. Like Sarah Nguyen, a current MLIS student at the University of Washington and our amazing Project Coordinator.

Behaviors chart from the Preserve This Podcast! survey.

One moral of this story is that digital archives conferences really can bring people together and inspire them to advance the field. The other moral of the story is that, after months of consulting audio preservation experts and interviewing podcasters and getting 556 podcasters to take a survey and reading about the history of podcasting, we can confirm that podcasts are disappearing and podcast producers are not adequately equipped to preserve their work against the onslaught of forces working against the long-term endurance of digital information rendering devices. There is more information on our website about the project (preservethispodcast.org) and in the report about the survey findings. Please reach out to mschwartz@metro.org or snguyen@metro.org if you have any thoughts or ideas.


Molly Schwartz is the Studio Manager at the Metropolitan New York Library Council (METRO). She is the host and producer of two podcasts about libraries and archives — Library Bytegeist and Preserve This Podcast. Molly did a Fulbright grant at the Aalto University Media Lab in Helsinki, was part of the inaugural cohort of National Digital Stewardship Residents in Washington, D.C., and worked at the U.S. State Department as a data analyst. She holds an MLS with a specialization in Archives, Records and Information Management from the University of Maryland at College Park and a BA/MA in History from the Johns Hopkins University.

IEEE Big Data 2018: 3rd Computational Archival Science (CAS) Workshop Recap

by Richard Marciano, Victoria Lemieux, and Mark Hedges

Introduction

The 3rd workshop on Computational Archival Science (CAS) was held on December 12, 2018, in Seattle, following two earlier CAS workshops in 2016 in Washington DC and in 2017 in Boston. It also built on three earlier workshops on ‘Big Humanities Data’ organized by the same chairs at the 2013-2015 conferences, and more directly on a symposium held in April 2016 at the University of Maryland. The current working definition of CAS is:

A transdisciplinary field that integrates computational and archival theories, methods and resources, both to support the creation and preservation of reliable and authentic records/archives and to address large-scale records/archives processing, analysis, storage, and access, with aim of improving efficiency, productivity and precision, in support of recordkeeping, appraisal, arrangement and description, preservation and access decisions, and engaging and undertaking research with archival material [1].

The workshop featured five sessions and thirteen papers with international presenters and authors from the US, Canada, Germany, the Netherlands, the UK, Bulgaria, South Africa, and Portugal. All details (photos, abstracts, slides, and papers) are available at: http://dcicblog.umd.edu/cas/ieee-big-data-2018-3rd-cas-workshop/. The keynote focused on using digital archives to preserve the history of WWII Japanese-American incarceration and featured Geoff Froh, Deputy Director at Densho.org in Seattle.

Keynote speaker Geoff Froh, Deputy Director at Densho.org in Seattle presenting on “Reclaiming our Story: Using Digital Archives to Preserve the History of WWII Japanese American Incarceration.”

This workshop explored the conjunction (and its consequences) of emerging methods and technologies around big data with archival practice and new forms of analysis and historical, social, scientific, and cultural research engagement with archives. The aim was to identify and evaluate current trends, requirements, and potential in these areas, to examine the new questions that they can provoke, and to help determine possible research agendas for the evolution of computational archival science in the coming years. At the same time, we addressed the questions and concerns scholarship is raising about the interpretation of ‘big data’ and the uses to which it is put, in particular appraising the challenges of producing quality – meaning, knowledge and value – from quantity, tracing data and analytic provenance across complex ‘big data’ platforms and knowledge production ecosystems, and addressing data privacy issues.

Sessions

  1. Computational Thinking and Computational Archival Science
  • #1:Introducing Computational Thinking into Archival Science Education [William Underwood et al]
  • #2:Automating the Detection of Personally Identifiable Information (PII) in Japanese-American WWII Incarceration Camp Records [Richard Marciano, et al.]
  • #3:Computational Archival Practice: Towards a Theory for Archival Engineering [Kenneth Thibodeau]
  • #4:Stirring The Cauldron: Redefining Computational Archival Science (CAS) for The Big Data Domain [Nathaniel Payne]
  1. Machine Learning in Support of Archival Functions
  • #5:Protecting Privacy in the Archives: Supervised Machine Learning and Born-Digital Records [Tim Hutchinson]
  • #6:Computer-Assisted Appraisal and Selection of Archival Materials [Cal Lee]
  1. Metadata and Enterprise Architecture
  • #7:Measuring Completeness as Metadata Quality Metric in Europeana [Péter Királyet al.]
  • #8:In-place Synchronisation of Hierarchical Archival Descriptions [Mike Bryant et al.]
  • #9:The Utility Enterprise Architecture for Records Professionals [Shadrack Katuu]
  1. Data Management
  • #10:Framing the scope of the common data model for machine-actionable Data Management Plans [João Cardoso et al.]
  • #11:The Blockchain Litmus Test [Tyler Smith]
  1. Social and Cultural Institution Archives
  • #12:A Case Study in Creating Transparency in Using Cultural Big Data: The Legacy of Slavery Project [Ryan CoxSohan Shah et al]
  • #13:Jupyter Notebooks for Generous Archive Interfaces [Mari Wigham et al.]

Next Steps

Updates will continue to be provided through the CAS Portal website, see: http://dcicblog.umd.edu/cas and a Google Group you can join at computational-archival-science@googlegroups.com.

Several related events are scheduled in April 2019: (1) a 1 ½ day workshop on “Developing a Computational Framework for Library and Archival Education” will take place on April 3 & 4, 2019, at the iConference 2019 event (See: https://iconference2019.umd.edu/external-events-and-excursions/ for details), and (2) a “Blue Sky” paper session on “Establishing an International Computational Network for Librarians and Archivists” (See: https://www.conftool.com/iConference2019/index.php?page=browseSessions&form_session=356).

Finally, we are planning a 4th CAS Workshop in December 2019 at the 2019 IEEE International Conference on Big Data (IEEE BigData 2019) in Los Angeles, CA. Stay tuned for an upcoming CAS#4 workshop call for proposals, where we would welcome SAA member contributions!

References

[1] “Archival records and training in the Age of Big Data”, Marciano, R., Lemieux, V., Hedges, M., Esteva, M., Underwood, W., Kurtz, M. & Conrad, M.. See: LINK. In J. Percell , L. C. Sarin , P. T. Jaeger , J. C. Bertot (Eds.), Re-Envisioning the MLS: Perspectives on the Future of Library and Information Science Education (Advances in Librarianship, Volume 44B, pp.179-199). Emerald Publishing Limited. May 17, 2018. See: http://dcicblog.umd.edu/cas/wp-content/uploads/sites/13/2017/06/Marciano-et-al-Archival-Records-and-Training-in-the-Age-of-Big-Data-final.pdf


Richard Marciano is a professor at the University of Maryland iSchool where he directs the Digital Curation Innovation Center (DCIC). He previously conducted research at the San Diego Supercomputer Center at the University of California San Diego for over a decade. His research interests center on digital preservation, sustainable archives, cyberinfrastructure, and big data. He is also the 2017 recipient of Emmett Leahy Award for achievements in records and information management. Marciano holds degrees in Avionics and Electrical Engineering, a Master’s and Ph.D. in Computer Science from the University of Iowa. In addition, he conducted postdoctoral research in Computational Geography.

Victoria Lemieux is an associate professor of archival science at the iSchool and lead of the Blockchain research cluster, Blockchain@UBC at the University of British Columbia – Canada’s largest and most diverse research cluster devoted to blockchain technology. Her current research is focused on risk to the availability of trustworthy records, in particular in blockchain record keeping systems, and how these risks impact upon transparency, financial stability, public accountability and human rights. She has organized two summer institutes for Blockchain@UBC to provide training in blockchain and distributed ledgers, and her next summer institute is scheduled for May 27-June 7, 2019. She has received many awards for her professional work and research, including the 2015 Emmett Leahy Award for outstanding contributions to the field of records management, a 2015 World Bank Big Data Innovation Award, a 2016 Emerald Literati Award and a 2018 Britt Literary Award for her research on blockchain technology. She is also a faculty associate at multiple units within UBC, including the Peter Wall Institute for Advanced Studies, Sauder School of Business, and the Institute for Computers, Information and Cognitive Systems.

Mark Hedges is a Senior Lecturer in the Department of Digital Humanities at King’s College London, where he teaches on the MA in Digital Asset and Media Management, and is also Departmental Research Lead. His original academic background was in mathematics and philosophy, and he gained a PhD in mathematics at University College London, before starting a 17-year career in the software industry, before joining King’s in 2005. His research is concerned primarily with digital archives, research infrastructures, and computational methods, and he has led a range of projects in these areas over the last decade. Most recently has been working in Rwanda on initiatives relating to digital archives and the transformative impact of digital technologies.

A Recap of “DAM if you do and DAM if you don’t!”

by Regina Carra

When: December 3, 2018

Where: Metropolitan New York Library Council (METRO), New York, NY

Speakers:

  • Stephen Klein, Digital Services Librarian at the CUNY Graduate Center (CUNY)
  • Ashley Blewer, AV Preservation Specialist at Artefactual
  • Kelly Stewart, Digital Preservation Services Manager at Artefactual

On December 3, 2018, the Metropolitan New York Library Council (METRO)’s Digital Preservation Interest Group hosted an informative (and impeccably titled) presentation about how the CUNY Graduate Center (GC) plans to incorporate Archivematica, a web-based, open-source digital asset management software (DAMs) developed by Artefactual, into its document management strategy for student dissertations. Speakers included Stephen Klein, Digital Services Librarian at the CUNY Graduate Center (GC); Ashley Blewer, AV Preservation Specialist at Artefactual; and Kelly Stewart, Digital Preservation Services Manager at Artefactual. The presentation began with an overview from Stephen about the GC’s needs and why they chose Archivematica as a DAMs, followed by an introduction to and demo of Archivematica and Duracloud, an open-source cloud storage service, led by Ashley and Kelly (who was presenting via video-conference call). While this post provides a general summary of the presentation, I would recommend reaching out to any of the presenters for more detailed information about their work. They were all great!

Every year the GC Library receives between 400-500 dissertations, theses, and capstones. These submissions can include a wide variety of digital materials, from PDF, video, and audio files, to websites and software. Preservation of these materials is essential if the GC is to provide access to emerging scholarship and retain a record of students’ work towards their degrees. Prior to implementing a DAMs, however, the GC’s strategy for managing digital files of student work was focused primarily on access, not preservation. Access copies of student work were available on CUNY Academic Works, a site that uses Bepress Digital Commons as a CMS. Missing from the workflow, however, was the creation, storage, and management of archival originals. As Stephen explained, if the Open Archival Information System (OAIS) model is a guide for a proper digital preservation workflow, the GC was without the middle, Archival Information Package (AIP), portion of it. Some of the qualities that GC liked about Archivematica was that it was open-source and highly-customizable, came with strong customer support from Artefactual, and had an API that could integrate with tools already in use at the library. GC Library staff hope that Archivematica can eventually integrate with both the library’s electronic submission system (Vireo) and CUNY Academic Works, making the submission, preservation, and access of digital dissertations a much more streamlined, automated, and OAIS-compliant process.

A sample of one of Duracloud’s data visualization graphs from the presentation slides.

Next, Ashley and Kelly introduced and demoed Archivematica and Duracloud. I was very pleased to see several features of the Archivematica software that were made intentionally intuitive. The design of the interface is very clean and easily customizable to fit different workflows. Also, each AIP that is processed includes a plain-text, human-readable file which serves as extra documentation explaining what Archivematica did to each file. Artefactual recommends pairing Archivematica with Duracloud, although users can choose to integrate the software with local storage or with other cloud services like those offered by Google or Amazon. One of the features I found really interesting about Duracloud is that it comes with various data visualization graphs that show the user how much storage is available and what materials are taking up the most space.

I close by referencing something Ashley wrote in her recent bloggERS post (conveniently she also contributed to this event). She makes an excellent point about how different skill-sets are needed to do digital preservation, from the developers that create the tools that automate digital archival processes to the archivists that advocate for and implement said tools at their institutions. I think this talk was successful precisely because it included the practitioner and vendor perspectives, as well as the unique expertise that comes with each role. Both are needed if we are to meet the challenges and tap into the potential that digital archives present. I hope to see more of these “meetings of the minds” in the future.

(For more info: Stephen and Ashley and Kelly have generously shared their slides!)


Regina Carra is the Archive Project Metadata and Cataloging Coordinator at Mark Morris Dance Group. She is a recent graduate of the Dual Degree MLS/MA program in Library Science and History at Queens College – CUNY.

Call for Contributions: Script It!

Scripting and working in the command line have become increasingly important skills for archivists, particularly for those who work with digital materials — at the same time, approaching these tools as a beginner can be intimidating. This series hopes to help break down barriers by allowing archivists to learn from their peers. We want to hear about how you use or are learning to use scripts (Bash, Python, Ruby, etc.) or the command line (one-liners, a favorite command line tool) in your day-to-day work, how scripts play into your processes and workflows, and how you are developing your knowledge in this area. How has this changed the way you think about your work? How has this changed your relationship with your colleagues or other stakeholders?

We’re particularly interested in posts that consist of a walk-through of a simple script (or one-liner) used in your digital archives workflow. Show us your script or command and tell us how it works.

A few other potential topics and themes for posts:

  • Stories of success or failure with scripting for digital archives
  • General “tips and tricks” for the command line/scripting
  • Independent or collaborative learning strategies for developing “tech” skills
  • A round-up of resources about a particular scripting language or related topic
  • Applying computational thinking to digital archives

Writing for bloggERS! “Script It!” Series

  • We encourage visual representations: Posts can include or largely consist of comics, flowcharts, a series of memes, etc!
  • Written content should be roughly 600-800 words in length
  • Write posts for a wide audience: anyone who stewards, studies, or has an interest in digital archives and electronic records, both within and beyond SAA
  • Align with other editorial guidelines as outlined in the bloggERS! guidelines for writers.

Posts for this series will start in July, so let us know if you are interested in contributing by sending an email to ers.mailer.blog@gmail.com!

Inaugural #bdaccess Bootcamp: A Success Story

By Margaret Peachy

This post is the nineteenth in a bloggERS series about access to born-digital materials.

____

At this year’s New England Archivists Spring Meeting, archivists who work with born-digital materials had the opportunity to attend the inaugural Born-Digital Access Bootcamp. The bootcamp was an idea generated at the born-digital hackfest, part of a session at SAA 2015, where a group of about 50 archivists came together to tackle the problem facing most archival repositories: How do we provide access to born-digital records, which can have different technical and ethical requirements than digitized materials?  Since 2015, a team has come together to form a bootcamp curriculum, reach out to organizations outside of SAA, and organize bootcamps at various conferences.

Excerpt of results from a survey administered in advance of the Bootcamp.

Alison Clemens and Jessica Farrell facilitated the day-long camp, which had about 30 people in attendance from institutions of all sizes and types, though the majority were academic. The attendees also brought a broad range of experience to the camp, from those just starting out thinking about this issue, to those who have implemented access solutions.

Continue reading

A Case Study in Failure (and Triumph!) from the Records Management Perspective

By Sarah Dushkin

____

This is the sixth post in the bloggERS series #digitalarchivesfail: A Celebration of Failure.

I’m the Records Coordinator for a global energy engineering, procurement, and construction  contractor, herein referred to as the “Company.” The Company does design, fabrication, installation, and commissioning of upstream and downstream technologies for operators. I manage the program for our hard copy and electronic records produced from our Houston office.

A few years ago our Records Management team was asked by the IT department to help create a process to archive digital records of closed projects created out of the Houston office. I saw the effort as an opportunity to expand the scope and authority of our records program to include digital records. Up to this point, our practice only covered paper records, and we asked employees to apply the paper record policies to their own electronic records.

The Records Management team’s role was limited to providing IT with advice on how to deploy a software tool where files could be stored for a long-term period. We were not included in the discussions on which software tool to use. It took us over a year to develop the new process with IT and standardize it into a published procedure. We had many areas of triumph and failure throughout the process. Here is a synopsis of the project.

Objective:
IT was told that retaining closed projects files on the local server was an unnecessary cost and was tasked with removing them. IT reached out to Records Management to develop a process to maintain the project files for the long-term in a more cost-effective solution that was nearline or offline, where records management policies could be applied.

Vault:
The software chosen was a proprietary cloud-based file storage center or “vault.” It has search, tagging, and records disposition capabilities. It is more cost-effective than storing files on the local server.

Process:
At 80% project completion, Records Management reaches out to active projects to discover their methods for storing files and the project completion schedule. 80% engineering completion is an important timeline for projects because most of the project team is still involved and the bulk of the work is complete. Records Management also gains knowledge of the project schedule so we can accurately apply the two-year timespan to when the files will be migrated off the local server and to the vault.  The two-year time span was created to ensure that all project files would be available to the project team during the typical warranty period. Two years after a project is closed, all technical files and data are exported from the current management system and ingested into the vault, and access groups are created so employees can view and download the files for reference as needed.

Deployment:
Last year, we began to apply the process to large active projects that had passed 80% engineering completion. Large projects are those that have greater than 5 million in revenue.

Observations:
Recently we have begun to audit the whole project with IT, and are just now identifying our areas of failure and triumph. We will conduct an analysis of these areas and assess where we can make improvements.

Our big areas of failure were related to stakeholder involvement in the development, deployment, and utilization of the vault.

Stakeholders, including the Records Management team, were not involved in the selection or development of the vault software tool. As a result, the vault development project lacked the resources required to make it as successful as possible.

In the deployment of the vault, we did not create an outreach campaign with training courses that would introduce the tool across our very large company. Due to this, many employees are still unaware of the vault. When we talk with departments and projects about methods to save old files for less money they are reluctant to try the solution because it seems like another way for IT to save money from their budget without thinking about the greater needs of the company. IT is still viewed as a support function that is inessential to the Company’s philosophy.

Lastly, we did not have methods to export project files from all systems for ingest into the vault; nor did we, in North America, have the authority to develop that solution. To be effective, that type of decision and process can only be developed by our corporate office in another country. The Company also does not make information about project closure available to most employees. A project end date can be determined by several factors, including when the final invoice was received or the end of the warranty period. This type of information is essential to the information lifecycle of a project, and since we had no involvement from upper level management, we were not able to devise a solution for easily discovering this information.

We had some triumphs throughout the process, though. Our biggest triumph is that this project gave Records Management an opportunity to showcase our knowledge of records retention and its value as a method to save money and maintain business continuity. We were able to collaborate with IT and promulgate a process. It gave us a great opportunity to grow by harnessing better relationships with the business lines. Although some departments and teams are still skeptical about the value of the vault, when we advertise it to other project teams, they see the vault as evidence that the Company cares about preserving their work. We earned our seat at the table with these players, but we still have to work on winning over more projects and departments. We’ve also preserved more than 30 TB of records and saved the Company several thousands of dollars by ingesting inactive project files into the vault.

I am optimistic that when we have support from upper management, we will be able to improve the vault process and infrastructure, and create an effective solution for utilizing records management policies to ensure legal compliance, maintain business continuity, and save money.

____

Sarah Dushkin earned her MSIS from the University of Texas at Austin School of Information with a focus in Archival Enterprise and Records Management. Afterwards, she sought to diversify her expertise by working outside of the traditional archival setting and moved to Houston to work in oil and gas. She has collaborated with management from across her company to refine their records management program and develop a process that includes the retention of electronic records and data. She lives in Sugar Land, Texas with her husband.

Fail4Lib: Acknowledging and Embracing Professional Failure

By Andreas Orphanides

____

This is the fifth post in the bloggERS series #digitalarchivesfail: A Celebration of Failure.

trainwreck
It could be worse.
Image title: Train wreck at Montparnasse
Credit: Studio Lévy et Fils, 1895
Copyright: Public domain

When was the last time you totally, completely, utterly loused up a project or a report or some other task in your professional life? When was the last time you dissected that failure, in meticulous detail, in front of a room full of colleagues? Let’s face it: we’ve all had the first experience, and I’d wager that most of us would pay good money to avoid the second.

It’s a given that we’ll all encounter failure professionally, but there’s a strong cultural disincentive to talk about it. Failure is bad. It is to be avoided at all costs. And should one fail, that failure should be buried away in a dark closet with one’s other skeletons. At the same time, it’s well acknowledged that failure is a critical step on the path to success. It’s only through failing and learning from that experience that we can make the necessary course corrections. In that sense, refusing to acknowledge or unpack failure is a disservice: failure is more valuable when well-understood than when ignored.

This philosophy — that we can gain value from failure by acknowledging and understanding it openly — is the underlying principle behind Fail4Lib, the perennial preconference workshop that takes place at the annual Code4Lib conference, and which completed its fifth iteration (Fail5Lib!) at Code4Lib 2017 in Los Angeles. Jason Casden (now of UNC Libraries) originally conceived of the Fail4Lib idea, and together he and I developed the concept into a workshop about understanding, analyzing, and coming to terms with professional failure in a safe, collegial environment.

Participants in a Fail4Lib workshop engage in a number of activities to foster a healthier relationship with failure: case study discussions to analyze high-profile failures such as the Challenger disaster and the Volkswagen diesel emissions scandal; lightning talks where brave souls share their own professional failures and talk about the lessons they learned; and an open bull session about risk, failure, and organizational culture, to brainstorm on how we can identify and manage failure, and how to encourage our organizations to become more failure-tolerant.

Fail4Lib’s goal is to help its participants to get better at failing. By practicing talking about and thinking about failure, we position ourselves to learn more from the failures of others as well as our own future failures. By sharing and talking through our failures we maximize the value of our experiences, we normalize the practice of openly acknowledging and discussing failure, and we reinforce the message to participants that it happens to all of us. And by brainstorming approaches to allow our institutions to be more failure-tolerant, we can begin making meaningful organizational change towards accepting failure as part of the development process.

The principles I’ve outlined here not only form the framework for the Fail4Lib workshop, they also represent a philosophy for engaging with professional failure in a constructive and blameless way. It’s only by normalizing the experience of failure that we can gain the most from it; in so doing, we make failure more productive, we accelerate our successes, and we make ourselves more resilient.

____

Andreas Orphanides is Associate Head, User Experience at the NCSU Libraries, where he develops user-focused solutions to support teaching, learning, and information discovery. He has facilitated Fail4Lib workshops at the annual Code4Lib conference since 2013. He holds a BA from Oberlin College and an MSLS from UNC-Chapel Hill.

OSS4Pres 2.0: Design Requirements for Better Open Source Tools

By Heidi Elaine Kelly

____

This is the second post in the bloggERS series describing outcomes of the #OSS4Pres 2.0 workshop at iPRES 2016, addressing open source tool and software development for digital preservation. This post outlines the work of the group tasked with “drafting a design guide and requirements for Free and Open Source Software (FOSS) tools, to ensure that they integrate easily with digital preservation institutional systems and processes.” 

The FOSS Development Requirements Group set out to create a design guide for FOSS tools to ensure easier adoption of open-source tools by the digital preservation community, including their integration with common end-to-end software and tools supporting digital preservation and access that are now in use by that community. 

The group included representatives of large digital preservation and access projects such as Fedora and Archivematica, as well as tool developers and practitioners, ensuring a range of perspectives were represented. The group’s initial discussion led to the creation of a list of minimum necessary requirements for developing open source tools for digital preservation, based on similar examples from the Open Preservation Foundation (OPF) and from other fields. Below is the draft list that the group came up with, followed by some intended future steps. We welcome feedback or additions to the list, as well as suggestions for where such a list might be hosted long term.

Minimum Necessary Requirements for FOSS Digital Preservation Tool Development

Necessities

  • Provide publicly accessible documentation and an issue tracker
  • Have a documented process for how people can contribute to development, report bugs, and suggest new documentation
  • Every tool should do the smallest possible task really well; if you are developing an end-to-end system, develop it in a modular way in keeping with this principle
  • Follow established standards and practices for development and use of the tool
  • Keep documentation up-to-date and versioned
  • Follow test-driven development philosophy
  • Don’t develop a tool without use cases, and stakeholders willing to validate those use cases
  • Use an open and permissive software license to allow for integrations and broader use

Recommendations

  • Have a mailing list, Slack or IRC channel, or other means for community interaction
  • Establish community guidelines
  • Provide a well-documented mechanism for integration with other tools/systems in different languages
  • Provide functionality of tool as a library, separating out the GUI and the actual functions
  • Package tool in an easy-to-use way; the more broadly you want the tool to be used, package it for different operating systems
  • Use a packaging format that supports any dependencies
  • Provide examples of functionality for potential users
  • Consider the organizational home or archive for the tool for long-term sustainability; develop your tool based on potential organizations’ guidelines
  • Consider providing a mechanism for internationalization of your tool (this is a broader community need as well, to identify the tools that exist and to incentivize this)

Premise

  • Digital preservation is an operating system-agnostic field

Next Steps

Feedback and Perspectives. Because of the expense of the iPRES conference (and its location in Switzerland), all of the group members were from relatively large and well-resourced institutions. The perspective of under-resourced institutions is very often left out of open-source development communities, as they are unable to support and contribute to such projects; in this case, this design guide would greatly benefit from the perspective of such institutions as to how FOSS tools can be developed to better serve their digital preservation needs. The group was also largely from North America and Europe, so this work would eventually benefit greatly from adding perspectives from the FOSS and digital preservation communities in South America, Asia, and Africa.

Institutional Home and Stewardship. When finalized, the FOSS development requirements list should live somewhere permanently and develop based on the ongoing needs of our community. As this line of communication between practitioners and tool developers is key to the continual development of better and more user-friendly digital preservation tools, we should continue to build on the work of this group.

Referenced FOSS Tool and Community Guides

____

heidi-elaine-kellyHeidi Elaine Kelly is the Digital Preservation Librarian at Indiana University, where she is responsible for building out the infrastructure to support long-term sustainability of digital content. Previously she was a DiXiT fellow at Huygens ING and an NDSR fellow at the Library of Congress.