Welcome to the World of Tomorrow: Technology that should give archivists nightmares (or at least indigestion)

by Joshua Kitchens

Advances in technology should not be looked as so much as forward progress, but as a series  of more complicated things for use to preserve. This complicated reality that we as archivists will be facing. For just a moment, instead of considering the present or looking or backwards, let us look towards the bright and shiny tomorrow.

Quantum Computing

Quantum computing seems like a real thing. There were some doubts early on about whether or not the quantum computers that existed were real, but that sort of fits the whole definition of theoretical physics. Now it seems that qubits are the new bits. With Google and other tech companies leading the efforts to build machines that can calculate seemingly impossible things, and with speeds unheard of by today’s standards, say goodbye to simple 1’s and 0’s and hello to 1 and 0’s in superpositions and entangled, quantumly speaking. What kinds of records will these machines create? <Shrugs> It is impossible to know just yet, but they are coming, and we should be aware. Unfortunately, I doubt Al will be there to help us figure out where our leap into this new realm of computing has landed us.

Virtual Reality and Augmented Reality

Nothing quite gets my head spinning like thinking about how to deal with the inevitable virtual reality take over. While we may get to luxuriate in digital evergreen fields with elves, orcs, and cyberspace marines, I can only expect the enviable need to find a way to preserve these New Aged sprites, as I can only imagine that in the future a peace treaties will be worked out between a 7-foot-tall virtual anthropomorphic moose and an overly cute chibi panda. While further historians will debate the meaning of 🙂 in the third line of that treaty, we will need to understand the significant properties and other aspects that should be preserved and what could be said of the record qualities of these virtual spaces. What sorts of technological preservation will be required for these environments? Will we feel an overwhelming sense of dread as we appraise these records? Think about the headset graveyard!!! We should also consider augmented reality. Augmented reality poses a complex issue. What is the record, in this case: the Google Glass overlay onto the real world, or the data behind the overlay? I feel a bit like we are Morpheus searching for our Neo in this case. Will you be the One?

Video Games

In many respects, video games could be included in any discussion of virtual worlds, but for now, let’s take Mario head on, or shall we say feet first. Like virtual reality, video games are complex digital objects, but in addition to a game with systems for rendering pixels and dynamic worlds, there is usually a rabid and supporting fan base. These are primarily cultural spaces, sometimes based on game, like World of Warcraft and Eve Online, and sometimes existing through forums and twitter hashtags. These groups introduce new  language, like “ult” or ultimate. They debate issues going beyond the game environment. Problems range from ethics to Trans rights, to much more. So for video games, part of understanding  the complex record that is a game, is the various communities that have been created around them.


Blockchain is the new buzz word on the internet and business these days. What started out as principally a vehicle and system for recording transactions of a currency unfettered from governmental controls has blossomed into a buzzword fueled explosion of… well, I’m not entirely sure. What I do know is that graphics cards are prohibitively expensive now, and Kodak has licensed its name to a bitcoin mining company. Kodak has also allowed its name to be used for a company that wants to use blockchains to help track image rights. This is quite a development. Some researchers, such as Hrvoje Stancic, are already thinking about the implications of blockchains for archives and information professionals. So get ready, you might need your hacker specs for this one.

Diving into Computational Archival Science

by Jane Kelly

In December 2017, the IEEE Big Data conference came to Boston, and with it came the second annual computational archival science workshop! Workshop participants were generous enough to come share their work with the local library and archives community during a one-day public unconference held at the Harvard Law School. After some sessions from Harvard librarians that touched on how they use computational methods to explore archival collections, the unconference continued with lightning talks from CAS workshop participants and discussions about what participants need to learn to engage with computational archival science in the future.

So, what is computational archival science? It is defined by CAS scholars as:

“An interdisciplinary field concerned with the application of computational methods and resources to large-scale records/archives processing, analysis, storage, long-term preservation, and access, with aim of improving efficiency, productivity and precision in support of appraisal, arrangement and description, preservation and access decisions, and engaging and undertaking research with archival material.”

Lightning round (and they really did strike like a dozen 90-second bolts of lightning, I promise!) talks from CAS workshop participants ranged from computational curation of digitized records to blockchain to topic modeling for born-digital collections. Following a voting session, participants broke into two rounds of large group discussions to dig deeper into lightning round topics. These discussions considered natural language processing, computational curation of cultural heritage archives, blockchain, and computational finding aids. Slides from lightning round presenters and community notes can be found on the CAS Unconference website.

Lightning round talks. (Image credit)


What did we learn? (What questions do we have now?)

Beyond learning a bit about specific projects that leverage computational methods to explore archival material, we discussed some of the challenges that archivists may bump up against when they want to engage with this work. More questions were raised than answered, but the questions can help us build a solid foundation for future study.

First, and for some of us in attendance perhaps the most important point, is the need to familiarize ourselves with computational methods. Do we have the specific technical knowledge to understand what it really means to say we want to use topic modeling to describe digital records? If not, how can we build our skills with community support? Are our electronic records suitable for computational processes? How might these issues change the way we need to conceptualize or approach appraisal, processing, and access to electronic records?

Many conversations repeatedly turned to issues of bias, privacy, and ethical issues. How do our biases shape the tools we build and use? What skills do we need to develop in order to recognize and dismantle biases in technology?

Word cloud from the unconference created by event co-organizer Ceilyn Boyd.


What do we need?

The unconference was intended to provide a space to bring more voices into conversations about computational methods in archives and, more specifically, to connect those currently engaged in CAS with other library and archives practitioners. At the end of the day, we worked together to compile a list of things that we felt many of us would need to learn in order to engage with CAS.

These needs include lists of methodologies and existing tools, canonical data and/or open datasets to use in testing such tools, a robust community of practice, postmortem analysis of current/existing projects, and much more. Building a community of practice and skill development for folks without strong programming skills was identified as both particularly important and especially challenging.

Be sure to check out some of the lightning round slides and community notes to learn more about CAS as a field as well as specific projects!

Interested in connecting with the CAS community? Join the CAS Google Group at: computational-archival-science@googlegroups.com!

The Harvard CAS unconference was planned and administered by Ceilyn Boyd, Jane Kelly, and Jessica Farrell of Harvard Library, with help from Richard Marciano and Bill Underwood from the Digital Curation Innovation Center (DCIC) at the University of Maryland’s iSchool. Many thanks to all the organizers, presenters, and participants!

Jane Kelly is the Historical & Special Collections Assistant at the Harvard Law School Library. She will complete her MSLIS from the iSchool at the University of Illinois, Urbana-Champaign in December 2018.