Vote for the Winner: Caption These Bits! #2

Last week, we asked readers to submit captions for this image:

CarletonCollegeArchivesSource: 20110615_2_Computer_Equipment_2/data/originals/9440 from Collection ID: 3 Sub-Series 1968/69 Folder .032, Carleton College Archives.

Thanks to everyone who contributed ideas!

The BloggERS editorial team voted for the top three captions, and now we need your help choosing the winner. Cast your vote for your favorite caption by 5/6, and then we’ll announce the winner.

Digital Preservation in the News: Copyright and Abandonware

Heads up for anyone with an interest in video game preservation…

The Electronic Frontier Foundation (EFF) is seeking an exemption to the Prohibition on Circumvention of Copyright Protection Systems for Access Control Technologies (17 U.S.C. § 1201(a)(1)). The exemption is proposed for users who want to modify “videogames that are no longer supported by the developer, and that require communication with a server,” in order to serve player communities who want to keep maintain the functionality of their games–as well as “archivists, historians, and other academic researchers who preserve and study video games[.]” The proposal emphasizes that the games impacted by this exemption would not be persistent worlds (think World of Warcraft or Eve Online), but rather those games “that must communicate with a remote computer (a server) in order to enable core functionality, and that are no longer supported by the developer.”

The exemption is opposed by the Entertainment Software Association, representing major American (ESA) game publishers and platform providers. The ESA response to the EFF proposal argues that the scope of the proposed exemption is too broadly defined, and that “permitting circumvention of video game access controls would increase piracy, significantly reduce users’ options to access copyrighted works on platforms and devices, anddecrease the value of these works for copyright owners[.]”

In addition to the comments by the EFF, ESA, and their respective supporters, there are also a number of articles which go into much greater detail on this issue.

What do you think? Should there be a legal exemption for modifying unsupported (but still copyright-protected) video games to ensure their enduring usability?

The latest round of public comment on the proposed exemption closes on May 1, 2015. To voice your opinion, follow this link to, where you can learn more and submit a comment voicing your opinion on this and other existing proposals.

Martin Gengenbach is an Assistant Archivist at the Gates Archive.

Caption These Bits! #2

All right, y’all, polish those puns: it’s time for Caption These Bits! round two!

Once a month, bloggERS invites readers to submit captions for images related to electronic records and the history of technology, sourced from archives around the world. Submit your caption below by 4/22. Digital archives, preservation, and curation humor encouraged.

We’ll choose three finalists and invite readers to vote for the winner.

This month’s image:

CarletonCollegeArchivesSource: 20110615_2_Computer_Equipment_2/data/originals/9440 from Collection ID: 3 Sub-Series 1968/69 Folder .032, Carleton College Archives.


A Little Too Personal

The following is a post by John Rees, Archivist and Digital Resources Manager at the National Library of Medicine, based on a breakout session at the ERS meeting of last year’s SAA annual meeting.

One of the breakout sessions at the 2014 ERS section meeting convened around the topic of identifying and redacting personally identifiable information (PII) and personal health information (PHI) from born-digital content. The premise I proposed was, “Health sciences archivists working in the paper world have a relatively easy time of identifying/restricting PII/PHI content. As we move to born-digital collecting we are especially in need of tools and techniques that will allow us to easily identify/restrict/release similar data in electronic form.”

Of course this issue has broader relevance beyond the health science archives, and as we transition from paper-based models of archival processing to data processing models, machines should be able to interpret and act upon various content rules in an automated fashion. The healthcare industry is ahead of the curve in this area, building tools to anonymize any of the nineteen identifiers HIPAA defines as PHI in electronic health record data systems.

Archivists arguably face greater challenges than healthcare workers, sifting through the variety of semantic and unstructured PII found in the various formats traditionally referred to as personal papers, such as recommendation letters, correspondence with sensitive content, publication peer review commentary, etc. Human cognition can learn what these data are and identify them fairly easily during physical processing of paper material, but this requires more effort when triaging unstructured data on poorly labeled media—reading a list of filenames is not sufficient due diligence.

In general, the group felt confident in our ability to collect born-digital material but was much less confident in our ability to provide unmediated access to these records on the open web. Our discussion started off by sharing any tools we knew of that purport to locate PII/PHI in digital archival materials—the list was short:

The strength of these tools is that they can easily and quickly identify logically formatted PII such as social security numbers, email addresses, credit card numbers, phone numbers, and bank account numbers. Their weaknesses include too many false positives, expense of stand-alone proprietary software, narrow use cases, too much item-level manual intervention, and steep learning curves.

The group then talked about access protocols. Identifying PII to restrict requires significant effort, but models for access are almost nonexistent, which complicates the issue when management wants collections to be as immediate and open as possible, the common refrain being, “It’s already digital, so why can’t we put it on the web as soon as it’s acquired?”

The breakout group agreed that from a risk management perspective, outside of manual review and item-level redaction of surrogates, limiting access to data was the easiest solution. Methods of limiting access include:

  • On site-only access via read-only physical media or a disk image
  • On site online access via un-networked computer
  • Authentication paywalls or read-only virtual reading rooms on the open web

In the end we recognized the problem is complex and there are no magic solutions. However, each participant went away with the goal of making incremental progress toward a solution this year.