4 thoughts on “Processing Metadata with Python: An “os.walk”through

  1. themadgiggler July 19, 2018 / 11:55 am

    This is great! I love hearing about archivists who don’t have a programming background learning to do these things. I think learning to write scripts would be very useful for me. The problem is finding the time to learn how to do it. I’ve tried several times over the past few years, but I can never cobble together enough time to figure things out as I have too many other competing priorities. Still, reading this gives me hope that maybe I can do this too.


    • Alejandra Dean August 3, 2018 / 10:09 am

      Thanks for your comment! It’s taken me close to a year to really feel comfortable using Python, but that’s because I’ve also been learning on my own time at home and with free online resources. I’ve found the W3Schools tutorial on Python to be pretty helpful in relaying some of the fundamentals. There’s also a free online manual called “Automate the boring stuff” that explains how to install Python to your system and get started, but I think it might only cover Python 2 — you’d have to double-check on that. Overall what’s been the most helpful has been consistently attempting to apply Python IRL — tackling day-to-day data-entry tasks has been really productive.


  2. Anarchivist (@viciouslibrary) July 31, 2018 / 2:33 pm

    Great post. I’m curious, when you uploaded the metadata to Preservica, how did you package the deliverable units, i.e. did you upload 1 tiff with 1 .metadata file or upload in bulk and have all the data at the DU level?


    • Alejandra Dean August 3, 2018 / 12:49 pm

      Thanks! When we generate our descriptive metadata, we do it on a one-to-one basis. Preservica’s SIP Creator will associate XML metadata with content if the XML files conform to a specific naming convention: the XML filename needs to match the filename of the record and include a “.metadata” extension. Our XSL stylesheet transforms the single XML file we’ve converted using Python into individual .metadata files in one step, but it would be great to accomplish this with Python as well (then we could combine scripts and run everything at once).


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s