Collections as Data Jam

Collections as Data Jam

Experiments with World Digital Library in ACH Collections as Data Jam 2019 in Pittsburgh, PA

At the ACH Conference in Pittsburgh, PA, July 23, 2019, at the Collections as Data Jam workshop. The team included: Kristen Mapes, Mickey Casad, David Newbury, Ellen Prokop, Anindita Basu Sempere, and Ece Turnator. We explored the World Digital Library (, and used its API ( to undertake a data exploration of a search for “dragon” in the library. The csv included here is the compiled data we created, which includes latitude and longitude, dates, subject headings, and image measurements (hue, saturation, brightness). The pdf is a slideshow with many screenshots of the work we did, including a data model of the process, and Palladio visualizations.

We have uploaded our project to github here.

Collections as Data Jam Description

The “data jam” is a concept that draws from both the longstanding “hackathon” tradition as well as the relatively recent phenomenon of the “game jam” – brief, fast-paced and cooperative events where participants produce an ephemeral, proof-of-concept final product. In the industry, the creative output of hackathons includes the now-ubiquitous Facebook “like” button. Although hackathons are associated with an exclusionary hacker culture that perpetuates hacking as the defining activity of a technological “priesthood” class, the recent emergence of collaborative coding platforms like has the potential to democratize hacking as an inclusive, collaborative activity more in alignment with the principles of DH. Moreover, research indicates that the peer-based learning and networking associated with the hackathon model can serve as a community-building tool that could put those new to DH on a level playing field with established practitioners.

As the digital products of humanistic research grow in number and complexity, DH project teams have repeatedly borrowed and adapted from the software development industry to bolster the sustainability of their digital scholarship. At the same time, some have asked what could be gained from embracing an ephemeral approach to DH work.

The success of the Collections as Data initiative has revealed the gulf between the needs of scholars using computational methods on humanities collections and the accessibility of those collections as usable data. The initiative stipulates that such collections designed “for everyone” will inevitably fail the individual who approaches the data for a particular purpose, looking to perform their singular brand of research. Yet the hidden strength of data is that it allows us to forge our own paths of entry into the collection, eschewing preset personas and established methods.

This workshop seeks to apply the methodology of the data jam to spontaneously uncover new points of entry into established collections as data. Prior to the workshop, a set of exemplary collections as data will be provided to participants, encouraging them to come with ideas. Over the course of a full day, participants will break into teams and utilize collaborative coding tools to produce an ephemeral digital product, with the ultimate goal of creating a novel window into a particular collection that goes beyond the personas envisioned by the Collections as Data initiative. The products of the jam will be showcased by participants and analyzed by a panel responsible for rewarding particularly innovative and thought-provoking entries.

The workshop builds on experiences at Princeton University hosting a DH hackathon and “Playing with Data” events in addition to the lessons of the Collections as Data initiative. As the workshop is also intended to serve as a community-building tool, it particularly welcomes the participation of those new to DH. No particular digital or humanistic skills are required, and teams will be arranged to ensure even distribution of skills and experience.

Leave a Reply

Your email address will not be published. Required fields are marked *