Wikimania summary and Commons Machinery future

Posted by Jonas Öberg on August 12, 2014
Commons Machinery: full speed ahead

Commons Machinery: full speed ahead

This is the first in a series of blog posts where we’ll talk about our recent development work, what we were up to at Wikimania 2014 (in London), and where that will lead us in the future.

Let’s start with a nine sentence recap:

  1. Commons Machinery is developing tools that make attribution automatic, and which persistently connect every digital work online with information (metadata) about the same work.
  2. In 2013 we released prototypes that made it possible to copy & paste images from Flickr into LibreOffice & Aloha Editor with automatic attribution.
  3. These prototypes were based on the standards identified in our white paper on metadata standards.
  4. We came to the conclusion that this would only ever work on a small subset of users.
  5. We experimentally developed support for metadata in MediaGoblin, GIMP and InkScape, and developed libraries for extracting metadata, handling metadata and generating attribution statements from metadata.
  6. Towards December of 2013, we developed an architecture that focuses more on retaining contextual information about digital works, but would also allow us to do automatic attribution for almost all users rather than only the small subset we previously could support.
  7. Since then, we have further refined and developed our backend: our data model, our API and the code that implements it.
  8. We’ve also launched Elog.io to let people sign up to be the first to try our tools as they become available.
  9. During Wikimania, we spent time hacking and defining our priorities that’ll get us to releasing our next tool within the next two months.

This is now what our roadmap looks like. You’ll notice that we’ve pushed automatic attribution a bit further down the line: we realised that in order to properly support automatic attributions in other applications, we first need the technology to visualise attributions where people go looking for images: the Internet.

Over the next two months, we’ll be writing a bit more about the API, but also working on developing the first tool that can be built using our API: an extension for Firefox that allows you to see where images on the Internet come from, even if they’re not attributed by those publishing them.

It won’t work on all images, but it’ll work on some. We hope to retrieve and calculate image information from Wikimedia Commons as well as the British Library and other institutions that hold large amounts of photographs. With our API, we’ll be able to search those collections for visual matches. Even if an image is resized or has its format changed (from PNG to JPEG, for instance), we should be able to relate an image published on a web site with an image from the collection, and show that relation.

The advantage for you is that you’ll see where images on the Internet originate and you’ll be able to find your way back to the collections, where you may find more information about the images. The benefit for those holding the collections is that they’ll be able to increase their engagement with their audience, even when that audience encounters images far away from the institution’s collections.

So what happened with our work on automatic attribution? That’ll be one of the next steps on our roadmap: now that we’ll have the contextual information about images that people want to use, we’ll want to make that information useful as well. We’ll also be developing ways in which users can contribute information about images that aren’t part of our catalog yet, to help others find them. Collaboration is key!

We’ll get there in a few blog posts from now…but in our next blog post in this series, we’ll be talking more about the upcoming extension itself, and show you some examples of what it will look like.