All the standard image format allow to embed metadata.
Why not store attribution there ? This would be simpler and more robust.
It’s a very good question with one technical answer, but it also helps raising a lot of further questions. I’ll look at those first, and then provide an answer to the original question. The follow-up questions I don’t have answers to yet!
In discussing this, I’m going to invent the term “linked metadata”. It is a sub-category of the “linked data” of the semantic web, and uses the same RDF technology. The difference is that linked metadata always describe a digital creation, while linked data can describe any object in the world. The nuance is that a digital creation and its linked metadata are accessed and processed with very similar tools, while objects that linked data refers to may not exist on the web at all. This restriction on linked metadata results in some technical challenges.
Embedded or linked metadata?
I really agree with ballombe! It would be simpler and more robust to embed the metadata in the image, but only if the infrastructure actually supports it. Right now, embedded metadata gets lost all over the place so it may actually be more robust in having linked metadata on a web page as RDFa, referring the image.
It is also true that the important image formats all support embedded metadata, but the situation is worse for sound and video formats. Linked metadata has the benefit that it doesn’t matter what file format is used to distribute the creation. However, I think we can find a way to embed metadata in other media formats, and should explore that.
Linked metadata has a lot of challenges as well. Here are a few that we’re looking at right now:
- How do you find the metadata if you only have the image?
- You can embed a link to it in the image metadata, but that risks being stripped away.
- You can use a fingerprint or a watermark that only relies on the content of the image itself, and look that up in a registry. But which registry? There are already several registries for professionals, but few for casual creators.
- How is the RDFa metadata linked to the image?
- The image must be possible to identify with an URI of some kind. It could simply be a web page URI where the image is published, and not necessarily require one of the registries form the previous question.
- But what if you publish a thumbnail on the page with the RDFa, and provides the original image with a download link? How can we express all those relationships so that tools can understand what linked metadata to actually use?
Can you see more problems, and solutions? We will continue to explore this at Commons Machinery, whenever possible with hands-on examples.
The answer to ballombe
To explore this with embedded metadata, we’d have to modify Firefox and GTK. That would have made it much more difficult for other people to try this themselves. But I want to see how this works when the metadata is embedded in the image too, so we’ll probably do that soon.