Archives and museums are filled with amazing material and most are naturally thinking about how to better use Internet technologies to become more useful and relevant to the public. Success stories abound: The Commons on Flickr, Oxford’s Bodleian Library digitized a 550 year old copy of the Gutenberg Bible, The Europeana Project and more, and more, and more…
We can also read wonderful success stories like the recent How the Rijksmuseum opened up its collection:
“…This has resulted in over 150,000 high resolution images for anyone to view, download, copy, remix, print and use for any purpose they can think of.
The museum has been really satisfied with the results of this move. They believe that their core business is to get people familiar with the collection and the museum. By making the images available without copyright restrictions, their reach has extended exponentially and far beyond their own website. The material is now for example being shared and used widely in all kinds of online platforms such as Wikipedia or educational websites.”
So it seems easy. All these archives opening up and going online, becoming part of the sharing revolution…but how is it done? We see the successes and marvel at the wealth of online material. The tipping point is being reached (or already has been?) and we expect everything to be online. But are we looking at the wrong thing?
Many years ago, I lived in an area of town that seemed to have a surprisingly large cluster of hairdressers. None seemed to be overly busy and occasionally I would wonder how they survived. Were they maybe a front for an international art smuggling cartel? Or, part of an intricate tax evasion scheme? In my less wild moments, I realized that they were simply unsuccessful businesses barely surviving year after year. But eventually, I realized they were not really surviving.
On taking a closer look, I noticed that the world of stable hairdressers (and art thieves) I had assumed was more volatile than I thought. The stores were in a total state of flux with new hairdressers taking over, buying and sometimes renaming the storefronts. The hairdressers were obviously not surviving…they were all slowly failing and being replaced by new hopefuls.
So this made me wonder: why was it that hairdressers kept coming to this location and failing? They were probably failing because there were too many hairdressers and not enough hair. But why didn’t this deter newcomers? The answer probably lay in survivor bias. Prospective hairdressers saw several hairdresser shops and thought this was a good location for hairdressers. Since the long list of failed startups left no trace the onlooker presumed this was a prime hairdressing location.
The same thing seems to happen with collections of Italian or Thai restaurants that pop up in clusters in certain locations in my city. “The cemetery of failed restaurants is very silent,” warns Nassim Taleb in his book The Black Swan. Even though we know that, business schools insist on teaching about successful businesses but ignore the important example of failures.
The narratives surrounding survivor bias are fascinating and none is more so than when statistician Abraham Wald helped the air force with their bombers. Bombers were being shot down and needed more protection. Since armoring the whole bomber would make the plane unwieldy, the question was where to best improve the plane’s defenses. The military made observations and noted that planes seemed to be taking most hits “…along the wings, around the tail gunner, and down the center of the body” and discussed adding armor to these areas. What Wald pointed out was that these planes made it back despite their damage. Studying the survivors was giving a misleading picture of what needed to be better protected (You are not so smart: Survivorship Bias).
When pondering survivor bias in relation to archives, we see the phenomenal success of those organizations that have shared their materials. It’s easy to think the common factor is their openness and every museum should provide their collection online! But their success does not only come from openness. There are a ton of openly available archives that contain a wealth of wonderful and useful things – but they are not being used to their full potential.
Opening up and making material available is crucial but successful implementation requires that the material made available is useable. Useable material requires it to be searchable and that the information about the digital object is easily accessible – preferably not detachable from the object itself. In order to do this, we must understand the role of metadata.
The success stories of online archives are not due to the fact that they had a ton of material and then put it online. The success story comes when the material going online is useful.
On December 4, 2008, Wikimedia Commons received more than 80,000 images as a donation from the German Federal Archives. Imagine all those images without the data surrounding them. Imagine if the donation had been 80,000 images with only obscure file names such as the ever-useful IMG_6623.JPG my camera offers me. The collection would have been virtually useless. The sheer amount of the generosity would make it less useful rather than more.
Failed openness–openness without complete data–slowly vanishes from our view and we are left with the idea that all openness is good. Openness is vital but information is crucial.