Towards the end of last year I received an exciting letter informing me that Trove/Australian Newspapers had been nominated for an international crowdsourcing award for the text correction activity. I had never heard of the award before but the letter explained it:
“The Digital Heritage Award is an initiative of the Dutch Institute for Cultural Heritage and the Digitaal Erfgoed Nederland Foundation (DEN). First introduced in 2008, it has since been awarded annually to a heritage institution or project that has used digital heritage in an innovative or inspiring way. In this year’s edition the award will go to the best digital heritage related crowdsourcing project.”
I considered it a great honour to be nominated and short listed. Several years of my life have been committed day and night to developing, maintaining and promoting Australian Newspapers which is ‘my baby’. The five shortlisted nominees had been selected by a jury, consisting of five international experts on crowdsourcing and heritage. The jury consisted of Susan Hazan (Director of Digital Heritage UK and Curator of New Media and Head of the Internet Office at Israel Museum), Johan Oomen (Head of R&D at the Netherlands Institute for Sound and Vision), Josh Greenberg (Director of Alfred P. Sloan Foundations Digital Information Technology Programme in New York), Vincent Puig (Executive Director at IRI/Centre Pompidou) and Mia Ridge (UK Researcher, consultant, programmer, analyst).
To be shortlisted the crowdsourcing projects had to meet the following selection criteria:
· Be in an advanced or finished state of development.
· Hundreds or thousands of members of the public should have contributed.
· Significant results already achieved in 2010 or 2011 and publicly visible.
· Results exceeded expectations, are inspiring to others, and can be replicated.
· Have had press coverage.
· Have a clear project leader who can present the project at the awards ceremony.
The shortlisted projects also had the following criteria applied:
· Long-term commitment to the activity
· Continued progress
· Motivation and rewards for the crowd
· Effective design
· Link to existing communities
This led to a list of five finalists for the Digital Heritage Award 2011:
- Digitalkoot from the National Library of Finland. 50,000 volunteers are correcting OCR newspaper text to 99% accuracy.
- Old Weather from the National Maritime Museum, UK. 700,000 - 97% of navy handwritten ships logs with temperatures have been transcribed by thousands of volunteers harnessed in Galaxy Zoo.
- Remember Me: Displaced Children of the Holocaust from the Unites States Holocaust Memorial Museum. 61,000 people have viewed 1000 pictures of children lost in the Haulocaust. So far 180 have been identified and traced.
- Trove Australian Newspapers from the National Library of Australia. 40,000 volunteers have corrected 51 million lines of OCR text in historic newspapers making them more searchable.
- Transcribe Bentham from the University College of London. Volunteers subscribe 44 handwritten Bentham manuscripts per week.
The winner would be selected by audience voting. Over 500 digital culture heritage specialists attending the international conference Digital Strategies for Heritage (DISH2011) would watch presentations on the crowdsourcing projects, speak to the project leaders and then vote for their favourite on the first day of the conference - 7 December 2011.
So, rather belatedly I’m now going to tell you what happened next…
Unfortunately the National Library of Australia decided that it could not justify the cost of sending me to Holland for the conference. This was perhaps rightly so since it would have cost over AU$5000 for me to attend and the Library is making travel cutbacks at a time of severe financial restraint. The conference being in Europe was rather pricey, but did present a good professional development opportunity as well as the chance to win an award, and for me to meet face to face the other project managers. Not attending or being able to present in person immediately reduced our chances of winning. The conference organisers kindly let me send a video message instead. As it turned out there was only one conference attendee from Australia, and one from New Zealand, but a very strong contingent from Scandanavia. Because the winner was based on audience voting, and on occasions such as this national pride and alliances run strong, things seemed to be against us from the start.
The winner who had a straight lead to the finishing post was DigitalKoot. I congratulate them and all the other nominees. It was a very hard choice to make with each project being really good. Maybe that’s what the judges also thought, who came up with the idea of audience voting (devolving the responsibility!)
In discussions with people afterwards and by following the conference online I was interested to pick up that many of the digital culture specialists still seemed to think that you would stand little chance of getting thousands of volunteers to do something for you unless you made it into a game and it looked cool, hence perhaps their enthusiasm for DigitalKoot (a game to correct newspaper text that involves a mole). This caused me pause for thought, because I don’t think I agree with this view, but then maybe I have got the whole thing wrong? It is also interesting that the year before in 2010 the ‘Best Archives on the Web: Best Use of Crowdsourcing for Description’ Award was given to Waisda (What’s that) a Dutch project from Netherlands Institute of Sound and Vision. It also uses gaming technology. People tag videos with subjects and see if they match other peoples.
I realised there are some things we discussed on the Australian Newspapers project which I have never written about in my articles or mentioned in my presentations, which now seem very pertinent. Firstly when we began to design the Australian Newspapers site we employed a web design company who could not think why anyone would want to correct newspaper text unless they made it into a game. But the primary purpose of our newspaper project was to digitise and make online available for free and full-text searchable Australian Newspapers. The bit on the side was that it would be good to improve the quality of the text for searching if we had the time and means. Hence we never had a ‘crowdsourcing project’ and we never focused on that. We told the web developers to focus first on getting the search and browse interface up and running and leave the text correction bit until last if they had time. They did this. When doing public usability testing for ‘search and browse’ the developers were overwhelmed by the excited response they got from people off the street about the availability of Australian Newspapers. None of these users were ‘library users’ or really considered that this was a library service. They all showed early signs of getting quite sucked into search and browse. All the testers had no problem thinking up something they would like to look up in old newspapers. Based on this high level of interest and motivation the developers thought maybe a ‘gaming strategy’ to attract users would not after all be required. Also the library project team felt that anyone who wanted to improve the text would come from the user base i.e. newspaper searchers, so they would be in the site already and have an interest in improving the text of something they had just read. Lastly the image of the newspaper text was always visible so the need to match or verify someone’s corrections with someone else’s before accepting them was not really necessary, a strategy often employed in gaming technology.
The simple, explicit thing I have never said is that as far as we know our volunteer text correctors are a subset of our Trove search user base, not a separate group of people who simply want to crowdsource. That is they do not think “Oh I want to help with crowdsourcing, let’s find a site that does that”, they are already in our site thinking “this is a great site, I found what I wanted, oh look I can make it even better, I’ll do that to help”. Obviously some of the text correctors are doing vast amounts of work, but most are simultaneously undertaking research using the resource. They seem to find both activities highly enjoyable and addictive without ‘gamification’.
I’ve had so many people contact me and say “How can I set up a crowdsourcing project?” but we never came to it from that point of view and I don’t think that is how you should. It is the wrong question to ask. You need to ask “What goal do I want to achieve, and how can I do that?” Ours was to improve the quality of our searching, and it happened that our solution was to get the public to help with this which became ‘crowdsourcing’. For the National Library of Australia the crowdsourcing activity is a side effect of its ongoing effort to deliver high quality services. We play it down actually, and never use the word ‘crowdsourcing’. We just say the public are helping us, or that the community is involved, or volunteers work online. There is still great reticence about using the “C” word itself, acknowledging the scale of the activity or the activity itself. Three years in senior managers finally agreed we could put some text on the home page of Trove saying ‘contribute’ and ‘how to correct text’. Before this a Trove user would only stumble across the fact they could correct text when they had actually reached the point in the newspaper search where they could do it. The Library has never formally appealed for the community to help, or had a strategy to do this, though it has acknowledged and congratulated the highest achieving volunteers.
So firstly I am thinking that even with this lack of public appeal our results have been phenomenal. We have drawn on our existing Trove user base (5 million) and about 40,000 people have become volunteer text correctors, about 4,000 of them hugely committed and correcting each week. They have improved 56.8 million lines of text. But then on the other hand I’m thinking “Would this have exponentially increased if we had used gaming technology from the start, targeted gamers rather than searchers, and introduced an animal like DigitalKoot have done?” Of course our animal could not have been a mole it would have to have been a native – a kangaroo, koala, possum, sugar glider, or my favourite a bilby. Maybe our initial decision was wrong after all.
There’s really not much written or researched on this topic. In fact the term ‘gamification’ is only about 18 months old. There is quite a lot of negativity towards ‘gamification’. It is considered by some to be a stupid fad that will soon pass, and conversely others see it as important as the rise of social media. The ABC’s creative director of strategy thinks it is as stupid as it sounds because it limits creativity and goals. Viewpoints on whether to use gaming technology or not on digital cultural heritage crowdsourcing sites, may also depend on what your goals are and whether or not you think the journey i.e. the level of social engagement and community building is as, or more important than the destination i.e. the result. At the National Library of Australia I think it is fair to say we consider the journey ‘interesting’ but really we have our eyes on the destination only. As I said our results are phenomenal, but maybe they can be improved and increased. I wonder if someone could do some research on this, or alternatively we could just change our interface, add a few different levels of competence and a couple of bilby’s and see what happens….
Photos of DISH 2011 from the DEN flickr stream
Finland's DigitalKoot receives the DISH2011 Crowdsourcing Award