Wednesday, September 30, 2009

Crowdsourcing Australian History using Web 2.0

Nick Gruen over at the Gov 2.0 Taskforce has reminded me of a project I took a look at last year but have never mentioned in this blog.

It's the National Library of Australia's Historic Australian Newspapers archive, which contains digitalised versions of Australian newspapers from between 1803 and 1954 (which are not covered by copyright).

The archive began with the intention of using OCR (Optical Character Recognition) to digitalise the newspapers to make them accessible and searchable online - a vital resource for researchers and geneologists.

However the project took this a step further - allowing the public to correct OCR mistakes in text with extremely low barriers to entry.

This led to over 2 millions lines of text being corrected in 100,000 articles in the first six months, with corrections undertaken by 1,300 users from around the world (78% from Australia). In fact there wasn't a single hour in a day when corrections were not taking place - and there were no instances of vandalism.

The IT Project Manager, Rose Holley has written a great report on the project, detailing how the crowdsourcing initiative was suggested, the process they used to understand and manage potential risks, test and establish the system and how successful it has been - including profiles of some of the top participants and what motivates them to contribute.

This report, Many Hands Make Light Work: Public Collaborative OCR Text Correction in Australian Historic Newspapers (PDF), is a must-read for anyone in the Australian public sector considering how they can get the public involved in their online initiative.

The project is ongoing - with more than 2,294 registered users in February this year.

So why not get involved yourself - even just to understand how such a system might work.

No comments:

Post a Comment