Tuesday, March 08, 2011

Doing good while improving security with ReCAPTCHA

There's still many government online forms and consultation systems that don't make use of 'human recognition' tools such as CAPTCHA to help verify that the people filling in the forms are humans and reduce the attractiveness of online government forms to large-scale automated attacks by bot-armies.

However, even where government has added CAPTCHA security, I've yet to see an instance where this has been used for good, as well as security.

CAPTCHA, for those unfamiliar, is a technology whereby, when completing an online form, the user is asked to type in one or more words or calculate the product of a sum before submitting their response. The words or sum are presented in an image with 'background static' designed to make it hard for a computer to read.

In most cases, humans are able to decipher and type in the correct response whereas automated form completion systems, often used for spamming, are not.

Many CAPTCHA systems are also enhanced with audio CAPTCHA (where words are read out, amidst static and background noises), supporting vision-impaired people.

These systems are not perfect, however they do increase the barriers to hackers, reducing the prospect for spam submissions or attacks.

They also add a little time to each submission attempt - possibly ten seconds. This is negligible to an individual (in most circumstances), however as millions of people complete CAPTCHA forms each day, this adds up to a lot of time overall.

Initially CAPTCHA tools just presented random words, however a system supported by Google is supporting organisations to 'do good' as well as improve their security.

Named ReCAPTCHA, the system has integrated the work being done to digitalise books and documents. Rather than using random words, users are presented with words that computers could not understand during the document digitalisation process.

Each time a user completes a ReCAPTCHA, they are helping to decipher and digitalise the world's literature and records - preserving it into the digital age.

Assuming an average of two words per ReCAPTCHA, and each being repeated many times in order to validate the entry, there's a miniscule contribution by any particular individual.

However if, for example, 50 million people each verify themselves using ReCAPTCHA each day, with each set of two words presented ten times on average, a total of 10 million words in old documents and books that have been deciphered and correctly digitalised. Each day. That's 3.6 billion words per year.

So if your organisation isn't using CAPTCHA security on forms, or even if you are using a custom CAPTCHA technology, you might wish to consider exploring the use of ReCAPTCHA - which is free to reuse from Google.

Alternatively, of course, Australian institutions could develop their own type of CAPTCHA approach (for old newspapers, for example - or archival records). It would be a meaningful extension to the work the National Library of Australia is already doing.

Below is a video on the work being done with ReCAPTCHA.

Learn more about ReCAPTCHA.


  1. Nothing like reaching the end of a long, complicated government form. And then being punished with a captcha.

  2. In commenting on this blog, I had to answer a captcha (not recaptcha?)
    When typing the word my iPhone wanted first to capitalise what I typed, then to autocorrect it to a different word. I'd rather integrate with an antispam service (like akismet) than ask users to answer captchas. Or just delete the spam! Note that certain forms (forums, comments and feedback in particular) are most likely to attract spam. Just another consideration before captchas proliferate across government sites :)

  3. Ben,

    I'll see if Blogger (Google) allows me to choose ReCAPTCHA (Google) as my CAPTCHA :)

  4. I hate CAPTCHA! I can barely read the text sometimes and I'm most definitely human!