Monday, June 23, 2008

The issues with CAPTCHA security

CAPTCHA is a security technology for websites that works by making users verify they are a human by typing in a random string of letters or numbers displayed in an image.

You could consider it a Turing test for humans.

It is now widely used as it is easy to implement and has a reasonably good success rate in differentiating human and machine.
However it does have weaknesses and issues, and organisations need to think a little before they simply decide on the CAPTCHA path.

Here's some factors to consider.

CAPTCHA isn't accessible - straight CAPTCHA may breach accessibility law

CAPTCHA relies on presenting a graphic image of text to a viewer, who then reads the text and enters it into a text box. As computers are now smart enough to read clear images, the images used in modern CAPTCHA systems are usually 'messy' with random strokes and distorted letters (called reCAPTCHA).

For example:

These images can also be hard for some humans to read - the old, the young, the visually-impaired and even groups who would not consider themselves as having sight issues.

This means that visual CAPTCHA systems may be inaccessible under Australian laws regarding accessibility. This is a very important consideration for Australian government agencies.

There are approaches to get around this, such as either offering a selection of images, one of which (hopefully) is readable by the audience; or through offering an audio alternative, whereby someone listens to a series of letters or numbers - usually interspersed with other sounds - and types these in.

Note that the latter approach also has similar accessibility issues for those with hearing impairments.

Personally I have on occasion had difficulty using either a visual and audio CAPTCHA approach and my vision and hearing are both above average for my age group (Gen X).

CAPTCHA is breakable

There are several ways to break a CAPTCHA system.

The first is to simply have a large group of low paid computer users systematically interpret and type in the correct response.

Organisations in nations where labour is cheap are able to offer this as a service for hacking sites or preparing the way for automated systems to then use hacked sites and accounts for spamming and other illicit purposes.

Also as technology improves it becomes easier for machines to break CAPTCHA. Already we've seen a move from clear text to messy and distorted images - tested against optical character recognition to ensure they are not readable - in order to reduce the ability for computers to read the image.

It is only a matter of time before machines can also read these messy images - handwriting recognition and optical character recognition technology both continue to get better and are converging on this area.

Not endorsed by the W3C

CAPTCHA is not endorsed for use by the W3C.

The W3C has indicated in a working paper entitled Inaccessibility of CAPTCHA that CAPTCHA is inaccessible and the technology is not yet endorsed within W3C guidelines.

This means that it is not endorsed within the standard guidelines underpining website development in the public sector.

This doesn't exclude agencies from using it - it has not been specifically rejected by the W3C, it sits in a gray area and each agency would have to make their own decision.

So what next?

CAPTCHA has already advanced to reCAPTCHA - involving the messy distorted text indicated above.

Most reCAPTCHA implementations have also integrated audio reCAPTCHA as an alternative - in the hope that if people cannot read the image they can understand the sounds.

Some organisations, such as banks, use physical PIN devices, others have talked about using fingerprint or retina scanners attached to PCs.

However there is no clear successor to reCAPTCHA for widespread use on websites.

What should organisations do?

As there's no readily accessible and cost-effective alternative, organisations should strongly consider reCAPTCHA as a security measure in their sites, integrating both visual and audio approaches.

However they should also strongly consider offering an approach accessible to those who cannot see or hear the CAPTCHA security, such as phone-based identification or the use of secret questions.

1 comment:

  1. ReCAPTCHA embeds scripting from the host site into your site and sends input from users back to the host site - while this may be free captcha control it is also web security "out of the frying pan into the fire".