Monday, 12 December 2011

Another brilliant TED talk

This guy, Luis von Ahn, is a member of the team that brought us CAPTCHAs, those little strings of distorted characters that you have to type into an online form to prove to the website that you're not a computer program.  The talk is only 15 minutes long, and is definitely worth watching, but here's a summary.
They knew the few seconds we spend entering those characters waste our time and that over 100 million people use captchas each day, so they wondered if there was work that could be broken down into such small bits, it would be possible to do something useful in those few seconds.  Amazingly, there is something.  They use it to digitise books.

They changed the captcha software to present you with two words instead of a simple string of characters, and you now have to type both words in. One word, the computer recognises, the other it doesns't.  It uses the one it recognises to check that you're not a computer program, then stores your entry of the other word, with a probability that it's right.  When some number of other people have identified this unknown word, the probability that it has a valid digitisation increases to the point where it can accept it as right.  So it moves that word from Unknown to Known and proceeds.  With hundreds of millions of such transactions each day, digitisation can proceed fast!

Another project they have going is translating the web into the main world languages.  Machine translation is not much good yet, but human translation is expensive.  So they decided to recruit anyone who wants, to translate text for them, working at their own level of competence.  It's free, because translating English into your chosen language helps you to learn the language, which is enough motivation to keep lots of people doing it.  There are millions of people around the world paying to learn a foreign language. Once a dozen people have translated the same piece of text, it's easy to merge the results and be confident the translation is good.  He reckons they can translate the remaining 2/3 of Wikipedia into Spanish in 80 hours.  Amazing!
Now watch the talk!


No comments: