The Fall of CAPTCHAs – really?

I recently saw a Slashdot post dramatically titled “Fallout From the Fall of CAPTCHAs“, citing an equally dramatic article about “How CAPTCHA got trashed“.  Am I missing something? Ignoring their name for a moment, CAPTCHAs are computer programs, following specific rules, and therefore they are subject to the same cat-and-mouse games that all security mechanisms go through. Where exactly is the surprise? So Google’s or Yahoo’s current versions were cracked.  They’ll soon come up with new tricks, and still newer ones after those are cracked, and so on.

In fact, I was always confused about one aspect of CAPTCHAs. I thought that a Turing test is, by definition, administered by a human, so a “completely-automated Turing-test” is an oxymoron, something like a “liberal conservative”. An unbreakable authentication system based on Turing tests should rely fully on human computation: humans should also be at the end that generates the tests. Let humans come up with questions, using references to images, web site content, and whatever else they can think of.  Then match these to other humans who can gain access to a web service by solving the riddles. Perhaps the tests should also be somehow rated, lest the simple act of logging in turns into an absurd treasure hunt. I’m not exactly sure if and how this could be turned into an addictive game, but I’ll leave that to the experts.  The idea is too obvious to miss anyway.

CAPTCHAs, even in their current form, have led to numerous contributions.  A non-exclusive list, in no particular order:

  1. They have a catchy name. That counts a lot. Seriously. I’m not joking; if you don’t believe me, repeat out loud after me: “I have no idea what ‘onomatopoeia’ is—I’d better MSN-Live it” or “… I’d better Yahoo it.”  Doesn’t quite work, does it?
  2. They popularized an idea which, even if not entirely new, was made accesible to webmasters the world over, and is now used daily by thousands if not millions of people.  What greater measure of success can you think of for a technology?
  3. Sowed the seeds for Luis von Ahn’s viral talk on human computation, which has featured in countless universities, companies and conferences.  Although not professionally designed, the slides’ simplicity matches their content in a Jobs-esque way. As for delivery and timing, Steve might even learn something from this talk (although, in fairness, Steve Jobs probably doesn’t get the chance to introduce the same product hundreds of times).

So is anyone really surprised that the race for smarter tests and authentication mechanisms has not ended, and probably never will? (Incidentally, the lecture video above is from 2006, over three years after the first CAPTCHAs were succesfully broken by another computer program—see also CVPR 2003 paper—.  There are no silver bullets, no technology is perfect, but some are really useful. Perhaps CAPTCHAs are, to some extent, victim of their own hype which, however, is instrumental and perhaps even necessary for the wide adoption of any useful technology.  I’m pretty sure we’ll see more elaborate tests soon, not less.


Life with three cats

I admit it. For the past year I’ve been running a zoo.  Let me introduce the protagonists, in alphabetical order.

Cats: Portraits

Two of them grew up together (Aki and Kiki), but all three have lived together at different times in the past.  For the most part, they get along just fine. Aki and Alan get into a hustle from time to time, both being a little insecure in their own way.  Kiki doesn’t care much—after all, whenever there is a bad thunderstorm, he’s the one that climbs onto the back of the couch and stares intently at the lightning and thunder, while Alan tries to bury himself behind the toilet seat (as for Aki, I still haven’t managed to figure out exactly what he does at moments like this; he simply seems to disappear).

Anyway, it’s a full house and the cats run it. They have all picked their territories by now, and they have trained me to stick to mine. A queen size bed is too small sometimes.

Cats: Pajama party

The couch can get crowded too.  They haven’t gotten hold of the remote yet, but one of these days they probably will.

Cats: Couch potatoes

Of course, it’s not always like that; we sometimes have more “intimate” moments. Things can get quite hairy (literally: the bed, the clothes, my face—all things).

Cats: Alan intimate

A year ago I used to bemoan our crowded co-existence; now I think the apartment would be too big without them. If you’ve never had a cat and wonder what it might be like, I’ll let George Carlin describe it in his unique way.

I’ve experienced all these (except the outdoor activities, since our cats don’t get the chance), but not all from the same cat. Aki is an obsessive self-petter. Alan is on a bad drug too often (but at least sometimes I manage to convince him to do his thing on my shoulders: a pretty good massage). Kiki’s is the proudest (and most active) of the three and his ass button is impressive.  I wonder how many cats George Carlin had.  Must have been too many—really, who would have expected this from him?

Web science: what and how?

From the article “Web Science: An Interdisciplinary Approach to Understanding the Web” in the July issue CACM (which, by the way, looks quite impressive after the editorial overhaul!):

At the micro scale, the Web is an infrastructure of artificial languages and protocols; it is a piece of engineering. […] The macro system, that is, the use of the micro system by many users interacting with one another in often-unpredicted ways, is far more interesting in and of itself and generally must be analyzed in ways that are different from the micro system. […] The essence of our understanding of what succeeds on the Web and how to develop better Web applications is that we must create new ways to understand how to design systems to produce the effect we want.  The best we can do today is design and build in the micro, hoping for the best, but how do we know if we’ve built in the right functionality to ensure the desired macroscale effects? How do we predict other side effects and the emergent properties of the macro? […] Given the breadth of the Web and its inherently multi-user (social) nature, its science is necessarily interdisciplinary, involving at least mathematics, CS, artificial intelligence, sociology, psychology, biology and economics.

This is a noble goal indeed. The Wikipedia article on sociology sounds quite similar on many aspects:

Sociologists research macro-structures and processes that organize or affect society […] And, they research micro-processes […] Sociologists often use  quantitative methods—such as social statistics or network analysis—to investigate the structure of a social process or describe patterns in social relationships. Sociologists also often use qualitative methods—such as focused interviews, group discussions and ethnographic methods—to investigate social processes.

First, we have to keep in mind that the current Western notion of “science” is fairly recent.  Furthermore, it has not always been the case that technology follows science. As an example, in the book “A People’s History of Science” by Clifford Conner, one can find the following quotation from Gallileo’s Two New Sciences, about Venice’s weapons factory (the Arsenal):

Indeed, I myself, being curious by nature, frequently visit this place for the mere pleasure of observing the work of those who, on account of their superiority over other artisans, we call “first rank men.” Conference with them has often helped me in the investigation of certain effects, including not only those which are striking, but also those which are recondite and almost incredible.

Later on, Conner says (p.284), quoting again Gallileo himself from the same source:

[Gallileo] demonstrated mathematically that “if projectiles are fired … all having the same speed, but each having a different elevation, the maximum range … will be obtained when the elevation is 45°: the other shots, fired at angles greater or less will have a shorter range. But in recounting how he arrived at that conclusion, he revealed that his initial inspiration came from discussions at the Arsenal: “From accounts given by gunners, I was already aware of the fact that in the use of cannons and mortars, the maximum range, that is the one in which the shot goes the farthest, is obtained when the elevation is 45°.” Although Gallileo’s mathematical analysis of the problem was a valuable original contribution, it did not tell workers at the Arsenal anything htey had not previously learned by empirical tests, and had little effect on the practical art of gunnery.

In any case, facilitating “technology” or “engineering” is certainly not the only good reason to pursue scientific knowledge. Conversely, although “pure science” certainly has an important role, it is not the only ingredient of technological progress (something I’ve alluded to in a previous post about, essentially, the venture capital approach to research).  Furthermore, some partly misguided opinions about the future of science have brightly shot through the journalistic sphere.

However, if, for whatever reason, we decide to go the way of science (a worthy pursuit), then I am reminded of the following interview of Richard Feynman by the BBC in 1981 (full programme):

Privacy concerns notwithstanding, the web gives us unprecedented opportunities to collect measurements in quantities and levels of detail that simply were not possible when the venerable state-of-the-art involved, e.g., passing around written notes among a few people. So, perhaps we can now check hypotheses more vigorously and eventually formulate universal laws (in the sense of physics).  Perhaps the web will allow us to prove Feynman wrong.

I’m not entirely convinced that it is possible to get quantitative causal models (aka. laws) of this sort. But if it is, then we need an appropriate experimental apparatus for large-scale data analysis to test hypotheses—what would be, say, the LHC-equivalent for web science?  (Because, pure science seems to have an increasing need for powerful apparatuses.) I’ll write some initial thoughts and early observations on this in another post.

I’m pretty sure that my recent posts have been doing circles around something, but I’m not quite sure yet what that is.  In any case, all this seems an interesting direction worth pursuing.  Even though Feynman was sometimes a very hard critic, we should pehaps remember his words along the way.

