CAPTCHA vs Human Logic
March 20, 2007 Posted by Tyler CruzAfter installing the spam-protected contact forum plug-in a week ago, I immediately wondered why all spam-prevention systems didn’t use basic human logic questions instead of CAPTCHA technology.
By human logic questions I’m referring to questions such as “What is 2+2” and “What sound does a cow make?” where the user is to give the correct answer, as opposed to the image verification system, or CAPTCHA. The recent influx of such makeshift questions is an attempt to help solve the problem of automated spambots while CAPTCHA’s security continues to worsen.
I discovered CAPTCHA’s flaws firsthand when my various vBulletin forums started receiving mad amounts of spambots, or automated spam robots/scripts. CAPTCHA seemed like a good preventative solution, but unfortunately malicious programmers have found a way to actually program their bots to solve the image verifications. Now, CAPTCHA responds by making their images even more difficult to read, and while that does make it harder for the programmers, it also makes it a lot for legitimate users to try to decipher as the image are so badly warped!
When I upgraded vBulletin, the improved CAPTCHA system they were using definitely helped, and removed 99% of the spam. However, it’s only a matter of time before programmers “crack” the new CAPTCHA systems. It’s a neverending arms race.
That brings us back to why people have been recently using the human logic questions. What I want to know is: why don’t we use the human logic system for everything?
Please correct me if I’m wrong, but I don’t see how a spambot could ever answer such questions. Oh, sure, I’m sure in a decade or two, AI will have improved where spambots could answer questions such as “What is two plus two?” or “How many days are there in March?”, but would they be able to answer questions such as “How many toes does a one legged man have?”.
Even if they took the most common questions such as “2+2=?” and “What year is it” and added them into a database, all people would have to do would be to create their own unique questions and answers, such as my “How many toes does a one legged man have?” riddle above.
It’s a lot more fun to answer than trying to figure out a bunch of garbled numbers and letters, too. In fact, whenever I have to use a CAPTCHA verification, I often get it wrong at least once before being able to continue.
Am I missing something here, though? Why bother with CAPTCHA? Why not just use human logic questions? I don’t see how spambots can answer them, it’s a lot simpler, and is a lot easier on the user than having to squit to read a bunch of garbled text.
More information on CAPTCHA from Wikipedia.
Tyler,
I tried to buy gymnasticsforums.com from you one night, that’s when I started reading your blog. Since then I have decided to start a blog…greatly inspired by you.
I have a post that you might like, it’s a mod for vBulletin that does just what you want:
http://www.jbslife.com/my-1-vbulletin-mod-cant-live-without-this-one/
The mod is called NoSpam! It works great. Human logic at it’s best.
Thanks.. can’t seem to be able to access your blog though…
Do you really think that bots can’t answer 2+2? It takes 2 seconds to write regexp that will take 2 numbers in a string and add them together. As for the “what noise does a cow make”, there are a number of problems:
Accessibility – What about people who don’t talk English as their primary language? People who have never seen a cow?
Limited questions – There’s only a limited amount of questions you can have, and soon enough bots will have indexed all the default ones, or indexed all the ones from popular boards. They only need to be human solved once, and then the bot can cache the question and answer and spam away.
These make it not a great choice for the developers to include by default. They want their software as accessible as possible, and 99% of users won’t change the default questions – rendering it useless once the bots have a hold of them.
At my site’s registration page I have tried a different human logic (http://exaltic.com/register). It does limit to people with basic english language, but since the site is English it’s required anyway. Of course, if the bots were to cache the images and cache the answers they could be beat, but I plan to introduce noise to the images.
Tyler, in addition to the other well thought out reasons mentioned, could it be that there are a lot of people out there who can’t figure out the sum of 2 plus 2? And blog owners who want them anyway?
Serious;y it’s much easier to write code to parse text for regular expressions like ” 2 + 2″ than the complex code needed to parse all the pixels in a common Captcha image.
I do agree though it is downright difficult, especially for some like me who are slightly dyslexic. It certainly won’t meet the requirements of sites that have to be accessible to the disabled.
Don’t know the good answer, just know that simple math problems aren’t really a solution … and a one legged man … well he could have from zero to 5 toes … maybe more if you allow for birth defects ๐
The first 7 mins of this video by Luis von Ahn on Human Computation talks about CAPTCHAs and an “alleged” method that pornographers are using to solve catchas, and can easily be used to solve any “human logic” questions.
Display the captcha/question/riddle to their users and ask them to solve it ๐
http://video.google.co.uk/url?docid=-8246463980976635143&esrc=sr9&ev=v&q=captcha&vidurl=http://video.google.co.uk/videoplay%3Fdocid%3D-8246463980976635143%26q%3Dcaptcha&usg=AL29H23AskyDOHvhib9GL58K8Qr8pKPmVA
Sorry…I’ve been having some cross browser probs, thought I had it fixed.
Anyways, I believe that CAPTCHA + Human Logic is the best way to go. You get the best of both worlds. It maybe possible to beat both, but since I installed NoSpam! (in conjunction with CAPTCHA) on my vBulletin boards I have had no spam.
Results don’t lie…you can find NoSpam! at vBulletin.org.
As mentioned for the logical questions, a parsing of the registration form would be the only thing necessary to bypass this.
For the linguistic questions, sure it might be effective in a very shot period, but as more will use this, it will spawn big databases where the spambots would compare the question against a sea of already identified ones.
Not to mention the way these could be bypassed using a simple brutefoce method. Sure you could set the maximum tries to 3, but then again people too make mistakes and could cause you to lose a potential member.
For now I go with a method proven quite good, but the question is how long before the solution will be saved in one of the databases and no longer effective.
Right now I have let say 5 radio buttons, and simply tell the people to check 3 of them if they aren’t.
Still this could be overcome with a simple trial and error method.
Also but not realistic is the sound method, where a sound tells you the letters/numbers that you should write in, but again this would hurt people who do not have sound available or are simply deaf.
And it wouldn’t take long before the bots had voice recognition.
You want human logic? You’ve got human logic. Good luck trying to solve those!
“Results donโt lieโฆ” Yes, and I know multiple websites that have NoSpam and still get spam. Your website is most likely to small for bots to target.
it’s a horrible idea.
math is out fo the question. it is very easy to write code to solve any simply presented math problem. if you obfuscate it or make it more complex than single-digit addition, then you will lose a significant part of your userbase. try asking someone on the street what 6*8 is. won’t be pretty.
and about “What sound does a cow make?”… on the forum i run, there are many people who aren’t native english speakers and they make valuable contributions. they don’t know that a cow says moo in english. in spanish, una vaca dice mu. you get the idea.
more complicated logic questions will confuse many people.
captchas still are the best solution at the moment.
Whether you can beat it or not is besides the point. It is working on my site very well in conjunction with CAPTCHA. CAPTCHA alone was not stopping everything.
It may only work for a short period of time, but it is working right now.
If this article was about how well CAPTCHA worked, everyone would chime in the opposite way. It can be beat as well. It’s about slowing people down…adding a few extra speed bumps.
Funny…people say “The Club” will not protect a car. They say alarms can be defeated and electronic kill switches are easy to find. Maybe…but I have all three. It’s gonna take you an extra 30 seconds to steal my truck. Why not just steal BMW next to mine…it would be easier.
If you do happen to steal my truck, it’s insured. If you kill my site, it’s backed up.
OkI’m the other Tyler ๐
I’ve used the simple 5 character CAPTCHAs and had no problems (yet), but I was thinking of using human logic if they ever got out of hand.
I never thought about using RegExp to parse the question to figure out the answer. But then how many people might have troubles reading: two plus two equals ?
Now some of you have mentioned that it might be hard for those who do not have English as their primary language to read that type of question and understand it, but if your blog is majority in English, wouldn’t they want to respond in English? Though I did recently have to convert someone who asked me a question in Spanish, to English using Babelfish, but then I was only brought up with English.
I’ll stick to the character CAPTCHAs for now, perhaps mix it up a bit and use simple images? Like a house or a tree? *shrug*
to the other tyler:
just because they are not perfectly fluent in english doesn’t mean they have nothing worthwhile to say. there’s also cultural differences to worry about, what with all the different dialects of english in different countries.
Most blog spam bots are very stupid. I have done (non-scientific) experiments on them, looking for an effective way to reduce spam WITHOUT causing an additional hassle to the user. My experiments confirmed what I thought – most bots do not even read the form, so filling a hidden field with a secret is enough to stop them. The ones that do read the form don’t execute javascript, so filling another field with javascript will stop them.
I have created a wordpress plugin based on my findings, it can be found here: http://www.paulbutler.org/archives/preventing-comment-spam-with-javascript-bot-detection/
[…] a few articles about Internet related and blog related topics. One great article I read was Captcha vs. Human Logic which is right on and the truth is so […]
What some of you are missing/forgetting/… is that CAPTCHA’s are inaccessible to blind users.
Even some fully sighted people have issues with CAPTCHA’s.
I agree that perhaps “What sound does a cow make?” might not be the best logic question, but it’s more accessible than image CAPTCHA’s.
Is there a perfect solution? No. Will there ever be? No.
It’s all about what’s going to work best for your circumstances, whether that be an image CAPTCHA, a logic CAPTCHA or some other form of spambot detection.
The problem(s) with this is that the tests have to be come up with manually. Thus the CA part of CAPTCHA [completely automated] does not stand. There are a finite set of questions that can be asked, and bots can still come up with the answer to something as simple as arithmetic or basic trivia. You also have the gray line that’s drawn between basic logic and advanced logic, the latter of which introduces regional conflict.
As the commenter above me suggests, and I agree, it’s all about circumstances. Websites like Blogger.com and Digg are best off with the more robust ubiquitous visual/audio letter/number CAPTCHA because it is most resistant to attack. However, spam bots in general are playing the numbers game… differentiating your blog/form from others simply by renaming a form field will prevent a lot of spam. That works in a situation (which I believe most common) where the form isn’t necessarily a target on its own, instead it’s a target because it’s a known vulnerable form (such as a WordPress blog).
[…] just a personal preference/suggestion, but it’d be nice if they changed that to use a “Human Logic” question such as “What is the opposite of day?” or “What day comes after […]
Hi!!! Hope you are doing well. We the leading Data processing company in Bangladesh. Presently we are processing 300000+ captcha per day by our 55 operators. We have a well set up and We can give the law rate for the captcha solving.
Our rate $2 per 1000 captcha.
We just wanna make the relationship for long terms. can we go forward? Thank you, (For inquiry amir4@yours.com or
khoknaa@yahoo.com)
Best Regards
Amir Hossain Dewan
Data Home Ltd.
amir4@yours.com
khoknaa@yahoo.com
..and right there, my friends, we have the reason why all verification techniques will ultimately fall– unscrupulous people who throw humans at the problem.
Hey,
I’m a fan of logic captchas over image ones. While it’s true nothing will be 100% effective, answering a simple logic question is a better experience for most users that deciphering an increasingly complex image. This looks liek a good service:
http://textcaptcha.com/
I’ve also found a honeypot technique successful:
http://haacked.com/archive/2007/09/11/honeypot-captcha.aspx
I just got out of a course about accessibility and we talked about logical captcha. I did found textcaptcha.com with a very quick google search. Looking at the questions, I though it would be easy to write a program that solve their captcha. I was right, in less than 5 hours of coding, I have written a python script (380 lines) that could run for hours without finding a captcha that it couldn’t solve.
I am still looking for a good logical captcha provider…