Integrating reCAPTCHA with Django


This is how I added reCAPTCHA captchas to TrenchMice, a Django-powered website.

Background

We didn’t initially build captchas into TrenchMice, because we simply didn’t think they would be necessary.

By September 2006, the site started receiving spam comments.  They were the usual gibberish you see in blog spam: Lots of links, garbage words, and bogus e-mail addresses.  (Whenever I see this stuff, I shake my head and wonder why script kidz waste their time generating it.  Then I remember it’s because, out of the gazillions of spam messages, some recipients click on the links, making spam financially rewarding.  And then I get slightly depressed about the average Internet user.  But, I digress…)

So we broke down and added a captcha system using PIL and our own algorithms.  Our images were simple, as modern captchas go:

TrenchMice captcha example

But they got the job done, with an acceptable load on our servers.

We decided to upgrade the captcha technology for two reasons.

  1. Lately, we’ve noticed more doorknob-jiggling activity.  This hasn’t yet resulted in comment spam, but it indicates the site is getting more attention from spammers.  I don’t want to wait until there’s a successful attack to better secure the site.
  2. We’re not interested in developing a core competency in captcha design.  The initial captchas were easy to do, but we don’t want to invest time into learning the latest and greatest imaging techniques now that more work is required.

Why reCAPTCHA?

Their image algorithms are way more sophisticated than ours, and they believe they’re as good as any out there.

Their system does useful work by correcting OCR text from digitized books.  This is rather cool.

They claim excellent system availability for their users, and expect to be in business for years.  There’re no indications to the contrary.

If a hacker cracks their images, they promise to respond quickly by tweaking their algorithms.  So we won’t have to do much besides add our voice to the, “Please fix this,” thread that would presumably get created in their support newsgroup.

There was nothing to install.  (But PyCrypto was needed for Mailhide.)  The API looked fairly easy.  And it was free.

The deed

I replaced our template captcha code with this. It’s a straight lift from their client API instructions:


Prove you're not a robot:

<script type="text/javascript"><!--mce:0--></script>
<script type="text/<span class="><!--mce:1--></script>
<noscript>
<iframe src="http://api.recaptcha.net/noscript?k=PUBLIC_KEY{{ captcha_error }}" height="300" width="500" frameborder="0"></iframe>

<textarea name="recaptcha_challenge_field" rows="3" cols="40">
<input type="hidden" name="recaptcha_response_field" value="manual_challenge"/>

The captcha_error template variable is "&error=ERROR_CODE" if we’re re-displaying a bad form after a POST. Otherwise, it’s an empty string. I vacillated over moving this into the view’s form class for 30 minutes, but I kept it in the template because:

  • TrenchMice has a mix of oldforms and newforms, because we agreed to upgrade pages to newforms only if edit them for another reason. (I.e., fixing a bug or changing the form for some other reason.)  We haven’t done all of them yet.  I didn’t want to procedurally trigger an update of the remain oldforms-based views using captchas; and if I chose to ignore this self-imposed rule, I didn’t want to further burden them with even more code that would have to eventually be updated.
  • This is a case where the control–presentation distinction wasn’t clear.   Forced to choose one or the other, the reCAPTCHA display belongs in presentation, because it’s JavaScript from another site with core algorithms outside of TrenchMice’s control.  Of course, the view has the reCAPTCHA API code to evaluate the user’s response.

I replaced our view captcha code with this. It uses the Python recaptcha-client. (Warning, recaptcha-client didn’t install properly on my system without my tweaking the package files. YMMV.):

# Initialize to an empty string, not None, so the reCAPTCHA call query string
# will be correct if there wasn't a captcha error on POST.
captcha_error = ""

if request.method == 'POST':
## ...code snipped...

# Check the form captcha.  If not good, pass the template an error code
captcha_response = \
captcha.submit(request.POST.get("recaptcha_challenge_field", None),
request.POST.get("recaptcha_response_field", None),
RECAPTCHA_PRIVATE_KEY,
request.META.get("REMOTE_ADDR", None))

if not captcha_response.is_valid:
captcha_error = "&error=%s" % captcha_response.error_code

elif form.is_valid():
## ...

I also swapped out our PIL-based e-mail address obfuscation for the reCAPTCHA Mailhide API. Recaptcha-client had code for this too, and it was easy to hook up. So easy that I won’t bother writing about it. 🙂

The end result

The reCAPTCHA captchas work great, and the total amount of view and template code decreased.  Our simple captchas had small view hacks to handle the case of re-displaying a form that had a good captcha response but a problem in another field.  That code, however minor, is now gone.  We also had a background script to clean the captcha image file directory — gone.  We also had a font directory for the images — gone.

Visually, the styling isn’t completely in keeping with the rest of the page.  But it’s perfectly acceptable. We have been displaying a simple blue box with green text, and I can’t claim that was visually wonderful.

18 thoughts on “Integrating reCAPTCHA with Django

  1. Hi,

    I’m one of the engineers on the reCAPTCHA team. Glad to hear you are happy with reCAPTCHA.

    About the styling, did you see http://recaptcha.net/apidocs/captcha/client.html — it should let you customize the look and feel to your hearts content. If you didn’t find that, I’d appreciate if you could shoot us an email (support@recaptcha.net) telling where you looked for this information. We want to make the themeing easier to find (I think it’s a slick api, but I’m sort of biased)

    – Ben

  2. Hi my name is Harish, and I am a consultant working through http://www.floresense.com

    You wrote a nice experience summary.. and enjoyed reading it, since it was in the area of my recent interests. CAPTCHA.

    We developed a captcha control for .net2.0 after experiences with other captchas and if you may be interested, you try downloading it from http://www.floresense.com.. I will be happy to have feedback from a person like you using it.

    We also have a free, simple captcha service that might be interesting for small sites as well.

    You can find these at our website http://www.floresense.com

    Thanks for your time

  3. @Ben:

    Yes, I did read that section. But reCAPTCHA’s styling with the white theme is sufficient for our needs, so we didn’t feel the need to use your custom theme hook.

    Our captcha images heretofore haven’t stylistically been anything to write home about. What I didn’t say, but should have clearly said, is that reCAPTCHA is slicker looking than our simple blue rectangle, so it’s actually an improvement.

    If we sometime have time to burn (ha!), we’ll look at better styling for it.

  4. @Harish:

    Thanks for writing about your company’s captcha control.

    From a cursory reading, it looks like a nice solution for a .NET shop. But TrenchMice is written in Django running on Linux!

    Your free “RND Plate” service might be applicable. (Why’d you give it that odd name?) But we’re happy with reCAPTCHA’s availability, strength, and the small social service it performs.

  5. Hiya, you’ve mentioned that you’ve changed the recaptcha view code to:

    captcha_error = “”

    if request.method == ‘POST’:

    ## etc..

    is that inserted into the view JavaScript code itself or is that in handler method on top? I’m trying to use reCaptcha with Python but having trouble getting request.META[‘request_challenge_field’].

  6. I gotta say, the comment by “Harish” is some damn clever spam–and pretty ironic considering this is a post about captchas, of all things.

    Notice how he doesn’t actually address the content of the post, merely referring to it as an “experience summary”–terms that could be applied to almost any blog post.

    Also note how he includes the link to his website three times. And note that the **product** he is **promoting** has nothing to do with Django–it’s written for a totally different platform.

    My guess is “Harish” is really a spam bot programmed to leave comments on blog posts that it infers are discussions regarding captchas.

    It’s pretty ironic the double standard most people have regarding promotion. My guess is most folks would feel significantly uncomfortable writing a comment as self-promotional as Harish’s, but when Harish does it we don’t bat an eye.

    Oh yeah, thanks for writing this blog post.

    1. I approved his comment because his company’s site (I didn’t miss the fact that he mentioned it three times) was real, and did offer a captcha solution.

      You may be right about Harish being a bot. I considered that, and ultimately decided there was less than a 50% chance of it being true. If so, it is, as you note, quite clever.

      I chose to err on the side of believing that Harish was just a decent individual trying to market his company’s services. If I chose to be fooled, well, it wouldn’t be the first time.

  7. This is just me saying “Thank you for this post”.

    Great writeup, exactly what I was looking for. Your examples were spot on and I really enjoyed your reasoning.

    This wasn’t exactly what I needed, but it got me 95% there and saved me a lot of trouble.

  8. Yopu know @Harish is not really a spam, just because you mentione your website thrice in a post means you are emphasising, like if i were to mention my site, that is http://www.inchis.com a number of times reffering to a problem, it will not make it spam. think about that. Spam is a web designers nightmare we need solution to

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.