Archive

Tag Archives: Python


tl;dr: Think about exceptions when writing a context manager.

I made a huge unforced error with a context manager at work.

We use Redis distributed locks for system synchronization. I wanted a context manager that acquired n locks, executed protected code, and then released the n locks in reverse order. It would be simple to use:

from common.util import Semaphore, distlock

semaphore1 = Semaphore(OwnerDisambiguationUpdate.UPDATE_LOCK)
semaphore2 = Semaphore(USMaintenanceFeeUpdate.UPDATE_LOCK)

with distlock(semaphore1, semaphore2):
    do_some_work()

(The Semaphore class does other work with aborting Celery tasks, but that’s not germane here. It’s a Redis distributed lock with extra fanciness.)

Read More


An update to an earlier post

We’ve had problems using the pyrax SDK, mostly in account authentication.

First, it wasn’t at all clear when, or under what conditions, we had to re-authenticate our pyrax token. As documented, after you initially authenticate your credentials, pyrax handles all subsequent re-authentication under the covers. I.e., it will automatically re-authenticate the token if it ever expires.

This is kind of odd. I don’t understand why a good token should need re-authentication.

We then discovered that pyrax sometimes can’t re-authenticate our token! Every 19 hours, we hit a period of about five hours when our token won’t automatically authenticate. Why? I still don’t have a clear answer. Some authentication server, somewhere, clearly gets confused. You won’t run into this bug if you don’t have long-running processes. But, we do.

Read More


We host IP Street’s SAAS product at Rackspace. We’re finally taking the plunge and upgrading from python-cloudfiles to pyrax. We didn’t have any big issues with python-cloudfiles, but I was tiring of getting the brush-off from Rackspace when we asked for help with an API failure.

The benefits of keeping a technology up-to-date far outweighs the costs, unless you’re in an extreme corner case with a very unreliable vendor. Better performance, bug fixes, better capabilities, better support… all good stuff.

 


I’ve found some candidates for replacing Celery in my company’s product. (My reasons for replacing it are elucidated here, here, and here.)

I got these from web trawling, blog comments, and some e-mail. At first blush, none of the candidates have any disqualifying attributes, except for lacking subtasks. Celery is the only Python-friendly asynchronous task technology with subtask support, so I’ll need to bend on that if I want any alternatives to consider. (If I’m wrong on this point, please let me know in the comments!)

I’m not saying that these candidates will definitely satisfy all (sans subtasks) of my requirements. Right now they’ve just passed my initial sniff test. The next step will be to read documentation in detail, assess the health/activity of its community and developers, and try some sample code.

Read More


I’m ready to start looking at candidates to replace Celery in my company’s product. (The reasons are elucidated here, here, and here.)

Our SaaS product provides data mining and visualization for intellectual property. A 10-second elevator pitch is, it’s as though we attached Microsoft Excel’s chart wizard to US and international patent offices. (“As though” = “We didn’t do that, and in fact we go way beyond that, but I’m giving you a simple description.”) Our code is 100% Django and Python.

I looked at how we use Celery in our codebase. The reality of how we use it is much simpler than our ideas when we started two years ago. Combining our existing features with our product roadmap, I know with high confidence what features we need for our asynchronous tasks. And which ones are nice to have but not required, and which ones we’ll probably never need.

Read More


Commenting on my update to my Celery rant, @asksol asked me to post the Pylint results that made me question the claim of backwards compatibility.

(“@Asksol asked” — See what I did there? That’s alliteration. It’s a sign of a quality blog post. Ask for it by name.)

Again for the record, @asksol is a smart and friendly person. I know I wouldn’t last a day supporting a project the way he has supported Celery over multiple years. I’ve calmed down since yesterday, and I hope that something good results from my rant — if not for me, then for a future Celery user needing upgrade help. In his reply to my rant, @asksol describes some history and rationale for how he manages code change, and I encourage you to read it.

Here we go:

Read More


An update to my rant on Celery’s frequently-changing API: I’ve decided to stay with Django-celery 2.5.5 and Celery 2.5.3.

When I tried using Celery 3.0.4 with my existing code, Pylint threw about 60 warnings, many of which look real and all of which weren’t there when I used Celery 2.5.3.

“Backwards-compatible” my ass!

I shouldn’t have to chase my tail like this. Celery, you lost me. I’m now looking to replace you.


This is a rant.

My company’s code base is over 65K lines of Python and JavaScript code. We use Celery, Django-Celery, and RabbitMQ for our background asynchronous tasks. Ten different tasks.py files contain 30 task classes, split roughly 50-50 between periodic and on-demand. We use subtasks.

Today, I dug into updating from Celery 2.5.3 to 3.0.4, and I popped my cork.

I am aggravated by the frequency and extent of Celery API changes. It’s easily changed more often than any other five technologies in our stack combined. I’ve been upgrading Celery and Django-celery every six months or so, which corresponds to upgrading every few minor versions. And the changes are similar in scope to what I see when upgrading any other technology across one or two major versions.

Read More


Sometimes you don’t write unit tests. Your reason for not doing so always falls into one of two categories.

Complexity

The code you just wrote would be so much easier to test using system-level testing. For example…

  • The setup and teardown would be 10x the test code.
  • There’s too much interaction with multiple data stores or third-party vendors.
  • Your dev boxes or CI server don’t all have the necessary technology installed.

These are rational reasons to not write unit tests for new code. You’re fine.

Simplicity

But sometimes you don’t write unit tests because the code you just wrote is so darn obvious.

It’s really simple. It’s straightforward. It’s nearly trivial. Why both writing unit tests for it?

Well, I’ll tell you why you should test it. In fact I’ll give you three reasons.
Read More


I was in some code I haven’t visited in a while. And I came upon something I coded months ago.

It used a list comprehension to test every element of a list. If the result was empty, it signaled an error. Otherwise, it used result[0].

Gah! That’s so retarded! Was I asleep when I wrote that?

I changed it to a generator expression with a .next(), within a try/except on StopIteration.

One fewer line of code, and it’s faster.

I felt good.


tl;dr

When writing tests, mock out a subsystem if and only if it’s prohibitive to test against the real thing.

!tl;dr

Our product uses Redis. It’s an awesome technology.

We’ve avoided needing Redis in our unit tests. But when I added a product feature that made deep use of Redis, I wrote its unit tests to use it, and changed our development fabfile to instantiate a test Redis server when running the unit tests locally.

(A QA purest might argue that unit tests should never touch major system components outside of the unit under test. I prefer to do as much testing as possible in unit tests, provided they don’t take too long to run, and setup and teardown aren’t too much of a PITA.)

This was a contributory reason for our builds now failing on our Hudson CI server. Redis wasn’t installed on it!

Why didn’t I immediately install Redis on our CI server?

  1. Our CI server had other problems
  2. I intended to nuke it and re-create it with the latest version of Jenkins. I just needed to first clear some things off my plate
  3. Our dev team had shrunk down to just two people
  4. We were both strict about running unit tests before checking code into the pool
  5. We were up to our necks in other alligators

From a test-quality perspective, if code uses X in production, it’s better for tests to run with X than with a simulation of X.

One of the many joys of working with Ryan is that he challenges my assumptions and makes me consider alternatives. Because of a perceived lack of elegance in needing Redis on our CI server, and because his work had been temporarily blocked by my code changes, he challenged me to replace my unit tests’ use of Redis with a mock.

I walked into work yesterday and it was quiet. All our critical bugs blocking Saturday’s release were closed. I thought, why not? I’ll give it a go. Today’s a good day to see what’s involved with replacing Redis with a mock!

Read More