Archive

Tag Archives: Postgres


I’m in work on a Saturday, doing some database munging. We have a large update that requires dropping a bunch of rows, and a schema change.

I wish we had a DBA on staff for times like this. Or maybe a kick-ass local consultant whom we could bring in from time to time.

The row drops are taking forever. We don’t use triggers, but we of course have FKs and indexes. I’ll bet a savvy DBA would know some tricks to make the drops go faster. Drop indexes first? Don’t use a transaction? Inhibit table scanning? Something something something.

I know about good db behavior in our application, and measurement techniques, and know enough to know what I don’t know (that’s always most important), and a few performance tricks esp. when using Django. But table munging tricks I’m not so hot on. It’s not for lack of desire; there are only so many hours in the day.


Here’s another cautionary performance tale, wherein I thought I was clever but was not.

A table (“Vital”) holds widget information. Another table (“Furball”) holds other information, with an M:M relationship to Vital.

We want to do inferential computations on filtered Furball rows. So we generate a pk list from a Vital QuerySet, and call this function:

def _get_top(vitals):
    from django.db.models import Count

    TOP_NUMBER = 5

    vitalids = [x.id for x in vitals]
    top_balls = Furball.objects.filter(vital__id__in=vitalids)\
                            .annotate(count=Count('id'))\
                            .order_by('-count')[:TOP_NUMBER]
    top_list = [(x.name, x.count)for x in top_balls]

    return top_list

Read More


At work, we’ve contracted with PostgreSQL Experts to help us improve our Postgres performance. After analyzing our system, one of their consultants, Christophe Pettus, found glaring problems in how some of my code accessed our database.

I consider myself well-informed about good database access practices in Django, and in general. I might not exactly hit the bull’s-eye, but I’m sufficiently savvy to avoid making a “WTF” mistake, right?

Nope!
Read More


19:00: Checking the PostgreSQL BOF session. Oh, Selena‘s here, that’s a +1. News and tidbits about Postgres 9… I made a lame joke about Postgres running on Android, and the response was a serious, “I don’t think so, not yet.” (The times, they are a-changin’.) Postgres’ site will be migrated to Django. Hot-standby replication and streaming replication. Automatic join removal and optimization of ORM-generated queries. Some disparaging comments about the SQL generated by Rails.

18:41: Dinner was a quick bite at a Subway. Then after I return to the hacker lounge, there’s a call for a group to go to a sushi place. argh!

16:45: import rdma: Zero-copy networking with RDMA and Python. Interesting talk about kernel and user mode buffered-I/O, and the consequences of buffer copies in the socket interface. Locking down memory regions used for I/O feels like going back to the future, before the time of scatter/gather. But InfiniBand products’ price/performance are impressive. I don’t expect to use any of these techniques anytime soon, but I’ll file them away for future reference.

15:45: Cassandra: Strategies for Distributed Data Storage. Overview of CAP theorem, then delved into using Cassandra. A little too deeply too quickly for my interests, but I stayed with it. A good talk.

Read More

Follow

Get every new post delivered to your Inbox.

Join 9,513 other followers