Rackspace changed Ubuntu bits without telling users


tl;dr

Rackspace changed their Ubuntu 11.04 (Natty) server image without telling their customers. Our installation scripts unexpectedly broke. In the cloud, the rug can be pulled out from underneath you without warning, even in a very simple setup.

The Story

My employer is a small shop, and we use Rackspace Cloud Servers for our QA and Production systems. We use unmanaged VMs, from 256 MB to 16 GB in size, running Ubuntu.

Rackspace has generally been a very good hosting provider. My only significant complaint is with their cloud administrative dashboard — it’s slow, clunky, and often hangs. But we’ve learned to live with it.

When we upgraded from Ubuntu 10.10 to 11.04, we had some typical upgrade pain with our Operations scripts. We had to remove some 10.10 package workarounds, and we switched some software from source builds to packages, because the 11.04 repository’s version was now acceptable.

We got past all that, and moved our systems to 11.04. Since then, re-building servers meant selecting Ubuntu 11.04 as the server image, running our Fabric scripts, and everything working predictably without surprises.

Until November 21…

I needed to rebuild our QA system for an impending big software release. I didn’t expect any problems with the server rebuilds themselves. I do the steps almost by rote: Initiate the rebuilds, wait, launch Fabric… Yawn…

BAM! The fabfile crashes. What the??

Our installation script’s first action is to disable the root account, and create a sudo account for software installation. For the first time in over a year, this was now failing.

My first thought was, maybe this one time our servers didn’t completely boot after the rebuild, for whatever odd reason. So I rebooted them. The root/sudo account part of the installation script still failed.

I did some sleuthing and found that /etc/apt/sources.list, which configures the software installation repository search, had changed from what it was in the earlier 11.04 server images. It used to be:

deb http://archive.ubuntu.com/ubuntu/ natty main restricted universe
deb-src http://archive.ubuntu.com/ubuntu/ natty main restricted universe

deb http://archive.ubuntu.com/ubuntu/ natty-updates main restricted universe
deb-src http://archive.ubuntu.com/ubuntu/ natty-updates main restricted universe

deb http://security.ubuntu.com/ubuntu natty-security main restricted universe
deb-src http://security.ubuntu.com/ubuntu natty-security main restricted universe

Sometime between our last successful server rebuild and November 21, it had changed to:

deb http://us.archive.ubuntu.com/ubuntu/ natty main restricted
deb-src http://us.archive.ubuntu.com/ubuntu/ natty main restricted

## Major bug fix updates produced after the final release of the
## distribution.
deb http://us.archive.ubuntu.com/ubuntu/ natty-updates main restricted
deb-src http://us.archive.ubuntu.com/ubuntu/ natty-updates main restricted

## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
## team. Also, please note that software in universe WILL NOT receive any
## review or updates from the Ubuntu security team.
#deb http://gb.archive.ubuntu.com/ubuntu/ natty universe
#deb-src http://gb.archive.ubuntu.com/ubuntu/ natty universe
#deb http://gb.archive.ubuntu.com/ubuntu/ natty-updates universe
#deb-src http://gb.archive.ubuntu.com/ubuntu/ natty-updates universe

## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
## team, and may not be under a free licence. Please satisfy yourself as to
## your rights to use the software. Also, please note that software in
## multiverse WILL NOT receive any review or updates from the Ubuntu
## security team.
#deb http://gb.archive.ubuntu.com/ubuntu/ natty multiverse
#deb-src http://gb.archive.ubuntu.com/ubuntu/ natty multiverse
#deb http://gb.archive.ubuntu.com/ubuntu/ natty-updates multiverse
#deb-src http://gb.archive.ubuntu.com/ubuntu/ natty-updates multiverse

## Uncomment the following two lines to add software from the 'backports'
## repository.
## N.B. software from this repository may not have been tested as
## extensively as that contained in the main release, although it includes
## newer versions of some applications which may provide useful features.
## Also, please note that software in backports WILL NOT receive any review
## or updates from the Ubuntu security team.
#deb http://gb.archive.ubuntu.com/ubuntu/ natty-backports main restricted universe multiverse
#deb-src http://gb.archive.ubuntu.com/ubuntu/ natty-backports main restricted universe multiverse

## Uncomment the following two lines to add software from Canonical's
## 'partner' repository. This software is not part of Ubuntu, but is
## offered by Canonical and the respective vendors as a service to Ubuntu
## users.
#deb http://archive.canonical.com/ubuntu natty partner
#deb-src http://archive.canonical.com/ubuntu natty partner

deb http://security.ubuntu.com/ubuntu natty-security main restricted
deb-src http://security.ubuntu.com/ubuntu natty-security main restricted
#deb http://security.ubuntu.com/ubuntu natty-security universe
#deb-src http://security.ubuntu.com/ubuntu natty-security universe
#deb http://security.ubuntu.com/ubuntu natty-security multiverse
#deb-src http://security.ubuntu.com/ubuntu natty-security multiverse

In particular, the natty, natty-updates, and natty-security repositories were commented out in this different file, and so were not included in the package searches.

The fix would be easy: Just uncomment six lines. But who or what had changed sources.list? Did someone at IP Street break our fabfile? Was I massively confused? I didn’t consider the possibility of the 11.04 image bits changing, because released bits wouldn’t change like that.

After more poking, a lightbulb went off…maybe the 11.04 bits had changed? Nahhhh. But… Maybe?

So I asked Rackspace if sources.list had been changed in the Ubuntu 11.04 image, or if I was suffering from massive PEBKAC. Here’s their answer:

You are correct in assuming we have changed the default Ubuntu 11.04 server image. This is part of our infrastructure migration from Xen Classic to Xen Server and required some changes to our images.

As for why the ‘universe’ repository entries are commented out, the answer I’ve been given from our Cloud Operations Engineers is that they want to keep the installs as close to the default CD installation as possible, for security reasons.

Unfortunately we do not at this time give announcements to customers for changes to the installation images. I have passed along your request for an announcement facility regarding changes to the default installation images. Unfortunately I can’t give an ETA that being implemented at this time. Feel free to also make the suggestion on our Cloud Feedback page here:

[snip]

Yikes.

I sent them back a polite (for those who know me, I was unusually polite) response pointing out this was a sub-optimal mode of customer interaction. They changed a released server image and did not tell their customer base!

Their reason for the change is rational. But they should have at least notified customers who were using the 11.04 image. Ideally, they shouldn’t have changed it at all, and instead changed their Ubuntu hosting procedures as of the next release. The repositories searched during a package install are completely unrelated to their hypervisor upgrade.

This should’ve been the end of my problems, because if Rackspace had changed anything else in the 11.04 image, their customer support person would have told me about it in this answer, right?

Wrong. After a few successful server provisions, our Postgres installation failed. The diagnosis here was easy. Seems that Rackspace decided to change the filesystem name. The virtual disk used to be mounted at /dev/sda1, but was now mounted at /dev/xvda1.

It was trivial to adjust the blockdev command string in our script. Just about as trivial as it would have been for Rackspace to warn their users.

Summary

After I got past these problems, and another one caused by our goof, I provisioned the QA system.

These problems weren’t hard to fix, but they required diagnostic time, stopped a straightforward process in its tracks, and goosed my blood pressure. This was happening in the shadow of another serious problem Rackspace was having with their Xen Hypervisor — Ubuntu 11.04 interaction.

The world can change underneath your feet in the cloud, even if you just have a server with a local disk and install the same OS you did a few weeks prior.

2 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: