Archive
Comparing two technologies on their configuration style
At IP Street, most of our technology stack is open-source. Something happened last week that threw our components’ different design philosophies into stark relief.
We use Solr (with Zookeeper) for many of our search and pivot tasks, and Redis as a Swiss Army Knife. They do different things and have different consistency requirements. You can easily critique any juxtaposition as comparing apples to oranges. I think it’s instructive, because Solr and Redis are both high-performance, production-quality, and powerful tools.
Working on them within the same day, I experienced exact terminal opposites in configuration philosophy!
Let’s meet contestant number 1
Solr is a powerful search engine. Their Cloud feature lets you shard and scale your index, and Solr will do the internal shard and node routing. Or you can direct your queries to the appropriate node for a small performance win. Being short-handed understaffed frugal with our peons worker bees people, we let Solr do the routing. “Here’s a document, store it.” “I want this document.” “Here’s a pivot within a search, do it and assemble the results for me, pronto.” Etc.
Solr nodes are peers, though internally there are leaders and replicas. Solr uses Zookeeper, an Apache technology for distributed persistent configuration. Nodes do the right thing when other nodes come and go.
Read more…
Drobo data recovery: Conclusion
My dead Drobo saga’s conclusion…
tl;dr
- Grades: Drobo customer support: A+. DiskWarrior: F. Disk Rescue 3: A-.
- Don’t consider your Drobo to be hot-swappable. Ever.
- Buy Disk Rescue 3 and have it on hand.
- Run Disk Utility and do a Verify Disk once a month. If that’s too often for you, do it once a quarter.
A Drobo firmware update bricked my Drobo
I’m migrating my files and apps to my new MacBook Pro. A highly anticipated improvement was connecting my Drobo S to a USB 3.0 interface, instead of my previous laptop’s USB 2.0 bus.
During my migration, the Drobo Dashboard advised me that a Drobo firmware update was available. I did the update, which -boom- bricked my Drobo.
After trying rebooting, power-cycling the Drobo, and plugging it into the other USB socket, I’m at a point where Drobo Dashboard says the Drobo is healthy. But OS X won’t mount it. Disk Utility says:
Unable to bootstrap transaction group 6000: cksum mismatch
No valid commit checkpoint found
The volume xxxxxxx was found corrupt and needs to be repaired.
Problems were found with the partition map which may prevent booting
Error: This disk needs to be repaired. Click Repair Disk.
I then run Repair Disk, and it tells me the same thing! So Repair Disk can’t repair the disk!
I bought DiskWarrior (for $109, I’ll have you know) but it can’t repair a disk that isn’t mounted. They’ll ship me a physical CD-ROM of my purchase, so I can try booting from it. Oh, but wait, my MacBook Pro/Retina doesn’t have a CD-ROM drive!
I am not a happy camper.
How not to maintain an API
We license a vendor’s services for corporate information, like annual revenue and office locations. Their name shall be kept confidential in this story.
We access their API via http calls. They call it a REST API. But like 95% of the “REST” APIs in the world, it’s not REST at all, and in fact nowhere near REST. The term “REST” has been corrupted to be become synonymous with, “web API”.
But whatever. It’s an API accessed with http calls.
One of service calls has a parameter called, “countryCode”, which was documented as an ISO 3166 country code.
As the world turns…
Boy, what a roller coaster! Shortly after opening a position for a Senior Devops engineer, we had a funding “event” and now the opening’s gone. What’s worse, I had to lay off one of my developers, right before before the end-of-year holidays. It was stressful for all involved.
We’re doing some interesting things with name relationships at work, and these present fun development challenges. I’m trying to spend as much time as possible in Emacs, because the less-fun work issues always occur when I’m not coding.
I upgraded our codebase to version 3 of Celery, just to get us off version 2. I’m still hankering to replace Celery, but it must have known it was living on borrowed time because it’s been behaving lately, so I’ve decided to fry some bigger fish. But the moment Celery starts acting up again…
I just turned 55. How the hell did that happen?!?
I switched from Cacti to Munin
This week, I switched our systems-level monitoring from Cacti to Munin.
I was dissatisfied with Cacti’s interactive-only configuration and limited OOTB charts, and its reluctance to correctly display the processor %U of my multicore servers. I tried the oft-cited suggestion of cloning the existing %U graph into a new template and bumping the maximum to 1,200% (for a 12-core server); no good.
I have a, “I have bigger fish to fry” mindset lately, and I want something that does (mostly) the right thing OOTB without having to delve into the source code.
Nook networking blues
My wife owns a NOOK Simple Touch with GlowLight.
Our home router in an Apple Airport Extreme, the latest model.
The NOOK worked flawlessly when she first got it. But starting three weeks ago, it forgets our home network credentials about once a week.
Nothing’s changed in our network configuration. The Airport’s location hasn’t changed. No other electronics in the house have moved. We don’t have any new electronics.
We asked for help at the Barnes & Noble on Pine Street. The help desk person said, “Gee, it hooks up to our network just fine, so, it must be your wireless router.” She was not helpful.
Nothing else in our house has network connectivity problems! Not my personal MBP, my work MBP, my wife’s iMac, my iPad, or our two iPhones. Only the NOOK has a problem.
I wonder if its NVRAM (or whatever it uses to remember network credentials) is sick. But this thing is only a few months old.
No, I don’t have network sniffer software, nor do I want to download a sniffer and learn out to use it. I shouldn’t have to resort to that for an e-reader.
Sigh.
Unit test your obvious code
Sometimes you don’t write unit tests. Your reason for not doing so always falls into one of two categories.
Complexity
The code you just wrote would be so much easier to test using system-level testing. For example…
- The setup and teardown would be 10x the test code.
- There’s too much interaction with multiple data stores or third-party vendors.
- Your dev boxes or CI server don’t all have the necessary technology installed.
These are rational reasons to not write unit tests for new code. You’re fine.
Simplicity
But sometimes you don’t write unit tests because the code you just wrote is so darn obvious.
It’s really simple. It’s straightforward. It’s nearly trivial. Why both writing unit tests for it?
Well, I’ll tell you why you should test it. In fact I’ll give you three reasons.
Read more…
iPad diagramming II
I’m playing more with Noteshelf and thinking about how I use a whiteboard. And I’m noticing aspects of my sketching for the first time…
My drawings mutate a lot as I create them:
- I’ll start out leaving space for objects (e.g., server boxes, database symbols), and then decide the objects need more space. (For practical or esthetic reasons.)
- I’ll assign colors to different entities, and later change the color assignments.
- I’ll start recording attributes A, B, and C for state transitions, and then decide to drop B and add attributes D and E.
- It’s very rare that nothing has to change. But even then, I’ll wish I could move the whole diagram on the whiteboard or page in toto, because it’s grown in a direction or to an extent that I didn’t anticipate.
I often wish I could do a diagram twice — once as a dry run, once “for real.”
These alterations happen more often to my drawings than they do for others. At least, it seems that way to me.
iPad diagramming
I often need to diagram things at work. It’s usually something like a system block diagram, a gnarly code problem, or client-server interactions. Sometimes it’s just a list of things I’m comparing.


Whatever the diagram is, I need to the keep it around for a while. And refer to it, scribble on it, and update it. And sometimes share it.
Since “back in the day,” I’ve used a whiteboard for this. Or sometimes pages from a pad of graph paper. I’ll noodle around, sketch things out, and leave it up.
For sharing, I’ve resorted to snapping a photograph of the whiteboard with my iPhone. (Or a couple of photographs, which I then stitch together with AutoStitch.) If the photo’s not adequately square, I straighten it out with Genius Scan. And then e-mail it. The mail message can get pretty large, so this can be a nuisance.
Eventually the whiteboard needs to be erased, or is accidentally erased. Or I lose the graphing paper doodles, or decide to throw out the diagrams.
In December, I received an iPad 2 as a gift. And I’ve gotten around to thinking, why not step up my game and use the iPad for this? (Yeah, I’m being dramatic and rhetorical. Sorry. I’ll re-phrase: “I’ve decided to use the iPad for diagrams and simple drawings.”)
I haven’t completely figured out how I’ll do this. I’ll write about my experience here as I go down the learning curve, mistakes and all.
Read more…
Rackspace Ubuntu 11.04 servers broken without a network
I’ve got an odd problem. I create a 4 GB VM in the Rackspace Cloud with an Ubuntu 11.04 server image. After it’s created, I can’t ssh to it, and ping returns zero bytes.
I can get to it from the Rackspace dashboard console. But it’s not on the network. Creating a VM without a network is kind of useless.
I first alerted Rackspace to this over a week ago. It’s still present in our VMs, and now impacts our company in a very serious way. Rackspace says their Operations team has to check the host machine to fix this. You’d think this would be easy to isolate and resolve, but….nope.
Does anybody else have this problem?
———
Update: Rackspace confirmed this is a system-wide problem! Until it’s fixed, after I rebuild a VM I have to ask their customer support to goose the underlying host machine before it’ll respond to the network. Yikes.
Recent Comments