DRBD on Ubuntu

This week I experimented with DRBD (Distributed Replicated Block Device) which is “a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via (a dedicated) network. You could see it as a network raid-1.”

DRBD can be used for build a “poor man’s SAN”, a RAID running over Ethernet. In our case, we want a warm spare of an important database system, i.e. a second set of hardware which can take over for the primary within a few minutes of failure, and DRBD might be a great way accomplish that. DRBD runs on commodity hardware (ordinary servers, network adapters, etc.) so it is inexpensive. It is also, if benchmarks and comments are to be believed, surprisingly good. I suspect this is because CPUs and networks (Gigabit Ethernet, ideally) are so much faster than disks.

The systems at hand for testing here run the Ubuntu distribution, itself based on Debian. It took a lot of reading and learning, but very little actual typing, to get DRBD going on Ubuntu. My writeup here should help the next person along. As I write this, the released version of DRBD is 0.7, and the DRBD site encourages ignoring the newer unofficial release, advice that I followed.

First, read the HOWTO and the FAQ, but don’t follow the instruction in them yet; then you are ready to begin. The guessable part is to install the module with apt-get (you’ll need the “Community” APT sources in Ubuntu; I don’t know offhand what is equivalent in Debian):

This installs the module’s source code, not the module itself. After doing this, I spent a couple of hours trying to figure out how to compile it – untarring it and running “make” does not work. There is an abundance of confusing and misleading advice on the web about this, including advice to compile your own kernel. If you don’t need a custom-compiled kernel for other reasons, ignore all that and discover the joy of Debian’s “Module Assistant”, invoked by the command “m-a”.

“prepare” will get your machine ready to compile modules that can link with the running kernel; it will prompt to download and install packages needed for this, including the correct kernel header packages. Now you are ready to build and install the module:

Follow the prompts; if it asks to install more packages, agree to do so. On Ubuntu 6.10, it may fail with an error about “mv”. This is a known bug; the workaround is to expand the module source file (it lives in /usr/src/), add the line “SHELL=/bin/bash” at the top of each Makefile, then retar the module source.

By the way, this points out a stark and troubling aspect of Ubuntu “community” packages: not only are they not assured to work, they are not even casually tested to verify they compile on the Ubuntu version they are purportedly intended for use with… which I find very disappointing. I suppose the lesson is that Ubuntu is very slick for the set of officially supported packages, but if you’ll be using a lot of packages beyond that set, its appeal is much diminished.

Now you are ready to enable DRBD, following the HOWTO instructions, starting with setting up /etc/drbd.conf.

In my case I used LVM to make a 10 GB partition on each machine, put DRBD on top of that, put a filesystem (Reiserfs, for variety) on top of them, then moved/symlinked my PostgreSQL data there. It worked fine, including these tests: I started a long intensive DB operation (restoring a several-GB backup), then with that running, rebooted the secondary machine. This did not interrupt the DB operation, and when the secondary finished its reboot, the DRBD partition re-synced automatically. I then shut down the primary machine, and verified I could mount and read that same data on the secondary machine… all exactly as described.

One caveat though – in a few cases, machines with DRBD stopped during the boot sequence, waiting for user intervention (in this case, closing a shell) because they the DRBD device wouldn’t mount. I suspect this was because I don’t fully understand how to configure it.

Of course this was just a short initial look, but I am very impressed with DRBD so far.

A/B Technique for Web Application Deployment

This description of my “A/B technique for web application deployment” was transcribed from audio, so it less tight, more verbose than my normal prose. I chose to post it in rough form, rather than leave it on the “back burner” until an unknown future date when I have time to rewrite it. I first explained this to a colleague around 1999, 8 years is long enough for an idea to wait.

The Problem / Context

At least a dozen times over the last decade, this scenario has come up at consulting client sites: you have a web application and you want to upgrade it with a new version. You could do so with a brute force cutover (stop the app, swap the code), but that’s not the scenario that I’m talking about. I’m talking about the upgrading to a new version, not quite compatible with the old one, without dropping current active users. For example, in the new version, you might have some different data that goes in the session. You might have some different pages, so you have some different URLs, you may be adding a new field so that once you put in this new version, there’ll be an additional field, and it stores that additional field in the session and in the database and the caching and so on. Yet you want the current users to keep working without interruption.

Solution

This technique is not language-specific – it applies equally in PHP, ASP, ASP.NET, Java servlets, JSP, CGI, etc.; with nearly any application or infrastructure.

Have the URL of your application, which I’ll call “/contact” here, be the URL of a proxy application (or “launching pad”). Then have two additional URLs for two specific instances of the actual application. For example, you might have “/contact” as your overall application URL and then “/contact/contactA” or “/contact/A” as one instance where you have that application installed.

At the “/contact” URL, install a simple proxy application, whose job is to take a newly arriving user, present them an intro/login screen, then redirect them to one of the specific instances of the application.

As a user I will point my browser to the “/contact” URL, and I bookmark that. I launch that, I see a login screen, I type in my name and password, I press the button to log in. The “/contact” launching pad redirects me to “/contact/A,”, an instance of real application. I’ll call the second instance “B”, perhaps at the URL “/contact/B”. In the normal steady state of the system, the user will be using A, they login to “/contact” and they end up in the “A” instance.

Then you want to do an upgrade, install a new version. Install the new version as “/contact/B.” Leave the existing application in place and working. (By the way, I’ve assumed you are using a technology where you can deploy and undeploy applications without bringing your web server down, but with mod_proxy you could make this work even without that capability) Deploy the new application version in a new and different path than what your current running users are on. Adjust some setting your proxy/launchpad application (perhaps as simple as a single line in a single config file). So, for example, you might install the new version as “/contact/B” and then you flip a switch (edit the config file) to make it so that new users that come to the “/contact” page don’t land in “/contact/A” anymore, they land in “/contact/B” as they login.

The current users already using the application in “/contact/A” stay there – they don’t know or care that you’ve deployed a new version. New users come in the come in the new version. So you want to have some sort of mechanism (likely provided by your application server if you’re using one, and not hard to build otherwise) for monitoring how many users are using each of these applications. So you might for example notice that you have 1000 users active on /contact/A. You deploy a new version as /contact/B and flip the switch. Then, depending on the usage characteristics of your application – over the next few minutes, next few hours, however it works out, the users, as they log out and log in and such, gradually all make it into the /B application. Some kind of maximum-login-time mechanism will ensure that this cutover happens in finite time.

Once the users have moved to /contact/B, you then declare it as your “current” version, and you take down /A, because no one’s using it and no one can get into it. So that next time you need to do an upgrade, you just do it in reverse – you deploy that new version as /contact/A, flip the switch back to make all new users’ logins land in the A… and again, after however many hours or minutes or whatever, you have all your users on the new version, and you can take down the old version.

You can easily implement with just the tools that already come with your Web application development system. You don’t need any kind of special hardware or special application server or HTTP server support. You don’t need any sort of special way of doing session affinity; you’re doing session affinity by simply handing out the URL of one of these other Web applications.

Bookmarks

Someone might bookmark a page of your application. So let’s say that you had directed them to /contact/A, and they were on the page /contact/A/lists.jsp. When they return to this bookmark later (you do want to support bookmarking, right?), you don’t want them to land there; you don’t want them to end up in the A application if you’re currently using the B application. This is actually pretty easy to handle also. You simply use some settings on your Web server to do a redirect, with a few lines of configuration in .htaccess or analogous mechanims. So based on your setting of which one is current, you make it so that if someone comes into the application without having a referrer from inside the application, you just redirect them over to whatever the current instance is. And that takes a little bit of thought, but only a little, and you can make it seamlessly solve that problem of users’ bookmarks working in spite of you switching back and forth between two instances.

Clustering

You might be deployed on a cluster. Perhaps you are using Websphere with 37 web servers. It turns out that this A/B approach works orthogonally to the clustering features of your Web application server. You could have the A application deployed across all 37 servers; you could deploy the B application, with a few clicks, across all 37 servers; you could flip that switch in some global way to kick people onto the B, and so on.

Override the launchpad for testing

You can permit users to enter a special URL to get to the “other side”. you could have some way of entering a URL that takes you past that launch application straight into the B side, so that you could click around, you could manually verify that the new B application works in the production environment before you flip the switch to make that the deployed production system. This is a very wise and useful type of testing to do, a great final stage of testing because the new code in actually in production. It’s obviously not a replacement for testing in a separate test environment, it’s an adjunct for even greater safety in deploment.

Performance

When a system is running in a steady state, its caches are fully populated with relevant data, so many requests can be answered with data from the cache (RAM). But when a system is freshly started, its caches are empty, so more requests require (slower) disk access, during the first few minutes of operations. This is sometimes called the “empty cache” problem, and is responsible for the poor performance sometimes seen in the first few minutes after a busy system is restarted.

The technique described here prevents this problem, because with it you avoid ever shutting down and restarting your whole Web application with your full user population on it. Instead, since the switch only brings newly-logging-in users to the new version – the new instance – you gradually have people start using it, so you never take a big hit all at once in terms of cache population.

Schema Changes

Hani asked, in a comment, about schema changes. A simple answer is that you won’t be able to make a transition like the one described here (where both the old and new code versions run in parallel for a while), if you make schema changes such that the old code no longer works. A more complex answer, which I have used with great results, is that this is a programmable computer and you are a programmer – with some effort, you can make the software tolerate both the old and new schemas. So the process works like this:

  1. Decide on the schema change, but don’t deploy it
  2. Modify your software to tolerate the old or new schema, whichever is present
  3. Deploy the new software, transition all users to it (as described above)
  4. Make the schema change; you may need to momentarily quiesce the software, but hopefully not kill user sessions

(There are a few tools out there to help with the schema-change-in-a-live-app problem. One of then is ChronicDB who wrote me to point this out.)

Of course this is a lot more work than just stopping the server, making your change, and restarting. Whether it’s worth it depends on your situation. If you have an overnight non-usage window, consider using it instead of the long path described here.

I hope this is helpful for someone out there. Comments are welcome.

Hotel connectivity, with click tracking by SuperClick

I’m enjoying my stay this weekend, but the Big Cedar Resort uses click tracking software, which is unethical and unnecessary. Some details of this “superclick” tracking system are on nerdblog and swiK. There is much discussion of the system on SlashDot.

My take on this is that “don’t be evil” is good policy, and not only for Google. In addition to a very healthy room rate, this place charges $10 for every 24 hours of connectivity – a rate of $300 per month! That should be enough – they should not also sell me out for a system that “allows hoteliers and conference center managers to leverage the investment they have made in their IP infrastructure to create advertising revenue, deliver targeted marketing and brand messages to guests and users on their network.”

Big Cedar and SuperClick: I do not want to be leveraged. I do not want to be targeted. I do want to stay at a nice resort, with access to problem-free and spy-free internet connectivity.

Update: In addition to the issues above, this remarkably bad system also breaks the back button on many web sites.

Lots of TextDrive downtime

I apologize for the site downtime today, to my huge audience of loyal readers (er… both of you). It turns out that my new ISP, TextDrive, has been having a lot of downtime on the shared hosting machine my sites are on… frequent freezes/crashes, exacerbated by long delays in noticing and recovering. Apparently it is most likely some oddball problem with some particular site belonging to one of the many customers on that machine… but unclear which one. Hopefully they will locate the trouble soon and return to the high quality of service they were previously known for.

Update: Another 1+ hour downtime on Jan. 17.

Update: Another ~1 hour downtime on Jan. 25. There were some short downtimes in the last few days also.

Update: What a surprise, another ~1 hour downtime on Jan. 27

Update: Another short downtime on Jan. 28

Update: ~2 hour downtime on Jan. 29. Seeing a pattern yet?

Update: 2.5 hour downtime on Jan. 30. This time with a kernel update.

Update: TextDrive says they have a major update coming soon, that will fix all these issues; but no specific timetable.

Update: ~1.5 hours downtime in the early overnight hours of Feb. 4. Not mentioned on the issue tracker site.

Update: ~1 hour downtime on Feb. 7, during the morning part of the workday. Maybe didn’t really want customer after all.

Update: ~1.5 hour downtime on Feb. 8, during morning part of the workday.

Update: Ah, it seems the TextDrive crew is quite busy pushing the envelope for high-end Rails hosting (via Tim Bray). It is quite impressive, but it’s also disappointing – I’d rather they had spent that brainpower making their existing hosted sites stay up. Plus… a 5-10 minutes downtime while I was trying to post this update. TextDrive considers frequent short downtimes normal, for “Apache restarts”. On several other hosts where I have had shared hosting accounts, I have not experienced this at all; so it seems to be something about how TextDrive works.

Update: After a month of stability (aside from a few short downtimes every day for “apache restarts”), ~1.5 hours downtime on March 12.

I have seen the future, and it runs OSX

iPhone: Wow

It’s a phone. It’s a PDA. It’s an iPod. It’s a widescreen video iPod. It has zero physical buttons, rather the whole front is a multi-touch-screen. I’ll leave the rest of the raving to the many other sites doing a great job of that.

The real innovation of this new device is the OS. Apple has an answer to PalmOS and Windows Mobile (CE): run their real desktop/server OSX, with Unix inside, on the phone. Today’s handhelds have much more computing power and storage than desktop PCs of a decade ago, and there are enormous benefits to running a real, common OS on such a device. I’d been saying since I bought my first Palm (a Handspring, actually) than within a decade the handheld OS’s would go away.

Apple has gone first. How soon will Microsoft follow? Will Palm and RIM make to the new era at all?

(Update: Yes, the title of this post is slightly in jest. I’m serious about real OSs in handheld devices, and the iPhone looks fantastic, but Apple is very unlikely to dominate the phone market in the way the iPod dominates the tiny-media-player market.)

Math Drill – for your 1st..4th grader

Here is a simple piece of software I wrote to help my children practice basic arithmetic facts. Of course there are no doubt already a thousand or more similar programs out there; but in a few minutes of searching I didn’t find one that did quite what I wanted. I ended up tinkering with this for several hours, and even wrote documentation and made an installer. Free download and screenshot below.

MathDrillSetup.exe (free)

Math Drill screen shot

I’ve also published the source code over on Github: http://github.com/kylecordes/mathdrill/tree/master

This software is written in Delphi, and though there is some chance that I am the only developer anywhere using both Delphi and git, hopefully this will be wrong, and someone will fork and offer their improvements.