May 2006 – Kyle Cordes

Pure Ruby Sparklines – No RMagick ImageMagick

This Pure Ruby Sparklines implementation got my attention – unlike other Ruby graphics packages, it does not need RMagick, and thus not ImageMagick, sparing the installer considerable effort and misery. The source of that misery is the long list of dependencies; now of course a great positive of the open source world is that tools can readily build on each other – but this comes with a cost of unexpected complexity in getting things to compile and install. We had a project here delayed by a couple of weeks of working through issues in getting it all to work on shared hosting account. I never did get it working with Ruby 1.8.4 on Windows, I reverted back to 1.8.2 for now.

With a “Pure Ruby” implementation, you copy the .rb files over and it works. I wonder if “Pure Ruby” will become a marketing point like “Pure Java”.

Lest anyone accuse me of unwarranted Ruby fanhood, I’ll point this out: with Ruby it is necessary to find or what code for graphic drawing. With Java, it is in the box, in the form of the excellent Java2D feature set. Ruby is way behind in the area.

SQL Server Log Shipping – Testing the impact of large operations

Background

In a project we have here at Oasis Digital, our customer relies on the log shipping feature of Microsoft SQL Server 2000 to keep some secondary databases, reporting databases, up to date in almost-real-time: every few minutes a transaction log backup runs, then at a slightly longer interval, every 5-15 minutes, a restore of those logs runs on secondary reporting databases. We’ve developed some code so that the reports can automatically run against any available reporting database; thus as the reporting database occasionally go offline for log shipping restores, the end users never notice that there’s more than one different database being used for their reports.

Most of the time, this works very well. The users get reports quickly, and those reports (no matter how large and painful the queries are) never have any impact on the production OLTP databas. But we have been stung a few times by changes we make in the production database having an unexpectedly large impact on that log shipping process.

We have two theories on what happens:

Certain kinds of changes in the production database, such as adding a field, changing a field type, changing an index, can affect a very large number of pages of the database – since SQL Server log shipping occurs a page at a time, an operation that makes a small change to 100,000 pages of the database can result in an extremely large file being shipped between the production database and the reporting databases.
There are performance characteristics of that restore process which we don’t fully understand – sometimes a log restore takes considerably longer than its side would suggest. Sometimes a log-shipped change triggers such an occurrence.

The result of those two issues together is that sometimes we intentionally make a seemingly minor change in the database and it has a large, negative impact on that log shipping process: hours of reporting database downtime! Occasionally it’s taken more than a day for the log restore target DBs to catch up.

This affects the users severely, and is therefore a Very Bad Thing.

Therefore, we’ve been looking for a way to assess the impact of such schema and index changes, on the log ship process, before making them in production.

Testing / Measuring

Unfortunately the only reasonable answer appears to be the rather large hammer of recreate a copy of the production system, including log shipping. The databases in question are quite large, so this means two servers, or one server with a bunch of hard drives (to run the log ship source and destination on the same machine).

The process is:

Install a lot of hard drive space, OS, SQL Server
Restore a copy of the production DB, call this “test1”
Set up log shipping to one or two reporting DBs, test2 and test3

Then for each test run:

Run log shipping (so its all up to date)
Run test queries (like adding a field, changing a field type, changing a large index, etc)
Run log shipping again, and observe how big of a file is log shipped between the two DBs.

The main metric we’ll get out of this is for change X (i.e. for a certain SQL DDL change or DML change), we get a log ship of size N megabytes or N gigabytes. I suspect that with this kind of data in hand, we will soon discover the underlying “rules” and understand which changes result in very large logs, so most of the time we can tell which things are going to have a big impact, and schedule them around appropriately.

We can even automate the test process: feed in a piece of SQL to a program runs those steps. It might take hours of (cheap) computer time to run, but little (expensive) human time.

The Plot Thickens

What I’ve described above is the simple case. There’s a more complex case: we’ve observed that the worst delays tend to happen if a log backup occurs in the middle of a certain operation – even operations that are usually harmless, can perhaps result in a huge log ship if the log backup happens mid-operation. These also appear to be the logs that take especially long to restore on the recording on the destination database.

To test this, we can take a piece of DML that we think is going to take several minutes to run, start it, wait 30 seconds or so, and while it’s still running start a log backup. Then wait until the DML completes and start another log backup. We would gain several data points: the size of the mid-operation ship and the final data ship, the how long each takes to restore. I suspect we will learn that it’s a really bad idea to let a log backup start while running any potentially large operation.

To prevent that, perhaps we can automate these operations like so:

Run a snippet of SQL to disable log backups (log shipping)
Wait for any running backup to finish
Run the target SQL
Wait for it to finish
Reenable log backups (log shipping)

That’s as much detail as I have time to post; hopefully this will help someout out there with SQL Server log shipping problems. It would be great to hear from others out there who have experienced similar log shipping issues.

Fewer accounts. Fewer points of failure.

I’ve recently been working on having fewer accounts, at fewer banks, fewer lines of credit, etc. This is in an attempt to reduce the accidental complexity of my life, as well as reduce the likelihood of the identity theft.

Imagine my joy, in reading Will Shipley’s E-TRADE nightmare, that one of my former accounts that got the axe, was at E-TRADE.

Dual Monitors – Worth Every Penny

As I write this in 2006, it is a very late post. Back in January 2003, I got a major PC upgrade: two 19″ LCD monitors and a dual-DVI video card to drive them. This was somewhat less common (and much more expensive) in 2003 than today. The monitors are “Samsung SyncMaster 191T”. I took this photo at the time, meaning to post it:

Dual monitors are a remarkable productivity tool – I am confident that this upgrade paid for itself within the first few months of use. Another developer here at Oasis Digital uses three screens; I’m planning to go that direction, or two larger screens, next time I upgrade. I’ve since moved to a faster computer, but I’m still using these monitors.

For best results, don’t even think about plugging LCDs in to analog VGA – the difference between analog and DVI is immediately visible and stark. I’ve seen many sources online claim that DVI is only a minor improvement; I find this unfathomable.

I use and recommend Ultramon for making best use of multiple screen – it seems to work better than the similar software which ships with some video cards.

Reimplementing Good Ideas

A while back, Weiqi posted about Eric Burke‘s comments about Wikis; one point being that “WikiWordLooksStupidAndAreNotNormal”. I mostly agree with that, and the Wiki tool we’ve been using recently, MediaWiki, supports non-WikiWords trivially. Eric is working on new Wiki software that sounds quite compelling.

The notion of needing some new software with which to set up a Wiki-repository for misc. project-related information seems a little behind the times. I set such a thing seven years ago, in 1999 – and it was old news then.

I have a personal aversion to re-solving such a well solved problem; it feels like duplication, which I have deep aversion to in code, to the extent I’ve held on to this snippet from a post by Ryan King on the TDD mailing list in 2002:

“So, duplication sucks. The more rabidly a person hates duplication, the more I trust their opinions on managing information. A programmer, upon noticing some redundancy, should be thrown into a panic, begin hyperventilating, and stammer something about “¡El diablo! !El diablo está en mi software!”.”

And yet… I have done such things myself many times – for a while in my early days of programming, I implemented a text editor a couple of times each year. There’s something deeply valuable in the experience of building something yourself, I think it is a mistake to dismiss such efforts as “Not Invented Here”.

Going to RailsConf. Be wary of software religion.

I’m going to RailsConf 2006 in Chicago next month:

The interesting thing about this is that I signed up before I knew how popular it would be – it sold out in a few days, so the pent-up demand must have been remarkable. More remarkable is that it sold out long before the full list of speakers and talks was released (or even existed). Me and 500 other people didn’t even need to know what/who would be there, to decide to go. For me, that is because I have a project going that uses Rails, and I generally enjoy software conferences, and this one is close.

But maybe there is more. Is Ruby+Rails a fad? A cult? I nearly always enjoy the Ruby on Rails Podcast, but its intro music proudly and bizarrely claims that “we’re building a religion”.

I don’t know who “we” are, but I’m not on board with that.