Please, Use a Web Application Framework

Historically I have not been a fan of “frameworks”, and I have often repeated the following joke:

What’s the difference between an application and a framework?

An application is something a customer actually wants!

However, for some applications, I recommend use of an application framework. For some Oasis Digital projects, I require it:

Please, Use a Web Application Framework

My reasoning here applies specifically to web applications with many CRUD (create-read-update-delete) features, and an underlying database. The advice applies much more widely, and with many nuances and caveats, but this article I am discussing only CRUD-ish web applications. Even within this niche, my reasoning does not apply to web applications which “push the envelope” of what is possible or which attempt to advance the state of the art.

Regardless of the programming language, the application should be built on a framework. More specifically, the framework should popular and mainstream, with a community of developers, and the appearance of momentum for the future. Likewise, the client-side JavaScript used in these applications, should also be based on such a framework. Here are some examples:

Ruby: Ruby on Rails, IOWA

Python: Django, TurboGears, Pylons, TwistedWeb

PHP: Akelos, CakePHP, CodeIgnitor, Symfony, Zend

Java: Struts, Seam, Rife, Tapestry, Stripes, Wicket Spring MVC

JavaScript: Scriptaculous, Prototype, JQuery

This is just a list of some frameworks that I am aware of; I have not evaluated all of these in detail, and I do not endorse them; nor is this an exhaustive list. For Oasis Digital projects, we help evaluate proposed frameworks, then I personally give the go-ahead to use a particular framework for a particular project.

An in-house web application framework does not meet the “community of developers” criteria, except at the very largest firms. Everywhere else, you are better off with an off-the-shelf, popular framework than with an inhouse framework, even if the latter is brilliantly designed.

Justification

My recommendation (and requirement, for some projects) for using an application framework for this kind of application is not based on a fad. Rather it is based on my years of experience as a developer, a team leader, a maintainer, and most importantly, a customer of software development.

The wild success of some frameworks (such as Ruby on Rails) has shown that they can reduce the amount of code and time needed to develop an application. That second factor, the amount of code specific to the application, is at least as important as the development time. Lines of code are not an asset; they are a liability. Only the features that the code provides are an asset. The most valuable software provides a lot of features using the smallest possible amount of application-specific code.

Therefore, even if a developer is so extraordinarily fast that they can create a system very quickly without using an off-the-shelf framework, they still have provided less value by doing so, compared to creating that same system quickly with fewer lines of code.

Another benefit of using a common framework (not a custom, in-house framework) is that this makes an application much easier and faster for other developers are work on in the future. A more maintainable system is more valuable.

Framework Caveats

Vidar Hokstad left a lengthy and excellent comment below, disagreeing with my thesis. It turns out that I mostly agree with Vidar, and it sounds like he and I have been through many of the same experiences with poor application frameworks. There are a lot of things an application framework can do wrong, and sadly, many of them take the opportunity to do so. In-house frameworks created by “architecture astronauts” seems to be especially prone to these defects:

  • All-or-nothing: Some frameworks intentionally or accidentally make it hard to replace a section of the framework. Don’t use these. Use a framework instead that has a “library” philosophy, such that you are readily choose to use some parts but not others.
  • Just Different: There are frameworks which offer an API wrapper around the underlying mechanisms, which isn’t really any better, just different. In this case, different is worse. Writing to (for example) the com.acme.inhouse.servlet API is, all else equal, much worse than writing to the standard Java Servlet API. To be worth its weight, a framework API must be demonstrably and obviously more concise.
  • Lower Abstraction: There are frameworks which, ironically, lower the level of abstraction of the application code, because that code ends up working around the framework features to get the job done.
  • Pile of Pieces: There are frameworks in which it is necessary to shred your application in to a pile of pieces, and then wire those pieces together with configuration files. This is sometimes useful, but often makes the application harder to understand, not easier, especially if there are extensive “XML pushups” involved. (I’m looking at YOU, Struts!) Instead, choose a framework with convention-over-configuration, and one which offers but does not require manual wiring.
  • Keyhole Database Access: If you find you mostly use a frameworks’ DB access features, and as a result you have short, easy to change code, then keep it. But if you find you use extensive SQL to work around lots of framework issues, throw it out. If a framework intentionally makes it hard to reach to the underlying SQL access, throw it out now.
  • No Source: If someone proposes a framework for which you won’t have source code, laugh. Aloud. If this gets you fired, then it has set you on a path to find employment at a more enlightened organization.
  • Exceptionally Bad Exception Handling: Java frameworks are especially prone to issues with exception handling, in which the framework code “eats” exception details.

In summary, pick up a framework and use it to get your application up and running quickly, but don’t be stupid. Do what makes sense locally for your project over time. It is a win to use an application framework to reach “1.0” functionality, even if you end up removing or swapping out parts of it later.

Next Big Language = JavaScript

There’s a lot of buzz about Steve Yegge’s “port” of Rails to JavaScript, and Steve has now provided (in his funny, self-deprecating style) the background of how it came to be. He doesn’t quite say it explicitly in this post, but I think it reveals that the “Next Big Language” he has been hinting at is JavaScript.

I (mostly) agree:

JavaScript is in nearly every browser, including tiny ones (like the one in my BlackBerry Pearl). It may be the single most widely available language today.

Because of the above, an enormous population of JavaScript programmers (though sometimes of dubious skill) has emerged.

Starting with Java 6 it’s “in the box” there also. To me, this makes it the likely winner, by a wide margin, for a dynamic language to be used at Java shops or inside Java projects. Being “in the box” is a powerful advantage, one which the many other contenders will have a hard time overcoming.

Adobe’s new JavaScript virtual machine implementation, which they handed over to Mozilla as “Tamarin”, sounds like it will boost JavaScript performance great, making it good enough for a very wide variety of projects.

JavasScript uses curly braces, like the last few Big Languages.

Like Java, C, C++, etc., JavaScript has specs and multiple competing, complete, current, high quality implementations. This, to me, is a big advantage over Ruby, Python, and other currently popular dynamic languages. Of course there is plenty of room in the industry for these language to thrive also, I am not saying any of them will go away; we use Python with great results and expect to keep doing so.
Mark Volkmann initially thought I was nuts to predict JavaScript as a winner but came around a few month later (and said so in a user group talk).

In a project at work, we’d adopted JavaScript as our plugin extension language for user-customizable rules (billing rules, etc.). I’d have chosen Lua (as I did for another project), but there are at least 1000x as many JavaScript programs out there. So far it has worked very well. If we had it to do over we might implement far more of the project in JavaScript.

However, there are a few reasons why I only “mostly” agree:

First, with JavaScript there isn’t a good way to avoid shipping source code. Sure, you an obfuscate JavaScript with various tools, but the results remains far for amenable to readable-source recovery than in a more traditionally compiled language. For open source projects this is no big deal, but there are also many worthwhile businesses and projects which depend on proprietary, not open software (including most of our projects), and it’s not year clear that obfuscation is sufficient protection. (Update in reply to a comment below: This matters even for server-side software, because some of us create and sell software products for other people to run on their servers.)

Second, at the moment JavaScript appears to lack a module system, without which it’s painful to build large systems. I expect an upcoming language version will address this.

BaseJumpr: BaseCamp -> ActiveCollab

BaseJumpr has a fascinating service offering: they export your data from your Basecamp account, producing a set of files ready to import in to ActiveCollab, the open source Basecamp-sorta-clone-like-program. They then, if you wish to buy their hosting service, create an instance of ActiveCollab for you and import your data there. (They host your file storage on Amazon S3, so they can easily offer ample storage.)

I find this very appealing, yet also a bit impolite; 37Signals has built a good business on Basecamp, the ActiveCollab team has created (well, is creating) an open source clone, while BaseJumpr did neither of these things yet stands to gain (at 37s’s expense). However, I doubt BaseJumpr is a significant threat or bother to 37Signals because most users interested in the open source ActiveCollab would likely not be using the Basecamp service in the first place.

Speaking of Basecamp, I am fascinated by 37Signals’ business success with such a simple (but well executed) application. I tried out Basecamp myself, and found it far too feature-anemic for my taste; but I could readily see its appeal and simplicity, and it has me thinking about the merit of building a business in a focussed niche, intentionally and happily excluding the potential customers outside it.

Update in 2009: BaseJumpr doesn’t appear to exist any more. I am curious how it worked out.

Indentation as Block Structure – HAML instead of RHTML

When I starting with Python sometime in 2001, I was briefly frustrated by the intentation-as-block-structure syntax; but after a few weeks I found it  natural. Its most obvious advantage is that it avoid the duplication between indentation and braces / keywords. Yet this kind of syntax has not become popular outside of Python.

Today I saw an interesting use of it “in the wild”: HAML, an HTML templating mechanism for Ruby on Rails. I haven’t used HAML (and may not, since at the moment we have only some sample projects using RoR, nothing in production), but from the tutorial it appears to be a very tight (indentation-based) syntax for HTML templating. I’ve encountered a Rubyist or two who disdains the Python syntax – I wonder if that similarity will limit HAML’s adoption.

A/B Technique for Web Application Deployment

This description of my “A/B technique for web application deployment” was transcribed from audio, so it less tight, more verbose than my normal prose. I chose to post it in rough form, rather than leave it on the “back burner” until an unknown future date when I have time to rewrite it. I first explained this to a colleague around 1999, 8 years is long enough for an idea to wait.

The Problem / Context

At least a dozen times over the last decade, this scenario has come up at consulting client sites: you have a web application and you want to upgrade it with a new version. You could do so with a brute force cutover (stop the app, swap the code), but that’s not the scenario that I’m talking about. I’m talking about the upgrading to a new version, not quite compatible with the old one, without dropping current active users. For example, in the new version, you might have some different data that goes in the session. You might have some different pages, so you have some different URLs, you may be adding a new field so that once you put in this new version, there’ll be an additional field, and it stores that additional field in the session and in the database and the caching and so on. Yet you want the current users to keep working without interruption.

Solution

This technique is not language-specific – it applies equally in PHP, ASP, ASP.NET, Java servlets, JSP, CGI, etc.; with nearly any application or infrastructure.

Have the URL of your application, which I’ll call “/contact” here, be the URL of a proxy application (or “launching pad”). Then have two additional URLs for two specific instances of the actual application. For example, you might have “/contact” as your overall application URL and then “/contact/contactA” or “/contact/A” as one instance where you have that application installed.

At the “/contact” URL, install a simple proxy application, whose job is to take a newly arriving user, present them an intro/login screen, then redirect them to one of the specific instances of the application.

As a user I will point my browser to the “/contact” URL, and I bookmark that. I launch that, I see a login screen, I type in my name and password, I press the button to log in. The “/contact” launching pad redirects me to “/contact/A,”, an instance of real application. I’ll call the second instance “B”, perhaps at the URL “/contact/B”. In the normal steady state of the system, the user will be using A, they login to “/contact” and they end up in the “A” instance.

Then you want to do an upgrade, install a new version. Install the new version as “/contact/B.” Leave the existing application in place and working. (By the way, I’ve assumed you are using a technology where you can deploy and undeploy applications without bringing your web server down, but with mod_proxy you could make this work even without that capability) Deploy the new application version in a new and different path than what your current running users are on. Adjust some setting your proxy/launchpad application (perhaps as simple as a single line in a single config file). So, for example, you might install the new version as “/contact/B” and then you flip a switch (edit the config file) to make it so that new users that come to the “/contact” page don’t land in “/contact/A” anymore, they land in “/contact/B” as they login.

The current users already using the application in “/contact/A” stay there – they don’t know or care that you’ve deployed a new version. New users come in the come in the new version. So you want to have some sort of mechanism (likely provided by your application server if you’re using one, and not hard to build otherwise) for monitoring how many users are using each of these applications. So you might for example notice that you have 1000 users active on /contact/A. You deploy a new version as /contact/B and flip the switch. Then, depending on the usage characteristics of your application – over the next few minutes, next few hours, however it works out, the users, as they log out and log in and such, gradually all make it into the /B application. Some kind of maximum-login-time mechanism will ensure that this cutover happens in finite time.

Once the users have moved to /contact/B, you then declare it as your “current” version, and you take down /A, because no one’s using it and no one can get into it. So that next time you need to do an upgrade, you just do it in reverse – you deploy that new version as /contact/A, flip the switch back to make all new users’ logins land in the A… and again, after however many hours or minutes or whatever, you have all your users on the new version, and you can take down the old version.

You can easily implement with just the tools that already come with your Web application development system. You don’t need any kind of special hardware or special application server or HTTP server support. You don’t need any sort of special way of doing session affinity; you’re doing session affinity by simply handing out the URL of one of these other Web applications.

Bookmarks

Someone might bookmark a page of your application. So let’s say that you had directed them to /contact/A, and they were on the page /contact/A/lists.jsp. When they return to this bookmark later (you do want to support bookmarking, right?), you don’t want them to land there; you don’t want them to end up in the A application if you’re currently using the B application. This is actually pretty easy to handle also. You simply use some settings on your Web server to do a redirect, with a few lines of configuration in .htaccess or analogous mechanims. So based on your setting of which one is current, you make it so that if someone comes into the application without having a referrer from inside the application, you just redirect them over to whatever the current instance is. And that takes a little bit of thought, but only a little, and you can make it seamlessly solve that problem of users’ bookmarks working in spite of you switching back and forth between two instances.

Clustering

You might be deployed on a cluster. Perhaps you are using Websphere with 37 web servers. It turns out that this A/B approach works orthogonally to the clustering features of your Web application server. You could have the A application deployed across all 37 servers; you could deploy the B application, with a few clicks, across all 37 servers; you could flip that switch in some global way to kick people onto the B, and so on.

Override the launchpad for testing

You can permit users to enter a special URL to get to the “other side”. you could have some way of entering a URL that takes you past that launch application straight into the B side, so that you could click around, you could manually verify that the new B application works in the production environment before you flip the switch to make that the deployed production system. This is a very wise and useful type of testing to do, a great final stage of testing because the new code in actually in production. It’s obviously not a replacement for testing in a separate test environment, it’s an adjunct for even greater safety in deploment.

Performance

When a system is running in a steady state, its caches are fully populated with relevant data, so many requests can be answered with data from the cache (RAM). But when a system is freshly started, its caches are empty, so more requests require (slower) disk access, during the first few minutes of operations. This is sometimes called the “empty cache” problem, and is responsible for the poor performance sometimes seen in the first few minutes after a busy system is restarted.

The technique described here prevents this problem, because with it you avoid ever shutting down and restarting your whole Web application with your full user population on it. Instead, since the switch only brings newly-logging-in users to the new version – the new instance – you gradually have people start using it, so you never take a big hit all at once in terms of cache population.

Schema Changes

Hani asked, in a comment, about schema changes. A simple answer is that you won’t be able to make a transition like the one described here (where both the old and new code versions run in parallel for a while), if you make schema changes such that the old code no longer works. A more complex answer, which I have used with great results, is that this is a programmable computer and you are a programmer – with some effort, you can make the software tolerate both the old and new schemas. So the process works like this:

  1. Decide on the schema change, but don’t deploy it
  2. Modify your software to tolerate the old or new schema, whichever is present
  3. Deploy the new software, transition all users to it (as described above)
  4. Make the schema change; you may need to momentarily quiesce the software, but hopefully not kill user sessions

(There are a few tools out there to help with the schema-change-in-a-live-app problem. One of then is ChronicDB who wrote me to point this out.)

Of course this is a lot more work than just stopping the server, making your change, and restarting. Whether it’s worth it depends on your situation. If you have an overnight non-usage window, consider using it instead of the long path described here.

I hope this is helpful for someone out there. Comments are welcome.

$200 -> Rubinius

I’ve been using Ruby sporadically for some time, including in a bit of production code (in which it is running well), but the apparent lack of progress toward a more modern VM for Ruby makes it harder to get more deeply involved. On the one hand, today’s Ruby interpreter/runtime is sufficiently good to build very successful services on (37Signal’s Rails-based services, for exampel); but in my own testing for the kinds of higher volume data handling I often need to do, it’s among the slowest I’d used. That matters little for populating a web page, but matters a lot for things like OLAP ETL.

So today I joined Geoffrey Grosenbach in supporting Evan Phoenix’s rubinius project, by sending $200 to help sponsor the work. It’s not much in the grand scheme of things, but I believe in “putting your money where your mouth is”.

This isn’t the first time I mentioned Geoff; earlier this year I took him to task for his choice of music for the Ruby on Rails podcast, which has changed since then to something more suitable.