gitosis on Ubuntu 9.04 Jaunty

As of April 2009, the gitosis package in Ubuntu 9.04 Jaunty is broken; it fails with an error like so:

pkg_resources.DistributionNotFound: gitosis==0.2

There are quite a few pages and mailing list messages that mention this. I only found one with a good hint toward a solution, which was that it is also a known issue on Debian. Following that lead, I got it working by grabbing newer packages from Debian Unstable:

wget http://ftp.us.debian.org/debian/pool/main/p/python-support/python-support_1.0.2_all.deb
wget http://ftp.us.debian.org/debian/pool/main/g/gitosis/gitosis_0.2+20080825-14_all.deb
sudo dpkg -i python-support_1.0.2_all.deb
sudo dpkg -i gitosis_0.2+20080825-14_all.deb

Use this at your own risk; your mileage may vary.

Please, Use a Web Application Framework

Historically I have not been a fan of “frameworks”, and I have often repeated the following joke:

What’s the difference between an application and a framework?

An application is something a customer actually wants!

However, for some applications, I recommend use of an application framework. For some Oasis Digital projects, I require it:

Please, Use a Web Application Framework

My reasoning here applies specifically to web applications with many CRUD (create-read-update-delete) features, and an underlying database. The advice applies much more widely, and with many nuances and caveats, but this article I am discussing only CRUD-ish web applications. Even within this niche, my reasoning does not apply to web applications which “push the envelope” of what is possible or which attempt to advance the state of the art.

Regardless of the programming language, the application should be built on a framework. More specifically, the framework should popular and mainstream, with a community of developers, and the appearance of momentum for the future. Likewise, the client-side JavaScript used in these applications, should also be based on such a framework. Here are some examples:

Ruby: Ruby on Rails, IOWA

Python: Django, TurboGears, Pylons, TwistedWeb

PHP: Akelos, CakePHP, CodeIgnitor, Symfony, Zend

Java: Struts, Seam, Rife, Tapestry, Stripes, Wicket Spring MVC

JavaScript: Scriptaculous, Prototype, JQuery

This is just a list of some frameworks that I am aware of; I have not evaluated all of these in detail, and I do not endorse them; nor is this an exhaustive list. For Oasis Digital projects, we help evaluate proposed frameworks, then I personally give the go-ahead to use a particular framework for a particular project.

An in-house web application framework does not meet the “community of developers” criteria, except at the very largest firms. Everywhere else, you are better off with an off-the-shelf, popular framework than with an inhouse framework, even if the latter is brilliantly designed.

Justification

My recommendation (and requirement, for some projects) for using an application framework for this kind of application is not based on a fad. Rather it is based on my years of experience as a developer, a team leader, a maintainer, and most importantly, a customer of software development.

The wild success of some frameworks (such as Ruby on Rails) has shown that they can reduce the amount of code and time needed to develop an application. That second factor, the amount of code specific to the application, is at least as important as the development time. Lines of code are not an asset; they are a liability. Only the features that the code provides are an asset. The most valuable software provides a lot of features using the smallest possible amount of application-specific code.

Therefore, even if a developer is so extraordinarily fast that they can create a system very quickly without using an off-the-shelf framework, they still have provided less value by doing so, compared to creating that same system quickly with fewer lines of code.

Another benefit of using a common framework (not a custom, in-house framework) is that this makes an application much easier and faster for other developers are work on in the future. A more maintainable system is more valuable.

Framework Caveats

Vidar Hokstad left a lengthy and excellent comment below, disagreeing with my thesis. It turns out that I mostly agree with Vidar, and it sounds like he and I have been through many of the same experiences with poor application frameworks. There are a lot of things an application framework can do wrong, and sadly, many of them take the opportunity to do so. In-house frameworks created by “architecture astronauts” seems to be especially prone to these defects:

  • All-or-nothing: Some frameworks intentionally or accidentally make it hard to replace a section of the framework. Don’t use these. Use a framework instead that has a “library” philosophy, such that you are readily choose to use some parts but not others.
  • Just Different: There are frameworks which offer an API wrapper around the underlying mechanisms, which isn’t really any better, just different. In this case, different is worse. Writing to (for example) the com.acme.inhouse.servlet API is, all else equal, much worse than writing to the standard Java Servlet API. To be worth its weight, a framework API must be demonstrably and obviously more concise.
  • Lower Abstraction: There are frameworks which, ironically, lower the level of abstraction of the application code, because that code ends up working around the framework features to get the job done.
  • Pile of Pieces: There are frameworks in which it is necessary to shred your application in to a pile of pieces, and then wire those pieces together with configuration files. This is sometimes useful, but often makes the application harder to understand, not easier, especially if there are extensive “XML pushups” involved. (I’m looking at YOU, Struts!) Instead, choose a framework with convention-over-configuration, and one which offers but does not require manual wiring.
  • Keyhole Database Access: If you find you mostly use a frameworks’ DB access features, and as a result you have short, easy to change code, then keep it. But if you find you use extensive SQL to work around lots of framework issues, throw it out. If a framework intentionally makes it hard to reach to the underlying SQL access, throw it out now.
  • No Source: If someone proposes a framework for which you won’t have source code, laugh. Aloud. If this gets you fired, then it has set you on a path to find employment at a more enlightened organization.
  • Exceptionally Bad Exception Handling: Java frameworks are especially prone to issues with exception handling, in which the framework code “eats” exception details.

In summary, pick up a framework and use it to get your application up and running quickly, but don’t be stupid. Do what makes sense locally for your project over time. It is a win to use an application framework to reach “1.0” functionality, even if you end up removing or swapping out parts of it later.

Python or Python+Delphi Developer Wanted

Speaking of Python, over at oasis Digital we’re looking for a Python (subcontract) or Python+Delphi (full time) developer. For the right person this could be a great opportunity to use your preferred tools.

Plus, a tip to anyone applying for this work or any other work: when you email a resume, don’t name it “my resume.doc” or “resume.pdf”. Rather name it with your name, perhaps “Kyle Cordes resume.doc” or “Resume for Smith, John.pdf”. (I fear that someone will read this and miss the point entirely, naming their resume file with my name…)

Update: We have hired for this work – thank you to everyone who applied.

Yet Another Python Success Story

Is it OK to use programming language X in a production enterprise application? Or are fear, uncertainly, and doubt holding you back? Public “success stories” might make it more acceptable for you to do so in your environment.

In that spirit I offer our story of a production Python deployment at an Oasis Digital customer (without names or details, to protect their privacy). There are many other success stories at Python.org. In this project, the client and application server (the bulk of the system) are written in Delphi (which was much more popular when the project started, than it is today). A major subsystem (roughly 1/3rd of the overall system) is written in Python. It consists of a set modules that parse textual data from a large number of varied formats, into a common schema, another set of modules to apply (frequently adjusted) business rules, and a third set of miscellaneous modules. These are all used in background data processing, not part of a client application or a server handling requests from a client application. These modules interact with the rest of the system primarily through state stored in a database. I generally recommend against the database integration style between separate applications, but it works well in this scenario within modules of the same application, built and maintained concurrently by the same team.

The Good

We chose Python for this subsystem for a variety of reasons. First, its built-in features are well suited to the text processing task at hand. Python’s “batteries included” have generally avoided the need to find or implement add-on text processing tools (which would have been necessary in Delphi); thus a programmer needs to know and use “just” what’s in the Python box, with few external libraries to consider.

Second, Python’s built in features and compact syntax have shortened the programming time considerably, in our estimation, than would have been otherwise. It takes relatively few lines of Python to get the job done. We have many lines of Python, and would have had many more lines had we used a lower-level language. (Of course lines of code is not everything; it’s possible to come up with dense, bad code. As a general rule, though, a language in which you can express what you need to express more succinctly, is better.)

Third, Python’s interpreted nature keeps the edit-test cycles short, further speeding development. This development speed issue is especially important given the niche this project occupies, in which data format changes and rule changes sometimes arrive with no notice: new data arrives, and part of the application does not work until it is enhanced to handle the new data. Extensive use of automated unit and integration tests (many hundreds of test cases) effectively prevents the interpreted operation from causing trivial runtime errors (type errors, syntax errors).

Of course, some other languages with similar compact syntax, included libraries, and high level features, would have worked equally well. At the time(5+ years ago), though, Python appeared to have the most momentum, other than Perl. Perl would have been a good choice also from a technical point of view, but it had an (unwarranted) reputation of being hard to maintain, which I didn’t want to have to overcome.

The Bad

There are downsides to our two-language approach. The first is somewhat Python specific, and really not a big deal: Python is slow. Its primitives are fast, but when you write considerable Python code to do something, it does that something at a rather leisurely pace compared to Delphi or Java or C++, using a lot more CPU along the way. The practical impact of this has been limited, because the bottleneck on this system is not the Python code, it is the database; but still, this has been inconvenient, and has required our customer to deploy multiple machines for this subsystem where one would have been sufficient with a more efficient language/runtime. Doing so is not particularly expensive, though, and adds a measure of reliability, so we haven’t had a need to speed things up with Psyco, C modules, etc.

The other issue is more serious. It is created by the large gap in language style and features between Delphi vs. Python in particular, and low-level vs. scripting languages in general. (Those of you unfamiliar with Delphi may be thinking this is because Delphi is some hideous VB-like toy. Wrong. Delphi is a somewhat C++-like or Java-like language, statically compiled, fast, and sadly burdened with a Pascal syntax.) Personally, this gap bothers me not at all. I’ve written production code in assembly, C, C++, Delphi, Java, Python, Ruby, Javascript, Lua, a bit of Prolog, and others I forget right now; I am happy to use one language in the morning and another radically different one in the afternoon.

I have discovered that most developers are not like me, though. Most Delphi developers are notably uninterested in Python and vice versa. As a result, our project team has ended up divided along the same lines as the software, with some cross training but relatively little production development crossover between the developers working in each languages. This is an obstacle to any developer taking end-to-end responsibility for features or issues that span the languages, and also an obstacle to hiring.

Python itself is not much of an obstacle to hiring: while there are far fewer Python programmers than Java (for example) programmers, there are also far fewer Python jobs than Java jobs.

The Verdict

In spite of the downsides discussed above, overall it has been a “win”, technically, to use two languages (each well suited to part of the application) in this project. More importantly, I am also confident this choice has been a win for our customer: they got a system delivered faster, and at lower cost, than they otherwise would have. They used every bit of speed we could deliver, to win business from their competitors.

However, the world has improved a lot since this decision was made; today I could probably choose a single language / toolset which meets all the needs sufficiently well, and thus avoid the downsides of the two-language solution. Alternatively, if starting today I might build the infrastructure for all subsystems in the same base language, with hooks to use Lua or Javascript scripting to accommodate the need for rapid runtime logic changes. It’s even possible that we would port the existing code to another language in the future – which would not make the original decision a mistake.

Next Big Language = JavaScript

There’s a lot of buzz about Steve Yegge’s “port” of Rails to JavaScript, and Steve has now provided (in his funny, self-deprecating style) the background of how it came to be. He doesn’t quite say it explicitly in this post, but I think it reveals that the “Next Big Language” he has been hinting at is JavaScript.

I (mostly) agree:

JavaScript is in nearly every browser, including tiny ones (like the one in my BlackBerry Pearl). It may be the single most widely available language today.

Because of the above, an enormous population of JavaScript programmers (though sometimes of dubious skill) has emerged.

Starting with Java 6 it’s “in the box” there also. To me, this makes it the likely winner, by a wide margin, for a dynamic language to be used at Java shops or inside Java projects. Being “in the box” is a powerful advantage, one which the many other contenders will have a hard time overcoming.

Adobe’s new JavaScript virtual machine implementation, which they handed over to Mozilla as “Tamarin”, sounds like it will boost JavaScript performance great, making it good enough for a very wide variety of projects.

JavasScript uses curly braces, like the last few Big Languages.

Like Java, C, C++, etc., JavaScript has specs and multiple competing, complete, current, high quality implementations. This, to me, is a big advantage over Ruby, Python, and other currently popular dynamic languages. Of course there is plenty of room in the industry for these language to thrive also, I am not saying any of them will go away; we use Python with great results and expect to keep doing so.
Mark Volkmann initially thought I was nuts to predict JavaScript as a winner but came around a few month later (and said so in a user group talk).

In a project at work, we’d adopted JavaScript as our plugin extension language for user-customizable rules (billing rules, etc.). I’d have chosen Lua (as I did for another project), but there are at least 1000x as many JavaScript programs out there. So far it has worked very well. If we had it to do over we might implement far more of the project in JavaScript.

However, there are a few reasons why I only “mostly” agree:

First, with JavaScript there isn’t a good way to avoid shipping source code. Sure, you an obfuscate JavaScript with various tools, but the results remains far for amenable to readable-source recovery than in a more traditionally compiled language. For open source projects this is no big deal, but there are also many worthwhile businesses and projects which depend on proprietary, not open software (including most of our projects), and it’s not year clear that obfuscation is sufficient protection. (Update in reply to a comment below: This matters even for server-side software, because some of us create and sell software products for other people to run on their servers.)

Second, at the moment JavaScript appears to lack a module system, without which it’s painful to build large systems. I expect an upcoming language version will address this.

Indentation as Block Structure – HAML instead of RHTML

When I starting with Python sometime in 2001, I was briefly frustrated by the intentation-as-block-structure syntax; but after a few weeks I found it  natural. Its most obvious advantage is that it avoid the duplication between indentation and braces / keywords. Yet this kind of syntax has not become popular outside of Python.

Today I saw an interesting use of it “in the wild”: HAML, an HTML templating mechanism for Ruby on Rails. I haven’t used HAML (and may not, since at the moment we have only some sample projects using RoR, nothing in production), but from the tutorial it appears to be a very tight (indentation-based) syntax for HTML templating. I’ve encountered a Rubyist or two who disdains the Python syntax – I wonder if that similarity will limit HAML’s adoption.