Code Reviews – Use a gate, not a drive-by

Does your team / project use code reviews? If not, I suggest starting today. There are countless resources online about how and why to do code reviews; there are numerous code review tools, open source and commercial.

Who should review code? A great question, perhaps for other blog posts.

When should code be reviewed? Hopefully also great question, and the topic of this blog post.

Context

When making a recommendation, I’m trying to get in the habit of always mentioning the context, the situation in my work from which my recommendation arises. This is in response to seeing frustration when people try to follow advice, not realizing that their situation is completely unlike that of the writer.

Here ate Oasis Digital our context is mostly complex line-of-business applications, which are expected to live for a long time and are important to the business operations of the companies that use them. We are almost always replacing an existing system (which a company relies on) or expanding an existing system (which a company relies on).

Another critical part of our context: we have flexible and fast source control, and a group of developers who have learned how to use it competently. We have decoupled source control from build automation, so that we can perform builds of branches in progress without merging them to a mainline. We can achieve a sufficient degree of continuous integration without actually putting each change directly into the same code line.

Review Code Early

How long after code has been written should it be reviewed? Hours or days, not weeks. For this reason, we commit and push code frequently (to a disposable branch of course) where others can see it for early review. It can be a challenge to get past the ego-driven desire to make code perfect before exposing it to the scrutiny of others, but it is worthwhile. By exposing code to review and comment early in its life, a developer can get useful feedback early in a unit of work.

Review Code Often

Another anti-pattern of code review is to look at a given proposed change only once or twice. That is sensible for a very small change, but for a complex change to a complex system it is best to perform code review repeatedly as work proceeds. For example, consider a feature that takes four weeks to implement. Our typical approach, which I recommend to others (if their context is reasonably similar to ours, see above) is something like this:

  • Review the code a few days in.
  • Review the code progress once a week or so.
  • Review the code when it is nearly ready to be called “done” and to go in the mainline.
  • If there are fixes needed, review those promptly so as not to cause delay.

Review Code Now

Code waiting for review is a drag on the speed of development; the right time to review code is now. A quick review now is probably more valuable than a deep review deferred.

Review Code as a Gate

In some organizations, code review happens asynchronously and sporadically, after code is already part of a project. This risks software quality collapse. Once code is in the mainline of a product, it is arguably too late to review. If a reviewer finds a problem with code already in a system, there will be pressure to sweep that problem under the rug and move on, or to convince your that is not really a problem, or that quality doesn’t really matter, just this once.

The right answer is simple: use code review as a gate through which code must pass before it can become part of the mainline of the project. Use your source control system, and stack up proposed code changes that are nearly ready to go into the mainline. Then periodically perform your review process (whether it is in person, remote, synchronous, asynchronous, etc.). Once the code passes a final review process, then it goes in the mainline. If there are serious problems, the code sits in a branch, a developer improves it, and it tries again next time.

We have had excellent results with this approach, maintaining software quality for years on end even through ongoing development team turnover.

 

Upcoming Talk: Lua on iPhone and Android (using Corona)

This Thursday (May 26, 2011), I will give a talk at the St. Louis Mobile Dev group on cross-mobile-platform development with Lua. There are various ways to do this (including rolling your own), but for simplicity I’m using Ansca’s Corona product.

As usual, I’ll zoom through some slides, and concentrate instead on the code. For some background on Lua, you may want to watch the video of my 20-minute Lua talk from last year’s Strange Loop.

Update: slides are available here.

 

Write your whole stack in JavaScript with Node.JS

Node is a combination of Google’s V8 JavaScript implementation, and various plumbing and libraries. The result is an unusual and clever server programming platform. Node is in a fairly early development phase, and already has a remarkably active community: ~9000 mailing list messages (as of June 2010) and many dozens of projects and libraries. I’ve spent some time digging through Node code and writing small bits of it, and was pleased with what I found.

Why is Node worthy of attention?

  • JavaScript is a Next Big Language, it is everywhere. It is probably the most widely used programming language ever.
  • I know a few things about asyncronous server programming, having done a lot of it in 1990s IVR software; it is very well suited to serving a large user population.
  • Node is accumulating libraries at an impressive rate, indicating momentum.
  • There are significant advantages in developing a whole application stack (server and client code) in a single language. For example, this makes code and business logic sharing works across tiers. Using Node, a JavaScript-HTML tool, a JavaScript-CSS tool, JSON, etc., it is possible to develop a complex web application using only JavaScript.

Node is not all unicorns and roses though.; my most serious misgiving about it is that it does not (yet) have a great strategy to make straightforward use of many-core servers. We’ll have to see how that develops over time.

Node Knockout

The team at Fortnight Labs is putting together Node Knockout, a 48-hour Node programming contest. I am a fan of such contests. I’ve offered to help out by being a judge, and I’ve also signed up Oasis Digital as a sponsor.

As a judge, I can’t be on a team; I’ve like to see a team or two form here in St. Louis, though.

The Prolog Story

I’ve told this story in person dozens of times, it’s time to write it down and share it here. I’ve again experimentally recorded a video version (below), which you can view on a non-Flash device here.

The Prolog Story from Kyle Cordes on Vimeo.

I know a little Prolog, which I learned in college – just enough to be dangerous. Armed with that, and some vigorous just-in-time online learning, I used Prolog in a production system a few years ago, with great results. There are two stories about that woven together here; one about the technical reasons for choosing this particular tool, and the other about the business payoff for taking a road less travelled.

In 2004 (or so) I was working on a project for an Oasis Digital customer on a client/server application with SQL Server behind it. This application worked (and still works) very well for the customer, who remains quite happy with it. This is the kind of project where there is an endless series of enhancement and additions, some of them to attack a problem-of-the-moment and some of them to enrich and strengthen the overall application capabilities.

The customer approached us with a very unusual feature request – pardon my generic description here; I don’t want to accidentally reveal any of their business secrets. The feature was described to us declaratively, in terms of a few rules and a bunch of examples of those rules. The wrinkle is that these were not “forward” rules (if X, do Y). Rather, these rules describe scenarios, such that if those scenarios happen, then something else should happen. Moreover, the rules were are on complex transitive/recursive relationships, the sort of thing that SQL is not well suited for.

An initial analysis found that we would need to implement a complex depth/breadth search algorithm either in the client application or in SQL. This wasn’t a straightforward graph search, though, rather that part was just the tip of the iceberg. I’m not afraid of algorithmic programming, Oasis Digital is emphatically not an “OnClick-only” programming shop, so I dug in. After spending a couple of days attacking the problem this way, I concluded that this would be a substantial block of work, at least several person-months to get it working correctly and efficiently. That’s not a lot in the grand scheme of things, but for this particular customer, this would use up their reasonable-but-not-lavish budget for months, even ignoring their other feature needs.

We set this problem aside for a few days, and upon more though I realized that:

  • this would be a simple problem to describe in Prolog
  • the Prolog runtime would then solve the problem
  • the Prolog runtime would be responsible for doing it correctly and efficiently, i.e. our customer would not foot the bill to achieve those things.

We proceeded with the Prolog approach.

….

It actually took one day of work to get it working, integrated, and into testing, then a few hours a few weeks later to deploy it.

The implementation mechanism is pretty rough:

  • The rules (the fixed portions of the Prolog solution) are expressed in a prolog source file, a page or two in length.
  • A batch process runs every N minutes, on a server with spare capacity for this purpose.
  • The batch process executes a set of SQL queries (in stored procs), returning a total of tens or hundreds of thousands of rows of data. SQL is used to format that query output as Prolog terms. These stored procs are executed using SQL Server BCP, making it trivial to save the results in files.
  • The batch process run a Prolog interpreter, passing the data and rules (both are code, both are data) as input. This takes up to a few minutes.
  • The Prolog rules are set up, with considerable hackery, to emit the output data we needed in the form of CSV data. This output is directed to a file.
  • SQL Server BCP imports this output data back in to the production SQL Server database.
  • The result of the computation is thus available in SQL tables for the application to use.

This batch process is not an optimal design, but it has the advantage of being quick to implement, and robust in operation. The cycle time is very small compared to the business processes being controlled, so practically speaking it is 95% as good as a continuous calculation mechanism, at much less cost.

There are some great lessons here:

  • Declarative >>> Imperative. This is among the most important and broad guidelines to follow in system design.
  • Thinking Matters. We cut the cost/time of this implementation by 90% or more, not by coding more quickly, but by thinking more clearly. I am a fan of TDD and incremental design, but you’re quite unlikely to ever make it from a handcoded solution to this simply-add-Prolog solution that way.
  • The Right Tool for the Job. Learn a lot of them, don’t be the person who only has a hammer.
  • A big problem is a big opportunity. It is quite possible that another firm would not have been able to deliver the functionality our customer needed at a cost they could afford. This problem was an opportunity to win, both for us and for our customer.

That’s all for now; it’s time for LessConf.

When Will It Ship? Estimates and Promises

I’m trying something new with this post: a short video presentation of approximately the same content.


Here is an area of confusion that has come up both at Oasis Digital, and at every other firm I’ve worked:

estimate ≠ promise

Background: Unpredictability

Around half of my software development and leadership experience has been in enterprise/internal software development, and that is the world I am thinking of as I write this.

Software development, like other endeavors with a significant creative component, is inherently unpredictable. With a good, deep understanding of the development process, you can build a model of the probability distribution of the cost, effort, and elapsed time for software development work. In the large, on average this can be made to work: small and large projects can succeed, within some broad range of predictability.

But notice also how common it is for large complex projects (in software and elsewhere) to be farcically over budget and late. This is not (usually) due to incompetence or fraud. It is because of the inherent unpredictability of the work.

If someone claims that they (or you) can exactly predict software development work, they are:

  • mistaken, or
  • lying, or
  • padding their estimates very substantially, stating a date or cost much later/higher than a neutral median estimate would suggest

As a customer of software development services, and as a provider of such services, I don’t want any of those things.

Blame the Service Trades

I place some of the blame for the confusion of these two wildly different things, on the service trades: it is common for auto repair shops, roof installers, landscapers, and the like to offer something they call an estimate, but which is actually a fixed price quote (a promise).

Sadly, while there are plenty of common good synonyms for promise, there aren’t many for estimate. We’re stuck with using the word estimate, and explaining that we really mean it as defined. Perhaps in a few more decades we will lose the word entirely, much like the word “literally” has come to mean its antonym, “figuratively”, which renders it mostly useless.

Estimates

An estimate is an approximation of an unknown quantity. Typically in the world of software development, it is a prediction of the cost, working hours, or delivery date of a project or milestone. It is not in any sense a commitment, any more than estimating the temperature outside this afternoon is a commitment.

As the word implies, a customer reasonably expects the actual value to vary somewhat, in either direction, from the estimate. In fact, if an estimate turns out exactly match the actual result, there is a good chance the books have been cooked. Moreover, if the work is completed at-or-before the estimate most of the time, this means the estimates (on average) are too high.

An estimate “costs” nothing, other than the time/effort required to create it, which consists of analyzing the work at hand, decomposing it in to parts, and comparing those parts to past work.

Promises

A promise, also called a commitment, deadline, quote, fixed price, etc. is a different beast entirely.

With a promise in hand, a customer should expect with high confidence that the actual value (for cost, hours of work, delivery date) will be less than (before), or equal to, the promised value/date.

Be wary of a promise easily made and freely given: it probably doesn’t mean anything at all. A wise customer (and I aim to count myself in this category) should expect that a casually made commitment will probably be broken; not because the maker is morally defective, but simply because meeting a commitment for complex work requires considerable effort and thought. Without evidence that happens, it would be mere wishful thinking to expect the results delivered as promised.

Likewise, keeping promises often has a cost. If the work underway gets behind the schedule needed to meet the promise, something will have to give:

  • Other work may fall behind, as time and effort are diverted to meet the promise.
  • Weekends, evenings, and overtime work may be needed. These might appear free, but are not.
  • Staff may need to be reassigned, or added
  • Additional hardware and software may be needed.

These risks cost real money; thus a wise promise-maker will find that, on average, it costs more to promise feature X by date D, than to delivery feature X by date D without such a promise.

Estimates are Cheaper, so Prefer Estimates

At Oasis Digital, we provide many estimates, but few promises. Most of the time, an estimate is what our customers need; and we can provide at estimate with very little cost. Typically we estimate reasonably well:

  • small features usually arrive with a day or two (plus or minus) of the estimated delivery date (and likewise for cost)
  • medium items within a week or two of the estimate, likewise
  • large items (major new features or subsystems with complex interdependencoes) within a month or so, likewise

The key here is is that with good estimate, commitments (promises) aren’t needed very often, and therefore the cost of promises can be avoided.

But Learn How to Promise Well, Also

Yet occasionally, a customer needs a commitment, most often because a software version needs to be available to match an important business event with a fixed date, such as a presentation, a legal filing, etc. I’ll follow up later (no promise or estimate, as to when) with thoughts on:

  • how to credibly make promises (as a service provider)
  • how to evaluate promises (as a customer)

My First Emacs (plus Slime, Swank, Clojure) on Mac OSX

Emboldened a talk that Ryan Senior gave at the St. Louis Lambda Lounge (now available as a video), I grabbed the most popular two Emacs versions for Mac OSX:

and set up each on my Macbook Pro (10.6.2). For each Emacs, I also set up SLIME, Swank, and Clojure.

If that sounds like a bunch of garbled nonsense, don’t worry, it is simply a sign that you are a normal, well adjusted person, rather than a Lisp person.

It was a bit irritating to get these tools up and running the first time; there are countless web pages with the details (I won’t make that worse by adding my own here). The challenge is figuring out which instructions to follow, while ignoring the rest. The most tractable approach for me, for a quickly-installed, easily-updating, fewest-steps process, was to install the needed packages from ELPA, without any manual downloading/configuration of SLIME, Swank, etc. at all. I have no command line operations to share with you, because in the process I recommend, there aren’t any.

I used Emacs in school, but not at all in the last 10 years. Since then, my expectations for developer tools have grown – I think it is reasonable for a modern development environment to simultaneously:

  1. be easily discoverable, for a fast start
  2. cooperate with its environment well
  3. make good use of the large high-res display on most PCs, to expose expected status and control surfaces
  4. be deeply configurable and highly powerful

My experience, coming in from that point of view, differs from what Ryan and a couple other people suggested: I suggest Aquamacs. It is much more of a good citizen on the Mac platform. Its menus are much more approachable. Printing works in a Mac way. You get tabs showing your buffers, by default. You get menu options to gather them up in to one window(frame) or split them apart, all by default.

It may indeed by worth switching to the non-Aqua version later, if/when you get so deeply in the toolset that the pure emacs experience is preferable… but I estimate you’d need to be pretty far down the path (much further than I).

Something I should try out eventually is the Emacs Starter Kit. Perhaps it would have given a sufficiently “saner set of defaults” to make the non-Aqua version a more reasonable choice. Emacs gurus seem to lean that direction, overall.