code – Kyle Cordes

Flex version of my overworn “flying boxes” GUI demo

Long-time fans may remember my “flying boxes” demo at the St. Louis Java user group in 2005, or my followup work in 2008, in which I translated that demo to JavaScript (Rhino).

I picked that demo project up again last year and recreated it with Flex 3; you can now try it out online, download the code, or browse it on github. It looks like this:

Click on it to try it “live” in as Flash/Flex.

The interesting part is the drag-drop movement – not my less-then-stellar color and design choices.

Rhino + JavaScript + Swing, Look Ma No Java

A while back I was discussing the future of programming languages with a colleague, and we agreed that for all its foibles, JavaScript will continue to enjoy very wide and increasing use in the coming years. I wrote last year about Steve Yegge’s hints that JavaScript is the “next big languages”, see that post for the reasoning.

Based on all that, I set about writing a small test app to see what it’s like to program a Swing app with JS. After a day or so of work (spread over a few months), I offer my Rhino Swing Test App:

Run RSTA now via Java Web Start

Get the RSTA code (git, on GitHub)

It implements the same “flying boxes” animation demo that I presented a few years ago at the St. Louis JUG, but aside from a generic launcher class, the GUI is implemented entirely in JavaScript. To clarify, this is not web browser JavaScript; it is running in Rhino, in the JVM, using Swing classes.

The documentation for interaction between Java and JS is limited, but sufficient. For simplicity, I used Rhino as an interpreter, I did not compile to java bytecode. Nonetheless, the animation runs about as smoothly in JS as it does in Java, because the heavy lifting is done by the JDK classes.
I used Eclipse (with JavaScript support) to write this code, but of course JS makes much less code completion possible than Java, and I missed that. Typically I mildly prefer IDEs, but am also productive with a text editor. For working with a large API like Swing though, IDE support helps greatly.

Still, I recommend a look at this approach if you are fond of dynamic languages but need to build on the Java platform, and I intend to investigate server-side JS development also.

Git on Windows, it actually works now

I’ve been trying out various distributed source control tools, and used several of them for various very small projects. I’ve most mostly settled on git as the one I prefer, but I haven’t yet published any code with it. Also, I’ve been frustrated that git support for Windows has been very weak.

Msysgit appears to have solved the git-windows problem, at least well enough for small scale work. If you’ve been holding back on trying git because you use Windows, now is the time to jump in.

Update: I’ve posted details on how to get started with msysgit and GitHub as well as a comparison of Git software for Windows.

Help! My Hierarchy is Slow – Faster Hierarchies with Nested Sets

A great many applications, including many that I’ve worked on, have a hierarchy of things: of parts, of people, of organizations, etc. The way most of us represent such hierarchy is with the first thing that generally comes to mind: make each Widget have a parent Widget, with a table like so:

create table widget (widget_id int, parent_widget_id int, other_fields_here);

This representation is called an “adjacency list”, and is simple and easy. You can readily build a tool to manipulate a hierarchy stored this way. Many off the shelf visual components, for both client-side and web applications, know how to manipulate hierarchies represented this way. Some reporting tools know how to report on hierarchies represented this way.

However, for answering common questions like “who all is under person X in the hierarchy”, the adjacency list approach is unwieldy and slow.

There are various other approaches to representing a hierarchy, most of them discussed in detail in Joe Celko’s articles and books, prominently in the book Trees and Hierarchies in SQL for Smarties. If you work with SQL and hierarchies, buy this book now.

One approach Celko is especially fond of is the “nested set” representation. You can read about it online here and here.

Of course, changing an entire application to use nested sets might be a very big deal in a mature application. That’s OK; in most cases we can get much of the benefit by building a nested-sets “cache” of the share of the hierarchy, with a table like so:

create table widget_hier_cache (widget_id int, left int, right int);

Each time the hierarchy has changed, or before each time we need to run complex queries, delete the rows in this cache table and repopulate them based on the current canonical adjacency data. Celko offers SQL code in his book to do that, which could be translated to work in the stored procedure language of the DBMS at hand. But what about DBMSs that don’t offer stored procedures, such as lightweight local databases (SQLite), MySQL, etc.? The translation must be done in application code instead.

I wrote such code in Delphi a while back, in the process of getting a full understanding of this problem; I’ve cleaned it up and now offer code for download here (DelphiAdjacencyNestedSets.zip), under an open source license (MIT license – use it all you want). I tested this today with Delphi 2007 Win32, but it should work fine at least back to Delphi 7. As far as I can tell with some searching, this is the only Delphi code for translating adjacency to nested sets available on the internet. This code doesn’t know about databases – it is a module to which you feed adjacency data, and from which you get back nested set data. It includes DUnit test cases.

I’ve also put the code on github, for easy browsing and forking.

(Update in August 2007: a new version (DelphiAdjacencyNestedSets2.zip) optionally tolerates “orphan” nodes and forests. Update in January 2008: a newer version (DelphiAdjacencyNestedSets3.zip) propagates an integer value down the hierarchy; it is named Tag as a nod to the .Tag property on VCL components. Both of these newer versions were tested with Delphi 2006/2007 also.)

The essence of the translation is a depth-first traversal of the hierarchy, and of course this can be easily implemented in other languages; the Delphi code is easy to understand, so don’t be afraid to take a look even if you need some Java or C# or PHP etc. I also stumbled across this PHP nested sets implementation, which offers a set of functions to maintain (insert, update, etc.) a hierarchy stored as nested sets, rather than only translate from adjacency to nested sets.

Another useful way to represent a hierarchy for fast querying is with a transitive closure table. I’ll write this up in a future post; it turns out to be especially useful (and necessary) to make arbitrary hierarchies work in the Mondrian OLAP server.

How Many (Java) Classes Do You Need To Go (XP) Bowling?

An object-oriented developer searches for a reason to add more of them.

Ron Jeffries recently posted a couple of articles on simple design, with the
specific example of code to score a bowling game:

First article,
with lots of classes
Second
article, a much shorter procedural solution

In a thread on the subject on the XP Mailing List, various posters expressed
a preference for one solution or the other.

With the caveat that I’m normally a very object-oriented guy, I prefer the
second (procedural) solution, in that I find has approriately simple design:
it work, expresses the programmer’s intentions, has (almost) no duplication
of ideas, and has the minimal amount of code needed to do all those things.

The point was made, though, that perhaps the procedural solution would fall
apart once the complexity of the problem increased. To test this idea, I decided
to extend it with the following features:

Know what frame the bowler is current done with / working one
Know the score of each frame
Know if the game is over
Reject erroneous input data

My first step was to convert the example C# code to Java. (I like C# at least
as much as Java, but for the moment I’m happier with the code-editing features
of Eclipse compared to VS.NET.) Here’s the Java code, the starting point before
adding any features:

/*
 * Trivial port of Ron Jeffries' Bowling to Java
 */
package bowling;

import java.util.ArrayList;

public class BowlingGame {

  ArrayList rolls = new ArrayList();

  public void roll(int roll) {
    rolls.add(new Integer(roll));
  }

  public int score() {
    int rollIndex = 0;
    int total = 0;
    for (int frame = 0; frame < 10; frame++) {
      if (strike(rollIndex)) {
        total += 10 + pinCount(rollIndex + 1) + pinCount(rollIndex + 2);
        rollIndex++;
      } else if (spare(rollIndex)) {
        total += 10 + pinCount(rollIndex + 2);
        rollIndex += 2;
      } else {
        total += pinCount(rollIndex) + pinCount(rollIndex + 1);
        rollIndex += 2;
      }
    }
    return total;
  }

  private boolean strike(int rollIndex) {
    return pinCount(rollIndex) == 10;
  }

  private boolean spare(int rollIndex) {
    return pinCount(rollIndex) + pinCount(rollIndex + 1) == 10;
  }

  private int pinCount(int pinPosition) {
    return ((Integer) rolls.get(pinPosition)).intValue();
  }
}

I also converted the NUnit tests to JUnit. (Formerly this was available for download,
but I’ve misplaced the file since then.)

With this in hand, I started adding tests for my new features. An example test:

  public void testStrikeWithFrameCounting() {
    game.roll(10);
    assertEquals(1, game.finishedFrames());
    assertEquals(0, game.scoredFrames());
    game.roll(5);
    assertEquals(1, game.finishedFrames());
    assertEquals(0, game.scoredFrames());
    game.roll(3);
    assertEquals(2, game.finishedFrames());
    assertEquals(2, game.scoredFrames());
    game.roll(2);
    assertEquals(2, game.finishedFrames());
    game.roll(1);
    assertEquals(3, game.finishedFrames());
    assertEquals(3, game.scoredFrames());
    rollMany(14, 0);
    assertEquals(10, game.finishedFrames());
    assertEquals(10, game.scoredFrames());
    assertEquals(29, game.score());
  }

This test, and others like it, verify that after each roll, the game object
know what frame has been completed, what frame is scorable, and when the game
is over. (I also added the feature to report the score of each frame, but didn’t
get around to testing it explicitly. I found that to get things right, I needed
a variety of tests for things happening in the last frame, since there are special
rules there.

Without further ado, here is working code to score the game, with the new features:

/*
 * Bowling Scorer, converted to Java, extended to know what
 * frame we are on, what frame has been scored, whether the
 * game is over, and the score of each frame
 */
package bowling;

public class BowlingGame {

  private static final int NUM_FRAMES = 10;
  private static final int NUM_PINS = 10;
  private static final int MAX_ROLLS_IN_GAME = NUM_FRAMES*2 + 1;

  // Input state:
  private int[] rolls = new int[MAX_ROLLS_IN_GAME];
  private int rollSoFar = 0;

  // Output state:
  private int[] frameScores = new int[10];
  private int scoredFrame;
  private int finishedFrame;

  // Processing variables; these would be locals, but
  // this class essentially is a "method object", so we
  // use instance variables instead of param passing
  private int scoringFrame;
  private int scoringRoll;

  public void roll(int roll) {
    if(roll<0 || roll>NUM_PINS)
      throw new RuntimeException("Roll out of range");

    if(gameOver())
      throw new RuntimeException("The game is over, no more rolls allowed.");

    rolls[rollSoFar++] = roll;
    calculate();
  }

  private void calculate() {
    scoredFrame = 0;
    finishedFrame = 0;
    scoringRoll = 0;
    for (scoringFrame = 1; scoringFrame <= NUM_FRAMES; scoringFrame++) {
      if (isStrike()) {
        scoreStrike();
      } else if (isSpare()) {
        scoreSpare();
      } else {
        scoreNormal();
      }
    }
  }

  private boolean isStrike() {
    return rolls[scoringRoll] == NUM_PINS;
  }

  private void scoreStrike() {
    storeFrameScore(NUM_PINS + rolls[scoringRoll + 1] + rolls[scoringRoll + 2]);
    frameIsScoredIfWeHaveRollOffset(2);

    if(scoringTheLastFrame())
      frameIsDoneIfWeHaveRollOffset(2);
    else
      frameIsDoneIfWeHaveRollOffset(0);
  }

  private boolean isSpare() {
    return rolls[scoringRoll] + rolls[scoringRoll + 1] == NUM_PINS;
  }

  private void scoreSpare() {
    storeFrameScore(NUM_PINS + rolls[scoringRoll + 2]);
    frameIsScoredIfWeHaveRollOffset(2);

    if(scoringTheLastFrame())
      frameIsDoneIfWeHaveRollOffset(2);
    else
      frameIsDoneIfWeHaveRollOffset(1);
  }

  private void scoreNormal() {
    storeFrameScore(rolls[scoringRoll] + rolls[scoringRoll + 1]);
    frameIsScoredIfWeHaveRollOffset(1);
    frameIsDoneIfWeHaveRollOffset(1);
  }

  private boolean scoringTheLastFrame() {
    return scoringFrame == NUM_FRAMES;
  }

  private void storeFrameScore(int frameScore) {
    frameScores[scoringFrame - 1] = frameScore;
  }

  private void frameIsDoneIfWeHaveRollOffset(int rollOffset) {
    if(scoringRoll + rollOffset < rollSoFar) {
      finishedFrame = scoringFrame;
    }
    // Continue scoring at the roll after the last one
    // on this frame:
    scoringRoll += rollOffset + 1;
  }

  private void frameIsScoredIfWeHaveRollOffset(int rollOffset) {
    if(scoringRoll + rollOffset < rollSoFar) {
      scoredFrame = scoringFrame;
    }
  }

  // The public interface has a few more methods for the new features:

  public int score() {
    int totalScore = 0;
    for(int i=0; i<frameScores.length; i++)
    totalScore += frameScores[i];
    return totalScore;
  }

  public int scoredFrames() {
    return scoredFrame;
  }

  public int finishedFrames() {
    return finishedFrame;
  }

  public int scoreForFrame(int frame) {
    return frameScores[frame-1];
  }

  public boolean gameOver() {
    return finishedFrame == NUM_FRAMES;
  }
}

Here is what I like about this solution:

There are tests to show that it actually works
The method names are, at least to me, intention-revealing
The methods are short and straightforward

The main scoring loop, which still happens in one straight-line pass, is
simply this:

    for (scoringFrame = 1; scoringFrame <= NUM_FRAMES; scoringFrame++) {
      if (isStrike()) {
        scoreStrike();
      } else if (isSpare()) {
        scoreSpare();
      } else {
        scoreNormal();
      }
    }

There’s not much duplication left in there; perhaps someone can suggest
a way to get rid of the bits that remain

The real question, of course, is whether this is too much for one class to
do – whether there are any parts of this code that should be a separate class.
The obvious candidates are the pairs of methods: should isStrike() and scoreStrike()
somehow be in a Strike class? They have an obvious parallel structure, a form
of duplication that I might be able to get rid of by adding classes.

I thought these new features would push me there – but they didn’t. Perhaps
a future story would do so. This implementation is still not quite “finished”,
in that I know there are more kinds of error conditions to consider, test for,
and implement. I don’t see any reason to think that adding those now would add
much to the conversation, so I didn’t add them.

One lesson of this exercrise, to me, is a common one: the problem space is
not the solution space. Just because we have a thing called a Foo in the problem
domain, doesn’t mean we need a class Foo in the solution. We might, we might
not; with test-driven design, the need for a Foo should eventually become obvious.

Comments welcome, via email (address below) or on the XP mailing list.

Word Chains Kata Solution

I worked out a solution for “Pragmatic” Dave Thomas’s Code Kata 19 (Word Chains) in Java; it’s somewhat different than Dave’s published solution.

I’ve been watching PragDave’s Code Katas with interest, since I am also in the habit of occassionally solve a small, standalone programming problem just for the learning experience. Today I worked out a solution for Kata 19, Word Chains. I did this Kata the “pure” way, i.e. without looking at any of the discussion of solution approaches before coding. Here are the results (also on github). I wrote the code this time in Java.

Unlike Dave’s published solution, I didn’t precalculate adjacency lists; rather I took the more brute-force approach of scanning the word list at each step looking for neighbor words. In spite of that non-optimization, this still calculates the word chains for a bunch of examples in a few seconds (total).

I was curious how much of the program’s time is spent checking for neighbor words; this provided a good reason to try out a profiler plug for Eclipse, available on SourceForge. The profiler worked fine, and told me, as I would have guessed, that the program spends approximately 100% of it’s time looking at words to decide if they are neighbors (in the word-chains sense). Of course the usual optimization advice applies: it would have been actively stupid of me to optimize without measuring where the slow parts are.