Ancient History: JBuilder Open Tools

Some years ago, the Java IDE marketplace looked quite different than it does today. VisualAge was very popular. Borland’s JBuilder was another top contender. Since then, many of the good ideas from VisualAge ended up in Eclipse, while the JBuilder of that era was replaced by a newer, Eclipse-based JBuilder. Not everything ended up on Eclipse, though: NetBeans matured to a slick IDE (with its own plugin ecosystem), as did IDEA.

But this post isn’t about today, it’s about a leftover bit of history. Back in that era, I had a section of this web site dedicated to the numerous JBuilder “Open Tools” (plugins) then available. That content is long obsolete and I removed it years ago. Remarkably, this site still gets hits every day from people (or perhaps bots) looking for it.

I agree strongly that Cool URIs don’t change, but that’s OK, because my old JBuilder Open Tools content just wasn’t very cool anyway.

On the off chance you landed on this page looking for it, here is a Google link for your convenience, or you can take a look at web.archive.org’s snapshot of my old list.

 

Comparing OPML Files, or How to Leave NetNewsWire

Recently I reached a level of excessive frustration with NetNewsWire (Mac) and decided it was time to move on. Problems with NetNewsWire include:

  1. NetNewsWire has no way to sync its subscription list to match your Google Reader subscription list. There is a Merge button in the Preferences that sounds like it should do this, but it does not work correctly. Once your lists get out of sync, they generally stay that way.
  2. NetNewsWire won’t prefetch images referenced in feeds. Without this, it is not useful for the most obvious purpose of a desktop reader: reading without a network connection. That’s a reasonable thing to leave out in early development, but in a mature product? What could they have been thinking?
  3. NetNewsWire fails (silently) to subscribe to Google Alerts feeds, apparently because Google Reader already knows about those feeds… but see #1.
  4. As many other users have reports, NetNewsWire frequently shows a different number of unread items from Google Reader, and no amount of Refreshing makes it match. The sync doesn’t quite work.

But to get rid of NetNewsWire, I needed to verify that I had all my feeds in Google Reader. This was easy:

  1. Export OPML feed list from NNW
  2. Export OPML feed list from Reader
  3. Use a bit of perl regex and diff (below) to extract and compare just the list of feed URLs
  4. Look over the diff, and copy-paste-subscribe the missing ones in Reader

The commands are:

perl -ne '/xmlUrl="([^"]*)"/ && print "$1\n"' <google-reader-subscriptions.xml  | sort >gr.urls
perl -ne '/xmlUrl="([^"]*)"/ && print "$1\n"' <nn.opml  | sort >nn.urls
diff gr.urls nn.urls

… which took much less time and far fewer keypresses than writing this post.

Offline reading is still very useful; at the moment I’m trying a combination of Google Reader, Gruml, and Reeder (iPad). Those work very well – so well that the risk of time-wasting feeds must be managed agressively: drop all but the most important, and don’t look every day.

Fix timestamps after a mass file transfer

I recently transferred a few thousand files, totalling gigabytes, from one computer to another over a slowish internet connection. At the end of the transfer, I realized the process I used had lost all the original file timestamps. Rather, all the files on the destination machine had a create/modify date of when the transfer occurred. In this particular case I had uploaded files to Amazon S3 from end then downloaded them from another, but there are numerous other ways to transfer files that lose the timestamps; for example, many FTP clients do so by default.

This file transfer took many hours, so I wasn’t inclined to delete and try again with a better (timestamp-preserving) transfer process. Rather, it shouldn’t be very hard to fix them in-place.

Both machines were Windows servers; neither had a broad set of Unix tools installed. If I had those present, the most obvious solution would be a simple rsync command, which would fix the timestamps without retransferring the data. But without those tools present, and with an unrelated desire to keep these machines as “clean” as possible, plus a firewall obstacle to SSH, I looked elsewhere for a fix.

I did, however, happen to have a partial set of Unix tools (in the form of the MSYS tools that come with MSYSGIT) on the source machine. After a few minutes of puzzling, I came up with this approach:

  1. Run a command on the source machine
  2. … which looks up the timestamp of each file
  3. … and stores those in the form of batch file
  4. Then copy this batch file to the destination machine and run it.

Here is the source machine command, executed at the top of the file tree to be fixed:

find . -print0 | xargs -0 stat -t "%d-%m-%Y %T"
 -f 'nircmd.exe setfilefoldertime "%N" "%Sc" "%Sm"'
 | tr '/' '\\' >~/fix_dates.bat

I broken it up to several lines here, but it’s intended as one long command.

  • “find” gets the names of every file and directory in the file tree
  • xargs feeds these to the stat command
  • stat gets the create and modify dates of each file/directory, and formats the results in a very configurable way
  • tr converts the Unix-style “/” paths to Windows-style “\” paths.
  • The results are redirected to (stored in) a batch file.

As far as I can tell, the traditional set of Windows built in command line tools does not include a way to set a file or directory’s timestamps. I haven’t spent much time with Powershell yet, so I used the (very helpful) NIRCMD command line utilities, specifically the setfilefoldertime subcommand. The batch file generated by the above process is simply a very long list of lines like this:

nircmd.exe setfilefoldertime "path\filename" "19-01-2000 04:50:26" "19-01-2000 04:50:26"

I copied this batch file to the destination machine and executed it; it corrected the timestamps, the problem was solved.

New site: Learn Clojure

Over the last few days I put together Learn-Clojure.com, a web site to help people get started with Clojure. Please take a look, and send feedback.

I also have several other ideas for informational sites and simple applications, which I’ll launch as time allows. In the past I’ve been inclined to just post new things here on my blog, but I think certain kinds of more “evergreen” information are more useful on standalone sites. Certainly the hosting/domain economics are such that it’s not a big deal to put them there.

Hire a RAIT: Redundant Array of Independent Teams

Life is Risk

Whenever you hire out work, either to a person, to a team, or to a company, there are risks. These risks can easily prevent the work from being completed, and even more easily prevent it from being completed on time. (I’m thinking mostly of software development work as I write this, but most of this applies to other domains as well.)

What could go wrong with the person/team/company you hire?

  • They get distracted by family or personal issues.
  • They turn out to not be as qualified or capable as they appeared.
  • They leave for better work. Sure, you might have a contract requiring them to finish, but your lawsuit won’t get the work done on time.
  • They turn out to not be as interested in your work as they first appeared.
  • They start with an approach which, while initially appearing wise, turns out to be poorly suited.
  • Illness or injury.

Of course you should carefully interview and check reputations to avert some of these risks, but you cannot make them all go away. You don’t always truly know who is good, who will produce. You can only estimate, with varying levels of accuracy. The future is unavoidably unknown and uncertain.

But you still want the work done, sufficiently well and sufficently soon. Or at least I do.

Redundancy Reduces Risk

A few years ago I stumbled across a way to attack many of these risks with the same, simple approach: hire N people or teams in parallel to separately attack the same work. I sometimes call this a RAIT, a Redundant Array of Independent Teams. Both the team size (one person or many), and the number of teams (N) can vary. Think of the normal practice of hiring a single person or single team as a degenerate case of RAIT with N=1.

To make RAIT practical, you need a hiring and management approach that uses your time (as the hirer) very efficiently. The key to efficiency here is to avoid doing things N times (once per team); rather, do them once, and broadcast to all N teams. For example, minimize cases where you answer developer questions in a one-off way. If you get asked a question by phone, IM, or email, answer it by adding information to a document or wiki; publish the document or wiki to all N teams. If you don’t have a publishing system or wiki technology in hand, in many cases simply using a shared Google Document is sufficient.

There are plenty of variations on the RAIT theme. For example, you might keep the teams completely isolated in terms of their interim work; this would minimize the risk that one teams’ bad ideas will contaminate the others. Or you might pass their work back and forth from time to time, since this would reduce duplicated effort (and thus cost) and speed up completion.

Another variation is to start with N teams, then incrementally trim back to a single team. For example, consider a project that will take 10 weeks to complete. You could start with three concurrent efforts. After one week, drop one of the efforts – whichever has made the least progress. After three weeks, again drop whichever team has made the least progress, leaving a single team to work all 10 weeks. As you can see in the illustration below, the total cost of this approach is 14 team-weeks of work.

How might you think about that 14 team-weeks of effort/cost?

  1. It is a 40% increase in cost over picking the right team the first time. If you can see the future, you don’t need RAIT.
  2. It is a 50% decrease compared to paying one team for 10 weeks, realizing they won’t produce, then paying another team for 10 more weeks.
  3. If you hired only one team, which doesn’t deliver on time, you might miss a market opportunity.

Still, isn’t this an obvious waste of money?

To understand the motivation here, you must first understand (and accept) that no matter how amazing your management, purchasing, and contracting skills, there remains a significant random element in the results of any non-trivial project. There is a range of possibilities, a probability function describing the likelihood with which the project will be done as a function of time.

RAIT is not about minimizing best-case cost. It is about maximizing the probability of timely, successful delivery:

  • To reduce the risk of whether your project will be delivered.
  • To reduce the risk of whether your project will be delivered on time.
  • To increase your aggregate experience (as you learn from multple teams) faster.
  • To enable bolder exploration of alternative approaches to the work.

What projects are best suited for RAIT?

Smaller projects have a lower absolute cost of duplicate efforts, so for these it is easier to consider some cost duplication. RAIT is especially well suited when hiring out work to be done “out there” by people scattered around the internet and around the world, because the risk of some of the teams/people not engaging effectively in the work is typically higher.

Very important projects justify the higher expense of RAIT. You could think of high-profile, big-dollar government technologies development programs as an example of RAIT: a government will sometimes pay two firms to developing different designs of working prototype aircraft, then choose only one of them for volume production. For a smaller-scale example, consider the notion of producing an iPhone app or Flash game for an upcoming event, where missing the date means getting no value at all for your efforts.

Thanks to David McNeil for reviewing a draft of this.

If you like it, make a link to it – a plea for real links

You see something good on the web; now it’s time to tell other people about it. Maybe you’ll use various common tools:

  • Facebook “like” it
  • Social-network-share it
  • Bit.ly it
  • Tweet it
  • Mention it in a forum post
  • Mention it in a blog comment

I believe it’s smart and convenient to do those things, but not to only do those things. Why? Because they create redirected, tracked, short-lived, rel=nofollowed, or otherwise weak links. Links that don’t properly tell search engines that the content is worthwhile. Quasi-links that attempt to replace real links as the fundamental currency of the web.

If you really like it, if you think it deserves ongoing attention, then in addition to whatever else you do, put a real A-HREF link to it on your web site/blog.