Getting Started with Git and GitHub on Windows

(Update: I have a new, related post about the Best Git GUIs for Windows.)

I’ve been attracted to, and trying out, various distributed source control tools for the last two years, and have come to the conclusion that the most likely “winner” is Git. Git does a great many things right, good progress is being made in the few areas it is weak, and it has rapidly growing popularity. There are many web sites with extensive information about using Git, learning Git, Git integration, and more.

For new Oasis Digital projects, we will generally Git rather than SVN for source control. Here are instructions I wrote to help our teams get started. The contents here are 95% generic, but the references to me are, of course, Oasis-Digital-specific.

(For a general introduction to Git, consider this video at GitHub.)

GitHub

Although Git is a fully distributed source control system, it is very convenient to have a set of robust, central repositories. Oasis Digital’s repositories are hosted by GitHub:

http://github.com/

Github offers a useful set of online features to supplement what Git has built in and available locally. As of the spring of 2008, GitHub is certainly a work-in-progress, I’d characterize is as a “beta” level service. Nonetheless it is worthwhile and recommended. There is a lot to learn from the “guides” published here also:

http://github.com/guides

Install and Configure msysGit on Windows

I assume here that you are using Windows, although Git works very well (better, actually) on Linux or Mac. As I write this, the best Windows Git package is msysgit, available here:

http://code.google.com/p/msysgit/

Make sure to follow the download instructions labeled “If you only want to use Git”. As I write this the download is Git-1.5.5-preview20080413.exe, but get the current version available as you read this instead, not that specific version.

Install by running the EXE installer. Accept the default install directory. When you get to the PATH setting screen, I recommend the “Use Git Bash only” setting, because it avoid any risk of PATH conflicts.

By default, Git will be configured to translate text files between Windows CRLFs (in your working copy) and Unix LFs (in repositories). This setting is fine if you like to use an editor on Windows that insists on Windows CRLFs. I generally use an editor that is equally happy to use Unix LFs, so I sometimes use Git in the other (non-translating) mode.

msysgit includes both the git command line, and a usable GUI. The GUI is not on par with more mature products, but it is helpful and good enough for users who are allergic to the command line.

Create your SSH Key

The first step in using Git is to create your SSH Key. This will be used to secure communications between your machine and other machines, and to identify your source code changes. (If you already have an SSH key for other reasons, you can use it here, there is nothing Git-specific about this.)

In Windows Explorer, pick any convenient directory, right-click, and choose “Git Bash Here”.

Then type this command:

ssh-keygen -C "username@email.com" -t rsa

(with your own email address, of course)

Accept the default key file location. When prompted for a passphrase, make one up and enter it. If you feel confident that your own machine is secure, you can use a blank passphrase, for more convenience and less security. Note where it told you it stored the file. On the machine I tested with, it was stored in “C:\Documents and Settings\Kyle\.ssh\”

Open the file id_rsa.pub with a text editor. The text in there is your “public SSH key”. You will need it to set up your GitHub account, in the next section.

Beware of $HOME trouble: a reader reported a tricky failure mode in which some other software he installed had set up a HOME or HOME_PATH environment variable pointing in to that application instead of to your real home (“Documents and Settings”) directory.

More details on the key process are available here:

http://github.com/guides/providing-your-ssh-key

Set up your GitHub account

Go to http://github.com/ and sign up for a free account. In the sample here I signed up with a made-up alter-ego, harry@kylecordes.com. Use your own, real email address of course.

Don’t worry that GitHub describes the free account as for “open source” work; I will later add you as a collaborator to my paid (and therefore private, non-open-source) projects. Make sure to copy-paste in your SSH public key that you created earlier.

Once you have set up your github account, email your github username (not your password) to me, so I can add you to the relevant project(s). (Reminder – send to me only if you are working on an Oasis Digital project! If you are using this page as generic Git instructions, send the information to your project leader instead!)

Once I have added you as a collaborator to the relevant project and sent back its URL, you can navigate to the URL, which will look like this:

https://github.com/kylecordes/PROJECTNAME/tree/master

On that page, click the “fork” button to create your own workspace for the project. This will take you to your own page for the project, something like this:

https://github.com/YOURNAME/PROJECTNAME/tree/master

Now you can clone this project to your own machine, as discussed below.

Getting Started Locally

First, use the “Git Bash Here” feature described above, to get a command prompt. Tell Git about your name and email address:


git config --global user.email Your.Email@domain.com
git config --global user.name "Your Real Name"

Then you are ready to proceed with getting into a project. Copy the “Clone URL” from a github project page. Make a new directory on your machine, to become your working directory. There are two approaches to which project to clone.

  1. Clone from your own fork repo. This will make it trivial to push your changes up, but require one more command to get upstream changes.
  2. Clone from the upstream (my) repo. This will make it trivial to get change, but require one more command to be able to push changes, because you can’t push to another Github users’ repo.

I assume later in these instructions that you chose #1.

Next, get your local clone by clicking, or by typing.

Approach 1: GUI

In Windows Explorer, Right click the working directory and choose “Git GUI Here”.

Choose “Clone Existing Repository” in the dialog that comes up:

Paste the URL that you copied above from GitHub. Note that in the screenshot I show it as if you clone your repo, while I think it’s slightly easier overall to clone the upstream repo. Thus it really should show git@githib.com:kylecordes/sample1.git instead.
Note that your browser might add an erroneous mailto: to the URL, which you must remove – Git URLs do not start with “mailto”. Enter the directory where you want your working copy:

Make sure to use real, reasonable values. For example, you are not working “sample1” and you probably don’t keep your working projects in a directory named “GitStuff”, so put in a directory that makes sense for the project you are working on. Also, put your working copy in a place where you can effectively work in it; for example the working copy for a web project should usually be under the webroot of your local development web server.

Click Clone. You will be prompted for your passphrase if you used one. In a few minutes the Clone will finish, and you have the project available on your machine. I’ve had sporadic problems with this process hanging (growing pains at GitHub are the likely cause), so if you don’t see progress for a few minutes, stop it and start over.

Approach 2: Command Line

I find this easier. In Windows Explorer, right-click on the working directory you want and choose “Gui Bash Here”. Then enter a command like this:

git clone repoURL

Git might prompt you about an SSH key, the first time you do this with github (or any other new server). Answer “yes”.

It’s worth pointing out here, if you didn’t already understand from the various Git web sites, that Git is a distributed source control system. It will pull down the whole project history, so you can browse history and even commit changes without online access. Thus Git works very well if you have an intermittent or poor network connection.

Work Flow

As with all source control, work in the directory where you use source control. Do not copy files back and forth between here and some other working directory, that is a path to endless merge and update problems.

Once you have checked out the software, here is a summary of your work flow. For more details, please read the copies Git documentation online. I suggest reading both the official Git material, as well as other sites and articles about Git.

Getting Changes

Get changes from others with “git pull” (or using the GUI). By default this will pull from the repo from which you cloned, so if you cloned the upstream repo, that will get other peoples’ changes.

If you cloned from your own Github repo, you’ll need to use something like this:


git remote add upstream git@github.com:kylecordes/sample1.git
git pull kyle upstream

Sending Changes

Commit your changes locally with “git commit” (or using the GUI). Remember that Git generally wants you to explicitly say which files’ changes to include (“git add”), so make sure you read and understand enough about Git to do this properly; it is only a few commands or clicks in the GUI. The usual caveat applies, to only commit actual source files, not generated files or temp files.

Push your changes up to your GitHub repository with “git push”. This step will make it so others on your project can see your changes. Do this at least once per day, and ideally more often as you collaborate. Assuming that you cloned from the upstream repo, you’ll need to set up a reference to your own Github repo (the one you can push to), with something like this:


git remote add harry git@github.com:harrycordes/odtimetracker.git

As usual, use reasonable names and relevant URLs, not my sample names and URLs. Once you’ve added the remote reference, pushing is easy:


git push harry master

When you have a set of changes (one or more commits) that you think are ready to go in to the main-line of the project, use Github to issue a “pull request”. Your project lead (me, typically, at Oasis Digital) will review your commits and either pull them in to the main-line, or send feedback about changes needed before they can be pulled in.

A key thing to understand about Git is that it makes branching extremely easy and fast, so that very convenient to use branches. If you are accustomed to other source control systems where branching is a big, painful thing, it will be very different for you in Git. Once you learn to use branches, you’ll sometimes push up a branch you are working on instead of master.

I’ve only scratched the surface in this introduction. You now have Git up and running with project code in it, so pick up a Git tutorial or reference (such as the screencast videos at GitCasts) and start learning.

To Learn More

I heartily recommend the “Illustrated Guide to Git on Windows“. It doesn’t yet cover GitHub, but does cover many more details of using Git itself.

Also, a bit of Git can be very useful even in a project that uses SVN, especially when you need to rearrange a bunch of files in SVN.

Update: In a newer post, I list several other Git GUIs for Windows.

RocketModem Driver Source Package for Debian / Ubuntu

A couple of months ago I posted about using the current model Comtrol RocketModem IV with Debian / Ubuntu Linux. Ubuntu/Debian includes an older “rocket” module driver in-the-box, which works well for older RocketModem IV cards. But for the newest cards, it does not work at all. The current RocketModem IV is not recognized by the rocket module in-the-box in Linux, it requires an updated driver from Comtrol.

With some work (mostly outsourced to a Linux guru) I now present a source driver package for the 3.08 “beta” driver version (from the Comtrol FTP site):

comtrol-source_3.08_all.deb

Comtrol ships the driver source code under a GPL license, so unless I badly misunderstand, it’s totally OK for me to redistribute it here.

To install this, you follow the usual Debian driver-build-install process. The most obvious place to do so is on the hardware where you want to install it, but you can also use another machine the same distro version and Linux kernel version as your production hardware. Some of these commands must be run as root.

dpkg -i comtrol-source_3.08_all.deb
module-assistant build comtrol

This builds a .deb specific to the running kernel. When I ran it, the binary .deb landed here:

/usr/src/comtrol-module-2.6.22-14-server_3.08+2.6.22-14.52_amd64.deb

Copy to production hardware (if you are not already on it), then install:

dpkg -i /usr/src/comtrol-module-2.6.22-14-server_3.08+2.6.22-14.52_amd64.deb

and verify the module loads:

modprobe rocket

and finds the hardware:

ls /dev/ttyR*

To verify those devices really work (that they talk to the modems on your RocketModem card), Minicom is a convenient tool:

apt-get install minicom
minicom -s

Kernel Versions

Linux kernel module binaries are specific to the kernel version they are built for; this is an annoyance, but is not considered a bug (by Linus). Because of this, when you upgrade your kernel, you need to:

  • Uninstall the binary kernel module .deb you created above
  • Put in the new kernel
  • Build and install a new binary module package as above

Rebuilding the source .deb

Lastly, if you care to recreate the source .deb, you can do so by downloading the “source” for the source deb: comtrol-source_3.08.tar.gz then following a process roughly like so:

apt-get install module-assistant debhelper fakeroot
m-a prepare
tar xvf comtrol-source_3.08.tar.gz
cd comtrol-source
dpkg-buildpackage –rfakeroot

The comtrol subdirectory therein contains simply the content of Comtrol’s source driver download, and this is likely to work trivially with newer driver versions (3.09, etc.) when they appear.

Git on Windows, it actually works now

I’ve been trying out various distributed source control tools, and used several of them for various very small projects. I’ve most mostly settled on git as the one I prefer, but I haven’t yet published any code with it. Also, I’ve been frustrated that git support for Windows has been very weak.

Msysgit appears to have solved the git-windows problem, at least well enough for small scale work. If you’ve been holding back on trying git because you use Windows, now is the time to jump in.

Update: I’ve posted details on how to get started with msysgit and GitHub as well as a comparison of Git software for Windows.

So, you want to use your new RocketModem IV on Linux

On one of our projects, we’ve been using the Comtrol RocketModem IV for several years, for both modem communications and FAXing (with Hylafax). All of our RMIVs have been completely reliable and very easy to get working under Linux, particular Ubuntu/Debian which includes the rocket driver in-the-box.

Then we got a new card; it looks like the old cards; but according to the rocket driver it does not exist.

We initially suspected a bad card. We were unable to test with the floppy-image diagnostics on the Comtrol web site, because like most of the rest of the world we stopped ordering floppy drives in new machines several years ago. Comtrol support helpfully pointed us at a CD .ISO for the diagnostics, in here:

ftp://ftp.comtrol.com/contribs/rocketport/diagnostics/rocketmodem/

…without mentioning any reason why this information would not be on their support web site in 2008.

The diagnostics showed a 100% functional card. It finally dawned on me that I might need an updated driver for the current card; and this turned out to be true. The current RocketModem IV is not recognized at all by the rocket module in-the-box in Linux.

Getting it going was an adventure; the short version is that to get the current RocketModem IV to work on any vaguely current Linux kernel, you need to poke around the Comtrol web site and find the beta 3.08 driver, in here:

ftp://ftp.comtrol.com/beta/rmodem/drivers/u_pci/linux/

then do the usual apt-get of a build environment, make, sudo make install, /etc/init.d/rocket start.

Production Deployment

Of course the instructions above are for development / test usage; it is a very poor practice to do the above installation on production hardware, because it makes no accommodation for the inevitable stream of upgrades coming through the packaging mechanism, and because in most cases sysadmins prefer to not have a build environment on production hardware at all (ever).

I am no packaging / Debian expert, but I can see two workable paths to good production deployment for kernel modules like this:

1) Use checkinstall or similar mechanism to create a package which installs the binary modules; set things up so that the resulting package depends on the proper kernel package version. As new kernel packages appear, make new package versions to accommodate them, and likewise for each architecture (i386, amd64).

2) recast the raw source .tar file, as a source-deb package suitable for use with the Debian module-assistant mechanism.

I’d love to hear from any Debian/Ubuntu experts on the relative merits of these approaches.

Perhaps in a few years, things will settle back to the previous bliss of a working driver in the Linux “box”.

Update: I’ve posted a .deb for this here.

Is Delphi Dead? No.

A few months ago Alex Miller pointed me to this Delphi doom article (the site appears to be down at the moment), which reminded me to post about the same topic. Here goes.

Delphi shipped in 1995, and its demise has been declared frequently since 1997 or so. In a sense this demise is true, yet also false. Delphi’s current popularity is very different in form (not only in magnitude) from that of Java, C#, etc. Delphi is used substantially by commercial software vendors, and only rarely by enterprises. An ugly reality of the software industry is that the bulk of software developers nationwide work inside large non-software companies, so this usage pattern most likely does not produce the level of unit sales that Codegear (Borland’s dev-tools subsidiary) would like to see. It does, however, produce an enormous number of Delphi application instances running “in the field”, used by real paying end users, who don’t care (or know) what development tools were used to build the software they buy. Many commercial software products, both those in shrinkwrap at retail stores and those for vertical markets, are written in Delphi and will continue to be, because there are very few other good choices for high quality (polished) native Win32 GUI software. In these markets, shipping a Java or .NET app can be a competitive disadvantage (though to a lesser extent over time), and old-style VB is a sad joke.

I don’t think Delphi is eligible for demise until the dominant desktop operating system ships with a dominant runtime platform “in the box”. For example, if all of this happens at the same time:

  • Microsoft ships Windows with the .NET runtime already installed
  • That version Windows is the commonly deployed version
  • That version of the .NET runtime is the commonly targeted version

At that time, the .NET platform (with the language of your choice) could be a compelling replacement for Delphi in its niche. There is a lot to like about .NET (and Java, and I use them both), but I’m not holding my breath for the above conjunction.

Over at Oasis Digital we have several ongoing Delphi projects in which we develop and extend in-house, enterprise applications. These projects feel notably lonely (very few developers here in the midwest use Delphi), and the Delphi language leaves a lot to be desired (such as garbage collection) – but the resulting software works very well for our customers, especially when we add in a bit of Lua or Prolog (story coming someday…).

Delphi is not dead. It’s not at the top of the popularity charts, and won’t be. It probably shouldn’t be your first choice for a new in-house enterprise application starting today, because of the network effects of Java and .NET popularity. But Delphi is not going away anytime soon, and is a great choice for certain classes of projects.

Optimize Hierarchy Queries with a Transitive Closure Table

Last year I posted about the use of a Joe Celko-style nested set hierarchy representation, for fast hierarchy queries. Here I will describe another approach which is simpler to query, but more wasteful of space. I did not invent this transitive closure approach, I learned of it from several directions:

There are two (main) places to put the code for building a closure tables: in your application code, or in your database. The application approach is nice if you are aiming to avoid vendor-specific SQL code, but it is quite simple in SQL, and therefore not a big problem to recode for another RDBMS if the need arises. The SQL approach also avoids round-tripping the relevant data in and out of the database. Therefore, the approach I generally recommend for this is an SQL stored procedure / function.

Here is a simplified PostgreSQL stored procedure to do the job; note that his assumes a “widget” table with a widget_id and parent_id (the “adjacency” representation of a hierarchy), and a widget_closure table with fields (ancestor_id, widget_id, depth):

CREATE OR REPLACE FUNCTION populate_widget_closure()
RETURNS integer AS '
DECLARE
  distance int;
BEGIN
  delete from widget_closure;
  insert into widget_closure
    select widget_id, widget_id, 0 from widget;

  FOR distance in 1..20 LOOP
    insert into widget_closure
    select uc.parent_id, u.widget_id, distance
      from widget_closure uc, widget u
      where uc.child_id=u.reports_to_id
        and uc.distance = distance-1;
  END LOOP;
  RETURN 1;
END;
' LANGUAGE plpgsql;

This sample code assumes a maximum depth of 20, and has no error checking. It will blindly miss greater depths and produce garbage if there is a “loop” in the ancestry. I recommend both arbitrary depth handling and error checking for production use.

Once your transitive closure table is populated, querying it is extremely fast and simple. You can get all the descendants of widget 12345 with the query “select widget_id from widget_closure where ancestor_id=12345”. Of course, this hierarchy representation, while simple to generate, is not simple to incrementally update as the hierarchy changes. The most straightforward way to use it is as a cache, regenerated as needed.