Jul 18 2008

Rearranging files in SVN? Use git-svn instead.

Published by at 2:04 pm under Technology   

From time to time I need to rearrange a set of files in a project, typically while revamping the file / directory layout of a set of source files. The most direct way to do so is by dragging the files around (using a Windows or Linux GUI), but this is quite tedious to do if the files are in a Subversion repo, because SVN does not understand the file moves and renames unless you use “svn mv” or (with TortoiseSVN) right-click-drag the files to new locations.

The answer to all this is simple: use git-svn for your SVN checkout, instead of the SVN command lines tools or Tortoise. You can readily Google for extensive git-svn information to get started.

The workflow with git-svn for file rearrangement is fairly simple, and very fast:

  • “git svn rebase” to get your local git repo up to date with your SVN repo.
  • make new directories, rearrange files, rename files, etc., using any tool you like. This is easy and painless because git (wisely) keeps all the metadata in a top-level .git directory. It is fast because you aren’t running any svn code (or any git code) while doing so.
  • use “git add”, or the git GUI, to add the newly moved / renamed files.
  • “git commit”
  • “git svn dcommit” to push the changes over to SVN.

git / git-svn looks at the content of files, notices (rather than you needing to tell it) that the content has been moved to different filenames in different directories, and generates SVN move/copy operations. It even notices nearly-identical (edited) files and handles them correctly.

One loose end is left hanging, which is that git-svn does not remove any now-empty or removed directories from SVN, a consequence of git not caring about directories. Therefore, if you need the empty directories removed from the SVN repo, use a normal SVN client (command line, Tortoise) to do so.

From the description of above, this does not initially sound like a big win; but with hundred of files to move around, the time for the git commands (instant, except for the first and last) is irrelevant, and the time/hassle saved is enormous.

Perhaps someday SVN itself will gain the ability to “just work” when you move or rename files / directories, rather than having to be told.

If you found this post useful, please link to it from your web site, mention it online, or mention it to a colleague.

3 responses so far

3 Responses to “Rearranging files in SVN? Use git-svn instead.”

  1. Juliano says:

    There is a big problem here. There is absolutely no way to implicitly detect renames with 100% accuracy without reading the user mind, or at least watching their actions. The first one is quite impossible with today’s technology, and the second one requires integration with the user environment (IDE, file manager or operating system).

    There is an overstatement that git properly detects renames. That is because git was designed with the sole purpose of managing the Linux kernel tree (only text files), is used to manage mostly source-code projects (mostly text files) and pretty much ignored for everything else. Revision-control is way broader than this. What is, for example, two “nearly-identical” PNG images? Say, you open an image drafts/imagex.png, apply a brightness filter, save it, and move it to final/imagey.png. Git just breaks history in this situation, because the files are completely different (binarily speaking… for a human it is not), it didn’t watch the user changing the image, and didn’t read the user’s mind.

    The “nearly-identical” by itself is very subjective. How much is this? 80%? 90%? How was this value defined? What are the user’s options when he trespasses this threshold and git is not going to figure the change?

    Git actually doesn’t “just work” unless you keep yourself under this nearly-identical threshold, that rarely holds for files other than text. Subversion, Mercurial, Bazaar, etc… on the other hand, properly records history across renames with 100% accuracy, since the user provides this information. There is no threshold, no ambiguity. Until the day that revision-control can actually record what the user is really doing, I’ll prefer telling exactly what is happening to my files, so that this information is not lost forever. I hope that Subversion and Mercurial, my preferred DVCSes, never gain this “ability”.

  2. Kyle Cordes says:

    Great comment. I find git’s ability to detect this specific type of rename (text files that haven’t changes by much) *extremely* useful, because I work on a lot of projects that involve a lot of text files: source code. (“Track what the user is actually doing” is a non-goal of git – rather, it is a feature of git, that git tracks the content you ended up with, without regard to how you got there.)

    On the other hand, tracking the kind of changes that Juliano describes (changes to image files while simultaneously renaming them) is not a need I have had (yet). git is a relatively poor choice for projects that consist of a lot of binary files; I am confident I would look elsewhere when working on such a project.

    Fortunately, it’s a big world, and there are a lot of great tools to choose from, including many source control tools that don’t implicitly follow renamed content, that do allow/require explicit rename tracking, etc.

  3. Two things to add:
    git-mv : http://www.kernel.org/pub/software/scm/git/docs/git-mv.html
    Move or rename a file, a directory, or a symlink

    git-svn dcommit –rmdir : http://www.kernel.org/pub/software/scm/git/docs/git-svn.html
    ” Only used with the dcommit, set-tree and commit-diff commands.

    Remove directories from the SVN tree if there are no files left behind. SVN can version empty directories, and they are not removed by default if there are no files left in them. git cannot version empty directories. Enabling this flag will make the commit to SVN act like git.

    config key: svn.rmdir”

    ;)

    @Kyle: care to qualify that statement that Git isn’t good with Binary data?