Subversion tips: working with branches
February 12, 2006 | co.mments
update: Ian Bicking has a good followup on branching practices
Subversion is great software, essentially a major upgrade of CVS. Its branch support is stellar, for a few reasons:
- Visibility: Branches are physical copies, you can see all branches, stored by convention in the /branches folder. This is unlike CVS (or VSS) where branches are placed in the time dimension and are invisible, hidden "behind" the CVS HEAD revision.
- Efficiency: Branches are calculated as deltas and are not full physically copies, they are efficient and cheap to create.
- Global revisoning: the entire repository gets versioned on every change. As a result merging can be applied as the merging of two source trees; this is much easier to think about and execute than merging between two sets of files, as is the case with CVS.
Update before email. Updating against the repository should be the first thing you do in the morning, even before reading your email. This tip isn't specific to branching, but it's so central to having a good working practice with any form of source control, I'll mention it here. Some of the biggest development issues with source control can be traced directly to not updating frequently. Do it until it becomes muscle memory. Email is a terrible way to start the day anyway.
Put the branch revision number in the comment. When you create the branch from the trunk, make a note of the revison number the branch was created from. For example:
svn copy http://svn.example.org/foo/trunk \ http://svn.example.org/branches/foo/mybranch \ -m"Created foo/mybranch branch from rev  of foo/trunk"
Subversion does not constrain the scope of a merge to a branch, so you have to tell it to only merge changes on the branch that have happened since the branch was created. Otherwise you'll get everything that happended before the branch brought across which screws up the changeset. Treating branches specially is something that might get added to subversion in the future, but for now you'll have to do it yourself.
Backport to the branch. Come the glorious day when you merge your changes back into the trunk, things will go much easier for you if you have tried to keep up with the changes on the trunk. The easiest way to do that is keep merging changes on the trunk onto your branch as frequently as possible - aka "backporting". Here's an example of merging changes from a branch that was created in revision r20 above while the repository has moved on to version 25 due to changes on the trunk:
cd /branches/foo/mybranch svn merge -r20:25 http://svn.example.org/foo/trunk . svn ci . -m "foo/mybranch: merged to "The smaller the changeset the easier it is to manage issues - so prefer lots of little updates than one big bang integration, that could take days, or just not be possible to complete at all.
Put the revision number in the backport. This for the same reasons as putting he revision number in the initial branch comment. You only want the changes on the trunk made since the last backport merge. Suppose the repository has moved on to version 30. Because we made a comment on the last backport telling us what revison we merged to (25) we know we only need to merge from r25 onwards:
cd /branches/project/mybranch svn merge -r25:30 http://svn.example.org/foo/trunk . svn ci .-m "mybranch: merged to "
Take a merge for a dry run. The merge command has a flag called "--dry-run". This allows you to see what the result of the merge will be without actually applying it to the target. It's useful if you have any doubts that the merge will succeed or what it's ging to apply to. On this front if the merge goes to hell you can always run the revert command to clean up your working copy.
Don't forget to commit a merge. Merging only applies changes to the working copy. You have to check those changes into the repository with the "commit" command.
Prefix branch comments with the branch name. This makes scanning the log history easy. Those that come after will thank you. Here's an example of the right thing from Django magic-removal branch:
Merge from the target context. One thing that can be confusing with merging is making sure you don't get your merge sources (what you;re merging) and targets (where you're merging to) mixed up. It's much easier to get this right if you get into the habit of running a merge from the target. That way you can think of it as taking in merges from somewhere else.
Never check into a tag. The convention for tags is to place them in a /tags folder in the repository. Tags are meant to be read only snapshots of your code. It's tempting sometimes to check little fixes into tags. Try not to do this - someday you will forget to put that change into the trunk as well and the next tag will be hosed in a way that is difficult to track down. And those little changes will get bigger and bigger over time. In subversion creating new tags is a cheap operation (both time and space). Instead, check the change into the trunk/branch, retag and release. Aside from code management another problem is confidence - seeing commits into the /tags folder lowers confidence in the integrity of the codebase. Nobody wants to think about tags that are actually branches.
Minimize the number of active branches. Branches can be useful, but too many of them is indicative of problems, typically of poor communication amongst developers or an inability not to break each others' code. Branches should be created only when neccessary- they're not a good default approach. If you really want to work by having individuals merge changesets, you've probably been following kernel-dev too closely, but you should look at tools that support this model, such as SVK (based on Subversion), Darcs or Bazaar-NG. Subversion is a centralised revision control system, theres not much point fighting it. Ian Bicking's "Distributed vs. Centralized Version Control" is a good overview of the two approaches.
Prune back dead branches: branches that are no longer active or required should be deleted agressively. Developer and experimental branches typically flal inot this category, but it's mroe or less true of any branch that has been merged back to the trunk. Get rid of it and focus on the active lines.
Never branch a branch. Branching a branch is sometime called "Staircasing" since a drawing of branching branches looks like a staircase. In general staircases happen because active development drifts away from the trunk and onto a branch, in turn that usually happens because merging back onto the trunk was too hard to do, and in turn that happens because backporting wasn't done. Crazy as it sounds, branching off branches can happen in CVS almost by adcident. This is because CVS records branches in the time dimension, so you can't see them as you could when branches are physical copies. In Subversion as branches are copies this problem should be alleviated, but it's still its something to be watchful of. Regression merging from branch to branch is a nightmare to manage and is understood to be a revision control worst practice - any configuration manager worth a salt will go a long way to make sure it doesn't happen on their watch. This is what a staircased repository eventually looks like:
So you see, I wasn't kidding about updating first thing in the morning.
February 12, 2006 11:09 PM