Don't understand german? Read or subscribe to my english-only feed.

git[-svn] in 30 minutes

… or something like that… I planned to write a short note about how to start with using git-svn so I can provide a pointer to some of my colleagues. It turned out that git has too many nice features that you should be aware of. :) Hopefully my notes (now being a reference for myself as well, thx to gebi for all the help and feedback) are useful anyway. If you think something (more or less essential, or at least something most of us should be aware of) is missing: please feel free to mention that in the comment section of my blog entry, thanks.

Disclaimer: I’m still happy with mercurial for what I – and we at grml in general – use it: linear, but anyway distributed development. git on the other hand provides some really nice features. Rebasing and branching with git is really great – so non-linear development just works. As usual: use the right tool for the right job.

git is a bit complicate to use. Not only but especially in the beginning. On the other hand I’m not such a big friend of subversion. If you wanthave to use subversion (Graz University of Technology for example provides a svn service to their students and employees) but prefer to work with git instead you should be aware of git-svn. git-svn gives you bidirectional operation between subversion and git.

First of all make sure you have all you might need when working with git. Make sure to use a current version of git (I’m refering to version >=1.5.3). Just execute the following command line on your Debian system to install all relevant packages:

aptitude install \\
git-buildpackage git-core git-cvs git-daemon-run \\
git-doc git-email git-gui git-load-dirs git-svn \\
gitk gitweb qgit

Now let’s start with some general and basic configuration:

# Remove directories from the SVN tree if there
# are no files left behind, configure it globaly:
git config --global svn.rmdir=true

# Want some more global, personal git configuration?
for line in  \
  user.name=Michael Prokop \
  user.email=foo@example.invalid \
  color.diff=auto \
  color.diff.new=cyan \
  color.diff.old=magenta \
  color.diff.frag=yellow \
  color.diff.meta=green \
  color.diff.commit=normal \
do
  git config --global $line
done

Check out man git-config for much more details about configuration options.

First tip: set ‘g’ as an alias for git so you don’t have to type that much. I’ll write the long version in the following examples so copy/paste works for everyone. Make sure to use the short options of git itself as well: use ‘git co’ for example instead of ‘git checkout’. You can define your own aliases inside git as well – either manually in ~/.gitconfig or running something like:

git config --global alias.st status

Enough pre-configuration for now. It’s time to checkout the SVN repository:

# Check out the SVN repository and set 'svn/' as
# prefix for the branches:
git svn clone -s --prefix=svn/ \\
https://svn.tugraz.at/svn/$project foobar && \
cd foobar

# Adjust svn:ignore settings within git:
git svn show-ignore >> .git/info/exclude

# List all branches:
git branch -a

# List all remote branches:
git branch -r

# Rebase your local changes against the
# latest changes in SVN (kind of 'svn up'):
git svn rebase

# Checkout a specific branch:
git checkout $branch

Ok so far? But what do we have to do if we want to work on the upstream source and are allowed to commit/push directly to the repository? Let’s see how to work on that without using branches:

# Hack:
$EDITOR foobar

# Check status
git st[atus]

# List diff:
git diff [foobar]

# Commit it with a commit message using $EDITOR:
git commit -a

# Now commit your changes (that were committed
# previously using git) to SVN, as well as
# automatically updating your working HEAD:
git svn dcommit

But what should we do if we do not have commit rights? Let’s create our own branch and send a patch via mail to upstream:

# Make a new branch:
git checkout -b mikas_demo_patch

# and hack...
$EDITOR

# Commit all changes:
git ci -a -m 'Best patch but worst commit msg ever'

# ... and prepare patch[es]:
git format-patch -s -p -n master

# Now send mail(s) either use git-send-email:
git send-email --to foo@example.invalid *.patch
# ... or if you prefer mutt instead (short zsh syntax):
for f in *.patch ; mutt -H $f

You got a mail from someone else and would like to incorporate changes from the attached patch in your repository? Just store the mail in a seperate mailbox (use save-message in mutt for example, keybinding ‘s’ by default), then execute:

# Apply a [series of] patch[es] from a mailbox
git am /path/to/mailbox

Want to work on a seperate branch and rebase your work with upstream?

# First of all make sure to use recent sources...
# So pull when using plain git:
git pull -u
# .. or when using git-svn use:
git svn rebase

# Then create a new branch:
git checkout -b mika

# Hack:
$EDITOR

# Commit:
git ci -a -m 'Best patch but worst commit msg ever'

# Switch to master branch:
git checkout master

# Pull again when using plain git:
git pull -u
# .. or when using git-svn use:
git svn rebase

# Finally switch back
git checkout mika

# Now rebase it with plain git using:
git rebase origin/master
# ... or when using git-svn:
git svn rebase

# Now check out the last 5 commits:
git log -n5

Another branch-session might look like:

git co -b foo
$EDITOR
git ci -a -m 'foo changes'
git co master
git co -b bar
$EDITOR
git ci -a -m 'bar changes''
git co foo
git rebase bar
git log -n5
git st
git branch

Pfuhhh? Right. :) Now it’s time to check out another cool feature: git stash, which is just great when pulling into a dirty tree or when suffering from interrupted workflow. Demo:

git stash
git pull / fetch+rebase
$EDITOR # fix conflicts
git commit -a -m "Fix in a hurry"
git stash apply
git stash clear # unless you want to keep the stash

git reset rocks as well:

# List all recent actions:
git reflog

# Now undo the last action:
git reset --hard HEAD@{0}

How to get rid of branches?

# Delete a branch. The branch must be fully merged:
git branch -d remove_me_branch
# Delete a branch irrespective of its index status:
git branch -D remove_me_branch

# Delete a remote branch:
git push reponame :branch

Repack a git repository to minimize its disk usage:

git pack-refs --prune
git reflog expire --all
git repack -a -d -f -l
git prune
git rerere gc

Use git cherry to find commits not merged upstream.

Another really cool feature is the interactive rebasing: git rebase –interactive

Make sure you are aware of gitk:

Screenshot of gitk

… and don’t forget to set readable fonts for gitk, like:

[ -r ~/.gitk ] || cat > ~/.gitk << EOF
set mainfont {Arial 10}
set textfont { Courier 10}
set uifont {Arial 10 bold}
EOF

If you prefer a Qt based interface check out qgit.

Useful ressources:

10 Responses to “git[-svn] in 30 minutes”

  1. chris Says:

    A quick question regarding your “separate branch” example: I don’t understand why you switch to your branch again and rebase? I’d have expected that you would merge the changes in the master branch.

    Btw. if you “git-svn clone” a fairly large project, you can drink a few cups of coffee in the meantime.

  2. ak Says:

    These examples show why not to use git – its usage is non-intuitive and overcomplicated, at least what I can see from your examples. Why do I need to switch back and forth between branches? Why would I need branches at all to simply update my local changes with the latest revision from the remote repository? Why would I want to apply all patches of a mailbox at once? Ever heard about thorough patch reviews? Why are the commands so absolutely non-intuitive? Why does the command to delete a remote branch (“git push reponame :branch”) not contain common words such as “delete”, “rm” or “remove”?

    All in all, you describe use cases that the average open-source project is never ever going to need, nor any other “big” software project. The use cases that you describe only have a real advantage for projects that mainly do integrational work with contributions from a lot of people, and whose men in charge trust each other – incidently, that exactly describes the work flow of the Linux kernel project, but it definitely doesn’t describe Joe Average’s open source project, nor does it describe the typical workflows of even the bigger open source projects, or software development projects in a commercial environment. Branching is totally overrated, and not only that, it encumbers software quality methods such as continuous integration (unless you immediately merge your branch back to the trunk, HEAD, whatever; or you do CI for every branch, which would be a complete waste of resources).

  3. Ted Percival Says:

    There is also a GTK+-based repository/history viewer called Giggle.

  4. Jakub Narebski Says:

    * Why do I need to switch back and forth between branches?

    For example because when developing new feature on separate branch, or on ‘devel’/’next’ branch you have noticed a bug (or been send bug report), and you need to correct it on ‘maint’ branch (and then merge or cherry-pick it to ‘master’ and ‘next’).

    * Why would I need branches at all to simply update my local changes with the latest revision from the remote repository?

    You don’t need remote-tracking branches (you can pull directly into yuour local branch), but they are dead useful. They allow to see what was/is the state of other (fetched from) repository. Besides in new git they are hidden by default from user (one which uses: git clone, git remote add, git branch –track).

    * Why would I want to apply all patches of a mailbox at once? Ever heard about thorough patch reviews?

    Because you save _series of patches_ into _separate mailbox_? And why do you think you cannot do review of series of patches before applying? Besides, git-am has –interactive option…

  5. mika Says:

    @chris: you want to rebase it because that way you don’t flood your history with merges and make it as easy as possible for upstream to merge in your changes. You just develop and make sure to rebase your work on origin/master; if upstream is “happy” with that then he/she can merge it. :)

    @ak: you’re a subversion-user, right? ;) Well, you’re right in some aspects, but a few notes from my side:

    * the “strange commands” are a good point where git is getting better from version to version (and yes: it’s definitely hard and difficult to understand all those commands and options, especially in the very beginning :)). There where just too many git-… commands and AFAIK all of them are sorted out into ‘git command’ instead (which makes it at least a bit better). Many common tasks find their way into something like a new command and hopefully stuff like “delete a remote branch” will be become smoother and maybe more obvious as well. ;) Oh, if you want to use a special option you can simply assign an alias anyway *SCNR*. ;)

    * commercial software regarding version control isn’t a good example because developers just too often take the “easiest way to go”-approach instead of really learning their tools (just look at the usual developer using his/her editor ;)), or just think of MS SourceSafe).

    * regarding the use cases: well, that’s *exactly* what many projects (no matter whether OS or closed source) need, most of them just don’t know that yet (that’s not a joke). As soon as you have non-linear development (or just base your work on someone else’ source), branching/rebase is something you really want to be able to use that way. You don’t have any “waste of resources” when keeping up with trunk/head/tip/master/…. – check out the disk usage of git compared to subversion and be sure to never want to use subversion any more at all. :)

    @Ted: thx!

    regards,
    -mika-

  6. ak Says:

    Yes, I’m a subversion user, I may not be cool because I don’t follow the latest version control fads, but I do know sh*t about version management.

    it’s definitely hard and difficult to understand all those commands and options, especially in the very beginning

    Why does it need to be hard?

    commercial software regarding version control isn’t a good example because developers just too often take the “easiest way to go”-approach instead of really learning their tools

    I think that’s just generic blabla. We use Subversion here, and we support tens of customers every month with custom hotfixes, and it needs to be easy and simple, because otherwise with the huge amount of code being checked in we would definitely lose overview. You always presume that it’s so easy, you branch and make your changes, and merge, and so on, but it’s NOT. No matter what tool you use. Managing branches is always hard, unless you only work on Mickey-Mouse projects with less than 5 to 10 developers.

    regarding the use cases: well, that’s *exactly* what many projects (no matter whether OS or closed source) need, most of them just don’t know that yet (that’s not a joke). As soon as you have non-linear development (or just base your work on someone else’ source), branching/rebase is something you really want to be able to use that way.

    Don’t tell anyone what they want or don’t want. We evaluated a number of version control systems, and Subversion came out to be easy enough to use for everyone (you just can’t deploy a VCS that requires days of productivity-blocking introductions in a shop with e.g. more than 60 people, or in a single project with 10 developers or more), yet able support our complex custom hotfix and patch process, and being a multi-platform VCS (that’s the killer argument against git).

    You don’t have any “waste of resources” when keeping up with trunk/head/tip/master/…. – check out the disk usage of git compared to subversion and be sure to never want to use subversion any more at all.

    You don’t get it, do you? Hard disk is cheap. Really cheap. Other computing resources, such as CPU time, are not. My point remains valid: a proliferation of branching is a danger to quality assurance measures such as continuous integration (do you even know what that is?). And integration, at least in bigger projects, is one of the most crucial things during development. That’s why git works well for the Linux kernel, because what these people do is basically nothing but integrating a huge pile of patches from outside sources. But those advantages don’t apply to projects who do not have this type of work flow, i.e. virtually all other open source projects and commercial projects.

  7. maks Says:

    ak first of all:

    * git is simple
    * git is fast

    i’ve deployed git across many different people with various vcs kownledge.

    The requirements for any serious vcs is
    * speed
    * offline work (people on planes, country house or wherever)
    * repo format not changeable by hostile party

    so you svn fanboy, i guess you are doing mickey mouse stuff, as your crap that pretends to be better then cvs fails on any of this important points and don’t get me started with nonlinear development.

    also mikas git post is meant for people already working and having fun to explore git corners. The git cheat sheet i give away to costums has no more then 10 lines.
    90ties are over mate.

  8. tomb Says:

    All those scary git commands always makes me feel dizzy. I have to do svn in my work, I played a bit with git too, but never had enought time to dive in deeper. These tips helped me a lot, thanks for this informative post.

  9. Jakub Narebski Says:

    @mikas

    Just execute the following command line on your Debian system to install all relevant packages

    Are all those packages really relevant? On the other hand git-completion (bash tab completion for git) seems to be missing; although if you use zsh, then you should have completion for git already…

    Now let’s start with some general and basic configuration

    Wouldn’t it be easier to just edit relevant file (~/.gitconfig) instead of using git-config which is indented for *tool* use?

    Repack a git repository to minimize its disk usage

    It is now enought to use ‘git gc’ or ‘git gc –prune’ (the latter on quescent repository). By the way, modern git requires repacking less and less.

    You didn’t mention git commit tool, ‘git gui’ (and it’s wonderfull graphical blame ‘git gui blame’). qgit gas also some commit tool features.

    @ak

    Branching is totally overrated, and not only that, it encumbers software quality methods such as continuous integration (unless you immediately merge your branch back to the trunk, HEAD, whatever; or you do CI for every branch, which would be a complete waste of resources).

    Private branches allow to create “perfect patch series”: feature introduced in small, self contained steps, and then send those series to public server as a unit. Centralized SCM don’t really allow private branches (where “private” might mean: worked on by some subgroup).

    Sidenote: in git repository there is contrib/continuous with example cidaemon and post-receive hook, so when you push to central / publishing repository the code would be checked (and perhaps refused to be integrated).

    As to difficulity of git: I’d rather like to think about it as steep learning curve (which usually follows steep power curve ;-)

  10. mika Says:

    @Jakub:

    Right, I don’t use bash :) No, not all the packages are relevant, but most of them are really useful. :)

    @git-config: just depends on what you prefer :)

    Thanks for mentioning ‘git gc’ (I wasn’t aware of that) and ‘git gui’ (forgot to mention that :))

    regards,
    -mika-