Command line: version control

Version control is one of the most useful and powerful tools programmers have. It’s analogous to saving your place in a video game– it lets be creative without being nervous about having to start over from the beginning. Indeed, a good version control system sets you save your “state” (like your place in the game), try something out, save your state there, go back to where you were, try something else, then save that, and so on. Beyond saving state, it also allows you to bring your new developments back together if you decide to keep them all.

Things really get fun when you have multiple people working on the same project with version control. They can keep track of what the other people are doing, develop a new feature on a friend’s project and ask them to pull that new feature into their code. They can comment on each other’s code, point out problems, etc, etc, etc.

Sorry, it’s easy for me to get really excited about version control– it sets you free when you use it on your own, and it’s a party when everyone knows how to use it properly.

There are many version control systems out there, and each has its own philosophy and advantages. The one that has been most extensively adopted by the computational biology community is called Git. It was written by the same guy—Linus Torvalds—who started Linux. He quips that like Linux, he named it after himself (“git” in British English means an unpleasant person).

You can learn Git with this cute tutorial website. It will walk you through the basics. An excellent book is Pro Git by Scott Chacon. It’s available in its entirety at this website, but Connor has the print version I’m sure he’d be willing to loan. There are many other resources at the git-scm website, including some tutorial videos. There is also a nice interactive site that teaches you what’s going on “under the hood” once you have gained some experience.

I’m not going to try to do better than these resources, but I do have a couple of bits of advice. I would recommend starting with three commands: git add, git commit -a, and git push. Keep doing that until you get antsy, or until you need to do something else. Then read about that thing and have a play with it. I would really suggest staying away from git rebase until you feel super-confident (there is no shame in git merge!), and I suggest backing things up before using it the first several times.

You can’t work with git for very long without hearing about GitHub. It’s a place you can keep your Git repositories online with many features. You can start unlimited open repositories for free, and they have lots of great freebies for students and academics. In fact, this website is a Git repository hosted at GitHub. Send me a pull request!

Edit this page on GitHub

brought to you by