Mercurial vs. Subversion


I’m launching a new Django-based project at work.  I’ll initially be the sole developer, but I expect it to grow to 10 heads (developers, QA, web designers, operations, and project management) over time.  I hope to start the ball rolling this week, and one of my first decisions is the Source Code Management (SCM) tool.

Here’s the development environment:

  • Up to 12 people using the pool.  Four to five of them will be full-time developers.  The developers, QA, and web designers will use Macs; operations and project management will probably use Macs but might use Windows
  • Most of us will be in the same office space, and all of us will be in the same building. But I don’t want a tool that obviates our potentially hiring a contractor in a far-away land
  • We’ll use the standard pool organization of one trunk with multiple branches
  • This is for a commercial product that’ll have, eh, many lines of code.  Python will be the major language, and most of the technologies will be open-source.  There won’t be a line of .NET, or any other Microsoft technology, within 500 miles of the project
  • Perferably open-source.  Preferably free

I quickly narrowed the solution set down to Subversion, Mercurial, and Git.  I’ve extensively used Subversion, and occasionally used Mercurial.

I first eliminated Git, after reading multiple commentaries about its UI, and getting the sense that it’s less-than-fully baked.

Mercurial’s primary advantages are in minimizing network latency effects, a simple setup, and no status directories (e.g., “.svn”) “polluting” the pool. But, we’ll all be on a high-speed internal network, I’ve used Subversion before and don’t think the setup is terrible, and the declining cost of disk space makes .svn directories a non-issue. These all blunted Mercurial’s presumed advantages.

Additionally, I think distributed SCMs like Mercurial have a not-yet-fully-appreciated problem in making it too easy to not [ever] check code back into the main pool.  With a local repository, a developer can feel protected from accidents and continue working happily for quite a long time.  And then, say a year down the road, he/she does a massive check-in and discovers an integration problem.  Branches, or a local repository that is effectively a private branch,  should be easy to make — but not too easy.

Mercurial will be very effective for SCMing individual scripts and one-off files in random directories.  It means remembering which files are in Mercurial (or doing “hg status” a lot), but that’s balanced against not needing to make a new repository for a directory that contains just one SCM’d file.

So, I decided upon Subversion, via Trac, for the main code pool, and Mercurial for a “one-off” script SCM tool.  Another nice aspect of Subversion is its Trace integration.  (Trac has a Mercurial plug-in, but it’s experimental.  No thanks.)

21 comments
  1. We started using beanstalkapp.com for our svn repo, since it’s managed and don’t have any security departmet politics. Haven’t used trac before but i though you should take a look at http://warehouseapp.com/ since it’s similar to beanstalk, but can be installed in your servers. the only thing is that it’s a rails app so there might be some deployment issues. it is from the makers of mephisto and lighthouse app. the price isn’t too bad either..only $30 usd. hope this helps.

    btw: what UI issues were you refering to on git? since I’ve been tempted to move there for a while now, because of all the buzz it’s getting

  2. Jeremy Dunck said:

    Brett and I have been happily using git and git-svn for a couple months now.

    I think the forgotten-code problem is minor– if it was useful or important work, it won’t be forgotten.

    Git’s community is very swiftly moving along, though I’d like to hear what UI problems there are, too.

    I did have to read kind of an inordinate amount of text to really get git, and that curve will slow adoption, but I do think it’s a lovely system.

    No comment on hg; I have only dabbled with it.

  3. Jakub Narebski said:

    Both Subversion and Mercurial have made IMVHO some not so good design decisions. For Subversion they are abuse of cheap-copy paradigm (using copying for branching *and* tagging, and relying on *convention* to tell what the branches and tags are) and I guess also (ab)use hidden from CLI tools properties (instead of something like .svnignore). For Mercurial they are using tags to name branches, tags implementation: either transferred and versioned (and needing special-casing of .hgtags file) or not transferred and not versioned instead of not versioned and transferrable, limitation to only two parents: no octopus merges.

    Note also that with centralized version control system like Subversion you would be tempted to use only minimal number of branches: “branching is evil”, instead of using topic branches and developing features in separation.

    BTW. if you are Python shop, you might consider also Bazaar (`bzr`), http://bazaar-vcs.org/ (backed by Canonical of Ubuntu, integration with Launchpad, GNU project)

  4. John said:

    @Jonathan:

    Eh… A hosted svn would be my last choice, I think. I want the code safely in house and not in someone’s cloud, and I don’t want the Internet’s vagaries to affect when I can get to my code.

    Re: svn clients, thanks for the tip, I’ll look at Warehouse. FYI, I’ve heard good things about Versions and Cornerstone. The former’s free for now, and the latter is $60.

    I didn’t record all the blogs I read. But after an afternoon of web trawling, I came away with the impression that git’s command-line interface was more complicated than either it needed to be, or I needed for my project, take your pick. And I wasn’t bowled over by the state of the GUI interfaces.

    Now, note that I’m not running an open-source project divided across hundreds or thousands of developers, who are spread across the globe. Perhaps git’s features really are justified by projects like the Linux kernel, or maybe it’s a baroque design. Either way, it’s more shall we say involved than I need for a ~ 10 person in-house team.

  5. John said:

    @Jakub: Thanks for the comments. What’s wrong with using convention to differentiate the branches and tags?

  6. Simon said:

    I use Git for just about everything, after being used to Subversion for quite some time.

    I’m baffled by this post. Git was ditched because you heard on the internet that it’s GUI isn’t up to speed?!
    That should probably not be your primary criterion…

    The Git GUI tools (git-gui and gitk) have been criticised before, back when Git was in its infancy, but now they’re easy to use and work really well. And on all platforms. It’s not eye-candy, but they get the job done, and they do it well.

    Branching is never “bad”. It is the safest way to introduce new features — and when branching and merging is hard to do (which is the hopeless case for Subversion), you end up never branching at all, and definitely never merging again.

    Git is objectively better than Subversion. I’ve been working on very large proprietary systems using Subversion, and it gets the job done, as long as you don’t want to do anything out of the ordinary. Which you occasionally do when working multiple people on the same project and conflicts occur.

    If you want to know more about Git and why it was designed the way it is, you should check out Linus’ presentation about Git at Google: http://video.google.com/videoplay?docid=-2199332044603874737

    As to Git vs. Mercurial, I can’t really say. They seem very similar in design, but I haven’t heard of any major projects using Mercurial for primary development, while Git has the Linux Kernel, Ruby on Rails, X.org (and everything else from freedesktop.org), OLPC, Wine, various GNU build tools, Merb, Rubinius, Samba, VLC Media Player, and I could go on, but that would be annoying. :-)

    All in all, you’re wrong, everyone else is right.

    – Simon

  7. Sean said:

    I think what Jakub is referring to about the “branching is evil” view with Subversion users is about merging branches. No amount of convention will save you when you have to merge changes from one branch into another. Anyone who has had to do this will quickly realize how painful and inadequate Subversion is at tracking branches. Take a look at this link to learn more about the svn branching woes. And many more .

    In most projects that I’ve worked with using Subversion, I usually -spend- waste a considerable amount of time trying to fix some hangup that is preventing a commit. Subversion only seems to have no problems if you only make trivial changes. But what development project have you worked on that only required trivial changes?

  8. sofia said:

    Simon says
    “As to Git vs. Mercurial, I can’t really say. They seem very similar in design, but I haven’t heard of any major projects using Mercurial for primary development ”
    How about the Mozilla Project (http://hg.mozilla.org/) and Open Solaris ? But you can see more at http://www.selenic.com/mercurial/wiki/index.cgi/ProjectsUsingMercurial . I’ve used mercurial for a few months and so far i’m happy with it but i also wanna try git sometime just for fun.

    – Sofia

  9. Peter said:

    Sean says “Anyone who has had to do this [merging bracnhes] will quickly realize how painful and inadequate Subversion is at tracking branches.”

    This limitation (lack of “merge tracking” ) has a long history in the svn community and was addressed in svn 1.5 (recently released).

  10. Robert said:

    I am going to use bazaar. It has a cool name!

  11. Frank said:

    Take a look at svk a layer on top of svn that makes it taste better :)

  12. John said:

    James Bennett just wrote an excellent post about DVCS vs. CVCS. The comment are very good, too.

  13. Taavi Burns said:

    We’ve been using Subversion at work, and migrated to 1.5 pretty quickly to take advantage of the merge tracking.

    And now we wish we hadn’t. It’s a complete bollocks that gets confused easily, and generates a LOT of svn:mergeinfo properties where they’re not needed, causing trac and other diff tools to spew irrelevant changes, adding way too much noise to the process.

    Maybe it just needs some more work. But svn’s had 7 years to get this right, and have been pretty seriously upstaged by the “newcomers” hg and git when it comes to this “basic” SCM ability.

  14. Ahmed said:

    Some thoughts…

    If a manager is worried that his or her developers are NOT checking in their changes on a timely basis, this is a problem with both CVCS and DVCS.

    You mention that this is potentially more pervasive with DVCS since they have local copies of the repository on their machine.

    Yet, there is a solution to this: DVCS allows “pull” operations. For the manager, all that is required consequently is to set up regularly scheduled pulls (e.g. a cron job) and review the diffs between the developer’s work and the current baseline project. As I see it, this is in fact a big advantage over CVCS: if the developer is negligent in checking in, the manager has the ability to investigate in a very detailed way.

    • John said:

      @Ahmed: Your “pull” suggestion isn’t quite identical to developers making regular and healthy check-ins.

      My advocation of regular check-ins isn’t to verify that the developer is working. I trust my developers until proven otherwise, and I have easier ways of discovering a productivity issue than doing regular pulls from their private work areas. Rather, regular check-ins are wise for early uncovering of integration problems.

      For this, performing regular “pulls” wouldn’t be as effective as regular check-ins, because I won’t know the right time to “pull”. The developer should do check-ins at a reasonable frequency, but not when their new code is in any random broken state! The code should be buttoned up and “not break the build.” Yet if I pull from their private working copies/branches at any random time, I can’t claim those attributes. Just because I pulled over pure crap wouldn’t mean that there’s a problem — it might just mean that the developer is in a massive code migration, which might be all done by, say, the end of the day.

      In addition, as a manager, I might not know all the places to pull from.

      So, I still maintain that it’s best if developers regularly check into the pool. I think DCVS have an interesting issue with this.

      • It is easy with Git/Mercurial to prevent late push
        1. 1 developer = 1 task
        2. task is done after test
        3. CI Server pull code from central repo for make daily build
        4. salary after each task done (to be more Agile :) )

        http://www.joelonsoftware.com/items/2010/03/17.html

  15. Pingback: Seek Nuance

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 9,536 other followers

%d bloggers like this: