It’s just data

Experience with Git

The git vs svn permathread seems to have reignited at the ASF, and I thought I would describe some of my actual experiences with git in the hopes that it will help anchor the discussion.

The context: I’m co-author of a book on Rails.  I have a vested interest in the scenarios described in that book continuing to work, so I wrote a few tests that I run against various combinations of Book editions, Rails releases, and Ruby versions.

From time to time, a test fails.  One of the first things I typically do is run git bisect.  All I need to do is to identify a good version, a bad version, and a test to run.  Even if the test takes 2 minutes and there are 30 or so revisions between the good and bad points, I get an answer in about 10 minutes without needing to be further involved in determining where the problem is.

I mention it to a developer, and the first thing he does is place a comment on the the actual commit.

The next thing I do is build a smaller test case.  I have a scenario that fails.  I know a revision that it passes on.  I whittle down the scenario to one that continue to fail with the same symptoms and passes on the known good revision.  I post a comment on the same commit.

At this point, I have a small test and a commit that fails.  I may not know the full Rails codebase, but the commit page shows what changed and I can make an educated guess as to what the problem is.  I post my patch for all to see.  I request that this patch be pulled.  Within minutes it is, and the pull request is closed.

Can all of this be done using svn and JIRA?  Absolutely.  I’ve used svn log, diff, and patch plenty of times.  But compare that experience to going to a web page showing a list of commits, running git bisect, pushing a patch for all to see, and then clicking on pull request.

Are any of these features absolutely necessary?  Well, no.  I’m even aware that pieces are available like svn-bisect.  But I can see how people that have gotten used to having everything integrated and at their fingertips feel like they are taking a step huge backwards when they migrate to an environment that doesn’t.

Whilst I no longer read ASF mailing lists, the core of my objection to svn in an open source context is that it actively penalises developers who don’t have write access. (This is true of any centralised VCS).
Working on a project with git (and github optimises this hugely, but isn’t necessary), I can fork the upstream code base and still enjoy all of the benefits of revision control - atomic commits, history, etc etc - whilst not having had any prior interaction with the upstream project. I can merge new code from upstream, keep a number of simultaneous branches in flight, and generally get an efficient workflow for free.
None of this is true of subversion; before getting commit access to httpd or apr i regularly found myself with numerous full checkouts of svn just to work on semantically separate patches.

Posted by Thom May at

I can’t believe there even is a git (or any other of the major DVCS options) vs SVN question as we approach the end of 2011.

I second everything Thom said, and further I actively avoid contributing to (and to a much lesser extent, but it’s still a consideration, using) projects that use SVN. Apart from the practical considerations, call it “project smell” perhaps. A vastly better set of tools has emerged, and the idea that there are people that haven’t realised that after all this time is somewhat disturbing.

Posted by Ciaran at

The title only mentions Git, but the post touches heavily on both Git and Github.  A non-trivial portion of this appears to rely on Github specifically, and not just Git.

Git provides some wonderful features, but for the purposes of your title, how much would be removed for projects hosted outside of Github?

Posted by Joseph Scott at

Joseph: two comments.  First to command line geeks (including, yours truly), that question is very relevant.  But to many the functions of a given system is viewed through the lens of whatever UI is presented to them.  For those people, movement to svn can be perceived to be a bit step backwards.

The post that was of most value to me on learning the value of git was Ryan’s.

Posted by Sam Ruby at

Sam Ruby: Experience with Git

Sam Ruby: Experience with Git miyagawa [link]...

Excerpt from はてなブックマーク - naoのブックマーク - お気に入り at

Tab Sweep

Happy Thanksgiving Americans! If you’re the type who browses while full of turkey, here are some postprandial links with no unifying theme whatsoever....

Excerpt from ongoing by Tim Bray at

... so let’s change the ASF’s policies. Choice of revision control is a religious argument and has no right or wrong answer. If the mantra is “Community > Code” then the answer to the question “What RCS should we use?” should be “whatever works for the community.”

Posted by Fil Maj at

Agreed, I think the fundamental question is whether the tooling debate is even relevant to ASF projects. Is the ASF about a particular technology or is it about an ideology? Technology deprecates --- this we know for sure! Principles, while susceptible to change, are certainly more concrete. This isn’t to say that making good technology choices is not important. It is. But I can tell ya, that isn’t a battle we thought we were picking!!! That all said, we are happy to help move this forward w/ the Couch team in due process.

Posted by Brian LeRoux, PhoneGap at

Fil Maj: please point to an actual policy documented on the ASF website that needs to change.  Part of the frustration of this discussion is that people with incomplete information are filling in gaps and reacting to the picture of the world that they think that they are seeing, and not the world as it actually is.  As an example, I encourage everybody who has read Apache Considered Harmful to read the updates that have since been made to that page, in particular the last paragraph at the time of this writing which reference Noah Slater.

Brian LeRoux: “ideology” is a emotionally laden term.  But let me tell you a story.  I work for a large company (see Disclaimer on the right).  I have had occasion where a group has come to me and said that they want to contribute their project to the ASF, but they wanted to retain the ability to set the technical direction, decide who can or can not contribute, and to continue using the license that they had been using.  My response was simple: “so, you are telling me that you do not want to come to the ASF”.

If you want to call that response ideology, I’m fine with that.  I’m also fine with — and frankly the beneficiary of — any number of non-Apache ways of doing things, and any number of licenses — just not at the ASF.

I will assure that that this isn’t about dictating technology choices.  Fundamentally it comes down to being comfortable that we can answer the question as to whether or not licensees (a.k.a. users) of our software can be assured that the the product that they are downloading can be and is made available under the terms of the Apache License, Version 2.0.  That quickly breaks down into a number of subquestions.  First and foremost: can the ASF infrastucture team support this tooling?  So far the answer to that question looks promising.  Secondly, how do we interpret the terms intentionally submitted in the context of a DVCS?  This will need to be discussed on legal-discuss.  I’m confident that pull requests for normal size patches will suffice.  And so on.  This includes how do we work with our development communities so that they understand what part they need to play in this.

You can see my ASF credentials.  You can review my past statements on git and assess whether or not I am biased against git (spoiler: I am not).  I am confident that if a project can demonstrate that they will actively maintain a current and relevant master copy of the repository on ASF infrastructure, proactively monitor and police the set of patches that are integrated, and document via ICLAs and IP clearance forms each and every notable exceptions, that all will be fine.

That’s the outline of what needs to be addressed as I see it.  You can help — for example, either by asking questions or by providing answers.  Or you can wait a bit.  In the latter case, I don’t expect that it will be much longer.

Posted by Sam Ruby at

Related links: git work-in-progress, read-only mirrors.

Posted by Sam Ruby at

Re: "Comment utiliser efficacement Git ?"

Sam Ruby ? [link]...

Excerpt from : les commentaires pour Jeudi du Libre de décembre à Lyon : Git, ou comment donner l’impression qu’on est un supercodeur ? at

Experience with Git: From time to time, a test fails.  One of the first things I typically do is run git bisect.  All I need to do is to identify a good version, a bad version, and a test to run.  Even if the test takes 2 minutes and there are 30 or...

Excerpt from turnings :: daniel berlinger at

As someone who was slow to move from svn to git -  I actually used mercurial for a bit there too - I can attest to the resistance. What took me over the tipping point was actually working in a shop that uses git. Having a trusted friend to build a bridge to the emerald city on the hill has been invaluable and I now can’t imagine going to anything but git.

That said, I find now that git is functioning on a couple levels sociologically. One is that nascent projects that use git seem to get traction more quickly. I attribute this to two things: 1) git encourages ease of interaction - this is true on both the ease-of-commit and pull requests that you describe and by extension, the ease of interacting with developers via github issues. I think that open source software development is accelerating as a result of git and github and for that reason alone, I think it should be considered a de facto standard.

Arguments can still be made around usability and discoverability and I think there are cases (plumbing git reset’s depths) where git’s usability and discoverability can be improved and the learning curve made easier for new developers who aren’t Linus.

Posted by David Watson at

Add your comment