Discussion Open Source Programming

Thoughts on Git and “Enterprise Open Source”

Three years ago I wrote a post about how Git and Github changed the Open Source world and how companies can benefit from this model (in Portuguese, unfortunately, but you can try to read an almost-decent automatically translated version).

How come that three years later companies are still struggling to find a good version control system and a good model for internal collaboration when there’s a working model out there ready to be copied? 🙂

In many big corporations – like the one where I work – finding code in (Subversion) repositories is like finding a needle in a haystack. Code is hidden from potential contributors, which makes collaboration very hard. If you are really willing to collaborate, you have to download the code and send a patch by e-mail or via bug tracker. You just don’t know when, or how, or if the patch was applied. Besides all that, people lose hours and hours on slow SVN blames, checkouts and updates. Branching and merging with Subversion is too painful to mention, it’s an error-prone process that requires a lot of attention and, sometimes, hours of work. And if you are using SVN externals, oh god, poor you!

To solve these problems, two years ago my team started to work with Git using Git-SVN. This is a very well-known approach that lets you work on your Subversion repositories using Git as the client. You get some of the Git benefits like the automatic merges, local stash, local repositories and local commits but keep using Subversion as your remote repository for team syncs and “source of truth”. Well, it turns out that Git-SVN has some problems and you need to do some tricks to avoid trouble. And because you are still working with Subversion, in some situations you will still be slow, for instance, when you do an initial clone (it can take several hours if the repository is big). This approach seems acceptable at first, until you realize you’ve put a Lamborghini body around your old Horsey Horseless car.

In an effort to increase our productivity, we decided to go for a different (and obvious) approach: build our own Git server and quit Subversion forever. But because life is hard, we had several things plugged to our SVN repositories, from CI to internationalization system, and given that Git is not “official”, some of those internal tools didn’t support it and wouldn’t work. Not to mention that if we still wanted to be compliant with IT we needed to have our code on Subversion anyway. Not having Subversion was not an option.

To solve that, we wrote a server-side hook that replicates code to Subversion when there’s a git push. Despite the fact that we were still having slow Git pull and pushes sometimes, it was great because we got rid of many Git-SVN problems. The downside is that our code was being replicated to Subversion without commit history. Every push – regardless of how many commits it contains – becomes one single commit on Subversion, and we lose the commit messages. But since the main goal was to use Git only and still have the hooked systems working with Subversion, we loved it. We use Git for history (and everything else) while Subversion is just the thing that’s there because we couldn’t get rid of.

Two years later and after many different (sometimes inexplicably weird) attempts, our Git setup evolved a lot. We’re now using Gitolite for managing our Git server and, because of that, we needed to change the sync strategy to a cron sync due to the fact that the server-side hook was somewhat unstable with Gitolite. We now have several repositories, there are a few different teams using our server and if other teams want to have their own similar setups, we created a comprehensive step-by-step manual so that in less than a couple hours they can be up and running with their own boxes.

The Git adoption in our teams was painful but we made it. But we made it only for a couple teams and a lot of people are still suffering with Subversion throughout the company. And even if the entire company was using Git (which would be awesome already, don’t get me wrong), that solves only the development productivity problem, not the collaboration problem. Repositories would still be hidden and you would only be able to clone them and send (manual) pull requests knowing where they are.

That’s just wrong.

First, I believe developers shouldn’t be still justifying and fighting for Git adoption when it is clearly becoming the industry standard everywhere. For instance, Google Code had to add Git support to catch up with Github, and because they took too long to do that they lost many projects and developers (including myself). Atlassian’s Bitbucket is also supporting Git since late 2011. Even Microsoft recently announced Git support for Windows Azure. And the list goes on and on. The big players are recognizing Git’s relevance. And the small ones, they don’t care about anything else. Take Heroku for example, where deployment is only possible through Git.

Second, many Open Source projects like Rails, Node.js, Symfony, Django, PHP Language, Qt, openSUSE, YUI, JQuery and countless others already made us the favor of proving how platforms like Github and Gitorious can greatly improve the collaboration and contribution experience by providing workflows and tools that are really helpful for maintaining software projects. Those platforms enhance collaboration significantly, besides giving visibility to people’s projects and things they are working on.

We are not talking about a new hype here. Git and Github (and maybe this could be extended to other Github-like systems like Gitorious and Bitbucket) became the industry standard in the past years. Git is around for some 7 years now and was created inspired by the way people worked on the Linux Kernel, one of the biggest and most important software projects of the computer science history. By the way, if you didn’t do it already, stop for an hour now and watch this great video by Linus Torvalds on why he created Git, I promise you won’t regret. Git is ultra fast, stable, scalable, secure and makes collaboration much easier and faster. Managing merges and patches won’t be a nightmare anymore, not only because they put a lot of effort and intelligence on the merging itself but also because Git embraces collaboration workflow in a way that makes your life much easier (both for project owners and contributors). And Github will make your work more pleasant, collaborative and visible by adding even more tools and value on top of Git. But let’s not make this any longer, you get the idea already: they became the new standard for a reason.

The door is there. Now companies have to walk through it. And because I like challenges, I’ll help one more company take the red pill. See you on the other side.