Discussion Open Source Programming

Thoughts on Git and “Enterprise Open Source”

Three years ago I wrote a post about how Git and Github changed the Open Source world and how companies can benefit from this model (in Portuguese, unfortunately, but you can try to read an almost-decent automatically translated version).

How come that three years later companies are still struggling to find a good version control system and a good model for internal collaboration when there’s a working model out there ready to be copied? ­čÖé

In many big corporations – like the one where I work – finding code in (Subversion) repositories is like finding a needle in a haystack. Code is hidden from potential contributors, which makes collaboration very hard. If you are really willing to collaborate, you have to download the code and send a patch by e-mail or via bug tracker. You just don’t know when, or how, or if the patch was applied. Besides all that, people lose hours and hours on slow SVN blames, checkouts and updates. Branching and merging with Subversion is too painful to mention, it’s an error-prone process that requires a lot of attention and, sometimes, hours of work. And if you are using SVN externals, oh god, poor you!

To solve these problems, two years ago my team started to work with Git using Git-SVN. This is a very well-known approach that lets you work on your Subversion repositories using Git as the client. You get some of the Git benefits like the automatic merges, local stash, local repositories and local commits but keep using Subversion as your remote repository for team syncs and “source of truth”. Well, it turns out that Git-SVN has some problems and you need to do some tricks to avoid trouble. And because you are still working with Subversion, in some situations you will still be slow, for instance, when you do an initial clone (it can take several hours if the repository is big). This approach seems acceptable at first, until you realize you’ve put a Lamborghini body around your old Horsey Horseless car.

In an effort to increase our productivity, we decided to go for a different (and obvious) approach: build our own Git server and quit Subversion forever. But because life is hard, we had several things plugged to our SVN repositories, from CI to internationalization system, and given that Git is not “official”, some of those internal tools didn’t support it and wouldn’t work. Not to mention that if we still wanted to be compliant with IT we needed to have our code on Subversion anyway. Not having Subversion was not an option.

To solve that, we wrote a server-side hook that replicates code to Subversion when there’s a git push. Despite the fact that we were still having slow Git pull and pushes sometimes, it was great because we got rid of many Git-SVN problems. The downside is that our code was being replicated to Subversion without commit history. Every push – regardless of how many commits it contains – becomes one single commit on Subversion, and we lose the commit messages. But since the main goal was to use Git only and still have the hooked systems working with Subversion, we loved it. We use Git for history (and everything else) while Subversion is just the thing that’s there because we couldn’t get rid of.

Two years later and after many different (sometimes inexplicably weird) attempts, our Git setup evolved a lot. We’re now using Gitolite for managing our Git server and, because of that, we needed to change the sync strategy to a cron sync due to the fact that the server-side hook was somewhat unstable with Gitolite. We now have several repositories, there are a few different teams using our server and if other teams want to have their own similar setups, we created a comprehensive step-by-step manual so that in less than a couple hours they can be up and running with their own boxes.

The Git adoption in our teams was painful but we made it. But we made it only for a couple teams and a lot of people are still suffering with Subversion throughout the company. And even if the entire company was using Git (which would be awesome already, don’t get me wrong), that solves only the development productivity problem, not the collaboration problem. Repositories would still be hidden and you would only be able to clone them and send (manual) pull requests knowing where they are.

That’s just wrong.

First, I believe developers shouldn’t be still justifying and fighting for Git adoption when it is clearly becoming the industry standard everywhere. For instance, Google Code had to add Git support to catch up with Github, and because they took too long to do that they lost many projects and developers (including myself). Atlassian’s Bitbucket is also supporting Git since late 2011. Even Microsoft recently announced Git support for Windows Azure. And the list goes on and on. The big players are recognizing Git’s relevance. And the small ones, they don’t care about anything else. Take Heroku for example, where deployment is only possible through Git.

Second, many Open Source projects like Rails, Node.js, Symfony, Django, PHP Language, Qt, openSUSE, YUI, JQuery and countless others already made us the favor of proving how platforms like Github and Gitorious can greatly improve the collaboration and contribution experience by providing workflows and tools that are really helpful for maintaining software projects. Those platforms enhance collaboration significantly, besides giving visibility to people’s projects and things they are working on.

We are not talking about a new hype here. Git and Github (and maybe this could be extended to other Github-like systems like Gitorious and Bitbucket) became the industry standard in the past years. Git is around for some 7 years now and was created inspired by the way people worked on the Linux Kernel, one of the biggest and most important software projects of the computer science history. By the way, if you didn’t do it already, stop for an hour now and watch this great video by Linus Torvalds on why he created Git, I promise you won’t regret. Git is ultra fast, stable, scalable, secure and makes collaboration much easier and faster. Managing merges and patches won’t be a nightmare anymore, not only because they put a lot of effort and intelligence on the merging itself but also because Git embraces collaboration workflow in a way that makes your life much easier (both for project owners and contributors). And Github will make your work more pleasant, collaborative and visible by adding even more tools and value on top of Git. But let’s not make this any longer, you get the idea already: they became the new standard for a reason.

The door is there. Now companies have to walk through it. And because I like challenges, I’ll help one more company take the red pill. See you on the other side.


Open source: “I don’t use open sorce software because I want support”

“I don’t use open sorce software because I want support. I want to pay for it, so I can have support if I need.” That’s what a lot of people say about free and open source software. But today I came to a really interesting situation that it’s interesting to share.

We were configuring a continuous integration server at my team for a new project and we decided to use Integrity – that is a very simple yet powerful and beautiful tool. Our goal was really really simple: run tests, deploy the application and run more (acceptance) tests. Then we came to a situation where the tests were not running and the reason was somewhat bizarre. Integrity was opening a subshell to execute our build (and that’s very fair), but the problem is that Python‘s sys.stdout was showing an Unicode error, because the test reports have a lot of accents. For some strange reason the very same code that was working in our shells was aborting with an Exception when executed in a subshell.

Given that complex situation, I decided to go to the website’s FAQ to see if somebody had this kind of problem before. I thought that maybe some configuration or environment variables setup could easily solve my problem. After some minutes of browsing I found instructions to configure Passenger user switching to overcome this problem, but I got no success.

Then, very frustrated, I decided to take a look in the documentation again and this time I saw a link to “support”, that pointed me to an IRC channel.

In five minutes I was talking to 2 commiters of the project and was having a high level discussion about the problem, the causes and the possible workarounds. The best part was that it took exactly 30 seconds for them to understand what I was talking about and they immediately started pointing me to solutions and asking me to try things… Thank to the guys’ tips (and Google) I could solve the problem in the end.

If you don’t like open source software because of the support, then I would like to ask you: in what reality do you live? Do you prefer to talk about “subshells” and “environment variables” with some call center attendant or do you want to talk to the people that can really help you solve your problem?

In other situation I was working at a company that used a VoIP telephony equipment that only worked on Windows. I wanted so much to use my preferred Linux distro, but that would mean that I couldn’t have a telephone. So, since we had a gold support plan (because we had a lot of PBXs with almost 200 branches), I decided to call the company and ask why they didn’t have a Linux version. I also tried to propose to or account manager: “we can implement that for you, just give us the Windows source code or protocol spec that we will implement everything for you and give you the source code and all the rights for free”. That was 6 years ago and they still don’t have a Linux version of the software….

Then I want to ask again: do you want to pray for your vendor to implement the solutions that are important to you or do you want to have the power to do it yourself when you need?

Think about these things. In the great majority of the times I asked for support in open source projects they were infinite times better than any paid support I’ve ever had!