slanglois

So, when is Alfresco moving to GitHub?

Blog Post created by slanglois on Apr 1, 2015

'When are you moving to Git/GitHub/Bitbucket/decentralised version control software/...' must be the most frequent question I get, both from people inside and outside of Alfresco.

I usually start by explaining that Alfresco is on GitHub! The Alfresco organisation there currently owns 35 public repositories, and many of them are very active. All our mobile code has been there forever; for instance: (iOS + Android) * (SDK + App), as well as Aikau, Gytheio, the Alfresco SDK, etc. The recent announcement that Google Code is going to shut down triggered another batch of migrations to GitHub, such as Jive Toolkit, dropbox integration, and more.


Of course, what the question really means is: 'When is the Alfresco Platform codebase moving? Not some petty integration or client app.' :-) Well, we do have a mirror of the Community Edition source on GitHub. But as for moving the whole code base there, the answer is: never -- at least in its current form. Here are a few reasons why.

    • It doesn't fit!

      GitHub has a limit of 1GB per repository. We hit that limit for the mirror, and had to filter a few big folders. (Even Linux had to cut down the history of the repository to move there!) Of course, we used to commit big jar files in the source. We don't any more, but even then, we still cross the limit. And it's not just GitHub but all DVCS: holding all the history of the repository in your local copy has a big impact if you have a 10 year old code base of one million lines of code!

 

    • Continuity with past releases

      We could make a clean cut, leaving the Old World in Subversion and having the Brave New World in Git. That would work, but it would make merging modifications from earlier maintenance branches very hard -- and most of our commits are of that kind! We have 4 active service pack branches (4.1.x, 4.2.x, 5.0.x) and 18 active hotfix branches (from 3.4.14.x to 5.0.1.x) All fixes have to be merged forward, so that people don't get regressions when upgrading. Doing this is tedious enough without having to switch software in the middle!

 

    • Access rights

      Our Subversion repository is currently a patchwork of access rights, and DVCS don't support that -- the idea being that you spread your software in smaller repositories, and manage the rights for each repository. Even in a given branch, we have folders which are public (Community Edition), others which are reserved to customers (Enterprise Edition) and others which are private (Cloud Edition).

 

    • Big team, big problems

      I don't want to manage a Git repository where 50 people (and counting!) commit daily. DVCS are inherently more complicated than centralised systems (think of what identifies a revision, for instance), which certainly allows for more power, but also more headaches in big teams where not everyone has a PhD in Gitology!


However, don't despair. We are not stuck in this situation forever! As you saw earlier this week, we are currently working hard to make the code more modular, and to extract from our big codebase some independent, consistent pieces that can be released separately. You've seen this already with Aikau and Share (but it also happened a while ago with Records Management, and all the integrations).

As we extract smaller chunks, it can make sense to move them over to GitHub, because they will then be more manageable and will have a small team of people responsible for it. The goal here is twofold: externally, to release more frequently to our audience, and internally, to allow more parallel developments to happen and be more agile.

I hope you are as excited as we are about this change - this is much more interesting than just changing our SCM software!

Outcomes