Long Running Feature Branches

I have now worked on two scientific code bases that are written in C++. One is for classical physics, the other one does quantum field theory. But that is not the point. The point is that there is a lot of code and there are lots of scientist working on that code.

Scientists are on a different schedule than full-time employees of a software companies. The software is just a means to an end, not the actual product. The quality of the software does not really matter, just the scientific results that are obtained with it. The problem is that the software needs to gets more complex over the decades because the scientific problems are also more complex. The low hanging fruits usually are already taken. In order to get a high performance code, one has to develop and improve on it for a decade or more. New architectures need to be supported and new features implemented.

One has to think about the software in the long-term whereas the scientific results are just mid-term. Sadly the software is often thought of in shorter terms, leading to build-up of low quality code and technical dept.

The one aspect that makes this frustrating at times are long running feature branches. The work schedule of scientist is not nine-to-five for the software. It includes various things like teaching, other research, writing grant applications, office hours, supervising students. There might be weeks with no lectures where one has more time to develop code. Or there is a hackathon where scientist get together and work on their codes for a week.

Also students (like myself) start to work on some code and add a new feature. They often do so by forking the upstream repository on GitHub and then work on the feature. A master thesis takes several months, sometimes a full year. They work on the code and never pull anything from the upstream repository. In the worst case they do not even tell the upstream authors about the changes that they have planned. So after a year they have a ton of new features implemented and all these are sitting in their fork. The code has generated them some results, the thesis is published, they get their degree. But the code just sits in the fork on GitHub and nobody can actually use the features.

In order to make use of the new features, they have to be merged into the mainline of the upstream repository. That requires communication with the upstream authors and then a merge. This is the point where it become nasty. The upstream authors usually have not sit there idle for a year, there have been changes to the code. Sometimes there was even some refactoring, meaning that the upstream code and the forked code are no longer compatible. The merge will give a lot of conflicts that have to be resolved manually. Beyond that, more things might happen. If a function was removed and the fork still needs that, the function has to be restored. If some function now works differently, the fork needs to be adapted. In the course of a full year, a lot such things can happen.

During my master thesis, I tried to avoid having my code rot somewhere without landing it into the upstream repository. So I have talked to the upstream authors and first landed the features of the previous two master theses that have been added. Then I did some refactoring and removed a lot of duplicated code. I had landed the features that I cared about and then assumed that everything of relevance was within this repository. There is no way of knowing whether somebody is working on a fork somewhere in the dark. So this is just a risk they take by not keeping in sync with upstream.

The build system was converted from GNU Autotools to CMake, the code generator was fused into the main project and the build process streamlined. A few new features were introduced which make some new function parameters necessary. And since I could not take the inconsistent mess of tabs (8 spaces), tabs (2 spaces) and actual spaces any more, I got to run clang-format on the whole codebase. The development branch was changing rapidly but that was not a problem because the feature branches got merged within a couple of days. Nothing was getting out of sync too far to be unsalvageable. I have also set up automated building and testing on Travis CI such that one could merge features quickly with confidence. It was some sort of continuous integration where small chunks of changes get into the mainline quickly.

Then somebody asked that a four month old feature branch gets merged into the mainline. It turned out to be a nightmare to merge, I actually ended up not merging it but just re-implementing the code on top of the development branch. There have been changes in the code on methods that I removed in a deduplication effort. Git was not able to cope with that well, it just added the duplicated code back in. Also the API of various kernel functions had changed, this was not reflected either. So the only feasible way to get these performance improvements in was to understand the changes and perform them again where needed. And even worse, the improvements only apply to the old features that were implemented four months ago. That was before I landed work of two master theses and added my own stack of features. Therefore the feature is able to be included in the mainline now, but it only benefits half of the codebase.

Of course, since four months ago the master theses features were unknown to the upstream authors, there was nothing they could have done. The people writing the thesis should have kept up with the upstream developments and then implement these in their code. But also the upstream authors should have merged the performance improvements early on such that the mainline had them directly after the hackathon from which they originated. The issue is that they do not improve the situation in every possible situation. In that situation one should have merged them but added a configuration flag such that one could select what one would like to have. This way both variants would be in the code and not in some hidden feature branch.

Actually this made another thing apparent in the codebase: Some things that should be separate and interchangeable were tightly coupled. Going forward, one would have to refactor this and separate the performance logic from the scientific application. But refactoring is a major change in the code, which the developers fear because of upcoming merge conflicts. Martin Fowler summarizes this nicely:

This fear of big merges also acts as a deterrent to refactoring. Keeping code clean is constant effort, to do it well it requires everyone to keep an eye out for cruft and fix it wherever they see it. However this kind of refactoring on a feature branch is awkward because it makes the Big Scary Merge much worse. The result we see is that teams using feature branches shy away from refactoring which leads to uglier code bases.

There are two opposing ways to make changes in code:

  • The boy scouts way is leaving a code file cleaner than you found it. This means that you improve on everything that you see while you work on something. The code will continuously improve.

  • The spec ops way is just making the changes and not touching anything else. That makes merges easier.

The long running feature branches incentivise the spec ops way of working. Only minimal changes are done, nothing is cleaned up. Over time, the code will become brittle, clean-up is frowned upon. However, in the long term, this is going to have a significant cost. Either horrible merge conflicts or everything will be written from scratch yet another time.

I would really like to avoid the long running feature branches. Instead I would like to not only have the continuous testing but also continuous integration and features branches should only be used until some feature compiles and all the tests succeed. Then the feature should be merged into the mainline. It does not need to be activated yet, but the code should all be there. This would avoid frustrating merges and also allow for global refactoring of the code. This will make it much easier to maintain the code for the next decade and add new features and support new architectures.

::: {.seealso} - https://blog.newrelic.com/2012/11/14/long-running-branches-considered-harmful/ :::