Wednesday, September 13, 2017

Faster builds for C++ projects

AAA titles in the video game industry really push boundaries in tech. Game engines and code content requirements lead to massive projects -- almost exclusively in C++. In spite of ever faster mass-storage, and CPU speed, build times suffer worse with each generation of titles.

C++ programmers at times look enviously to Python, JavaScript and other languages for their comparatively fast iteration loops, but know that leaving the speed, power and type safety of a compiled systems language like C++ is a train wreck waiting to happen. Sure, there are a few newcomers on the scene, but none compare to a language that has decades of stability and game industry support for blazing fast physics, rendering and game logic.

Yet, we end up in this self-defeating cycle. Production teams demand ever more content. Iterations times nose-dive, making it harder to turn around new features quickly -- build times rise above the 20 or 40 minute mark (and beyond). Producers then want to hire more programmers to get more out of their tech teams. This self-defeating act of desperation looses dozens or more engineers on the code base, adding more code, more bugs and even worse iteration times.

What is a tech team to do? Inevitably more coders are going to pour mode code into the project and I have yet to meet a producer that understands that "adding more people to the project just makes the project later." Well, the next best approach it to speed up build times.

There are some powerful build acceleration tools at our disposal as programmers. Some will distribute builds across the developer's network, ensuring no single engineer has to spend all of his workstation's resources on a task that is too large. DistCC and Incredibuild do a great job distributing computation workloads.

Others will cache workloads. CCache has long been available on Linux/UNIX systems. Unfortunately that only works well for mostly server and back-end platforms -- most game and engine programmers still need to deal with Windows as the preferred PC gaming platform, and Microsoft compiler tools dominate console game development. Mobile platforms are locked in with XCode, which does not support the use of CCache (though it can be crow-barred in by doing some unsavory things to an OSX system).

Around the turn of the century, I liked to use the ccache+distcc combination for my Linux platform MMO backends. A system would look for a cache hit and return immediately. On a miss, it would distribute each compilation pass across the network. It was beautiful. It was configuration hell. It was well worth all of the extra effort to administer and maintain.

So, let's say you are a AAA game developer (or otherwise locked in to using Microsoft's tools with C++). What can you do?

Distributed Compilation

Incredibuild and FastBuild both have distributed compilation features, Incredibuild focuses on just working out of the box, but is a commercial offering. FastBuild is an up-and-coming OSS project that is free, but requires more from developers to support.



Incredibuild is a no-brainer for any team larger than a few programmers that also has to develop and maintain a complex C++ code base. The time savings are a huge pay-off (from a business point of view) and provide better quality of life (from the perspective of a human being having to write code).

Configuring and maintaining it is straightforward. Install a coordinator somewhere. Install a bunch of agents on the network and give them the IP address of the coordinator and your are off to the races.

Incredibuild has some pretty amazing tech behind their work distribution solution. Unlike distcc which required each node to have the right compiler installed, Incredibuild just shuttles your version of CL.EXE, LIB.EXE or other tools to an agent. The Agent intercepts many system calls to access files, so it LOOKS like (on the agent) that CL.EXE is actually running on the initiating node.
Incredibuild's Build Monitor showing progress distributed across several machines.

It "just works" for just about every scenario.

I've used Incredibuild for at least 10 years (maybe longer) and insist on having it anywhere I work that takes more than 20 minutes to get a decent build after switching branches or syncing someone's awful global header changes.

My experience with FastBuild is only tangential, but it is written by game developers at Ubisoft -- one of the few big publishers that takes engineering problems of scale seriously enough to encourage smart guys to take time to develop really cool solutions. 

FastBuild is a bit more invasive, requiring developers to adjust their build systems to accommodate the tool. Like ccache and CLCache, this can be employed to great effect, really reducing build times, especially for git-like workflows that have devs bouncing between feature development on their own branch and back to bug fixes on another.
FastBuild's integrated build monitor, very similar to Incredibuild's visualizer. 

FastBuild provides some distributed compilation, like distcc. It also tries to do some caching, like ccache. It appears to provide good performance for complex projects, with the cost being the requirement to re-tool projects and maintain a completely new build description language.

What about Caching?

For the longest time, if you were on a Microsoft compiler suite, you could not get a compiler cache. There *are* some options now, however! CLCache comes to mind on the Free Software side, as well as the commercial offering Stashed, which is designed to work with distributed build systems like Incredibuild or FastBuild. Fastbuild also sports some features of compiler caches (see above).

CLCache

CLCache aims to be a free ccache-like implementation for Microsoft C/C++ compilers. To work most generally (without having to rewrite a project's build system), the programmer compiles the CLCache python script, replaces Microsoft's CL.EXE and renames it to another file that CLCache knows to invoke on a cache miss. In principle it works a lot like ccache. In practice, it also comes with a lot of the headaches that are also involved with trying to get XCode/clang on Mac OSX working with ccache.

On small enough teams, the administrative headaches are easily worth the trouble. It does pretty much what it says on the tin. Best of all, it is free and open-source, so will likely improve with time.

There are some things it cannot do (yet) that might be a deal breaker for many programmers. These are the same reasons it has taken almost 20 years just to get minimalist ccache equivalent for MSVC.

Working compiler caches that support CL.EXE are hard to make. Microsoft does some really wacky things at build time with its compiler driver (CL.EXE). Managing the program database is one that has so far been an obstacle for compiler cache developers.

So, using /Zi or /ZI with CLCache is still somewhere over the horizon. Projects are out of luck if they use PDBs for debugging, which are especially nice to keep around so you can ship release versions of a game while collecting minidumps from the top crashes in the wild and debug them. Unlike older COFF debug formats, PDBs move most of the debug symbols to a separate file that can be excluded from a shipping build, but still be used to analyze crashes.

Stashed

Stashed is a recently available, commercial compiler cache. It works on similar principles to ccache, but with more smarts. It does not require modifying an existing build system or really any configuration at all to work. The setup process is to download a 10MB installer, run it, verify an email address and just get back to work. It starts with a free month trial -- no credit card required.

One of the nice things about Stashed is that it works with PDBs (/Zi) just fine. What is super stellar is that it also plays very nicely with FastBuild and Incredibuild. It is, so far, the only game in town that scratches three of those itches:

  1. no crazy configuration or build system changes needed
  2. accelerates even when PDBs are being generated
  3. pairs up nicely with Incredibuild or FastBuild to distribute computation when a cache miss happens, and REALLY speeds them up when the cache is hot (sometimes by 5-10x faster than either of them alone!)


Here's a nice video showing Stashed + Incredibuild building an UnrealEngine project, as well as Stashed running standalone to build LLVM:


Conclusion?

Commercial projects with an existing code base should be using Incredibuild + Stashed, hands-down. No changes required to project files, no configuration overhead, can be easily installed and administered on lots of machines.

Small teams just starting and hobby/home users should check out all of the freebies available. ccache can be made to work on OSX and Linux. CLCache, with some effort, can be made to work seamlessly with MSVC.


Full disclosure:
I am a Stashed developer and have been using Incredibuild for a very long time, so my experiences may be biased. Please comment with corrections!