gib: a simpler build system

I have been generally dissatisfied with available build systems (a familiar notion, I am sure). For divine, the case was especially stark: we used to have a pretty long and complicated toplevel makefile whose main job was to figure out how to run cmake. Which then generated some more makefiles, or perhaps a build.ninja. It's all slow, buggy, and doesn't know when files are added or removed (which we hacked around with globs and fake dependencies on version control metadata, but it's awful).

The decision to use cmake was seemingly easy, back in the day:

everyone used it (and hence it was familiar),
llvm also used it,
for various reasons, we ship and build a copy of llvm.

Actually, we used cmake even before we imported llvm in 2016. It wasn't great, but it wasn't terrible either. The thousands upon thousands of lines of cmake scripts that we pulled in from llvm did very little to improve that, though. We still have bugs 6 years later that we never figured out, like cmake not generating build dependencies correctly… sometimes, build would fail the first time around. Re-running ninja (perhaps a few times) would fix the problem.

requirements

What do I want a build system to do? Well, build things, obviously, but I also have some additional preferences. When it comes to build recipes, writing and maintaining them is annoying and I want to spend as little time on that as possible. Hence the first set of requirements:

compact, hand-written build recipes with simple semantics,
automatic/dynamic dependencies (what century is this?),
automatic file lists (everybody seems to be failing at this pretty hard),
targets depend on the command lines used to build them,

When you get to actually building things, you generally want the system to:

use separate source and build trees (seriously),
quickly figure out what needs to be done,
run jobs in parallel (and keep them coming quickly),
tell you what's going on,
make it easy to find and read error messages.

Other people don't want to spend time with your build system, either, so it should:

run (at least in principle) on anything POSIX, without fussing about,
be small and simple enough to bundle the whole thing with your project.

These last two items pretty much force C as the implementation language (don't say perl). Fortunately, C is pretty okay for what gib does. It took a bit less than 2500 lines.

implementation choices

Implicitly, we consider a dependency graph to exist: the leaf nodes are source files (and other inputs), while non-leaf nodes are targets, and are built and rebuilt only when needed (e.g. their inputs changed). The build system may, but does not have to, construct this graph explicitly (gib does).

Obviously, build systems face some implementation choices: the paper build systems à la carte is a decent summary. The two fundamental choices they describe are:

the scheduler, which can be one of:
1. topological (make, ninja – the entire dependency graph is known upfront; in both cases, this is convenient fiction, since both systems support a specific type of dynamic dependencies),
2. restarting, which supports dynamic dependencies (those that are only discovered after their build is attempted) by aborting the task and restarting it after the discovered dependency was built,
3. suspending, which suspend the task that found a missing dependency until that dependency can be completed;
the rebuilder, with these options given in the paper:
1. dirty bit: goals are either up-to-date or out-of-date (dirty), and those out of date are rebuilt,
2. verifying traces: compute a hash of each node in the dependency graph (hopefully cheaply, without rebuilding the node) – if the current node hash does not match the one obtained when the node was rebuilt, the node is out of date,
3. constructive traces and deep constructive traces: same, but also store the result (useful in distributed build caches, not interesting for us).

Notably, gib does not neatly fit into any of the given categories: the scheduler is a compromise between topological and suspending, while the rebuilder is somewhere between dirty bit and verifying traces. In both cases, the latter of the two is better, but also more expensive. So we take the former, which is fast, and add enough of the latter to obtain the better properties when we need them.

The interesting properties contingent on the above choices are:

support for dynamic dependencies:
1. can build recipes (commands) declare dependencies as they run? (this is how ninja and make understand the concept),
2. is it okay if such dependencies are themselves targets, and hence may not exist while the recipe runs (this is how the paper defines the concept),
being minimal, as in, does it never rebuild targets that did not need to be rebuilt?
support for early cutoff: if a recipe realizes that it did not need to do anything, can it communicate this to the build system in a way that saves work?
being self-tracking: are the build recipes (what the user gives the build system as its input) part of the dependency graph, in a useful fashion (i.e. not ‘user touched makefile! panic and rebuild everything from scratch’)?

Perhaps surprisingly, gib checks all those boxes, even while maintaining speed comparable to ninja. Absent implementation bugs, it is also correct, in the sense that it will not fail to rebuild outdated targets. There are also a few additional properties that I didn't see mentioned either in the paper, or elsewhere. Onwards!

unique features

First, a disclaimer: some, or even all, of the below can be done in other build systems, at least in principle. For instance, make with generated makefiles and recursion can basically do anything you can think of, though it's not going to be very efficient, or very pretty. In general, anything that relies on recursive calls to the build system will suffer from fragmentation of the dependency graph, and can easily lead to incorrectness. This is the gist of the fairly famous paper recursive make considered harmful.

So what interesting features gib offers? The big one is that it can continue to build the dependency graph even as the build is already underway. The trick is really the same as the above: generate gib rule files as your targets and include them into the build graph. But gib can do this without recursion and without restarts. So let's dive into what a gib rule file looks like.

First of all, it is written using a non-embedded domain-specific language, like make or ninja. The language is regular – about the simplest possible – and rule files are processed in a single top-down pass. The file is structured into stanzas, and normally each stanza creates a single node in the dependency graph, by using commands such as out (naming the target), dep (for listing static dependencies) or cmd (for telling gib what command to run to produce the output).

Nodes, once created, cannot be changed: about the only thing you can do with them is use them as dependencies of other nodes. This will be important shortly: the important thing is that a statement somewhere later in the file cannot sneak up on an existing node and remove (or even worse, add) a dependency. Or change the command that it runs. And while gib of course has variables, they cannot be used to circumvent this immutability: the first time a variable is used in a substitution, it is frozen and it is an error if a later statement tries to change it.

Therefore, as soon as a node is created, it can be safely built (though we usually don't immediately know whether we want to build it). So consider the bog-standard ‘include’ statement (spelled sub in gib). Unlike with most other build systems, the argument to sub is a node of the dependency graph, not (just) a path. Often, it will be a leaf (i.e. a source file), but it doesn't have to be. And when it's not, magic happens: if we can mention the node, that means it must exist in the graph, and hence we know how to build it. So at this point, gib suspends the rule interpreter, tosses the node onto the job queue and runs queued jobs until the node is ‘clean’ (built or rebuilt as needed). Then it opens the generated rule file, resumes the rule interpreter, and continues to build the dependency graph.

Remember when I have mentioned ‘automatic file lists’ about half a blog ago? This is how that feature works, too: gib contains a simple gib.findsrc utility that will walk your source tree and generate gib rules that list all your source files. It also emits a dynamic dependency on each directory that it traverses. If you create (or delete) a source file, the timestamp on its containing directory is going to change, and the affected file list will be rebuilt. Neat!

Wait a minute, what do you mean dependencies on directories? Right, gib doesn't really care whether something is a file or a directory. If it has mtime, it's fair game.

Finally, in addition to sub, there are 2 very simple macro-like constructs:

simple macros, created with def and used with use, that execute statements from the macro as if they were at the use site,
a for loop that iterates over a list and repeats the statements for each item in the list, binding that item to a local variable.

Unlike sub, these are just shortcuts and don't bring any interesting features. Except that they offer enough abstraction to make writing most of your build rules by hand quite feasible. In turn, that means you can skip writing a build rule generator. Generated code is no fun (especially since you will inevitably need to debug it). Especially considering that for divine, the cmake-generated build.ninja is 42 megabytes. The hand-written gib rules fit within 30 kilobytes. The bulk of that difference is down to the humble for loop. In fact, the gib rules are only about 1000 lines, compared to 10000 lines of cmake scripts that were used to generate those 42M of build.ninja. Neat! Obviously, that number will grow for gib as the build system encounters new machines and new users. But I am optimistic that it will not grow too much.

build configuration

Finally, what about build configuration – the part of the build that is usually done directly by cmake (or perhaps meson nowadays): finding compilers and other tools, libraries, dependencies, figuring out include and library paths, and so on.

I'd say that vast majority of this stuff is basically obsolete. Way back when, autoconf made a huge deal out of testing your system to figure out where to find everything. Since crummy proprietary platforms were common, this everything included a ton of bog standard POSIX or ISO C functionality. Which is all great, but is your program really going to work on a system that doesn't have string.h? Sure, thanks to autoconf, you can write an #ifdef that you are never going to be able to test. A few years later, someone finally comes and tries to build your software on some crippled half-unix. Your #ifdef is going to see its first use! Too bad its contents is broken or hopelessly out of date. So much for all that portability.

Anyway, along came cmake and tossed out the decades of crufty checks that nobody cares about, but it kept the same mindset. The sprawling macros shipped with cmake will try very hard to work on everything, up to and including your toaster, so instead your own code can fail spectacularly. Because obviously, you (the author) never actually tried building your software on that brand of a toaster. Or, if you did, that was 2 years ago, and things bit-rotted since then (honestly, who puts 20 different toaster brands in a build farm?).

I guess the entire moral of this story is that software will only build and work on the platforms that the developers actively test. Everything else is going to require some porting effort. Of course, the amount of work will depend on how well the author did their homework, but the act of yak-shaving itself is going to be there, 9 times out of 10. Small yaks are better than big yaks, but they are still yaks. Eww.

So perhaps you could argue that the role of cmake is to minimize the amount of yak that you will have to shave in order to get that piece of software running. In which case, it sucks. Often the only thing you actually need is to pass a flag or two to cc. So in a normal world, you would tack those flags onto CFLAGS or whatever and go on about your life. Not so with cmake. Of course you remember that the variable is called CMAKE_C_COMPILER_FLAGS (because why not) and toss in your -I/opt/libmagic/include in there and everything should be fine, yes? Except no, because cmake will insist that libmagic cannot be found. Does it want a pkg-config file? Perhaps you need to set PKG_CONFIG_PATH. Or is -DMagic_DIR=/opt/libmagic? Nobody knows. I mean sure, it's documented, but the yak has grown considerably.

I don't mean to beat a dead yak here, but also consider this: a typical cmake project has dozens upon dozens of user-configurable variables. You can look at them with ccmake! This is such a big problem that ccmake will, by default, hide most of those variables, because they are ‘advanced’. Of course, the odds are barely better than 1:1 that you need to tweak one of the advanced vars.

A case in point: divine has 12 pages of normal variables, and 10 more pages of the ‘advanced’ stuff. With gib, we are currently at less than 1 page of those. You look at the top of gib/main, and if you need to change something (where to find the compiler, flags to give to the compiler, stuff like that), just put that var into gib/local with your new value and that's it. Conveniently, you also won't have to redo those tweaks 2 weeks later, just because you wiped your build directory. Of course, if you edit gib/main that's fine too, but version control might get a little upset down the line.

Of course, sometimes there is a legitimate need to probe for something automatically. Fortunately, this is very easy with gib: write a program that does the probing (sh or perl usually work great) and writes the required gib rules (probably just variable assignments) into a file. Then sub that file. If the script changes, gib will of course rebuild the rules, and anything that depends on them. If you want to go all fancy, your script can even emit dynamic dependencies on the files it probed, so if those files change, it will re-run the check. Basically the same story as with file lists.

conclusion

All that said, cmake has one crucial advantage: many people know it, and also know where the yaks live. So you save time hunting down the right yak. Confronted with gib, chances are you will have no idea where to start, so there's definitely some extra effort in there. But even if user comfort ends up being a tossup, the improvement in my life was well worth it.

Since this post is a bit light on detail, I will perhaps write a more in-depth followup in a few weeks. Or perhaps just some old-fashioned documentation. You didn't expect the thing to be documented, did you?