gib: a simpler build system
I have been generally dissatisfied with available build systems (a familiar notion, I am sure). For
divine, the case was especially stark: we used to have a pretty long and complicated toplevel
makefilewhose main job was to figure out how to run
cmake. Which then generated some more makefiles, or perhaps a
build.ninja. It's all slow, buggy, and doesn't know when files are added or removed (which we hacked around with globs and fake dependencies on version control metadata, but it's awful).
The decision to use
cmakewas seemingly easy, back in the day:
- everyone used it (and hence it was familiar),
llvmalso used it,
- for various reasons, we ship and build a copy of
cmakeeven before we imported
llvmin 2016. It wasn't great, but it wasn't terrible either. The thousands upon thousands of lines of
cmakescripts that we pulled in from
llvmdid very little to improve that, though. We still have bugs 6 years later that we never figured out, like
cmakenot generating build dependencies correctly… sometimes, build would fail the first time around. Re-running
ninja(perhaps a few times) would fix the problem.
What do I want a build system to do? Well, build things, obviously, but I also have some additional preferences. When it comes to build recipes, writing and maintaining them is annoying and I want to spend as little time on that as possible. Hence the first set of requirements:
- compact, hand-written build recipes with simple semantics,
- automatic/dynamic dependencies (what century is this?),
- automatic file lists (everybody seems to be failing at this pretty hard),
- targets depend on the command lines used to build them,
- use separate source and build trees (seriously),
- quickly figure out what needs to be done,
- run jobs in parallel (and keep them coming quickly),
- tell you what's going on,
- make it easy to find and read error messages.
- run (at least in principle) on anything POSIX, without fussing about,
- be small and simple enough to bundle the whole thing with your project.
perl). Fortunately, C is pretty okay for what
gibdoes. It took a bit less than 2500 lines.
Implicitly, we consider a dependency graph to exist: the leaf nodes are source files (and other inputs), while non-leaf nodes are targets, and are built and rebuilt only when needed (e.g. their inputs changed). The build system may, but does not have to, construct this graph explicitly (
Obviously, build systems face some implementation choices: the paper build systems à la carte is a decent summary. The two fundamental choices they describe are:
- the scheduler, which can be one of:
- topological (
ninja– the entire dependency graph is known upfront; in both cases, this is convenient fiction, since both systems support a specific type of dynamic dependencies),
- restarting, which supports dynamic dependencies (those that are only discovered after their build is attempted) by aborting the task and restarting it after the discovered dependency was built,
- suspending, which suspend the task that found a missing dependency until that dependency can be completed;
- topological (
- the rebuilder, with these options given in the paper:
- dirty bit: goals are either up-to-date or out-of-date (dirty), and those out of date are rebuilt,
- verifying traces: compute a hash of each node in the dependency graph (hopefully cheaply, without rebuilding the node) – if the current node hash does not match the one obtained when the node was rebuilt, the node is out of date,
- constructive traces and deep constructive traces: same, but also store the result (useful in distributed build caches, not interesting for us).
gibdoes not neatly fit into any of the given categories: the scheduler is a compromise between topological and suspending, while the rebuilder is somewhere between dirty bit and verifying traces. In both cases, the latter of the two is better, but also more expensive. So we take the former, which is fast, and add enough of the latter to obtain the better properties when we need them.
The interesting properties contingent on the above choices are:
- support for dynamic dependencies:
- can build recipes (commands) declare dependencies as they run? (this is
makeunderstand the concept),
- is it okay if such dependencies are themselves targets, and hence may not exist while the recipe runs (this is how the paper defines the concept),
- can build recipes (commands) declare dependencies as they run? (this is how
- being minimal, as in, does it never rebuild targets that did not need to be rebuilt?
- support for early cutoff: if a recipe realizes that it did not need to do anything, can it communicate this to the build system in a way that saves work?
- being self-tracking: are the build recipes (what the user gives the build system as its input) part of the dependency graph, in a useful fashion (i.e. not ‘user touched makefile! panic and rebuild everything from scratch’)?
gibchecks all those boxes, even while maintaining speed comparable to
ninja. Absent implementation bugs, it is also correct, in the sense that it will not fail to rebuild outdated targets. There are also a few additional properties that I didn't see mentioned either in the paper, or elsewhere. Onwards!
First, a disclaimer: some, or even all, of the below can be done in other build systems, at least in principle. For instance,
makewith generated makefiles and recursion can basically do anything you can think of, though it's not going to be very efficient, or very pretty. In general, anything that relies on recursive calls to the build system will suffer from fragmentation of the dependency graph, and can easily lead to incorrectness. This is the gist of the fairly famous paper recursive make considered harmful.
So what interesting features
giboffers? The big one is that it can continue to build the dependency graph even as the build is already underway. The trick is really the same as the above: generate
gibrule files as your targets and include them into the build graph. But
gibcan do this without recursion and without restarts. So let's dive into what a
gibrule file looks like.
First of all, it is written using a non-embedded domain-specific language, like
ninja. The language is regular – about the simplest possible – and rule files are processed in a single top-down pass. The file is structured into stanzas, and normally each stanza creates a single node in the dependency graph, by using commands such as
out(naming the target),
dep(for listing static dependencies) or
gibwhat command to run to produce the output).
Nodes, once created, cannot be changed: about the only thing you can do with them is use them as dependencies of other nodes. This will be important shortly: the important thing is that a statement somewhere later in the file cannot sneak up on an existing node and remove (or even worse, add) a dependency. Or change the command that it runs. And while
gibof course has variables, they cannot be used to circumvent this immutability: the first time a variable is used in a substitution, it is frozen and it is an error if a later statement tries to change it.
Therefore, as soon as a node is created, it can be safely built (though we usually don't immediately know whether we want to build it). So consider the bog-standard ‘include’ statement (spelled
gib). Unlike with most other build systems, the argument to
subis a node of the dependency graph, not (just) a path. Often, it will be a leaf (i.e. a source file), but it doesn't have to be. And when it's not, magic happens: if we can mention the node, that means it must exist in the graph, and hence we know how to build it. So at this point,
gibsuspends the rule interpreter, tosses the node onto the job queue and runs queued jobs until the node is ‘clean’ (built or rebuilt as needed). Then it opens the generated rule file, resumes the rule interpreter, and continues to build the dependency graph.
Remember when I have mentioned ‘automatic file lists’ about half a blog ago? This is how that feature works, too:
gibcontains a simple
gib.findsrcutility that will walk your source tree and generate
gibrules that list all your source files. It also emits a dynamic dependency on each directory that it traverses. If you create (or delete) a source file, the timestamp on its containing directory is going to change, and the affected file list will be rebuilt. Neat!
Wait a minute, what do you mean dependencies on directories? Right,
gibdoesn't really care whether something is a file or a directory. If it has
mtime, it's fair game.
Finally, in addition to
sub, there are 2 very simple macro-like constructs:
- simple macros, created with
defand used with
use, that execute statements from the macro as if they were at the
forloop that iterates over a list and repeats the statements for each item in the list, binding that item to a local variable.
sub, these are just shortcuts and don't bring any interesting features. Except that they offer enough abstraction to make writing most of your build rules by hand quite feasible. In turn, that means you can skip writing a build rule generator. Generated code is no fun (especially since you will inevitably need to debug it). Especially considering that for
build.ninjais 42 megabytes. The hand-written
gibrules fit within 30 kilobytes. The bulk of that difference is down to the humble
forloop. In fact, the
gibrules are only about 1000 lines, compared to 10000 lines of
cmakescripts that were used to generate those 42M of
build.ninja. Neat! Obviously, that number will grow for
gibas the build system encounters new machines and new users. But I am optimistic that it will not grow too much.
Finally, what about build configuration – the part of the build that is usually done directly by
mesonnowadays): finding compilers and other tools, libraries, dependencies, figuring out include and library paths, and so on.
I'd say that vast majority of this stuff is basically obsolete. Way back when,
autoconfmade a huge deal out of testing your system to figure out where to find everything. Since crummy proprietary platforms were common, this everything included a ton of bog standard POSIX or ISO C functionality. Which is all great, but is your program really going to work on a system that doesn't have
string.h? Sure, thanks to
autoconf, you can write an
#ifdefthat you are never going to be able to test. A few years later, someone finally comes and tries to build your software on some crippled half-unix. Your
#ifdefis going to see its first use! Too bad its contents is broken or hopelessly out of date. So much for all that portability.
Anyway, along came
cmakeand tossed out the decades of crufty checks that nobody cares about, but it kept the same mindset. The sprawling macros shipped with
cmakewill try very hard to work on everything, up to and including your toaster, so instead your own code can fail spectacularly. Because obviously, you (the author) never actually tried building your software on that brand of a toaster. Or, if you did, that was 2 years ago, and things bit-rotted since then (honestly, who puts 20 different toaster brands in a build farm?).
I guess the entire moral of this story is that software will only build and work on the platforms that the developers actively test. Everything else is going to require some porting effort. Of course, the amount of work will depend on how well the author did their homework, but the act of yak-shaving itself is going to be there, 9 times out of 10. Small yaks are better than big yaks, but they are still yaks. Eww.
So perhaps you could argue that the role of
cmakeis to minimize the amount of yak that you will have to shave in order to get that piece of software running. In which case, it sucks. Often the only thing you actually need is to pass a flag or two to
cc. So in a normal world, you would tack those flags onto
CFLAGSor whatever and go on about your life. Not so with
cmake. Of course you remember that the variable is called
CMAKE_C_COMPILER_FLAGS(because why not) and toss in your
-I/opt/libmagic/includein there and everything should be fine, yes? Except no, because
cmakewill insist that
libmagiccannot be found. Does it want a
pkg-configfile? Perhaps you need to set
PKG_CONFIG_PATH. Or is
-DMagic_DIR=/opt/libmagic? Nobody knows. I mean sure, it's documented, but the yak has grown considerably.
I don't mean to beat a dead yak here, but also consider this: a typical
cmakeproject has dozens upon dozens of user-configurable variables. You can look at them with
ccmake! This is such a big problem that
ccmakewill, by default, hide most of those variables, because they are ‘advanced’. Of course, the odds are barely better than 1:1 that you need to tweak one of the advanced vars.
A case in point:
divinehas 12 pages of normal variables, and 10 more pages of the ‘advanced’ stuff. With
gib, we are currently at less than 1 page of those. You look at the top of
gib/main, and if you need to change something (where to find the compiler, flags to give to the compiler, stuff like that), just put that var into
gib/localwith your new value and that's it. Conveniently, you also won't have to redo those tweaks 2 weeks later, just because you wiped your build directory. Of course, if you edit
gib/mainthat's fine too, but version control might get a little upset down the line.
Of course, sometimes there is a legitimate need to probe for something automatically. Fortunately, this is very easy with
gib: write a program that does the probing (
perlusually work great) and writes the required gib rules (probably just variable assignments) into a file. Then
subthat file. If the script changes,
gibwill of course rebuild the rules, and anything that depends on them. If you want to go all fancy, your script can even emit dynamic dependencies on the files it probed, so if those files change, it will re-run the check. Basically the same story as with file lists.
All that said,
cmakehas one crucial advantage: many people know it, and also know where the yaks live. So you save time hunting down the right yak. Confronted with
gib, chances are you will have no idea where to start, so there's definitely some extra effort in there. But even if user comfort ends up being a tossup, the improvement in my life was well worth it.
Since this post is a bit light on detail, I will perhaps write a more in-depth followup in a few weeks. Or perhaps just some old-fashioned documentation. You didn't expect the thing to be documented, did you?