Wednesday, May 4, 2016

Supporting C in a build system

In a previous post, I showed that the standard build rules for C code are unreliable. Let me describe two ways to do better.

In the interest of brevity, I will describe the build rules using a toy build engine called Blcache (short for "build cache"). I initially tried to write this post using standard tools like Make, Ninja, Bazel, and Nix, but they all have one limitation or another that distracts from the main point of this post.

Example problem

Here is an example problem this post will work against. There is a test.c file, and it includes two separate header files:

// File test.c
#include <stdio.h>
#include "syslimits.h"
#include "messages.h"

int main() {
  printf("%s: %d\n", MSG_LIMIT_THREADS, LIMIT_THREADS);
}

// File localhdrs/messages.h
#define MSG_LIMIT_THREADS "Limit on threads:"

// File hdrs/syslimits.h
#define LIMIT_THREADS 10

After compiling this code, a third header file is added as follows:

// File localhdrs/syslimits.h
#define LIMIT_THREADS 500

The challenge is for the addition of this header file to trigger a rebuild, while still making the build as incremental as possible.

Method 1: Compile entire components

The simplest way to get correct C compiles is to compile entire components at a time, rather than to set up build rules to compile individual C files. I've posted before on this strategy for Java, and it applies equally well to C.

This approach seems to be unusual, but my current feel is that it should work well in practice. It seems to me that when you are actively working on a given component, you should almost always use an IDE or other specialized tool for compiling that given component. The build system should therefore not concern itself with fine-grained incremental rebuilds of individual C files. Rebuilds of whole components--executables, libraries, and shared libraries--should be plenty, and even when using a system like Gyp, there are advantages to having the low-level build graph be simple enough to read through and debug by hand.

Using such an approach, you would set up a single build rule that goes all the way from C and H files to the output. Here it is in JSON syntax, using Blcache:

{
    "environ": [
        "PATH"
    ],
    "rules": [
        {
            "commands": [
                "gcc -Ilocalhdrs -Ihdrs  -o test test.c"
            ], 
            "inputs": [
                "test.c", 
                "localhdrs",
                "hdrs"
            ], 
            "name": "c/test", 
            "outputs": [
                "test"
            ]
        }
    ]
}

The "environ" part of this build file declares which environment variables are passed through to the underlying commands. In this case, only PATH is passed through.

There is just one build rule, and it's named c/test in this example. The inputs include the one C file (test.c), as well as two entire directories of header files (localhdrs and hdrs). The build command for this rule is very simple: it invokes gcc with all of the supplied input files, and has it build the final executable directly.

With the build rules set up like this, any change to any of the declared inputs will cause a rebuild to happen. For example, here is what happens in an initial build of the tool:

$ blcache c/test
Started building c/test.
Output from building c/test:
gcc -Ilocalhdrs -Ihdrs  -o test test.c

$ ./target/c/test
Limit on threads: 10

After adding syslimits.h to the localhdrs directory, the entire component gets rebuilt, because the localhdrs input is considered to have changed:

$ blcache c/test
Started building c/test.
Output from building c/test:
gcc -Ilocalhdrs -Ihdrs  -o test test.c

$ ./target/c/test
Limit on threads: 500

As a weakness of this approach, though, any change to any C file or any header file will trigger a rebuild of the entire component.

Method 2: Preprocess as a separate build step

Reasonable people disagree about how fine-grained of build rules to use for C, so let me describe the fine-grained version as well. This version can rebuild more incrementally in certain scenarios, but that benefit comes at the expense of a substantially more complicated build graph. Then again, most developers will never look at the build graph directly, so there is some argument for increasing the complexity here to improve overall productivity.

The key idea with the finer-grained dependencies is to include a separate build step for preprocessing. Here's a build file to show how it can be done:

{
    "environ": [
        "PATH"
    ],
    "rules": [
        {
            "commands": [
                "gcc -c -o test.o target/c/preproc/test.i"
            ], 
            "inputs": [
                "c/preproc/test:test.i"
            ], 
            "name": "c/object/test",
            "outputs": [
                "test.o"
            ]
        },

        {
            "commands": [
                "gcc -Ilocalhdrs -Ihdrs -E -o test.i test.c"
            ],
            "inputs": [
                "test.c", 
                "localhdrs",
                "hdrs"
            ], 
            "name": "c/preproc/test",
            "outputs": [
                "test.i"
            ]
        },

        {
            "commands": [
                "gcc -o test target/c/object/test.o"
            ],
            "inputs": [
                "c/object/test:test.o"
            ],
            "name": "c/test",
            "outputs": [
                "test"
            ]
        }
    ]
}

This file has three rules in it that chain together to produce the final output file. When you build the test executable for the first time, all three rules will be executed:

$ blcache c/test
Started building c/preproc/test.
Output from building c/preproc/test:
gcc -Ilocalhdrs -Ihdrs -E -o test.i test.c

Started building c/object/test.
Output from building c/object/test:
gcc -c -o test.o target/c/preproc/test.i

Started building c/test.
Output from building c/test:
gcc -o test target/c/object/test.o

$ ./target/c/test
Limit on threads: 10

First, the test.c file is preprocessed, yielding test.i. Second, the test.i file is compiled to test.o. Finally, test.o is linked into the final test executable.

Adding the new syslimits.h file behaves as expected, causing the full chain of recompiles.

$ blcache c/test
Started building c/preproc/test.
Output from building c/preproc/test:
gcc -Ilocalhdrs -Ihdrs -E -o test.i test.c

Started building c/object/test.
Output from building c/object/test:
gcc -c -o test.o target/c/preproc/test.i

Started building c/test.
Output from building c/test:
gcc -o test target/c/object/test.o

$ target/c/test
Limit on threads: 500

Modifying an irrelevant header file, on the other hand, only causes the precompilation step to run. Since the precompilation yields the same result as before, rebuilding stops at that point.

$ touch localhdrs/irrelevant.h
$ blcache c/test
Started building c/preproc/test.
Output from building c/preproc/test:
gcc -Ilocalhdrs -Ihdrs -E -o test.i test.c

Using cached results for c/object/test.
Using cached results for c/test.

It's not shown in this example, but since each C file is compiled individually, a change to a C file will only trigger a rebuild of that one file. Thus, the technique here is fine-grained in two different ways. First, changes to one C file only trigger a recompile of that one file. Second, changes to the H files only trigger preprocessing of all C files, and then only compilation of those C files that turn out to be affected by the H files that were changed.

By the way, there's a trick here that generalizes to a variety of cached computations. If you want to add a cache for a complicated operation like a C compile, then don't try to have the operation itself be directly incremental. It's too error prone. Instead, add a fast pre-processing step that accumulates all of the relevant inputs, and introduce the caching after that pre-processing step. In the case of this example, the fast pre-processing step is, well, the actual C preprocessor.

Coda

Before realizing the problem with C compilation, I used C as an example of why you might want to break the rules a little bit about the most strict and simplistic version of a build cache. However, now it seems to me that you find the best set of build rules if you strictly adhere to a build-cache discipline. I'm sorry, build cache. I should never have doubted you.

No comments: