Wednesday, May 4, 2016

Standard build rules for C are unreliable

The standard way of integrating C into a build system is to use automatic dependencies generated from the compiler. Gcc and Clang can emit a list of the header files they read if you run them with the -M option. Visual Studio can do it as well, using the /showIncludes option. What I will call the "standard approach" in this post is to use the dependencies the user explicitly declared, and then to augment them with automatic dependencies generated by options like -M or /showIncludes.

Until a few years ago, I just took this approach as received wisdom and didn't think further about it. It's a neat trick, and it works correctly in the most obvious scenarios. Unfortunately, I have learned that the technique is not completely reliable. Let me share the problem, because I figure that other people will be interested as well, especially anyone else who ends up responsible for setting up a build system.

The root problem with the standard approach is that sometimes a C compile depends on the absence of a file. Such a dependency cannot be represented and indeed goes unnoticed in the standard approach to automatic dependencies. The standard approach involves an "automatic dependency list", which is a file listing out the automatically determined dependencies for a given C file. By its nature, a list of files only includes files that exist. If you change the status of a given file from not existing, to existing, then the standard approach will overlook the change and skip a rebuild that depends on it.

To look at it another way, the job of a incremental build system is to skip a compile if running it again would produce the same results. Take a moment to consider what a compiler does as it runs. It does a number of in-memory operations such as AST walks, and it does a number of IO operations including reading files into memory. Among those IO operations are things like "list a directory" and "check if a file exists". If you want to prove that a compiler is going to do the same thing on a second run as it did on the first, then you want to prove that those IO operations are going to do the same thing on a second run. That means all of the IO operations, though, not just the ones that read a file into memory.

Such a situation may seem exotic. At least one prominent source has declared that the standard approach is "correct" up to changes in the build command, which suggests to me that the author did not consider this scenario at all. It's not just a theoretical problem, though. Let me show a concrete example of how it can arise in practice.

Suppose you are compiling the following collection of files, including a single C file and two H files:

// File test.c
#include <stdio.h>
#include "syslimits.h"
#include "messages.h"

int main() {
  printf("%s: %d\n", MSG_LIMIT_THREADS, LIMIT_THREADS);
}

// File localhdrs/messages.h
#define MSG_LIMIT_THREADS "Limit on threads:"

// File hdrs/syslimits.h
#define LIMIT_THREADS 10
Using automatic dependencies, you set up a Makefile that looks like this:
CFLAGS=-Ilocalhdrs -Ihdrs

test.o test.d : test.c
 gcc $(CFLAGS) -M test.c > test.d
 gcc $(CFLAGS) -c test.c

test: test.o
 gcc -o test test.o

-include test.d

You compile it and everything looks good:

$ make test
gcc -Ilocalhdrs -Ihdrs -M test.c > test.d
gcc -Ilocalhdrs -Ihdrs -c test.c
gcc -o test test.o
$ ./test
Limit on threads: 10
Moreover, if you change any of the input files, including either of the H files, then invoking make test will trigger a rebuild as desired.
$ touch localhdrs/messages.h
$ make test
gcc -Ilocalhdrs -Ihdrs -M test.c > test.d
gcc -Ilocalhdrs -Ihdrs -c test.c
gcc -o test test.o

What doesn't work so well is if you create a new version of syslimits.h that shadows the existing one. Suppose you next create a new syslimits.h file that shadows the default one:

// File localhdrs/syslimits.h
#define LIMIT_THREADS 500

Make should now recompile the executable, but it doesn't:

$ make test
make: 'test' is up to date.
$ ./test
Limit on threads: 10

If you force a recompile, you can see that the behavior changed, so Make really should have recompiled it:

$ rm test.o
$ make test
gcc -Ilocalhdrs -Ihdrs -M test.c > test.d
gcc -Ilocalhdrs -Ihdrs -c test.c
gcc -o test test.o
$ ./test
Limit on threads: 500

It may seem picky to discuss such a tricky scenario as this one, with header files shadowing other header files. Imagine a developer in the above scenario, though. They are doing something tricky, yes, but it's a tricky thing that is fully supported by the C language. If this test executable is part of a larger build, the developer can be in for a really difficult debugging exercise to try and understand why their built executable is not behaving the way that's consistent with the source code. I dare say, it is precisely such tricky situations where people rely the most on their tools behaving in an intuitive way.

I will describe how to set up better build rules for this scenario in a followup post.

No comments: