Friday, September 1, 2023

Task-based build systems

Today I ran into a great book chapter by Erik Kuefler and Lisa Carey about Google's build system philosophy. It's all very good, but one part that stood out to me is the distinction between a task-based build system and an artifact-based build system. Ant is a task-based build system and lets you define a task in terms of other tasks. Google's build system, partially open sourced as the "Bazel" project, is based on artifacts. An important part of an artifact-based system, they explain, is that it honors build cache semantics. They write specifically the following:
For a remote caching system to work, the build system must guarantee that builds are completely reproducible. That is, for any build target, it must be possible to determine the set of inputs to that target such that the same set of inputs will produce exactly the same output on any machine. This is the only way to ensure that the results of downloading an artifact are the same as the results of building it oneself. Fortunately, Bazel provides this guarantee and so supports remote caching. Note that this requires that each artifact in the cache be keyed on both its target and a hash of its inputs—that way, different engineers could make different modifications to the same target at the same time, and the remote cache would store all of the resulting artifacts and serve them appropriately without conflict.

No comments: