Compiling C and C++ Programs for Dynamic White-Box Analysis

Authors: Zuzana Baranová, Petr Ročkai

Building software packages from source is a complex and highly technical process. For this reason, most software comes with build instructions which have both a human-readable and an executable component. The latter in turn requires substantial infrastructure, which helps software authors deal with two major sources of complexity: first, generation and management of various build artefacts and their dependencies, and second, the differences between platforms, compiler toolchains and build environments.

This poses a significant problem for white-box analysis tools, which often require that the source code of the program under test is compiled into an intermediate format, like the LLVM IR. In this paper, we present divcc, a drop-in replacement for C and C++ compilation tools which transparently fits into existing build tools and software deployment solutions. Additionally, divcc generates intermediate and native code in a single pass, ensuring that the final executable is built from the intermediate code that is being analysed.