1 Introduction

The DIVINE project aims to develop a general-purpose, fast, reliable and easy-to-use model checker. The roots of the project go back to a special-purpose, explicit-state, asynchronous system model checking tool for LTL properties. However, rigorous development processes are in a steady decline, being displaced by more agile, flexible and dynamic methods. In the agile world, there is little place for large-scale, long-term planning and pondering on “paper only” designs, which would favour the use of a traditional model checker.

The current version of DIVINE strives to keep up with this dynamic world, bringing “heavy-duty” model checking technology much closer to daily programming routine. Our major goal is to express model checking problems in a language which every developer is fluent with: the programming language of their own project. Even if you don’t apply model checking to your resulting program directly, writing throwaway models makes much more sense in a language you understand well and use daily.

Current versions of DIVINE provide out-of-the box support for the C (C99) and C++ (C++14) programming languages, including their respective standard libraries. Addtional libraries may be rebuilt for use with DIVINE by the user.

2 Installation

This section is only relevant when you are installing from source. We will assume that you have obtained a source tarball from http://divine.fi.muni.cz, eg. divine-4.0.tar.gz.

2.1 Prerequisites

To build DIVINE from source, you will need the following:

Additionally, DIVINE can make use of the following optional components:

2.2 Building & Installing

First, unzip the distribution tarball and enter the newly created directory

$ tar xvzf divine-4.0.tar.gz
$ cd divine-4.0

The build is driven by a Makefile, and should be fully automatic. You only need to run:

$ make

This will first build a C++14 toolchain and a runtime required to build DIVINE itself, then proceed to compile DIVINE. After a while, you should obtain the main DIVINE binary. You can check that this is indeed the case by running:

$ ./_build.release/tools/divine help

You can now run DIVINE from its build directory, or you can optionally install it by issuing

$ make install

You can also run the test-suite if you like:

$ make check

3 An Introduction to Using DIVINE

In this section, we will give a short example on how to invoke DIVINE and its various functions. We will use a small C program (consisting of a single compile unit) as an example in this section, along with a few simple properties.

3.1 System Modelling Basics

We will start with a simple C program with 2 threads and a single shared variable:

#include <pthread.h>
#include <assert.h>

int i = 33;

void* thread( void *x )
    i ++;
    return NULL;

int main()
    pthread_t tid;
    pthread_create( &tid, NULL, thread, NULL );

    i ++;

    pthread_join( tid, NULL );
    assert( i == 35 );
    return i;

(Yes, this program clearly has a bug in it, a race condition on the update of i. We will deal with that later.)

The program as it is can be compiled using your system’s C compiler just fine (assuming access to pthread.h) and you can try running it. To use DIVINE, we will need to compile it a little differently:

$ divine cc program.c

After a short while, it will produce a file, program.bc. This file contains LLVM bitcode of our program, and of the runtime environment. We can learn a little more about the program using divine info:

$ divine info program.bc
(TODO: divine info is not yet implemented in DIVINE 4)

We can analyse the bitcode file a little more, actually exploring its entire state space, measuring it (we won’t be asking about any properties of the program yet):

$ divine metrics program.bc
(TODO: divine metrics is not yet implemented in DIVINE 4)

We have learned that our program has NN unique states and there are NN transitions between these states. By default, DIVINE tries to use the most efficient reduction available which is still exact (i.e. it will not use hash-compaction or other approximation techniques unless explicitly enabled). We can try turning off those reductions to see what happens:

$ divine metrics program.bc --no-reduce
(TODO: divine metrics is not yet implemented in DIVINE 4)

3.2 Property Verification & Counter-Example Analysis

Now let’s turn to some more interesting properties. With the above program, the first thing that comes to mind is asking DIVINE whether the assertion in the program can ever be violated:

$ divine verify program.bc
loading /lib/libc.bc... linking... done
loading /lib/libcxxabi.bc... linking... done
loading /lib/libcxx.bc... linking... done
loading /lib/libdios.bc... linking... done
compiling program.c
annotating bitcode... done
computing RR... constants... done
found 181 states and 314 edges in 0:00, averaging inf states/s

found an error:
Dios started!
FAULT: __vm_fault called
VM Fault
1: main
2: __dios_main

choices made: 0 0 0 1 1 0 0 1 0 0 1 1 0 0 0 0 0 0 1 1 0 0
backtraces in the error state:
@address: heap* 902b4 0+0
@pc: code* 1d4 4f
@location: /divine/src/dios/fault.cpp:34
@symbol: __dios::Fault::handler(_VM_Fault, _VM_Frame*, void (*)())

@address: heap* 901e1 0+0
@pc: code* 234 11
@location: program.c:20
@symbol: main

@address: heap* 901bb 0+0
@pc: code* 1de 26
@location: /divine/src/dios/main.cpp:111
@symbol: __dios_main

Here, as we have anticipated, DIVINE tells us that in fact, the assertion can be violated and gave us an indication of what exactly went wrong. In particular, the backtrace reveals that program.c:20 was active during the fault, which is exactly our assertion.

To understand what led to the assertion failure, however, we will need to analyse the counterexample in more detail. For this purpose, DIVINE provides an interactive debugger, available as divine sim.

4 Model Checking C and C++ Code via LLVM Bitcode

The traditional “explicit-state” model checking practice is not widely adopted in the programming community, and vice-versa, use of mainstream programming languages is not a common practice in the model checking community. Hence, model checking of systems expressed as C programs comes with some novelties for both kinds of users.

First of all, the current main application of DIVINE is verification of safety (and some liveness) properties of asynchronous, shared-memory programs. The typical realisation of such asynchronous parallelism is programming with threads and shared memory. Often for performance and/or familiarity reasons, programming with threads, shared memory and locking is the only viable alternative, even though the approach is fraught with difficulties and presents many pitfalls that can catch even expert programmers unaware – not to say novices. Sadly, resource locking is inherently non-compositional, hence there is virtually no way to provide a reliable yet powerful abstraction, all that while retaining speed and scalability.

Despite all its shortcomings, lock-based programming (or alternatively, lock-free shared memory programming, which is yet more difficult) is becoming more prevalent. Model checking provides a powerful tool to ascertain correctness of programs written with locks around concurrent access to shared memory. Most programmers will agree that bugs that show up rarely and are nearly impossible to reproduce are the worst kind to deal with. Sadly, most concurrency bugs are of this nature, since they arise from subtle interactions of nondeterministically scheduled threads of execution. A test-case may work 99 times out of 100, yet the 100th time die due to an unfathomable invalid memory access. Even sophisticated modern debugging tools like valgrind are often powerless in this situation.

This is where DIVINE can help, since it systematically and efficiently explores all relevant execution orderings, discovering even the subtlest race conditions, restoring a crucially important property of bugs: reproducibility. If you have ever changed a program and watched the test-case run in a loop for hundreds of iterations, wondering if the bug is really fixed, or it just stubbornly refuses to crash the program… well, DIVINE is the tool you have been waiting for.

Of course, there is a catch. Model checking is computationally intensive and memory-hungry. While this is usually not a problem with comparably small unit tests, applying model checking to large programs may not be feasible – depending on your use-case, and on the amount of memory and time that you have.

With the universal LLVM backend, DIVINE can support a wide range of compiled programming languages. However, out of the box, language-specific support is only provided for C and C++. A fairly complete ISO C runtime library is provided as part of DIVINE, with appropriate hooks into DIVINE, as well as an implementation of the pthread specification, i.e. the POSIX threading interface. Additionally, an implementation of the standard C++ library is bundled with DIVINE. Besides the standard library, DIVINE also provides an adapted version of the runtime library required by C++ programs to implement runtime type identification and exception handling; both are fully supported by DIVINE.

If your language interfaces with the C library, the libc part of language support can be re-used transparently. However, currently no other platform glue is provided for other languages. Your mileage may vary. Data structure and algorithmic code can be very likely processed with at most trivial additions to the support code, in any language that can be compiled into LLVM bitcode.

4.1 Compiling Programs

The first step to take when you want to use DIVINE for C/C++ verification is to compile your program into LLVM bitcode and link it against the DIVINE-provided runtime libraries. When you issue divine cc program.c, divine will compile the runtime support and your program and link them together automatically, using a builtin clang-based compiler. The cc subcommand accepts a wide array of traditional C compiler flags like -I, -std, -W and so on.

Alternatively, you can pass a single-file C or C++ program directly to other divine commands, like verify or sim, in which case the program will be compiled and linked transparently.

4.2 Limitations

When DIVINE interprets your program, it does so in a very strictly controlled environment, so that every step is fully reproducible. Hence, no real IO is allowed; the program can only see its own memory.

While the pthread interface is fully supported, the fork system call is not: DIVINE cannot verify multi-process systems at this time. Likewise, C++ exceptions are well-supported but the C functions setjmp and longjmp are not yet implemented.

The interpreter simulates an “in order” CPU architecture, hence any possible detrimental effects of instruction and memory access reordering won’t be detected. Again, an improvement in this area is planned for a future release.

4.3 State Space of a Program

Internally, DIVINE constructs the state space of your program – an oriented graph, whose vertices represent various states your program can reach, i.e. the values of all its mapped memory locations, the values in all machine registers, and so on. DIVINE normally doesn’t store (or construct) all the possible states – only a relevant subset. In a multi-threaded program, it often happens that more than one thread can run at once, i.e. if your program is in a given state, the next state it will get into is determined by chance – it all hinges on which thread gets to run. Hence, some (in fact, most) states in a parallel program can proceed to multiple different configurations, depending on a scheduling choice. In such cases, DIVINE has to explore both such “variant” successors to determine the behaviour of the program in all possible scenarios.

You can visualise the state space DIVINE explores by using divine draw – this will show how the “future” of your program branches through various configurations, and how it converges back into a common point – or, if its behaviour had changed depending on scheduling, diverges into multiple different outcomes.

4.4 Non-Deterministic Choice

Scheduling is not the only source of non-determinism in typical programs. Interactions with the environment can have different outcomes, even if the internal state of the program is identical. A typical example would be memory allocation: the same call in the same context could either succeed or fail, depending on the conditions outside the control of the program. In addition to failures, the normal input of the program (files, network, user input on the terminal) are all instances of non-determinism due to external influences. The sum of all such external behaviours that affect the outcome of the program is called the environment (this includes the scheduling of threads done by the operating system and/or CPU). When testing, the environment is controlled to some degree: the inputs are usually fixed, but eg. thread scheduling is (basically) random – it changes with every test execution. Resource exhaustion can be simulated in a test environment to some degree. In a model checker, the environment is controlled much more strictly: when a test case is run through a model checker, it will come out the same every time.

4.5 ω-Regular Properties and LTL

5 Interactive Debugging

DIVINE comes with an interactive debugger, available as divine sim. The debugger loads programs the same way verify and other DIVINE commands do, so you can run it on standalone C or C++ source files directly, or you can compile larger programs into bitcode and load that.

5.1 Tutorial

Let’s say you have a small C program which you wish to debug. We will refer to this program as program.c. To load the program into the debugger, simply execute

$ divine sim program.c

and DIVINE will take care of compiling and linking your program and load it, but will not execute it. Instead, sim will present a prompt to you, looking like this:

# executing __boot at /divine/src/dios/dios.cpp:79

The __boot function is common to all DIVINE-compiled programs and belongs to DiOS, our minimalist operating system. When debugging user-mode programs, the good first command to run is

> start

which will start executing the program until it enters the main function:

# a new program state was stored as #1
# active threads: [1:0]
T: starting program.c...
# a new program state was stored as #2
# active threads: [1:0]
# executing main at program.c:13

Here, we can already see a few DIVINE-specific features of the debugger. First, program states are stored and retained for future reference. Second, thread switching is quite explicit – every time a scheduling decision is made, sim informs you of this fact. We will look at how to influence these decisions later.

Now is a good time to familiarise ourselves with how to inspect the program. There are two commands for listing the program itself: source and bitcode, each will print the currently executing function in the appropriate form (the original C source code and LLVM bitcode, respectively). Additionally, when printing bitcode, the current values of all LLVM registers are shown as inline comments:

>>label %entry:
    %01 = alloca [i32 1 d]                                  # [const* 0 0 uu]
    %02 = call @pthread_create %01 [const* 0 0 dd]          # [i32 0 u]
               [code* 233 0 dd] [const* 0 0 dd] 
    call @__vm_interrupt_mem [global* 2d 0 dd] [i32 1 d] 
    %04 = load [global* 2d 0 dd]                            # [i32 0 u]

Besides printing the currently executing function, both source and bitcode can print code corresponding to other functions; in fact, by default they print whatever function the debugger variable $frame refers to. To print the source code of the current function’s caller, you can issue

> source @caller
    103 void __dios_main( int l, int argc, char **argv, char **envp ) noexcept {
    104     __vm_control( _VM_CA_Bit, _VM_CR_Flags, _VM_CF_Mask, _VM_CF_Mask );
    105     __dios_trace_t( "Dios started!" );
    106     __dios::runCtors();
    107     int res;
    108     switch (l) {
    109     case 0:
    110         __vm_control( _VM_CA_Bit, _VM_CR_Flags, _VM_CF_Mask, _VM_CF_None );
>>  111         res = main();
    112         break;
    113     case 2:
    114         __vm_control( _VM_CA_Bit, _VM_CR_Flags, _VM_CF_Mask, _VM_CF_None );
    115         res = main( argc, argv );
    116         break;

To inspect data, we can instead use show and inspect. We have mentioned $frame earlier: there is, in fact, a number of variables set by sim. The most interesting of those is $_ which normally refers to the (usually) most interesting object at hand, typically the executing frame. By default, show will print the content of $_ (like many other commands). However, when we pass an explicit parameter to these commands, the difference between show and inspect becomes apparent: the latter also sets $_ to its parameter. This makes inspect more suitable for exploring more complex data, while show is useful for quickly glancing at nearby values:

> show
@address:     heap* 901e2 0+0
@pc:          code* 234 0
@location:    program.c:13
@symbol:      main
    @type:    pthread_t
related:      @caller

This is how a frame is presented when we look at it with show.

6 The DIVINE Virtual Machine

Programs in DIVINE are executed by a virtual machine (DIVINE VM or DiVM for short). The machine code executed by this virtual machine is the LLVM bitcode. The details of the “machine language” are, therefore, described in the LLVM Documentation.

6.1 Activation Frames

Unlike in ‘traditional’ implementations of C, there is no continuous stack in DiVM. Instead, each activation record (frame) is allocated from the heap and its size is fixed. Allocations that are normally done at runtime from the stack are instead done from the heap, using the alloca LLVM instruction. Additionally, since LLVM bitcode is in partial SSA form, what LLVM calls ‘registers’ are objects quite different from traditional machine registers. The registers used by a given function are bound to the frame of that function (they cannot be used to pass values around and they don’t need to be explicitly saved across calls). In the VM, this is realized by storing registers (statically allocated) in the activation record itself, along with DiVM-specific “program counter” register (this is an actual register, but is saved across calls automatically by the VM, see also Control Registers below) and a pointer to the caller’s activation frame. The header of the activation record has a C-compatible representation, available as struct _VM_Frame in divine.h.

6.2 Control Registers

The state of the VM consists of two parts, a set of control registers and the heap (structured memory). All available registers are described by enum _VM_ControlRegister defined in divine.h and can be manipulated through the __vm_control hypercall (see also The Hypercall Interface below). The registers are of two types, holding either an integer or a pointer. There are 2 integer registers, _VM_CR_Flags, which is used as a bitfield, and _VM_CR_User1 (the latter is not used or examined by the VM itself).

Four control registers govern address translation and normal execution:

Additional 4 registers are concerned with scheduling and interrupt control (for details, see Scheduling below):

Finally, there’s _VM_CR_FaultHandler (the address of the fault handler, see Faults) and a pair of user registers (_VM_CR_User1 and _VM_CR_User2).

The flags register (_VM_CR_Flags) is further broken down into individual bits, described by enum _VM_ControlFlags, again defined in divine.h. These are:

The remaining 3 flags indicate properties of the resulting edge in the state space (see also State Space of a Program). These may be set by the program itself or by a special monitor automaton, a feature of DiOS which enables modular specification of non-trivial (sequence-dependent) properties. These 3 flags are reset by the VM upon entering the scheduler (see Scheduling). The edge-specific flags are:

6.3 Heap

The entire persistent state of the VM is stored in the heap. The heap is represented as a directed graph of objects, where pointers stored in those objects act as the edges of the graph. For each object, in addition to the memory corresponding to that object, a supplemental information area is allocated transparently by the VM for tracking metadata, like which bytes in the object are initialised (defined) and a list of addresses in the object where pointers are stored.

Activation frames, global variables and even constants are all stored in the heap. The heap is also stored in a way that makes it quite efficient (both time- and memory-wise) for the VM to take snapshots and store them. This is how model checking and reversible debugging is realized in DIVINE.

6.4 Scheduling

The DIVINE VM has no intrinsic concept of threads or processes. Instead, it relies on an “operating system” to implement such abstractions and the VM itself only provides the minimum support neccessary. Unlike with “real” computers, a system required to operate DiVM can be extremely simple, consisting of just 2 C functions (one of them is __boot, see Boot Sequence below). The latter of those is the scheduler, the responsibility of which is to organize interleaving of threads in the program to be verified. However, the program may not use threads but some other form of concurrency – it is up to the scheduler, which may be provided by the user, to implement the correct abstractions.

From the point of view of the state space (cf. State Space of a Program), the scheduler decides what the successors of a given state are. When DIVINE needs to construct successors to a particular state, it executes the scheduler in that state; the scheduler decides which thread to run (usually with the help of the non-deterministic choice operator) and transfers control to that thread (by changing the value of the _VM_CR_Frame control register, i.e. by instructing DIVINE to execute a particular activation frame). The VM then continues execution in the activation frame that the scheduler has chosen, until it encounters an interrupt. When DIVINE loads a program, it annotates the bitcode with interrupt points, that is, locations in the program where threads may need to be re-scheduled. When such a point is encountered, the VM sets the _VM_CF_Interrupted bit in _VM_CR_Flags and unless _VM_CF_Mask is in effect, an interrupt is raised immediately.

Upon an interrupt, the values of _VM_CR_IntFrame and _VM_CR_Frame are swapped, which usually means that the control is transferred back to the scheduler, which can then read the address of the interrupted frame from _VM_CR_IntFrame (this may be a descendant or a parent of the frame that the scheduler originally transferred control to, or may be a null pointer if the activation stack became empty).

Of course, the scheduler needs to store its state – for this purpose, it must use the _VM_CR_State register, which is set by __boot to point to a particular heap object. This heap object can be resized by calling __vm_obj_resize if needed, but the register itself is read-only after __boot returns. The object can be used to, for example, store pointers to activation frames corresponding to individual threads (but of course, those may also be stored indirectly, behind a pointer to another heap object). In other words, the object pointed to by _VM_CR_State serves as the root of the heap.

6.5 Faults

An important role of DIVINE is to detect errors – various types of safety violations – in the program. For this reason, it needs to interpret the bitcode as strictly as possible and report any problems back to the user. Specifically, any dangerous operations that would normally lead to a crash (or worse, a security vulnerability) are caught and reported as faults by the VM. The fault types that can arise are the following (enumerated in enum _VM_Fault in divine.h):

When a fault is raised, control is transferred to a user-defined fault handler (a function the address of which is held in the _VM_CR_FaultHandler control register). Out of the box, DiOS (see DiOS, DIVINE’s Operating System) provides a configurable fault handler. If a fault handler is set, faults are not fatal (the only exception is a double fault, that is, a fault that occurs while the fault handler itself is active). The fault handler, possibly with cooperation from the scheduler (see Scheduling above), can terminate the program, or raise the _VM_CF_Error flag, or take other appropriate actions.

The handler can also choose to continue with execution despite the fault, by transferring control to the activation frame and program counter value that are provided by the VM for this purpose. (Note: this is neccessarry, because the fault might occur in the middle of evaluating a control flow instruction, in which case, the VM could not finish its evaluation. The continuation passed to the fault handler is the best estimate by the VM on where the execution should resume. The fault handler is free to choose a different location.)

6.6 Boot Sequence

The virtual machine explicitly recognizes two modes of execution: privileged (kernel) mode and normal, unprivileged user mode. When the VM is started, it looks up a function named __boot in the bitcode file and starts executing this function, in kernel mode. The responsibility of this function is to set up the operating system and set up the VM state for execution of the user program. There are only two mandatory steps in the boot process: set the _VM_CR_Scheduler and the _VM_CR_State control registers (see above). An optional (but recommended) step is to inform the VM (or more specifically, any debugging or verification tools outside the VM) about the C/C++ type (or DWARF type, to be precise, as this is also possible for non-C languages) associated with the OS state. This is accomplished by an appropriate __vm_trace call (see also below).

6.7 The Hypercall Interface

7 DiOS, DIVINE’s Operating System

Programs traditionaly rely on a wide array of services provided by their runtime environment (that is, a combination of libraries, the operating system kernel, the hardware architecture and its peripherals and so on). When DIVINE executes a program, it needs to provide these services to the program. Some of those, especially library functions, can be obtained the same way they are in a traditional (real) execution environment: the libraries can be compiled just like other programs and the resulting bitcode files can be linked just as easily.

The remaining services, though, must be somehow provided by DIVINE, since they are not readily available as libraries. Some of those are part of the virtual machine, like memory management and interrupt control (cf. The DIVINE Virtual Machine). The rest is provided by an “operating system”. In principle, you can write your own small operating system for use with DIVINE; however, to make common verification tasks easier, DIVINE ships with a small OS that provides a subset of POSIX interfaces that should cover the requirements of a typical user-level program.