1 Introduction

The DIVINE project aims to develop a general-purpose, fast, reliable and easy-to-use model checker. The roots of the project go back to a special-purpose, explicit-state, asynchronous system model checking tool for LTL properties. However, rigorous development processes are in a steady decline, being displaced by more agile, flexible and dynamic methods. In the agile world, there is little place for large-scale, long-term planning and pondering on “paper only” designs, which would favour the use of a traditional model checker.

The current version of DIVINE strives to keep up with this dynamic world, bringing “heavy-duty” model checking technology much closer to daily programming routine. Our major goal is to express model checking problems in a language which every developer is fluent with: the programming language of their own project. Even if you don’t apply model checking to your resulting program directly, writing throwaway models makes much more sense in a language you understand well and use daily.

Current versions of DIVINE provide out-of-the-box support for the C (C99) and C++ (C++14) programming languages, including their respective standard libraries. Additional libraries may be rebuilt for use with DIVINE by the user.

2 Installation

This section is only relevant when you are installing from source. We will assume that you have obtained a source tarball from http://divine.fi.muni.cz, e.g. divine-4.0.18.tar.gz.

DIVINE can be built on Linux and possibly on other POSIX-compatible systems including macOS (not tested). There is currently no support for DIVINE on Windows. If you do not want to build DIVINE from sources, you can download a virtual machine image with pre-built DIVINE.

2.1 Prerequisites

2.1.1 Automatic Installation

If you use recent Ubuntu, Fedora or Arch Linux (or possibly another distribution which uses apt-get, yum or pacman as a package manager) the easiest way to get dependencies of DIVINE is to run make prerequisites in the directory with the sources (you will need to have make installed):

$ tar xvzf divine-4.0.18.tar.gz
$ cd divine-4.0.18
$ make prerequisites

2.1.2 Manual Installation

Otherwise, to build DIVINE, you will need the following:

Additionally, DIVINE can make use of the following optional components:

2.2 Building & Installing

First, unzip the distribution tarball and enter the newly created directory

$ tar xvzf divine-4.0.18.tar.gz
$ cd divine-4.0.18

The build is driven by a Makefile, and should be fully automatic. You only need to run:

$ make

This will first build a C++14 toolchain and a runtime required to build DIVINE itself, then proceed to compile DIVINE. After a while, you should obtain the main DIVINE binary. You can check that this is indeed the case by running:

$ ./_build.release/tools/divine help

You can now run DIVINE from its build directory, or you can optionally install it by issuing

$ make install

This will install DIVINE and its version of LLVM into /opt/divine/.

You can also run the test-suite if you like:

$ make check

3 An Introduction to Using DIVINE

In this section, we will give a short example on how to invoke DIVINE and its various functions. We will use a small C program (consisting of a single compile unit) as an example, along with a few simple properties.

3.1 Basics of Program Analysis

We will start with the following simple C program:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <stdio.h>
#include <assert.h>

void foo( int *array ) {
    for ( int i = 0; i <= 4; ++i ) {
        printf( "writing at index %d\n", i );
        array[i] = 42;
    }
}

int main() {
    int x[4];
    foo( x );
    assert( x[3] == 42 );
}

The above code contains a bug, an access out of bounds of array for i == 4; we will see how this is presented by DIVINE.

The program as it is can be compiled by your system’s C compiler and run. If you do so, it will probably run OK despite the out of bounds access (this is an example of a stack buffer overflow – the program will incorrectly overwrite an adjacent value on the stack which, in most cases, does not interfere with its execution). We can proceed to verify the program using DIVINE:

$ divine verify program.c

DIVINE will now compile your program and run the verifier on the compiled code. After a short while, it will produce the following output:

compiling program.c
loading bitcode … LART … RR … constants … done
booting … done
found 83 states in 0:00, averaging 164.7

state count: 83
states per second: 164.683
version: 4.0.12+3c13c2b13ac1

architecture: Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz
memory used: 1261248
physical memory used: 565456
user time: 6.830000
system time: 0.090000
wall time: 6.639394

timers:
  lart: 0.761
  loader: 1.92
  boot: 1.23
  search: 0.504
  ce: 1.67
error found: yes
error trace: |
  [0] writing at index 0
  [0] writing at index 1
  [0] writing at index 2
  [0] writing at index 3
  [0] writing at index 4
  FAULT: access of size 4 at [heap* 53e6ba2a 10 ddp] is 4 bytes out of bounds
  [0] Fault in userspace: memory
  [0] Backtrace:
  [0]   1: foo
  [0]   2: main
  [0]   3: _start

error state:
  backtrace 1: # active stack
    - address: heap* c237acdc 0+0
      pc: code* 16a 5c
      location: /divine/include/dios/core/fault.hpp:172
      symbol: {Fault}::fault_handler(int, _VM_Frame*, int)

    - address: heap* 53b4676 0+0
      pc: code* 1be e3
      location: /divine/include/dios/core/syscall.hpp:70
      symbol: {Syscall}::fault_handlerWrappper({Context}&, void*, __va_list_tag*)

    - address: heap* 743beb3 0+0
      pc: code* 1a6 14
      location: /divine/include/dios/core/syscall.hpp:41
      symbol: {Syscall}::handle({Context}&, _DiOS_Syscall&)

    - address: heap* 10 0+0
      pc: code* 18f 41
      location: /divine/include/dios/core/scheduling.hpp:467
      symbol: void {Scheduler}::run_scheduler<{Context} >()

  backtrace 2:
    - address: heap* dbc1a03b 0+0
      pc: code* 100e 15
      location: /divine/src/libc/functions/unistd/syscall.c:14
      symbol: __dios_trap

    - address: heap* 4a840476 0+0
      pc: code* 100f 9
      location: /divine/src/libc/functions/unistd/syscall.c:26
      symbol: __dios_syscall

    - address: heap* 11 0+0
      pc: code* 169 1c
      location: /divine/include/dios/core/fault.hpp:198
      symbol: void {Fault}::handler<{Context} >(_VM_Fault, _VM_Frame*, void (*)(), ...)

    - address: heap* 73fbe04e 0+0
      pc: code* 1 11
      location: program.c:7
      symbol: foo

    - address: heap* fab409f3 0+0
      pc: code* 2 4
      location: program.c:13
      symbol: main

    - address: heap* 3340f329 0+0
      pc: code* 100c b
      location: /divine/src/libc/functions/sys/start.cpp:72
      symbol: _start

a report was written to program.report.cqlyph

The output begins with compile- and load-time report messages, followed by some statistics. Then, starting with error found: yes, the detected error is introduced. The error-related information contains:

In our case, the most important information is FAULT: access of size 4 at [heap* 53e6ba2a 10 ddp] is 4 bytes out of bounds which indicates the error is caused by an invalid memory access. The other crucial information is the line of code which caused it (this can be found on the second stack, in the entry for the foo function):

- address: heap* 73fbe04e 0+0
  pc: code* 1 11
  location: program.c:7
  symbol: foo

Hence we can see that the problem is caused by an invalid memory access on line 7 in our program.

Note: one might notice that the addresses in DIVINE are printed in the form [heap* 53e6ba2a 10 ddp]; the meaning of which is: the pointer in question is a heap pointer (other types of pointers are constant pointers and global pointers; stack pointers are not distinguished from heap pointers); the object id (in hexadecimal, assigned to the allocation) is 53e6ba2a; the offset (again in hexadecimal) is 10 and the value is a defined pointer (ddp, i.e. it is not an uninitialised value).

3.1.1 Debugging Counterexamples with the Interactive Simulator

We have found a bug in our program. While in this case the cause might be visible directly, it is useful to be able to inspect the error in more detail. For this, we can use DIVINE’s simulator.

divine sim program.c

After the compilation, we will land in an interactive debugger environment:

Welcome to 'divine sim', an interactive debugger. Type 'help' to get started.
# executing __boot at /divine/src/dios/core/dios.cpp:315
>

There are a few commands we could use in this situation. For instance, the start command brings the program to the beginning of the main function (skipping the internal program initialization process). Another alternative is to invoke sim by passing the generated report (the name of which is specified at the very end of the verify output) using the --load-report option. The simulator now outputs identifiers of the produced states, together with outputs the program printed so far (prefixed with T:), and ends just after the last choice before the error. We can now let the simulator run the program until the error is found (or a new state is produced) with stepa (step atomic).

At this point, the simulator should have stopped in the fault handler which is executed by DIVINE when an error is found. We can use the up command to step from the fault handler to the frame of the function in which the error occurred (generally, up can be used to move from the callee to the caller on the stack, down can be used to move in the opposite direction). As we should be in the frame of the foo function, we can use the show command to print local variables present in this function. In the output there should be an entry for i:

.i$1:
    type:    int
    value:   [i32 4 d]

The entry suggests that i is an int variable. It is represented as [i32 4 d], meaning it is a 32 bit integer with value 4 and it is fully defined (d). If we jump one frame up and use show again, we can see the entry for x:

.x:
    type:    int[]
    .[0]:
        type: int
        value: [i32 42 d]
    .[1]:
        type: int
        value: [i32 42 d]
    .[2]:
        type: int
        value: [i32 42 d]
    .[3]:
        type: int
        value: [i32 42 d]

Here, we see that x is an int array and that it contains 4 values, therefore, access at x[4] is out of bounds of this array.

More details about the simulator can be found in the interactive debugging section.

3.1.2 Controlling the Execution Environment

Programs in DIVINE run in an environment provided by DiOS, DIVINE’s operating system, and by runtime libraries (including C and C++ standard libraries and pthreads). The behaviour of this runtime can be configured using the -o option. To get the list of options, run divine info program.c:

$ divine info program.c
compiling program.c

DIVINE 4.0.5

Available options for program.c are:
I: DiOS boot info:
I:   help:
I:     supported commands:
I:       - "debug":
I:           description: print debug information during boot
I:           arguments:
I:             - help - help and exit
I:             - machineparams - specified by user, e.g. number of cpus
I:             - mainargs - argv and envp
I:             - faultcfg - fault and simfail configuration
I:       - "{trace|notrace}":
I:           description: report/not report item back to VM
I:           arguments:
I:             - threads - thread info during execution
I:       - "[force-]{ignore|report|abort}":
I:           description: configure fault, force disables program override
I:           arguments:
I:             - assert
I:             - arithmetic
I:             - memory
I:             - control
I:             - locking
I:             - hypercall
I:             - notimplemented
I:             - diosassert
I:       - "{nofail|simfail}":
I:           description: enable/disable simulation of failure
I:           arguments:
I:             - malloc
I:       - "ncpus":
I:           description: specify number of cpu units (default 0)
I:           arguments:
I:             - <num>
I:       - "{stdout|stderr}":
I:           description: specify how to treat program output
I:           arguments:
I:             - notrace - ignore the stream
I:             - unbuffered - trace each write
I:             - trace - trace after each newline (default)
I:       - "syscall":
I:           description: specify how to treat standard syscalls
I:           arguments:
I:             - simulate - simulate syscalls, use virtual file system (can be used with verify and run)
I:             - passthrough - use syscalls from the underlying host OS (cannot be used with verify)
use -o {option}:{value} to pass these options to the program

It is often convenient to assume that malloc never fails: this can be achieved by passing the -o nofail:malloc option to DiOS. Other important options are those controlling the fatalness of errors (the default option is abort – if an error of type abort is encountered, the verification ends with an error report; on the other hand, the verifier will attempt to continue when it detects an error that was marked as ignore).

Furthermore, it is possible to pass arguments to the main function of the program by appending them after the name of the source file (e.g. divine verify program.c main-arg-1 main-arg-2), and to add environment variables for the program using the -DVAR=VALUE option.

3.1.3 Compilation Options and Compilation of Multiple Files

Supposing you wish to verify something that is not just a simple C program, you may have to pass compilation options to DIVINE. In some cases, it is sufficient to pass the following options to divine verify:

However, if you need to pass more options or if your program consists of more than one source file, it might be more practical to compile it to LLVM bitcode first and pass this bitcode to divine verify:

$ divine cc -Iinclude-path -O1 -std=c++14 -DFOO=bar program.c other.c
$ divine verify program.bc

divine cc is a wrapper for the clang compiler and it is possible to pass most of clang’s options to it directly.

4 Commandline Interface

The main interface to DIVINE is the command-line tool simply called divine. Basic information about the binary can be obtained by issuing divine --version:

version: 4.0.12+8fbbb7005640
source sha: 37282b999cc10a027cabbb9669365f9e630fb685
runtime sha: b8939c99ca81161cbe69076e31073807f69dda72
build date: 2017-09-19, 14:02 UTC
build type: Debug

4.1 Synopsis

divine cc [compiler flags] <sources>
divine verify [...] <input file>
divine draw [...] <input file>
divine run [...] <input file>
divine sim [...] <input file>

4.2 Input Options

All commands that work with inputs share a number of flags that influence the input program.

divine {...} [-D {string}]
             [--autotrace {tracepoint}]
             [--sequential]
             [--disable-static-reduction]
             [--relaxed-memory {string}]
             [--lart {string}]

4.3 State Space Visualisation & Simulation

divine draw [--distance {int}]
            [--render {string}]
            {input options}

To visualise (a part of) the state space, you can use divine draw, which creates a graphviz-compatible representation. By default, it will run “dot -Tx11” to display the resulting graph. You can override the drawing command to run by using the --render switch. The command will get the dot-formatted graph description on its standard input. Out of common input options, the --autotrace option is often quite useful with draw.

divine sim [--batch] [--load-report {file}] [--skip-init]
           {input options}

The sim sub-command is used to interactively explore a program using a terminal-based interface. Use help in the interactive prompt to obtain help on available commands. See also Interactive Debugging.

4.4 Model Checking

divine {check|verify} [model checking options]
                      [exploration options]
                      {input options}

These two commands are the main workhorse of model checking. The verify command performs full model checking under conservative assumptions. The check command is more liberal, and will assume, for instance, that malloc will not fail. While check is likely to cover most scenarios, it omits cases that are either very expensive to check or that appear in programs very often and make the tool cumbersome to use.

The algorithms DIVINE uses are often resource-intensive and some are parallel (multi-threaded). The verify and check commands will, in common cases, use a parallel algorithm to explore the state space of the system. By default, parallel algorithms will use available cores, 4 at most. A few switches control resource use:

divine {...} [--threads {int}]
             [--max-memory {mem}]
             [--max-time {int}]
--threads {int} | -T {int}
The number of threads to use for verification. The default is 4 or the number of cores if less than 4. For optimal performance, each thread should get one otherwise mostly idle CPU core. Your mileage may vary with hyper-threading (it is best to run a few benchmarks on your system to find the best configuration).
--max-memory {mem}
Limit the amount of memory divine is allowed to allocate. This is mainly useful to limit swapping. When the verification exceeds available RAM, it will usually take extremely long time to finish and put a lot of strain on the IO subsystem. It is recommended that you do not allow divine to swap excessively, either using this option or by some other means.
--max-time {int}
Put a limit of {int} seconds on the maximal running time.

Verification results can be written in a few forms, and resource use can also be logged for benchmarking purposes:

divine {...} [--report {fmt}]
             [--no-report-file]
             [--report-filename {string}]
             [--num-callers {int}]
--report {fmt}
At the end of a verification run, produce a comprehensive, machine-readable report. Currently the available formats are: none – disables the report entirely, yaml – prints a concise, yaml-fomatted summary of the results (without memory statistics or machine-readable counterexample data) and yaml-long which prints everything as yaml.
--no-report-file
By default, the long-form verification results (equivalent to --report yaml-long) are stored in a file. This switch suppresses that behaviour.
--report-filename {string}
Store the long-form verification results in a file with the given name. If this option is not used, a unique filename is derived from the name of the input file.

5 Model Checking C and C++ Code via LLVM Bitcode

The traditional “explicit-state” model checking practice is not widely adopted in the programming community, and vice-versa, use of mainstream programming languages is not a common practice in the model checking community. Hence, model checking of systems expressed as C programs comes with some novelties for both kinds of users.

First of all, the current main application of DIVINE is verification of safety (and some liveness) properties of asynchronous, shared-memory programs. The typical realisation of such asynchronous parallelism is programming with threads and shared memory. Often for performance and/or familiarity reasons, programming with threads, shared memory and locking is the only viable alternative, even though the approach is fraught with difficulties and presents many pitfalls that can catch even expert programmers unaware – not to say novices. Sadly, resource locking is inherently non-compositional, hence there is virtually no way to provide a reliable yet powerful abstraction, all that while retaining speed and scalability.

Despite all its shortcomings, lock-based programming (or alternatively, lock-free shared memory programming, which is yet more difficult) is becoming more prevalent. Model checking provides a powerful tool to ascertain correctness of programs written with locks around concurrent access to shared memory. Most programmers will agree that bugs that show up rarely and are nearly impossible to reproduce are the worst kind to deal with. Sadly, most concurrency bugs are of this nature, since they arise from subtle interactions of nondeterministically scheduled threads of execution. A test-case may work 99 times out of 100, yet the 100th time die due to an unfathomable invalid memory access. Even sophisticated modern debugging tools like valgrind are often powerless in this situation.

This is where DIVINE can help, since it systematically and efficiently explores all relevant execution orderings, discovering even the subtlest race conditions, restoring a crucially important property of bugs: reproducibility. If you have ever changed a program and watched the test-case run in a loop for hundreds of iterations, wondering if the bug is really fixed, or it just stubbornly refuses to crash the program… well, DIVINE is the tool you have been waiting for.

Of course, there is a catch. Model checking is computationally intensive and memory-hungry. While this is usually not a problem with comparably small unit tests, applying model checking to large programs may not be feasible – depending on your use-case, and on the amount of memory and time that you have.

With the universal LLVM backend, DIVINE can support a wide range of compiled programming languages. However, out of the box, language-specific support is only provided for C and C++. A fairly complete ISO C runtime library is provided as part of DIVINE, with appropriate hooks into DIVINE, as well as an implementation of the pthread specification, i.e. the POSIX threading interface. Additionally, an implementation of the standard C++ library is bundled with DIVINE. Besides the standard library, DIVINE also provides an adapted version of the runtime library required by C++ programs to implement runtime type identification and exception handling; both are fully supported by DIVINE.

If your language interfaces with the C library, the libc part of language support can be re-used transparently. However, currently no other platform glue is provided for other languages. Your mileage may vary. Data structure and algorithmic code can be very likely processed with at most trivial additions to the support code, in any language that can be compiled into LLVM bitcode.

5.1 Compiling Programs

The first step to take when you want to use DIVINE for C/C++ verification is to compile your program into LLVM bitcode and link it against the DIVINE-provided runtime libraries. When you issue divine cc program.c, divine will compile the runtime support and your program and link them together automatically, using a builtin clang-based compiler. The cc subcommand accepts a wide array of traditional C compiler flags like -I, -std, -W and so on.

Alternatively, you can pass a single-file C or C++ program directly to other divine commands, like verify or sim, in which case the program will be compiled and linked transparently.

5.2 Limitations

When DIVINE interprets your program, it does so in a very strictly controlled environment, so that every step is fully reproducible. Hence, no real IO is allowed; the program can only see its own memory.

While the pthread interface is fully supported, the fork system call is not: DIVINE cannot verify multi-process systems at this time. Likewise, C++ exceptions are well-supported but the C functions setjmp and longjmp are not yet implemented.

The interpreter simulates an “in order” CPU architecture, hence any possible detrimental effects of instruction and memory access reordering won’t be detected. Again, an improvement in this area is planned for a future release.

5.3 State Space of a Program

Internally, DIVINE constructs the state space of your program – an oriented graph, whose vertices represent various states your program can reach, i.e. the values of all its mapped memory locations, the values in all machine registers, and so on. DIVINE normally doesn’t store (or construct) all the possible states – only a relevant subset. In a multi-threaded program, it often happens that more than one thread can run at once, i.e. if your program is in a given state, the next state it will get into is determined by chance – it all hinges on which thread gets to run. Hence, some (in fact, most) states in a parallel program can proceed to multiple different configurations, depending on a scheduling choice. In such cases, DIVINE has to explore both such “variant” successors to determine the behaviour of the program in all possible scenarios.

You can visualise the state space DIVINE explores by using divine draw – this will show how the “future” of your program branches through various configurations, and how it converges back into a common point – or, if its behaviour had changed depending on scheduling, diverges into multiple different outcomes.

5.4 Non-Deterministic Choice

Scheduling is not the only source of non-determinism in typical programs. Interactions with the environment can have different outcomes, even if the internal state of the program is identical. A typical example would be memory allocation: the same call in the same context could either succeed or fail, depending on the conditions outside the control of the program. In addition to failures, the normal input of the program (files, network, user input on the terminal) are all instances of non-determinism due to external influences. The sum of all such external behaviours that affect the outcome of the program is called the environment (this includes the scheduling of threads done by the operating system and/or CPU). When testing, the environment is controlled to some degree: the inputs are usually fixed, but eg. thread scheduling is (basically) random – it changes with every test execution. Resource exhaustion can be simulated in a test environment to some degree. In a model checker, the environment is controlled much more strictly: when a test case is run through a model checker, it will come out the same every time.

5.5 ω-Regular Properties and LTL

6 Interactive Debugging

DIVINE comes with an interactive debugger, available as divine sim. The debugger loads programs the same way verify and other DIVINE commands do, so you can run it on standalone C or C++ source files directly, or you can compile larger programs into bitcode and load that.

6.1 Tutorial

Let’s say you have a small C program which you wish to debug. We will refer to this program as program.c. To load the program into the debugger, simply execute

$ divine sim program.c

and DIVINE will take care of compiling and linking your program and load it, but will not execute it. Instead, sim will present a prompt to you, looking like this:

# executing __boot at /divine/src/dios/dios.cpp:79
>

The __boot function is common to all DIVINE-compiled programs and belongs to DiOS, our minimalist operating system. When debugging user-mode programs, the good first command to run is

> start

which will start executing the program until it enters the main function:

# a new program state was stored as #1
# active threads: [1:0]
T: starting program.c...
# a new program state was stored as #2
# active threads: [1:0]
# executing main at program.c:13
>

Here, we can already see a few DIVINE-specific features of the debugger. First, program states are stored and retained for future reference. Second, thread switching is quite explicit – every time a scheduling decision is made, sim informs you of this fact. We will look at how to influence these decisions later.

Now is a good time to familiarise ourselves with how to inspect the program. There are two commands for listing the program itself: source and bitcode, each will print the currently executing function in the appropriate form (the original C source code and LLVM bitcode, respectively). Additionally, when printing bitcode, the current values of all LLVM registers are shown as inline comments:

>>label %entry:
    %01 = alloca [i32 1 d]                                  # [const* 0 0 uu]
    %02 = call @pthread_create %01 [const* 0 0 dd]          # [i32 0 u]
               [code* 233 0 dd] [const* 0 0 dd] 
    call @__vm_interrupt_mem [global* 2d 0 dd] [i32 1 d] 
    %04 = load [global* 2d 0 dd]                            # [i32 0 u]

Besides printing the currently executing function, both source and bitcode can print code corresponding to other functions; in fact, by default they print whatever function the debugger variable $frame refers to. To print the source code of the current function’s caller, you can issue

> source @caller
    103 void __dios_main( int l, int argc, char **argv, char **envp ) noexcept {
    104     __vm_control( _VM_CA_Bit, _VM_CR_Flags, _VM_CF_Mask, _VM_CF_Mask );
    105     __dios_trace_t( "Dios started!" );
    106     __dios::runCtors();
    107     int res;
    108     switch (l) {
    109     case 0:
    110         __vm_control( _VM_CA_Bit, _VM_CR_Flags, _VM_CF_Mask, _VM_CF_None );
>>  111         res = main();
    112         break;
    113     case 2:
    114         __vm_control( _VM_CA_Bit, _VM_CR_Flags, _VM_CF_Mask, _VM_CF_None );
    115         res = main( argc, argv );
    116         break;
    [...]

To inspect data, we can instead use show and inspect. We have mentioned $frame earlier: there is, in fact, a number of variables set by sim. The most interesting of those is $_ which normally refers to the (usually) most interesting object at hand, typically the executing frame. By default, show will print the content of $_ (like many other commands). However, when we pass an explicit parameter to these commands, the difference between show and inspect becomes apparent: the latter also sets $_ to its parameter. This makes inspect more suitable for exploring more complex data, while show is useful for quickly glancing at nearby values:

> show
@address:     heap* 901e2 0+0
@pc:          code* 234 0
@location:    program.c:13
@symbol:      main
tid:
    @type:    pthread_t
related:      @caller

This is how a frame is presented when we look at it with show.

7 The DIVINE Virtual Machine

Programs in DIVINE are executed by a virtual machine (DIVINE VM or DiVM for short). The machine code executed by this virtual machine is the LLVM bitcode. The details of the “machine language” are, therefore, described in the LLVM Documentation.

7.1 Activation Frames

Unlike in ‘traditional’ implementations of C, there is no continuous stack in DiVM. Instead, each activation record (frame) is allocated from the heap and its size is fixed. Allocations that are normally done at runtime from the stack are instead done from the heap, using the alloca LLVM instruction. Additionally, since LLVM bitcode is in partial SSA form, what LLVM calls ‘registers’ are objects quite different from traditional machine registers. The registers used by a given function are bound to the frame of that function (they cannot be used to pass values around and they don’t need to be explicitly saved across calls). In the VM, this is realized by storing registers (statically allocated) in the activation record itself, along with DiVM-specific “program counter” register (this is an actual register, but is saved across calls automatically by the VM, see also Control Registers below) and a pointer to the caller’s activation frame. The header of the activation record has a C-compatible representation, available as struct _VM_Frame in divine.h.

7.2 Control Registers

The state of the VM consists of two parts, a set of control registers and the heap (structured memory). All available registers are described by enum _VM_ControlRegister defined in divine.h and can be manipulated through the __vm_control hypercall (see also The Hypercall Interface below). The registers are of two types, holding either an integer or a pointer. There are 2 integer registers, _VM_CR_Flags, which is used as a bitfield, and _VM_CR_User1 (the latter is not used or examined by the VM itself).

Four control registers govern address translation and normal execution:

Additional 4 registers are concerned with scheduling and interrupt control (for details, see Scheduling below):

Finally, there’s _VM_CR_FaultHandler (the address of the fault handler, see Faults) and a pair of user registers (_VM_CR_User1 and _VM_CR_User2).

The flags register (_VM_CR_Flags) is further broken down into individual bits, described by enum _VM_ControlFlags, again defined in divine.h. These are:

The remaining 3 flags indicate properties of the resulting edge in the state space (see also State Space of a Program). These may be set by the program itself or by a special monitor automaton, a feature of DiOS which enables modular specification of non-trivial (sequence-dependent) properties. These 3 flags are reset by the VM upon entering the scheduler (see Scheduling). The edge-specific flags are:

7.3 Heap

The entire persistent state of the VM is stored in the heap. The heap is represented as a directed graph of objects, where pointers stored in those objects act as the edges of the graph. For each object, in addition to the memory corresponding to that object, a supplemental information area is allocated transparently by the VM for tracking metadata, like which bytes in the object are initialised (defined) and a list of addresses in the object where pointers are stored.

Activation frames, global variables and even constants are all stored in the heap. The heap is also stored in a way that makes it quite efficient (both time- and memory-wise) for the VM to take snapshots and store them. This is how model checking and reversible debugging is realized in DIVINE.

7.4 Scheduling

The DIVINE VM has no intrinsic concept of threads or processes. Instead, it relies on an “operating system” to implement such abstractions and the VM itself only provides the minimum support necessary. Unlike with “real” computers, a system required to operate DiVM can be extremely simple, consisting of just 2 C functions (one of them is __boot, see Boot Sequence below). The latter of those is the scheduler, the responsibility of which is to organize interleaving of threads in the program to be verified. However, the program may not use threads but some other form of concurrency – it is up to the scheduler, which may be provided by the user, to implement the correct abstractions.

From the point of view of the state space (cf. State Space of a Program), the scheduler decides what the successors of a given state are. When DIVINE needs to construct successors to a particular state, it executes the scheduler in that state; the scheduler decides which thread to run (usually with the help of the non-deterministic choice operator) and transfers control to that thread (by changing the value of the _VM_CR_Frame control register, i.e. by instructing DIVINE to execute a particular activation frame). The VM then continues execution in the activation frame that the scheduler has chosen, until it encounters an interrupt. When DIVINE loads a program, it annotates the bitcode with interrupt points, that is, locations in the program where threads may need to be re-scheduled. When such a point is encountered, the VM sets the _VM_CF_Interrupted bit in _VM_CR_Flags and unless _VM_CF_Mask is in effect, an interrupt is raised immediately.

Upon an interrupt, the values of _VM_CR_IntFrame and _VM_CR_Frame are swapped, which usually means that the control is transferred back to the scheduler, which can then read the address of the interrupted frame from _VM_CR_IntFrame (this may be a descendant or a parent of the frame that the scheduler originally transferred control to, or may be a null pointer if the activation stack became empty).

Of course, the scheduler needs to store its state – for this purpose, it must use the _VM_CR_State register, which is set by __boot to point to a particular heap object. This heap object can be resized by calling __vm_obj_resize if needed, but the register itself is read-only after __boot returns. The object can be used to, for example, store pointers to activation frames corresponding to individual threads (but of course, those may also be stored indirectly, behind a pointer to another heap object). In other words, the object pointed to by _VM_CR_State serves as the root of the heap.

7.5 Faults

An important role of DIVINE is to detect errors – various types of safety violations – in the program. For this reason, it needs to interpret the bitcode as strictly as possible and report any problems back to the user. Specifically, any dangerous operations that would normally lead to a crash (or worse, a security vulnerability) are caught and reported as faults by the VM. The fault types that can arise are the following (enumerated in enum _VM_Fault in divine.h):

When a fault is raised, control is transferred to a user-defined fault handler (a function the address of which is held in the _VM_CR_FaultHandler control register). Out of the box, DiOS (see DiOS, DIVINE’s Operating System) provides a configurable fault handler. If a fault handler is set, faults are not fatal (the only exception is a double fault, that is, a fault that occurs while the fault handler itself is active). The fault handler, possibly with cooperation from the scheduler (see Scheduling above), can terminate the program, or raise the _VM_CF_Error flag, or take other appropriate actions.

The handler can also choose to continue with execution despite the fault, by transferring control to the activation frame and program counter value that are provided by the VM for this purpose. (Note: this is necessary, because the fault might occur in the middle of evaluating a control flow instruction, in which case, the VM could not finish its evaluation. The continuation passed to the fault handler is the best estimate by the VM on where the execution should resume. The fault handler is free to choose a different location.)

7.6 Boot Sequence

The virtual machine explicitly recognizes two modes of execution: privileged (kernel) mode and normal, unprivileged user mode. When the VM is started, it looks up a function named __boot in the bitcode file and starts executing this function, in kernel mode. The responsibility of this function is to set up the operating system and set up the VM state for execution of the user program. There are only two mandatory steps in the boot process: set the _VM_CR_Scheduler and the _VM_CR_State control registers (see above). An optional (but recommended) step is to inform the VM (or more specifically, any debugging or verification tools outside the VM) about the C/C++ type (or DWARF type, to be precise, as this is also possible for non-C languages) associated with the OS state. This is accomplished by an appropriate __vm_trace call (see also below).

7.7 The Hypercall Interface

8 DiOS, DIVINE’s Operating System

Programs traditionally rely on a wide array of services provided by their runtime environment (that is, a combination of libraries, the operating system kernel, the hardware architecture and its peripherals and so on). When DIVINE executes a program, it needs to provide these services to the program. Some of those, especially library functions, can be obtained the same way they are in a traditional (real) execution environment: the libraries can be compiled just like other programs and the resulting bitcode files can be linked just as easily.

The remaining services, though, must be somehow supplied by DIVINE, since they are not readily available as libraries. Some of them are part of the virtual machine, like memory management and interrupt control (cf. The DIVINE Virtual Machine). The rest is provided by an “operating system”. In principle, you can write your own small operating system for use with DIVINE; however, to make common verification tasks easier, DIVINE ships with a small OS that implements a subset of POSIX interfaces that should cover the requirements of a typical user-level program.

8.1 DiOS Compared to Traditional Operating Systems

The main goal of DiOS is to provide a runtime environment for programs under inspection, which should not be distinguishable from the real environment. To achieve this goal, it presents an API for thread (and process in the future) handling, faults and virtual machine configuration. DiOS API is then used to implement the POSIX interface and supports the implementation of standard C and C++ library.

However, there are a few differences to the real POSIX-compatible OSs the user should be aware of:

8.2 DiOS Fault Handling And Error Traces

When DIVINE 4 VM performs an illegal operation (e.g. division by zero or null pointer dereference), it performs a so-called fault and a user supplied function, the fault handler, is called. This function can react to the error - collect additional information or decide how the error should be handled. DiOS provides its own fault handler, so that verified programs do not have to.

The DiOS fault handler prints a simple human readable stack trace together with a short summary of the error. When a fault is triggered, it can either abort the verification or continue execution – depending on its configuration for a given fault. Please see the next section for configuration details.

Consider the following simple C++ program:

int main() {
    int *a = nullptr;
    *a = 42;
    return 0;
}

This program does nothing interesting, it just triggers a fault. If we execute it using ./divine run -std=c++11 test.cpp, we obtain the following output:

T: FAULT: null pointer dereference: [const* 0 0 ddp]
T: Fault in userspace: memory
T: Backtrace:
T:   1: main
T:   2: _start

By inspecting the trace, we can see that a fault was triggered. When VM triggers a fault, it prints the reason – here a null pointer dereference caused it. The encounter of an error caused a DiOS fault handler to be called. The fault handler first communicates whether the fault occurred in the DiOS kernel or in the user space – in the actual verified program. This is followed by a simple backtrace. Note that _start is a DiOS function, which is always at the bottom of a backtrace. It calls all global constructors, initializes the standard libraries and calls main with the right arguments.

If we run ./divine verify -std=c++11 test.cpp, we obtain a more detailed backtrace:

the error trace:
  FAULT: null pointer dereference: [const* 0 0 ddp]
  Fault in userspace: memory
  Backtrace:
    1: main
    2: _start

backtrace #1 [active stack]:
  @address: heap* 998dffb7 0+0
  @pc: code* 956 50
  @location: /divine/src/dios/fault.cpp:49
  @symbol: __dios::Fault::sc_handler(__dios::Context&, int*, void*, __va_list_tag*)

  @address: heap* 118ab8d1 0+0
  @pc: code* 95f 1
  @location: /divine/src/dios/fault.cpp:272
  @symbol: __sc::fault_handler(__dios::Context&, int*, void*, __va_list_tag*)

  @address: heap* e6dbccb9 0+0
  @pc: code* b11 16
  @location: /divine/include/dios/syscall.hpp:50
  @symbol: __dios::Syscall::handle(__dios::Context*)

  @address: heap* 11 0+0
  @pc: code* a75 26
  @location: /divine/include/dios/scheduling.hpp:185
  @symbol: void __dios::sched<false>()

backtrace #2:
  @address: heap* 9aeef2fc 0+0
  @pc: code* a8c 17
  @location: /divine/include/dios/syscall.hpp:43
  @symbol: __dios::Syscall::trap(int, int*, void*, __va_list_tag (&) [1])

  @address: heap* c149bfb0 0+0
  @pc: code* 982 b
  @location: /divine/src/dios/syscall.cpp:16
  @symbol: __dios_syscall

  @address: heap* 12 0+0
  @pc: code* 958 16
  @location: /divine/src/dios/fault.cpp:76
  @symbol: __dios::Fault::handler(_VM_Fault, _VM_Frame*, void (*)(), ...)

  @address: heap* 32250330 0+0
  @pc: code* c58 2
  @location: drt_dev/test_error.cpp:3
  @symbol: main

  @address: heap* ea18a3ba 0+0
  @pc: code* 96b 7
  @location: /divine/src/dios/main.cpp:152
  @symbol: _start

The first part of the output is the same as in the previous case, the rest are full DIVINE backtraces for all the active threads. In this concrete example, we see two backtraces even for a single threaded program. To see the reason for this, let us first examine backtrace #2. If we go through the backtrace from its end, we can see an invocation of _start, which called main. main was then interrupted on line 3 because of the fault. Next, we can see a call to the DiOS fault handler __dios::Fault::handler. As the handler is called from user space and, consequently, is isolated from the internal state of DiOS, it performs a syscall to jump into the kernel, as we can see from the first two entries of the backtrace.

All DiOS kernel routines, including scheduling and syscalls, feature a separate call stack – this is the reason why we observe two backtraces. Therefore, backtrace #1 contains a scheduler and syscall machinery after which fault handler is called in kernel mode and can do further inspection.

8.3 DiOS Configuration

DIVINE supports passing of boot parameters to the OS. Some of these parameters are automatically generated by DIVINE (e.g. the name of the program, program parameters or a snapshot of a file system), others can be supplied by the user. These parameters can be specified using the command line option -o {argument}.

DiOS expects {argument} to be in the form {command}:{argument}. DiOS help can be also invoked with a shortcut -o help. All arguments are processed during the boot phase, so if an invalid command or argument is passed, DiOS fails to boot. DIVINE handles this state as unsuccessful verification and the output contains a description of the error. DiOS supports the following commands:

8.4 DiOS Virtual File System

DiOS provides a POSIX-compatible filesystem implementation. As there are no real I/O operations allowed, DiOS Virtual File System (VFS) operates on top of a filesystem snapshot. The effect of all operations performed by the VFS is not propagated back to the host. Snapshots are created by DIVINE just before DiOS boots; to create a snapshot of a directory, the command line option --capture {vfsdir} can be used, where {vfsdir} is a :-separated list of up to three options:

- Path to a directory. Mandatory.
- `follow` or `nofollow` specifies whether symlink targets should or should
  not be captured. Default is symlink following. Optional.
- Mount point in the VFS. If not specified, directory is mounted to the
  capture path.

DIVINE can handle capture of files, directories, sym- and hardlinks, DiOS can also create pipes or UNIX sockets. These, however, cannot be captured from the host system by DIVINE.

The size of the snapshot is by default limited to 16 MiB, to prevent an accidental capture of much larger portions of the filesystem than intended (due to the presence of symlinks).

Example: divine verify --capture testdir:follow:/home/test/ --vfslimit 1kB test.cpp

The VFS implements basic POSIX syscalls – write, read, etc. This allows DiOS to directly use implementations of C-library functions like printf, scanf or C++ streams. Clearly, functions which operate on the filesystem modify the internal filesystem snapshot of DiOS. The user can further specify a file (containing some data) using the DIVINE switch --stdin {file}, which will be supplied as stdin of the verified program. The standard output can be handled in several ways:

All of the above options can be specified via DiOS boot parameters. See DiOS configuration for more details.


  1. this information is run-length encoded (e.g. 0^2 2^4 means the first two choices returned 0 and the following four returned 2);