Module Operator

Operator defines the syntax and semantics of the expressions of the internal languages used by the Codex analyzers. It defines operations such as addition (on bitvectors or integers), logical/bitwise and (on booleans, integers and bitvectors), etc. In addition, it contains several utility modules (such as pretty-printers) for code dealing with these operators.

Most of our passes do not use a term representation of an AST, but instead calls "constructor functions" manipulating expressions, similarly to the Tagless-final of Carette, Kiselyov and Shan (2009). Thus, syntax means here that we define signatures (see Syntax: signature of operators) .

We do also provide a tag that can be used in an AST representation of the language (module Function_symbol).

We define a concrete semantics of these operators in module Concrete, which can be used to interpret constant terms.

Finally, Conversions contain helpers when doing domain transformations.

Unique identifiers

module Malloc_id : sig ... end

Unique identifier for malloc sites, which eventually includes all allocations in a C program. We also give string as a convenient name for these allocations.

module MakeId () : sig ... end

Generative functor to create unique ids.

module Condition : sig ... end

We want a choose operation on sets (which normally selects an arbitrary elements in a set) but we want to tell whether two distinct choose(S) operations selected the same element or not.

module Choice : sig ... end

Alarms

module Alarm : sig ... end

In the concrete, an alarm would correspond to an exception/panic due to a partial operator.

module Flags : sig ... end

Syntax: signature of operators

module Sig : sig ... end
include module type of struct include Sig end

This defines the syntax for the operators usable in the internal languages of Codex, expressed as signatures as in the Tagless final approach.

The signatures are grouped by type of values manipulated (boolean, integer, bitvector, binary, memory, enum). We define two set of functions: the forward are the normal operations, and the backward exclude the functions of arity 0 (for which a backward operation is meaningless).

type access_type =
  1. | Read
    (*

    Loading from memory.

    *)
  2. | Write
    (*

    Storing into memory.

    *)
type arith_type =
  1. | Plus
    (*

    Addition operation

    *)
  2. | Minus
    (*

    Substraction operation

    *)
module type ARITY = Sig.ARITY

Arity of function symbols. 'r represents the result type and 'a, 'b, 'c the arguments.

module Forward_Arity = Sig.Forward_Arity

Standard arities for forward transfer functions: given the arguments, return the results. These match the arities of the concrete functions they represent (but with concrete types substituted for their abstract counterparts).

module Backward_Arity = Sig.Backward_Arity

Standard arities for backward transfer functions (used to refined the arguments from information on the result values). These take the result value 'r as argument and return a new-improved value for each argument. They return None when no improvement is possible for that argument.

Note: in the following, we distinguish between backward and forward because there is no need to implement backward transfer functions for symbols with arity 0.

Boolean transfer functions

Transfer functions for boolean values: not, and (&&), or (||), as well as contants true_ and false_.

module type BOOLEAN_BACKWARD = Sig.BOOLEAN_BACKWARD
module type BOOLEAN_FORWARD = Sig.BOOLEAN_FORWARD

Integer transfer functions

Transfer functions for unbounded integers:

  • addition (iadd); subtraction (isub);
  • multiplication (imul, in general, itimes when multiplying by a constant);
  • division (idiv), remainder (imod);
  • comparisons (ieq for ==, ile for <=);
  • shifts (left ishl and right ishr)
  • bitwise operations (ior, iand, ixor).

For the bitwise operation, we assume an infinite two-complement representation: i.e. -1 is represented by an infinite sequence of 1, and 0 by an infinite sequence of 0.

module type INTEGER_BACKWARD = Sig.INTEGER_BACKWARD
module type INTEGER_FORWARD_MIN = Sig.INTEGER_FORWARD_MIN
module type INTEGER_FORWARD = Sig.INTEGER_FORWARD

Bitvector transfer functions

Purely numerical operations on fixed-size bitvectors. Includes bitwise operations and arithmetic, but not pointer arithmetic.

Note: the size argument is generally the size of both arguments and the result.

module type BITVECTOR_BACKWARD = Sig.BITVECTOR_BACKWARD
module type BITVECTOR_FORWARD = Sig.BITVECTOR_FORWARD
module type BITVECTOR_FORWARD_WITH_BIMUL_ADD = Sig.BITVECTOR_FORWARD_WITH_BIMUL_ADD

Binary transfer functions

Binary is the name of values handled by C or machine-level programs, i.e. either numeric bitvectors or pointers.

module type BINARY_BACKWARD = Sig.BINARY_BACKWARD
module type BINARY_FORWARD = Sig.BINARY_FORWARD
module type OFFSET_BACKWARD = Sig.OFFSET_BACKWARD
module type OFFSET_FORWARD = Sig.OFFSET_FORWARD
module type BLOCK_BACKWARD = Sig.BLOCK_BACKWARD
module type BLOCK_FORWARD = Sig.BLOCK_FORWARD

Enum transfer functions

Transfer function for enum values. Enums are types with a fixed (small) number of possible cases.

module type ENUM_BACKWARD = Sig.ENUM_BACKWARD
module type ENUM_FORWARD = Sig.ENUM_FORWARD

Memory transfer functions

module type MEMORY_BACKWARD = Sig.MEMORY_BACKWARD
module type MEMORY_FORWARD = Sig.MEMORY_FORWARD

Concrete (reference) implementation giving a meaning to operators

module Concrete : sig ... end

Concrete interpreter using OCaml boolean and Z.t for values.

Function symbols

module Function_symbol : sig ... end

Conversions

module Conversions : sig ... end

Functors to change arities of transfer functions signatures (i.e. replace ar0 with a new ar0). "Conversions"; i.e. passing the same transfer function (currently: with same types for dimension identifiers) with minimal changes.

Automatic logging

Similar to conversion, converts transfer functions to the same thing but that logs its call.

module Autolog : sig ... end

These functors allows automatic logging of transfer functions. You define how to handle functions of different arities, and how to print values of different types, and then you can automatically log transfer functions of a given signature (the functor names correspond to this signature).