Codex is an OCaml library to help writing sound static analyses by abstract interpretations, such as BINSEC/Codex (for analysis of machine code using the BINSEC platform) or Frama-C/Codex (for analysis of C code using the Frama-C platform).

To illustrate what Codex can do, here is the output of Codex for the abs function:

You can hover on parts of the code above to see the possible values of each expression. Observe that initially, the value of x is unknown (is in the interval [--..--] meaning all the possible values), but its value is refined after the x<0 test in each branch. Codex performs a sound analysis, meaning that the values attached to each program variable is a superset of the real values in the program. In other words, when we display that x is in the interval [0..0x7FFFFFFF], then we guarantee that x cannot be negative there.

An important application of such analysis is to verify the absence of runtime errors. It is easy, even for a seasoned C programmer, to fall into one of the many pitfalls (undefined behaviour) of the C programming language, be it well-known ones such as buffer overflows, null pointer dereferences, and division by zero, or lesser known ones like signed integer overflow or absence of compliance to the C strict aliasing rules. Programs with these errors risk being miscompiled, to crash, to silently corrupt some data, or be vulnerable to cybersecurity attacks.

Codex can be used to automatically prove the absence of such defects, or warn if it finds one. The seemingly simple abs function has such a defect: the value INT_MIN = -0x80000000 cannot be negated as 0x80000000 cannot fit inside a int value. As a result, Codex displays an alarm warning about this at the top right of the analysis output (the list of alarms that Codex looks at is configurable).

This error only appears if abs is given the value INT_MIN, so we may consider that the problem does not reside in the abs implementation, but in any function that would fail to call abs properly. Codex allows to encode such contracts between abs and its callers using types specifications. For instance, a possible contract for abs could be

(int with self >= 0) abs((int with self >= 0-1000) x);

which states that if abs is provided with a value higher than -1000, then abs returns a positive number. Observe the analysis output for abs when provided with this contract: the error is now gone.

It is important to understand what “the error is gone” means. Codex is not a heuristic bug finding tool (like compiler warnings), that may miss some alarms. The fact that the signed overflow alarm is not displayed anymore is a proof that such an error cannot happen. In other words, Codex can automatically compute a proof of absence of runtime errors; it can, for instance prove that a program is memory-safe, or return a list of lines where there may be possible errors. As an example, Codex has been used to verify absence of privilege escalation (which includes proving memory safety or control-flow integrity) of a small industrial embedded OS kernel, directly from its executable (see here for more details).

The tool is currently a research prototype, but we believe that it would be very useful to C or systems programmers, which is why we are working on making it more mature (writing better documentation, improving user interface, etc.). Stay tuned for future updates, or just contact us if you are interested or want to help!