SAFECode Users Guide

Written by the LLVM Research Group

Overview

The SAFECode compiler is a memory safety compiler built using the LLVM Compiler Infrastructure and the Clang compiler driver. A memory safety compiler is a compiler that inserts run-time checks into a program during compilation to catch memory safety errors at run-time. Such errors can include buffer overflows, invalid frees, and dangling pointer dereferences.

With additional instrumentation to track debugging information, a memory safety compiler can be used to find and diagnose memory safety errors in programs (this functionality is similar to what Valgrind does).

This manual will show how to compile a program with the SAFECode compiler and how to read the diagnostic output when a memory safety error occurs.

Compiling a Program with SAFECode

The easiest way to use SAFECode is to use the modified version of Clang that comes in the SAFECode distribution. When used in this way, the SAFECode transforms are performed transparently by the Clang compiler.

To activate the SAFECode transforms during compilation, add the -fmemsafety command-line option to the Clang command line:

You may also need to add a -L$PREFIX/lib option to the link command-line to indicate where the libraries are located; $PREFIX is the directory into which SAFECode was installed:

Finally, you may want to utilize SAFECode's whole-program analysis features; these features allow SAFECode to detect more memory safety errors and to optimize its run-time checks. To use this feature, just add the -flto option to the command line (and -use-gold-plugin if you are using SAFECode on Linux):

That's it! Note the use of the -g option; that generates debugging information that the SAFECode transforms can use to enhance its run-time checks. The -fmemsafety-logfile option can be used to specify a file into which memory safety errors are recorded (by default, they are printed to standard error).

To configure an autoconf-based software package to use SAFECode, do the following:

  1. Set the environment variable CC to $PREFIX/clang.
  2. Set the environment variable CXX to $PREFIX/clang++.
  3. Set the environment variable CFLAGS to "-g -fmemsafety -flto"
  4. Set the environment variable LDFLAGS to "-L$PREFIX/lib" where $PREFIX is the directory into which SAFECode was installed.
  5. Run the configure script
  6. Type "make" to compile the source code.
Note that some configure scripts may not use the LDFLAGS variable properly. If the above directions do not work, try setting CFLAGS to "-g -fmemsafety -L$PREFIX/lib".
Sample Debugging with SAFECode

Let's say that we have the following C program:

  1 #include "stdio.h"
  2 #include "stdlib.h"
  3 
  4 int
  5 foo (char * bar) {
  6   for (unsigned index = 0; index < 10; ++index)
  7     bar[index] = 'a';
  8   return 0;
  9 }
 10 
 11 int
 12 main (int argc, char ** argv) {
 13   char * array[100];
 14   int max = atoi (argv[1]);
 15 
 16   for (int index = max; index >= 0; --index) {
 17     array[index] = malloc (index+1);
 18   }
 19 
 20   for (int index = max; index >= 0; --index) {
 21     foo (array[index]);
 22   }
 23 
 24   exit (0);
 25 }

Lines 16-18 allocate character arrays of decreasing size, starting with the argument plus one specified by the user down to an array of one character. Lines 20-22 then call the function foo() which accesses elements 0-9 of the array.

If we compile this program with SAFECode and execute it:

We'll get the following error report:

=======+++++++    SAFECODE RUNTIME ALERT +++++++=======
= Error type                            : Load/Store Error
= Faulting pointer                      : 0x100100679
= Program counter                       : 0x100002493
= Fault PC Source                       : /Users/criswell/tmp/safecode/test/test.c:7
=
= Object allocated at PC                : 0x100002ad6
= Allocated in Source File              : /Users/criswell/tmp/safecode/test/test.c:17
= Object allocation sequence number     : 3
= Object start                          : 0x100100670
= Object length                         : 0x9

The first thing to note is the error type. SAFECode is reporting a load/store error, meaning that some memory access is trying to access a memory location that it should not. The second thing to note is the faulting pointer field (0x100100679) which tells us the pointer value that was invalid. SAFECode reports that the error occurred on line 7 of test.c; this is what we expect because that is the line in foo() that accesses out-of-bounds memory.

Now look at the "Object start" and "Object length" fields in the report:

= Object start                          :       0x100100670
= Object length                         :       0x9

Because this is a load/store error, SAFECode is telling us that a pointer started out within the bounds of the memory object starting at 0x100100670 with length 0x9 but that it went out of bounds to 0x100100679 and was subsequently dereferenced. SAFECode can do this using a technique called pointer rewriting; when a pointer goes out of bounds, SAFECode changes it to point to a reserved, unmapped region in the program's virtual address space. SAFECode tracks enough metadata about memory objects that if it ever detects a dereference of a rewritten pointer (like the example above), it can report the bounds of the original object from which the pointer came. Not only that, SAFECode can even tell us the source line information about the location at which the memory object was allocated:

= Allocated in Source File              : /Users/criswell/tmp/safecode/test/test.c:17

Finally, notice the allocation sequence number:

= Object allocation sequence number     :       3

A particular source line may be executed multiple times and allocate many objects. The sequence number above tells us that the memory object was the third memory object allocated at the allocation site in main(). This information could be used, for example, to set a breakpoint at the memory allocation site that doesn't trigger until the memory object experiencing the error is actually allocated.