Note: this section was written on 6/29/01. CCured has changed since
then..
Once you download the sources for a new package, you can run the translator
on it. To create a Makefile target for a package you typically add to
cil/Makefile a target that just invokes make on the package's own Makefile
with the CC variable bound to “ccured –merge”. I'll use the 6/29/01 version
of ftpd as my example.
First we try to run it without our tool involved:
% make ftpd-clean
% cd test/ftpd/ftpd
% make
This succeeds, generating an 'ftpd' binary. In the case of 'ftpd',
running it is slightly complicated:
% ./ftpd -D -d -p 3333
(then in another window)
% telnet localhost 3333
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 madrone.cs.berkeley.edu FTP server (Version 6.5/OpenBSD, linux port 0.3.2) ready.
(etc)
This is (to our way of measuring) success.
Now we try it in 'cil' mode:
% cd cil
% make ftpd-clean
% make ftpd
At the moment, this also works, producing another 'ftpd' binary. We test
it the same way, and rejoice at its success.
Finally, we dare to try it in 'box' mode, meaning the instrumentation
module will be used:
% make ftpd-clean
% make ftpd INFERBOX=4
After crunching for a while, it reports this error (you have to scroll
back a bit to see the right one):
./ls_all.c:1338: Bug: Calling non-wild ioctl with too many args
This is an error from the 'box' module, complaining about what it
perceives to be a type error. If we investigate the named source
line, we see
if(ioctl(1, 0x5413, & win) == 0 && win.ws_col > 0)
confirming that 'ioctl' is involved. Since the *_all.c files are the
output of our tool, and do not themselves #include any files, we can
simply search in this file for ioctl's declaration. We do so, and see
extern int ioctl(int __fd , unsigned long int __request , ...) ;
Hmmm... looks like it was declared to accept any number (>=2) of args, so
this looks like a bug in the 'box' module; it should accept this code,
but it does not.
The next step is to write a tiny C program which calls ioctl (see
test/small2/ioctl.c), and verify it fails the same way
% make scott/ioctl INFERBOX=4
[...]
ioctl.c:9: Bug: Calling non-wild ioctl with too many args
[...]
Yep, same problem. Now we report this to George, since typically he's
much faster at identifying the problem, since he wrote the 'box' module.
In the meantime (waiting for George to magically fix the problem), we
could temporarily comment-out the call so we can proceed to find other
bugs. Or, perhaps we change the ioctl call to instead call a wrapper
function (wrappers are defined in lib/ccuredlib.c, which gets linked into
the translated program).
Eventually (see "make go INFERBOX=4") we'll get an executable. If it
runs correctly, celebration is in order. If not, it will usually fail
because of a failed runtime check (this one is from a test vector for
which go fails);
% make go INFERBOX=4
% cd test/spec/099.go/src
% ./go 5 4
[...]
array bug: index is 5980 (vs 5980)
Failure: Ubound
Abort
Tracking down the source of such failures is the most time-consuming
part of pushing a program through. Sometimes it's a bug in the
translator, in which case ideally a test case can be isolated for easy
diagnosis.
Sometimes (more and more often) it's a bug in the original program (go
had 10 array bounds violation bugs at last count). In this case you
have to change the original code to fix the bug; this may be easy or
hard. If it's hard, try just surrounding the offending statement with
an explicit bounds check in an 'if' statement, so the program skips
the bad statements (that is what I did to cause all of the "array bug"
outputs in the "5 4" case above).
C.2 Writing Wrappers Manually
To interface with external code, you are usually better off using the automatic
wrapper system described in Chapter 8. However if that isn't possible,
you'll need to write a wrapper directly in C:
-
Step 1:
-
Consult the name-mangling algorithm documented Section 8.1
and ccuredlib.c to decode the required types of the parameters.
- Step 2:
-
Determine the semantics of the function being wrapped (e.g.,
if it's a unix libc call, consult its man page). In particular,
find out how memory passed via pointers is accessed (read and
written).
- Step 3:
-
Write the wrapper, and make calls to the verification
and query functions in the section in ccuredlib.c titled
“general-purpose”. If the function manipulates wild pointers,
be sure to update tags; conversely, if no wild pointers are
involved, there are no tags to worry about.
Good examples to consult (in ccuredlib.c) include read_w,
fgets_ffw, stat_ww, strcat_www, memmove_www.
C.3 Apache Modules
This section applies to CCured as of July 2002
C.3.1 Introduction
This writing assumes Apache 1.3.19 and an x86/Linux system. Apache is an
open source web server that has the ability to dynamically load third-party
modules. Modules can examine and alter HTTP requests and also examine and
alter the webserver replies. For example, a compression module might
examine the HTTP request to see if it contains the “Accept-Encoding:
gzip” tag. If it does it might alter the HTTP reply, replacing the text of
the webpage body with a compressed version of that text. Modules can be
configured (via a file called httpd.conf) so that their behavior is
limited to a certain location or directory.
Apache modules share the same address space as the Apache webserver: no
software fault isolation is present. As a result, if the module crashes it
brings down that webserver (although apache is usually configured to
immediately spawn a new webserver thread to replace the fallen one). More
distressingly, a module with a security violation (for example, a
format-string bug) can allow remote users to gain shell access to the
webserver machine (one version of mod_php3 features such a
vulnerability: CCured prevents that vulnerability).
Apache modules are typically single files with a fairly standard naming
convention: mod_foo.c is the foo module, where foo
ranges over fairly descriptive keywords like gzip, random, urlcount, auth or layout. mod_foo.c almost invariable
contains a global data structure of type module with the C name foo_module. This data structure is a table of function pointers and entry
points. Once mod_foo.c has been compiled to the shared object mod_foo.so, Apache will dynamically load it and call the function
pointers listed in the foo_module structure at the appropriate time
(e.g., when a new request comes in).
C.3.2 Curing Apache Modules
Most Apache modules are of a relatively modest size and curing them is no
great chore. However, some annotation work must be done. Since the cured
module must interact with the non-cured Apache webserver, objects that are
passed between them must not change size. As a result, WILD and other
fat pointers cannot be introduced into such objects. Annotations must be
added to convince CCured that the module can be made safe without such
run-time checks.
Imagine that you are trying to cure mod_urlcount.c. Take out your
favorite text editor and open up the file. Near the top you should find a
configuration record structure. Each module defines a separate
configuration record structure with a separate (non-exported) name. For
example, mod_urlcount has:
typedef struct urlcount_config_rec {
int urlcount_default;
CounterType urlcount_type;
int urlcount_auto_add;
char *urlcount_file;
} urlcount_config_rec;
Each module also contains functions that create, manipulate and merge such
configuration structures. This is the mechanism through which Apache modules
maintain global state. Each time Apache calls one of the function pointers
exported by the module, it passes along a way to get to the appropriate
configuration record. Since Apache does not know how the config structure
will be defined, it uses void pointers to describe the type. CCured
comes with a set of macros that instantiate those void *s on a
per-module basis. Add the line:
NEW_MODULE_TYPE(urlcount, urlcount_config_rec) // this is a macro
where the first parameter is the module suffix name and the second is the
type name of the configuration record. This macro declares a type named
module_urlcount. As mentioned earlier, each module exports a module
structure (full of function pointers). We must redeclare the module to take
advantage of the instantiated types. Change:
module urlcount_module; // full of "void *"s
to
module_urlcount urlcount_module; // uses "url_config_rec *", not "void *"
Now scroll down a bit and look for the word keyword void. Apache
modules often feature unnecessary casts to void. For example, mod_headers contains this function:
static void *merge_headers_config(pool *p, void *basev, void *overridesv)
{
headers_conf *a = (headers_conf *) ap_pcalloc(p, sizeof(headers_conf));
headers_conf *base = (headers_conf *) basev,
*overrides = (headers_conf *) overridesv;
a->headers = ap_append_arrays(p, base->headers, overrides->headers);
return a;
}
Every void in this function really stands for headrs_conf (the
mod_headers version of urlcount_config_rec). Change it so
that the void types are no longer present:
static headers_conf *
merge_headers_config(pool *p, headers_conf *basev, headers_conf *overridesv)
{
headers_conf *a = (headers_conf *) ap_pcalloc(p, sizeof(headers_conf));
headers_conf *base = (headers_conf *) basev,
*overrides = (headers_conf *) overridesv;
a->headers = ap_append_arrays(p, base->headers, overrides->headers);
return a;
}
Repeat this process with all configuration functions that contain void. Now search for ap_get_module_config. It is a macro that
contains a (safe) cast to and from void * – it allows modules to
extract their configuration record from the global server state. For
example, mod_headers contains:
headers_conf *serverconf =
(headers_conf *) ap_get_module_config(s->module_config, &headers_module);
Change this to:
headers_conf * serverconf;
{ __NOBOXBLOCK
serverconf = ap_get_module_config(s->module_config, &headers_module);
}
The __NOBOXBLOCK block keyword tells CCured to leave the block
alone: we are asserting that it is already safe. Modify every instance of
ap_get_module_config and ap_get_perdir_module_config
the same way.
Now look for a datatype with the suffix entry. For example, mod_headers features:
typedef struct {
hdr_actions action;
char *header;
char *value;
} header_entry;
This marks a use of Apache's polymorphic (via void *) array routines.
Insert the following macro declaration to tell CCured about this array
type:
NEW_TABLE_TYPE(header_entry, header_entry) // macro
This macro declares a new type, array_header_FOO (where FOO
is the first argument) that is a specialized version of the Apache-provided
type array_header. Other data structure (for example, the
configuration record structure) will contain array_headers. We
change them to use this new datatype. Change all declarations like:
typedef struct {
array_header *headers;
} headers_conf;
into:
typedef struct {
array_header_header_entry *headers;
} headers_conf;
Now search for every call to ap_make_array, ap_append_arrays, ap_push_array and append FOO (in our
running example, header_entry) to the name of each called function.
For example, change:
new = (header_entry *) ap_push_array(dirconf->headers);
into:
new = (header_entry *) ap_push_array_header_entry(dirconf->headers);
Finally, surround all global table declarations with #pragmas that
tell CCured to leave them alone (because Apache must read them). Often
there are three such global tables per module. One is an array of struct const command_recs, one is an array of struct const
handler_recs, and the last is a module. Surround them all with #pragmas as follows:
static const handler_rec mod_gzip_handlers[] =
{
{"mod_gzip_handler", mod_gzip_handler},
{CGI_MAGIC_TYPE, mod_gzip_handler},
{"cgi-script", mod_gzip_handler},
{"*", mod_gzip_handler},
{NULL}
};
becomes:
#pragma box(off)
static const handler_rec mod_gzip_handlers[] =
{
{"mod_gzip_handler", mod_gzip_handler},
{CGI_MAGIC_TYPE, mod_gzip_handler},
{"cgi-script", mod_gzip_handler},
{"*", mod_gzip_handler},
{NULL}
};
#pragma box(on)
Finally, change the declaration of the global module to use the new
specialized type we created earlier. For example, change:
module MODULE_VAR_EXPORT urlcount_module = {
into:
module_urlcount MODULE_VAR_EXPORT urlcount_module = {
Voilą.
C.3.3 Linking Apache Modules
Suppose you have just finished making the source modifications to mod_foo.c. Now you want to test it on Apache. Use CCured to compile it to
mod_foo.o. Make sure that there are no WILD pointers and that
the sizes of types involved in the apache-module interface did not change.
Now you must link it:
$ gcc -shared -o mod_foo.so mod_foo.o
$ cp mod_foo.so /path/to/apache/bin/
Now go to your Apache binary directory and edit httpd.conf. Go to the
LoadModule section and add something appropriate according to the
documentation for your module. For example, mod_usertrack can be
configured by adding:
LoadModule usertrack_module bin/mod_usertrack.so
CookieTracking On
CookieExpires "1 days"
Now try to start Apache:
$ ./apachectl stop
$ ./apachectl start
./apachectl start: httpd started
If you see the “httpd started” message, it worked. If there were
messages about undefined symbols, you probably have to write a few
wrappers. For example, you might see:
./httpd: undefined symbol strdup_ff: mod_foo cannot be loaded
In this case you must write a wrapper for strdup that uses FSEQ
pointers. Suppose you write it in wrapper_foo.c and compile that to
wrapper_foo.o. Now go back to the linking step:
$ gcc -shared -o mod_foo.so mod_foo.o wrapper_foo.o
$ cp mod_foo.so /path/to/apache/bin/
And try to start Apache again. Eventually this process converges (you can
skip ahead by using a utility like nm to list all of the undefined
symbols in mod_foo.so if you like) and your Apache module will be up
and running.
If for some reasons your Apache module crashes at run-time, consider using
the underlying CIL –logcalls mechanism to track down the error
(Apache modules do not treat well with normal debuggers). Make sure that
the debugging comments are directed to syslog(3) rather than printf(3) or somesuch.
As daunting as it may seem, it actually takes less than 30 minutes to Cure
an Apache module of average size and get it up and running. Some of that
time is spend reading the module's documentation so that it can be loaded
and tested correctly. Good luck!