Software Security - PREfast project

Individual Project 1: Program Analysis with PREfast and SAL

In this project we will use PREfast, a static analysis tool for C(++) developed at Microsoft, and the associated annotation language SAL, on some toy C code.

Learning objectives

Goals of the project are

to appreciate - if it wasn't clear from the lectures - some of the many things that can go wrong in a C(++) program;
to understand the capabilities and the limitations of an (almost) state-of-the-art static analysis tool;
to understand the trade-offs in the design and in the use of such a tool.

Thanks to Jonathan Aldrich and colleagues at CMU for pointers to PREfast and sample exercises.

Handing in the assignment

The project is due Monday Sept 28 before the lecture (so we can discuss solutions in class). Start it well before then so there is the chance to sort out any technical problems should these arise. (Sept 21 there is no lecture, so that should give you time).
To hand in your solution, send an email to Michael Colesky (mrc at cs.ru.nl) with subject "[SS] PREfast assignment" and two attachments:

a file called "YourName_prefast_exercise.cpp" with your modified and annotated C++ code, i.e. part I, and
a text file YourName_answers.txt or PDF file YourName_answers.pdf with your answers to the questions in part II

where YourName is your full name, without spaces.
NB1 make sure your full name and student number is also included in both these files, so that if we print them we know whose they are !
NB2 don't zip, tar, gzip, compress, bzip, etc. the attachments !
NB3 No .doc or other formats apart from PDF and plaintext !
Please follow the instructions above to the letter, to save us time having to deal with solutions without names, in strange formats, etc.

Installing PREfast

We provide a version of PREfast that can be used from the command line. How to install is described in detail here.

The assignment - Part I - Using the tool

If you've followed the installation instructions for PREfast above, then you should already have a copy of the exercise file prefast_exercise.cpp.

Running PREfast, by compiling with the option /analyze, should produce 7 warnings: C4996, C6386, C6011, C6217, C6282, C6273, C6031. If you don't get the C4996 warning, the command line option /W 3 is probably missing; you have to include that, as in cl /analyze /W 3 prefast_exercise.cpp

Once that, works follow the steps below:

Get rid of the warnings in prefast_exercise.cpp that PREfast gives, by fixing the code. Mark places where you changed the code with a comment //FIXED to keep track of the changes you made.
Keep the changes to the code minimal; the code is completely silly, no need to completely rewrite it.
Annotate all buffers that are passed as parameters (i.e. all parameters of type char* or int*), to specify whether they are read from and/or written to, and specify their lengths. This means you have annotate them
- with _In_count_(...) if they are only read from;
- with _Out_cap_(...) if they are only written from;
- with _Inout_count_(...) if they are both read from and written from (but I don't think you need that).
If the length is a compile-time constant, such as 55 or BUF_SIZE, rather than a program variable, you need another suffix c_. So, for example, you can annotate a buffer with _Out_cap_(len) or with _Out_cap_c_(BUF_SIZE). (Why this extra c_ is needed for constants is a mystery to me.)
There is no need to annotate the size of the argument of execute, as its size does not really matter.
Fix any new warnings this produces.
Similarly, annotate the buffers returned as results by my_alloc and do_read to specify their size, using the annotations _Ret_cap_(...) or _Ret_cap_c_(...).
Fix any new warnings this produces.
As last step, we will add tainting annotations to trace any input passing from input to execute without passing through the validation operation, and add calls to the validation routine validate in the right places to fix any problems with missing input validation. The steps for this are explained in more detail below.
To do this, first
- annotate the first parameter of input with [SA_Post(Tainted=SA_Yes)], which specifies that this parameter will be tainted as postcondition, and
- annotate the parameter of execute with [SA_Pre(Tainted=SA_No)] to specify the precondition that this parameter should not be tainted.
So you get
```
   HRESULT input([SA_Post(Tainted=SA_Yes)] _Out_... char *buf) {...

   int execute([SA_Pre(Tainted=SA_No)] _In_ char *buf) {...
```
Now annotate all the procedures that may handle or produce tainted data using pre- and/or postconditions as above. These procedures are:
- do_read, as it calls input, which produces tainted data;
- copy_data, as it is used to copy data coming from do_read, which is tainted.
To specify that the return value of a function is tainted, declare it as
```
   [returnvalue:SA_Post(Tainted=SA_Yes)] char* somefunction() { ...
```
PREfast should now produce warnings C6029, when it spots that the program is passing tainted data to the function execute().
Add calls to the validation routine validate in the right places to make such warnings disappear.
As you may notice, PREfast's tainting analysis is not reliable unless you annotate all procedures that may handle tainted data correctly.

The assignment - Part II - Reflection

Briefly answer the following questions

PREfast tries to check annotations at compile time. Suppose that we have a way to check the annotations at runtime. (Actually, it is not so straightforward to do this at runtime, for all annotations, but let's assume it is possible.) Name one advantage and two disadvantages of doing these checks at runtime instead of doing them at compile-time. (Hint: there are very generic advantages and disadvantages when it comes to runtime vs compile-time checking.)
Sometimes PREfast only warns about problems after you add annotations. For example, the tool does not complain about zero() until after you add an annotation about the size of buf. An alternative tool design would be to produce a warning about zero() if there are no annotations for it. (The warning would then not so much be that there is a potential buffer overflow problem, but rather that the tool does not have enough information to determine whether there is a buffer overflow or not.) Can you give a plausible explanation why PREfast has been designed so that it does not complain about such unannotated methods?

Keep your answers concise and to the point.

More information for this project

For more info about

C(++): to look up what system calls such as gets, gets_s, memcpy, printf, system, ... do, see the online C reference or C++ reference specs.
NB these websites can (and should!) be criticized for not warning forcefully about the dangers of for instance gets and providing suggestions for safer alternatives!
Secure coding in C(++): CERT publishes guidelines for this: CERT C coding standard and CERT C++ coding standard.
SAL: There is a lot of information on SAL on the Microsoft website. But beware, the syntax of SAL keeps changing, so stick to the syntax listed above in the exercises, which is SAL version 2008/2009, as this syntax works with the version of PREfast that we provide.
If you want to know more about SAL in the latest version, see SAL 2.0 version 2013. The first version of SAL dates back to 2005: SAL version 2005.