Why3 project

In this project, your task is to model a small (but non-trivial) piece of C code using Why3. You will have to make choices on how to model certain aspects of the C programming language in Why3, and how to translate the C code itself to WhyML, and document these in a report.

You can choose which routine you want to verify from a list; three of them are C library routines, the fourth one is inspired by an actual crypto library. The four options are described in the next sections. You only have to attempt verification of one of these routines, of course!

Your report should address the following items:

You may assume that the reader of your report is familiar with Why3, so you do not need to introduce it.

Submission instructions

Register your team in Brightspace.

You should hand in your report as a PDF file (try to aim for at most 5 pages), as well as your Why3 files, in Brightspace. Working in pairs is allowed, and strongly recommended. Mention both authors in your report; only one of the authors should submit it.

Please include all the Why3 session files needed to reproduce a proof. An easy way to check whether this is complete is to issue the command:

why3 replay *projectname*

Where projectname is the name of your .mlw file, without the file extension.

The deadline for submission is Wednesday 3 April 2019, 23:59. This gives you a few weeks to work, but remember that this project will take much more time than the weekly exercise, so start right now!

Grading information

Your grade for this project will be based on the quality of your Why3 formalization, and on your report. It will count towards your final grade for this course.

Asking for help will not affect your grade, so please pose questions early, especially when:

Options 1, 2 and 3: a C system library function

The library cloudlibc is designed to be a safe alternative for the standard C library. It sometimes breaks compatibility with the C/POSIX standards where they conflict with the security aims of the CloudABI platform (which cloudlibc is written for). E.g., its intent is to remove most (if not all) forms of global state, as well as functions that are intrinsically insecure. It is probably well-tested, but no attempts at verification have been made.

For your verification target, pick one routine from the bullet list below, figure out what it is supposed to do, and verify that it does that (and doesn't do things it is not supposed to):

To get an informal specification of what these routines do, you can look at man pages. A more authoritative reference is one of these two sources, which will explain it in "standardese".

  1. The draft ISO C11 standard
  2. The POSIX standard

Of course, cloudlibc may deviate from these standards, but only if it states so clearly!

Tips:

Option 4: a tweetNaCL routine

TweetNaCL (pronounced: "tweet salt") is a lightweight cryptographic library designed to fit in 100 tweets of 160 characters.

It contains code for performing modular arithmetic. Specifically, arithmetic modulo the prime number \(2^{255} - 19\).

Since a number modulo \(2^{255}-19\) can still be quite large (requiring 255 bits to store it), normal integer types cannot be used to handle them, and so cryptographic engineers write a clever routine like this to add them:

void add(int64_t o[16], int64_t a[16], int64_t b[16])
{
    for(int i=0; i<16; i++) {
        o[i] = a[i] + b[i];
    }

    for(int i=0; i<15; i++) {
        int64_t c = o[i] >> 16;
        o[i+1] = o[i+1] + c;
        o[i] = o[i] - (c << 16);
    }

    int64_t c = o[15] >> 16;
    int64_t t = 38*c;
    o[0] = o[0] + t;
    o[15] = o[15] - (c<<16);
}

Here, a and b point to representations of a number modulo \(2^{255}-19\), which are then added (modulo \(2^{255}-19\)), with the result stored at o:

$$ o \equiv (a+b), \mod 2^{255}-19 $$

Any number modulo \(2^{255}-19\) can of course be represented in 256 bits, or 16 "limbs" of 16 bits each. If you look closely, you will see that the code above instead uses int64_t instead of uint16_t. This is done so that there is some room to temporarily store carries that may be arise during the computation.

The addition code itself can be understood as follows:

  1. The first loops simply adds all the individual limbs. This may generate carries (i.e. 50000+50000 = 100000, which no longer fits in a 16 bit value).

  2. The second loops propagates the "excess carries" from the previous loop to the limb where they belong.

  3. If the result is larger than \(2^{255}-19\), it needs to be reduced modulo \(2^{255}-19\). For this, it uses the fact that $\(2^{256} \equiv 38, \mod 2^{255}-19\)$

Verification goals

Verify the add function listed above.

This challenge is slightly more open-ended, things you can try to verify are: