\documentclass[runningheads,envcountsame,envcountsect]{llncs}
\usepackage{amssymb,url,alltt,graphicx,color,eurosym}
\raggedbottom
\definecolor{lightgray}{rgb}{.9,.9,.9}
\definecolor{gray}{rgb}{.8,.8,.8}
\definecolor{darkgray}{rgb}{.6,.6,.6}
\def\promovendus{{\textit{promovendus}}}
\def\promovendus{{PhD student}}
\begin{document}
\authorrunning{Freek Wiedijk}
\titlerunning{Proof Checking the Proof Checker}
\title{}
\author{}
\institute{}
\maketitle
\vspace{-1.1cm}
\begin{itemize}
\item[1a)] \textbf{Project Title.}
\emph{Proof Checking the Proof Checker.}
\item[1b)] \textbf{Project Acronym.}
PCPC.
\item[1c)] \textbf{Principal Investigator.}
Freek Wiedijk.
\item[1d)] \textbf{Renewed Application.}
No.
\end{itemize}
\begin{itemize}
\item[2a)] \textbf{Summary.}
\parfillskip=1em plus 1fil
%
It is considered a fact of life that all serious
computer programs contain errors, so-called `bugs'.
Empirical data indicates that production software has around
two bugs per thousand lines of source code,
and even programs used on space missions by NASA are believed to have around
0.1 bugs per thousand lines of code \cite{li:mal:den:98}.
\emph{Interactive theorem proving} is
a technology for building programs
that almost certainly have \emph{zero} bugs per thousand lines of code.
Already some significant programs have been shown to be
fully correct.
For instance both the certified C compiler
of Xavier Leroy \cite{ler:06,bla:dar:ler:06} and the programs from the proof of
the Four Color Theorem by Georges Gonthier \cite{gon:06}
have been formally shown -- with a fully computer-checked
proof -- to do
\emph{precisely} what they should do,
and therefore are guaranteed to be bug-free.
This technology of interactive theorem proving for software correctness
is on the verge of becoming widely applicable.
A sign that this moment has not yet arrived
is that currently it is not even used by the
very people who build tools for it.
Thus far, no system for interactive theorem proving
has been formally proved bug-free.
The \emph{Proof Checking the Proof Checker} project
will change this situation.
At the end of this project
one of the best systems for
interactive theorem proving will have used its own technology to establish
that it is completely sound.
Furthermore not just a model, but the actual source code of the program
will have been proved correct.
\end{itemize}
\begin{itemize}
\item[2b)] \textbf{Abstract for laymen (in Dutch).}
Het wordt als onontkoombaar gezien dat ieder serieus
computerprogramma fouten bevat, `bugs' genaamd.
In de praktijk bevat productiesoftware
rond de twee bugs per duizend regels broncode, en zelfs van
programma's voor ruimtemissies door NASA wordt geschat dat
ze rond de 0,1 bugs per duizend regels bevatten.
\emph{Interactief stellingenbewijzen} is een technologie om
programma's te maken die met grote zekerheid \emph{nul} bugs per duizend
regels programmatekst hebben.
Momenteel zijn al enkele bijzonder
ingewikkelde programma's op deze manier volledig correct
bewezen.
Zo zijn er de gevalideerde C compiler van Xavier Leroy
en de programma's voor het bewijs van de vierkleurenstelling door
Georges Gon\-thier, waarvan formeel is aangetoond -- en met
een volledig door de computer gecontroleerd bewijs -- dat
ze \emph{exact} doen wat ze moeten doen, dus dat ze gegarandeerd
bugvrij zijn.
De technologie van het interactief stellingenbewijzen voor
programmacorrectheid staat op het punt om algemeen toepasbaar
te worden.
Een teken dat dit punt nog niet is bereikt is dat
deze technlogie momenteel niet eens wordt gebruikt
door de mensen die de software hiervoor ontwikkelen.
Tot nog toe is geen enkel systeem voor interactief
stellingenbewijzen formeel bugvrij bewezen.
Het \emph{Proof Checking the Proof Checker} project zal hier
verandering in brengen.
Aan het eind van het project
zal van \'e\'en van de beste systemen voor interactief
stellingenbewijzen door middel van zijn eigen technologie
betrouwbaar bewezen zijn.
Bovendien zal niet slechts
van een model, maar van de volledige broncode van het programma
de correctheid zijn aangetoond.
\end{itemize}
\begin{itemize}
\item[2c)] \textbf{Keywords.}
Theorem proving, proof assistants, formal methods, program correctness,
software reliability, HOL.
\end{itemize}
\begin{itemize}
\item[3)] \textbf{Classification.}
\emph{Informatica},
NOAG-ict 2005-2010
theme 3.6: \emph{Intelligente systemen}
(Computationele logica, Redeneersystemen, Semantiek).
\end{itemize}
\begin{itemize}
\item[4)] \textbf{Composition of the Research Team.}
\begin{center}
\begin{tabular}{lclcc}
\emph{name} &$\enskip$& \emph{primary field} &$\enskip$& \emph{affiliation} \\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
Prof. dr. H.P. Barendregt && mathematical logic && RU \\
Prof. dr. J.H. Geuvers (\emph{promotor}) && theorem proving && RU \\
Dr. F. Wiedijk && theorem proving && RU \\
Dr. J.H. McKinna && functional programming && RU \\
Drs. R.S.S. O'Connor && theorem proving && RU \\
Drs. D.G. Synek && functional programming && RU \\
Dr. J.R. Harrison && theorem proving && Intel \\
\emph{\promovendus} && && RU \\
\noalign{\smallskip}
\hline
\end{tabular}
\end{center}
\medskip
\emph{RU} = Institute for Computing and Information Sciences \\
Radboud University Nijmegen \\
Nijmegen, The Netherlands
\medskip
\emph{Intel} = Intel Corporation \\
Hillsboro, Oregon, USA
\end{itemize}
\begin{itemize}
\item[5)] \textbf{Research School.}
IPA, Institute for Programming research and Algorithmics.
\end{itemize}
\begin{itemize}
\item[6a)] \textbf{Description of the Proposed Research}
\
\item
\emph{Scientific Problem and Research Goals.}
\medskip
Here is the motivation for the project summarized in four short sentences:
\emph{Compilers compile themselves.
Proof assistants do not yet prove themselves correct.
This shows that compiler technology is more mature than
proof assistant technology.
The goal of the project is to remove this distinction.}
\medskip
Software is notorious for being unreliable.
Everyone has experience with software misbehaving,
and occasionally software errors have very costly consequences \cite{neu:09}.
The fact that software errors are generally affectionally
called `bugs' does not make this less significant.
\begin{figure}[ht]
\begin{center}
\bigskip
\includegraphics[scale=.55]{0.eps}
\end{center}
\caption{The main formal methods.}\label{fig:cost}
\end{figure}
There are various approaches to improve this situation, of
which one class is called \emph{formal methods}.
The formal methods consist of applying techniques
from mathematics
and specifically from mathematical logic to establish
properties of computer programs.
There are various formal methods that differ in the amount
of information that they provide about software and in the amount of work
that it costs to establish that information.
Jean-Raymond Abrial has the suggestive diagram shown in Fig.~\ref{fig:cost} \cite{abr:08},
which summarizes the main formal methods that are currently popular.
The axes of this diagram are merely suggestive, but
it should be clear that strongly typed programming
is cheap but not does give much reliability, while theorem
proving is very expensive but potentially gives the highest reliability
possible.
Because of the high cost of theorem proving, in practice
it currently is mostly used for correctness of software for human transportation
(with spacecraft an extreme case) and for medical applications.
There actually is a spectrum of theorem proving going from fully \emph{automated}
theorem proving (ATP) to \emph{interactive} theorem proving.
A system for interactive theorem proving is called a \emph{proof assistant}
or \emph{proof checker},
and a proof developed inside such a system is called a \emph{formalization}.
The most important current proof assistants are (see also \cite{wie:06}):
\begin{center}
\begin{tabular}{lclcl}
\hline
\noalign{\smallskip}
\rlap{\emph{HOL proof assistants} \cite{gor:mel:93}} &$\quad$& &$\quad$& \\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
HOL4 \cite{nor:sli:07} && Norrish \& Slind && UK, USA \& Australia \\
HOL Light \cite{har:00} && Harrison && UK \& USA \\
ProofPower \cite{lem:00} && Arthan && UK \\
Isabelle \cite{nip:pau:wen:02} && Nipkow, Paulson \& Wenzel && UK \& Germany\\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
\rlap{\emph{non-HOL proof assistants}} \\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
Coq \cite{coq:08} && Barras \& Herbelin && France \\
B Method \cite{abr:96} && Abrial && France \\
PVS \cite{owr:rus:sha:92} && Owre, Rushby \& Shankar && USA \\
ACL2 \cite{kau:man:moo:00} && Moore && USA \\
Mizar \cite{muz:93} && Trybulec && Poland \& Japan \\
\noalign{\smallskip}
\hline
\end{tabular}
\end{center}
\medskip
\begin{figure}[bp]
\begin{center}
\bigskip
\includegraphics[scale=.55]{1.eps}
\end{center}
\caption{A compiler compiling itself.}\label{fig:compiler}
\end{figure}
\noindent
Currently interactive theorem proving for software correctness
is mostly used with mathematical models of software or for relatively
simple programs.
It is not yet applied much to the actual source code of existing
software.
In particular the proof assistants themselves are still considered
to be out of reach of current interactive theorem proving technology.
%for reasons of scale.
%
This is a noticeable difference with compilers.
Proof assistants and compilers are very similar systems,
but a compiler \emph{is} used for itself.
Compilers routinely
compile their own source code, as shown in Fig\.~\ref{fig:compiler}.
\begin{figure}[ht]
\begin{center}
\bigskip
\includegraphics[scale=.55]{2.eps}
\end{center}
\caption{A proof checker checking itself.}\label{fig:checker}
\end{figure}
The analogous situation for a proof assistant is slightly
more complex because there both a compiler \emph{and}
a proof assistant are involved.
Both systems process the same source code, but the compiler
produces an executable program while the proof
assistant just produces a Boolean value stating that the code
is correct.
This situation is shown in Fig.~\ref{fig:checker}.
Actual proof assistants, like compilers, have thousands of lines of source code,
and are in practice too big for the kind of scrutiny that interactive
theorem proving requires.
However, here is where the \emph{de Bruijn criterion} \cite{bar:wie:06} comes in.
Proof assistants often have an architecture where only a small part
of the program needs to be trusted for the whole system to be mathematically
sound.
(I.e., if that part of the program is correct, it will be impossible to prove \emph{false}, even if the rest of the program has bugs.)
This part of the program is called the \emph{kernel} or \emph{core} of the program,
and it is \emph{not} too big to be proved correct within a reasonable time.
The de Bruijn criterion leads to a modified version of Fig.~\ref{fig:checker}, shown in Fig.~\ref{fig:kernel}.
This diagram presents what we will realize in the \emph{Proof Checking the Proof Checker} project.
\begin{figure}[htbp]
\begin{center}
\bigskip
\includegraphics[scale=.55]{3.eps}
\end{center}
\caption{The \emph{Proof Checking the Proof Checker} project.}\label{fig:kernel}
\end{figure}
\end{itemize}
\begin{itemize}
\item
\emph{Relation to Existing Research.}
A self-correctness proof of a proof assistant has not been
done yet,
but significant work in that direction already exists.
(What has not been done yet is prove correct the \emph{actual production source code} of a proof assistant
which is used on a large scale for serious work.)
For example there are
the verification of the Ivy proof checker in ACL2 by Olga Shumsky \cite{mcc:shu:00},
the formalization of HOL in HOL from the nineties by Joakim von Wright \cite{wri:94},
the formalization of nqthm (the precursor of ACL2) in Coq and vice versa by Gilles Dowek and Bob Boyer \cite{dow:boy:93},
and the work on formalizing type theory in Lego by Randy Pollack and James McKinna \cite{pol:94,mck:pol:99}.
However, there
are two projects that stand out and surpass earlier efforts:
\medskip
\begin{itemize}
\item
\textbf{Coq in Coq.}
In the late nineties Bruno Barras, together with his advisor Benjamin Werner,
formalized the
theory behind the Coq proof assistant, and \emph{extracted} from it a
program that was functionally equivalent to the Coq kernel \cite{bar:96,bar:96:1,bar:99:1,bar:wer:97}.
This program can be used to recheck many Coq formalizations and is
guaranteed to be correct.
However, in practice no one uses it, and the difference in size
and sophistication between this program and the actual Coq kernel
that everyone uses is huge.
The extracted Coq in Coq program is a few hundreds lines of code,
while the actual Coq kernel
is more than ten thousand lines of code.
Also the real Coq kernel uses programming language
features that are not available when one extracts a program from
a formalization.
\begin{figure}[bp]
\begin{center}
\bigskip
\includegraphics[scale=.55]{4.eps}
\end{center}
\caption{Developing software as mathematics.}\label{fig:extract}
\end{figure}
The methodology of the Coq in Coq project is shown in Fig.~\ref{fig:extract}.
First a beautiful piece of mathematics is coded in the computer,
which then is refined into an executable program.
This is also the approach used with the B Method \cite{abr:96}.
\medskip
\item
\textbf{HOL in HOL.}
In 2006 John Harrison proved the correctness of a simplified version
of the kernel of his HOL Light proof assistant \cite{har:06}.
Here the distance between the actual code of the system
and the version proved correct is much smaller.
Mostly some features were removed to
make the effort simpler,
but the code that was proved correct was essentially
code from the real system.
For example, the function definition from the actual HOL Light kernel source code:
\medskip
\begin{quote}\scriptsize
\def\\{\char`\\}
\def\{{\char`\{}
\def\}{\char`\}}
\begin{alltt}
let vsubst =
let rec vsubst ilist tm =
match tm with
Var(_,_) -> rev_assocd tm ilist tm
| Const(_,_) -> tm
| Comb(s,t) -> let s' = vsubst ilist s and t' = vsubst ilist t in
if s' == s & t' == t then tm else Comb(s',t')
| Abs(v,s) -> let ilist' = filter (fun (t,x) -> x <> v) ilist in
if ilist' = [] then tm else
let s' = vsubst ilist' s in
if s' == s then tm else
if exists (fun (t,x) -> vfree_in v t & vfree_in x s) ilist'
then let v' = variant [s'] v in
Abs(v',vsubst ((v',v)::ilist') s)
else Abs(v,s') in
fun theta ->
if theta = [] then (fun tm -> tm) else
if forall (fun (t,x) -> type_of t = snd(dest_var x)) theta
then vsubst theta else failwith "vsubst: Bad substitution list"
\end{alltt}
\end{quote}
\medskip
is represented in the HOL in HOL formalization as:
\medskip
\begin{quote}\scriptsize
\def\\{\char`\\}
\def\{{\char`\{}
\def\}{\char`\}}
\begin{alltt}
let VSUBST = define
`(VSUBST ilist (Var x ty) = REV_ASSOCD (Var x ty) ilist (Var x ty)) /\\
(VSUBST ilist (Equal ty) = Equal ty) /\\
(VSUBST ilist (Select ty) = Select ty) /\\
(VSUBST ilist (Comb s t) = Comb (VSUBST ilist s) (VSUBST ilist t)) /\\
(VSUBST ilist (Abs x ty t) =
let ilist' = FILTER (\\(s',s). ~(s = Var x ty)) ilist in
let t' = VSUBST ilist' t in
if EX (\\(s',s). VFREE_IN (Var x ty) s' /\\ VFREE_IN s t) ilist'
then let z = VARIANT t' x ty in
let ilist'' = CONS (Var z ty,Var x ty) ilist' in
Abs z ty (VSUBST ilist'' t)
else Abs x ty t')`;;
\end{alltt}
\end{quote}
\medskip
The remainder of the HOL Light kernel after the features were removed was more than half of the code.
However, the feature removal meant that the version of the system
that was proved correct could not actually check existing HOL proofs.
%
Also, for the correctness proof the OCaml source code was translated into HOL definitions
by hand.
Although it is easy to see by comparing source files that this translation is correct,
it still leaves a gap.
\begin{figure}[ht]
\begin{center}
\bigskip
\includegraphics[scale=.55]{5.eps}
\end{center}
\caption{Proving existing software correct.}\label{fig:import}
\end{figure}
In HOL in HOL an opposite direction was followed from the Coq in Coq project.
While in Coq in Coq the software was the \emph{result} of the work,
in HOL in HOL it was the starting point.
HOL in HOL is not about creating software,
but about taking existing software -- real production code -- and proving it correct.
This is sketched in Fig.~\ref{fig:import}.
From the point of view of software verification this seems to
be the more interesting and challenging direction.
\end{itemize}
Of course in practice the best approach is a middle ground
between Figures~\ref{fig:extract} and~\ref{fig:import}.
Still, starting from experimentation with a
functional program and only when the program turns out to be
reasonable
begin work on a correctness proof
seems a good approach.
This was the methodology followed both by Georges Gonthier
in his Four Color Theorem verification \cite{gon:06} and by Russell O'Connor
in his work on provably correct exact real arithmetic \cite{oco:06}.
\end{itemize}
\begin{itemize}
\item
\emph{Research Approach.}
The project that we propose here is to {finish} what
John Harrison started with his HOL in HOL formalization.
One {\promovendus} should
be able to reach the point where the \emph{whole} of the HOL Light kernel
is proved correct.
The system then will be running a kernel that has been proved
correct by itself.
In the HOL in HOL project the HOL code that was proved correct was
syntactically closely related
to the OCaml source code of the HOL Light kernel,
but there was no \emph{formal} relation between the two.
For this reason we propose to add automatic translation from HOL code
to OCaml (Step 1.1 in Section~7),
even if it essentially will produce code that already is there.
Conversely, OCaml code will be automatically
converted to HOL definitions (Step 3).
These translations will be formally proved correct with respect
to a formal OCaml semantics (Step 2.2).
This means that we will not just prove a \emph{model} of the
HOL Light kernel correct, but that the correctness proof will apply to
the actual source code.
A more detailed description of the project will be given in Section~7 below.
\medskip
There are a couple of possible objections to this project that need to
be addressed.
First, there is a \emph{chicken and egg} problem here.
The program that checks the correctness might be wrong, and for this
reason accept a fallacious proof, and therefore in fact might not be correct.
Of course this possibility exists.
However, as a human will be paying attention as well,
it is so small that it is only of philosophical importance \cite{pol:98}.
More serious is G\"odel's second incompleteness theorem \cite{goe:31,fra:05}.
It says that a system cannot prove its own consistency,
which applies to a proof assistant too.
A proof assistant cannot support a proof that it will never
prove \emph{false}.
However this is not a serious problem either.
What will be proved is not that the system is consistent but that
the system implements its logic correctly.
Equivalently, the correctness proof will show that
the consistency of the logic implies that the system will never
accept a proof of \emph{false}.
This consistency can be stated in only a few lines and is
very simple, while
the real world program being proved correct is hundreds of lines long and
very complex.
In fact this approach was used in the HOL in HOL project.
There the consistency of the HOL logic is phrased as an `inaccessible
cardinal axiom for the HOL logic':
\medskip
\begin{quote}\small
\def\\{\char`\\}
\def\{{\char`\{}
\def\}{\char`\}}
\begin{alltt}
new_type("I",0);;\medskip
let I_AXIOM = new_axiom
`UNIV:ind->bool <_c UNIV:I->bool /\\
(!s:A->bool. s <_c UNIV:I->bool ==>
\{t | t SUBSET s\} <_c UNIV:I->bool)`;;
\end{alltt}
\end{quote}
\medskip
This then is shown to imply that the HOL Light implementation will never accept a proof of \emph{false}.
Finally there is the question: `Why HOL Light?'
There are impressive programs that have been validated
in other systems.
Maybe one of these other systems would be a better choice?
The reason for our choice for HOL Light is that it has by far the smallest
kernels of the major proof assistants that satisfy the de Bruijn criterion:
\medskip
\begin{center}
\begin{tabular}{lcr}
\hline
\noalign{\smallskip}
\emph{system} &$\quad$& \emph{kernel size {\rm (\emph{in $10^3$ lines})}} \\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
HOL4 && 6 \\
HOL Light && 0.7 \\
ProofPower && 7 \\
Isabelle && 5 \\
Coq && 14 \\
\noalign{\smallskip}
\hline
\end{tabular}
\end{center}
\medskip
Note that although HOL Light is a relatively light-weight
system,
it is one of the best proof assistants available.
It has been extensively used for hardware verification at Intel \cite{har:99,har:06:1},
and is among the best systems for formalization of mathematics \cite{wie:08,wie:09}.
\end{itemize}
\begin{itemize}
\item
\emph{Scientific Importance and Urgency of the Proposed Research.}
It is very important to use interactive theorem proving
on real world programs, programs that are actually used.
Only then the technology has to face real problems,
and only then will it be developed in the right direction.
Having a system verify itself (not just a model or a simplified version,
but the actual code with all the dirty tricks to make it run fast)
is a good choice for such a verification,
and the technology is at a point that this can be done.
Clearly now is the time to do this.
\end{itemize}
\begin{itemize}
\item
\emph{Relation to the Research Group.}
The Foundations Group in Nijmegen is internationally renowned for its
research into the technology and use of proof assistants,
both for mathematics and computer science.
There is strong expertise in mathematical logic, functional programming
and the use of interactive theorem provers,
which makes it the best place in the Netherlands for this project.
\end{itemize}
\begin{itemize}
\item[6b)] \textbf{Application Perspective.}
This project
will prove a program correct that has been used for many years
both in academia (with impressive results like \cite{har:08}) as well as in industry \cite{har:02,har:06:1}.
This is not about experimentation or a prototype: it is about
`the real thing'.
The project proposed here will really mean pushing the boundary
of interactive theorem proving for software correctness.
Part of the project -- Phase 3 in Section~7 below -- will be
to investigate how program
development can be integrated with validation of the developed code.
The project will not just be about proving a single
program correct, but also about creating a methodology that is
generally applicable.
This will then allow many functional programs to be proved correct,
including programs that use state, exceptions and non-structural recursion.
\end{itemize}
\begin{itemize}
\item[7)] \textbf{Project planning.}
%
\begin{figure}[ht]
\begin{center}
\begingroup
%\normalsize
\sf
\setlength{\unitlength}{.9pt}
\hspace{-2em}%
\begin{picture}(400,309)(0,238)
%\put(0,213){\framebox(400,309){}}
\put(0,512){\makebox(0,0)[lb]{\textsl{year 1}}}
\put(100,512){\makebox(0,0)[lb]{\textsl{year 2}}}
\put(200,512){\makebox(0,0)[lb]{\textsl{year 3}}}
\put(300,512){\makebox(0,0)[lb]{\textsl{year 4}}}
\put(0,238){\textcolor{darkgray}{\line(0,1){261}}}
\put(100,238){\textcolor{darkgray}{\line(0,1){261}}}
\put(200,238){\textcolor{darkgray}{\line(0,1){261}}}
\put(300,238){\textcolor{darkgray}{\line(0,1){261}}}
\put(400,238){\textcolor{darkgray}{\line(0,1){261}}}
\put(50,488){\makebox(0,0)[lt]{\textcolor{darkgray}{\circle*{20}}}}
\put(58,488){\makebox(0,0)[l]{\strut\ visiting Harrison}}
\put(250,488){\makebox(0,0)[lt]{\textcolor{darkgray}{\circle*{20}}}}
\put(258,488){\makebox(0,0)[l]{\strut\ visiting Harrison}}
\put(0,465){\makebox(0,0)[lt]{\textcolor{gray}{\rule{100\unitlength}{18\unitlength}}}}
\put(0,455){\makebox(0,0)[l]{\strut\ 1.1}}
\put(50,440){\makebox(0,0)[lt]{\textcolor{gray}{\rule{100\unitlength}{18\unitlength}}}}
\put(50,430){\makebox(0,0)[l]{\strut\ 1.2}}
\put(100,415){\makebox(0,0)[lt]{\textcolor{gray}{\rule{50\unitlength}{18\unitlength}}}}
\put(100,405){\makebox(0,0)[l]{\strut\ 1.3}}
\put(100,390){\makebox(0,0)[lt]{\textcolor{gray}{\rule{250\unitlength}{18\unitlength}}}}
\put(100,380){\makebox(0,0)[l]{\strut\ 1.4}}
\put(150,365){\makebox(0,0)[lt]{\textcolor{gray}{\rule{100\unitlength}{18\unitlength}}}}
\put(150,355){\makebox(0,0)[l]{\strut\ 2.1a}}
\put(200,355){\makebox(0,0)[l]{\strut\ 2.1b}}
\put(150,340){\makebox(0,0)[lt]{\textcolor{gray}{\rule{200\unitlength}{18\unitlength}}}}
\put(150,330){\makebox(0,0)[l]{\strut\ 2.2}}
\put(100,315){\makebox(0,0)[lt]{\textcolor{gray}{\rule{300\unitlength}{18\unitlength}}}}
\put(100,305){\makebox(0,0)[l]{\strut\ 3}}
\put(50,281){\makebox(0,0)[lt]{\textcolor{darkgray}{\rule{300\unitlength}{18\unitlength}}}}
\put(50,272){\makebox(0,0)[l]{\strut\ conference papers}}
\put(200,256){\makebox(0,0)[lt]{\textcolor{darkgray}{\rule{200\unitlength}{18\unitlength}}}}
\put(200,248){\makebox(0,0)[l]{\strut\ writing thesis}}
\end{picture}
\hspace{-2em}
\endgroup
\smallskip
\end{center}%
\caption{Project schedule.}\label{fig:schedule}
\end{figure}
%
Fig.~\ref{fig:schedule} presents an overview of the project schedule.
It consists of three phases, with each phases divided in
steps as follows:
\medskip
\begin{itemize}
\item[]
\textbf{Phase 1:}
\emph{Create a fully self-verified HOL Light.}
\begin{itemize}
\item[]
1.1.
\emph{Compile the existing HOL in HOL code to OCaml.}
\item[]
1.2.
\emph{Extend the HOL in HOL code to match the actual kernel.}
\item[]
1.3.
\emph{Run HOL Light on top of the extended HOL in HOL code.}
\item[]
1.4.
\emph{Extend the correctness proof to the extended HOL in HOL code.}
\end{itemize}
\item[]
\textbf{Phase 2:}
\emph{Obtain further correctness evidence.}
\begin{itemize}
\item[]
2.1.
\emph{HOL in HOL outside OCaml.}
\begin{itemize}
\item[]
2.1a.
\emph{Run the verified kernel inside HOL Light by rewriting.}
\item[]
2.1b.
\emph{Check the correctness proof in other HOL provers.}
\end{itemize}
\item[]
2.2.
\emph{Prove the OCaml translation correct inside HOL Light.}
\end{itemize}
\item[]
\textbf{Phase 3:}
\emph{Integrate OCaml and HOL development.}
\end{itemize}
\medskip
We will now detail each of these steps:
\medskip
\begin{itemize}
\item[]
\textbf{Phase 1:}
\emph{Create a fully self-verified HOL Light.}
\smallskip
\begin{itemize}
\item[]
1.1.
\emph{Compile the existing HOL in HOL code to OCaml.}
The HOL in HOL formalization represents the verified code
as recursive HOL definitions.
The first part of the project is to translate these `executable'
HOL definitions back into OCaml syntax.
This way an executable HOL Light-like kernel for a simplified logic is obtained.
The HOL in HOL code will have to be modified and extended many times during the project.
For this reason the translation into OCaml will be automated.
This is done by writing a printer for HOL terms in OCaml syntax,
that then is used to print the function definitions from
the HOL in HOL kernel.
%
The result of this step is what amounts to a version of
\emph{program extraction} for HOL Light.
\smallskip
\item[]
1.2.
\emph{Extend the HOL in HOL code to match the actual kernel.}
The current HOL in HOL formalization leaves out some important features
from the HOL Light kernel.
%Most notably definitions of new constants
%and type definitions are not possible.
The HOL in HOL code now is extended to exactly match the
actual HOL Light kernel.
%For this a version of the HOL Light kernel is used that is
%\emph{stateless}, to simplify the HOL in HOL definitions.
%%(It will have to be investigated whether changing the HOL Light
%%kernel even more to have it not use exceptions anymore will
%%still have acceptable performance.)
\smallskip
\item[]
1.3.
\emph{Run HOL Light on top of the extended HOL in HOL code.}
The translation from 1.1 now is applied to the code from 1.2,
and a version of HOL Light is created in which the result of this
has replaced the current HOL Light kernel.
Because the HOL in HOL definitions closely mimic code from the HOL Light
kernel, this is not a large change.
\smallskip
\item[]
1.4.
\emph{Extend the correctness proof to the extended HOL in HOL code.}
Finally the correctness proof from the HOL in HOL project is
extended to cover the extended HOL in HOL code.
%This is the most labor intensive part of Phase 1 of the project.
\smallskip
\end{itemize}
\item[]
\textbf{Phase 2:}
\emph{Obtain further correctness evidence.}
\smallskip
\begin{itemize}
\item[]
2.1.
\emph{HOL in HOL outside OCaml.}
\
\smallskip
\item[]
2.1a.
\emph{Run the verified kernel inside HOL Light by rewriting.}
\looseness=-1
The code from 1.1 that translates HOL definitions into OCaml has not
been verified, and might contain bugs that taint
the correctness result about the HOL in HOL code.
For this reason the HOL in HOL program also is executed
\emph{inside} the HOL system, by using the definitions as
rewrite rules.
This is much slower, but is certain to be correct.
\smallskip
\item[]
2.1b.
\emph{Check the correctness proof in other HOL provers.}
The formalization is imported in other theorem
provers to check the correctness by different code written
for different compilers.
Translators of HOL Light code to Isabelle/HOL \cite{obu:ska:06},
and HOL4 and ProofPower \cite{hur:09,hur:09:1}
are applied, and the results checked in these other systems.
\smallskip
\item[]
2.2.
\emph{Prove the OCaml translation correct inside HOL Light.}
It also is possible to check the correctness of the OCaml translation
\emph{inside} HOL.
The HOL semantics of OCaml Light by Scott Owens \cite{owe:08}
can be combined with a proof producing version of the HOL to OCaml translator
from 1.1 to get a validated version of the OCaml code.
\smallskip
\end{itemize}
\item[]
\textbf{Phase 3:}
\emph{Integrate OCaml and HOL development.}
\smallskip
\begin{itemize}
\item[]
We manually translated OCaml code to HOL definitions
(Step 1.2), and then had the computer translate the result
back to OCaml (Step 1.3).
For this project that is the practical way to go about it,
but for development of validated software in general it is a
convoluted way of working.
A more integrated style of software
development is now developed, with in particular compilation from OCaml to HOL.
Programs then can be written in OCaml, while the corresponding HOL definitions
are created automatically.
\end{itemize}
\end{itemize}
\medskip
\noindent
\label{visits}%
This project is a direct continuation of existing work
by John Harrison, and therefore
two visits are planned for intensive contact between
the {\promovendus} and John Harrison.
Both visits are marked in Fig.~\ref{fig:schedule},
and are planned for the first and third year of the project.
These visits will either consist of the {\promovendus} visiting
Intel in the United States, or of John Harrison visiting the Radboud University Nijmegen in the Netherlands.
(Ideally there will be one visit in each direction.)
These visits will be around one month long.
In the project budget below a special travel budget for these visits is included.
\medskip
\item
\emph{Educational Aspects.}
The {\promovendus} has to be an expert on functional programming
(OCaml), theorem proving (HOL) and mathematical logic (HOL semantics).
In the first years of the project he or she will
interact with experts in these
fields, and attend relevant courses at the Radboud University Nijmegen.
The {\promovendus} also will
attend a summer school related to the field of interactive theorem proving.
\end{itemize}
\begin{itemize}
\item[8)] \textbf{Expected Use of Instrumentation.}
The only part of the project that involves heavy computation is
2.1a, which is not a bottleneck.
Therefore there is no need for instrumentation beyond
a regular workstation.
\end{itemize}
\begin{itemize}
\item[9)] \textbf{Literature.}
The key publications of the research team relevant for the project are
\cite{har:06}, \cite{har:08}, \cite{kal:wie:07}, \cite{oco:06} and \cite{wie:06}.
\medskip
\begingroup
\def\section#1#2{}
\def\small{}
\bibliographystyle{plain}
\bibliography{frk}
\endgroup
\end{itemize}
\newpage
\begin{itemize}
\item[10)] \textbf{Requested Budget.}
\begin{center}
\begin{tabular}{llrrrclrcl}
\hline
\noalign{\smallskip}
\rlap{\emph{PhD student} for 4 years} \\
\noalign{\smallskip}
& a) appointment (incl. benchfee) &$\hspace{5em}$&& \llap{standard amount} &$\hspace{1em}$& =&\euro && 195,424 \\
& b) additional travel budget &&& \llap{(\emph{detailed below})} && =&\euro && \phantom{00}7,600 \\
\noalign{\vbox to 0pt{\vspace{-.2pt}\hbox to 0pt{\hspace{33em}\rlap{+}\hss}\vss}\smallskip}
\hline
\noalign{\smallskip}
\rlap{Total} &&&&&& =&k$\,$\euro && \textbf{203} \\
\noalign{\bigskip\medskip}
\rlap{\emph{Budget for visiting Harrison}} \\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
$\enskip$ & return ticket Netherlands/USA & & {\euro} \phantom{0,}800 \\
& living costs abroad for one month & & {\euro} 3,000 \\
\noalign{\vbox to 0pt{\vspace{-.2pt}\hbox to 0pt{\hspace{25em}\rlap{+}\hss}\vss}\smallskip}
\cline{4-4}
\noalign{\smallskip}
\rlap{1 $\times$ visiting Harrison} && & {\euro} 3,800 \\
\noalign{\smallskip}
\rlap{2 $\times$ visiting Harrison} &&&\llap{2 $\times$ }{\euro} 3,800 &&& =&\euro && \phantom{00}7,600 \\
\noalign{\vbox to 0pt{\vspace{-.2pt}\hbox to 0pt{\hspace{33em}\rlap{+}\hss}\vss}\smallskip}
\hline
\noalign{\smallskip}
\rlap{Total} &&&&&& =&\euro && \phantom{00}7,600
\end{tabular}
\end{center}
\noindent
(The motivation for two visits that will allow the {\promovendus} to
work with John Harrison is given in Section~7 on page~\pageref{visits}.)
\vspace{-\bigskipamount}
\end{itemize}
\end{document}