State machine inference for security protocols

This webpage tries to give a overview of experiments with using state machine inference (aka active automata learning) to analyse implementations of standard security protocols. If you know of experiments that we are missing in this overview, please send us an email!


Implementing a security protocol usually involves implementing a finite state machine (aka a Deterministic Finite Automata or DFA) to correctly handle different protocol flows. If the happy flow is not implemented correctly then you quickly notice, as the implementation simply will not work, but logical flaws in handling non-standard protocol flows can be harder to spot. Such flaws can introduce security vulnerabilities, as has for instance been the case for TLS (see here and here).

State machine inference is a technique to automatically infer the state machine from an implementation by means of black-box testing. This means we obtain formal models for free! Inspecting the state machines obtained - either manually or using a model-checker - can reveal flaws in implementations. It may also reveal differences between implementations of the same protocol that allow fingerprinting of particular implementations. Such differences are also an indication that the specifications may be unclear or ambiguous.

Often protocol state machines are not well-specified or left largely implicit in specifications. Paying explicit attention to these state machines can be seen as adhering to the principles of LangSec (language-theoretic security), as discussed in more detail here. Active state machine learning can also be regarded as a form of fuzzing, where you do not fuzz individual messages, but rather sequences of messages.

Further reading

The idea of state machine learning goes back to the work by Angluin from the 1980s. A standard software library for inferring state machines which we have used extensively in our own work is LearnLib. One of the first tools to apply the idea of state machine learning to security protocols was Prospex, which specifically targetted botnet protocols, with the aim to automatically reverse engineer these. Overviews of automated tools for protocol reverse engineering, not just using active state machine learning but also other techniques, is given in this survey paper or this one.

Overview of protocols analysed

Below an overview of research into the use of using active learning to analyse implementations of standardised security protocols. (So we exclude e.g. botnet protocols in this overview.) This wiki provides additional information about some of the experiments listed below. This wiki also includes information n some case studies with protocols that are not security-related, such as TCP, MQTT, and SIP.

Network security protocols

Smartcards and banking tokens

Industrial Control Systems