NetKAT Language

Today we will shift gears and begin exploring a set of topics that are more mathematical in nature, starting with the NetKAT language.

Overview

We have seen the value of having linguistic abstractions that sit atop the hardware-level primitives provided by OpenFlow. In particular, by writing programs in a higher-level programming language like Frenetic, we gain the ability to compose applications, since the compiler and run-time system are responsible for handling low-level issues such as the precise rules used to populate match-action tables.

But Frenetic was based on an ad hoc mathematical foundation. And as the language evolved over time, features were added and removed, seemingly without rhyme or reason. In addition, the meaning of some constructs in the language was never entirely clear. For instance, consider the following Frenetic program:

Flood() >> Flood()    

If we assume evaluating a single instance of the flood primitive generates (n) packets, one for every physical port on the switch, should evaluating the composition of two primitives generate (n^2) packets? To answer this question and many others, we need a more foundational account of OpenFlow programming.

Design Goals

Taking a step back, there are two pieces of functionality that are essential for writing OpenFlow programs:

Packet classification: we need the ability to write programs that implement different functionality on different types of packets, and
Network-wide forawrding: we need the ability to program the network as a whole, and not in terms of hop-by-hop behavior on individual devices.

NetKAT is designed to be a core calculus that provides both pieces of functionality. Formally, the language is the combination of a Boolean algebra and a Kleene algebra, interpreted in the networking domain. The elements of the Boolean algebra describe logical predicates on packets used for classifying packets, while the elements of the Kleene algebra describe regular paths through the network used for forwarding packets. The language is designed to be minimal, meaning that it is designed around a set of simple and orthogonal core constructs. Many richer constructs are defined by translation into the core constructs.

Model

NetKAT is based on a model in which programs describe pure functions on packets. A packet is a record from header fields f to values n. Given a packet pk, we use the notation pk.f to denote the f field of pk and pk[f := n] to denote the packet obtained by setting the f field of pk to n. We assume that the set of all packets is finite – i.e., the sets of header fields and values that can be associated with those fields are both drawn from finite domains. For simplicity, we do not distinguish between the different types of packets that can arise in a network – e.g., ARP vs. IPv4.

Syntax

NetKAT’s syntax is captured by the following grammmar:

p ::= -- Boolean Algebra 
    | False     -- false
    | True      -- true
    | f = n     -- test
    | !p        -- negation
    | p1 | p2   -- disjunction
    | p1 & p2   -- conjunction
    -- Kleene Algebra
    | 0         -- empty
    | 1         -- unit
    | p1 + p2   -- union 
    | p1 ; p2   -- concatenation 
    | p*        -- iteration
    -- Packet processing
    | f := n    -- modification    
    | S => S'   -- forward

The first six constructs define syntax for a standard Boolean algebra over primitive packet tests of the form f = n. The language can be generalized to accomodate richer forms of tests, but for simplicity we will focus on comparisons against concrete values in this lecture. The next five operators define syntax for regular expressions. The final two constructs define syntax for packet-processing operations: modifying header fields and moving packets between switches.

Denotational Semantics

The formal semantics of NetKAT can be modeled in a number of different ways. Arguably the most intuitive semantics is a denotational semantics, which models the meaning of NetKAT programs as functions from packets to sets of packets. The notation [[ p ]] indicates the function denoted by p.

[[ p ]] : Packet -> Packet Set
[[ False ]] = lambda pk . {}
[[ True ]] = lambda pk . { pk }     
[[ f = n ]] = lambda pk . { { pk } if pk.f = n
                          { {}     otherwise 
[[ !p ]] = lambda pk . { pk } \ [[ p ]] pk
[[ p1 | p2 ]] = lambda pk . [[ p1 ]] pk ∪ [[ p2 ]] pk
[[ p1 & p2 ]] = lambda pk . [[ p1 ]] pk ∩ [[ p2 ]] pk    
[[ 0 ]] = lambda pk . {} 
[[ 1 ]] = lambda pk . { pk }     
[[ p1 + p2 ]] = lambda pk . [[ p1 ]] pk ∪ [[ p2 ]] pk
[[ p1 ; p2 ]] = lambda pk . ∪ { [[ p2 ]] pk' | pk' : [[ p1 ]] pk }
[[ p* ]] = lambda pk . ∪i [[ pk ]]^i pk
[[ f := n ]] = lambda pk . { pk[f := n] }
[[ S => S' ]] = lambda pk . { { pk[switch := S'] } if pk.switch = s
                            { {}                   otherwise    

There are a few things to notice about the semantics. First, the definition of the semantics for negation only makes sense if [[ p ]] returns either the empty set or the singleton set containing the input packet. In fact, this property holds for all elements of the Boolean sub-algebra, consisting of True, False, and f = n closed under !, | and &.

Lemma: if p is an element of the Boolean sub-algebra, then either

[[ p ]] pk = {} or
[[ p ]] pk = { pk }

Intuitively, this says that elements of the Boolean sub-algebra behave like filters that discard packets that do not satisfy the predicate they encode. We adopt the convention that ! is only ever applied to other elements of the Boolean sub-algebra.

Second, a number of operators are redundant:

False and 0
True and 1
| and +
& and ;

In addition, S => S' can be encoded as switch = S; switch := S'.

Hence, we can reduce NetKAT further to an even smaller core,

p ::=
    | 0         -- false
    | 1         -- true
    | f = n     -- test
    | !p        -- negation
    | p1 + p2   -- union 
    | p1 ; p2   -- concatenation 
    | p*        -- iteration
    | f := n    -- modification    

and streamline the semantics as follows,

[[ p ]] : Packet -> Packet Set
[[ 0 ]] = lambda pk . {} 
[[ 1 ]] = lambda pk . { pk }     
[[ f = n ]] = lambda pk . { { pk } if pk.f = n
                          { {}     otherwise 
[[ !p ]] = lambda pk . { pk } \ [[ p ]] pk
[[ p1 + p2 ]] = lambda pk . [[ p1 ]] pk ∪ [[ p2 ]] pk
[[ p1 ; p2 ]] = lambda pk . ∪ { [[ p2 ]] pk' | pk' : [[ p1 ]] pk }
[[ p* ]] = lambda pk . ∪i [[ pk ]]^i pk
[[ f := n ]] = lambda pk . { pk[f := n] }

without restricting the expressiveness of the language.

Axiomatic Semantics

So far, we have seen how to reduce several properties of interest to program equivalence. But how do we show that two programs are equivalent? One way would be to use the denotational semantics,

p == q iff [[ p ]] = [[ q ]]

But this requires reasoning about mathematical objects, including least fixpoints, which is possible but not entirely straightforward.

Another approach is to define a deductive system in terms of a finite set of equational axioms. Then a proof of equivalence for p and q is just a sequence of valid applications of these axioms that starts with p and ends with q, using reflexivity, transitivity, symmetry, and substitution as needed.

It turns out that there relatively simple set of axioms for NetKAT that is both sound and complete. These axioms can be divided into three groups.

Reachability

For instance, suppose we want to verify that location b is reachable from a, where each location is encoded as a NetKAT predicate. We can check whether:

a; in; (pol; topo); pol; out; b != 0

Waypointing

Next suppose want to verify that all paths from a to b go via a waypoint w. We can check whether

a; in; (pol; topo); pol; out; b <= 
a; in; (!b; pol; topo)*; w; (!a; pol; topo)*; pol; b; out

The inclusion relation <=, can be defined as follows:

p <= q =def= p + q == q

Hence, above we check whether the behaviors encoded by the program on the left are included in the set of behaviors encoded by the specification on the right.

Forgetting histories

Recall that NetKAT’s semantics is formulated in terms of packet histories. In many cases, it is convenient to ignore histories and simply model programs as functions from packets to sets of packets. We can achieve this by first applying a simple homomorphism φ that maps dup to 1:

φ(0) = 0
φ(1) = 1
φ(f = n) = f = n
φ(f := n) = f := n    
φ(dup) = 1
φ(!p) = !φ(p)
φ(p1 + p2) = φ(p1) + φ(p2)
φ(p1 ; p2) = φ(p1) ; φ(p2)
φ(p*) = φ(p)*    

Equational Axioms

Now we present a collection of axioms for proving equivalences between NetKAT programs.

Kleene Algebra

The first group of axioms capture the conditions for the Kleene Algebra fragment of the language:

p + (q + r) == (p + q) + r      KA-Plus-Associative
p + q == q + p                  KA-Plus-Commutative
p + 0 == p                      KA-Plus-Zero
p + p == p                      KA-Plus-Idempotent
(p ; q) ; r = (p ; q) ; r       KA-Sequence-Associative
1 ; p == p                      KA-One-Sequence    
p ; 1 == p                      KA-Sequence-One
p; (q + r) == p; q + p; r       KA-Sequence-Distribute-Left
(p + q); r == p; r + q; r       KA-Sequence-Distribute-Right
0 ; p == 0                      KA-Zero-Sequence
p ; 0 == 0                      KA-Sequence-Zero
1 + p; p* == p*                 KA-Unroll-Left    
1 + p*; p == p*                 KA-Unroll-Right
q + p; r <= r => p*; q <= r     KA-Least-Fixpoint-Left
p + q; r <= q => p; r* <= q     KA-Least-Fixpoint-Right                

For the most part, these axioms are straightforward. The final two equations capture the conditions needed for Kleene start to be a least fixpoint and were originally published in a paper by Kozen in 1994.

Boolean Algebra

The second group of axioms capture additional conditions for the Boolean Algebra fragment of the language:

a + (b ; c) == (a + b); (a + c) BA-Plus-Distribute
a + 1 == 1                      BA-Plus-One
a + !a == 1                     BA-Excluded-Middle
a; b == b; a                    BA-Sequence-Commutative
a; !a == 0                      BA-Contradiction
a ; a == a                      BA-Sequence-Idempotent                    

Packet Algebra

The final group of axioms capture conditions on packet tests, modifications and dup.

f := n; f' := n' == f' := n'; f := n    if f != f'  PA-Mod-Mod-Commutative
f := n; f' = n' == f' = n'; f := n      if f != f'  PA-Mod-Filter-Commutative
dup; f = n                                          PA-Dup-Filter-Commutative
f := n; f = n == f := n                             PA-Mod-Filter
f = n; f := n == f = n                              PA-Filter-Mod
f := n; f := n' == f := n'                          PA-Mod-Mod
f = n; f = n' == 0                                  PA-Contradiction
Sum_i f = i == 1                                    PA-Filter-All

Soundness and Completeness

Recall the axioms from the last lecture, which allow us to prove equivalences of the form p == q. We would like to know that these are “good” axioms in some sense. There are two properties we might like:

Soundness: if p == q then [[ p ]] = [[ q ]]
_Completeness: if [[ p ]] = [[ q ]] then p == q.

Showing soundness is largely straightforward – we prove induction that every valid instance of an axiom relates programs that are denotationally equal.

But completeness is more difficult. Our proof will work by first showing how to put programs into a kind of “normal form” and then showing that elements that are equal in this normal form can be proved equivalent using the axioms.

Reduced NetKAT

As a first step, let us define reduced NetKAT which eliminates negation, and also replaces tests with complete tests and modifications with complete modifications:

r ::= α           -- complete test
    | β           -- complete assignment
    | dup         -- duplication
    | r1 + r2     -- union
    | r1 ; r2     -- sequence
    | r*        

We let α range over complete tests – i.e., sequences that test the value of every header field in the packet:

α ::= f1 = n1 ; ... ; fk = nk

Similarly, we let β range over complete assignments – i.e., sequences that assign a value to every header field in the packet:

β ::= f1 := n1 ; ... ; fk := nk

Note that complete tests and complete assignments are in one-to-one correspondence. We will write β_α for the complete test corresponding to α, and vice versa.

Reduced NetKAT is a regular language – note that it is just the regular expressions over the non-standard but finite alphabet Σ = A ∪ B ∪ { dup }. Hence, we can interpret programs as regular expressions:

R(r) ⊆ (A ∪ B ∪ { dup })*
R(α) = { α }   
R(β) = { β }
R(dup) = { dup }
R(r1 + r2) = R(r1) ∪ R(r2)
R(r1; r2) = R(r1) . R(r2)
R(r*) = R(r)*        

where “.” denotes language concatenation and “*” denotes language iteration.

We can transform an arbitrary NetKAT program into equivalent program in reduced form, and we can prove the equivalence using the axioms.

Lemma: for all NetKAT programs p, there exists a reduced program r such that p == r.

For example, to keep things simple, suppose packets have just two fields, f1 and f2. Then we can transform a test f1 = n1 into a complete test as follows:

   f1 = n1
== f1 = n1 ; 1              -- KA-One-Sequence
== f2 = n1 ; Sum_i f2 = i   -- PA-Filter-All
== Sum_i (f1 = n1 ; f2 = i) -- KA-Sequence-Distribute-Left

Unfortunately, the following property holds

Fact: [[ r1 ]] = [[ r2 ]] does not imply R(r1) = R(r2).

To see why, consider two distinct complete tests: α1 and α2. The programs α1; α2 and α2; α1 are equivalent – both describe the program that filters away all packets. However, R(α1;α2) is not equal to R(α2; α1), because they are distinct “strings”.

Language Model

To fix the problem with reduced NetKAT identified above, we will work with guarded strings, which are even more normalized:

α ; β1 ; dup; ...; βn; dup; β

Note that an “empty” guarded string has the form α ; β. Intuitively, a guarded string can be thought of as capturing the trajectory of a packet as it goes through the network: initially it satisfies the test α, and then it assumes a sequence of values β1 through βn until the final value β.

We can define a partial concatenation operation on guarded strings as follows:

α1 ; u; β1 ♢ α2 ; v; β2 = { α1 ; u; v; β2     if β1 = β_α2
                          { undefined         otherwise

We “smash” together two guarded strings if the assignment at the end of the first matches the test at the end of the second. Otherwise the operation is undefined. However, when lifted to sets of strings, it becomes a total function again:

G1 ♢ G2 = { g1 ♢ g2 | g1 in G1 and g2 in G2 }

Now we can define a mapping from programs in reduced form to sets of guarded strings.

G(r) ⊆ A ; (B; { dup })* ; B
G(α) = { α; β_α }   
G(β) = { α; β | α in A }
G(dup) = { α; β_α; dup; β_α | α in A }
G(r1 + r2) = G(r1) ∪ G(r2)
G(r1; r2) = G(r1) ♢ G(r2)
G(r*) = G(r)*        

Lemma: [[ r1 ]] == [[ r2 ]] iff G(r1) = G(r2).

We will say that a term is in normal form if G(r) = R(r).

Lemma: For every reduced program r there is a term in normal form r^ such that r == r^.

Completeness proof

With these building blocks in place, the proof of equivalence is straightforward:

[[ p ]] = [[ q ]]      -- By assumption
p = p^ and q == q^     -- Reduce and normalize
[[ p^ ]] = [[ q^ ]]    -- Soundness
G(p^) = G(q^)          -- Language model
R(p^) = R(q^)          -- Normal form
p^ == q^               -- Kleene Algebra completeness [Kozen '94]
p == q                 -- Transitivity    

Reading

The material presented in this lecture was originally published in the following research paper:

NetKAT: Semantic Foundations for Networks