Introduction to P4

Thus far in the course, we’ve seen some simple examples of network algorithms and developed a basic model of SDN.

In this lecture, we’ll introduce P4, a domain-specific language for specifying the behavior of programmable data planes. Unlike models of SDN such as OpenFlow, P4 does not presuppose any functionality, such as Ethernet or IP. Instead, the packet-processing functionality including the packet parsers and the structure of the pipeline of match-action tables is defined by the P4 program.

At a high-level, P4 is based on standard programming constructs (types, variables, assignment, conditionals, etc.) as well as network-specific packet-processing constructs (parsers, match-action tables, etc.) In this lecture, we’ll focus on the P4 type system and its features for specifying packet parsers. Other features will be introduced in subsequent lectures.

Types

P4 is a statically-typed language. Every component of a P4 program has a type that is checked at compile time, and programs that are ill-typed are rejected by the compiler.

Primitive Types

Because packet-processing often involves manipulating bits in packet headers, P4 provides a rich collection of types for describing various kinds of packet data including:

  • bit<N>: unsigned integers of width N,
  • int<N>: signed integers of width N, and
  • int: arbitrary-precision, signed integers

Integer literals can be written in binary (0b), octal (0o), decimal, or hex (0x) notation. Programmers may also optionally specify the width of an integer iteral—e.g., 8w0xF, which specifies the encoding of 15 as an 8-bit unsigned integer.

The type int is an internal type used by the compiler for integer literals; it cannot be written directly by the programmer.

Signed operations are carried out using twos-complement arithmetic and most operations truncate the result in the case of arithmetic overflow.

To convert a value from one type to another, P4 provides casts between different primitive types: e.g., (bit<4>) 8w0xF produces the encoding of 15 as a 4-bit unsigned integer.

Header Types

Packets typically comprise a sequence of headers, each of which are a sequence of fields. For example, an Ethernet packet has the following structure:

+-------------+-------------+------+-------------
| Destination | Source      | Type | Payload ...
+-------------+-------------+------+-------------

The destination and source addresses are 48 bits each, while the type field is 16 bits. The payload is simply the “rest” of the packet and has a variable format depending on the type field.

P4 provides a built-in type for representing headers, using syntax that resembles C structs:

header ethernet_t {
    bit<48> dstAddr;
    bit<48> srcAddr;
    bit<16> etherType;
}

Each component of a header can be accessed using standard “dot” notation—e.g., if a variable ethernet has type ethernet_t, then ethernet.dstAddr denotes the destination address.

A header value can be in one of two states, valid or invalid, and is initially invalid. Reading a field of an invalid header produces an undefined value. A header can be made valid using operations such as isValid(), setValid() and setInvalid(), or by extracting it in the parser (as explained below).

Typedefs and Structs

To support giving convenient names to commonly-used types, P4 provides type definitions:

typedef bit<48> macAddr_t;    

With this declaration, the types bit<48> and macAddr_t are synonyms that are treated as equivalent by the type checker.

P4 also provides standard C-style structs, which are defined as follows:

struct headers_t {
  ethernet_t ethernet;
  ipv4_t ipv4;
}        

Unlike a header, a struct does not have a built-in notion of validity and does not imply any ordering between fields.

Header Stacks and Unions

P4 provides derived types for header stacks and unions. A header stack is similar to an array, but supports additional operations that can be used when parsing packets. If header is a header type, then the type header[N] denotes a header stack type, where N must be an integer literal. If stack is an expression whose type is a header stack then:

  • stack[i]: denotes the header at index i,
  • stack.size: denotes the size of the header stack,
  • stack.push_front(n): shifts stack "right" by n, making the n` entries at the front of the stack invalid, and
  • stack.pop_front(n): shifts stack "left" by n, making the n` elements at the end of the stack invalid.

In addition, within a parser (described below), the following expressions may be used:

  • hs.next denotes the next element of the header stack that has not been populated using a call to extract(...), which is explained below, and
  • hs.last is a reference to the last element of the header stack that was previously populated using a call to extract(...).

A header union encodes a disjoint alternative between two headers:

header_union l3 {
  ipv4_t ipv4;
  ipv6_t ipv6;
}    

Only one of the headers may be valid at run-time. The components of a header union can be accessed and modified using standard “dot” notation.

Other types

P4 provides a number of other types including:

  • Generics,
  • Types for parsers, actions, tables, controls and other program elements,
  • Types for extern functions and objects,

See the language specification document for details.

Parsers

The first piece of a P4 program is usually a parser that maps the bits in the actual packet into typed representations. A typical parser might be declared as follows:

parser P(packet_in packet,
         out headers_t headers,
         inout meta_t meta,
         inout standard_metadata_t std_meta) {
  ...
}

Here the packet argument is an object that encapsulates the packet being parsed. It has a generic method extract that can be used to populate headers. The headers, meta, and std_meta arguments are data structures representing the parsed headers, along with program-specific and standard metadata. Typically the types of header and meta are structs defined by the programmer, while the type of std_meta is a struct defined in the standard library.

The direction annotations out and inout indicate arguments that are write-only and read-write respectively. There is also an in annotation that indicates a read-only argument.

Internally, a parser describe a state machine in which each state may:

  • extract bits out of the packet header,
  • branch on the values of data, and
  • transition to the next state.

For example, the following parser recognizes Ethernet and IPv4 packets.

state start {
  return parse_ethernet;
}
state parse_ethernet {
  packet.extract(headers.ethernet);
  return select(headers.ethernet.etherType) {
    0x800 : parse_ipv4;
    default : accept;
  }
}
state parse_ipv4 {
  packet.extract(headers.ipv4);
  return accept;
}                

Parsers have several special states including the initial state (start), which must be explicitly defined, as well as accepting (accept) and rejecting (reject) final states.

Example

Next let us see how to parse packets with variable structure. We will work with a simple example involving source routing: each packet is either a standard IPv4 packet or a source routed packet.

First, we will define types to represent Ethernet headers

header ethernet_t {
    bit<48>   dstAddr;
    bit<48>   srcAddr;
    bit<16>   etherType;
}

and source routing headers,

header srcRoute_t {
    bit<1>    bos;
    bit<15>   port;
}

Intuitively, the port field encodes the port that the packet should be forwarded out on, while the bos field is a bottom-of-stack marker that is set to 1 on the last elements of the stack.

We also define a struct to represent all headers:

struct headers {
    ethernet_t      ethernet;
    srcRoute_t[8]   srcRoutes;
}

Note that srcRoutes is a stack containing up to 8 elements.

With these type definitions, we can define the parser itself:

parser MyParser(packet_in packet,
                out headers hdr,
                inout metadata meta,
                inout standard_metadata_t standard_metadata) {
    
    state start {
        transition parse_ethernet;
    }

    state parse_ethernet {
        packet.extract(hdr.ethernet);
        transition select(hdr.ethernet.etherType) {
            0x1234: parse_srcRouting;
            default: accept;
        }
    }

    state parse_srcRouting {
        packet.extract(hdr.srcRoutes.next);
        transition select(hdr.srcRoutes.last.bos) {
            1: parse_ipv4;
            default: parse_srcRouting;
        }
    }

    state parse_ipv4 {
        packet.extract(hdr.ipv4);
        transition accept;
    }

}

The most interesting state is parse_srcRouting, which repeatedly extracts the next element of the hdr.srcRoutes stack until it either runs out of space in the stack or finds an element with bos set to 1.

Discussion

  • What kinds of errors can arise in a parser?

  • Are there packet formats that would be difficult to express in P4?

Reading

For a full description of the P4 language, you may consult the language specification document.