Introduction to P4
Thus far in the course, we’ve seen some simple examples of network algorithms and developed a basic model of SDN.
In this lecture, we’ll introduce P4, a domain-specific language for specifying the behavior of programmable data planes. Unlike models of SDN such as OpenFlow, P4 does not presuppose any functionality, such as Ethernet or IP. Instead, the packet-processing functionality including the packet parsers and the structure of the pipeline of match-action tables is defined by the P4 program.
At a high-level, P4 is based on standard programming constructs (types, variables, assignment, conditionals, etc.) as well as network-specific packet-processing constructs (parsers, match-action tables, etc.) In this lecture, we’ll focus on the P4 type system and its features for specifying packet parsers. Other features will be introduced in subsequent lectures.
Types
P4 is a statically-typed language. Every component of a P4 program has a type that is checked at compile time, and programs that are ill-typed are rejected by the compiler.
Primitive Types
Because packet-processing often involves manipulating bits in packet headers, P4 provides a rich collection of types for describing various kinds of packet data including:
bit<N>: unsigned integers of widthN,int<N>: signed integers of widthN, andint: arbitrary-precision, signed integers
Integer literals can be written in binary (0b), octal (0o),
decimal, or hex (0x) notation. Programmers may also optionally
specify the width of an integer iteral—e.g., 8w0xF, which
specifies the encoding of 15 as an 8-bit unsigned integer.
The type int is an internal type used by the compiler for integer
literals; it cannot be written directly by the programmer.
Signed operations are carried out using twos-complement arithmetic and most operations truncate the result in the case of arithmetic overflow.
To convert a value from one type to another, P4 provides casts between
different primitive types: e.g., (bit<4>) 8w0xF produces the
encoding of 15 as a 4-bit unsigned integer.
Header Types
Packets typically comprise a sequence of headers, each of which are a sequence of fields. For example, an Ethernet packet has the following structure:
+-------------+-------------+------+-------------
| Destination | Source | Type | Payload ...
+-------------+-------------+------+-------------
The destination and source addresses are 48 bits each, while the type field is 16 bits. The payload is simply the “rest” of the packet and has a variable format depending on the type field.
P4 provides a built-in type for representing headers, using syntax
that resembles C structs:
header ethernet_t {
bit<48> dstAddr;
bit<48> srcAddr;
bit<16> etherType;
}
Each component of a header can be accessed using standard “dot”
notation—e.g., if a variable ethernet has type ethernet_t, then
ethernet.dstAddr denotes the destination address.
A header value can be in one of two states, valid or invalid, and is
initially invalid. Reading a field of an invalid header produces an
undefined value. A header can be made valid using operations such as
isValid(), setValid() and setInvalid(), or by extracting it in
the parser (as explained below).
Typedefs and Structs
To support giving convenient names to commonly-used types, P4 provides type definitions:
typedef bit<48> macAddr_t;
With this declaration, the types bit<48> and macAddr_t are
synonyms that are treated as equivalent by the type checker.
P4 also provides standard C-style structs, which are defined as follows:
struct headers_t {
ethernet_t ethernet;
ipv4_t ipv4;
}
Unlike a header, a struct does not have a built-in notion of
validity and does not imply any ordering between fields.
Header Stacks and Unions
P4 provides derived types for header stacks and unions. A header
stack is similar to an array, but supports additional operations that
can be used when parsing packets. If header is a header type, then
the type header[N] denotes a header stack type, where N must be an
integer literal. If stack is an expression whose type is a header
stack then:
stack[i]: denotes the header at indexi,stack.size: denotes the size of the header stack,stack.push_front(n): shiftsstack"right" byn, making then` entries at the front of the stack invalid, andstack.pop_front(n): shiftsstack"left" byn, making then` elements at the end of the stack invalid.
In addition, within a parser (described below), the following expressions may be used:
hs.nextdenotes the next element of the header stack that has not been populated using a call toextract(...), which is explained below, andhs.lastis a reference to the last element of the header stack that was previously populated using a call toextract(...).
A header union encodes a disjoint alternative between two headers:
header_union l3 {
ipv4_t ipv4;
ipv6_t ipv6;
}
Only one of the headers may be valid at run-time. The components of a header union can be accessed and modified using standard “dot” notation.
Other types
P4 provides a number of other types including:
- Generics,
- Types for parsers, actions, tables, controls and other program elements,
- Types for
externfunctions and objects,
See the language specification document for details.
Parsers
The first piece of a P4 program is usually a parser that maps the bits in the actual packet into typed representations. A typical parser might be declared as follows:
parser P(packet_in packet,
out headers_t headers,
inout meta_t meta,
inout standard_metadata_t std_meta) {
...
}
Here the packet argument is an object that encapsulates the packet
being parsed. It has a generic method extract that can be used to
populate headers. The headers, meta, and std_meta arguments are
data structures representing the parsed headers, along with
program-specific and standard metadata. Typically the types of
header and meta are structs defined by the programmer, while the
type of std_meta is a struct defined in the standard library.
The direction annotations out and inout indicate arguments that
are write-only and read-write respectively. There is also an in
annotation that indicates a read-only argument.
Internally, a parser describe a state machine in which each state may:
- extract bits out of the packet header,
- branch on the values of data, and
- transition to the next state.
For example, the following parser recognizes Ethernet and IPv4 packets.
state start {
return parse_ethernet;
}
state parse_ethernet {
packet.extract(headers.ethernet);
return select(headers.ethernet.etherType) {
0x800 : parse_ipv4;
default : accept;
}
}
state parse_ipv4 {
packet.extract(headers.ipv4);
return accept;
}
Parsers have several special states including the initial state
(start), which must be explicitly defined, as well as accepting
(accept) and rejecting (reject) final states.
Example
Next let us see how to parse packets with variable structure. We will work with a simple example involving source routing: each packet is either a standard IPv4 packet or a source routed packet.
First, we will define types to represent Ethernet headers
header ethernet_t {
bit<48> dstAddr;
bit<48> srcAddr;
bit<16> etherType;
}
and source routing headers,
header srcRoute_t {
bit<1> bos;
bit<15> port;
}
Intuitively, the port field encodes the port that the packet should
be forwarded out on, while the bos field is a bottom-of-stack
marker that is set to 1 on the last elements of the stack.
We also define a struct to represent all headers:
struct headers {
ethernet_t ethernet;
srcRoute_t[8] srcRoutes;
}
Note that srcRoutes is a stack containing up to 8 elements.
With these type definitions, we can define the parser itself:
parser MyParser(packet_in packet,
out headers hdr,
inout metadata meta,
inout standard_metadata_t standard_metadata) {
state start {
transition parse_ethernet;
}
state parse_ethernet {
packet.extract(hdr.ethernet);
transition select(hdr.ethernet.etherType) {
0x1234: parse_srcRouting;
default: accept;
}
}
state parse_srcRouting {
packet.extract(hdr.srcRoutes.next);
transition select(hdr.srcRoutes.last.bos) {
1: parse_ipv4;
default: parse_srcRouting;
}
}
state parse_ipv4 {
packet.extract(hdr.ipv4);
transition accept;
}
}
The most interesting state is parse_srcRouting, which repeatedly
extracts the next element of the hdr.srcRoutes stack until it either
runs out of space in the stack or finds an element with bos set to
1.
Discussion
-
What kinds of errors can arise in a parser?
-
Are there packet formats that would be difficult to express in P4?
Reading
For a full description of the P4 language, you may consult the language specification document.