Introduction to P4
Thus far in the course, we’ve seen some simple examples of network algorithms and developed a basic model of SDN.
In this lecture, we’ll introduce P4, a domain-specific language for specifying the behavior of programmable data planes. Unlike models of SDN such as OpenFlow, P4 does not presuppose any functionality, such as Ethernet or IP. Instead, the packet-processing functionality including the packet parsers and the structure of the pipeline of match-action tables is defined by the P4 program.
At a high-level, P4 is based on standard programming constructs (types, variables, assignment, conditionals, etc.) as well as network-specific packet-processing constructs (parsers, match-action tables, etc.) In this lecture, we’ll focus on the P4 type system and its features for specifying packet parsers. Other features will be introduced in subsequent lectures.
Types
P4 is a statically-typed language. Every component of a P4 program has a type that is checked at compile time, and programs that are ill-typed are rejected by the compiler.
Primitive Types
Because packet-processing often involves manipulating bits in packet headers, P4 provides a rich collection of types for describing various kinds of packet data including:
bit<N>
: unsigned integers of widthN
,int<N>
: signed integers of widthN
, andint
: arbitrary-precision, signed integers
Integer literals can be written in binary (0b
), octal (0o
),
decimal, or hex (0x
) notation. Programmers may also optionally
specify the width of an integer iteral—e.g., 8w0xF
, which
specifies the encoding of 15
as an 8-bit unsigned integer.
The type int
is an internal type used by the compiler for integer
literals; it cannot be written directly by the programmer.
Signed operations are carried out using twos-complement arithmetic and most operations truncate the result in the case of arithmetic overflow.
To convert a value from one type to another, P4 provides casts between
different primitive types: e.g., (bit<4>) 8w0xF
produces the
encoding of 15
as a 4-bit unsigned integer.
Header Types
Packets typically comprise a sequence of headers, each of which are a sequence of fields. For example, an Ethernet packet has the following structure:
+-------------+-------------+------+-------------
| Destination | Source | Type | Payload ...
+-------------+-------------+------+-------------
The destination and source addresses are 48 bits each, while the type field is 16 bits. The payload is simply the “rest” of the packet and has a variable format depending on the type field.
P4 provides a built-in type for representing headers, using syntax
that resembles C struct
s:
header ethernet_t {
bit<48> dstAddr;
bit<48> srcAddr;
bit<16> etherType;
}
Each component of a header can be accessed using standard “dot”
notation—e.g., if a variable ethernet
has type ethernet_t
, then
ethernet.dstAddr
denotes the destination address.
A header value can be in one of two states, valid or invalid, and is
initially invalid. Reading a field of an invalid header produces an
undefined value. A header can be made valid using operations such as
isValid()
, setValid()
and setInvalid()
, or by extract
ing it in
the parser (as explained below).
Typedefs and Structs
To support giving convenient names to commonly-used types, P4 provides type definitions:
typedef bit<48> macAddr_t;
With this declaration, the types bit<48>
and macAddr_t
are
synonyms that are treated as equivalent by the type checker.
P4 also provides standard C-style structs, which are defined as follows:
struct headers_t {
ethernet_t ethernet;
ipv4_t ipv4;
}
Unlike a header, a struct
does not have a built-in notion of
validity and does not imply any ordering between fields.
Header Stacks and Unions
P4 provides derived types for header stacks and unions. A header
stack is similar to an array, but supports additional operations that
can be used when parsing packets. If header
is a header type, then
the type header[N]
denotes a header stack type, where N
must be an
integer literal. If stack
is an expression whose type is a header
stack then:
stack[i]
: denotes the header at indexi
,stack.size
: denotes the size of the header stack,stack.push_front(n): shifts
stack"right" by
n, making the
n` entries at the front of the stack invalid, andstack.pop_front(n): shifts
stack"left" by
n, making the
n` elements at the end of the stack invalid.
In addition, within a parser (described below), the following expressions may be used:
hs.next
denotes the next element of the header stack that has not been populated using a call toextract(...)
, which is explained below, andhs.last
is a reference to the last element of the header stack that was previously populated using a call toextract(...)
.
A header union encodes a disjoint alternative between two headers:
header_union l3 {
ipv4_t ipv4;
ipv6_t ipv6;
}
Only one of the headers may be valid at run-time. The components of a header union can be accessed and modified using standard “dot” notation.
Other types
P4 provides a number of other types including:
- Generics,
- Types for parsers, actions, tables, controls and other program elements,
- Types for
extern
functions and objects,
See the language specification document for details.
Parsers
The first piece of a P4 program is usually a parser that maps the bits in the actual packet into typed representations. A typical parser might be declared as follows:
parser P(packet_in packet,
out headers_t headers,
inout meta_t meta,
inout standard_metadata_t std_meta) {
...
}
Here the packet
argument is an object that encapsulates the packet
being parsed. It has a generic method extract
that can be used to
populate headers. The headers
, meta
, and std_meta
arguments are
data structures representing the parsed headers, along with
program-specific and standard metadata. Typically the types of
header
and meta
are struct
s defined by the programmer, while the
type of std_meta
is a struct
defined in the standard library.
The direction annotations out
and inout
indicate arguments that
are write-only and read-write respectively. There is also an in
annotation that indicates a read-only argument.
Internally, a parser describe a state machine in which each state may:
- extract bits out of the packet header,
- branch on the values of data, and
- transition to the next state.
For example, the following parser recognizes Ethernet and IPv4 packets.
state start {
return parse_ethernet;
}
state parse_ethernet {
packet.extract(headers.ethernet);
return select(headers.ethernet.etherType) {
0x800 : parse_ipv4;
default : accept;
}
}
state parse_ipv4 {
packet.extract(headers.ipv4);
return accept;
}
Parsers have several special states including the initial state
(start
), which must be explicitly defined, as well as accepting
(accept
) and rejecting (reject
) final states.
Example
Next let us see how to parse packets with variable structure. We will work with a simple example involving source routing: each packet is either a standard IPv4 packet or a source routed packet.
First, we will define types to represent Ethernet headers
header ethernet_t {
bit<48> dstAddr;
bit<48> srcAddr;
bit<16> etherType;
}
and source routing headers,
header srcRoute_t {
bit<1> bos;
bit<15> port;
}
Intuitively, the port
field encodes the port that the packet should
be forwarded out on, while the bos
field is a bottom-of-stack
marker that is set to 1
on the last elements of the stack.
We also define a struct to represent all headers:
struct headers {
ethernet_t ethernet;
srcRoute_t[8] srcRoutes;
}
Note that srcRoutes
is a stack containing up to 8 elements.
With these type definitions, we can define the parser itself:
parser MyParser(packet_in packet,
out headers hdr,
inout metadata meta,
inout standard_metadata_t standard_metadata) {
state start {
transition parse_ethernet;
}
state parse_ethernet {
packet.extract(hdr.ethernet);
transition select(hdr.ethernet.etherType) {
0x1234: parse_srcRouting;
default: accept;
}
}
state parse_srcRouting {
packet.extract(hdr.srcRoutes.next);
transition select(hdr.srcRoutes.last.bos) {
1: parse_ipv4;
default: parse_srcRouting;
}
}
state parse_ipv4 {
packet.extract(hdr.ipv4);
transition accept;
}
}
The most interesting state is parse_srcRouting
, which repeatedly
extracts the next element of the hdr.srcRoutes
stack until it either
runs out of space in the stack or finds an element with bos
set to
1
.
Discussion
-
What kinds of errors can arise in a parser?
-
Are there packet formats that would be difficult to express in P4?
Reading
For a full description of the P4 language, you may consult the language specification document.