Homework 4
Changelog
- v1: November 5, 2019
- v2: November 11, 2019: clarified
probe
header that should be returned in exercise 3.
Overview
In this homework assignment, you will implement several space-efficient network monitoring schemes in P4.
Due Date
- 11:59pm, November 14, 2018
Academic Integrity
This assignment must be completed individually. All work you submit
must be your own and sharing or receiving code is forbidden, with the
exception of your partner. Do not look for or submit code you find on
the Internet, and do not post solutions or partial solutions on the
discussion site. If you make use of any outside materials, you must
give attribution. You may ask general questions about the development
environment, p4c
, bmv2
, Mininet, etc., and you may discuss
high-level details of the exercises with your classmates. If you have
any questions about what is allowed and what is not allowed, please
ask the instructor first!
Starter Code
You can download starter code for this assignment as a zipfile from CMS.
Exercise 1: Warmup
To familiarize yourself with the starter code, please take a look at the
files in the homework04
directory including:
-
The
topology.json
JSON file, which describes the topology of the network. For this assignment we will use a single switch connected to eight hosts with addresses10.0.1.1
,10.0.1.2
, and so on through10.0.1.8
. -
The code in
switch.p4
, which implements a basic source routing scheme, similar to what we’ve seen in previous assignments. -
The
send.py
receive.py
, andprobe.py
scripts, which can be used to test your solutions.
Note that unlike previous assignments, we do not need a controller,
as switch.p4
is based on source routing and has no match-action
tables.
Next, let’s test that the starter code works:
% make
...
mininet> xterm h1 h2
Then execute ./receive.py
in the terminal for h1
and ./send.py
10.0.1.1 "Hello World" in the terminal for
h2. You should see that
the packet is successfully sent from
h2 to
h1`.
Finally, let’s add a simple counter to tabulate the total number of
bytes sent to each switch. The v1model.p4
architecture includes the
following declarations:
enum CounterType {
packets,
bytes,
packets_and_bytes
}
extern counter {
counter(bit<32> size, CounterType type);
void count(in bit<32> index);
}
To complete this exercise, simply instantiate a counter in the
MyIngress
control, and execute its count
method on every packet.
To read the value of the counter, you can use the simple_switch_CLI
command-line utility on switch s1
. For example, if your program has
a counter c
, then the command counter_read MyIngress.c
will print
its value to standard output.
To submit: Rename your solution counter.p4
and submit on CMS.
Exercise 2: Efficient Flow Monitoring
In this exercise, you will develop a more sophisticated monitoring
program in P4 that tracks traffic statistics at per-flow
granularity. For the purposes of this exercise, you may assume that
flows are identified by an IP source-destination pair. Note that even
with this simplification, the number of flows grows still
quadratically with the number of hosts in the network – e.g., even
if we assumed that all hosts have addresses of the form 10.0.1.*
there would be 65,025 possible flows! Hence, it would be quite
expensive to allocate a counter
object for every possible flow, and
infeasible in a larger network.
Instead, we will develop an approximate solution based on the
count-min sketch data structure we saw in lecture. You can implement
a sketch using the hash functions and register
objects provided the
V1Model architecture.
To answer frequency queries about a particular flow, we will use
special probe packets, which have an Ethernet type of 0x9999
followed by an instance of the following header type:
header probe_t {
bit<32> srcAddr;
bit<32> dstAddr;
bit<16> result;
}
Upon receipt of a packet containing probe header probe
, your program
should estimate the cumulative size of all packets in the flow from
probe.srcAddr
to probe.dstAddr
and then return the answer by
storing the answer in probe.result
and forwarding the packet back to
the host that issued the query. We have provided a script probe.py
that will send and print probe packets.
Implementation Steps
To complete this exericse, you should extend the starter code in
switch.p4
as follows:
-
Add the
probe_t
header and extend theheaders
andparser
appropriately. Extend the source routing forwarding logic to send probe packets back to the host that sent the query. (Hint: usestandard_metadata.ingress_port
.) -
Develop data structures and operations that implement the functionality of a count-min sketch. For the purposes of this assignment, your sketch should use 3 hash functions (e.g.,
crc16
,csum16
andidentity
) and registers of size6
. -
To estimate packet sizes, you may rely on the
totalLen
field in the IPv4 header. You do not need to count the traffic generated by the probe packets in the count-min sketch. -
Complete your implementation by adding packet sizes to the sketch for non-probe packets and querying the sketch for probe packets.
Testing
To test that your solution is working correctly, you can use the
supplied send.py
, receive.py
, and probe.py
scripts.
To submit: Rename your solution sketch.p4
and submit on CMS.
Karma: Accuracy Experiment
Note: Karma exercises are suggested problems that are completely optional. Feel free to try them if you are so inclined, or skip them if you are busy!
Because the count-min sketch is an approximate data structure, it does
not always return the most accurate answer. Find an input (i.e., a
list of specific invocations of send.py
followed by probe.py
) that
cause your sketch to overestimate the traffic in a given flow. You may
increase the number of hosts if it helps.
To submit: Assemble your solution into a file collision.zip
and
submit on CMS.
Exercise 3: Heavy Hitters
In this exercise, we will develop a way to answer “heavy hitter” queries. For the purpose of this exercise, to keep things simple, we will identify the heaviest senders of traffic rather than the traffic generated between each source-destation pair.
The main challenge is that a count-min sketch does not actually keep track of the values that were used to populate the counts in the sketch. Unless we were to somehow hash the entire space of possible values, there is no easy way to recover those values.
Your task in this exercise is to add extra state to your sketch.p4
program so it can compute the heaviest hitters. For simplicity, you
only have to compute the single heaviest hitter (in practice,
operators are typically interested in knowing the k
heaviest
hitters.)
More specifically, upon receipt of a packet containing the probe your
program should estimate the cumulative size of all packets sent by the
heaviest hitter as well as the IP address of the sender. You should
store the cumulative size in the probe.result
field and the IP
address of the sender in the probe.srcAddr
field. The
probe.dstAddr
field should not be modified. To finish the job,
forward the packet back to the host that issued the query. You can use
the same probe.py
script as before to send and print probe packets
to the switch, filling in the command-line arguments for the source
and destination addresses with dummy values (e.g., 127.0.0.1
).
Following are two possible implementation strategies you could consider:
-
Easy: Add a register to keep track of the heavy hitter, and update it on every packet. Because P4 lacks loops, you will have to iterate through the sketches and this register manually. Then modify the
probe
header as described above. -
Sophisticated: Decompose the space of possible flows into a tree of disjoint blocks of IP addresses, and allocate a count-min sketch at each level whose keys correspond to sets of addresses rather than individual addresses.
The root of the tree should track all IP addresses and the leaves of the tree should track individual IP addresses. Then, for each node, use a count-min sketch to keep track of the traffic generated by hosts in each set. For example, for our node topology, the tree might look like this:
sketch1: {h1-h8}
/ \
/ \
sketch2: {h1-h4} {h5-h8}
/ \ / \
/ \ / \
sketch3: {h1-h2} {h3-h4} {h5-h6} {h7-h8}
/ \ / \ / \ / \
/ \ / \ / \ / \
sketch3: {h1} {h2} {h3} {h4} {h5} {h6} {h7} {h8}
To process traffic from a given host, say h2
, we would update all of
the sketches from root to leaf–e.g., we would update sketch1
using
the key {h1-h8}
, sketch2
using the key {h1-h4}
, sketch3
using the key {h1-h2}
, and sketch4
using the key {h2}
.
Then, to answer a heavy hitter query, we would traverse the tree from root to leaf, following the branch associated the highest estimate at each step.
For simplicity, because our network only has eight hosts, you can write P4 code to do this analysis. In reality, one would likely transfer the values of the sketches to a general-purpose machine that could perform the required analysis.
To submit: Rename your solution heavy_hitters.p4
and submit it
on CMS.
Debriefing
-
How many hours did you spend on this assignment?
-
Would you rate it as easy, moderate, or difficult?
-
If you worked with a partner, please briefly describe both of your contributions to the solution you submitted.
-
How deeply do you feel you understand the material it covers (0%-100%)?
-
If you have any other comments, I would like to hear them! Please write them down or send email to
jnfoster@cs.cornell.edu
To submit: debriefing.txt