SIGCOMM 2019 Tutorial Edits (#272)
* Updated the utils/run_exercise.py to allow exercises to customize host configuration from the topology.json file. Now hosts and `ping` each other in the basic exercise. Other Linux utilities should work as well (e.g. iperf). ``` mininet> h1 ping h2 PING 10.0.2.2 (10.0.2.2) 56(84) bytes of data. 64 bytes from 10.0.2.2: icmp_seq=1 ttl=62 time=3.11 ms 64 bytes from 10.0.2.2: icmp_seq=2 ttl=62 time=2.34 ms 64 bytes from 10.0.2.2: icmp_seq=3 ttl=62 time=2.15 ms ^C --- 10.0.2.2 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 2.153/2.540/3.118/0.416 ms mininet> pingall *** Ping: testing ping reachability h1 -> h2 h3 h2 -> h1 h3 h3 -> h1 h2 *** Results: 0% dropped (6/6 received) ``` Only updated basic exercise, still need to update other exercises. Also, updated the root-bootstrap.sh because I was running into issues with latest version of vagrant. * Accidentially added the solution to the basic exercise in the previous commit. Undoing that here ... * Updated the topology.json file and table entries for the basic_tunnel exercise. * Updated P4Runtime exercise with new topology and table entries. * Fixed MAC addresses in P4Runtime exercise. It is working now. * Fixed MAC addresses in P4Runtime exercise starter code * Updated ECN exercise to use new topology.json file. Updated the table entries / MAC addresses as well. * Updated the topology.json file and table entries for the MRI exercise. * Updated source_routing exercise with new topology file and verified correct functionality. * Updated load_balance exercise with new topology. * Moved basic exercise triangle topology into a separate folder * Added new topology for the basic exercise: a single pod of a fat-tree. * Updated Makefiles and run_exercise.py to allow exercises to configure each switch with a different P4 program. This is mainly for the firewall exercise. * Updated Makefiles of project to work with new utils/Makefile * Updated load_balance and p4runtime exercise Makefiles * Initial commit of the firewall exercise, which is a simple stateful firewall that uses a bloom filter. Need to update README files * Initial commit of the path_monitor exercise. It is working but still need to update the README and figure out what we want the tutorial attendees to implement. * Updated README file in firewall exercise. Also removed the bits from the starter code that we want the tutorial attendees to implement * Renamed path_monitor exercise to link_monitor * Updated the README in the link_monitor exercise and removed the bits from the starter code that we want the tutorial attendees to implement. * Updated README for the firewall exercise * Adding pod-topo.png image to basic exercise * Added firewall-topo.png image to firewall exercise * Added link-monitor-topo.png to link_monitor exercise * Updated README files to point to topology images * Updated top-level README to point to new exercises. * Fixed link for VM dependencies script in README * Updated bmv2/pi/p4c commits * Updated README files for exercises to fix some typos and added a note about the V1Model architecture. * Added a note about food for thought in the link_monitor README * Updated the firewall.p4 program to use two register arrays rather than a single one. This is to make the design more portable to high line rate devices which can only support a single access to each register array. * Minor fix to firewall exercise to get rid of compiler warning. * Updated comment in firewall exercise. * Minor (typo) fixes in the firewall ReadMe * More info in firewall exercise ReadMe step 2 * Updated firewall.p4 to reuse direction variable * More testing steps, small fixes in firewall exercise Readme * Added food for thought to firewall Readme * Cosmetic fixes to firewall ReadMe * Made a few updates to the basic exercise README and added more details to the link_monitor exercise README. Also added a command to install grip when provisioning the VM. This could be useful for rendering the markdown README files offline. * Updated top level README so it can be merged into the master branch. * Moved cmd to install grip from root-bootstrap to user-bootstrap
This commit is contained in:
227
exercises/link_monitor/README.md
Normal file
227
exercises/link_monitor/README.md
Normal file
@@ -0,0 +1,227 @@
|
||||
# Implementing Link Monitoring
|
||||
|
||||
## Introduction
|
||||
|
||||
The objective of this exercise is to write a P4 program that enables
|
||||
a host to monitor the utilization of all links in the network. This
|
||||
exercise builds upon the basic IPv4 forwarding exercise so be sure
|
||||
to complete that one before attempting this one. Specifically, we
|
||||
will modify the basic P4 program to process a source routed probe
|
||||
packet such that it is able to pick up the egress link utilization
|
||||
at each hop and deliver it to a host for monitoring purposes.
|
||||
|
||||
Our probe packet will contain the following three header types:
|
||||
```
|
||||
// Top-level probe header, indicates how many hops this probe
|
||||
// packet has traversed so far.
|
||||
header probe_t {
|
||||
bit<8> hop_cnt;
|
||||
}
|
||||
|
||||
// The data added to the probe by each switch at each hop.
|
||||
header probe_data_t {
|
||||
bit<1> bos;
|
||||
bit<7> swid;
|
||||
bit<8> port;
|
||||
bit<32> byte_cnt;
|
||||
time_t last_time;
|
||||
time_t cur_time;
|
||||
}
|
||||
|
||||
// Indicates the egress port the switch should send this probe
|
||||
// packet out of. There is one of these headers for each hop.
|
||||
header probe_fwd_t {
|
||||
bit<8> egress_spec;
|
||||
}
|
||||
```
|
||||
|
||||
We will use the pod-topology for this exercise, which consists of
|
||||
four hosts connected to four switches that are wired up as they
|
||||
would be in a single pod of a fat tree topology.
|
||||
|
||||

|
||||
|
||||
In order to monitor the link utilization our switch will maintain
|
||||
two register arrays:
|
||||
* `byte_cnt_reg` - counts the number of bytes transmitted out of
|
||||
each port since the last probe packet was transmitted out of
|
||||
the port.
|
||||
* `last_time_reg` - stores the last time that a probe packet was
|
||||
transmitted out of each port.
|
||||
|
||||
Our P4 program will be written for the V1Model architecture implemented
|
||||
on P4.org's bmv2 software switch. The architecture file for the V1Model
|
||||
can be found at: /usr/local/share/p4c/p4include/v1model.p4. This file
|
||||
desribes the interfaces of the P4 programmable elements in the architecture,
|
||||
the supported externs, as well as the architecture's standard metadata
|
||||
fields. We encourage you to take a look at it.
|
||||
|
||||
> **Spoiler alert:** There is a reference solution in the `solution`
|
||||
> sub-directory. Feel free to compare your implementation to the
|
||||
> reference.
|
||||
|
||||
## Step 1: Run the (incomplete) starter code
|
||||
|
||||
The directory with this README contains a skeleton P4 program,
|
||||
`link_monitor.p4`, which implements basic IPv4 forwarding, as well
|
||||
as source routing of the probe packets. Your job will be to
|
||||
extend this skeleton program to fill out the fields in the probe
|
||||
packet.
|
||||
|
||||
Before that, let's compile and test the incomplete `link_monitor.p4`
|
||||
program:
|
||||
|
||||
1. In your shell, run:
|
||||
```bash
|
||||
make run
|
||||
```
|
||||
This will:
|
||||
* compile `link_monitor.p4`, and
|
||||
* start the pod-topo in Mininet and configure all switches with
|
||||
the `link_monitor.p4` program + table entries, and
|
||||
* configure all hosts with the commands listed in
|
||||
[pod-topo/topology.json](./pod-topo/topology.json)
|
||||
|
||||
2. You should now see a Mininet command prompt. Open two terminals
|
||||
on `h1`:
|
||||
```bash
|
||||
mininet> xterm h1 h1
|
||||
```
|
||||
3. In one of the xterms run the `send.py` script to start sending
|
||||
probe packets every second. Each of these probe packets takes the
|
||||
path indicated in link-monitor-topo.png.
|
||||
```bash
|
||||
./send.py
|
||||
```
|
||||
4. In the other terminal run the `receive.py` script to start
|
||||
receiving and parsing the probe packets. This allows us to monitor
|
||||
the link utilization within the network.
|
||||
```bash
|
||||
./receive.py
|
||||
```
|
||||
The reported link utilization and the switch port numbers will
|
||||
always be 0 because the probe fields have not been filled out yet.
|
||||
|
||||
5. Run an iperf flow between h1 and h4:
|
||||
```bash
|
||||
mininet> iperf h1 h4
|
||||
```
|
||||
6. Type `exit` to leave each xterm and the Mininet command line.
|
||||
Then, to stop mininet:
|
||||
```bash
|
||||
make stop
|
||||
```
|
||||
And to delete all pcaps, build files, and logs:
|
||||
```bash
|
||||
make clean
|
||||
```
|
||||
|
||||
The measured link utilizations will not agree with what iperf reports
|
||||
because the probe packet fields have not been populated yet. Your
|
||||
goal is to fill out the probe packet fields so that the two
|
||||
measurements agree.
|
||||
|
||||
### A note about the control plane
|
||||
|
||||
A P4 program defines a packet-processing pipeline, but the rules
|
||||
within each table are inserted by the control plane. When a rule
|
||||
matches a packet, its action is invoked with parameters supplied by
|
||||
the control plane as part of the rule.
|
||||
|
||||
In this exercise, we have already implemented the control plane
|
||||
logic for you. As part of bringing up the Mininet instance, the
|
||||
`make run` command will install packet-processing rules in the tables of
|
||||
each switch. These are defined in the `sX-runtime.json` files, where
|
||||
`X` corresponds to the switch number.
|
||||
|
||||
**Important:** We use P4Runtime to install the control plane rules. The
|
||||
content of files `sX-runtime.json` refer to specific names of tables, keys, and
|
||||
actions, as defined in the P4Info file produced by the compiler (look for the
|
||||
file `build/link_monitor.p4.p4info.txt` after executing `make run`). Any
|
||||
changes in the P4 program that add or rename tables, keys, or actions
|
||||
will need to be reflected in these `sX-runtime.json` files.
|
||||
|
||||
## Step 2: Implement Link Monitoring Logic
|
||||
|
||||
The `link_monitor.p4` file contains a skeleton P4 program with key pieces of
|
||||
logic replaced by `TODO` comments. Your implementation should follow
|
||||
the structure given in this file---replace each `TODO` with logic
|
||||
implementing the missing piece.
|
||||
|
||||
Here are a few more details about the design:
|
||||
|
||||
**Parser**
|
||||
* The parser has been extended support parsing of the source routed probe packets.
|
||||
The parser is the most complicated part of the design so spend a bit of time
|
||||
reading over it. Note that it does not contain any TODO comments so there is
|
||||
nothing you need to change here.
|
||||
* To parse the probe packets, we use the `hdr.probe.hop_cnt` to determine how many
|
||||
hops the packet has traversed prior to reaching the switch. If this is the first
|
||||
hop then there will not be any `probe_data` in the packet so we skip that state
|
||||
and transition directly to the `parse_probe_fwd` state. In the `parse_probe_fwd`
|
||||
state, we use the `hdr.probe.hop_cnt` field to figure out which `egress_spec`
|
||||
header field to use to perform forwarding and we save that port value into a
|
||||
metadata field which is subsequently used to perform forwarding.
|
||||
|
||||
**Ingress Control**
|
||||
* The ingress control block looks very similar to the `basic` exercise. The only
|
||||
difference is that the `apply` block contains another condition to forward probe
|
||||
packets using the `egress_spec` field extracted by the parser. It also increments
|
||||
the `hdr.probe.hop_cnt` field.
|
||||
|
||||
**Egress Control**
|
||||
* This is where the interesting stateful processing occurs. It uses the
|
||||
`byte_cnt_reg` register to count the number of bytes that have passed through each
|
||||
port since the last probe packet passed through the port.
|
||||
* It adds a new `probe_data` header to the packet and filld out the `bos`
|
||||
(bottom of stack) field, as well as the `swid` (switch ID) field.
|
||||
* TODO: your job is to fill out the rest of the probe packet fields in order to
|
||||
ensure that you can properly measure link utilization.
|
||||
|
||||
**Deparser**
|
||||
* Simply emits all headers in the correct order.
|
||||
* Note that emitting a header stack will only emit the headers within the stack
|
||||
that are actually marked as valid.
|
||||
|
||||
## Step 3: Run your solution
|
||||
|
||||
Follow the instructions from Step 1. This time, the measured link
|
||||
utilizations should agree with what `iperf` reports.
|
||||
|
||||
### Troubleshooting
|
||||
|
||||
There are several problems that might manifest as you develop your program:
|
||||
|
||||
1. `link_monitor.p4` might fail to compile. In this case, `make run` will
|
||||
report the error emitted from the compiler and halt.
|
||||
|
||||
2. `link_monitor.p4` might compile but fail to support the control plane
|
||||
rules in the `s1-runtime.json` through `s4-runtime.json` files that
|
||||
`make run` tries to install using P4Runtime. In this case, `make run` will
|
||||
report errors if control plane rules cannot be installed. Use these error
|
||||
messages to fix your `link_monitor.p4` implementation.
|
||||
|
||||
3. `link_monitor.p4` might compile, and the control plane rules might be
|
||||
installed, but the switch might not process packets in the desired
|
||||
way. The `logs/sX.log` files contain detailed logs that describing
|
||||
how each switch processes each packet. The output is detailed and can
|
||||
help pinpoint logic errors in your implementation.
|
||||
|
||||
#### Cleaning up Mininet
|
||||
|
||||
In the latter two cases above, `make run` may leave a Mininet instance
|
||||
running in the background. Use the following command to clean up
|
||||
these instances:
|
||||
|
||||
```bash
|
||||
make stop
|
||||
```
|
||||
|
||||
### Food For Thought
|
||||
|
||||
Now that you've implemented this basic monitoring framework can you
|
||||
think of ways to leverage this information about link utilization
|
||||
within the core of the network? For instance, how might you use this
|
||||
data, either at the hosts or at the switches, to make real-time
|
||||
load-balancing decisions?
|
||||
|
||||
Reference in New Issue
Block a user