One of the main purposes of <code>ovn-northd</code> is to populate the
<code>Logical_Flow</code> table in the <code>OVN_Southbound</code>
database. This section describes how <code>ovn-northd</code> does this
- for logical datapaths.
+ for switch and router logical datapaths.
</p>
- <h2>Ingress Table 0: Admission Control and Ingress Port Security</h2>
+ <h2>Logical Switch Datapaths</h2>
+
+ <h3>Ingress Table 0: Admission Control and Ingress Port Security</h3>
<p>
Ingress table 0 contains these logical flows:
be dropped.
</p>
- <h2>Ingress Table 1: <code>from-lport</code> Pre-ACLs</h2>
+ <h3>Ingress Table 1: <code>from-lport</code> Pre-ACLs</h3>
<p>
Ingress table 1 prepares flows for possible stateful ACL processing
the connection tracker before advancing to table 2.
</p>
- <h2>Ingress table 2: <code>from-lport</code> ACLs</h2>
+ <h3>Ingress table 2: <code>from-lport</code> ACLs</h3>
<p>
Logical flows in this table closely reproduce those in the
</li>
</ul>
- <h2>Ingress Table 3: Destination Lookup</h2>
+ <h3>Ingress Table 3: ARP responder</h3>
+
+ <p>
+ This table implements ARP responder for known IPs. It contains these
+ logical flows:
+ </p>
+
+ <ul>
+ <li>
+ Priority-100 flows to skip ARP responder if inport is of type
+ <code>localnet</code>, and advances directly to table 3.
+ </li>
+
+ <li>
+ <p>
+ Priority-50 flows that matches ARP requests to each known IP address
+ <var>A</var> of logical port <var>P</var>, and respond with ARP
+ replies directly with corresponding Ethernet address <var>E</var>:
+ </p>
+
+ <pre>
+eth.dst = eth.src;
+eth.src = <var>E</var>;
+arp.op = 2; /* ARP reply. */
+arp.tha = arp.sha;
+arp.sha = <var>E</var>;
+arp.tpa = arp.spa;
+arp.spa = <var>A</var>;
+outport = <var>P</var>;
+inport = ""; /* Allow sending out inport. */
+output;
+ </pre>
+
+ <p>
+ These flows are omitted for logical ports (other than router ports)
+ that are down.
+ </p>
+ </li>
+
+ <li>
+ One priority-0 fallback flow that matches all packets and advances to
+ table 4.
+ </li>
+ </ul>
+
+ <h3>Ingress Table 4: Destination Lookup</h3>
<p>
This table implements switching behavior. It contains these logical
</li>
</ul>
- <h2>Egress Table 0: <code>to-lport</code> Pre-ACLs</h2>
+ <h3>Egress Table 0: <code>to-lport</code> Pre-ACLs</h3>
<p>
This is similar to ingress table 1 except for <code>to-lport</code>
traffic.
</p>
- <h2>Egress Table 1: <code>to-lport</code> ACLs</h2>
+ <h3>Egress Table 1: <code>to-lport</code> ACLs</h3>
<p>
This is similar to ingress table 2 except for <code>to-lport</code> ACLs.
</p>
- <h2>Egress Table 2: Egress Port Security</h2>
+ <h3>Egress Table 2: Egress Port Security</h3>
<p>
This is similar to the ingress port security logic in ingress table 0,
<code>eth.src</code>. Second, packets directed to broadcast or multicast
<code>eth.dst</code> are always accepted instead of being subject to the
port security rules; this is implemented through a priority-100 flow that
- matches on <code>eth.dst[40]</code> with action <code>output;</code>.
+ matches on <code>eth.mcast</code> with action <code>output;</code>.
Finally, to ensure that even broadcast and multicast packets are not
delivered to disabled logical ports, a priority-150 flow for each
disabled logical <code>outport</code> overrides the priority-100 flow
with a <code>drop;</code> action.
</p>
+
+ <h2>Logical Router Datapaths</h2>
+
+ <h3>Ingress Table 0: L2 Admission Control</h3>
+
+ <p>
+ This table drops packets that the router shouldn't see at all based on
+ their Ethernet headers. It contains the following flows:
+ </p>
+
+ <ul>
+ <li>
+ Priority-100 flows to drop packets with VLAN tags or multicast Ethernet
+ source addresses.
+ </li>
+
+ <li>
+ For each enabled router port <var>P</var> with Ethernet address
+ <var>E</var>, a priority-50 flow that matches <code>inport ==
+ <var>P</var> && (eth.mcast || eth.dst ==
+ <var>E</var></code>), with action <code>next;</code>.
+ </li>
+ </ul>
+
+ <p>
+ Other packets are implicitly dropped.
+ </p>
+
+ <h3>Ingress Table 1: IP Input</h3>
+
+ <p>
+ This table is the core of the logical router datapath functionality. It
+ contains the following flows to implement very basic IP host
+ functionality.
+ </p>
+
+ <ul>
+ <li>
+ <p>
+ L3 admission control: A priority-100 flow drops packets that match
+ any of the following:
+ </p>
+
+ <ul>
+ <li>
+ <code>ip4.src[28..31] == 0xe</code> (multicast source)
+ </li>
+ <li>
+ <code>ip4.src == 255.255.255.255</code> (broadcast source)
+ </li>
+ <li>
+ <code>ip4.src == 127.0.0.0/8 || ip4.dst == 127.0.0.0/8</code>
+ (localhost source or destination)
+ </li>
+ <li>
+ <code>ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8</code> (zero
+ network source or destination)
+ </li>
+ <li>
+ <code>ip4.src</code> is any IP address owned by the router.
+ </li>
+ <li>
+ <code>ip4.src</code> is the broadcast address of any IP network
+ known to the router.
+ </li>
+ </ul>
+ </li>
+
+ <li>
+ <p>
+ ICMP echo reply. These flows reply to ICMP echo requests received
+ for the router's IP address. Let <var>A</var> be an IP address or
+ broadcast address owned by a router port. Then, for each
+ <var>A</var>, a priority-90 flow matches on <code>ip4.dst ==
+ <var>A</var></code> and <code>icmp4.type == 8 && icmp4.code
+ == 0</code> (ICMP echo request). These flows use the following
+ actions where, if <var>A</var> is unicast, then <var>S</var> is
+ <var>A</var>, and if <var>A</var> is broadcast, <var>S</var> is the
+ router's IP address in <var>A</var>'s network:
+ </p>
+
+ <pre>
+ip4.dst = ip4.src;
+ip4.src = <var>S</var>;
+ip.ttl = 255;
+icmp4.type = 0;
+inport = ""; /* Allow sending out inport. */
+next;
+ </pre>
+
+ <p>
+ Similar flows match on <code>ip4.dst == 255.255.255.255</code> and
+ each individual <code>inport</code>, and use the same actions in
+ which <var>S</var> is a function of <code>inport</code>.
+ </p>
+ </li>
+
+ <li>
+ <p>
+ ARP reply. These flows reply to ARP requests for the router's own IP
+ address. For each router port <var>P</var> that owns IP address
+ <var>A</var> and Ethernet address <var>E</var>, a priority-90 flow
+ matches <code>inport == <var>P</var> && arp.tpa ==
+ <var>A</var> && arp.op == 1</code> (ARP request) with the
+ following actions:
+ </p>
+
+ <pre>
+eth.dst = eth.src;
+eth.src = <var>E</var>;
+arp.op = 2; /* ARP reply. */
+arp.tha = arp.sha;
+arp.sha = <var>E</var>;
+arp.tpa = arp.spa;
+arp.spa = <var>A</var>;
+outport = <var>P</var>;
+inport = ""; /* Allow sending out inport. */
+output;
+ </pre>
+ </li>
+
+ <li>
+ <p>
+ UDP port unreachable. Priority-80 flows generate ICMP port
+ unreachable messages in reply to UDP datagrams directed to the
+ router's IP address. The logical router doesn't accept any UDP
+ traffic so it always generates such a reply.
+ </p>
+
+ <p>
+ These flows should not match IP fragments with nonzero offset.
+ </p>
+
+ <p>
+ Details TBD. Not yet implemented.
+ </p>
+ </li>
+
+ <li>
+ <p>
+ TCP reset. Priority-80 flows generate TCP reset messages in reply to
+ TCP datagrams directed to the router's IP address. The logical
+ router doesn't accept any TCP traffic so it always generates such a
+ reply.
+ </p>
+
+ <p>
+ These flows should not match IP fragments with nonzero offset.
+ </p>
+
+ <p>
+ Details TBD. Not yet implemented.
+ </p>
+ </li>
+
+ <li>
+ <p>
+ Protocol unreachable. Priority-70 flows generate ICMP protocol
+ unreachable messages in reply to packets directed to the router's IP
+ address on IP protocols other than UDP, TCP, and ICMP.
+ </p>
+
+ <p>
+ These flows should not match IP fragments with nonzero offset.
+ </p>
+
+ <p>
+ Details TBD. Not yet implemented.
+ </p>
+ </li>
+
+ <li>
+ Drop other IP traffic to this router. These flows drop any other
+ traffic destined to an IP address of this router that is not already
+ handled by one of the flows above, which amounts to ICMP (other than
+ echo requests) and fragments with nonzero offsets. For each IP address
+ <var>A</var> owned by the router, a priority-60 flow matches
+ <code>ip4.dst == <var>A</var></code> and drops the traffic.
+ </li>
+ </ul>
+
+ <p>
+ The flows above handle all of the traffic that might be directed to the
+ router itself. The following flows (with lower priorities) handle the
+ remaining traffic, potentially for forwarding:
+ </p>
+
+ <ul>
+ <li>
+ Drop Ethernet local broadcast. A priority-50 flow with match
+ <code>eth.bcast</code> drops traffic destined to the local Ethernet
+ broadcast address. By definition this traffic should not be forwarded.
+ </li>
+
+ <li>
+ Drop IP multicast. A priority-50 flow with match
+ <code>ip4.mcast</code> drops IP multicast traffic.
+ </li>
+
+ <li>
+ <p>
+ ICMP time exceeded. For each router port <var>P</var>, whose IP
+ address is <var>A</var>, a priority-40 flow with match <code>inport
+ == <var>P</var> && ip.ttl == {0, 1} &&
+ !ip.later_frag</code> matches packets whose TTL has expired, with the
+ following actions to send an ICMP time exceeded reply:
+ </p>
+
+ <pre>
+icmp4 {
+ icmp4.type = 11; /* Time exceeded. */
+ icmp4.code = 0; /* TTL exceeded in transit. */
+ ip4.dst = ip4.src;
+ ip4.src = <var>A</var>;
+ ip.ttl = 255;
+ next;
+};
+ </pre>
+
+ <p>
+ Not yet implemented.
+ </p>
+ </li>
+
+ <li>
+ TTL discard. A priority-30 flow with match <code>ip.ttl == {0,
+ 1}</code> and actions <code>drop;</code> drops other packets whose TTL
+ has expired, that should not receive a ICMP error reply (i.e. fragments
+ with nonzero offset).
+ </li>
+
+ <li>
+ Next table. A priority-0 flows match all packets that aren't already
+ handled and uses actions <code>next;</code> to feed them to the ingress
+ table for routing.
+ </li>
+ </ul>
+
+ <h3>Ingress Table 2: IP Routing</h3>
+
+ <p>
+ A packet that arrives at this table is an IP packet that should be routed
+ to the address in <code>ip4.dst</code>. This table implements IP
+ routing, setting <code>reg0</code> to the next-hop IP address (leaving
+ <code>ip4.dst</code>, the packet's final destination, unchanged) and
+ advances to the next table for ARP resolution.
+ </p>
+
+ <p>
+ This table contains the following logical flows:
+ </p>
+
+ <ul>
+ <li>
+ <p>
+ Routing table. For each route to IPv4 network <var>N</var> with
+ netmask <var>M</var>, a logical flow with match <code>ip4.dst ==
+ <var>N</var>/<var>M</var></code>, whose priority is the number of
+ 1-bits in <var>M</var>, has the following actions:
+ </p>
+
+ <pre>
+ip.ttl--;
+reg0 = <var>G</var>;
+next;
+ </pre>
+
+ <p>
+ (Ingress table 1 already verified that <code>ip.ttl--;</code> will
+ not yield a TTL exceeded error.)
+ </p>
+
+ <p>
+ If the route has a gateway, <var>G</var> is the gateway IP address,
+ otherwise it is <code>ip4.dst</code>.
+ </p>
+ </li>
+
+ <li>
+ <p>
+ Destination unreachable. For each router port <var>P</var>, which
+ owns IP address <var>A</var>, a priority-0 logical flow with match
+ <code>in_port == <var>P</var> && !ip.later_frag &&
+ !icmp</code> has the following actions:
+ </p>
+
+ <pre>
+icmp4 {
+ icmp4.type = 3; /* Destination unreachable. */
+ icmp4.code = 0; /* Network unreachable. */
+ ip4.dst = ip4.src;
+ ip4.src = <var>A</var>;
+ ip.ttl = 255;
+ next(2);
+};
+ </pre>
+
+ <p>
+ (The <code>!icmp</code> check prevents recursion if the destination
+ unreachable message itself cannot be routed.)
+ </p>
+
+ <p>
+ These flows are omitted if the logical router has a default route,
+ that is, a route with netmask 0.0.0.0.
+ </p>
+ </li>
+ </ul>
+
+ <h3>Ingress Table 3: ARP Resolution</h3>
+
+ <p>
+ Any packet that reaches this table is an IP packet whose next-hop IP
+ address is in <code>reg0</code>. (<code>ip4.dst</code> is the final
+ destination.) This table resolves the IP address in <code>reg0</code>
+ into an output port in <code>outport</code> and an Ethernet address in
+ <code>eth.dst</code>, using the following flows:
+ </p>
+
+ <ul>
+ <li>
+ <p>
+ Known MAC bindings. For each IP address <var>A</var> whose host is
+ known to have Ethernet address <var>HE</var> and reside on router
+ port <var>P</var> with Ethernet address <var>PE</var>, a priority-200
+ flow with match <code>reg0 == <var>A</var></code> has the following
+ actions:
+ </p>
+
+ <pre>
+eth.src = <var>PE</var>;
+eth.dst = <var>HE</var>;
+outport = <var>P</var>;
+output;
+ </pre>
+
+ <p>
+ MAC bindings can be known statically based on data in the
+ <code>OVN_Northbound</code> database. For router ports connected to
+ logical switches, MAC bindings can be known statically from the
+ <code>addresses</code> column in the <code>Logical_Port</code> table.
+ For router ports connected to other logical routers, MAC bindings can
+ be known statically from the <code>mac</code> and
+ <code>network</code> column in the <code>Logical_Router_Port</code>
+ table.
+ </p>
+ </li>
+
+ <li>
+ <p>
+ Unknown MAC bindings. For each non-gateway route to IPv4 network
+ <var>N</var> with netmask <var>M</var> on router port <var>P</var>
+ that owns IP address <var>A</var> and Ethernet address <var>E</var>,
+ a logical flow with match <code>ip4.dst ==
+ <var>N</var>/<var>M</var></code>, whose priority is the number of
+ 1-bits in <var>M</var>, has the following actions:
+ </p>
+
+ <pre>
+arp {
+ eth.dst = ff:ff:ff:ff:ff:ff;
+ eth.src = <var>E</var>;
+ arp.sha = <var>E</var>;
+ arp.tha = 00:00:00:00:00:00;
+ arp.spa = <var>A</var>;
+ arp.tpa = ip4.dst;
+ arp.op = 1; /* ARP request. */
+ outport = <var>P</var>;
+ output;
+};
+ </pre>
+
+ <p>
+ TBD: How to install MAC bindings when an ARP response comes back.
+ (Implement a "learn" action?)
+ </p>
+
+ <p>
+ Not yet implemented.
+ </p>
+ </li>
+ </ul>
+
+ <h3>Egress Table 0: Delivery</h3>
+
+ <p>
+ Packets that reach this table are ready for delivery. It contains
+ priority-100 logical flows that match packets on each enabled logical
+ router port, with action <code>output;</code>.
+ </p>
+
</manpage>