cascardo/ovs.git
10 years agoSet release date for 1.11.0. v1.11.0
Justin Pettit [Wed, 28 Aug 2013 21:32:27 +0000 (14:32 -0700)]
Set release date for 1.11.0.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoofproto-dpif: Fix memory leak in handle_flow_miss().
Ben Pfaff [Tue, 20 Aug 2013 14:15:39 +0000 (07:15 -0700)]
ofproto-dpif: Fix memory leak in handle_flow_miss().

Every xlate_actions() needs a corresponding xlate_out_uninit(), but the
call in handle_flow_miss() lacked one.  struct xlate_out has a built-in
256-byte actions stub, so the bug only showed up for lots of actions.

Bug #19198.
Reported-by: Ronald Lee <ronaldlee@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agoofproto-dpif: Destroy bundle after moving its last port out.
Ben Pfaff [Wed, 14 Aug 2013 00:44:14 +0000 (17:44 -0700)]
ofproto-dpif: Destroy bundle after moving its last port out.

When the ofp_port argument to bundle_add_port() refers to an ofport_dpif
that already belongs to some other bundle, bundle_add_port() removed
the port from the other bundle, correctly, with bundle_del_port().
If the other bundle now contained no ports, however, this violated the
invariant that a bundle always contains at least one port.

Normally, this would get fixed up when the other bundle was processed
later during reconfiguration.  I haven't quite zeroed in on the exact
case where this is not true, but segfaults have happened here in
production, in particular when port adds and deletes happen simultaneously
and the new port reuses the OpenFlow port number of one of the deleted
ports.  It seems that the duplicate port number allows some port to rip
away the new port from its bundle without destroying that bundle.  I
suspect, therefore, that there is still a more subtle bug here, but I
hope that this will fix the segfault.

Bug #18967.
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodebian: Fix build with old versions of dpkg-buildflags.
Ben Pfaff [Tue, 13 Aug 2013 19:54:35 +0000 (12:54 -0700)]
debian: Fix build with old versions of dpkg-buildflags.

dpkg-buildflags has not always supported --export=configure, but commit
6c2d4c8780 (debian: Apply hardening options to build.) used it
unconditionally, causing the build to fail on old Debian distributions.
This fixes the problem.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-ofctl: Avoid groff warning due to too-long line.
Ben Pfaff [Mon, 12 Aug 2013 22:11:35 +0000 (15:11 -0700)]
ovs-ofctl: Avoid groff warning due to too-long line.

Avoids these warnings from groff:

<standard input>:1037: warning [p 14, 6.0i]: cannot adjust line
<standard input>:1037: warning [p 14, 6.2i]: can't break line

Found by lintian.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodebian: Apply hardening options to build.
Ben Pfaff [Mon, 12 Aug 2013 22:10:39 +0000 (15:10 -0700)]
debian: Apply hardening options to build.

Debian now encourages building every program with various GCC hardening
options.  This commit implements that recommendation for Open vSwitch.

See https://wiki.debian.org/Hardening for details.

Found by lintian.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoodp-util: Always export the priority and skb_mark netlink attributes.
Andy Zhou [Sat, 3 Aug 2013 19:23:15 +0000 (12:23 -0700)]
odp-util: Always export the priority and skb_mark netlink attributes.

The current Netlink protocol allows a default value of zero if either mark
or priority is not specified (this is part of the ABI).  Until now, when
userspace serializes either the value or mask, it looked at the value and
omitted the netlink attribute if it is zero.  This is a bug because an
exact match on zero turns into a wildcard of the field.

These two fields (plus input port and EtherType) are special because they
can be omitted whereas most other values are required to be fully
specified.  These protocol variations tend to cause bugs (as above) when we
evolve the protocol because an exception that makes sense in one context
might not be logical in another.  Since the default value for mark and
priority are merely shorthands, we can push the protocol in a more
consistent direction by ignoring the shortcut and always serializing the
values.  This is what this commits does.

Signed-off-by: Andy Zhou <azhou@nicira.com>
[blp@nicira.com added Jesse's text to the commit message]
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: avoid losing track of kernel flows upon reinstallation
Andy Zhou [Sat, 3 Aug 2013 03:22:17 +0000 (20:22 -0700)]
ofproto-dpif: avoid losing track of kernel flows upon reinstallation

This commit fixes a problem whereby userspace can lose track of a
flow installed in the kernel, instead believing that the flow is
not installed.  The most visible consequence of this bug was a
message in the ovs-vswitchd log warning about an unexpected flow
in the kernel.  Other possible consequences included loss of
statistics and failure to updates actions when the OpenFlow flow
table changed.

The problem arose in the following scenario.  Suppose userspace
sets up a kernel flow due to an arriving packet.  Before kernel
flow setup completes, another packet for that flow arrives.  The
kernel sends the new packet to userspace after userspace has
completed processing the batch of packets that set up the flow.
Userspace then attempts to reinstall the kernel flow.  This fails
with EEXIST, so userspace then marked the flow as not-installed,
even though it was successfully installed before and remains
installed.  The next time userspace dumped the kernel flow
table to gather statistics, it would complain about an unexpected
flow and delete it.

In practice, we have seen these messages with netperf TCP_CRR tests and
UDP stream tests.

This patch fixes the problem by changing userspace so that, once
it successfully installs a flow in the kernel, it will not reinstall
it when it sees another packet for the flow in userspace.  This
has the downside that, if something goes wrong and a flow
disappears from the kernel (e.g. ovs-dpctl del-flows), then userspace
won't reinstall it (until it tries to delete it).  (This is in fact
the reason why until now userspace reinstalled flows it knew it
already installed.)

Some more background may be warranted.  There are two EEXIST error
cases:

       1. A subfacet was installed successfully in a previous (recent)
          batch.  Now we've attempted to reinstall exactly the same
          subfacet in this batch.

       2. A subfacet was installed successfully in a previous (recent)
          batch or earlier in the current batch.  We've attempted to
          install a subfacet for an overlapping megaflow.

Before megaflows, installation errors were ignored completely.
Since megaflows were introduced, they have been handled by
considering on any installation error that the given subfacet is
not installed.  This works well for case #2 but causes case #1 to
yield unexpected flows, as described at the top of the commit
message.

This commit adds the wrinkle that we never try to reinstall
exactly the same subfacet that we know we installed successfully
earlier (and haven't deleted) unless its actions change.  This
ought to work just as well for case #2, and avoids the problem
with case #1.

Prepared with assistance from Ethan.

Signed-off-by: Andy Zhou <azhou@nicira.com>
[blp@nicira.com rewrote the commit message]
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Always un-wildcard fields that are being set.
Justin Pettit [Sat, 3 Aug 2013 04:17:31 +0000 (21:17 -0700)]
ofproto-dpif: Always un-wildcard fields that are being set.

The ODP library has an optimization to not set a header if the field was
not changed, regardless of whether an action to set the field was
present.  That library is also responsible for un-wildcarding fields
that are bieng modified.  This leads to a problem where a packet matches
a flow that updates a field, but that particular packet's field already
has that value.  As such, an overly loose megaflow will be generated
that doesn't match on that field and the actions won't update it.  A
second packet that should have the field set will match that flow and
will not be modified.

This commit changes the behavior to always un-wildcard fields that are
being modified.  Since the ODP library updates the entire header if a
field in it is modified, and all those fields will be un-wildcarded, the
generated flows may be different.  However, they should be correct.

Bug #18946.

Reported-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoodp-util: Always serialize tunnel mask attributes.
Jesse Gross [Thu, 1 Aug 2013 23:17:47 +0000 (16:17 -0700)]
odp-util: Always serialize tunnel mask attributes.

A tunnel value attribute is not allowed to have an empty IP destination
address but this is legal for masks. This drops both the checks for
serializing masks and also the sanity checks on them.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agodatapath: Introduce is_mask when serializing netlink attributes.
Jesse Gross [Thu, 1 Aug 2013 23:17:46 +0000 (16:17 -0700)]
datapath: Introduce is_mask when serializing netlink attributes.

The intention is clearer than if we rederive it in every location.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoofproto-dpif-xlate: Unmask mark when used for tunnels.
Jesse Gross [Thu, 1 Aug 2013 20:31:28 +0000 (13:31 -0700)]
ofproto-dpif-xlate: Unmask mark when used for tunnels.

The tunnel lookup uses the skb_mark as part of the port find process
but it isn't unmasked along with the other fields. This adds it to
the list of significant fields.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agodatapath: Accept any 802.2 eth_type mask but override to be exact match
Andy Zhou [Thu, 1 Aug 2013 17:49:46 +0000 (10:49 -0700)]
datapath: Accept any 802.2 eth_type mask but override to be exact match

When key.eth_type is absent it is interpreted to be 802.2, which is
represented by a special value. In order to prevent inadvertant matches
on this opaque value, the mask is forced to be either fully wildcarded
or fully exact.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Accept any in_port mask but override to be exact match
Andy Zhou [Thu, 1 Aug 2013 17:49:45 +0000 (10:49 -0700)]
datapath: Accept any in_port mask but override to be exact match

Pre mega flow, netlink allows the in_port key attribute
to be missing. Missing in_port is interpreted as DP_MAX_PORTS.

For backward compatibility, mega flow implementation will always allow
the mask of in_port to be specified, as if the in_port key attribute
is always specified.

To prevent accidental match of the DP_MAX_PORTS, which value is opaque to
the user space, we will always force the mask to be exact match,
regardless of the value supplied by the netline message. Missing
in_port mask continue to mean wildcarded match, same as other masks.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Always allow tunnel mask to be specified in the netlink
Andy Zhou [Thu, 1 Aug 2013 03:39:49 +0000 (20:39 -0700)]
datapath: Always allow tunnel mask to be specified in the netlink

Netlink message usually only accpets a mask when there is a
corresponding key attribute. Tunnel mask and eth_type are the
only two expections so far.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoodp-util: fix bug in setting ipv4 frag flag mask
Andy Zhou [Wed, 31 Jul 2013 20:54:12 +0000 (13:54 -0700)]
odp-util: fix bug in setting ipv4 frag flag mask

This bug causes the flag mask to always mask only 1 bit, not the 2 bits
possible. While at it, make the top 6 bits exact match.

Bug #18834.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: always export priority and skb_mark in netlink message
Andy Zhou [Wed, 31 Jul 2013 02:49:12 +0000 (19:49 -0700)]
datapath: always export priority and skb_mark in netlink message

Handling of missing attributes in netlink can be tricky and turns out
to be error prone. The value (savings in netlink bandwidth) does not
seem to be significant enough to justify allowing them. This patch
series make both kernel and userspace always export priority and
skb_mark attribute. There will be follow on patches in the
direction of making all attributes explicit.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoofproto-dpif: Only track drop flows that are installed
Andy Zhou [Tue, 30 Jul 2013 17:52:35 +0000 (10:52 -0700)]
ofproto-dpif: Only track drop flows that are installed

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Unset DPIF_FP_MODIFY flag when creating a new flo
Andy Zhou [Tue, 30 Jul 2013 17:52:34 +0000 (10:52 -0700)]
ofproto-dpif: Unset DPIF_FP_MODIFY flag when creating a new flo

Remove the DPIF_FP_MODIFY flag when creating a new flow. When flows arrive in
a batch, userspace may push down multiple unique flow definitions that
overlap when wildcards are applied. Kernels support flow wildcarding
will reject these flow as duplicates (EEXIST), which will be logged
at a lower logging level.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Fix missing VLAN netlink attribute handling
Andy Zhou [Mon, 29 Jul 2013 21:05:23 +0000 (14:05 -0700)]
datapath: Fix missing VLAN netlink attribute handling

Missing VLAN netlink attribute should be interpreted as exact match
of no VLAN tag, instead of wildcarded match for all VLAN tags.

Bug #18736.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: fix a bug in SF_FLOW_KEY_PUT macro
Andy Zhou [Mon, 29 Jul 2013 20:26:08 +0000 (13:26 -0700)]
datapath: fix a bug in SF_FLOW_KEY_PUT macro

This bug will cause mask values to corrupt the flow key value. So far
the bug has not showed up because we don't write mask value when
there is no mask Netlink attributes.  However, it needs to be fixed for
the next and future commits where we will start to set default
values for key and mask for missing Netlink attributes.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: list: Fix double fetch of pointer in hlist_entry_safe()
Pravin B Shelar [Fri, 26 Jul 2013 20:52:24 +0000 (13:52 -0700)]
datapath: list: Fix double fetch of pointer in hlist_entry_safe()

Following patch backports commit f65846a1800ef8c48d (list: Fix double
fetch of pointer in hlist_entry_safe()) from upstream kernel.
This patch fixes following panic. Thanks to Jesse for helping to
debug this issue.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000118
[129608.216422] IP: [<ffffffffa02436da>] ovs_masked_flow_lookup+0xda/0x140 [openvswitch]
[129608.216918] PGD 11500a067 PUD 120851067 PMD 0
[129608.216994] Oops: 0000 [#1] SMP
[129608.217049] CPU 0
[129608.218697]
[129608.218726] Pid: 0, comm: swapper/0 Tainted: G           O
3.2.39-server-nn21 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop
Reference Platform
[129608.219288] RIP: 0010:[<ffffffffa02436da>]  [<ffffffffa02436da>]
ovs_masked_flow_lookup+0xda/0x140 [openvswitch]
[129608.219454] RSP: 0018:ffff88013fc03b60  EFLAGS: 00010282
[129608.219536] RAX: 0000000000000020 RBX: ffff880123087100 RCX:
ffff88012098e000
[129608.219719] RDX: ffff8800b3b0ca30 RSI: 000000000000010a RDI:
ffff88011df8c000
[129608.220121] RBP: ffff88013fc03c30 R08: 0000000000000001 R09:
0000000020069825
[129608.220287] R10: 0000000000000020 R11: 0000000000000001 R12:
ffff880036e1c6c0
[129608.220451] R13: ffff88013fc03b98 R14: 0000000000000024 R15:
ffffffffffffffe0
[129608.220618] FS:  0000000000000000(0000) GS:ffff88013fc00000(0000)
knlGS:0000000000000000
[129608.220794] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[129608.220911] CR2: 0000000000000118 CR3: 00000001190c9000 CR4:
00000000000406f0
[129608.221122] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[129608.221320] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[129608.221488] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000,
task ffffffff81c0d020)
[129608.221669] Stack:
[129608.221725]  0000000000000044 0000000000000020 ffff88013fc03c60
0000000000000000
[129608.221906]  0000000000000000 0000000000000000 0000000000000000
f014232200000002
[129608.222069]  1973f0142322015a 0000000600080000 1973f0140579f414
000000002f1dc7ec
[129608.222211] Call Trace:
[129608.222264]  <IRQ>
[129608.222316]  [<ffffffffa02445bd>] ovs_flow_lookup+0x5d/0x70
[openvswitch]
[129608.222411]  [<ffffffffa0242550>] ovs_dp_process_received_packet+0x70/0x110 [openvswitch]
[129608.222541]  [<ffffffff8104d6ec>] ? resched_task+0x2c/0x80
[129608.222644]  [<ffffffffa0249b20>] ? netdev_create+0x120/0x120
[openvswitch]
[129608.222743]  [<ffffffffa02483f8>] ovs_vport_receive+0x38/0x40
[openvswitch]
[129608.222838]  [<ffffffffa0249bc3>] netdev_frame_hook+0xa3/0xf0
[openvswitch]
[129608.222933]  [<ffffffffa0249b20>] ? netdev_create+0x120/0x120
[openvswitch]
[129608.223029]  [<ffffffff81539318>] __netif_receive_skb+0x1c8/0x620
[129608.223114]  [<ffffffff81325740>] ? map_single+0x60/0x60
[129608.223192]  [<ffffffff81539b71>] process_backlog+0xb1/0x190
[129608.223274]  [<ffffffff8153ae64>] net_rx_action+0x134/0x290
[129608.223355]  [<ffffffff8106e1a8>] __do_softirq+0xa8/0x210
[129608.223433]  [<ffffffff816458de>] ? _raw_spin_lock+0xe/0x20
[129608.223513]  [<ffffffff8164fcac>] call_softirq+0x1c/0x30
[129608.223590]  [<ffffffff81016215>] do_softirq+0x65/0xa0
[129608.223665]  [<ffffffff8106e58e>] irq_exit+0x8e/0xb0
[129608.223738]  [<ffffffff81650573>] do_IRQ+0x63/0xe0
[129608.223808]  [<ffffffff81645d6e>] common_interrupt+0x6e/0x6e
[129608.223887]  <EOI>
[129608.223933]  [<ffffffff8103ced6>] ? native_safe_halt+0x6/0x10
[129608.224014]  [<ffffffff8101c6a3>] default_idle+0x53/0x1d0
[129608.224092]  [<ffffffff81013236>] cpu_idle+0xd6/0x120
[129608.224167]  [<ffffffff8161785e>] rest_init+0x72/0x74
[129608.224252]  [<ffffffff81cfcbaa>] start_kernel+0x3b5/0x3c2
[129608.224331]  [<ffffffff81cfc347>]
x86_64_start_reservations+0x132/0x136
[129608.224421]  [<ffffffff81cfc140>] ? early_idt_handlers+0x140/0x140
[129608.224506]  [<ffffffff81cfc44d>] x86_64_start_kernel+0x102/0x111
[129608.224589] Code: 25 48 63 53 28 48 8d 42 01 48 c1 e0 04 49 01 c7 49
8b 07 48 85 c0 74 61 4d 8b 3f 48 c1 e2 04 48 83 c2 10 49 29 d7 4d 85 ff
74 26 <4d> 39 a7 38 01 00 00 75 cd 48 8b 95 38 ff ff ff 4c 89 ee 49 8d
[129608.224949] RIP  [<ffffffffa02436da>] ovs_masked_flow_lookup+0xda/0x140 [openvswitch]

Original commit msg:

list: Fix double fetch of pointer in hlist_entry_safe()

The current version of hlist_entry_safe() fetches the pointer twice,
once to test for NULL and the other to compute the offset back to the
enclosing structure.  This is OK for normal lock-based use because in
that case, the pointer cannot change.  However, when the pointer is
protected by RCU (as in "rcu_dereference(p)"), then the pointer can
change at any time.  This use case can result in the following sequence
of events:

1.  CPU 0 invokes hlist_entry_safe(), fetches the RCU-protected
    pointer as sees that it is non-NULL.

2.  CPU 1 invokes hlist_del_rcu(), deleting the entry that CPU 0
    just fetched a pointer to.  Because this is the last entry
    in the list, the pointer fetched by CPU 0 is now NULL.

3.  CPU 0 refetches the pointer, obtains NULL, and then gets a
    NULL-pointer crash.

This commit therefore applies gcc's "({ })" statement expression to
create a temporary variable so that the specified pointer is fetched
only once, avoiding the above sequence of events.  Please note that
it is the caller's responsibility to use rcu_dereference() as needed.
This allows RCU-protected uses to work correctly without imposing
any additional overhead on the non-RCU case.

Many thanks to Eric Dumazet for spotting root cause!

Reported-by: CAI Qian <caiqian@redhat.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Li Zefan <lizefan@huawei.com>
Bug #17099

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: remove RCU annotation from flow->mask
Andy Zhou [Mon, 22 Jul 2013 22:08:00 +0000 (15:08 -0700)]
datapath: remove RCU annotation from flow->mask

After a mask is assigned to a flow, it will not change for the life of
the flow. Since flow access is protected by RCU lock, access to
flow->mask after getting a flow is always safe.

Suggested-by: Jesse Gross <jesse@nicira.com>
Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Conflicts:
datapath/datapath.c

10 years agodatapath: Add mask check during flow lookup
Andy Zhou [Fri, 19 Jul 2013 18:11:24 +0000 (11:11 -0700)]
datapath: Add mask check during flow lookup

A mega flow matches when the masked key matches and the mask applied
is the same as the mask used to create the mega flow.

This patch adds the implementation of the second match condition
mentioned above. Without this fix, mega flow lookup may result false
match.

Bug #18584

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoofproto-dpif: Make "ovs-appctl dpif/dump-flows" megaflow-aware.
Justin Pettit [Wed, 17 Jul 2013 07:19:49 +0000 (00:19 -0700)]
ofproto-dpif: Make "ovs-appctl dpif/dump-flows" megaflow-aware.

The "ovs-appctl dpif/dump-flows" command wasn't updated to print
megaflows, so the output would not include wildcards even though the
datapath may, so the output was inconsistent and confusing.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoofproto-dpif: Expire fin_timeout actions when no previous timeout set.
Justin Pettit [Wed, 17 Jul 2013 21:22:28 +0000 (14:22 -0700)]
ofproto-dpif: Expire fin_timeout actions when no previous timeout set.

Commit e503cc199 (ofproto: Optimise OpenFlow flow expiry) optimized
OpenFlow flow expiration by putting expirable flows on a list.  However,
the list is only configured at rule creation time.  If the rule is
created without a timeout, but is later set by the fin_timeout action,
it will never expire.  This commit adds the rule to the list when the
action is triggered.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-ofctl: Bug fix in flow_format()
Andy Zhou [Tue, 16 Jul 2013 00:25:34 +0000 (17:25 -0700)]
ovs-ofctl: Bug fix in flow_format()

Fix a corner case bug where ARP packet with ARP opcode value of 1 would cause
tp_src and tp_dst to appear in the output string.

This bug caused some output from flow_format() to not be accepted by
'ovs-appctl ofproto/trace'

Added test coverage by using ARP opcode 1 in one unit test case, that
would have exposed the bug.

Bug #18334

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Don't put new subfacets as result of "userspace" action.
Justin Pettit [Tue, 16 Jul 2013 01:53:23 +0000 (18:53 -0700)]
ofproto-dpif: Don't put new subfacets as result of "userspace" action.

Don't install a flow if it's the result of the "userspace" action for an
already installed facet.  This can occur when a datapath flow with
wildcards has a "userspace" action and flows sent to userspace result in
a different subfacet, which will then be rejected as overlapping by the
datapath.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Co-authored-by: Ethan Jackson <ethan@nicira.com>
10 years agodatapath: Use masked flow when validating actions.
Jesse Gross [Tue, 16 Jul 2013 04:30:59 +0000 (21:30 -0700)]
datapath: Use masked flow when validating actions.

It is important to validate flow actions to ensure that they do
not try to write off the end of a packet. The mechanism to do this
is to ensure that a flow is precise enough to describe valid vs.
invalid packets and only allowing actions on valid flows.

The introduction of megaflows broke this by using a narrow base
flow but a potentially wide match. This meant that while the
original flow was properly validated, later packets might not
conform to that flow and could be truncated. This switches to
using the masked flow instead, effectively requiring that all
possible matching packets be valid in order for a flow's actions
to be accepted.

This change only affects the flow setup path - executed packets
have always used the flow extracted from the packet and therefore
were properly validated.

Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Revert "datapath: rhel: Account for RHEL specific backports"
Pravin B Shelar [Mon, 15 Jul 2013 19:45:20 +0000 (12:45 -0700)]
datapath: Revert "datapath: rhel: Account for RHEL specific backports"

This reverts commit 752378e1cd1f133a8366fbacec3b281a45ff8268
(datapath: rhel: Account for RHEL specific backports).
Change related to netif_needs_gso() is cuasing panic on RHEL
and Xen platforms.  This way we revert back to use of ovs
skb_gso_segment() and netif_skb_features() which has required
compat code for gso for kernel older than 2.6.38.

<1>[  924.855722] BUG: unable to handle kernel NULL pointer dereference
at 000000a0
<1>[  924.855789] IP: [<f0337fb7>] netdev_send+0x77/0x340 [openvswitch]
<4>[  924.855849] *pdpt = 000000011bc66027 *pde = 0000000000000000
<0>[  924.855895] Oops: 0000 [#1] SMP
<0>[  924.855927] last sysfs file: /sys/class/net/lo/carrier
<4>[  924.856551] Pid: 17937, comm: vif Not tainted
(2.6.32.43-0.4.1.xs1.6.10.734.170748xen #1) VMware Virtual Platform
<4>[  924.856618] EIP: 0061:[<f0337fb7>] EFLAGS: 00010246 CPU: 0
<4>[  924.856659] EIP is at netdev_send+0x77/0x340 [openvswitch]
<4>[  924.856697] EAX: 00000000 EBX: dd4fb800 ECX: 00000000 EDX:
00000289
<4>[  924.856749] ESI: edd55a40 EDI: 000005dc EBP: df287aa8 ESP:
df287a7c
<4>[  924.856790]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
<0>[  924.856825] Process vif (pid: 17937, ti=df286000 task=ee88d570
task.ti=df286000)
<0>[  924.856880] Stack:
<4>[  924.856902]  00000000 be8b9067 00000000 e9b02000 dd523ac0 dd523b50
f033aef1 b2e77c64
<4>[  924.856966] <0> dd523ac0 ee902840 dd4fc58c df287ab8 f0336162
edd55a40 ee902840 df287ac8
<4>[  924.857043] <0> f032d684 0000001e ee10a300 df287b34 f032d6ef
0000001c 00000000 00000000
<0>[  924.858942] Call Trace:
<4>[  924.859553]  [<f033aef1>] ? flex_array_get+0x51/0x70 [openvswitch]
<4>[  924.860189]  [<f0336162>] ? ovs_vport_send+0x12/0x50 [openvswitch]
<4>[  924.860806]  [<f032d684>] ? do_output+0x34/0x50 [openvswitch]
<4>[  924.861444]  [<f032d6ef>] ? do_execute_actions+0x4f/0x830
[openvswitch]
<4>[  924.862047]  [<f032dfa0>] ? ovs_execute_actions+0x70/0xd0
[openvswitch]
<4>[  924.862636]  [<f032fd3f>] ?
ovs_dp_process_received_packet+0x8f/0xf0 [openvswitch]
<4>[  924.863774]  [<f033690e>] ? ovs_vport_receive+0x5e/0x70
[openvswitch]
<4>[  924.864354]  [<f0337d9f>] ? netdev_frame_hook+0x4f/0x90
[openvswitch]
<4>[  924.864918]  [<c034f7ab>] ? netif_receive_skb+0x1bb/0x6a0
<4>[  924.865469]  [<c03c193f>] ? vlan_gro_common+0x10f/0x230
<4>[  924.866007]  [<c034fd58>] ? napi_skb_finish+0x38/0x40
<4>[  924.866533]  [<c03c1e86>] ? vlan_gro_receive+0x76/0x80
<4>[  924.867055]  [<f05adba4>] ? e1000_receive_skb+0x74/0x80 [e1000]
<4>[  924.867571]  [<f05b2b67>] ? e1000_clean_rx_irq+0x1f7/0x3e0 [e1000]
<4>[  924.868084]  [<f05b2970>] ? e1000_clean_rx_irq+0x0/0x3e0 [e1000]
<4>[  924.869025]  [<f05b06ac>] ? e1000_poll+0x4c/0x1f0 [e1000]

--snip--

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agorhel, xenserver: Create /var/log/openvswitch directory.
Gurucharan Shetty [Sat, 13 Jul 2013 03:52:51 +0000 (20:52 -0700)]
rhel, xenserver: Create /var/log/openvswitch directory.

During installation create the /var/log/openvswitch directory
so that openvswitch startup script is able to write the ovs-ctl.log

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Zero-out subfacet counters when installation fails.
Justin Pettit [Fri, 12 Jul 2013 23:56:55 +0000 (16:56 -0700)]
ofproto-dpif: Zero-out subfacet counters when installation fails.

When deleting subfacets, subfacet_uninstall() asserts that the
subfacet's counters are zero to make sure we don't lose counters.  We
have seen cases where a subfacet cannot be installed, but the counters
have values.  This should not happen and indicates a bug, since we
shouldn't create a subfacet if the datapath has a flow.  A buggy
datapath could trigger this, so just zero out the counters and log an
error.

Bug #18460.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agodatapath: Use kernel eth_mac_addr() on old kernels.
Jesse Gross [Fri, 12 Jul 2013 17:02:15 +0000 (10:02 -0700)]
datapath: Use kernel eth_mac_addr() on old kernels.

The OVS MAC address set function was removed in favor of the version
in the kernel but the function pointer for older kernels was not
updated.

Reported-by: Cali Ente <calientepermanente@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
10 years agodatapath: rhel: Account for RHEL specific backports
Thomas Graf [Tue, 9 Jul 2013 16:00:26 +0000 (18:00 +0200)]
datapath: rhel: Account for RHEL specific backports

The following symbols have been backported to RHEL and the kernel
version is no longer an accurate indicator for their presence:

 - skb_gso_segment
 - netif_skb_features
 - netif_needs_gso

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoFAQ: Add supported kernel versions for newer OVS releases.
Jesse Gross [Tue, 9 Jul 2013 21:11:12 +0000 (14:11 -0700)]
FAQ: Add supported kernel versions for newer OVS releases.

Reported-by: Kris zhang <zhang.kris@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
10 years agoovs-ctl.in: Increase the limit on file descriptors.
Gurucharan Shetty [Thu, 11 Jul 2013 16:10:41 +0000 (09:10 -0700)]
ovs-ctl.in: Increase the limit on file descriptors.

Testing shows that creation of 5000 internal ports and using it
to do some meaningful tasks works fine on a 12 cpu hardware.
Since a single port needs one file descriptor and a bridge
needs 3 file descriptors, we will have to increase the file
descriptor limit to a higher number from the current limit of 5000.
7500 feels like a decent increase with enough room for further
scale testing.

Bug #18383.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-vswitchd: Correct the documentation on the limits for bridges and ports.
Gurucharan Shetty [Thu, 11 Jul 2013 16:08:15 +0000 (09:08 -0700)]
ovs-vswitchd: Correct the documentation on the limits for bridges and ports.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
10 years agodatpath: Fix tunnel TTL flow rejection message.
Jesse Gross [Tue, 9 Jul 2013 23:59:56 +0000 (16:59 -0700)]
datpath: Fix tunnel TTL flow rejection message.

There is no default value for the tunnel TTL, so it must be
specified when setting up a new flow. However, the flow rejection
log message indicates that the TTL must be non-zero, which is not
true.

CC: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Always include tunnel TTL in serialized Netlink attributes.
Jesse Gross [Tue, 9 Jul 2013 23:47:01 +0000 (16:47 -0700)]
datapath: Always include tunnel TTL in serialized Netlink attributes.

There is no default value for the tunnel TTL so it must always be
included in flow keys sent from userspace to kernel. The kernel
should also respect this convertion when sending flows to userspace
by always including the TTL in tunnel flows.

CC: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: fix bugs in exporting netlink attributes
Andy Zhou [Tue, 9 Jul 2013 07:25:00 +0000 (00:25 -0700)]
datapath: fix bugs in exporting netlink attributes

Reported-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Conflicts:
datapath/flow.c

10 years agodatapath: Fix Netlink error message header.
Jesse Gross [Tue, 9 Jul 2013 21:43:47 +0000 (14:43 -0700)]
datapath: Fix Netlink error message header.

CC: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Use DP_MAX_PORTS when no IN_PORT attribute is present.
Jesse Gross [Tue, 9 Jul 2013 21:21:26 +0000 (14:21 -0700)]
datapath: Use DP_MAX_PORTS when no IN_PORT attribute is present.

To indicate that a flow is not associated with any particular in port,
userspace may omit the IN_PORT attribute, which the kernel translates
internally to the special value DP_MAX_PORTS. After the megaflows
changes, this was no longer being done, resulting in it using port 0
(the internal port).

This also adopts a wildcarding scheme similar to 802.2 packets where
a mask can be specified for this non-existent key attribute but it
must either be completely wildcarded or completely exact match.

CC: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Make netlink error messages more consistent.
Jesse Gross [Mon, 8 Jul 2013 21:51:27 +0000 (14:51 -0700)]
datapath: Make netlink error messages more consistent.

Suggested-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: add netlink error message to help kernel userspace integration.
Andy Zhou [Wed, 3 Jul 2013 16:18:27 +0000 (09:18 -0700)]
datapath: add netlink error message to help kernel userspace integration.

When kernel rejects a netlink message, it usually returns EINVAL
error code to the userspace. The actual reason for rejecting the
netlinke message is not available, making it harder to debug netlink
issues.  This patch adds kernel log messages whenever a netlink message
is rejected with reasons. Those messages are logged at the info level.

Those messages are logged only once per message, to keep kernel log noise
level down. Reload the kernel module to re-enable already logged
messages.

The messages are meant to help developers to debug userspace and kernel
intergration issues. The actual message may change or be removed over time.
These messages are not expected to show up in a production environment.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoodp-util: Clearly indicate VID mask is printed in hex.
Justin Pettit [Mon, 1 Jul 2013 18:03:23 +0000 (11:03 -0700)]
odp-util: Clearly indicate VID mask is printed in hex.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoodp-util: Always encode mask of 0xffff for dl_type < ETH_TYPE_MIN.
Justin Pettit [Mon, 1 Jul 2013 17:43:18 +0000 (10:43 -0700)]
odp-util: Always encode mask of 0xffff for dl_type < ETH_TYPE_MIN.

For non-Ethernet II packets, we don't set an EtherType netlink attribute
and set the Ethertype mask attribute to 0xffff.  The code was encoding
whatever mask was passed in, which could lead to bugs if the caller
didn't know the userspace-kernel interface.

Found by inspection.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
10 years agoKeep all of tunnel metadata in flow.
Jarno Rajahalme [Mon, 6 May 2013 11:56:16 +0000 (14:56 +0300)]
Keep all of tunnel metadata in flow.

Do not clear tunnel metadata on tunnel input.

This is a backport of 4110a57 (Keep all of tunnel metadata in
flow.) to branch-1.11.  This fixes an issue with received tunnel
traffic and megaflows.  xlate_actions() calls tnl_port_should_receive()
to determine whether it should un-wildcard the tunnel fields.  Without
this commit, the original flow's tunnel information is zeroed and is
unavailable for tnl_port_should_receive()'s use to determine whether
it's a tunnel flow or not.  This resulted in some tunnel traffic
being wildcarded inappropriately.

Bug #18277.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Fix tunnel source port selection for mega flow
Andy Zhou [Tue, 2 Jul 2013 22:58:19 +0000 (15:58 -0700)]
datapath: Fix tunnel source port selection for mega flow

Tunnel source port selection was based on hash value cached in the
flow. This no longer works with mega flow, since all flows matching
a mega flow will be transmitted with the same tunnel source port.

This patch computes the tunnel source port at run time based on each
incoming packet. Packets belong to the same micro flow would still get
the same source port, but multiple micro flows hitting the same mega flow
can get different source ports.

Packets injected from the usespace will be assigned to the same
source port as if they are forwarded in the kernel.

Bug #18216

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoofproto-dpif: Add ability to disable megaflows.
Justin Pettit [Sat, 29 Jun 2013 00:13:50 +0000 (17:13 -0700)]
ofproto-dpif: Add ability to disable megaflows.

Add new "dpif/disable-megaflows" and "dpif/enable-megaflows" commands to
ovs-appctl to disable and enable megaflows, respectively.  By default,
megaflows are enabled, and these commands should only be used for
debugging.

Feature #18265.

Requested-by: Ronghua Zhang <rzhang@vmware.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoflow: Don't assume non-IPv4 is IPv6 for un-wildcarding.
Justin Pettit [Fri, 28 Jun 2013 18:31:48 +0000 (11:31 -0700)]
flow: Don't assume non-IPv4 is IPv6 for un-wildcarding.

When determinining what fields to un-wildcard for the symmetric L4 hash,
don't include the IPv6 address fields if the packet isn't IPv6.

Reported-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoovs-appctl: allow ofproto/trace to take mega flow input
Andy Zhou [Fri, 28 Jun 2013 22:20:49 +0000 (15:20 -0700)]
ovs-appctl: allow ofproto/trace to take mega flow input

ofproto/trace command was not updated to take mega flow input, which made
it hard to use by cutting and pasting output from "ovs-dpctl dump-flows".

(The mask is not actually used for tracing.)

Bug #18267.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-dpctl: Fix mega flow output
Andy Zhou [Fri, 28 Jun 2013 05:02:58 +0000 (22:02 -0700)]
ovs-dpctl: Fix mega flow output

ovs-dpctl sometimes displays wildcarded fields as exact match. This
patch fixes those cases.

This patch implements the following logic. When OVS_FLOW_ATTR_MASK is
missing, the entire key attributes will be displayed as exact match fields.
When OVS_FLOW_ATTR_MASK is present, but some individual key attributes do
not have matching attributes in the mask, those key attributes will be
displayed as wildcarded fields.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agonetflow: Only un-wildcard IPv4 packets.
Justin Pettit [Thu, 27 Jun 2013 00:13:33 +0000 (17:13 -0700)]
netflow: Only un-wildcard IPv4 packets.

NetFlow v5 only supports IPv4, so don't bother un-wildcarding
non-IPv4 packets.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoflow: Only un-wildcard relevant IP headers.
Justin Pettit [Wed, 26 Jun 2013 23:37:16 +0000 (16:37 -0700)]
flow: Only un-wildcard relevant IP headers.

When determining the fields to un-wildcard, we need to be careful
about only un-wildcarding fields that are relevant.  Also, we
didn't properly handle IPv6 addresses.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoodp-util: Fix converting masked VLAN from flow.
Justin Pettit [Fri, 28 Jun 2013 00:57:57 +0000 (17:57 -0700)]
odp-util: Fix converting masked VLAN from flow.

When converting the VLAN from a flow to an ODP key, the processing logic
would always store the VLAN ethertype.  However, when handling a mask,
it should be a mask, not an ethertype.  And since we don't support
bit-wise masking of the ethertype, just make it an exact-match mask.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoodp-util: Correct printing the VLAN PCP mask.
Justin Pettit [Thu, 27 Jun 2013 22:02:12 +0000 (15:02 -0700)]
odp-util: Correct printing the VLAN PCP mask.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif: Remove old comment.
Justin Pettit [Thu, 27 Jun 2013 23:09:03 +0000 (16:09 -0700)]
ofproto-dpif: Remove old comment.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agodatapath: Bug fix: Kernel rejects flow with valid vlan field
Andy Zhou [Wed, 26 Jun 2013 15:54:45 +0000 (08:54 -0700)]
datapath: Bug fix: Kernel rejects flow with valid vlan field

Bug #18233

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoDatapath: Bug fix: kernel rejects mega flow with encap masks
Andy Zhou [Wed, 26 Jun 2013 15:54:44 +0000 (08:54 -0700)]
Datapath: Bug fix: kernel rejects mega flow with encap masks

Bug #18233

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Convert IPv6 TCP and UDP port netlink attributes properly.
Justin Pettit [Thu, 27 Jun 2013 20:42:14 +0000 (13:42 -0700)]
datapath: Convert IPv6 TCP and UDP port netlink attributes properly.

The code that converts netlink attributes to a flow match always
stored TCP and UDP ports in the IPv4 structure.  This commit
properly puts TCP and UDP traffic into appropriate IPv4 and IPv6
structures.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
10 years agoofproto-dpif: Always un-wildcard 'dl_type'.
Justin Pettit [Tue, 25 Jun 2013 23:55:36 +0000 (16:55 -0700)]
ofproto-dpif: Always un-wildcard 'dl_type'.

We always look at the fragment status and often look at other L3
headers when processing the packet, so just un-wildcard the
Ethertype.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agotunnel: Only un-wildcard the ECN bits for IP traffic.
Justin Pettit [Tue, 25 Jun 2013 23:40:50 +0000 (16:40 -0700)]
tunnel: Only un-wildcard the ECN bits for IP traffic.

With tunnels carrying IP packets, ECN bits are always inherited by
the encapsulating tunnel.  However, it doesn't make sense to
unwildcard the inner packet's TOS fields if the packet is not IP.

Found by inspection.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agodatapath: Make OVS_ACTION_ATTR_USERSPACE action to send packet key
Andy Zhou [Tue, 25 Jun 2013 16:21:16 +0000 (09:21 -0700)]
datapath: Make OVS_ACTION_ATTR_USERSPACE action to send packet key

OVS_ACTION_ATTR_USERSPACE action was sending the key from the matching
flow. This works for exact match flows because flow keys are the
same as packet keys. However, it does not work with wildcarded flows as
the packet keys may be different than the flow keys. This patch uses
the packet keys carried in OVS_CB(skb) when calling output_userspace().

Bug #18163

Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Do not clear key in ovs_match_init()
Jesse Gross [Mon, 24 Jun 2013 19:21:29 +0000 (12:21 -0700)]
datapath: Do not clear key in ovs_match_init()

When executing packets sent from userspace, the majority of the
flow information is extracted from the packet itself and a small
amount of metadata supplied by userspace is added. However, when
adding this metadata, the extracted flow information is currently
being cleared.

This manifests in a problem when executing actions as elements of key are
used when verifying some actions. For example a dec_ttl action verifies the
proto of the flow. An example of a flow that fails as a result of this
problem is:

        ovs-ofctl add-flow br0 "ip actions=dec_ttl,normal"

This is a regression added by "datapath: Mega flow implementation",
a1c564be1e2ffc31f8da09ab654c8ed987907fe5.

CC: Andy Zhou <azhou@nicira.com>
Reported-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Fix a kernel crash caused by corrupted mask list.
Andy Zhou [Fri, 21 Jun 2013 23:07:08 +0000 (16:07 -0700)]
datapath: Fix a kernel crash caused by corrupted mask list.

When flow table is copied, the mask list from the old table
is not properly copied into the new table. The corrupted mask
list in the new table will lead to kernel crash. This patch
fixes this bug.

Bug #18110
Reported-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agodatapath: Use a single attribute array for parsing values and masks.
Jesse Gross [Fri, 21 Jun 2013 00:08:09 +0000 (17:08 -0700)]
datapath: Use a single attribute array for parsing values and masks.

When parsing flow Netlink messages we currently have arrays to hold the
attribute pointers for both values and masks. This results in a large
stack, which some compilers warn about. It's not actually necessary
to have both arrays at the same time, so we can collapse this to a
single array.

Reported-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoofproto-dpif: Use megaflows.
Justin Pettit [Tue, 21 May 2013 00:48:49 +0000 (17:48 -0700)]
ofproto-dpif: Use megaflows.

The commit configures the masks generated from megaflows and pushes
them through the dpif layer.

With this commit and a wildcard supporting OVS kernel module,
ovs-vswitchd's flow setup rate is very close to that of the Linux
bridge.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agodpif: Log flow masks for "put" and "dump_next".
Justin Pettit [Thu, 20 Jun 2013 20:43:56 +0000 (13:43 -0700)]
dpif: Log flow masks for "put" and "dump_next".

When debugging the system, it's useful to not just see the key but
also the mask.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Handle failed flow 'put's.
Justin Pettit [Tue, 11 Jun 2013 01:09:53 +0000 (18:09 -0700)]
ofproto-dpif: Handle failed flow 'put's.

If a flow cannot be installed in the datapath, we should notice
this and not treat it as installed.  This becomes an issue with
megaflows, since a batch of unique flows may come in that generate
a single new datapath megaflow that covers them.  Since userspace
doesn't know whether the datapath supports megaflows, each unique
flow will get a separate flow entry (which overlap when masks are
applied) and all except the first will get rejected by a megaflow-
supporting datapath as duplicates.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoodp-util: Introduce odp_flow_key_from_mask().
Justin Pettit [Thu, 6 Jun 2013 01:56:58 +0000 (18:56 -0700)]
odp-util: Introduce odp_flow_key_from_mask().

Add a new function for converting a mask into a set of
OVS_KEY_ATTR* attributes.  This will be useful in a future commit.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Return EEXIST on overlapping new flow request
Andy Zhou [Thu, 20 Jun 2013 23:36:04 +0000 (16:36 -0700)]
datapath: Return EEXIST on overlapping new flow request

Flow update still requires unmasked key to match. If not,
return EINVAL.

CC: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agoovs-dpctl: Add mega flow support
Andy Zhou [Wed, 19 Jun 2013 07:15:10 +0000 (07:15 +0000)]
ovs-dpctl: Add mega flow support

Added support to allow mega flow specified and displayed. ovs-dpctl tool
is mainly used as debugging tool.

This patch also implements the low level user space routines to send
and receive mega flow netlink messages. Those netlink suppor
routines are required for forthcoming user space mega flow patches.

Added a unit test to test parsing and display of mega flows.

Ethan contributed the ovs-dpctl mega flow output function.

Co-authored-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agodatapath: Mega flow implementation
Andy Zhou [Mon, 17 Jun 2013 14:51:00 +0000 (07:51 -0700)]
datapath: Mega flow implementation

Add wildcarded flow support in kernel datapath.

Wildcarded flow can improve OVS flow set up performance by avoid sending
matching new flows to the user space program. The exact performance boost
will largely dependent on wildcarded flow hit rate.

In case all new flows hits wildcard flows, the flow set up rate is
within 5% of that of linux bridge module.

Pravin has made significant contributions to this patch. Including API
clean ups and bug fixes.

Co-authored-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
[jesse: Additional documentation, fix memory leak, and improve validation.]
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agotests: Tolerate init process pid != 1.
James Page [Thu, 20 Jun 2013 21:31:52 +0000 (22:31 +0100)]
tests: Tolerate init process pid != 1.

On Ubuntu Saucy based desktops, upstart runs with user sessions
enabled which means that the init process under which a daemon
might run is not always pid = 1.

Instead of checking for pid = 1, check to ensure that the parent
pid of the monitor is not the pid of the shell that started it.

Signed-off-by: James Page <james.page@ubuntu.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agovswitchd: Update flow-eviction-threshold documentation
Joe Stringer [Thu, 20 Jun 2013 00:25:30 +0000 (09:25 +0900)]
vswitchd: Update flow-eviction-threshold documentation

Patch 27a88d1373cbfcceac6d901bbf1c17051aa7845f caused the vswitchd
documentation and the code to digress. This brings them back in line.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Tighten up megaflow wildcard handling.
Justin Pettit [Wed, 19 Jun 2013 06:55:47 +0000 (23:55 -0700)]
ofproto-dpif: Tighten up megaflow wildcard handling.

A number of use-cases weren't handled properly when determining what can
be wildcarded for megaflows.  This commit both catches additional fields
that cannot be wildcarded and loosens a few other cases.

Bug #17979

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
10 years agoovs-xapi-sync: Cache the bridge-id value for non nicira-bridge-id too.
Gurucharan Shetty [Sun, 16 Jun 2013 12:09:20 +0000 (05:09 -0700)]
ovs-xapi-sync: Cache the bridge-id value for non nicira-bridge-id too.

Currently we connect to xapi in case there are multiple
external_ids:xs-network-uuids to get the single bridge id everytime
we have a change in the database for all the interested columns in
ovs-xapi-sync. The xs-network-uuids value can also change whenever
new VLANs are added or deleted, which is a common use case. The
disadvantage with this approach is that we query XAPI more often
and set the bridge-id as "" if we don't get a valid response for
our query. This can take down the logical connectivity for all the
VMs on that xenserver.

Instead of looking at the PIF records for all the xs-network-uuids,
we can instead just look at the xapi record which has the same bridge
name as the OVS bridge name and then cache its uuid. This value will
hold true till the OVS bridge is recreated in which case we will re-read
the value.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
10 years agoofproto-dpif: Don't wildcard fields used in special processing.
Justin Pettit [Tue, 18 Jun 2013 01:07:33 +0000 (18:07 -0700)]
ofproto-dpif: Don't wildcard fields used in special processing.

A number of fields are looked at when determining whether special
processing (slow-path) is needed.  This commit removes wildcarding when
they were consulted.

Reported-by: Ethan Jackson <ethan@nicira.com>
Reported-by: Paul Ingram <paul@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoovs-xapi-sync: Retry getting bridge-ids in case xapi is not ready.
Gurucharan Shetty [Fri, 14 Jun 2013 18:23:25 +0000 (11:23 -0700)]
ovs-xapi-sync: Retry getting bridge-ids in case xapi is not ready.

When there are multiple xs-network-uuids set for a bridge,
we query xapi to get the record that does not have a VLAN
associated with it. For cases when xapi does not respond,
retry again after a second.

During the times when xapi does not respond, set the value
as external_ids:bridge_id "".

Bug #17877.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
10 years agoodp-util: Use proper formatting for ODP port number.
Jarno Rajahalme [Fri, 14 Jun 2013 14:09:34 +0000 (17:09 +0300)]
odp-util: Use proper formatting for ODP port number.

Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto: Fix use of uninitialized local variable.
Jarno Rajahalme [Fri, 14 Jun 2013 14:09:33 +0000 (17:09 +0300)]
ofproto: Fix use of uninitialized local variable.

Also make the table id arithmetic less confusing.

Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agotunnel: Don't wildcard TTL and TOS in some circumstances.
Justin Pettit [Thu, 13 Jun 2013 23:46:33 +0000 (16:46 -0700)]
tunnel: Don't wildcard TTL and TOS in some circumstances.

For tunnels, we need to handle the facet's wildcards specially in a
couple of cases:

    - Don't wildcard TTL for facets if "ttl" option is "inherit".
    - Never wildcard the ECN bits, since they are always inherited.
    - Wildcard the rest of the TOS field if the "tos" option is "inherit".

Issue #17911

Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoovsdb-server: Preserve remotes across crash and restart.
Ben Pfaff [Thu, 13 Jun 2013 19:25:39 +0000 (12:25 -0700)]
ovsdb-server: Preserve remotes across crash and restart.

Commit b421d2af0ab (ovsdb-server: Add commands for adding and removing
remotes) made it possible to make ovsdb-server connect to OVS managers only
after ovs-vswitchd has completed its initial configuration.  But this
results in an undesirable effect: whenever ovsdb-server crashes, the
monitor restarts its, but ovsdb-server can no longer connect to the manager
because the remotes were added during runtime and that information is lost
during the crash.

This commit fixes the problem.

Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agoofproto-dpif: Fix format specifier.
Ben Pfaff [Tue, 11 Jun 2013 20:42:12 +0000 (13:42 -0700)]
ofproto-dpif: Fix format specifier.

list_size() returns a size_t, not a uint64_t.

Found by GCC.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
10 years agoofproto-dpif: Print slow path actions in dpif/dump-megaflows.
Justin Pettit [Wed, 12 Jun 2013 05:54:50 +0000 (22:54 -0700)]
ofproto-dpif: Print slow path actions in dpif/dump-megaflows.

It's often helpful to see what the slow path actions actual are.  Print
them when "ovs-appctl dpif/dump-megaflows" is called.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif: Use proper flow when using facets.
Justin Pettit [Wed, 12 Jun 2013 05:47:12 +0000 (22:47 -0700)]
ofproto-dpif: Use proper flow when using facets.

The handle_flow_miss_with_facet() function used the facet's flow
information instead of the missed flow.  This corrects that.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif: Never wildcard dl_type for "normal" action.
Justin Pettit [Wed, 12 Jun 2013 00:15:31 +0000 (17:15 -0700)]
ofproto-dpif: Never wildcard dl_type for "normal" action.

The is_gratuitous_arp() function is occasionally called when
processing the "normal" action.  The previous code only disabled
wildcarding the dl_type field when the function was called, but
since it runs occasionally, it could lead to inconsistencies in the
facet table.  This commit causes the dl_type to never be wildcarded
when the "normal" action is used.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif: Store relevant fields for wildcarding in facet.
Justin Pettit [Wed, 15 May 2013 01:24:43 +0000 (18:24 -0700)]
ofproto-dpif: Store relevant fields for wildcarding in facet.

Dynamically determines the flow fields that were relevant in
processing flows based on the OpenFlow flow table and switch
configuration.  The immediate use for this functionality is to
cache action translations for similar flows in facets.  This yields
a roughly 80% improvement in flow set up rates for a complicated
flow table.

More importantly, these wildcards will be used to determine what to
wildcard for the forthcoming kernel wildcard (megaflow) patches
that will allow wildcarding in the kernel, which will provide
significant flow set up improvements.

The approach to tracking fields and caching action translations in
facets was based on an impressive prototype by Ethan Jackson.

Co-authored-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoflow: Add new wildcard functions.
Ethan Jackson [Tue, 11 Jun 2013 05:48:58 +0000 (22:48 -0700)]
flow: Add new wildcard functions.

Rename the function flow_wildcards_combine() to flow_wildcards_and().
Add new flow_wildcards_or() and flow_hash_in_wildcards() functions.
These will be useful in a future patch.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoclassifier: Add 'wc' argument to classifier_lookup().
Ethan Jackson [Fri, 10 May 2013 02:15:54 +0000 (19:15 -0700)]
classifier: Add 'wc' argument to classifier_lookup().

A future commit will want to know what bits were significant during the
classifier lookup.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Co-authored-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoflow: Add new flow_wildcards_fold_minimask() function.
Ethan Jackson [Fri, 10 May 2013 02:14:20 +0000 (19:14 -0700)]
flow: Add new flow_wildcards_fold_minimask() function.

This function will be useful in a future commit.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Co-authored-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
10 years agoofproto-dpif: Consolidate facet stat logic.
Ethan Jackson [Wed, 29 May 2013 19:38:48 +0000 (12:38 -0700)]
ofproto-dpif: Consolidate facet stat logic.

The logic for updating statistics at the facet level had been
spread through ofproto-dpif in a rather confusing manner.  This
patch consolidates as much of this logic as is reasonable into
facet_push_stats().

On a side note, I'd expect this patch to have a marginal positive
performance impact when using learning (though I haven't bothered
to measure it).  It combines facet_learn() and facet_push_stats()
into one step allowing us to avoid a redundant xlate_actions().

Signed-off-by: Ethan Jackson <ethan@nicira.com>
10 years agoovsdb-idlc: Write a new-line at the end of "annotate" output.
Ben Pfaff [Mon, 10 Jun 2013 17:25:29 +0000 (10:25 -0700)]
ovsdb-idlc: Write a new-line at the end of "annotate" output.

Some tools do not like text files that lack a trailing new-line.  In
particular, Debian's dpkg-source utility complains about a missing new-line
in the file generated by ovsdb-idlc:

    dpkg-source: warning: file
    openvswitch-1.9.2+git20130605/lib/vswitch-idl.ovsidl has no final
    newline (either original or modified version)

This commit fixes the problem.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agodpif-netdev: Don't run port names through netdev_vport_get_dpif_port().
Ben Pfaff [Thu, 6 Jun 2013 22:27:15 +0000 (15:27 -0700)]
dpif-netdev: Don't run port names through netdev_vport_get_dpif_port().

The ports that exist within a dpif have already been translated through
netdev_vport_get_dpif_port(), so there is no value to translating them
again in the interfaces that query or dump ports (and possibly a drawback
if somehow the translation could change).

After this change, dpif-netdev translates port names in just one place,
the port_add path, which makes dpif-netdev act the same way as dpif-linux
in this respect.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif: Do not give stats to rules bypassed by "drop" frag policy.
Ben Pfaff [Wed, 5 Jun 2013 17:11:55 +0000 (10:11 -0700)]
ofproto-dpif: Do not give stats to rules bypassed by "drop" frag policy.

When the OFPC_FRAG_DROP policy is in effect, IP fragments are supposed to
be dropped before they reach the flow table.  Open vSwitch properly dropped
IP fragments in this case, but still accounted them to the packet and byte
counters for the flow that they would have hit if the OFPC_FRAG_NX_MATCh
policy had been in effect.

Reported-by: love you <thunder.love07@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoofproto-dpif: Don't count misses in OpenFlow table stats.
Jesse Gross [Sat, 25 May 2013 00:01:34 +0000 (17:01 -0700)]
ofproto-dpif: Don't count misses in OpenFlow table stats.

Originally no rule existed for packets that did not match an
OpenFlow flow and therefore every packet with a rule could be
counted as a hit. However, newer versions of OVS have hidden
miss rules so this is no longer true. To return the correct
table stats, this subtracts packets that hit the miss rule
from the total and removes the separate counter.

Reported-by: love you <thunder.love07@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
10 years agopackets: Fix typo in reserved multicast Ethernet addresses.
Ben Pfaff [Tue, 28 May 2013 23:05:34 +0000 (16:05 -0700)]
packets: Fix typo in reserved multicast Ethernet addresses.

The reserved multicast Ethernet addresses begin with 01:80:c2, not
01:08:c2.

Reported-by: Padmanabhan Krishnan <kprad1@yahoo.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
10 years agoAlways use valid ids pointer in dec_ttl_cnt_ids_from_openflow()
Simon Horman [Mon, 3 Jun 2013 05:46:30 +0000 (14:46 +0900)]
Always use valid ids pointer in dec_ttl_cnt_ids_from_openflow()

Always update the ids pointer after calling ofpbuf_put()
to ensure that it is valid when accessed.

During testing a case came up where the call to ofpbuf_put() in the
for (i = 0; i < ids->n_controllers; i++) loop would cause the underlying
buffer to be reallocated. This resulted in ids->n_controllers being an
incorrect value, the loop continuing on longer than desired and finally a
segmentation fault.

Reported-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
10 years agocfm: Update netdev when changed.
Ethan Jackson [Mon, 3 Jun 2013 20:49:13 +0000 (13:49 -0700)]
cfm: Update netdev when changed.

If ofproto decided to change the netdev of a particular ofport,
cfm_demand mode would improperly continue using the old netdev to
collect stats.

Bug #17583.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
10 years agoINSTALL: Add documentaion for hot upgrades.
Gurucharan Shetty [Thu, 30 May 2013 09:07:15 +0000 (09:07 +0000)]
INSTALL: Add documentaion for hot upgrades.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>