cascardo/ovs.git
8 years agodpif-netdev: Fix race condition in pmd thread initialization.
Daniele Di Proietto [Wed, 6 Apr 2016 01:02:14 +0000 (18:02 -0700)]
dpif-netdev: Fix race condition in pmd thread initialization.

The pmds and the main threads are synchronized using a condition
variable.  The main thread writes a new configuration, then it waits on
the condition variable.  A pmd thread reads the new configuration, then
it calls signal() on the condition variable. To make sure that the pmds
and the main thread have a consistent view, each signal() should be
backed by a wait().

Currently the first signal() doesn't have a corresponding wait().  If
the pmd thread takes a long time to start and the signal() is received
by a later wait, the threads will have an inconsistent view.

The commit fixes the problem by removing the first signal() from the
pmd thread.

This is hardly a problem on current master, because the main thread
will call the first wait() a long time after the creation of a pmd
thread.  It becomes a problem with the next commits.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
8 years agodpif-netdev: Add functions to modify rxq without reloading pmd threads.
Daniele Di Proietto [Wed, 6 Apr 2016 00:01:25 +0000 (17:01 -0700)]
dpif-netdev: Add functions to modify rxq without reloading pmd threads.

This commit introduces some functions to add/remove rxqs from pmd
threads without reloading them.  They will be used by next commits.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
8 years agodpif-netdev: Factor out port_create() from do_add_port().
Daniele Di Proietto [Tue, 5 Apr 2016 20:14:56 +0000 (13:14 -0700)]
dpif-netdev: Factor out port_create() from do_add_port().

Instead of performing every operation inside do_port_add() it seems
clearer to introduce port_create(), since we already have
port_destroy().

No functional change.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
8 years agodpif-netdev: Remove unused 'index' in dp_netdev_pmd_thread.
Daniele Di Proietto [Thu, 7 Apr 2016 19:54:10 +0000 (12:54 -0700)]
dpif-netdev: Remove unused 'index' in dp_netdev_pmd_thread.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
8 years agodpif-netdev: Destroy 'port_mutex' in dp_netdev_free().
Daniele Di Proietto [Tue, 5 Apr 2016 01:10:33 +0000 (18:10 -0700)]
dpif-netdev: Destroy 'port_mutex' in dp_netdev_free().

Found by inspection.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
8 years agonetdev-native-tnl: Fix a build error on NetBSD 7.0
YAMAMOTO Takashi [Fri, 20 May 2016 05:52:19 +0000 (05:52 +0000)]
netdev-native-tnl: Fix a build error on NetBSD 7.0

netinet/ip6.h is not a standalone header there.

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Tested-by: Jeff Feng <jianhua@us.ibm.com>
8 years agonetdev-dpdk: Improve pthread_getaffinity_np() fail handling.
Kevin Traynor [Thu, 19 May 2016 12:51:32 +0000 (13:51 +0100)]
netdev-dpdk: Improve pthread_getaffinity_np() fail handling.

Prevent pthread_setaffinity_np() being called with a potentially
invalid cpu_set_t and add a default (core 0x1).

Also, only call pthread_getaffinity_np() if no dpdk-lcore-mask specified.

Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agonetdev-dpdk: Fix coremask logic.
Kevin Traynor [Thu, 19 May 2016 12:51:31 +0000 (13:51 +0100)]
netdev-dpdk: Fix coremask logic.

Only set the thread affinity back to the pre rte_eal_init() value
when the user has not specified a coremask.

Fixes: 88964e6428dc("netdev-dpdk: Autofill lcore coremask if absent")
Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoofproto-dpif-xlate: Fix IGMP megaflow matching.
Ben Pfaff [Sun, 8 May 2016 17:34:10 +0000 (10:34 -0700)]
ofproto-dpif-xlate: Fix IGMP megaflow matching.

IGMP translations wasn't setting enough bits in the wildcards to ensure
different packets were handled differently.

Reported-by: "O'Reilly, Darragh" <darragh.oreilly@hpe.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-April/021036.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodpif-netdev: Initialize packet RSS hash in dpif_netdev_execute().
Daniele Di Proietto [Wed, 18 May 2016 01:38:20 +0000 (18:38 -0700)]
dpif-netdev: Initialize packet RSS hash in dpif_netdev_execute().

The datapath code expects the RSS hash to always be initialized.  This
is enforced by checking in emc_processing() that the hash is valid, and
eventually by computing a new one.

Unfortunately, there is another entry point to the datapath,
dpif_netdev_execute().  A packet generated by OVS (BFD frame,
packet-out from controller) doesn't have a valid RSS hash and so is
allowed to enter the datapath with an uninitialized hash value.

This commit recomputes the hash (if not valid) in dpif_netdev_execute().

The only place where we would use an invalid hash is netdev-vport, in
push_udp_header().  This caused an uninitialized memory read, and a
random value to be assigned to the outer tunnel header source port.

Reported-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agodpif: Pass flow parameter to dpif_execute().
Daniele Di Proietto [Wed, 18 May 2016 01:26:02 +0000 (18:26 -0700)]
dpif: Pass flow parameter to dpif_execute().

All the callers of the function already have a copy of the extracted
flow in their stack (or a few frames before).

This is useful for different resons:
* It forces the callers to also call flow_extract() on the packet, which
  is necessary to initialize the l2,l3,l4 pointers.
* It will be used in the userspace datapath to generate the RSS hash by
  a following commit
* It can be used by the userspace connection tracker to avoid extracting
  the l3 type again.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoflow: Fix uninitialized reads in [mini]flow_hash_5tuple().
Daniele Di Proietto [Wed, 18 May 2016 02:18:51 +0000 (19:18 -0700)]
flow: Fix uninitialized reads in [mini]flow_hash_5tuple().

Almost every caller expects [mini]flow_hash_5tuple() to be able to deal
with all kinds of flows, not only TCP and UDP.

Currently, when dealing with non L4 flows, the function may access
uninitialized memory.  This commit changes it to return prematurely with
a partial hash value instead of reading uninitialized memory.

Found by valgrind.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoutilities/checkpatch.py: Check for appropriate bracing
Aaron Conole [Fri, 20 May 2016 15:52:59 +0000 (11:52 -0400)]
utilities/checkpatch.py: Check for appropriate bracing

Teach checkpatch.py to understand that if/for/while blocks should always
end with braces on the same line (if possible). This does not address
multi-line if/for/while blocks, but provides a point where such blocks
could be added.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agostp: Initialize mutex whenever we register unixctl command.
Ben Pfaff [Fri, 20 May 2016 14:49:02 +0000 (07:49 -0700)]
stp: Initialize mutex whenever we register unixctl command.

The stp/tcn command, which locks the mutex, was being registered without
initializing the mutex, so calling stp/tcn before STP was enabled on the
switch caused a crash.  This commit fixes the bug by initializing the mutex
at the same time we register the stp/tcn command.

Reported-by: Ding Zhi <zhi.ding@6wind.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-May/071381.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Quentin Monnet <quentin.monnet@6wind.com>
8 years agopython: Add TCP passive-mode to IDL.
Ofer Ben-Yacov [Wed, 18 May 2016 15:29:13 +0000 (18:29 +0300)]
python: Add TCP passive-mode to IDL.

Requested-by: "D M, Vikas" <vikas.d-m@hpe.com>
Requested-by: "Kamat, Maruti Haridas" <maruti.kamat@hpe.com>
Requested-by: "Sukhdev Kapur" <sukhdev@arista.com>
Requested-by: "Migliaccio, Armando" <armando.migliaccio@hpe.com>
Signed-off-by: "Ofer Ben-Yacov" <ofer.benyacov@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoutilities/ovs-ctl.in: Only add_managers with vswitchd
Aaron Conole [Fri, 20 May 2016 14:50:46 +0000 (10:50 -0400)]
utilities/ovs-ctl.in: Only add_managers with vswitchd

The ovs-ctl script was changed recently to have per-service start/stop
control. However, when that change was made the add_managers() call was
overlooked. This results in calls to `ovs-ctl --no-ovs-vswitchd start`
telling the ovsdb-server to connect to the remote controllers.

The fix presented will defer signaling to remote managers until the
following are both true:
1. At least one of OVSDB_SERVER or OVS_VSWITCHD was told to start
2. Both daemons are running.

Fixes: 7fc28c50c012 ("ovs-ctl: Allow selective start for db and switch")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoutilities: Tweak python shebangs to use env
YAMAMOTO Takashi [Fri, 13 May 2016 14:36:15 +0000 (14:36 +0000)]
utilities: Tweak python shebangs to use env

"python" command provided by pkg_alternatives is a shell script.
At least on NetBSD-7, execve can't execute scripts whose interpreter
is another shell script.  (While some "rich" shells like zsh seem
to have handle the case by itself, NetBSD's /bin/sh doesn't.)
Workaround the issue by using env command for shebangs for
these scripts.

Noticed with the recent tunnel-push-pop.at tests using ovs-pcap command.

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-controller-vtep.at: Pre-sort output before feeding to "sort -d"
YAMAMOTO Takashi [Fri, 13 May 2016 14:11:20 +0000 (14:11 +0000)]
ovn-controller-vtep.at: Pre-sort output before feeding to "sort -d"

NetBSD's "sort -d" preserves the order of lines which doesn't have
alphanumeric and blanks.  eg. empty lines and [].
It means it sometimes preserve unstable order of the list output.

Also, simply remove -d option where the expected output doesn't
include [].

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovsdb-server.at: Fix races
YAMAMOTO Takashi [Fri, 13 May 2016 12:57:48 +0000 (12:57 +0000)]
ovsdb-server.at: Fix races

As ovsdb-server creates pid file before unixctl socket, waiting
for pid file creation is not enough.  Fix the race by retrying
with "version" command before assuming the server is up.

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agodpif: Remove a warning
YAMAMOTO Takashi [Fri, 13 May 2016 11:42:55 +0000 (11:42 +0000)]
dpif: Remove a warning

Remove "attempted to unregister a datapath provider that is not registered"
warning.  It's normal for --enabled-dummy=system with userland-only build.
ovn-controller-vtep.at tests use the flag and fail on the extra warning.

Alternatively, we can make the tests ignore this specific warning.
But currently it doesn't make much sense as dp_unregister_provider
is only used for --enabled-dummy.

Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovn test: add '-O OpenFlow13' to ovs-ofctl
Flavio Fernandes [Tue, 17 May 2016 01:02:52 +0000 (21:02 -0400)]
ovn test: add '-O OpenFlow13' to ovs-ofctl

Make test calls to ovs-ofctl in test use the protocol parameter
'-O OpenFlow13', so it is consistent with the existing dump-flows
invocations.

Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn test: remove check for non-existing bridge in hv3
Flavio Fernandes [Tue, 17 May 2016 01:02:51 +0000 (21:02 -0400)]
ovn test: remove check for non-existing bridge in hv3

In OVN vtep test, the network topology is like this:

  hv1---\
         >-- [net1] <-- vtep --> [net2] <-- hv3
  hv2---/

The logical switch lsw0 created in this test has no logical
port corresponding to hv3, so that hypervisor does not have
any bridges created by OVN. With this test change, we are
replacing the 'show br-int' with a check to ensure that
'br-int' is not present.

Fixes: 8dab102238f0 ("ovn: Add more details to test output.")
Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn test: improve vtep test description and fix typo
Flavio Fernandes [Tue, 17 May 2016 01:02:50 +0000 (21:02 -0400)]
ovn test: improve vtep test description and fix typo

- Add vtep as keyword and in description of test 2028
- Fix minor typo: 'information'

Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-nbctl: Fix memory leak reported by Valgrind.
William Tu [Sun, 15 May 2016 15:52:33 +0000 (08:52 -0700)]
ovn-nbctl: Fix memory leak reported by Valgrind.

Definitely lost is reported by test 2026: ovn -- 3 HVs, 1 LS, 3 lports/HV.
  ds_put_char__ (dynamic-string.c:82)
  ds_put_char (dynamic-string.h:88)
  process_escape_args (process.c:103)
  main (ovn-nbctl.c:92)
Another leak shown at ovn-sbctl.c with similar pattern.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agotnl-ports: Handle STT ports.
Pravin B Shelar [Wed, 18 May 2016 00:35:33 +0000 (17:35 -0700)]
tnl-ports: Handle STT ports.

STT uses TCP port so we need to filter traffic on basis of TCP
port numbers.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agotunnel: Add IP ECN related functions.
Pravin B Shelar [Wed, 18 May 2016 00:35:28 +0000 (17:35 -0700)]
tunnel: Add IP ECN related functions.

Set and get functions for IP explicit congestion notification flag.
These function would be used by STT reassembly code.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodpif-netdev: Refactor userspace action
Pravin B Shelar [Wed, 18 May 2016 00:33:44 +0000 (17:33 -0700)]
dpif-netdev: Refactor userspace action

Large segment support need to use this refactored function to
send individual segments.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodpif-netdev: Refactor fast path process function.
Pravin B Shelar [Wed, 18 May 2016 00:33:32 +0000 (17:33 -0700)]
dpif-netdev: Refactor fast path process function.

Once datapath support large packets, we need to segment packet before
sending it to upcall. Refactoring this code make it bit cleaner.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodpif-netdev: Fix memory leak in tunnel header push action.
Pravin B Shelar [Wed, 18 May 2016 00:33:10 +0000 (17:33 -0700)]
dpif-netdev: Fix memory leak in tunnel header push action.

in case of error from netdev_push_header() batch of packets was not
freed. Following patch fixes this issue.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodpif-netdev: Fix memory leak in tunnel header pop action.
Pravin B Shelar [Wed, 18 May 2016 00:32:37 +0000 (17:32 -0700)]
dpif-netdev: Fix memory leak in tunnel header pop action.

The tunnel header pop action can leak batch of packet
in case of error. Following patch fixex the error code path.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodpif-netdev: create batch object
Pravin B Shelar [Wed, 18 May 2016 00:32:33 +0000 (17:32 -0700)]
dpif-netdev: create batch object

DPDK datapath operate on batch of packets. To pass the batch of
packets around we use packets array and count.  Next patch needs
to associate meta-data with each batch of packets. So Introducing
a batch structure to make handling the metadata easier.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodpif-netdev: rename packet_batch
Pravin B Shelar [Wed, 18 May 2016 00:32:28 +0000 (17:32 -0700)]
dpif-netdev: rename packet_batch

Next patch introduces new structure named packet_batch. So
I am renaming it to packet_batch_per_flow.
This does not change any functionality.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodp-packet: use packet reset function.
Pravin B Shelar [Wed, 18 May 2016 00:32:23 +0000 (17:32 -0700)]
dp-packet: use packet reset function.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodp-packet: Add private data
Pravin B Shelar [Wed, 18 May 2016 00:32:17 +0000 (17:32 -0700)]
dp-packet: Add private data

This scratchpad can be used by any layer to keep private data.
STT will use it for TCP reassembly state.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agonetdev: Return number of packet from netdev_pop_header()
Pravin B Shelar [Wed, 18 May 2016 00:32:06 +0000 (17:32 -0700)]
netdev: Return number of packet from netdev_pop_header()

Current tunnel-pop API does not allow the netdev implementation
retain a packet but STT can keep a packet from batch of packets
during TCP reassembly processing. To return exact count of
valid packet STT need to pass this number of packet parameter
as a reference.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agonetdev-vport: Factor-out tunnel Push-pop code into separate module.
Pravin B Shelar [Wed, 18 May 2016 00:31:33 +0000 (17:31 -0700)]
netdev-vport: Factor-out tunnel Push-pop code into separate module.

It is better to move tunnel push-pop action specific functions into
separate module.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agonat: documentation and parsing fixes.
Jarno Rajahalme [Wed, 18 May 2016 23:28:36 +0000 (16:28 -0700)]
nat: documentation and parsing fixes.

Add the missing NAT documentation to ovs-ofctl man page and add
validation of the NAT flags to NAT action decoding and parsing.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
8 years agoovs-dev.py: Update for python3.
Joe Stringer [Sat, 14 May 2016 22:08:08 +0000 (15:08 -0700)]
ovs-dev.py: Update for python3.

Adapt to python-2.6+, including support for 3.

Signed-off-by: Joe Stringer <joe@ovn.org>
8 years agoovs-dev.py: PEP-8ify.
Joe Stringer [Sat, 14 May 2016 21:18:27 +0000 (14:18 -0700)]
ovs-dev.py: PEP-8ify.

Signed-off-by: Joe Stringer <joe@ovn.org>
8 years agotests: Enable color output for unit tests, if available.
Flavio Fernandes [Wed, 18 May 2016 15:00:49 +0000 (11:00 -0400)]
tests: Enable color output for unit tests, if available.

Reference thread in mailing list:
http://openvswitch.org/pipermail/discuss/2016-May/021339.html

Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-vtep: Support running multiple ovs-vtep processes
nickcooper-zhangtonghao [Fri, 6 May 2016 03:07:57 +0000 (23:07 -0400)]
ovs-vtep: Support running multiple ovs-vtep processes

Include ovs-vtep physical switch name as part of logical switch name to
support running multiple ovs-vtep processes sharing the same ovsdb and vswitchd.

Signed-off-by: nickcooper-zhangtonghao <nickcooper-zhangtonghao@opencloud.tech>
Tested-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agotests: Add test for partial map updates.
Edward Aymerich [Mon, 2 May 2016 20:07:20 +0000 (14:07 -0600)]
tests: Add test for partial map updates.

Insert basic functionality for testing partial map updates
and add a new test table named "simple2".

Signed-off-by: Edward Aymerich <edward.aymerich@hpe.com>
Signed-off-by: Arnoldo Lutz <arnoldo.lutz.guevara@hpe.com>
Co-authored-by: Arnoldo Lutz <arnoldo.lutz.guevara@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovsdb-idlc.in: Autogenerate partial map updates functions.
Edward Aymerich [Mon, 2 May 2016 20:01:46 +0000 (14:01 -0600)]
ovsdb-idlc.in: Autogenerate partial map updates functions.

Code inserted that autogenerates corresponding map functions to set and
delete elements in map columns.
Inserts description to the functions that are autogenerated.

Signed-off-by: Edward Aymerich <edward.aymerich@hpe.com>
Signed-off-by: Arnoldo Lutz <arnoldo.lutz.guevara@hpe.com>
Co-authored-by: Arnoldo Lutz <arnoldo.lutz.guevara@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovsdb-idl: Add partial map updates functionality.
Edward Aymerich [Mon, 2 May 2016 19:59:44 +0000 (13:59 -0600)]
ovsdb-idl: Add partial map updates functionality.

In the current implementation, every time an element of either a map or set
column has to be modified, the entire content of the column is sent to the
server to be updated. This is not a major problem if the information contained
in the column for the corresponding row is small, but there are cases where
these columns can have a significant amount of elements per row, or these
values are updated frequently, therefore the cost of the modifications becomes
high in terms of time and bandwidth.

In this solution, the ovsdb-idl code is modified to use the RFC 7047 'mutate'
operation, to allow sending partial modifications on map columns to the server.
The functionality is exposed to clients in the vswitch idl. This was
implemented through map operations.

A map operation is defined as an insertion, update or deletion of a key-value
pair inside a map. The idea is to minimize the amount of map operations
that are send to the OVSDB server when a transaction is committed.

In order to keep track of the requested map operations, structs map_op and
map_op_list were defined with accompanying functions to manipulate them. These
functions make sure that only one operation is send to the server for each
key-value that wants to be modified, so multiple operation on a key value are
collapsed into a single operation.

As an example, if a client using the IDL updates several times the value for
the same key, the functions will ensure that only the last value is send to
the server, instead of multiple updates. Or, if the client inserts a key-value,
and later on deletes the key before committing the transaction, then both
actions cancel out and no map operation is send for that key.

To keep track of the desired map operations on each transaction, a list of map
operations (struct map_op_list) is created for every column on the row on which
a map operation is performed. When a new map operation is requested on the same
column, the corresponding map_op_list is checked to verify if a previous
operations was performed on the same key, on the same transaction. If there is
no previous operation, then the new operation is just added into the list. But
if there was a previous operation on the same key, then the previous operation
is collapsed with the new operation into a single operation that preserves the
final result if both operations were to be performed sequentially. This design
keep a small memory footprint during transactions.

When a transaction is committed, the map operations lists are checked and
all map operations that belong to the same map are grouped together into a
single JSON RPC "mutate" operation, in which each map_op is transformed into
the necessary "insert" or "delete" mutators. Then the "mutate" operation is
added to the operations that will be send to the server.

Once the transaction is finished, all map operation lists are cleared and
deleted, so the next transaction starts with a clean board for map operations.

Using different structures and logic to handle map operations, instead of
trying to force the current structures (like 'old' and 'new' datums in the row)
to handle then, ensures that map operations won't mess up with the current
logic to generate JSON messages for other operations, avoids duplicating the
whole map for just a few changes, and is faster for insert and delete
operations, because there is no need to maintain the invariants in the 'new'
datum.

Signed-off-by: Edward Aymerich <edward.aymerich@hpe.com>
Signed-off-by: Arnoldo Lutz <arnoldo.lutz.guevara@hpe.com>
Co-authored-by: Arnoldo Lutz <arnoldo.lutz.guevara@hpe.com>
[blp@ovn.org made style changes and factored out error checking]
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodpif-netlink: Only warn when OVS datapath Netlink family is unavailable.
Ciara Loftus [Tue, 17 May 2016 13:28:39 +0000 (14:28 +0100)]
dpif-netlink: Only warn when OVS datapath Netlink family is unavailable.

OVS using DPDK (or the userspace datapath without DPDK) can still function
correctly without the module loaded.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agonetdev: Initialise DPDK netdev classes only once
Ciara Loftus [Tue, 17 May 2016 13:28:38 +0000 (14:28 +0100)]
netdev: Initialise DPDK netdev classes only once

DPDK netdev classes were being initialised twice, resulting in warning
logs like so:

netdev|WARN|attempted to register duplicate netdev provider: dpdk

This commit removes one of the initialisation calls.

Fixes: 0692257923fe ("netdev: Fix potential deadlock.")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoappveyor: Update OpenSSL version
Alin Serdean [Wed, 11 May 2016 20:49:09 +0000 (20:49 +0000)]
appveyor: Update OpenSSL version

OpenSSL version changed from 1.0.2g to 1.0.2h this patch bumps the
version.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto-dpif-xlate: Fix compilation with GCC 4.6.
Ben Pfaff [Tue, 17 May 2016 23:29:39 +0000 (16:29 -0700)]
ofproto-dpif-xlate: Fix compilation with GCC 4.6.

Without this change, GCC 4.6 reports:

ofproto/ofproto-dpif-xlate.c: In function â€˜xlate_actions’:
ofproto/ofproto-dpif-xlate.c:5117:27: error: missing initializer
ofproto/ofproto-dpif-xlate.c:5117:27: error: (near initialization for
    â€˜(anonymous).masks.vlan_tci’)

Reported-by: Joe Stringer <joe@ovn.org>
Reported-at: https://travis-ci.org/openvswitch/ovs/builds/130256491
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agotests: Add support for helgrind thread error detector.
William Tu [Sat, 30 Apr 2016 05:13:46 +0000 (22:13 -0700)]
tests: Add support for helgrind thread error detector.

Helgrind is a Valgrind tool for detecting thread errors, reporting three
classes of errors: misuses of the POSIX pthreads API, potential deadlocks
arising from lock ordering problems, and data races -- accessing memory
without adequate locking.  Similar to valgrind, users do "make check-helgrind"
and results will be saved at tests/testsuite.dir/<N>/helgrind.*.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agotests: Remove redundant ofport_request.
William Tu [Fri, 29 Apr 2016 17:11:25 +0000 (10:11 -0700)]
tests: Remove redundant ofport_request.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agopinctrl: Fix "sparse" warning.
Ben Pfaff [Tue, 17 May 2016 14:44:06 +0000 (07:44 -0700)]
pinctrl: Fix "sparse" warning.

The ofport member should be an ofp_port_t, since it represents an OpenFlow
port number.

Fixes: 0ee8aaf658dd ("ovn: Send GARP on localnet.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agovtep: Add other_config to Global table.
Dennis Sam [Wed, 11 May 2016 18:51:29 +0000 (11:51 -0700)]
vtep: Add other_config to Global table.

Extend the Global table to allow for additional configurations by re-using
the idea of an other_config column.

Signed-off-by: Dennis Sam <dsam@arista.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto-dpif-upcall: Fix UFID usage with flow_modify.
Joe Stringer [Fri, 13 May 2016 21:17:12 +0000 (14:17 -0700)]
ofproto-dpif-upcall: Fix UFID usage with flow_modify.

As per the delete_op_init{,__}() functions, the UFID should only be
passed down if ukey->ufid_present is set. Otherwise it is possible to
request a flow modification only using a UFID in a datapath that doesn't
support UFID, which will fail.

Fixes: 43b2f131a229 ("ofproto: Allow in-place modifications of datapath flows.")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agodpctl: Sort port listing in "show" command.
Justin Pettit [Thu, 12 May 2016 00:28:54 +0000 (17:28 -0700)]
dpctl: Sort port listing in "show" command.

The port listing did not consistently print in the same order.  While it
is a better user experience to see the ports printed in order, more
importantly, this fixes a unit test ("dpctl - add-if set-if del-if")
that would occasionally fail due to expecting that the ports are printed
in order.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath-windows: Validate Netlink packets' integrity.
Paul Boca [Wed, 27 Apr 2016 08:05:47 +0000 (08:05 +0000)]
datapath-windows: Validate Netlink packets' integrity.

Solved access violation when trying to access Netlink message - obtained
with forged IOCTLs.

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoclassifier: Use ccmaps for staged lookup indices.
Jarno Rajahalme [Sat, 23 Apr 2016 02:40:09 +0000 (19:40 -0700)]
classifier: Use ccmaps for staged lookup indices.

Use the new ccmap type instead of cmap for staged lookup indices to
fix the problem with slow removal of rules with large number of
duplicates.  This was problematic especially when many rules shared
the same match in packet metadata (e.g., a port number, but nothing
else), causing a large number of duplicates to be inserted into the
staged lookup index.  ccmap only keeps the count of inserted (hash)
values, so duplicates do not add any performance penalty.

Reported-by: Alok Kumar Maurya <alok-kumar.maurya@hpe.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agolib: Add new 'counting cmap' type.
Jarno Rajahalme [Sat, 23 Apr 2016 02:40:09 +0000 (19:40 -0700)]
lib: Add new 'counting cmap' type.

cmap implements duplicates as linked lists, which causes removal of
rules to become (O^2) with large number of duplicates.  This patch
fixes the problem by introducing a new 'counting' variant of the cmap
(ccmap), which can be efficiently used to keep counts of inserted hash
values provided by the caller.  This does not require a node in the
user data structure, so this makes the user implementation a bit more
memory efficient, too.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovn: Fix localnet ports deletion and recreation sometimes after restart.
Ramu Ramamurthy [Fri, 29 Apr 2016 00:23:59 +0000 (20:23 -0400)]
ovn: Fix localnet ports deletion and recreation sometimes after restart.

On graceful restart of ovn-controller, the chassis row is inserted in the
Chassis table. During this transaction, there is a window of time where an
idl row-read may not return the newly created row - even though the row
should exist, but the transaction is in an incomplete state.  As a result,
get_chassis() in binding_run() returns a null chassis record binding_run
exits early, and does not create local_datapaths, and patch_run deletes
localnet patch ports. In a later run, the localnet patch ports are
recreated.

This is reproducable consistently but not on every restart.  The fix is to
handle the case that the chassis record may be null in binding_run, and yet
create local_datapaths.

Restart logs follow with commentary:

2016-04-28T18:35:42.448Z|00001|vlog|INFO|opened log file /home/ovs/ovs/tests/testsuite.dir/2035/hv/ovn-controller.log
2016-04-28T18:35:42.449Z|00002|reconnect|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/db.sock: connecting...
2016-04-28T18:35:42.449Z|00003|reconnect|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/db.sock: connected
2016-04-28T18:35:42.452Z|00004|reconnect|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/ovn-sb/ovn-sb.sock: connecting...
2016-04-28T18:35:42.452Z|00005|reconnect|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/ovn-sb/ovn-sb.sock: connected
2016-04-28T18:35:42.454Z|00006|ovsdb_idl|INFO|ovsdb_idl_txn_insert:
                Chassis row inserted into transaction above
2016-04-28T18:35:42.454Z|00007|binding|INFO|Claiming lport localvif2 for this chassis.
2016-04-28T18:35:42.454Z|00008|binding|INFO|Claiming lport localvif3 for this chassis.
2016-04-28T18:35:42.454Z|00009|binding|INFO|Claiming lport localcif4 for this chassis.
2016-04-28T18:35:42.454Z|00010|binding|INFO|Claiming lport localcif5 for this chassis.
2016-04-28T18:35:42.454Z|00011|binding|INFO|Claiming lport localcif1 for this chassis.
2016-04-28T18:35:42.454Z|00012|binding|INFO|Claiming lport localvif1 for this chassis.
2016-04-28T18:35:42.454Z|00013|binding|INFO|Claiming lport localvif201 for this chassis.
2016-04-28T18:35:42.454Z|00014|binding|INFO|Claiming lport localcif3 for this chassis.
2016-04-28T18:35:42.454Z|00015|binding|INFO|Claiming lport localcif2 for this chassis.
               Binding run found the chassis record and has claimed the vifs
2016-04-28T18:35:42.455Z|00016|ofctrl|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt: connecting to switch
2016-04-28T18:35:42.455Z|00017|rconn|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt: connecting...
2016-04-28T18:35:42.455Z|00018|pinctrl|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt: connecting to switch
2016-04-28T18:35:42.456Z|00019|rconn|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt: connecting...
2016-04-28T18:35:42.457Z|00020|ovsdb_idl|INFO|ovsdb_idl_row_clear_new:
                At this point read of Chassis table returns no rows, and
                the transaction status is still incomplete.
2016-04-28T18:35:42.457Z|00021|binding|INFO|no chassis rec!
                Binding run exits early because chassis_rec was null
2016-04-28T18:35:42.459Z|00022|patch|INFO|removing port patch-br-int-to-localnet201
2016-04-28T18:35:42.459Z|00023|patch|INFO|removing port patch-br-int-to-localnet1
2016-04-28T18:35:42.459Z|00024|patch|INFO|removing port patch-localnet1-to-br-int
2016-04-28T18:35:42.459Z|00025|patch|INFO|removing port patch-localnet201-to-br-int
               Localnet ports are removed above, because local_datapaths dont exist
2016-04-28T18:35:42.459Z|00026|rconn|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt: connected
2016-04-28T18:35:42.460Z|00027|rconn|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt: connected
2016-04-28T18:35:42.460Z|00028|ovsdb_idl|INFO|ovsdb_idl_row_create:
               Now, the transaction is complete

Signed-off-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn: Send GARP on localnet.
Ramu Ramamurthy [Tue, 26 Apr 2016 21:31:07 +0000 (17:31 -0400)]
ovn: Send GARP on localnet.

In some use cases such as VM migration or when VMs reuse IP addresses, VMs
become unreachable externally because external switches/routers on localnet
have stale port-mac or ARP caches. The problem resolves after some time
when the caches ageout which could be minutes for port-mac bindings or
hours for ARP caches.

To fix this, send some gratuitous ARPs when a logical port on a localnet
datapath gets added. Such gratuitous ARPs help on a best-effort basis to
update the mac-port bindings and ARP caches of external switches and
routers on the localnet.

Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1545897
Reported-by: Kyle Mestery <mestery@mestery.com>
Signed-off-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn: Move extract_lport_addresses
Ramu Ramamurthy [Tue, 26 Apr 2016 21:31:06 +0000 (17:31 -0400)]
ovn: Move extract_lport_addresses

Move the function extract_lport_addresses to a file
in ovn/lib since that function can be used by ovn-controller also
to parse addresses stored in the mac column of the
port_binding table. Currently that function is used only
in ovn_northd.

Signed-off-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodaemon-unix: Properly handle missing users or groups.
Christian Ehrhardt [Mon, 25 Apr 2016 07:12:19 +0000 (09:12 +0200)]
daemon-unix: Properly handle missing users or groups.

From the manpages of getgrnam_r (getpwnam_r is similar):
"If no matching group record was found, these functions return 0 and
store NULL in *result."

The code checked only against errors, but non existing users didn't set
e != 0 therefore the code could try to set arbitrary uid/gid values.

Fixes: e91b927d lib/daemon: support --user option for all OVS daemon
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agomcast-snooping: Trigger revalidation when adding a new multicast group.
Ben Pfaff [Mon, 16 May 2016 20:13:36 +0000 (13:13 -0700)]
mcast-snooping: Trigger revalidation when adding a new multicast group.

Otherwise it takes a long time for flows to be updated when a new group
entry is added.

Reported-by: "O'Reilly, Darragh" <darragh.oreilly@hpe.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-May/021224.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: "O'Reilly, Darragh" <darragh.oreilly@hpe.com>
Tested-at: http://openvswitch.org/pipermail/discuss/2016-May/021244.html

8 years agoacinclude.m4: Fix skb_get_hash function detection
Markos Chandras [Tue, 10 May 2016 08:21:00 +0000 (09:21 +0100)]
acinclude.m4: Fix skb_get_hash function detection

Commit e2f3178f0582 ("datapath: Add support for kernel 3.14.") added
support for 3.14 kernels and a new OVS_GREP_IFELSE check for the
"skg_get_hash" function in the process. "skb_get_hash" was introduced
in the Linux kernel commit 3958afa1b272 ("net: Change skb_get_rxhash to
skb_get_hash") which exists in >=3.14 but the OVS_GREP_IFELSE macro
also matches the "skb_get_hash_raw" function which exists in older
kernels. As a result of which, the check makes the build system
behave as if the "skb_get_hash" function is available in these older
kernels leading to build failures. We fix this by explicitly checking
for "skb_get_hash(" which matches the function definition.

Signed-off-by: Markos Chandras <mchandras@suse.de>
Signed-off-by: Jesse Gross <jesse@kernel.org>
8 years agoovsdb-server: Fix memory leak reported by Valgind.
William Tu [Fri, 13 May 2016 17:33:07 +0000 (10:33 -0700)]
ovsdb-server: Fix memory leak reported by Valgind.

Reported by test 1657: ovsdb-server/add-db and remove-db.
  ds_put_format (dynamic-string.c:142)
  query_db_remotes (ovsdb-server.c:798)
  reconfigure_remotes (ovsdb-server.c:988)
  main_loop (ovsdb-server.c:156)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-controller: Fix errors reported by Valgrind.
William Tu [Fri, 13 May 2016 18:58:43 +0000 (11:58 -0700)]
ovn-controller: Fix errors reported by Valgrind.

Fix two errors reported by test 2026: ovn -- 3 HVs, 1 LS, 3 lports/HV.
1. Conditional jump or move depends on uninitialised value(s)
    physical_run (physical.c:366)
    main (ovn-controller.c:382)
2. Use of uninitialised value of size 8
    bitmap_set1 (bitmap.h:97)
    update_ct_zones (binding.c:115)
    binding_run (binding.c:228)
    main (ovn-controller.c:362)

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto-dpif-xlate: Always generate wildcards.
Ben Pfaff [Sat, 23 Apr 2016 00:45:03 +0000 (17:45 -0700)]
ofproto-dpif-xlate: Always generate wildcards.

Until now, the flow translation code has tried to avoid constructing a
set of wildcards during translation in the cases where it can, because
wildcards are large and somewhat expensive.  However, this has problems
that we hadn't previously realized.  Specifically, the generated actions
can depend on the constructed wildcards, to decide which bits of a field
need to be set in a masked set_field action.  This means that in practice
translation needs to always construct the wildcards.

(It might be possible to avoid masked set_field when we're not constructing
wildcards, but this would mean that we'd generate different actions
depending on whether wildcards were being constructed, which seems rather
confusing at best.  Also, the cases in which we don't need wildcards anyway
are fairly obscure, meaning that the benefits of avoiding them in those
cases are minimal and that it's going to be hard to get test coverage.  The
latter is probably why we didn't notice this until now.)

Reported-by: William Tu <u9012063@gmail.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-April/069219.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Tested-by: William Tu <u9012063@gmail.com>
8 years agonetdev-dpdk: Fix locking during get_stats.
Joe Stringer [Tue, 10 May 2016 22:50:42 +0000 (15:50 -0700)]
netdev-dpdk: Fix locking during get_stats.

Clang complains:
lib/netdev-dpdk.c:1860:1: error: mutex 'dev->mutex' is not locked on every path
      through here [-Werror,-Wthread-safety-analysis]
}
^
lib/netdev-dpdk.c:1815:5: note: mutex acquired here
    ovs_mutex_lock(&dev->mutex);
    ^
./include/openvswitch/thread.h:60:9: note: expanded from macro 'ovs_mutex_lock'
        ovs_mutex_lock_at(mutex, OVS_SOURCE_LOCATOR)
        ^

Fixes: d6e3feb57c44 ("Add support for extended netdev statistics based on RFC 2819.")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agotests: Add valgrind targets for ovn utilities and dameons.
Gurucharan Shetty [Thu, 12 May 2016 15:22:04 +0000 (08:22 -0700)]
tests: Add valgrind targets for ovn utilities and dameons.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
8 years agoofproto-dpif-upcall: Pass key to dpif_flow_get().
Joe Stringer [Tue, 10 May 2016 22:42:01 +0000 (15:42 -0700)]
ofproto-dpif-upcall: Pass key to dpif_flow_get().

Windows datapath folks have reported instances where OVS userspace will
pass down a flow_get request to the datapath using a UFID even though the
datapath has no support for UFIDs. Since commit e672ff9b4d22
("ofproto-dpif: Restore metadata and registers on recirculation."), if a
flow dump provides a flow that userspace isn't aware of, and the flow
dump doesn't provide actions for that flow, then userspace will attempt
a flow_get using just the UFID. This is because the ofproto-dpif layer
doesn't pass the key down to the dpif layer even if it's available.
Prior to the above commit, the codepath was only hit if the key was not
available, which would have implied UFID support. This assumption is now
broken: An empty set of actions could also trigger flow_get, and
datapaths without UFID support are free to pass up empty actions lists.

Pass down the flow key if available, and don't pass down the UFID if
unavailable to be more consistent with the usage of other dpif APIs
within this file.

Fixes: e672ff9b4d22 ("ofproto-dpif: Restore metadata and registers on recirculation.")
Reported-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agovtep: Add source node replication support.
Darrell Ball [Sat, 7 May 2016 16:21:21 +0000 (09:21 -0700)]
vtep: Add source node replication support.

This patch updates the vtep schema, vtep-ctl commands and vtep simulator
to support source node replication in addition to service node
replication per logical switch.  The default replication mode is service
node as that was the only mode previously supported.  Source node
replication mode is optionally configurable and clearing the replication
mode implicitly sets the replication mode back to a default of service
node.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Bruce Davie <bdavie@vmware.com>
Acked-by: Anupam Chanda <achanda@vmware.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
8 years agoofproto-dpif-xlate: fix for group liveness propagation
László Sürü [Wed, 11 May 2016 08:46:33 +0000 (08:46 +0000)]
ofproto-dpif-xlate: fix for group liveness propagation

According to OpenFlow v1.3.5 specification a group is considered live,
if it has at least one live bucket in it.  (6.5 Group Table
Modification Messages: "A group is considered live if a least one of
its buckets is live.")

However, OVS implementation incorrectly returns group as live when no
live bucket is found in group_is_alive() function of
ofproto-dpif-xlate.c.

Instead it should return true only if a live bucket is found (that is
!= NULL).

Signed-off-by: László Sűrű <laszlo.suru@ericsson.com>
Co-authored-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agotests: Fix tunnel push pop test failure.
Pravin B Shelar [Wed, 11 May 2016 17:46:30 +0000 (10:46 -0700)]
tests: Fix tunnel push pop test failure.

Sort the list of arp entries to get predictable output.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoofproto-dpif: Restore packet metadata when a continuation is resumed.
Numan Siddique [Tue, 10 May 2016 23:04:35 +0000 (16:04 -0700)]
ofproto-dpif: Restore packet metadata when a continuation is resumed.

Recirculations due to NXT_RESUME are failing if the packet metadata is not
restored prior to the packet execution.

Reported-at: http://openvswitch.org/pipermail/dev/2016-May/070723.html
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
8 years agoutil: Pass 128-bit arguments directly instead of using pointers.
Justin Pettit [Wed, 4 May 2016 01:20:51 +0000 (18:20 -0700)]
util: Pass 128-bit arguments directly instead of using pointers.

Commit f2d105b5 (ofproto-dpif-xlate: xlate ct_{mark, label} correctly.)
introduced the ovs_u128_and() function.  It directly takes ovs_u128
values as arguments instead of pointers to them.  As this is a bit more
direct way to deal with 128-bit values, modify the other utility
functions to do the same.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agosystem-traffic: Wait for availability of ftpd.
Joe Stringer [Thu, 5 May 2016 01:01:06 +0000 (18:01 -0700)]
system-traffic: Wait for availability of ftpd.

Some FTP tests had intermittent failures because the FTP daemons
might not load before the testsuite script iterated to running the
client. Add checks after launching FTP daemons to make these tests more
resilient.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agosystem-traffic: Wait for IPv6 connectivity.
Joe Stringer [Thu, 5 May 2016 01:01:05 +0000 (18:01 -0700)]
system-traffic: Wait for IPv6 connectivity.

Several of the tests have race conditions where the next step in the
test may run before the kernel actually provides IPv6 connectivity.
This causes intermittent testsuite failures. Some existing tests
would even sleep in an attempt to mitigate this issue.

Improve the resilience of these tests by waiting until IPv6 or FTP
connectivity are ready. This speeds the testsuite up by a couple of
percent.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agosystem-traffic: Drop auto ct helpers in namespaces.
Joe Stringer [Thu, 5 May 2016 01:01:03 +0000 (18:01 -0700)]
system-traffic: Drop auto ct helpers in namespaces.

Automatic helper assignment in conntrack can trigger an upstream bug
where namespace deletion followed by immediate unload of conntrack
helper modules may cause kernel crashes. Disable automatic helper
assignment within created namespaces to avoid this issue.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agotnl-neigh-cache: check for arp expiration.
Pravin B Shelar [Mon, 25 Apr 2016 22:58:33 +0000 (15:58 -0700)]
tnl-neigh-cache: check for arp expiration.

The neighbor entry expiry is only checked in dpif-poll
event handler, But in absence of any event we could keep
using arp entry forever. This patch changes it to check
expiration on each lookup.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agonetdev: Fix potential deadlock.
Ben Pfaff [Sat, 23 Apr 2016 00:03:22 +0000 (17:03 -0700)]
netdev: Fix potential deadlock.

Until now, netdev_class_mutex and route_table_mutex could be taken in
either order:

    * netdev_run() takes netdev_class_mutex, then netdev_vport_run() calls
      route_table_run(), which takes route_table_mutex.

    * route_table_init() takes route_table_mutex and then eventually calls
      netdev_open(), which takes netdev_class_mutex.

This commit fixes the problem by converting the netdev_classes hmap,
protected by netdev_class_mutex, into a cmap protected on the read
side by RCU.  Only a very small amount of code actually writes to the
cmap in question, so it's a lot easier to understand the locking rules
at that point.  In particular, there's no need to take netdev_class_mutex
from either netdev_run() or netdev_open(), so neither of the code paths
above determines a lock ordering any longer.

Reported-by: William Tu <u9012063@gmail.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-February/020216.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Tested-by: William Tu <u9012063@gmail.com>
8 years agocmap: New macro CMAP_INITIALIZER, for initializing an empty cmap.
Ben Pfaff [Fri, 22 Apr 2016 23:51:03 +0000 (16:51 -0700)]
cmap: New macro CMAP_INITIALIZER, for initializing an empty cmap.

Sometimes code is much simpler if we can statically initialize data
structures.  Until now, this has not been possible for cmap-based data
structures, so this commit introduces a CMAP_INITIALIZER macro.

This works by adding a singleton empty cmap_impl that simply forces the
first insertion into any cmap that points to it to allocate a real
cmap_impl.  There could be some risk that rogue code modifies the
singleton, so for safety it is also marked 'const' to allow the linker to
put it into a read-only page.

This adds a new OVS_ALIGNED_VAR macro with GCC and MSVC implementations.
The latter is based on Microsoft webpages, so developers who know Windows
might want to scrutinize it.

As examples of the kind of simplification this can make possible, this
commit removes an initialization function from ofproto-dpif-rid.c and a
call to cmap_init() from tnl-neigh-cache.c.  An upcoming commit will add
another user.

CC: Jarno Rajahalme <jarno@ovn.org>
CC: Gurucharan Shetty <guru@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
8 years agoofproto-dpif: Do not count resubmit to later tables against limit.
Ben Pfaff [Thu, 21 Apr 2016 17:50:17 +0000 (10:50 -0700)]
ofproto-dpif: Do not count resubmit to later tables against limit.

Open vSwitch must ensure that flow translation takes a finite amount of
time.  Until now it has implemented this by limiting the depth of
recursion.  The initial limit, in version 1.0.1, was no recursion at all,
and then over the years it has increased to 8 levels, then 16, then 32,
and 64 for the last few years.  Now reports are coming in that 64 levels
are inadequate for some OVN setups.  The natural inclination would be to
double the limit again to 128 levels.

This commit attempts another approach.  Instead of increasing the limit,
it reduces the class of resubmits that count against the limit.  Since the
goal for the depth limit is to prevent an infinite amount of work, it's
not necessary to count resubmits that can't lead to infinite work.  In
particular, a resubmit from a table numbered x to a table y > x cannot do
this, because any OpenFlow switch has a finite number of tables.  Because
in fact a resubmit (or goto_table) from one table to a later table is the
most common form of an OpenFlow pipeline, I suspect that this will greatly
alleviate the pressure to increase the depth limit.

Reported-by: Guru Shetty <guru@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
8 years agoofproto-dpif: Rename "recurse" to "indentation".
Ben Pfaff [Thu, 21 Apr 2016 17:50:16 +0000 (10:50 -0700)]
ofproto-dpif: Rename "recurse" to "indentation".

The "recurse" member of struct xlate_in and struct xlate_ctx is used for
two purposes: to determine the amount of indentation in "ofproto/trace"
output and to limit the depth of recursion.  An upcoming commit will
separate these tasks, and so in preparation this commit renames "recurse"
to "indentation".

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
8 years agoovn-nbctl: Add sanity checking for lswitch-add.
Ben Pfaff [Sun, 8 May 2016 16:21:29 +0000 (09:21 -0700)]
ovn-nbctl: Add sanity checking for lswitch-add.

I don't think anyone really wants the painful behavior of creating multiple
logical switches with the same name to be the default.  This commit retains
the possibility of doing that in case someone really wants it, but refuses
by default for sanity.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
8 years agoovn-nbctl: Make error handling consistent with ovs-vsctl.
Ben Pfaff [Sun, 8 May 2016 16:21:41 +0000 (09:21 -0700)]
ovn-nbctl: Make error handling consistent with ovs-vsctl.

ovs-vsctl distinguishes between internal database inconsistencies, which
it logs, and errors in commands specified by the user, which cause fatal
exits.  ovn-nbctl wasn't as careful about this and tended to just log
everything.  This commit brings it up to the same standard as ovs-vsctl.

This commit also adds --if-exists and --may-exist options in the same kinds
of places as ovs-vsctl, to allow for scripting in cases where it's OK if
an operation has already occurred.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
8 years agoovn-nbctl: Mark lport-del commands as writing the database.
Ben Pfaff [Fri, 6 May 2016 17:54:04 +0000 (10:54 -0700)]
ovn-nbctl: Mark lport-del commands as writing the database.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
8 years agonetdev-dpdk: Print default vhost-sock-dir value & update documentation
Ciara Loftus [Fri, 6 May 2016 10:20:34 +0000 (11:20 +0100)]
netdev-dpdk: Print default vhost-sock-dir value & update documentation

When no vhost-sock-dir value is provided, print the default location.
Update the documentation to reflect the fact that vhost-sock-dir values
are now subdirectory loctions rather than full paths.

Fixes: d8a8f353c23e ("netdev-dpdk: Restrict vhost_sock_dir")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoAdd support for extended netdev statistics based on RFC 2819.
mweglicx [Thu, 5 May 2016 08:46:01 +0000 (09:46 +0100)]
Add support for extended netdev statistics based on RFC 2819.

Implementation of new statistics extension for DPDK ports:
- Add new counters definition to netdev struct and open flow,
  based on RFC2819.
- Initialize netdev statistics as "filtered out"
  before passing it to particular netdev implementation
  (because of that change, statistics which are not
  collected are reported as filtered out, and some
  unit tests were modified in this respect).
- New statistics are retrieved using experimenter code and
  are printed as a result to ofctl dump-ports.
- New counters are available for OpenFlow 1.4+.
- Add new vendor id: INTEL_VENDOR_ID.
- New statistics are printed to output via ofctl only if those
  are present in reply message.
- Add new file header: include/openflow/intel-ext.h which
  contains new statistics definition.
- Extended statistics are implemented only for dpdk-physical
  and dpdk-vhost port types.
- Dpdk-physical implementation uses xstats to collect statistics.
- Dpdk-vhost implements only part of statistics (RX packet sized
  based counters).

Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
[blp@ovn.org made software devices more consistent]
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoAdd change tracking documentation
RYAN D. MOATS [Fri, 22 Apr 2016 21:35:37 +0000 (16:35 -0500)]
Add change tracking documentation

Change tracking is a bit different from what someone with
"classic" database experience might expect, so let's add
the knowledged gained from the experience of making change
tracking work for incremental processing.

Signed-off-by: RYAN D. MOATS <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-sbctl: Display correct ovnsb sock location in help message.
Hui Kang [Tue, 19 Apr 2016 17:50:25 +0000 (13:50 -0400)]
ovn-sbctl: Display correct ovnsb sock location in help message.

Signed-off-by: Hui Kang <kangh@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodpif-netdev: Fix dp_netdev_pmd_remove_flow().
Daniele Di Proietto [Tue, 3 May 2016 23:35:10 +0000 (16:35 -0700)]
dpif-netdev: Fix dp_netdev_pmd_remove_flow().

After removing a flow from the dpcls classifier there might still be
readers who have access to the flow, until the next grace period.

Setting flow->cr.mask to NULL can cause concurrent readers to crash,
so this commit avoids doing it.

The crash can be reproduced, for example, by invoking an operation
that cause datapath flows to be deleted (such as `ovs-appctl
upcall/enable-megaflows`) while traffic is running.

I think the assignment was intended just as a safety measure to catch
race conditions, and it should be safe to remove.

Here's a stack trace of a possible crash:

Program terminated with signal SIGSEGV, Segmentation fault.
rule=0x7f3ae8006190) at ../lib/dpif-netdev.c:4156
4156            if (OVS_UNLIKELY((value & *maskp++) != *keyp++)) {
(gdb) bt
rule=0x7f3ae8006190) at ../lib/dpif-netdev.c:4156
rules=0x7f3afa3f2e40, cnt=<optimized out>) at ../lib/dpif-netdev.c:4225
(pmd=pmd@entry=0x7f3afa3fc010, packets=packets@entry=0x7f3afa3fa420,
cnt=cnt@entry=32, keys=keys@entry=0x7f3afa3f6428,
batches=batches@entry=0x7f3afa3f4118,
n_batches=n_batches@entry=0x7f3afa3fa3b0)
    at ../lib/dpif-netdev.c:3483
(pmd=pmd@entry=0x7f3afa3fc010, packets=packets@entry=0x7f3afa3fa420,
cnt=<optimized out>, md_is_valid=md_is_valid@entry=false,
port_no=<optimized out>) at ../lib/dpif-netdev.c:3625
cnt=<optimized out>, packets=0x7f3afa3fa420, pmd=0x7f3afa3fc010) at
../lib/dpif-netdev.c:3642
rxq=<optimized out>, port=<optimized out>, port=<optimized out>) at
../lib/dpif-netdev.c:2574
../lib/dpif-netdev.c:2693
../lib/ovs-thread.c:340
pthread_create.c:312
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Fixes: 361d808dd9e4("flow: Split miniflow's map.")
CC: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
8 years agoovn-northd: Add support for static_routes.
Steve Ruan [Tue, 3 May 2016 12:06:50 +0000 (07:06 -0500)]
ovn-northd: Add support for static_routes.

Logical patch ports are used to connect logical routers
together. Static routes are used to select between different logical router
ports when exiting a logical router.

Reported-by: Na Zhu <nazhu@cn.ibm.com>
Reported-by: Dustin Lundquist <dlundquist@linux.vnet.ibm.com>
Reported-at:
https://bugs.launchpad.net/networking-ovn/+bug/1545140
https://bugs.launchpad.net/networking-ovn/+bug/1539347

Signed-off-by: Steve Ruan <ruansx@cn.ibm.com>
[guru@ovn.org provided the unit test.]
Co-authored-by: Gurucharan Shetty <guru@ovn.org>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agocheck-kmod: Remove all OVS modules in this target.
Joe Stringer [Tue, 3 May 2016 22:44:15 +0000 (15:44 -0700)]
check-kmod: Remove all OVS modules in this target.

The make check-kmod target would previously attempt to only remove the
openvswitch module, which would fail if any vport modules were loaded.
Remove those modules too, to allow the target to proceed.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
8 years agoclassifier: Remove rare optimization case.
Jarno Rajahalme [Wed, 4 May 2016 20:00:06 +0000 (13:00 -0700)]
classifier: Remove rare optimization case.

This optimization applied when a staged lookup index would narrow down
to a single rule, which happens sometimes is simple test cases, but
presumably less often in more populated flow tables.  The result of
this optimization allowed a bit more general megaflows, but the bit
patterns produced were sometimes cryptic.  Finally, a later fix to a
more important performance problem does not allow for this
optimization any more, so remove it now.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoclassifier: Remove logging.
Jarno Rajahalme [Wed, 4 May 2016 20:00:05 +0000 (13:00 -0700)]
classifier: Remove logging.

The only vlog line was a left over from debugging.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoclassifier: Remove redundant index.
Jarno Rajahalme [Wed, 4 May 2016 20:00:05 +0000 (13:00 -0700)]
classifier: Remove redundant index.

The test for figuring out if the last index had the same fields as the
actual rules map as broken, resulting into keeping an unnecessary
index around.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agocompat: Remove skbuff header helper backports.
Joe Stringer [Tue, 3 May 2016 00:47:33 +0000 (17:47 -0700)]
compat: Remove skbuff header helper backports.

These have existed largely since v2.6.22, so it's well overdue.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agocompat: Remove unused ipv[46] backports.
Joe Stringer [Tue, 3 May 2016 00:47:32 +0000 (17:47 -0700)]
compat: Remove unused ipv[46] backports.

These pieces #if on kernel versions which are not supported since commit
f2ab1536ddbc ("compat: Backport conntrack strictly to v3.10+.")

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agocompat: Document nf_defrag_ipv[46] backport.
Joe Stringer [Mon, 2 May 2016 18:19:18 +0000 (11:19 -0700)]
compat: Document nf_defrag_ipv[46] backport.

Document how the IP(6) defrag backport works, and do minor style cleanups.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: Fix template leak in error cases.
Joe Stringer [Mon, 2 May 2016 18:19:17 +0000 (11:19 -0700)]
datapath: Fix template leak in error cases.

Upstream commit:
    openvswitch: Fix template leak in error cases.

    Commit 2f3ab9f9fc23 ("openvswitch: Fix helper reference leak") fixed a
    reference leak on helper objects, but inadvertently introduced a leak on
    the ct template.

    Previously, ct_info.ct->general.use was initialized to 0 by
    nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action()
    returned successful. If an error occurred while adding the helper or
    adding the action to the actions buffer, the __ovs_ct_free_action()
    cleanup would use nf_ct_put() to free the entry; However, this relies on
    atomic_dec_and_test(ct_info.ct->general.use). This reference must be
    incremented first, or nf_ct_put() will never free it.

    Fix the issue by acquiring a reference to the template immediately after
    allocation.

    Fixes: cae3a2627520 ("openvswitch: Allow attaching helpers to ct action")
    Fixes: 2f3ab9f9fc23 ("openvswitch: Fix helper reference leak")
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 90c7afc96cbb ("openvswitch: Fix template leak in error cases.")
Fixes: 11251c170d92 ("datapath: Allow attaching helpers to ct action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: Orphan skbs before IPv6 defrag
Joe Stringer [Mon, 2 May 2016 18:19:16 +0000 (11:19 -0700)]
datapath: Orphan skbs before IPv6 defrag

Upstream commit:
    openvswitch: Orphan skbs before IPv6 defrag

    This is the IPv6 counterpart to commit 8282f27449bf ("inet: frag: Always
    orphan skbs inside ip_defrag()").

    Prior to commit 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free
    clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be
    cloned (implicitly orphaning) prior to queueing for reassembly. As such,
    when the IPv6 message is eventually reassembled, the skb->sk for all
    fragments would be NULL. After that commit was introduced, rather than
    cloning, the original skbs were queued directly without orphaning. The
    end result is that all frags except for the first and last may have a
    socket attached.

    This commit explicitly orphans such skbs during nf_ct_frag6_gather() to
    prevent BUG_ON(skb->sk) during a later call to ip6_fragment().

    kernel BUG at net/ipv6/ip6_output.c:631!
    [...]
    Call Trace:
     <IRQ>
     [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
     [<ffffffffa042c7c0>] ? do_output.isra.28+0x1b0/0x1b0 [openvswitch]
     [<ffffffff810bb8a2>] ? __lock_is_held+0x52/0x70
     [<ffffffffa042c587>] ovs_fragment+0x1f7/0x280 [openvswitch]
     [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
     [<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50
     [<ffffffff81697ea0>] ? dst_discard_out+0x20/0x20
     [<ffffffff81697e80>] ? dst_ifdown+0x80/0x80
     [<ffffffffa042c703>] do_output.isra.28+0xf3/0x1b0 [openvswitch]
     [<ffffffffa042d279>] do_execute_actions+0x709/0x12c0 [openvswitch]
     [<ffffffffa04340a4>] ? ovs_flow_stats_update+0x74/0x1e0 [openvswitch]
     [<ffffffffa04340d1>] ? ovs_flow_stats_update+0xa1/0x1e0 [openvswitch]
     [<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40
     [<ffffffffa042de75>] ovs_execute_actions+0x45/0x120 [openvswitch]
     [<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch]
     [<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40
     [<ffffffffa042def4>] ovs_execute_actions+0xc4/0x120 [openvswitch]
     [<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch]
     [<ffffffffa04337f2>] ? key_extract+0x442/0xc10 [openvswitch]
     [<ffffffffa043b26d>] ovs_vport_receive+0x5d/0xb0 [openvswitch]
     [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
     [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
     [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
     [<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50
     [<ffffffffa043c11d>] internal_dev_xmit+0x6d/0x150 [openvswitch]
     [<ffffffffa043c0b5>] ? internal_dev_xmit+0x5/0x150 [openvswitch]
     [<ffffffff8168fb5f>] dev_hard_start_xmit+0x2df/0x660
     [<ffffffff8168f5ea>] ? validate_xmit_skb.isra.105.part.106+0x1a/0x2b0
     [<ffffffff81690925>] __dev_queue_xmit+0x8f5/0x950
     [<ffffffff81690080>] ? __dev_queue_xmit+0x50/0x950
     [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
     [<ffffffff81690990>] dev_queue_xmit+0x10/0x20
     [<ffffffff8169a418>] neigh_resolve_output+0x178/0x220
     [<ffffffff81752759>] ? ip6_finish_output2+0x219/0x7b0
     [<ffffffff81752759>] ip6_finish_output2+0x219/0x7b0
     [<ffffffff817525a5>] ? ip6_finish_output2+0x65/0x7b0
     [<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80
     [<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50
     [<ffffffff81754af1>] ip6_fragment+0xba1/0xc50
     [<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40
     [<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0
     [<ffffffff81754dcf>] ip6_output+0x5f/0x1a0
     [<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50
     [<ffffffff81797fbd>] ip6_local_out+0x3d/0x80
     [<ffffffff817554df>] ip6_send_skb+0x2f/0xc0
     [<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50
     [<ffffffff817796cc>] icmpv6_push_pending_frames+0xac/0xe0
     [<ffffffff8177a4be>] icmpv6_echo_reply+0x42e/0x500
     [<ffffffff8177acbf>] icmpv6_rcv+0x4cf/0x580
     [<ffffffff81755ac7>] ip6_input_finish+0x1a7/0x690
     [<ffffffff81755925>] ? ip6_input_finish+0x5/0x690
     [<ffffffff817567a0>] ip6_input+0x30/0xa0
     [<ffffffff81755920>] ? ip6_rcv_finish+0x1a0/0x1a0
     [<ffffffff817557ce>] ip6_rcv_finish+0x4e/0x1a0
     [<ffffffff8175640f>] ipv6_rcv+0x45f/0x7c0
     [<ffffffff81755fe6>] ? ipv6_rcv+0x36/0x7c0
     [<ffffffff81755780>] ? ip6_make_skb+0x1c0/0x1c0
     [<ffffffff8168b649>] __netif_receive_skb_core+0x229/0xb80
     [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
     [<ffffffff8168c07f>] ? process_backlog+0x6f/0x230
     [<ffffffff8168bfb6>] __netif_receive_skb+0x16/0x70
     [<ffffffff8168c088>] process_backlog+0x78/0x230
     [<ffffffff8168c0ed>] ? process_backlog+0xdd/0x230
     [<ffffffff8168db43>] net_rx_action+0x203/0x480
     [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
     [<ffffffff817c156e>] __do_softirq+0xde/0x49f
     [<ffffffff81752768>] ? ip6_finish_output2+0x228/0x7b0
     [<ffffffff817c070c>] do_softirq_own_stack+0x1c/0x30
     <EOI>
     [<ffffffff8106f88b>] do_softirq.part.18+0x3b/0x40
     [<ffffffff8106f946>] __local_bh_enable_ip+0xb6/0xc0
     [<ffffffff81752791>] ip6_finish_output2+0x251/0x7b0
     [<ffffffff81754af1>] ? ip6_fragment+0xba1/0xc50
     [<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80
     [<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50
     [<ffffffff81754af1>] ip6_fragment+0xba1/0xc50
     [<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40
     [<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0
     [<ffffffff81754dcf>] ip6_output+0x5f/0x1a0
     [<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50
     [<ffffffff81797fbd>] ip6_local_out+0x3d/0x80
     [<ffffffff817554df>] ip6_send_skb+0x2f/0xc0
     [<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50
     [<ffffffff81778558>] rawv6_sendmsg+0xa28/0xe30
     [<ffffffff81719097>] ? inet_sendmsg+0xc7/0x1d0
     [<ffffffff817190d6>] inet_sendmsg+0x106/0x1d0
     [<ffffffff81718fd5>] ? inet_sendmsg+0x5/0x1d0
     [<ffffffff8166d078>] sock_sendmsg+0x38/0x50
     [<ffffffff8166d4d6>] SYSC_sendto+0xf6/0x170
     [<ffffffff8100201b>] ? trace_hardirqs_on_thunk+0x1b/0x1d
     [<ffffffff8166e38e>] SyS_sendto+0xe/0x10
     [<ffffffff817bebe5>] entry_SYSCALL_64_fastpath+0x18/0xa8
    Code: 06 48 83 3f 00 75 26 48 8b 87 d8 00 00 00 2b 87 d0 00 00 00 48 39 d0 72 14 8b 87 e4 00 00 00 83 f8 01 75 09 48 83 7f 18 00 74 9a <0f> 0b 41 8b 86 cc 00 00 00 49 8#
    RIP  [<ffffffff8175468a>] ip6_fragment+0x73a/0xc50
     RSP <ffff880072803120>

    Fixes: 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free clone
    operations")
Reported-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 49e261a8a21e ("openvswitch: Orphan skbs before IPv6 defrag")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>