cascardo/ovs.git
8 years agoSet release dates for 2.5.0. v2.5.0
Justin Pettit [Wed, 24 Feb 2016 12:25:49 +0000 (04:25 -0800)]
Set release dates for 2.5.0.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agorhel: provide our own SELinux custom policy package
Ansis Atteka [Tue, 19 Jan 2016 17:59:12 +0000 (09:59 -0800)]
rhel: provide our own SELinux custom policy package

CentOS, RHEL and Fedora distributions ship with their own Open vSwitch
SELinux policy that is too strict and prevents Open vSwitch to work
normally out of the box.

As a solution, this patch introduces a new package which will "loosen"
up "openvswitch_t" SELinux domain so that Open vSwitch could operate
normally.

Intended use-cases of this package are:
1. to allow users to install newer Open vSwitch on already released Fedora,
RHEL and CentOS distributions where the default Open vSwitch SELinux policy
that shipped with the corresponding Linux distribution is not up to date
and did not anticipate that a newer Open vSwitch version might need to
invoke new system calls or need to access certain system resources that
it did not before; And
2. to provide alternative means through which Open vSwitch developers
can proactively fix SELinux related policy issues without waiting for
corresponding Linux distribution maintainers to update their central
Open vSwitch SELinux policy.

This patch was tested on Fedora 23 and CentOS 7. I verified that now
on Fedora 23 Open vSwitch can create a NetLink socket; and that I did
not see following error messages:

vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log
ovs_numa|INFO|Discovered 2 CPU cores on NUMA node 0
ovs_numa|INFO|Discovered 1 NUMA nodes and 2 CPU cores
reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
netlink_socket|ERR|fcntl: Permission denied
dpif_netlink|ERR|Generic Netlink family 'ovs_datapath' does not exist.
                 The Open vSwitch kernel module is p robably not loaded.
dpif|WARN|failed to enumerate system datapaths: Permission denied
dpif|WARN|failed to create datapath ovs-system: Permission denied

I did not test all Open vSwitch features so there still could be some
OVS configuration that would get "Permission denied" errors.

Since, Open vSwitch daemons on Ubuntu 15.10 by default run under "unconfined"
SELinux domain, then there is no need to create a similar debian package
for Ubuntu, because it works on default Ubuntu installation.

Signed-off-by: Ansis Atteka <aatteka@nicira.com>
Acked-by: Flavio Leitner <fbl@sysclose.com>
8 years agoovsdb: avoid unnecessary call to ovsdb_monitor_get_update()
Andy Zhou [Mon, 22 Feb 2016 08:35:28 +0000 (00:35 -0800)]
ovsdb: avoid unnecessary call to ovsdb_monitor_get_update()

Optimizing ovsdb_jsonrpc_mintor_flush_all() by avoiding calling
ovsdb_monitor_get_update() on monitors that do not have any
unflushed updates.  This change saves CPU cycles on ovsdb-server's
main loop, but should not introduce any client visible changes.

Reported-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Liran Schour <lirans@il.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovsdb: rename variables in ovsdb_monitor_get_update()
Andy Zhou [Mon, 22 Feb 2016 08:31:03 +0000 (00:31 -0800)]
ovsdb: rename variables in ovsdb_monitor_get_update()

'prev_txn' and 'next_txn" are more confusing than 'unflushed' and
'unflushed_next'. Rename them.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Liran Schour <lirans@il.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovsdb: Fix one off error in tracking monitor changes
Andy Zhou [Mon, 22 Feb 2016 08:24:06 +0000 (00:24 -0800)]
ovsdb: Fix one off error in tracking monitor changes

dbmon's changes should be stored with the next transaction number,
rather than the current transaction number.  This bug causes the
changes of a transaction stored in a monitor to be unnoticed by
the jsonrpc connections that is responsible for flush the monitor
content.

However, the bug was not noticed until it was exposed by a later
optimization patch: "avoid unnecessary call to ovsdb_monitor_get_update()."
The lack of optimization means that the update is still generated
when 'unflushed' equals to n_transactions + 1, which should have
indicated the monitor has been flushed already.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Liran Schour <lirans@il.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoINSTALL.DPDK: Add notes regarding vhost multiq configuration.
Ian Stokes [Wed, 24 Feb 2016 17:30:57 +0000 (17:30 +0000)]
INSTALL.DPDK: Add notes regarding vhost multiq configuration.

Linux kernel network devices in a guest should have the number of
multi-purpose channels configured when used with DPDK multiqueue on the host.
This commit adds an example of how this can be done. Also add QEMU 2.5
requirements for multiqueue with DPDK in NEWS.

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agonetdev-dpdk: vhost-user: Fix sending packets to queues not enabled by guest.
Ilya Maximets [Wed, 24 Feb 2016 14:14:43 +0000 (17:14 +0300)]
netdev-dpdk: vhost-user: Fix sending packets to queues not enabled by guest.

Currently virtio driver in guest operating system have to be configured
to use exactly same number of queues. If number of queues will be less,
some packets will get stuck in queues unused by guest and will not be
received.

Fix that by using new 'vring_state_changed' callback, which is
available for vhost-user since DPDK 2.2.
Implementation uses additional mapping from configured tx queues to
enabled by virtio driver. This requires mandatory locking of TX queues
in __netdev_dpdk_vhost_send(), but this locking was almost always anyway
because of calling set_multiq with n_txq = 'ovs_numa_get_n_cores() + 1'.

OVS_VHOST_MAX_QUEUE_NUM = 1024 chosen based on the fact that this is
the maximum number of queues supported by QEMU.

Fixes: 4573fbd38fa1 ("netdev-dpdk: Add vhost-user multiqueue support")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agonetdev-dpdk: Do not add vhost-user ports with '/' or '\' in name.
Daniele Di Proietto [Wed, 3 Feb 2016 01:24:32 +0000 (17:24 -0800)]
netdev-dpdk: Do not add vhost-user ports with '/' or '\' in name.

This check prevents an obvious way for a vhost-user socket to escape the
intended directory.

There might be other ways to escape the directory (none comes to mind at
the moment), but this is a problem that should be properly solved by
mandatory access control.

A similar check is done for a bridge name, since that name is used as
part of a socket as well.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agoovs-ofctl.8: Clarify conntrack documentation.
Joe Stringer [Tue, 23 Feb 2016 21:26:29 +0000 (13:26 -0800)]
ovs-ofctl.8: Clarify conntrack documentation.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agolib: Fix netbsd compilation error.
Lance Richardson [Mon, 15 Feb 2016 15:08:51 +0000 (10:08 -0500)]
lib: Fix netbsd compilation error.

NetBSD requires <netinet/in.h> to be included before <netinit/ip6.h>.
Without this fix we have:

In file included from lib/netdev-vport.c:25:0:
/usr/include/netinet/ip6.h:82:18: error: field 'ip6_src' has incomplete type
/usr/include/netinet/ip6.h:83:18: error: field 'ip6_dst' has incomplete type

Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto-dpif-xlate: Fix crash when using multicast snooping.
Thadeu Lima de Souza Cascardo [Wed, 17 Feb 2016 14:43:56 +0000 (12:43 -0200)]
ofproto-dpif-xlate: Fix crash when using multicast snooping.

The revalidator thread may set may_learn and call xlate_actions with no packet
data. If the revalidated flow is IGMPv3 or MLD, vswitchd will crash when trying
to access the NULL packet.

Only process IGMP and MLD flows when there is a packet. This is a similar
behavior than what we have for other special packets.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Reported-by: Yi Ba <yby.developer@yahoo.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-January/020023.html
Fixes: 06994f879c9d ("mcast-snooping: Add Multicast Listener Discovery support")
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agotests: Fix bug in testsuite introduced in backport.
Ben Pfaff [Tue, 23 Feb 2016 00:17:28 +0000 (16:17 -0800)]
tests: Fix bug in testsuite introduced in backport.

Found by travis.

Reported-by: Joe Stringer <joe@ovn.org>
Fixes: 8a133bb5cb (ofproto-dpif-xlate: Don't consider mirrors used when excluded by VLAN.)
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agotypes: Fix defined but not used warning.
William Tu [Fri, 19 Feb 2016 21:35:55 +0000 (13:35 -0800)]
types: Fix defined but not used warning.

warning: ‘OVS_BE128_MAX’ defined but not used [-Wunused-const-variable]
Found using CentOS 6.6 with gcc 6.0.0.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoINSTALL.DPDK.md: Correct mergeable buffers parameter.
Ian Stokes [Wed, 10 Feb 2016 10:50:54 +0000 (10:50 +0000)]
INSTALL.DPDK.md: Correct mergeable buffers parameter.

Update the mergeable buffers paramaters in performance tuning
to the correct parameter mrg_rxbuf.

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoINSTALL.DPDK: Update details of XL710 restrictions for DPDK 2.2.
Ian Stokes [Tue, 9 Feb 2016 14:48:47 +0000 (14:48 +0000)]
INSTALL.DPDK: Update details of XL710 restrictions for DPDK 2.2.

DPDK 2.2 removes restrictions related to maximum number of TX
queues for XL710 devices. Update documentation to reflect these
changes.

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoofproto-dpif-xlate: Don't consider mirrors used when excluded by VLAN.
Ben Pfaff [Sat, 6 Feb 2016 03:16:01 +0000 (19:16 -0800)]
ofproto-dpif-xlate: Don't consider mirrors used when excluded by VLAN.

Mirrors can be configured to select packets for mirroring on the basis
of multiple criteria: input ports, output ports, and VLANs.  A packet P
is to be mirrored if there exists a mirror M such that either:

    - P ingresses on an input port selected by M, or

    - P egresses on an output port selected by M

AND P is in a VLAN selected by M.

In addition, every mirror has a destination, which can be an output port
or an output VLAN.  Either way, if a packet is mirrored to a particular
destination, it is done only once, even if different mirrors both select
a packet and have the same destination.

Since commit efbc3b7c4006c (ofproto-dpif-xlate: Rewrite mirroring to better
fit flow translation.), these requirements have been implemented
incorrectly: if a packet satisfies one of the bulleted requirements
above for mirror M1, but not the VLAN selection requirement for M1,
then it was not sent to M's destination, but it was still considered
as having been sent to M1's destination for the purpose of avoid output
duplication.  Thus, if P satisfied *all* of the requirements for a
second mirror M2, if M1 and M2 had the same destination, the packet was
still not mirrored.  This commit fixes that problem.

(The issue only occurred if M1 happened to have a smaller index than
M2 in OVS's internal data structures.  That's just a matter of luck.)

Reported-by: Huanle Han <hanxueluo@gmail.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-January/064531.html
Fixes: 7efbc3b7c4006c (ofproto-dpif-xlate: Rewrite mirroring to better fit flow translation.)
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agodatapath: lisp: Relax MTU constraints.
Joe Stringer [Sat, 13 Feb 2016 12:47:13 +0000 (04:47 -0800)]
datapath: lisp: Relax MTU constraints.

Currently, even if the entire path supports jumbo frames, the LISP netdev
limits the path MTU to 1500 bytes, and cannot be configured otherwise.
Relax the constraints on modifying the device MTU, and set it to the
maximum by default.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: stt: Relax MTU constraints.
Joe Stringer [Sat, 13 Feb 2016 12:32:36 +0000 (04:32 -0800)]
datapath: stt: Relax MTU constraints.

Currently, even if the entire path supports jumbo frames, the STT netdev
limits the path MTU to 1500 bytes, and cannot be configured otherwise.
Relax the constraints on modifying the device MTU, and set it to the
maximum by default.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: geneve: Refine MTU limit.
David Wragg [Thu, 18 Feb 2016 17:43:29 +0000 (17:43 +0000)]
datapath: geneve: Refine MTU limit.

Upstream commit:
    Calculate the maximum MTU taking into account the size of headers
    involved in GENEVE encapsulation, as for other tunnel types.

    Changes in v3:
    - Correct comment style
    Changes in v2:
    - Conform more closely to ip_tunnel_change_mtu
    - Exclude GENEVE options from max MTU calculation

Signed-off-by: David Wragg <david@weave.works>
Acked-by: Jesse Gross <jesse@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: aeee0e66c6b4 ("geneve: Refine MTU limit")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: Set a large MTU on tunnel devices.
David Wragg [Wed, 10 Feb 2016 00:05:58 +0000 (00:05 +0000)]
datapath: Set a large MTU on tunnel devices.

Upstream commit:
    Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
    transmit vxlan packets of any size, constrained only by the ability to
    send out the resulting packets.  4.3 introduced netdevs corresponding
    to tunnel vports.  These netdevs have an MTU, which limits the size of
    a packet that can be successfully encapsulated.  The default MTU
    values are low (1500 or less), which is awkwardly small in the context
    of physical networks supporting jumbo frames, and leads to a
    conspicuous change in behaviour for userspace.

    Instead, set the MTU on openvswitch-created netdevs to be the relevant
    maximum (i.e. the maximum IP packet size minus any relevant overhead),
    effectively restoring the behaviour prior to 4.3.

Signed-off-by: David Wragg <david@weave.works>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 7e059158d57b ("vxlan, gre, geneve: Set a large MTU on ovs-created
tunnel devices")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: geneve: Relax MTU constraints.
David Wragg [Wed, 10 Feb 2016 00:05:57 +0000 (00:05 +0000)]
datapath: geneve: Relax MTU constraints.

Upstream commit:
    Allow the MTU of geneve devices to be set to large values, in order to
    exploit underlying networks with larger frame sizes.

    GENEVE does not have a fixed encapsulation overhead (an openvswitch
    rule can add variable length options), so there is no relevant maximum
    MTU to enforce.  A maximum of IP_MAX_MTU is used instead.
    Encapsulated packets that are too big for the underlying network will
    get dropped on the floor.

Signed-off-by: David Wragg <david@weave.works>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 55e5bfb53cff ("geneve: Relax MTU constraints")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: vxlan: Relax MTU constraints.
David Wragg [Wed, 10 Feb 2016 00:05:55 +0000 (00:05 +0000)]
datapath: vxlan: Relax MTU constraints.

Upstream commit:
    Allow the MTU of vxlan devices without an underlying device to be set
    to larger values (up to a maximum based on IP packet limits and vxlan
    overhead).

    Previously, their MTUs could not be set to higher than the
    conventional ethernet value of 1500.  This is a very arbitrary value
    in the context of vxlan, and prevented vxlan devices from being able
    to take advantage of jumbo frames etc.

    The default MTU remains 1500, for compatibility.

Signed-off-by: David Wragg <david@weave.works>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 72564b59ffc4 ("vxlan: Relax MTU constraints")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agotunneling: Disable IPv6 tunnel
Pravin B Shelar [Thu, 18 Feb 2016 02:36:01 +0000 (18:36 -0800)]
tunneling: Disable IPv6 tunnel

There are multiple issues in IPv6 userspace tunnel
implementation. Even the kernel module that ships with
2.5 does not support IPv6 tunneling. There is not
enough time to get all fixes in branch-2.5. So it make
sense to disable the support on 2.5.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agonetdev-linux: Fix warning message.
Thadeu Lima de Souza Cascardo [Mon, 15 Feb 2016 17:14:21 +0000 (15:14 -0200)]
netdev-linux: Fix warning message.

Instead of reading

"error receiving Ethernet packet on Permission denied: ens3",

it should read

"error receiving Ethernet packet on ens3: Permission denied".

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agonetlink-socket: return correct error code when connect fails
Thadeu Lima de Souza Cascardo [Mon, 15 Feb 2016 17:13:30 +0000 (15:13 -0200)]
netlink-socket: return correct error code when connect fails

When connect and other calls fail after get_socket_rcvbuf, the return code would
be the rcvbuf size, not errno from the last call.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
8 years agonetdev-dpdk: Fix dpdk_watchdog failure to quiesce.
Kevin Traynor [Fri, 5 Feb 2016 17:07:16 +0000 (17:07 +0000)]
netdev-dpdk: Fix dpdk_watchdog failure to quiesce.

Fix issue whereby vhost_thread is waiting for dpdk_watchdog
thread to quiesce and at the same time dpdk_watchdog thread
is waiting for vhost_thread to give up dpdk_mutex.

Reported-by: Patrik Andersson R <patrik.r.andersson@ericsson.com>
Signed-off-by: Patrik Andersson R <patrik.r.andersson@ericsson.com>
Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agorhel: Add '--with dpdk' spec option to build DPDK-enabled packages
Panu Matilainen [Thu, 28 Jan 2016 12:23:52 +0000 (14:23 +0200)]
rhel: Add '--with dpdk' spec option to build DPDK-enabled packages

Requires DPDK >= 2.2 as that is the first version to have a standard
install layout which we can discover without help from user.
Additionally document the option in INSTALL.Fedora.md.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovsdb: Fix typo in libovsdb's pkg-config.
Ansari, Shad [Fri, 22 Jan 2016 20:06:28 +0000 (20:06 +0000)]
ovsdb: Fix typo in libovsdb's pkg-config.

Fix typo in the library name of pkg-config of libovsdb.

Reported-by: Javier Albornz <javier.albornoz@hpe.com>
Signed-off-by: Shad Ansari <shad.ansari@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-benchmark: Fix return value of do_poll.
William Tu [Thu, 21 Jan 2016 18:16:23 +0000 (10:16 -0800)]
ovs-benchmark: Fix return value of do_poll.

A positive number is returned when do_poll successes.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto-dpif-mirror: Fix bug that flag "need_revalidate" is never reset.
Huanle Han [Fri, 5 Feb 2016 23:43:25 +0000 (15:43 -0800)]
ofproto-dpif-mirror: Fix bug that flag "need_revalidate" is never reset.

Flag "need_revalidate" on mbridge is set to true when an ofbundle
destroy. And it's never reset. It causes the backer revalidate and
the mac learning flush every time 'ofproto_run' is called.

Signed-off-by: Huanle Han <hanxueluo@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-vswitchd: Preserve datapath ports across graceful shutdown.
Ben Pfaff [Thu, 4 Feb 2016 17:48:54 +0000 (09:48 -0800)]
ovs-vswitchd: Preserve datapath ports across graceful shutdown.

Until now, asking ovs-vswitchd to shut down gracefully, e.g. with
"ovs-appctl exit", would cause it to first remove all the ports from
kernel-based datapaths.  This has the unfortunate side effect that IP
addresses on any removed "internal" ports are lost, even if the ports are
added again when ovs-vswitchd is restarted.  This is long-standing
behavior, but it only became important when the OVS control scripts were
changed to try to do graceful shutdown first instead of using a signal.

This commit changes graceful shutdown so that it leaves ports in the
datapath, fixing the problem.

Fixes: 9b5422a98f8 (ovs-lib: Try to call exit before killing.)
Reported-by: Edgar Cantu <eocantu@us.ibm.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-January/020024.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Gurucharan Shetty <guru@ovn.org>
8 years agotravis: Update kernel matrix.
Joe Stringer [Thu, 24 Dec 2015 19:46:42 +0000 (11:46 -0800)]
travis: Update kernel matrix.

Remove v4.2 as it is EOL; Add v4.3 as we support this version in
OVS-2.5. Update other versions to the latest listed on kernel.org.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agodatapath: Re-designate OVS_FRAGMENT_BACKPORT.
Joe Stringer [Thu, 24 Dec 2015 21:09:38 +0000 (13:09 -0800)]
datapath: Re-designate OVS_FRAGMENT_BACKPORT.

Typically the way that we include backported code is by testing for
existence of the feature in the upstream codebase via header checks,
then attempt to use the upstream code as much as possible. However, for
the IP fragmentation handling backport we have an additional constraint
which is that we cannot support kernels older than Linux-3.10.

To date, OVS_FRAGMENT_BACKPORT has been defined to include the backport
of the IP fragmentation code for all kernels from 3.10 to 4.2, rather
than attempting to use the upstream code as much as possible. This patch
relaxes OVS_FRAGMENT_BACKPORT to only check the lower bound so that the
upstream code may be used in more circumstances.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use upstream ip_fragment().
Joe Stringer [Tue, 2 Feb 2016 23:19:02 +0000 (15:19 -0800)]
compat: Detect and use upstream ip_fragment().

Previously a version check was used to determine whether the upstream
ip_fragment() should be used or the backported version. The actual test
is for whether upstream commit d6b915e29f4a ("ip_fragment: don't forward
defragmented DF packet") is present, so test for that instead.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use inet_frag_queue->list_evictor.
Joe Stringer [Thu, 24 Dec 2015 18:41:35 +0000 (10:41 -0800)]
compat: Detect and use inet_frag_queue->list_evictor.

Kernels 3.17 to 4.2 have a work queue to evict old fragments, but do not
track these fragments in an eviction list. On these kernels, we detect
the absence of the list_evictor and provide one. This commit fixes the
reliance on kernel versions in the case that this functionality is
backported.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Rename OVS frag caches.
Joe Stringer [Tue, 2 Feb 2016 23:19:00 +0000 (15:19 -0800)]
compat: Rename OVS frag caches.

These should not have the same name as the upstream ones, to reduce
confusion when they are created. Rename them.

Suggested-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agodatapath: Fix kernel-4.3 build.
Joe Stringer [Tue, 2 Feb 2016 23:18:59 +0000 (15:18 -0800)]
datapath: Fix kernel-4.3 build.

Commit 792e5ed750ce ("datapath: inet: frag: Always orphan skbs inside
ip_defrag().") broke the build for OVS backport against kernel-4.3. Fix
the build.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agobridge: Do not add bridges with '/' in name.
Daniele Di Proietto [Tue, 2 Feb 2016 21:28:11 +0000 (13:28 -0800)]
bridge: Do not add bridges with '/' in name.

This effectively stops vswitchd from creating bridges with '/' in the
name. OVS used to print a warning but the bridge was created anyway.

This restriction is implemented because the bridge name is part of a
filesystem path.

This check is no substitute for Mandatory Access Control, but it
certainly helps to catch the error early.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
[blp@ovn.org added a test]
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Detect and handle errors in ofproto_port_add().
Ben Pfaff [Wed, 3 Feb 2016 01:57:46 +0000 (17:57 -0800)]
ofproto: Detect and handle errors in ofproto_port_add().

The update_port() function called in ofproto_port_add() can encounter
errors that prevent a port from being added, but nothing was checking for
the error and in fact update_port() didn't even pass the error along to
its caller.  This commit fixes the problem.

The scenario that led me to examine this code can be triggered as follows
from the sandbox, as long as you change --enable-dummy=override to
--enable-dummy=system in ovs-sandbox:

ovs-vsctl add-br br0
ovs-vsctl add-port br0 tun0 \
    -- set interface tun0 type=stt options:remote_ip=1.2.3.4
ovs-vsctl add-port br0 tun1 \
    -- set interface tun1 type=stt options:remote_ip=1.2.3.4

The second add-port will fail due to the duplicate tunnel options, but
ofproto_port_add() will not return the error.  Instead, it will report to
the caller that it succeeded and tell it that it has ofp_port OFPP_NONE
(65535), which is invalid and it obviously does not.  The result is that
you get bizarre log messages like this:

    tunnel|WARN|tun1: attempting to add tunnel port with same config as port 'tun0' (::->1.2.3.4, key=0, dp port=7471, pkt mark=0)
    ofproto|WARN|br0: could not add port tun1 (File exists)
    bridge|INFO|bridge br0: added interface tun1 on port 65535
    ofproto|WARN|br0: cannot configure bfd on nonexistent port 65535
    ofproto|WARN|br0: cannot configure LLDP on nonexistent port 65535
    ofproto|WARN|br0: cannot get STP status on nonexistent port 65535
    ofproto|WARN|br0: cannot get RSTP status on nonexistent port 65535
    ofproto|WARN|br0: cannot get STP stats on nonexistent port 65535
    ofproto|WARN|br0: cannot get STP stats on nonexistent port 65535

VMware-BZ: #1598643
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
8 years agodpif-netdev: Fix improper use of CMAP_FOR_EACH.
Daniele Di Proietto [Wed, 27 Jan 2016 02:53:52 +0000 (18:53 -0800)]
dpif-netdev: Fix improper use of CMAP_FOR_EACH.

It is ok to iterate a cmap with CMAP_FOR_EACH and remove elements with
cmap_remove(), but having quiescent states inside the loop might create
problems, since some of the postponed cleanup done inside the cmap might
be executed, freeing the memory that the iterator is using.

We had several of these errors in dpif-netdev, because when we rearrange
ports or threads we often need to wait on a condition variable (which
implies a quiescent state).

This problem caused iterations to skip elements or to list them twice,
resulting in the main thread waiting on a condition without anyone else
to signal.

Fix these cases by moving the possible quiescent states outside
CMAP_FOR_EACH loops.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agodpif-netdev: Delay packets' metadata initialization.
Daniele Di Proietto [Fri, 29 Jan 2016 01:47:51 +0000 (17:47 -0800)]
dpif-netdev: Delay packets' metadata initialization.

When a group of packets arrives from a port, we loop through them to
initialize metadata and then we loop through them again to extract the
flow and perform the exact match classification.

This commit combines the two loops into one, and initializes packet->md
in emc_processing() to improve performance.

Since emc_processing() might also be called after recirculation (in
which case the metadata is already valid), an extra parameter is added
to support both cases.

This commits also implements simple prefetching of packet metadata,
to further improve performance.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Andy Zhou <azhou@ovn.org>
Acked-by: Chandran, Sugesh <sugesh.chandran@intel.com>
8 years agocompat: Detect and use nf_ct_frag6_gather().
Joe Stringer [Fri, 8 Jan 2016 01:47:23 +0000 (17:47 -0800)]
compat: Detect and use nf_ct_frag6_gather().

This function is a likely candidate for backporting, and currently
relies on version checks to include the source or not. Grep for the
appropriate functions instead, and include the backport based on that.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use inet_getpeer_v4().
Joe Stringer [Fri, 8 Jan 2016 01:58:59 +0000 (17:58 -0800)]
compat: Detect and use inet_getpeer_v4().

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use __skb_dst_copy().
Joe Stringer [Thu, 24 Dec 2015 19:41:40 +0000 (11:41 -0800)]
compat: Detect and use __skb_dst_copy().

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use nf_connlabels_get().
Joe Stringer [Thu, 24 Dec 2015 19:34:35 +0000 (11:34 -0800)]
compat: Detect and use nf_connlabels_get().

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use nf_ipv6_ops->fragment.
Joe Stringer [Thu, 24 Dec 2015 19:32:38 +0000 (11:32 -0800)]
compat: Detect and use nf_ipv6_ops->fragment.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use struct nf_conntrack_zone.
Joe Stringer [Thu, 24 Dec 2015 19:29:34 +0000 (11:29 -0800)]
compat: Detect and use struct nf_conntrack_zone.

Rather than relying on version checks, detect the presence of this
structure and use it if available.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use inet_frags->lock.
Joe Stringer [Thu, 24 Dec 2015 19:06:18 +0000 (11:06 -0800)]
compat: Detect and use inet_frags->lock.

Prior to ab1c724f6330 ("inet: frag: use seqlock for hash rebuild")
upstream, a rwlock was used when rebuilding inet_frags. Rather than
using a version check to detect this, search for it in the header and
enable the code based on whether it exists.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use inet_frags->frags_work.
Joe Stringer [Thu, 24 Dec 2015 18:54:37 +0000 (10:54 -0800)]
compat: Detect and use inet_frags->frags_work.

Kernels 3.17 and newer have a work queue to evict old fragments, while
older kernel versions use an LRU in the fast path; see upstream commit
b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue").
This commit fixes the version checking so that rather than enabling the
code for either of these approaches using version checks, it is
triggered based on the presence of the work queue in "struct inet_frags".

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agocompat: Detect and use inet_frag_queue->last_in.
Joe Stringer [Thu, 24 Dec 2015 18:40:02 +0000 (10:40 -0800)]
compat: Detect and use inet_frag_queue->last_in.

Kernels 3.17 and older have this field, while newer kernels use the
'flags' field. Detect this in the build in case anyone backports this
change to an older kernel.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agonetdev-dpdk: Fix leak on netdev_dpdk_vhost_user_construct failure.
Ilya Maximets [Tue, 2 Feb 2016 11:02:16 +0000 (14:02 +0300)]
netdev-dpdk: Fix leak on netdev_dpdk_vhost_user_construct failure.

Memory pool for vhost-user ports always created even if construction
fails. And message about successfull socket creation also printed.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agonetdev-dpdk: Unlink vhost-user sockets on fatal signals.
Ilya Maximets [Tue, 2 Feb 2016 11:02:15 +0000 (14:02 +0300)]
netdev-dpdk: Unlink vhost-user sockets on fatal signals.

While killing OVS may not call rte_vhost_driver_unregister()
for vhost-user ports. As a result corresponding socket will
remain in a system and opening of that port after restart
will fail.

(Even after this patch this remains a problem for signals
that OVS does not or cannot catch, such as SIGSEGV and
SIGKILL.)

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoodp-util: Fix formatting and parsing of 'frag' in tnl_push ipv4 argument.
Ben Pfaff [Mon, 1 Feb 2016 19:31:54 +0000 (11:31 -0800)]
odp-util: Fix formatting and parsing of 'frag' in tnl_push ipv4 argument.

ip_frag_off is an ovs_be16 so it must be converted between host and
network byte order for parsing and formatting.

Reported-by: Dimitri John Ledkov <xnox@ubuntu.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-January/020072.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Dimitri John Ledkov <xnox@ubuntu.com>
8 years agoovn-northd: Don't set custom log level defaults.
Russell Bryant [Mon, 1 Feb 2016 14:58:22 +0000 (09:58 -0500)]
ovn-northd: Don't set custom log level defaults.

ovn-northd set some custom log level defaults, which I believe were
copied from ovs-vsctl.  Other daemons don't set this.  The difference in
behavior in ovn-northd vs other daemons has caused some confusion during
OpenStack+OVN development and testing, so make it consistent.

Reported-by: Ryan Moats <rmoats@us.ibm.com>
Reported-at: https://bugs.launchpad.net/bugs/1539994
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-By: Kyle Mestery <mestery@mestery.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoacinclude.m4: Fix dpdk build if -mssse3 not supported.
Ilya Maximets [Tue, 12 Jan 2016 11:15:39 +0000 (14:15 +0300)]
acinclude.m4: Fix dpdk build if -mssse3 not supported.

On arm/arm64:
gcc: error: unrecognized command line option '-mssse3'

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath: inet: frag: Always orphan skbs inside ip_defrag().
Joe Stringer [Fri, 29 Jan 2016 19:01:56 +0000 (11:01 -0800)]
datapath: inet: frag: Always orphan skbs inside ip_defrag().

When the linux stack is an endpoint connected to OVS which is performing
IP fragmentation via conntrack actions, it's possible to hit a kernel
BUG. The following upstream commit fixes the issue inside ip_defrag().
For the backport, we provide this inside ip_defrag() for kernels that we
currently backport that function, and also provide just the bugfix for
newer kernels, so we can continue to use upstream functionality as much
as possible.

Upstream commit:
    Later parts of the stack (including fragmentation) expect that there is
    never a socket attached to frag in a frag_list, however this invariant
    was not enforced on all defrag paths. This could lead to the
    BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the
    end of this commit message.

    While the call could be added to openvswitch to fix this particular
    error, the head and tail of the frags list are already orphaned
    indirectly inside ip_defrag(), so it seems like the remaining fragments
    should all be orphaned in all circumstances.

    kernel BUG at net/ipv4/ip_output.c:586!
    [...]
    Call Trace:
     <IRQ>
     [<ffffffffa0205270>] ? do_output.isra.29+0x1b0/0x1b0 [openvswitch]
     [<ffffffffa02167a7>] ovs_fragment+0xcc/0x214 [openvswitch]
     [<ffffffff81667830>] ? dst_discard_out+0x20/0x20
     [<ffffffff81667810>] ? dst_ifdown+0x80/0x80
     [<ffffffffa0212072>] ? find_bucket.isra.2+0x62/0x70 [openvswitch]
     [<ffffffff810e0ba5>] ? mod_timer_pending+0x65/0x210
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffffa03205a2>] ? nf_conntrack_in+0x252/0x500 [nf_conntrack]
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffffa02051a3>] do_output.isra.29+0xe3/0x1b0 [openvswitch]
     [<ffffffffa0206411>] do_execute_actions+0xe11/0x11f0 [openvswitch]
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffffa0206822>] ovs_execute_actions+0x32/0xd0 [openvswitch]
     [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffffa02068a2>] ovs_execute_actions+0xb2/0xd0 [openvswitch]
     [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
     [<ffffffffa0215019>] ? ovs_ct_get_labels+0x49/0x80 [openvswitch]
     [<ffffffffa0213a1d>] ovs_vport_receive+0x5d/0xa0 [openvswitch]
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
     [<ffffffffa02148fc>] internal_dev_xmit+0x6c/0x140 [openvswitch]
     [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
     [<ffffffff81660299>] dev_hard_start_xmit+0x2b9/0x5e0
     [<ffffffff8165fc21>] ? netif_skb_features+0xd1/0x1f0
     [<ffffffff81660f20>] __dev_queue_xmit+0x800/0x930
     [<ffffffff81660770>] ? __dev_queue_xmit+0x50/0x930
     [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
     [<ffffffff81669876>] ? neigh_resolve_output+0x106/0x220
     [<ffffffff81661060>] dev_queue_xmit+0x10/0x20
     [<ffffffff816698e8>] neigh_resolve_output+0x178/0x220
     [<ffffffff816a8e6f>] ? ip_finish_output2+0x1ff/0x590
     [<ffffffff816a8e6f>] ip_finish_output2+0x1ff/0x590
     [<ffffffff816a8cee>] ? ip_finish_output2+0x7e/0x590
     [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
     [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
     [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
     [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
     [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
     [<ffffffff816ab4c0>] ip_output+0x70/0x110
     [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
     [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
     [<ffffffff816abf89>] ip_send_skb+0x19/0x40
     [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
     [<ffffffff816df21a>] icmp_push_reply+0xea/0x120
     [<ffffffff816df93d>] icmp_reply.constprop.23+0x1ed/0x230
     [<ffffffff816df9ce>] icmp_echo.part.21+0x4e/0x50
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffff810d5f9e>] ? rcu_read_lock_held+0x5e/0x70
     [<ffffffff816dfa06>] icmp_echo+0x36/0x70
     [<ffffffff816e0d11>] icmp_rcv+0x271/0x450
     [<ffffffff816a4ca7>] ip_local_deliver_finish+0x127/0x3a0
     [<ffffffff816a4bc1>] ? ip_local_deliver_finish+0x41/0x3a0
     [<ffffffff816a5160>] ip_local_deliver+0x60/0xd0
     [<ffffffff816a4b80>] ? ip_rcv_finish+0x560/0x560
     [<ffffffff816a46fd>] ip_rcv_finish+0xdd/0x560
     [<ffffffff816a5453>] ip_rcv+0x283/0x3e0
     [<ffffffff810b6302>] ? match_held_lock+0x192/0x200
     [<ffffffff816a4620>] ? inet_del_offload+0x40/0x40
     [<ffffffff8165d062>] __netif_receive_skb_core+0x392/0xae0
     [<ffffffff8165e68e>] ? process_backlog+0x8e/0x230
     [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
     [<ffffffff8165d7c8>] __netif_receive_skb+0x18/0x60
     [<ffffffff8165e678>] process_backlog+0x78/0x230
     [<ffffffff8165e6dd>] ? process_backlog+0xdd/0x230
     [<ffffffff8165e355>] net_rx_action+0x155/0x400
     [<ffffffff8106b48c>] __do_softirq+0xcc/0x420
     [<ffffffff816a8e87>] ? ip_finish_output2+0x217/0x590
     [<ffffffff8178e78c>] do_softirq_own_stack+0x1c/0x30
     <EOI>
     [<ffffffff8106b88e>] do_softirq+0x4e/0x60
     [<ffffffff8106b948>] __local_bh_enable_ip+0xa8/0xb0
     [<ffffffff816a8eb0>] ip_finish_output2+0x240/0x590
     [<ffffffff816a9a31>] ? ip_do_fragment+0x831/0x8a0
     [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
     [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
     [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
     [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
     [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
     [<ffffffff816ab4c0>] ip_output+0x70/0x110
     [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
     [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
     [<ffffffff816abf89>] ip_send_skb+0x19/0x40
     [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
     [<ffffffff816d55d3>] raw_sendmsg+0x7d3/0xc30
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffff816e7557>] ? inet_sendmsg+0xc7/0x1d0
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffff816e759a>] inet_sendmsg+0x10a/0x1d0
     [<ffffffff816e7495>] ? inet_sendmsg+0x5/0x1d0
     [<ffffffff8163e398>] sock_sendmsg+0x38/0x50
     [<ffffffff8163ec5f>] ___sys_sendmsg+0x25f/0x270
     [<ffffffff811aadad>] ? handle_mm_fault+0x8dd/0x1320
     [<ffffffff8178c147>] ? _raw_spin_unlock+0x27/0x40
     [<ffffffff810529b2>] ? __do_page_fault+0x1e2/0x460
     [<ffffffff81204886>] ? __fget_light+0x66/0x90
     [<ffffffff8163f8e2>] __sys_sendmsg+0x42/0x80
     [<ffffffff8163f932>] SyS_sendmsg+0x12/0x20
     [<ffffffff8178cb17>] entry_SYSCALL_64_fastpath+0x12/0x6f
    Code: 00 00 44 89 e0 e9 7c fb ff ff 4c 89 ff e8 e7 e7 ff ff 41 8b 9d 80 00 00 00 2b 5d d4 89 d8 c1 f8 03 0f b7 c0 e9 33 ff ff f
     66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48
    RIP  [<ffffffff816a9a92>] ip_do_fragment+0x892/0x8a0
     RSP <ffff88006d603170>

    Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agodatapath: Fix IPv6 fragment expiry crash.
Joe Stringer [Wed, 27 Jan 2016 00:49:36 +0000 (00:49 +0000)]
datapath: Fix IPv6 fragment expiry crash.

Prior to a series of commits in 3.17 like the following, the model
used to manage and expire fragments was different. We already backport
several of these functions (See datapath/compat/inet_fragment.c) to do
things like allocate/evict/destroy frags and frag queues. In the IPv4
code, we use these. In most of the IPv6 cases, we already reuse these
also. However, for timed frag expiration we instead call the upstream
version of the function, which proceeds to use the upstream versions
of the functions we backport in inet_fragment.c. There can be some
discrepancy between the offsets used in these upstream versions vs. the
backport versions, so if you mix/match them then it leads to invalid
dereferences.

b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue")
ab1c724f6330 ("inet: frag: use seqlock for hash rebuild")

Fixes the following kernel oops on kernels < 3.17 when IPv6 fragments
are expired without reassembling the frame.

BUG: unable to handle kernel paging request at 00000006845d69a8
IP: [<ffffffff8172c09e>] _raw_spin_lock+0xe/0x50
...
Call Trace:
 <IRQ>
 [<ffffffff816a32d3>] inet_frag_kill+0x63/0x100
 [<ffffffff816ead93>] ip6_expire_frag_queue+0x63/0x110
 [<ffffffffa01130e6>] nf_ct_frag6_expire+0x26/0x30 [openvswitch]
 [<ffffffff810744f6>] call_timer_fn+0x36/0x100
 [<ffffffffa01130c0>] ? nf_ct_net_init+0x20/0x20 [openvswitch]
 [<ffffffff8107548f>] run_timer_softirq+0x1ef/0x2f0
 [<ffffffff8106cccc>] __do_softirq+0xec/0x2c0
 [<ffffffff8106d215>] irq_exit+0x105/0x110
 [<ffffffff81737095>] smp_apic_timer_interrupt+0x45/0x60
 [<ffffffff81735a1d>] apic_timer_interrupt+0x6d/0x80
 <EOI>
 [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10
 [<ffffffff8101cb2f>] default_idle+0x1f/0xc0
 [<ffffffff8101d406>] arch_cpu_idle+0x26/0x30
 [<ffffffff810bf3a5>] cpu_startup_entry+0xc5/0x290
 [<ffffffff817122e7>] rest_init+0x77/0x80
 [<ffffffff81d34f70>] start_kernel+0x438/0x443

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agoovn: Remove top ovn directory from PATHs.
Ilya Maximets [Fri, 29 Jan 2016 09:20:13 +0000 (12:20 +0300)]
ovn: Remove top ovn directory from PATHs.

Since 5b5c922b0ca6 ("ovn-nbctl: Move ovn-nbctl to utilities directory.")
there is no more executables in top ovn directory.

Removing of this directory from PATHs helps to avoid problems when
old executable ./ovn/ovn-nbctl used instead of ./ovn/utilities/ovn-nbctl.

This may happen if source directory was updated to commit 5b5c922b0ca6
without calling 'make clean'.

Fixes: 5b5c922b0ca6 ("ovn-nbctl: Move ovn-nbctl to utilities directory.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
8 years agodatapath: test for netlink_set_err returning void
Simon Horman [Fri, 27 Nov 2015 06:07:23 +0000 (22:07 -0800)]
datapath: test for netlink_set_err returning void

In v2.6.33 netlink_set_err returns void. However, 1a50307ba182 ("netlink:
fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err()") was backported and
included in v2.6.33.2 and in that and subsequent v2.6.33 stable releases
netlink_set_err returns an int.

It seems plausible that there are other backports floating around. So check
for netlink_set_err returning void rather than including compatibility code
based on the version of the kernel.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agonetdev-dpdk: Add vhost-user multiqueue support
Flavio Leitner [Tue, 26 Jan 2016 18:58:14 +0000 (16:58 -0200)]
netdev-dpdk: Add vhost-user multiqueue support

Most of the network cards today supports multiple receive
and transmit queues (MQ).  The core idea is that on packet
reception, a NIC can send different packets to different
queues to distribute processing among CPUs running in parallel.
The packet distribution is based on a result of a filter applied
on each packet headers. The filter should keep all packets from
the same flow on the same queue to avoid re-ordering while
distributing different flows among all available queues.

This is how the packet moves in a typical vhost-user use-case:

NIC             OVS
DPDK port ==== bridge --- vhost-user ==== qemu ==== virtio eth0

The DPDK ports, OVS bridges, virtio network driver and
recently QEMU (vhost-user) supports MQ.  This patch adds MQ
support to OVS that leverages DPDK vhost library to implement
vhost-user interfaces.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Kevin Traynor <kevin.traynor@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoofproto-dpif-xlate: Do not execute resubmit again after recirculation.
Ben Pfaff [Wed, 27 Jan 2016 17:14:18 +0000 (09:14 -0800)]
ofproto-dpif-xlate: Do not execute resubmit again after recirculation.

Consider the following flow table:

    table=0 actions=resubmit(,1),2
    table=1 actions=debug_recirc

When debug_recirc triggers recirculation and we later resume processing,
only the output to port 2 should be executed, because the effects of
"resubmit" have already taken place.  However, until now, the "resubmit"
was added to the actions to execute post-recirculation, resulting in an
infinite loop.

Now consider this flow table (as seen in the "MPLS handling" test in
ofproto-dpif.at):

    table=0 actions=pop_mpls(0x0806),resubmit(,1)
    table=1 ip,nw_dst=1.2.3.4 actions=controller

Here, we do want to add the "resubmit" to the actions to execute
post-recirculation, since the "resubmit" cannot be processed until after
recirculation makes the nw_dst field available.

This commit fixes the problem in both cases.

Found when testing a feature based on recirculation.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agoREADME.ovs-vtep.md: Fix incorrect spacing.
Kyle Mestery [Wed, 27 Jan 2016 23:55:28 +0000 (17:55 -0600)]
README.ovs-vtep.md: Fix incorrect spacing.

This fixes a simple formatting issue with this file I noticed while reviewing
the example of experimenting with the OVS HW-VTEP simulator.

Signed-off-by: Kyle Mestery <mestery@mestery.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoNEWS: DPDK 2.2 is now required.
Flavio Leitner [Wed, 27 Jan 2016 16:18:09 +0000 (14:18 -0200)]
NEWS: DPDK 2.2 is now required.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodpif-netdev: Unique and sequential tx_qids.
Ilya Maximets [Tue, 26 Jan 2016 06:12:34 +0000 (09:12 +0300)]
dpif-netdev: Unique and sequential tx_qids.

Currently tx_qid is equal to pmd->core_id. This leads to unexpected
behavior if pmd-cpu-mask different from '/(0*)(1|3|7)?(f*)/',
e.g. if core_ids are not sequential, or doesn't start from 0, or both.

Example:
starting 2 pmd threads with 1 port, 2 rxqs per port,
pmd-cpu-mask = 00000014 and let dev->real_n_txq = 2

It that case pmd_1->tx_qid = 2, pmd_2->tx_qid = 4 and
txq_needs_locking = true (if device hasn't ovs_numa_get_n_cores()+1
queues).

In that case, after truncating in netdev_dpdk_send__():
'qid = qid % dev->real_n_txq;'
pmd_1: qid = 2 % 2 = 0
pmd_2: qid = 4 % 2 = 0

So, both threads will call dpdk_queue_pkts() with same qid = 0.
This is unexpected behavior if there is 2 tx queues in device.
Queue #1 will not be used and both threads will lock queue #0
on each send.

Fix that by using sequential tx_qids.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agodpif-netdev: Rework of rx queue management.
Ilya Maximets [Tue, 26 Jan 2016 06:12:33 +0000 (09:12 +0300)]
dpif-netdev: Rework of rx queue management.

Current rx queue management model is buggy and will not work properly
without additional barriers and other syncronization between PMD
threads and main thread.

Known BUGS of current model:
* While reloading, two PMD threads, one already reloaded and
  one not yet reloaded, can poll same queue of the same port.
  This behavior may lead to dpdk driver failure, because they
  are not thread-safe.
* Same bug as fixed in commit e4e74c3a2b
  ("dpif-netdev: Purge all ukeys when reconfigure pmd.") but
  reproduced while only reconfiguring of pmd threads without
  restarting, because addition may change the sequence of
  other ports, which is important in time of reconfiguration.

Introducing the new model, where distribution of queues made by main
thread with minimal synchronizations and without data races between
pmd threads. Also, this model should work faster, because only
needed threads will be interrupted for reconfiguraition and total
computational complexity of reconfiguration is less.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoovs-lib: Try to call exit before killing.
Ilya Maximets [Wed, 16 Dec 2015 12:32:21 +0000 (15:32 +0300)]
ovs-lib: Try to call exit before killing.

While killing OVS may not free all allocated resources.

Example:
Socket for vhost-user port will stay in a system
after 'systemctl stop openvswitch' and opening
that port after restart will fail.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoUpdate relevant artifacts to add support for DPDK v2.2.0.
mweglicx [Wed, 23 Dec 2015 10:20:22 +0000 (10:20 +0000)]
Update relevant artifacts to add support for DPDK v2.2.0.

Following changes have been applied:
 - INSTALL.DPDK.md: change DPDK version number,
 - build.sh: change DPDK version number.

Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoofproto-dpif-xlate: Fix recirculation for resubmit to current table.
Ben Pfaff [Fri, 22 Jan 2016 23:58:55 +0000 (15:58 -0800)]
ofproto-dpif-xlate: Fix recirculation for resubmit to current table.

When recirculation defers actions for processing later, it decides
based on the actions being saved whether it needs to record the table
and cookie from which they originated.  Until now, it was thought that
this was only important for actions that send packets to the controller
(because those actions send the table ID and cookie).  This overlooked
a special case of the "resubmit" action which also depends on the
current table ID, which meant that this special case malfunctioned if
it came after recirculation.  This commit fixes the problem, and adds
a test.

Found while testing another feature under development.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agodatapath: compat: Add NULL check for tun-dst.
Pravin B Shelar [Thu, 21 Jan 2016 05:17:45 +0000 (21:17 -0800)]
datapath: compat: Add NULL check for tun-dst.

tun-dst could be NULL in case of incorrect action list
where set tunnel action is missing but packet is sent
to tunnel vport.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoovn-controller: Update check for parent port.
Russell Bryant [Wed, 20 Jan 2016 16:17:58 +0000 (11:17 -0500)]
ovn-controller: Update check for parent port.

There were a couple of checks that checked for a parent port as the
field being non-NULL.  We should treat an empty string the same as NULL
for this field.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-nbctl: Update show format for addresses.
Russell Bryant [Thu, 14 Jan 2016 16:00:52 +0000 (11:00 -0500)]
ovn-nbctl: Update show format for addresses.

This patch updates the formatting for the Logical_Port addresses column
in the show command output.  Previously, output would look like:

  addresses: 00:00:00:00:00:01 192.168.1.1 00:00:00:00:00:01 192.168.1.2

Now it looks like:

  addresses: ["00:00:00:00:00:01 192.168.1.1", "00:00:00:00:00:01 192.168.1.2"]

The grouping of addresses is important, so it should be reflected in the
output.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-nbctl: Help catch lport-set-addresses mistakes.
Russell Bryant [Thu, 14 Jan 2016 15:47:18 +0000 (10:47 -0500)]
ovn-nbctl: Help catch lport-set-addresses mistakes.

While debugging a broken OVN environment yesterday, the problem turned
out to be invalid entries in the logical port addresses column.  In
particular, the following command had been used:

  $ ovn-nbctl lport-set-addresses lp0 MAC IP

instead of:

  $ ovn-nbctl lport-set-addresses lp0 "MAC IP"

This is really easy to mess up, so add some simple validation to the
lport-set-addresses command.  If the beginning of an argument is ever
an IP address, it's wrong.

In passing, also add a note to the ovn-nb db documentation to note that
the order of "MAC IP" is required, as "IP MAC" is not valid.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath: Fix panic sending IP frags over tunnels.
Joe Stringer [Wed, 20 Jan 2016 23:26:49 +0000 (15:26 -0800)]
datapath: Fix panic sending IP frags over tunnels.

The entire OVS_GSO_CB was not preserved when handling IP fragments,
leading to the following NULL pointer dereference in ovs_stt_xmit(). Fix
this in the fragmentation handling code by preserving the whole CB.

BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
IP: [<ffffffffa0cfc5b1>] ovs_stt_xmit+0x61/0x260 [openvswitch]
Call Trace:
 [<ffffffff815f682e>] ? __alloc_skb+0x7e/0x2b0
 [<ffffffffa0cf1134>] ovs_vport_send+0x44/0xb0 [openvswitch]
 [<ffffffffa0ce241f>] ovs_vport_output+0x10f/0x190 [openvswitch]
 [<ffffffff8163fe98>] ip_fragment+0x238/0x870
 [<ffffffffa0ce2310>] ? do_output.isra.35+0x120/0x120 [openvswitch]
 [<ffffffffa0d02093>] ovs_fragment+0x283/0x292 [openvswitch]
 [<ffffffff81073ff7>] ? mod_timer_pending+0x67/0x1b0
 [<ffffffff8160e2d0>] ? dst_ifdown+0x90/0x90
 [<ffffffff8160e2d0>] ? dst_ifdown+0x90/0x90
 [<ffffffffa0b30165>] ? nfnetlink_has_listeners+0x15/0x20 [nfnetlink]
 [<ffffffffa0cdb164>] ? ctnetlink_conntrack_event+0x74/0x7ee [nf_conntrack_netlink]
 [<ffffffffa0b873cd>] ? nf_ct_deliver_cached_events+0xad/0xf0 [nf_conntrack]
 [<ffffffff81360331>] ? csum_partial+0x11/0x20
 [<ffffffffa0ce2747>] ? execute_masked_set_action+0x2a7/0xa60 [openvswitch]
 [<ffffffffa0ce22a8>] do_output.isra.35+0xb8/0x120 [openvswitch]
 [<ffffffffa0ce2ff4>] do_execute_actions+0xf4/0x7f0 [openvswitch]
 [<ffffffffa0ce3730>] ovs_execute_actions+0x40/0x130 [openvswitch]
 [<ffffffffa0ce7c69>] ovs_packet_cmd_execute+0x2b9/0x2e0 [openvswitch]
 [<ffffffff81634fad>] genl_family_rcv_msg+0x18d/0x370
 [<ffffffff81635190>] ? genl_family_rcv_msg+0x370/0x370
 [<ffffffff81635221>] genl_rcv_msg+0x91/0xd0
 [<ffffffff816332c9>] netlink_rcv_skb+0xa9/0xc0
 [<ffffffff816337c8>] genl_rcv+0x28/0x40
 [<ffffffff816329b5>] netlink_unicast+0xd5/0x1b0
 [<ffffffff81632d9e>] netlink_sendmsg+0x30e/0x680
 [<ffffffff8162fc84>] ? netlink_rcv_wake+0x44/0x60
 [<ffffffff81630d12>] ? netlink_recvmsg+0x1a2/0x3a0
 [<ffffffff815ed7fb>] sock_sendmsg+0x8b/0xc0
 [<ffffffff8114d06d>] ? __alloc_pages_nodemask+0x16d/0xac0
 [<ffffffff8101c4b9>] ? sched_clock+0x9/0x10
 [<ffffffff815edbc9>] ___sys_sendmsg+0x349/0x360
 [<ffffffff811f8a39>] ? ep_scan_ready_list.isra.7+0x199/0x1c0
 [<ffffffff8110705c>] ? acct_account_cputime+0x1c/0x20
 [<ffffffff811cd90f>] ? fget_light+0x8f/0xf0
 [<ffffffff815ee922>] __sys_sendmsg+0x42/0x80
 [<ffffffff815ee972>] SyS_sendmsg+0x12/0x20
 [<ffffffff8170f22f>] tracesys+0xe1/0xe6

VMware-BZ: #1587324
Fixes: a94ebc39996b ("datapath: Add conntrack action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
8 years agotests: Set enable-dummy=system for ovn-controller-vtep tests.
Russell Bryant [Thu, 14 Jan 2016 20:07:59 +0000 (15:07 -0500)]
tests: Set enable-dummy=system for ovn-controller-vtep tests.

All of the ovn-controller-vtep tests were failing on my laptop due to an
unexpected message in the ovs-vswitchd log related to my VPN.  This
setting resolves it and makes all tests pass.

Fixes: 0c1e8a7d637e ("ovn-controller-vtep: Add gateway module.")
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Fix memory leak and memory exhaustion bugs in group_mod.
Ben Pfaff [Thu, 14 Jan 2016 06:15:09 +0000 (22:15 -0800)]
ofproto: Fix memory leak and memory exhaustion bugs in group_mod.

In handle_group_mod() cases where adding a group failed, nothing freed the
list of buckets, causing a leak.  The same was true in every case of
modifying a group.  This commit fixes the problem by changing add_group()
to never steal or free the buckets (modify_group() already acted this way)
and then making handle_group_mod() always free the buckets when it's done.

This approach might at first raise objections, because it makes add_group()
copy the buckets instead of just take the existing ones.  But it actually
fixes a worse problem too: when OF1.4+ REQUESTFORWARD is enabled, the
group_mod is reused for the request forwarding.  Until now, for a group_mod
that adds a new group and that has some buckets, the previous stealing of
buckets in add_group() meant that the group_mod's buckets were no longer
valid; in practice, the list of buckets became linked in a way that
iteration never terminated, which caused memory to be exhausted while
composing the requestforward message.  By making add_group() no longer
modify the group_mod, we also fix this problem.

The requestforward test in the testsuite did not find the latter problem
because it only added a group without any buckets.  This commit also
updates the testsuite to include a bucket in its group_mod, which would
have found the problem.

Found by pain and suffering.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agoovn-tutorial: fix a typo
William Tu [Sun, 17 Jan 2016 01:23:15 +0000 (17:23 -0800)]
ovn-tutorial: fix a typo

switch_in_pre_acl -> switch_out_pre_acl

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
8 years agoovn: Use assigned Geneve class.
Jesse Gross [Thu, 14 Jan 2016 22:25:17 +0000 (14:25 -0800)]
ovn: Use assigned Geneve class.

The most recent version of the Geneve draft included an option
class assignment for OVN:
https://tools.ietf.org/html/draft-ietf-nvo3-geneve-01

As a result, we can stop using the experimental class and switch to
the allocated one (0x0102).

Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-bugtool: Add conntrack output.
William Tu [Wed, 13 Jan 2016 23:51:44 +0000 (15:51 -0800)]
ovs-bugtool: Add conntrack output.

Add a script to show all the connection entries in the tracker.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Gurucharan Shetty <guru@ovn.org>
8 years agodatapath: STT: Fix nf-hook softlockup.
Pravin B Shelar [Thu, 14 Jan 2016 00:42:10 +0000 (16:42 -0800)]
datapath: STT: Fix nf-hook softlockup.

nf-hook is not unregistered on STT device delete, But when
second time it was created it nf-hook is again registered.
which causes following softlockup.
Following patch fixes it by registering nf-hook only on very
first stt device.

---8<---

BUG: soft lockup - CPU#1 stuck for 22s! [ovs-vswitchd:11293]
RIP: 0010:[<ffffffffa0e48308>]  [<ffffffffa0e48308>] nf_ip_hook+0xf8/0x180 [openvswitch]
Stack:
 <IRQ>
 [<ffffffff8163bf60>] ? ip_rcv_finish+0x350/0x350
 [<ffffffff8163572a>] nf_iterate+0x9a/0xb0
 [<ffffffff8163bf60>] ? ip_rcv_finish+0x350/0x350
 [<ffffffff816357bc>] nf_hook_slow+0x7c/0x120
 [<ffffffff8163bf60>] ? ip_rcv_finish+0x350/0x350
 [<ffffffff8163c343>] ip_local_deliver+0x73/0x80
 [<ffffffff8163bc8d>] ip_rcv_finish+0x7d/0x350
 [<ffffffff8163c5e8>] ip_rcv+0x298/0x3d0
 [<ffffffff81605f26>] __netif_receive_skb_core+0x696/0x880
 [<ffffffff81606128>] __netif_receive_skb+0x18/0x60
 [<ffffffff81606cce>] process_backlog+0xae/0x180
 [<ffffffff81606512>] net_rx_action+0x152/0x270
 [<ffffffff8106accc>] __do_softirq+0xec/0x300
 [<ffffffff81710a1c>] do_softirq_own_stack+0x1c/0x30

Fixes: fee43fa2 ("datapath: Fix deadlock on STT device destroy.")
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Tested-by: Joe Stringer <joe@ovn.org>
8 years ago{lib, utilities}: Fix ct_state constants in docs.
Joe Stringer [Wed, 13 Jan 2016 18:59:03 +0000 (10:59 -0800)]
{lib, utilities}: Fix ct_state constants in docs.

These pieces of documentation were not updated when the CS_* flags were
reordered on the OpenFlow interface.

Fixes: 63bc9fb1c69f ("packets: Reorder CS_* flags to remove gap.")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
8 years agonetdev-dpdk: Fix thread_is_pmd() symbol conflict.
Joe Stringer [Tue, 12 Jan 2016 19:32:41 +0000 (11:32 -0800)]
netdev-dpdk: Fix thread_is_pmd() symbol conflict.

DPDK build was broken after commit 2f8932e8403a ("poll: Suppress logging
for pmd threads.") due to the following error:

lib/netdev-dpdk.c:245:13: error: static declaration of ‘thread_is_pmd’
follows non-static declaration
lib/ovs-thread.h:526:6: note: previous declaration of ‘thread_is_pmd’
was here

The version used in this file operates in the fastpath, so it cannot
switch to using the newly introduced version; the new version lives
outside of the dpdk portions of OVS so its implementation cannot be
shared with this function. Rename it to resolve the conflict.

Fixes: 2f8932e8403a ("poll: Suppress logging for pmd threads.")
Suggested-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agodatapath: Fix deadlock on STT device destroy.
Pravin B Shelar [Tue, 12 Jan 2016 04:13:40 +0000 (20:13 -0800)]
datapath: Fix deadlock on STT device destroy.

STT unregisters nf-hook when there are no other STT devices
left in the namespace. On some kernel versions the nf-unreg API
take RTNL lock, but it is already taken in the tunnel device
destroy code path which results in deadlock. To fix the issue
I moved the unreg call into net-exit.

VMware-BZ: #1582410
Reported-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoovs-ofctl.8.in: Fix indentation.
Joe Stringer [Tue, 12 Jan 2016 00:43:52 +0000 (16:43 -0800)]
ovs-ofctl.8.in: Fix indentation.

This extraneous .RE caused the indentation for the subsequent actions to
drop back an extra step, fix it.

Fixes: 8e53fe8cf7a1 ("Add connection tracking mark support.")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofp-parse: Use xstrdup() instead of strdup().
Ben Pfaff [Mon, 11 Jan 2016 17:21:58 +0000 (09:21 -0800)]
ofp-parse: Use xstrdup() instead of strdup().

This avoids a null pointer dereference in the case of memory allocation
failure.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
8 years agopoll: Suppress logging for pmd threads.
Ilya Maximets [Tue, 22 Dec 2015 14:26:47 +0000 (17:26 +0300)]
poll: Suppress logging for pmd threads.

'Unreasonably long poll interval's are reasonable for PMD threads.
Also reporting of high CPU usage is not necessary.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath-windows: Add LSOv2 support for VXLAN
Alin Serdean [Fri, 11 Dec 2015 22:29:38 +0000 (22:29 +0000)]
datapath-windows: Add LSOv2 support for VXLAN

This patch adds LSO version 2 support for the windows datapath.
(https://msdn.microsoft.com/en-us/library/windows/hardware/ff568840%28v=vs.85%29.aspx)

Tested using psping and iperf3.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath-windows: Fix bug small bug in GRE.
Alin Serdean [Fri, 11 Dec 2015 22:24:49 +0000 (22:24 +0000)]
datapath-windows: Fix bug small bug in GRE.

Allow GRE encapsulation to take place in the case we have a TCP payload
without LSO.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Fix memory leak reported by valgrind.
William Tu [Tue, 5 Jan 2016 21:38:43 +0000 (13:38 -0800)]
ofproto: Fix memory leak reported by valgrind.

Test case 757: ofproto - table description (OpenFlow 1.4)
Call stacks:
    parse_ofp_table_vacancy (ofp-parse.c:896)
    parse_ofp_table_mod (ofp-parse.c:978)
    ofctl_mod_table (ovs-ofctl.c:2011)
    ovs_cmdl_run_command (command-line.c:121)
    main (ovs-ofctl.c:135)
Reason: return without freeing memory

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agorstp: Fix memory leak reported by valgrind.
William Tu [Tue, 5 Jan 2016 21:38:42 +0000 (13:38 -0800)]
rstp: Fix memory leak reported by valgrind.

test case: 1650 RSTP Single bridge, call stacks
    hmap_insert_at (hmap.h:235)
    rstp_port_set_port_number__ (rstp.c:744)
    rstp_add_port (rstp.c:1164)
    new_bridge (test-rstp.c:123)
    test_rstp_main (test-rstp.c:514)
    ovstest_wrapper_test_rstp_main__ (test-rstp.c:714)
    ovs_cmdl_run_command (command-line.c:121)
    main (ovstest.c:132)
fix it by adding hmap_destroy() at rstp_unref()

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Daniele Venturino <daniele.venturino@m3s.it>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovs-ofctl: Fix memory leak reported by valgrind.
William Tu [Tue, 5 Jan 2016 21:38:41 +0000 (13:38 -0800)]
ovs-ofctl: Fix memory leak reported by valgrind.

Reported by 348: ovs-ofctl parse-flows (skb_priority)
Reason: return without freeing memory

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agostream-ssl: Fix memory leak reported by valgrind.
William Tu [Thu, 7 Jan 2016 23:59:34 +0000 (15:59 -0800)]
stream-ssl: Fix memory leak reported by valgrind.

test case 1628: peer ca cert
    ASN1_item_dup
    do_ca_cert_bootstrap (stream-ssl.c:413)
    ssl_connect (stream-ssl.c:468)
    scs_connecting (stream.c:297)
    stream_connect (stream.c:320)
Fix by removing the X509_dup().

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agopython: Fix the TypeError exception seen when idl.Idl parses lock reply
Numan Siddique [Fri, 8 Jan 2016 06:29:47 +0000 (11:59 +0530)]
python: Fix the TypeError exception seen when idl.Idl parses lock reply

File "/usr/lib/python2.7/site-packages/ovs/db/idl.py", line 334,
in __parse_lock_notify
  self.__update_has_lock(self, new_has_lock)
TypeError: __update_has_lock() takes exactly 2 arguments (3 given)

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
8 years agoofproto-dpif-upcall: Don't delete modified ukeys.
Joe Stringer [Thu, 7 Jan 2016 19:47:46 +0000 (11:47 -0800)]
ofproto-dpif-upcall: Don't delete modified ukeys.

If revalidation returns the result UKEY_DELETE, then both the ukey and
its corresponding flow should be deleted. However, if revalidation
returns UKEY_MODIFY, the ukey itself should be modified in-place and
should not be deleted.

Fix this by only applying the ukey deletion to ukeys whose datapath
operations delete a flow.

This may fix statistics accounting issues in rare cases involving
OpenFlow rule modification where actions are updated but flows remain
the same.

Found by inspection.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
8 years agoofproto-dpif-upcall: Avoid double-delete of ukeys.
Ben Pfaff [Wed, 6 Jan 2016 23:44:39 +0000 (15:44 -0800)]
ofproto-dpif-upcall: Avoid double-delete of ukeys.

revalidate_sweep__() has two cases where it calls ukey_delete() to
remove a ukey from the umap via cmap_remove().  The first case is a direct
call to ukey_delete(), when !flow_exists.  The second case is an indirect
call via push_ukey_ops(), when result != UKEY_KEEP.  If both of these
conditions are simultaneously true, however, the code would call
ukey_delete() twice, causing an assertion failure in the second call.  This
commit fixes the problem by eliminating one of the calls.

The version tested by Ben Warren differs from this version, see:
    http://openvswitch.org/pipermail/dev/2016-January/064117.html

Reported-by: Keith Holleman <keith.holleman@gmail.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2015-December/019772.html
CC: Joe Stringer <joe@ovn.org>
VMware-BZ: #1579057
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Ben Warren <ben@skyportsystems.com>
8 years agoofproto-dpif-rid: Fix memory leak in recirc_state.
Ben Pfaff [Wed, 6 Jan 2016 00:51:54 +0000 (16:51 -0800)]
ofproto-dpif-rid: Fix memory leak in recirc_state.

recirc_state_clone() copies the stack and actions and nothing ever freed
them.

CC: Jarno Rajahalme <jarno@ovn.org>
CC: Andy Zhou <azhou@ovn.org>
Reported-by: William Tu <u9012063@gmail.com>
Reported-at: http://openvswitch.org/pipermail/dev/2016-January/064040.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoofp-util: Avoid use-after-free error in ofputil_append_meter_config().
Ben Pfaff [Wed, 16 Dec 2015 06:51:29 +0000 (22:51 -0800)]
ofp-util: Avoid use-after-free error in ofputil_append_meter_config().

Reported-by: weizj <334965317@qq.com>
Reported-at: https://github.com/openvswitch/ovs/pull/97
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoodp-util: Fix memory leak reported by valgrind.
William Tu [Tue, 5 Jan 2016 00:18:41 +0000 (16:18 -0800)]
odp-util: Fix memory leak reported by valgrind.

Test case: OVS datapath key parsing and formatting (377)
Return without freeing buf:
    xmalloc(util.c:112)
    ofpbuf_init(ofpbuf.c:124)
    parse_odp_userspace_action(odp-util.c:987)
    parse_odp_action(odp-util.c:1552)
    odp_actions_from_string(odp-util.c:1721)
    parse_actions(test-odp.c:132)

Test case: OVS datapath actions parsing and formatting (380)
Exit withtou uninit in test-odp.c
    xrealloc(util.c:123)
    ofpbuf_resize__(ofpbuf.c:243)
    ofpbuf_put_uninit(ofpbuf.c:364)
    nl_msg_put_uninit(netlink.c:178)
    nl_msg_put_unspec_uninit(netlink.c:216)
    nl_msg_put_unspec(netlink.c:243)
    parse_odp_key_mask_attr(odp-util.c:3974)
    odp_flow_from_string(odp-util.c:4151)
    parse_keys(test-odp.c:49)
    test_odp_main(test-odp.c:237)
    ovstest_wrapper_test_odp_main__(test-odp.c:251)
    ovs_cmdl_run_command(command-line.c:121)
    main(ovstest.c:132)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath-windows: Fix subscribe/unsubscribe packets
Alin Serdean [Mon, 4 Jan 2016 23:04:11 +0000 (23:04 +0000)]
datapath-windows: Fix subscribe/unsubscribe packets

The policy of the subscribe packets is defined by the following:
    const NL_POLICY policy[] =  {
        [OVS_NL_ATTR_PACKET_PID] = {.type = NL_A_U32 },
        [OVS_NL_ATTR_PACKET_SUBSCRIBE] = {.type = NL_A_U8 }
        };
Switch the value of the join operation with the one from the policy.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agonetlink-socket: Fix log message for subscribe/unsubscribe on Windows.
Alin Serdean [Mon, 4 Jan 2016 23:04:10 +0000 (23:04 +0000)]
netlink-socket: Fix log message for subscribe/unsubscribe on Windows.

The warning message was inverted on the performed operation.

Also use the error returned by nl_sock_subscribe_packet__.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-northd: Can't use ct() for router ports.
l0310 [Wed, 2 Dec 2015 11:20:07 +0000 (19:20 +0800)]
ovn-northd: Can't use ct() for router ports.

This patch ensures that we do not attempt to use connection tracking for
logical ports with type=router.  This does not work as the traffic
through a logical router port is not symmetric since logical routers are
distributed.  The result was that traffic between logical ports on
different hypervisors that went through a logical router would fail if
ACLs were in use.

GitHub-PR: #92
Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1522022
Signed-off-by: l0310 <liw@dtdream.com>
[russell@ovn.org updated commit message, style tweaks]
Signed-off-by: Russell Bryant <russell@ovn.org>