cascardo/ovs.git
9 years agoSet release dates for 2.3.0. v2.3
Justin Pettit [Thu, 14 Aug 2014 18:03:39 +0000 (11:03 -0700)]
Set release dates for 2.3.0.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoIndicate that DPDK support in 2.3 is experimantal.
Justin Pettit [Thu, 14 Aug 2014 18:00:17 +0000 (11:00 -0700)]
Indicate that DPDK support in 2.3 is experimantal.

Users should consider a later version of OVS for DPDK support.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agobfd: Add configuration for setting and matching mac address.
Alex Wang [Fri, 8 Aug 2014 23:41:27 +0000 (16:41 -0700)]
bfd: Add configuration for setting and matching mac address.

This commit adds options for configuring the MAC addresses
in BFD state machine.  Therein, the "bfd_local_src_mac" and
"bfd_local_dst_mac" configure the MAC address for transmitted
BFD packets.  The "bfd_remote_dst_mac" configure the matching
of MAC address on received BFD packets.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
9 years agobfd: Flip the default value of bfd ip source and destination.
Alex Wang [Fri, 8 Aug 2014 22:50:14 +0000 (15:50 -0700)]
bfd: Flip the default value of bfd ip source and destination.

This commit flips the default value of bfd ip source and destination,
so that they match the default value of ip destination and source
of vtep schema.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
9 years agodpif-netdev: Avoid divide by zero.
Ben Pfaff [Wed, 6 Aug 2014 16:43:31 +0000 (09:43 -0700)]
dpif-netdev: Avoid divide by zero.

Otherwise creating the first dpif-netdev bridge fails because there are
no handlers:

  Program terminated with signal 8, Arithmetic exception.
  #0  0x080971e9 in dp_execute_cb (aux_=aux_@entry=0xffcfaa54,
      packet=packet@entry=0xffcfac54, md=md@entry=0xffcfac84,
      a=a@entry=0x8f58930, may_steal=false) at ../lib/dpif-netdev.c:2154
  #1  0x080b5adb in odp_execute_actions__ (dp=dp@entry=0xffcfaa54,
      packet=packet@entry=0xffcfac54, steal=steal@entry=false,
      md=md@entry=0xffcfac84, actions=actions@entry=0x8f58930,
      actions_len=actions_len@entry=20,
      dp_execute_action=dp_execute_action@entry=0x8097040 <dp_execute_cb>,
      more_actions=more_actions@entry=false) at ../lib/odp-execute.c:218
  #2  0x080b5def in odp_execute_actions (dp=dp@entry=0xffcfaa54,
      packet=packet@entry=0xffcfac54, steal=steal@entry=false,
      md=md@entry=0xffcfac84, actions=0x8f58930, actions_len=20,
      dp_execute_action=dp_execute_action@entry=0x8097040 <dp_execute_cb>)
      at ../lib/odp-execute.c:285
  #3  0x08095098 in dp_netdev_execute_actions (actions_len=<optimized out>,
      actions=<optimized out>, md=0xffcfac84, may_steal=false,
      packet=0xffcfac54, key=0xffcfaa5c, dp=<optimized out>)
      at ../lib/dpif-netdev.c:2227
  #4  dpif_netdev_execute (dpif=0x8f59598, execute=0xffcfac78)
      at ../lib/dpif-netdev.c:1551
  #5  0x0809a56c in dpif_execute (dpif=0x8f59598,
      execute=execute@entry=0xffcfac78) at ../lib/dpif.c:1227
  #6  0x08071071 in check_variable_length_userdata (backer=<optimized out>)
      at ../ofproto/ofproto-dpif.c:1040
  #7  open_dpif_backer (backerp=0x8f5834c, type=<optimized out>)
      at ../ofproto/ofproto-dpif.c:921
  #8  construct (ofproto_=0x8f581c0) at ../ofproto/ofproto-dpif.c:1120
  #9  0x080675e0 in ofproto_create (datapath_name=0x8f57310 "br0",
      datapath_type=<optimized out>, ofprotop=ofprotop@entry=0x8f576c8)
      at ../ofproto/ofproto.c:564
  #10 0x080529aa in bridge_reconfigure (ovs_cfg=ovs_cfg@entry=0x8f596d8)
      at ../vswitchd/bridge.c:572
  #11 0x08055e33 in bridge_run () at ../vswitchd/bridge.c:2339
  #12 0x0804cdbd in main (argc=9, argv=0xffcfb554)
      at ../vswitchd/ovs-vswitchd.c:116

This bug was introduced by commit 9bcefb956d8f (ofproto-dpif: fix an ovs
crash when dpif_recv_set returns error).

CC: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agoofproto-dpif: Use DPIF_FP_CREATE but not DPIF_FP_MODIFY.
Ben Pfaff [Sat, 2 Aug 2014 00:22:20 +0000 (17:22 -0700)]
ofproto-dpif: Use DPIF_FP_CREATE but not DPIF_FP_MODIFY.

A dpif reports EEXIST if a flow put operation that should create a new flow
instead attempts to modify an existing flow, or ENOENT if a flow put would
create a flow that overlaps some existing flow.  The latter does not always
indicate a bug in OVS userspace, because it can also mean that two
userspace OVS packet handler threads received packets that generated
the same megaflow for different microflows.  Until now, userspace has
logged this, which confuses users by making them suspect a bug.  We could
simply not log ENOENT in userspace, but that would suppress logging for
genuine bugs too.  Instead, this commit drops DPIF_FP_MODIFY from flow
put operations in ofproto-dpif, which transforms this particular
problem into EEXIST, which userspace already logs at "debug" level (see
flow_message_log_level()), effectively suppressing the logging in normal
circumstances.

It appears that in practice ofproto-dpif doesn't actually ever need to
modify flows in the datapath, only create and delete them, so this
shouldn't cause problems.

This backport squashes together commit a7d1bbdcfe4 (ofproto-dpif: Use
DPIF_FP_CREATE but not DPIF_FP_MODIFY.) and commit dc71c5dd0455
(ofproto-dpif: Fix tests broken by previous commit.) from master, with
resolved conflicts.

Suggested-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agodpif-netdev: Initialize upcall->packet when queuing to userspace.
Ben Pfaff [Mon, 4 Aug 2014 19:16:25 +0000 (12:16 -0700)]
dpif-netdev: Initialize upcall->packet when queuing to userspace.

Only the data and size members were being initialized, but all of them
should be.

This is inspired by commit b6f4590fa036 (dpif-netdev: Initialize
upcall->packet when queuing to userspace.) from master.  The background is
not exactly the same as on master (the commit that it references is not
on branch-2.3).

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoofproto-dpif: fix an ovs crash when dpif_recv_set returns error
Andy Zhou [Sun, 3 Aug 2014 23:16:40 +0000 (16:16 -0700)]
ofproto-dpif: fix an ovs crash when dpif_recv_set returns error

When dpif_recv_set returns an error, close_dpif_backer gets called,
which in term calls recirc_id_pool_destroy, which can lead to a crash
since recirc_id_pool_create() was not called before this patch.

Reported-by: Mukesh Hira <mhira@vmware.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath: Drop packets when interdev is not up
Chunhe Li [Wed, 30 Jul 2014 01:49:01 +0000 (09:49 +0800)]
datapath: Drop packets when interdev is not up

If the internal device is not up, it should drop received
packets. Sometimes it receive the broadcast or multicast
packets, and the ip protocol stack will casue more cpu
usage wasted.

Signed-off-by: Chunhe Li <lichunhe@huawei.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoconnmgr: Make call to connmgr_wants_packet_in_on_miss() thread safe.
Alex Wang [Tue, 29 Jul 2014 17:50:07 +0000 (10:50 -0700)]
connmgr: Make call to connmgr_wants_packet_in_on_miss() thread safe.

connmgr_wants_packet_in_on_miss() is called by multiple threads
and thusly should be protected by the mutex.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
9 years agoconnmgr: Fix a typo.
Alex Wang [Tue, 29 Jul 2014 18:10:21 +0000 (11:10 -0700)]
connmgr: Fix a typo.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
9 years agoconnmgr: Only send role status messages to OpenFlow 1.4+ controllers.
Ben Pfaff [Tue, 22 Jul 2014 22:54:55 +0000 (15:54 -0700)]
connmgr: Only send role status messages to OpenFlow 1.4+ controllers.

Only OpenFlow 1.4 and later support role status messages, but this code
tried to send them to all controllers, which caused an assertion failure.

Also, add tests to check that role status messages work, and that they
don't cause trouble with OF1.2.

Reported-by: Anup Khadka <khadka.py@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agoconnmgr: Demote service controllers too in ofconn_set_role().
Ben Pfaff [Mon, 28 Jul 2014 17:17:56 +0000 (10:17 -0700)]
connmgr: Demote service controllers too in ofconn_set_role().

Service controllers can set their roles, so it's necessary to demote them
to slaves if another controller becomes master.  Unfortunately the
'controllers' hmap only contains primary controllers, so this was omitted.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agoovs-ofctl: Add --unixctl command line option.
Ben Pfaff [Mon, 28 Jul 2014 17:31:25 +0000 (10:31 -0700)]
ovs-ofctl: Add --unixctl command line option.

This matches the option offered by some other Open vSwitch daemons.  I
intend to use it in tests in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
9 years agoFix two memory leaks.
yinpeijun [Mon, 28 Jul 2014 07:21:17 +0000 (15:21 +0800)]
Fix two memory leaks.

Found by coverity.

Signed-off-by: yinpeijun <yinpeijun@huawei.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath: Fix buffer overrun in mask array realloc.
Pravin B Shelar [Fri, 25 Jul 2014 23:22:46 +0000 (16:22 -0700)]
datapath: Fix buffer overrun in mask array realloc.

mask realloc copies elements from old array to new array. When
shrinking array it can go beyond allocated memory.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
9 years agodatapath: Add NULL check for mask pointer.
Pravin B Shelar [Thu, 24 Jul 2014 20:32:35 +0000 (13:32 -0700)]
datapath: Add NULL check for mask pointer.

There is race in datapath when last mask in mask array deleted can
result in NULL pointer dereference in datapath.
datapath lookup does not check mask pointer if its index is less than
mask-array count. That is safe because delete operation moves last valid
pointer to the deleted element. But this does not work if we are
deleting last element in array. Following patch adds NULL check for the
mask pointer.
This patch also avoids accessing ma->count without any locks.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
9 years agodatapath: reorder action netlink attribute definition for upstreaming
Andy Zhou [Thu, 24 Jul 2014 09:17:48 +0000 (02:17 -0700)]
datapath: reorder action netlink attribute definition for upstreaming

Keeping the order of netlink attribute definition in the order of
upstreaming is the best way to keep all released user space program
forward compatible with upstreamed kernel modules.

Adjust action netlink attribute order to match with the current
upstreaming plan.

Recirc and hash actions are introduced in branch 2.3, which will be
fixed by the patch. The MPLS actions have been released since
branch-2.1 but there is no kernel implementation of them prior to
branch 2.3. Thus the ordering change should not affect them.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agoroute-table: Make route-table module thread-safe.
Ryan Wilson [Thu, 29 May 2014 21:56:09 +0000 (14:56 -0700)]
route-table: Make route-table module thread-safe.

Since the use of xcache, the netdev struct can be freed by the
revalidator threads.  This fact also makes the following race possible:

1. Consider there is a gre tunnel, and datapath flows that go through
   the tunnel.  Now, assume user deletes the tunnel port.
2. The main thread closes all of its references to the corresponding
   netdev struct.
3. If the ukey for the tunnel datapath flows hold the last reference
   to the old port's netdev, the revalidator will then close it
   and remove the netlink notifier struct (if the netdev is the last
   vport netdev).
4. However, if the main thread executes the netdev_vport_run(), and
   sees the existence of netlink notifier struct (before revalidator
   frees it), it will try polling updates from the notifier socket.
   And when it polls the socket fd, the fd has already been freed,
   and poll will keep failing with "Bad file descriptor".

The following script could be used to reproduce the race:
- assume on a VM-VM setup, with the setup below:

ovs-vsctl add-br br-int -- add-port br-int p3
ovs-vsctl add-port br-int vif1 -- set int vif1 type=internal
ifconfig vif1 11.0.0.1 up; ifconfig eth3 3.3.3.1 up

- while keeping a ping from 11.0.0.1 to 11.0.0.2, run this loop:

for i in `seq 1000000`; do
    sleep 5; ovs-vsctl del-port p3;
    ovs-vsctl add-port br-int p3 -- set int p3  type=gre \
        options:remote_ip=3.3.3.2 options:key=1;
done

- after a while, the race should be triggered.  and the main thread
  should run at 100% cpu.

This race has already been fixed on master by commit 3c27dbe6 (route
-table: Make route-table module thread-safe.).  This commit backports
it to branch-2.3.

VMware-BZ: #1287360

Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
9 years agoovsdb: Don't add ovsdb-server.c to libovsdb.
Gurucharan Shetty [Fri, 18 Jul 2014 01:15:17 +0000 (18:15 -0700)]
ovsdb: Don't add ovsdb-server.c to libovsdb.

Without this change, with shared libraries, VLOG
constructor for ovsdb-server would get called twice corrupting
the 'vlog_modules' list causing an infinite loop.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Reported-by: Gur Stavi <gstavi@mrv.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agostp: Make stp-disabled port forward stp bpdu packets.
Alex Wang [Wed, 16 Jul 2014 01:52:19 +0000 (18:52 -0700)]
stp: Make stp-disabled port forward stp bpdu packets.

Commit 0d1cee123a84 (stp: Fix bpdu tx problem in listening state)
makes ovs drop the stp bpdu packets if stp is not enabled on the
input port.  However, when pif bridge is used and stp is enabled
on the integration bridge.  The flow translation of stp bpdu
packets will go through a level of resubmission which changes
the input port to the corresponding peer port.  Since, the
patch port on the pif bridge does not have stp enabled, the
flow translation will drop the bpdu packets.

This commit fixes the issue by making ovs forward stp bpdu packets
on stp-disabled port.

VMware-BZ: #1284695

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
9 years agodatapath: Check for NULL upcall_portids.
Pravin B Shelar [Tue, 15 Jul 2014 18:12:11 +0000 (11:12 -0700)]
datapath: Check for NULL upcall_portids.

Following patch adds NULL check for memory allocated
by kmalloc.

Reported-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
9 years agonetlink-socket: Fix handling socket allocation failure in nl_dump_start().
Ben Pfaff [Mon, 14 Jul 2014 21:06:03 +0000 (14:06 -0700)]
netlink-socket: Fix handling socket allocation failure in nl_dump_start().

If nl_pool_alloc() failed, then 'dump' was not initialized at all and
further use of the dump would access uninitialized data, probably causing
a crash.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
9 years agodpif-linux: Avoid null dereference if all ports disappear.
Ben Pfaff [Mon, 14 Jul 2014 20:17:05 +0000 (13:17 -0700)]
dpif-linux: Avoid null dereference if all ports disappear.

When dpif_linux_refresh_channels() refreshes the set of channels when
the number of handlers changes, it destroys all the dpif's channels and
sets dpif->uc_array_size to 0.  If the port dump later in the function
turns up no ports (which generally indicates a bug), then no channels will
be allocated and thus dpif->uc_array_size will remain 0 and 'channels' will
be null in each handler.  This is self-consistent, at least, but
dpif_linux_port_get_pid__() was still willing in this situation to
try to access element 0 of the set of channels, dereferencing a null
pointer.

This fixes the problem.

I encountered this while looking at a bug that I had introduced during
development that caused the port dump to always be empty.  It would be
difficult to encounter in normal use.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
9 years agoofp-msgs: Correct code for queue configuration messages in OpenFlow 1.0.
Ben Pfaff [Mon, 14 Jul 2014 17:37:38 +0000 (10:37 -0700)]
ofp-msgs: Correct code for queue configuration messages in OpenFlow 1.0.

Reported-by: Simon Jouet <simon.jouet@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoofp-util: Fix null pointer dereference in ofputil_pull_buckets().
Ben Pfaff [Mon, 14 Jul 2014 21:33:01 +0000 (14:33 -0700)]
ofp-util: Fix null pointer dereference in ofputil_pull_buckets().

Found by clang-analyzer.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agotests: Disable glibc memory checking under glibc <= 2.11.
Ben Pfaff [Fri, 11 Jul 2014 18:03:08 +0000 (11:03 -0700)]
tests: Disable glibc memory checking under glibc <= 2.11.

We noticed that the unit tests sometimes fail on XenServer inside glibc's
memory checker, in the free_check() function.  It turns out that the
glibc memory checker in glibc 2.11 and earlier had an internal race that
caused false positives in multithreaded programs.

This commit avoids the problem by disabling the glibc memory checker in
glibc 2.11 and earlier.

VMware-BZ: #1267127
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agonetlink-socket: Work around kernel Netlink dump thread races.
Ben Pfaff [Thu, 10 Jul 2014 23:48:16 +0000 (16:48 -0700)]
netlink-socket: Work around kernel Netlink dump thread races.

The Linux kernel Netlink implementation has two races that cause problems
for processes that attempt to dump a table in a multithreaded manner.

The first race is in the structure of the kernel netlink_recv() function.
This function pulls a message from the socket queue and, if there is none,
reports EAGAIN:
skb = skb_recv_datagram(sk, flags, noblock, &err);
if (skb == NULL)
goto out;
Only if a message is successfully read from the socket queue does the
function, toward the end, try to queue up a new message to be dumped:
if (nlk->cb && atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf / 2) {
ret = netlink_dump(sk);
if (ret) {
sk->sk_err = ret;
sk->sk_error_report(sk);
}
}
This means that if thread A reads a message from a dump, then thread B
attempts to read one before A queues up the next, B will get EAGAIN.  This
means that, following EAGAIN, B needs to wait until A returns to userspace
before it tries to read the socket again.  nl_dump_next() already does
this, using 'dump->status_seq' (although the need for it has never been
explained clearly, to my knowledge).

The second race is more serious.  Suppose thread X and thread Y both
simultaneously attempt to queue up a new message to be dumped, using the
call to netlink_dump() quoted above.  netlink_dump() begins with:
mutex_lock(nlk->cb_mutex);

cb = nlk->cb;
if (cb == NULL) {
err = -EINVAL;
goto errout_skb;
}
Suppose that X gets cb_mutex first and finds that the dump is complete.  It
will therefore, toward the end of netlink_dump(), clear nlk->cb to NULL to
indicate that no dump is in progress and release the mutex:
nlk->cb = NULL;
mutex_unlock(nlk->cb_mutex);
When Y grabs cb_mutex afterward, it will see that nlk->cb is NULL and
return -EINVAL as quoted above.  netlink_recv() stuffs -EINVAL in sk_err,
but that error is not reported immediately; instead, it is saved for the
next read from the socket.  Since Open vSwitch maintains a pool of Netlink
sockets, that next failure can crop up pretty much anywhere.  One of the
worst places for it to crop up is in the execution of a later transaction
(e.g. in nl_sock_transact_multiple__()), because userspace treats Netlink
transactions as idempotent and will re-execute them when socket errors
occur.  For a transaction that sends a packet, this causes packet
duplication, which we actually observed in practice.  (ENOBUFS should
actually cause transactions to be re-executed in many cases, but EINVAL
should not; this is a separate bug in the userspace netlink code.)

VMware-BZ: #1283188
Reported-and-tested-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
9 years agonetlink-socket: Fix sign of error code.
Ben Pfaff [Thu, 10 Jul 2014 21:32:10 +0000 (14:32 -0700)]
netlink-socket: Fix sign of error code.

Commit 8f20fd98db (netlink-socket: Work around upstream kernel Netlink
bug.) got the sign of the error code wrong, so that it reported e.g. -22
for EINVAL to nl_sock_recv__()'s caller, instead of 22.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
9 years agodatapath: add skb_clone NULL check in the recirc action.
Andy Zhou [Thu, 10 Jul 2014 07:30:27 +0000 (00:30 -0700)]
datapath: add skb_clone NULL check in the recirc action.

Refactoring recirc action implementation.

The main change is to fix a bug where the NULL check after skb clone()
call is missing.  The fix is to return -ENOMEM whenever skb_clone()
failed to create a clone.

Also rearranged adjacent code to improve readability.

Reported-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath/flow_netlink: Fix NDP flow mask validation
Daniele Di Proietto [Wed, 9 Jul 2014 22:22:14 +0000 (15:22 -0700)]
datapath/flow_netlink: Fix NDP flow mask validation

match_validate() enforce that a mask matching on NDP attributes has also an
exact match on ICMPv6 type.
The ICMPv6 type, which is 8-bit wide, is stored in the 'tp.src' field of
'struct sw_flow_key', which is 16-bit wide.
Therefore, an exact match on ICMPv6 type should only check the first 8 bits.

This commit fixes a bug that prevented flows with an exact match on NDP field
from being installed

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodpif-linux: Recheck the socket pointer existence before getting its pid.
Alex Wang [Tue, 8 Jul 2014 04:58:33 +0000 (21:58 -0700)]
dpif-linux: Recheck the socket pointer existence before getting its pid.

This commit fixes a race between port deletion and flow miss handling.
More specifically, a port could be removed by main thread while
the handler thread is handling the flow miss from it.  If the flow
requires slow path action, the handler thread will try querying a pid
from port's socket.  Since the port has been deleted, the query will
cause a dereference of NULL socket pointer.

This commit makes the handler thread recheck the socket pointer before
dereferencing it.

VMware-BZ: 1251981

Reported-by: Pratap Reddy <preddy@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
9 years agotests: Fix "megaflow disabled" test.
Joe Stringer [Tue, 8 Jul 2014 09:01:20 +0000 (09:01 +0000)]
tests: Fix "megaflow disabled" test.

The previous 'grep' logic would occasionally catch unintended flows.
There's no reason to check for these flows separately, so combine the
two checks.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ansis Atteka <aatteka@nicira.com>
9 years agorevalidator: Revalidate missed flows.
Joe Stringer [Thu, 3 Jul 2014 08:29:26 +0000 (08:29 +0000)]
revalidator: Revalidate missed flows.

If the datapath doesn't dump a flow for some reason, and the current
dump is expected to revalidate all flows in the datapath, then perform
revalidation for those flows by fetching them during the sweep phase.
If revalidation is not required, then leave the flow in the datapath and
don't revalidate it.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agodpif: Support fetching flow mask via dpif_flow_get().
Joe Stringer [Fri, 4 Jul 2014 08:37:12 +0000 (08:37 +0000)]
dpif: Support fetching flow mask via dpif_flow_get().

Change the interface to allow implementations to pass back a buffer, and
allow callers to specify which of actions, mask, and stats they wish to
receive. This will be used in the next commit.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoutil: fix compile warnings
Ansis Atteka [Tue, 8 Jul 2014 04:11:53 +0000 (04:11 +0000)]
util: fix compile warnings

This patch fixes two compile warnings introduced by commit
64b73291 ("util: create a copy of program_name"):
1. ../lib/util.c:457:5: error: passing argument 1 of 'free'
   discards 'const' qualifier from pointer target type; And
2. ../lib/util.c:463:5: error: ISO C90 forbids mixed declarations
   and code [-Werror=declaration-after-statement] (affected only
   branch-2.3 that is C90 compliant and not the master)

Reported-By: Joe Stringer <jstringer@nicira.com>
Reported-By: Lorand Jakab <lojakab@cisco.com>
Signed-Off-By: Ansis Atteka <aatteka@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
9 years agoutil: Suppress a warning by adding CONST_CAST
YAMAMOTO Takashi [Fri, 4 Jul 2014 14:14:57 +0000 (14:14 +0000)]
util: Suppress a warning by adding CONST_CAST

Suppress a gcc warning which was introduced by
commit e0b48482c16b6eaa7f14d8c7e7c6275528881b9e.
("util: create a copy of program_name")
I guess MSVC doesn't have a corresponding warning.

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Acked-by: Lorand Jakab <lojakab@cisco.com>
9 years agoovs-thread: Implement OVS specific barrier.
Alex Wang [Thu, 29 May 2014 22:37:37 +0000 (15:37 -0700)]
ovs-thread: Implement OVS specific barrier.

Non-leader revalidator thread uses pthread_barrier_* functions in their
main loop to synchronize with leader thread.  However, since those threads
only call poll_block() intermittently, the poll interval check in
poll_block() can wrongly take the time since last call as poll interval
and issue the following warnings:

"Unreasonably long XXXXms poll interval".

To prevent it, this commit implements the barrier struct and operations
for OVS which allow thread to block on barrier via poll_block().

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoutil: create a copy of program_name
Ansis Atteka [Thu, 3 Jul 2014 20:41:02 +0000 (13:41 -0700)]
util: create a copy of program_name

Commit 8a9562 ("dpif-netdev: Add DPDK netdev.") reversed sequence
in which set_program_name() and proctitle_init() functions are
called.  This introduced a regression where program_name and argv_start
would point to exactly the same memory (previously both of these
pointers were pointing to different memory locations because
proctitle_init() would have beforehand created a copy of argv[0]
for the succeeding set_program_name() call).

This regression on my system caused ovs-vswitchd monitoring process to
show up without process name:

...  00:00:00 : monitoring pid 26308 (healthy)

Ps output was lacking process name because following code was
using overlapping memory for source and target buffer:.

proctitle_set(const char *format, ...)
{
    ...
    n = snprintf(argv_start, argv_size, "%s: ", program_name);

Overall C99 and POSIX standards state that behavior is undefined
if source and target buffers overlap.

Signed-Off-By: Ansis Atteka <aatteka@nicira.com>
Acked-By: Ben Pfaff <blp@nicira.com>
9 years agonetlink-socket: Work around upstream kernel Netlink bug.
Ben Pfaff [Mon, 30 Jun 2014 21:57:42 +0000 (14:57 -0700)]
netlink-socket: Work around upstream kernel Netlink bug.

The upstream kernel net/netlink/af_netlink.c netlink_recvmsg() contains the
following code to refill the Netlink socket buffer with more dump skbs
while a dump is in progress:

if (nlk->cb && atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf / 2) {
ret = netlink_dump(sk);
if (ret) {
sk->sk_err = ret;
sk->sk_error_report(sk);
}
}

The netlink_dump() function that this calls returns a negative number on
error, the convention used throughout the kernel, and thus sk->sk_err
receives a negative value on error.

However, sk->sk_err is supposed to contain either 0 or a positive errno
value, as one can see from a quick "grep" through net for 'sk_err =', e.g.:

    ipv4/tcp.c:2067: sk->sk_err = ECONNRESET;
    ipv4/tcp.c:2069: sk->sk_err = ECONNRESET;
    ipv4/tcp_input.c:4106: sk->sk_err = ECONNREFUSED;
    ipv4/tcp_input.c:4109: sk->sk_err = EPIPE;
    ipv4/tcp_input.c:4114: sk->sk_err = ECONNRESET;
    netlink/af_netlink.c:741: sk->sk_err = ENOBUFS;
    netlink/af_netlink.c:1796: sk->sk_err = ENOBUFS;
    packet/af_packet.c:2476: sk->sk_err = ENETDOWN;
    unix/af_unix.c:341: other->sk_err = ECONNRESET;
    unix/af_unix.c:407: skpair->sk_err = ECONNRESET;

The result is that the next attempt to receive from the socket will return
the error to userspace with the wrong sign.

(The root of the error in this case is that multiple threads are attempting
to read a single flow dump from a shared fd.  That should work, but the
kernel has an internal race that can result in one or more of those threads
hitting the EINVAL case at the start of netlink_dump().  The EINVAL is
harmless in this case and userspace should be able to ignore it, but
reporting the EINVAL as if it were a 22-byte message received in userspace
throws a real wrench in the works.)

This bug makes me think that there are probably not many programs doing
multithreaded Netlink dumps.  Maybe it is good that we are considering
other approaches.

VMware-BZ: #1255704
Reported-by: Mihir Gangar <gangarm@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
9 years agorevalidator: Improve optimization to skip revalidation.
Joe Stringer [Wed, 2 Jul 2014 07:41:33 +0000 (07:41 +0000)]
revalidator: Improve optimization to skip revalidation.

The should_revalidate() optimisation introduced with commit 698ffe3623
(revalidator: Only revalidate high-throughput flows.) was a little
aggressive, occasionally deleting flows even when OVS is quite capable
of performing full revalidation.

This commit modifies the logic to:
* Firstly, check if we are likely to handle full revalidation, and
  attempt that instead.
* Secondly, fall back to the existing flow throughput estimations to
  determine whether to revalidate the flow or just delete it.

VMware-BZ: #1271926

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
9 years agodatapath: Use exact lookup for flow_get and flow_del.
Alex Wang [Mon, 30 Jun 2014 21:51:02 +0000 (14:51 -0700)]
datapath: Use exact lookup for flow_get and flow_del.

Due to the race condition in userspace, there is chance that two
overlapping megaflows could be installed in datapath.  And this
causes userspace unable to delete the less inclusive megaflow flow
even after it timeout, since the flow_del logic will stop at the
first match of masked flow.

This commit fixes the bug by making the kernel flow_del and flow_get
logic check all masks in that case.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: fix sparse warning in function tbl_mask_array_delete_mask()
Andy Zhou [Tue, 24 Jun 2014 06:00:55 +0000 (23:00 -0700)]
datapath: fix sparse warning in function tbl_mask_array_delete_mask()

Sparse gives "incompatible types in comparison expression (different
address spaces)" warning messages. Fix this by add rcu_dereference()
wrappers.

Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: keep mask array compact when deleting mask
Andy Zhou [Mon, 16 Jun 2014 19:45:04 +0000 (12:45 -0700)]
datapath: keep mask array compact when deleting mask

When deleting a mask from the mask array, we always move the last entry
into its current location. Another approach can be NULL in its
current place, and periodically compact it.

The approach taken by this patch is more efficient during run
time.  During look up, fast path packet don't have to skip over NULL
pointers.

A more important advantage of this approach is that it tries to
keep the mask array index stable by avoiding periodic index
reshuffle.

This patch implements an optimization to further promote index
stability.  By leaving the last entry value intact when moving it
to a new location, the old cache index can 'fix' themselves, by noticing
the index in the cache is outside the valid mask array region. The
new index can be found by scanning the mask pointer within the valid
rtegion.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: simplify ovs_flow_tbl_lookup_stats()
Andy Zhou [Fri, 6 Jun 2014 20:30:27 +0000 (13:30 -0700)]
datapath: simplify ovs_flow_tbl_lookup_stats()

Simplify flow mask cache replacement without using expensive atomic
memory access to the mask pointers.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: Change u64_stats_* to use _irq instead of _bh().
Jesse Gross [Mon, 30 Jun 2014 20:43:25 +0000 (13:43 -0700)]
datapath: Change u64_stats_* to use _irq instead of _bh().

The upstream u64_stats API has been changed to remove the _bh()
versions and switch all consumers to use IRQ safe variants instead.
This was done to be safe for netpoll generated packets, which can
occur in hard IRQ context. From a safety perspective, this doesn't
directly affect OVS since it doesn't support netpoll. However, this
change has been backported to older kernels so OVS needs to use the
new API to compile.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agorhel, xenserver: Run unit tests while creating rpms.
Gurucharan Shetty [Mon, 23 Jun 2014 21:05:00 +0000 (14:05 -0700)]
rhel, xenserver: Run unit tests while creating rpms.

For RHEL, Fedora and Xenserver, run unit tests while
building rpms.  This may catch some cross-platform bugs.

The commit also allows the users to optionally skip unit tests.
(On debian, the default is to run unit tests. For consistency,
do the same for rpms.)

VMware-BZ: 1267127

CC: Flavio Leitner <fbl@redhat.com>
CC: Ben Pfaff <blp@nicira.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
Tested-by: Flavio Leitner <fbl@redhat.com>
9 years agotest-ovsdb: Workaround unicode bug in Python 2.4.x.
Gurucharan Shetty [Fri, 20 Jun 2014 20:13:51 +0000 (13:13 -0700)]
test-ovsdb: Workaround unicode bug in Python 2.4.x.

Run the following command on Xenserver:
PYTHONPATH=`pwd`/python/compat::`pwd`/python python ./tests/test-ovsdb.py \
parse-atoms '{"type": "string", "minLength": 2}'    \
'[""]'     '["a"]'     '["ab"]'     '["abc"]'     '["\ud834\udd1e"]'

And we get the following error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\U0001d11e'
in position 23: ordinal not in range(128).

It looks like we are hitting the following bug:
http://bugs.python.org/issue2517

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-By: Ben Pfaff <blp@nicira.com>
9 years agosocket_util.py: Make set_dscp() python 2.4.3 compatible.
Gurucharan Shetty [Thu, 19 Jun 2014 17:38:20 +0000 (10:38 -0700)]
socket_util.py: Make set_dscp() python 2.4.3 compatible.

There is no 'errno' field in socket.error. Instead use the
get_exception_errno() function to get the error number.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib: Fix FreeBSD build.
Joe Stringer [Wed, 25 Jun 2014 07:41:41 +0000 (07:41 +0000)]
lib: Fix FreeBSD build.

Various recent commits have introduced build failures on FreeBSD. This
patch fixes them.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agodpif-netdev: delete lost packets in dp_execute_cb()
Daniele Di Proietto [Wed, 25 Jun 2014 20:39:45 +0000 (13:39 -0700)]
dpif-netdev: delete lost packets in dp_execute_cb()

This commit fixes memory leaks in dp_execute_cb() in two cases:
    - when the output port cannot be found
    - when the recirculation depth is exceeded

Reported-by: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodpif-netdev: Fix memory leak in dpif_netdev_flow_put()
Ryan Wilson [Wed, 25 Jun 2014 20:05:17 +0000 (13:05 -0700)]
dpif-netdev: Fix memory leak in dpif_netdev_flow_put()

miniflow_destroy() needs to be called after using miniflow_init().
Otherwise, if the miniflow mallocs data, then a memory leak may
occur.

Found by inspection.

Signed-off-by: Ryan Wilson <wryan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agojson: Fix parsing of strings that end with a backslash.
Ben Pfaff [Wed, 25 Jun 2014 18:39:25 +0000 (11:39 -0700)]
json: Fix parsing of strings that end with a backslash.

json_string_unescape() flagged a backslash at the end of a string as an
error, but of course "\\" is a valid string.  This fixes the problem.

VMware-BZ: #1275208
Reported-by: Michael Hu <mhu@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
9 years agodatapath: Rehash 16-bit skbuff hashes into 32 bits.
Jesse Gross [Wed, 25 Jun 2014 01:28:08 +0000 (18:28 -0700)]
datapath: Rehash 16-bit skbuff hashes into 32 bits.

Currently, if the network stack provides skb->rxhash then we use it,
otherwise we compute our own. However, on at least some versions of
RHEL/CentOS, the stack provides a hash that is 16 bits rather than
32 bits. In cases where we use the uppermost bits of the hash this
is particularly bad because we detect that a hash is present and we
use it rather than computing our own but the result is always zero.

This is particularly noticible with tunnel ports that use the hash
to generate a source port, such as VXLAN. On these kernels the tunnel
source port is always the minimum value. To solve this problem while
still taking advantage of the precomputed hash, this rehashes the
hash so that the entropy is spread throughout 32 bits.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
9 years agodpif: When executing actions needs help, use "set" action to set tunnel.
Ben Pfaff [Tue, 24 Jun 2014 23:39:33 +0000 (16:39 -0700)]
dpif: When executing actions needs help, use "set" action to set tunnel.

Open vSwitch userspace is able to implement some actions that the kernel
doesn't support, such as modifying ARP fields.  When it does this for a
tunneled packet, it needs to supply the tunnel information with a "set"
action, because the Linux kernel datapath throws away tunnel information
supplied in the OVS_PACKET_CMD_EXECUTE metadata argument.

VMware-BZ: #1270110
Reported-by: Srinivas Neginhal <sneginha@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
9 years agonetdev-vport: Fix use-after-free error in netdev_vport_route_changed().
Ben Pfaff [Tue, 24 Jun 2014 20:47:33 +0000 (13:47 -0700)]
netdev-vport: Fix use-after-free error in netdev_vport_route_changed().

We can't unlock the netdev's mutex after close the netdev, because closing
the netdev might destroy the mutex.

VMware-BZ: #1275187
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agostream-ssl: Enable TLSv1.1 and TLSv1.2.
Ben Pfaff [Fri, 13 Jun 2014 23:24:49 +0000 (16:24 -0700)]
stream-ssl: Enable TLSv1.1 and TLSv1.2.

The Open vSwitch SSL code was inadvertently enabling only TLSv1, not
later versions.  This commit should fix it.

See https://www.openssl.org/docs/ssl/SSL_CTX_new.html
and http://www.postgresql.org/message-id/20131203213049.GA8259@gmail.com
for more information.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Reported-by: Abhinav Singhal <Abhinav.Singhal@spirent.com>
Acked-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agolib/classifier: Fix use of uninitialized memory.
Jarno Rajahalme [Fri, 13 Jun 2014 21:52:59 +0000 (14:52 -0700)]
lib/classifier: Fix use of uninitialized memory.

When reaching the end of a prefix trie, we checked one bit off the end
to the intended data.  However, since the trie node in that case has
NULLs for both edge links, this did not result in incorrect
functionality.

Found via check-valgrind.

Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/classifier: Clarify trie_lookup_value().
Jarno Rajahalme [Fri, 13 Jun 2014 21:52:59 +0000 (14:52 -0700)]
lib/classifier: Clarify trie_lookup_value().

trie_lookup_value() is easier to read with the local variable 'plen'
renamed as 'ofs'.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agodatapath: avoid memory corruption in queue_userspace_packet()
Andy Zhou [Thu, 12 Jun 2014 20:19:25 +0000 (13:19 -0700)]
datapath:  avoid memory corruption in queue_userspace_packet()

In queue_userspace_packet(), the ovs_nla_put_flow return value is
not checked. This is fine as long as key_attr_size() returns the
correct value. In case it does not, the current code may corrupt buffer
memory. Add a run time assertion catch this case to avoid silent
failure.

Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
9 years agodatapath: fix key size computation in key_attr_size()
Andy Zhou [Thu, 12 Jun 2014 20:07:01 +0000 (13:07 -0700)]
datapath: fix key size computation in key_attr_size()

The key_attr_size() was not updated when RECIRC_ID and DP_HASH
key fields are added to support recircualtion. This patch fixes it.

Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
VMware-BZ: 1266214

9 years agoFix log message weird suffixes.
Ben Pfaff [Wed, 11 Jun 2014 16:14:54 +0000 (09:14 -0700)]
Fix log message weird suffixes.

I think these were leftovers from the removal of %z for MSVC that happened
some time ago.

VMware-BZ: 1265762
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
9 years agonetflow: Fold netflow_expire() into netflow_flow_clear().
Anoob Soman [Tue, 20 May 2014 11:40:35 +0000 (12:40 +0100)]
netflow: Fold netflow_expire() into netflow_flow_clear().

netflow_flow_clear() asserted that no packets or bytes were included
in the statistics for the flow being cleared.  Before threading Open
vSwitch, this assertion was always true because netflow_expire() was
always called before calling netflow_flow_clear().  Since Open
vSwitch was threaded, however, it was possible that a packet arrived
after netflow_expire() but before netflow_flow_clear(), since each of
these function separately took the netflow mutex.

This commit fixes the problem by merging netflow_expire() into
netflow_flow_clear(), under a single acquisition of the netflow
mutex.

Signed-off-by: Anoob Soman <anoob.soman@citrix.com>
[blp@nicira.com modified the patch to remove netflow_expire() and
 rewrote the commit message]
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoofproto-dpif-rid: Fix memory leak in recirc_id_pool_destroy().
Ben Pfaff [Fri, 6 Jun 2014 01:46:23 +0000 (18:46 -0700)]
ofproto-dpif-rid: Fix memory leak in recirc_id_pool_destroy().

recirc_id_pool_create() allocates memory but recirc_id_pool_destroy() did
not destroy it.

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
9 years agoofproto: Fix memory leak in ofproto_destroy().
Ben Pfaff [Fri, 6 Jun 2014 00:43:46 +0000 (17:43 -0700)]
ofproto: Fix memory leak in ofproto_destroy().

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
9 years agoofproto: Send monitor updates if a flow mod changes a rules actions
Simon Horman [Thu, 5 Jun 2014 09:54:47 +0000 (18:54 +0900)]
ofproto: Send monitor updates if a flow mod changes a rules actions

Without this change a monitor update will be sent when a flow mod changes
a rules cookie but not if only the actions are updated. This appears
to be a logic error.

I noticed this while working on implementing OpenFlow1.4 flow monitor
as an OpenFlow1.4 flow mod does not update a rules cookie.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agopoll-loop: Ignore 'wevent' in poll_fd_wait_at() on non-Windows.
Ben Pfaff [Wed, 4 Jun 2014 22:47:16 +0000 (15:47 -0700)]
poll-loop: Ignore 'wevent' in poll_fd_wait_at() on non-Windows.

'wevent' isn't actually used on non-Windows systems, but poll_fd_wait_at()
and find_poll_node() treat events with different 'wevent' as different, so
it seems better to make sure that 'wevent' doesn't matter.

Alternatively, one could ovs_assert(!wevent).  I guess that would catch
a caller accidentally swapping the 'fd' and 'wevent' arguments.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agostream-ssl: Always initialize wevent member, even on non-Windows.
Ben Pfaff [Thu, 5 Jun 2014 18:36:56 +0000 (11:36 -0700)]
stream-ssl: Always initialize wevent member, even on non-Windows.

Otherwise the indeterminate 'wevent' could frustrate poll_fd_wait_at()'s
attempt to merge "poll_node"s for the same fd.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agoflow: Fix use-after-free in flow_compose().
Ben Pfaff [Wed, 4 Jun 2014 22:42:13 +0000 (15:42 -0700)]
flow: Fix use-after-free in flow_compose().

flow_compose_l4() can cause 'b' to be reallocated, thus the network header
pointer needs to be refreshed afterward.

Found by valgrind in the IPv6 case.  I updated the IPv4 case too just in
case, and for consistency.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
9 years agodpif-netdev: Fix another use-after-free in port_unref().
Ben Pfaff [Wed, 4 Jun 2014 22:41:09 +0000 (15:41 -0700)]
dpif-netdev: Fix another use-after-free in port_unref().

Commit 87400a3d4cc4a (dpif-netdev: Fix use-after-free in port_unref().)
fixed one use-after-free in the common case of port_unref().  However,
there was another, similar case: if port->netdev has no rxqs, then
the netdev_close() causes port->netdev to be destroyed and thus the
following call to netdev_n_rxq() accesses freed memory.  This commit fixes
the problem.

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
9 years agoofproto-dpif: Fix uninitialized data in ofproto_dpif_execute_actions().
Ben Pfaff [Wed, 4 Jun 2014 22:38:34 +0000 (15:38 -0700)]
ofproto-dpif: Fix uninitialized data in ofproto_dpif_execute_actions().

The dp_hash and recirc_id fields weren't being initialized.

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
9 years agodpif: Fix slow action handling for DP_HASH and RECIRC
Andy Zhou [Thu, 29 May 2014 00:00:48 +0000 (17:00 -0700)]
dpif: Fix slow action handling for DP_HASH and RECIRC

In case DP_HASH and RECIRC actions need to be executed in slow path,
current implementation simply don't handle them -- vswitchd simply
crashes. This patch fixes them by supply an implementation for them.

RECIRC will be handled by the datapath, same as the output action.

DP_HASH, on the other hand, is handled in the user space. Although the
resulting hash values may not match those computed by the datapath, it
is less expensive; current use case (bonding) does not require a strict
match to work properly.

Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
9 years agoofproto: Remove ofproto_refresh_rule().
Joe Stringer [Tue, 3 Jun 2014 09:09:59 +0000 (21:09 +1200)]
ofproto: Remove ofproto_refresh_rule().

The only user of this function was removed in the previous patch, so
remove it.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
9 years agoofproto-dpif-xlate: Cache full flowmod for learning.
Joe Stringer [Tue, 3 Jun 2014 08:44:35 +0000 (20:44 +1200)]
ofproto-dpif-xlate: Cache full flowmod for learning.

Caching the results of xlate_learn was previously dependent on the state
of the 'may_learn' flag. This meant that if the caller did not specify
that this flow may learn, then a learn entry would not be cached.
However, the xlate_cache tends to be used on a recurring basis, so
failing to cache the learn entry can provide unexpected behaviour later
on, particularly in corner cases.

Such a corner case occurred previously:-
* Revalidation was requested.
* A flow with a learn action was dumped.
* The flow had no packets.
* The flow's corresponding xcache was cleared, and the flow revalidated.
* The flow went on to receive packets after the xcache is re-created.

In this case, the xcache would be re-created, but would not refresh the
timeouts on the learnt flow until the next time it was cleared, even if
it received more traffic. This would cause flows to time out sooner than
expected. Symptoms of this bug may include unexpected forwarding
behaviour or extraneous statistics being attributed to the wrong flow.

This patch fixes the issue by caching the entire flow_mod, including
actions, upon translating an xlate_learn action. This is used to perform
a flow_mod from scratch with the original flow, rather than simply
refreshing the rule that was created during the creation of the xcache.

Bug #1252997.

Reported-by: Scott Hendricks <shendricks@vmware.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agolib/ovs-rcu: Evaluate parameters before ovsrcu_set.
Jarno Rajahalme [Tue, 3 Jun 2014 15:35:16 +0000 (08:35 -0700)]
lib/ovs-rcu: Evaluate parameters before ovsrcu_set.

ovsrcu_set() looks like a function, so users are justified in expecting
the arguments to be evaluated before any of the body of ovsrcu_set().

With ovs-atomic-pthreads, a fallback ovs-atomics implementation used
when no C11 atomics are available or with GCC older than 4.0, the
prior definitions led to nested mutex locking when the 'VALUE'
argument was an atomic_read().  This is resolved by ensuring function
argument semantics for the parameters.  For non-GCC compilers this is
done with an inline function.  GCC implementation of ovs-rcu does not
fix the RCU variable type, so a GCC macro needs to be used instead.

Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
9 years agoovs-rcu: Remove the extra 'typedef' keyword.
Jarno Rajahalme [Tue, 3 Jun 2014 15:35:16 +0000 (08:35 -0700)]
ovs-rcu: Remove the extra 'typedef' keyword.

'struct ovsrcu_pointer' is always used with the 'struct' keyword, so
remove the unneeded 'typedef'.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
9 years agolib/flow: fix minimatch_extract() ICMPv6 parsing
Jarno Rajahalme [Mon, 2 Jun 2014 21:29:57 +0000 (14:29 -0700)]
lib/flow: fix minimatch_extract() ICMPv6 parsing

This patch addresses two bugs related to ICMPv6(NDP) packets:

- In miniflow_extract() push the words in the correct order
- In parse_icmpv6() use sizeof struct, not size of struct *

match_wc_init() has been modified, to include the nd_target field
when the transport layer protocol is ICMPv6

A testcase has been added to detect the bugs.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agobridge: Resend status changes to database if previous transaction
Ryan Wilson [Fri, 30 May 2014 19:49:45 +0000 (12:49 -0700)]
bridge: Resend status changes to database if previous transaction
was not successful.

Bridge, port and interface status changes are not sent to the
database if the connectivity and netdev sequence numbers have not
changed. However, if the previous database transaction fails, then
status changes will not be updated in the database until the
connectivity and netdev sequence numbers change again. This could
leave the database in an incorrect state for a long period of time.

This patch always sends status changes to the database if the last
transaction was not successful.

Bug #1256577
Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
9 years agoovsdb-idl: Add coverage counters for ovsdb commit return statuses.
Ryan Wilson [Fri, 30 May 2014 06:44:37 +0000 (23:44 -0700)]
ovsdb-idl: Add coverage counters for ovsdb commit return statuses.

Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
9 years agobridge: Initialize dscp for mgmt connections.
Gurucharan Shetty [Fri, 16 May 2014 19:04:00 +0000 (12:04 -0700)]
bridge: Initialize dscp for mgmt connections.

Without it, garbage values make it to set_dscp function
in Windows.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agorevalidator: Eliminate duplicate flow handling.
Joe Stringer [Wed, 28 May 2014 00:39:51 +0000 (12:39 +1200)]
revalidator: Eliminate duplicate flow handling.

A series of bugs have been identified recently that are caused by a
combination of the awkward flow dump API, possibility of duplicate flows
in a flow dump, and premature optimisation of the revalidator logic.
This patch attempts to simplify the revalidator logic by combining
multiple critical sections into one, which should make the state more
consistent.

The new flow of logic is:
+ Lookup the ukey.
+ If the ukey doesn't exist, create it.
+ Insert the ukey into the udpif. If we can't insert it, skip this flow.
+ Lock the ukey. If we can't lock it, skip it.
+ Determine if the ukey was already handled. If it has, skip it.
+ Revalidate.
+ Update ukey's fields (mark, flow_exists).
+ Unlock the ukey.

Previously, we would attempt process a flow without creating a ukey if
it hadn't been dumped before and it was due to be deleted. This patch
changes this to always create a ukey, allowing the ukey's
mutex to be used as the basis for preventing a flow from being handled
twice. This is expected to cause some minor performance regression for
cases like TCP_CRR, in favour of correctness and readability.

Bug #1252997.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
9 years agorhel: support persistent mac addresses on OVS bridges
Lars Kellogg-Stedman [Fri, 23 May 2014 21:14:35 +0000 (17:14 -0400)]
rhel: support persistent mac addresses on OVS bridges

This patch adds support for RHEL-derived systems (RHEL/CentOS/Fedora)
for setting the persistent MAC address of an OVS bridge via the MACADDR
setting in the interface configuration file.

Without this change, when an administrator provides MACADDR in the
interface configuration file that address will be set in ifup-eth using
the "ip link set" command.  While this appears to work, any operation
that updates the OVS configuration will cause the MAC address to revert.

Persistent MAC addresses must be set using ovs-vsctl.

(Resubmitted with whitespace and grammar corrections)

Signed-off-by: Lars Kellogg-Stedman <lars@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
9 years agodpif-linux: Fix revalidator core caused by flow dumps.
Ethan Jackson [Sat, 24 May 2014 01:07:25 +0000 (18:07 -0700)]
dpif-linux: Fix revalidator core caused by flow dumps.

The dpif flow dump API guarantees that keys it returns are not deleted
unless dpif_flow_dump_next_may_destroy_keys() warns that they might
be.  When dpif_linux_flow_dump_next() needed extra memory for a
datapath flow's actions, it would use a special temporary buffer.
Unfortunately, it also used memory from this temporary buffer for the
keys it returned.  Thus, on the next loop this memory would be freed,
breaking our invariant.

The correct solution to this problem is probably to fix this rather
awkward API.  However, this patch's solution is small and simple, so
it's fine for now.

The following is a valgrind stack trace which lead to finding this
bug.

Invalid read of size 2
  at 0x4C2D050: memcpy@@GLIBC_2.14 (in ...)
  by 0x4D40B8: dpif_linux_flow_to_ofpbuf (dpif-linux.c:2373)
  by 0x4D4DE4: dpif_linux_operate__ (dpif-linux.c:1394)
  by 0x4D5055: dpif_linux_operate (dpif-linux.c:1480)
  by 0x450D14: dpif_operate (dpif.c:1225)
  by 0x42D706: push_dump_ops__.isra.6 (ofproto-dpif-upcall.c:1362)
  by 0x42E5A8: revalidate.isra.11 (ofproto-dpif-upcall.c:1531)
  by 0x42EB0E: udpif_revalidator (ofproto-dpif-upcall.c:592)
  by 0x495070: ovsthread_wrapper (ovs-thread.c:302)
  by 0x5472E99: start_thread (pthread_create.c:308)
  by 0x5C7C4BC: clone (clone.S:112)
Address 0x227e12d0 is 160 bytes inside a block of size 4,976 free'd
  at 0x4C2A82E: free (in ...)
  by 0x494227: ofpbuf_uninit (ofpbuf.c:136)
  by 0x4D5D34: dpif_linux_flow_dump_next (ofpbuf.h:182)
  by 0x4509BA: dpif_flow_dump_next (dpif.c:1036)
  by 0x42E137: revalidate.isra.11 (ofproto-dpif-upcall.c:1451)
  by 0x42EB0E: udpif_revalidator (ofproto-dpif-upcall.c:592)
  by 0x495070: ovsthread_wrapper (ovs-thread.c:302)
  by 0x5472E99: start_thread (pthread_create.c:308)
  by 0x5C7C4BC: clone (clone.S:112)

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoofproto-dpif-upcall: Fix additional use-after-free in revalidate().
Ben Pfaff [Wed, 21 May 2014 23:26:55 +0000 (16:26 -0700)]
ofproto-dpif-upcall: Fix additional use-after-free in revalidate().

Commit 1340ce0c175 (ofproto-dpif-upcall: Avoid use-after-free in
revalidate() corner cases.) fixed one use-after-free error in revalidate(),
but missed one more subtle case, in which dpif_linux_flow_dump_next()
attempts to retrieve actions for a flow that didn't have them in the main
dump result.  If retrieving those actions fails,
dpif_linux_flow_dump_next() goes on to the next flow, and as part of that
overwrites the old dumped flows in its buffer.  This is a problem because
dpif_linux_flow_dump_next_may_destroy_keys() would have indicated that
the old dumped flows would not have been destroyed, which means that the
data the caller relied on has gone away.  In the worst case, this causes
a segfault and core dump.

The best way to fix this problem is the refactoring that has already
happened on master in commit ac64794acb85 (dpif: Refactor flow dumping
interface to make better sense for batching.)  That is a fairly large
change, and not yet well-tested, so it doesn't yet seem suitable for a
stable branch.  For now, this commit refines the conditions that
dpif_linux_flow_dump_next_may_destroy_keys() uses, so that if the next
flow does not include actions it indicates that keys may be destroyed.

Thanks to Joe Stringer for suggesting this particular solution.

Bug #1249988.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
9 years agoofproto-dpif-xlate: Fix a bug.
Alex Wang [Thu, 22 May 2014 03:45:24 +0000 (20:45 -0700)]
ofproto-dpif-xlate: Fix a bug.

Commit b256dc525c8 (ofproto-dpif-xlate: Cache xlate_actions() effects.)
caches the variables needed for refreshing mac-learning table in
xlate_normal().  Wherein, the cache entry always records reference to
the original 'ofproto'.

When patch port is used to connect two 'ofproto's, packet goes through the
patch port will have two mac-learning cache entries created for each
'ofproto'.  So, each entry should reference to the corresponding 'ofproto'.
However, due to the bug mentioned above, all cache entries will refer to the
same 'ofproto'.  Subsequently, the mac-learning tables can be corrupted, which
causes connection loss.

This commit fixes the bug by making each cache entry refer to the correct
'ofproto'.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
9 years agodpif-netdev: Fix memory leak.
Ben Pfaff [Wed, 21 May 2014 00:07:56 +0000 (17:07 -0700)]
dpif-netdev: Fix memory leak.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agodpif-netdev: Fix use-after-free in port_unref().
Ben Pfaff [Wed, 21 May 2014 00:09:59 +0000 (17:09 -0700)]
dpif-netdev: Fix use-after-free in port_unref().

When the last rxq is closed (which releases the rxq's internal reference
to its netdev) the next call to netdev_n_rxq() accesses freed memory.

Found by valgrind.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Reported-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
9 years agorevalidator: Re-fix a flow duplication bug.
Joe Stringer [Wed, 21 May 2014 02:49:04 +0000 (14:49 +1200)]
revalidator: Re-fix a flow duplication bug.

Commit 73a3c4757e59 (revalidator: Prevent handling the same flow twice.)
fixed a bug where duplicated flows could be deleted twice. Commit
7d1700980b5d (ofproto-dpif-upcall: Remove the flow_dumper thread.)
partially re-introduced this bug.

The bug would cause the logs to show messages such as
"failed to flow_get (No such file or directory) skb_priority(0),..."
"failed to flow_del (No such file or directory) skb_priority(0),..."

This patch fixes the issue again.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
9 years agoovs-ctl: Raise the limit on the number of open file descriptors.
Alex Wang [Tue, 20 May 2014 21:16:54 +0000 (14:16 -0700)]
ovs-ctl: Raise the limit on the number of open file descriptors.

Since the removal of dispatcher thread, OVS creates 'n-handler-threads'
file descriptors for each bridge port.  To allow more bridge ports
be supported, this commit raises the limit on the number of open file
descriptors from 7500 to 65535.

Bug #1254038.

Signed-off-by: Alex Wang <alexw@nicira.com>
9 years agoAUTHORS: Fix spelling of Anoob Soman's name.
Ben Pfaff [Tue, 20 May 2014 18:22:11 +0000 (11:22 -0700)]
AUTHORS: Fix spelling of Anoob Soman's name.

Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoacinclude.m4: Fix "sparse", via detection of GNU make "if" directive.
Ben Pfaff [Mon, 19 May 2014 14:52:21 +0000 (07:52 -0700)]
acinclude.m4: Fix "sparse", via detection of GNU make "if" directive.

Make treats tabs very differently from spaces at the beginning of a line,
so this test must use a tab instead of a space.  This partially reverts
commit a0903134d2d60 (acinclude.m4: Expand tabs).

Without this commit, the build system never enables checking with sparse
because it never detects that GNU make "if" works.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
9 years agoofproto-dpif: Install internal rule should not change the match content.
Andy Zhou [Sat, 10 May 2014 02:13:47 +0000 (19:13 -0700)]
ofproto-dpif: Install internal rule should not change the match content.

Without this patch, the match passed into to
ofproto_dpif_add_internal_flow() are modified. The mask of dl_type will
always be converted from wildcarded match into exact match due to
calling rule_dpif_lookup_in_table(). The fix makes sure
ofproto_dpif_add_internal_flow() does not change the original match,
and makes the match passed as const in the
ofproto_dpif_add_internal_flow() API.

This bug prevents bond module from properly tracking the post
recirculation rules installed in the internal table. The existing rule
is always deleted followed by reinstalling of the same rule.

The observable behavior of the bug is that bond module losses track
of the slave's stats, after the slave is rebalanced. Although traffic
flows through the slave just fine.

Bug #1229225

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoovs-atomic: Remove atomic_uint64_t and atomic_int64_t.
Simon Horman [Wed, 14 May 2014 07:19:35 +0000 (16:19 +0900)]
ovs-atomic: Remove atomic_uint64_t and atomic_int64_t.

Some concern has been raised by Ben Pfaff that atomic_uint64_t may not
be portable. In particular on 32bit platforms that do not have atomic
64bit integers.

Now that there are no longer any users of atomic_uint64_t remove it
entirely. Also remove atomic_int64_t which has no users.

Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoofproto-dpif-upcall: Use atomic_long in struct udpif
Simon Horman [Wed, 14 May 2014 07:19:34 +0000 (16:19 +0900)]
ofproto-dpif-upcall: Use atomic_long in struct udpif

Some concern has been raised by Ben Pfaff that atomic_uint64_t may not
be portable. Accordingly, use atomic_ulong instead of atomic_uint64_t
in struct ofproto.

This is in preparation for removing atomic_uint64_t entirely.

Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
9 years agoofproto-dpif-upcall: Avoid use-after-free in revalidate() corner cases.
Ben Pfaff [Thu, 15 May 2014 22:52:17 +0000 (15:52 -0700)]
ofproto-dpif-upcall: Avoid use-after-free in revalidate() corner cases.

The loop in revalidate() needs to ensure that any data obtained from
dpif_flow_dump_next() is used before it is destroyed, as indicated by
dpif_flow_dump_next_may_destroy_keys().  In the common case, where
processing reaches the end of the main "while" loop, it does this, but
in two corner cases the code in the loop execute "continue;", which skipped
the check.  This commit fixes the problem.

Bug #1249988.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
9 years agodatapath: sample action without side effects
Simon Horman [Thu, 15 May 2014 00:05:03 +0000 (09:05 +0900)]
datapath: sample action without side effects

The sample action is rather generic, allowing arbitrary actions to be
executed based on a probability. However its use, within the Open vSwitch
code-base is limited: only a single user-space action is ever nested.

A consequence of the current implementation of sample actions is that
depending on weather the sample action executed (due to its probability)
any side-effects of nested actions may or may not be present before
executing subsequent actions.  This has the potential to complicate
verification of valid actions by the (kernel) datapath. And indeed adding
support for push and pop MPLS actions inside sample actions is one case
where such case.

In order to allow all supported actions to be continue to be nested inside
sample actions without the potential need for complex verification code
this patch changes the implementation of the sample action in the kernel
datapath so that sample actions are more like a function call and any side
effects of nested actions are not present when executing subsequent
actions.

With the above in mind the motivation for this change is twofold:

* To contain side-effects the sample action in the hope of making it
  easier to deal with in the future and;
* To avoid some rather complex verification code introduced in the MPLS
  datapath patch.

Some notes about the implementation:

* This patch silently changes the behaviour of sample actions whose nested
  actions have side-effects. There are no known users of such sample
  actions.

* sample() does not clone the skb for the only known use-case of the sample
  action: a single nested userspace action. In such a case a clone is not
  needed as the userspace action has no side effects.

  Given that there are no known users of other nested actions and in order
  to avoid the complexity of predicting if other sequences of actions have
  side-effects in such cases the skb is cloned.

* As sample() provides a cloned skb in the unlikely case where there are
  nested actions other than a single userspace action it is no longer
  necessary to clone the skb in do_execute_actions() when executing a
  recirculation action just because the keep_skb parameter is set: this
  parameter was only set when processing the nested actions of a sample
  action.  Moreover it is possible to remove the keep_skb parameter of
  do_execute_actions entirely.

* As sample() provides either a cloned skb or one that has had a
  reference taken (using keep_skb) to do_execute_actions()
  the original skb passed to sample() is never consumed. Thus the
  caller of sample() (also do_execute_actions()) can use its generic
  error handling to free the skb on error.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
9 years agoPrepare for 2.3.0.
Justin Pettit [Thu, 15 May 2014 21:09:39 +0000 (14:09 -0700)]
Prepare for 2.3.0.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
9 years agolib/classifier: Simpilify array ordering.
Jarno Rajahalme [Thu, 15 May 2014 02:53:51 +0000 (19:53 -0700)]
lib/classifier: Simpilify array ordering.

The terminology we used for subtable ordering ('splice', 'right
before') was inherited from an earlier use of a linked list, and
turned out to be confusing when applied to an array.  Also, we only
ever move one subtable earlier or later within the array, so we can
simplify the code as well.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
9 years agoovs-pki: Workaround lack of /dev/stdin in Windows.
Gurucharan Shetty [Mon, 12 May 2014 20:08:35 +0000 (13:08 -0700)]
ovs-pki: Workaround lack of /dev/stdin in Windows.

This lets us generate certs for unit tests on Windows

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
9 years agoodp-util: Fix a comment.
Joe Stringer [Mon, 12 May 2014 22:07:18 +0000 (10:07 +1200)]
odp-util: Fix a comment.

The comment was very specific to one user of the function, and had a
typo. This change reflects the wider effect of the case.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>