cascardo/linux.git
11 years agonet: fix psock_fanout selftest hash collision
Willem de Bruijn [Tue, 19 Mar 2013 20:42:44 +0000 (20:42 +0000)]
net: fix psock_fanout selftest hash collision

Fix flaky results with PACKET_FANOUT_HASH depending on whether the
two flows hash into the same packet socket or not.

Also adds tests for PACKET_FANOUT_LB and PACKET_FANOUT_CPU and
replaces the counting method with a packet ring.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: Get rid of compat defines in psock_fanout.c selftest.
David S. Miller [Tue, 19 Mar 2013 22:08:45 +0000 (18:08 -0400)]
net: Get rid of compat defines in psock_fanout.c selftest.

Reported-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: Fix failure string in net-socket selftests Makefile.
David S. Miller [Tue, 19 Mar 2013 21:05:50 +0000 (17:05 -0400)]
net: Fix failure string in net-socket selftests Makefile.

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agopacket: packet fanout rollover during socket overload
Willem de Bruijn [Tue, 19 Mar 2013 10:18:11 +0000 (10:18 +0000)]
packet: packet fanout rollover during socket overload

Changes:
  v3->v2: rebase (no other changes)
          passes selftest
  v2->v1: read f->num_members only once
          fix bug: test rollover mode + flag

Minimize packet drop in a fanout group. If one socket is full,
roll over packets to another from the group. Maintain flow
affinity during normal load using an rxhash fanout policy, while
dispersing unexpected traffic storms that hit a single cpu, such
as spoofed-source DoS flows. Rollover breaks affinity for flows
arriving at saturated sockets during those conditions.

The patch adds a fanout policy ROLLOVER that rotates between sockets,
filling each socket before moving to the next. It also adds a fanout
flag ROLLOVER. If passed along with any other fanout policy, the
primary policy is applied until the chosen socket is full. Then,
rollover selects another socket, to delay packet drop until the
entire system is saturated.

Probing sockets is not free. Selecting the last used socket, as
rollover does, is a greedy approach that maximizes chance of
success, at the cost of extreme load imbalance. In practice, with
sufficiently long queues to absorb bursts, sockets are drained in
parallel and load balance looks uniform in `top`.

To avoid contention, scales counters with number of sockets and
accesses them lockfree. Values are bounds checked to ensure
correctness.

Tested using an application with 9 threads pinned to CPUs, one socket
per thread and sufficient busywork per packet operation to limits each
thread to handling 32 Kpps. When sent 500 Kpps single UDP stream
packets, a FANOUT_CPU setup processes 32 Kpps in total without this
patch, 270 Kpps with the patch. Tested with read() and with a packet
ring (V1).

Also, passes psock_fanout.c unit test added to selftests.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: Add socket() system call self test.
David S. Miller [Tue, 19 Mar 2013 18:49:44 +0000 (14:49 -0400)]
net: Add socket() system call self test.

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm: use xfrm direction when lookup policy
Baker Zhang [Tue, 19 Mar 2013 04:24:30 +0000 (04:24 +0000)]
xfrm: use xfrm direction when lookup policy

because xfrm policy direction has same value with corresponding
flow direction, so this problem is covered.

In xfrm_lookup and __xfrm_policy_check, flow_cache_lookup is used to
accelerate the lookup.

Flow direction is given to flow_cache_lookup by policy_to_flow_dir.

When the flow cache is mismatched, callback 'resolver' is called.

'resolver' requires xfrm direction,
so convert direction back to xfrm direction.

Signed-off-by: Baker Zhang <baker.zhang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/smsc911x: Use NULL instead of integer for pointer
Sachin Kamat [Mon, 18 Mar 2013 21:01:38 +0000 (21:01 +0000)]
net/smsc911x: Use NULL instead of integer for pointer

Silences the following sparse warning:
drivers/net/ethernet/smsc/smsc911x.c:2145:30:
warning: Using plain integer as NULL pointer

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomrf24j40: Fix byte-order of IEEE address
Alan Ott [Mon, 18 Mar 2013 12:06:43 +0000 (12:06 +0000)]
mrf24j40: Fix byte-order of IEEE address

Load the 64-bit Extended (IEEE) address into the hardware in the proper
byte order.

Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomrf24j40: Increase max SPI speed to 10MHz
Alan Ott [Mon, 18 Mar 2013 12:06:42 +0000 (12:06 +0000)]
mrf24j40: Increase max SPI speed to 10MHz

Upon consulting the datasheet further, it does indicates a maximum speed
for SCK at 10MHz.

Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomrf24j40: Warn if transmit interrupts timeout
Alan Ott [Mon, 18 Mar 2013 12:06:41 +0000 (12:06 +0000)]
mrf24j40: Warn if transmit interrupts timeout

Issue a warning if a transmit complete interrupt doesn't happen in time.

Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomrf24j40: pinctrl support
Alan Ott [Mon, 18 Mar 2013 12:06:40 +0000 (12:06 +0000)]
mrf24j40: pinctrl support

Activate pinctrl settings when used with a DT system.

Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocan: dump stack on protocol bugs
Oliver Hartkopp [Mon, 18 Mar 2013 07:52:06 +0000 (07:52 +0000)]
can: dump stack on protocol bugs

The rework of the kernel hlist implementation "hlist: drop the node parameter
from iterators" (b67bfe0d42cac56c512dd5da4b1b347a23f4b70a) created some
fallout in the form of non matching comments and obsolete code.

Additionally to the cleanup this patch adds a WARN() statement to catch the
caller of the wrong filter removal request.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: add RSS capability for GRE traffic
Dmitry Kravkov [Mon, 18 Mar 2013 06:51:04 +0000 (06:51 +0000)]
bnx2x: add RSS capability for GRE traffic

The patch drives FW to perform RSS for GRE traffic,
based on inner headers.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: add CSUM and TSO support for encapsulation protocols
Dmitry Kravkov [Mon, 18 Mar 2013 06:51:03 +0000 (06:51 +0000)]
bnx2x: add CSUM and TSO support for encapsulation protocols

The patch utilizes FW offload capabilities for
encapsulation protocols.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: Fix a comment typo
Kusanagi Kouichi [Mon, 18 Mar 2013 02:59:52 +0000 (02:59 +0000)]
net: Fix a comment typo

Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: ftgmac100: Use module_platform_driver()
Sachin Kamat [Mon, 18 Mar 2013 01:50:48 +0000 (01:50 +0000)]
net: ftgmac100: Use module_platform_driver()

module_platform_driver macro removes some boilerplate and makes
the code simpler.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Po-Yu Chuang <ratbert@faraday-tech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: ep93xx_eth: Use module_platform_driver()
Sachin Kamat [Mon, 18 Mar 2013 01:50:47 +0000 (01:50 +0000)]
net: ep93xx_eth: Use module_platform_driver()

module_platform_driver macro removes some boilerplate and makes
the code simpler.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: dm9000: Use module_platform_driver()
Sachin Kamat [Mon, 18 Mar 2013 01:50:46 +0000 (01:50 +0000)]
net: dm9000: Use module_platform_driver()

module_platform_driver macro removes some boilerplate and makes
the code simpler.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: neterion: replace ip_fast_csum with csum_replace2
Li RongQing [Sun, 17 Mar 2013 22:34:48 +0000 (22:34 +0000)]
net: neterion: replace ip_fast_csum with csum_replace2

replace ip_fast_csum with csum_replace2 to save cpu cycles

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotcp: Remove TCPCT
Christoph Paasch [Sun, 17 Mar 2013 08:23:34 +0000 (08:23 +0000)]
tcp: Remove TCPCT

TCPCT uses option-number 253, reserved for experimental use and should
not be used in production environments.
Further, TCPCT does not fully implement RFC 6013.

As a nice side-effect, removing TCPCT increases TCP's performance for
very short flows:

Doing an apache-benchmark with -c 100 -n 100000, sending HTTP-requests
for files of 1KB size.

before this patch:
average (among 7 runs) of 20845.5 Requests/Second
after:
average (among 7 runs) of 21403.6 Requests/Second

Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers: net: irda: use resource_size() in au1k_ir.c
Silviu-Mihai Popescu [Sat, 16 Mar 2013 21:03:46 +0000 (21:03 +0000)]
drivers: net: irda: use resource_size() in au1k_ir.c

This uses the resource_size() function instead of explicit computation.

Signed-off-by: Silviu-Mihai Popescu <silviupopescu1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosfc: make local functions static
stephen hemminger [Sat, 16 Mar 2013 06:57:51 +0000 (06:57 +0000)]
sfc: make local functions static

Trivial sparse detected functions that should be static.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: fix some typos in netif features
Cong Wang [Sat, 16 Mar 2013 04:47:55 +0000 (04:47 +0000)]
net: fix some typos in netif features

Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
David S. Miller [Sun, 17 Mar 2013 16:58:47 +0000 (12:58 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jesse/openvswitch

Conflicts:
net/openvswitch/vport-internal_dev.c

Jesse Gross says:

====================
A couple of minor enhancements for net-next/3.10.  The largest is an
extension to allow variable length metadata to be passed to userspace
with packets.

There is a merge conflict in net/openvswitch/vport-internal_dev.c:
A existing commit modifies internal_dev_mac_addr() and a new commit
deletes it.  The new one is correct, so you can just remove that function.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers:net: dma_alloc_coherent: use __GFP_ZERO instead of memset(, 0)
Joe Perches [Fri, 15 Mar 2013 07:23:58 +0000 (07:23 +0000)]
drivers:net: dma_alloc_coherent: use __GFP_ZERO instead of memset(, 0)

Reduce the number of calls required to alloc
a zeroed block of memory.

Trivially reduces overall object size.

Other changes around these removals
o Neaten call argument alignment
o Remove an unnecessary OOM message after dma_alloc_coherent failure
o Remove unnecessary gfp_t stack variable

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetpoll: use DEFINE_STATIC_SRCU() to define netpoll_srcu
Lai Jiangshan [Fri, 15 Mar 2013 06:50:52 +0000 (06:50 +0000)]
netpoll: use DEFINE_STATIC_SRCU() to define netpoll_srcu

DEFINE_STATIC_SRCU() defines srcu struct and do init at build time.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovxlan: generalize forwarding tables
David Stevens [Fri, 15 Mar 2013 04:35:51 +0000 (04:35 +0000)]
vxlan: generalize forwarding tables

This patch generalizes VXLAN forwarding table entries allowing an administrator
to:
1) specify multiple destinations for a given MAC
2) specify alternate vni's in the VXLAN header
3) specify alternate destination UDP ports
4) use multicast MAC addresses as fdb lookup keys
5) specify multicast destinations
6) specify the outgoing interface for forwarded packets

The combination allows configuration of more complex topologies using VXLAN
encapsulation.

Changes since v1: rebase to 3.9.0-rc2

Signed-Off-By: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocaif: remove caif_shm
Erwan Yvin [Mon, 11 Mar 2013 03:13:03 +0000 (03:13 +0000)]
caif: remove caif_shm

caif_shm is an old implementation
caif_shm will be replaced by caif_virtio

[ As explained by Linus Walleij: "U5500 used this, but was cancelled
  and the silicon did not reach anyone outside ST-Ericsson.  Then for
  the next platforms, we have gone for the leaner & cleaner approach
  of using virtio, rpmesg and rproc." ]

Signed-off-by: Erwan Yvin <erwan.yvin@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Sjur Brendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv4: replace ip_fast_csum with csum_replace2
Li RongQing [Thu, 14 Mar 2013 22:50:18 +0000 (22:50 +0000)]
ipv4: replace ip_fast_csum with csum_replace2

replace ip_fast_csum with csum_replace2 to save cpu cycles

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodriver/qlogic: replace ip_fast_csum with csum_replace2
Li RongQing [Thu, 14 Mar 2013 22:50:07 +0000 (22:50 +0000)]
driver/qlogic: replace ip_fast_csum with csum_replace2

replace ip_fast_csum with csum_replace2 to save cpu cycles

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoDocumentation: fix neigh/default/gc_thresh1 default value.
Li RongQing [Thu, 14 Mar 2013 22:49:47 +0000 (22:49 +0000)]
Documentation: fix neigh/default/gc_thresh1 default value.

The default value is 128, not 256
#grep gc_thresh1 net/ -rI
net/decnet/dn_neigh.c: .gc_thresh1 = 128,
net/ipv6/ndisc.c: .gc_thresh1 =  128,
net/ipv4/arp.c: .gc_thresh1 = 128,

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers:net: Remove dma_alloc_coherent OOM messages
Joe Perches [Thu, 14 Mar 2013 13:07:21 +0000 (13:07 +0000)]
drivers:net: Remove dma_alloc_coherent OOM messages

I believe these error messages are already logged
on allocation failure by warn_alloc_failed and so
get a dump_stack on OOM.

Remove the unnecessary additional error logging.

Around these deletions:

o Alignment neatening.
o Remove unnecessary casts of dma_alloc_coherent.
o Hoist assigns from ifs.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobe2net: Use new F/W mailbox cmd to manipulate interrupts.
Somnath Kotur [Thu, 14 Mar 2013 02:42:07 +0000 (02:42 +0000)]
be2net: Use new F/W mailbox cmd to manipulate interrupts.

This is needed as the earlier method of manipulating this register via PCI
Config space is disallowed by certain Hypervisors.

Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobe2net: enable interrupts in be_probe() (RoCE and other ULPs need them)
Somnath Kotur [Thu, 14 Mar 2013 02:41:51 +0000 (02:41 +0000)]
be2net: enable interrupts in be_probe() (RoCE and other ULPs need them)

As the NIC PCI function may be used by other protocols, the chip interrupts
must be enabled in be_probe() itself rather than be_open().

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: clean leftover of COMPAT_NET_DEV_OPS removal
Fernando Luis Vazquez Cao [Wed, 13 Mar 2013 16:57:25 +0000 (16:57 +0000)]
net: clean leftover of COMPAT_NET_DEV_OPS removal

COMPAT_NET_DEV_OPS was removed a while back and with it the definition of
netdev_resync_ops() went away. Let's finish the clean-up.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoVSOCK: Support VM sockets connected to the hypervisor.
Reilly Grant [Thu, 14 Mar 2013 11:55:41 +0000 (11:55 +0000)]
VSOCK: Support VM sockets connected to the hypervisor.

The resource ID used for VM socket control packets (0) is already
used for the VMCI_GET_CONTEXT_ID hypercall so a new ID (15) must be
used when the guest sends these datagrams to the hypervisor.

The hypervisor context ID must also be removed from the internal
blacklist.

Signed-off-by: Reilly Grant <grantr@vmware.com>
Acked-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetxen: write IP address to firmware when using bonding
nikolay@redhat.com [Tue, 12 Mar 2013 02:49:01 +0000 (02:49 +0000)]
netxen: write IP address to firmware when using bonding

This patch allows LRO aggregation on bonded devices that contain an
NX3031 device. It also adds a for_each_netdev_in_bond_rcu(bond, slave)
macro which executes for each slave that has bond as master.

V3: After testing and discussing this with Rajesh, I decided to keep the
    vlan ip cache and just rename it to ip_cache since it will store bond
    ip addresses too. A new master flag has been added to the ip cache to
    denote that the address has been added because of a master device.
    I've taken care of the enslave/release cases by checking for various
    combinations of events and flags (e.g. netxen has a master, it's a
    bond master and it's not marked as a slave means it is being enslaved
    and is dev_open()ed in bond_enslave).
    I've changed netxen_free_ip_list() to have a "master" parameter which
    causes all IP addresses marked as master to be deleted (used when a
    netxen is being released). I've made the patch use the new upper
    device API as well. The following cases were tested:
    - bond -> netxen
    - vlan -> netxen
    - vlan -> bond -> netxen

V2: Remove local ip caching, retrieve addresses dynamically and
    restore them if necessary.

Note: Tested with NX3031 adapter.

Tested-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: Andy Gospodarek <agospoda@redhat.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge
David S. Miller [Thu, 14 Mar 2013 15:47:15 +0000 (11:47 -0400)]
Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:
- introduction of the new Network Coding component. This new mechanism aims to
  increase throughput by fusing multiple packets in one transmission.
- minor cleanups

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocsiostor: Cleanup chip specific operations.
Arvind Bhushan [Thu, 14 Mar 2013 05:09:08 +0000 (05:09 +0000)]
csiostor: Cleanup chip specific operations.

This patch removes chip specific operations from the common hardware
paths, as well as the Makefile change to accomodate the new files.

Signed-off-by: Arvind Bhushan <arvindb@chelsio.com>
Signed-off-by: Naresh Kumar Inna <naresh@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocsiostor: Header file modifications for chip support and bug fixes.
Arvind Bhushan [Thu, 14 Mar 2013 05:09:07 +0000 (05:09 +0000)]
csiostor: Header file modifications for chip support and bug fixes.

This patch defines the common operations to support multiple chips. It
includes common header file modifications to support the current chips
(T4 and T5). It also includes the following bug fixes:
- reconfirms the rnode state after an implicit logo.
- corrects the stats array size.
- sets up and checks flags correctly when coming up as master and finding
the card initialized

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Arvind Bhushan <arvindb@chelsio.com>
Signed-off-by: Naresh Kumar Inna <naresh@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocsiostor: Add T5 adapter operations.
Arvind Bhushan [Thu, 14 Mar 2013 05:09:06 +0000 (05:09 +0000)]
csiostor: Add T5 adapter operations.

This patch creates a new file for T5 adapter operations.

Signed-off-by: Arvind Bhushan <arvindb@chelsio.com>
Signed-off-by: Naresh Kumar Inna <naresh@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocsiostor: Segregate T4 adapter operations.
Arvind Bhushan [Thu, 14 Mar 2013 05:09:05 +0000 (05:09 +0000)]
csiostor: Segregate T4 adapter operations.

This patch separates T4 adapter operations into a new file.

Signed-off-by: Arvind Bhushan <arvindb@chelsio.com>
Signed-off-by: Naresh Kumar Inna <naresh@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRDMA/cxgb4: Fix onchip queue support for T5
Vipul Pandya [Thu, 14 Mar 2013 05:09:04 +0000 (05:09 +0000)]
RDMA/cxgb4: Fix onchip queue support for T5

T5 adapter does not support onchip queue memory. Present logic fails to
allocate QP for T5 and returns an error. Also, if module parameter ocqp_support
is zero then we are unable to allocate QP which should not be the case. Ideally
if ocqp_support parameter is 0 or onchip queue support is disable then host QP
should be allocated before returning an error.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRDMA/cxgb4: Bump tcam_full stat and WR reply timeout
Vipul Pandya [Thu, 14 Mar 2013 05:09:03 +0000 (05:09 +0000)]
RDMA/cxgb4: Bump tcam_full stat and WR reply timeout

Always bump the tcam_full stat. Also, bump wr reply timeout to 30 seconds.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRDMA/cxgb4: Map pbl buffers for dma if using DSGL.
Vipul Pandya [Thu, 14 Mar 2013 05:09:02 +0000 (05:09 +0000)]
RDMA/cxgb4: Map pbl buffers for dma if using DSGL.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRDMA/cxgb4: Use DSGLs for fastreg and adapter memory writes for T5.
Vipul Pandya [Thu, 14 Mar 2013 05:09:01 +0000 (05:09 +0000)]
RDMA/cxgb4: Use DSGLs for fastreg and adapter memory writes for T5.

It enables direct DMA by HW to memory region PBL arrays and fast register PBL
arrays from host memory, vs the T4 way of passing these arrays in the WR itself.
The result is lower latency for memory registration, and larger PBL array
support for fast register operations.

This patch also updates ULP_TX_MEM_WRITE command fields for T5. Ordering bit of
ULP_TX_MEM_WRITE is at bit position 22 in T5 and at 23 in T4.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRDMA/cxgb4: Add module_params to enable DB FC & Coalescing on T5
Vipul Pandya [Thu, 14 Mar 2013 05:09:00 +0000 (05:09 +0000)]
RDMA/cxgb4: Add module_params to enable DB FC & Coalescing on T5

Both DB Flow-Control and DB Coalescing are disabled by default on T5

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRDMA/cxgb4: Turn off db coalescing when RDMA QPs are in use.
Vipul Pandya [Thu, 14 Mar 2013 05:08:59 +0000 (05:08 +0000)]
RDMA/cxgb4: Turn off db coalescing when RDMA QPs are in use.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRDMA/cxgb4: Add Support for Chelsio T5 adapter
Vipul Pandya [Thu, 14 Mar 2013 05:08:58 +0000 (05:08 +0000)]
RDMA/cxgb4: Add Support for Chelsio T5 adapter

Adds support for Chelsio T5 adapter.
Enables T5's Write Combining feature.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4vf: Add support for Chelsio T5 adapter
Santosh Rastapur [Thu, 14 Mar 2013 05:08:57 +0000 (05:08 +0000)]
cxgb4vf: Add support for Chelsio T5 adapter

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Disable SR-IOV support for PF4-7 for T5
Santosh Rastapur [Thu, 14 Mar 2013 05:08:56 +0000 (05:08 +0000)]
cxgb4: Disable SR-IOV support for PF4-7 for T5

All T5 adapters will only support VFs on PF0-3 despite the ability of the
hardware to support them on PF4-7.  This keeps our T4 and T5 adapters more
similar which simplifies host driver software.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Update driver version and description
Santosh Rastapur [Thu, 14 Mar 2013 05:08:55 +0000 (05:08 +0000)]
cxgb4: Update driver version and description

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Add T5 PCI ids
Santosh Rastapur [Thu, 14 Mar 2013 05:08:54 +0000 (05:08 +0000)]
cxgb4: Add T5 PCI ids

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Add T5 debugfs support
Santosh Rastapur [Thu, 14 Mar 2013 05:08:53 +0000 (05:08 +0000)]
cxgb4: Add T5 debugfs support

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Enable doorbell drop recovery only for T4 adapter
Santosh Rastapur [Thu, 14 Mar 2013 05:08:52 +0000 (05:08 +0000)]
cxgb4: Enable doorbell drop recovery only for T4 adapter

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Add T5 write combining support
Santosh Rastapur [Thu, 14 Mar 2013 05:08:51 +0000 (05:08 +0000)]
cxgb4: Add T5 write combining support

This patch implements a low latency Write Combining (aka Write Coalescing) work
request path. PCIE maps User Space Doorbell BAR2 region writes to the new
interface to SGE. SGE pulls a new message from PCIE new interface and if its a
coalesced write work request then pushes it for processing. This patch copies
coalesced work request to memory mapped BAR2 space.

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Dump T5 registers
Santosh Rastapur [Thu, 14 Mar 2013 05:08:50 +0000 (05:08 +0000)]
cxgb4: Dump T5 registers

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Initialize T5
Santosh Rastapur [Thu, 14 Mar 2013 05:08:49 +0000 (05:08 +0000)]
cxgb4: Initialize T5

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Add macros, structures and inline functions for T5
Santosh Rastapur [Thu, 14 Mar 2013 05:08:48 +0000 (05:08 +0000)]
cxgb4: Add macros, structures and inline functions for T5

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocxgb4: Add register definations for T5
Santosh Rastapur [Thu, 14 Mar 2013 05:08:47 +0000 (05:08 +0000)]
cxgb4: Add register definations for T5

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobatman-adv: network coding - receive coded packets and decode them
Martin Hundebøll [Fri, 25 Jan 2013 10:12:43 +0000 (11:12 +0100)]
batman-adv: network coding - receive coded packets and decode them

When receiving a network coded packet, the decoding buffer is searched
for a packet to use for decoding. The source, destination, and crc32 from
the coded packet is used to identify the wanted packet. The decoded
packet is passed to the usual unicast receiver function, as had it never
been network coded.

Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: network coding - save overheard and tx packets for decoding
Martin Hundebøll [Fri, 25 Jan 2013 10:12:42 +0000 (11:12 +0100)]
batman-adv: network coding - save overheard and tx packets for decoding

To be able to decode a network coded packet, a node must already know
one of the two coded packets. This is done by buffering skbs before
transmission and buffering packets sniffed with promiscuous mode from
other hosts.

Packets are kept in a buffer similar to the one with forward-skbs: A
hash table, where each entry, which corresponds to a src-dst pair, has a
linked list packets.

Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: network coding - code and transmit packets if possible
Martin Hundebøll [Fri, 25 Jan 2013 10:12:41 +0000 (11:12 +0100)]
batman-adv: network coding - code and transmit packets if possible

Before adding forward-skbs to the coding buffer, the buffer is searched
for a potential coding opportunity. If one is found, the two packets are
network coded and transmitted right away. If not, the forward-skb is
added to the buffer.

Network coded packets are transmitted with information about the two
receivers and the two coded packets. The first receiver is given by the
MAC header, while the second is given in the payload/bat-header. The
second receiver uses promiscuous mode to receive the packet and check
the second destination.

Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: network coding - buffer unicast packets before forward
Martin Hundebøll [Fri, 25 Jan 2013 10:12:40 +0000 (11:12 +0100)]
batman-adv: network coding - buffer unicast packets before forward

Two be able to network code two packets, one packet must be buffered
until the next is available. This is done in a "coding buffer", which is
essentially a hash table with lists of packets. Each entry in the hash
table corresponds to a specific src-dst pair, which has a linked list of
packets that are buffered.

This patch adds skbs to the buffer just before forwarding them. The
buffer is traversed every 10 ms, where timed skbs are removed from the
buffer and transmitted. To allow experiments with the network coding
scheme, the timeout is tunable through a file in debugfs.

Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: network coding - detect coding nodes and remove these after timeout
Martin Hundebøll [Fri, 25 Jan 2013 10:12:39 +0000 (11:12 +0100)]
batman-adv: network coding - detect coding nodes and remove these after timeout

To use network coding efficiently, a relay must know when neighbor nodes
are likely to have enough information to be able to decode a network
coded packet. This is detected by using OGMs from batman-adv to discover
when one neighbor is in range of another neighbor. The relay check the
TLL to detect when an OGM is forwarded from one neighbor by another
neighbor, and thereby knows that the two neighbors are in range and thus
overhear packets sent by each other.

This information is saved in the orig_node struct to be used when
searching for coding opportunities. Two lists are added to the
orig_node struct: One for neighbors that can hear the orig_node
(outgoing nc_nodes) and one for neighbors that the orig_node can hear
(incoming nc_nodes).

Information about nc_nodes is kept for 10 seconds and is available
through debugfs in batman_adv/nc_nodes to use when debugging network
coding.

Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: network coding - add the initial infrastructure code
Martin Hundebøll [Fri, 25 Jan 2013 10:12:38 +0000 (11:12 +0100)]
batman-adv: network coding - add the initial infrastructure code

Network coding exploits the 802.11 shared medium to allow multiple
packets to be sent in a single transmission. In brief, a relay can XOR
two packets, and send the coded packet to two destinations. The
receivers can decode one of the original packets by XOR'ing the coded
packet with the other original packet. This will lead to increased
throughput in topologies where two packets cross one relay.

In a simple topology with three nodes, it takes four transmissions
without network coding to get one packet from Node A to Node B and one
from Node B to Node A:

 1.  Node A  ---- p1 --->  Node R                Node B
 2.  Node A                Node R  <--- p2 ----  Node B
 3.  Node A  <--- p2 ----  Node R                Node B
 4.  Node A                Node R  ---- p1 --->  Node B

With network coding, the relay only needs one transmission, which saves
us one slot of valuable airtime:

 1.  Node A  ---- p1 --->  Node R                Node B
 2.  Node A                Node R  <--- p2 ----  Node B
 3.  Node A  <- p1 x p2 -  Node R  - p1 x p2 ->  Node B

The same principle holds for a topology including five nodes. Here the
packets from Node A and Node B are overheard by Node C and Node D,
respectively. This allows Node R to send a network coded packet to save
one transmission:

   Node A                  Node B

    |     \              /    |
    |      p1          p2     |
    |       \          /      |
    p1       > Node R <       p2
    |                         |
    |         /      \        |
    |    p1 x p2    p1 x p2   |
    v       /          \      v
           /            \
   Node C <              > Node D

More information is available on the open-mesh.org wiki[1].

This patch adds the initial code to support network coding in
batman-adv. It sets up a worker thread to do house keeping and adds a
sysfs file to enable/disable network coding. The feature is disabled by
default, as it requires a wifi-driver with working promiscuous mode, and
also because it adds a small delay at each hop.

[1] http://www.open-mesh.org/projects/batman-adv/wiki/Catwoman

Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: don't use !! in bool conversion
Antonio Quartulli [Tue, 15 Jan 2013 12:17:19 +0000 (22:17 +1000)]
batman-adv: don't use !! in bool conversion

In C standard any expression different from 0 will be converted to
'true' when casting to bool (whatever is the length of the value).
Therefore all the "!!" conversions can be removed.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
11 years agobatman-adv: Return reason for failure in batadv_check_unicast_packet()
Martin Hundebøll [Sun, 13 Jan 2013 23:20:32 +0000 (00:20 +0100)]
batman-adv: Return reason for failure in batadv_check_unicast_packet()

batadv_check_unicast_packet() is changed to return a value based on the
reason to drop the packet, which will be useful information for
future users of batadv_check_unicast_packet().

Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Acked-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agobatman-adv: replace redundant primary_if_get calls
Marek Lindner [Sat, 12 Jan 2013 11:19:06 +0000 (19:19 +0800)]
batman-adv: replace redundant primary_if_get calls

The batadv_priv struct carries a pointer to its own interface
struct. Therefore, it is not necessary to retrieve the soft_iface
via the primary interface.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
11 years agotuntap: remove unused variable in __tun_detach()
Wei Yongjun [Wed, 13 Mar 2013 03:03:58 +0000 (03:03 +0000)]
tuntap: remove unused variable in __tun_detach()

The variable dev is initialized but never used
otherwise, so remove the unused variable.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosfc: remove duplicated include from efx.c
Wei Yongjun [Wed, 13 Mar 2013 03:02:20 +0000 (03:02 +0000)]
sfc: remove duplicated include from efx.c

Remove duplicated include.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Bump up the version to 5.1.37
Shahed Shaikh [Tue, 12 Mar 2013 09:02:17 +0000 (09:02 +0000)]
qlcnic: Bump up the version to 5.1.37

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Implement flash sysfs callback for 83xx adapter
Himanshu Madhani [Tue, 12 Mar 2013 09:02:16 +0000 (09:02 +0000)]
qlcnic: Implement flash sysfs callback for 83xx adapter

QLogic applications use these callbacks to perform

o  NIC Partitioning (NPAR) configuration and management
o  Diagnostic tests
o  Flash access and updates

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'cpsw'
David S. Miller [Wed, 13 Mar 2013 08:38:27 +0000 (04:38 -0400)]
Merge branch 'cpsw'

Mugunthan V N says:

====================
This patch serires implements the following features in CPSW driver
* get/set phy link settings
* interrupt pacing
* get phy id via ioctl cmd SIOCGMIIPHY

Changes from initial version
* Made active-slave common for cpts, ethtool and SIOCGMIIPHY ioctl
* Cleaned CPSW DT binding documentation by seperating slave nodes
  under sub-section
* implemented get phy id via ioctl cmd SIOCGMIIPHY
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers: net: ethernet: cpsw: implement get phy_id via ioctl
Mugunthan V N [Mon, 11 Mar 2013 23:16:38 +0000 (23:16 +0000)]
drivers: net: ethernet: cpsw: implement get phy_id via ioctl

Implement get phy_id via ioctl SIOCGMIIPHY. In switch mode active phy_id
is returned and in dual EMAC mode slave's specific phy_id is returned.

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodriver: net: ethernet: cpsw: implement interrupt pacing via ethtool
Mugunthan V N [Mon, 11 Mar 2013 23:16:37 +0000 (23:16 +0000)]
driver: net: ethernet: cpsw: implement interrupt pacing via ethtool

This patch implements support for interrupt pacing block of CPSW via ethtool
Inetrrupt pacing block is common of both the ethernet interface in
dual emac mode

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodriver: net: ethernet: cpsw: implement ethtool get/set phy setting
Mugunthan V N [Mon, 11 Mar 2013 23:16:36 +0000 (23:16 +0000)]
driver: net: ethernet: cpsw: implement ethtool get/set phy setting

This patch implements get/set of the phy settings via ethtool apis

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers: net: ethernet: cpsw: change cpts_active_slave to active_slave
Mugunthan V N [Mon, 11 Mar 2013 23:16:35 +0000 (23:16 +0000)]
drivers: net: ethernet: cpsw: change cpts_active_slave to active_slave

Change cpts_active_slave to active_slave so that the same DT property
can be used to ethtool and SIOCGMIIPHY.

CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodocumentation: dt: bindings: cpsw: cleanup documentation
Mugunthan V N [Mon, 11 Mar 2013 23:16:34 +0000 (23:16 +0000)]
documentation: dt: bindings: cpsw: cleanup documentation

Move all the slave note properties to separate section to reduce the
confusion between slave note properties and cpsw node properties

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoks8851_mll: basic ethernet statistics
David J. Choi [Mon, 11 Mar 2013 16:22:54 +0000 (09:22 -0700)]
ks8851_mll: basic ethernet statistics

Implement to collect ethernet statistical information on ks8851_mll device.

Signed-off-by: David J. Choi <david.choi@micrel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodriver: isdn: hisax: remove cast for kmalloc/kzalloc return value
Zhang Yanfei [Mon, 11 Mar 2013 19:15:49 +0000 (19:15 +0000)]
driver: isdn: hisax: remove cast for kmalloc/kzalloc return value

remove cast for kmalloc/kzalloc return value.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodriver: isdn: capi: remove cast for kmalloc return value
Zhang Yanfei [Mon, 11 Mar 2013 19:13:47 +0000 (19:13 +0000)]
driver: isdn: capi: remove cast for kmalloc return value

remove cast for kmalloc return value.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomv643xx_eth with 88E1318S: support Wake on LAN
Michael Stapelberg [Mon, 11 Mar 2013 13:56:45 +0000 (13:56 +0000)]
mv643xx_eth with 88E1318S: support Wake on LAN

This has been tested on a qnap TS-119P II. Note that enabling WOL with
"ethtool -s eth0 wol g" is not enough; you also need to tell the PIC
microcontroller inside the qnap that WOL should be enabled by sending
0xF2 with qcontrol(1) and you have to disable EUP ("Energy-using
Products", a European power-saving thing) by sending 0xF4.

Signed-off-by: Michael Stapelberg <michael@stapelberg.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agophy: add set_wol/get_wol functions
Michael Stapelberg [Mon, 11 Mar 2013 13:56:44 +0000 (13:56 +0000)]
phy: add set_wol/get_wol functions

This allows ethernet drivers (such as the mv643xx_eth) to support
Wake on LAN on platforms where PHY registers have to be configured
for Wake on LAN (e.g. the Marvell Kirkwood based qnap TS-119P II).

Signed-off-by: Michael Stapelberg <michael@stapelberg.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotcp: TLP loss detection.
Nandita Dukkipati [Mon, 11 Mar 2013 10:00:44 +0000 (10:00 +0000)]
tcp: TLP loss detection.

This is the second of the TLP patch series; it augments the basic TLP
algorithm with a loss detection scheme.

This patch implements a mechanism for loss detection when a Tail
loss probe retransmission plugs a hole thereby masking packet loss
from the sender. The loss detection algorithm relies on counting
TLP dupacks as outlined in Sec. 3 of:
http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01

The basic idea is: Sender keeps track of TLP "episode" upon
retransmission of a TLP packet. An episode ends when the sender receives
an ACK above the SND.NXT (tracked by tlp_high_seq) at the time of the
episode. We want to make sure that before the episode ends the sender
receives a "TLP dupack", indicating that the TLP retransmission was
unnecessary, so there was no loss/hole that needed plugging. If the
sender gets no TLP dupack before the end of the episode, then it reduces
ssthresh and the congestion window, because the TLP packet arriving at
the receiver probably plugged a hole.

Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotcp: Tail loss probe (TLP)
Nandita Dukkipati [Mon, 11 Mar 2013 10:00:43 +0000 (10:00 +0000)]
tcp: Tail loss probe (TLP)

This patch series implement the Tail loss probe (TLP) algorithm described
in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
first patch implements the basic algorithm.

TLP's goal is to reduce tail latency of short transactions. It achieves
this by converting retransmission timeouts (RTOs) occuring due
to tail losses (losses at end of transactions) into fast recovery.
TLP transmits one packet in two round-trips when a connection is in
Open state and isn't receiving any ACKs. The transmitted packet, aka
loss probe, can be either new or a retransmission. When there is tail
loss, the ACK from a loss probe triggers FACK/early-retransmit based
fast recovery, thus avoiding a costly RTO. In the absence of loss,
there is no change in the connection state.

PTO stands for probe timeout. It is a timer event indicating
that an ACK is overdue and triggers a loss probe packet. The PTO value
is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
ACK timer when there is only one oustanding packet.

TLP Algorithm

On transmission of new data in Open state:
  -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
  -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
  -> PTO = min(PTO, RTO)

Conditions for scheduling PTO:
  -> Connection is in Open state.
  -> Connection is either cwnd limited or no new data to send.
  -> Number of probes per tail loss episode is limited to one.
  -> Connection is SACK enabled.

When PTO fires:
  new_segment_exists:
    -> transmit new segment.
    -> packets_out++. cwnd remains same.

  no_new_packet:
    -> retransmit the last segment.
       Its ACK triggers FACK or early retransmit based recovery.

ACK path:
  -> rearm RTO at start of ACK processing.
  -> reschedule PTO if need be.

In addition, the patch includes a small variation to the Early Retransmit
(ER) algorithm, such that ER and TLP together can in principle recover any
N-degree of tail loss through fast recovery. TLP is controlled by the same
sysctl as ER, tcp_early_retrans sysctl.
tcp_early_retrans==0; disables TLP and ER.
 ==1; enables RFC5827 ER.
 ==2; delayed ER.
 ==3; TLP and delayed ER. [DEFAULT]
 ==4; TLP only.

The TLP patch series have been extensively tested on Google Web servers.
It is most effective for short Web trasactions, where it reduced RTOs by 15%
and improved HTTP response time (average by 6%, 99th percentile by 10%).
The transmitted probes account for <0.5% of the overall transmissions.

Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agofec: Use devm_request_and_ioremap()
Fabio Estevam [Mon, 11 Mar 2013 07:32:55 +0000 (07:32 +0000)]
fec: Use devm_request_and_ioremap()

Using devm_request_and_ioremap() can make the code cleaner and simpler.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agofec: Remove unused pci header
Fabio Estevam [Mon, 11 Mar 2013 07:32:54 +0000 (07:32 +0000)]
fec: Remove unused pci header

PCI header is not needed, so get rid of it.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobridge: using for_each_set_bit to simplify the code
Wei Yongjun [Mon, 11 Mar 2013 05:45:23 +0000 (05:45 +0000)]
bridge: using for_each_set_bit to simplify the code

Using for_each_set_bit() to simplify the code.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobridge: using for_each_set_bit_from to simplify the code
Wei Yongjun [Mon, 11 Mar 2013 05:43:48 +0000 (05:43 +0000)]
bridge: using for_each_set_bit_from to simplify the code

Using for_each_set_bit_from() to simplify the code.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: use list_move instead of list_del/list_add
Wei Yongjun [Mon, 11 Mar 2013 04:40:14 +0000 (04:40 +0000)]
bnx2x: use list_move instead of list_del/list_add

Using list_move() instead of list_del() + list_add().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: remove duplicated include from qlcnic_sysfs.c
Wei Yongjun [Mon, 11 Mar 2013 04:29:50 +0000 (04:29 +0000)]
qlcnic: remove duplicated include from qlcnic_sysfs.c

Remove duplicated include.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Restore FCoE 4-port devices support
Dmitry Kravkov [Mon, 11 Mar 2013 05:17:53 +0000 (05:17 +0000)]
bnx2x: Restore FCoE 4-port devices support

bnx2x FW 1.78.17 properly supports DCBX configuration for 4-port devices,
enabling FCoE support on 57840 boards.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: use FW 7.8.17
Dmitry Kravkov [Mon, 11 Mar 2013 05:17:52 +0000 (05:17 +0000)]
bnx2x: use FW 7.8.17

Update appropriate HSI files and adapt driver accordingly.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Avoid using zero MAC
Yuval Mintz [Mon, 11 Mar 2013 05:17:51 +0000 (05:17 +0000)]
bnx2x: Avoid using zero MAC

Prevent bnx2x devices which are used mainly for storage from using zero
MAC addresses as their primary MAC address.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Control SFP+ tap values via nvm config
Yaniv Rosner [Mon, 11 Mar 2013 05:17:50 +0000 (05:17 +0000)]
bnx2x: Control SFP+ tap values via nvm config

Configure SFP+ tap values to optimize link signal according to NVRAM setup.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Add EEE support for BCM84834
Yaniv Rosner [Mon, 11 Mar 2013 05:17:49 +0000 (05:17 +0000)]
bnx2x: Add EEE support for BCM84834

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Add RJ45 SFP module detection
Yaniv Rosner [Mon, 11 Mar 2013 05:17:48 +0000 (05:17 +0000)]
bnx2x: Add RJ45 SFP module detection

Add RJ45 SFP module detection. In case the user set 10G link speed, and the
module doesn't support it, then force the speed to 1G and notify the user.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Get gso_segs from FW
Yuval Mintz [Mon, 11 Mar 2013 05:17:47 +0000 (05:17 +0000)]
bnx2x: Get gso_segs from FW

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobnx2x: Control number of vfs dynamically
Ariel Elior [Mon, 11 Mar 2013 05:17:46 +0000 (05:17 +0000)]
bnx2x: Control number of vfs dynamically

1. Support sysfs interface for getting the maximal number of virtual functions
   of a given physical function.
2. Support sysfs interface for getting and setting the current number of
   virtual functions.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>