cascardo/linux.git
10 years agodt: Document a compatible entry for MDIO ethernet Phys
Jason Gunthorpe [Wed, 19 Mar 2014 22:15:23 +0000 (16:15 -0600)]
dt: Document a compatible entry for MDIO ethernet Phys

This describes a compatible entry of the form:
  ethernet-phy-idAAAA,BBBB
Which is modelled after the PCI structured compatible entry
(pciVVVV,DDDD.SSSS.ssss.RR)

If present the OF core will be able to use this information to
directly create the correct phy without auto probing the bus.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'cdc-next'
David S. Miller [Thu, 20 Mar 2014 20:58:01 +0000 (16:58 -0400)]
Merge branch 'cdc-next'

Ben Chan says:

====================
Adjust MTU as indicated by MBIM extended functional descriptor.

The MBIM extended functional descriptor, defined in "Universal Serial Bus
Communications Class Subclass Specification for Mobile Broadband Interface
Model, Revision 1.0, Errata-1" by USB-IF, indicates the operator preferred MTU
value via a wMTU field.

This patch set ensures that the initial MTU value set by cdc_ncm on a MBIM net
device does not exceed the wMTU value, provided the MBIM device exposes a MBIM
extended functional descriptor.

* Changelog
v2: Fixed a le16_to_cpu conversion issue in patch 2/2 pointed out by
    Bjørn Mork <bjorn@mork.no>
v3: No code changes. Resubmitted to include patch 1/2 as suggested by
    David Miller <davem@davemloft.net>
v4: No code changes. Resubmitted as suggested by David Miller:
    - Added a summary of the patch set
    - Carried the ACK from Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    - Added a specified the tree (net-next) to apply the patch set to
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: cdc_ncm: respect operator preferred MTU reported by MBIM
Ben Chan [Wed, 19 Mar 2014 21:00:06 +0000 (14:00 -0700)]
net: cdc_ncm: respect operator preferred MTU reported by MBIM

According to "Universal Serial Bus Communications Class Subclass
Specification for Mobile Broadband Interface Model, Revision 1.0,
Errata-1" published by USB-IF, the wMTU field of the MBIM extended
functional descriptor indicates the operator preferred MTU for IP data
streams.

This patch modifies cdc_ncm_setup to ensure that the MTU value set on
the usbnet device does not exceed the operator preferred MTU indicated
by wMTU if the MBIM device exposes a MBIM extended functional
descriptor.

Signed-off-by: Ben Chan <benchan@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoUSB: cdc: add MBIM extended functional descriptor structure
Ben Chan [Wed, 19 Mar 2014 21:00:05 +0000 (14:00 -0700)]
USB: cdc: add MBIM extended functional descriptor structure

This patch adds the MBIM extended functional descriptor structure
defined in "Universal Serial Bus Communications Class Subclass
Specification for Mobile Broadband Interface Model, Revision 1.0,
Errata-1" published by USB-IF.

Signed-off-by: Ben Chan <benchan@chromium.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'mlx4-next'
David S. Miller [Thu, 20 Mar 2014 20:19:41 +0000 (16:19 -0400)]
Merge branch 'mlx4-next'

Or Gerlitz says:

====================
mlx4: Add support for single port VFs

The mlx4 Firmware && driver expose both ports of the device through one PCI function.

This can be non-optimal under virtualization schemes where the admin
would like the VF to expose one interface to the VM, etc.

This series from Matan Barak adds support for single ported VFs.

Since all the VF interaction with the firmware passes through the PF
 driver, we can emulate to the VF they have one port, and further create
a set of the VFs which act on port1 of the device and another set which
acts on port2.

Series done against net-next commit 3ab428a "netfilter: Add missing
vmalloc.h include to nft_hash.c"

Roland, we send this through netdev, but if you have comments, will love
to hear them.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet/mlx4: Adapt num_vfs/probed_vf params for single port VF
Matan Barak [Wed, 19 Mar 2014 16:11:53 +0000 (18:11 +0200)]
net/mlx4: Adapt num_vfs/probed_vf params for single port VF

A new syntax is added for the module parameters num_vfs and probe_vf.

  num_vfs=p1,p2,p1+p2
  probe_bf=p1,p2,p1+p2

Where p1(2) is the number of VFs on / probed VFs for physical
port1(2) and p1+p2 is the number of dual port VFs.

Single port VFs are currently supported only when the link type
for both ports of the device is Ethernet.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet/mlx4: Adapt code for N-Port VF
Matan Barak [Wed, 19 Mar 2014 16:11:52 +0000 (18:11 +0200)]
net/mlx4: Adapt code for N-Port VF

Adds support for N-Port VFs, this includes:
1. Adding support in the wrapped FW command
In wrapped commands, we need to verify and convert
the slave's port into the real physical port.
Furthermore, when sending the response back to the slave,
a reverse conversion should be made.
2. Adjusting sqpn for QP1 para-virtualization
The slave assumes that sqpn is used for QP1 communication.
If the slave is assigned to a port != (first port), we need
to adjust the sqpn that will direct its QP1 packets into the
correct endpoint.
3. Adjusting gid[5] to modify the port for raw ethernet
In B0 steering, gid[5] contains the port. It needs
to be adjusted into the physical port.
4. Adjusting number of ports in the query / ports caps in the FW commands
When a slave queries the hardware, it needs to view only
the physical ports it's assigned to.
5. Adjusting the sched_qp according to the port number
The QP port is encoded in the sched_qp, thus in modify_qp we need
to encode the correct port in sched_qp.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet/mlx4: Add utils for N-Port VFs
Matan Barak [Wed, 19 Mar 2014 16:11:51 +0000 (18:11 +0200)]
net/mlx4: Add utils for N-Port VFs

This patch adds the following utils:
1. Convert slave_id -> VF
2. Get the active ports by slave_id
3. Convert slave's port to real port
4. Get the slave's port from real port
5. Get all slaves that uses the i'th real port
6. Get all slaves that uses the i'th real port exclusively

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet/mlx4: Add data structures to support N-Ports per VF
Matan Barak [Wed, 19 Mar 2014 16:11:50 +0000 (18:11 +0200)]
net/mlx4: Add data structures to support N-Ports per VF

Adds the required data structures to support VFs with N (1 or 2)
ports instead of always using the number of physical ports.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoIB/mlx4_ib: Adapt code to use caps.num_ports instead of a constant
Matan Barak [Wed, 19 Mar 2014 16:11:49 +0000 (18:11 +0200)]
IB/mlx4_ib: Adapt code to use caps.num_ports instead of a constant

Some code in the mlx4 IB driver stack assumed MLX4_MAX_PORTS ports.

Instead, we should only loop until the number of actual ports in i
the device, which is stored in dev->caps.num_ports.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosmsc911x: Change clock warning message to debug level
Fabio Estevam [Wed, 19 Mar 2014 14:22:06 +0000 (11:22 -0300)]
smsc911x: Change clock warning message to debug level

Since passing the clock is not mandatory, change the warning message to debug,
so that we avoid getting the following clock failure message on every boot:

smsc911x: Driver version 2008-10-21
smsc911x smsc911x (unregistered net_device): couldn't get clock -2
libphy: smsc911x-mdio: probed

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: remove empty lines from tcp_syn_flood_action
Daniel Baluta [Wed, 19 Mar 2014 13:58:25 +0000 (15:58 +0200)]
net: remove empty lines from tcp_syn_flood_action

Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Thu, 20 Mar 2014 18:19:45 +0000 (14:19 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to e100, igb, igbvf, ixgbe and ixgbevf.

Stefan adds a igb patch to enable the ability strip VLAN header information
for packets bound for a VM on i350 hardware.

Joe Perches provides patches for e100, igb, igbvf, ixgbe and ixgbevf to
convert the use of __constant_<foo> to just <foo> to align with the rest
of the kernel.

Don provides two fixes for ixgbe, first resolves a link issue with DA
cables where we were not always freeing the firmware/software semaphore
after grabbing it.  Second stops caching whether the management firmware
was enabled, however since this is not static, we really need to verify
with each check.

Jacob provides six fixes/cleanups for ixgbe, most notably, correct
the stop_mac_link_on d3() to check the Core Clock Disable bit before
stopping link and to fully check to see if manage firmware is running or
could be enabled before bringing down the link.  Fix flow control
auto-negation for KR/KX/K4 interfaces, since setting up MAC link, the
cached autoc value and current autoc value were being incorrectly used to
determine whether link reset is required.

Emil provides a fix for ixgbe where there was a chance for aggressive
start_ndo_zmit() callers to sneak packets between enabling the Tx queues
and the link coming up.  To resolve this, move the call to enable Tx
queues to after the link is established.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoixgbe: enable tx queues after link up
Emil Tantilov [Thu, 20 Mar 2014 03:47:53 +0000 (03:47 +0000)]
ixgbe: enable tx queues after link up

This patch moves the call to enable Tx queues after the link is established.
Previously there was a chance for aggressive start_ndo_xmit() callers to
sneak packets between enabling the Tx queues and the link coming up.

In addition it replaces netif_tx_start_all_queues() with
netif_tx_wake_all_queues() to allow for flushing of the qdisc.

CC: Arun Sharma <asharma@fb.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: Stop cacheing if the MNG FW enabled
Don Skidmore [Thu, 27 Feb 2014 09:03:30 +0000 (09:03 +0000)]
ixgbe: Stop cacheing if the MNG FW enabled

We use to cache whether the MNG FW was enabled, how since this isn't
static we really need to verify with each check.  This patch makes that
change.

CC: Arun Sharma <asharma@fb.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: clean up ixgbe_atr_compute_perfect_hash_82599
Jacob Keller [Sat, 22 Feb 2014 01:23:59 +0000 (01:23 +0000)]
ixgbe: clean up ixgbe_atr_compute_perfect_hash_82599

Rather than assign several parameters in a row, we should use a for
loop, which reduces code size.

CC: Arun Sharma <asharma@fb.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: use ixgbe_read_pci_cfg_word
Jacob Keller [Sat, 22 Feb 2014 01:23:58 +0000 (01:23 +0000)]
ixgbe: use ixgbe_read_pci_cfg_word

This patch replaces some direct uses of pci_read_config_word with the
protected ixgbe_read_pci_cfg_word function, which checks for whether the
adapter is removed when LER is enabled. We shouldn't use the
pci_read_config_word calls directly because of these checks.

This patch also cleans up an unnecessary save of a pointer to the mac
object, as our standard style is to just use the hw pointer.

CC: Arun Sharma <asharma@fb.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: remove unused media type
Jacob Keller [Sat, 22 Feb 2014 01:23:57 +0000 (01:23 +0000)]
ixgbe: remove unused media type

This patch reverts the addition of the fiber_fixed type, which ended up
never being used. We don't have plans to support this type going
forward, and there is no reason to keep an unused type around polluting
the code.

Reverts: 4e8e1bca6e2 ("ixgbe: add new media type")
CC: Arun Sharma <asharma@fb.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: fix ixgbe_setup_mac_link_82599 autoc variables
Jacob Keller [Sat, 22 Feb 2014 01:23:56 +0000 (01:23 +0000)]
ixgbe: fix ixgbe_setup_mac_link_82599 autoc variables

This patch fixes flow control autonegotiation for KR/KX/K4 interfaces.
When setting up MAC link, the cached autoc value and current autoc value
were being incorrectly used to determine whether link reset is required.
This resulted in the driver ignoring and discarding flow control
negotiation changes that occur since the caching happened, as well as
when the mac was being setup.

This patch also splits the assignments for the 3 autoc variables into
their own block, and adds a comment explaining what each one means, in
order to help keep logic more straightforward while reading the code.

CC: Arun Sharma <asharma@fb.com>
Reported-by: Sourav Chatterjee <sourav.chatterjee@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: fix ixgbe_stop_mac_link_on_d3_82599 to check mng correctly
Jacob Keller [Sat, 22 Feb 2014 01:23:55 +0000 (01:23 +0000)]
ixgbe: fix ixgbe_stop_mac_link_on_d3_82599 to check mng correctly

Previously, we did a full check to see if MNG FW was running. Instead,
we should only check to see whether it could be enabled. Since it may
become active while down, we don't want to bring the link down.

CC: Arun Sharma <asharma@fb.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: check Core Clock Disable bit
Jacob Keller [Sat, 22 Feb 2014 01:23:54 +0000 (01:23 +0000)]
ixgbe: check Core Clock Disable bit

This patch corrects the stop_mac_link_on_d3 function in ixgbe_82599 by
checking the Core Clock Disable bit before stopping link.

CC: Arun Sharma <asharma@fb.com>
Reported-by: Chris Pavlas <chris.pavlas@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: fix errors related to protected AUTOC calls
Don Skidmore [Wed, 19 Mar 2014 09:16:26 +0000 (09:16 +0000)]
ixgbe: fix errors related to protected AUTOC calls

Found several incorrect conditionals after calling the prot_autoc_*
functions. Likewise we weren't always freeing the FWSW semaphore after
grabbing it.   This would lead to DA cables being unable to link along with
possible other errors.

CC: Arun Sharma <asharma@fb.com>
CC: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbevf: Convert uses of __constant_<foo> to <foo>
Joe Perches [Thu, 13 Mar 2014 05:19:30 +0000 (05:19 +0000)]
ixgbevf: Convert uses of __constant_<foo> to <foo>

The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: Convert uses of __constant_<foo> to <foo>
Joe Perches [Thu, 13 Mar 2014 05:19:25 +0000 (05:19 +0000)]
ixgbe: Convert uses of __constant_<foo> to <foo>

The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoigbvf: Convert uses of __constant_<foo> to <foo>
Joe Perches [Thu, 13 Mar 2014 05:19:19 +0000 (05:19 +0000)]
igbvf: Convert uses of __constant_<foo> to <foo>

The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoigb: Convert uses of __constant_<foo> to <foo>
Joe Perches [Thu, 13 Mar 2014 05:19:14 +0000 (05:19 +0000)]
igb: Convert uses of __constant_<foo> to <foo>

The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoe100: Convert uses of __constant_<foo> to <foo>
Joe Perches [Thu, 13 Mar 2014 05:19:09 +0000 (05:19 +0000)]
e100: Convert uses of __constant_<foo> to <foo>

The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoigb: enable VLAN stripping for VMs with i350
Stefan Assmann [Wed, 11 Dec 2013 22:10:12 +0000 (22:10 +0000)]
igb: enable VLAN stripping for VMs with i350

For i350 VLAN stripping for VMs is not enabled in the VMOLR register but in
the DVMOLR register. Making the changes accordingly. It's not necessary to
unset the E1000_VMOLR_STRVLAN bit on i350 as the hardware will simply ignore
it.

Without this change if a VLAN is configured for a VF assigned to a guest
via (i.e.)
ip link set p1p1 vf 0 vlan 10
the VLAN tag will not be stripped from packets going into the VM. Which they
should be because the VM itself is not aware of the VLAN at all.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoqeth: Fix IP version detection for VLAN traffic
Stefan Raspl [Wed, 19 Mar 2014 06:58:02 +0000 (07:58 +0100)]
qeth: Fix IP version detection for VLAN traffic

The current code would always return 0 for VLAN-encapsulated IP traffic.
One notable side effect was that VLAN traffic would never get prioritized
on OSD and OSX devices when priority queueing modes prio_queueing_tos or
prio_queueing_prec were enabled.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqeth: Removed unused parameter
Stefan Raspl [Wed, 19 Mar 2014 06:58:01 +0000 (07:58 +0100)]
qeth: Removed unused parameter

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoqeth: make qeth_query_card_info_cb() static
Heiko Carstens [Wed, 19 Mar 2014 06:58:00 +0000 (07:58 +0100)]
qeth: make qeth_query_card_info_cb() static

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoaf_iucv: recvmsg problem for SOCK_STREAM sockets
Ursula Braun [Wed, 19 Mar 2014 06:57:59 +0000 (07:57 +0100)]
af_iucv: recvmsg problem for SOCK_STREAM sockets

Commit f9c41a62bba3f3f7ef3541b2a025e3371bcbba97 introduced
a problem for SOCK_STREAM sockets, when only part of the
incoming iucv message is received by user space. In this
case the remaining data of the iucv message is lost.
This patch makes sure an incompletely received iucv message
is queued back to the receive queue.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reported-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Thu, 20 Mar 2014 03:55:48 +0000 (23:55 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to i40e, i40evf, e1000e, ixgbe and ixgbevf.

Mitch adds support for the VF link state ndo which allows the PF driver
to control the virtual link state of the VF devices.  Added
support for viewing and modifying RSS hash options and RSS hash look-up
table programming through ethtool for i40evf.  Fixed complaint about
the use of min() where min_t() should be used in i40evf.

Anjali adds support for ethtool -k option for NTUPLE control for i40e.

Elizabeth cleans up and refactors i40e_open() to separate out the VSI
code into its own i40e_vsi_open().

Jesse enables the hardware feature head write back to avoid updating the
descriptor ring by marking each descriptor with a DD bit and instead
writes a memory location with an update to where the driver should clean
up to in i40e and i40evf.  Reduces context descriptors for i40e/i40evf
since we do not need context descriptors for every packet, only for
TSO or timesync.

Dan Carpenter fixes a potential array underflow in i40e_vc_process_vf_msg().

Dave fixes an e1000e hardware unit hang where the check for pending Tx work
when link is lost was mistakenly moved to be done only when link is first
detected to be lost.  Fixed a problem with poor network performance on
certain silicon in e1000e when configured for 100M HDX performance.

Carolyn adds register defines needed for time sync functions and the code
to call the updated defines.

Jacob adds the ixgbe function for writing PCI config word and checks
whether the adapter has been removed first.

Mark adds the bit __IXGBEVF_REMOVING to indicate that the module is being
removed because the __IXGBEVF_DOWN bit had been overloaded for this
purpose, but leads to trouble.  ixgbevf_down function can now prevent
multiple executions by doing test_and_set_bit on __IXGBEVF_DOWN.

v2:
- dropped patch Mitch's patch "i40evf: Support RSS option in ethtool"
  based on feedback from Ben Hutchings so that Mitch can re-work the
  patch solution

v3:
- removed unnecessary parenthesis in patch 1 based on feedback from David
  Miller
- changed a macro to get the next queue to a function in patch 2 based on
  feedback from David Miller
- added blank lines after variable declaration and code in two functions
  in patch 6 based on feedback from David Miller
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoixgbevf: Protect ixgbevf_down with __IXGBEVF_DOWN bit
Mark Rustad [Tue, 4 Mar 2014 03:02:18 +0000 (03:02 +0000)]
ixgbevf: Protect ixgbevf_down with __IXGBEVF_DOWN bit

The ixgbevf_down function can now prevent multiple executions by
doing test_and_set_bit on __IXGBEVF_DOWN. This did not work before
introduction of the __IXGBEVF_REMOVING bit, because of overloading
of __IXGBEVF_DOWN. Also add smp_mb__before_clear_bit call before
clearing the __IXGBEVF_DOWN bit.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbevf: Indicate removal state explicitly
Mark Rustad [Tue, 4 Mar 2014 03:02:13 +0000 (03:02 +0000)]
ixgbevf: Indicate removal state explicitly

Add a bit, __IXGBEVF_REMOVING, to indicate that the module is being
removed. The __IXGBEVF_DOWN bit had been overloaded for this purpose,
but that leads to trouble. A few places now check both __IXGBEVF_DOWN
and __IXGBEVF_REMOVING.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoixgbe: add ixgbe_write_pci_cfg_word with ixgbe_removed check
Jacob Keller [Sat, 22 Feb 2014 01:23:53 +0000 (01:23 +0000)]
ixgbe: add ixgbe_write_pci_cfg_word with ixgbe_removed check

Inline with the current use for ixgbe_read_pci_cfg_word, create a
similar function for writing PCI config, which checks whether the
adapter has been removed first, if Live Error Recovery has been enabled.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoigb: Add register defines needed for time sync functions
Carolyn Wyborny [Tue, 11 Mar 2014 06:15:37 +0000 (06:15 +0000)]
igb: Add register defines needed for time sync functions

This patch adds defines needed for implementing the auxiliary time sync
functions and also changes code to call the updated defines instead of
the old.

Reported-by: Richard Cochran <ricahrdcochran@gmail.com>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoe1000e: Fix Explicitly set Transmit Control Register
David Ertman [Mon, 13 Jan 2014 23:19:27 +0000 (23:19 +0000)]
e1000e: Fix Explicitly set Transmit Control Register

This patch causes the TCTL to be explicitly set to fix a problem with
poor network performance (throughput) on certain silicon when configured
for 100M HDX performance.

Cc: Todd Fujinaka <todd.fujinaka@intel.com>
Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Acked-by: Bruce W. Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoe1000e: Fix Hardware Unit Hang
David Ertman [Wed, 8 Jan 2014 01:07:55 +0000 (01:07 +0000)]
e1000e: Fix Hardware Unit Hang

The check for pending Tx work when link is lost was mistakenly moved to be
done only when link is first detected to be lost.  It turns out there is a
small window of opportunity for additional Tx work to get queued up shortly
after link is dropped.

Move the check back to the place it was before in the watchdog task.  Put in
additional debug information for other reset paths and a final catch-all for
false hangs in the scheduled function that prints out the hardware hang
message.

Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e/i40evf: Bump build versions
Catherine Sullivan [Fri, 14 Feb 2014 02:14:42 +0000 (02:14 +0000)]
i40e/i40evf: Bump build versions

Bump to version 0.3.36 for i40e and 0.9.16 for i40evf.

Change-ID: I7b4ff97b32d2825181803c03c316381a7608a618
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: potential array underflow in i40e_vc_process_vf_msg()
Dan Carpenter [Wed, 15 Jan 2014 06:43:39 +0000 (06:43 +0000)]
i40e: potential array underflow in i40e_vc_process_vf_msg()

If "vf_id" is smaller than hw->func_caps.vf_base_id then it leads to
an array underflow of the pf->vf[] array.  This is unlikely to happen
unless the hardware is bad, but it's a small change and it silences a
static checker warning.

Fixes: 7efa84b7abc1 ('i40e: support VFs on PFs other than 0')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e/i40evf: reduce context descriptors
Jesse Brandeburg [Fri, 14 Feb 2014 02:14:41 +0000 (02:14 +0000)]
i40e/i40evf: reduce context descriptors

We don't need context descriptors for every packet, only tso
or timesync.  This fixes a bug in the driver where it would
always add a context even if all the passed in values
to the context descriptor function were 0/default values.

Change-ID: I0101d2b893380707b5c2de61aab3e16d4310e9a1
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e/i40evf: enable hardware feature head write back
Jesse Brandeburg [Fri, 14 Feb 2014 02:14:40 +0000 (02:14 +0000)]
i40e/i40evf: enable hardware feature head write back

The hardware supports a feature to avoid updating the descriptor
ring by marking each descriptor with a DD bit, and instead
writes a memory location with an update to where the driver
should clean up to.  Enable this feature.

Change-ID: I5da4e0681f0b581a6401c950a81808792267fe57
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Refactor and cleanup i40e_open(), adding i40e_vsi_open()
Elizabeth Kappler [Sat, 15 Feb 2014 07:41:38 +0000 (07:41 +0000)]
i40e: Refactor and cleanup i40e_open(), adding i40e_vsi_open()

This patch cleans up and moves a portion of i40e_open to i40e_vsi_open,
in order to have a shorter vsi_open function that does only that.

Change-ID: I1c418dda94dcfc0eb7d4386a70c330692ef5ecc9
Signed-off-by: Elizabeth Kappler <elizabeth.m.kappler@intel.com>
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40e: Patch to enable Ethtool/netdev feature flag for NTUPLE control
Anjali Singhai Jain [Fri, 14 Feb 2014 02:14:38 +0000 (02:14 +0000)]
i40e: Patch to enable Ethtool/netdev feature flag for NTUPLE control

This enables option '-k/-K' in ethtool for NTUPLE control.
NTUPLE control requires a reset, to take effect. When the feature is
turned off, the SW list of stored FD SB filters gets cleaned up.

Change-ID: I9d564b67a10d4afa11de3b320d601c3d2e6edc1f
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40evf: use min_t
Mitch Williams [Tue, 11 Feb 2014 08:27:52 +0000 (08:27 +0000)]
i40evf: use min_t

Checkpatch complained in an earlier patch about using min(), but that
change would have been completely unrelated to the point of that patch.
So fix it here.

Change-ID: I2cd87b39cfd406850d283b88f259757a6bcd14cd
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agoi40evf: correctly program RSS HLUT table
Mitch Williams [Tue, 11 Feb 2014 08:27:50 +0000 (08:27 +0000)]
i40evf: correctly program RSS HLUT table

The HLUT programming loop in in i40evf_configure_rss was a) overly-
complicated, and b) just plain broken. Most of the entries ended up being
not written at all, so most of the flows ended up at queue zero.

Refactor the HLUT programming loop to simply walk through the registers
and write four values to each one, incrementing through the number of
available queues.

Change-ID: I75766179bc67e4e997187794f3144e28c83fd00d
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agomicrel: fix masking off LED bits
Sergei Shtylyov [Tue, 18 Mar 2014 23:58:16 +0000 (02:58 +0300)]
micrel: fix masking off LED bits

Commit 20d8435a1cff (phy: micrel: add of configuration for LED mode) made the
obvious mistake when masking off  the LED mode bits: forgot to do a logical NOT
to the mask with which it ANDs the register value, so that unrelated bits are
cleared instead.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoi40e: support VF link state ndo
Mitch Williams [Tue, 11 Feb 2014 08:27:49 +0000 (08:27 +0000)]
i40e: support VF link state ndo

This netdev op allows the PF driver to control the virtual link state of
the VF devices. This can be used to deny naughty VF drivers access to
the wire, or to allow VFs (regardless of temperament) to communicate
with each other over the device's internal switch even though external
link is down.

Add the actual ndo function, and modify vc_notify_link_state to check
the link status of each VF before sending a message in the case when
physical link changes state.

Change-ID: Ib5a6924da78c540789f21d26b5e8086d71c29384
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
10 years agonetfilter: Add missing vmalloc.h include to nft_hash.c
David S. Miller [Wed, 19 Mar 2014 03:12:02 +0000 (23:12 -0400)]
netfilter: Add missing vmalloc.h include to nft_hash.c

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoieee802154: properly unshare skbs in ieee802154 *_rcv functions
Phoebe Buckheister [Mon, 17 Mar 2014 17:30:19 +0000 (18:30 +0100)]
ieee802154: properly unshare skbs in ieee802154 *_rcv functions

ieee802154 sockets do not properly unshare received skbs, which leads to
panics (at least) when they are used in conjunction with 6lowpan, so
run skb_share_check on received skbs.
6lowpan also contains a use-after-free, which is trivially fixed by
replacing the inlined skb_share_check with the explicit call.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Tested-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge tag 'linux-can-next-for-3.15-20140317' of git://gitorious.org/linux-can/linux...
David S. Miller [Tue, 18 Mar 2014 19:20:00 +0000 (15:20 -0400)]
Merge tag 'linux-can-next-for-3.15-20140317' of git://gitorious.org/linux-can/linux-can-next

linux-can-next-for-3.15-20140317

Marc Kleine-Budde says:

====================
this is a pull request of three patches for net-next/master.

It consists of a patch by Oliver Hartkopp, which unifies the MTU
settings for CAN interfaces. A patch by Christopher R. Baker populates
netdev::dev_id for udev discrimination for multi interface CAN devices.
Alexander Shiyan contributes a patch for the mcp251x driver which fixes
the regulators operation if CONFIG_REGULATOR is not enabled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agovia: fix a punctuation typo
wangweidong [Mon, 17 Mar 2014 07:52:17 +0000 (15:52 +0800)]
via: fix a punctuation typo

In generic, after an assignment, we use ';' instead of ','.
Although, it won't hurt.

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agodoc: update driver TX algorithm in timestamping.txt
Jakub Kicinski [Sun, 16 Mar 2014 19:32:48 +0000 (20:32 +0100)]
doc: update driver TX algorithm in timestamping.txt

Since cd4d8fdad1f1 ("net: kernel panic in dev_hard_start_xmit:
remove faulty software TX time stamping") dev_hard_start_xmit()
will not provide software timestamps. It's a responsibility of
the drivers to call skb_tx_timestamp() at the right time.

Cc: linux-doc@vger.kernel.org
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agobonding: ratelimit pr_warn()s in 802.3ad mode
Veaceslav Falico [Sun, 16 Mar 2014 16:55:03 +0000 (17:55 +0100)]
bonding: ratelimit pr_warn()s in 802.3ad mode

Only ratelimit the ones that might spam, omiting the ones from
enslave/deslave.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: sched: use no more than one page in struct fw_head
Eric Dumazet [Tue, 18 Mar 2014 03:20:49 +0000 (20:20 -0700)]
net: sched: use no more than one page in struct fw_head

In commit b4e9b520ca5d ("[NET_SCHED]: Add mask support to fwmark
classifier") Patrick added an u32 field in fw_head, making it slightly
bigger than one page.

Lets use 256 slots to make fw_hash() more straight forward, and move
@mask to the beginning of the structure as we often use a small number
of skb->mark. @mask and first hash buckets share the same cache line.

This brings back the memory usage to less than 4000 bytes, and permits
John to add a rcu_head at the end of the structure later without any
worry.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec...
David S. Miller [Tue, 18 Mar 2014 18:09:07 +0000 (14:09 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
One patch to rename a newly introduced struct. The rest is
the rework of the IPsec virtual tunnel interface for ipv6 to
support inter address family tunneling and namespace crossing.

1) Rename the newly introduced struct xfrm_filter to avoid a
   conflict with iproute2. From Nicolas Dichtel.

2) Introduce xfrm_input_afinfo to access the address family
   dependent tunnel callback functions properly.

3) Add and use a IPsec protocol multiplexer for ipv6.

4) Remove dst_entry caching. vti can lookup multiple different
   dst entries, dependent of the configured xfrm states. Therefore
   it does not make to cache a dst_entry.

5) Remove caching of flow informations. vti6 does not use the the
   tunnel endpoint addresses to do route and xfrm lookups.

6) Update the vti6 to use its own receive hook.

7) Remove the now unused xfrm_tunnel_notifier. This was used from vti
   and is replaced by the IPsec protocol multiplexer hooks.

8) Support inter address family tunneling for vti6.

9) Check if the tunnel endpoints of the xfrm state and the vti interface
   are matching and return an error otherwise.

10) Enable namespace crossing for vti devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet/i40e: Avoid double setting of NETIF_F_SG for the HW encapsulation feature mask
Or Gerlitz [Tue, 18 Mar 2014 08:36:45 +0000 (10:36 +0200)]
net/i40e: Avoid double setting of NETIF_F_SG for the HW encapsulation feature mask

The networking core does it for the driver during registration time.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoi40evf: Rename i40e_ptype_lookup i40evf_ptype_lookup
Eric W Biederman [Tue, 18 Mar 2014 07:26:50 +0000 (00:26 -0700)]
i40evf: Rename i40e_ptype_lookup i40evf_ptype_lookup

When compiling the i40e and the i40evf driver into the same kernel I get:
LD      drivers/net/ethernet/intel/built-in.o
drivers/net/ethernet/intel/i40evf/built-in.o:(.data+0x300): multiple definition of `i40e_ptype_lookup'
drivers/net/ethernet/intel/i40e/built-in.o:(.data+0x780): first defined here
make[3]: *** [drivers/net/ethernet/intel/built-in.o] Error 1
make[2]: *** [drivers/net/ethernet/intel] Error 2
make[1]: *** [drivers/net/ethernet/] Error 2
make: *** [sub-make] Error 2

Fix this by renaming the i40evf version of this structure from
i40e_ptype_lookup to i40evf_ptype_lookup.

This build failure was introduced in:
  commit 206812b5fccb808d1194344eaa942f68f59b2630
  Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
  i40e/i40evf: i40e implementation for skb_set_hash

Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Eric W Biederman <ebiederm@xmission.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoe1000e: fix the build error when PM is disabled
Kevin Hao [Tue, 18 Mar 2014 07:26:49 +0000 (00:26 -0700)]
e1000e: fix the build error when PM is disabled

The commit 2800209994f8 (e1000e: Refactor PM flows) changed the
SET_SYSTEM_SLEEP_PM_OPS to open-coded assignment, but forgot to
protect them with CONFIG_PM_SLEEP. Then cause the following build
error when PM is disabled:
drivers/net/ethernet/intel/e1000e/netdev.c:7079:13:
error: 'e1000e_pm_suspend' undeclared here (not in a function)
  .suspend = e1000e_pm_suspend,
             ^
drivers/net/ethernet/intel/e1000e/netdev.c:7080:13:
error: 'e1000e_pm_resume' undeclared here (not in a function)
  .resume  = e1000e_pm_resume,
             ^
drivers/net/ethernet/intel/e1000e/netdev.c:7082:11:
error: 'e1000e_pm_thaw' undeclared here (not in a function)
  .thaw  = e1000e_pm_thaw,
           ^
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoigb: remove references to long gone command line parameters
Fernando Luis Vazquez Cao [Tue, 18 Mar 2014 07:26:48 +0000 (00:26 -0700)]
igb: remove references to long gone command line parameters

Command line parameters QueuePairs, Node, EEE, DMAC and InterruptThrottleRate
do not exist these days. Remove all references to them in the Documentation
folder and update code comments.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'altera_tse'
David S. Miller [Tue, 18 Mar 2014 01:37:25 +0000 (21:37 -0400)]
Merge branch 'altera_tse'

Vince Bridgers says:

====================
Altera Triple Speed Ethernet (TSE) Driver

This is the version 6 submission for the Altera Triple Speed Ethernet (TSE)
driver. All comments received during the version 2, 3, 4, and 5 submissions
have been accepted. Please find the change log and a description of the
submission below.

If you find the submission acceptable, please consider this patch set for
inclusion into the Linux kernel.

V6: Address comments from V5 review
    - add call to skb_tx_timestamp in the drivers transmit path
    - correct use of unsigned int where it was cast to pointer. Use types
      appropriate for intended and correct use to let the compiler warn us
      when type usage is incorrect.
    - use correct semantics for pointer arithmetic in same code path

V5: Address comments from V4 review
    - Add descriptions of statistics to driver documentation. The statstics
      supported by the driver/controller map to IEEE and RFC statistics, and
      the names and mappings are described in the user documentation.
    - Change "unsigned int" to u32 in device structure definitions
    - Change used of netdev_warn to netif_warn in altera_sgdma.c
    - Change stat name rx_fifo_drops to ether_drops to match the event
      actually counted by the hardware.

V4: Address comments from V3 review
    - Change statistics names in ethtool module to follow common use in
      other ethernet drivers.
    - remove an unnecessary case in ethtool module
    - change logging to use netdev_* where possible instead of dev_*
    - remove logging for OOM errors since those are already logged

V3: Address comments from V2 review
    - Reorder patch submission so that net/ethernet Makefile and Kconfig
      are committed last, thus not breaking bisect
    - Use of_get_mac_address instead of of_get_property
    - Change supplemental and hash configuration bindings to boolean/empty,
      and more meaningful names
    - Add check for failure from calls to of_phy_connect and
      connect_local_phy
    - Correct code to find mdio child node
    - Update bindings document
    - Remove cast to u64 when not necessary
    - add use of const for statistics strings

V2: Address comments from initial RFC review.
    - The driver files were broken up by major sections of functionality.
      These include MSGDMA, SGDMA, Misc, and Main.
    - Add patch for MAINTAINERS file, add the maintainer for this submission
    - Use 32-bit lower/upper physical address accessor functions so the driver
      is 64-bit ready.
    - Use standard bindings where applicable. Especially phy-addr, and change
      "altr,rx-fifo-depth" to "rx-fifo-depth" and "altr,tx-fifo-depth" to
      "tx-fifo-depth".
    - Add use of max-frame-size property
    - Update bindings documents accordingly
    - Correct interrupt handler to use budget parameter in the convential way
    - Use macros consistently to define bit fields across files
    - Correct include exclusion macro in altera_msgdmahw.h (typo)
    - Remove use of barriers, these were not necessary since the DMA APIs
      ensure memory & buffer consistency
    - Remove use of netif_carrier_off in driver
    - move probing of phy from the open function to the probe function
    - use of_get_phy_mode instead of custom function
    - Use the .data field in the device structure to obtain a pointer
      to SGDMA or MSGDMA device specific properties and functions.
    - remove custom function to access devicetree since Altera specific
      bindings requiring it's use have been deprecated in favor of
      standard bindings.

The Altera TSE is a 10/100/1000 Mbps Ethernet soft IP component that can be
configured and synthesized using Quartus, and programmed into Altera FPGAs.
Two types of soft DMA IP components are supported by this driver - the Altera
SGDMA and the MSGDMA. The MSGDMA DMA component is preferred over the SGDMA,
since the SGDMA will be deprecated in favor of the MSGDMA. Software supporting
both is provided for customers still using the SGDMA and to demonstrate how
multiple types of DMA engines may be supported by the TSE driver in the event
customers wish to develop their own custom soft DMA engine for particular
applications.

The design has been tested on Altera's Cyclone 4, 5, and Cyclone 5 SOC
development kits using an ARM A9 processor and an Altera NIOS2 processor.
Differences in CPU/DMA coherency management and address alignment are
addressed by proper use of driver APIs and semantics.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: ethernet: Change Ethernet Makefile and Kconfig for Altera TSE driver
Vince Bridgers [Mon, 17 Mar 2014 22:52:41 +0000 (17:52 -0500)]
net: ethernet: Change Ethernet Makefile and Kconfig for Altera TSE driver

This patch changes the Ethernet Makefile and Kconfig files to add the Altera
Ethernet driver component.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMAINTAINERS: Add entry for Altera Triple Speed Ethernet Driver
Vince Bridgers [Mon, 17 Mar 2014 22:52:40 +0000 (17:52 -0500)]
MAINTAINERS: Add entry for Altera Triple Speed Ethernet Driver

Add a MAINTAINERS entry covering the Altera Triple Speed
Ethernet Driver, with support for the MSGDMA and SGDMA
soft DMA IP components.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoAltera TSE: Add Altera Ethernet Driver Makefile and Kconfig
Vince Bridgers [Mon, 17 Mar 2014 22:52:39 +0000 (17:52 -0500)]
Altera TSE: Add Altera Ethernet Driver Makefile and Kconfig

This patch adds the Altera Triple Speed Ethernet Makfile and
Kconfig file.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoAltera TSE: Add main and header file for Altera Ethernet Driver
Vince Bridgers [Mon, 17 Mar 2014 22:52:38 +0000 (17:52 -0500)]
Altera TSE: Add main and header file for Altera Ethernet Driver

This patch adds the main driver and header file for the Altera Triple
Speed Ethernet driver.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoAltera TSE: Add Miscellaneous Files for Altera Ethernet Driver
Vince Bridgers [Mon, 17 Mar 2014 22:52:37 +0000 (17:52 -0500)]
Altera TSE: Add Miscellaneous Files for Altera Ethernet Driver

This patch adds miscellaneous files for the Altera Ethernet Driver,
including ethtool support.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoAltera TSE: Add Altera Ethernet Driver SGDMA file components
Vince Bridgers [Mon, 17 Mar 2014 22:52:36 +0000 (17:52 -0500)]
Altera TSE: Add Altera Ethernet Driver SGDMA file components

This patch adds the SGDMA soft IP support for the Altera Triple
Speed Ethernet driver.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoAltera TSE: Add Altera Ethernet Driver MSGDMA File Components
Vince Bridgers [Mon, 17 Mar 2014 22:52:35 +0000 (17:52 -0500)]
Altera TSE: Add Altera Ethernet Driver MSGDMA File Components

This patch adds the MSGDMA soft IP support for the Altera Triple
Speed Ethernet driver.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoDocumentation: networking: Add Altera Ethernet (TSE) Documentation
Vince Bridgers [Mon, 17 Mar 2014 22:52:34 +0000 (17:52 -0500)]
Documentation: networking: Add Altera Ethernet (TSE) Documentation

This patch adds a bindings description for the Altera Triple Speed Ethernet
(TSE) driver. The bindings support the legacy SGDMA soft IP as well as the
preferred MSGDMA soft IP. The TSE can be configured and synthesized in soft
logic using Altera's Quartus toolchain. Please consult the bindings document
for supported options.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agodts: Add bindings for the Altera Triple Speed Ethernet driver
Vince Bridgers [Mon, 17 Mar 2014 22:52:33 +0000 (17:52 -0500)]
dts: Add bindings for the Altera Triple Speed Ethernet driver

This patch adds a bindings description for the Altera Triple Speed Ethernet
(TSE) driver. The bindings support the legacy SGDMA soft IP as well as the
preferred MSGDMA soft IP. The TSE can be configured and synthesized in soft
logic using Altera's Quartus toolchain. Please consult the bindings document
for supported options.

Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetfilter: conntrack: Fix UP builds
Eric Dumazet [Mon, 17 Mar 2014 20:37:53 +0000 (13:37 -0700)]
netfilter: conntrack: Fix UP builds

ARRAY_SIZE(nf_conntrack_locks) is undefined if spinlock_t is an
empty structure. Replace it by CONNTRACK_LOCKS

Fixes: 93bb0ceb75be ("netfilter: conntrack: remove central spinlock nf_conntrack_lock")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'at86rf230'
David S. Miller [Mon, 17 Mar 2014 20:10:43 +0000 (16:10 -0400)]
Merge branch 'at86rf230'

Alexander Aring says:

====================
at86rf230: various fixes and devicetree support

this patch series fix some bugs with the at86rf231 chip and cleaup some code.
Also add devicetree support for the at86rf230 driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoat86rf230: add support for devicetree
Alexander Aring [Sat, 15 Mar 2014 08:29:07 +0000 (09:29 +0100)]
at86rf230: add support for devicetree

This patch adds devicetree support for the at86rf230 driver.

Possible gpios to configure are "reset-gpio" and "sleep-gpio".
Also add support to configure the "irq-type" for the irq polarity
register.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoat86rf230: make reset pin optionally
Alexander Aring [Sat, 15 Mar 2014 08:29:06 +0000 (09:29 +0100)]
at86rf230: make reset pin optionally

This patch make the reset pin optionally. Some devices like the atben
from qi-hardware don't have a reset pin externally. The usually way is
to turn power off/on for the atben device to initiate a device reset.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoat86rf230: change reset timings
Alexander Aring [Sat, 15 Mar 2014 08:29:05 +0000 (09:29 +0100)]
at86rf230: change reset timings

While checkpatch another patch I got a:

"WARNING: msleep < 20ms can sleep for up to 20ms"

The datasheet of at86rf231 and at86rf212 says a minimum delay for reset
pulse width and spi access latency after reset is 625 nanoseconds.

This patch removes the 1 milliseconds sleep and replace it with a 1
microseconds udelay which should be also okay for the reset pulse width.

To change the state from RESET -> TRX_OFF the at86rf230 device needs 120
microseconds, this is a worst case of all at86rf* chips.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoat86rf230: move locking state in xmit
Alexander Aring [Sat, 15 Mar 2014 08:29:04 +0000 (09:29 +0100)]
at86rf230: move locking state in xmit

There is no need to lock the clearing of IRQ_TRX_END in status.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoat86rf230: fix unexpected state change
Alexander Aring [Sat, 15 Mar 2014 08:29:03 +0000 (09:29 +0100)]
at86rf230: fix unexpected state change

This patch fix a unexpected state change for the at86rf231 chip.
We can't change into STATE_FORCE_TX_ON while the chip is in one of
SLEEP, P_ON, RESET, TRX_OFF, and all *_NOCLK states.

In this case we are in the TRX_OFF state. See datasheet [1] page 71 for
more information.

Without this patch you will get the following message on a at86rf231 device:

[   20.065218] unexpected state change: 8, asked for 4
[   20.070527] ------------[ cut here ]------------
[   20.075414] WARNING: CPU: 0 PID: 160 at net/mac802154/ieee802154_dev.c:43 mac802154_slave_open+0x70/0xb8()
[   20.085594] Modules linked in: autofs4
[   20.089667] CPU: 0 PID: 160 Comm: ifconfig Not tainted 3.14.0-20140108-1-00993-g905c192 #162
[   20.098612] [<c00127b8>] (unwind_backtrace) from [<c0010b1c>] (show_stack+0x10/0x14)
[   20.106819] [<c0010b1c>] (show_stack) from [<c0033838>] (warn_slowpath_common+0x60/0x80)
[   20.115311] [<c0033838>] (warn_slowpath_common) from [<c00338e8>] (warn_slowpath_null+0x18/0x20)
[   20.124590] [<c00338e8>] (warn_slowpath_null) from [<c057b7e8>] (mac802154_slave_open+0x70/0xb8)
[   20.133880] [<c057b7e8>] (mac802154_slave_open) from [<c0488a58>] (__dev_open+0xa8/0x108)
[   20.142553] [<c0488a58>] (__dev_open) from [<c0488cb0>] (__dev_change_flags+0x8c/0x148)
[   20.151051] [<c0488cb0>] (__dev_change_flags) from [<c0488d84>] (dev_change_flags+0x18/0x48)
[   20.159968] [<c0488d84>] (dev_change_flags) from [<c04e2e9c>] (devinet_ioctl+0x2b0/0x63c)
[   20.168623] [<c04e2e9c>] (devinet_ioctl) from [<c04712e4>] (sock_ioctl+0x23c/0x29c)
[   20.176727] [<c04712e4>] (sock_ioctl) from [<c00e3cb8>] (do_vfs_ioctl+0x4a8/0x578)
[   20.184671] [<c00e3cb8>] (do_vfs_ioctl) from [<c00e3dd4>] (SyS_ioctl+0x4c/0x78)
[   20.192402] [<c00e3dd4>] (SyS_ioctl) from [<c000da00>] (ret_fast_syscall+0x0/0x48)
[   20.200392] ---[ end trace 9a34542f4ea08e47 ]---

This patch was tested on at86rf231 and at86rf212.

[1] http://www.atmel.com/images/doc8111.pdf

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'sh_eth'
David S. Miller [Mon, 17 Mar 2014 20:06:48 +0000 (16:06 -0400)]
Merge branch 'sh_eth'

Sergei Shtylyov says:

====================
Beautify 'sh_eth' driver's messages

This patchset converts te driver to using netdev_*() and netif_*() to print out
its messages whenever possible.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosh_eth: fold netif_msg_*() and netdev_*() calls into netif_*() invocations
Sergei Shtylyov [Sat, 15 Mar 2014 00:30:59 +0000 (03:30 +0300)]
sh_eth: fold netif_msg_*() and netdev_*() calls into netif_*() invocations

Now that we call netdev_*() under netif_msg_*() checks, we can fold these into
netif_*() macro invocations.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosh_eth: convert dev_*() to netdev_*() calls
Sergei Shtylyov [Sat, 15 Mar 2014 00:29:14 +0000 (03:29 +0300)]
sh_eth: convert dev_*() to netdev_*() calls

Convert dev_*(&ndev->dev, ...) to netdev_*(ndev, ...) calls since they are a bit
shorter and at the same time give more information on a device.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosh_eth: convert pr_*() to netdev_*() calls
Sergei Shtylyov [Sat, 15 Mar 2014 00:27:54 +0000 (03:27 +0300)]
sh_eth: convert pr_*() to netdev_*() calls

Convert pr_*() to netdev_*() calls as the latter provide info on a device.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosh_eth: exit probe with unknown register layout
Sergei Shtylyov [Sat, 15 Mar 2014 00:11:24 +0000 (03:11 +0300)]
sh_eth: exit probe with unknown register layout

Exit the driver's probe() method when the register layout is unknown as the
driver would cause kernel oops in this case anyway.

While at it, move the corresponding error message printout and convert it from
pr_err() to dev_err().

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'netpoll-next'
David S. Miller [Mon, 17 Mar 2014 19:48:53 +0000 (15:48 -0400)]
Merge branch 'netpoll-next'

Eric W. Biederman says:

====================
netpoll: Cleanup received packet processing

This is the long-winded, careful, and polite version of removing the netpoll
receive packet processing.

First I untangle the code in small steps.  Then I modify the code to not
force reception and dropping of packets when we are transmiting a packet
with netpoll.  Finally I move all of the packet reception under
CONFIG_NETPOLL_TRAP and delete CONFIG_NETPOLL_TRAP.

If someone wants to do a stable backport of these patches, it would
require backporting the first 18 patches that handle the budget == 0 in
the networking drivers, and the first 6 of these patches.

If anyone wants to resurrect netpoll packet reception someday it should
just be a matter of reverting the last patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP)
Eric W. Biederman [Sat, 15 Mar 2014 03:51:52 +0000 (20:51 -0700)]
netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP)

The netpoll packet receive code only becomes active if the netpoll
rx_skb_hook is implemented, and there is not a single implementation
of the netpoll rx_skb_hook in the kernel.

All of the out of tree implementations I have found all call
netpoll_poll which was removed from the kernel in 2011, so this
change should not add any additional breakage.

There are problems with the netpoll packet receive code.  __netpoll_rx
does not call dev_kfree_skb_irq or dev_kfree_skb_any in hard irq
context.  netpoll_neigh_reply leaks every skb it receives.  Reception
of packets does not work successfully on stacked devices (aka bonding,
team, bridge, and vlans).

Given that the netpoll packet receive code is buggy, there are no
out of tree users that will be merged soon, and the code has
not been used for in tree for a decade let's just remove it.

Reverting this commit can server as a starting point for anyone
who wants to resurrect netpoll packet reception support.

Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Move all receive processing under CONFIG_NETPOLL_TRAP
Eric W. Biederman [Sat, 15 Mar 2014 03:50:58 +0000 (20:50 -0700)]
netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP

Make rx_skb_hook, and rx in struct netpoll depend on
CONFIG_NETPOLL_TRAP Make rx_lock, rx_np, and neigh_tx in struct
netpoll_info depend on CONFIG_NETPOLL_TRAP

Make the functions netpoll_rx_on, netpoll_rx, and netpoll_receive_skb
no-ops when CONFIG_NETPOLL_TRAP is not set.

Only build netpoll_neigh_reply, checksum_udp service_neigh_queue,
pkt_is_ns, and __netpoll_rx when CONFIG_NETPOLL_TRAP is defined.

Add helper functions netpoll_trap_setup, netpoll_trap_setup_info,
netpoll_trap_cleanup, and netpoll_trap_cleanup_info that initialize
and cleanup the struct netpoll and struct netpoll_info receive
specific fields when CONFIG_NETPOLL_TRAP is enabled and do nothing
otherwise.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Consolidate neigh_tx processing in service_neigh_queue
Eric W. Biederman [Sat, 15 Mar 2014 03:50:25 +0000 (20:50 -0700)]
netpoll: Consolidate neigh_tx processing in service_neigh_queue

Move the bond slave device neigh_tx handling into service_neigh_queue.

In connection with neigh_tx processing remove unnecessary tests of
a NULL netpoll_info.  As the netpoll_poll_dev has already used
and thus verified the existince of the netpoll_info.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP
Eric W. Biederman [Sat, 15 Mar 2014 03:49:43 +0000 (20:49 -0700)]
netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP

Now that we no longer need to receive packets to safely drain the
network drivers receive queue move netpoll_trap and netpoll_set_trap
under CONFIG_NETPOLL_TRAP

Making netpoll_trap and netpoll_set_trap noop inline functions
when CONFIG_NETPOLL_TRAP is not set.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Don't drop all received packets.
Eric W. Biederman [Sat, 15 Mar 2014 03:48:28 +0000 (20:48 -0700)]
netpoll: Don't drop all received packets.

Change the strategy of netpoll from dropping all packets received
during netpoll_poll_dev to calling napi poll with a budget of 0
(to avoid processing drivers rx queue), and to ignore packets received
with netif_rx (those will safely be placed on the backlog queue).

All of the netpoll supporting drivers have been reviewed to ensure
either thay use netif_rx or that a budget of 0 is supported by their
napi poll routine and that a budget of 0 will not process the drivers
rx queues.

Not dropping packets makes NETPOLL_RX_DROP unnecesary so it is removed.

npinfo->rx_flags is removed  as rx_flags with just the NETPOLL_RX_ENABLED
flag becomes just a redundant mirror of list_empty(&npinfo->rx_np).

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Add netpoll_rx_processing
Eric W. Biederman [Sat, 15 Mar 2014 03:47:49 +0000 (20:47 -0700)]
netpoll: Add netpoll_rx_processing

Add a helper netpoll_rx_processing that reports when netpoll has
receive side processing to perform.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Warn if more packets are processed than are budgeted
Eric W. Biederman [Sat, 15 Mar 2014 03:47:15 +0000 (20:47 -0700)]
netpoll: Warn if more packets are processed than are budgeted

There is already a warning for this case in the normal netpoll path,
but put a copy here in case how netpoll calls the poll functions
causes a differenet result.

netpoll will shortly call the napi poll routine with a budget 0 to
avoid any rx packets being processed.  As nothing does that today
we may encounter drivers that have problems so a netpoll specific
warning seems desirable.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Visit all napi handlers in poll_napi
Eric W. Biederman [Sat, 15 Mar 2014 03:45:51 +0000 (20:45 -0700)]
netpoll: Visit all napi handlers in poll_napi

In poll_napi loop through all of the napi handlers even when the
budget falls to 0 to ensure that we process all of the tx_queues, and
so that we continue to call into drivers when our initial budget is 0.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: Pass budget into poll_napi
Eric W. Biederman [Sat, 15 Mar 2014 03:45:17 +0000 (20:45 -0700)]
netpoll: Pass budget into poll_napi

This moves the control logic to the top level in netpoll_poll_dev
instead of having it dispersed throughout netpoll_poll_dev,
poll_napi and poll_one_napi.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev
Eric W. Biederman [Sat, 15 Mar 2014 03:44:37 +0000 (20:44 -0700)]
netpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev

Today netpoll depends on setting NETPOLL_RX_DROP before networking
drivers receive packets in interrupt context so that the packets can
be dropped.  Move this setting into netpoll_poll_dev from
poll_one_napi so that if ndo_poll_controller happens to receive
packets we will drop the packets on the floor instead of letting the
packets bounce through the networking stack and potentially cause problems.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Mon, 17 Mar 2014 19:06:24 +0000 (15:06 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next,
most relevantly they are:

* cleanup to remove double semicolon from stephen hemminger.

* calm down sparse warning in xt_ipcomp, from Fan Du.

* nf_ct_labels support for nf_tables, from Florian Westphal.

* new macros to simplify rcu dereferences in the scope of nfnetlink
  and nf_tables, from Patrick McHardy.

* Accept queue and drop (including reason for drop) to verdict
  parsing in nf_tables, also from Patrick.

* Remove unused random seed initialization in nfnetlink_log, from
  Florian Westphal.

* Allow to attach user-specific information to nf_tables rules, useful
  to attach user comments to rule, from me.

* Return errors in ipset according to the manpage documentation, from
  Jozsef Kadlecsik.

* Fix coccinelle warnings related to incorrect bool type usage for ipset,
  from Fengguang Wu.

* Add hash:ip,mark set type to ipset, from Vytas Dauksa.

* Fix message for each spotted by ipset for each netns that is created,
  from Ilia Mirkin.

* Add forceadd option to ipset, which evicts a random entry from the set
  if it becomes full, from Josh Hunt.

* Minor IPVS cleanups and fixes from Andi Kleen and Tingwei Liu.

* Improve conntrack scalability by removing a central spinlock, original
  work from Eric Dumazet. Jesper Dangaard Brouer took them over to address
  remaining issues. Several patches to prepare this change come in first
  place.

* Rework nft_hash to resolve bugs (leaking chain, missing rcu synchronization
  on element removal, etc. from Patrick McHardy.

* Restore context in the rule deletion path, as we now release rule objects
  synchronously, from Patrick McHardy. This gets back event notification for
  anonymous sets.

* Fix NAT family validation in nft_nat, also from Patrick.

* Improve scalability of xt_connlimit by using an array of spinlocks and
  by introducing a rb-tree of hashtables for faster lookup of accounted
  objects per network. This patch was preceded by several patches and
  refactorizations to accomodate this change including the use of kmem_cache,
  from Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonetfilter: connlimit: use rbtree for per-host conntrack obj storage
Florian Westphal [Wed, 12 Mar 2014 22:49:51 +0000 (23:49 +0100)]
netfilter: connlimit: use rbtree for per-host conntrack obj storage

With current match design every invocation of the connlimit_match
function means we have to perform (number_of_conntracks % 256) lookups
in the conntrack table [ to perform GC/delete stale entries ].
This is also the reason why ____nf_conntrack_find() in perf top has
> 20% cpu time per core.

This patch changes the storage to rbtree which cuts down the number of
ct objects that need testing.

When looking up a new tuple, we only test the connections of the host
objects we visit while searching for the wanted host/network (or
the leaf we need to insert at).

The slot count is reduced to 32.  Increasing slot count doesn't
speed up things much because of rbtree nature.

before patch (50kpps rx, 10kpps tx):
+  20.95%  ksoftirqd/0  [nf_conntrack] [k] ____nf_conntrack_find
+  20.50%  ksoftirqd/1  [nf_conntrack] [k] ____nf_conntrack_find
+  20.27%  ksoftirqd/2  [nf_conntrack] [k] ____nf_conntrack_find
+   5.76%  ksoftirqd/1  [nf_conntrack] [k] hash_conntrack_raw
+   5.39%  ksoftirqd/2  [nf_conntrack] [k] hash_conntrack_raw
+   5.35%  ksoftirqd/0  [nf_conntrack] [k] hash_conntrack_raw

after (90kpps, 51kpps tx):
+  17.24%       swapper  [nf_conntrack]    [k] ____nf_conntrack_find
+   6.60%   ksoftirqd/2  [nf_conntrack]    [k] ____nf_conntrack_find
+   2.73%       swapper  [nf_conntrack]    [k] hash_conntrack_raw
+   2.36%       swapper  [xt_connlimit]    [k] count_tree

Obvious disadvantages to previous version are the increase in code
complexity and the increased memory cost.

Partially based on Eric Dumazets fq scheduler.

Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: connlimit: make same_source_net signed
Florian Westphal [Wed, 12 Mar 2014 22:49:50 +0000 (23:49 +0100)]
netfilter: connlimit: make same_source_net signed

currently returns 1 if they're the same.  Make it work like mem/strcmp
so it can be used as rbtree search function.

Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agonetfilter: connlimit: use keyed locks
Florian Westphal [Wed, 12 Mar 2014 22:49:49 +0000 (23:49 +0100)]
netfilter: connlimit: use keyed locks

connlimit currently suffers from spinlock contention, example for
4-core system with rps enabled:

+  20.84%   ksoftirqd/2  [kernel.kallsyms] [k] _raw_spin_lock_bh
+  20.76%   ksoftirqd/1  [kernel.kallsyms] [k] _raw_spin_lock_bh
+  20.42%   ksoftirqd/0  [kernel.kallsyms] [k] _raw_spin_lock_bh
+   6.07%   ksoftirqd/2  [nf_conntrack]    [k] ____nf_conntrack_find
+   6.07%   ksoftirqd/1  [nf_conntrack]    [k] ____nf_conntrack_find
+   5.97%   ksoftirqd/0  [nf_conntrack]    [k] ____nf_conntrack_find
+   2.47%   ksoftirqd/2  [nf_conntrack]    [k] hash_conntrack_raw
+   2.45%   ksoftirqd/0  [nf_conntrack]    [k] hash_conntrack_raw
+   2.44%   ksoftirqd/1  [nf_conntrack]    [k] hash_conntrack_raw

May allow parallel lookup/insert/delete if the entry is hashed to
another slot.  With patch:

+  20.95%  ksoftirqd/0  [nf_conntrack] [k] ____nf_conntrack_find
+  20.50%  ksoftirqd/1  [nf_conntrack] [k] ____nf_conntrack_find
+  20.27%  ksoftirqd/2  [nf_conntrack] [k] ____nf_conntrack_find
+   5.76%  ksoftirqd/1  [nf_conntrack] [k] hash_conntrack_raw
+   5.39%  ksoftirqd/2  [nf_conntrack] [k] hash_conntrack_raw
+   5.35%  ksoftirqd/0  [nf_conntrack] [k] hash_conntrack_raw
+   2.00%  ksoftirqd/1  [kernel.kallsyms] [k] __rcu_read_unlock

Improved rx processing rate from ~35kpps to ~50 kpps.

Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 years agocan: mcp251x: Fix regulators operation without CONFIG_REGULATOR
Alexander Shiyan [Fri, 14 Mar 2014 08:46:20 +0000 (12:46 +0400)]
can: mcp251x: Fix regulators operation without CONFIG_REGULATOR

If CONFIG_REGULATOR is not set, devm_regulator_get() returns NULL,
so use IS_ERR_OR_NULL() macro for checks.

Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
10 years agocan: populate netdev::dev_id for udev discrimination
Christopher R. Baker [Sat, 8 Mar 2014 16:00:20 +0000 (11:00 -0500)]
can: populate netdev::dev_id for udev discrimination

My objective is to be able to totally discriminate CAN ports on multi-port
cards via udev so as to rename them to semantically interesting/unique names
for my system (e.g., "ecuCAN" and "auxCAN" instead of "can0" and "can1").

The following patch assigns the dev_id field to match the channel number on all
multi-channel devices. I can only test my two-port Peak PCI card, but it works
as expected: ATTRS{dev_id} now expresses the port number and my udev rules now
unambiguously pick out and rename my individual CAN ports.

Signed-off-by: Christopher R. Baker <cbaker@rec.ri.cmu.edu>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> [PEAK PCAN-USB pro and EMS PCMCIA]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>