cascardo/linux.git
7 years agokernel/trace/bpf_trace.c: work around gcc-4.4.4 anon union initialization bug
Andrew Morton [Mon, 18 Jul 2016 22:50:58 +0000 (15:50 -0700)]
kernel/trace/bpf_trace.c: work around gcc-4.4.4 anon union initialization bug

kernel/trace/bpf_trace.c: In function 'bpf_event_output':
kernel/trace/bpf_trace.c:312: error: unknown field 'next' specified in initializer
kernel/trace/bpf_trace.c:312: warning: missing braces around initializer
kernel/trace/bpf_trace.c:312: warning: (near initialization for 'raw.frag.<anonymous>')

Fixes: 555c8a8623a3a87 ("bpf: avoid stack copy and use skb ctx for event output")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovirtio-net: Remove more stack DMA
Andy Lutomirski [Mon, 18 Jul 2016 22:34:49 +0000 (15:34 -0700)]
virtio-net: Remove more stack DMA

VLAN and MQ control was doing DMA from the stack.  Fix it.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobnxt_en: Remove locking around txr->dev_state
Florian Fainelli [Mon, 18 Jul 2016 20:02:47 +0000 (13:02 -0700)]
bnxt_en: Remove locking around txr->dev_state

txr->dev_state was not consistently manipulated with the acquisition of
the per-queue lock, after further inspection the lock does not seem
necessary, either the value is read as BNXT_DEV_STATE_CLOSING or 0.

Reported-by: coverity (CID 1339583)
Fixes: c0c050c58d840 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'frag-udp-tunneled-skbs'
David S. Miller [Tue, 19 Jul 2016 23:40:22 +0000 (16:40 -0700)]
Merge branch 'frag-udp-tunneled-skbs'

Shmulik Ladkani says:

====================
net: Consider fragmentation of udp tunneled skbs in 'ip_finish_output_gso'

Currently IP fragmentation of GSO segments that exceed dst mtu is
considered only in the ipv4 forwarding case.

There are cases where GSO skbs that are bridged and then udp-tunneled
may have gso_size exceeding the egress device mtu.
It makes sense to fragment them, as in the non GSOed code path.

The exact cases where this behavior is needed is described and addressed
in the 2nd patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation...
Shmulik Ladkani [Mon, 18 Jul 2016 11:49:34 +0000 (14:49 +0300)]
net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for local udp tunneled skbs

Given:
 - tap0 and vxlan0 are bridged
 - vxlan0 stacked on eth0, eth0 having small mtu (e.g. 1400)

Assume GSO skbs arriving from tap0 having a gso_size as determined by
user-provided virtio_net_hdr (e.g. 1460 corresponding to VM mtu of 1500).

After encapsulation these skbs have skb_gso_network_seglen that exceed
eth0's ip_skb_dst_mtu.

These skbs are accidentally passed to ip_finish_output2 AS IS.
Alas, each final segment (segmented either by validate_xmit_skb or by
hardware UFO) would be larger than eth0 mtu.
As a result, those above-mtu segments get dropped on certain networks.

This behavior is not aligned with the NON-GSO case:
Assume a non-gso 1500-sized IP packet arrives from tap0. After
encapsulation, the vxlan datagram is fragmented normally at the
ip_finish_output-->ip_fragment code path.

The expected behavior for the GSO case would be segmenting the
"gso-oversized" skb first, then fragmenting each segment according to
dst mtu, and finally passing the resulting fragments to ip_finish_output2.

'ip_finish_output_gso' already supports this "Slowpath" behavior,
according to the IPSKB_FRAG_SEGS flag, which is only set during ipv4
forwarding (not set in the bridged case).

In order to support the bridged case, we'll mark skbs arriving from an
ingress interface that get udp-encaspulated as "allowed to be fragmented",
causing their network_seglen to be validated by 'ip_finish_output_gso'
(and fragment if needed).

Note the TUNNEL_DONT_FRAGMENT tun_flag is still honoured (both in the
gso and non-gso cases), which serves users wishing to forbid
fragmentation at the udp tunnel endpoint.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/ipv4: Introduce IPSKB_FRAG_SEGS bit to inet_skb_parm.flags
Shmulik Ladkani [Mon, 18 Jul 2016 11:49:33 +0000 (14:49 +0300)]
net/ipv4: Introduce IPSKB_FRAG_SEGS bit to inet_skb_parm.flags

This flag indicates whether fragmentation of segments is allowed.

Formerly this policy was hardcoded according to IPSKB_FORWARDED (set by
either ip_forward or ipmr_forward).

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bnxt_en-NS2-Nitro'
David S. Miller [Tue, 19 Jul 2016 23:29:41 +0000 (16:29 -0700)]
Merge branch 'bnxt_en-NS2-Nitro'

Michael Chan says:

====================
bnxt_en: Add support for NS2 Nitro.

This series adds support for the embedded version of the
ethernet controller (Nitro) in the North Star 2 SoC.  There are a number
of features not supported and a software workaround for a hardware rx
bug is required for Nitro A0.  Please review.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobnxt_en: Add BCM58700 PCI device ID for NS2 Nitro.
Prashant Sreedharan [Mon, 18 Jul 2016 11:15:25 +0000 (07:15 -0400)]
bnxt_en: Add BCM58700 PCI device ID for NS2 Nitro.

A bridge device in NS2 has the same device ID as the ethernet controller.
Add check to avoid probing the bridge device.

Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobnxt_en: Workaround Nitro A0 RX hardware bug (part 4).
Prashant Sreedharan [Mon, 18 Jul 2016 11:15:24 +0000 (07:15 -0400)]
bnxt_en: Workaround Nitro A0 RX hardware bug (part 4).

Allocate special vnic for dropping packets not matching the RX filters.
First vnic is for normal RX packets and the driver will drop all
packets on the 2nd vnic.

Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobnxt_en: Workaround Nitro A0 hardware RX bug (part 3).
Prashant Sreedharan [Mon, 18 Jul 2016 11:15:23 +0000 (07:15 -0400)]
bnxt_en: Workaround Nitro A0 hardware RX bug (part 3).

Allocate napi for special vnic, packets arriving on this
napi will simply be dropped and the buffers will be replenished back
to the HW.

Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobnxt_en: Workaround Nitro A0 hardware RX bug (part 2).
Prashant Sreedharan [Mon, 18 Jul 2016 11:15:22 +0000 (07:15 -0400)]
bnxt_en: Workaround Nitro A0 hardware RX bug (part 2).

The hardware is unable to drop rx packets not matching the RX filters.  To
workaround it, we create a special VNIC and configure the hardware to
direct all packets not matching the filters to it.  We then setup the
driver to drop packets received on this VNIC.

This patch creates the infrastructure for this VNIC, reserves a
completion ring, and rx rings.  Only shared completion ring mode is
supported.  The next 2 patches add a NAPI to handle packets from this
VNIC and the setup of the VNIC.

Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobnxt_en: Workaround Nitro A0 hardware RX bug (part 1).
Prashant Sreedharan [Mon, 18 Jul 2016 11:15:21 +0000 (07:15 -0400)]
bnxt_en: Workaround Nitro A0 hardware RX bug (part 1).

Nitro A0 has a hardware bug in the rx path.  The workaround is to create
a special COS context as a path for non-RSS (non-IP) packets.  Without this
workaround, the chip may stall when receiving RSS and non-RSS packets.

Add infrastructure to allow 2 contexts (RSS and CoS) per VNIC.  Allocate
and configure the CoS context for Nitro A0.

Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobnxt_en: Add basic support for Nitro in North Star 2.
Prashant Sreedharan [Mon, 18 Jul 2016 11:15:20 +0000 (07:15 -0400)]
bnxt_en: Add basic support for Nitro in North Star 2.

Nitro is the embedded version of the ethernet controller in the North
Star 2 SoC.  Add basic code to recognize the chip ID and disable
the features (ntuple, TPA, ring and port statistics) not supported on
Nitro A0.

Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'marvell-phy'
David S. Miller [Tue, 19 Jul 2016 23:05:57 +0000 (16:05 -0700)]
Merge branch 'marvell-phy'

Charles-Antoine Couret says:

====================
Marvell phy: fiber interface configuration

Another patchset to manage correctly the fiber link for some concerned Marvell's
phy like 88E1512.

This patchset fixed the commit log for the third and last commits and a comment
in the first commit.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMarvell phy: add functions to suspend and resume both interfaces: fiber and copper...
Charles-Antoine Couret [Tue, 19 Jul 2016 09:13:13 +0000 (11:13 +0200)]
Marvell phy: add functions to suspend and resume both interfaces: fiber and copper links.

These functions used standards registers in a different page
for both interfaces: copper and fiber.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMarvell phy: add configuration of autonegociation for fiber link.
Charles-Antoine Couret [Tue, 19 Jul 2016 09:13:12 +0000 (11:13 +0200)]
Marvell phy: add configuration of autonegociation for fiber link.

To be correctly initilized, the fiber interface needs
to be configured via autonegociation registers which use
some customs options or registers.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMarvell phy: add field to get errors from fiber link.
Charles-Antoine Couret [Tue, 19 Jul 2016 09:13:11 +0000 (11:13 +0200)]
Marvell phy: add field to get errors from fiber link.

Add support for the fiber receiver error counter in the
statistics. Rename the current counter which is for copper errors to
phy_receive_errors_copper, so it is easy to distinguish copper from
fiber.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMarvell phy: check link status in case of fiber link.
Charles-Antoine Couret [Tue, 19 Jul 2016 09:13:10 +0000 (11:13 +0200)]
Marvell phy: check link status in case of fiber link.

For concerned phy, the fiber link is checked before the copper link.
According to datasheet, the link which is up is enabled.

If both links are down, copper link would be used.
To detect fiber link status, we used the real time status
because of troubles with the copper method.

Tested with Marvell 88E1512.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'renesas-dma-channel'
David S. Miller [Tue, 19 Jul 2016 23:01:34 +0000 (16:01 -0700)]
Merge branch 'renesas-dma-channel'

Sergei Shtylyov says:

====================
Fix DMA channel misreporting for the Renesas Ethernet drivers

Here's a set of 2 patches against DaveM's 'net.git' repo fixing up the DMA
channel reporting by 'ifconfig'...
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosh_eth: fix DMA channel misreporting
Sergei Shtylyov [Sun, 17 Jul 2016 13:09:19 +0000 (16:09 +0300)]
sh_eth: fix DMA channel misreporting

Currently 'ifconfig' for the Ethernet devices handled by this driver  shows
"DMA chan: ff" while the driver doesn't use any DMA channels. Not assigning
a value to 'net_device::dma' causes 'ifconfig'  to  correctly not report  a
DMA channel.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoravb: fix DMA channel misreporting
Sergei Shtylyov [Sun, 17 Jul 2016 13:08:43 +0000 (16:08 +0300)]
ravb: fix DMA channel misreporting

Currently 'ifconfig' for the Ethernet devices handled by this driver  shows
"DMA chan: ff" while the driver doesn't use any DMA channels. Not assigning
a value to 'net_device::dma' causes 'ifconfig'  to  correctly not report  a
DMA channel.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodrivers: atm: nicstar: Use the correct function to free some resources
Christophe Jaillet [Sun, 17 Jul 2016 07:03:24 +0000 (09:03 +0200)]
drivers: atm: nicstar: Use the correct function to free some resources

In 'get_scq', 'dma_alloc_coherent' has been used to allocate some
resources, so we need to free them using 'dma_free_coherent' instead
of 'kfree'.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ti: cpmac: Use the correct function to free some resources.
Christophe Jaillet [Sun, 17 Jul 2016 06:15:50 +0000 (08:15 +0200)]
net: ti: cpmac: Use the correct function to free some resources.

In 'cpmac_open', 'dma_alloc_coherent' has been used to allocate some
resources, so we need to free them using 'dma_free_coherent' instead
of 'kfree'.

Also, we don't need to free these resources if the allocation has failed.
So I have slighly modified the goto label in this case.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomacvtap: correctly free skb during socket destruction
Jason Wang [Tue, 19 Jul 2016 03:02:59 +0000 (11:02 +0800)]
macvtap: correctly free skb during socket destruction

We should use kfree_skb() instead of kfree() to free an skb.

Fixes: 362899b8725b ("macvtap: switch to use skb array")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: marvell: pxa168_eth: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Sun, 17 Jul 2016 21:30:46 +0000 (23:30 +0200)]
net: ethernet: marvell: pxa168_eth: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: marvell: pxa168_eth: use phydev from struct net_device
Philippe Reynes [Sun, 17 Jul 2016 21:30:45 +0000 (23:30 +0200)]
net: ethernet: marvell: pxa168_eth: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: adi: bfin_mac: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Sat, 16 Jul 2016 23:10:15 +0000 (01:10 +0200)]
net: ethernet: adi: bfin_mac: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

There was a check on CAP_NET_ADMIN in bfin_mac_ethtool_setsettings,
but this check is already done in dev_ethtool, so no need to repeat
it before calling the generic function.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: adi: bfin_mac: use phydev from struct net_device
Philippe Reynes [Sat, 16 Jul 2016 23:10:14 +0000 (01:10 +0200)]
net: ethernet: adi: bfin_mac: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodwc_eth_qos: Remove deprecated create_singlethread_workqueue
Bhaktipriya Shridhar [Sat, 16 Jul 2016 08:23:28 +0000 (13:53 +0530)]
dwc_eth_qos: Remove deprecated create_singlethread_workqueue

alloc_workqueue replaces deprecated create_singlethread_workqueue().

A dedicated workqueue has been used since the workitem viz
lp->txtimeout_reinit is involved in reinitialization if a TX timeout
occurs, which is necessary to guarantee forward progress in packet
processing. As a network device can be used during memory reclaim, the
workqueue needs forward progress guarantee under memory pressure.
WQ_MEM_RECLAIM has been set to ensure this.

Since there is only a single work item, explicit concurrency limit is
unnecessary here.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: bpf_event_entry_gen's alloc needs to be in atomic context
Daniel Borkmann [Fri, 15 Jul 2016 23:15:55 +0000 (01:15 +0200)]
bpf: bpf_event_entry_gen's alloc needs to be in atomic context

Should have been obvious, only called from bpf() syscall via map_update_elem()
that calls bpf_fd_array_map_update_elem() under RCU read lock and thus this
must also be in GFP_ATOMIC, of course.

Fixes: 3b1efb196eee ("bpf, maps: flush own entries on perf map release")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosctp: fix GSO for IPv6
Marcelo Ricardo Leitner [Fri, 15 Jul 2016 19:40:02 +0000 (16:40 -0300)]
sctp: fix GSO for IPv6

commit 90017accff61 ("sctp: Add GSO support") didn't register SCTP GSO
offloading for IPv6 and yet didn't put any restrictions on generating
GSO packets while in IPv6, which causes all IPv6 GSO'ed packets to be
silently dropped.

The fix is to properly register the offload this time.

Fixes: 90017accff61 ("sctp: Add GSO support")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosctp: recvmsg should be able to run even if sock is in closing state
Marcelo Ricardo Leitner [Fri, 15 Jul 2016 19:38:19 +0000 (16:38 -0300)]
sctp: recvmsg should be able to run even if sock is in closing state

Commit d46e416c11c8 missed to update some other places which checked for
the socket being TCP-style AND Established state, as Closing state has
some overlapping with the previous understanding of Established.

Without this fix, one of the effects is that some already queued rx
messages may not be readable anymore depending on how the association
teared down, and sending may also not be possible if peer initiated the
shutdown.

Also merge two if() blocks into one condition on sctp_sendmsg().

Cc: Xin Long <lucien.xin@gmail.com>
Fixes: d46e416c11c8 ("sctp: sctp should change socket state when shutdown is received")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: usb: ax88172x: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Fri, 15 Jul 2016 13:25:36 +0000 (15:25 +0200)]
net: usb: ax88172x: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'hisilicon-mdio-femac'
David S. Miller [Sun, 17 Jul 2016 04:33:12 +0000 (21:33 -0700)]
Merge branch 'hisilicon-mdio-femac'

Dongpo Li says:

====================
Add Hisilicon MDIO bus driver and FEMAC driver

This patch set adds a Hisilicon MDIO bus driver and
a Fast Ethernet MAC(FEMAC) driver.
We also abstract a general interface "of_phy_get_and_connect"
for PHY connect. User will have no bother with getting
"phy-mode" and "phy-handle" any more.

Changes in v1:
- Pass private data structure instead of struct mii_bus
  in MDIO read and write operation.
- Return the error which devm_clk_get() gives when MDIO probe.
- Leave the clock unprepared and disabled on error when MDIO probe.
- Abstract a general interface "of_phy_get_and_connect" for PHY connect.
- Remove the "_reset" suffixes in "reset-names" property.
- Enable tx per-packet interrupt when tx fifo full.
- Remove pointless compatible and add SoC specific compatible.
- Declare only one clock in MAC dts documentation.
- Add standard unit suffixes for "phy-reset-delays".
- Use a smaller NAPI poll weight 16 for our Fast Ethernet MAC.
- Use phy_ethtool_{get|set}_link_ksettings for ethtool ops.
- Use phydev from struct net_device in MAC driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: hisilicon: Add Fast Ethernet MAC driver
Dongpo Li [Fri, 15 Jul 2016 08:26:35 +0000 (16:26 +0800)]
net: hisilicon: Add Fast Ethernet MAC driver

This patch adds the Hisilicon Fast Ethernet MAC(FEMAC) driver.
The FEMAC supports max speed 100Mbps and has been used in many
Hisilicon SoC.

Signed-off-by: Dongpo Li <lidongpo@hisilicon.com>
Reviewed-by: Jiancheng Xue <xuejiancheng@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoof_mdio: Abstract a general interface for phy connect
Dongpo Li [Fri, 15 Jul 2016 08:26:34 +0000 (16:26 +0800)]
of_mdio: Abstract a general interface for phy connect

Abstract a general interface "of_phy_get_and_connect"
for PHY connect. User will have no bother with getting
"phy-mode" and "phy-handle" any more.

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dongpo Li <lidongpo@hisilicon.com>
Reviewed-by: Jiancheng Xue <xuejiancheng@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: Add MDIO bus driver for the Hisilicon FEMAC
Dongpo Li [Fri, 15 Jul 2016 08:26:33 +0000 (16:26 +0800)]
net: Add MDIO bus driver for the Hisilicon FEMAC

This patch adds a separate driver for the MDIO interface of the
Hisilicon Fast Ethernet MAC.

Signed-off-by: Dongpo Li <lidongpo@hisilicon.com>
Reviewed-by: Jiancheng Xue <xuejiancheng@hisilicon.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: cpsw: make TI_CPSW_PHY_SEL invisible
Uwe Kleine-König [Fri, 15 Jul 2016 08:12:15 +0000 (10:12 +0200)]
net: cpsw: make TI_CPSW_PHY_SEL invisible

TI_CPSW_PHY_SEL depended on TI_CPSW and was selected by the latter. So
there is no reason to have this symbol visible.

A further optimisation would be to put the code for both symbols into a
single module which would allow to not export at least cpsw_phy_sel()
and simplify the module load process.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agowan/fsl_ucc_hdlc: rewrite error handling to make it clearer
Zhao Qiang [Fri, 15 Jul 2016 02:38:25 +0000 (10:38 +0800)]
wan/fsl_ucc_hdlc: rewrite error handling to make it clearer

It was used err_xxx for labeled statement, it is
not easy to understand, now use free_xxx for labeled
statement.

Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agowan/fsl_ucc_hdlc: remove reduplicative freed memory 'uhdlc_priv'
Zhao Qiang [Fri, 15 Jul 2016 02:38:24 +0000 (10:38 +0800)]
wan/fsl_ucc_hdlc: remove reduplicative freed memory 'uhdlc_priv'

'uhdlc_priv' has freed twice, drop the first one.

Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ipmr/ip6mr: add support for keeping an entry age
Nikolay Aleksandrov [Thu, 14 Jul 2016 16:28:27 +0000 (19:28 +0300)]
net: ipmr/ip6mr: add support for keeping an entry age

In preparation for hardware offloading of ipmr/ip6mr we need an
interface that allows to check (and later update) the age of entries.
Relying on stats alone can show activity but not actual age of the entry,
furthermore when there're tens of thousands of entries a lot of the
hardware implementations only support "hit" bits which are cleared on
read to denote that the entry was active and shouldn't be aged out,
these can then be naturally translated into age timestamp and will be
compatible with the software forwarding age. Using a lastuse entry doesn't
affect performance because the members in that cache line are written to
along with the age.
Since all new users are encouraged to use ipmr via netlink, this is
exported via the RTA_EXPIRES attribute.
Also do a minor local variable declaration style adjustment - arrange them
longest to shortest.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Shrijeet Mukherjee <shm@cumulusnetworks.com>
CC: Satish Ashok <sashok@cumulusnetworks.com>
CC: Donald Sharp <sharpd@cumulusnetworks.com>
CC: David S. Miller <davem@davemloft.net>
CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
CC: James Morris <jmorris@namei.org>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agorndis_host: Set valid random MAC on buggy devices
Kristian Evensen [Thu, 14 Jul 2016 08:23:03 +0000 (10:23 +0200)]
rndis_host: Set valid random MAC on buggy devices

Some devices of the same type all export the same, random MAC address. This
behavior has been seen on the ZTE MF910, MF823 and MF831, and there are
probably more devices out there. Fix this by generating a valid random MAC
address if we read a random MAC from device.

Also, changed the memcpy() to ether_addr_copy(), as pointed out by
checkpatch.

Suggested-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bridge-rx-simplify-fwd-consolidate'
David S. Miller [Sun, 17 Jul 2016 02:57:38 +0000 (19:57 -0700)]
Merge branch 'bridge-rx-simplify-fwd-consolidate'

Nikolay Aleksandrov says:

====================
net: bridge: simplify receive path and consolidate forwarding paths

This set tries to simplify the receive and forwarding paths. Patch 01 is
a trivial style adjustment, patch 02 removes one conditional from the
unicast fast path, patch 03 removes another conditional and more imporantly
removes the skb0/skb2 ambiguity about locally receiving the skb and
switches to a boolean called "local_rcv".
Patch 04 is the most important change which consolidates the forwarding
paths for locally originated and forwarded packets into __br_forward. This
allows us to remove the function pointers giving a minor performance boost,
more importantly it makes it much easier to reason about the forwarding
path and reduces the code duplication that was needed when making changes.
Also it allows the receive path to fully setup the environment prior to
calling any forwarding functions (i.e. to properly set unicast, local_rcv
and search for unicast/mcast dst).
Functionally everything should stay the same after this set.

I've done basic tests with unicast/multicast/broadcast Tx/Rx. Please
review carefully.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: bridge: remove _deliver functions and consolidate forward code
Nikolay Aleksandrov [Thu, 14 Jul 2016 03:10:02 +0000 (06:10 +0300)]
net: bridge: remove _deliver functions and consolidate forward code

Before this patch we had two flavors of most forwarding functions -
_forward and _deliver, the difference being that the latter are used
when the packets are locally originated. Instead of all this function
pointer passing and code duplication, we can just pass a boolean noting
that the packet was locally originated and use that to perform the
necessary checks in __br_forward. This gives a minor performance
improvement but more importantly consolidates the forwarding paths.
Also add a kernel doc comment to explain the exported br_forward()'s
arguments.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: bridge: drop skb2/skb0 variables and use a local_rcv boolean
Nikolay Aleksandrov [Thu, 14 Jul 2016 03:10:01 +0000 (06:10 +0300)]
net: bridge: drop skb2/skb0 variables and use a local_rcv boolean

Currently if the packet is going to be received locally we set skb0 or
sometimes called skb2 variables to the original skb. This can get
confusing and also we can avoid one conditional on the fast path by
simply using a boolean and passing it around. Thanks to Roopa for the
name suggestion.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: bridge: rearrange flood vs unicast receive paths
Nikolay Aleksandrov [Thu, 14 Jul 2016 03:10:00 +0000 (06:10 +0300)]
net: bridge: rearrange flood vs unicast receive paths

This patch removes one conditional from the unicast path by using the fact
that skb is NULL only when the packet is multicast or is local.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: bridge: minor style adjustments in br_handle_frame_finish
Nikolay Aleksandrov [Thu, 14 Jul 2016 03:09:59 +0000 (06:09 +0300)]
net: bridge: minor style adjustments in br_handle_frame_finish

Trivial style changes in br_handle_frame_finish.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotcp_timer.c: Add kernel-doc function descriptions
Richard Sailer [Sat, 16 Jul 2016 02:04:34 +0000 (04:04 +0200)]
tcp_timer.c: Add kernel-doc function descriptions

This adds kernel-doc style descriptions for 6 functions and
fixes 1 typo.

Signed-off-by: Richard Sailer <richard@weltraumpflege.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: ti: cpmac: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Fri, 15 Jul 2016 10:39:02 +0000 (12:39 +0200)]
net: ethernet: ti: cpmac: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

There was a check on CAP_NET_ADMIN in cpmac_set_settings, but this
check is already done in dev_ethtool, so no need to repeat it before
calling the generic function.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: ti: cpmac: use phydev from struct net_device
Philippe Reynes [Fri, 15 Jul 2016 10:39:01 +0000 (12:39 +0200)]
net: ethernet: ti: cpmac: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: amd: au1000_eth: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Fri, 15 Jul 2016 10:05:12 +0000 (12:05 +0200)]
net: ethernet: amd: au1000_eth: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

There was a check on CAP_NET_ADMIN in au1000_set_settings, but this
check is already done in dev_ethtool, so no need to repeat it before
calling the generic function.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: amd: au1000_eth: use phydev from struct net_device
Philippe Reynes [Fri, 15 Jul 2016 10:05:11 +0000 (12:05 +0200)]
net: ethernet: amd: au1000_eth: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: smsc9420: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Fri, 15 Jul 2016 08:36:21 +0000 (10:36 +0200)]
net: ethernet: smsc9420: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: smsc9420: use phydev from struct net_device
Philippe Reynes [Fri, 15 Jul 2016 08:36:20 +0000 (10:36 +0200)]
net: ethernet: smsc9420: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: ethoc: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Fri, 15 Jul 2016 07:59:12 +0000 (09:59 +0200)]
net: ethernet: ethoc: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: ethoc: use phydev from struct net_device
Philippe Reynes [Fri, 15 Jul 2016 07:59:11 +0000 (09:59 +0200)]
net: ethernet: ethoc: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: pasemi_mac: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Thu, 14 Jul 2016 21:44:53 +0000 (23:44 +0200)]
net: ethernet: pasemi_mac: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: pasemi_mac: use phydev from struct net_device
Philippe Reynes [Thu, 14 Jul 2016 21:44:52 +0000 (23:44 +0200)]
net: ethernet: pasemi_mac: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: xilinx: axienet: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Thu, 14 Jul 2016 17:45:58 +0000 (19:45 +0200)]
net: ethernet: xilinx: axienet: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: xilinx: axienet: use phydev from struct net_device
Philippe Reynes [Thu, 14 Jul 2016 17:45:57 +0000 (19:45 +0200)]
net: ethernet: xilinx: axienet: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: tc35815: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Thu, 14 Jul 2016 13:20:47 +0000 (15:20 +0200)]
net: ethernet: tc35815: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: tc35815: use phydev from struct net_device
Philippe Reynes [Thu, 14 Jul 2016 13:20:46 +0000 (15:20 +0200)]
net: ethernet: tc35815: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: fixup for tracepoint napi:napi_poll
Jesper Dangaard Brouer [Fri, 15 Jul 2016 21:55:20 +0000 (23:55 +0200)]
net: fixup for tracepoint napi:napi_poll

The recent change to tracepoint napi:napi_poll changed the order of
the parameters that perf scripts sees, the printk was correct.  The
problem was that the new parameters (work and budget) were pushed
in front of dev_name.

The new parameters obviously need to be appended to keep backward
compatible.

Fixes: 1db19db7f5ff ("net: tracepoint napi:napi_poll add work and budget")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomacvtap: switch to use skb array
Jason Wang [Fri, 15 Jul 2016 07:46:31 +0000 (03:46 -0400)]
macvtap: switch to use skb array

This patch switch to use skb array instead of sk_receive_queue to
avoid spinlock contentions. Tests shows about 21% improvements for
guest rx pps:

Before: 1472731 pkts/s
After:  1786289 pkts/s

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomacvtap: avoid hash calculating for single queue
Jason Wang [Fri, 15 Jul 2016 07:46:30 +0000 (03:46 -0400)]
macvtap: avoid hash calculating for single queue

We decide the rxq through calculating its hash which is not necessary
if we only have one rx queue. So this patch skip this and just return
queue 0. Test shows 22% improving on guest rx pps.

Before: 1201504 pkts/s
After:  1472731 pkts/s

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bpf-event-output-helper-improvements'
David S. Miller [Fri, 15 Jul 2016 21:23:56 +0000 (14:23 -0700)]
Merge branch 'bpf-event-output-helper-improvements'

Daniel Borkmann says:

====================
BPF event output helper improvements

This set adds improvements to the BPF event output helper to
support non-linear data sampling, here specifically, for skb
context. For details please see individual patches. The set
is based against net-next tree.

v1 -> v2:
  - Integrated and adapted Peter's diff into patch 1, updated
    the remaining ones accordingly. Thanks Peter!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: avoid stack copy and use skb ctx for event output
Daniel Borkmann [Thu, 14 Jul 2016 16:08:05 +0000 (18:08 +0200)]
bpf: avoid stack copy and use skb ctx for event output

This work addresses a couple of issues bpf_skb_event_output()
helper currently has: i) We need two copies instead of just a
single one for the skb data when it should be part of a sample.
The data can be non-linear and thus needs to be extracted via
bpf_skb_load_bytes() helper first, and then copied once again
into the ring buffer slot. ii) Since bpf_skb_load_bytes()
currently needs to be used first, the helper needs to see a
constant size on the passed stack buffer to make sure BPF
verifier can do sanity checks on it during verification time.
Thus, just passing skb->len (or any other non-constant value)
wouldn't work, but changing bpf_skb_load_bytes() is also not
the proper solution, since the two copies are generally still
needed. iii) bpf_skb_load_bytes() is just for rather small
buffers like headers, since they need to sit on the limited
BPF stack anyway. Instead of working around in bpf_skb_load_bytes(),
this work improves the bpf_skb_event_output() helper to address
all 3 at once.

We can make use of the passed in skb context that we have in
the helper anyway, and use some of the reserved flag bits as
a length argument. The helper will use the new __output_custom()
facility from perf side with bpf_skb_copy() as callback helper
to walk and extract the data. It will pass the data for setup
to bpf_event_output(), which generates and pushes the raw record
with an additional frag part. The linear data used in the first
frag of the record serves as programmatically defined meta data
passed along with the appended sample.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf, perf: split bpf_perf_event_output
Daniel Borkmann [Thu, 14 Jul 2016 16:08:04 +0000 (18:08 +0200)]
bpf, perf: split bpf_perf_event_output

Split the bpf_perf_event_output() helper as a preparation into
two parts. The new bpf_perf_event_output() will prepare the raw
record itself and test for unknown flags from BPF trace context,
where the __bpf_perf_event_output() does the core work. The
latter will be reused later on from bpf_event_output() directly.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoperf, events: add non-linear data support for raw records
Daniel Borkmann [Thu, 14 Jul 2016 16:08:03 +0000 (18:08 +0200)]
perf, events: add non-linear data support for raw records

This patch adds support for non-linear data on raw records. It
extends raw records to have one or multiple fragments that will
be written linearly into the ring slot, where each fragment can
optionally have a custom callback handler to walk and extract
complex, possibly non-linear data.

If a callback handler is provided for a fragment, then the new
__output_custom() will be used instead of __output_copy() for
the perf_output_sample() part. perf_prepare_sample() does all
the size calculation only once, so perf_output_sample() doesn't
need to redo the same work anymore, meaning real_size and padding
will be cached in the raw record. The raw record becomes 32 bytes
in size without holes; to not increase it further and to avoid
doing unnecessary recalculations in fast-path, we can reuse
next pointer of the last fragment, idea here is borrowed from
ZERO_OR_NULL_PTR(), which should keep the perf_output_sample()
path for PERF_SAMPLE_RAW minimal.

This facility is needed for BPF's event output helper as a first
user that will, in a follow-up, add an additional perf_raw_frag
to its perf_raw_record in order to be able to more efficiently
dump skb context after a linear head meta data related to it.
skbs can be non-linear and thus need a custom output function to
dump buffers. Currently, the skb data needs to be copied twice;
with the help of __output_custom() this work only needs to be
done once. Future users could be things like XDP/BPF programs
that work on different context though and would thus also have
a different callback function.

The few users of raw records are adapted to initialize their frag
data from the raw record itself, no change in behavior for them.
The code is based upon a PoC diff provided by Peter Zijlstra [1].

  [1] http://thread.gmane.org/gmane.linux.network/421294

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agorxrpc: checking for IS_ERR() instead of NULL
Dan Carpenter [Thu, 14 Jul 2016 14:47:01 +0000 (15:47 +0100)]
rxrpc: checking for IS_ERR() instead of NULL

The rxrpc_lookup_peer() function returns NULL on error, it never returns
error pointers.

Fixes: 8496af50eb38 ('rxrpc: Use RCU to access a peer's service connection tree')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: phy: micrel: Add KSZ8041FTL fiber mode support
Philipp Zabel [Thu, 14 Jul 2016 14:29:43 +0000 (16:29 +0200)]
net: phy: micrel: Add KSZ8041FTL fiber mode support

We can't detect the FXEN (fiber mode) bootstrap pin, so configure
it via a boolean device tree property "micrel,fiber-mode".
If it is enabled, auto-negotiation is not supported.
The only available modes are 100base-fx (full duplex and half duplex).

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agowan/fsl_ucc_hdlc: info leak in uhdlc_ioctl()
Dan Carpenter [Thu, 14 Jul 2016 11:16:53 +0000 (14:16 +0300)]
wan/fsl_ucc_hdlc: info leak in uhdlc_ioctl()

There is a 2 byte struct whole after line.loopback so we need to clear
that out to avoid disclosing stack information.

Fixes: c19b6d246a35 ('drivers/net: support hdlc function for QE-UCC')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'rds-enable-mprds'
David S. Miller [Fri, 15 Jul 2016 18:36:58 +0000 (11:36 -0700)]
Merge branch 'rds-enable-mprds'

Sowmini Varadhan says:

====================
RDS: TCP: Enable mprds for rds-tcp

The third, and final, installment for mprds-tcp changes.

In Patch 3 of this set, if the transport support t_mp_capable,
we hash outgoing traffic across multiple paths.  Additionally, even if
the transport is MP capable, we may be peering with some node that does
not support mprds, or supports a different number of paths. This
necessitates RDS control plane changes so that both peers agree
on the number of paths to be used for the rds-tcp connection.
Patch 3 implements all these changes, which are documented in patch 5
of the series.

Patch 1 of this series is a bug fix for a race-condition
that has always existed, but is now more easily encountered with mprds.
Patch 2 is code refactoring. Patches 4 and 5 are Documentation updates.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoDocumentation: RDS: Document Multipath RDS (mprds)
Sowmini Varadhan [Thu, 14 Jul 2016 10:51:05 +0000 (03:51 -0700)]
Documentation: RDS: Document Multipath RDS (mprds)

Document the design of mprds, covering a brief description
of the motivation, data-structures and modifications to the
RDS control plane.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoDocumentation: RDS: updates for SO_RDS_TRANSPORT socket option
Sowmini Varadhan [Thu, 14 Jul 2016 10:51:04 +0000 (03:51 -0700)]
Documentation: RDS: updates for SO_RDS_TRANSPORT socket option

Update the documentation to describe the changes added by
commit 8ba38460f363 ("net/rds Add getsockopt support for SO_RDS_TRANSPORT")

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoRDS: TCP: Enable multipath RDS for TCP
Sowmini Varadhan [Thu, 14 Jul 2016 10:51:03 +0000 (03:51 -0700)]
RDS: TCP: Enable multipath RDS for TCP

Use RDS probe-ping to compute how many paths may be used with
the peer, and to synchronously start the multiple paths. If mprds is
supported, hash outgoing traffic to one of multiple paths in rds_sendmsg()
when multipath RDS is supported by the transport.

CC: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoRDS: TCP: Reduce code duplication in rds_tcp_reset_callbacks()
Sowmini Varadhan [Thu, 14 Jul 2016 10:51:02 +0000 (03:51 -0700)]
RDS: TCP: Reduce code duplication in rds_tcp_reset_callbacks()

Some code duplication in rds_tcp_reset_callbacks() can be avoided
by having the function call rds_tcp_restore_callbacks() and
rds_tcp_set_callbacks().

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoRDS: TCP: avoid bad page reference in rds_tcp_listen_data_ready
Sowmini Varadhan [Thu, 14 Jul 2016 10:51:01 +0000 (03:51 -0700)]
RDS: TCP: avoid bad page reference in rds_tcp_listen_data_ready

As the existing comments in rds_tcp_listen_data_ready() indicate,
it is possible under some race-windows to get to this function with the
accept() socket. If that happens, we could run into a sequence whereby

   thread 1 thread 2

rds_tcp_accept_one() thread
sets up new_sock via ->accept().
The sk_user_data is now
sock_def_readable
data comes in for new_sock,
->sk_data_ready is called, and
we land in rds_tcp_listen_data_ready
rds_tcp_set_callbacks()
takes the sk_callback_lock and
sets up sk_user_data to be the cp
read_lock sk_callback_lock
ready = cp
unlock sk_callback_lock
page fault on ready

In the above sequence, we end up with a panic on a bad page reference
when trying to execute (*ready)(). Instead we need to call
sock_def_readable() safely, which is what this patch achieves.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodevlink: fix trace format string
Arnd Bergmann [Thu, 14 Jul 2016 09:37:29 +0000 (11:37 +0200)]
devlink: fix trace format string

Including devlink.h on ARM and probably other 32-bit architectures results in
a harmless warning:

In file included from ../include/trace/define_trace.h:95:0,
                 from ../include/trace/events/devlink.h:51,
                 from ../net/core/devlink.c:30:
include/trace/events/devlink.h: In function 'trace_raw_output_devlink_hwmsg':
include/trace/events/devlink.h:42:12: error: format '%lu' expects argument of type 'long unsigned int', but argument 10 has type 'size_t {aka unsigned int}' [-Werror=format=]

The correct format string for 'size_t' is %zu, not %lu, this works on all
architectures.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: e5224f0fe2ac ("devlink: add hardware messages tracing facility")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotracing: change owner name to driver name for devlink hwmsg tracepoint
Jiri Pirko [Thu, 14 Jul 2016 09:37:28 +0000 (11:37 +0200)]
tracing: change owner name to driver name for devlink hwmsg tracepoint

Turned on that driver->owner which is struct module is not available when
modules are disabled. Better to depend on a driver name which is
always available.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: e5224f0fe2 ("devlink: add hardware messages tracing facility")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum_router: Return -ENOENT in case of error
Christophe Jaillet [Thu, 14 Jul 2016 06:18:45 +0000 (08:18 +0200)]
mlxsw: spectrum_router: Return -ENOENT in case of error

'vr' should be a valid pointer here, so returning 'PTR_ERR(vr)' is wrong.
Return an explicit error code (-ENOENT) instead.

Fixes: 61c503f976 ("mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: ll_temac: use phy_ethtool_{get|set}_link_ksettings
Philippe Reynes [Wed, 13 Jul 2016 23:48:52 +0000 (01:48 +0200)]
net: ethernet: ll_temac: use phy_ethtool_{get|set}_link_ksettings

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: ll_temac: use phydev from struct net_device
Philippe Reynes [Wed, 13 Jul 2016 23:48:51 +0000 (01:48 +0200)]
net: ethernet: ll_temac: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'wireless-drivers-next-for-davem-2016-07-13' of git://git.kernel.org/pub...
David S. Miller [Thu, 14 Jul 2016 23:27:42 +0000 (16:27 -0700)]
Merge tag 'wireless-drivers-next-for-davem-2016-07-13' of git://git./linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for 4.8

Major changes:

iwlwifi

* more work on the RX path for the 9000 device series
* some more dynamic queue allocation work
* SAR BIOS implementation
* some work on debugging capabilities
* added support for GCMP encryption
* data path rework in preparation for new HW
* some cleanup to remove transport dependency on mac80211
* support for MSIx in preparation for new HW
* lots of work in preparation for HW support (9000 and a000 series)

mwifiex

* implement get_tx_power and get_antenna cfg80211 operation callbacks

wl18xx

* add support for 64bit clock

rtl8xxxu

* aggregation support (optional for now)

Also wireless-drivers is merged to fix some conflicts.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'pktgen-scripts'
David S. Miller [Thu, 14 Jul 2016 22:19:52 +0000 (15:19 -0700)]
Merge branch 'pktgen-scripts'

Jesper Dangaard Brouer says:

====================
pktgen samples: new scripts and removing older samples

This patchset is adding some pktgen sample scripts that I've been
using for a while[1], and they seams to relevant for more people.

Patchset also remove some of the older style pktgen samples.

[1] https://github.com/netoptimizer/network-testing/tree/master/pktgen
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agopktgen: remove sample script pktgen.conf-1-1-rdos
Jesper Dangaard Brouer [Wed, 13 Jul 2016 20:06:15 +0000 (22:06 +0200)]
pktgen: remove sample script pktgen.conf-1-1-rdos

Removing the pktgen sample script pktgen.conf-1-1-rdos, because
it does not contain anything that is not covered by the other and
newer style sample scripts.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agopktgen: add sample script pktgen_sample05_flow_per_thread.sh
Jesper Dangaard Brouer [Wed, 13 Jul 2016 20:06:10 +0000 (22:06 +0200)]
pktgen: add sample script pktgen_sample05_flow_per_thread.sh

This pktgen sample script is useful for scalability testing a
receiver.  The script will simply generate one flow per
thread (option -t N) using the thread number as part of the
source IP-address.

The single flow sample (pktgen_sample03_burst_single_flow.sh)
have become quite popular, but it is important that developers
also make sure to benchmark scalability of multiple receive
queues.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agopktgen: add sample script pktgen_sample04_many_flows.sh
Jesper Dangaard Brouer [Wed, 13 Jul 2016 20:06:04 +0000 (22:06 +0200)]
pktgen: add sample script pktgen_sample04_many_flows.sh

Adding a pktgen sample script that demonstrates how to use pktgen
for simulating flows.  Script will generate a certain number of
concurrent flows ($FLOWS) and each flow will contain $FLOWLEN
packets, which will be send back-to-back, before switching to a
new flow, due to flag FLOW_SEQ.

This script obsoletes the old sample script 'pktgen.conf-1-1-flows',
which is removed.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'mlx5-bulk-flow-stats-sriov-tc-offloads'
David S. Miller [Thu, 14 Jul 2016 20:34:30 +0000 (13:34 -0700)]
Merge branch 'mlx5-bulk-flow-stats-sriov-tc-offloads'

Saeed Mahameed says:

====================
Mellanox 100G mlx5 Bulk flow statistics and SRIOV TC offloads

This series from Amir and Or deals with two enhancements for the mlx5 TC offloads.

The 1st two patches add bulk reading of flow counters. Few bulk counter queries are
used instead of issuing thousands firmware commands per second to get statistics of all
flows set to HW.

The next patches add TC based SRIOV offloading to mlx5, as a follow up for the e-switch
offloads mode and the VF representors. When the e-switch is set to the (new) "offloads"
mode, we can now offload TC/flower drop and forward rules, the forward action we offload
is TC mirred/redirect.

The above is done by the VF representor netdevices exporting the setup_tc ndo where from
there we're re-using and enhancing the existing mlx5 TC offloads sub-module which now
works for both the NIC and the SRIOV cases.

The series is applied on top b38a75d2d324 ('mlxsw: core: Trace EMAD messages')
and it has no merge issues with the on-going net submission ('mlx5 tx timeout watchdog fixes')

V2:
    - Fixed compilation warning.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5e: Add TC offload support for the VF representors netdevice
Or Gerlitz [Thu, 14 Jul 2016 07:32:46 +0000 (10:32 +0300)]
net/mlx5e: Add TC offload support for the VF representors netdevice

The VF representors support only TC filter/action offloads
(not mqprio) and this is enabled for them by default.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5e: Add TC HW support for FDB (SRIOV e-switch) offloads
Or Gerlitz [Thu, 14 Jul 2016 07:32:45 +0000 (10:32 +0300)]
net/mlx5e: Add TC HW support for FDB (SRIOV e-switch) offloads

Enhance the TC offload code such that when the eswitch exists and it's
mode being SRIOV offloads, we do TC actions parsing and setup targeted
for eswitch. Next, we add the offloaded flow to the HW e-switch (fdb).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads
Or Gerlitz [Thu, 14 Jul 2016 07:32:44 +0000 (10:32 +0300)]
net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads

Add the setup code that parses the TC actions needed to support offloading drop
and mirred/redirect for SRIOV e-switch. We can redirect between two devices if
they belong to the same HW switch, compare the switchdev HW ID attribute to
enforce that.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/switchdev: Export the same parent ID service function
Or Gerlitz [Thu, 14 Jul 2016 07:32:43 +0000 (10:32 +0300)]
net/switchdev: Export the same parent ID service function

This helper serves to know if two switchdev port netdevices belong to the
same HW ASIC, e.g to figure out if forwarding offload is possible between them.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5e: Adjustments in the TC offload code towards reuse for SRIOV
Or Gerlitz [Thu, 14 Jul 2016 07:32:42 +0000 (10:32 +0300)]
net/mlx5e: Adjustments in the TC offload code towards reuse for SRIOV

Towards reusing the TC offloads code for an SRIOV use-case, change some of the
helper functions to have _nic in their names so it's clear what's NIC unique
and what's general. Also group together the NIC related helpers so we can easily
branch per the use-case in downstream patch.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5: E-Switch, Add API to configure rules for the offloaded mode
Or Gerlitz [Thu, 14 Jul 2016 07:32:41 +0000 (10:32 +0300)]
net/mlx5: E-Switch, Add API to configure rules for the offloaded mode

This allows for upper levels in the driver, e.g the TC offload code to add
e-switch offloaded steering rules. The caller provides the rule spec for
matching, action, source and destination vports.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5: E-Switch, Use two priorities for SRIOV offloads mode
Or Gerlitz [Thu, 14 Jul 2016 07:32:40 +0000 (10:32 +0300)]
net/mlx5: E-Switch, Use two priorities for SRIOV offloads mode

In the offloads mode, some slow path rules are added by the driver (e.g
send-to-vport), while offloaded rules are to be added from upper layers.

The slow path rules have lower priority and we don't want matching on
offloaded rules to suffer from extra steering hops related to the slow
path rules.

We use two priorities, one for offloaded rules (fast path), and one for
the control rules (slow path). To allow for that, we enable two priorities
for the FDB namespace in the FS core code.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5e: Offload TC flow counters only when supported
Or Gerlitz [Thu, 14 Jul 2016 07:32:39 +0000 (10:32 +0300)]
net/mlx5e: Offload TC flow counters only when supported

Currenly, the code that programs the flow actions into the firmware
doesn't check if was actually asked to offload the statistics, fix that.

Fixes: aad7e08d39bd ('net/mlx5e: Hardware offloaded flower filter statistics support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5: Introduce bulk reading of flow counters
Amir Vadai [Thu, 14 Jul 2016 07:32:38 +0000 (10:32 +0300)]
net/mlx5: Introduce bulk reading of flow counters

This commit utilize the ability of ConnectX-4 to bulk read flow counters.
Few bulk counter queries could be done instead of issuing thousands of
firmware commands per second to get statistics of all flows set to HW,
such as those programmed when we offload tc filters.

Counters are stored sorted by hardware id, and queried in blocks (id +
number of counters).

Due to hardware requirement, start of block and number of counters in a
block must be four aligned.

Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5: Store counters in rbtree instead of list
Amir Vadai [Thu, 14 Jul 2016 07:32:37 +0000 (10:32 +0300)]
net/mlx5: Store counters in rbtree instead of list

In order to use bulk counters, we need to have counters sorted by id.

Signed-off-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'nps_ent-coding-style'
David S. Miller [Thu, 14 Jul 2016 03:59:07 +0000 (20:59 -0700)]
Merge branch 'nps_ent-coding-style'

Elad Kanfi says:

====================
Code style fixes

Fix all checkpatch warnings and errors, and reuse code
====================

Signed-off-by: David S. Miller <davem@davemloft.net>