git.cascardo.info Git - cascardo/linux.git/log

David Howells [Fri, 4 Mar 2016 15:54:27 +0000 (15:54 +0000)]

rxrpc: The protocol family should be set to PF_RXRPC not PF_UNIX

Fix the protocol family set in the proto_ops for rxrpc to be PF_RXRPC not
PF_UNIX.

Signed-off-by: David Howells <dhowells@redhat.com>

commit | commitdiff | tree

David Howells [Fri, 4 Mar 2016 15:53:46 +0000 (15:53 +0000)]

rxrpc: Keep the skb private record of the Rx header in host byte order

Currently, a copy of the Rx packet header is copied into the the sk_buff
private data so that we can advance the pointer into the buffer,
potentially discarding the original. At the moment, this copy is held in
network byte order, but this means we're doing a lot of unnecessary
translations.

The reasons it was done this way are that we need the values in network
byte order occasionally and we can use the copy, slightly modified, as part
of an iov array when sending an ack or an abort packet.

However, it seems more reasonable on review that it would be better kept in
host byte order and that we make up a new header when we want to send
another packet.

To this end, rename the original header struct to rxrpc_wire_header (with
BE fields) and institute a variant called rxrpc_host_header that has host
order fields. Change the struct in the sk_buff private data into an
rxrpc_host_header and translate the values when filling it in.

This further allows us to keep values kept in various structures in host
byte order rather than network byte order and allows removal of some fields
that are byteswapped duplicates.

Signed-off-by: David Howells <dhowells@redhat.com>

commit | commitdiff | tree

David Howells [Fri, 4 Mar 2016 15:53:46 +0000 (15:53 +0000)]

rxrpc: Rename call events to begin RXRPC_CALL_EV_

Rename call event names to begin RXRPC_CALL_EV_ to distinguish them from the
flags.

Signed-off-by: David Howells <dhowells@redhat.com>

commit | commitdiff | tree

David Howells [Fri, 4 Mar 2016 15:53:46 +0000 (15:53 +0000)]

rxrpc: Convert call flag and event numbers into enums

Convert call flag and event numbers into enums and move their definitions
outside of the struct.

Also move the call state enum outside of the struct and add an extra
element to count the number of states.

Signed-off-by: David Howells <dhowells@redhat.com>

commit | commitdiff | tree

David Howells [Fri, 4 Mar 2016 15:53:46 +0000 (15:53 +0000)]

rxrpc: Fix a case where a call event bit is being used as a flag bit

Fix a case where RXRPC_CALL_RELEASE (an event) is being used to specify a
flag bit. RXRPC_CALL_RELEASED should be used instead.

Signed-off-by: David Howells <dhowells@redhat.com>

commit | commitdiff | tree

Alexandre Courbot [Fri, 26 Feb 2016 09:06:53 +0000 (18:06 +0900)]

gpu: host1x: Set DMA ops on device creation

Currently host1x-instanciated devices have their dma_ops left to NULL,
which makes any DMA operation (like buffer import) on ARM64 fallback
to the dummy_dma_ops and fail with an error.

This patch calls of_dma_configure() with the host1x node when creating
such a device, so the proper DMA operations are set.

Suggested-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

commit | commitdiff | tree

Alexandre Courbot [Fri, 26 Feb 2016 09:06:52 +0000 (18:06 +0900)]

gpu: host1x: Set DMA mask

The default DMA mask covers a 32 bits address range, but host1x devices
can address a larger range on TK1 and TX1. Set the DMA mask to the range
addressable when we use the IOMMU to prevent the use of bounce buffers.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

commit | commitdiff | tree

Steven Rostedt (Red Hat) [Thu, 3 Mar 2016 22:18:20 +0000 (17:18 -0500)]

tracing: Do not have 'comm' filter override event 'comm' field

Commit 9f61668073a8d "tracing: Allow triggers to filter for CPU ids and
process names" added a 'comm' filter that will filter events based on the
current tasks struct 'comm'. But this now hides the ability to filter events
that have a 'comm' field too. For example, sched_migrate_task trace event.
That has a 'comm' field of the task to be migrated.

echo 'comm == "bash"' > events/sched_migrate_task/filter

will now filter all sched_migrate_task events for tasks named "bash" that
migrates other tasks (in interrupt context), instead of seeing when "bash"
itself gets migrated.

This fix requires a couple of changes.

1) Change the look up order for filter predicates to look at the events
   fields before looking at the generic filters.

2) Instead of basing the filter function off of the "comm" name, have the
   generic "comm" filter have its own filter_type (FILTER_COMM). Test
   against the type instead of the name to assign the filter function.

3) Add a new "COMM" filter that works just like "comm" but will filter based
   on the current task, even if the trace event contains a "comm" field.

Do the same for "cpu" field, adding a FILTER_CPU and a filter "CPU".

Cc: stable@vger.kernel.org # v4.3+
Fixes: 9f61668073a8d "tracing: Allow triggers to filter for CPU ids and process names"
Reported-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

commit | commitdiff | tree

Ed Spiridonov [Fri, 4 Mar 2016 06:07:27 +0000 (09:07 +0300)]

can: mcp251x: avoid write to error flag register if it's unnecessary

Only two bits (RX0OVR and RX1OVR) are writable in EFLG, write is useless
if these bits aren't set.

Signed-off-by: Ed Spiridonov <edo.rus@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Libin Yang [Fri, 4 Mar 2016 06:33:43 +0000 (14:33 +0800)]

ALSA: hda - hdmi defer to register acomp eld notifier

Defer to register acomp eld notifier until hdmi audio driver
is fully ready.

After registering eld notifier, gfx driver can use this
callback function to notify audio driver the monitor
connection event. However this action may happen when
audio driver is adding the pins or doing other initialization.
This is not always safe, however. For example, using
per_pin->lock before the lock is initialized.

Let's register the eld notifier after the initialization is done.

Signed-off-by: Libin Yang <libin.yang@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

commit | commitdiff | tree

Libin Yang [Fri, 4 Mar 2016 06:33:06 +0000 (14:33 +0800)]

ALSA: hda - hdmi add wmb barrier for audio component

To make sure audio_ptr is set before intel_audio_codec_enable()
or intel_audio_codec_disable() calling pin_eld_notify(),
this patch adds wmb barrier to prevent optimizing.

Signed-off-by: Libin Yang <libin.yang@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

commit | commitdiff | tree

Scott Wood [Thu, 3 Mar 2016 04:51:04 +0000 (22:51 -0600)]

powerpc/fsl-book3e: Avoid lbarx on e5500

lbarx/stbcx. are implemented on e6500, but not on e5500.
Likewise, SMT is on e6500, but not on e5500.

So, avoid executing an unimplemented instruction by only locking
when needed (i.e. in the presence of SMT).

Signed-off-by: Scott Wood <oss@buserror.net>

commit | commitdiff | tree

Dave Airlie [Fri, 4 Mar 2016 03:51:53 +0000 (13:51 +1000)]

Merge tag 'drm-intel-fixes-2016-03-03' of git://anongit.freedesktop.org/drm-intel into drm-fixes

Small conflict as I had the balance in my tree already for testing.

* tag 'drm-intel-fixes-2016-03-03' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Balance assert_rpm_wakelock_held() for !IS_ENABLED(CONFIG_PM)
drm/i915/skl: Fix power domain suspend sequence

commit | commitdiff | tree

Filipe Manana [Wed, 2 Mar 2016 15:49:38 +0000 (15:49 +0000)]

Btrfs: fix loading of orphan roots leading to BUG_ON

When looking for orphan roots during mount we can end up hitting a
BUG_ON() (at root-item.c:btrfs_find_orphan_roots()) if a log tree is
replayed and qgroups are enabled. This is because after a log tree is
replayed, a transaction commit is made, which triggers qgroup extent
accounting which in turn does backref walking which ends up reading and
inserting all roots in the radix tree fs_info->fs_root_radix, including
orphan roots (deleted snapshots). So after the log tree is replayed, when
finding orphan roots we hit the BUG_ON with the following trace:

[118209.182438] ------------[ cut here ]------------
[118209.183279] kernel BUG at fs/btrfs/root-tree.c:314!
[118209.184074] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[118209.185123] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic ppdev xor raid6_pq evdev sg parport_pc parport acpi_cpufreq tpm_tis tpm psmouse
processor i2c_piix4 serio_raw pcspkr i2c_core button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata
virtio_pci virtio_ring virtio scsi_mod e1000 floppy [last unloaded: btrfs]
[118209.186318] CPU: 14 PID: 28428 Comm: mount Tainted: G        W       4.5.0-rc5-btrfs-next-24+ #1
[118209.186318] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
[118209.186318] task: ffff8801ec131040 ti: ffff8800af34c000 task.ti: ffff8800af34c000
[118209.186318] RIP: 0010:[<ffffffffa04237d7>]  [<ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs]
[118209.186318] RSP: 0018:ffff8800af34faa8  EFLAGS: 00010246
[118209.186318] RAX: 00000000ffffffef RBX: 00000000ffffffef RCX: 0000000000000001
[118209.186318] RDX: 0000000080000000 RSI: 0000000000000001 RDI: 00000000ffffffff
[118209.186318] RBP: ffff8800af34fb08 R08: 0000000000000001 R09: 0000000000000000
[118209.186318] R10: ffff8800af34f9f0 R11: 6db6db6db6db6db7 R12: ffff880171b97000
[118209.186318] R13: ffff8801ca9d65e0 R14: ffff8800afa2e000 R15: 0000160000000000
[118209.186318] FS:  00007f5bcb914840(0000) GS:ffff88023edc0000(0000) knlGS:0000000000000000
[118209.186318] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[118209.186318] CR2: 00007f5bcaceb5d9 CR3: 00000000b49b5000 CR4: 00000000000006e0
[118209.186318] Stack:
[118209.186318]  fffffbffffffffff 010230ffffffffff 0101000000000000 ff84000000000000
[118209.186318]  fbffffffffffffff 30ffffffffffffff 0000000000000101 ffff880082348000
[118209.186318]  0000000000000000 ffff8800afa2e000 ffff8800afa2e000 0000000000000000
[118209.186318] Call Trace:
[118209.186318]  [<ffffffffa042e2db>] open_ctree+0x1e37/0x21b9 [btrfs]
[118209.186318]  [<ffffffffa040a753>] btrfs_mount+0x97e/0xaed [btrfs]
[118209.186318]  [<ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf
[118209.186318]  [<ffffffff8117b87e>] mount_fs+0x67/0x131
[118209.186318]  [<ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde
[118209.186318]  [<ffffffffa0409f81>] btrfs_mount+0x1ac/0xaed [btrfs]
[118209.186318]  [<ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf
[118209.186318]  [<ffffffff8108c26b>] ? lockdep_init_map+0xb9/0x1b3
[118209.186318]  [<ffffffff8117b87e>] mount_fs+0x67/0x131
[118209.186318]  [<ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde
[118209.186318]  [<ffffffff81195637>] do_mount+0x8a6/0x9e8
[118209.186318]  [<ffffffff8119598d>] SyS_mount+0x77/0x9f
[118209.186318]  [<ffffffff81493017>] entry_SYSCALL_64_fastpath+0x12/0x6b
[118209.186318] Code: 64 00 00 85 c0 89 c3 75 24 f0 41 80 4c 24 20 20 49 8b bc 24 f0 01 00 00 4c 89 e6 e8 e8 65 00 00 85 c0 89 c3 74 11 83 f8 ef 75 02 <0f> 0b
4c 89 e7 e8 da 72 00 00 eb 1c 41 83 bc 24 00 01 00 00 00
[118209.186318] RIP  [<ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs]
[118209.186318]  RSP <ffff8800af34faa8>
[118209.230735] ---[ end trace 83938f987d85d477 ]---

So fix this by not treating the error -EEXIST, returned when attempting
to insert a root already inserted by the backref walking code, as an error.

The following test case for xfstests reproduces the bug:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1 # failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      _cleanup_flakey
      cd /
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter
  . ./common/dmflakey

  # real QA test starts here
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_dm_target flakey
  _require_metadata_journaling $SCRATCH_DEV

  rm -f $seqres.full

  _scratch_mkfs >>$seqres.full 2>&1
  _init_flakey
  _mount_flakey

  _run_btrfs_util_prog quota enable $SCRATCH_MNT

  # Create 2 directories with one file in one of them.
  # We use these just to trigger a transaction commit later, moving the file from
  # directory a to directory b and doing an fsync against directory a.
  mkdir $SCRATCH_MNT/a
  mkdir $SCRATCH_MNT/b
  touch $SCRATCH_MNT/a/f
  sync

  # Create our test file with 2 4K extents.
  $XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io

  # Create a snapshot and delete it. This doesn't really delete the snapshot
  # immediately, just makes it inaccessible and invisible to user space, the
  # snapshot is deleted later by a dedicated kernel thread (cleaner kthread)
  # which is woke up at the next transaction commit.
  # A root orphan item is inserted into the tree of tree roots, so that if a
  # power failure happens before the dedicated kernel thread does the snapshot
  # deletion, the next time the filesystem is mounted it resumes the snapshot
  # deletion.
  _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap
  _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap

  # Now overwrite half of the extents we wrote before. Because we made a snapshpot
  # before, which isn't really deleted yet (since no transaction commit happened
  # after we did the snapshot delete request), the non overwritten extents get
  # referenced twice, once by the default subvolume and once by the snapshot.
  $XFS_IO_PROG -c "pwrite -S 0xbb 4K 8K" $SCRATCH_MNT/foobar | _filter_xfs_io

  # Now move file f from directory a to directory b and fsync directory a.
  # The fsync on the directory a triggers a transaction commit (because a file
  # was moved from it to another directory) and the file fsync leaves a log tree
  # with file extent items to replay.
  mv $SCRATCH_MNT/a/f $SCRATCH_MNT/a/b
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar

  echo "File digest before power failure:"
  md5sum $SCRATCH_MNT/foobar | _filter_scratch

  # Now simulate a power failure and mount the filesystem to replay the log tree.
  # After the log tree was replayed, we used to hit a BUG_ON() when processing
  # the root orphan item for the deleted snapshot. This is because when processing
  # an orphan root the code expected to be the first code inserting the root into
  # the fs_info->fs_root_radix radix tree, while in reallity it was the second
  # caller attempting to do it - the first caller was the transaction commit that
  # took place after replaying the log tree, when updating the qgroup counters.
  _flakey_drop_and_remount

  echo "File digest before after failure:"
  # Must match what he got before the power failure.
  md5sum $SCRATCH_MNT/foobar | _filter_scratch

  _unmount_flakey
  status=0
  exit

Fixes: 2d9e97761087 ("Btrfs: use btrfs_get_fs_root in resolve_indirect_ref")
Cc: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>

commit | commitdiff | tree

Eric Dumazet [Wed, 2 Mar 2016 16:21:43 +0000 (08:21 -0800)]

net: sched: use pfifo_fast for non real queues

Some devices declare a high number of TX queues, then set a much
lower real_num_tx_queues

This cause setups using fq_codel, sfq or fq as the default qdisc to consume
more memory than really needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David Ahern [Wed, 2 Mar 2016 19:30:07 +0000 (11:30 -0800)]

net: ipv6: Fix refcnt on host routes

Andrew and Ying Huang's test robot both reported usage count problems that
trace back to the 'keep address on ifdown' patch.

>From Andrew:
We execute CRIU test on linux-next. On the current linux-next kernel
they hangs on creating a network namespace.

The kernel log contains many massages like this:
[ 1036.122108] unregister_netdevice: waiting for lo to become free.
Usage count = 2
[ 1046.165156] unregister_netdevice: waiting for lo to become free.
Usage count = 2
[ 1056.210287] unregister_netdevice: waiting for lo to become free.
Usage count = 2

I tried to revert this patch and the bug disappeared.

Here is a set of commands to reproduce this bug:

[root@linux-next-test linux-next]# uname -a
Linux linux-next-test 4.5.0-rc6-next-20160301+ #3 SMP Wed Mar 2
17:32:18 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

[root@linux-next-test ~]# unshare -n
[root@linux-next-test ~]# ip link set up dev lo
[root@linux-next-test ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
[root@linux-next-test ~]# logout
[root@linux-next-test ~]# unshare -n

-----

The problem is a change made to RTM_DELADDR case in __ipv6_ifa_notify that
was added in an early version of the offending patch and is no longer
needed.

Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Cc: Andrey Wagin <avagin@gmail.com>
Cc: Ying Huang <ying.huang@linux.intel.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Lada Trimasova [Thu, 3 Mar 2016 14:07:46 +0000 (17:07 +0300)]

net: ezchip: adapt driver to little endian architecture

Since ezchip network driver is written with big endian EZChip platform it
is necessary to add support for little endian architecture.

The first issue is that the order of the bits in a bit field is
implementation specific. So all the bit fields are removed.
Named constants are used to access necessary fields.

And the second one is that network byte order is big endian.
For example, data on ethernet is transmitted with most-significant
octet (byte) first. So in case of little endian architecture
it is important to swap data byte order when we read it from
register. In case of unaligned access we can use "get_unaligned_be32"
and in other case we can use function "ioread32_rep" which reads all
data from register and works either with little endian or big endian
architecture.

And then when we are going to write data to register we need to restore
byte order using the function "put_unaligned_be32" in case of
unaligned access and in other case "iowrite32_rep".

The last little fix is a space between type and pointer to observe
coding style.

Signed-off-by: Lada Trimasova <ltrimas@synopsys.com>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Tal Zilcer <talz@ezchip.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Venkat Duvvuru [Wed, 2 Mar 2016 11:00:28 +0000 (06:00 -0500)]

be2net: don't enable multicast flag in be_enable_if_filters() routine

When the interface is opened (in be_open()) the routine
be_enable_if_filters() must be called to switch on the basic filtering
capabilities of an interface that are not changed at run-time.
These include the flags UNTAGGED, BROADCAST and PASS_L3L4_ERRORS.
Other flags such as MULTICAST and PROMISC must be enabled later by
be_set_rx_mode() based on the state in the netdev/adapter struct.

be_enable_if_filters() routine is wrongly trying to enable MULTICAST flag
without checking the current adapter state. This can cause the RX_FILTER
cmds to the FW to fail. This patch fixes this problem by only enabling
the basic filtering flags in be_enable_if_filters().

The VF must be able to issue RX_FILTER cmd with any filter flag, as long
as the PF allowed those flags (if_cap_flags) in the iface it provisioned
for the VF. This rule is applicable even when the VF doesn't have the
FILTMGMT privilege. There is a bug in BE3 FW that wrongly fails RX_FILTER
multicast programming cmds on VFs that don't have FILTMGMT privilege.
This patch also helps in insulating the VF driver from be_open failures due
to the FW bug. A fix for the BE3 FW issue will be available in
versions >= 11.0.283.0 and 10.6.334.0

Reported-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Dan Carpenter [Wed, 2 Mar 2016 10:11:10 +0000 (13:11 +0300)]

net: moxa: fix an error code

We accidentally return IS_ERR(priv->base) which is 1 instead of
PTR_ERR(priv->base) which is the error code.

Fixes: 6c821bd9edc9 ('net: Add MOXA ART SoCs ethernet driver')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Nimrod Andy [Wed, 2 Mar 2016 09:24:53 +0000 (17:24 +0800)]

MAINTAINERS: add maintainer entry for FREESCALE FEC ethernet driver

Add a maintainer entry for FREESCALE FEC ethernet driver and add myself
as a maintainer.

Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Acked-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Daniel Borkmann [Wed, 2 Mar 2016 01:32:08 +0000 (02:32 +0100)]

vxlan: fix missing options_len update on RX with collect metadata

When signalling to metadata consumers that the metadata_dst entry
carries additional GBP extension data for vxlan (TUNNEL_VXLAN_OPT),
the dst's vxlan_metadata information is populated, but options_len
is left to zero. F.e. in ovs, ovs_flow_key_extract() checks for
options_len before extracting the data through ip_tunnel_info_opts_get().

Geneve uses ip_tunnel_info_opts_set() helper in receive path, which
sets options_len internally, vxlan however uses ip_tunnel_info_opts(),
so when filling vxlan_metadata, we do need to update options_len.

Fixes: 4c22279848c5 ("ip-tunnel: Use API to access tunnel metadata options.")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Simon Horman [Wed, 2 Mar 2016 01:28:13 +0000 (10:28 +0900)]

sh_eth, ravb: Use ARCH_RENESAS

Make use of ARCH_RENESAS in place of ARCH_SHMOBILE.

This is part of an ongoing process to migrate from ARCH_SHMOBILE to
ARCH_RENESAS the motivation for which being that RENESAS seems to be a more
appropriate name than SHMOBILE for the majority of Renesas ARM based SoCs.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Arnd Bergmann [Wed, 2 Mar 2016 09:40:54 +0000 (10:40 +0100)]

net: mellanox: add DEVLINK dependencies

The new NET_DEVLINK infrastructure can be a loadable module, but the drivers
using it might be built-in, which causes link errors like:

drivers/net/built-in.o: In function `mlx4_load_one':
:(.text+0x2fbfda): undefined reference to `devlink_port_register'
:(.text+0x2fc084): undefined reference to `devlink_port_unregister'
drivers/net/built-in.o: In function `mlxsw_sx_port_remove':
:(.text+0x33a03a): undefined reference to `devlink_port_type_clear'
:(.text+0x33a04e): undefined reference to `devlink_port_unregister'

There are multiple ways to avoid this:

a) add 'depends on NET_DEVLINK || !NET_DEVLINK' dependencies
   for each user
b) use 'select NET_DEVLINK' from each driver that uses it
   and hide the symbol in Kconfig.
c) make NET_DEVLINK a 'bool' option so we don't have to
   list it as a dependency, and rely on the APIs to be
   stubbed out when it is disabled
d) use IS_REACHABLE() rather than IS_ENABLED() to check for
   NET_DEVLINK in include/net/devlink.h

This implements a variation of approach a) by adding an
intermediate symbol that drivers can depend on, and changes
the three drivers using it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 09d4d087cd48 ("mlx4: Implement devlink interface")
Fixes: c4745500e988 ("mlxsw: Implement devlink interface")
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Christoph Hellwig [Wed, 2 Mar 2016 17:07:14 +0000 (18:07 +0100)]

block: support large requests in blk_rq_map_user_iov

This patch adds support for larger requests in blk_rq_map_user_iov by
allowing it to build multiple bios for a request. This functionality
used to exist for the non-vectored blk_rq_map_user in the past, and
this patch reuses the existing functionality for it on the unmap side,
which stuck around. Thanks to the iov_iter API supporting multiple
bios is fairly trivial, as we can just iterate the iov until we've
consumed the whole iov_iter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Christoph Hellwig [Thu, 3 Mar 2016 21:43:45 +0000 (14:43 -0700)]

block: fix blk_rq_get_max_sectors for driver private requests

Driver private request types should not get the artifical cap for the
FS requests. This is important to use the full device capabilities
for internal command or NVMe pass through commands.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Updated by me to use an explicit check for the one command type that
does support extended checking, instead of relying on the ordering
of the enum command values - as suggested by Keith.

Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Christoph Hellwig [Wed, 2 Mar 2016 17:07:12 +0000 (18:07 +0100)]

nvme: fix max_segments integer truncation

The block layer uses an unsigned short for max_segments. The way we
calculate the value for NVMe tends to generate very large 32-bit values,
which after integer truncation may lead to a zero value instead of
the desired outcome.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Christoph Hellwig [Wed, 2 Mar 2016 17:07:11 +0000 (18:07 +0100)]

nvme: set queue limits for the admin queue

Factor out a helper to set all the device specific queue limits and apply
them to the admin queue in addition to the I/O queues. Without this the
command size on the admin queue is arbitrarily low, and the missing
other limitations are just minefields waiting for victims.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Tejun Heo [Mon, 29 Feb 2016 23:28:53 +0000 (18:28 -0500)]

writeback: flush inode cgroup wb switches instead of pinning super_block

If cgroup writeback is in use, inodes can be scheduled for
asynchronous wb switching.  Before 5ff8eaac1636 ("writeback: keep
superblock pinned during cgroup writeback association switches"), this
could race with umount leading to super_block being destroyed while
inodes are pinned for wb switching.  5ff8eaac1636 fixed it by bumping
s_active while wb switches are in flight; however, this allowed
in-flight wb switches to make umounts asynchronous when the userland
expected synchronosity - e.g. fsck immediately following umount may
fail because the device is still busy.

This patch removes the problematic super_block pinning and instead
makes generic_shutdown_super() flush in-flight wb switches.  wb
switches are now executed on a dedicated isw_wq so that they can be
flushed and isw_nr_in_flight keeps track of the number of in-flight wb
switches so that flushing can be avoided in most cases.

v2: Move cgroup_writeback_umount() further below and add MS_ACTIVE
    check in inode_switch_wbs() as Jan an Al suggested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Tahsin Erdogan <tahsin@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Link: http://lkml.kernel.org/g/CAAeU0aNCq7LGODvVGRU-oU_o-6enii5ey0p1c26D1ZzYwkDc5A@mail.gmail.com
Fixes: 5ff8eaac1636 ("writeback: keep superblock pinned during cgroup writeback association switches")
Cc: stable@vger.kernel.org #v4.5
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Keith Busch [Wed, 24 Feb 2016 16:15:58 +0000 (09:15 -0700)]

NVMe: Fix 0-length integrity payload

A user could send a passthrough IO command with a metadata pointer to a
namespace without metadata. With metadata length of 0, kmalloc returns
ZERO_SIZE_PTR. Since that is not NULL, the driver would have set this as
the bio's integrity payload, which causes an access fault on completion.

This patch ignores the users metadata buffer if the namespace format
does not support separate metadata.

Reported-by: Stephen Bates <stephen.bates@microsemi.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Keith Busch [Wed, 24 Feb 2016 16:15:57 +0000 (09:15 -0700)]

NVMe: Don't allow unsupported flags

The command flags can change the meaning of other fields in the command
that the driver is not prepared to handle. Specifically, the user could
passthrough an SGL flag, causing the controller to misinterpret the PRP
list the driver created, potentially corrupting memory or data.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Keith Busch [Wed, 24 Feb 2016 16:15:56 +0000 (09:15 -0700)]

NVMe: Move error handling to failed reset handler

This moves failed queue handling out of the namespace removal path and
into the reset failure path, fixing a hanging condition if the controller
fails or link down during del_gendisk. Previously the driver had to see
the controller as degraded prior to calling del_gendisk to setup the
queues to fail. But, if the controller happened to fail after this,
there was no task to end outstanding requests.

On failure, all namespace states are set to dead. This has capacity
revalidate to 0, and ends all new requests with error status.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Keith Busch [Wed, 24 Feb 2016 16:15:55 +0000 (09:15 -0700)]

NVMe: Simplify device reset failure

A reset failure schedules the device to unbind from the driver through
the pci driver's remove. This cleans up all intialization, so there is
no need to duplicate the potentially racy cleanup.

To help understand why a reset failed, the status is logged with the
existing warning message.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Keith Busch [Wed, 24 Feb 2016 16:15:54 +0000 (09:15 -0700)]

NVMe: Fix namespace removal deadlock

This patch makes nvme namespace removal lockless. It is up to the caller
to ensure no active namespace scanning is occuring. To ensure no scan
work occurs, the nvme pci driver adds a removing state to the controller
device to avoid queueing scan work during removal. The work is flushed
after setting the state, so no new scan work can be queued.

The lockless removal allows the driver to cleanup a namespace
request_queue if the controller fails during removal. Previously this
could deadlock trying to acquire the namespace mutex in order to handle
such events.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Keith Busch [Wed, 24 Feb 2016 16:15:53 +0000 (09:15 -0700)]

NVMe: Use IDA for namespace disk naming

A namespace may be detached from a controller, but a user may be holding
a reference to it. Attaching a new namespace with the same NSID will create
duplicate names when using the NSID to name the disk.

This patch uses an IDA that is released only when the last reference is
released instead of using the namespace ID.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Keith Busch [Wed, 24 Feb 2016 16:15:52 +0000 (09:15 -0700)]

NVMe: Don't unmap controller registers on reset

Unmapping the registers on reset or shutdown is not necessary. Keeping
the mapping simplifies reset handling.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Ming Lei [Fri, 26 Feb 2016 15:40:53 +0000 (23:40 +0800)]

block: merge: get the 1st and last bvec via helpers

This patch applies the two introduced helpers to
figure out the 1st and last bvec.

Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Ming Lei [Fri, 26 Feb 2016 15:40:52 +0000 (23:40 +0800)]

block: get the 1st and last bvec via helpers

This patch applies the two introduced helpers to
figure out the 1st and last bvec, and fixes the
original way after bio splitting.

Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Ming Lei [Fri, 26 Feb 2016 15:40:51 +0000 (23:40 +0800)]

block: check virt boundary in bio_will_gap()

In the following patch, the way for figuring out
the last bvec will be changed with a bit cost introduced,
so return immediately if the queue doesn't have virt
boundary limit. Actually most of devices have not
this limit.

Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Ming Lei [Fri, 26 Feb 2016 15:40:50 +0000 (23:40 +0800)]

block: bio: introduce helpers to get the 1st and last bvec

The bio passed to bio_will_gap() may be fast cloned from upper
layer(dm, md, bcache, fs, ...), or from bio splitting in block
core.

Unfortunately bio_will_gap() just figures out the last bvec via
'bi_io_vec[prev->bi_vcnt - 1]' directly, and this way is obviously
wrong.

This patch introduces two helpers for getting the first and last
bvec of one bio for fixing the issue.

Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Wolfram Sang [Tue, 1 Mar 2016 16:37:59 +0000 (17:37 +0100)]

net: ethernet: renesas: sh_eth: don't open code of_device_get_match_data()

This change will also make Coverity happy by avoiding a theoretical NULL
pointer dereference; yet another reason is to use the above helper function
to tighten the code and make it more readable.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Wolfram Sang [Tue, 1 Mar 2016 16:37:58 +0000 (17:37 +0100)]

net: ethernet: renesas: ravb_main: don't open code of_device_get_match_data()

This change will also make Coverity happy by avoiding a theoretical NULL
pointer dereference; yet another reason is to use the above helper function
to tighten the code and make it more readable.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Thomas Falcon [Tue, 1 Mar 2016 16:20:09 +0000 (10:20 -0600)]

ibmvnic: Fix ibmvnic_capability struct

The ibmvnic_capability struct was defined incorrectly. The last two
elements of the struct are in the wrong order. In addition, the number
element should be 64-bit. Byteswapping functions are updated
as well.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Florian Westphal [Tue, 1 Mar 2016 15:15:16 +0000 (16:15 +0100)]

ipv6: re-enable fragment header matching in ipv6_find_hdr

When ipv6_find_hdr is used to find a fragment header
(caller specifies target NEXTHDR_FRAGMENT) we erronously return
-ENOENT for all fragments with nonzero offset.

Before commit 9195bb8e381d, when target was specified, we did not
enter the exthdr walk loop as nexthdr == target so this used to work.

Now we do (so we can skip empty route headers). When we then stumble upon
a frag with nonzero frag_off we must return -ENOENT ("header not found")
only if the caller did not specifically request NEXTHDR_FRAGMENT.

This allows nfables exthdr expression to match ipv6 fragments, e.g. via

nft add rule ip6 filter input frag frag-off gt 0

Fixes: 9195bb8e381d ("ipv6: improve ipv6_find_hdr() to skip empty routing headers")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Bjørn Mork [Tue, 1 Mar 2016 13:31:02 +0000 (14:31 +0100)]

qmi_wwan: add Sierra Wireless EM74xx device ID

The MC74xx and EM74xx modules use different IDs by default, according
to the Lenovo EM7455 driver for Windows.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Parthasarathy Bhuvaragan [Tue, 1 Mar 2016 10:07:09 +0000 (11:07 +0100)]

tipc: Revert "tipc: use existing sk_write_queue for outgoing packet chain"

reverts commit 94153e36e709e ("tipc: use existing sk_write_queue for
outgoing packet chain")

In Commit 94153e36e709e, we assume that we fill & empty the socket's
sk_write_queue within the same lock_sock() session.

This is not true if the link is congested. During congestion, the
socket lock is released while we wait for the congestion to cease.
This implementation causes a nullptr exception, if the user space
program has several threads accessing the same socket descriptor.

Consider two threads of the same program performing the following:
     Thread1                                  Thread2
--------------------                    ----------------------
Enter tipc_sendmsg()                    Enter tipc_sendmsg()
lock_sock()                             lock_sock()
Enter tipc_link_xmit(), ret=ELINKCONG   spin on socket lock..
sk_wait_event()                             :
release_sock()                          grab socket lock
    :                                   Enter tipc_link_xmit(), ret=0
    :                                   release_sock()
Wakeup after congestion
lock_sock()
skb = skb_peek(pktchain);
!! TIPC_SKB_CB(skb)->wakeup_pending = tsk->link_cong;

In this case, the second thread transmits the buffers belonging to
both thread1 and thread2 successfully. When the first thread wakeup
after the congestion it assumes that the pktchain is intact and
operates on the skb's in it, which leads to the following exception:

[2102.439969] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
[2102.440074] IP: [<ffffffffa005f330>] __tipc_link_xmit+0x2b0/0x4d0 [tipc]
[2102.440074] PGD 3fa3f067 PUD 3fa6b067 PMD 0
[2102.440074] Oops: 0000 [#1] SMP
[2102.440074] CPU: 2 PID: 244 Comm: sender Not tainted 3.12.28 #1
[2102.440074] RIP: 0010:[<ffffffffa005f330>]  [<ffffffffa005f330>] __tipc_link_xmit+0x2b0/0x4d0 [tipc]
[...]
[2102.440074] Call Trace:
[2102.440074]  [<ffffffff8163f0b9>] ? schedule+0x29/0x70
[2102.440074]  [<ffffffffa006a756>] ? tipc_node_unlock+0x46/0x170 [tipc]
[2102.440074]  [<ffffffffa005f761>] tipc_link_xmit+0x51/0xf0 [tipc]
[2102.440074]  [<ffffffffa006d8ae>] tipc_send_stream+0x11e/0x4f0 [tipc]
[2102.440074]  [<ffffffff8106b150>] ? __wake_up_sync+0x20/0x20
[2102.440074]  [<ffffffffa006dc9c>] tipc_send_packet+0x1c/0x20 [tipc]
[2102.440074]  [<ffffffff81502478>] sock_sendmsg+0xa8/0xd0
[2102.440074]  [<ffffffff81507895>] ? release_sock+0x145/0x170
[2102.440074]  [<ffffffff815030d8>] ___sys_sendmsg+0x3d8/0x3e0
[2102.440074]  [<ffffffff816426ae>] ? _raw_spin_unlock+0xe/0x10
[2102.440074]  [<ffffffff81115c2a>] ? handle_mm_fault+0x6ca/0x9d0
[2102.440074]  [<ffffffff8107dd65>] ? set_next_entity+0x85/0xa0
[2102.440074]  [<ffffffff816426de>] ? _raw_spin_unlock_irq+0xe/0x20
[2102.440074]  [<ffffffff8107463c>] ? finish_task_switch+0x5c/0xc0
[2102.440074]  [<ffffffff8163ea8c>] ? __schedule+0x34c/0x950
[2102.440074]  [<ffffffff81504e12>] __sys_sendmsg+0x42/0x80
[2102.440074]  [<ffffffff81504e62>] SyS_sendmsg+0x12/0x20
[2102.440074]  [<ffffffff8164aed2>] system_call_fastpath+0x16/0x1b

In this commit, we maintain the skb list always in the stack.

Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Thu, 3 Mar 2016 21:27:23 +0000 (16:27 -0500)]

Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2016-03-01

Here's our main set of Bluetooth & 802.15.4 patches for the 4.6 kernel.

- New Bluetooth HCI driver for Intel/AG6xx controllers
- New Broadcom ACPI IDs
- LED trigger support for indicating Bluetooth powered state
- Various fixes in mac802154, 6lowpan and related drivers
- New USB IDs for AR3012 Bluetooth controllers

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

John Fastabend [Mon, 29 Feb 2016 19:26:13 +0000 (11:26 -0800)]

net: relax setup_tc ndo op handle restriction

I added this check in setup_tc to multiple drivers,

if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)

Unfortunately restricting to TC_H_ROOT like this breaks the old
instantiation of mqprio to setup a hardware qdisc. This patch
relaxes the test to only check the type to make it equivalent
to the check before I broke it. With this the old instantiation
continues to work.

A good smoke test is to setup mqprio with,

# tc qdisc add dev eth4 root mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 0@0 1@1 2@2 3@3 4@4 5@5 6@6 7@7

Fixes: e4c6734eaab9 ("net: rework ndo tc op to consume additional qdisc handle paramete")
Reported-by: Singh Krishneil <krishneil.k.singh@intel.com>
Reported-by: Jake Keller <jacob.e.keller@intel.com>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Sakari Ailus [Thu, 3 Mar 2016 17:20:14 +0000 (14:20 -0300)]

[media] media: Sanitise the reserved fields of the G_TOPOLOGY IOCTL arguments

The argument structs are used in arrays for G_TOPOLOGY IOCTL. The
arguments themselves do not need to be aligned to a power of two, but
aligning them up to the largest basic type alignment (u64) on common ABIs
is a good thing to do.

The patch changes the size of the reserved fields to 5 or 6 u32's and
aligns the size of the struct to 8 bytes so we do no longer depend on the
compiler to perform the alignment.

While at it, add __attribute__ ((packed)) to these structs as well.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Mar 2016 20:54:39 +0000 (12:54 -0800)]

Merge tag 'pci-v4.5-fixes-4' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
"Freescale Layerscape host bridge driver:
    Fix MSG TLP drop setting (Minghuan Lian)

  TI Keystone host bridge driver:
    Fix MSI code that retrieves struct pcie_port pointer (Murali Karicheri)"

* tag 'pci-v4.5-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: layerscape: Fix MSG TLP drop setting
  PCI: keystone: Fix MSI code that retrieves struct pcie_port pointer

commit | commitdiff | tree

Benjamin Poirier [Mon, 29 Feb 2016 23:03:33 +0000 (15:03 -0800)]

mld, igmp: Fix reserved tailroom calculation

The current reserved_tailroom calculation fails to take hlen and tlen into
account.

skb:
[__hlen__|__data____________|__tlen___|__extra__]
^                                               ^
head                                            skb_end_offset

In this representation, hlen + data + tlen is the size passed to alloc_skb.
"extra" is the extra space made available in __alloc_skb because of
rounding up by kmalloc. We can reorder the representation like so:

[__hlen__|__data____________|__extra__|__tlen___]
^                                               ^
head                                            skb_end_offset

The maximum space available for ip headers and payload without
fragmentation is min(mtu, data + extra). Therefore,
reserved_tailroom
= data + extra + tlen - min(mtu, data + extra)
= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)

Compare the second line to the current expression:
reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
and we can see that hlen and tlen are not taken into account.

The min() in the third line can be expanded into:
if mtu < skb_tailroom - tlen:
reserved_tailroom = skb_tailroom - mtu
else:
reserved_tailroom = tlen

Depending on hlen, tlen, mtu and the number of multicast address records,
the current code may output skbs that have less tailroom than
dev->needed_tailroom or it may output more skbs than needed because not all
space available is used.

Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Greg Kroah-Hartman [Thu, 3 Mar 2016 20:37:21 +0000 (12:37 -0800)]

Merge tag 'usb-serial-4.5-rc7' of git://git./linux/kernel/git/johan/usb-serial into usb-linus

Jonan writes:

USB-serial fixes for v4.5-rc7

Here are some new device ids and a patch removing the mxu11x0 driver,
which turned out not to be needed.

Signed-off-by: Johan Hovold <johan@kernel.org>

commit | commitdiff | tree

Igal Liberman [Sun, 28 Feb 2016 21:59:53 +0000 (23:59 +0200)]

fsl/fman: Initialize fman->dev earlier

Currently, in a case of error, dev_err is using fman->dev
before its initialization and "(NULL device *)" is printed.
This patch fixes this issue.

Signed-off-by: Igal Liberman <igal.liberman@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gabriel Fernandez [Mon, 29 Feb 2016 16:18:22 +0000 (17:18 +0100)]

stmmac: Fix 'eth0: No PHY found' regression

This patch manages the case when you have an Ethernet MAC with
a "fixed link", and not connected to a normal MDIO-managed PHY device.

The test of phy_bus_name was not helpful because it was never affected
and replaced by the mdio test node.

Signed-off-by: Gabriel Fernandez <gabriel.fernandez@linaro.org>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Mar 2016 19:54:56 +0000 (11:54 -0800)]

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
- ARM/MIPS: Fixes for ioctls when copy_from_user returns nonzero
- x86: Small fix for Skylake TSC scaling
- x86: Improved fix for last week's missed hardware breakpoint bug

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: x86: Update tsc multiplier on change.
  mips/kvm: fix ioctl error handling
  arm/arm64: KVM: Fix ioctl error handling
  KVM: x86: fix root cause for missed hardware breakpoints

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Mar 2016 19:39:11 +0000 (11:39 -0800)]

Merge tag 'gpio-v4.5-3' of git://git./linux/kernel/git/linusw/linux-gpio

Pull late GPIO fix from Linus Walleij:
"Regressions never arrive when you want them to, so here is a late fix
  for the Renesas RCAR GPIO driver.  It only affects that driver on the
  very specific Renesas platforms:

   - Fix a runtime PM suspend/resume bug in the RCAR driver"

* tag 'gpio-v4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: rcar: Add Runtime PM handling for interrupts

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Mar 2016 19:32:13 +0000 (11:32 -0800)]

Merge tag 'iommu-fixes-v4.5-rc6' of git://git./linux/kernel/git/joro/iommu

Pull IOMMU fixes from Joerg Roedel:
"One fix for Intel VT-d:

   - Use BUS_NOTIFY_REMOVED_DEVICE notifier to unbind a device from its
     domain _after_ it has been unbound from its driver.  This fixes a
     BUG_ON being triggered in the PCI hotplug path.

  And three for AMD IOMMU:

   - Add a workaround for a hardware issue with ATS in use

   - Fix ATS enable/disable balance when a device is removed

   - Fix a boot warning being triggered when the system has IOMMU
     performance counters and PCI device 00:00.0 is not covered by the
     IOMMU"

* tag 'iommu-fixes-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Use BUS_NOTIFY_REMOVED_DEVICE in hotplug path
  iommu/amd: Detach device from domain before removal
  iommu/amd: Apply workaround for ATS write permission check
  iommu/amd: Fix boot warning when device 00:00.0 is not iommu covered

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Mar 2016 19:07:32 +0000 (11:07 -0800)]

Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull minor virtio/vhost fixes from Michael Tsirkin:
"This fixes two minor bugs: error handling in vhost, and capability
  processing in virtio"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: fix error path in vhost_init_used()
  virtio-pci: read the right virtio_pci_notify_cap field

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Mar 2016 18:56:17 +0000 (10:56 -0800)]

Merge tag 'vfio-v4.5-rc7' of git://github.com/awilliam/linux-vfio

Pull VFIO fix from Alex Williamson:
"Use -EFAULT for copy_to_user error in ioctl (Michael Tsirkin)"

* tag 'vfio-v4.5-rc7' of git://github.com/awilliam/linux-vfio:
vfio: fix ioctl error handling

commit | commitdiff | tree

Linus Torvalds [Thu, 3 Mar 2016 18:46:18 +0000 (10:46 -0800)]

Merge tag 'fbdev-fixes-4.5' of git://git./linux/kernel/git/tomba/linux

Pull fbdev fix from Tomi Valkeinen:
"Fix hang caused by fbconsole blink timer"

* tag 'fbdev-fixes-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
fbcon: set a default value to blink interval

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 3 Mar 2016 17:52:51 +0000 (14:52 -0300)]

[media] media.h: postpone connectors entities

The representation of external connections got some heated
discussions recently. As we're too close to the merge window,
let's not set those entities into a stone.

Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>

commit | commitdiff | tree

Konstantin Khlebnikov [Sun, 31 Jan 2016 13:21:29 +0000 (16:21 +0300)]

ovl: copy new uid/gid into overlayfs runtime inode

Overlayfs must update uid/gid after chown, otherwise functions
like inode_owner_or_capable() will check user against stale uid.
Catched by xfstests generic/087, it chowns file and calls utimes.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>

commit | commitdiff | tree

Konstantin Khlebnikov [Sun, 31 Jan 2016 13:17:53 +0000 (16:17 +0300)]

ovl: ignore lower entries when checking purity of non-directory entries

After rename file dentry still holds reference to lower dentry from
previous location. This doesn't matter for data access because data comes
from upper dentry. But this stale lower dentry taints dentry at new
location and turns it into non-pure upper. Such file leaves visible
whiteout entry after remove in directory which shouldn't have whiteouts at
all.

Overlayfs already tracks pureness of file location in oe->opaque. This
patch just uses that for detecting actual path type.

Comment from Vivek Goyal's patch:

Here are the details of the problem. Do following.

$ mkdir upper lower work merged upper/dir/
$ touch lower/test
$ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=
work merged
$ mv merged/test merged/dir/
$ rm merged/dir/test
$ ls -l merged/dir/
/usr/bin/ls: cannot access merged/dir/test: No such file or directory
total 0
c????????? ? ? ? ? ? test

Basic problem seems to be that once a file has been unlinked, a whiteout
has been left behind which was not needed and hence it becomes visible.

Whiteout is visible because parent dir is of not type MERGE, hence
od->is_real is set during ovl_dir_open(). And that means ovl_iterate()
passes on iterate handling directly to underlying fs. Underlying fs does
not know/filter whiteouts so it becomes visible to user.

Why did we leave a whiteout to begin with when we should not have.
ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave
whiteout if file is pure upper. In this case file is not found to be pure
upper hence whiteout is left.

So why file was not PURE_UPPER in this case? I think because dentry is
still carrying some leftover state which was valid before rename. For
example, od->numlower was set to 1 as it was a lower file. After rename,
this state is not valid anymore as there is no such file in lower.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Viktor Stanchev <me@viktorstanchev.com>
Suggested-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>

commit | commitdiff | tree

Rui Wang [Fri, 8 Jan 2016 15:09:59 +0000 (23:09 +0800)]

ovl: fix getcwd() failure after unsuccessful rmdir

ovl_remove_upper() should do d_drop() only after it successfully
removes the dir, otherwise a subsequent getcwd() system call will
fail, breaking userspace programs.

This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491

Signed-off-by: Rui Wang <rui.y.wang@intel.com>
Reviewed-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>

commit | commitdiff | tree

Konstantin Khlebnikov [Sun, 31 Jan 2016 13:22:16 +0000 (16:22 +0300)]

ovl: fix working on distributed fs as lower layer

This adds missing .d_select_inode into alternative dentry_operations.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Fixes: 7c03b5d45b8e ("ovl: allow distributed fs as lower layer")
Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Reviewed-by: Nikolay Borisov <kernel@kyup.com>
Tested-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org> # 4.2+

commit | commitdiff | tree

Or Gerlitz [Tue, 1 Mar 2016 16:52:23 +0000 (18:52 +0200)]

IB/core: Use GRH when the path hop-limit > 0

According to IBTA spec v1.3 section 12.7.19, QPs should use GRH when
the path returned by the SA has hop-limit > 0. Currently, we do that
only for the > 1 case, fix that.

Fixes: 6d969a471ba1 ('IB/sa: Add ib_init_ah_from_path()')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

commit | commitdiff | tree

Robert Jarzmik [Tue, 16 Feb 2016 21:54:02 +0000 (22:54 +0100)]

dmaengine: pxa_dma: fix cyclic transfers

While testing audio with pxa2xx-ac97, underrun were happening while the
user application was correctly feeding the music. Debug proved that the
cyclic transfer is not cyclic, ie. the last descriptor did not loop on
the first.

Another issue is that the descriptor length was always set to 8192,
because of an trivial operator issue.

This was tested on a pxa27x platform.

Fixes: a57e16cf0333 ("dmaengine: pxa: add pxa dmaengine driver")
Reported-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

commit | commitdiff | tree

Majd Dibbiny [Sun, 14 Feb 2016 16:35:52 +0000 (18:35 +0200)]

IB/{core, mlx5}: Fix input len in vendor part of create_qp/srq

Currently, the inlen field of the vendor's part of the command
doesn't match the command buffer. This happens because the inlen
accommodates ib_uverbs_cmd_hdr which is deducted from the in buffer.
This is problematic since the vendor function could be called either
from the legacy verb (where the input length mismatches the actual
length) or by the extended verb (where the length matches). The vendor
has no idea which function calls it and therefore has no way to know
how the length variable should be treated.

Fixing this by aligning the inlen to the correct length.

All vendor drivers either assumed that inlen >= sizeof(vendor_uhw_cmd)
or just failed wrongly (mlx5) and fixed in this patch.

Fixes: cfb5e088e26a ('IB/mlx5: Add CQE version 1 support to user QPs and SRQs')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

commit | commitdiff | tree

Majd Dibbiny [Sun, 14 Feb 2016 16:35:51 +0000 (18:35 +0200)]

IB/mlx5: Avoid using user-index for SRQs

Normal SRQs, unlike XRC SRQs, don't have user-index, therefore
avoid verifying it and using it.

Fixes: cfb5e088e26a ('IB/mlx5: Add CQE version 1 support to user QPs and SRQs')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

commit | commitdiff | tree

Ravi Bangoria [Wed, 2 Mar 2016 09:55:17 +0000 (15:25 +0530)]

powerpc/hw_breakpoint: Fix oops when destroying hw_breakpoint event

When destroying a hw_breakpoint event, the kernel oopses as follows:

  Unable to handle kernel paging request for data at address 0x00000c07
  NIP [c0000000000291d0] arch_unregister_hw_breakpoint+0x40/0x60
  LR [c00000000020b6b4] release_bp_slot+0x44/0x80

Call chain:

  hw_breakpoint_event_init()
    bp->destroy = bp_perf_event_destroy;

  do_exit()
    perf_event_exit_task()
      perf_event_exit_task_context()
        WRITE_ONCE(child_ctx->task, TASK_TOMBSTONE);
        perf_event_exit_event()
          free_event()
            _free_event()
              bp_perf_event_destroy() // event->destroy(event);
                release_bp_slot()
                  arch_unregister_hw_breakpoint()

perf_event_exit_task_context() sets child_ctx->task as TASK_TOMBSTONE
which is (void *)-1. arch_unregister_hw_breakpoint() tries to fetch
'thread' attribute of 'task' resulting in oops.

Peterz points out that the code shouldn't be using bp->ctx anyway, but
fixing that will require a decent amount of rework. So for now to fix
the oops, check if bp->ctx->task has been set to (void *)-1, before
dereferencing it. We don't use TASK_TOMBSTONE, because that would
require exporting it and it's supposed to be an internal detail.

Fixes: 63b6da39bb38 ("perf: Fix perf_event_exit_task() race")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

commit | commitdiff | tree

Simon South [Thu, 3 Mar 2016 04:10:44 +0000 (23:10 -0500)]

ALSA: hda - Fix mic issues on Acer Aspire E1-472

This patch applies the microphone-related fix created for the Acer
Aspire E1-572 to the E1-472 as well, as it uses the same Realtek ALC282
CODEC and demonstrates the same issues.

This patch allows an external, headset microphone to be used and limits
the gain on the (quite noisy) internal microphone.

Signed-off-by: Simon South <simon@simonsouth.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

commit | commitdiff | tree

Hans Verkuil [Mon, 29 Feb 2016 08:02:47 +0000 (09:02 +0100)]

[media] media.h: use hex values for range offsets, move connectors base up.

Make the base offset hexadecimal to simplify debugging since the base
addresses are hex too.

The offsets for connectors is also changed to start after the 'reserved'
range 0x10000-0x2ffff.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>

commit | commitdiff | tree

Dave Airlie [Thu, 3 Mar 2016 01:37:07 +0000 (11:37 +1000)]

Merge branch 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Fixes for radeon and amdgpu:
- Fix GPUVM flushing on CI and VI
- Misc DPM and Powerplay fixes
- VCE DPM fixes for CZ/ST
- DP hotplug fix

* 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: return from atombios_dp_get_dpcd only when error
  drm/amdgpu/cz: remove commented out call to enable vce pg
  drm/amdgpu/powerplay/cz: enable/disable vce dpm independent of vce pg
  drm/amdgpu/cz: enable/disable vce dpm even if vce pg is disabled
  drm/amdgpu/gfx8: specify which engine to wait before vm flush
  drm/amdgpu: apply gfx_v8 fixes to gfx_v7 as well
  drm/amd/powerplay: send event to notify powerplay all modules are initialized.
  drm/amd/powerplay: export AMD_PP_EVENT_COMPLETE_INIT task to amdgpu.
  drm/radeon/pm: update current crtc info after setting the powerstate
  drm/amdgpu/pm: update current crtc info after setting the powerstate

commit | commitdiff | tree

Todd E Brandt [Thu, 3 Mar 2016 00:05:29 +0000 (16:05 -0800)]

PM / sleep / x86: Fix crash on graph trace through x86 suspend

Pause/unpause graph tracing around do_suspend_lowlevel as it has
inconsistent call/return info after it jumps to the wakeup vector.
The graph trace buffer will otherwise become misaligned and
may eventually crash and hang on suspend.

To reproduce the issue and test the fix:
Run a function_graph trace over suspend/resume and set the graph
function to suspend_devices_and_enter. This consistently hangs the
system without this fix.

Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

commit | commitdiff | tree

Arnd Bergmann [Wed, 2 Mar 2016 22:24:33 +0000 (23:24 +0100)]

Merge tag 'renesas-dt-fixes2-for-v4.5' of git://git./linux/kernel/git/horms/renesas into fixes

Merge "Second Round of Renesas ARM Based SoC DT Fixes for v4.5" from Simon Horman:

* remove enable prop from HS-USB device node on porter board

* tag 'renesas-dt-fixes2-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: dts: porter: remove enable prop from HS-USB device node

commit | commitdiff | tree

Eric Engestrom [Mon, 29 Feb 2016 16:40:23 +0000 (16:40 +0000)]

ethernet/atl1c: remove left over dead code

Left over from c24588afc536a35c924d014f13b669b20ccf8553
("atl1c: using fixed TXQ configuration for l2cb and l1c")

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eric Engestrom [Mon, 29 Feb 2016 16:38:06 +0000 (16:38 +0000)]

net/ipv4: remove left over dead code

8cc785f6f429c2a3fb81745dc142cbd72a462c4a ("net: ipv4: make the ping
/proc code AF-independent") removed the code using it, but renamed this
variable instead of removing it.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eric Engestrom [Mon, 29 Feb 2016 16:36:38 +0000 (16:36 +0000)]

net/rtnetlink: remove dead code

3b766cd832328fcb87db3507e7b98cf42f21689d ("net/core: Add reading VF
statistics through the PF netdevice") added that variable but it's never
been used.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Wed, 2 Mar 2016 19:57:16 +0000 (14:57 -0500)]

Merge branch 'dwc_eth_qos'

Lars Persson says:

====================
dwc_eth_qos: stability fixes and support for CMA

This series has bug fixes for the dwc_eth_qos ethernet driver.

Mainly two stability fixes for problems found by Rabin Vincent:
- Successive starts and stops of the interface would trigger a DMA reset timeout.
- A race condition in the TX DMA handling could trigger a netdev watchdog
timeout.

The memory allocation was improved to support use of the CMA as DMA allocator
backend.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Lars Persson [Mon, 29 Feb 2016 15:22:34 +0000 (16:22 +0100)]

dwc_eth_qos: do phy_start before resetting hardware

This reverts the changed init order from commit 3647bc35bd42
("dwc_eth_qos: Reset hardware before PHY start") and makes another fix
for the race.

It turned out that the reset state machine of the dwceqos hardware
requires PHY clocks to be present in order to complete the reset
cycle.

To plug the race with the phy state machine we defer link speed
setting until the hardware init has finished.

Signed-off-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Rabin Vincent [Mon, 29 Feb 2016 15:22:33 +0000 (16:22 +0100)]

dwc_eth_qos: use DWCEQOS_MSG_DEFAULT

Since debug is hardcoded to 3, the defaults in the DWCEQOS_MSG_DEFAULT
macro are never used, which does not seem to be the intended behaviour
here. Set debug to -1 like other drivers so that DWCEQOS_MSG_DEFAULT is
actually used by default.

Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Rabin Vincent [Mon, 29 Feb 2016 15:22:32 +0000 (16:22 +0100)]

dwc_eth_qos: use GFP_KERNEL in dma_alloc_coherent()

Since we are in non-atomic context here we can pass GFP_KERNEL to
dma_alloc_coherent(). This enables use of the CMA.

Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Lars Persson [Mon, 29 Feb 2016 15:22:31 +0000 (16:22 +0100)]

dwc_eth_qos: release descriptors outside netif_tx_lock

To prepare for using the CMA, we can not be in atomic context when
de-allocating DMA buffers.

The tx lock was needed only to protect the hw reset against the xmit
handler. Now we briefly grab the tx lock while stopping the queue to
make sure no thread is inside or will enter the xmit handler.

Signed-off-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Rabin Vincent [Mon, 29 Feb 2016 15:22:30 +0000 (16:22 +0100)]

dwc_eth_qos: fix race condition in dwceqos_start_xmit

The xmit handler and the tx_reclaim tasklet had a race on the tx_free
variable which could lead to a tx timeout if tx_free was updated after
the tx complete interrupt.

Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Wed, 2 Mar 2016 19:46:30 +0000 (14:46 -0500)]

Merge branch 'cxgb4-next'

Hariprasad Shenai says:

====================
cxgb4/cxgb4vf: Cleanup and minor fixes

This series sets FBMIN to 64 bytes for Chelsio's T6 series of adapters,
check to replenish fl is revised, some code cleanup in cxgb4vf sge
initialization code and removes dead code.

This patch series has been created against net-next tree and includes
patches on cxgb4 and cxgb4vf driver.

We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
====================

commit | commitdiff | tree

Hariprasad Shenai [Tue, 1 Mar 2016 11:49:36 +0000 (17:19 +0530)]

cxgb4vf: Remove dead functions collect_netdev_[um]c_list_addrs

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hariprasad Shenai [Tue, 1 Mar 2016 11:49:35 +0000 (17:19 +0530)]

cxgb4vf: Remove redundant adapter ready check during probe

Function t4vf_wait_dev_ready() is already called in t4vf_prep_adapter(),
no need to call it again in adap_init0().

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hariprasad Shenai [Tue, 1 Mar 2016 11:49:34 +0000 (17:19 +0530)]

cxgb4vf: Make sge init code more readable

Adds a new function t4vf_fl_pkt_align() and use the same in SGE
initialization code to find out freelist packet alignment

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hariprasad Shenai [Tue, 1 Mar 2016 11:49:33 +0000 (17:19 +0530)]

cxgb4/cxgb4vf: For T6 adapter, set FBMIN to 64 bytes

T4 and T5 hardware will not coalesce Free List PCI-E Fetch Requests if
the Host Driver provides more Free List Pointers than the Fetch Burst
Minimum value. So if we set FBMIN to 64 bytes and the Host Driver
supplies 128 bytes of Free List Pointer data, the hardware will issue two
64-byte PCI-E Fetch Requests rather than a single coallesced 128-byte
Fetch Request. T6 fixes this. So, for T4/T5 we set the FBMIN value to 128

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hariprasad Shenai [Tue, 1 Mar 2016 11:49:32 +0000 (17:19 +0530)]

cxgb4/cxgb4vf: Use fl capacity to check if fl needs to be replenished

Use freelist capacity instead of freelist size while checking, if
freelist needs to be refilled

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Wed, 2 Mar 2016 19:42:46 +0000 (14:42 -0500)]

Merge branch 'mlx4-fixes'

Or Gerlitz says:

====================
Mellanox 10/40G mlx4 driver fixes for 4.5-rc6

This series contains two fixes for the SRIOV HW LAG that was
introduced in 4.5-rc1 and one fix that allows to revoke the
administrative MAC that was assigned to VF through the PF.

The VF mac fix needs to go for stable too.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jack Morgenstein [Wed, 2 Mar 2016 15:47:46 +0000 (17:47 +0200)]

net/mlx4_core: Allow resetting VF admin mac to zero

The VF administrative mac addresses (stored in the PF driver) are
initialized to zero when the PF driver starts up.

These addresses may be modified in the PF driver through ndo calls
initiated by iproute2 or libvirt.

While we allow the PF/host to change the VF admin mac address from zero
to a valid unicast mac, we do not allow restoring the VF admin mac to
zero. We currently only allow changing this mac to a different unicast mac.

This leads to problems when libvirt scripts are used to deal with
VF mac addresses, and libvirt attempts to revoke the mac so this
host will not use it anymore.

Fix this by allowing resetting a VF administrative MAC back to zero.

Fixes: 8f7ba3ca12f6 ('net/mlx4: Add set VF mac address support')
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reported-by: Moshe Levi <moshele@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Moni Shoua [Wed, 2 Mar 2016 15:47:45 +0000 (17:47 +0200)]

net/mlx4_core: Check the correct limitation on VFs for HA mode

The limit of 63 is only for virtual functions while the actual enforcement
was for VFs plus physical functions, fix that.

Fixes: e57968a10bc1 ('net/mlx4_core: Support the HA mode for SRIOV VFs too')
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jack Morgenstein [Wed, 2 Mar 2016 15:47:44 +0000 (17:47 +0200)]

net/mlx4_core: Fix lockdep warning in handling of mac/vlan tables

In the mac and vlan register/unregister/replace functions, the driver locks
the mac table mutex (or vlan table mutex) on both ports.

We move to use mutex_lock_nested() to prevent warnings, such as the one below.

[ 101.828445] =============================================
[ 101.834820] [ INFO: possible recursive locking detected ]
[ 101.841199] 4.5.0-rc2+ #49 Not tainted
[ 101.850251] ---------------------------------------------
[ 101.856621] modprobe/3054 is trying to acquire lock:
[ 101.862514] (&table->mutex#2){+.+.+.}, at: [<ffffffffa079c10e>] __mlx4_register_mac+0x87e/0xa90 [mlx4_core]
[ 101.874598]
[ 101.874598] but task is already holding lock:
[ 101.881703] (&table->mutex#2){+.+.+.}, at: [<ffffffffa079c0f0>] __mlx4_register_mac+0x860/0xa90 [mlx4_core]
[ 101.893776]
[ 101.893776] other info that might help us debug this:
[ 101.901658] Possible unsafe locking scenario:
[ 101.901658]
[ 101.908859] CPU0
[ 101.911923] ----
[ 101.914985] lock(&table->mutex#2);
[ 101.919595] lock(&table->mutex#2);
[ 101.924199]
[ 101.924199] * DEADLOCK *
[ 101.924199]
[ 101.931643] May be due to missing lock nesting notation

Fixes: 5f61385d2ebc ('net/mlx4_core: Keep VLAN/MAC tables mirrored in multifunc HA mode')
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Suggested-by: Doron Tsur <doront@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Wed, 2 Mar 2016 19:37:27 +0000 (14:37 -0500)]

Merge branch 'mlx5-fixes'

Saeed Mahameed says:

====================
Mellanox 100G mlx5 driver fixes

This series has few bug fixes for the mlx5 Ethernet driver.

Eran fixed a locking issue with time-stamping that could cause a
soft-lockup when time-stamping is enabled.

Gal fixed the rx/tx packets/bytes counters returned by the driver to
actually went through the network stack.

Tariq removed a poll CQ optimization which could lead the driver to
stop getting interrupts for some of the rings, and a did also fix to
HW LRO which is currently broken.

He also provided RSS and RX hash fixes for the case of changing the
number of rx rings the RX hash/RSS configuration will be out of sync.

The time stamping fix from Eran is not for -stable as the feature was
only introduced in 4.5 but all of the others are.

Changes fro V0:
- Eran addressed the irqsave/restore comments from "Dave" and fixed them.

This series is generated against net commit 4c0b6eaf373a 'net:
thunderx: Fix for Qset error due to CQ full'
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gal Pressman [Mon, 29 Feb 2016 19:17:15 +0000 (21:17 +0200)]

net/mlx5e: Provide correct packet/bytes statistics

Using the HW VPort counters for traffic (rx/tx packets/bytes)
statistics is wrong. This is because frames dropped due to steering or
out of buffer will be counted as received. To fix that, we move to use
the packet/bytes accounting done by the driver for what the netdev
reports out.

Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support [...]')
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gal Pressman [Mon, 29 Feb 2016 19:17:14 +0000 (21:17 +0200)]

net/mlx5e: Add rx/tx bytes software counters

Sum up rx/tx bytes in software as we do for rx/tx packets, to be reported
in upcoming statistics fix.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tariq Toukan [Mon, 29 Feb 2016 19:17:13 +0000 (21:17 +0200)]

net/mlx5e: Correctly handle RSS indirection table when changing number of channels

Upon changing num_channels, reset the RSS indirection table to
match the new value.

Fixes: 2d75b2bc8a8c ('net/mlx5e: Add ethtool RSS configuration options')
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tariq Toukan [Mon, 29 Feb 2016 19:17:12 +0000 (21:17 +0200)]

net/mlx5e: Fix ethtool RX hash func configuration change

We should modify TIRs explicitly to apply the new RSS configuration.
The light ndo close/open calls do not "refresh" them.

Fixes: 2d75b2bc8a8c ('net/mlx5e: Add ethtool RSS configuration options')
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eran Ben Elisha [Mon, 29 Feb 2016 19:17:11 +0000 (21:17 +0200)]

net/mlx5e: Fix soft lockup when HW Timestamping is enabled

Readers/Writers lock for SW timecounter was acquired without disabling
interrupts on local CPU.

The problematic scenario:
* HW timestamping is enabled
* Timestamp overflow periodic service task is running on local CPU and
  holding write_lock for SW timecounter
* Completion arrives, triggers interrupt for local CPU.
  Interrupt routine calls napi_schedule(), which triggers rx/tx
  skb process.
  An attempt to read SW timecounter using read_lock is done, which is
  already locked by a writer on the same CPU and cause soft lockup.

Add irqsave/irqrestore for when using the readers/writers lock for
writing.

Fixes: ef9814deafd0 ('net/mlx5e: Add HW timestamping (TS) support')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tariq Toukan [Mon, 29 Feb 2016 19:17:10 +0000 (21:17 +0200)]

net/mlx5e: Fix LRO modify

Ethtool LRO enable/disable is broken, as of today we only modify TCP
TIRs in order to apply the requested configuration.

Hardware requires that all TIRs pointing to the same RQ should share the
same LRO configuration. For that all other TIRs' LRO fields must be
modified as well.

Fixes: 5c50368f3831 ('net/mlx5e: Light-weight netdev open/stop')
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom