git.cascardo.info Git - cascardo/linux.git/log

net: ipv6: fallback to full lookup if table lookup is unsuitable

Commit 8c14586fc320 ("net: ipv6: Use passed in table for nexthop
lookups") introduced a regression: insertion of an IPv6 route in a table
not containing the appropriate connected route for the gateway but which
contained a non-connected route (like a default gateway) fails while it
was previously working:

    $ ip link add eth0 type dummy
    $ ip link set up dev eth0
    $ ip addr add 2001:db8::1/64 dev eth0
    $ ip route add ::/0 via 2001:db8::5 dev eth0 table 20
    $ ip route add 2001:db8:cafe::1/128 via 2001:db8::6 dev eth0 table 20
    RTNETLINK answers: No route to host
    $ ip -6 route show table 20
    default via 2001:db8::5 dev eth0  metric 1024  pref medium

After this patch, we get:

    $ ip route add 2001:db8:cafe::1/128 via 2001:db8::6 dev eth0 table 20
    $ ip -6 route show table 20
    2001:db8:cafe::1 via 2001:db8::6 dev eth0  metric 1024  pref medium
    default via 2001:db8::5 dev eth0  metric 1024  pref medium

Fixes: 8c14586fc320 ("net: ipv6: Use passed in table for nexthop lookups")
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mlx5-fixes'

Or Gerlitz says:

====================
mlx5 fixes to 4.8-rc6

This series series has a fix from Roi to memory corruption bug in
the bulk flow counters code and two late and hopefully last fixes
from me to the new eswitch offloads code.

Series done over net commit 37dd348 "bna: fix crash in bnad_get_strings()"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx5: E-Switch, Handle mode change failures

E-switch mode changes involve creating HW tables, potentially allocating
netdevices, etc, and things can fail. Add an attempt to rollback to the
existing mode when changing to the new mode fails. Only if rollback fails,
getting proper SRIOV functionality requires module unload or sriov
disablement/enablement.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init code

When enablement of the SRIOV e-switch in certain mode (switchdev or legacy)
fails, we must set the mode to none. Otherwise, we'll run into double free
based crashes when further attempting to deal with the e-switch (such
as when disabling sriov or unloading the driver).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx5: Fix flow counter bulk command out mailbox allocation

The FW command output length should be only the length of struct
mlx5_cmd_fc_bulk out field. Failing to do so will cause the memcpy
call which is invoked later in the driver to write over wrong memory
address and corrupt kernel memory which results in random crashes.

This bug was found using the kernel address sanitizer (kasan).

Fixes: a351a1b03bf1 ('net/mlx5: Introduce bulk reading of flow counters')
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

irqchip/gicv3: Silence noisy DEBUG_PER_CPU_MAPS warning

gic_raise_softirq() walks the list of cpus using for_each_cpu(), it calls
gic_compute_target_list() which advances the iterator by the number of
CPUs in the cluster.

If gic_compute_target_list() reaches the last CPU it leaves the iterator
pointing at the last CPU. This means the next time round the for_each_cpu()
loop cpumask_next() will be called with an invalid CPU.

This triggers a warning when built with CONFIG_DEBUG_PER_CPU_MAPS:
[    3.077738] GICv3: CPU1: found redistributor 1 region 0:0x000000002f120000
[    3.077943] CPU1: Booted secondary processor [410fd0f0]
[    3.078542] ------------[ cut here ]------------
[    3.078746] WARNING: CPU: 1 PID: 0 at ../include/linux/cpumask.h:121 gic_raise_softirq+0x12c/0x170
[    3.078812] Modules linked in:
[    3.078869]
[    3.078930] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-rc5+ #5188
[    3.078994] Hardware name: Foundation-v8A (DT)
[    3.079059] task: ffff80087a1a0080 task.stack: ffff80087a19c000
[    3.079145] PC is at gic_raise_softirq+0x12c/0x170
[    3.079226] LR is at gic_raise_softirq+0xa4/0x170
[    3.079296] pc : [<ffff0000083ead24>] lr : [<ffff0000083eac9c>] pstate: 200001c9
[    3.081139] Call trace:
[    3.081202] Exception stack(0xffff80087a19fbe0 to 0xffff80087a19fd10)

[    3.082269] [<ffff0000083ead24>] gic_raise_softirq+0x12c/0x170
[    3.082354] [<ffff00000808e614>] smp_send_reschedule+0x34/0x40
[    3.082433] [<ffff0000080e80a0>] resched_curr+0x50/0x88
[    3.082512] [<ffff0000080e89d0>] check_preempt_curr+0x60/0xd0
[    3.082593] [<ffff0000080e8a60>] ttwu_do_wakeup+0x20/0xe8
[    3.082672] [<ffff0000080e8bb8>] ttwu_do_activate+0x90/0xc0
[    3.082753] [<ffff0000080ea9a4>] try_to_wake_up+0x224/0x370
[    3.082836] [<ffff0000080eabc8>] default_wake_function+0x10/0x18
[    3.082920] [<ffff000008103134>] __wake_up_common+0x5c/0xa0
[    3.083003] [<ffff0000081031f4>] __wake_up_locked+0x14/0x20
[    3.083086] [<ffff000008103f80>] complete+0x40/0x60
[    3.083168] [<ffff00000808df7c>] secondary_start_kernel+0x15c/0x1d0
[    3.083240] [<00000000808911a4>] 0x808911a4
[    3.113401] Detected PIPT I-cache on CPU2

Avoid updating the iterator if the next call to cpumask_next() would
cause the for_each_cpu() loop to exit.

There is no change to gic_raise_softirq()'s behaviour, (cpumask_next()s
eventual call to _find_next_bit() will return early as start >= nbits),
this patch just silences the warning.

Fixes: 021f653791ad ("irqchip: gic-v3: Initial support for GICv3")
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1474306155-3303-1-git-send-email-james.morse@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Merge branch 'akpm' (patches from Andrew)

Merge fixes from Andrew Morton:
"20 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  rapidio/rio_cm: avoid GFP_KERNEL in atomic context
  Revert "ocfs2: bump up o2cb network protocol version"
  ocfs2: fix start offset to ocfs2_zero_range_for_truncate()
  cgroup: duplicate cgroup reference when cloning sockets
  mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting
  ocfs2: fix double unlock in case retry after free truncate log
  fanotify: fix list corruption in fanotify_get_response()
  fsnotify: add a way to stop queueing events on group shutdown
  ocfs2: fix trans extend while free cached blocks
  ocfs2: fix trans extend while flush truncate log
  ipc/shm: fix crash if CONFIG_SHMEM is not set
  mm: fix the page_swap_info() BUG_ON check
  autofs: use dentry flags to block walks during expire
  MAINTAINERS: update email for VLYNQ bus entry
  mm: avoid endless recursion in dump_page()
  mm, thp: fix leaking mapped pte in __collapse_huge_page_swapin()
  khugepaged: fix use-after-free in collapse_huge_page()
  MAINTAINERS: Maik has moved
  ocfs2/dlm: fix race between convert and migration
  mem-hotplug: don't clear the only node in new_node_page()

rapidio/rio_cm: avoid GFP_KERNEL in atomic context

As reported by Alexey Khoroshilov (https://lkml.org/lkml/2016/9/9/737):
riocm_send_close() is called from rio_cm_shutdown() under
spin_lock_bh(idr_lock), but riocm_send_close() uses a GFP_KERNEL
allocation.

Fix by taking riocm_send_close() outside of spinlock protected code.

[akpm@linux-foundation.org: remove unneeded `if (!list_empty())']
Link: http://lkml.kernel.org/r/20160915175402.10122-1-alexandre.bounine@idt.com
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "ocfs2: bump up o2cb network protocol version"

This reverts commit 38b52efd218b ("ocfs2: bump up o2cb network protocol
version").

This commit made rolling upgrade fail. When one node is upgraded to new
version with this commit, the remaining nodes will fail to establish
connections to it, then the application like VMs on the remaining nodes
can't be live migrated to the upgraded one. This will cause an outage.
Since negotiate hb timeout behavior didn't change without this commit,
so revert it.

Fixes: 38b52efd218bf ("ocfs2: bump up o2cb network protocol version")
Link: http://lkml.kernel.org/r/1471396924-10375-1-git-send-email-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: fix start offset to ocfs2_zero_range_for_truncate()

If we punch a hole on a reflink such that following conditions are met:

1. start offset is on a cluster boundary
2. end offset is not on a cluster boundary
3. (end offset is somewhere in another extent) or
   (hole range > MAX_CONTIG_BYTES(1MB)),

we dont COW the first cluster starting at the start offset.  But in this
case, we were wrongly passing this cluster to
ocfs2_zero_range_for_truncate() to zero out.  This will modify the
cluster in place and zero it in the source too.

Fix this by skipping this cluster in such a scenario.

To reproduce:

1. Create a random file of say 10 MB
     xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
2. Reflink  it
     reflink -f 10MBfile reflnktest
3. Punch a hole at starting at cluster boundary  with range greater that
1MB. You can also use a range that will put the end offset in another
extent.
     fallocate -p -o 0 -l 1048615 reflnktest
4. sync
5. Check the  first cluster in the source file. (It will be zeroed out).
    dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C

Link: http://lkml.kernel.org/r/1470957147-14185-1-git-send-email-ashish.samant@oracle.com
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reported-by: Saar Maoz <saar.maoz@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Eric Ren <zren@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cgroup: duplicate cgroup reference when cloning sockets

When a socket is cloned, the associated sock_cgroup_data is duplicated
but not its reference on the cgroup. As a result, the cgroup reference
count will underflow when both sockets are destroyed later on.

Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup")
Link: http://lkml.kernel.org/r/20160914194846.11153-2-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: <stable@vger.kernel.org> [4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting

During cgroup2 rollout into production, we started encountering css
refcount underflows and css access crashes in the memory controller.
Splitting the heavily shared css reference counter into logical users
narrowed the imbalance down to the cgroup2 socket memory accounting.

The problem turns out to be the per-cpu charge cache.  Cgroup1 had a
separate socket counter, but the new cgroup2 socket accounting goes
through the common charge path that uses a shared per-cpu cache for all
memory that is being tracked.  Those caches are safe against scheduling
preemption, but not against interrupts - such as the newly added packet
receive path.  When cache draining is interrupted by network RX taking
pages out of the cache, the resuming drain operation will put references
of in-use pages, thus causing the imbalance.

Disable IRQs during all per-cpu charge cache operations.

Fixes: f7e1cb6ec51b ("mm: memcontrol: account socket memory in unified hierarchy memory controller")
Link: http://lkml.kernel.org/r/20160914194846.11153-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: <stable@vger.kernel.org> [4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: fix double unlock in case retry after free truncate log

If ocfs2_reserve_cluster_bitmap_bits() fails with ENOSPC, it will try to
free truncate log and then retry. Since ocfs2_try_to_free_truncate_log
will lock/unlock global bitmap inode, we have to unlock it before
calling this function. But when retry reserve and it fails with no
global bitmap inode lock taken, it will unlock again in error handling
branch and BUG.

This issue also exists if no need retry and then ocfs2_inode_lock fails.
So fix it.

Fixes: 2070ad1aebff ("ocfs2: retry on ENOSPC if sufficient space in truncate log")
Link: http://lkml.kernel.org/r/57D91939.6030809@huawei.com
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Jiufei Xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fanotify: fix list corruption in fanotify_get_response()

fanotify_get_response() calls fsnotify_remove_event() when it finds that
group is being released from fanotify_release() (bypass_perm is set).

However the event it removes need not be only in the group's notification
queue but it can have already moved to access_list (userspace read the
event before closing the fanotify instance fd) which is protected by a
different lock. Thus when fsnotify_remove_event() races with
fanotify_release() operating on access_list, the list can get corrupted.

Fix the problem by moving all the logic removing permission events from
the lists to one place - fanotify_release().

Fixes: 5838d4442bd5 ("fanotify: fix double free of pending permission events")
Link: http://lkml.kernel.org/r/1473797711-14111-3-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Miklos Szeredi <mszeredi@redhat.com>
Tested-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fsnotify: add a way to stop queueing events on group shutdown

Implement a function that can be called when a group is being shutdown
to stop queueing new events to the group. Fanotify will use this.

Fixes: 5838d4442bd5 ("fanotify: fix double free of pending permission events")
Link: http://lkml.kernel.org/r/1473797711-14111-2-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: fix trans extend while free cached blocks

The root cause of this issue is the same with the one fixed by the last
patch, but this time credits for allocator inode and group descriptor
may not be consumed before trans extend.

The following error was caught:

  WARNING: CPU: 0 PID: 2037 at fs/jbd2/transaction.c:269 start_this_handle+0x4c3/0x510 [jbd2]()
  Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront fb_sys_fops sysimgblt sysfillrect syscopyarea xen_netfront parport_pc parport pcspkr i2c_piix4 i2c_core acpi_cpufreq ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod
  CPU: 0 PID: 2037 Comm: rm Tainted: G        W       4.1.12-37.6.3.el6uek.bug24573128v2.x86_64 #2
  Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016
  Call Trace:
    dump_stack+0x48/0x5c
    warn_slowpath_common+0x95/0xe0
    warn_slowpath_null+0x1a/0x20
    start_this_handle+0x4c3/0x510 [jbd2]
    jbd2__journal_restart+0x161/0x1b0 [jbd2]
    jbd2_journal_restart+0x13/0x20 [jbd2]
    ocfs2_extend_trans+0x74/0x220 [ocfs2]
    ocfs2_free_cached_blocks+0x16b/0x4e0 [ocfs2]
    ocfs2_run_deallocs+0x70/0x270 [ocfs2]
    ocfs2_commit_truncate+0x474/0x6f0 [ocfs2]
    ocfs2_truncate_for_delete+0xbd/0x380 [ocfs2]
    ocfs2_wipe_inode+0x136/0x6a0 [ocfs2]
    ocfs2_delete_inode+0x2a2/0x3e0 [ocfs2]
    ocfs2_evict_inode+0x28/0x60 [ocfs2]
    evict+0xab/0x1a0
    iput_final+0xf6/0x190
    iput+0xc8/0xe0
    do_unlinkat+0x1b7/0x310
    SyS_unlinkat+0x22/0x40
    system_call_fastpath+0x12/0x71
  ---[ end trace a62437cb060baa71 ]---
  JBD2: rm wants too many credits (149 > 128)

Link: http://lkml.kernel.org/r/1473674623-11810-2-git-send-email-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: fix trans extend while flush truncate log

Every time, ocfs2_extend_trans() included a credit for truncate log
inode, but as that inode had been managed by jbd2 running transaction
first time, it will not consume that credit until
jbd2_journal_restart().

Since total credits to extend always included the un-consumed ones,
there will be more and more un-consumed credit, at last
jbd2_journal_restart() will fail due to credit number over the half of
max transction credit.

The following error was caught when unlinking a large file with many
extents:

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 13626 at fs/jbd2/transaction.c:269 start_this_handle+0x4c3/0x510 [jbd2]()
  Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea parport_pc parport pcspkr i2c_piix4 i2c_core acpi_cpufreq ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod
  CPU: 0 PID: 13626 Comm: unlink Tainted: G        W       4.1.12-37.6.3.el6uek.x86_64 #2
  Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016
  Call Trace:
    dump_stack+0x48/0x5c
    warn_slowpath_common+0x95/0xe0
    warn_slowpath_null+0x1a/0x20
    start_this_handle+0x4c3/0x510 [jbd2]
    jbd2__journal_restart+0x161/0x1b0 [jbd2]
    jbd2_journal_restart+0x13/0x20 [jbd2]
    ocfs2_extend_trans+0x74/0x220 [ocfs2]
    ocfs2_replay_truncate_records+0x93/0x360 [ocfs2]
    __ocfs2_flush_truncate_log+0x13e/0x3a0 [ocfs2]
    ocfs2_remove_btree_range+0x458/0x7f0 [ocfs2]
    ocfs2_commit_truncate+0x1b3/0x6f0 [ocfs2]
    ocfs2_truncate_for_delete+0xbd/0x380 [ocfs2]
    ocfs2_wipe_inode+0x136/0x6a0 [ocfs2]
    ocfs2_delete_inode+0x2a2/0x3e0 [ocfs2]
    ocfs2_evict_inode+0x28/0x60 [ocfs2]
    evict+0xab/0x1a0
    iput_final+0xf6/0x190
    iput+0xc8/0xe0
    do_unlinkat+0x1b7/0x310
    SyS_unlink+0x16/0x20
    system_call_fastpath+0x12/0x71
  ---[ end trace 28aa7410e69369cf ]---
  JBD2: unlink wants too many credits (251 > 128)

Link: http://lkml.kernel.org/r/1473674623-11810-1-git-send-email-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ipc/shm: fix crash if CONFIG_SHMEM is not set

Commit c01d5b300774 ("shmem: get_unmapped_area align huge page") makes
use of shm_get_unmapped_area() in shm_file_operations() unconditional to
CONFIG_MMU.

As Tony Battersby pointed this can lead NULL-pointer dereference on
machine with CONFIG_MMU=y and CONFIG_SHMEM=n. In this case ipc/shm is
backed by ramfs which doesn't provide f_op->get_unmapped_area for
configurations with MMU.

The solution is to provide dummy f_op->get_unmapped_area for ramfs when
CONFIG_MMU=y, which just call current->mm->get_unmapped_area().

Fixes: c01d5b300774 ("shmem: get_unmapped_area align huge page")
Link: http://lkml.kernel.org/r/20160912102704.140442-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Tony Battersby <tonyb@cybernetics.com>
Tested-by: Tony Battersby <tonyb@cybernetics.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org> [4.7.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: fix the page_swap_info() BUG_ON check

Commit 62c230bc1790 ("mm: add support for a filesystem to activate
swap files and use direct_IO for writing swap pages") replaced the
swap_aops dirty hook from __set_page_dirty_no_writeback() with
swap_set_page_dirty().

For normal cases without these special SWP flags code path falls back to
__set_page_dirty_no_writeback() so the behaviour is expected to be the
same as before.

But swap_set_page_dirty() makes use of the page_swap_info() helper to
get the swap_info_struct to check for the flags like SWP_FILE,
SWP_BLKDEV etc as desired for those features. This helper has
BUG_ON(!PageSwapCache(page)) which is racy and safe only for the
set_page_dirty_lock() path.

For the set_page_dirty() path which is often needed for cases to be
called from irq context, kswapd() can toggle the flag behind the back
while the call is getting executed when system is low on memory and
heavy swapping is ongoing.

This ends up with undesired kernel panic.

This patch just moves the check outside the helper to its users
appropriately to fix kernel panic for the described path. Couple of
users of helpers already take care of SwapCache condition so I skipped
them.

Link: http://lkml.kernel.org/r/1473460718-31013-1-git-send-email-santosh.shilimkar@oracle.com
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Joe Perches <joe@perches.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rik van Riel <riel@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jens Axboe <axboe@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org> [4.7.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

autofs: use dentry flags to block walks during expire

Somewhere along the way the autofs expire operation has changed to hold
a spin lock over expired dentry selection. The autofs indirect mount
expired dentry selection is complicated and quite lengthy so it isn't
appropriate to hold a spin lock over the operation.

Commit 47be61845c77 ("fs/dcache.c: avoid soft-lockup in dput()") added a
might_sleep() to dput() causing a WARN_ONCE() about this usage to be
issued.

But the spin lock doesn't need to be held over this check, the autofs
dentry info. flags are enough to block walks into dentrys during the
expire.

I've left the direct mount expire as it is (for now) because it is much
simpler and quicker than the indirect mount expire and adding spin lock
release and re-aquires would do nothing more than add overhead.

Fixes: 47be61845c77 ("fs/dcache.c: avoid soft-lockup in dput()")
Link: http://lkml.kernel.org/r/20160912014017.1773.73060.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Reported-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Takashi Iwai <tiwai@suse.de>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: NeilBrown <neilb@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: update email for VLYNQ bus entry

Link: http://lkml.kernel.org/r/1473218738-21836-1-git-send-email-f.fainelli@gmail.com
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: avoid endless recursion in dump_page()

dump_page() uses page_mapcount() to get mapcount of the page.
page_mapcount() has VM_BUG_ON_PAGE(PageSlab(page)) as mapcount doesn't
make sense for slab pages and the field in struct page used for other
information.

It leads to recursion if dump_page() called for slub page and DEBUG_VM
is enabled:

dump_page() -> page_mapcount() -> VM_BUG_ON_PAGE() -> dump_page -> ...

Let's avoid calling page_mapcount() for slab pages in dump_page().

Link: http://lkml.kernel.org/r/20160908082137.131076-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm, thp: fix leaking mapped pte in __collapse_huge_page_swapin()

Currently, khugepaged does not permit swapin if there are enough young
pages in a THP. The problem is when a THP does not have enough young
pages, khugepaged leaks mapped ptes.

This patch prohibits leaking mapped ptes.

Link: http://lkml.kernel.org/r/1472820276-7831-1-git-send-email-ebru.akagunduz@gmail.com
Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com>
Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

khugepaged: fix use-after-free in collapse_huge_page()

hugepage_vma_revalidate() tries to re-check if we still should try to
collapse small pages into huge one after the re-acquiring mmap_sem.

The problem Dmitry Vyukov reported[1] is that the vma found by
hugepage_vma_revalidate() can be suitable for huge pages, but not the
same vma we had before dropping mmap_sem. And dereferencing original
vma can lead to fun results..

Let's use vma hugepage_vma_revalidate() found instead of assuming it's the
same as what we had before the lock was dropped.

[1] http://lkml.kernel.org/r/CACT4Y+Z3gigBvhca9kRJFcjX0G70V_nRhbwKBU+yGoESBDKi9Q@mail.gmail.com

Link: http://lkml.kernel.org/r/20160907122559.GA6542@black.fi.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: Maik has moved

Maik is no longer using the plusserver.de email, update with his
current email.

Link: http://lkml.kernel.org/r/1473007794-27960-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Cc: Maik Broemme <mbroemme@libmpq.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2/dlm: fix race between convert and migration

Commit ac7cf246dfdb ("ocfs2/dlm: fix race between convert and recovery")
checks if lockres master has changed to identify whether new master has
finished recovery or not. This will introduce a race that right after
old master does umount ( means master will change), a new convert
request comes.

In this case, it will reset lockres state to DLM_RECOVERING and then
retry convert, and then fail with lockres->l_action being set to
OCFS2_AST_INVALID, which will cause inconsistent lock level between
ocfs2 and dlm, and then finally BUG.

Since dlm recovery will clear lock->convert_pending in
dlm_move_lockres_to_recovery_list, we can use it to correctly identify
the race case between convert and recovery. So fix it.

Fixes: ac7cf246dfdb ("ocfs2/dlm: fix race between convert and recovery")
Link: http://lkml.kernel.org/r/57CE1569.8010704@huawei.com
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mem-hotplug: don't clear the only node in new_node_page()

Commit 394e31d2ceb4 ("mem-hotplug: alloc new page from a nearest
neighbor node when mem-offline") introduced new_node_page() for memory
hotplug.

In new_node_page(), the nid is cleared before calling
__alloc_pages_nodemask(). But if it is the only node of the system, and
the first round allocation fails, it will not be able to get memory from
an empty nodemask, and will trigger oom.

The patch checks whether it is the last node on the system, and if it
is, then don't clear the nid in the nodemask.

Fixes: 394e31d2ceb4 ("mem-hotplug: alloc new page from a nearest neighbor node when mem-offline")
Link: http://lkml.kernel.org/r/1473044391.4250.19.camel@TP420
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Reported-by: John Allen <jallen@linux.vnet.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

scripts/faddr2line: improve on base path filtering a bit

Due to our compiler include directives, the build pathnames for header
files often end up being of the form "$srcdir/./include/linux/xyz.h",
which ends up having that extra "." path component after the build base
in it.

Teach faddr2line to skip that too, to make code generated in inline
functions in header files match the filename for the regular C files.

Rabin Vincent pointed out that I can't make a stricter regexp match by
using the " at " prefix for the pathname, because that ends up being
locale-dependent. But this does require that the path match be preceded
by a space, to make it a bit more strict (that matters mainly if we
didn't find any base_dir at all, and we only end up with the "./" part
of the match)

Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rabin Vincent <rabin@rab.in>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

blk-throttle: Extend slice if throttle group is not empty

Right now, if slice is expired, we start a new slice. If a bio is
queued, we keep on extending slice by throtle_slice interval (100ms).

This worked well as long as pending timer function got executed with-in
few milli seconds of scheduled time. But looks like with recent changes
in timer subsystem, slack can be much longer depending on the expiry time
of the scheduled timer.

commit 500462a9de65 ("timers: Switch to a non-cascading wheel")

This means, by the time timer function gets executed, it is possible the
delay from scheduled time is more than 100ms. That means current code
will conclude that existing slice has expired and a new one needs to
be started. New slice will be 100ms by default and that will not be
sufficient to meet rate requirement of group given the bio size and
bio will not be dispatched and we will start a new timer function to
wait. And when that timer expires, same process will repeat and we
will wait again and this can easily be an infinite loop.

Solve this issue by starting a new slice only if throttle gropup is
empty. If it is not empty, that means there should be an active slice
going on. Ideally it should not be expired but given the slack, it is
possible that it has expired.

Reported-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
"This fixes a potential weakness in IPsec CBC IV generation, as well as
  a number of issues that arose out of an OOM crash on ARM with CTR-mode
  AES"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: arm64/aes-ctr - fix NULL dereference in tail processing
  crypto: arm/aes-ctr - fix NULL dereference in tail processing
  crypto: skcipher - Fix blkcipher walk OOM crash
  crypto: echainiv - Replace chaining with multiplication

Merge tag 'drm-fixes-for-4.8-rc7' of git://people.freedesktop.org/~airlied/linux

Pull exynos and one stable ABI fix from Dave Airlie:
"One important drm 32/64 ABI fix came in so I'll dequeue what I have,
  the rest is just exynos runtime pm fixes, but the net removal of code
  seems like a win to me.

  I'm going to be sporadic this week due to school holidays, so if
  anything urgent turns up, Daniel will take care of it"

* tag 'drm-fixes-for-4.8-rc7' of git://people.freedesktop.org/~airlied/linux:
  drm: Only use compat ioctl for addfb2 on X86/IA64
  Subject: [PATCH, RESEND] drm: exynos: avoid unused function warning
  drm/exynos: g2d: fix system and runtime pm integration
  drm/exynos: rotator: fix system and runtime pm integration
  drm/exynos: gsc: fix system and runtime pm integration
  drm/exynos: fimc: fix system and runtime pm integration
  exynos-drm: Fix unsupported GEM memory type error message to be clear

scripts: add script for translating stack dump function offsets

addr2line doesn't work with KASLR addresses. Add a basic addr2line
wrapper script which takes the 'func+offset/size' format as input.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MIPS: Select HAVE_REGS_AND_STACK_ACCESS_API

Add lost Kconfig symbol. This should have been part of 40e084a506eb
('MIPS: Add uprobes support.').

Fixes: 40e084a506eb ('MIPS: Add uprobes support.')
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[media] cx23885/saa7134: assign q->dev to the PCI device

Fix a regression caused by commit 2bc46b3ad3c1 ("[media] media/pci:
convert drivers to use the new vb2_queue dev field").

Three places where q->dev should be set were missed, causing
a WARN.

Fixes: 2bc46b3ad3c1 ("[media] media/pci: convert drivers to use the new vb2_queue dev field").

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Reported-by: Marton Balint <cus@passwd.hu>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>

MIPS: Octeon: Fix platform bus probing

Commit 44a7185c2ae6 ("of/platform: Add common method to populate
default bus") added new arch_initcall of_platform_default_populate_init()
that will override device_initcall octeon_publish_devices(). This broke
many OCTEON boards as important devices are not getting probed anymore
(e.g. on EdgeRouter Lite the USB mass storage/rootfs is missing).

Fix by changing octeon_publish_devices() to arch_initcall.

Fixes: 44a7185c2ae6 ("of/platform: Add common method to populate default bus")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Rob Herring <robh@kernel.org>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14041/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

MIPS: Octeon: mangle-port: fix build failure with VDSO code

Commit 1685ddbe35cd ("MIPS: Octeon: Changes to support readq()/writeq()
usage.") added bitwise shift operations that assume that unsigned long
is always 64-bits. This broke the build of VDSO code, as it gets compiled
also in "faked" 32-bit mode. Althought the failing inline functions are
never executed in 32-bit mode, they still need to pass the compilation.
Fix by using 64-bit types explicitly.

The patch fixes the following build failure:

  CC      arch/mips/vdso/gettimeofday-o32.o
In file included from los/git/devel/linux/arch/mips/include/asm/io.h:32:0,
                 from los/git/devel/linux/arch/mips/include/asm/page.h:194,
                 from los/git/devel/linux/arch/mips/vdso/vdso.h:26,
                 from los/git/devel/linux/arch/mips/vdso/gettimeofday.c:11:
los/git/devel/linux/arch/mips/include/asm/mach-cavium-octeon/mangle-port.h: In function '__should_swizzle_bits':
los/git/devel/linux/arch/mips/include/asm/mach-cavium-octeon/mangle-port.h:19:40: error: right shift count >= width of type [-Werror=shift-count-overflow]
  unsigned long did = ((unsigned long)a >> 40) & 0xff;
                                        ^~

Fixes: 1685ddbe35cd ("MIPS: Octeon: Changes to support readq()/writeq() usage.")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: Alex Smith <alex.smith@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14039/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...)

cpu_has_fpu macro uses smp_processor_id() and is currently executed
with preemption enabled, that triggers the warning at runtime.

It is assumed throughout the kernel that if any CPU has an FPU, then all
CPUs would have an FPU as well, so it is safe to perform the check with
preemption enabled - change the code to use raw_ variant of the check to
avoid the warning.

Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # 4.0+
Patchwork: https://patchwork.linux-mips.org/patch/14125/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

can: flexcan: fix resume function

On a imx6ul-pico board the following error is seen during system suspend:

dpm_run_callback(): platform_pm_resume+0x0/0x54 returns -110
PM: Device 2090000.flexcan failed to resume: error -110

The reason for this suspend error is because when the CAN interface is not
active the clocks are disabled and then flexcan_chip_enable() will
always fail due to a timeout error.

In order to fix this issue, only call flexcan_chip_enable/disable()
when the CAN interface is active.

Based on a patch from Dong Aisheng in the NXP kernel.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

xfrm: Fix memory leak of aead algorithm name

commit 1a6509d99122 ("[IPSEC]: Add support for combined mode algorithms")
introduced aead. The function attach_aead kmemdup()s the algorithm
name during xfrm_state_construct().
However this memory is never freed.
Implementation has since been slightly modified in
commit ee5c23176fcc ("xfrm: Clone states properly on migration")
without resolving this leak.
This patch adds a kfree() call for the aead algorithm name.

Fixes: 1a6509d99122 ("[IPSEC]: Add support for combined mode algorithms")
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Acked-by: Rami Rosen <roszenrami@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

mtd: nand: mxc: fix obiwan error in mxc_nand_v[12]_ooblayout_free() functions

commit a894cf6c5a82 ("mtd: nand: mxc: switch to mtd_ooblayout_ops")
introduced a regression accessing the OOB area from the mxc_nand
driver due to an Obiwan error in the mxc_nand_v[12]_ooblayout_free()
functions. They report a bogus oobregion { 64, 7 } which leads to
errors accessing bogus data when reading the oob area.

Prior to the commit the mtd-oobtest module could be run without any
errors. With the offending commit, this test fails with results like:
|Running mtd-oobtest
|
|=================================================
|mtd_oobtest: MTD device: 5
|mtd_oobtest: MTD device size 524288, eraseblock size 131072, page size 2048, count of eraseblocks 4, pages per eraseblock 64, OOB size 64
|mtd_test: scanning for bad eraseblocks
|mtd_test: scanned 4 eraseblocks, 0 are bad
|mtd_oobtest: test 1 of 5
|mtd_oobtest: writing OOBs of whole device
|mtd_oobtest: written up to eraseblock 0
|mtd_oobtest: written 4 eraseblocks
|mtd_oobtest: verifying all eraseblocks
|mtd_oobtest: error @addr[0x0:0x19] 0x9a -> 0x78 diff 0xe2
|mtd_oobtest: error @addr[0x0:0x1a] 0xcc -> 0x0 diff 0xcc
|mtd_oobtest: error @addr[0x0:0x1b] 0xe0 -> 0x85 diff 0x65
|mtd_oobtest: error @addr[0x0:0x1c] 0x60 -> 0x62 diff 0x2
|mtd_oobtest: error @addr[0x0:0x1d] 0x69 -> 0x45 diff 0x2c
|mtd_oobtest: error @addr[0x0:0x1e] 0xcd -> 0xa0 diff 0x6d
|mtd_oobtest: error @addr[0x0:0x1f] 0xf2 -> 0x60 diff 0x92
|mtd_oobtest: error: verify failed at 0x0
[...]

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Fixes: a894cf6c5a82 ("mtd: nand: mxc: switch to mtd_ooblayout_ops")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>

mtd: nand: fix chances to create incomplete ECC data when writing

When mtk_nfc_do_write_page() comparing the sector number,because the
sector number field is at the 12th-bit position of NFI_BYTELEN
register,the masked register should be shifted 12 bits before being
compared.The result of this bug may cause the second subpage has
incomplete ECC parity bytes.

Signed-off-by: RogerCC Lin <rogercc.lin@mediatek.com>
Fixes: 1d6b1e464950 ("mtd: mediatek: driver for MTK Smart Device")
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>

mtd: nand: fix generating over-boundary ECC data when writing

When mtk_ecc_encode() is writing the ECC parity data to the OOB
region,because each register is 4 bytes in length,but the len's unit is
in bytes,the operation in the for loop will cross the ECC's boundary.

Signed-off-by: RogerCC Lin <rogercc.lin@mediatek.com>
Fixes: 1d6b1e464950 ("mtd: mediatek: driver for MTK Smart Device")
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>

genirq: Skip chained interrupt trigger setup if type is IRQ_TYPE_NONE

There is no point in trying to configure the trigger of a chained
interrupt if no trigger information has been configured. At best
this is ignored, and at the worse this confuses the underlying
irqchip (which is likely not to handle such a thing), and
unnecessarily alarms the user.

Only apply the configuration if type is not IRQ_TYPE_NONE.

Fixes: 1e12c4a9393b ("genirq: Correctly configure the trigger on chained interrupts")
Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/CAMuHMdVW1eTn20=EtYcJ8hkVwohaSuH_yQXrY2MGBEvZ8fpFOg@mail.gmail.com
Link: http://lkml.kernel.org/r/1474274967-15984-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

drm: Only use compat ioctl for addfb2 on X86/IA64

Similar to struct drm_update_draw, struct drm_mode_fb_cmd2 has an
unaligned 64 bit field (modifier). This get packed differently between
32 bit and 64 bit modes on architectures that can handle unaligned 64
bit access (X86 and IA64). Other architectures pack the structs the
same and don't need the compat wrapper. Use the same condition for
drm_mode_fb_cmd2 as we use for drm_update_draw.

Note that only the modifier will be packed differently between compat
and non-compat versions.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
[seanpaul added not at bottom of commit msg re: modifier]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1473801645-116011-1-git-send-email-hoegsberg@chromium.org
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge branch 'exynos-drm-fixes' of git://git./linux/kernel/git/daeinki/drm-exynos into drm-fixes

Just fixup to runtime pm usage and some cleanups.

* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
  Subject: [PATCH, RESEND] drm: exynos: avoid unused function warning
  drm/exynos: g2d: fix system and runtime pm integration
  drm/exynos: rotator: fix system and runtime pm integration
  drm/exynos: gsc: fix system and runtime pm integration
  drm/exynos: fimc: fix system and runtime pm integration
  exynos-drm: Fix unsupported GEM memory type error message to be clear

Merge tag 'mac80211-for-davem-2016-09-16' of git://git./linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Two more fixes:
* reject aggregation sessions for TSID/TID 8-16 that we
can never use anyway and which could confuse drivers
* check return value of skb_linearize()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: fix PWRDWN into the PMT register for global unicast.

MAC devices use the RWKPKTEN and MGKPKTEN bits of the PMT Control/Status
register to generate power management events.
So this patch is to properly set the RWKPKTEN [BIT(2)] inside the
PMT register (needed in case of global unicast).

Reported-by: Aditi SHARMA <aditi-hed.sharma@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Subject: [PATCH, RESEND] drm: exynos: avoid unused function warning

When CONFIG_PM is not set, we get a warning about an unused function:

drivers/gpu/drm/exynos/exynos_drm_gsc.c:1219:12: error: 'gsc_clk_ctrl' defined but not used [-Werror=unused-function]
static int gsc_clk_ctrl(struct gsc_context *ctx, bool enable)
^~~~~~~~~~~~

This removes the two #ifdef checks in this file and instead marks the
functions as __maybe_unused, which is a more reliable way of doing the
same, allowing better build coverage and avoiding the warning above.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

Linux 4.8-rc7

Merge tag 'usb-4.8-rc7' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are two small fixes, and one new device id, for 4.8-rc7

  The fixes solve a build error that was reported in your tree for the
  blackfin arch, and resolve an issue with a number of broken USB
  devices that reported the wrong interval rate.  Included here is also
  a new device id for the usb-serial driver.

  All have been in linux-next with no reported issues"

* tag 'usb-4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: change bInterval default to 10 ms
  usb: musb: Fix tusb6010 compile error on blackfin
  USB: serial: simple: add support for another Infineon flashloader

Merge tag 'fixes-for-linus-v4.8-rc7' of git://git./linux/kernel/git/groeck/linux-staging

Pull uaccess fixes from Guenter Roeck:
"Two patches fixing problems introduced with copy_from_user changes"

* tag 'fixes-for-linus-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
openrisc: fix the fix of copy_from_user()
avr32: fix 'undefined reference to `___copy_from_user'

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
"A couple of small fixes to x86 perf drivers:

   - Measure L2 for HW_CACHE* events on AMD

   - Fix the address filter handling in the intel/pt driver

   - Handle the BTS disabling at the proper place"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2
  perf/x86/intel/pt: Do validate the size of a kernel address filter
  perf/x86/intel/pt: Fix kernel address filter's offset validation
  perf/x86/intel/pt: Fix an off-by-one in address filter configuration
  perf/x86/intel: Don't disable "intel_bts" around "intel" event batching

Merge branch 'smp-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull SMP build fixlet from Thomas Gleixner:
"Add a missing include in cpuhotplug.h"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Include linux/types.h in linux/cpuhotplug.h

Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
"Two patches from Boris which address a potential deadlock in the atmel
  irq chip driver"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/atmel-aic: Fix potential deadlock in ->xlate()
  genirq: Provide irq_gc_{lock_irqsave,unlock_irqrestore}() helpers

openrisc: fix the fix of copy_from_user()

Since commit acb2505d0119 ("openrisc: fix copy_from_user()"),
copy_from_user() returns the number of bytes requested, not the
number of bytes not copied.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Fixes: acb2505d0119 ("openrisc: fix copy_from_user()")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

avr32: fix 'undefined reference to `___copy_from_user'

avr32 builds fail with:

arch/avr32/kernel/built-in.o: In function `arch_ptrace':
(.text+0x650): undefined reference to `___copy_from_user'
arch/avr32/kernel/built-in.o:(___ksymtab+___copy_from_user+0x0): undefined
reference to `___copy_from_user'
kernel/built-in.o: In function `proc_doulongvec_ms_jiffies_minmax':
(.text+0x5dd8): undefined reference to `___copy_from_user'
kernel/built-in.o: In function `proc_dointvec_minmax_sysadmin':
sysctl.c:(.text+0x6174): undefined reference to `___copy_from_user'
kernel/built-in.o: In function `ptrace_has_cap':
ptrace.c:(.text+0x69c0): undefined reference to `___copy_from_user'
kernel/built-in.o:ptrace.c:(.text+0x6b90): more undefined references to
`___copy_from_user' follow

Fixes: 8630c32275ba ("avr32: fix copy_from_user()")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Havard Skinnemoen <hskinnemoen@gmail.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

drm/exynos: g2d: fix system and runtime pm integration

Move code from system sleep pm to runtime pm callbacks to ensure proper
driver state preservation when device is under power domain. Then, use
generic helpers for using runtime pm for system sleep pm.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

drm/exynos: rotator: fix system and runtime pm integration

Use generic helpers instead of open-coding usage of runtime pm for system
sleep pm, which was potentially broken for some corner cases.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

drm/exynos: gsc: fix system and runtime pm integration

Use generic helpers instead of open-coding usage of runtime pm for system
sleep pm, which was potentially broken for some corner cases.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

drm/exynos: fimc: fix system and runtime pm integration

Use generic helpers instead of open-coding usage of runtime pm for system
sleep pm, which was potentially broken for some corner cases.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

exynos-drm: Fix unsupported GEM memory type error message to be clear

Fix unsupported GEM memory type error message to include the memory type
information.

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

fix iov_iter_fault_in_readable()

... by turning it into what used to be multipages counterpart

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'mmc-v4.8-rc6' of git://git.linaro.org/people/ulf.hansson/mmc

Pull MMC fixes from Ulf Hansson:
"MMC host:
   - omap/omap_hsmmc: Initialize dma_slave_config to avoid random data
   - sdhci-st: Handle interconnect clock"

* tag 'mmc-v4.8-rc6' of git://git.linaro.org/people/ulf.hansson/mmc:
  mmc: omap: Initialize dma_slave_config to avoid random data in it's fields
  mmc: omap_hsmmc: Initialize dma_slave_config to avoid random data
  mmc: sdhci-st: Handle interconnect clock
  dt-bindings: mmc: sdhci-st: Mention the discretionary "icn" clock

Merge tag 'powerpc-4.8-6' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
"Fixes for code merged this cycle:

   - Fix restore of SPRs upon wake up from hypervisor state loss from
     Gautham R  Shenoy
   - Fix the state of root PE from Gavin Shan
   - Detach from PE on releasing PCI device from Gavin Shan
   - Fix size of NUM_CPU_FTR_KEYS on 32-bit
   - Fix missed TCE invalidations that should fallback to OPAL"

* tag 'powerpc-4.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powernv/pci: Fix missed TCE invalidations that should fallback to OPAL
  powerpc/powernv: Detach from PE on releasing PCI device
  powerpc/powernv: Fix the state of root PE
  powerpc/kernel: Fix size of NUM_CPU_FTR_KEYS on 32-bit
  powerpc/powernv: Fix restore of SPRs upon wake up from hypervisor state loss

bna: fix crash in bnad_get_strings()

Commit 6e7333d "net: add rx_nohandler stat counter" added the new entry
rx_nohandler into struct rtnl_link_stats64. Unfortunately the bna
driver foolishly depends on the structure. It uses part of it for
ethtool statistics and it's not bad but the driver assumes its size
is constant as it defines string for each existing entry. The problem
occurs when the structure is extended because you need to modify bna
driver as well. If not any attempt to retrieve ethtool statistics results
in crash in bnad_get_strings().
The patch changes BNAD_ETHTOOL_STATS_NUM so it counts real number of
strings in the array and also removes rtnl_link_stats64 entries that
are not used in output and are always zero.

Fixes: 6e7333d "net: add rx_nohandler stat counter"
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bna: add missing per queue ethtool stat

Commit ba5ca784 "bna: check for dma mapping errors" added besides other
things a statistic that counts number of DMA buffer mapping failures
per each Rx queue. This counter is not included in ethtool stats output.

Fixes: ba5ca784 "bna: check for dma mapping errors"
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'iwlwifi-for-kalle-2016-09-15' of git://git./linux/kernel/git/iwlwifi/iwlwifi-fixes

* fix to prevent firmware crash when sending off-channel frames

sctp: fix SSN comparision

This function actually operates on u32 yet its paramteres were declared
as u16, causing integer truncation upon calling.

Note in patch context that ADDIP_SERIAL_SIGN_BIT is already 32 bits.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

irda: Free skb on irda_accept error path.

skb is not freed if newsk is NULL. Rework the error path so free_skb is
unconditionally called on function exit.

Fixes: c3ea9fa27413 ("[IrDA] af_irda: IRDA_ASSERT cleanups")
Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: avoid sk_forward_alloc overflows

A malicious TCP receiver, sending SACK, can force the sender to split
skbs in write queue and increase its memory usage.

Then, when socket is closed and its write queue purged, we might
overflow sk_forward_alloc (It becomes negative)

sk_mem_reclaim() does nothing in this case, and more than 2GB
are leaked from TCP perspective (tcp_memory_allocated is not changed)

Then warnings trigger from inet_sock_destruct() and
sk_stream_kill_queues() seeing a not zero sk_forward_alloc

All TCP stack can be stuck because TCP is under memory pressure.

A simple fix is to preemptively reclaim from sk_mem_uncharge().

This makes sure a socket wont have more than 2 MB forward allocated,
after burst and idle period.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fix overflow in __tcp_retransmit_skb()

If a TCP socket gets a large write queue, an overflow can happen
in a test in __tcp_retransmit_skb() preventing all retransmits.

The flow then stalls and resets after timeouts.

Tested:

sysctl -w net.core.wmem_max=1000000000
netperf -H dest -- -s 1000000000

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

xen-netback: fix error handling on netback_probe()

In case of error during netback_probe() (e.g. an entry missing on the
xenstore) netback_remove() is called on the new device, which will set
the device backend state to XenbusStateClosed by calling
set_backend_state(). However, the backend state wasn't initialized by
netback_probe() at this point, which will cause and invalid transaction
and set_backend_state() to BUG().

Initialize the backend state at the beginning of netback_probe() to
XenbusStateInitialising, and create two new valid state transitions on
set_backend_state(), from XenbusStateInitialising to XenbusStateClosed,
and from XenbusStateInitialising to XenbusStateInitWait.

Signed-off-by: Filipe Manco <filipe.manco@neclab.eu>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
"Small set of cifs fixes"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  Move check for prefix path to within cifs_get_root()
  Compare prepaths when comparing superblocks
  Fix memory leaks in cifs_do_mount()

Merge tag 'nfsd-4.8-2' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfix from Bruce Fields:
"Fix a memory corruption bug that I introduced in 4.7"

* tag 'nfsd-4.8-2' of git://linux-nfs.org/~bfields/linux:
svcauth_gss: Revert 64c59a3726f2 ("Remove unnecessary allocation")

Merge tag 'drm-fixes-for-4.8-rc6' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Two sets of i915 fixes, one set of vc4 crasher fixes, and a couple of
  atmel fixes.

  Nothing too out there at this stage, though I think some people are
  holidaying so it's been quiet enough"

* tag 'drm-fixes-for-4.8-rc6' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: Ignore OpRegion panel type except on select machines
  Revert "drm/i915/psr: Make idle_frames sensible again"
  drm/i915: Restore lost "Initialized i915" welcome message
  drm/vc4: mark vc4_bo_cache_purge() static
  drm/i915: Add GEN7_PCODE_MIN_FREQ_TABLE_GT_RATIO_OUT_OF_RANGE to SNB
  drm/i915: disable 48bit full PPGTT when vGPU is active
  drm/i915: enable vGPU detection for all
  drm/atmel-hlcdc: Make ->reset() implementation static
  drm: atmel-hlcdc: Fix vertical scaling
  drm/vc4: Allow some more signals to be packed with uniform resets.
  drm/i915/dvo: Remove dangling call to drm_encoder_cleanup()

Merge tag 'pm-4.8-rc7' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
"More annotations of tracepoints in the runtime PM framework to prevent
  RCU from complaining when that code is invoked from the idle path
  (Paul McKenney)"

* tag 'pm-4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / runtime: Use _rcuidle for runtime suspend tracepoints

Merge tag 'drm-vc4-fixes-2016-09-14' of https://github.com/anholt/linux into drm-fixes

This pull request brings in a fix for crashes in X on VC4.

* tag 'drm-vc4-fixes-2016-09-14' of https://github.com/anholt/linux:
drm/vc4: mark vc4_bo_cache_purge() static
drm/vc4: Allow some more signals to be packed with uniform resets.

Merge tag 'drm-intel-fixes-2016-09-15' of git://anongit.freedesktop.org/drm-intel into drm-fixes

i915 fixes from Jani.

* tag 'drm-intel-fixes-2016-09-15' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Ignore OpRegion panel type except on select machines
  Revert "drm/i915/psr: Make idle_frames sensible again"
  drm/i915: Restore lost "Initialized i915" welcome message

Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:
"Round three of 4.8 rc fixes.

  This is likely the last rdma pull request this cycle.  The new rxe
  driver had a few issues (you probably saw the boot bot bug report) and
  they should be addressed now.  There are a couple other fixes here,
  mainly mlx4.  There are still two outstanding issues that need
  resolved but I don't think their fix will make this kernel cycle.

  Summary:

   - Various fixes to rdmavt, ipoib, mlx5, mlx4, rxe"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/rdmavt: Don't vfree a kzalloc'ed memory region
  IB/rxe: Fix kmem_cache leak
  IB/rxe: Fix race condition between requester and completer
  IB/rxe: Fix duplicate atomic request handling
  IB/rxe: Fix kernel panic in udp_setup_tunnel
  IB/mlx5: Set source mac address in FTE
  IB/mlx5: Enable MAD_IFC commands for IB ports only
  IB/mlx4: Diagnostic HW counters are not supported in slave mode
  IB/mlx4: Use correct subnet-prefix in QP1 mads under SR-IOV
  IB/mlx4: Fix code indentation in QP1 MAD flow
  IB/mlx4: Fix incorrect MC join state bit-masking on SR-IOV
  IB/ipoib: Don't allow MC joins during light MC flush
  IB/rxe: fix GFP_KERNEL in spinlock context

Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
"Here are a couple of bugfixes for v4.8-rc.

  Most of them have actually been around for a while this time but for
  some reason didn't get applied early on.  The shmobile regulator fix
  is the only one that isn't completely obvious.

  Device tree changes:
   - archtimer interrupts must be level triggered (multiple platforms)
   - fix for USB and MMC clocks on STiH410
   - fix split DT repository in case of raspberry-pi 3
   - a new use of skeleton.dtsi on arm64 has crept in after that was
     removed.

  defconfig updates:
   - xilinx vdma has a new Kconfig symbol name
   - keystone requires CONFIG_NOP_USB_XCEIV since v4.8-rc1

  Code fixes:
   - fix regulator quirk on shmobile
   - suspend-to-ram regression on EXYNOS

  Maintainer updates:
   - Javier Martinez Canillas is now a reviewer for Samsung EXYNOS"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: keystone: defconfig: Fix USB configuration
  arm64: dts: Fix broken architected timer interrupt trigger
  ARM: multi_v7_defconfig: update XILINX_VDMA
  ARM64: dts: bcm: Use a symlink to R-Pi dtsi files from arch=arm
  ARM: dts: Remove use of skeleton.dtsi from bcm283x.dtsi
  ARM: dts: STiH407-family: Provide interconnect clock for consumption in ST SDHCI
  ARM: dts: STiH410: Handle interconnect clock required by EHCI/OHCI (USB)
  ARM: shmobile: fix regulator quirk for Gen2
  ARM: EXYNOS: Clear OF_POPULATED flag from PMU node in IRQ init callback
  MAINTAINERS: Add myself as reviewer for Samsung Exynos support

Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
"Most of this update are fixes primarily discovered from testing on the
  older StrongARM 1110 and PXA systems, as a result of recent interest
  from several people in these platforms:

   - Locomo interrupt handling incorrectly stores the handler data in
     the chip's private data slot: when Locomo is combined with an
     interrupt controller who's chip uses the chip private data, this
     leads to an oops.

   - SA1111 was missing a call to clk_disable() to clean up after a
     failed probe.

   - SA1111 and PCMCIA suspend/resume was broken:

     The PCMCIA "ds" layer was using the legacy bus suspend/resume
     methods, which the core PM code is no longer calling as a result of
     device_pm_check_callbacks() introduced in commit aa8e54b559479
     ("PM / sleep: Go direct_complete if driver has no callbacks").

     SA1111 was broken due to changes to PCMCIA which makes PCMCIA
     suspend itself later than the SA1111 code expects, and resume
     before the SA1111 code has initialised access to the pcmcia
     sub-device.

   - the default SA1111 interrupt mask polarity got messed up when it
     was converted to use a dynamic interrupt base number for its
     interrupts.

   - fix platform_get_irq() error code propagation, which was causing
     problems on platforms where the interrupt may not be available at
      probe time in DT setups.

   - fix the lack of clock to PCMCIA code on PXA platforms, which was
     omitted in conversions of PXA to CCF.

   - fix an oops in the PXA PCMCIA code caused by a previous commit not
     realising that Lubbock is different from the rest of the PXA PCMCIA
     drivers.

   - ensure that SA1111 low-level PCMCIA drivers propagate their error
     codes to the main probe function, rather than the driver silently
     accepting a failure.

   - fix the sa11xx debugfs reporting of timing information, which
     always indicated zero due to the clock being a factor of 1000 out.

   - fix the polarity of the status change signal reported from the
     sockets.

  Lastly, one ARM specific commit from Stefan Agner fixing the LPAE
  cache attributes"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: pxa/lubbock: add pcmcia clock
  ARM: locomo: fix locomo irq handling
  ARM: 8612/1: LPAE: initialize cache policy correctly
  ARM: sa1111: fix missing clk_disable()
  ARM: sa1111: fix pcmcia suspend/resume
  ARM: sa1111: fix pcmcia interrupt mask polarity
  ARM: sa1111: fix error code propagation in sa1111_probe()
  pcmcia: lubbock: fix sockets configuration
  pcmcia: sa1111: fix propagation of lowlevel board init return code
  pcmcia: soc_common: fix SS_STSCHG polarity
  pcmcia: sa11xx_base: add units to the timing information
  pcmcia: sa11xx_base: fix reporting of timing information
  pcmcia: ds: fix suspend/resume

IB/rdmavt: Don't vfree a kzalloc'ed memory region

The userspace memory region 'mr' is allocated with kzalloc in
__rvt_alloc_mr however it is incorrectly being freed with vfree in
__rvt_free_mr. Fix this by using kfree to free it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/rxe: Fix kmem_cache leak

Decrement qp reference when handling error path
in completer to prevent kmem_cache leak.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/rxe: Fix race condition between requester and completer

rxe_requester() is sending a pkt with rxe_xmit_packet() and
then calls rxe_update() to update the wqe and qp's psn values.
But sometimes the response is received before the requester
had time to update the wqe in which case the completer
acts on errornous wqe values.
This fix updates the wqe and qp before actually sending
the request and rolls back when xmit fails.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/rxe: Fix duplicate atomic request handling

When handling ack for atomic opcodes like "fetch&add"
or "cmp&swp", the method send_atomic_ack() saves the ack
before sending it, in case it gets lost and never reach the
requester. In which case the method duplicate_request()
will need to find it using the duplicated request.psn.
But send_atomic_ack() used a wrong psn value and thus
the above ack was never found.
This fix uses the ack.psn to locate the ack in case
its needed.
This fix also copies the ack packet to the skb's control buffer
since duplicate_request() will need it when calling rxe_xmit_packet()

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/rxe: Fix kernel panic in udp_setup_tunnel

Disable creation of a UDP socket for ipv6 when
CONFIG_IPV6 is not enabeld. Since udp_sock_create6()
returns 0 when CONFIG_IPV6 is not set

[   46.888632] IP: [<c220705a>] setup_udp_tunnel_sock+0x6/0x4f
[   46.891355] *pdpt = 0000000000000000 *pde = f000ff53f000ff53
[   46.893918] Oops: 0002 [#1] PREEMPT
[   46.896014] CPU: 0 PID: 1 Comm: swapper Not tainted 4.7.0-rc4-00001-g8700e3e #1
[   46.900280] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[   46.904905] task: cf06c040 ti: cf05e000 task.ti: cf05e000
[   46.907854] EIP: 0060:[<c220705a>] EFLAGS: 00210246 CPU: 0
[   46.911137] EIP is at setup_udp_tunnel_sock+0x6/0x4f
[   46.914070] EAX: 00000044 EBX: 00000001 ECX: cf05fef0 EDX: ca8142e0
[   46.917236] ESI: c2c4505b EDI: cf05fef0 EBP: cf05fed0 ESP: cf05fed0
[   46.919836]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[   46.922046] CR0: 80050033 CR2: 000001fc CR3: 02cec000 CR4: 000006b0
[   46.924550] Stack:
[   46.926014]  cf05ff10 c1fd4657 ca8142e0 0000000a 00000000 00000000 0000b712 00000008
[   46.931274]  00000000 6bb5bd01 c1fd48de 00000000 00000000 cf05ff1c 00000000 00000000
[   46.936122]  cf05ff1c c1fd4bdf 00000000 cf05ff28 c2c4507b ffffffff cf05ff88 c2bf1c74
[   46.942350] Call Trace:
[   46.944403]  [<c1fd4657>] rxe_setup_udp_tunnel+0x8f/0x99
[   46.947689]  [<c1fd48de>] ? net_to_rxe+0x4e/0x4e
[   46.950567]  [<c1fd4bdf>] rxe_net_init+0xe/0xa4
[   46.953147]  [<c2c4507b>] rxe_module_init+0x20/0x4c
[   46.955448]  [<c2bf1c74>] do_one_initcall+0x89/0x113
[   46.957797]  [<c2bf15eb>] ? set_debug_rodata+0xf/0xf
[   46.959966]  [<c2bf1dbc>] ? kernel_init_freeable+0xbe/0x15b
[   46.962262]  [<c2bf1ddc>] kernel_init_freeable+0xde/0x15b
[   46.964418]  [<c232eb54>] kernel_init+0x8/0xd0
[   46.966618]  [<c2333122>] ret_from_kernel_thread+0xe/0x24
[   46.969592]  [<c232eb4c>] ? rest_init+0x6f/0x6f

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/mlx5: Set source mac address in FTE

Set the source mac address in the FTE when L2 specification
is provided.

Fixes: 038d2ef87572 ('IB/mlx5: Add flow steering support')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/mlx5: Enable MAD_IFC commands for IB ports only

MAD_IFC command is supported only for physical functions (PF)
and when physical port is IB. The proposed fix enforces it.

Fixes: d603c809ef91 ("IB/mlx5: Fix decision on using MAD_IFC")
Reported-by: David Chang <dchang@suse.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/mlx4: Diagnostic HW counters are not supported in slave mode

Modify the mlx4_ib_diag_counters() to avoid the following error in the
hypervisor when the slave tries to query the hardware counters in SR-IOV
mode.

mlx4_core 0000:81:00.0: Unknown command:0x30 accepted from slave:1

Fixes: 3f85f2aaabf7 ("IB/mlx4: Add diagnostic hardware counters")
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/mlx4: Use correct subnet-prefix in QP1 mads under SR-IOV

When sending QP1 MAD packets which use a GRH, the source GID
(which consists of the 64-bit subnet prefix, and the 64 bit port GUID)
must be included in the packet GRH.

For SR-IOV, a GID cache is used, since the source GID needs to be the
slave's source GID, and not the Hypervisor's GID. This cache also
included a subnet_prefix. Unfortunately, the subnet_prefix field in
the cache was never initialized (to the default subnet prefix 0xfe80::0).
As a result, this field remained all zeroes. Therefore, when SR-IOV
was active, all QP1 packets which included a GRH had a source GID
subnet prefix of all-zeroes.

However, the subnet-prefix should initially be 0xfe80::0 (the default
subnet prefix). In addition, if OpenSM modifies a port's subnet prefix,
the new subnet prefix must be used in the GRH when sending QP1 packets.
To fix this we now initialize the subnet prefix in the SR-IOV GID cache
to the default subnet prefix. We update the cached value if/when OpenSM
modifies the port's subnet prefix. We take this cached value when sending
QP1 packets when SR-IOV is active.

Note that the value is stored as an atomic64. This eliminates any need
for locking when the subnet prefix is being updated.

Note also that we depend on the FW generating the "port management change"
event for tracking subnet-prefix changes performed by OpenSM. If running
early FW (before 2.9.4630), subnet prefix changes will not be tracked (but
the default subnet prefix still will be stored in the cache; therefore
users who do not modify the subnet prefix will not have a problem).
IF there is a need for such tracking also for early FW, we will add that
capability in a subsequent patch.

Fixes: 1ffeb2eb8be9 ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/mlx4: Fix code indentation in QP1 MAD flow

The indentation in the QP1 GRH flow in procedure build_mlx_header is
really confusing. Fix it, in preparation for a commit which touches
this code.

Fixes: 1ffeb2eb8be9 ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/mlx4: Fix incorrect MC join state bit-masking on SR-IOV

Because of an incorrect bit-masking done on the join state bits, when
handling a join request we failed to detect a difference between the
group join state and the request join state when joining as send only
full member (0x8). This caused the MC join request not to be sent.
This issue is relevant only when SRIOV is enabled and SM supports
send only full member.

This fix separates scope bits and join states bits a nibble each.

Fixes: b9c5d6a64358 ('IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV')
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/ipoib: Don't allow MC joins during light MC flush

This fix solves a race between light flush and on the fly joins.
Light flush doesn't set the device to down and unset IPOIB_OPER_UP
flag, this means that if while flushing we have a MC join in progress
and the QP was attached to BC MGID we can have a mismatches when
re-attaching a QP to the BC MGID.

The light flush would set the broadcast group to NULL causing an on
the fly join to rejoin and reattach to the BC MCG as well as adding
the BC MGID to the multicast list. The flush process would later on
remove the BC MGID and detach it from the QP. On the next flush
the BC MGID is present in the multicast list but not found when trying
to detach it because of the previous double attach and single detach.

[18332.714265] ------------[ cut here ]------------
[18332.717775] WARNING: CPU: 6 PID: 3767 at drivers/infiniband/core/verbs.c:280 ib_dealloc_pd+0xff/0x120 [ib_core]
...
[18332.775198] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
[18332.779411]  0000000000000000 ffff8800b50dfbb0 ffffffff813fed47 0000000000000000
[18332.784960]  0000000000000000 ffff8800b50dfbf0 ffffffff8109add1 0000011832f58300
[18332.790547]  ffff880226a596c0 ffff880032482000 ffff880032482830 ffff880226a59280
[18332.796199] Call Trace:
[18332.798015]  [<ffffffff813fed47>] dump_stack+0x63/0x8c
[18332.801831]  [<ffffffff8109add1>] __warn+0xd1/0xf0
[18332.805403]  [<ffffffff8109aebd>] warn_slowpath_null+0x1d/0x20
[18332.809706]  [<ffffffffa025d90f>] ib_dealloc_pd+0xff/0x120 [ib_core]
[18332.814384]  [<ffffffffa04f3d7c>] ipoib_transport_dev_cleanup+0xfc/0x1d0 [ib_ipoib]
[18332.820031]  [<ffffffffa04ed648>] ipoib_ib_dev_cleanup+0x98/0x110 [ib_ipoib]
[18332.825220]  [<ffffffffa04e62c8>] ipoib_dev_cleanup+0x2d8/0x550 [ib_ipoib]
[18332.830290]  [<ffffffffa04e656f>] ipoib_uninit+0x2f/0x40 [ib_ipoib]
[18332.834911]  [<ffffffff81772a8a>] rollback_registered_many+0x1aa/0x2c0
[18332.839741]  [<ffffffff81772bd1>] rollback_registered+0x31/0x40
[18332.844091]  [<ffffffff81773b18>] unregister_netdevice_queue+0x48/0x80
[18332.848880]  [<ffffffffa04f489b>] ipoib_vlan_delete+0x1fb/0x290 [ib_ipoib]
[18332.853848]  [<ffffffffa04df1cd>] delete_child+0x7d/0xf0 [ib_ipoib]
[18332.858474]  [<ffffffff81520c08>] dev_attr_store+0x18/0x30
[18332.862510]  [<ffffffff8127fe4a>] sysfs_kf_write+0x3a/0x50
[18332.866349]  [<ffffffff8127f4e0>] kernfs_fop_write+0x120/0x170
[18332.870471]  [<ffffffff81207198>] __vfs_write+0x28/0xe0
[18332.874152]  [<ffffffff810e09bf>] ? percpu_down_read+0x1f/0x50
[18332.878274]  [<ffffffff81208062>] vfs_write+0xa2/0x1a0
[18332.881896]  [<ffffffff812093a6>] SyS_write+0x46/0xa0
[18332.885632]  [<ffffffff810039b7>] do_syscall_64+0x57/0xb0
[18332.889709]  [<ffffffff81883321>] entry_SYSCALL64_slow_path+0x25/0x25
[18332.894727] ---[ end trace 09ebbe31f831ef17 ]---

Fixes: ee1e2c82c245 ("IPoIB: Refresh paths instead of flushing them on SM change events")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/rxe: fix GFP_KERNEL in spinlock context

There is skb_clone(skb, GFP_KERNEL) in spinlock context
in rxe_rcv_mcast_pkt().

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

Merge tag 'usb-serial-4.8-rc7' of git://git./linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for v4.8-rc7

Here's another Infineon flashloader device id.

Signed-off-by: Johan Hovold <johan@kernel.org>

Merge tag 'samsung-fixes-4.8-2' of git://git./linux/kernel/git/krzk/linux into fixes

Pull "ARM: exynos: Fixes for v4.8, secound round" from Krzysztof Kozłowski:

1. A recent change in populating irqchip devices from Device Tree
   broke Suspend to RAM on Exynos boards due to lack of probing of
   PMU (Power Management Unit) driver.  Multiple drivers attach to
   the PMU's DT node: irqchip, clock controller and PMU platform
   driver for handling suspend.  The new irqchip code marked the
   PMU's DT node as OF_POPULATED but we need to attach to this
   node also PMU platform driver.

2. Add Javier as additional reviewer for Exynos patches.

* tag 'samsung-fixes-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
  ARM: EXYNOS: Clear OF_POPULATED flag from PMU node in IRQ init callback
  MAINTAINERS: Add myself as reviewer for Samsung Exynos support

USB: change bInterval default to 10 ms

Some full-speed mceusb infrared transceivers contain invalid endpoint
descriptors for their interrupt endpoints, with bInterval set to 0.
In the past they have worked out okay with the mceusb driver, because
the driver sets the bInterval field in the descriptor to 1,
overwriting whatever value may have been there before.  However, this
approach was never sanctioned by the USB core, and in fact it does not
work with xHCI controllers, because they use the bInterval value that
was present when the configuration was installed.

Currently usbcore uses 32 ms as the default interval if the value in
the endpoint descriptor is invalid.  It turns out that these IR
transceivers don't work properly unless the interval is set to 10 ms
or below.  To work around this mceusb problem, this patch changes the
endpoint-descriptor parsing routine, making the default interval value
be 10 ms rather than 32 ms.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Wade Berrier <wberrier@gmail.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: musb: Fix tusb6010 compile error on blackfin

We have CONFIG_BLACKFIN ifdef redefining all musb registers in
musb_regs.h and tusb6010.h is never included causing a build
error with blackfin-allmodconfig and COMPILE_TEST.

Let's fix the issue by not building tusb6010 if CONFIG_BLACKFIN
is selected.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2

While the Intel PMU monitors the LLC when perf enables the
HW_CACHE_REFERENCES and HW_CACHE_MISSES events, these events monitor
L1 instruction cache fetches (0x0080) and instruction cache misses
(0x0081) on the AMD PMU.

This is extremely confusing when monitoring the same workload across
Intel and AMD machines, since parameters like,

$ perf stat -e cache-references,cache-misses

measure completely different things.

Instead, make the AMD PMU measure instruction/data cache and TLB fill
requests to the L2 and instruction/data cache and TLB misses in the L2
when HW_CACHE_REFERENCES and HW_CACHE_MISSES are enabled,
respectively. That way the events measure unified caches on both
platforms.

Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1472044328-21302-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>

configfs: Return -EFBIG from configfs_write_bin_file.

The check for writing more than cb_max_size bytes does not 'goto out' so
it is a no-op which allows users to vmalloc an arbitrary amount.

Fixes: 03607ace807b ("configfs: implement binary attributes")
Cc: stable@kernel.org
Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>