cascardo/linux.git
7 years agos390/smp: clean up a condition
Dan Carpenter [Mon, 18 Jul 2016 07:02:56 +0000 (09:02 +0200)]
s390/smp: clean up a condition

I can never remember precedence rules.  Let's add some parenthesis so
this code is more clear.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cio/chp : Remove deprecated create_singlethread_workqueue
Bhaktipriya Shridhar [Sat, 16 Jul 2016 08:48:34 +0000 (14:18 +0530)]
s390/cio/chp : Remove deprecated create_singlethread_workqueue

The workqueue "chp_wq" is involved in performing pending
configure tasks for channel paths.

It has a single work item(&cfg_work) and hence doesn't require
ordering. Also, it is not being used on a memory reclaim path.
Hence, the singlethreaded workqueue has been replaced with the use of
system_wq.

System workqueues have been able to handle high level of concurrency
for a long time now and hence it's not required to have a singlethreaded
workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue
created with create_singlethread_workqueue(), system_wq allows multiple
work items to overlap executions even on the same CPU; however, a
per-cpu workqueue doesn't have any CPU locality or global ordering
guarantee unless the target CPU is explicitly specified and thus the
increase of local concurrency shouldn't make any difference.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/chsc: improve channel path descriptor determination
Sebastian Ott [Tue, 21 Jun 2016 14:26:25 +0000 (16:26 +0200)]
s390/chsc: improve channel path descriptor determination

When we fetch channel path descriptors via chsc we use a suboptimal
struct chsc_scpd and adjust that by casting the response to a generic
chsc_response_struct. Simplify the code by improving struct chsc_scpd.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/chsc: sanitize fmt check for chp_desc determination
Sebastian Ott [Tue, 21 Jun 2016 11:59:08 +0000 (13:59 +0200)]
s390/chsc: sanitize fmt check for chp_desc determination

When fetching channel path descriptors we've only evaluated the rfmt
parameter which could lead us to trigger the chsc even though the
machine doesn't support the specific format or to not trigger the
chsc and report a failure to userspace even though the machine would've
supported it.

Improve these checks and change the parameters of the in-kernel
user to be less confusing.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cio: make fmt1 channel path descriptor optional
Sebastian Ott [Thu, 14 Jul 2016 09:30:37 +0000 (11:30 +0200)]
s390/cio: make fmt1 channel path descriptor optional

Not all machines / hypervisors support the chsc commands to fetch
the fmt1 descriptor. When these commands fail the channel path would
currently not be available to linux.

Since users of these descriptors can already deal with invalid data
make fetching it optional. The only data that is mandatory for us is
the fmt0 channel path descriptor.

Also make the return code for missing facilities in
chsc_get_channel_measurement_chars consistent to other functions.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/chsc: fix ioctl CHSC_INFO_CU command
Sebastian Ott [Tue, 21 Jun 2016 10:33:30 +0000 (12:33 +0200)]
s390/chsc: fix ioctl CHSC_INFO_CU command

Via CHSC_INFO_CU we ought to provide userspace with control unit
configuration data. Due to an erroneous request code we trigger the
wrong chsc command. Fix this copy and paste error.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cio/device_ops: fix kernel doc
Sebastian Ott [Thu, 23 Jun 2016 09:09:26 +0000 (11:09 +0200)]
s390/cio/device_ops: fix kernel doc

Fix an incorrect kernel doc comment in device_ops.c. Also
provide proper function markups while at it.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cio: allow to reset channel measurement block
Sebastian Ott [Tue, 12 Jul 2016 17:57:57 +0000 (19:57 +0200)]
s390/cio: allow to reset channel measurement block

Prior to commit 1bc6664bdfb949bc69a08113801e7d6acbf6bc3f a call to
enable_cmf for a device for which channel measurement was already
enabled resulted in a reset of the measurement data.

What looked like bugs at the time (a 2nd allocation was triggered
but failed, reset was called regardless of previous failures, and
errors have not been reported to userspace) was actually something
at least one userspace tool depended on. Restore that behavior in
a sane way.

Fixes: 1bc6664bdfb ("s390/cio: use device_lock during cmb activation")
Cc: stable@vger.kernel.org #v4.4+
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/console: Make preferred console handling more consistent
Peter Oberparleiter [Thu, 7 Jul 2016 05:52:38 +0000 (07:52 +0200)]
s390/console: Make preferred console handling more consistent

Use the same code structure when determining preferred consoles for
Linux running as KVM guest as with Linux running in LPAR and z/VM
guest:

 - Extend the console_mode variable to cover vt220 and hvc consoles
 - Determine sensible console defaults in conmode_default()
 - Remove KVM-special handling in set_preferred_console()

Ensure that the sclp line mode console is also registered when the
vt220 console was selected to not change existing behavior that
someone might be relying on.

As an externally visible change, KVM guest users can now select
the 3270 or 3215 console devices using the conmode= kernel parameter,
provided that support for the corresponding driver was compiled into
the kernel.

Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/mm: fix gmap tlb flush issues
David Hildenbrand [Thu, 7 Jul 2016 08:44:10 +0000 (10:44 +0200)]
s390/mm: fix gmap tlb flush issues

__tlb_flush_asce() should never be used if multiple asce belong to a mm.

As this function changes mm logic determining if local or global tlb
flushes will be neded, we might end up flushing only the gmap asce on all
CPUs and a follow up mm asce flushes will only flush on the local CPU,
although that asce ran on multiple CPUs.

The missing tlb flushes will provoke strange faults in user space and even
low address protections in user space, crashing the kernel.

Fixes: 1b948d6caec4 ("s390/mm,tlb: optimize TLB flushing for zEC12")
Cc: stable@vger.kernel.org # 3.15+
Reported-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/mm: add support for 2GB hugepages
Gerald Schaefer [Mon, 4 Jul 2016 12:47:01 +0000 (14:47 +0200)]
s390/mm: add support for 2GB hugepages

This adds support for 2GB hugetlbfs pages on s390.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: have unique symbol for __switch_to address
Heiko Carstens [Thu, 30 Jun 2016 10:40:25 +0000 (12:40 +0200)]
s390: have unique symbol for __switch_to address

After linking there are several symbols for the same address that the
__switch_to symbol points to. E.g.:

000000000089b9c0 T __kprobes_text_start
000000000089b9c0 T __lock_text_end
000000000089b9c0 T __lock_text_start
000000000089b9c0 T __sched_text_end
000000000089b9c0 T __switch_to

When disassembling with "objdump -d" this results in a missing
__switch_to function. It would be named __kprobes_text_start
instead. To unconfuse objdump add a nop in front of the kprobes text
section. That way __switch_to appears again.

Obviously this solution is sort of a hack, since it also depends on
link order if this works or not. However it is the best I can come up
with for now.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cpuinfo: show maximum thread id
Heiko Carstens [Mon, 27 Jun 2016 11:19:22 +0000 (13:19 +0200)]
s390/cpuinfo: show maximum thread id

Expose the maximum thread id with /proc/cpuinfo.
With the new line the output looks like this:

vendor_id       : IBM/S390
bogomips per cpu: 20325.00
max thread id   : 1

With this new interface it is possible to always tell the correct
number of cpu threads potentially being used by the kernel.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/ptrace: clarify bits in the per_struct
Martin Schwidefsky [Tue, 28 Jun 2016 11:05:04 +0000 (13:05 +0200)]
s390/ptrace: clarify bits in the per_struct

The bits single_step and instruction_fetch lost their meaning
with git commit 5e9a26928f550157 "[S390] ptrace cleanup".
Clarify the comment for these two bits.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: stack address vs thread_info
Heiko Carstens [Mon, 27 Jun 2016 05:57:21 +0000 (07:57 +0200)]
s390: stack address vs thread_info

Avoid using the address of a process' thread_info structure as the
kernel stack address. This will break as soon as the thread_info
structure will be removed from the stack, and in addition it makes the
code a bit more understandable.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: remove pointless load within __switch_to
Heiko Carstens [Fri, 24 Jun 2016 08:02:38 +0000 (10:02 +0200)]
s390: remove pointless load within __switch_to

Remove a leftover from the code that transferred a couple of TIF bits
from the previous task to the next task.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: enable kcov support
Heiko Carstens [Mon, 20 Jun 2016 12:08:32 +0000 (14:08 +0200)]
s390: enable kcov support

Now that hopefully all inline assemblies have been converted to single
basic blocks we can enable kcov on s390.

Note that this patch does not disable as many files on s390 like the
x86 variant does. Right now I didn't see a reason to do that, however
additional files or directories can be excluded at any time.

The runtime overhead seems to be quite high.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cpumf: use basic block for ecctr inline assembly
Heiko Carstens [Mon, 20 Jun 2016 12:06:54 +0000 (14:06 +0200)]
s390/cpumf: use basic block for ecctr inline assembly

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/hypfs: use basic block for diag inline assembly
Heiko Carstens [Mon, 20 Jun 2016 12:06:09 +0000 (14:06 +0200)]
s390/hypfs: use basic block for diag inline assembly

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/sysinfo: use basic block for stsi inline assembly
Heiko Carstens [Mon, 20 Jun 2016 12:05:54 +0000 (14:05 +0200)]
s390/sysinfo: use basic block for stsi inline assembly

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/smp: use basic blocks for sigp inline assemblies
Heiko Carstens [Mon, 20 Jun 2016 12:04:17 +0000 (14:04 +0200)]
s390/smp: use basic blocks for sigp inline assemblies

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cio: use basic blocks for i/o inline assemblies
Heiko Carstens [Mon, 20 Jun 2016 12:03:52 +0000 (14:03 +0200)]
s390/cio: use basic blocks for i/o inline assemblies

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cio: use basic blocks for cmf inline assemblies
Heiko Carstens [Mon, 20 Jun 2016 12:03:38 +0000 (14:03 +0200)]
s390/cio: use basic blocks for cmf inline assemblies

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/pci: use basic blocks for pci inline assemblies
Heiko Carstens [Mon, 20 Jun 2016 12:03:25 +0000 (14:03 +0200)]
s390/pci: use basic blocks for pci inline assemblies

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/iucv: use basic blocks for iucv inline assemblies
Heiko Carstens [Mon, 20 Jun 2016 12:03:00 +0000 (14:03 +0200)]
s390/iucv: use basic blocks for iucv inline assemblies

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/crypto: use basic blocks for ap bus inline assemblies
Heiko Carstens [Mon, 20 Jun 2016 12:00:27 +0000 (14:00 +0200)]
s390/crypto: use basic blocks for ap bus inline assemblies

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/mm: use basic block for essa inline assembly
Heiko Carstens [Mon, 20 Jun 2016 11:31:55 +0000 (13:31 +0200)]
s390/mm: use basic block for essa inline assembly

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/lib: use basic blocks for inline assemblies
Heiko Carstens [Mon, 20 Jun 2016 11:07:42 +0000 (13:07 +0200)]
s390/lib: use basic blocks for inline assemblies

Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/uaccess: fix __put_get_user_asm define
Heiko Carstens [Mon, 20 Jun 2016 08:35:20 +0000 (10:35 +0200)]
s390/uaccess: fix __put_get_user_asm define

The __put_get_user_asm defines an inline assmembly which makes use of
the asm register construct. The parameters passed to that define may
also contain function calls.

It is a gcc restriction that between register asm statements and the
use of any such annotated variables function calls may clobber the
register / variable contents. Or in other words: gcc would generate
broken code.

This can be achieved e.g. with the following code:

    get_user(x, func() ? a : b);

where the call of func would clobber register zero which is used by
the __put_get_user_asm define.
To avoid this add two static inline functions which don't have these
side effects.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cpuinfo: rename cpu field to cpu number
Heiko Carstens [Tue, 21 Jun 2016 09:05:05 +0000 (11:05 +0200)]
s390/cpuinfo: rename cpu field to cpu number

The cpu field name within /proc/cpuinfo has a conflict with the
powerpc and sparc output where it contains the cpu model name.  So
rename the field name to cpu number which shouldn't generate any
conflicts.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/perf: remove perf_release/reserver_sampling functions
Heiko Carstens [Sat, 11 Jun 2016 09:08:10 +0000 (11:08 +0200)]
s390/perf: remove perf_release/reserver_sampling functions

Now that the oprofile sampling code is gone there is only one user of
the sampling facility left. Therefore the reserve and release
functions can be removed.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/oprofile: remove hardware sampler support
Heiko Carstens [Fri, 10 Jun 2016 08:40:06 +0000 (10:40 +0200)]
s390/oprofile: remove hardware sampler support

Remove hardware sampler support from oprofile module.

The oprofile user space utilty has been switched to use the kernel
perf interface, for which we also provide hardware sampling support.

In addition the hardware sampling support is also slightly broken: it
supports only 16 bits for the pid and therefore would generate wrong
results on machines which have a pid >64k.

Also the pt_regs structure which was passed to oprofile common code
cannot necessarily be used to generate sane backtraces, since the
task(s) in question may run while the samples are fed to oprofile.
So the result would be more or less random.

However given that the only user space tools switched to the perf
interface already four years ago the hardware sampler code seems to be
unused code, and therefore it should be reasonable to remove it.

The timer based oprofile support continues to work.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Andreas Arnez <arnez@linux.vnet.ibm.com>
Acked-by: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Acked-by: Robert Richter <rric@kernel.org>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: remove math emulation code
Heiko Carstens [Wed, 1 Jul 2015 11:26:51 +0000 (13:26 +0200)]
s390: remove math emulation code

The last in-kernel user is gone so we can finally remove this code.

Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: calculate loops_per_jiffies with fp instructions
Heiko Carstens [Wed, 1 Jul 2015 11:19:19 +0000 (13:19 +0200)]
s390: calculate loops_per_jiffies with fp instructions

Implement calculation of loops_per_jiffies with fp instructions which
are available on all 64 bit machines.
To save and restore floating point register context use the new vx support
functions.

Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: Updated kernel config files
Hendrik Brueckner [Wed, 24 Jun 2015 14:10:59 +0000 (16:10 +0200)]
s390: Updated kernel config files

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/crc32-vx: add crypto API module for optimized CRC-32 algorithms
Hendrik Brueckner [Tue, 28 Apr 2015 13:52:44 +0000 (15:52 +0200)]
s390/crc32-vx: add crypto API module for optimized CRC-32 algorithms

Add a crypto API module to access the vector extension based CRC-32
implementations.  Users can request the optimized implementation through
the shash crypto API interface.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/crc32-vx: use vector instructions to optimize CRC-32 computation
Hendrik Brueckner [Tue, 28 Apr 2015 10:29:06 +0000 (12:29 +0200)]
s390/crc32-vx: use vector instructions to optimize CRC-32 computation

Use vector instructions to optimize the computation of CRC-32 checksums.
An optimized version is provided for CRC-32 (IEEE 802.3 Ethernet) in
normal and bitreflected domain, as well as, for bitreflected CRC-32C
(Castagnoli).

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/vx: add support functions for in-kernel FPU use
Hendrik Brueckner [Wed, 18 Feb 2015 13:46:00 +0000 (14:46 +0100)]
s390/vx: add support functions for in-kernel FPU use

Introduce the kernel_fpu_begin() and kernel_fpu_end() function
to enclose any in-kernel use of FPU instructions and registers.
In enclosed sections, you can perform floating-point or vector
(SIMD) computations.  The functions take care of saving and
restoring FPU register contents and controls.

For usage details, see the guidelines in arch/s390/include/asm/fpu/api.h

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/mm: fix compile for PAGE_DEFAULT_KEY != 0
Heiko Carstens [Tue, 14 Jun 2016 04:55:43 +0000 (06:55 +0200)]
s390/mm: fix compile for PAGE_DEFAULT_KEY != 0

The usual problem for code that is ifdef'ed out is that it doesn't
compile after a while. That's also the case for the storage key
initialisation code, if it would be used (set PAGE_DEFAULT_KEY to
something not zero):

./arch/s390/include/asm/page.h: In function 'storage_key_init_range':
./arch/s390/include/asm/page.h:36:2: error: implicit declaration of function '__storage_key_init_range'

Since the code itself has been useful for debugging purposes several
times, remove the ifdefs and make sure the code gets compiler
coverage. The cost for this is eight bytes.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/topology: remove z10 special handling
Heiko Carstens [Wed, 25 May 2016 07:53:07 +0000 (09:53 +0200)]
s390/topology: remove z10 special handling

I don't have a z10 to test this anymore, so I have no idea if the code
works at all or even crashes. I can try to emulate, but it is just
guess work.

Nor do we know if the z10 special handling is performance wise still
better than the generic handling. There have been a lot of changes to
the scheduler.

Therefore let's play safe and remove the special handling.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/topology: add drawer scheduling domain level
Heiko Carstens [Wed, 25 May 2016 08:25:50 +0000 (10:25 +0200)]
s390/topology: add drawer scheduling domain level

The z13 machine added a fourth level to the cpu topology
information. The new top level is called drawer.

A drawer contains two books, which used to be the top level.

Adding this additional scheduling domain did show performance
improvements for some workloads of up to 8%, while there don't
seem to be any workloads impacted in a negative way.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agotopology/sysfs: provide drawer id and siblings attributes
Heiko Carstens [Sat, 21 May 2016 09:10:13 +0000 (11:10 +0200)]
topology/sysfs: provide drawer id and siblings attributes

The s390 cpu topology gained another hierarchy level. The top level is
now called drawer and contains several books. A book used to be the
top level.

In order to expose the cpu topology to user space allow to create new
sysfs attributes dependent on CONFIG_SCHED_DRAWER which an
architecture may define and select.

These additional attributes will be available:

/sys/devices/system/cpu/cpuX/topology/drawer_id
/sys/devices/system/cpu/cpuX/topology/drawer_siblings
/sys/devices/system/cpu/cpuX/topology/drawer_siblings_list

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/ipl: rename diagnose enums
Heiko Carstens [Sat, 11 Jun 2016 09:24:27 +0000 (11:24 +0200)]
s390/ipl: rename diagnose enums

Rename DIAG308_IPL and DIAG308_DUMP to DIAG308_LOAD_CLEAR and
DIAG308_LOAD_NORMAL_DUMP to better reflect the associated IPL
functions.

Suggested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/ipl: use load normal for LPAR re-ipl
Heiko Carstens [Fri, 10 Jun 2016 10:36:50 +0000 (12:36 +0200)]
s390/ipl: use load normal for LPAR re-ipl

Avoid clearing memory for CCW-type re-ipl within a logical
partition. This can save a significant amount of time if a logical
partition contains a lot of memory.

On the other hand we still clear memory if running within a second
level hypervisor, since the hypervisor can simply free all memory that
was used for the guest.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: avoid extable collisions
Heiko Carstens [Fri, 10 Jun 2016 07:57:05 +0000 (09:57 +0200)]
s390: avoid extable collisions

We have some inline assemblies where the extable entry points to a
label at the end of an inline assembly which is not followed by an
instruction.

On the other hand we have also inline assemblies where the extable
entry points to the first instruction of an inline assembly.

If a first type inline asm (extable point to empty label at the end)
would be directly followed by a second type inline asm (extable points
to first instruction) then we would have two different extable entries
that point to the same instruction but would have a different target
address.

This can lead to quite random behaviour, depending on sorting order.

I verified that we currently do not have such collisions within the
kernel. However to avoid such subtle bugs add a couple of nop
instructions to those inline assemblies which contain an extable that
points to an empty label.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/uaccess: use __builtin_expect for get_user/put_user
Heiko Carstens [Mon, 13 Jun 2016 08:17:20 +0000 (10:17 +0200)]
s390/uaccess: use __builtin_expect for get_user/put_user

We always expect that get_user and put_user return with zero. Give the
compiler a hint so it can slightly optimize the code and avoid
branches.
This is the same what x86 got with commit a76cf66e948a ("x86/uaccess:
Tell the compiler that uaccess is unlikely to fault").

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/uaccess: fix whitespace damage
Heiko Carstens [Fri, 10 Jun 2016 07:35:51 +0000 (09:35 +0200)]
s390/uaccess: fix whitespace damage

Fix some whitespace damage that was introduced by me with a
query-replace when removing 31 bit support.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/pci: ensure to not cross a dma segment boundary
Sebastian Ott [Fri, 3 Jun 2016 17:05:38 +0000 (19:05 +0200)]
s390/pci: ensure to not cross a dma segment boundary

When we use the iommu_area_alloc helper to get dma addresses
we specify the boundary_size parameter but not the offset (called
shift in this context).

As long as the offset (start_dma) is a multiple of the boundary
we're ok (on current machines start_dma always seems to be 4GB).

Don't leave this to chance and specify the offset for iommu_area_alloc.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/pci: ensure page aligned dma start address
Sebastian Ott [Fri, 3 Jun 2016 17:03:12 +0000 (19:03 +0200)]
s390/pci: ensure page aligned dma start address

We don't have an architectural guarantee on the value of
the dma offset but rely on it to be at least page aligned.
Enforce page alignemt of start_dma.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: use SPARSE_IRQ
Sebastian Ott [Thu, 2 Jun 2016 12:57:17 +0000 (14:57 +0200)]
s390: use SPARSE_IRQ

Use dynamically allocated irq descriptors on s390 which allows
us to get rid of the s390 specific config option PCI_NR_MSI and
exploit more MSI interrupts. Also the size of the kernel image
is reduced by 131K (using performance_defconfig).

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/Documentation: improve sort command for trace buffer
Thomas Richter [Wed, 8 Jun 2016 11:58:22 +0000 (13:58 +0200)]
s390/Documentation: improve sort command for trace buffer

When s390 traces with hex_ascii or sprintf view are
extracted and sorted, use the sort option -s (stable)
to avoid multiple lines with the same time stamp being
sorted using the rest of the line as secondary key.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: use __section macro everywhere
Heiko Carstens [Tue, 7 Jun 2016 08:23:07 +0000 (10:23 +0200)]
s390: use __section macro everywhere

Small cleanup patch to use the shorter __section macro everywhere.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: add proper __ro_after_init support
Heiko Carstens [Tue, 7 Jun 2016 08:12:55 +0000 (10:12 +0200)]
s390: add proper __ro_after_init support

On s390 __ro_after_init is currently mapped to __read_mostly which
means that data marked as __ro_after_init will not be protected.

Reason for this is that the common code __ro_after_init implementation
is x86 centric: the ro_after_init data section was added to rodata,
since x86 enables write protection to kernel text and rodata very
late. On s390 we have write protection for these sections enabled with
the initial page tables. So adding the ro_after_init data section to
rodata does not work on s390.

In order to make __ro_after_init work properly on s390 move the
ro_after_init data, right behind rodata. Unlike the rodata section it
will be marked read-only later after all init calls happened.

This s390 specific implementation adds new __start_ro_after_init and
__end_ro_after_init labels. Everything in between will be marked
read-only after the init calls happened. In addition to the
__ro_after_init data move also the exception table there, since from a
practical point of view it fits the __ro_after_init requirements.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agovmlinux.lds.h: allow arch specific handling of ro_after_init data section
Heiko Carstens [Tue, 7 Jun 2016 10:20:51 +0000 (12:20 +0200)]
vmlinux.lds.h: allow arch specific handling of ro_after_init data section

commit c74ba8b3480d ("arch: Introduce post-init read-only memory")
introduced the __ro_after_init attribute which allows to add variables
to the ro_after_init data section.

This new section was added to rodata, even though it contains writable
data. This in turn causes problems on architectures which mark the
page table entries read-only that point to rodata very early.

This patch allows architectures to implement an own handling of the
.data..ro_after_init section.
Usually that would be:
- mark the rodata section read-only very early
- mark the ro_after_init section read-only within mark_rodata_ro

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/mm: simplify the TLB flushing code
Martin Schwidefsky [Wed, 25 May 2016 07:45:26 +0000 (09:45 +0200)]
s390/mm: simplify the TLB flushing code

ptep_flush_lazy and pmdp_flush_lazy use mm->context.attach_count to
decide between a lazy TLB flush vs an immediate TLB flush. The field
contains two 16-bit counters, the number of CPUs that have the mm
attached and can create TLB entries for it and the number of CPUs in
the middle of a page table update.

The __tlb_flush_asce, ptep_flush_direct and pmdp_flush_direct functions
use the attach counter and a mask check with mm_cpumask(mm) to decide
between a local flush local of the current CPU and a global flush.

For all these functions the decision between lazy vs immediate and
local vs global TLB flush can be based on CPU masks. There are two
masks:  the mm->context.cpu_attach_mask with the CPUs that are actively
using the mm, and the mm_cpumask(mm) with the CPUs that have used the
mm since the last full flush. The decision between lazy vs immediate
flush is based on the mm->context.cpu_attach_mask, to decide between
local vs global flush the mm_cpumask(mm) is used.

With this patch all checks will use the CPU masks, the old counter
mm->context.attach_count with its two 16-bit values is turned into a
single counter mm->context.flush_count that keeps track of the number
of CPUs with incomplete page table updates. The sole user of this
counter is finish_arch_post_lock_switch() which waits for the end of
all page table updates.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agobitmap: bitmap_equal memcmp optimization
Martin Schwidefsky [Wed, 25 May 2016 07:32:20 +0000 (09:32 +0200)]
bitmap: bitmap_equal memcmp optimization

The bitmap_equal function has optimized code for small bitmaps with less
than BITS_PER_LONG bits. For larger bitmaps the out-of-line function
__bitmap_equal is called.

For a constant number of bits divisible by BITS_PER_LONG the memcmp
function can be used. For s390 gcc knows how to optimize this function,
memcmp calls with up to 256 bytes / 2048 bits are translated into a
single instruction.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/mm: fix vunmap vs finish_arch_post_lock_switch
Martin Schwidefsky [Mon, 6 Jun 2016 08:30:45 +0000 (10:30 +0200)]
s390/mm: fix vunmap vs finish_arch_post_lock_switch

The vunmap_pte_range() function calls ptep_get_and_clear() without any
locking. ptep_get_and_clear() uses ptep_xchg_lazy()/ptep_flush_direct()
for the page table update. ptep_flush_direct requires that preemption
is disabled, but without any locking this is not the case. If the kernel
preempts the task while the attach_counter is increased an endless loop
in finish_arch_post_lock_switch() will occur the next time the task is
scheduled.

Add explicit preempt_disable()/preempt_enable() calls to the relevant
functions in arch/s390/mm/pgtable.c.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/time: remove ETR support
Martin Schwidefsky [Tue, 31 May 2016 13:06:51 +0000 (15:06 +0200)]
s390/time: remove ETR support

The External-Time-Reference (ETR) clock synchronization interface has
been superseded by Server-Time-Protocol (STP). Remove the outdated
ETR interface.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/time: add leap seconds to initial system time
Martin Schwidefsky [Tue, 31 May 2016 10:47:03 +0000 (12:47 +0200)]
s390/time: add leap seconds to initial system time

The PTFF instruction can be used to retrieve information about UTC
including the current number of leap seconds. Use this value to
convert the coordinated server time value of the TOD clock to a
proper UTC timestamp to initialize the system time. Without this
correction the system time will be off by the number of leap seonds
until it has been corrected via NTP.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/time: LPAR offset handling
Martin Schwidefsky [Tue, 31 May 2016 08:16:24 +0000 (10:16 +0200)]
s390/time: LPAR offset handling

It is possible to specify a user offset for the TOD clock, e.g. +2 hours.
The TOD clock will carry this offset even if the clock is synchronized
with STP. This makes the time stamps acquired with get_sync_clock()
useless as another LPAR migth use a different TOD offset.

Use the PTFF instrution to get the TOD epoch difference and subtract
it from the TOD clock value to get a physical timestamp. As the epoch
difference contains the sync check delta as well the LPAR offset value
to the physical clock needs to be refreshed after each clock
synchronization.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/time: move PTFF definitions
Martin Schwidefsky [Tue, 31 May 2016 07:23:37 +0000 (09:23 +0200)]
s390/time: move PTFF definitions

The PTFF instruction is not a function of ETR, rename and move the
PTFF definitions from etr.h to timex.h.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/time: STP sync clock correction
Martin Schwidefsky [Tue, 31 May 2016 07:09:57 +0000 (09:09 +0200)]
s390/time: STP sync clock correction

The sync clock operation of the channel subsystem call for STP delivers
the TOD clock difference as a result. Use this TOD clock difference
instead of the difference between the TOD timestamps before and after
the sync clock operation.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/kexec: fix crash on resize of reserved memory
Heiko Carstens [Tue, 31 May 2016 07:14:00 +0000 (09:14 +0200)]
s390/kexec: fix crash on resize of reserved memory

Reducing the size of reserved memory for the crash kernel will result
in an immediate crash on s390. Reason for that is that we do not
create struct pages for memory that is reserved. If that memory is
freed any access to struct pages which correspond to this memory will
result in invalid memory accesses and a kernel panic.

Fix this by properly creating struct pages when the system gets
initialized. Change the code also to make use of set_memory_ro() and
set_memory_rw() so page tables will be split if required.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/kexec: fix update of os_info crash kernel size
Heiko Carstens [Tue, 31 May 2016 07:13:59 +0000 (09:13 +0200)]
s390/kexec: fix update of os_info crash kernel size

Implement an s390 version of the weak crash_free_reserved_phys_range
function. This allows us to update the size of the reserved crash
kernel memory if it will be resized.

This was previously done with a call to crash_unmap_reserved_pages
from crash_shrink_memory which was removed with ("s390/kexec:
consolidate crash_map/unmap_reserved_pages() and
arch_kexec_protect(unprotect)_crashkres()")

Fixes: 7a0058ec7860 ("s390/kexec: consolidate crash_map/unmap_reserved_pages() and arch_kexec_protect(unprotect)_crashkres()")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/mm: align swapper_pg_dir to 16k
Heiko Carstens [Sat, 28 May 2016 08:03:55 +0000 (10:03 +0200)]
s390/mm: align swapper_pg_dir to 16k

The segment/region table that is part of the kernel image must be
properly aligned to 16k in order to make the crdte inline assembly
work.
Otherwise it will calculate a wrong segment/region table start address
and access incorrect memory locations if the swapper_pg_dir is not
aligned to 16k.

Therefore define BSS_FIRST_SECTIONS in order to put the swapper_pg_dir
at the beginning of the bss section and also align the bss section to
16k just like other architectures did.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: dump_stack: fill in arch description
Christian Borntraeger [Tue, 24 May 2016 13:23:20 +0000 (15:23 +0200)]
s390: dump_stack: fill in arch description

Lets provide the basic machine information for dump_stack on
s390. This enables the "Hardware name:" line and results in
output like

[...]
Oops: 0004 ilc:2 [#1] SMP
Modules linked in:
CPU: 1 PID: 74 Comm: sh Not tainted 4.5.0+ #205
Hardware name: IBM              2964 NC9              704 (KVM)
[...]

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: use canonical include guard style
Daniel van Gerpen [Mon, 23 May 2016 13:19:38 +0000 (15:19 +0200)]
s390: use canonical include guard style

Signed-off-by: Daniel van Gerpen <daniel@vangerpen.de>
Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cpuinfo: show dynamic and static cpu mhz
Heiko Carstens [Thu, 14 Apr 2016 10:35:22 +0000 (12:35 +0200)]
s390/cpuinfo: show dynamic and static cpu mhz

Show the dynamic and static cpu mhz of each cpu. Since these values
are per cpu this requires a fundamental extension of the format of
/proc/cpuinfo.

Historically we had only a single line per cpu and a summary at the
top of the file. This format is hardly extendible if we want to add
more per cpu information.

Therefore this patch adds per cpu blocks at the end of /proc/cpuinfo:

cpu             : 0
cpu Mhz dynamic : 5504
cpu Mhz static  : 5504

cpu             : 1
cpu Mhz dynamic : 5504
cpu Mhz static  : 5504

cpu             : 2
cpu Mhz dynamic : 5504
cpu Mhz static  : 5504

cpu             : 3
cpu Mhz dynamic : 5504
cpu Mhz static  : 5504

Right now each block contains only the dynamic and static cpu mhz,
but it can be easily extended like on every other architecture.

This extension is supposed to be compatible with the old format.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/cpuinfo: print cache info and all single cpu lines on first iteration
Heiko Carstens [Thu, 14 Apr 2016 06:57:19 +0000 (08:57 +0200)]
s390/cpuinfo: print cache info and all single cpu lines on first iteration

Change the code to print all the current output during the first
iteration. This is a preparation patch for the upcoming per cpu block
extension to /proc/cpuinfo.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390: add explicit <linux/stringify.h> for jump label
Jason Baron [Fri, 20 May 2016 21:16:35 +0000 (17:16 -0400)]
s390: add explicit <linux/stringify.h> for jump label

Ensure that we always have __stringify().

Signed-off-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/pgtable: add mapping statistics
Heiko Carstens [Fri, 20 May 2016 06:08:14 +0000 (08:08 +0200)]
s390/pgtable: add mapping statistics

Add statistics that show how memory is mapped within the kernel
identity mapping. This is more or less the same like git
commit ce0c0e50f94e ("x86, generic: CPA add statistics about state
of direct mapping v4") for x86.

I also intentionally copied the lower case "k" within DirectMap4k vs
the upper case "M" and "G" within the two other lines. Let's have
consistent inconsistencies across architectures.

The output of /proc/meminfo now contains these additional lines:

DirectMap4k:        2048 kB
DirectMap1M:     3991552 kB
DirectMap2G:     4194304 kB

The implementation on s390 is lockless unlike the x86 version, since I
assume changes to the kernel mapping are a very rare event. Therefore
it really doesn't matter if these statistics could potentially be
inconsistent if read while kernel pages tables are being changed.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/vmem: simplify vmem code for read-only mappings
Heiko Carstens [Tue, 10 May 2016 14:28:28 +0000 (16:28 +0200)]
s390/vmem: simplify vmem code for read-only mappings

For the kernel identity mapping map everything read-writeable and
subsequently call set_memory_ro() to make the ro section read-only.
This simplifies the code a lot.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/pageattr: allow kernel page table splitting
Heiko Carstens [Tue, 17 May 2016 08:50:15 +0000 (10:50 +0200)]
s390/pageattr: allow kernel page table splitting

set_memory_ro() and set_memory_rw() currently only work on 4k
mappings, which is good enough for module code aka the vmalloc area.

However we stumbled already twice into the need to make this also work
on larger mappings:
- the ro after init patch set
- the crash kernel resize code

Therefore this patch implements automatic kernel page table splitting
if e.g. set_memory_ro() would be called on parts of a 2G mapping.
This works quite the same as the x86 code, but is much simpler.

In order to make this work and to be architecturally compliant we now
always use the csp, cspg or crdte instructions to replace valid page
table entries. This means that set_memory_ro() and set_memory_rw()
will be much more expensive than before. In order to avoid huge
latencies the code contains a couple of cond_resched() calls.

The current code only splits page tables, but does not merge them if
it would be possible.  The reason for this is that currently there is
no real life scenarion where this would really happen. All current use
cases that I know of only change access rights once during the life
time. If that should change we can still implement kernel page table
merging at a later time.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/pgtable: make pmd and pud helper functions available
Heiko Carstens [Tue, 10 May 2016 08:34:47 +0000 (10:34 +0200)]
s390/pgtable: make pmd and pud helper functions available

Make pmd_wrprotect() and pmd_mkwrite() available independently from
CONFIG_TRANSPARENT_HUGEPAGE and CONFIG_HUGETLB_PAGE so these can be
used on the kernel mapping.

Also introduce a couple of pud helper functions, namely pud_pfn(),
pud_wrprotect(), pud_mkwrite(), pud_mkdirty() and pud_mkclean()
which only work on the kernel mapping.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/mm: always use PAGE_KERNEL when mapping pages
Heiko Carstens [Thu, 19 May 2016 14:25:30 +0000 (16:25 +0200)]
s390/mm: always use PAGE_KERNEL when mapping pages

Always use PAGE_KERNEL when re-enabling pages within the kernel
mapping due to debug pagealloc. Without using this pgprot value
pte_mkwrite() and pte_wrprotect() won't work on such mappings after an
unmap -> map cycle anymore.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/vmem: make use of pte_clear()
Heiko Carstens [Tue, 17 May 2016 10:17:51 +0000 (12:17 +0200)]
s390/vmem: make use of pte_clear()

Use pte_clear() instead of open-coding it.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/pgtable: get rid of _REGION3_ENTRY_RO
Heiko Carstens [Fri, 20 May 2016 11:34:58 +0000 (13:34 +0200)]
s390/pgtable: get rid of _REGION3_ENTRY_RO

_REGION3_ENTRY_RO is a duplicate of _REGION_ENTRY_PROTECT.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/vmem: introduce and use SEGMENT_KERNEL and REGION3_KERNEL
Heiko Carstens [Wed, 11 May 2016 08:52:07 +0000 (10:52 +0200)]
s390/vmem: introduce and use SEGMENT_KERNEL and REGION3_KERNEL

Instead of open-coded SEGMENT_KERNEL and REGION3_KERNEL assignments use
defines.  Also to make e.g. pmd_wrprotect() work on the kernel mapping
a couple more flags must be set. Therefore add the missing flags also.

In order to make everything symmetrical this patch also adds software
dirty, young, read and write bits for region 3 table entries.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/vmem: align segment and region tables to 16k
Heiko Carstens [Fri, 13 May 2016 09:10:09 +0000 (11:10 +0200)]
s390/vmem: align segment and region tables to 16k

Usually segment and region tables are 16k aligned due to the way the
buddy allocator works.  This is not true for the vmem code which only
asks for a 4k alignment. In order to be consistent enforce a 16k
alignment here as well.

This alignment will be assumed and therefore is required by the
pageattr code.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/pgtable: introduce and use generic csp inline asm
Heiko Carstens [Sat, 14 May 2016 08:46:33 +0000 (10:46 +0200)]
s390/pgtable: introduce and use generic csp inline asm

We have already two inline assemblies which make use of the csp
instruction. Since I need a third instance let's introduce a generic
inline assmebly which can be used by everyone.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/keyboard: use memdup_user_nul()
Muhammad Falak R Wani [Fri, 20 May 2016 13:21:20 +0000 (18:51 +0530)]
s390/keyboard: use memdup_user_nul()

Use memdup_user_nul to duplicate a memory region from user-space
to kernel-space and terminate with a NULL, instead of open coding
using kmalloc + copy_from_user and explicitly NULL terminating.

Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com>
[heiko.carstens@de.ibm.com: remove comment]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agoKVM: s390/mm: Fix CMMA reset during reboot
Christian Borntraeger [Mon, 13 Jun 2016 11:14:56 +0000 (13:14 +0200)]
KVM: s390/mm: Fix CMMA reset during reboot

commit 1e133ab296f ("s390/mm: split arch/s390/mm/pgtable.c") factored
out the page table handling code from __gmap_zap and  __s390_reset_cmma
into ptep_zap_unused and added a simple flag that tells which one of the
function (reset or not) is to be made. This also changed the behaviour,
as it also zaps unused page table entries on reset.
Turns out that this is wrong as s390_reset_cmma uses the page walker,
which DOES NOT take the ptl lock.

The most simple fix is to not do the zapping part on reset (which uses
the walker)

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 1e133ab296f ("s390/mm: split arch/s390/mm/pgtable.c")
Cc: stable@vger.kernel.org # 4.6+
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agoLinux 4.7-rc3 v4.7-rc3
Linus Torvalds [Sun, 12 Jun 2016 14:20:35 +0000 (07:20 -0700)]
Linux 4.7-rc3

7 years agoMerge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Linus Torvalds [Sun, 12 Jun 2016 13:30:39 +0000 (06:30 -0700)]
Merge branch 'for-rc' of git://git./linux/kernel/git/rzhang/linux

Pull thermal management fixes from Zhang Rui:

 - fix an ordering issue in cpu cooling that cooling device is
   registered before it's ready (freq_table being populated).
   (Lukasz Luba)

 - fix a missing comment update (Caesar Wang)

* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  thermal: add the note for set_trip_temp
  thermal: cpu_cooling: fix improper order during initialization

7 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Sun, 12 Jun 2016 01:42:59 +0000 (18:42 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block layer fixes from Jens Axboe:
 "A small collection of fixes for the current series.  This contains:

   - Two fixes for xen-blkfront, from Bob Liu.

   - A bug fix for NVMe, releasing only the specific resources we
     requested.

   - Fix for a debugfs flags entry for nbd, from Josef.

   - Plug fix from Omar, fixing up a case of code being switched between
     two functions.

   - A missing bio_put() for the new discard callers of
     submit_bio_wait(), fixing a regression causing a leak of the bio.
     From Shaun.

   - Improve dirty limit calculation precision in the writeback code,
     fixing a case where setting a limit lower than 1% of memory would
     end up being zero.  From Tejun"

* 'for-linus' of git://git.kernel.dk/linux-block:
  NVMe: Only release requested regions
  xen-blkfront: fix resume issues after a migration
  xen-blkfront: don't call talk_to_blkback when already connected to blkback
  nbd: pass the nbd pointer for flags debugfs
  block: missing bio_put following submit_bio_wait
  blk-mq: really fix plug list flushing for nomerge queues
  writeback: use higher precision calculation in domain_dirty_limits()

7 years agoMerge tag 'gpio-v4.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux...
Linus Torvalds [Sun, 12 Jun 2016 01:03:39 +0000 (18:03 -0700)]
Merge tag 'gpio-v4.7-3' of git://git./linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "A new bunch of GPIO fixes for v4.7.

  This time I am very grateful that Ricardo Ribalda Delgado went in and
  fixed my stupid refcounting mistakes in the removal path for GPIO
  chips.  I had a feeling something was wrong here and so it was.  It
  exploded on OMAP and it fixes their problem.  Now it should be (more)
  solid.

  The rest i compilation, Kconfig and driver fixes.  Some tagged for
  stable.

  Summary:

   - Fix a NULL pointer dereference when we are searching the GPIO
     device list but one of the devices have been removed (struct
     gpio_chip pointer is NULL).

   - Fix unaligned reference counters: we were ending on +3 after all
     said and done.  It should be 0.  Remove an extraneous get_device(),
     and call cdev_del() followed by device_del() in gpiochip_remove()
     instead and the count goes to zero and calls the release() function
     properly.

   - Fix a compile warning due to a missing #include in the OF/device
     tree portions.

   - Select ANON_INODES for GPIOLIB, we're using that for our character
     device.  Some randconfig tests disclosed the problem.

   - Make sure the Zynq driver clock runs also without CONFIG_PM enabled

   - Fix an off-by-one error in the 104-DIO-48E driver

   - Fix warnings in bcm_kona_gpio_reset()"

* tag 'gpio-v4.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: bcm-kona: fix bcm_kona_gpio_reset() warnings
  gpio: select ANON_INODES
  gpio: include <linux/io-mapping.h> in gpiolib-of
  gpiolib: Fix unaligned used of reference counters
  gpiolib: Fix NULL pointer deference
  gpio: zynq: initialize clock even without CONFIG_PM
  gpio: 104-dio-48e: Fix control port offset computation off-by-one error

7 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 11 Jun 2016 18:42:08 +0000 (11:42 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Two current fixes:

   - one affects Qemu CD ROM emulation, which stopped working after the
     updates in SCSI to require VPD pages from all conformant devices.

     Fix temporarily by blacklisting Qemu (we can relax later when they
     come into compliance).

   - The other is a fix to the optimal transfer size.  We set up a
     minefield for ourselves by being confused about whether the limits
     are in bytes or sectors (SCSI optimal is in blocks and the queue
     parameter is in bytes).

     This tries to fix the problem (wrong setting for queue limits
     max_sectors) and make the problem more obvious by introducing a
     wrapper function"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  sd: Fix rw_max for devices that report an optimal xfer size
  scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist

7 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sat, 11 Jun 2016 18:24:54 +0000 (11:24 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - a bigger fix for i801 to finally be able to be loaded on some
   machines again

 - smaller driver fixes

 - documentation update because of a renamed file

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: mux: reg: Provide of_match_table
  i2c: mux: refer to i2c-mux.txt
  i2c: octeon: Avoid printk after too long SMBUS message
  i2c: octeon: Missing AAK flag in case of I2C_M_RECV_LEN
  i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR

7 years agoMerge tag 'devicetree-fixes-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 11 Jun 2016 18:08:57 +0000 (11:08 -0700)]
Merge tag 'devicetree-fixes-for-4.7' of git://git./linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:

 - fix unflatten_dt_nodes when dad parameter is set.

 - add vendor prefixes for TechNexion and UniWest

 - documentation fix for Marvell BT

 - OF IRQ kerneldoc fixes

 - restrict CMA alignment adjustments to non dma-coherent

 - a couple of warning fixes in reserved-memory code

 - DT maintainers updates

* tag 'devicetree-fixes-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  drivers: of: add definition of early_init_dt_alloc_reserved_memory_arch
  drivers/of: Fix depth for sub-tree blob in unflatten_dt_nodes()
  drivers: of: Fix of_pci.h header guard
  dt-bindings: Add vendor prefix for TechNexion
  of: add vendor prefix for UniWest
  dt: bindings: fix documentation for MARVELL's bt-sd8xxx wireless device
  of: add missing const for of_parse_phandle_with_args() in !CONFIG_OF
  of: silence warnings due to max() usage
  drivers: of: of_reserved_mem: fixup the CMA alignment not to affect dma-coherent
  of: irq: fix of_irq_get[_byname]() kernel-doc
  MAINTAINERS: DeviceTree maintainer updates

7 years agoMerge tag '20160610_uvc_compat_for_linus' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Sat, 11 Jun 2016 17:55:30 +0000 (10:55 -0700)]
Merge tag '20160610_uvc_compat_for_linus' of git://git./linux/kernel/git/luto/linux

Pull uvc compat XU ioctl fixes from Andy Lutomirski:
 "uvc's compat XU ioctls go through tons of potentially buggy
  indirection.  The first patch removes the indirection.  The second one
  cleans up the code.

  Compile-tested only.  I have the hardware, but I have absolutely no
  idea what XU does, how to use it, what software to recompile as
  32-bit, or what to test in that software"

* tag '20160610_uvc_compat_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux:
  uvc_v4l2: Simplify compat ioctl implementation
  uvc: Forward compat ioctls to their handlers directly

7 years agouvc_v4l2: Simplify compat ioctl implementation
Andy Lutomirski [Thu, 12 May 2016 00:41:27 +0000 (17:41 -0700)]
uvc_v4l2: Simplify compat ioctl implementation

The uvc compat ioctl implementation seems to have copied user data
for no good reason.  Remove a bunch of copies.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
7 years agouvc: Forward compat ioctls to their handlers directly
Andy Lutomirski [Tue, 24 May 2016 22:13:02 +0000 (15:13 -0700)]
uvc: Forward compat ioctls to their handlers directly

The current code goes through a lot of indirection just to call a
known handler.  Simplify it: just call the handlers directly.

Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
7 years agoMerge branch 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason...
Linus Torvalds [Fri, 10 Jun 2016 21:13:27 +0000 (14:13 -0700)]
Merge branch 'for-linus-4.7' of git://git./linux/kernel/git/mason/linux-btrfs

Pull btrfs fixes from Chris Mason:
 "Has some fixes and some new self tests for btrfs.  The self tests are
  usually disabled in the .config file (unless you're doing btrfs dev
  work), and this bunch is meant to find problems with the 64K page size
  patches.

  Jeff has a patch to help people see if they are using the hardware
  assist crc32c module, which really helps us nail down problems when
  people ask why crcs are using so much CPU.

  Otherwise, it's small fixes"

* 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: self-tests: Fix extent buffer bitmap test fail on BE system
  Btrfs: self-tests: Fix test_bitmaps fail on 64k sectorsize
  Btrfs: self-tests: Use macros instead of constants and add missing newline
  Btrfs: self-tests: Support testing all possible sectorsizes and nodesizes
  Btrfs: self-tests: Execute page straddling test only when nodesize < PAGE_SIZE
  btrfs: advertise which crc32c implementation is being used at module load
  Btrfs: add validadtion checks for chunk loading
  Btrfs: add more validation checks for superblock
  Btrfs: clear uptodate flags of pages in sys_array eb
  Btrfs: self-tests: Support non-4k page size
  Btrfs: Fix integer overflow when calculating bytes_per_bitmap
  Btrfs: test_check_exists: Fix infinite loop when searching for free space entries
  Btrfs: end transaction if we abort when creating uuid root
  btrfs: Use __u64 in exported linux/btrfs.h.

7 years agoMerge tag 'powerpc-4.7-3Michael Ellerman:' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Fri, 10 Jun 2016 19:23:49 +0000 (12:23 -0700)]
Merge tag 'powerpc-4.7-3Michael Ellerman:' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from
 - ptrace: Fix out of bounds array access warning from Khem Raj
 - pseries: Fix PCI config address for DDW from Gavin Shan
 - pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added
   from Michael Ellerman
 - of: fix autoloading due to broken modalias with no 'compatible' from
   Wolfram Sang
 - radix: Fix always false comparison against MMU_NO_CONTEXT from Aneesh
   Kumar K.V
 - hash: Compute the segment size correctly for ISA 3.0 from Aneesh
   Kumar K.V
 - nohash: Fix build break with 64K pages from Michael Ellerman

* tag 'powerpc-4.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/nohash: Fix build break with 64K pages
  powerpc/mm/hash: Compute the segment size correctly for ISA 3.0
  powerpc/mm/radix: Fix always false comparison against MMU_NO_CONTEXT
  of: fix autoloading due to broken modalias with no 'compatible'
  powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added
  powerpc/pseries: Fix PCI config address for DDW
  powerpc/ptrace: Fix out of bounds array access warning

7 years agoMerge tag 'hwmon-for-linus-v4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 10 Jun 2016 19:18:34 +0000 (12:18 -0700)]
Merge tag 'hwmon-for-linus-v4.7-rc3' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - fix regression in fam15h_power driver

 - minor variable type fix in lm90 driver

 - document compatible statement for ina2xx driver

* tag 'hwmon-for-linus-v4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (lm90) use proper type for update_interval
  hwmon: (ina2xx) Document compatible for INA231
  hwmon: (fam15h_power) Disable preemption when reading registers

7 years agoMerge branch 'stacking-fixes' (vfs stacking fixes from Jann)
Linus Torvalds [Fri, 10 Jun 2016 19:10:02 +0000 (12:10 -0700)]
Merge branch 'stacking-fixes' (vfs stacking fixes from Jann)

Merge filesystem stacking fixes from Jann Horn.

* emailed patches from Jann Horn <jannh@google.com>:
  sched: panic on corrupted stack end
  ecryptfs: forbid opening files without mmap handler
  proc: prevent stacking filesystems on top

7 years agosched: panic on corrupted stack end
Jann Horn [Wed, 1 Jun 2016 09:55:07 +0000 (11:55 +0200)]
sched: panic on corrupted stack end

Until now, hitting this BUG_ON caused a recursive oops (because oops
handling involves do_exit(), which calls into the scheduler, which in
turn raises an oops), which caused stuff below the stack to be
overwritten until a panic happened (e.g.  via an oops in interrupt
context, caused by the overwritten CPU index in the thread_info).

Just panic directly.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoecryptfs: forbid opening files without mmap handler
Jann Horn [Wed, 1 Jun 2016 09:55:06 +0000 (11:55 +0200)]
ecryptfs: forbid opening files without mmap handler

This prevents users from triggering a stack overflow through a recursive
invocation of pagefault handling that involves mapping procfs files into
virtual memory.

Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoproc: prevent stacking filesystems on top
Jann Horn [Wed, 1 Jun 2016 09:55:05 +0000 (11:55 +0200)]
proc: prevent stacking filesystems on top

This prevents stacking filesystems (ecryptfs and overlayfs) from using
procfs as lower filesystem.  There is too much magic going on inside
procfs, and there is no good reason to stack stuff on top of procfs.

(For example, procfs does access checks in VFS open handlers, and
ecryptfs by design calls open handlers from a kernel thread that doesn't
drop privileges or so.)

Signed-off-by: Jann Horn <jannh@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 10 Jun 2016 18:57:17 +0000 (11:57 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fix from Will Deacon:
 "A fix for an issue that Alex saw whilst swapping with hardware
  access/dirty bit support enabled in the kernel: Fix a failure to fault
  in old pages on a write when CONFIG_ARM64_HW_AFDBM is enabled"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: always take dirty state from new pte in ptep_set_access_flags