cascardo/linux.git
7 years agopowerpc/bpf: Implement support for tail calls
Naveen N. Rao [Fri, 23 Sep 2016 20:35:01 +0000 (02:05 +0530)]
powerpc/bpf: Implement support for tail calls

Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF
programs. This can be achieved either by:
(1) retaining the stack setup by the first eBPF program and having all
subsequent eBPF programs re-using it, or,
(2) by unwinding/tearing down the stack and having each eBPF program
deal with its own stack as it sees fit.

To ensure that this does not create loops, there is a limit to how many
tail calls can be done (currently 32). This requires the JIT'ed code to
maintain a count of the number of tail calls done so far.

Approach (1) is simple, but requires every eBPF program to have (almost)
the same prologue/epilogue, regardless of whether they need it. This is
inefficient for small eBPF programs which may not sometimes need a
prologue at all. As such, to minimize impact of tail call
implementation, we use approach (2) here which needs each eBPF program
in the chain to use its own prologue/epilogue. This is not ideal when
many tail calls are involved and when all the eBPF programs in the chain
have similar prologue/epilogue. However, the impact is restricted to
programs that do tail calls. Individual eBPF programs are not affected.

We maintain the tail call count in a fixed location on the stack and
updated tail call count values are passed in through this. The very
first eBPF program in a chain sets this up to 0 (the first 2
instructions). Subsequent tail calls skip the first two eBPF JIT
instructions to maintain the count. For programs that don't do tail
calls themselves, the first two instructions are NOPs.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/bpf: Introduce accessors for using the tmp local stack space
Naveen N. Rao [Fri, 23 Sep 2016 20:35:00 +0000 (02:05 +0530)]
powerpc/bpf: Introduce accessors for using the tmp local stack space

While at it, ensure that the location of the local save area is
consistent whether or not we setup our own stackframe. This property is
utilised in the next patch that adds support for tail calls.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n
Michael Ellerman [Fri, 30 Sep 2016 00:51:46 +0000 (10:51 +1000)]
powerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n

The fadump code calls vmcore_cleanup() which only exists if
CONFIG_PROC_VMCORE=y. We don't want to depend on CONFIG_PROC_VMCORE,
because it's user selectable, so just wrap the call in an #ifdef.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: tm: Enable transactional memory (TM) lazily for userspace
Cyril Bur [Wed, 14 Sep 2016 08:02:16 +0000 (18:02 +1000)]
powerpc: tm: Enable transactional memory (TM) lazily for userspace

Currently the MSR TM bit is always set if the hardware is TM capable.
This adds extra overhead as it means the TM SPRS (TFHAR, TEXASR and
TFAIR) must be swapped for each process regardless of if they use TM.

For processes that don't use TM the TM MSR bit can be turned off
allowing the kernel to avoid the expensive swap of the TM registers.

A TM unavailable exception will occur if a thread does use TM and the
kernel will enable MSR_TM and leave it so for some time afterwards.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/tm: Add TM Unavailable Exception
Cyril Bur [Wed, 14 Sep 2016 08:02:15 +0000 (18:02 +1000)]
powerpc/tm: Add TM Unavailable Exception

If the kernel disables transactional memory (TM) and userspace still
tries TM related actions (TM instructions or TM SPR accesses) TM aware
hardware will cause the kernel to take a facility unavailable
exception.

Add checks for the exception being caused by illegal TM access in
userspace.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Rewrite comment entirely, bugs in it are mine]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Remove do_load_up_transact_{fpu,altivec}
Cyril Bur [Fri, 23 Sep 2016 06:18:26 +0000 (16:18 +1000)]
powerpc: Remove do_load_up_transact_{fpu,altivec}

Previous rework of TM code leaves these functions unused

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: tm: Rename transct_(*) to ck(\1)_state
Cyril Bur [Fri, 23 Sep 2016 06:18:25 +0000 (16:18 +1000)]
powerpc: tm: Rename transct_(*) to ck(\1)_state

Make the structures being used for checkpointed state named
consistently with the pt_regs/ckpt_regs.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: tm: Always use fp_state and vr_state to store live registers
Cyril Bur [Fri, 23 Sep 2016 06:18:24 +0000 (16:18 +1000)]
powerpc: tm: Always use fp_state and vr_state to store live registers

There is currently an inconsistency as to how the entire CPU register
state is saved and restored when a thread uses transactional memory
(TM).

Using transactional memory results in the CPU having duplicated
(almost) all of its register state. This duplication results in a set
of registers which can be considered 'live', those being currently
modified by the instructions being executed and another set that is
frozen at a point in time.

On context switch, both sets of state have to be saved and (later)
restored. These two states are often called a variety of different
things. Common terms for the state which only exists after the CPU has
entered a transaction (performed a TBEGIN instruction) in hardware are
'transactional' or 'speculative'.

Between a TBEGIN and a TEND or TABORT (or an event that causes the
hardware to abort), regardless of the use of TSUSPEND the
transactional state can be referred to as the live state.

The second state is often to referred to as the 'checkpointed' state
and is a duplication of the live state when the TBEGIN instruction is
executed. This state is kept in the hardware and will be rolled back
to on transaction failure.

Currently all the registers stored in pt_regs are ALWAYS the live
registers, that is, when a thread has transactional registers their
values are stored in pt_regs and the checkpointed state is in
ckpt_regs. A strange opposite is true for fp_state/vr_state. When a
thread is non transactional fp_state/vr_state holds the live
registers. When a thread has initiated a transaction fp_state/vr_state
holds the checkpointed state and transact_fp/transact_vr become the
structure which holds the live state (at this point it is a
transactional state).

This method creates confusion as to where the live state is, in some
circumstances it requires extra work to determine where to put the
live state and prevents the use of common functions designed (probably
before TM) to save the live state.

With this patch pt_regs, fp_state and vr_state all represent the
same thing and the other structures [pending rename] are for
checkpointed state.

Acked-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Add checks for transactional VSXs in signal contexts
Cyril Bur [Fri, 23 Sep 2016 06:18:23 +0000 (16:18 +1000)]
selftests/powerpc: Add checks for transactional VSXs in signal contexts

If a thread receives a signal while transactional the kernel creates a
second context to show the transactional state of the process. This
test loads some known values and waits for a signal and confirms that
the expected values are in the signal context.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Add checks for transactional VMXs in signal contexts
Cyril Bur [Fri, 23 Sep 2016 06:18:22 +0000 (16:18 +1000)]
selftests/powerpc: Add checks for transactional VMXs in signal contexts

If a thread receives a signal while transactional the kernel creates a
second context to show the transactional state of the process. This
test loads some known values and waits for a signal and confirms that
the expected values are in the signal context.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Add checks for transactional FPUs in signal contexts
Cyril Bur [Fri, 23 Sep 2016 06:18:21 +0000 (16:18 +1000)]
selftests/powerpc: Add checks for transactional FPUs in signal contexts

If a thread receives a signal while transactional the kernel creates a
second context to show the transactional state of the process. This
test loads some known values and waits for a signal and confirms that
the expected values are in the signal context.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Add checks for transactional GPRs in signal contexts
Cyril Bur [Fri, 23 Sep 2016 06:18:20 +0000 (16:18 +1000)]
selftests/powerpc: Add checks for transactional GPRs in signal contexts

If a thread receives a signal while transactional the kernel creates a
second context to show the transactional state of the process. This
test loads some known values and waits for a signal and confirms that
the expected values are in the signal context.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Check that signals always get delivered
Cyril Bur [Fri, 23 Sep 2016 06:18:19 +0000 (16:18 +1000)]
selftests/powerpc: Check that signals always get delivered

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Add TM tcheck helpers in C
Cyril Bur [Fri, 23 Sep 2016 06:18:18 +0000 (16:18 +1000)]
selftests/powerpc: Add TM tcheck helpers in C

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Allow tests to extend their kill timeout
Cyril Bur [Fri, 23 Sep 2016 06:18:17 +0000 (16:18 +1000)]
selftests/powerpc: Allow tests to extend their kill timeout

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Introduce GPR asm helper header file
Cyril Bur [Fri, 23 Sep 2016 06:18:16 +0000 (16:18 +1000)]
selftests/powerpc: Introduce GPR asm helper header file

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Move VMX stack frame macros to header file
Cyril Bur [Fri, 23 Sep 2016 06:18:15 +0000 (16:18 +1000)]
selftests/powerpc: Move VMX stack frame macros to header file

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Rework FPU stack placement macros and move to header file
Cyril Bur [Fri, 23 Sep 2016 06:18:14 +0000 (16:18 +1000)]
selftests/powerpc: Rework FPU stack placement macros and move to header file

The FPU regs are placed at the top of the stack frame. Currently the
position expected to be passed to the macro. The macros now should be
passed the stack frame size and from there they can calculate where to
put the regs, this makes the use simpler.

Also move them to a header file to be used in an different area of the
powerpc selftests

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Check for VSX preservation across userspace preemption
Cyril Bur [Fri, 23 Sep 2016 06:18:13 +0000 (16:18 +1000)]
selftests/powerpc: Check for VSX preservation across userspace preemption

Ensure the kernel correctly switches VSX registers correctly. VSX
registers are all volatile, and despite the kernel preserving VSX
across syscalls, it doesn't have to. Test that during interrupts and
timeslices ending the VSX regs remain the same.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: signals: Stop using current in signal code
Cyril Bur [Fri, 23 Sep 2016 06:18:12 +0000 (16:18 +1000)]
powerpc: signals: Stop using current in signal code

Much of the signal code takes a pt_regs on which it operates. Over
time the signal code has needed to know more about the thread than
what pt_regs can supply, this information is obtained as needed by
using 'current'.

This approach is not strictly incorrect however it does mean that
there is now a hard requirement that the pt_regs being passed around
does belong to current, this is never checked. A safer approach is for
the majority of the signal functions to take a task_struct from which
they can obtain pt_regs and any other information they need. The
caveat that the task_struct they are passed must be current doesn't go
away but can more easily be checked for.

Functions called from outside powerpc signal code are passed a pt_regs
and they can confirm that the pt_regs is that of current and pass
current to other functions, furthurmore, powerpc signal functions can
check that the task_struct they are passed is the same as current
avoiding possible corruption of current (or the task they are passed)
if this assertion ever fails.

CC: paulus@samba.org
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Never giveup a reclaimed thread when enabling kernel {fp, altivec, vsx}
Cyril Bur [Fri, 23 Sep 2016 06:18:11 +0000 (16:18 +1000)]
powerpc: Never giveup a reclaimed thread when enabling kernel {fp, altivec, vsx}

After a thread is reclaimed from its active or suspended transactional
state the checkpointed state exists on CPU, this state (along with the
live/transactional state) has been saved in its entirety by the
reclaiming process.

There exists a sequence of events that would cause the kernel to call
one of enable_kernel_fp(), enable_kernel_altivec() or
enable_kernel_vsx() after a thread has been reclaimed. These functions
save away any user state on the CPU so that the kernel can use the
registers. Not only is this saving away unnecessary at this point, it
is actually incorrect. It causes a save of the checkpointed state to
the live structures within the thread struct thus destroying the true
live state for that thread.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Return the new MSR from msr_check_and_set()
Cyril Bur [Fri, 23 Sep 2016 06:18:10 +0000 (16:18 +1000)]
powerpc: Return the new MSR from msr_check_and_set()

msr_check_and_set() always performs a mfmsr() to determine if it needs
to perform an mtmsr(), as mfmsr() can be a costly operation
msr_check_and_set() could return the MSR now on the CPU to avoid
callers of msr_check_and_set having to make their own mfmsr() call.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Add check_if_tm_restore_required() to giveup_all()
Cyril Bur [Fri, 23 Sep 2016 06:18:09 +0000 (16:18 +1000)]
powerpc: Add check_if_tm_restore_required() to giveup_all()

giveup_all() causes FPU/VMX/VSX facilities to be disabled in a threads
MSR. If the thread performing the giveup was transactional, the kernel
must record which facilities were in use before the giveup as the
thread must have these facilities re-enabled on return to userspace.

>From process.c:
 /*
  * This is called if we are on the way out to userspace and the
  * TIF_RESTORE_TM flag is set.  It checks if we need to reload
  * FP and/or vector state and does so if necessary.
  * If userspace is inside a transaction (whether active or
  * suspended) and FP/VMX/VSX instructions have ever been enabled
  * inside that transaction, then we have to keep them enabled
  * and keep the FP/VMX/VSX state loaded while ever the transaction
  * continues.  The reason is that if we didn't, and subsequently
  * got a FP/VMX/VSX unavailable interrupt inside a transaction,
  * we don't know whether it's the same transaction, and thus we
  * don't know which of the checkpointed state and the transactional
  * state to use.
  */

Calling check_if_tm_restore_required() will set TIF_RESTORE_TM and
save the MSR if needed.

Fixes: c208505 ("powerpc: create giveup_all()")
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use
Cyril Bur [Fri, 23 Sep 2016 06:18:08 +0000 (16:18 +1000)]
powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use

Comment from arch/powerpc/kernel/process.c:967:
 If userspace is inside a transaction (whether active or
 suspended) and FP/VMX/VSX instructions have ever been enabled
 inside that transaction, then we have to keep them enabled
 and keep the FP/VMX/VSX state loaded while ever the transaction
 continues.  The reason is that if we didn't, and subsequently
 got a FP/VMX/VSX unavailable interrupt inside a transaction,
 we don't know whether it's the same transaction, and thus we
 don't know which of the checkpointed state and the ransactional
 state to use.

restore_math() restore_fp() and restore_altivec() currently may not
restore the registers. It doesn't appear that this is more serious
than a performance penalty. If the math registers aren't restored the
userspace thread will still be run with the facility disabled.
Userspace will not be able to read invalid values. On the first access
it will take an facility unavailable exception and the kernel will
detected an active transaction, at which point it will abort the
transaction. There is the possibility for a pathological case
preventing any progress by transactions, however, transactions
are never guaranteed to make progress.

Fixes: 70fe3d9 ("powerpc: Restore FPU/VEC/VSX if previously used")
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Fix data type for @r in pnv_ioda_parse_m64_window()
Gavin Shan [Tue, 2 Aug 2016 04:10:35 +0000 (14:10 +1000)]
powerpc/powernv: Fix data type for @r in pnv_ioda_parse_m64_window()

This fixes warning reported from sparse:

  pci-ioda.c:451:49: warning: incorrect type in argument 2 (different base types)

Fixes: 262af557dd75 ("powerpc/powernv: Enable M64 aperatus for PHB3")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data()
Gavin Shan [Tue, 2 Aug 2016 04:10:32 +0000 (14:10 +1000)]
powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data()

This fixes the warnings reported from sparse:

  pci.c:312:33: warning: restricted __be64 degrades to integer
  pci.c:313:33: warning: restricted __be64 degrades to integer

Fixes: cee72d5bb489 ("powerpc/powernv: Display diag data on p7ioc EEH errors")
Cc: stable@vger.kernel.org # v3.3+
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Specify proper data type for PCI_SLOT_ID_PREFIX
Gavin Shan [Tue, 2 Aug 2016 04:10:31 +0000 (14:10 +1000)]
powerpc/powernv: Specify proper data type for PCI_SLOT_ID_PREFIX

This fixes the warning reported from sparse:

  eeh-powernv.c:875:23: warning: constant 0x8000000000000000 is so big it is unsigned long

Fixes: ebe225312739 ("powerpc/powernv: Support PCI slot ID")
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag()
Gavin Shan [Tue, 2 Aug 2016 04:10:30 +0000 (14:10 +1000)]
powerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag()

The hub diag-data type is filled with big-endian data by OPAL call
opal_pci_get_hub_diag_data(). We need convert it to CPU-endian value
before using it. The issue is reported by sparse as pointed by Michael
Ellerman:

  eeh-powernv.c:1309:21: warning: restricted __be16 degrades to integer

This converts hub diag-data type to CPU-endian before using it in
pnv_eeh_get_and_dump_hub_diag().

Fixes: 2a485ad7c88d ("powerpc/powernv: Drop PHB operation next_error()")
Cc: stable@vger.kernel.org # v4.1+
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear()
Gavin Shan [Tue, 2 Aug 2016 04:10:29 +0000 (14:10 +1000)]
powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear()

The PE number (@frozen_pe_no), filled by opal_pci_next_error() is in
big-endian format. It should be converted to CPU-endian before it is
passed to opal_pci_eeh_freeze_clear() when clearing the frozen state if
the PE is invalid one. As Michael Ellerman pointed out, the issue is
also detected by sparse:

  eeh-powernv.c:1541:41: warning: incorrect type in argument 2 (different base types)

This passes CPU-endian PE number to opal_pci_eeh_freeze_clear() and it
should be part of commit <0f36db77643b> ("powerpc/eeh: Fix wrong printed
PE number"), which was merged to 4.3 kernel.

Fixes: 71b540adffd9 ("powerpc/powernv: Don't escalate non-existing frozen PE")
Cc: stable@vger.kernel.org # v4.3+
Suggested-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agodrivers/pci/hotplug: Use of_property_read_u32() in powernv driver
Gavin Shan [Thu, 29 Sep 2016 05:52:04 +0000 (15:52 +1000)]
drivers/pci/hotplug: Use of_property_read_u32() in powernv driver

This replaces of_get_property() with of_property_read_u32() or
of_property_read_string() so that we needn't consider the endian
issue, the returned value always is in CPU-endian.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
[mpe: Fold in the change to the "ibm,slot-surprise-pluggable" case]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agocxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put()
Andrew Donnellan [Fri, 29 Jul 2016 03:55:34 +0000 (13:55 +1000)]
cxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put()

Rewrite the cxl_guest_init_afu() loop in cxl_of_probe() to use
for_each_child_of_node() rather than a hand-coded for loop.

Remove the useless of_node_put(afu_np) call after the loop, where it's
guaranteed that afu_np == NULL.

Reported-by: SF Markus Elfring <elfring@users.sourceforge.net>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agocxl: Flush PSL cache before resetting the adapter
Frederic Barrat [Mon, 3 Oct 2016 19:36:02 +0000 (21:36 +0200)]
cxl: Flush PSL cache before resetting the adapter

If the capi link is going down while the PSL owns a dirty cache line,
any access from the host for that data could lead to an Uncorrectable
Error.

So when resetting the capi adapter through sysfs, make sure the PSL
cache is flushed. It won't help if there are any active Process
Elements on the card, as the cache would likely get new dirty cache
lines immediately, but if resetting an idle adapter, it should avoid
any bad surprises from data left over from terminated Process Elements.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Set default CPU type to POWER8 for little endian builds
Anton Blanchard [Sun, 25 Sep 2016 12:35:41 +0000 (22:35 +1000)]
powerpc: Set default CPU type to POWER8 for little endian builds

We supported POWER7 CPUs for bootstrapping little endian, but the
target was always POWER8. Now that POWER7 specific issues are
impacting performance, change the default target to POWER8.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian
Anton Blanchard [Sun, 25 Sep 2016 12:35:40 +0000 (22:35 +1000)]
powerpc: Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian

POWER8 handles unaligned accesses in little endian mode, but commit
0b5e6661ac69 ("powerpc: Don't set HAVE_EFFICIENT_UNALIGNED_ACCESS on
little endian builds") disabled it for all.

The issue with unaligned little endian accesses is specific to POWER7,
so update the Kconfig check to match. Using the stat() testcase from
commit a75c380c7129 ("powerpc: Enable DCACHE_WORD_ACCESS on ppc64le"),
performance improves 15% on POWER8.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Remove static branch prediction in atomic{, 64}_add_unless
Anton Blanchard [Mon, 3 Oct 2016 06:03:03 +0000 (17:03 +1100)]
powerpc: Remove static branch prediction in atomic{, 64}_add_unless

I see quite a lot of static branch mispredictions on a simple
web serving workload. The issue is in __atomic_add_unless(), called
from _atomic_dec_and_lock(). There is no obvious common case, so it
is better to let the hardware predict the branch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: During context switch, check before setting mm_cpumask
Anton Blanchard [Mon, 3 Oct 2016 06:40:29 +0000 (17:40 +1100)]
powerpc: During context switch, check before setting mm_cpumask

During context switch, switch_mm() sets our current CPU in mm_cpumask.
We can avoid this atomic sequence in most cases by checking before
setting the bit.

Testing on a POWER8 using our context switch microbenchmark:

tools/testing/selftests/powerpc/benchmarks/context_switch \
--process --no-fp --no-altivec --no-vector

Performance improves 2%.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/eeh: Quieten EEH message when no adapters are found
Anton Blanchard [Sun, 2 Oct 2016 00:09:38 +0000 (11:09 +1100)]
powerpc/eeh: Quieten EEH message when no adapters are found

No real need for this to be pr_warn(), reduce it to pr_info().

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/configs: Enable Intel i40e on 64 bit configs
Anton Blanchard [Fri, 30 Sep 2016 10:39:53 +0000 (20:39 +1000)]
powerpc/configs: Enable Intel i40e on 64 bit configs

We are starting to see i40e adapters in recent machines, so enable
it in our configs.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/configs: Change a few things from built in to modules
Anton Blanchard [Fri, 30 Sep 2016 10:39:52 +0000 (20:39 +1000)]
powerpc/configs: Change a few things from built in to modules

Change a few devices and filesystems that are seldom used any more
from built in to modules. This reduces our vmlinux about 500kB.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/configs: Bump kernel ring buffer size on 64 bit configs
Anton Blanchard [Fri, 30 Sep 2016 10:39:51 +0000 (20:39 +1000)]
powerpc/configs: Bump kernel ring buffer size on 64 bit configs

When we issue a system reset, every CPU in the box prints an Oops,
including a backtrace. Each of these can be quite large (over 4kB)
and we may end up wrapping the ring buffer and losing important
information.

Bump the base size from 128kB to 256kB and the per CPU size from
4kB to 8kB.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/configs: Enable VMX crypto
Anton Blanchard [Fri, 30 Sep 2016 10:39:50 +0000 (20:39 +1000)]
powerpc/configs: Enable VMX crypto

We see big improvements with the VMX crypto functions (often 10x or more),
so enable it as a module.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64: Align hot loops of memset() and backwards_memcpy()
Anton Blanchard [Thu, 4 Aug 2016 06:53:22 +0000 (16:53 +1000)]
powerpc/64: Align hot loops of memset() and backwards_memcpy()

Align the hot loops in our assembly implementation of memset()
and backwards_memcpy().

backwards_memcpy() is called from tcp_v4_rcv(), so we might
want to optimise this a little more.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Nick Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Remove unused exception code, small cleanups
Nicholas Piggin [Wed, 21 Sep 2016 07:44:07 +0000 (17:44 +1000)]
powerpc/64s: Remove unused exception code, small cleanups

This was not done before the big patches because I only noticed
them afterwards. It has become much easier to see which handlers
are branched to from which exception vectors now, and to see
exactly what vector space is being used for what.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Use a single macro for both parts of OOL exception
Nicholas Piggin [Wed, 21 Sep 2016 07:44:06 +0000 (17:44 +1000)]
powerpc/64s: Use a single macro for both parts of OOL exception

Simple substitution. This is possible now that both parts of the OOL
initial handler get linked into their correct location.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Move __replay_interrupt function below handlers
Nicholas Piggin [Wed, 21 Sep 2016 07:44:05 +0000 (17:44 +1000)]
powerpc/64s: Move __replay_interrupt function below handlers

This is not an exception handler as such, it's called from
local_irq_enable(), not exception entry.

Also clean up some now redundant comments at the end of the
consolidation series.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate CBE Thermal 0x1800 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:44:04 +0000 (17:44 +1000)]
powerpc/64s: Consolidate CBE Thermal 0x1800 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Altivec 0x1700 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:44:03 +0000 (17:44 +1000)]
powerpc/64s: Consolidate Altivec 0x1700 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Debug 0x1600 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:44:02 +0000 (17:44 +1000)]
powerpc/64s: Consolidate Debug 0x1600 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Softpatch 0x1500 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:44:01 +0000 (17:44 +1000)]
powerpc/64s: Consolidate Softpatch 0x1500 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Instruction Breakpoint 0x1300 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:44:00 +0000 (17:44 +1000)]
powerpc/64s: Consolidate Instruction Breakpoint 0x1300 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate CBE System Error 0x1200 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:59 +0000 (17:43 +1000)]
powerpc/64s: Consolidate CBE System Error 0x1200 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Reserved 0xfa0-0x1200 interrupts
Nicholas Piggin [Wed, 21 Sep 2016 07:43:58 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Reserved 0xfa0-0x1200 interrupts

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Hypervisor Facility Unavailable 0xf80 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:57 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Hypervisor Facility Unavailable 0xf80 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Facility Unavailable 0xf60 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:56 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Facility Unavailable 0xf60 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate VSX Unavailable 0xf40 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:55 +0000 (17:43 +1000)]
powerpc/64s: Consolidate VSX Unavailable 0xf40 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Vector Unavailable 0xf20 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:54 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Vector Unavailable 0xf20 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Performance Monitor 0xf00 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:53 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Performance Monitor 0xf00 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Reserved 0xec0, 0xee0 interrupts
Nicholas Piggin [Wed, 21 Sep 2016 07:43:52 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Reserved 0xec0, 0xee0 interrupts

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Hypervisor Virtualization 0xea0 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:51 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Hypervisor Virtualization 0xea0 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Directed Hypervisor Doorbell 0xe80 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:50 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Directed Hypervisor Doorbell 0xe80 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Hypervisor Maintenance 0xe60 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:49 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Hypervisor Maintenance 0xe60 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Hypervisor Emulation Assistance 0xe40 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:48 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Hypervisor Emulation Assistance 0xe40 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Hypervisor Instruction Storage 0xe20 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:47 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Hypervisor Instruction Storage 0xe20 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Hypervisor Data Storage 0xe00 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:46 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Hypervisor Data Storage 0xe00 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Trace 0xd00 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:45 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Trace 0xd00 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate System Call 0xc00 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:44 +0000 (17:43 +1000)]
powerpc/64s: Consolidate System Call 0xc00 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Reserved 0xb00 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:43 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Reserved 0xb00 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Directed Privileged Doorbell 0xa00 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:42 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Directed Privileged Doorbell 0xa00 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Hypervisor Decrementer 0x980 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:41 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Hypervisor Decrementer 0x980 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Decrementer 0x900 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:40 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Decrementer 0x900 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate FP Unavailable 0x800 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:39 +0000 (17:43 +1000)]
powerpc/64s: Consolidate FP Unavailable 0x800 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Program 0x700 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:38 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Program 0x700 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Alignment 0x600 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:37 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Alignment 0x600 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate External 0x500 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:36 +0000 (17:43 +1000)]
powerpc/64s: Consolidate External 0x500 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Instruction Segment 0x480 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:35 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Instruction Segment 0x480 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Instruction Storage 0x400 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:34 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Instruction Storage 0x400 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Data Segment 0x380 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:33 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Data Segment 0x380 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Data Storage 0x300 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:32 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Data Storage 0x300 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate Machine Check 0x200 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:31 +0000 (17:43 +1000)]
powerpc/64s: Consolidate Machine Check 0x200 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate System Reset 0x100 interrupt
Nicholas Piggin [Wed, 21 Sep 2016 07:43:30 +0000 (17:43 +1000)]
powerpc/64s: Consolidate System Reset 0x100 interrupt

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Use gas sections for arranging exception vectors
Nicholas Piggin [Wed, 28 Sep 2016 01:31:48 +0000 (11:31 +1000)]
powerpc: Use gas sections for arranging exception vectors

Use assembler sections of fixed size and location to arrange the 64-bit
Book3S exception vector code (64-bit Book3E also uses it in head_64.S
for 0x0..0x100).

This allows better flexibility in arranging exception code and hiding
unimportant details behind macros.

Gas sections can be a bit painful to use this way, mainly because the
assembler does not know where they will be finally linked. Taking
absolute addresses requires a bit of trickery for example, but it can
be hidden behind macros for the most part.

Generated code is mostly the same except locations, offsets, alignments.

The "+ 0x2" is only required for the trap number / kvm exit number,
which gets loaded as a constant into a register.

Previously, code also used + 0x2 for label names, but we changed to
using "H" to distinguish HV case for that. Remove the last vestiges
of that.

__after_prom_start is taking absolute address of a label in another
fixed section. Newer toolchains seemed to compile this okay, but older
ones do not. FIXED_SYMBOL_ABS_ADDR is more foolproof, it just takes an
additional line to define.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64: Change the way relocation copy is calculated
Nicholas Piggin [Wed, 28 Sep 2016 01:31:47 +0000 (11:31 +1000)]
powerpc/64: Change the way relocation copy is calculated

With a subsequent patch to put text into different sections,
(_end - _stext) can no longer be computed at link time to determine
the end of the copy. Instead, calculate it at runtime with
(copy_to_here - _stext) + (_end - copy_to_here).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Consolidate exception handler alignment
Nicholas Piggin [Wed, 21 Sep 2016 07:43:28 +0000 (17:43 +1000)]
powerpc/64s: Consolidate exception handler alignment

Move exception handler alignment directives into the head-64.h macros,
beause they will no longer work in-place after the next patch. This
slightly changes functions that have alignments applied and therefore
code generation, which is why it was not done initially (see earlier
patch).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Add new exception vector macros
Michael Ellerman [Fri, 30 Sep 2016 09:43:18 +0000 (19:43 +1000)]
powerpc/64s: Add new exception vector macros

Create arch/powerpc/include/asm/head-64.h with macros that specify
an exception vector (name, type, location), which will be used to
label and lay out exceptions into the object file.

Naming is moved out of exception-64s.h, which is used to specify the
implementation of exception handlers.

objdump of generated code in exception vectors is unchanged except for
names. Alignment directives scattered around are annoying, but done
this way so that disassembly can verify identical instruction
generation before and after patch. These get cleaned up in future
patch.

We change the way KVMTEST works, explicitly passing EXC_HV or EXC_STD
rather than overloading the trap number. This removes the need to have
SOFTEN values for the overloaded trap numbers, eg. 0x502.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/vdso64: Use double word compare on pointers
Anton Blanchard [Sun, 25 Sep 2016 07:16:53 +0000 (17:16 +1000)]
powerpc/vdso64: Use double word compare on pointers

__kernel_get_syscall_map() and __kernel_clock_getres() use cmpli to
check if the passed in pointer is non zero. cmpli maps to a 32 bit
compare on binutils, so we ignore the top 32 bits.

A simple test case can be created by passing in a bogus pointer with
the bottom 32 bits clear. Using a clk_id that is handled by the VDSO,
then one that is handled by the kernel shows the problem:

  printf("%d\n", clock_getres(CLOCK_REALTIME, (void *)0x100000000));
  printf("%d\n", clock_getres(CLOCK_BOOTTIME, (void *)0x100000000));

And we get:

  0
  -1

The bigger issue is if we pass a valid pointer with the bottom 32 bits
clear, in this case we will return success but won't write any data
to the pointer.

I stumbled across this issue because the LLVM integrated assembler
doesn't accept cmpli with 3 arguments. Fix this by converting them to
cmpldi.

Fixes: a7f290dad32e ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel")
Cc: stable@vger.kernel.org # v2.6.15+
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoKVM: PPC: Book3S HV: Migrate pinned pages out of CMA
Balbir Singh [Tue, 6 Sep 2016 06:27:31 +0000 (16:27 +1000)]
KVM: PPC: Book3S HV: Migrate pinned pages out of CMA

When PCI Device pass-through is enabled via VFIO, KVM-PPC will
pin pages using get_user_pages_fast(). One of the downsides of
the pinning is that the page could be in CMA region. The CMA
region is used for other allocations like the hash page table.
Ideally we want the pinned pages to be from non CMA region.

This patch (currently only for KVM PPC with VFIO) forcefully
migrates the pages out (huge pages are omitted for the moment).
There are more efficient ways of doing this, but that might
be elaborate and might impact a larger audience beyond just
the kvm ppc implementation.

The magic is in new_iommu_non_cma_page() which allocates the
new page from a non CMA region.

I've tested the patches lightly at my end. The full solution
requires migration of THP pages in the CMA region. That work
will be done incrementally on top of this.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[mpe: Merged via powerpc tree as that's where the changes are]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agodrivers/pci/hotplug: Support surprise hotplug in powernv driver
Gavin Shan [Wed, 28 Sep 2016 04:34:58 +0000 (14:34 +1000)]
drivers/pci/hotplug: Support surprise hotplug in powernv driver

This supports PCI surprise hotplug. The design is highlighted as
below:

   * The PCI slot's surprise hotplug capability is exposed through
     device node property "ibm,slot-surprise-pluggable", meaning
     PCI surprise hotplug will be disabled if skiboot doesn't support
     it yet.
   * The interrupt because of presence or link state change is raised
     on surprise hotplug event. One event is allocated and queued to
     the PCI slot for workqueue to pick it up and process in serialized
     fashion. The code flow for surprise hotplug is same to that for
     managed hotplug except: the affected PEs are put into frozen state
     to avoid unexpected EEH error reporting in surprise hot remove path.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agodrivers/pci/hotplug: Remove likely() and unlikely() in powernv driver
Gavin Shan [Wed, 28 Sep 2016 04:34:57 +0000 (14:34 +1000)]
drivers/pci/hotplug: Remove likely() and unlikely() in powernv driver

This removes likely() and unlikely() in pnv_php.c as the code isn't
running in hot path. Those macros to affect CPU's branch stream don't
help a lot for performance. I used them to identify the cases are
likely or unlikely to happen. No logical changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Unfreeze PE on allocation
Gavin Shan [Wed, 28 Sep 2016 04:34:56 +0000 (14:34 +1000)]
powerpc/powernv: Unfreeze PE on allocation

This unfreezes PE when it's initialized because the PE might be put
into frozen state in the last hot remove path. It's not harmful to
do so if the PE is already in unfrozen state.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/eeh: Export eeh_pe_state_mark()
Gavin Shan [Wed, 28 Sep 2016 04:34:55 +0000 (14:34 +1000)]
powerpc/eeh: Export eeh_pe_state_mark()

This exports eeh_pe_state_mark(). It will be used to mark the surprise
hot removed PE as isolated to avoid unexpected EEH error reporting in
surprise remove path.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/eeh: Export confirm_error_lock
Gavin Shan [Wed, 28 Sep 2016 04:34:54 +0000 (14:34 +1000)]
powerpc/eeh: Export confirm_error_lock

This exports @confirm_error_lock so that eeh_serialize_{lock, unlock}()
can be used to freeze the affected PE in PCI surprise hot remove path.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/eeh: Allow to freeze PE in eeh_pe_set_option()
Gavin Shan [Wed, 28 Sep 2016 04:34:53 +0000 (14:34 +1000)]
powerpc/eeh: Allow to freeze PE in eeh_pe_set_option()

Function eeh_pe_set_option() is used to apply the requested options
(enable, disable, unfreeze) in EEH virtualization path. The semantics
of this function isn't complete until freezing is supported.

This allows to freeze the indicated PE. The new semantics is going to
be used in PCI surprise hot remove path, to freeze removed PCI devices
(PE) to avoid unexpected EEH error reporting.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Call opal_pci_poll() if needed
Gavin Shan [Fri, 24 Jun 2016 06:44:19 +0000 (16:44 +1000)]
powerpc/powernv: Call opal_pci_poll() if needed

When issuing PHB reset, OPAL API opal_pci_poll() is called to drive
the state machine in OPAL forward. However, we needn't always call
the function under some circumstances like reset deassert.

This avoids calling opal_pci_poll() when OPAL_SUCCESS is returned
from opal_pci_reset(). Except the overhead introduced by additional
one unnecessary OPAL call, I didn't run into real issue because of
this.

Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/boot: Add support for XZ compression
Oliver O'Halloran [Thu, 22 Sep 2016 06:54:34 +0000 (16:54 +1000)]
powerpc/boot: Add support for XZ compression

This patch adds an option to use XZ compression for the kernel image.

Currently this is only enabled for 64-bit Book3S targets, which is
roughly equivalent to the platforms that use the kernel's zImage
wrapper, and that have been tested.

The bulk of the 32-bit platforms and 64-bit BookE use uboot images,
which relies on uboot implementing XZ. In future we can enable XZ
support for those targets once someone has tested it.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/boot: Add XZ support to the wrapper script
Oliver O'Halloran [Thu, 22 Sep 2016 06:54:33 +0000 (16:54 +1000)]
powerpc/boot: Add XZ support to the wrapper script

This modifies the wrapper script so that the -Z option takes an argument
to specify the compression type. It can either be 'gz', 'xz' or 'none'.

The legazy --no-gzip and -z options are still supported and will set the
compression to none and gzip respectively, but they are not documented.

Only XZ -6 is used for compression rather than XZ -9. Using compression
levels higher than 6 requires the decompressor to build a large (64MB)
dictionary when decompressing and some environments cannot satisfy such
large allocations (e.g. POWER 6 LPAR partition firmware).

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/boot: Remove the legacy gzip wrapper
Oliver O'Halloran [Thu, 22 Sep 2016 06:54:32 +0000 (16:54 +1000)]
powerpc/boot: Remove the legacy gzip wrapper

This code is no longer used and can be removed.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/boot: Use the pre-boot decompression API
Oliver O'Halloran [Thu, 22 Sep 2016 06:54:31 +0000 (16:54 +1000)]
powerpc/boot: Use the pre-boot decompression API

Currently the powerpc boot wrapper has its own wrapper around zlib to
handle decompressing gzipped kernels. The kernel decompressor library
functions now provide a generic interface that can be used in the
pre-boot environment. This allows boot wrappers to easily support
different compression algorithms. This patch converts the wrapper to use
this new API, but does not add support for using new algorithms.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/boot: Use CONFIG_KERNEL_GZIP
Oliver O'Halloran [Thu, 22 Sep 2016 06:54:30 +0000 (16:54 +1000)]
powerpc/boot: Use CONFIG_KERNEL_GZIP

Most architectures allow the compression algorithm used to produced the
vmlinuz image to be selected as a kernel config option. In preperation
for supporting algorithms other than gzip in the powerpc boot wrapper
the makefile needs to be modified to use these config options.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/boot: Add sed script
Oliver O'Halloran [Thu, 22 Sep 2016 06:54:29 +0000 (16:54 +1000)]
powerpc/boot: Add sed script

The powerpc boot wrapper is potentially compiled with a separate
toolchain and/or toolchain flags than the rest of the kernel. The usual
case is a 64-bit big endian kernel builds a 32-bit big endian wrapper.

The main problem with this is that the wrapper does not have access to
the kernel headers (without a lot of gross hacks). To get around this
the required headers are copied into the build directory via several sed
scripts which rewrite problematic includes. This patch moves these
fixups out of the makefile into a separate .sed script file to clean up
makefile slightly.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Reword first paragraph of change log a little]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Compile selftests against headers without AT_HWCAP2
Cyril Bur [Fri, 23 Sep 2016 06:18:07 +0000 (16:18 +1000)]
selftests/powerpc: Compile selftests against headers without AT_HWCAP2

It might be nice to compile selftests against older kernels and
headers but which may not have HWCAP2.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>