powerpc: Speed up clear_page by unrolling it
authorAnton Blanchard <anton@samba.org>
Thu, 2 Oct 2014 05:44:21 +0000 (15:44 +1000)
committerMichael Ellerman <mpe@ellerman.id.au>
Thu, 2 Oct 2014 06:04:21 +0000 (16:04 +1000)
commite35735b9a5d8d38d9ffe2f1f0cdcbb0d45c42eff
tree643f44507bdd55a621f7b772d877054fa7869f6e
parent2013add4ce73c93ae2148969a9ec3ecc8b1e26fa
powerpc: Speed up clear_page by unrolling it

Unroll clear_page 8 times. A simple microbenchmark which
allocates and frees a zeroed page:

for (i = 0; i < iterations; i++) {
unsigned long p = __get_free_page(GFP_KERNEL | __GFP_ZERO);
free_page(p);
}

improves 20% on POWER8.

This assumes cacheline sizes won't grow beyond 512 bytes or
page sizes wont drop below 1kB, which is unlikely, but we could
add a runtime check during early init if it makes people nervous.

Michael found that some versions of gcc produce quite bad code
(all multiplies), so we give gcc a hand by using shifts and adds.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
arch/powerpc/include/asm/page_64.h