fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y
authorAndy Lutomirski <luto@kernel.org>
Fri, 16 Sep 2016 05:45:49 +0000 (22:45 -0700)
committerIngo Molnar <mingo@kernel.org>
Fri, 16 Sep 2016 07:18:54 +0000 (09:18 +0200)
commitac496bf48d97f2503eaa353996a4dd5e4383eaf0
treed57eb8f649cc8268a3d3595f04bbe4bab72264ef
parent68f24b08ee892d47bdef925d676e1ae1ccc316f8
fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y

vmalloc() is a bit slow, and pounding vmalloc()/vfree() will eventually
force a global TLB flush.

To reduce pressure on them, if CONFIG_VMAP_STACK=y, cache two thread
stacks per CPU.  This will let us quickly allocate a hopefully
cache-hot, TLB-hot stack under heavy forking workloads (shell script style).

On my silly pthread_create() benchmark, it saves about 2 µs per
pthread_create()+join() with CONFIG_VMAP_STACK=y.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/94811d8e3994b2e962f88866290017d498eb069c.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
kernel/fork.c