crush: add chooseleaf_vary_r tunable
authorIlya Dryomov <ilya.dryomov@inktank.com>
Wed, 19 Mar 2014 14:58:37 +0000 (16:58 +0200)
committerSage Weil <sage@inktank.com>
Sat, 5 Apr 2014 04:07:26 +0000 (21:07 -0700)
commite2b149cc4ba00766aceb87950c6de72ea7fc8b2e
tree5f3d7b5dd55b7f75c412db786e1e6f4915ef9ed8
parent6ed1002f368c63ef79d7f659fcb4368a90098132
crush: add chooseleaf_vary_r tunable

The current crush_choose_firstn code will re-use the same 'r' value for
the recursive call.  That means that if we are hitting a collision or
rejection for some reason (say, an OSD that is marked out) and need to
retry, we will keep making the same (bad) choice in that recursive
selection.

Introduce a tunable that fixes that behavior by incorporating the parent
'r' value into the recursive starting point, so that a different path
will be taken in subsequent placement attempts.

Note that this was done from the get-go for the new crush_choose_indep
algorithm.

This was exposed by a user who was seeing PGs stuck in active+remapped
after reweight-by-utilization because the up set mapped to a single OSD.

Reflects ceph.git commit a8e6c9fbf88bad056dd05d3eb790e98a5e43451a.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
include/linux/crush/crush.h
net/ceph/crush/mapper.c