ocfs2/dlm: fix a race between purge and migration
authorXue jiufei <xuejiufei@huawei.com>
Thu, 14 Jan 2016 23:17:18 +0000 (15:17 -0800)
committerLinus Torvalds <torvalds@linux-foundation.org>
Fri, 15 Jan 2016 00:00:49 +0000 (16:00 -0800)
commit30bee898f86506893883ffb8db20d8101a29b5f5
treed1998a65a4373824d785773e6a9c2e7b504b01b5
parenta84ac334dcb44c76f0b051513a6c27a2d747f883
ocfs2/dlm: fix a race between purge and migration

We found a race between purge and migration when doing code review.
Node A put lockres to purgelist before receiving the migrate message
from node B which is the master.  Node A call dlm_mig_lockres_handler to
handle this message.

dlm_mig_lockres_handler
  dlm_lookup_lockres
  >>>>>> race window, dlm_run_purge_list may run and send
         deref message to master, waiting the response
  spin_lock(&res->spinlock);
  res->state |= DLM_LOCK_RES_MIGRATING;
  spin_unlock(&res->spinlock);
  dlm_mig_lockres_handler returns

  >>>>>> dlm_thread receives the response from master for the deref
  message and triggers the BUG because the lockres has the state
  DLM_LOCK_RES_MIGRATING with the following message:

dlm_purge_lockres:209 ERROR: 6633EB681FA7474A9C280A4E1A836F0F: res
M0000000000000000030c0300000000 in use after deref

Signed-off-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/ocfs2/dlm/dlmrecovery.c