ocfs2: dlm_request_all_locks() should deal with the status sent from target node
authorXue jiufei <xuejiufei@huawei.com>
Wed, 11 Sep 2013 21:19:46 +0000 (14:19 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Wed, 11 Sep 2013 22:56:31 +0000 (15:56 -0700)
dlm_request_all_locks() should deal with the status sent from target node
if DLM_LOCK_REQUEST_MSG is sent successfully, or recovery master will fall
into endless loop, waiting for other nodes to send locks and
DLM_RECO_DATA_DONE_MSG to me.

        NodeA                                  NodeB
                                     selected as recovery master
                                     dlm_remaster_locks()
                                     ->dlm_request_all_locks()
                                     send DLM_LOCK_REQUEST_MSG to nodeA

It happened that NodeA cannot alloc memory when it processes this
message.  dlm_request_all_locks_handler() do not queue
dlm_request_all_locks_worker and returns -ENOMEM.  It will never send
locks and DLM_RECO_DATA_DONE_MSG to NodeB.

                                    NodeB do not deal with the status
                                    sent from nodeA, and will fall in
                                    endless loop waiting for the
                                    recovery state of NodeA to be
                                    changed.

Signed-off-by: joyce <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Jeff Liu <jeff.liu@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/ocfs2/dlm/dlmrecovery.c

index 773bd32..f945502 100644 (file)
@@ -787,6 +787,7 @@ static int dlm_request_all_locks(struct dlm_ctxt *dlm, u8 request_from,
 {
        struct dlm_lock_request lr;
        int ret;
+       int status;
 
        mlog(0, "\n");
 
@@ -800,13 +801,15 @@ static int dlm_request_all_locks(struct dlm_ctxt *dlm, u8 request_from,
 
        // send message
        ret = o2net_send_message(DLM_LOCK_REQUEST_MSG, dlm->key,
-                                &lr, sizeof(lr), request_from, NULL);
+                                &lr, sizeof(lr), request_from, &status);
 
        /* negative status is handled by caller */
        if (ret < 0)
                mlog(ML_ERROR, "%s: Error %d send LOCK_REQUEST to node %u "
                     "to recover dead node %u\n", dlm->name, ret,
                     request_from, dead_node);
+       else
+               ret = status;
        // return from here, then
        // sleep until all received or error
        return ret;