connmgr: Fix attempt to take mutex recursively when exiting fail-open.
authorBen Pfaff <blp@nicira.com>
Mon, 16 Dec 2013 18:58:29 +0000 (10:58 -0800)
committerBen Pfaff <blp@nicira.com>
Mon, 16 Dec 2013 18:59:38 +0000 (10:59 -0800)
If one configured a controller which does not exist, waited for the switch
to enter fail-open mode, and then deleted the controller, then
ofproto_set_controllers() would take ofproto_mutex and call
update_fail_open(), which would call fail_open_destroy(), which would call
fail_open_recover(), which would call ofproto_delete_flow(), which requires
ofproto_mutex not to be held since it eventually try to take it.  This
caused OVS to abort.

This fixes the problem by releasing ofproto_mutex earlier, since nothing
seems to require it being held so long (a comment in
connmgr_set_controllers() says that this is likely to be the case).

Better annotations would have found this problem at compile time.  A later
patch adds them.

Reported-by: ZhengLingyun <konghuarukhr@163.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
ofproto/connmgr.c

index 84c3a7b..48ee14c 100644 (file)
@@ -637,12 +637,13 @@ connmgr_set_controllers(struct connmgr *mgr,
 
     shash_destroy(&new_controllers);
 
+    ovs_mutex_unlock(&ofproto_mutex);
+
     update_in_band_remotes(mgr);
     update_fail_open(mgr);
     if (had_controllers != connmgr_has_controllers(mgr)) {
         ofproto_flush_flows(mgr->ofproto);
     }
-    ovs_mutex_unlock(&ofproto_mutex);
 }
 
 /* Drops the connections between 'mgr' and all of its primary and secondary