datapath: list: Fix double fetch of pointer in hlist_entry_safe()
authorPravin B Shelar <pshelar@nicira.com>
Fri, 26 Jul 2013 20:52:24 +0000 (13:52 -0700)
committerPravin B Shelar <pshelar@nicira.com>
Fri, 26 Jul 2013 15:24:23 +0000 (08:24 -0700)
Following patch backports commit f65846a1800ef8c48d (list: Fix double
fetch of pointer in hlist_entry_safe()) from upstream kernel.
This patch fixes following panic. Thanks to Jesse for helping to
debug this issue.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000118
[129608.216422] IP: [<ffffffffa02436da>] ovs_masked_flow_lookup+0xda/0x140 [openvswitch]
[129608.216918] PGD 11500a067 PUD 120851067 PMD 0
[129608.216994] Oops: 0000 [#1] SMP
[129608.217049] CPU 0
[129608.218697]
[129608.218726] Pid: 0, comm: swapper/0 Tainted: G           O
3.2.39-server-nn21 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop
Reference Platform
[129608.219288] RIP: 0010:[<ffffffffa02436da>]  [<ffffffffa02436da>]
ovs_masked_flow_lookup+0xda/0x140 [openvswitch]
[129608.219454] RSP: 0018:ffff88013fc03b60  EFLAGS: 00010282
[129608.219536] RAX: 0000000000000020 RBX: ffff880123087100 RCX:
ffff88012098e000
[129608.219719] RDX: ffff8800b3b0ca30 RSI: 000000000000010a RDI:
ffff88011df8c000
[129608.220121] RBP: ffff88013fc03c30 R08: 0000000000000001 R09:
0000000020069825
[129608.220287] R10: 0000000000000020 R11: 0000000000000001 R12:
ffff880036e1c6c0
[129608.220451] R13: ffff88013fc03b98 R14: 0000000000000024 R15:
ffffffffffffffe0
[129608.220618] FS:  0000000000000000(0000) GS:ffff88013fc00000(0000)
knlGS:0000000000000000
[129608.220794] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[129608.220911] CR2: 0000000000000118 CR3: 00000001190c9000 CR4:
00000000000406f0
[129608.221122] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[129608.221320] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[129608.221488] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000,
task ffffffff81c0d020)
[129608.221669] Stack:
[129608.221725]  0000000000000044 0000000000000020 ffff88013fc03c60
0000000000000000
[129608.221906]  0000000000000000 0000000000000000 0000000000000000
f014232200000002
[129608.222069]  1973f0142322015a 0000000600080000 1973f0140579f414
000000002f1dc7ec
[129608.222211] Call Trace:
[129608.222264]  <IRQ>
[129608.222316]  [<ffffffffa02445bd>] ovs_flow_lookup+0x5d/0x70
[openvswitch]
[129608.222411]  [<ffffffffa0242550>] ovs_dp_process_received_packet+0x70/0x110 [openvswitch]
[129608.222541]  [<ffffffff8104d6ec>] ? resched_task+0x2c/0x80
[129608.222644]  [<ffffffffa0249b20>] ? netdev_create+0x120/0x120
[openvswitch]
[129608.222743]  [<ffffffffa02483f8>] ovs_vport_receive+0x38/0x40
[openvswitch]
[129608.222838]  [<ffffffffa0249bc3>] netdev_frame_hook+0xa3/0xf0
[openvswitch]
[129608.222933]  [<ffffffffa0249b20>] ? netdev_create+0x120/0x120
[openvswitch]
[129608.223029]  [<ffffffff81539318>] __netif_receive_skb+0x1c8/0x620
[129608.223114]  [<ffffffff81325740>] ? map_single+0x60/0x60
[129608.223192]  [<ffffffff81539b71>] process_backlog+0xb1/0x190
[129608.223274]  [<ffffffff8153ae64>] net_rx_action+0x134/0x290
[129608.223355]  [<ffffffff8106e1a8>] __do_softirq+0xa8/0x210
[129608.223433]  [<ffffffff816458de>] ? _raw_spin_lock+0xe/0x20
[129608.223513]  [<ffffffff8164fcac>] call_softirq+0x1c/0x30
[129608.223590]  [<ffffffff81016215>] do_softirq+0x65/0xa0
[129608.223665]  [<ffffffff8106e58e>] irq_exit+0x8e/0xb0
[129608.223738]  [<ffffffff81650573>] do_IRQ+0x63/0xe0
[129608.223808]  [<ffffffff81645d6e>] common_interrupt+0x6e/0x6e
[129608.223887]  <EOI>
[129608.223933]  [<ffffffff8103ced6>] ? native_safe_halt+0x6/0x10
[129608.224014]  [<ffffffff8101c6a3>] default_idle+0x53/0x1d0
[129608.224092]  [<ffffffff81013236>] cpu_idle+0xd6/0x120
[129608.224167]  [<ffffffff8161785e>] rest_init+0x72/0x74
[129608.224252]  [<ffffffff81cfcbaa>] start_kernel+0x3b5/0x3c2
[129608.224331]  [<ffffffff81cfc347>]
x86_64_start_reservations+0x132/0x136
[129608.224421]  [<ffffffff81cfc140>] ? early_idt_handlers+0x140/0x140
[129608.224506]  [<ffffffff81cfc44d>] x86_64_start_kernel+0x102/0x111
[129608.224589] Code: 25 48 63 53 28 48 8d 42 01 48 c1 e0 04 49 01 c7 49
8b 07 48 85 c0 74 61 4d 8b 3f 48 c1 e2 04 48 83 c2 10 49 29 d7 4d 85 ff
74 26 <4d> 39 a7 38 01 00 00 75 cd 48 8b 95 38 ff ff ff 4c 89 ee 49 8d
[129608.224949] RIP  [<ffffffffa02436da>] ovs_masked_flow_lookup+0xda/0x140 [openvswitch]

Original commit msg:

list: Fix double fetch of pointer in hlist_entry_safe()

The current version of hlist_entry_safe() fetches the pointer twice,
once to test for NULL and the other to compute the offset back to the
enclosing structure.  This is OK for normal lock-based use because in
that case, the pointer cannot change.  However, when the pointer is
protected by RCU (as in "rcu_dereference(p)"), then the pointer can
change at any time.  This use case can result in the following sequence
of events:

1.  CPU 0 invokes hlist_entry_safe(), fetches the RCU-protected
    pointer as sees that it is non-NULL.

2.  CPU 1 invokes hlist_del_rcu(), deleting the entry that CPU 0
    just fetched a pointer to.  Because this is the last entry
    in the list, the pointer fetched by CPU 0 is now NULL.

3.  CPU 0 refetches the pointer, obtains NULL, and then gets a
    NULL-pointer crash.

This commit therefore applies gcc's "({ })" statement expression to
create a temporary variable so that the specified pointer is fetched
only once, avoiding the above sequence of events.  Please note that
it is the caller's responsibility to use rcu_dereference() as needed.
This allows RCU-protected uses to work correctly without imposing
any additional overhead on the non-RCU case.

Many thanks to Eric Dumazet for spotting root cause!

Reported-by: CAI Qian <caiqian@redhat.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Li Zefan <lizefan@huawei.com>
Bug #17099

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
datapath/linux/compat/include/linux/list.h

index 4446779..18cce8a 100644 (file)
@@ -5,7 +5,9 @@
 
 #ifndef hlist_entry_safe
 #define hlist_entry_safe(ptr, type, member) \
-       (ptr) ? hlist_entry(ptr, type, member) : NULL
+       ({ typeof(ptr) ____ptr = (ptr); \
+        ____ptr ? hlist_entry(____ptr, type, member) : NULL; \
+        })
 
 #undef hlist_for_each_entry
 #define hlist_for_each_entry(pos, head, member)                                \