From: David S. Miller Date: Wed, 31 Dec 2014 23:26:02 +0000 (-0500) Subject: Merge branch 'fib_trie-next' X-Git-Tag: v4.0-rc1~133^2~312 X-Git-Url: http://git.cascardo.info/?p=cascardo%2Flinux.git;a=commitdiff_plain;h=e495f78d787b56ad249946b191406f4521b58150 Merge branch 'fib_trie-next' Alexander Duyck says: ==================== fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% These patches are meant to address several performance issues I have seen in the fib_trie implementation, and fib_table_lookup specifically. With these changes in place I have seen a reduction of up to 35 to 75% for the total time spent in fib_table_lookup depending on the type of search being performed. On a VM running in my Corei7-4930K system with a trie of maximum depth of 7 this resulted in a reduction of over 370ns per packet in the total time to process packets received from an ixgbe interface and route them to a dummy interface. This represents a failed lookup in the local trie followed by a successful search in the main trie. Baseline Refactor ixgbe->dummy routing 1.20Mpps 2.21Mpps ------------------------------------------------------------ processing time per packet 835ns 453ns fib_table_lookup 50.1% 418ns 25.0% 113ns check_leaf.isra.9 7.9% 66ns -- -- ixgbe_clean_rx_irq 5.3% 44ns 9.8% 44ns ip_route_input_noref 2.9% 25ns 4.6% 21ns pvclock_clocksource_read 2.6% 21ns 4.6% 21ns ip_rcv 2.6% 22ns 4.0% 18ns In the simple case of receiving a frame and dropping it before it can reach the socket layer I saw a reduction of 40ns per packet. This represents a trip through the local trie with the correct leaf found with no need for any backtracing. Baseline Refactor ixgbe->local receive 2.65Mpps 2.96Mpps ------------------------------------------------------------ processing time per packet 377ns 337ns fib_table_lookup 25.1% 95ns 25.8% 87ns ixgbe_clean_rx_irq 8.7% 33ns 9.0% 30ns check_leaf.isra.9 7.2% 27ns -- -- ip_rcv 5.7% 21ns 6.5% 22ns These changes have resulted in several functions being inlined such as check_leaf and fib_find_node, but due to the code simplification the overall size of the code has been reduced. text data bss dec hex filename 16932 376 16 17324 43ac net/ipv4/fib_trie.o - before 15259 376 8 15643 3d1b net/ipv4/fib_trie.o - after Changes since RFC: Replaced this_cpu_ptr with correct call to this_cpu_inc in patch 1 Changed test for leaf_info mismatch to (key ^ n->key) & li->mask_plen in patch 10 ==================== Signed-off-by: David S. Miller --- e495f78d787b56ad249946b191406f4521b58150