net: sched: do not acquire qdisc spinlock in qdisc/class stats dump
authorEric Dumazet <edumazet@google.com>
Mon, 6 Jun 2016 16:37:16 +0000 (09:37 -0700)
committerDavid S. Miller <davem@davemloft.net>
Tue, 7 Jun 2016 23:37:14 +0000 (16:37 -0700)
commitedb09eb17ed89eaa82a52dd306beac93e292b485
tree1f241506d6b781b65d1033925f1c1ce6a39c3394
parentf9eb8aea2a1e12fc2f584d1627deeb957435a801
net: sched: do not acquire qdisc spinlock in qdisc/class stats dump

Large tc dumps (tc -s {qdisc|class} sh dev ethX) done by Google BwE host
agent [1] are problematic at scale :

For each qdisc/class found in the dump, we currently lock the root qdisc
spinlock in order to get stats. Sampling stats every 5 seconds from
thousands of HTB classes is a challenge when the root qdisc spinlock is
under high pressure. Not only the dumps take time, they also slow
down the fast path (queue/dequeue packets) by 10 % to 20 % in some cases.

An audit of existing qdiscs showed that sch_fq_codel is the only qdisc
that might need the qdisc lock in fq_codel_dump_stats() and
fq_codel_dump_class_stats()

In v2 of this patch, I now use the Qdisc running seqcount to provide
consistent reads of packets/bytes counters, regardless of 32/64 bit arches.

I also changed rate estimators to use the same infrastructure
so that they no longer need to lock root qdisc lock.

[1]
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43838.pdf

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Kevin Athey <kda@google.com>
Cc: Xiaotian Pei <xiaotian@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 files changed:
Documentation/networking/gen_stats.txt
include/net/gen_stats.h
include/net/sch_generic.h
net/core/gen_estimator.c
net/core/gen_stats.c
net/netfilter/xt_RATEEST.c
net/sched/act_api.c
net/sched/act_police.c
net/sched/sch_api.c
net/sched/sch_atm.c
net/sched/sch_cbq.c
net/sched/sch_drr.c
net/sched/sch_fq_codel.c
net/sched/sch_hfsc.c
net/sched/sch_htb.c
net/sched/sch_mq.c
net/sched/sch_mqprio.c
net/sched/sch_multiq.c
net/sched/sch_prio.c
net/sched/sch_qfq.c