cascardo/linux.git
13 years agonetdevice.h: Align struct net_device members
Joe Perches [Mon, 9 May 2011 17:42:46 +0000 (17:42 +0000)]
netdevice.h: Align struct net_device members

Save a bit of space.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Fix 'iph' use before set.
David S. Miller [Fri, 13 May 2011 03:03:46 +0000 (23:03 -0400)]
ipv4: Fix 'iph' use before set.

I swear none of my compilers warned about this, yet it is so
obvious.

> net/ipv4/ip_forward.c: In function 'ip_forward':
> net/ipv4/ip_forward.c:87: warning: 'iph' may be used uninitialized in this function

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-next-2.6
David S. Miller [Fri, 13 May 2011 03:01:55 +0000 (23:01 -0400)]
Merge branch 'master' of /linux/kernel/git/davem/net-next-2.6

13 years agoipv4: Elide use of rt->rt_dst in ip_forward()
David S. Miller [Thu, 12 May 2011 23:34:30 +0000 (19:34 -0400)]
ipv4: Elide use of rt->rt_dst in ip_forward()

No matter what kind of header mangling occurs due to IP options
processing, rt->rt_dst will always equal iph->daddr in the packet.

So we can safely use iph->daddr instead of rt->rt_dst here.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Simplify iph->daddr overwrite in ip_options_rcv_srr().
David S. Miller [Thu, 12 May 2011 23:30:58 +0000 (19:30 -0400)]
ipv4: Simplify iph->daddr overwrite in ip_options_rcv_srr().

We already copy the 4-byte nexthop from the options block into
local variable "nexthop" for the route lookup.

Re-use that variable instead of memcpy()'ing again when assigning
to iph->daddr after the route lookup succeeds.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Kill spurious opt->srr check in ip_options_rcv_srr().
David S. Miller [Thu, 12 May 2011 23:26:57 +0000 (19:26 -0400)]
ipv4: Kill spurious opt->srr check in ip_options_rcv_srr().

All call sites conditionalize the call to ip_options_rcv_srr()
with a check of opt->srr, so no need to check it again there.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobonding: convert to ndo_fix_features
Michał Mirosław [Sat, 7 May 2011 03:22:17 +0000 (03:22 +0000)]
bonding: convert to ndo_fix_features

This should also fix updating of vlan_features and propagating changes to
VLAN devices on the bond.

Side effect: it allows user to force-disable some offloads on the bond
interface.

Note: NETIF_F_VLAN_CHALLENGED is managed by bond_fix_features() now.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: introduce netdev_change_features()
Michał Mirosław [Sat, 7 May 2011 03:22:17 +0000 (03:22 +0000)]
net: introduce netdev_change_features()

It will be needed by bonding and other drivers changing vlan_features
after ndo_init callback.

As a bonus, this includes kernel-doc for netdev_update_features().

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoCDC NCM: Add mising short packet in cdc_ncm driver
Alexey Orishko [Fri, 6 May 2011 03:01:30 +0000 (03:01 +0000)]
CDC NCM: Add mising short packet in cdc_ncm driver

Changes:
- while making NTB, driver shall check if device dwNtbOutMaxSize is higher than
 host value and shall add a short packet if this is the case
- previous temporary patch for this issue is replaced by this one

Signed-off-by: Alexey Orishko <alexey.orishko@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipvs: Remove all remaining references to rt->rt_{src,dst}
Julian Anastasov [Tue, 10 May 2011 12:46:05 +0000 (12:46 +0000)]
ipvs: Remove all remaining references to rt->rt_{src,dst}

Remove all remaining references to rt->rt_{src,dst}
by using dest->dst_saddr to cache saddr (used for TUN mode).
For ICMP in FORWARD hook just restrict the rt_mode for NAT
to disable LOCALNODE. All other modes do not allow
IP_VS_RT_MODE_RDR, so we should be safe with the ICMP
forwarding. Using cp->daddr as replacement for rt_dst
is safe for all modes except BYPASS, even when cp->dest is
NULL because it is cp->daddr that is used to assign cp->dest
for sync-ed connections.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipvs: Eliminate rt->rt_dst usage in __ip_vs_get_out_rt().
David S. Miller [Mon, 9 May 2011 21:38:06 +0000 (14:38 -0700)]
ipvs: Eliminate rt->rt_dst usage in __ip_vs_get_out_rt().

We can simply track what destination address is used based upon which
code block is taken at the top of the function.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipvs: Use IP_VS_RT_MODE_* instead of magic constants.
David S. Miller [Thu, 12 May 2011 22:22:34 +0000 (18:22 -0400)]
ipvs: Use IP_VS_RT_MODE_* instead of magic constants.

[ Add some cases I missed, from Julian Anastasov ]

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotg3: Allow ethtool to enable/disable loopback.
Mahesh Bandewar [Sun, 8 May 2011 06:51:48 +0000 (06:51 +0000)]
tg3: Allow ethtool to enable/disable loopback.

This patch adds tg3_set_features() to handle loopback mode. Currently the
capability is added for the devices which support internal MAC loopback mode.
So when enabled, it enables internal-MAC loopback.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/irda/ircomm_tty.c: Use flip buffers to deliver data
Amit Virdi [Thu, 12 May 2011 01:04:40 +0000 (01:04 +0000)]
net/irda/ircomm_tty.c: Use flip buffers to deliver data

use tty_insert_flip_string and tty_flip_buffer_push to deliver incoming data
packets from the IrDA device instead of delivering the packets directly to the
line discipline. Following later approach resulted in warning "Sleeping function
called from invalid context".

Signed-off-by: Amit Virdi <amit.virdi@st.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: group FCoE related feature flags
Yi Zou [Mon, 9 May 2011 11:53:27 +0000 (11:53 +0000)]
net: group FCoE related feature flags

Michał Mirosław's patch (http://patchwork.ozlabs.org/patch/94421/) fixes the
issue (http://patchwork.ozlabs.org/patch/94188/) about not populating FCoE related
flags correctly on vlan devices. However, only NETIF_F_FCOE_CRC is part of the
NETIF_F_ALL_TX_OFFLOADS right now, where weed NETIF_F_FCOE_MTU and NETIF_F_FSO
as well.

Therefore, add NETIF_F_ALL_FCOE to indicate feature flags used by FCoE TX offloads.
These include NETIF_F_FCOE_CRC, NETIF_F_FCOE_MTU, and NETIF_F_FSO and add them to
be part of NETIF_F_ALL_TX_OFFLOADS. This would eventually make sure all FCoE needed
flags are populated properly to vlan devices.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Fix vlan_features propagation
Michał Mirosław [Fri, 6 May 2011 07:56:29 +0000 (07:56 +0000)]
net: Fix vlan_features propagation

Fix VLAN features propagation for devices which change vlan_features.
For this to work, driver needs to make sure netdev_features_changed()
gets called after the change (it is e.g. after ndo_set_features()).

Side effect is that a user might request features that will never
be enabled on a VLAN device.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoethtool: bring back missing comma in netdev features strings
Franco Fichtner [Thu, 12 May 2011 06:42:04 +0000 (06:42 +0000)]
ethtool: bring back missing comma in netdev features strings

The issue was introduced in commit eed2a12f1ed9aabf.

Signed-off-by: Franco Fichtner <franco@lastsummer.de>
Acked-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agogarp: remove last synchronize_rcu() call
Eric Dumazet [Thu, 12 May 2011 21:46:56 +0000 (17:46 -0400)]
garp: remove last synchronize_rcu() call

When removing last vlan from a device, garp_uninit_applicant() calls
synchronize_rcu() to make sure no user can still manipulate struct
garp_applicant before we free it.

Use call_rcu() instead, as a step to further net_device dismantle
optimizations.

Add the temporary garp_cleanup_module() function to make sure no pending
call_rcu() are left at module unload time [ this will be removed when
kfree_rcu() is available ]

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agovmxnet3: Use single tx queue when CONFIG_PCI_MSI not defined
Shreyas Bhatewara [Tue, 10 May 2011 06:13:56 +0000 (06:13 +0000)]
vmxnet3: Use single tx queue when CONFIG_PCI_MSI not defined

Resending this patch with few changes.

Avoid multiple queues when MSI or MSI-X not available

Limit number of Tx queues to 1 if MSI/MSI-X support is not configured in
the kernel. This will make number of tx and rx queues equal when MSI/X
is not configured thus providing better performance.

Signed-off-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosctp: sctp_sendmsg: Don't test known non-null sinfo
Joe Perches [Thu, 12 May 2011 09:19:10 +0000 (09:19 +0000)]
sctp: sctp_sendmsg: Don't test known non-null sinfo

It's already known non-null above.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosctp: sctp_sendmsg: Don't initialize default_sinfo
Joe Perches [Thu, 12 May 2011 11:27:20 +0000 (11:27 +0000)]
sctp: sctp_sendmsg: Don't initialize default_sinfo

This variable only needs initialization when cmsgs.info
is NULL.

Use memset to ensure padding is also zeroed so
kernel doesn't leak any data.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agol2tp: fix potential rcu race
Eric Dumazet [Wed, 11 May 2011 18:22:36 +0000 (18:22 +0000)]
l2tp: fix potential rcu race

While trying to remove useless synchronize_rcu() calls, I found l2tp is
indeed incorrectly using two of such calls, but also bumps tunnel
refcount after list insertion.

tunnel refcount must be incremented before being made publically visible
by rcu readers.

This fix can be applied to 2.6.35+ and might need a backport for older
kernels, since things were shuffled in commit fd558d186df2c
(l2tp: Split pppol2tp patch into separate l2tp and ppp parts)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: James Chapman <jchapman@katalix.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: Fix to prevent flooding of TX queue
Padmanabh Ratnakar [Tue, 10 May 2011 05:13:57 +0000 (05:13 +0000)]
be2net: Fix to prevent flooding of TX queue

Start/stop TX queue is controlled by TX queue "used" counter.
It is incremented while WRBs are posted to TX queue and
decremented when TX completions are received. This counter was
getting decremented before HW is informed about processing of TX
completions. As used counter is decremented, transmit function
posts new WRBs and creates completion queue full scenario in HW.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: Use NTWK_RX_FILTER command for promiscous mode
Padmanabh Ratnakar [Tue, 10 May 2011 05:13:26 +0000 (05:13 +0000)]
be2net: Use NTWK_RX_FILTER command for promiscous mode

Use OPCODE_COMMON_NTWK_RX_FILTER command for promiscous mode as
OPCODE_ETH_PROMISCUOUS command is getting deprecated.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: In case of UE, do not dump registers for Lancer
Padmanabh Ratnakar [Tue, 10 May 2011 05:13:01 +0000 (05:13 +0000)]
be2net: In case of UE, do not dump registers for Lancer

In case of UE, do not dump registers for Lancer as they are not
supported.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: Disable coalesce water mark mode of CQ for Lancer
Padmanabh Ratnakar [Tue, 10 May 2011 05:12:37 +0000 (05:12 +0000)]
be2net: Disable coalesce water mark mode of CQ for Lancer

Disable coalesce water mark mode of CQ for Lancer

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: Handle error completion in Lancer
Padmanabh Ratnakar [Tue, 10 May 2011 05:12:17 +0000 (05:12 +0000)]
be2net: Handle error completion in Lancer

In Lancer if a frame is DMAed partially due to lack of RX buffers,
an error completion is sent with packet size as zero and num_recvd
indicating number of used buffers. These buffers need to be freed
and packet dropped.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-3.6
David S. Miller [Wed, 11 May 2011 18:26:15 +0000 (14:26 -0400)]
Merge branch 'master' of /linux/kernel/git/davem/net-3.6

Conflicts:
drivers/net/benet/be_main.c

13 years agoMerge branch 'tipc-May10-2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg...
David S. Miller [Wed, 11 May 2011 16:41:28 +0000 (12:41 -0400)]
Merge branch 'tipc-May10-2011' of git://git./linux/kernel/git/paulg/net-next-2.6

13 years agoMerge branch 'pablo/nf-2.6-updates' of git://1984.lsi.us.es/net-2.6
David S. Miller [Tue, 10 May 2011 22:04:35 +0000 (15:04 -0700)]
Merge branch 'pablo/nf-2.6-updates' of git://1984.lsi.us.es/net-2.6

13 years agoslcan: fix ldisc->open retval
Oliver Hartkopp [Tue, 10 May 2011 20:12:30 +0000 (13:12 -0700)]
slcan: fix ldisc->open retval

TTY layer expects 0 if the ldisc->open operation succeeded.

Reported-by: Matvejchikov Ilya <matvejchikov@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/usb: mark LG VL600 LTE modem ethernet interface as WWAN
Dan Williams [Mon, 9 May 2011 07:43:20 +0000 (07:43 +0000)]
net/usb: mark LG VL600 LTE modem ethernet interface as WWAN

Like other mobile broadband device ethernet interfaces, mark the LG
VL600 with the 'wwan' devtype so userspace knows it needs additional
configuration via the AT port before the interface can be used.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoxfrm: Don't allow esn with disabled anti replay detection
Steffen Klassert [Mon, 9 May 2011 19:43:05 +0000 (19:43 +0000)]
xfrm: Don't allow esn with disabled anti replay detection

Unlike the standard case, disabled anti replay detection needs some
nontrivial extra treatment on ESN. RFC 4303 states:

Note: If a receiver chooses to not enable anti-replay for an SA, then
the receiver SHOULD NOT negotiate ESN in an SA management protocol.
Use of ESN creates a need for the receiver to manage the anti-replay
window (in order to determine the correct value for the high-order
bits of the ESN, which are employed in the ICV computation), which is
generally contrary to the notion of disabling anti-replay for an SA.

So return an error if an ESN state with disabled anti replay detection
is inserted for now and add the extra treatment later if we need it.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoxfrm: Assign the inner mode output function to the dst entry
Steffen Klassert [Mon, 9 May 2011 19:36:38 +0000 (19:36 +0000)]
xfrm: Assign the inner mode output function to the dst entry

As it is, we assign the outer modes output function to the dst entry
when we create the xfrm bundle. This leads to two problems on interfamily
scenarios. We might insert ipv4 packets into ip6_fragment when called
from xfrm6_output. The system crashes if we try to fragment an ipv4
packet with ip6_fragment. This issue was introduced with git commit
ad0081e4 (ipv6: Fragment locally generated tunnel-mode IPSec6 packets
as needed). The second issue is, that we might insert ipv4 packets in
netfilter6 and vice versa on interfamily scenarios.

With this patch we assign the inner mode output function to the dst entry
when we create the xfrm bundle. So xfrm4_output/xfrm6_output from the inner
mode is used and the right fragmentation and netfilter functions are called.
We switch then to outer mode with the output_finish functions.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: dev_close() should check IFF_UP
Eric Dumazet [Tue, 10 May 2011 19:26:06 +0000 (12:26 -0700)]
net: dev_close() should check IFF_UP

Commit 443457242beb (factorize sync-rcu call in
unregister_netdevice_many) mistakenly removed one test from dev_close()

Following actions trigger a BUG :

modprobe bonding
modprobe dummy
ifconfig bond0 up
ifenslave bond0 dummy0
rmmod dummy

dev_close() must not close a non IFF_UP device.

With help from Frank Blaschka and Einar EL Lueck

Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Reported-by: Einar EL Lueck <ELELUECK@de.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agovlan: fix GVRP at dismantle time
Eric Dumazet [Tue, 10 May 2011 19:22:54 +0000 (12:22 -0700)]
vlan: fix GVRP at dismantle time

ip link add link eth2 eth2.103 type vlan id 103 gvrp on loose_binding on
ip link set eth2.103 up
rmmod tg3    # driver providing eth2

 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
 PGD 11d251067 PUD 11b9e0067 PMD 0
 Oops: 0000 [#1] SMP
 last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
 CPU 0
 Modules linked in: tg3(-) 8021q garp nfsd lockd auth_rpcgss sunrpc libphy sg [last unloaded: x_tables]

 Pid: 11494, comm: rmmod Tainted: G        W   2.6.39-rc6-00261-gfd71257-dirty #580 HP ProLiant BL460c G6
 RIP: 0010:[<ffffffffa0030c9e>]  [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
 RSP: 0018:ffff88007a19bae8  EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff88011b5e2000 RCX: 0000000000000002
 RDX: 0000000000000000 RSI: 0000000000000175 RDI: ffffffffa0030d5b
 RBP: ffff88007a19bb18 R08: 0000000000000001 R09: ffff88011bd64a00
 R10: ffff88011d34ec00 R11: 0000000000000000 R12: 0000000000000002
 R13: ffff88007a19bc48 R14: ffff88007a19bb88 R15: 0000000000000001
 FS:  0000000000000000(0000) GS:ffff88011fc00000(0063) knlGS:00000000f77d76c0
 CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
 CR2: 0000000000000000 CR3: 000000011a675000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process rmmod (pid: 11494, threadinfo ffff88007a19a000, task ffff8800798595c0)
 Stack:
  ffff88007a19bb36 ffff88011c84b800 ffff88011b5e2000 ffff88007a19bc48
  ffff88007a19bb88 0000000000000006 ffff88007a19bb38 ffffffffa003a5f6
  ffff88007a19bb38 670088007a19bba8 ffff88007a19bb58 ffffffffa00397e7
 Call Trace:
  [<ffffffffa003a5f6>] vlan_gvrp_request_leave+0x46/0x50 [8021q]
  [<ffffffffa00397e7>] vlan_dev_stop+0xb7/0xc0 [8021q]
  [<ffffffff8137e427>] __dev_close_many+0x87/0xe0
  [<ffffffff8137e507>] dev_close_many+0x87/0x110
  [<ffffffff8137e630>] rollback_registered_many+0xa0/0x240
  [<ffffffff8137e7e9>] unregister_netdevice_many+0x19/0x60
  [<ffffffffa00389eb>] vlan_device_event+0x53b/0x550 [8021q]
  [<ffffffff8143f448>] ? ip6mr_device_event+0xa8/0xd0
  [<ffffffff81479d03>] notifier_call_chain+0x53/0x80
  [<ffffffff81062539>] __raw_notifier_call_chain+0x9/0x10
  [<ffffffff81062551>] raw_notifier_call_chain+0x11/0x20
  [<ffffffff8137df82>] call_netdevice_notifiers+0x32/0x60
  [<ffffffff8137e69f>] rollback_registered_many+0x10f/0x240
  [<ffffffff8137e85f>] rollback_registered+0x2f/0x40
  [<ffffffff8137e8c8>] unregister_netdevice_queue+0x58/0x90
  [<ffffffff8137e9eb>] unregister_netdev+0x1b/0x30
  [<ffffffffa005d73f>] tg3_remove_one+0x6f/0x10b [tg3]

We should call vlan_gvrp_request_leave() from unregister_vlan_dev(),
not from vlan_dev_stop(), because vlan_gvrp_uninit_applicant()
is called right after unregister_netdevice_queue(). In batch mode,
unregister_netdevice_queue() doesn’t immediately call vlan_dev_stop().

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: fix two lockdep splats
Eric Dumazet [Tue, 10 May 2011 03:55:03 +0000 (20:55 -0700)]
net: fix two lockdep splats

Commit e67f88dd12f6 (net: dont hold rtnl mutex during netlink dump
callbacks) switched rtnl protection to RCU, but we forgot to adjust two
rcu_dereference() lockdep annotations :

inet_get_link_af_size() or inet_fill_link_af() might be called with
rcu_read_lock or rtnl held, so use rcu_dereference_rtnl()
instead of rtnl_dereference()

Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: xfrm: Eliminate ->rt_src reference in policy code.
David S. Miller [Mon, 9 May 2011 22:13:28 +0000 (15:13 -0700)]
ipv4: xfrm: Eliminate ->rt_src reference in policy code.

Rearrange xfrm4_dst_lookup() so that it works by calling a helper
function __xfrm_dst_lookup() that takes an explicit flow key storage
area as an argument.

Use this new helper in xfrm4_get_saddr() so we can fetch the selected
source address from the flow instead of from rt->rt_src

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoinfiniband: Remove rt->rt_src usage in addr4_resolve()
David S. Miller [Mon, 9 May 2011 21:52:02 +0000 (14:52 -0700)]
infiniband: Remove rt->rt_src usage in addr4_resolve()

Use an explicit flow key and fetch it from there.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosctp: Remove rt->rt_src usage in sctp_v4_get_saddr()
David S. Miller [Mon, 9 May 2011 21:49:13 +0000 (14:49 -0700)]
sctp: Remove rt->rt_src usage in sctp_v4_get_saddr()

Flow key is available, so fetch it from there.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: udp: Eliminate remaining uses of rt->rt_src
David S. Miller [Mon, 9 May 2011 20:31:04 +0000 (13:31 -0700)]
ipv4: udp: Eliminate remaining uses of rt->rt_src

We already track and pass around the correct flow key,
so simply use it in udp_send_skb().

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: icmp: Eliminate remaining uses of rt->rt_src
David S. Miller [Mon, 9 May 2011 20:28:22 +0000 (13:28 -0700)]
ipv4: icmp: Eliminate remaining uses of rt->rt_src

On input packets, rt->rt_src always equals ip_hdr(skb)->saddr

Anything that mangles or otherwise changes the IP header must
relookup the route found at skb_rtable().  Therefore this
invariant must always hold true.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Pass explicit daddr arg to ip_send_reply().
David S. Miller [Mon, 9 May 2011 20:22:43 +0000 (13:22 -0700)]
ipv4: Pass explicit daddr arg to ip_send_reply().

This eliminates an access to rt->rt_src.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotipc: Revise timings used when sending link request messages
Allan Stephens [Fri, 22 Apr 2011 01:34:03 +0000 (20:34 -0500)]
tipc: Revise timings used when sending link request messages

Revises the algorithm governing the sending of link request messages
to take into account the number of nodes each bearer is currently in
contact with, and to ensure more rapid rediscovery of neighboring nodes
if a bearer fails and then recovers.

The discovery object now sends requests at least once a second if it
is not in contact with any other nodes, and at least once a minute if
it has at least one neighbor; if contact with the only neighbor is
lost, the object immediately reverts to its initial rapid-fire search
timing to accelerate the rediscovery process.

In addition, the discovery object now stops issuing link request
messages if it is in contact with the only neighboring node it is
configured to communicate with, since further searching is unnecessary.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Add monitoring of number of nodes discovered by bearer
Allan Stephens [Fri, 22 Apr 2011 00:05:25 +0000 (19:05 -0500)]
tipc: Add monitoring of number of nodes discovered by bearer

Augments TIPC's discovery object to track the number of neighboring nodes
having an active link to the associated bearer.

This means tipc_disc_update_link_req() becomes either one of:

       tipc_disc_add_dest()
or:
       tipc_disc_remove_dest()

depending on the code flow direction of things.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Enhance sending of discovery object link request messages
Allan Stephens [Thu, 21 Apr 2011 21:28:02 +0000 (16:28 -0500)]
tipc: Enhance sending of discovery object link request messages

Augments TIPC's discovery object to send its initial neighbor discovery
request message as soon as the associated bearer is created, rather than
waiting for its first periodic timeout to occur, thereby speeding up the
discovery process. Also adds a check to suppress the initial request or
subsequent requests if the bearer is blocked at the time the request is
scheduled for transmission.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Enhance handling of discovery object creation failures
Allan Stephens [Thu, 21 Apr 2011 18:58:26 +0000 (13:58 -0500)]
tipc: Enhance handling of discovery object creation failures

Modifies bearer creation and deletion code to improve handling of
scenarios when a neighbor discovery object cannot be created. The
creation routine now aborts the creation of a bearer if its discovery
object cannot be created, and deletes the newly created bearer, rather
than failing quietly and leaving an unusable bearer hanging around.

Since the exit via the goto label really isn't a definitive failure
in all cases, relabel it appropriately.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Introduce routine to enqueue a chain of messages on link tx queue
Allan Stephens [Thu, 21 Apr 2011 15:50:42 +0000 (11:50 -0400)]
tipc: Introduce routine to enqueue a chain of messages on link tx queue

Create a helper routine to enqueue a chain of sk_buffs to a link's
transmit queue.  It improves readability and the new function is
anticipated to be used more than just once in the future as well.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Avoid recomputation of outgoing message length
Allan Stephens [Thu, 21 Apr 2011 15:42:07 +0000 (10:42 -0500)]
tipc: Avoid recomputation of outgoing message length

Rework TIPC's message sending routines to take advantage of the total
amount of data value passed to it by the kernel socket infrastructure.
This change eliminates the need for TIPC to compute the size of outgoing
messages itself, as well as the check for an oversize message in
tipc_msg_build().  In addition, this change warrants an explanation:

   -     res = send_packet(NULL, sock, &my_msg, 0);
   +     res = send_packet(NULL, sock, &my_msg, bytes_to_send);

Previously, the final argument to send_packet() was ignored (since the
amount of data being sent was recalculated by a lower-level routine)
and we could just pass in a dummy value (0). Now that the
recalculation is being eliminated, the argument value being passed to
send_packet() is significant and we have to supply the actual amount
of data we want to send.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Abort excessive send requests as early as possible
Allan Stephens [Tue, 20 Apr 2010 21:58:24 +0000 (17:58 -0400)]
tipc: Abort excessive send requests as early as possible

Adds checks to TIPC's socket send routines to promptly detect and
abort attempts to send more than 66,000 bytes in a single TIPC
message or more than 2**31-1 bytes in a single TIPC byte stream request.
In addition, this ensures that the number of iovecs in a send request
does not exceed the limits of a standard integer variable.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Strengthen checks for neighboring node discovery
Allan Stephens [Wed, 20 Apr 2011 21:24:07 +0000 (16:24 -0500)]
tipc: Strengthen checks for neighboring node discovery

Enhances existing checks on the discovery domain associated with a TIPC
bearer. A bearer can no longer be configured to accept links from itself
only (which would be pointless), or to nodes outside its own cluster
(since multi-cluster support has now been removed from TIPC). Also, the
neighbor discovery routine now validates link setup requests against the
configured discovery domain for the bearer, rather than simply ensuring
the requesting node belongs to the node's own cluster.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: make zone/cluster mask constants a define
Paul Gortmaker [Tue, 19 Apr 2011 17:11:23 +0000 (13:11 -0400)]
tipc: make zone/cluster mask constants a define

This allows them to be available for easy re-use in other places
and avoids trivial mistakes caused by  "count the f's and 0's".

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Fix sk_buff leaks when link congestion is detected
Allan Stephens [Tue, 19 Apr 2011 14:17:58 +0000 (10:17 -0400)]
tipc: Fix sk_buff leaks when link congestion is detected

Modifies a TIPC send routine that did not discard the outgoing sk_buff
if it was not transmitted because of link congestion; this eliminates
the potential for buffer leakage in the many callers who did not clean up
the unsent buffer. (The two routines that previously did discard the unsent
buffer have been updated to eliminate their now-redundant clean up.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Update destination node field on incoming multicast messages
Allan Stephens [Mon, 18 Apr 2011 14:14:26 +0000 (10:14 -0400)]
tipc: Update destination node field on incoming multicast messages

Sets the destination node field of an incoming multicast message
to the receiving node's network address before handing off the message
to each receiving port. This ensures that, in the event the destination
port returns the message to the sender, the sender can identify which
node the destination port belonged to.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Fix problem with bundled multicast message
Allan Stephens [Mon, 18 Apr 2011 14:08:22 +0000 (10:08 -0400)]
tipc: Fix problem with bundled multicast message

Set the destination node and destination port fields of an outgoing
multicast message header to zero; this is necessary to ensure that
the receiving node can route the message properly if it was packed
into a bundle due to link congestion. (Previously, there was a chance
that the receiving node would send the unbundled message to a random
node & port, rather than processing the message itself.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Set name lookup scope field properly in all data messages
Allan Stephens [Sun, 17 Apr 2011 20:02:11 +0000 (16:02 -0400)]
tipc: Set name lookup scope field properly in all data messages

Ensures that all outgoing data messages have the "name lookup scope"
field of their header set correctly; that is, named multicast messages
now specify cluster-wide name lookup, while messages not using TIPC
naming zero out the lookup field.  (Previously, the lookup scope specified
for these types of messages was inherited from the last message sent
by the sending port.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Fix issues with fragmentation of an existing message buffer
Allan Stephens [Sun, 17 Apr 2011 17:06:23 +0000 (13:06 -0400)]
tipc: Fix issues with fragmentation of an existing message buffer

Modifies the routine that fragments an existing message buffer to
use similar logic to that used when generating fragments from an iovec.
The routine now creates a complete chain of fragments and adds them to
the link transmit queue as a unit, so that the link sends all fragments
or none; this prevents the incomplete transmission of a fragmented
message that might otherwise result because of link congestion or
memory exhaustion. This change also ensures that the counter recording
the number of fragmented messages sent by the link is now incremented
only if the message is actually sent.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Avoid pointless masking of fragmented message identifier
Allan Stephens [Sun, 17 Apr 2011 15:44:24 +0000 (11:44 -0400)]
tipc: Avoid pointless masking of fragmented message identifier

Eliminates code that restricts a link's counter of its fragmented
messages to a 16-bit value, since the counter value is automatically
restricted to this range when it is written into the message header.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Don't initialize link selector field in fragmented messages
Allan Stephens [Sun, 17 Apr 2011 14:29:16 +0000 (10:29 -0400)]
tipc: Don't initialize link selector field in fragmented messages

Eliminates code that sets the link selector field in the header of
fragmented messages, since this information is never referenced.
(The unnecessary initialization was harmless as it was over-written
by the fragmented message identifier value before the fragments were
transmitted.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Remove code to emulate loss of broadcast messages
Allan Stephens [Tue, 12 Apr 2011 18:59:03 +0000 (14:59 -0400)]
tipc: Remove code to emulate loss of broadcast messages

Eliminates optional code used to test TIPC's ability to recover
from lost broadcast messages. This code duplicates functionality
already provided by the network stack's QoS option "network emulator".

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Cosmetic consolidation of internal message type definitions
Allan Stephens [Fri, 8 Apr 2011 15:04:15 +0000 (11:04 -0400)]
tipc: Cosmetic consolidation of internal message type definitions

Half of the #define entries in msg.h were down at the bottom
of the header, instead of up at the top before any of the static
inlines etc.   Relocate them up to the top, to be consistent with
the other normal linux header file layout conventions.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Eliminate unused routing message definitions
Allan Stephens [Fri, 8 Apr 2011 14:59:04 +0000 (10:59 -0400)]
tipc: Eliminate unused routing message definitions

Gets rid of unused constants defining the types used in routing
messages. These messages no longer exist in TIPC now that multicluster
and multizone support has been eliminated.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Update comments in message header include file
Allan Stephens [Fri, 8 Apr 2011 14:50:52 +0000 (10:50 -0400)]
tipc: Update comments in message header include file

Removes comments in TIPC's message header include file that are
outdated and/or unnecessary. Also introduces short comments (or
supplements existing ones) to better describe several set of existing
symbolic constants.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agotipc: Drop __TIME__ usage
Michal Marek [Tue, 5 Apr 2011 14:59:16 +0000 (16:59 +0200)]
tipc: Drop __TIME__ usage

The kernel already prints its build timestamp during boot, no need to
repeat it in random drivers and produce different object files each
time.

Signed-off-by: Michal Marek <mmarek@suse.cz>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: netdev@vger.kernel.org
Cc: tipc-discussion@lists.sourceforge.net
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
13 years agonetfilter: revert a2361c8735e07322023aedc36e4938b35af31eb0
Pablo Neira Ayuso [Tue, 10 May 2011 10:13:36 +0000 (12:13 +0200)]
netfilter: revert a2361c8735e07322023aedc36e4938b35af31eb0

This patch reverts a2361c8735e07322023aedc36e4938b35af31eb0:
"[PATCH] netfilter: xt_conntrack: warn about use in raw table"

Florian Wesphal says:
"... when the packet was sent from the local machine the skb
already has ->nfct attached, and -m conntrack seems to do
the right thing."

Acked-by: Jan Engelhardt <jengelh@medozas.de>
Reported-by: Florian Wesphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
13 years agonetfilter: IPv6: fix DSCP mangle code
Fernando Luis Vazquez Cao [Tue, 10 May 2011 08:00:21 +0000 (10:00 +0200)]
netfilter: IPv6: fix DSCP mangle code

The mask indicates the bits one wants to zero out, so it needs to be
inverted before applying to the original TOS field.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
13 years agonetfilter: IPv6: initialize TOS field in REJECT target module
Fernando Luis Vazquez Cao [Tue, 10 May 2011 07:55:44 +0000 (09:55 +0200)]
netfilter: IPv6: initialize TOS field in REJECT target module

The IPv6 header is not zeroed out in alloc_skb so we must initialize
it properly unless we want to see IPv6 packets with random TOS fields
floating around. The current implementation resets the flow label
but this could be changed if deemed necessary.

We stumbled upon this issue when trying to apply a mangle rule to
the RST packet generated by the REJECT target module.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
13 years agoIPVS: init and cleanup restructuring
Hans Schillstrom [Tue, 3 May 2011 20:09:31 +0000 (22:09 +0200)]
IPVS: init and cleanup restructuring

DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.

IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.

This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.

The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
13 years agoIPVS: Change of socket usage to enable name space exit.
Hans Schillstrom [Tue, 3 May 2011 20:09:30 +0000 (22:09 +0200)]
IPVS: Change of socket usage to enable name space exit.

If the sync daemons run in a name space while it crashes
or get killed, there is no way to stop them except for a reboot.
When all patches are there, ip_vs_core will handle register_pernet_(),
i.e. ip_vs_sync_init() and ip_vs_sync_cleanup() will be removed.

Kernel threads should not increment the use count of a socket.
By calling sk_change_net() after creating a socket this is avoided.
sock_release cant be used intead sk_release_kernel() should be used.

Thanks Eric W Biederman for your advices.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
13 years agonetfilter: ebtables: only call xt_compat_add_offset once per rule
Florian Westphal [Thu, 21 Apr 2011 08:58:25 +0000 (10:58 +0200)]
netfilter: ebtables: only call xt_compat_add_offset once per rule

The optimizations in commit 255d0dc34068a976
(netfilter: x_table: speedup compat operations) assume that
xt_compat_add_offset is called once per rule.

ebtables however called it for each match/target found in a rule.

The match/watcher/target parser already returns the needed delta, so it
is sufficient to move the xt_compat_add_offset call to a more reasonable
location.

While at it, also get rid of the unused COMPAT iterator macros.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
13 years agonetfilter: fix ebtables compat support
Eric Dumazet [Thu, 21 Apr 2011 08:57:21 +0000 (10:57 +0200)]
netfilter: fix ebtables compat support

commit 255d0dc34068a976 (netfilter: x_table: speedup compat operations)
made ebtables not working anymore.

1) xt_compat_calc_jump() is not an exact match lookup
2) compat_table_info() has a typo in xt_compat_init_offsets() call
3) compat_do_replace() misses a xt_compat_init_offsets() call

Reported-by: dann frazier <dannf@dannf.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
13 years agonetfilter: ctnetlink: fix timestamp support for new conntracks
Pablo Neira Ayuso [Thu, 21 Apr 2011 08:55:07 +0000 (10:55 +0200)]
netfilter: ctnetlink: fix timestamp support for new conntracks

This patch fixes the missing initialization of the start time if
the timestamp support is enabled.

libnetfilter_conntrack/utils# conntrack -E &
libnetfilter_conntrack/utils# ./conntrack_create
tcp      6 109 ESTABLISHED src=1.1.1.1 dst=2.2.2.2 sport=1025 dport=21 packets=0 bytes=0 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=21 dport=1025 packets=0 bytes=0 mark=0 delta-time=1303296401 use=2

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
13 years agotulip: Use pr_<level> where appropriate
Joe Perches [Mon, 9 May 2011 09:45:23 +0000 (09:45 +0000)]
tulip: Use pr_<level> where appropriate

Use the current logging styles.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotulip: Convert uses of KERN_DEBUG
Joe Perches [Mon, 9 May 2011 09:45:22 +0000 (09:45 +0000)]
tulip: Convert uses of KERN_DEBUG

Convert logging messages to more current styles.

Added -DDEBUG to Makefile to maintain current message logging.
This could be converted to a specific CONFIG_TULIP_DEBUG option.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotulip: Convert printks to netdev_<level>
Joe Perches [Mon, 9 May 2011 09:45:21 +0000 (09:45 +0000)]
tulip: Convert printks to netdev_<level>

Use the current more descriptive logging styles.

Add pr_fmt and remove PFX where appropriate.
Use netif_<level>, netdev_<level>
Indent a few blocks in xircom_cb where appropriate.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotulip: xircom_cb: Convert #ifdef DEBUG blocks and enter/leave uses
Joe Perches [Mon, 9 May 2011 09:45:20 +0000 (09:45 +0000)]
tulip: xircom_cb: Convert #ifdef DEBUG blocks and enter/leave uses

Change the blocks that are guarded by #if DEBUG to
be #if defined DEBUG && DEBUG > 1 so that pr_debug
can be used later.

Remove enter/leave macros and uses.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'davem-next.r8169' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Mon, 9 May 2011 19:48:05 +0000 (12:48 -0700)]
Merge branch 'davem-next.r8169' of git://git./linux/kernel/git/romieu/netdev-2.6

13 years agopch_gbe: support ML7223 IOH
Tomoya [Mon, 9 May 2011 01:19:37 +0000 (01:19 +0000)]
pch_gbe: support ML7223 IOH

Support new device OKI SEMICONDUCTOR ML7223 IOH(Input/Output Hub).
The ML7223 IOH is for MP(Media Phone) use.
The ML7223 is companion chip for Intel Atom E6xx series.
The ML7223 is completely compatible for Intel EG20T PCH.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: add mac_pton() for parsing MAC address
Alexey Dobriyan [Sat, 7 May 2011 23:00:07 +0000 (23:00 +0000)]
net: add mac_pton() for parsing MAC address

mac_pton() parses MAC address in form XX:XX:XX:XX:XX:XX and only in that form.

mac_pton() doesn't dirty result until it's sure string representation is valid.

mac_pton() doesn't care about characters _after_ last octet,
it's up to caller to deal with it.

mac_pton() diverges from 0/-E return value convention.
Target usage:

if (!mac_pton(str, whatever->mac))
return -EINVAL;
/* ->mac being u8 [ETH_ALEN] is filled at this point. */
/* optionally check str[3 * ETH_ALEN - 1] for termination */

Use mac_pton() in pktgen and netconsole for start.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetconsole: switch to kstrto*() functions
Alexey Dobriyan [Sat, 7 May 2011 20:33:13 +0000 (20:33 +0000)]
netconsole: switch to kstrto*() functions

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: bonding: factor out rlock(bond->lock) in xmit path
Michał Mirosław [Sat, 7 May 2011 01:48:02 +0000 (01:48 +0000)]
net: bonding: factor out rlock(bond->lock) in xmit path

Pull read_lock(&bond->lock) and BOND_IS_OK() to bond_start_xmit() from
mode-dependent xmit functions.

netif_running() is always true in hard_start_xmit.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agor8169: avoid late chip identifier initialisation.
Francois Romieu [Sun, 8 May 2011 15:47:36 +0000 (17:47 +0200)]
r8169: avoid late chip identifier initialisation.

Unknown 8168 chips did not have any PLL power method set as they
did not inherit a default family soon enough. Fix it.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
13 years agor8169: merge firmware information into the chipset description data.
Francois Romieu [Wed, 27 Apr 2011 06:22:39 +0000 (08:22 +0200)]
r8169: merge firmware information into the chipset description data.

- RTL_GIGA_MAC_NONE is a fake index so put it at the end of the
  enumeration and shift everybody.
- RTL_GIGA_MAC_VER_17 / RTL_GIGA_MAC_VER_16 ordering fixed. Though
  not wrong it was confusing enough to wonder if things were right.

Renaming rtl_chip_info was not strictly necessary. It allows to
check the patch for the correct use of the indexes though.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
13 years agor8169: provide some firmware information via ethtool.
Francois Romieu [Tue, 26 Apr 2011 16:58:59 +0000 (18:58 +0200)]
r8169: provide some firmware information via ethtool.

There is no real firmware version yet but the manpage of ethtool
is rather terse about the driver information.

Former output:
$ ethtool -i eth1
driver: r8169
version: 2.3LK-NAPI
firmware-version:
bus-info: 0000:01:00.0
$ ethtool -i eth0
driver: r8169
version: 2.3LK-NAPI
firmware-version:
bus-info: 0000:03:00.0

Current output:
$ ethtool -i eth1
driver: r8169
version: 2.3LK-NAPI
firmware-version: N/A
bus-info: 0000:01:00.0

$ ethtool -i eth0
driver: r8169
version: 2.3LK-NAPI
firmware-version: rtl_nic/rtl8168d-1.fw
bus-info: 0000:03:00.0

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Fixed-by Ciprian Docan <docan@eden.rutgers.edu>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Cc: Fejes József <fejes@joco.name>
Cc: Borislav Petkov <borislav.petkov@amd.com>
13 years agor8169: remove non-NAPI context invocation of rtl8169_rx_interrupt.
Francois Romieu [Tue, 15 Mar 2011 16:29:31 +0000 (17:29 +0100)]
r8169: remove non-NAPI context invocation of rtl8169_rx_interrupt.

Invocation of rtl8169_rx_interrupt from rtl8169_reset_task was originally
intended to retrieve as much packets as possible from the rx ring when a
reset was needed. Nowadays rtl8169_reset_task is only scheduled, with
some delay
a. from the tx timeout watchdog
b. when resuming
c. from rtl8169_rx_interrupt itself

It's dubious that the loss of outdated packets will matter much for a)
and b). c) does not need to call itself again.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
13 years agor8169: link speed selection timer rework.
Francois Romieu [Fri, 11 Mar 2011 20:07:11 +0000 (21:07 +0100)]
r8169: link speed selection timer rework.

The implementation was a bit krusty.

The 10s rtl8169_phy_timer timer has been (was ?) required with older
8169 for adequate phy operation when full gigabit is advertised in
autonegotiated mode. The timer does nothing if the link is up.
Otherwise it keeps resetting the phy until things improve.

- the device private data field phy_1000_ctrl_reg was used to
  schedule the timer. Avoid it and save a few bytes.

- rtl8169_set_settings
  pending timer is disabled before changing the link settings as
  rtl8169_phy_timer is not always needed (see the removed test in
  rtl8169_phy_timer).

- rtl8169_set_speed
  the requested link parameters may not match the chipset : bail out
  early on failure.

- rtl8169_open
  Calling rtl8169_request_timer is redundant with
  -> rtl8169_open
     -> rtl8169_init_phy
        -> rtl8169_set_speed
           -> mod_timer
  The latter always enables the phy timer whereas the former did not
  for RTL_GIGA_MAC_VER_01. It should not make things worse but only
  time will tell if reality agrees.

- rtl8169_request_timer : unused yet. Removed.

- rtl8169_delete_timer : useless. Bloat. Removed.

Side effect : the timer may kick in if the TBI is enabled. I do not
know if the TBI has ever been used in real life.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
13 years agor8169: rtl8169_set_speed_xmii cleanup.
Francois Romieu [Fri, 11 Mar 2011 19:30:24 +0000 (20:30 +0100)]
r8169: rtl8169_set_speed_xmii cleanup.

Shorten chipset version test.

No functional change.

Careful readers will notice that the 'supports_gmii' flag is deduced
from the device PCI id. Though less specific than the chipset related
RTL_GIGA_MAC_VER_XY, it is good enough to detect a GMII deprieved 810x.
Some features push for a device specific configuration (improved jumbo
frame support for instance). 'supports_gmii' will follow this path
if / when the device PCI id test stops working.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
13 years agor8169: remove some code duplication.
Francois Romieu [Fri, 29 Apr 2011 13:05:51 +0000 (15:05 +0200)]
r8169: remove some code duplication.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
13 years agor8169: style cleanups.
Francois Romieu [Fri, 1 Apr 2011 08:21:07 +0000 (10:21 +0200)]
r8169: style cleanups.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
13 years agoPCH_GbE : Fixed the issue of checksum judgment
Toshiharu Okada [Fri, 6 May 2011 02:53:56 +0000 (02:53 +0000)]
PCH_GbE : Fixed the issue of checksum judgment

The checksum judgment was mistaken.
  Judgment result
     0:Correct 1:Wrong

This patch fixes the issue.

Signed-off-by: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoPCH_GbE : Fixed the issue of collision detection
Toshiharu Okada [Fri, 6 May 2011 02:53:51 +0000 (02:53 +0000)]
PCH_GbE : Fixed the issue of collision detection

The collision detection setting was invalid.
When collision occurred, because data was not resent,
there was an issue to which a transmitting throughput falls.

This patch enables the collision detection.

Signed-off-by: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoNET: slip, fix ldisc->open retval
Matvejchikov Ilya [Fri, 6 May 2011 06:23:09 +0000 (06:23 +0000)]
NET: slip, fix ldisc->open retval

TTY layer expects 0 if the ldisc->open operation succeeded.

Signed-off-by : Matvejchikov Ilya <matvejchikov@gmail.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: Fixed bugs related to PVID.
Somnath Kotur [Wed, 4 May 2011 22:40:46 +0000 (22:40 +0000)]
be2net: Fixed bugs related to PVID.

Fixed bug to make sure 'pvid' retrieval will work on big endian hosts.
Fixed incorrect comparison between the Rx Completion's 16-bit VLAN TCI
and the PVID. Now comparing only the relevant 12 bits corresponding to
the VID.
Renamed 'vid' field under Rx Completion to 'vlan_tag' to reflect
accurate description.

Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoehea: fix wrongly reported speed and port
Kleber Sacilotto de Souza [Wed, 4 May 2011 13:05:11 +0000 (13:05 +0000)]
ehea: fix wrongly reported speed and port

Currently EHEA reports to ethtool as supporting 10M, 100M, 1G and
10G and connected to FIBRE independent of the hardware configuration.
However, when connected to FIBRE the only supported speed is 10G
full-duplex, and the other speeds and modes are only supported
when connected to twisted pair.

Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agovlan: remove one synchronize_net() call
Eric Dumazet [Mon, 9 May 2011 04:40:44 +0000 (04:40 +0000)]
vlan: remove one synchronize_net() call

At VLAN dismantle phase, unregister_vlan_dev() makes one
synchronize_net() call after vlan_group_set_device(grp, vlan_id, NULL).

This call can be safely removed because we are calling
unregister_netdevice_queue() to queue device for deletion, and this
process needs at least one rcu grace period to complete.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agogarp: remove one synchronize_rcu() call
Eric Dumazet [Mon, 9 May 2011 03:35:55 +0000 (03:35 +0000)]
garp: remove one synchronize_rcu() call

Speedup vlan dismantling in CONFIG_VLAN_8021Q_GVRP=y cases,
by using a call_rcu() to free the memory instead of waiting with
expensive synchronize_rcu() [ while RTNL is held ]

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: use batched device unregister in veth and macvlan
Eric Dumazet [Sun, 8 May 2011 23:17:57 +0000 (23:17 +0000)]
net: use batched device unregister in veth and macvlan

veth devices dont use the batched device unregisters yet.

Since veth are a pair of devices, it makes sense to use a batch of two
unregisters, this roughly divides dismantle time by two.

Fix this by changing dellink() callers to always provide a non NULL
head. (Idea from Michał Mirosław)

This patch also handles macvlan case : We now dismantle all macvlans on
top of a lower dev at once.

Reported-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Michał Mirosław <mirqus@gmail.com>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosctp: Fix debug message args.
David S. Miller [Mon, 9 May 2011 04:14:41 +0000 (21:14 -0700)]
sctp: Fix debug message args.

I messed things up when I converted over to the transport
flow, I passed the ipv4 address value instead of it's address.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Pass flow key down into ip_append_*().
David S. Miller [Mon, 9 May 2011 00:24:10 +0000 (17:24 -0700)]
ipv4: Pass flow key down into ip_append_*().

This way rt->rt_dst accesses are unnecessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Pass flow keys down into datagram packet building engine.
David S. Miller [Mon, 9 May 2011 00:12:19 +0000 (17:12 -0700)]
ipv4: Pass flow keys down into datagram packet building engine.

This way ip_output.c no longer needs rt->rt_{src,dst}.

We already have these keys sitting, ready and waiting, on the stack or
in a socket structure.

Signed-off-by: David S. Miller <davem@davemloft.net>