Simple round-robin hardware TX scheduling can cause starvation of TX rings
with small packets when other TX rings have large TSO or jumbo packets.
In the simplest case, consider 2 TCP streams running in opposite
directions. The TSO TX traffic will hash to one ring and the ACKs for the
incoming data on a different TCP connection will hash to a different TX
ring. The hardware fetches one complete TSO packet (up to 64K data)
before servicing the other TX ring. When it gets to the other TX ring, it
will only fetch one packet (64-byte ACK packet in this case). After that,
it will switch back to the 1st ring filled with more TSO packets. Because
only one ACK can go out roughly every 500 usec in this case, the incoming
data rate becomes very low.
Update version to 3.125.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
by introducing tg3_stop() that does the opposite of tg3_start(). This
function will be useful when adding the support for changing the numbe
of rx and tx rings.
Reviewed-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
by introducing tg3_start() that handles all initialization steps from
IRQ allocation. This function will be needed when adding support for
changing the number of rx and tx rings.
Reviewed-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fib6_add_1() should consistently return errno pointers,
rather than a mixture of NULL and errno pointers.
Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Kleine-Budde says:
====================
this pull request is for net-next, for the v3.7 release cycle.
AnilKumar Ch contributed a fix for a segfault in the c_can driver,
which is triggered by an earlier commit [1] in net-next (so no backport
is needed).
...
[1] 4cdd34b can: c_can: Add runtime PM support to Bosch C_CAN/D_CAN controller
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables wake from system suspend on magic packet.
Patch updated to change BUG_ON to WARN_ON_ONCE.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch instructs the device to enter its lowest power SUSPEND2
state during system suspend.
This patch also explicitly wakes the device after resume, which
should address reports of the device not automatically coming
back after system suspend:
Patch updated to change BUG_ON to WARN_ON_ONCE.
http://code.google.com/p/chromium-os/issues/detail?id=31871
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds an explicit test that the READY bit is set on
the device when attempting to initialize it.
If this bit is clear then the device hasn't succesfully started
all its clocks, and this patch helps make the resulting logged
error more helpful.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables wake from system suspend on magic packet.
Patch updated to replace BUG_ON with WARN_ON_ONCE and return.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables the device to enter its lowest power SUSPEND2
state during system suspend, instead of staying up using full power.
Patch updated to not add two pointers to .suspend & .resume.
Patch updated to replace BUG_ON with WARN_ON_ONCE and return.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes an issue on some systems, where after suspend the
link is re-established but the ethernet interface does not resume.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds additional checks of the values returned by
smsc95xx_(read|write)_reg, and wraps their common patterns
in macros.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removes unnecessary variables as smsc95xx_write_reg takes its
value by parameter. Early versions passed this parameter by
reference.
Also replace hardcoded interrupt status value with a #define
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
During init, the device reset is unexpected to complete immediately,
so sleep before testing the condition rather than after it.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/team/team.c
drivers/net/usb/qmi_wwan.c
net/batman-adv/bat_iv_ogm.c
net/ipv4/fib_frontend.c
net/ipv4/route.c
net/l2tp/l2tp_netlink.c
The team, fib_frontend, route, and l2tp_netlink conflicts were simply
overlapping changes.
qmi_wwan and bat_iv_ogm were of the "use HEAD" variety.
With help from Antonio Quartulli.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David S Miller:
1) Netfilter xt_limit module can use uninitialized rules, from Jan
Engelhardt.
2) Wei Yongjun has found several more spots where error pointers were
treated as NULL/non-NULL and vice versa.
3) bnx2x was converted to pci_io{,un}map() but one remaining plain
iounmap() got missed. From Neil Horman.
4) Due to a fence-post type error in initialization of inetpeer entries
(which is where we store the ICMP rate limiting information), we can
erroneously drop ICMPs if the inetpeer was created right around when
jiffies wraps.
Fix from Nicolas Dichtel.
5) smsc75xx resume fix from Steve Glendinnig.
6) LAN87xx smsc chips need an explicit hardware init, from Marek Vasut.
7) qlcnic uses msleep() with locks held, fix from Narendra K.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
netdev: octeon: fix return value check in octeon_mgmt_init_phy()
inetpeer: fix token initialization
qlcnic: Fix scheduling while atomic bug
bnx2: Clean up remaining iounmap
net: phy: smsc: Implement PHY config_init for LAN87xx
smsc75xx: fix resume after device reset
netdev: pasemi: fix return value check in pasemi_mac_phy_init()
team: fix return value check
l2tp: fix return value check
netfilter: xt_limit: have r->cost != 0 case work
Pull vfs fixes from Al Viro:
"A couple of fixes; one for automount/lazy umount race, another a
classic "we don't protect the refcount transition to zero with the
lock that protects looking for object in hash" kind of crap in lockd."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
close the race in nlmsvc_free_block()
do_add_mount()/umount -l races
Pull UML fixes from Richard Weinberger.
* 'for-linus-3.6-rc-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Preinclude include/linux/kern_levels.h
um: Fix IPC on um
um: kill thread->forking
um: let signal_delivered() do SIGTRAP on singlestepping into handler
um: don't leak floating point state and segment registers on execve()
um: take cleaning singlestep to start_thread()
Pull dm fixes from Alasdair G Kergon:
"A few fixes for problems discovered during the 3.6 cycle.
Of particular note, are fixes to the thin target's discard support,
which I hope is finally working correctly; and fixes for multipath
ioctls and device limits when there are no paths."
* tag 'dm-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
dm verity: fix overflow check
dm thin: fix discard support for data devices
dm thin: tidy discard support
dm: retain table limits when swapping to new table with no devices
dm table: clear add_random unless all devices have it set
dm: handle requests beyond end of device instead of using BUG_ON
dm mpath: only retry ioctl when no paths if queue_if_no_path set
dm thin: do not set discard_zeroes_data
Speculative cache pagecache lookups can elevate the refcount from
under us, so avoid the false positive. If the refcount is < 2 we'll be
notified by a VM_BUG_ON in put_page_testzero as there are two
put_page(src_page) in a row before returning from this function.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Petr Holasek <pholasek@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In case of error, the function of_phy_connect() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value
check should be replaced with NULL test.
dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull drm fixes from Dave Airlie:
"The three nouveau fixes quiten unneeded dmesg spam that people are
seeing and pondering,
The udl fix stops it from trying to driver monitors that are too big,
where we get a black screen.
And a vmware memory alloc problem."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/nvc0/fifo: ignore bits in PFIFO_INTR that aren't set in PFIFO_INTR_EN
drm/udl: limit modes to the sku pixel limits.
vmwgfx: corruption in vmw_event_fence_action_create()
drm/nvc0/ltcg: mask off intr 0x10
drm/nouveau: silence a debug message triggered by newer userspace
Pull USB fixes from Greg Kroah-Hartman:
"Here are two USB bugfixes for your 3.6-rc7 tree.
The OHCI fix has been reported a number of times and is a regression
from 3.5, and the patch that causes the regression was on the way to
the -stable trees before I was reminded (again) that this fix needed
to get to your tree soon.
The host controller bugfix was reported in older kernels as being
pretty easy to trigger, and has been tested by Red Hat and their
customers.
Both have been in the usb-next branch in the -next tree for a while
now, I just cherry-picked them out to get to you in time for the 3.6
release.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'usb-3.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: Fix race condition when removing host controllers
USB: ohci-at91: fix null pointer in ohci_hcd_at91_overcurrent_irq
Also fix the calls to next_packet_size() for the pause case. This was
missed in 245baf983 ("ALSA: snd-usb: fix calls to next_packet_size").
Signed-off-by: Daniel Mack <zonque@gmail.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Reported-and-tested-by: Christian Tefzer <ctrefzer@gmx.de>
Cc: stable@kernel.org
[ Taking directly because Takashi is on vacation - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ASoC update from Mark Brown:
"One small and obvious driver-specific fix.
Takashi is on vacation now so he asked me to send directly, it's a
pretty bad bug with low regression risk."
* tag 'asoc-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound:
ASoC: wm2000: Correct register size
Remove two holes on 64bit arches, and put dev_list at the end of
napi_struct since its not used in fast path.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently use percpu order-0 pages in __netdev_alloc_frag
to deliver fragments used by __netdev_alloc_skb()
Depending on NIC driver and arch being 32 or 64 bit, it allows a page to
be split in several fragments (between 1 and 8), assuming PAGE_SIZE=4096
Switching to bigger pages (32768 bytes for PAGE_SIZE=4096 case) allows :
- Better filling of space (the ending hole overhead is less an issue)
- Less calls to page allocator or accesses to page->_count
- Could allow struct skb_shared_info futures changes without major
performance impact.
This patch implements a transparent fallback to smaller
pages in case of memory pressure.
It also uses a standard "struct page_frag" instead of a custom one.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When jiffies wraps around (for example, 5 minutes after the boot, see
INITIAL_JIFFIES) and peer has just been created, now - peer->rate_last can be
< XRLIM_BURST_FACTOR * timeout, so token is not set to the maximum value, thus
some icmp packets can be unexpectedly dropped.
Fix this case by initializing last_rate to 60 seconds in the past.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct sock *sk is not used inside tcp_v4_save_options. Thus it can be
removed.
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit c0357e975a modified bnx2 to switch from
using ioremap/iounmap to pci_iomap/pci_iounmap. They missed a spot in the error
path of bnx2_init_one though. This patch just cleans that up.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Michael Chan <mcan@broadcom.com>
CC: "David S. Miller" <davem@davemloft.net>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dev_parse_header() callers provide 8 bytes of storage,
so it's not possible to store an IPv6 address.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull one more arm-soc bugfix from Olof Johansson:
"Here's a bugfix for orion5x. Without this, PCI doesn't initialize
properly because of too small coherent pool to cover the allocations
needed.
A similar fix has already been done on kirkwood."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: Orion5x: Fix too small coherent pool.
Pull ARM dma-mapping fix from Marek Szyprowski:
"This patch fixes a potential memory leak in the ARM dma-mapping code."
* 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: dma-mapping: Fix potential memory leak in atomic_pool_init()
Pull GPIO fix from Linus Walleij:
"A late GPIO fix: Roland Stigge found a problem in the LPC32xx driver
where a callback ignores one of its arguments. It needs to go into
stable too so sending this upstream immediately."
* tag 'gpio-fixes-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio-lpc32xx: Fix value handling of gpio_direction_output()
Pull two md bugfixes from NeilBrown:
"One (missing spinlock init) was only introduced recently. The other
has been present as long as raid10 has been supported, so is tagged
for -stable."
* tag 'md-3.6-fixes' of git://neil.brown.name/md:
md/raid10: fix "enough" function for detecting if array is failed.
md/raid5: add missing spin_lock_init.
Pull EDAC fixes from Mauro Carvalho Chehab:
"Three edac fixes at the memory enumeration logic:
- i3200_edac: Fixes a regression at the memory rank size, when the
memorias are dual-rank;
- i5000_edac: Fix a longstanding bug when calculating the memory
size: before Kernel 3.6, the memory size were right only
with one specific configuration;
- sb_edac: Fixes a bug since the initial release of the driver:
with 16GB DIMMs, there's an overflow at the memory size,
causing the number of pages per dimm (an unsigned value)
to have the highest bit equal to 1, effectively mangling
the memory size.
The third bug can potentially affect the error decoding logic as well."
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
sb_edac: Avoid overflow errors at memory size calculation
i5000: Fix the memory size calculation with 2R memories
i3200_edac: Fix memory rank size
It seems sk_init() has no value today and even does strange things :
# grep . /proc/sys/net/core/?mem_*
/proc/sys/net/core/rmem_default:212992
/proc/sys/net/core/rmem_max:131071
/proc/sys/net/core/wmem_default:212992
/proc/sys/net/core/wmem_max:131071
We can remove it completely.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shan Wei <davidshan@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GCC refuses to recognize that all error control flows do in fact
set err to something.
Add an explicit initialization to shut it up.
net/sched/sch_drr.c: In function ‘drr_enqueue’:
net/sched/sch_drr.c:359:11: warning: ‘err’ may be used uninitialized in this function [-Wmaybe-uninitialized]
net/sched/sch_qfq.c: In function ‘qfq_enqueue’:
net/sched/sch_qfq.c:885:11: warning: ‘err’ may be used uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: David S. Miller <davem@davemloft.net>
GCC can't see that in all non-error-return paths we do in fact
set *using_dac to something.
Add an explicit initialization to remove this warning:
drivers/net/ethernet/brocade/bna/bnad.c: In function ‘bnad_pci_probe’:
drivers/net/ethernet/brocade/bna/bnad.c:3079:5: warning: ‘using_dac’ may be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/net/ethernet/brocade/bna/bnad.c:3233:7: note: ‘using_dac’ was declared here
Signed-off-by: David S. Miller <davem@davemloft.net>