This is to workaround the llock/scond livelock
HS38x4 could get into a LLOCK/SCOND livelock in case of multiple overlapping
coherency transactions in the SCU. The exclusive line state keeps rotating
among contenting cores leading to a never ending cycle. So break the cycle
by deferring the retry of failed exclusive access (SCOND). The actual delay
needed is function of number of contending cores as well as the unrelated
coherency traffic from other cores. To keep the code simple, start off with
small delay of 1 which would suffice most cases and in case of contention
double the delay. Eventually the delay is sufficient such that the coherency
pipeline is drained, thus a subsequent exclusive access would succeed.
Link: http://lkml.kernel.org/r/1438612568-28265-1-git-send-email-vgupta@synopsys.com
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This reduces the diff in forth-coming patches and also helps understand
better the incremental changes to inline asm.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Extended testing of quad core configuration revealed that this fix was
insufficient. Specifically LTP open posix shm_op/23-1 would cause the
hardware livelock in llock/scond loop in update_cpu_load_active()
So remove this and make way for a proper workaround
This reverts commit a5c8b52abe.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
A quad core SMP build could get into hardware livelock with concurrent
LLOCK/SCOND. Workaround that by adding a PREFETCHW which is serialized by
SCU (System Coherency Unit). It brings the cache line in Exclusive state
and makes others invalidate their lines. This gives enough time for
winner to complete the LLOCK/SCOND, before others can get the line back.
The prefetchw in the ll/sc loop is not nice but this is the only
software workaround for current version of RTL.
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
- arch_spin_lock/unlock were lacking the ACQUIRE/RELEASE barriers
Since ARCv2 only provides load/load, store/store and all/all, we need
the full barrier
- LLOCK/SCOND based atomics, bitops, cmpxchg, which return modified
values were lacking the explicit smp barriers.
- Non LLOCK/SCOND varaints don't need the explicit barriers since that
is implicity provided by the spin locks used to implement the
critical section (the spin lock barriers in turn are also fixed in
this commit as explained above
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Pull ARC updates from Vineet Gupta:
"Sorry for the late pull request. Current stuff was ready for a while
but I was hoping to squeeze in support for almost ready ARC SDP
platform (and avoid a 2nd pull request), however it seems there are
still some loose ends which warrant more time.
- Platform code reduction/moving-up (TB10X no longer needs any
callbacks)
- updated boot printing
- kgdb update for arc gdb 7.5
- bug fixes (some marked for stable)
- more code refactoring/consolidation"
* tag 'arc-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: boot: cpu feature print enhancements
ARC: boot: consolidate cross-checking of h/w and s/w
ARC: unbork FPU save/restore
ARC: remove extraneous __KERNEL__ guards
ARC: Update order of registers in KGDB to match GDB 7.5
ARC: Remove unneeded Kconfig entry NO_DMA
ARC: BUG() dumps stack after @msg (@msg now same as in generic BUG))
ARC: refactoring: reduce the scope of some local vars
ARC: remove gcc mpy heuristics
ARC: RIP @running_on_hw
ARC: Update comments about uncached address space
ARC: rename kconfig option for unaligned emulation
ARC: [nsimosci] Allow "headless" models to boot
ARC: [arcfpga] Get rid of ARC_BOARD_ANGEL4 and ARC_BOARD_ML509
ARC: [arcfpga] Remove more dead code
ARC: [plat*] move code out of .init_machine into common
ARC: [arcfpga] consolidate machine description, DT
ARC: Allow SMP kernel to build/boot on UP-only infrastructure
This covers the UP / SMP (with no hardware assist for atomic r-m-w) as
well as ARC700 LLOCK/SCOND insns based.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>