Linus on recursive lock
Unreliable Guide To Locking
How to Quantify Scalability


A program is synchronized if all accesses to shared data are ordered by synchronization operations.
A data reference is ordered by a synchronization operation if, in every possible
execution, a write of a variable by one processor and an access (either a read or a
write) of that variable by another processor are separated by a pair of synchroni-
zation operations, one executed after the write by the writing processor and one
executed before the access by the second processor???. Cases where variables may
be updated without ordering by synchronization are called data races because the
execution outcome depends on the relative speed of the processors, and, like
races in hardware design, the outcome is unpredictable, which leads to another
name for synchronized programs: data-race-free.
Scalability Techniques for Practical Synchronization Primitives
Synchronization Instructions

Why we need synchronization

a critical section is a piece of code that accesses a shared resource
(data structure or device) that must not be concurrently accessed by more than one thread of execution.

This overlap, where the result depends on the relative timing of multiple tasks, is called a race condition.

How to use synchronization mechanism

A must read bookUnreliable Guide To Locking – Rusty Russell

Common usage

Protect from interruption by hardware interrupts:
local_irq_disable(int irq) & local_irq_enable(int irq)
Protection from software interrupts:
local_bh_disable(void) & local_bh_enable(void)
Protection from other CPUs:
spin_lock(spinlock_t *) & spin_unlock(spinlock_t *)
Preemption by other user contexts:
preempt_disable(void) & preempt_enable(void)

What is synchronization in computer science

synchronization means be of the same time.
It means “make it synchronous”, something like coexistence.

Process synchronization refers to the idea that multiple processes are
to join up or handshake at a certain point, in order to reach an
agreement or commit to a certain sequence of action.
* Mutual exclusion – only one excution routine in critical section
Readers–writers problem
The Producer-Consumer Problem
* Cooperative
TCP Handshake
* Deadlock
Dining philosophers problem
* Starvation
* Prioirty inversion
* Busy waiting


Waiman Long: Recent Locking Improvements in the Linux Kernel

Memory barier



x86 locked atomic operations:

v3a chapter 8
* Guaranteed atomic operations
* Bus locking, using the LOCK# signal and the LOCK instruction prefix
* ??Cache coherency protocols that ensure that atomic operations can be carried out on cached data structures(cache lock); this mechanism is present in the Pentium 4, Intel Xeon, and P6 family processors


all signle instructions is atomic, including cmpxchg
Anything is atomic if you do it with interrupts disabled on a UP system


cmpxchg isn’t atomic; it needs LOCK prefix.


x86 inc isn’t atomic; check atomic_inc.c


Atomic vs. Non-Atomic Operations

misalgined access

AMD 7.3.2 Access Atomicity
Also check Intel 8.1.1
naturally-aligned vs atomic vs cache

Why CAS?

Cache locking

v3a: 8.1.4
Effects of a LOCK Operation on Internal Processor Caches

cmpxchg vs xchg on LOCK prefix implicit

Check Linux SMP alternatives

interrupt/exception vs non-atomic operation

v3a 6.6
To allow the restarting of program or task following the handling of an exception or an interrupt, all exceptions
(except aborts) are guaranteed to report exceptions on an instruction boundary. All interrupts are guaranteed to be
taken on an instruction boundary.

Two processor writing to the same address

What happens when two CPU cores require write access to the same memory address at the same exact moment?
When an interrupt occurs, what happens to the process that is currently getting executed?

Local Interrupt Disabling

Disabling and Enabling Deferrable Functions

Spinlock - unsleepable mutex

MCS locks and qspinlocks
MCS locks on quora

Bit spinlock


Optimistic spin queue

Cancellable MCS spinlock rework
An Overview of Kernel Lock Improvements


Monitor lock before sleep.
Faireness: queue, rt mutex

RW Lock




Linus on semaphore
150 213 kernel/printk/printk.c <>
151 231 net/core/netpoll.c <>
it was implement based on spinlock.
it can have more than one holder at any time (the number decided at initialization time),
although it is most commonly used as a single-holder lock (a mutex).
if you can not get a semaphore, your task will put itself on the wait queue, and be woken
up the semaphore is released.
* xadd vs spinlock


RW semaphore

rwsem: Support optimistic spinning

RCU – lockless

Check The Journey to RCU for more details


list_add_tail_rcu(&p->tasks, &init_task.tasks); in copy_process
no rcu read block and unlock


##When to save irq rather than just disable irq, recusive?
local_irq_disable() used in the code path that never disabled interrupts.
local_irq_save(flags) used in the code path that already disabled interrupts.