When we are talking on memory model, we are refering memory consistency model or memory ordering model.
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Progranm
1987 ~ 1990
Linearizability: A Correctness Condition for Concurrent Objects
processor consistency: CACHE CONSISTENCY AND SEQUENTIAL CONSISTENCY
Release consistency: Memory consistency and event ordering in scalable shared-memory multiprocessors
Proving sequential consistency of high-performance shared memories
TSO Sparc v8: A standard memory model called Total Store Ordering (TSO) is defined for SPARC
Formal Specification of Memory Models: and two store ordered models TSO and PSO defined by the Sun Microsystem’s SPARC architecture.
2001 ~ Present
IA64 memory ordering
Motivation: hiding latency
▪ Why are we interested in relaxing ordering requirements?
- Speci!cally, hiding memory latency: overlap memory accesses with other operations
- Remember, memory access in a cache coherent system may entail much more then
simply reading bits from memory (!nding data, sending invalidations, etc.)
Why TSO? It’s because that write buffer or Store buffer is not invisible any more for multiprocessor
To abandon SC; to Allow use of a FIFO write buffer.
An example: There’s no reason why performing event (2) (a read from B) needs to wait until event (1) (a write to A) completes. They don’t interfere with each other at all, and so should be allowed to run in parallel. See Memory Consistency Models: A Primer
Hide the write latency by putting the data in the store buffer.
Why not read-write reordering?
reordering read-write is non-sense.
Recommened by CAAQA: Observity in SC, TSO, PC: Paragraph Relaxing the Write to Read Program Order in Shared Memory Consistency Models: A Tutorial
Memory Barriers: a Hardware View for Software Hackers - must read
‘A Summary of Relaxed Consistency’ CMUSlides
Total Store Ordering in Appendix k Sparc v8.
TSO in x86
TSO vs PC:
TSO and Peterson’s algorithm
weak consistency: Memory access buffering in multiprocessors
They distinguish between ordinary shared accesses and synchronization accesses, where the latter are used to control concurrency
between several processes and to maintain the integrity of ordinary shared data.
Firo: a must-read: Release consistency: Memory consistency and event ordering in scalable shared-memory multiprocessors
Must-read: Lockless Programming Considerations for Xbox 360 and Microsoft Windows
At right top of page 6
Condition 3.1: Conditions for Release Consistency
(A) before an ordinary load or store access is allowed to perform with respect to any other processor,
all previous acquire accesses must be performed, and
(B) before a release access is allowed to perform with
respect to any other processor, all previous ordinary
load and store accesses must be performed, and
© special accesses are processor consistent with respect to one another.
Acquire and Release Semantics