c assembly locking cpu-architecture spinlock

How spin locks are implemented on platforms without CAS instructions?

The normal practice is to wrap the CAS instruction in a while loop on platforms that support CAS instructions. But platforms such as SPARC don't have atomic CAS instructions.

Solution

SPARC v8 (32-bit) and earlier lack CAS, but v9 (64-bit) does have CAS.

For a spin-lock v7 and v8 provide LDSTUB which is an atomic-read-modify-write of an unsigned byte, which writes 0xFF. That does the lock phase of a spin-lock. An ordinary write of 0 (or anything not 0xFF) will unlock, when using TSO -- for PSO you need an STBAR before the write. [There is also the SWAP atomic-read-modify-write, which can be used in the same way.]

To implement CAS (and Fetch-Op) operations on v7/v8 you need an auxiliary spin-lock.

More generally:

(and as noted in comments) for "modern" devices, if CAS is not supported then some form of "LL/SC" probably is...

...and a CAS operation can be synthesized using LL/SC. [FWIW: LL/SC is more general than CAS and avoids the dreaded ABA that straight CAS is prone to :-(]
but otherwise, once you have a spin-lock you can simulate most things...

...but if the thread holding a spin-lock goes to sleep, everybody gets to wait :-(

Machines (now historic) which provide neither LL/SC nor hardware support for a spin-lock may well have sequentially-consistent memory. In which case you can implement a spin-lock using Peterson's Algorithm, or Burns', or others'), in software.