20 - Concurrency (4)OutlineSummaryLock Variables with HardwareStrict AlternationPeterson's SolutionAnnouncementsReading:SummaryWe saw last time a first attempt at guarding thecritical section with software.Mutual ExclusionLock Variables with HardwareStrict AlternationPeterson's SolutionMid-semester feedback survey is out.2 extra points on the midterm for 70% completion(by section)1)2)a)b)c)MOS 2.4.1 - 2.4.3void enter_critical(){ while (lock != 0); lock = 1;}void exit_critical(){ lock = 0;}Problem: now the race condition is on the lock!The lock variable uses busy-waiting to guard entryto critical section.This is also called a spinlock.Wastes CPU time, so should only be used ifwait time is guaranteed to be very short.TSL (Test-and-Set Lock) InstructionTSL (Test-and-Set Lock) InstructionXCHG Instruction (Compare and Swap)Hardware Lock VariablesXCHG (Exchange/Swap) Instructioncopies current value at address to registerused to only lock memory busnow requires modification to cache coherencyprotocolsets value at address to True (non-zero)Lock variables require hardware support to avoidrace condition.while (1) { while (lock != 0); lock = 1; do_critical(); lock = 0; do_noncritical();}while (1) { while (turn != 0); do_critical(); turn = 1; do_noncritical();}while (1) { flags[0] = TRUE; turn = 1; while(turn==1 && flags[1]); do_critical(); flags[0] = FALSE; do_noncritical();}while (1) { flags[1] = TRUE; turn = 0; while(turn==0 && flags[0]); do_critical(); flags[1] = FALSE; do_noncritical();}while (1) { while (turn != 0); do_critical(); turn = 1; do_noncritical();}while (1) {enter_critical: asm(TSL r1, lock); asm(CMP r1, #0); asm(JNE enter_critical); do_critical();exit_critical: asm(MOV lock, #0); do_noncritical();}while (1) {enter_critical: asm(TSL r1, lock); asm(CMP r1, #0); asm(JNE enter_critical); do_critical();exit_critical: asm(MOV lock, #0); do_noncritical();}while (1) { while (lock != 0); lock = 1; do_critical(); lock = 0; do_noncritical();}while (1) { while (turn != 1); do_critical(); turn = 0; do_noncritical();}while (1) { while (turn != 1); do_critical(); turn = 2; do_noncritical();}while (1) { while (turn != 2); do_critical(); turn = 0; do_noncritical();}now both in critical section!-----------Generally 2 options:Atomic read-modify-write. In a single step:Alternative atomic read-modify-write:More or less the same as TSL.Multiple processes now look like:Revisit our solution criteria:if proc A runs first, TSL will set the lock toTRUE set r1 to 0 (lock initialized to FALSE)and proc A will be able to enter critical sectionif proc B preempts proc A, the result of TSLwill have TRUE in r1, and proc B will continuelooping.race condition is prevented no matter wherepreemption occurs in the lock checking.proc B can continue after proc A unsets lockCPU designers implement atomicity in hardwareThe entry and exit procedures become:The entry and exit procedures:(it's still a spinlock)(again, still a spinlock)TSL reg, mem_addressXCHG reg, lock_addressenter_critical: TSL r1, lock_addr CMP r1, #0 JNE enter_criticalexit_critical: MOV lock_addr, #0proc Aproc B1)2)3)4)Mutual ExclusionNo assumptions of CPU speed or # of coresProgress (No Lockout)Bounded Waiting (No Starvation)------------No race conditionsSimpleRequires hardware support (instruction set)e.g. turn = 0 => process 0 can enter, etc."turn" indicates which process gets to enterAnalogy: processes race through waiting rooms."flags" indicates if a process wants to entere.g. process 0 before 1, 1 before 2, etc.ExampleExample: more processesIdea: add a "flags" array to the "turn" variableGary L. Peterson. "Myths About the MutualExclusion Problem". 1981.Each process indicates it wants to enter itscritical section, then tries to give the otherprocess a turn first.Processes busy wait while it is the other process'sturn and that process actually wants to use itsturn.Peterson's algorithm was designed for 2processes, but has been generalized to arbitrarilymany with a little hardware support.Note: there are other correct algorithms under thesame assumption.(e.g. Dekker's algorithm, Lamport's algorithm, ...)Peterson's is correct under the assumption of asequential consistency memory model.The basic form (2 processes):proc 0proc 0proc 1proc 1proc 2processes strictly alternate executionCan we get mutual exclusion without hardwaresupport?Idea: Use a single "turn" variable to indicate whichprocess is currently allowed to enter criticalMulticore scalabilityStarvation is theoretically possibleAdvantages:Disadvantages:Cache coherency hardware protocol becomes more complex withmore cores and cache levelsstatistically unlikely, so works well enough in practice(think geometric distribution)Hurts performance if lock being used oftenNot an unreasonable assumption these daysSwaps contents of register and memory addressSlightly more flexible since value can be chosen--enter_critical: MOV r1, #1 XCHG r1, lock CMP r1, #0 JNE enter_criticalexit_critical: MOV lock, #0Criteria:1)2)3)4)Mutual Exclusion (No Race Condition)No assumptions of CPU speed or # of coresProgress (No Lockout)Bounded Waiting (No Starvation)process 0process 1Criteria:1)2)3)4)Mutual Exclusion (No Race Condition)No assumptions of CPU speed or # of coresProgress (No Lockout)Bounded Waiting (No Starvation)See Filter Algorithm--------No race conditions*Satisfies all criteria*Fully software algorithm**Thwarted by modern hardwareGeneralization is non-trivialStill a busy-waiting algorithmComputation of multiple threads is identical tointerleaving in preserved orderViolated in many modern architectures.(cache coherence, out-of-order execution)Advantages:Disadvantages:And still ultimately dependent on some hardware supportWasted CPU cycles...Multicore cache coherency, out-of-order executionSolution Criteria:Busy Waiting Solutions:Solutions to race conditions do not necessarilysolve the Priority Inversion Problem.1)2)3)4)Mutual Exclusion (No Race Condition)No assumptions of CPU speed or # of coresProgress (No Lockout)Bounded Waiting (No Starvation)at most one process in critical section at a timea process should not block other processes if it is notattempting to enter its critical section.a process must be guaranteed to enter its critical section withina fixed time.correctness independent of execution speed or number of cores---------Correct solutions:Software-only lock variablesHardware-supported lock variablesStrict alternationPeterson's solution*Incorrect solutions:Always waste timelower priority process can block higher priorityone while in critical sectioncomplicated, but usually resolved with temporarypriority elevation