Detailed Java thread safety and lock mechanism

Detailed Java thread safety and lock mechanism

Critical section: 

Critical section refers to a program segment that accesses shared resources (such as shared equipment or shared memory), and these shared resources cannot be accessed by multiple threads at the same time. When a thread enters the critical section, other threads or processes must wait (for example: bounded waiting waiting method). Some synchronization mechanisms must be implemented at the entry and exit points of the critical section to ensure that these shared resources are Mutex is used, for example: semaphore. Devices that can only be accessed by a single thread, such as printers. 

Mutex: A 

mutex is a variable that can be in one of two states: unlocked and locked. In this way, only one binary bit is needed to represent it, but in fact, an integer is often used, 0 means unlock, and all other values mean lock. Mutex uses two procedures. When a thread (or process) needs to access the critical section, it calls mutex_lock. If the mutex is currently unlocked (that is, the critical section is available), the call succeeds and the calling thread can freely enter the critical section. 
On the other hand, if the mutex has been locked, the calling thread is blocked until the thread in the critical section completes and calls mutex_unlock. If multiple threads are blocked on the mutex, a thread will be randomly selected and allowed to acquire the lock. 

The tube: 

the tube (English: Monitors, also referred to as a monitor) is a program structure, a plurality of subprograms multiple worker threads (modules or objects) within the structure formed by mutually exclusive access to shared resources. These shared resources are generally hardware devices or a group of variables. 
The monitor realizes that at a point in time, at most only one thread is executing a certain subroutine of the monitor. Compared with those concurrent programming that achieve mutually exclusive access by modifying the data structure, the monitor implementation greatly simplifies the
various hardware resources and software resources in the programming  system, and the data structure can be used to abstractly describe its resource characteristics. That is, a small amount of information and operations performed on the resource are used to characterize the resource, while ignoring their internal structure and implementation details. 
The shared data structure is used to abstractly express the shared resources in the system, and the operations performed on the shared data structure are defined as a set of processes.  Semaphore



Semaphore, sometimes called a semaphore, is a facility used in a multithreaded environment to ensure that two or more critical code segments are not called concurrently. Before entering a critical code segment, the thread must acquire a semaphore; once the critical code segment is completed, the thread must release the semaphore. Other threads that want to enter this critical code segment must wait until the first thread releases the semaphore. In order to complete this process, you need to create a semaphore VI, and then place the Acquire Semaphore VI and Release Semaphore VI at the beginning and end of each key code segment. Make sure that these semaphore VIs refer to the semaphore that was originally created. 

CAS operation (Compare-and-Swap): 

CAS has 3 operands, the memory value V, the old expected value A, and the new value B to be modified. If and only if the expected value A and the memory value V are the same, modify the memory value V to B, otherwise do nothing. More detailed information: zl198751.iteye.com/blog/184857...Reordering 



Compiler and processor "In order to improve performance, the program will be reordered when the program is executed. It appears to improve the concurrency of the program. , Thereby improving performance! But for multi-threaded programs, reordering may cause the result of program execution is not the result we need! Reordering is divided into "compiler" and "processor" two aspects, and "processor" reordering It also includes "instruction-level reordering" and "memory reordering." 

1. thread and memory interactive operation 

 

All variables (instance fields, static fields, elements that make up an array object, excluding local variables and method parameters) are stored in the main memory. Each thread has its own working memory. The working memory of the thread saves the variables used by the thread. A copy of the main memory copy. All operations on variables by threads must be performed in working memory, and variables in main memory cannot be directly read and written. Different threads cannot directly access the variables in each other's working memory, and the transfer of variable values between threads is done through the main memory. 

The Java memory model defines eight operations: 

 

 

  • lock (lock): a variable acting on the main memory, it identifies a variable as a thread exclusive state;
  • unlock (unlock): a variable acting on the main memory, it releases a variable that is in a locked state, and the released variable can be locked by other threads;
  • read (read): a variable acting on the main memory, which transfers the value of a variable from the main memory to the working memory in the thread for the subsequent load action;
  • Load (load): A variable acting on the working memory, which puts the variable value obtained from the main memory by the read operation into the variable copy of the working memory;
  • use: A variable acting on the working memory, which passes the value of a variable in the working memory to the execution engine;
  • Assign (assignment): a variable acting on the working memory, it assigns a value received from the execution engine to a variable in the working memory;
  • store: A variable acting on the working memory, which transfers the value of a variable in the working memory to the main memory for subsequent write operations;
  • write: A variable that acts on the main memory. It writes the value of the variable obtained from the working memory by the store operation to the variable in the main memory.


The volatile keyword functions: 

1) Ensure that the new value can be stored in the main memory immediately, and refreshed from the main memory immediately before each use. 

2) Prohibit instruction reordering optimization. 

Note: The volatile keyword cannot guarantee the correctness of the operation of shared data in a multithreaded environment. You can use it when you need to notify all threads immediately after your state changes. 

2. the three characteristics of concurrency 



Atomicity refers to the smallest operation instruction that cannot be divided, that is, a single machine instruction. There can only be one thread at any time for atomic operations, so it is thread-safe. 
In the Java memory model, the six operations of read, load, assign, use, store and write ensure the atomic operation of variables. 

The two 64-bit data types long and double Java virtual machines do not enforce the atomicity of their read, load, store and write operations, which is the so-called non-atomic agreement, but various commercial Java virtual machines are currently available. Both implement the 4 non-atomic agreement operations of the long and double data types as atomic. Therefore, the access and read-write of basic data types in Java are atomic operations. 

A wide range of atomicity guarantees need to be guaranteed by lock and unlock operations and synchronized blocks. 

Visibility 

Visibility means that when a thread modifies the value of a shared variable, other threads can immediately learn about the modification. 
The Java memory model achieves visibility by synchronizing the new value back to the main memory after the variable is modified, and refreshing the variable value from the main memory before the variable is read. 

Visibility is guaranteed in Java through the three keywords of volatile, final, and synchronized: 

 

 

 

  • volatile: Ensure visibility by refreshing variable values.
  • synchronized: Before the synchronization block is locked by the variable lock, the variable value in the working memory must be cleared, and the variable value must be read from the main memory again. The variable value must be synchronized back to the main memory to ensure visibility before unlocking.
  • Final: Once the field modified by final is initialized in the constructor, and the constructor does not pass this reference in, then the value of the final field can be seen in other threads and can be accessed correctly by other threads without synchronization.


Orderliness The orderliness of 

threads means that all operations are executed in an orderly manner within a thread, but between threads, operations are executed out of order because of the delay in synchronization between the working memory and the main memory. 

Java uses the volatile and synchronized keywords to ensure the orderliness of operations between threads. 

 

 

  • Volatile prohibits instruction reordering optimization to achieve orderliness.
  • Synchronized through a variable at the same time allows only one thread to lock it to ensure orderliness.


3. the realization of java thread There are 

three ways of thread realization: 

Kernal thread (Kernal thread) 

Kernel Thread (KLT) is a thread directly supported by the operating system kernel. This thread is switched by the kernel through The operation scheduler schedules the threads and is responsible for mapping the tasks of the threads to each processor. Programs generally don t use kernel threads directly, but use a high-level interface of kernel threads Light Weight Process (LWP). Lightweight processes are threads in the usual sense. Each lightweight process is supported by a kernel thread, so only if the kernel thread is supported first can there be a lightweight process. This 1:1 relationship between lightweight processes and kernel threads is called a one-to-one threading model. Lightweight processes consume a certain amount of kernel resources (such as the stack space of kernel threads), and the cost of system calls is relatively high, so the number of lightweight processes supported by a system is limited. 

Light weight process (Light weight process) 

Broadly speaking, as long as a thread is not a kernel thread, it can be regarded as a user thread (User Thread, UT), and a narrow user thread refers to a thread that is completely built in user space. On the library, the system kernel cannot perceive the realization of the existence of threads. The establishment, synchronization, destruction and scheduling of user threads are completely completed in the user mode, without the help of the kernel. If the program is implemented properly, this thread does not need to switch to the kernel mode, so the operation can be very fast and low-consumption, and it can also support a larger number of threads. The multithreading in some high-performance databases is realized by user threads. . The 1:N relationship between this process and user threads is called the one-to-many threading model. (Windows and Linux use this method) 

The advantage of using user threads is that it does not require the support of the system kernel, and the disadvantage is that there is no support of the system kernel. All thread operations need to be processed by the user program itself, so the program implemented by user threads is used They are generally more complicated, and now there are fewer and fewer programs that use user threads. 

User thread/mixed thread (User thread) 

has both user threads and lightweight processes. User threads are still completely built in user space, and the lightweight processes supported by the operating system serve as a bridge between user threads and kernel threads. In this mixed mode, the ratio of the number of user threads to lightweight processes is indeterminate, which is M:N. Many Unix series systems provide M:N threading model implementation. 

Java thread scheduling 

Java thread was implemented based on user threads called "green threads" before JDK1.2, while in JDK1.2, the threading model was replaced with a native threading model based on the operating system. Therefore, in the current JDK version, what kind of threading model the operating system supports determines to a large extent how the threads of the Java virtual machine are mapped. There is no way to reach agreement on different platforms. Virtual machine specifications The Java thread does not define which thread model needs to be used to implement it. 

There are two ways of thread scheduling 

: cooperative: the execution time of the thread is controlled by the thread itself, and the system is actively notified to switch to another thread for execution after the thread task is completed. (Not recommended) 
Advantages: Simple to implement, thread switching operations are known to the thread itself, and there is no thread synchronization problem. 
Disadvantages: The thread execution time is uncontrollable. If the thread executes for a long time and does not give up the CPU execution time, the system may crash. 

Preemptive: The execution time of each thread is allocated by the operating system. The operating system allocates a time slice for each thread to execute. The thread that grabs the time slice executes. After the time slice is used up, the execution time is re-occupied, and the thread switching is not controlled by the thread. It's up to you to decide (the thread scheduling method used by Java is preemptive scheduling). 
Advantages: The thread execution time can be controlled, and the system will not crash due to a thread blocking problem. 

4. The scheduling relationship of thread status in Java  

 

5. The thread safety level in Java is 

immutable: 

It can be a basic type of final; it can be a final object, but the behavior of the object will not have any effect on its state. For example, the subString of String is a new String object. Various Number types, such as BigInteger and BigDecimal, are immutable. Yes, but AtomicInteger and AtomicLong, both of which are subtypes of Number, are not immutable. The reason is that the state object in it is an unsafe object, and the operations performed are all CAS operations, which can guarantee atomicity. 

Absolute thread safety: 

regardless of the runtime environment, the caller does not need any additional synchronization measures. 

Relative thread safety:

This is thread safety  in our usual sense. Need to ensure that the individual operations of the object are thread-safe. Such as Vector, HashTable, synchronizedCollection packaging collection, etc. 

Thread compatibility: The 

object itself is not thread-safe, but it can be achieved through synchronization. Generally we are not talking about thread safety, most of them refer to this. Such as ArrayList, HashMap, etc. 

Thread opposition: 

regardless of whether the caller adopts synchronization measures, code that cannot be used in concurrency. 

Sixth, thread-safe implementation of  

mutual exclusion synchronization 

When multi-threaded access, it is guaranteed that only one thread is used at the same time. 
Critical Section (Critical Section), Mutex (Mutex), Semaphore (Semaphore) are all a means of synchronization 
The most basic mutual exclusion synchronization method in java is synchronized. After compilation, two bytecode instructions, monitorrenter and monitorexit, will be formed. Both bytecodes require a reference type parameter to specify the object to be locked and unlocked. There is a lock counter to record the number of locks, and the number of locks must be unlocked several times before it can be restored to the unlocked state. 

In fact, it has already been mentioned in "Java and Threads" that Java threads are mapped to the native threads of the operating system. Whether blocked or awakened, the operating system needs the help of the operating system and needs to be converted from the user mode to the core mode. It is very time-consuming. It is a heavyweight operation in the java language. Although the virtual machine itself will do a little optimization operation, such as notifying the operating system that it will add a spin waiting process before blocking to avoid frequent switching to the core. Mentality. 

The advantages of ReentrantLock compared to synchronized: 

 

 

  • Waiting can be interrupted : When the thread holding the lock does not release the lock for a long time, the waiting thread can choose to give up waiting.
  • Fair lock : Acquiring a lock once in the order of applying for locks is called a fair lock. Synchronized is an unfair lock, and ReentrantLock can implement a fair lock through the constructor. new RenentrantLock(boolean fair)
  • The lock binds multiple conditions : multiple Condition objects can be obtained through multiple newConditions, which can easily implement more complex thread synchronization functions. Through await(), signal();


Non-blocking synchronization The 

main problem of mutual exclusion and synchronization is the performance problems caused by blocking and wake-up, so this is usually called blocking synchronization (pessimistic concurrency strategy). With the development of the hardware instruction set, we have another choice: an optimistic concurrency strategy based on conflict detection. Generally speaking, it is to operate first. If there is no other thread competing for the shared data, the operation is successful. If there is, other compensation is performed. (The most common is constant retries). Many implementations of this optimistic concurrency strategy do not need to suspend threads. This synchronization operation is called non-blocking synchronization. 

Such instructions are: 
1) Test and set (test-and-set) 
2) Get and add 
3) Exchange 
4) Compare and exchange (CAS) 
5) Load-Linked/Store-Conditional LL/SC) The 

latter two are new processor instructions for modern processors. After JDK1.5, CAS operations can be used in java, which are the compareAndSwapInt() and compareAndSwapLong() in the legendary sun.misc.Unsafe class After the packaging of several methods is provided, the virtual machine does special processing for these methods, and the result of timely compilation is a platform-related processor CAS instruction. There is no method call process, which can be considered as unconditional inlining. 

Originally need to synchronize i++, but now there is this kind of CAS operation to ensure atomicity, such as using AtomicInteger. But CAS has an ABA problem. It can be solved by AtomicStampedReference (chicken ribs). 

No synchronization 

Some code is inherently thread-safe and does not require synchronization. There are two categories as follows: 

Reentrant Code: Pure code, which does not rely on data stored on the heap and public system resources. The state quantities used are passed in by parameters, and non-reentrant methods are not called. Its return result is predictable. 

Thread Local Storage: Limit the visible range of shared data within the same thread, so that there is no need to synchronize and ensure that there is no data contention between threads. The thread local storage function can be realized through the java.lang.ThreadLocal class. 

7. the lock mechanism in java.  

Pessimistic lock 

assumes that concurrency conflicts will occur, shielding all operations that may violate data integrity. Pessimistic lock assumes that the probability of other threads trying to access or change the object you are accessing or changing is very high. Therefore, in a pessimistic lock environment, lock the object before you start to change the object, and until you commit The lock is released after the changes are made. 

Optimistic locking 

assumes that no concurrency conflicts will occur. Unlock easily.  Both

spin lock and adaptive spin 

thread suspend and resume operations need to be transferred to the kernel state to complete. These operations put a lot of pressure on the concurrent performance of the system. In many applications, the lock state of shared data is only It will last for a short period of time. It is not worthwhile to suspend and resume the thread for this period of time. You can let the thread that request the lock wait for a while, but do not give up the execution time of the processor, and let the thread execute a busy loop (spin ). 

The default number of spins of the spin lock is 10, which can be changed with the parameter -XX:PreBlockSpin. 

Adaptive spin means that the spin time is no longer fixed, but is determined by the previous spin time on the same lock and the state of the lock owner. 

Lock clearance : When the 

virtual machine just-in-time compiler is running, it eliminates locks that require synchronization on some codes, but are detected that there is no shared data competition. The main basis for determining lock elimination comes from the data support of escape analysis. 

Lock coarsening

If the virtual machine detects that there is a series of consecutive operations that repeatedly lock and unlock the same object, it will expand (coarse) the range of lock synchronization to the outside of the entire operation sequence.  In order to reduce the performance cost of acquiring and releasing locks,

lock upgrade 

Java SE1.6 introduced "biased locks" and "lightweight locks", so there are four lock states in Java SE1.6, no locks State, biased lock state, lightweight lock state and heavyweight lock state, it will gradually upgrade with competition. The lock can be upgraded but cannot be downgraded, which means that the partial lock cannot be downgraded to the partial lock after being upgraded to a lightweight lock. This strategy of upgrading but not degrading locks aims to improve the efficiency of acquiring and releasing locks.  The author of the

 

biased lock 

Hotspot found through previous research that in most cases, the lock not only does not have multi-thread competition, but is always acquired by the same thread multiple times. In order to make the thread acquire a lock at a lower cost, a biased lock is introduced. When a thread accesses a synchronized block and acquires a lock, the thread ID of the lock bias is stored in the lock record in the object header and stack frame. Later, the thread does not need to spend CAS operations to lock and unlock when entering and exiting the synchronized block. , And simply test whether there is a bias lock pointing to the current thread stored in the Mark Word of the object head. If the test is successful, the thread has obtained the lock. If the test fails, you need to test the bias lock in the Mark Word again. Whether the flag is set to 1 (indicating that it is currently a biased lock), if it is not set, the CAS competition lock is used. If it is set, it tries to use CAS to point the bias lock of the object header to the current thread. 

Revocation of the biased lock: The biased lock uses a mechanism that waits for the occurrence of competition before releasing the lock, so when other threads try to compete for the biased lock, the thread holding the biased lock will release the lock. To revoke a biased lock, you need to wait for a global safe point (at this point where no bytecode is executing), it will first suspend the thread holding the biased lock, and then check whether the thread holding the biased lock is alive, if the thread is not active State, set the object header to a lock-free state, if the thread is still alive, the stack with the biased lock will be executed, traverse the lock records of the biased object, the lock records in the stack and the Mark Word of the object header or re-bias to other threads , Either return to lock-free or mark the object as inappropriate as a biased lock, and finally wake up the suspended thread. In the figure below, thread 1 demonstrates the process of biased lock initialization, and thread 2 demonstrates the process of biased lock cancellation. 

 

Turn off biased lock: Biased lock is enabled by default in Java 6 and Java 7, but it is activated only a few seconds after the application starts. If necessary, you can use the JVM parameter to turn off the delay -XX: BiasedLockingStartupDelay = 0. If you are sure that all the locks in your application are usually in a competitive state, you can turn off biased locks through the JVM parameter -XX:-UseBiasedLocking=false, then the lightweight lock state will be entered by default. 

Lightweight lock

Lightweight lock and lock: Before the thread executes the synchronization block, the JVM will first create a space for storing the lock record in the stack frame of the current thread, and copy the Mark Word in the object header to the lock record In, officially called Displaced Mark Word. Then the thread tries to use CAS to replace the Mark Word in the object header with a pointer to the lock record. If it succeeds, the current thread acquires the lock. If it fails, it means that other threads compete for the lock, and the current thread tries to use spin to acquire the lock. 

Lightweight lock unlocking: When light weight unlocking, an atomic CAS operation is used to replace the Displaced Mark Word back to the object head. If it succeeds, it means that no competition occurs. If it fails, it means that there is competition for the current lock, and the lock will expand into a heavyweight lock. The following figure is a flow chart of two threads competing for locks at the same time, resulting in lock expansion. 

 

Because spinning consumes CPU, in order to avoid useless spinning (for example, the thread acquiring the lock is blocked), once the lock is upgraded to a heavyweight lock, it will not be restored to the lightweight lock state. When the lock is in this state, other threads will be blocked when they try to acquire the lock. When the thread holding the lock releases the lock, these threads will be awakened, and the awakened thread will engage in a new round of lock competition. 

Heavyweight lock : The 

weight lock is also called the object monitor (Monitor) in the JVM. It contains at least a queue of competing locks and a signal blocking queue (wait queue). The former is responsible for mutual exclusion and the latter is used for thread synchronization. . 

Partial lock, lightweight lock concept reference article: www.infoq.com/cn/articles...