Java Concurrent Programming Topic One (Challenges and Low-Level Implementation)

Java Concurrent Programming Topic One (Challenges and Low-Level Implementation)

The first part, the challenge of concurrent programming

The purpose of concurrent programming is to make the program run faster, but it is not that starting more threads will allow the program to execute concurrently to the maximum extent.

1. Context switching

The CPU implements this mechanism by assigning CPU time slices to each thread. The time slice is the time allocated by the CPU to each thread. Because the time slice is very short, the CPU keeps switching thread execution, which makes us feel that multiple threads are executing at the same time. The time slice is generally tens of milliseconds (ms). The CPU executes tasks cyclically through the time slice allocation algorithm. After the current task executes a time slice, it will switch to the next task. The state of the previous task will be saved before switching, so that the task can be loaded again when switching back to this task next time. status. So the process from saving to reloading a task is a context switch.

Is multithreading necessarily fast? Not necessarily, because threads have the overhead of creation and context switching.

How to reduce context switching:

  1. Lock-free concurrent programming. When multithreading competes for locks, context switching will occur. Therefore, when multithreading processes data, some methods can be used to avoid the use of locks. For example, the ID of the data is segmented according to the Hash algorithm, and different threads process different segments of data.
  2. CAS algorithm. Java's Atomic package uses the CAS algorithm to update data without locking.
  3. Use minimal threads. Avoid creating unnecessary threads. For example, there are few tasks, but many threads are created for processing, which will cause a large number of threads to wait.
  4. Coroutine: Realize the scheduling of multiple tasks in a single thread, and maintain switching between multiple tasks in a single thread.

2. Deadlock and how to avoid it

  1. Avoid one thread from acquiring multiple locks at the same time.

  2. Avoid one thread occupying multiple resources in the lock at the same time, and try to ensure that each lock only occupies one resource.

  3. Try to use a timed lock, use lock.tryLock(timeout) instead of using the internal lock mechanism.

  4. For database locks, locking and unlocking must be in a database connection, otherwise the unlocking will fail.

The second part, the underlying implementation principle of the Java concurrency mechanism

The Java code will become Java bytecode after compilation. The bytecode is loaded into the JVM by the class loader. The JVM executes the bytecode and finally needs to be converted into assembly instructions for execution on the CPU. The concurrency mechanism used in Java depends on in

Implementation of JVM and CPU instructions


Volatile is a lightweight synchronized, which guarantees the "visibility" of shared variables in multi-processor development. Visibility means that when one thread modifies a shared variable, another thread can read the modified value. It will not cause thread context switching and scheduling

1.1 How to achieve visibility

When a shared variable modified by a volatile variable is written, there will be a second line of assembly code:

0x01a3de1d: movb 0X0,0X1104800(0 0,0 1104800(%esi);0x01a3de24: lock addl 0 0,(%esp);

Instructions with the Lock prefix:

1) Write the data of the current processor cache line back to the system memory. 2) This operation of writing back to the memory will invalidate the data cached at the memory address in other CPUs.

In order to improve the processing speed, the processor does not directly communicate with the memory, but first reads the data in the system memory to the internal cache (L1, L2 or other) before performing the operation, but after the operation, it is not known when it will be written to the memory. If you write to a variable declared volatile, the JVM will send a Lock prefix instruction to the processor to write the data in the cache line where the variable is located back to the system memory. However, even if it is written back to the memory, if the cached value of other processors is still old, there will be problems when performing calculation operations. Therefore, under multi-processors, in order to ensure that the caches of each processor are consistent, the cache coherency protocol will be implemented. Each processor checks the value of its own cache by sniffing the data spread on the bus to see if the value of its cache is out of date. , When the processor finds that the memory address corresponding to its cache line has been modified, it will set the current processor s cache line to an invalid state. When the processor modifies this data, it will read the data from the system memory again To the processor cache.

1.2 Optimization of the use of volatile

Can adding 64 bytes improve the efficiency of concurrent programming?

The cache line of the processor's L1, L2, or L3 cache is 64 bytes wide. If the head node and tail node of the queue are both less than 64 bytes, the processor will read them all into the same cache line. Under multi-processors, each processor will cache the same head and tail nodes. When a processor attempts to modify the head node, the entire cache line will be locked. Then under the effect of the cache coherency mechanism, other processors will be caused Cannot access the tail node in its own cache, and the enqueue and dequeue operations of the queue need to constantly modify the head

Nodes and tail nodes, so in the case of multiple processors, it will seriously affect the efficiency of queue entry and dequeue. Doug lea fills up the cache line of the high-speed buffer by appending to 64 bytes, avoiding the head node and the tail node being loaded into the same cache line, so that the head and tail nodes will not lock each other when they are modified.

2. synchronized

2.1 Application method

For ordinary synchronization methods, the lock is the current instance object.

For static synchronization methods, the lock is the Class object of the current class.

For the synchronization method block, the lock is the object configured in Synchronized brackets.

When a thread attempts to access a synchronized code block, it must first obtain the lock, and must release the lock when it exits or throws an exception.

2.2 Implementation principle

JVM implements method synchronization and code block synchronization based on entering and exiting the Monitor object, but the implementation details of the two are different.

Code block synchronization is implemented using monitorenter and monitorexit instructions, while method synchronization is implemented in another way

The monitorenter instruction is inserted at the beginning of the synchronization code block after compilation, while the monitorexit is inserted at the end of the method and the exception. The JVM must ensure that each monitorenter must have a corresponding monitorexit paired with it.

Any object has a monitor associated with it, and when a monitor is held, it will be in a locked state. When the thread executes the monitorenter instruction, it will try to obtain the ownership of the monitor corresponding to the object, that is, try to obtain the lock of the object.

2.3 Java object header

The lock used for synchronized is stored in the Java object header. If the object is an array type, the array type will also be stored:

3. Upgrade and comparison of locks

There are 4 lock states, the levels from low to high are:

No lock state, partial lock state, lightweight lock state, heavyweight lock state
, These states will gradually escalate with competition. The lock can be upgraded but not downgraded

3.1 Bias lock

Experience: In most cases, the lock is acquired multiple times by the same thread .

Locking of biased locks:

When a thread accesses the synchronization block and acquires the lock, the thread ID of the lock bias is stored in the lock record in the object header and the stack frame. After that, the thread does not need to perform CAS operations to lock and unlock when entering and exiting the synchronization block. , Just simply test whether there is a bias lock pointing to the current thread stored in the Mark Word of the object header. If the test is successful, it means that the thread has acquired the lock. If the test fails, you need to test whether the mark of the bias lock in the Mark Word is set to 1 (indicating the current bias lock): if not set, use CAS competition lock; if set, try to use CAS to set the object head The bias lock points to the current thread.

Revocation of bias lock:

Biased locks use a mechanism that waits until contention occurs before releasing the lock, so when other threads try to compete for the biased lock, the thread holding the biased lock will release the lock. To revoke the biased lock, you need to wait for the global security point (there is no bytecode being executed at this point in time). It will first suspend the thread holding the biased lock, and then check whether the thread holding the biased lock is alive. If the thread is not active, the object header is set to a lock-free state; if the thread is still alive, the stack with the biased lock will be Execution, traverse the lock records of the biased object, the lock records in the stack and the Mark Word of the object header are either re-biased to other threads, or return to lock-free or mark the object as inappropriate as a biased lock, and finally wake up the suspended thread.

Enabling the bias lock:

The bias lock is enabled by default in Java 6 and Java 7, but it does not activate until a few seconds after the application starts.

3.2 Lightweight lock

Locking process:

Before the thread executes the synchronization block, the JVM will first create a space for storing the lock record in the stack frame of the current thread, and copy the Mark Word in the object header to the lock record, which is officially called Displaced Mark Word. Then the thread tries to use CAS to replace the Mark Word in the object header with a pointer to the lock record. If it succeeds, the current thread acquires the lock. If it fails, it means that other threads compete for the lock. The current thread tries to use spin to acquire the lock

Unlocking process:

When lightweight unlocking, an atomic CAS operation will be used to replace the Displaced Mark Word back to the object head. If it succeeds, it means that no competition has occurred. If it fails, it means that there is competition for the current lock, and the lock will expand into a heavyweight lock

Because spinning consumes CPU, in order to avoid useless spinning (for example, the thread acquiring the lock is blocked), once the lock is upgraded to a heavyweight lock, it will not be restored to the lightweight lock state. When the lock is in this state, other threads will be blocked when they try to acquire the lock. When the thread holding the lock releases the lock, these threads will be awakened, and the awakened thread will engage in a new round of lock capture

4. The realization principle of atomic operation

How the processor is implemented:

  1. Lock bus: Multiple processors read variable i from their respective caches at the same time, add 1 to them, and then write them to the system memory. Then, if you want to ensure that the operation of reading and modifying the shared variable is atomic, you must ensure that when CPU1 reads and modifies the shared variable, CPU2 cannot operate the cache that caches the memory address of the shared variable.

    The processor uses a bus lock to solve this problem. The so-called bus lock is to use a LOCK# signal provided by the processor. When a processor outputs this signal on the bus, the requests of other processors will be blocked, and the processor can monopolize the shared memory.

  2. Lock cache: The bus lock locks the communication between the CPU and the memory, which makes other processors unable to operate the data of other memory addresses during the lock period, so the overhead of the bus lock is relatively large. At present, the processor is used in some occasions. Cache locking replaces bus locking for optimization. "Cache lock" means that if the memory area is cached in the processor's cache line and locked during the Lock operation, when it performs the lock operation and writes back to the memory, the processor does not declare the LOCK# signal on the bus, but Modify the internal memory address, and allow its cache coherency mechanism to ensure the atomicity of the operation, because the cache coherency mechanism will prevent the data in the memory area cached by more than two processors from being modified at the same time, when other processors have been written back by When the data of a locked cache line is locked, the cache line will be invalidated

How to implement Java:

  1. Use cyclic CAS to achieve atomic operations: The CAS operation in the JVM is implemented using the CMPXCHG instruction provided by the processor.

    Since Java 1.5, the JDK concurrency package provides some classes to support atomic operations, such as AtomicBoolean (boolean value updated atomically), AtomicInteger (int value updated atomically) and AtomicLong (long atomically updated) value). These atomic wrappers also provide useful tools and methods, such as incrementing and decrementing the current value by 1 atomically.

    The three problems of CAS's realization of atomic operations:

    1) ABA problem

    2) Long cycle time and high overhead

    3) Only the atomic operation of a shared variable can be guaranteed

  2. (3) Use the lock mechanism to achieve atomic operations

The third part, Java memory model

Two key issues: how to communicate between threads and how to synchronize between threads

Synchronization refers to the mechanism used in the program to control the relative order of operations between different threads. In the shared memory concurrency model, synchronization is done explicitly. The programmer must explicitly specify that a certain method or a certain piece of code needs to be mutually exclusive between threads. In the concurrency model of message delivery, because the message must be sent before the message is received, synchronization is done implicitly.

Shared variables between threads are stored in the main memory (Main Memory), each thread has a private local memory (Local Memory), the local memory stores the thread to read/write a copy of the shared variable

Instruction reordering

In order to improve performance when executing programs, compilers and processors often reorder instructions. There are 3 types of reordering

type. 1) Compiler optimized reordering. The compiler can rearrange statements without changing the semantics of a single-threaded program

The order of execution.

2) Instruction-level parallel reordering. Modern processors use instruction-level parallelism (Instruction-Level Parallelism, ILP) to overlap multiple instructions. If there is no data dependency, the processor can change the execution order of the statement corresponding to the machine instructions.

3) Reordering of the memory system. Because the processor uses caches and read/write buffers, this makes load and store operations appear to be performed out of order.

The fourth part, Java concurrent programming foundation

The smallest unit of modern operating system scheduling is the thread, also called the Light Weight Process. Multiple threads can be created in a process. These threads have their own counters, stacks, and local variables, and can access them. Shared memory variables. The processor switches on these threads at high speed, allowing users to feel that these threads are executing at the same time.

1. Why use multithreading

  1. More processor cores: A single-threaded program can only use one processor core when running, so no matter how many processor cores are added, the execution efficiency of the program cannot be significantly improved. On the contrary, if the program uses multi-threading technology to distribute the calculation logic to multiple processor cores, the processing time of the program will be significantly reduced, and it will become more efficient as more processor cores are added.
  2. Faster response time: the creation of an order, which includes inserting order data, generating order snapshots, sending emails to notify sellers, and recording the quantity of goods sold, etc. Multi-threading technology can be used, that is, operations with weak data consistency are dispatched to other threads for processing (message queues can also be used), such as generating order snapshots, sending emails, etc. The advantage of this is that the threads that respond to user requests can be processed as quickly as possible, which shortens the response time and improves the user experience
  3. A better programming model

2. Thread priority

Modern operating systems basically schedule running threads in the form of time division. The operating system will allocate time slices one by one, and threads will be allocated to several time slices. When the thread time slices are used up, thread scheduling will occur and wait for the next allocation. .

In a Java thread, the priority is controlled by an integer member variable priority. The priority range is from 1 to 10. When the thread is constructed, the priority can be modified by the setPriority(int) method. The default priority is 5, priority High-level threads allocate more time slices than low-priority threads. When setting the thread priority, the thread that is frequently blocked (sleeping or I/O operation) needs to be set to a higher priority, and the thread that is more computationally (requiring more CPU time or partial calculation) is set to a lower priority to ensure Processor will not be monopolized

3. The state of the thread

4. Daemon thread

Daemon thread is a supporting thread, because it is mainly used for background scheduling and supporting work in the program. This means that when there are no non-Daemon threads in a Java virtual machine, the Java virtual machine will exit. You can set the thread as a Daemon thread by calling Thread.setDaemon(true).

5. Starting and terminating threads

The start() method of the thread is called to start, and the thread terminates as the execution of the run() method is completed

5.1 3.methods of constructing threads

package server.doc.thread; import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import java.util.concurrent.FutureTask; public class ThreadTest { public static void main (String[] args) throws ExecutionException, InterruptedException { A a = new A(); Thread threadA = new Thread(a); threadA.start(); B b = new B(); Thread threadB = new Thread(b); threadB.start(); C c = new C(); FutureTask<Integer> integerFutureTask = new FutureTask<>(c); //FutureTask<V>() is the implementation class of Runnable Thread threadC = new Thread(integerFutureTask); threadC.start(); System.out.println(integerFutureTask.get()); //The return value can be obtained through the get method } } class A extends Thread { @Override public void run () { System.out.println( "=======Inherit the Thread class to create a thread====" ); } } class B implements Runnable { @Override public void run () { System.out.println( "=======Implement runnable interface to create thread====" ); } } //Implement the Callable interface to create a thread, Integer is the return value class C implements Callable < Integer > { @Override public Integer call () throws Exception { System.out.println( "=======Implement Callable interface to create thread====" ); return 2 ; } } Copy code

5.2 Start thread start source code

//This method can create a new thread public synchronized void start () { //If it is not initialized, throw an exception if (threadStatus != 0 ) throw new IllegalThreadStateException(); group.add( this ); //started is an identifier. When we do something, we often write it like this //The identifier is false before the action occurs, and it becomes true after the occurrence boolean started = false ; try { //A new thread will be created here. After the execution is complete, the new thread is already running, and the content of the target is already running start0(); //The execution here is still the main thread started = true ; } finally { try { //If it fails, remove the thread from the thread group if (!started) { group.threadStartFailed( this ); } //Throwable can catch some exceptions that Exception cannot catch, such as exceptions thrown by child threads } catch (Throwable ignore) { /* do nothing. If start0 threw a Throwable then it will be passed up the call stack */ } } } //To start a new thread, use the native method private native void start0 () ; Copy code

5.3 Stop threads correctly

Interruption can be understood as a flag attribute of a thread, which indicates whether a running thread has been interrupted by other threads. Interrupt is like another thread greets the thread, and other threads interrupt the thread by calling the interrupt() method of the thread.

The thread responds by checking whether it is interrupted. The thread uses the method isInterrupted() to determine whether it is interrupted. You can also call the static method Thread.interrupted() to reset the interrupt flag of the current thread.

In principle, interrupt should be used to request an interrupt, rather than forced to stop, because this can avoid data confusion, and also allow the thread to have time to end the finishing work.

while  (!Thread.currentThread().islnterrupted() && more work to  do ) {      do  more work } Copy code

Once we call the interrupt() of a thread, the interrupt flag bit of this thread will be set to true. Each thread has such a flag bit. When the thread executes, this flag bit should be checked regularly . If the flag bit is set to true, it means that a program wants to terminate the thread. Back to the source code, you can see that in the while loop body judgment statement, first judge whether the thread is interrupted by Thread.currentThread().isInterrupt(), and then check whether there is work to be done

Can the interrupt position be induced in the sleep situation?

If sleep, wait, etc. can make the thread enter the blocking method to make the thread dormant, and the sleeping thread is interrupted, then the thread can feel the interrupt signal, and an InterruptedException will be thrown, and the interrupt signal will be cleared at the same time. The interrupt flag bit is set to false. In this way, there is no need to worry that the thread will not feel interrupted during a long sleep, because even if the thread is still sleeping, it can still respond to the interrupt notification and throw an exception.

6. Communication between threads

  1. Volatile and synchronized keywords

For the realization of the synchronization block, the monitorenter and monitorexit instructions are used, and the synchronization method relies on the ACC_SYNCHRONIZED on the method modifier to complete. No matter which method is adopted, the essence is to obtain the monitor of an object, and this obtaining process is exclusive, that is, only one thread can obtain the monitor of the object protected by synchronized at a time. Any object has its own monitor. When this object is called by the synchronization block or the synchronization method of this object, the thread that executes the method must first obtain the monitor of the object before it can enter the synchronization block or synchronization method. The thread of the monitor (executing the method) will be blocked at the entrance of the synchronized block and synchronized method, and enter the BLOCKED state.

  1. Wait/notify mechanism (wait/notify)

The waiting/notification mechanism means that a thread A calls the wait() method of object O to enter the waiting state, while another thread B calls the notify() or notifyAll() method of object O, and thread A receives the notification from the object O's wait() method returns, and then performs subsequent operations. The above two threads complete the interaction through the object O, and the relationship between wait() and notify/notifyAll() on the object is like a switch signal, used to complete the interactive work between the waiting party and the notifying party.

package _2 Different producer-consumer models; /** * The communication problem between threads: two threads alternately perform AB operations with the same variable +1, -1 * */ public class A { public static void main (String[] args) { Data data = new Data(); new Thread(()->{ for ( int i = 0 ; i < 10 ; i++) { try { data.increment(); } catch (InterruptedException e) { e.printStackTrace(); } } }, "A" ).start(); new Thread(()->{ for ( int i = 0 ; i < 10 ; i++) { try { data.decrement(); } catch (InterruptedException e) { e.printStackTrace(); } } }, "B" ).start(); new Thread(()->{ for ( int i = 0 ; i < 10 ; i++) { try { data.increment(); } catch (InterruptedException e) { e.printStackTrace(); } } }, "C" ).start(); new Thread(()->{ for ( int i = 0 ; i < 10 ; i++) { try { data.decrement(); } catch (InterruptedException e) { e.printStackTrace(); } } }, "D" ).start(); } } //1. Determine whether you need to wait //2. Perform business //3. Notify other threads class Data { //Number, resource class private int num = 0 ; //+1 public synchronized void increment () throws InterruptedException { while (num != 0 ) { //Use if will cause false wakeup phenomenon this .wait(); } num++; System.out.println(Thread.currentThread().getName()+ "=====" +num); //Notify other threads, +1 is complete this .notifyAll(); } //-1 public synchronized void decrement () throws InterruptedException { while (num == 0 ) { this .wait(); } num--; System.out.println(Thread.currentThread().getName()+ "=====" +num); this .notifyAll(); } } Copy code
  1. Thread.join() method
package server.doc.thread; public class JoinTest implements Runnable { @Override public void run () { System.out.println( "join thread demo" ); } public static void main (String[] args) throws InterruptedException { System.out.println( "main thread start..." ); JoinTest joinTest = new JoinTest(); Thread thread = new Thread(joinTest);thread.setName ( "joinTest thread" ); thread.start(); thread.join(); System.out.println( "main thread end" ); } } //When there is no join: main thread start... main thread end join thread demo //When there is a join main thread start... join thread demo main thread end In other words: When the main thread calls t.join(), it will block its current thread and wait until the execution of the t thread has reached the end state before the main thread can continue execution Copy code
//join source public final synchronized void join ( long millis) throws InterruptedException { long base = System.currentTimeMillis(); long now = 0 ; //First check whether the parameters are legal if (millis < 0 ) { throw new IllegalArgumentException( "timeout value is negative" ); } //If the join method has no parameters, it is equivalent to directly calling the wait method if (millis == 0 ) { while (isAlive()) { wait( 0 ); } } else { while (isAlive()) { //Determine whether the current thread is active. What is the active state? The active state is the thread has started and has not yet terminated long delay = millis-now; if (delay <= 0 ) { break ; } wait(delay); now = System.currentTimeMillis()-base; } } } Copy code
  1. ThreadLocal (follow-up explanation)

7. Why wait(), notify(), notifyAll() must be called in a synchronized method/code block?

The role of the wait() method: Put the current thread into a sleep state until it is notified or interrupted. Before calling wait(), the thread must obtain the object-level lock of the object, that is, it can only be in the synchronization method or synchronization The wait() method is called in the block. After entering the wait() method, the current thread releases the lock. Before returning from wait(), the thread competes with other threads to regain the lock. If the appropriate lock is not held when wait() is called, an IllegalMonitorStateException is thrown.

Calling wait() is to release the lock. The premise of releasing the lock is that the lock must be acquired first, and then the lock can be released.

notify() Function: wake up one of the other sleeping threads, this thread is determined by the system. Before the call, the thread must also obtain the object-level lock of the object. If the notify() is called without holding the appropriate lock, an IllegalMonitorStateException will also be thrown.

Notify(), notifyAll() are to hand the lock to the thread containing the wait() method and let it continue to execute. If there is no lock, how to call the lock to other threads;

8. Why wait/notify/notifyAll is defined in the Object class, while sleep is defined in the Thread class?

  1. Because every object in Java has a lock called a monitor, since every object can be locked, it requires a location in the object header to store lock information. This lock is at the object level, not at the thread level. Wait/notify/notifyAll are also lock-level operations. Their locks belong to objects, so it is most appropriate to define them in the Object class, because the Object class is all objects The parent class.
  2. Because if the wait/notify/notifyAll method is defined in the Thread class, it will bring great limitations. For example, a thread may hold multiple locks in order to achieve complex logic that cooperates with each other. Assuming that the wait method is defined in Thread at this time In the class, how to make a thread hold multiple locks? How do you know which lock the thread is waiting for? Since we are letting the current thread wait for the lock of an object, it should naturally be achieved by manipulating the object instead of manipulating the thread.

9. The difference between wait and sleep

Same point:

  1. All make the thread block
  2. Can receive interruption notice


  1. In the synchronous code block,
    sleep will not release the lock, wait will release the lock
    . So the wait method must be used in synchronized protected code, while sleep does not have this requirement.
  2. The sleep method must define a time, and it will automatically resume after the time expires. And wait can not set parameters, which means waiting forever
  3. Wait is a method of the Object class, and sleep is a method of Thread.