 Approaching the autumn recruitment, prepare for the summer internship, I wish everyone a little better every day!
 This article summarizes the interview questions related to Java collection knowledge, which will be updated daily~
1. How to make ArrayList thread safe?
The answer is as follows:
//Method 1:
//The bottom layer of synchronizedList is equivalent to adding the set add remove method of the collection to the synchronized lock
List<Object> list = Collections.synchronizedList( new ArrayList<>());
//Method two:
//Use threadsafe CopyOnWriteArrayList, and its bottom layer is also to lock the addition, deletion and modification methods: final ReentrantLock lock = this.lock;
//Method three:
//Write a wrapper class by yourself, inherit ArrayList according to the business, lock the add set remove method to control the
copy code
2. Tell me about the difference between Vector and ArrayList?
 The initial capacity of both is 0 , that is, when the empty parameter constructor is called for instantiation, the capacity of both is 0 , and the initial capacity value of 10 is attached when the element data is added for the first time .
 Vector expansion, if the expansion increment value is not specifiedcapacityIncrement, Or the value is not greater than 0 , the expansion of each original 1 times the amount of expansion orcapacityIncrementValue.
 When the ArrayList is expanded, each expansion is 1.5 times the original .
 Vector is a threadsafe collection, through theremove,addWait for the method to addsynchronizedKeywords to achieve. ArrayList is a nonthreadsafe collection.
Reference articles: ArrayList analysis of JDK collection source code (with example interview questions) , Vector analysis of JDK collection source code
3. Does CopyOnWriteArrayList need to expand when adding new elements? How is it done?
 CopyOnWriteArrayList bottom and nonarray dynamic expansion , the dynamic expansion is not, which is a threadsafe lock by adding reentrant ReentrantLock ensured.
 When adding elements to CopyOnWriteArrayList , after the thread acquires the execution right of the lock,addA new capacity will be created in the method (Old array capacity +1), copy the old array data into the array, and put the newly added data at the end of the new array.
code show as below:
public boolean add (E e) {
final ReentrantLock lock = this .lock;
lock.lock();
try {
Object[] elements = getArray();
int len = elements.length;
Object[] newElements = Arrays.copyOf(elements, len + 1 );
newElements[len] = e;
setArray(newElements);
return true ;
} finally {
lock.unlock();
}
}
Copy code
CopyOnWriteArrayList is suitable for the case of more reading and less writing (separation of reading and writing)! Because every time you call the method of modifying the array structure, you need to recreate the array, and the performance is low !
Article reference: CopyOnWriteArrayList analysis of JDK collection source code
4. What is the difference between HashMap and HashTable?
HashMap:
 HashMap : The bottom layer is based on array + linked list + redblack tree , which is not threadsafe . The default capacity is 16 , and free health values are allowed .
 initialsizeFor 16, expansion:newsize = oldsize << 1, Size must be 2 to the power of n .
 When the total number of elements in the Map exceeds 75% of the Entry array , the expansion operation is triggered. In order to reduce the length of the linked list, the element distribution is calculated more evenlyindexmethod:index = hash & (tab.length 1).
 The expansion is for the entire Map . Each time the expansion is performed, the elements in the original array are sequentially recalculated and reinserted.
HashTable:
 HashTable : The underlying array + linked list implementation, neither the key nor the value can benull, Thread safety , the way to achieve thread safety is to lock when the data is modified (synchroized) Living in the entire HashTable , the efficiency is low, and ConcurrentHashMap has been optimized.
 initialsizeFor 11, expansion:(tab.length << 1) + 1.
 CalculationindexMethods:index = (hash & 0x7FFFFFFF)% tab.length.
The difference between the two:

HashMap is not threadsafe, HashTable is threadsafe (using
synchronizedModification). 
HashMap allows
nullas aentryofkeyorvalue, And Hashtable does not allow it. 
The hash value of HashMap has been recalculated (as in the code below), and Hashtable uses hashCode directly .
//The algorithm for recalculating the hash value in HashMap static final int hash (Object key) { int h; return (key == null )? 0 : (h = key.hashCode()) ^ (h >>> 16 ); } Copy code 
Hashtable inherits from the Dictionary class, and HashMap is an implementation class of the Map interface.
HashMap and HashTable to find the addressing algorithm of bucket index:
 HashMap :index = hash & (tab.length 1)
 HashTable :index = (hash & 0x7FFFFFFF)% tab.length
The two formulas for finding the bucket index are to make the bucket index calculated each time more scattered, which can reduce hash collisions.
In HashTable :
 0x7FFFFFFFYes0111 1111 1111 1111 1111 1111 1111 1111: All 1 except the sign bit .
 (hash & 0x7FFFFFFF)The result will produce a positive integer.
 (hash & 0x7FFFFFFF)% tab.lengthThe calculated index result is always a positive integer, and make sure that the value of index is intab.lengthWithin range!
 The use of odd numbers for the array length of HashTable will cause fewer hash conflicts, and the use of even numbers will cause more conflicts! So the initial capacity is 11 and the expansion isnewsize = olesize << 1+1(Ie 2n + 1 times), to ensure that each expansion result is an odd number.
In HashMap :
 The initial capacity is 16 , when the number of valid data reaches 0.75 times the array capacity , the expansion is triggered.
 Bucket position calculation formula:index = hash & (tab.length 1)When the bit bucket index is calculated, the capacity must be as 2 to n th power (i.e. even), this is to reduce the hash conflicts, expansion:newsize = oldsize << 1(Ie 2n times), the result obtained is also an even number.
 In addition, when the length of the linked list in the bucket is greater than 8 and the length of the array reaches 64 , the linked list is treeized, and when it is less than 6 , detreeing is performed.
 Before JDK1.8 , the linked list in HashMap uses the head interpolation method . The advantage is that the efficiency is higher than the tail interpolation method because there is no need to traverse the linked list once and then insert data.
 After JDK1.8, the tail interpolation method is used . The reason why the tail interpolation method is used is to determine whether the length of the segment linked list is greater than 8. In this case, treeization should be considered.
 The method used by HashMap to resolve hash conflicts is the linked list method .
 HashMap first inserts data and then judges whether storage capacity is needed!
Article reference: HashMap analysis of JDK collection source code (top) , HashMap analysis of JDK collection source code (bottom) , HashTable analysis of JDK collection source code
5. What is the difference between HashMap and TreeMap?
 HashMap was introduced above, just look at TreeMap .
 The bottom layer of TreeMap is based on a balanced binary tree (redblack tree), which can customize the sorting rules. To implement the Comparator interface, various sorts of internal elements can be conveniently realizedTreeMap(Comparetor c), But the performance is worse than HashMap .
6. The relationship between Set and Map
 Both cores do not store duplicate elements, but store a set of unique objects.
 Each realization of Set corresponds to a package in Map .
 For example, the bottom layer of HashSet is to encapsulate HashMap , and the bottom layer of TreeSet is to encapsulate TreeMap .
7. Why choose the redblack tree at the bottom of HashMap instead of other trees, such as binary search tree, why not use redblack tree at the beginning, but only when the length of the linked list reaches 8 and the array capacity is greater than 64?
 The binary search tree will also become a linear structure under special circumstances. It has the same deep traversal problem as the original long linked list, and the search performance is slow, for example:
 The use of redblack trees is mainly to improve the speed of finding data. Redblack trees are a kind of balanced binary tree. After inserting new data ( new data is initially inserted by red nodes ), it will be balanced by lefthanded, righthanded, and colorchanging operations. , Solve the problem of singly linked list query depth.
 The reason why the redblack tree was not used at the beginning is because when the amount of data in the linked list is small, traversing the linear linked list consumes less resources than traversing the redblack tree. ), so the linear table is used in the early stage.
 The reason why 8 is the tree threshold is that after a lot of testing, the value of 8 is the most appropriate. Ideally, using a random hash code, the frequency of nodes distributed in the hash bucket follows the Poisson distribution . According to the Poisson distribution formula, the probability when the number of nodes in the linked list is 8 is 0.00000006 , which is low enough And when it reaches 8 nodes, the performance advantages of the redblack tree will begin to show, so 8 is a more reasonable number.
Since the redblack tree is mentioned, it is bound to ask about the 5 properties of the redblack tree:
Redblack tree in nature 

Property 1: Each node is either black or red . 
Property 2: The root node is black . 
Property 3: Each leaf node ( NIL ) is black . 
Property 4: The two child nodes of each red node must be black . You can not have two red colored section connected to the point. 
Property 5 : The path from any node to each leaf node contains the same number of black nodes . Commonly known as: Hei Gao ! 
Redblack tree example diagram :
The redblack tree is not a perfectly balanced binary search tree. It can be seen from the figure that the left subtree of the root node P is obviously higher than the right subtree.
But the number of layers of black nodes in the left subtree and the right subtree is equal, that is, the path from any node to each leaf node contains the same number of black nodes (Property 5) .
So we call the redblack tree the black perfect balance .
So when a hash conflict occurs in a bucket bit, instead of using the redblack tree directly, use the linked list first?
 First of all, we need to know that the linked list is not an array, and its storage address is not continuous. When retrieving data, you need to use pointers one by one nextUntil the target data is found.
 If the number of hash conflicts in the bucket is small, the time spent traversing the linked list is not much. However, once the number of hash conflicts is relatively large, resulting in a long linked list formed in the bucket, it takes a lot of time to traverse the long linked list!
 And if the longlinked list tree is transformed into a redblack tree, the redblack tree is a special binary tree . The binary tree can be searched in half. Ideally, the query time can be directly halved!
 The reason why the redblack tree was not used directly at the beginning, because after all, the storage space occupied by the tree structure must be much larger than that of the linked list, so when the length of the linked list is short, there is no need to tree!
In short, the choice between the linked list and the redblack tree is entirely based on a tradeoff between time efficiency and space size ~
Reference article: HashMap bottom redblack tree implementation (implement a simple redblack tree yourself)
8. Why must the capacity of HashMap be 2 to the power of N? What if the input value is not a power of 2, such as 10?
Question 1: Why must the capacity of HashMap be 2 to the power of N ?
 The core purpose is to evenly distribute the inserted nodes and reduce hash conflicts .
The HashMap construction method can specify the initial capacity of the collection, such as:
//Construct an empty HashMap with specified initial capacity and default load factor (0.75).
HashMap( int initialCapacity)
Copy code
When adding an element to the HashMap , you need to determine its specific bucket position in the array (addressing algorithm) according to the hash value of the key . In order to store data efficiently and reduce collisions, HashMap must distribute the data as evenly as possible , and the length of each linked list is roughly the same ==. The key to this realization is the algorithm of storing the data in which linked list.
This algorithm is actually a modulo operation:
For example, array length
Let's look at a length of the array (the number of bits barrel) is not 2 of n th power of the situation:
As can be seen from the above figure, when the length of the array is 9 (not 2 to the power of n ), different hash values are hashed ,
So let's summarize HashMap array capacity using 2 of n reasons of power:
 When the bucket subscript index is determined according to the hash value addressing calculation of the key , if the length of the HashMap array istab.lengthIt is the power of 2 to the nth power, then it can ensure that the data newly inserted into the array is evenly distributed, and each bucket bit may be allocated to the data, and if the length of the array is not the power of 2 to the nth power, then some buckets the position will never be inserted into the data, but some frequent barrelbit hash conflict, resulting in a waste of space array, the red hash increase the probability of conflict.
 general we find people logic array subscripts bit bucket index , tend to use modulo arithmetic to determine the index , i.e.,index = hash% length, However, the efficiency of the computer s modulo budgeting is far less than that of bit operations, so it needs to behash & (length1)Way of addressing. Essentially, the results obtained by the two methods are the same, namely:hash & (length1) = hash% length.
Therefore, the reason why the capacity of the HashMap array uses the n th power of 2 is to ensure that the new data can be evenly distributed in each bucket when the newly inserted data is determined by the addressing algorithm. The probability of frequent hash collisions on each bucket . After all, the more hash conflicts in a bucket, the longer the length of the linked list in the bucket, which greatly reduces the efficiency of data retrieval (because array linear query is definitely much faster than linked list).
Question 2: What if the length of the input array is 10 when creating a HashMap object instead of 2 to the power of n ?
E.g:
HashMap<String, Integer> hashMap = new HashMap( 10 );
Copy code
In this case, the HashMap twoparameter constructor will pass
 The calculation method of the smallest number of 2 to the power of N that is greater than or equal to the capacity is as follows:
static final int tableSizeFor ( int cap) {
int n = cap 1 ;
n = n >>> 1 ;
n = n >>> 2 ;
n = n >>> 4 ;
n = n >>> 8 ;
n = n >>> 16 ;
return (n < 0 )? 1 : (n >= MAXIMUM_CAPACITY)? MAXIMUM_CAPACITY: n + 1 ;
}
Copy code
this way
The calculation is designed in place here. Since it is an interview question, I will not explain step by step how to calculate recursion. A case diagram is directly attached. Please refer to the following source code analysis article for details:
Article reference: JDK collection source code analysisHashMap (on)
9. How is HashMap designed to calculate the hash value of the key? Why should the high 16 bits of hashCode participate in the operation?
The method of recalculating the hash value in HashMap is as follows:
static final int hash (Object key) {
int h;
//If the key is null, the hash value is 0,
//Otherwise, call the hashCode() method of the key to calculate the hash value of the key and assign it to h,
//Then Perform a bitwise XOR with the binary number unsigned by 16 bits to the right of h to get the final hash value.
//This is to make the calculated hash more scattered,
//why is it more scattered? Because the more scattered, the shorter the length of the linked list of a bucket, the fewer redblack trees will be generated later, the higher the retrieval efficiency!
return (key == null )? 0 : (h = key.hashCode()) ^ (h >>> 16 );
}
Copy code
 As shown in the code: the key is hashCode high 16 bit and hashCode lower 16 bits exclusiveOR (XOR) operation, the finally obtained the new hash value.
The analysis of the method is as follows:
 key.hashCode();The returned hash value is also the hashcode , assuming a random value generated.
 ^(Bitwise XOR operation) Operation rules: On the same binary digit, the number is the same, the result is 0 , and the difference is 1 .
 h >>> 16Shift the value of h to the right by 16 bits unsigned .
Question: Why do you want to do this?
We know that the newly inserted data in HashMap needs to go through the addressing algorithm
If n is the array length is very small, suppose it is n = 16 , then n1 is 15 , and its binary number is 1111 , such a value and hashCode directly do the bitwise AND operation, in fact only the hash value is used The last 4 digits. If the high bit of the hash value changes a lot and the low bit changes very little, it is easy to cause a hash conflict, so the high and low bits are used here to solve this problem .
Let's look at an analysis chart:
From the picture above, you can know if you only use
10. Tell me about your understanding of the hash algorithm? And what is hash conflict?
 The basic concept of hash is to map an input of any length into a fixedlength output after passing the hash algorithm .
 In the program, you may encounter two value values after the hash algorithm is calculated, and the same hash value is calculated . This situation is called a hash conflict.
Finally, I recommend a harder HashMap analysis point: Why should the load factor be set to 0.75 ?
 Reference article: Huawei interviewer: Why is the load factor of HashMap 0.75?
Supplementary question: talk about how to migrate the elements of the old array to the new array during resize expansion?
When the HashMap expands, it will be accompanied by a rehash allocation , and it will traverse all the elements in the old array and migrate them to the expanded new array. There are three cases of data migration in the old array, which are analyzed separately below:
Case 1: There is no hash conflict in the current bucket, and there is only one element:
In this case, HashMap uses
For example, when we expand from 16 to 32, the specific changes are as follows:
After the element recalculates the hash , the array length n becomes 2 times the original , and the mark range of n1 is 1 bit higher (red mark), so the new index will change like this.
Description :
In the above figure, 5 is the original index calculated by the assumption. This verifies the above description: After the expansion, the node is either in the original position or allocated to the "original position + old capacity" position.
Therefore, when we expand the HashMap , we don t need to recalculate the hash . We just need to see whether the bit added by the original hash value is 1 or 0. If it is 0 , the index has not changed. If it is 1 , the index becomes " Original location + old capacity" . We can look at the next picture shows the 16 expanded to 32 of
Precisely because of this ingenious
Case 2: A hash conflict occurs in the current bucket position, and a linked list is formed, but it is not a redblack tree:
At this time, the bucket chain is split into upper strand and lower strand two lists into a new array sequentially after the expansion. The text description is not as good as directly uploading the code:
//Note: Before the expansion of hashMap, the table is not null
if (oldTab != null ) {
//Move the data of each bucket to the new
hash table//Traverse each of the old hash tables Bucket, recalculate the new position of the element in the bucket
for ( int j = 0 ; j <oldCap; ++j) {
//current node node
Node<K,V> e;
//Note: There is data in the current bucket at this time, but the specific data is
//1. A single data, 2. It is a linked list, 3. It is a redblack tree, and it is not certain
if ((e = oldTab[j])! = null ) {
//The original data is assigned to null to facilitate GC recovery
oldTab[j] = null ;
//The first case: determine whether the array has the next reference (whether it is a single data)
if (e.next == null )
//There is no next reference, indicating that it is not a linked list.
//There is only a single data keyvalue pair on the current bucket.
//You can put the data directly into the new hash table
//e.hash & (newCap1) find There are two index results obtained by the address formula:
//1. Same as the index position in the original old hash table,
//2. Index position in the original old hash table i + old capacity oldCap
newTab[e.hash & (newCap  . 1 )] = E;
//second case: the tub has been formed bit redblack tree
the else IF (Einstanceof TreeNode)
//Explain that it is the redblack tree to handle the conflict, then call the relevant method to separate the tree.
//Redblack tree, I will write a separate blog for you to analyze it in detail.
//Redblack tree can be related first Skip
((TreeNode<K,V>)e).split( this , newTab, j, oldCap);
//The third case: the bucket has formed a linked list
else { //Use the linked list to handle conflicts
//Lowlevel linked list:
//The subscript position of the array after expansion, use
Node<K,V> when it is the same as the subscript position of the current array. loHead = null , loTail = null ;
//High linked list: The subscript position of the array after expansion is equal to
//current Array subscript position + the length of the array before expansion, oldCap, use
Node<K,V> hiHead = null , hiTail = null ;
Node<K,V> next;
//Calculate the new position of the node through the principle explained above
do {
//original index
next = e.next;
//Here to judge if it is true
//e this node does not need to move position after resize
//For example:
//If hash1 > ...... 0 1111
//If oldCap=16 > ... ... 1 0000
//The result of e.hash & oldCap is 0, then
//the subscript position j of the array after expansion is consistent with the subscript position of the current array
//use the loworder linked list
if ((e.hash & oldCap) == 0 ) {
if (loTail == null )
loHead = e;
else
loTail.next = e;
loTail = e;
}
//Example:
//if hash2 > ...... 1 1111
//if oldCap=16 > ...... 1 0000
//e.hash & oldCap result is not 0, then
//expand Then the subscript position of the
array
is: //current array subscript position j + the length of the array before expansion oldCap //use highorder linked list
else {
if (hiTail == null )
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null );
//Put the low linked list in the bucket
if (loTail != null ) {
loTail.next = null ;
//index position = current array subscript position j
newTab[j] = loHead;
}
//Put the highlevel linked list in the bucket
if (hiTail != null ) {
hiTail.next = null ;
//Index position = current array subscript position j + the length of the array before expansion oldCap
newTab[j + oldCap] = hiHead;
}
}
}
}
}
Copy code
Situation 3: A redblack tree is formed in the bucket position:
If the interviewer wants to ask about the migration of redblack trees, if it is me, I choose to just give up. The block is really complicated.
If you want to fully understand the HashMap source code, please refer to these articles: HashMap analysis of JDK collection source code (top) , HashMap analysis of JDK collection source code (bottom) , HashMap bottom redblack tree implementation (implement a simple redblack tree yourself )
The summarized interview questions are also timeconsuming. The articles will be updated from time to time, sometimes a few more a day. If you help you review and consolidate your knowledge points, please support three consecutive times, and you will be updated with billions of points in the future!