Tuning of hbase

Tuning of hbase

1. the design of the table

1.1 Pre-Creating Regions

By default, a region partition is automatically created when the HBase table is created. When data is imported, all HBase clients write data to this region, and the region is not split until the region is large enough. One way to speed up the batch writing speed is to create some empty regions in advance , so that when data is written to HBase, it will load balance the data in the cluster according to the region partition . Below is an example:

/** * Create multiple empty regions by Adminc.create(HTableDescriptor, byte[][] splits) * @param admin * @param table * @param splits * @return * @throws IOException */ public static boolean createTable(Admin admin, HTableDescriptor table, byte[][] splits) throws IOException { try { admin.createTable(table, splits); return true; } catch (TableExistsException e) { System.out.println("table "+ table.getNameAsString() +" already exists"); //the table already exists... return false; } } /** * Get splitkeys between splits * @param startKey * @param endKey * @param numRegions * @return */ public static byte[][] getHexSplits(String startKey, String endKey, int numRegions) { //start:001,endkey:100,10region [001,010] [011,020] byte[][] splits = new byte[numRegions-1][]; BigInteger lowestKey = new BigInteger(startKey, 16); BigInteger highestKey = new BigInteger(endKey, 16); BigInteger range = highestKey.subtract(lowestKey); BigInteger regionIncrement = range.divide(BigInteger.valueOf(numRegions)); lowestKey = lowestKey.add(regionIncrement); for(int i=0; i <numRegions-1;i++) { BigInteger key = lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i))); byte[] b = String.format("%016x", key).getBytes(); splits[i] = b; } return splits; } Copy code

1.2 Row Key

The row key in HBase is used to retrieve the records in the table and supports the following three methods:

  • Access through a single row key: get operation according to a certain row key key value;
  • Scan through the range of the row key: that is, scan within this range by setting startRowKey and endRowKey;
  • Full table scan: Scan all rows in the entire table directly.

In HBase, the row key can be any string with a maximum length of 64KB. In practical applications, it is generally 10~100bytes. It is stored as a byte[] byte array and is generally designed as a fixed length.

rowkey is stored in lexicographic order

When designing a rowkey, make full use of this sorting feature, store data that is frequently read together in one block, and put data that may be accessed recently in one block.

For example: if the data recently written to the HBase table is most likely to be accessed, consider using the timestamp as part of the row key. Since it is lexicographically sorted, you can use Long.MAX_VALUE-timestamp as the row key, so It can ensure that newly written data can be hit quickly when read.

Rowkey:

  • 1. The smaller the length, the better

  • 2. The value is determined according to the functional requirements to meet the query business requirements

  • 3. It is better to have a hashing principle for Row.

    • Negate
    • Hash
  • 4. Region data balance (single region 10~20G)

1.3, Column Family

Don't define too many column families in one table. At present, Hbase cannot handle tables with more than 2~3 column families very well. Because when a column family is flushed, its neighboring column family will also be flushed due to the correlation effect, which will eventually cause the system to generate more I/O . Those who are interested can perform actual tests on their own HBase clusters and verify them from the test result data obtained.

1.4 In Memory

When creating a table, you can pass
HColumnDescriptor.setInMemory(true)
Place the table in the RegionServer's cache to ensure that it is hit by the cache when it is read.

1.5 Max Version

When creating a table, you can pass
HColumnDescriptor.setMaxVersions(int maxVersions)
Set the maximum version of the data in the table, if you only need to save the latest version of the data, you can set
setMaxVersions(1)
.

1.6 Time To Live

When creating a table, you can pass
HColumnDescriptor.setTimeToLive(int timeToLive)
Set the storage lifetime of the data in the table. Expired data will be deleted automatically.
setTimeToLive(2 * 24 * 60 * 60)
.

1.7 Compact & Split

In HBase, data is first written to WAL log (HLog) and memory (MemStore) when it is updated. The data in MemStore is sorted . When MemStore accumulates to a certain threshold, a new MemStore will be created and the old the MemStore added to flush the queue, by a separate thread flush to disk, to become a StoreFile. At the same time, the system will record a redo point in zookeeper, indicating that the changes before this moment have been persisted (minor compact). StoreFile is read-only and cannot be modified once it is created. Therefore, the update of Hbase is actually a continuous addition operation. When the StoreFile in a Store reaches a certain threshold, it will be merged (major compact), and the modifications to the same key will be merged together to form a large StoreFile. When the size of the StoreFile reaches a certain threshold, it will be merged again. Split StoreFile into two StoreFiles equally. Since the update of the table is constantly appended, when processing read requests, you need to access all StoreFile and MemStore in the Store and merge them according to the row key. Because StoreFile and MemStore are sorted, and StoreFile has an in-memory index , Usually the merge process is relatively fast. In actual applications, you can consider manual major compaction if necessary, and merge the modifications of the same row key to form a large StoreFile. At the same time, the StoreFile can be set larger to reduce the occurrence of splits.

### In order to prevent too many small files (menstore flushed to disk) and ensure query efficiency, hbase needs to merge these small store files into a relatively large store file when necessary. This process is called It is compaction. In hbase, there are mainly two types of compaction: minor compaction and major compaction.

Minor compaction : is the consolidation of smaller, few files.

The function of major compaction is to merge all store files into one. The possible conditions for triggering major compaction are: major_compact command, majorCompact() API, region server automatically runs (related parameters:

hbase.hregion.majoucompaction defaults to 24 hours, //The default value is 0.2 to prevent the region server from performing major compaction at the same time). hbase.hregion.majorcompaction.jetter //The function of the parameter is to float the value specified by the parameter hbase.hregion.majoucompaction. If both parameters are the default values of 24 and 0,2, then the final value used by the major compact is: 19.2~28.8 this range. Copy code
  • 1. Turn off the automatic major compaction
  • 2. Manual programming major compaction

The operating mechanism of minor compaction is more complicated, and it is determined by the following parameters:

//It means that minor compaction will only start when at least three store files that meet the conditions are required hbase.hstore.compaction.min: The default value is 3, //Indicates that up to 10 store files are selected in a minor compaction The default value of hbase.hstore.compaction.max is 10, //Indicates that the store file with a file size smaller than this value will definitely be added to the store file of minor compaction hbase.hstore.compaction.min.size //Indicates that store files with a file size larger than this value will definitely be excluded by minor compaction hbase.hstore.compaction.max.size //Sort the store file according to file age (older to younger), minor compaction always starts from the older store file hbase.hstore.compaction.ratio Copy code

2. Write operation

2.1 Batch write

A specified row key record can be written to HBase by calling the HTable.put(Put) method. Similarly, HBase provides another method: by calling
HTable.put(List<Put>)
The method can write the specified row key list in batches to multiple rows of records . The advantage of this is that batch execution requires only one network I/O overhead. This is possible in scenarios where the real-time data requirements are high and the network transmission RTT is high. Brings a significant performance improvement.

2.2 HTable parameter setting

2.2.1 Auto Flush off

By calling the HTable.setAutoFlush(false) method, the automatic flush of the HTable write client can be turned off, so that data can be written to HBase in batches, instead of performing one update when there is a put, only when the put fills the client write cache, Actually initiate a write request to the HBase server. Auto flush is enabled by default.

2.2.2 Write Buffer modify the write buffer size

By calling
HTable.setWriteBufferSize(writeBufferSize)
The method can set the write buffer size of the HTable client. If the newly set buffer is smaller than the data in the current write buffer, the buffer will be flushed to the server. Among them, the unit of writeBufferSize is the number of bytes , which can be set according to the actual amount of data written.

2.2.3 WAL Flag (HLog)

In HBae, when the client submits data to the RegionServer in the cluster (
Put/Delete
Operation), first write the WAL (Write Ahead Log) log (ie HLog, all Regions on a RegionServer share one HLog ), only when the WAL log is written successfully, then write to MemStore, and then the client is notified that the data is submitted successfully ; If writing the WAL log fails, the client is notified that the submission failed. The advantage of this is that data recovery can be achieved after the RegionServer is down . Therefore, for relatively unimportant data, you can call
Put.setWriteToWAL(false)
or
Delete.setWriteToWAL(false)
Function, give up writing WAL log, thereby improving the performance of data writing.

It is worth noting that: carefully choose to turn off the WAL log, because in this case, once the RegionServer is down, Put/Delete data will not be able to be restored based on the WAL log.

3. Read the meter operation

3.1, batch read

By calling the HTable.get(Get) method, a row of records can be obtained according to a specified row key. Similarly, HBase provides another method: by calling
HTable.get(List<Get>)
The method can obtain multiple rows of records in batches according to a specified row key list. The advantage of doing this is batch execution and only one network I/O overhead is required. This is possible in scenarios where data real-time requirements are high and network transmission RTT is high Brings a significant performance improvement.

3.2 HTable parameter setting

3.2.1 Scanner Caching

The ```hbase.client.scanner.caching'' configuration item can set the number of data items that HBase scanner can grab from the server at a time, one item at a time by default. By setting it to a reasonable value, the time overhead of next() in the scan process can be reduced . The cost is that the scanner needs to maintain these cached rows through the client's memory. There are three places to configure:

  • 1) Configure in the conf configuration file of HBase;
  • 2) Configure by calling HTable.setScannerCaching(int scannerCaching);
  • 3) Configure by calling Scan.setCaching(int caching). The priority of the three is getting higher and higher.

3.2.2 Scan Attribute Selection

Specifying the required Column Family during scan can reduce the amount of network transmission data, otherwise the default scan operation will return all Column Family data in the entire row.

3.2.3 Close ResultScanner

After fetching data through scan, remember to close ResultScanner, otherwise RegionServer may have problems ( corresponding Server resources cannot be released ).

3.3 Cache query results

For application scenarios that frequently query HBase, you can consider caching in the application. When there is a new query request, first look it up in the cache, if it exists, return it directly without querying HBase; otherwise, initiate a read request query to HBase. Then cache the query results in the application. As for the cache replacement strategy, common strategies such as LRU can be considered.

3.4 Blockcache mechanism

The memory of Regionserver on HBase is divided into two parts: (1) One part is used as Memstore, which is mainly used for writing; write request will be written to Memstore first, Regionserver will provide a Memstore for each region, when Memstore is full 64MB, it will start flush to disk. When the total size of Memstore exceeds the limit
(Heapsize * hbase.regionserver.global.memstore.upperLimit * 0.9)
, The flush process will be forcibly started, starting from the largest Memstore until it is below the limit. (2) The other part is used as BlockCache, which is mainly used for reading. Read request: a) first check the data in Memstore, b) check it in BlockCache if it is not found, c) read it on the disk if it is not found, and put the read result into BlockCache. Because BlockCache uses the LRU strategy, BlockCache reaches the upper limit
(heapsize * hfile.block.cache.size * 0.85)
Later, the elimination mechanism will be activated to eliminate the oldest batch of data. There is a BlockCache and N Memstores on a Regionserver, and the sum of their sizes cannot be greater than or equal to
heapsize * 0.8
, Otherwise HBase cannot be started . The default BlockCache is 0.2 and Memstore is 0.4. For systems that focus on read response time, BlockCache can be set larger, such as setting BlockCache=0.4, Memstore=0.39 to increase the hit rate of the cache .