I wanted to learn MySQL Server performance optimization like this at the beginning, but now it's like this!

I wanted to learn MySQL Server performance optimization like this at the beginning, but now it's like this!

Preface

This article mainly analyzes the implementation mechanism of MySQL Server (mysqld) and obtains some corresponding optimization suggestions. It mainly involves the installation of MySQL and the optimization of related parameter settings, but does not include the optimization of parameters related to storage engines other than mysqld. The related parameter settings of storage engines will be recommended in my next article "Optimization of Common Storage Engines" In the description.

1. MySQL installation optimization

Choose the right release version

Binary release version (including packaged specific binary version such as RPM)

Due to the open source nature of MySQL, not only MySQL AB provides a variety of binary distributions on multiple platforms for everyone to choose from, but also many third-party companies (or individuals) also provide us with many choices.

What benefits can we get by using the binary distribution version provided by MySQL AB?

  • Quickly complete the deployment of MySQL through a very simple installation method;
  • The installed version is a compiled version that has undergone a relatively complete function and performance test;
  • The compilation parameters used are more versatile and stable;
  • If you purchase MySQL service, you will be able to get MySQL technical support to the greatest extent;

Most of the MySQL distributions provided by third parties are made more or less targeted changes in the source code officially provided by MySQL AB, and then compiled. Some of these changes are improvements in certain functions, and some are improvements in the performance of certain write operations. There are also some releases provided by various OS vendors, which may be related to their own OS in some code aspects to make some corresponding low-level call adjustments, so that MySQL and their own OS can be more perfectly combined. Of course, there are also some third-party releases that have not touched a single line of MySQL code, but just made some relevant adjustments in the compilation parameters, and made MySQL perform better in certain specific scenarios.

In this way, it sounds as if the MySQL binary version issued by a third party is more attractive than the official binary distribution version provided by MySQL AB, so should we choose a binary distribution version provided by a third party? Don't worry, we need to further analyze the possible problems with third-party releases.

First of all, due to the changes made to MySQL by third-party releases, many of them are made in response to the specific scenarios of the publisher. Therefore, the third-party release version may not be suitable for the environment of all other users.

Secondly, since the publisher of the third-party release version is not necessarily a company (or individual) that is trustworthy enough, we don t know whether it has done enough comprehensive functional and performance tests before generating its own release version. , Will there be bugs that do not exist in the official release version of MySQL AB when we use it?

Finally, if we purchase MySQL-related services and use a third-party release version, when our system has problems, I am afraid that the support work of MySQL's support engineers will be greatly reduced, and they may even refuse to provide support.

If you can completely put aside the above concerns that may have hidden dangers, you can try to use the binary version that is not officially provided by MySQL AB, and choose a release version that may have more features or higher performance.

Before, I also did some tests and comparisons on various third-party binary distributions on the Internet, and found some relatively good versions. For example, the overall quality of Percona's release after integrating some excellent Patches is not bad. Use There are also more people. Of course, Percona not only distributes binary versions, but also distributes and integrates some excellent Patch source code packages. For those who want to make some patches provided by Percona, and those who want to compile by themselves to further optimize and customize MySQL, you can also download the source code package provided by Percona.

For the installation of the binary distribution version, for the installation itself, we basically don t have much to optimize. The only thing we can do is when we decide to choose a third-party distribution version, we can choose the right one according to our own environment and application characteristics. The optimized release version of our environment is installed.

Source code installation

Compared with the binary release version, if we choose to install through the source code, the adjustments we can make to MySQL during the installation process will be more and more flexible. Because we can compile from source code:

  • Choose a suitable compiler for your own hardware platform to optimize the compiled binary code;
  • Adjust relevant compilation parameters according to different software platform environments;
  • For our specific application scenarios, choose what components are needed and what components are not needed;
  • According to the data content we need to store, we choose to install only the character set we need;
  • Multiple MySQL can be installed on the same host;
  • And so on, other various adjustments that can be made according to specific application scenarios.

While the source code installation brings us more flexibility, it also brings us hidden dangers that may be introduced:

  • Insufficient understanding of the compilation parameters, resulting in improper use of the compilation parameters, may make the compiled binary code not stable enough;
  • The optimized parameters used to grasp the mistakes in your own application environment may make the system performance worse;
  • Another small problem that cannot be called a hidden danger is that source code compilation and installation will make the installation and deployment process more complicated and take longer;

The most important feature of installing through source code is that we can adjust the compilation parameters by ourselves and customize the installation results to the greatest extent. Below I will give a brief introduction to some of my optimization experience in compiling and installing through source code, hoping to be helpful to everyone.

When installing from source code, the most critical step is to configure the compilation parameters, that is, to execute various compilation options set by the configure command. We can execute by executing under the folder where the MySQL source code is located "

./configure help
"Get all the compilation parameter options that can be set, as follows:

`configure 'configures this package to adapt to many kinds of systems. Usage: ./configure [OPTION]... [VAR=VALUE]... ... ... Installation directories: --prefix=PREFIX install architecture-independent files in PREFIX ... ... For better control, use the options below. Fine tuning of the installation directories: --bindir=DIR user executables [EPREFIX/bin] ... ... Program names: --program-prefix=PREFIX prepend PREFIX to installed program names ... ... System types: --build=BUILD configure for building on BUILD [guessed] ... ... Optional Features: --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) ... ... Optional Packages: --with-charset=CHARSET ... ... --without-innodb Do not include the InnoDB table handler ... ... Some influential environment variables: CC C compiler command ... ... CCASFLAGS assembler compiler flags (defaults to CFLAGS) ... ... Copy code

Many of the above output content have been omitted, and you can get richer content output through your own testing. Here is a brief introduction to several more important compilation parameters:

  • "
    Prefix
    ": Set the installation path, the default is "/usr/local";
  • "
    Datadir
    ": Set the MySQL data file storage path;
  • "
    With-charset
    ": Set the default character set of the system;
  • "
    With-collation
    ": The default verification rule of the system;
  • "
    With-extra-charsets
    ": The character set that needs to be compiled and installed outside the default character set;
  • "
    With-unix-socket-path
    ": Set the socket file address;
  • "
    With-tcp-port
    ": Specify a specific listening port, the default is 3306;
  • "
    With-mysqld-user
    ": Specify the os user running mysqld, the default is mysql;
  • "
    Without-query-cache
    ": Disable the Query Cache function;
  • "
    Without-innodb
    ": Disable the Innodb storage engine;
  • "
    --with-partition
    ": Open the partition support feature in version 5.1;
  • "
    --enable-thread-safe-client
    ": Compile the client in thread mode;
  • "
    With-pthread
    ": Mandatory use of pthread thread library compilation;
  • "
    With-named-thread-libs
    ": Specify to use a specific thread library to compile;
  • "
    Without-debug
    ": Use non-debug mode;
  • "
    With-mysqld-ldflags
    ": Additional link parameters of mysqld;
  • "
    With-client-ldflags
    ": the additional link parameters of the client;

The above parameters are some of the more commonly used compilation parameters in source code installation. The first few compilation parameters are mainly used to facilitate our customization of our own system during installation and make the system more suitable for the relevant specifications of our own application environment. The environment is unified, and corresponding binary codes are generated according to actual needs. The latter parameters are mainly used to optimize the compilation results.

I think everyone should understand that generally speaking, the more complex a system is, the worse its performance will generally be. Therefore, when we install and compile MySQL, we should try to select only the components we need, install only the storage engine we need, and only compile the character set we need, so that our system can be as simple as possible, because such MySQL will also Give us the highest possible performance.

In addition, for some specific software environments, there may be a variety of thread library choices. If you have a better understanding of each thread library, you can set the compilation parameters to let MySQL use the most suitable thread library. Give full play to his best side in a specific environment.

The compilation parameters of the source code package will generate binary code in Debug mode by default, and the performance loss of Debug mode to MySQL is relatively large, so when we compile the product code to be installed, we must not forget to use "

Without-debug
The parameter disables the Debug mode.

and"

With-mysqld-ldflags
"with"
With-client-ldflags
"If the two compilation parameters are set to "
-all-static
", you can tell the compiler to compile in a static way to get the highest performance of the compiled code. Compared with the code compiled in a static way and dynamically compiled, the performance gap may reach as much as 5% to 10%.

As far as I am concerned, the most commonly used compilation configuration parameters are as follows, you can refer to the relevant content of your own additions and deletions:

. / configure --prefix=/usr/local/mysql/ --without-debug/ --without-bench/ --enable-thread-safe-client/ --enable-assembler/ --enable-profiling/ - with-mysqld-ldflags=-all-static/ --with-client-ldflags=-all-static/ --with-charset=latin1/ --with-extra-charset=utf8,gbk/ --with-innodb/ --with-csv-storage-engine/ --with-federated-storage-engine/ --with-mysqld-user=mysql/ --without-embedded-server/ --with-server-suffix=-community/ - -with-unix-socket-path =/usr/local/mysql/sock/mysql.sock duplicated code

2. MySQL log settings optimization

After installing MySQL, some optimizations and adjustments must be made to various parameter options of MySQL. Although the MySQL system is highly scalable, it can run efficiently in an environment with sufficient hardware resources, and it can also run well in an environment with few resources, but in any case, as sufficient hardware resources as possible are important Performance improvements are always helpful. In this section, we mainly analyze the impact of MySQL logs (mainly Binlog) on system performance, and draw corresponding optimization ideas based on the relevant characteristics of the logs.

The performance impact of the log

MySQL logs include Error Log, Update Log, Binlog, Query Log, Slow Query Log, etc. Of course, the update log is only available in the old version of MySQL, and it has been replaced by the binary log.

By default, the system only opens the error log and closes all other logs to achieve the goal of minimizing IO loss and improving system performance . However, in practical application scenarios that are generally slightly more important, at least the binary log needs to be turned on, because this is the basis for many MySQL storage engines to perform incremental backups, and it is also the basic condition for MySQL to achieve replication. Sometimes in order to further optimize the performance and locate the slower SQL statement, many systems will also open the slow query log to record the SQL statement whose execution time exceeds a certain value (set by us).

Under normal circumstances, few systems in the production system will open the query log. After the query log is opened, every query executed in MySQL will be recorded in the log, which will bring a relatively large IO burden to the system, but the actual benefits brought by it are not very large. Generally, only in the development and test environment, in order to locate which SQL statements are used in certain functions, the log will be opened in a short period of time for corresponding analysis. Therefore, in the MySQL system, the MySQL log (not including the log of each storage engine) that will affect the performance is mainly Binlog.

Binlog related parameters and optimization strategies

Let's first look at the relevant parameters of Binlog. You can get the relevant parameters of Binlog by executing the following commands. Of course, it also shows "

innodb_locks_unsafe_for_binlog
"This Binlog-related parameter unique to the Innodb storage engine:

mysql > show variables like '%binlog%' ; + --------------------------------+----- -------+ | Variable_name | Value | + --------------------------------+--- ---------+ | binlog_cache_size | 1048576 | | innodb_locks_unsafe_for_binlog | OFF | | max_binlog_cache_size | 4294967295 | | max_binlog_size | 1073741824 | | sync_binlog | 0 | + --------------------------------+------------+ copy Code

"

binlog_cache_size
": The size of the cache to accommodate binary log SQL statements during the transaction. The binary log cache is the memory allocated for each client under the premise that the server supports the transaction storage engine and the server has enabled the binary log (-log-bin option). Note , Each Client can allocate the set size
binlog cache
space. If multi-sentence transactions often appear in the system of readers and friends, you can try to increase the value of this value to get more performance. Of course, we can judge the current state by the following two state variables of MySQL
binlog_cache_size
Status:
Binlog_cache_use
with
Binlog_cache_disk_use
.

"

max_binlog_cache_size
":with"
binlog_cache_size
"Corresponding, but it represents the maximum cache memory size that binlog can use. When we execute multi-statement transactions,
max_binlog_cache_size
If it is not big enough, the system may report "
Multi-statement transaction required more than'max_binlog_cache_size' bytes of storage
"mistake.

"

max_binlog_size
": The maximum value of the Binlog log, generally set to 512M or 1G, but cannot exceed 1G. This size does not strictly control the Binlog size, especially when it reaches the Binlog closer to the end and encounters a larger transaction. In order to ensure the integrity of the transaction, the system cannot switch the log, and can only record all the SQL of the transaction into the current log until the end of the transaction.
Oracle
of
Redo
The log is a bit different, because Oracle's Redo log records the changes in the physical location of the data file, and it also records Redo and Undo related information, so whether the same transaction is in the same log is not critical to Oracle . What MySQL records in Binlog is database logic change information, which MySQL calls Event, which is actually a Query statement such as DML that brings database changes.

"

sync_binlog
": This parameter is very important for the MySQL system. It not only affects the performance loss that Binlog brings to MySQL, but also affects the integrity of the data in MySQL. For "
sync_binlog
The description of the various settings of the parameter is as follows:

  • sync_binlog=0
    , When the transaction is committed, MySQL does not refresh disk synchronization instructions such as fsync
    binlog_cache
    The information in the disk, and let
    Filesystem
    Decide when to synchronize, or synchronize to disk after the cache is full.
  • sync_binlog=n
    , After every n transaction commits, MySQL will perform a disk synchronization command such as fsync to force the data in binlog_cache to be written to disk.

The system default setting in MySQL is

sync_binlog=0
, That is, do not do any compulsory disk refresh command. At this time, the performance is the best, but the risk is also the greatest. Because once the system crashes,
binlog_cache
All binlog information in will be lost. When it is set to "1", it is the safest setting but with the greatest performance loss. Because when set to 1, even if the system crashes, it will be lost at most
binlog_cache
An unfinished transaction in, does not have any substantial impact on the actual data. Judging from past experience and related tests, for systems with high concurrent transactions, "
sync_binlog
"The gap between the write performance of the system set to 0 and set to 1 may be as high as 5 times or more.

As everyone knows, MySQL replication is actually achieved by replicating the Binlog on the Master side to the Slave side through the network by using the IO thread, and then parsing the logs in the Binlog through the SQL thread and then applying it to the database. Therefore, the size of the Binlog volume will have a direct impact on the IO thread and the network between Msater and Slave.

The amount of Binlog generated in MySQL cannot be changed. As long as our Query changes the data in the database, then the Event corresponding to the Query must be recorded in Binlog. So is there no way to optimize replication? Of course not. In the MySQL replication environment, there are actually 8 parameters that allow us to control the DB or Table that needs to be replicated or that need to be ignored and not replicated. They are:

  • Binlog_Do_DB : Set which database (Schema) needs to record Binlog;
  • Binlog_Ignore_DB : Set which databases (Schema) do not record Binlog;
  • Replicate_Do_DB : Set the database to be replicated (Schema), multiple DBs are separated by commas (",");
  • Replicate_Ignore_DB : set the database that can be ignored (Schema);
  • Replicate_Do_Table : Set the Table to be replicated;
  • Replicate_Ignore_Table : Set a table that can be ignored;
  • Replicate_Wild_Do_Table : Same function
    Replicate_Do_Table
    , But can be set with wildcards;
  • Replicate_Wild_Ignore_Table : Same function
    Replicate_Ignore_Table
    , Can be set with wildcards;

Through the above eight parameters, we can easily control the amount of Binlog from the Master end to the Slave end as little as possible according to actual needs, thereby reducing the network traffic from the Master end to the Slave end, reducing the IO amount of the IO thread, and Reduce the number of SQL threads to parse and apply SQL, and ultimately improve the data delay problem on the Slave.

In fact, the first two of the above eight parameters are set on the Master side, and the last six parameters are set on the Slave side. Although the first two parameters and the following six parameters are not directly related in function, similar functions can be enabled for optimizing MySQL Replication. Of course there are certain differences, the main differences are as follows:

  • If you set the first two parameters on the Master side, not only will the amount of IO brought by the Binlog record on the Master side be reduced, but also the IO thread on the Master side can reduce the amount of Binlog read, which is passed to the Binlog of the IO thread on the Slave side. The amount will naturally be less. The advantage of this is that it can reduce network IO, reduce the amount of IO on the Slave side IO thread, and reduce the amount of IO on the Slave side.
    SQL
    The workload of the thread, thereby optimizing the copy performance to the greatest extent. Of course, setting on the Master side also has certain drawbacks, because MySQL's judgment whether it is necessary to copy an Event is not based on the DB where the data changed by the Query that generated the Event is located, but based on the default Schema at the time the Query is executed, which is The DB specified when we log in or run "
    USE DATABASE
    Specified in the DB. Only when the current default DB is exactly the same as the DB set in the configuration, the IO thread will read the Event to the Slave IO thread. So if it appears in the default DB and settings in the system When the data of a table in the DB that needs to be copied is changed when the DB to be copied is different, the Event will not be copied to the Slave, which will cause the data on the Slave side and the data of the Master to be inconsistent The situation arises. Similarly, if the data in the schema that does not need to be copied is changed under the default schema, it will be copied to the Slave side. When the Slave side does not have the Schema, it will cause replication errors and stop;
  • If you set the following six parameters on the Slave side, it may be slightly inferior in terms of performance optimization than on the Master side, because the events that need or do not need to be copied will be read to the Slave side by the IO thread, so Not only increased the amount of network IO, but also increased the IO thread on the Slave side
    Relay Log
    The amount of writes. But it can still reduce the amount of log application of the Slave SQL thread on the Slave side. Although the performance is slightly inferior, setting the replication filter mechanism on the Slave side can ensure that there will be no inconsistencies or replication errors between the Slave and Master data due to the default Schema.

Slow Query Log related parameters and usage suggestions

Take a look again

Slow Query Log
Related parameter configuration. Sometimes, in order to locate the more efficient Query statements in the system, we need to open the slow query log, which is
Slow Query Log
. We can view the related settings of the system slow query log as follows:

MySQL > Show Variables like 'log_slow%' ; + ------------------ + ------- + | variable_name | the Value | + ------ ------------+-------+ | log_slow_queries | ON | + ------------------+---- ---+ 1 row in set ( 0.00 sec) MySQL > Show Variables like 'long_query%' ; + ----------------- + ------- + | variable_name | the Value | + ------- ----------+-------+ | long_query_time | 1 | + -----------------+------- + . 1 Row in SET ( 0.01 sec) copying the code

"

log_slow_queries
The parameter shows whether the system has been turned on
Slow Query Log
Function, while "
long_query_time
"The parameter tells us how long the execution time of the Slow Query record of the current system setting exceeds the Query. In the MySQL version released by MySQL AB
Slow Query Log
The minimum slow query time that can be set is 1 second, which may not fully meet our requirements in some cases. If you want to further shorten the time limit of slow queries, you can use the one provided by Percona
microslow-patch
(Piece becomes
msl Patch
) To break the limit.
msl patch
Not only can the slow query time be reduced to the millisecond level, but also the recorded SQL can be filtered through some specific rules, such as only records related to a certain table
Slow Query
And so on additional features. Considering the space issue, I won t introduce it here
msl patch
For more detailed functions and usage, please refer to the official introduction ( www.mysqlperformanceblog.com/2008/04/20/...

turn on

Slow Query Log
The overall impact of functions on system performance is not as great as Binlog. After all, the data volume of Slow Query Log is relatively small, and the IO loss it brings is also small. However, the system needs to calculate the execution time of each Query, so the consumption will always be Some, mainly CPU consumption. If your system has sufficient CPU resources, you don't need to care about this little loss, after all, it may bring us greater performance optimization gains. But if our CPU resources are also tight, we can turn off this function most of the time, and only need to turn on the Slow Query Log function intermittently to locate possible slow queries.

As other MySQL logs are rarely used (Query Log) or have little performance impact, we will not analyze too much here. As for the logs related to each storage engine, we will leave it in the next "Common Storage Engine Optimization" section and do the corresponding Analysis.

3. Query Cache optimization

When it comes to Query Cache, I am afraid that most people who have used MySQL will have more or less understanding, because in many people's eyes, it can help us produce a "qualitative" improvement in the performance of the database. But is this really the case? In this section, we will make some corresponding analysis on how to reasonably use MySQL's Query Cache and come up with some optimization suggestions.

Is Query Cache really a "shangfang sword"?

MySQL

Query Cache
The implementation principle is actually not particularly complicated. In simple terms, the Query statement requested by the client (of course limited to SELECT type Query) is calculated through a certain hash algorithm to obtain a hash value, which is stored in a hash bucket in. At the same time, the result set of the Query (
Result Set
) Is also stored in a memory Cache. Store
Query hash
The node where each hash value in the linked list of values is located also stores the corresponding Query
Result Set
The memory address where the Cache is located, and other related information such as the identifiers of all Tables involved in the Query. When the system receives any SELECT type Query, it first calculates its hash value, and then passes the hash value to
Query Cache
To match, if the exact same Query is found, the previous Cache will be directly
Result Set
Return to the client without any subsequent steps to complete this request. And after any piece of data in any table in the backend changes, it will also notify
Query Cache
, It is necessary to invalidate all the Caches of Query related to the Table, and release the memory address previously occupied so that other queries can be used later.

Judging from the above implementation principle, Query Cache is indeed a relatively simple implementation that brings huge performance benefits. But many people may have overlooked the negative impact of using QueryCache:

  • The hash operation of Query statement and hash lookup resource consumption. When we use Query Cache, after each SELECT type Query arrives in MySQL, it needs to perform a hash operation and then find out whether the Query cache exists. Although this hash operation algorithm may already be very efficient, the hash lookup process is also It has been optimized enough. The resources consumed for a query are indeed very, very small, but when we have thousands or even thousands of queries per second, we cannot completely ignore the resulting CPU consumption. .
  • Invalidation of Query Cache. If our table changes more frequently, it will cause a very high failure rate of Query Cache. The table change here not only refers to the change of the data in the table, but also includes any change in the structure or index. That is to say, every time we cache the Cache data in the Query Cache, it may be cleared soon after the data in the table is changed, and then the previous Cache cannot be used after the new same Query comes in.
  • The Result Set is cached in the Query Cache, not the data page . That is to say, there is a possibility that the same record will be cached multiple times. This causes excessive consumption of memory resources. Of course, some people may say that we can limit the size of the Query Cache. Yes, we can indeed limit the size of Query Cache, but in this way, Query Cache can easily be swapped out due to insufficient memory, resulting in a drop in hit rate.

Regarding the above three negative effects of Query Cache, if each of them is taken out separately, it will not cause much problem to the entire system, and it will not cause too many concerns about the use of Query Cache. However, when these three negative effects are considered together, I am afraid that Query Cache will no longer be the previous "sword of justice" in the minds of many people.

Appropriate use of Query Cache

Although the use of Query Cache will have some negative effects, we should also believe that its existence must have a certain value. We do not completely lose confidence in Query Cache because of the above three negative effects of Query Cache. As long as we understand the implementation principle of Query Cache, then we can use certain means to maximize the strengths and avoid weaknesses when using Query Cache, focus on its advantages, and effectively avoid its disadvantages.

First of all, we need to determine which tables are suitable for using Query and which tables are not suitable for use according to the Query Cache invalidation mechanism. The failure of Query Cache is mainly due to changes in the data of the Table on which Query depends, causing Query to fail.

Result Set
It may have been changed to cause all related Query Cache to fail, then we should avoid using it on the Query of the table with frequent changes, and should use it on the Query of the table with less frequent changes. There are two special SQL Hints (hints) for Query Cache in MySQL:
SQL_NO_CACHE
with
SQL_CACHE
Respectively represent mandatory not to use Query Cache and mandatory use of Query Cache. We can take advantage of these two
SQL Hint
, Let MySQL know which SQL we want to use Query Cache and which SQL do not use. This not only allows the query of the frequently changing table to waste the memory of the Query Cache, but also reduces the amount of Query Cache detection.

Secondly, for those data whose changes are very small and mostly static, we can add

SQL_CACHE
of
SQL Hint
, Force MySQL to use Query Cache to improve the query performance of the table.

Finally, some SQL Result Sets are very large. If Query Cache is used, it is easy to cause insufficient Cache memory, or flush out some old Caches before. For this type of query we have two methods to solve, one is to use

SQL_NO_CACHE
Parameter to force him not to use Query Cache and directly search from the actual data every time. Another way is to set "
query_cache_limit
"The parameter value controls the maximum Result Set of the Cache in the Query Cache. The system defaults to 1M (1048576). When the Result Set of a Query is greater than "
query_cache_limit
When the value is set, Query Cache will not Cache this Query.

Related system parameter variables and state variables of Query Cache

Let's first look at the system variables of Query Cache. You can obtain system parameter variables related to Query Cache in MySQL by executing the following commands:

mysql > show variables like '%query_cache%' ; + ------------------------------+------- ----+ | Variable_name | Value | + ------------------------------+-------- ---+ | have_query_cache | YES | | query_cache_limit | 1048576 | | query_cache_min_res_unit | 4096 | | query_cache_size | 268435456 | | query_cache_type | ON | | query_cache_wlock_invalidate | OFF | + ------------------------------+----------- + copy code
  • "
    have_query_cache
    ": Whether the MySQL supports Query Cache;
  • "
    query_cache_limit
    ": The maximum Result Set of a single Query stored in the Query Cache, the default is 1M;
  • "
    query_cache_min_res_unit
    ": The minimum memory size stored in each Result Set of the Query Cache, the default is 4k;
  • "
    query_cache_size
    ": The size of the Query Cache memory used in the system;
  • "
    query_cache_type
    ": Whether the Query Cache function is turned on in the system;
  • "
    query_cache_wlock_invalidate
    ": For the MyISAM storage engine, set when there is a WRITE LOCK on a table, whether the read request should wait for the WRITE LOCK to release the resource before querying or allow the result to be read directly from the Query Cache, the default is FALSE (you can directly Get results from Query Cache).

The settings of the above parameters are mainly "

query_cache_limit
"with"
query_cache_min_res_unit
"The settings of the two parameters need to be adjusted for the application. If we need the Result Set of the Cache to be generally very small (less than 4k), we can appropriately change the "
query_cache_min_res_unit
"The parameter should be adjusted smaller to avoid waste of memory,"
query_cache_limit
"The parameters do not need to be adjusted. If we need most of the Result Set of the Cache to be larger than 4k, it is best to change the "
query_cache_min_res_unit
"Adjust to the size of the Result Set,"
query_cache_limit
"The parameter should also be greater than the size of the Result Set. Of course, sometimes it may be difficult for us to accurately estimate the size of the Result Set, so when the Result Set is large, we don t have to change the "
query_cache_min_res_unit
"The setting is about the same size as each Result Set. It can be half or a quarter of the size of each result set. It is indeed impossible to achieve perfection without wasting any memory at all.

If we want to understand the usage of Query Cache, we can get it through the state variables related to Query Cache, for example, through the following command:

MySQL > Show Status like 'Qcache%' ; + ------------------------- + ----------------------- + | Variable_name | Value | + -------------------------+------------+ | Qcache_free_blocks | 7499 | | Qcache_free_memory | 190662000 | | Qcache_hits | 1888430018 | | Qcache_inserts | 1014096388 | | Qcache_lowmem_prunes | 106071885 | |Qcache_not_cached | 7951123988 | | Qcache_queries_in_cache | 19315 | | Qcache_total_blocks | 47870 | + -------------------------+--------- ---+ Copy code
  • "
    Qcache_free_blocks
    ": How many remaining blocks are currently in the Query Cache. If the value is displayed larger, it means that there are more memory fragments in the Query Cache, and you may need to find a suitable opportunity to sort it out ().
  • "
    Qcache_free_memory
    ": The current remaining memory size in the Query Cache. Through this parameter, we can more accurately observe whether the Query Cache memory size in the current system is sufficient, whether it needs to be increased or is too much;
  • "
    Qcache_hits
    ": How many hits. Through this parameter, we can see the basic effect of Query Cache;
  • "
    Qcache_inserts
    ": How many misses and then insert. Passed"
    Qcache_hits
    "with"
    Qcache_inserts
    "With two parameters, we can calculate the hit rate of Query Cache:

Query Cache hit rate =

Qcache_hits/(Qcache_hits + Qcache_inserts)

  • "
    Qcache_lowmem_prunes
    ": How many queries were cleared out of the Query Cache due to insufficient memory. Passed "
    Qcache_lowmem_prunes
    "with"
    Qcache_free_memory
    "Combined with each other, we can understand more clearly whether the memory size of Query Cache in our system is really sufficient, and whether Query is swapped out because of insufficient memory very frequently.
  • "
    Qcache_not_cached
    ":because
    query_cache_type
    Settings or the number of queries that cannot be cached;
  • "
    Qcache_queries_in_cache
    ": The number of Query cached in the current Query Cache;
  • "
    Qcache_total_blocks
    ": The number of blocks in the current Query Cache;

Restrictions of Query Cache

Because Query Cache stores logically structured Result Sets instead of physical data pages, it will also be subject to some specific restrictions while improving performance.

  • Versions prior to 5.1.17 cannot Cache to help Query for variables, but starting from version 5.1.17, Query Cache has begun to support Query for variables;
  • The external query SQL in all subqueries cannot be cached;
  • Query in Procedure, Function and Trigger cannot be cached;
  • Query that contains many other functions that may get different results each time it is executed cannot be cached.

In view of the above restrictions, in the process of using Query Cache, it is recommended to use it in a precise setting, so that only the data of the appropriate table can enter the Query Cache, and only some of the query results of the Query are cached.

4. MySQL Server other commonly used optimization

In addition to installation, logs, and Query Cache, there are many other aspects of settings that may affect the overall performance of MySQL Server, such as network connection, thread management, and table management. In this section, we will analyze other optimizable parts that may affect the performance of MySQL Server in addition to the previous sections.

Network connection and connection thread

Although the connection method of MySQL is not only through the network, but also through the named pipe, no matter what method is connected to MySQL, all the connections requested by the client are managed through the thread. Each client connection will have a corresponding generation of a connection thread. Let's first look at the performance configuration items connected to the network and their impact on performance.

  • max_conecctions
    : The maximum number of connections allowed by the entire MySQL;

This parameter mainly affects the concurrent processing capability of the entire MySQL application. When the actual number of connections required in the system is greater than

max_conecctions
In the case of MySQL, due to the setting restrictions of MySQL, the application will inevitably generate waiting for connection requests, thereby limiting the corresponding amount of concurrency. So in general, as long as the performance of the MySQL host allows, set this parameter as large as possible. Generally speaking, about 500 to 800 is a suitable reference value

  • max_user_connections
    : The maximum number of connections allowed by each user;

The above parameter limits the number of connections for the entire MySQL, and

max_user_connections
It is a connection restriction for a single user. In general, we may seldom use this restriction, and it may be needed only in some applications that provide MySQL data storage services or provide virtual hosting services. In addition to the restricted objects, other aspects and
max_connections
same. The setting of this parameter completely depends on the number of connected users of the application. For ordinary applications, there are no too many restrictions at all, and you can try to let it go.

  • net_buffer_length
    : In the network packet transmission, the initial size of the net buffer before the message is transmitted;

This parameter may mainly affect the efficiency of network transmission. Since this parameter is only the initial size of the message buffer, the main impact is that MySQL always needs to apply for multiple applications when our messages are large each time. Expand the buffer size. The default size of the system is 16KB. Generally speaking, it can meet most scenarios. Of course, if our queries are very small, the network transmission volume is very small each time, and the system memory is relatively short, the value can also be appropriately set Reduced to 8KB.

  • max_allowed_packet
    : In the network transmission, the maximum value of a message transmission at a time;

This parameter is the same as

net_buffer_length
Correspondingly, it is just the maximum value of net buffer. When our message transmission volume is greater than
net_buffer_length
When setting, MySQL will automatically increase the size of the net buffer until the buffer size reaches
max_allowed_packet
The set value. The default value of the system is 1MB, and the maximum value is 1GB. It must be set as a multiple of 1024 and the unit is byte.

  • back_log
    : The maximum number of connection requests allowed to be stored in the MySQL connection request waiting queue.

The connection request waiting queue actually means that when the number of client connection requests is too large at a certain time, the MySQL main thread cannot allocate (or create) a connection thread to each new connection request in time, and it has not been allocated to All requests of the connection thread will be stored in a waiting queue, which is the MySQL connection request queue. When our system has a large number of instantaneous connection requests, we should pay attention

back_log
Parameter setting. The system default value is 50, and the maximum can be set to 65535. When we increase
back_log
At the same time, you also need to limit the network monitoring queue at the OS level, because if the network monitoring setting of the OS is less than that of MySQL
back_log
When setting it up, we increase the "
back_log
"The setting is meaningless.

The main optimization settings related to network connection interaction are introduced above, and let's take a look at the connection thread corresponding to each client connection.

In MySQL, in order to improve the performance of the process of client request to create a connection as much as possible, a

Thread Cache
Pool, where idle connection threads are stored instead of being destroyed after completing the request. In this way, when there is a new connection request, MySQL will first check
Thread Cache
Whether there is an idle connection thread in the pool, if it exists, it will be taken out and used directly. If there is no idle connection thread, a new connection thread will be created. The system parameters and state variables related to the connection thread in MySQL are described as follows:

  • thread_cache_size
    : The number of connection threads that should be stored in the Thread Cache pool.

When the system is initially started, it will not be created immediately

thread_cache_size
The set number of connection threads are stored in the Thread Cache pool, but as connection threads are created and used, the used connection threads are slowly stored in it. When the stored connection thread reaches
thread_cache_size
After the value is set, MySQL will no longer save the connection threads that are used up.

If our application uses short connections,

Thread Cache
The effect of the pool is the most obvious. Because in short-connection database applications, the creation and destruction of database connections are very frequent. If MySQL needs to create and destroy the corresponding connection thread every time, then this resource consumption is actually very large, and when we use Up
Thread Cache
After that, since most of the connection threads are created and waiting to be used, they do not need to be re-created each time, and they do not need to be destroyed after use, so a lot of system resources can be saved. So in a short-connected application system,
thread_cache_size
The value of should be set relatively large, and should not be less than the actual number of concurrent requests of the application system to the database.

And if we are using a long connection,

Thread Cache
The effect of using a short connection may not be as great, but it is not completely useless. Because even if the application uses a long connection, it is difficult to ensure that all the connections they manage can be in a very stable state, and there will still be many connection closures and new creation operations. In some systems with high concurrency and a large number of application servers, it is very common to create and close connections ten times per minute. And if the connection pool management of the application server is not very good, and the connection pool is prone to jitter, the resulting connection creation and destruction operations will be more. So even in an application environment that uses a long connection,
Thread Cache
The utilization of the mechanism is still very helpful for performance. It s just that we don t need to change
thread_cache_size
The parameter setting is too large. Generally speaking, it should be between 50 and 100.

  • thread_stack
    : When each connection thread is created, the amount of memory MySQL allocates to him.

When MySQL creates a new connection thread, it needs to allocate a certain size of memory stack space to it in order to store the client's request Query and its own various status and processing information. But generally speaking, if you are not familiar with MySQL's connection thread processing mechanism, you should not easily adjust the size of this parameter. Using the system default value (192KB) can basically be used in all common application environments. If the value is set too small, it will affect the size of the Query content that the MySQL connection thread can process the client request, and the size of the Query content created by the user

Procedures
with
Functions
Wait.

The above are all how we can configure the network connection interaction and the performance-related parameters of the connection thread. Let's see how to check whether the above settings are reasonable and whether there is any need to adjust. We can get relevant status information by executing the following commands in the system to help you verify the rationality of the settings:

We now look at the setting values of system variables related to the connection thread:

MySQL > Show the Variables like 'the Thread%' ; + ------------------- + -------- + | variable_name | Value | + ---- ---------------+--------+ | thread_cache_size | 64 | | thread_stack | 196608 | + -------------- -----+--------+ Copy code

Let's take a look at the number of times the system is connected and the status value of the connection thread in the current system:

MySQL > Show Status like 'Connections' ; + --------------- + ------- + | variable_name | the Value | + ---------- -----+-------+ | Connections | 127 | + ---------------+-------+ mysql > show status like '%thread%' ; + ------------------------+-------+ | Variable_name | Value | + ------------------------+-------+ | Delayed_insert_threads | 0 | | Slow_launch_threads | 0 | | Threads_cached | 4 | | Threads_connected | 7 | | Threads_created | 11 | | Threads_running | 1 | + ------------------------+-------+ Copy code

Through the above command, we can see that the system has set up the Thread Cache pool to cache up to 32 connection threads. When each connection thread is created, the system allocates 192KB of memory stack to him. The system has received 127 connections from the client until now, and 11 connection threads have been created. However, the first 7 connection threads are in the state of connecting with the client, and only one of the 7 connection threads is in the active state. In other words, only one Russian request submitted by the client is being processed. In the Thread Cache pool, a total of 4 connection threads are Cached.

Through the analysis of system settings and current status, we can find that

thread_cache_size
The settings are sufficient, and even far greater than the needs of the system. So we can appropriately reduce
thread_cache_size
The setting, for example, set to 8 or 16. according to
Connections
with
Threads_created
With these two system status values, we can also calculate the Thread Cache hit rate of newly-created connections in the system, that is, the ratio of the number of connection threads obtained through the Thread Cache pool to the total number of connections received by the system, as follows:

Threads_Cache_Hit = (the Connections - Threads_created) / the Connections * 100 % duplicated code

We can calculate the Thread Cache hit rate in the above environment through the above calculation formula:

Thread_Cache_Hit = (127-12)/127 * 100% = 90.55%

Generally speaking, after the system has been running stably for a period of time, our Thread Cache hit rate should be maintained at around 90% or even higher to be considered normal. It can be seen that the Thread Cache hit ratio in the above environment is basically normal.

Table Cache related optimization

Let's first look at the related mechanism of MySQL to open the table. Due to the multi-threaded implementation mechanism, in order to improve performance as much as possible, each thread in MySQL independently opens the file descriptor of the table it needs, instead of sharing the file descriptor of the opened table. . Of course, there may be different processing methods for different storage engines. Such as MyISAM table, each client thread needs to open a file descriptor to open any data file of any MyISAM table, but if it is an index file, multiple threads can share the same index file descriptor. For Innodb storage engine, if we use a shared table space to store data, then we need to open fewer file descriptors, and if we use a dedicated table space to store data, the same, because of storage If there are more data files for table data, a lot of table file descriptors will also be opened. In addition to opening the actual table or index of the database, temporary files also need to use file descriptors, which will also occupy the system

open_files_limit
Of setting limits.

In order to solve the problem of opening table file descriptors too frequently, MySQL has implemented a Table Cache mechanism in the system, which is similar to the Thread Cache mechanism introduced earlier, mainly the descriptors of all table files opened by the Cache. It does not need to be reopened when the request is made, and it does not need to be closed immediately when the use is over. In this way, the resource consumption caused by frequently opening and closing file descriptors can be reduced. Let's take a look at the system parameters and state variables related to Table Cache.

In MySQL we pass

table_cache
(From MySQL 5.1.3 to
table_open_cache
) To set the number of open table file descriptors for our Cache in the system. Through the introduction in the official MySQL manual, we set
table_cache
It should pass when the size is
max_connections
The parameters are calculated, and the formula is as follows:

table_cache = max_connections * N; copy the code

Where N represents the maximum number of Tables contained in a single Query statement. But I personally understand that this calculation is not too accurate. The analysis is as follows:

first of all,

max_connections
It is the maximum number of connections that the system can accept at the same time, but these connections are not necessarily in the active state, that is to say, there may be many connections in the Sleep state. It is impossible to open any Table when the connection is in Sleep state.

Secondly, this N is the number of Tables contained in the Query that contains the most Tables in the Query, because we cannot ignore the opening of the index file. Although the index file can share the open connection descriptor between each connection thread, it is always needed. Moreover, if the access to each table in my Query is retrieved through the current index positioning, or even through multiple indexes, then the file descriptors that need to be opened for the execution of the Query are even more, which may be N Twice or even three times as much.

Finally, this calculation formula can only calculate the maximum number of descriptors that we need to open at the same time, and the setting of table_cache does not necessarily have to be set according to this limit value, because table_cache only sets the descriptors opened by Cache The size of the quantity, not the size of the maximum amount that can be opened.

Of course, the above is just my personal understanding, and it may not be too rigorous. If you feel that you have other understandings, you can bring them up and discuss them again.

We can view the setting of table_cache and the current usage in the system in the following ways:

MySQL > Show Variables like 'the table_cache' ; + --------------- + ------- + | variable_name | the Value | + ---------- -----+-------+ | table_cache | 512 | + ---------------+-------+ MySQL > Show Status like 'open_tables' ; + --------------- + ------- + | variable_name | the Value | + ---------- -----+-------+ | Open_tables | 6 | + ---------------+-------+ Copy code

The above result shows that the table_cache set by the system is 512, which means that in this MySQL, Table Cache can cache 512 open file descriptors; the current system has only 6 open descriptors.

So under what circumstances will the Cache descriptor in the Table Cache pool be closed? Generally speaking, there are mainly the following situations where the descriptors of the Cache will be closed:

  • The Cache pool of Table Cache is full, and when a connection thread needs to open a table that is not in Table Cache, MySQL will use a certain algorithm to close some descriptors that are not in use;
  • When we execute commands such as Flush Table, MySQL will close all file descriptors of Cache in the current Table Cache;
  • When the amount of Cache in Table Cache exceeds the value set by the table_cache parameter;

Sort Buffer, Join Buffer and Read Buffer

In MySQL, in addition to the various Caches introduced before, there are also two types of Buffers during the execution of the query that will affect the overall performance of the database.

mysql > show variables like '%buffer%' ; + -------------------------------+------ ----+ | Variable_name | Value | + -------------------------------+------- ---+ ... ... | join_buffer_size | 4190208 | ... ... | sort_buffer_size | 2097144 | + -------------------------------+----------+ copy Code
  • join_buffer_size
    : Buffer used when our Join is ALL, index, rang or index_merge;

In fact, this kind of Join is called Full Join. In fact, each table participating in Join needs a Join Buffer, so when Join appears, there are at least two. The Join Buffer setting is 4GB before MySQL version 5.1.23, but starting from 5.1.23 version, the 4BG limit can be exceeded on 64-bit platforms other than Windows. The system default is 128KB.

  • sort_buffer_size
    : Buffer used when sorting data in the system;

Sort Buffer is also for a single Thread, so when multiple Threads are sorted at the same time, multiple Sort Buffers will appear in the system. Generally, we can improve the processing performance of ORDER BY or GROUP BY by increasing the size of Sort Buffer. The default size of the system is 2MB, and the maximum limit is the same as Join Buffer. Before MySQL version 5.1.23, the maximum is 4GB. Starting from version 5.1.23, the 4GB limit can be exceeded on 64-bit platforms except Windows.

If there are few Join statements in the application system, you don t need to care too much

join_buffer_size
The size of the parameter is set, but if the Join statement is not very few, personal suggestion can be increased appropriately
join_buffer_size
The setting is about 1MB, if the memory is sufficient, it can even be set to 2MB. for
sort_buffer_size
In terms of parameters, generally setting between 2MB and 4MB can meet the needs of most applications. Of course, if the sorting in the application system is relatively large, the memory is sufficient and the amount of concurrency is not particularly large, you can continue to increase
sort_buffer_size
setting. When setting these two Buffers, the most important thing to pay attention to is not to forget that each Thread will create its own independent Buffer instead of the Buffer shared by the entire system. Don't cause insufficient system memory due to excessive settings.

summary

The performance improvement that can be brought by performance optimization through parameter settings may not produce a qualitative leap as many people imagine, unless the previous settings are seriously unreasonable. We cannot completely rely on performance tuning on the parameter adjustments after the database goes online through the DBA. Instead, we should minimize performance issues during the system design and development stages. Of course, it cannot be denied that parameter adjustment has a relatively large impact on system performance in certain scenarios, but it is only a few special cases after all.

Friends who like it can pay attention to it, the next article updates "Common Storage Engine Optimization"!