This article mainly analyzes the implementation mechanism of MySQL Server (mysqld) and obtains some corresponding optimization suggestions. It mainly involves the installation of MySQL and the optimization of related parameter settings, but does not include the optimization of parameters related to storage engines other than mysqld. The related parameter settings of storage engines will be recommended in my next article "Optimization of Common Storage Engines" In the description.
1. MySQL installation optimization
Choose the right release version
Binary release version (including packaged specific binary version such as RPM)
Due to the open source nature of MySQL, not only MySQL AB provides a variety of binary distributions on multiple platforms for everyone to choose from, but also many third-party companies (or individuals) also provide us with many choices.
What benefits can we get by using the binary distribution version provided by MySQL AB?
- Quickly complete the deployment of MySQL through a very simple installation method;
- The installed version is a compiled version that has undergone a relatively complete function and performance test;
- The compilation parameters used are more versatile and stable;
- If you purchase MySQL service, you will be able to get MySQL technical support to the greatest extent;
Most of the MySQL distributions provided by third parties are made more or less targeted changes in the source code officially provided by MySQL AB, and then compiled. Some of these changes are improvements in certain functions, and some are improvements in the performance of certain write operations. There are also some releases provided by various OS vendors, which may be related to their own OS in some code aspects to make some corresponding low-level call adjustments, so that MySQL and their own OS can be more perfectly combined. Of course, there are also some third-party releases that have not touched a single line of MySQL code, but just made some relevant adjustments in the compilation parameters, and made MySQL perform better in certain specific scenarios.
In this way, it sounds as if the MySQL binary version issued by a third party is more attractive than the official binary distribution version provided by MySQL AB, so should we choose a binary distribution version provided by a third party? Don't worry, we need to further analyze the possible problems with third-party releases.
First of all, due to the changes made to MySQL by third-party releases, many of them are made in response to the specific scenarios of the publisher. Therefore, the third-party release version may not be suitable for the environment of all other users.
Secondly, since the publisher of the third-party release version is not necessarily a company (or individual) that is trustworthy enough, we don t know whether it has done enough comprehensive functional and performance tests before generating its own release version. , Will there be bugs that do not exist in the official release version of MySQL AB when we use it?
Finally, if we purchase MySQL-related services and use a third-party release version, when our system has problems, I am afraid that the support work of MySQL's support engineers will be greatly reduced, and they may even refuse to provide support.
If you can completely put aside the above concerns that may have hidden dangers, you can try to use the binary version that is not officially provided by MySQL AB, and choose a release version that may have more features or higher performance.
Before, I also did some tests and comparisons on various third-party binary distributions on the Internet, and found some relatively good versions. For example, the overall quality of Percona's release after integrating some excellent Patches is not bad. Use There are also more people. Of course, Percona not only distributes binary versions, but also distributes and integrates some excellent Patch source code packages. For those who want to make some patches provided by Percona, and those who want to compile by themselves to further optimize and customize MySQL, you can also download the source code package provided by Percona.
For the installation of the binary distribution version, for the installation itself, we basically don t have much to optimize. The only thing we can do is when we decide to choose a third-party distribution version, we can choose the right one according to our own environment and application characteristics. The optimized release version of our environment is installed.
Source code installation
Compared with the binary release version, if we choose to install through the source code, the adjustments we can make to MySQL during the installation process will be more and more flexible. Because we can compile from source code:
- Choose a suitable compiler for your own hardware platform to optimize the compiled binary code;
- Adjust relevant compilation parameters according to different software platform environments;
- For our specific application scenarios, choose what components are needed and what components are not needed;
- According to the data content we need to store, we choose to install only the character set we need;
- Multiple MySQL can be installed on the same host;
- And so on, other various adjustments that can be made according to specific application scenarios.
While the source code installation brings us more flexibility, it also brings us hidden dangers that may be introduced:
- Insufficient understanding of the compilation parameters, resulting in improper use of the compilation parameters, may make the compiled binary code not stable enough;
- The optimized parameters used to grasp the mistakes in your own application environment may make the system performance worse;
- Another small problem that cannot be called a hidden danger is that source code compilation and installation will make the installation and deployment process more complicated and take longer;
The most important feature of installing through source code is that we can adjust the compilation parameters by ourselves and customize the installation results to the greatest extent. Below I will give a brief introduction to some of my optimization experience in compiling and installing through source code, hoping to be helpful to everyone.
When installing from source code, the most critical step is to configure the compilation parameters, that is, to execute various compilation options set by the configure command. We can execute by executing under the folder where the MySQL source code is located "
`configure 'configures this package to adapt to many kinds of systems. Usage: ./configure [OPTION]... [VAR=VALUE]... ... ... Installation directories: --prefix=PREFIX install architecture-independent files in PREFIX ... ... For better control, use the options below. Fine tuning of the installation directories: --bindir=DIR user executables [EPREFIX/bin] ... ... Program names: --program-prefix=PREFIX prepend PREFIX to installed program names ... ... System types: --build=BUILD configure for building on BUILD [guessed] ... ... Optional Features: --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) ... ... Optional Packages: --with-charset=CHARSET ... ... --without-innodb Do not include the InnoDB table handler ... ... Some influential environment variables: CC C compiler command ... ... CCASFLAGS assembler compiler flags (defaults to CFLAGS) ... ... Copy code
Many of the above output content have been omitted, and you can get richer content output through your own testing. Here is a brief introduction to several more important compilation parameters:
- "Prefix": Set the installation path, the default is "/usr/local";
- "Datadir": Set the MySQL data file storage path;
- "With-charset": Set the default character set of the system;
- "With-collation": The default verification rule of the system;
- "With-extra-charsets": The character set that needs to be compiled and installed outside the default character set;
- "With-unix-socket-path": Set the socket file address;
- "With-tcp-port": Specify a specific listening port, the default is 3306;
- "With-mysqld-user": Specify the os user running mysqld, the default is mysql;
- "Without-query-cache": Disable the Query Cache function;
- "Without-innodb": Disable the Innodb storage engine;
- "--with-partition": Open the partition support feature in version 5.1;
- "--enable-thread-safe-client": Compile the client in thread mode;
- "With-pthread": Mandatory use of pthread thread library compilation;
- "With-named-thread-libs": Specify to use a specific thread library to compile;
- "Without-debug": Use non-debug mode;
- "With-mysqld-ldflags": Additional link parameters of mysqld;
- "With-client-ldflags": the additional link parameters of the client;
The above parameters are some of the more commonly used compilation parameters in source code installation. The first few compilation parameters are mainly used to facilitate our customization of our own system during installation and make the system more suitable for the relevant specifications of our own application environment. The environment is unified, and corresponding binary codes are generated according to actual needs. The latter parameters are mainly used to optimize the compilation results.
I think everyone should understand that generally speaking, the more complex a system is, the worse its performance will generally be. Therefore, when we install and compile MySQL, we should try to select only the components we need, install only the storage engine we need, and only compile the character set we need, so that our system can be as simple as possible, because such MySQL will also Give us the highest possible performance.
In addition, for some specific software environments, there may be a variety of thread library choices. If you have a better understanding of each thread library, you can set the compilation parameters to let MySQL use the most suitable thread library. Give full play to his best side in a specific environment.
The compilation parameters of the source code package will generate binary code in Debug mode by default, and the performance loss of Debug mode to MySQL is relatively large, so when we compile the product code to be installed, we must not forget to use "
As far as I am concerned, the most commonly used compilation configuration parameters are as follows, you can refer to the relevant content of your own additions and deletions:
. / configure --prefix=/usr/local/mysql/ --without-debug/ --without-bench/ --enable-thread-safe-client/ --enable-assembler/ --enable-profiling/ - with-mysqld-ldflags=-all-static/ --with-client-ldflags=-all-static/ --with-charset=latin1/ --with-extra-charset=utf8,gbk/ --with-innodb/ --with-csv-storage-engine/ --with-federated-storage-engine/ --with-mysqld-user=mysql/ --without-embedded-server/ --with-server-suffix=-community/ - -with-unix-socket-path =/usr/local/mysql/sock/mysql.sock duplicated code
2. MySQL log settings optimization
After installing MySQL, some optimizations and adjustments must be made to various parameter options of MySQL. Although the MySQL system is highly scalable, it can run efficiently in an environment with sufficient hardware resources, and it can also run well in an environment with few resources, but in any case, as sufficient hardware resources as possible are important Performance improvements are always helpful. In this section, we mainly analyze the impact of MySQL logs (mainly Binlog) on system performance, and draw corresponding optimization ideas based on the relevant characteristics of the logs.
The performance impact of the log
MySQL logs include Error Log, Update Log, Binlog, Query Log, Slow Query Log, etc. Of course, the update log is only available in the old version of MySQL, and it has been replaced by the binary log.
By default, the system only opens the error log and closes all other logs to achieve the goal of minimizing IO loss and improving system performance . However, in practical application scenarios that are generally slightly more important, at least the binary log needs to be turned on, because this is the basis for many MySQL storage engines to perform incremental backups, and it is also the basic condition for MySQL to achieve replication. Sometimes in order to further optimize the performance and locate the slower SQL statement, many systems will also open the slow query log to record the SQL statement whose execution time exceeds a certain value (set by us).
Under normal circumstances, few systems in the production system will open the query log. After the query log is opened, every query executed in MySQL will be recorded in the log, which will bring a relatively large IO burden to the system, but the actual benefits brought by it are not very large. Generally, only in the development and test environment, in order to locate which SQL statements are used in certain functions, the log will be opened in a short period of time for corresponding analysis. Therefore, in the MySQL system, the MySQL log (not including the log of each storage engine) that will affect the performance is mainly Binlog.
Binlog related parameters and optimization strategies
Let's first look at the relevant parameters of Binlog. You can get the relevant parameters of Binlog by executing the following commands. Of course, it also shows "
mysql > show variables like '%binlog%' ; + --------------------------------+----- -------+ | Variable_name | Value | + --------------------------------+--- ---------+ | binlog_cache_size | 1048576 | | innodb_locks_unsafe_for_binlog | OFF | | max_binlog_cache_size | 4294967295 | | max_binlog_size | 1073741824 | | sync_binlog | 0 | + --------------------------------+------------+ copy Code
- sync_binlog=0, When the transaction is committed, MySQL does not refresh disk synchronization instructions such as fsyncbinlog_cacheThe information in the disk, and letFilesystemDecide when to synchronize, or synchronize to disk after the cache is full.
- sync_binlog=n, After every n transaction commits, MySQL will perform a disk synchronization command such as fsync to force the data in binlog_cache to be written to disk.
The system default setting in MySQL is
As everyone knows, MySQL replication is actually achieved by replicating the Binlog on the Master side to the Slave side through the network by using the IO thread, and then parsing the logs in the Binlog through the SQL thread and then applying it to the database. Therefore, the size of the Binlog volume will have a direct impact on the IO thread and the network between Msater and Slave.
The amount of Binlog generated in MySQL cannot be changed. As long as our Query changes the data in the database, then the Event corresponding to the Query must be recorded in Binlog. So is there no way to optimize replication? Of course not. In the MySQL replication environment, there are actually 8 parameters that allow us to control the DB or Table that needs to be replicated or that need to be ignored and not replicated. They are:
- Binlog_Do_DB : Set which database (Schema) needs to record Binlog;
- Binlog_Ignore_DB : Set which databases (Schema) do not record Binlog;
- Replicate_Do_DB : Set the database to be replicated (Schema), multiple DBs are separated by commas (",");
- Replicate_Ignore_DB : set the database that can be ignored (Schema);
- Replicate_Do_Table : Set the Table to be replicated;
- Replicate_Ignore_Table : Set a table that can be ignored;
- Replicate_Wild_Do_Table : Same functionReplicate_Do_Table, But can be set with wildcards;
- Replicate_Wild_Ignore_Table : Same functionReplicate_Ignore_Table, Can be set with wildcards;
Through the above eight parameters, we can easily control the amount of Binlog from the Master end to the Slave end as little as possible according to actual needs, thereby reducing the network traffic from the Master end to the Slave end, reducing the IO amount of the IO thread, and Reduce the number of SQL threads to parse and apply SQL, and ultimately improve the data delay problem on the Slave.
In fact, the first two of the above eight parameters are set on the Master side, and the last six parameters are set on the Slave side. Although the first two parameters and the following six parameters are not directly related in function, similar functions can be enabled for optimizing MySQL Replication. Of course there are certain differences, the main differences are as follows:
- If you set the first two parameters on the Master side, not only will the amount of IO brought by the Binlog record on the Master side be reduced, but also the IO thread on the Master side can reduce the amount of Binlog read, which is passed to the Binlog of the IO thread on the Slave side. The amount will naturally be less. The advantage of this is that it can reduce network IO, reduce the amount of IO on the Slave side IO thread, and reduce the amount of IO on the Slave side.SQLThe workload of the thread, thereby optimizing the copy performance to the greatest extent. Of course, setting on the Master side also has certain drawbacks, because MySQL's judgment whether it is necessary to copy an Event is not based on the DB where the data changed by the Query that generated the Event is located, but based on the default Schema at the time the Query is executed, which is The DB specified when we log in or run "USE DATABASESpecified in the DB. Only when the current default DB is exactly the same as the DB set in the configuration, the IO thread will read the Event to the Slave IO thread. So if it appears in the default DB and settings in the system When the data of a table in the DB that needs to be copied is changed when the DB to be copied is different, the Event will not be copied to the Slave, which will cause the data on the Slave side and the data of the Master to be inconsistent The situation arises. Similarly, if the data in the schema that does not need to be copied is changed under the default schema, it will be copied to the Slave side. When the Slave side does not have the Schema, it will cause replication errors and stop;
- If you set the following six parameters on the Slave side, it may be slightly inferior in terms of performance optimization than on the Master side, because the events that need or do not need to be copied will be read to the Slave side by the IO thread, so Not only increased the amount of network IO, but also increased the IO thread on the Slave side Relay LogThe amount of writes. But it can still reduce the amount of log application of the Slave SQL thread on the Slave side. Although the performance is slightly inferior, setting the replication filter mechanism on the Slave side can ensure that there will be no inconsistencies or replication errors between the Slave and Master data due to the default Schema.
Slow Query Log related parameters and usage suggestions
Take a look again
MySQL > Show Variables like 'log_slow%' ; + ------------------ + ------- + | variable_name | the Value | + ------ ------------+-------+ | log_slow_queries | ON | + ------------------+---- ---+ 1 row in set ( 0.00 sec) MySQL > Show Variables like 'long_query%' ; + ----------------- + ------- + | variable_name | the Value | + ------- ----------+-------+ | long_query_time | 1 | + -----------------+------- + . 1 Row in SET ( 0.01 sec) copying the code
As other MySQL logs are rarely used (Query Log) or have little performance impact, we will not analyze too much here. As for the logs related to each storage engine, we will leave it in the next "Common Storage Engine Optimization" section and do the corresponding Analysis.
3. Query Cache optimization
When it comes to Query Cache, I am afraid that most people who have used MySQL will have more or less understanding, because in many people's eyes, it can help us produce a "qualitative" improvement in the performance of the database. But is this really the case? In this section, we will make some corresponding analysis on how to reasonably use MySQL's Query Cache and come up with some optimization suggestions.
Is Query Cache really a "shangfang sword"?
Judging from the above implementation principle, Query Cache is indeed a relatively simple implementation that brings huge performance benefits. But many people may have overlooked the negative impact of using QueryCache:
- The hash operation of Query statement and hash lookup resource consumption. When we use Query Cache, after each SELECT type Query arrives in MySQL, it needs to perform a hash operation and then find out whether the Query cache exists. Although this hash operation algorithm may already be very efficient, the hash lookup process is also It has been optimized enough. The resources consumed for a query are indeed very, very small, but when we have thousands or even thousands of queries per second, we cannot completely ignore the resulting CPU consumption. .
- Invalidation of Query Cache. If our table changes more frequently, it will cause a very high failure rate of Query Cache. The table change here not only refers to the change of the data in the table, but also includes any change in the structure or index. That is to say, every time we cache the Cache data in the Query Cache, it may be cleared soon after the data in the table is changed, and then the previous Cache cannot be used after the new same Query comes in.
- The Result Set is cached in the Query Cache, not the data page . That is to say, there is a possibility that the same record will be cached multiple times. This causes excessive consumption of memory resources. Of course, some people may say that we can limit the size of the Query Cache. Yes, we can indeed limit the size of Query Cache, but in this way, Query Cache can easily be swapped out due to insufficient memory, resulting in a drop in hit rate.
Regarding the above three negative effects of Query Cache, if each of them is taken out separately, it will not cause much problem to the entire system, and it will not cause too many concerns about the use of Query Cache. However, when these three negative effects are considered together, I am afraid that Query Cache will no longer be the previous "sword of justice" in the minds of many people.
Appropriate use of Query Cache
Although the use of Query Cache will have some negative effects, we should also believe that its existence must have a certain value. We do not completely lose confidence in Query Cache because of the above three negative effects of Query Cache. As long as we understand the implementation principle of Query Cache, then we can use certain means to maximize the strengths and avoid weaknesses when using Query Cache, focus on its advantages, and effectively avoid its disadvantages.
First of all, we need to determine which tables are suitable for using Query and which tables are not suitable for use according to the Query Cache invalidation mechanism. The failure of Query Cache is mainly due to changes in the data of the Table on which Query depends, causing Query to fail.
Secondly, for those data whose changes are very small and mostly static, we can add
Finally, some SQL Result Sets are very large. If Query Cache is used, it is easy to cause insufficient Cache memory, or flush out some old Caches before. For this type of query we have two methods to solve, one is to use
Related system parameter variables and state variables of Query Cache
Let's first look at the system variables of Query Cache. You can obtain system parameter variables related to Query Cache in MySQL by executing the following commands:
mysql > show variables like '%query_cache%' ; + ------------------------------+------- ----+ | Variable_name | Value | + ------------------------------+-------- ---+ | have_query_cache | YES | | query_cache_limit | 1048576 | | query_cache_min_res_unit | 4096 | | query_cache_size | 268435456 | | query_cache_type | ON | | query_cache_wlock_invalidate | OFF | + ------------------------------+----------- + copy code
- "have_query_cache": Whether the MySQL supports Query Cache;
- "query_cache_limit": The maximum Result Set of a single Query stored in the Query Cache, the default is 1M;
- "query_cache_min_res_unit": The minimum memory size stored in each Result Set of the Query Cache, the default is 4k;
- "query_cache_size": The size of the Query Cache memory used in the system;
- "query_cache_type": Whether the Query Cache function is turned on in the system;
- "query_cache_wlock_invalidate": For the MyISAM storage engine, set when there is a WRITE LOCK on a table, whether the read request should wait for the WRITE LOCK to release the resource before querying or allow the result to be read directly from the Query Cache, the default is FALSE (you can directly Get results from Query Cache).
The settings of the above parameters are mainly "
If we want to understand the usage of Query Cache, we can get it through the state variables related to Query Cache, for example, through the following command:
MySQL > Show Status like 'Qcache%' ; + ------------------------- + ----------------------- + | Variable_name | Value | + -------------------------+------------+ | Qcache_free_blocks | 7499 | | Qcache_free_memory | 190662000 | | Qcache_hits | 1888430018 | | Qcache_inserts | 1014096388 | | Qcache_lowmem_prunes | 106071885 | |Qcache_not_cached | 7951123988 | | Qcache_queries_in_cache | 19315 | | Qcache_total_blocks | 47870 | + -------------------------+--------- ---+ Copy code
- "Qcache_free_blocks": How many remaining blocks are currently in the Query Cache. If the value is displayed larger, it means that there are more memory fragments in the Query Cache, and you may need to find a suitable opportunity to sort it out ().
- "Qcache_free_memory": The current remaining memory size in the Query Cache. Through this parameter, we can more accurately observe whether the Query Cache memory size in the current system is sufficient, whether it needs to be increased or is too much;
- "Qcache_hits": How many hits. Through this parameter, we can see the basic effect of Query Cache;
- "Qcache_inserts": How many misses and then insert. Passed"Qcache_hits"with"Qcache_inserts"With two parameters, we can calculate the hit rate of Query Cache:
Query Cache hit rate =
- "Qcache_lowmem_prunes": How many queries were cleared out of the Query Cache due to insufficient memory. Passed "Qcache_lowmem_prunes"with"Qcache_free_memory"Combined with each other, we can understand more clearly whether the memory size of Query Cache in our system is really sufficient, and whether Query is swapped out because of insufficient memory very frequently.
- "Qcache_not_cached":becausequery_cache_typeSettings or the number of queries that cannot be cached;
- "Qcache_queries_in_cache": The number of Query cached in the current Query Cache;
- "Qcache_total_blocks": The number of blocks in the current Query Cache;
Restrictions of Query Cache
Because Query Cache stores logically structured Result Sets instead of physical data pages, it will also be subject to some specific restrictions while improving performance.
- Versions prior to 5.1.17 cannot Cache to help Query for variables, but starting from version 5.1.17, Query Cache has begun to support Query for variables;
- The external query SQL in all subqueries cannot be cached;
- Query in Procedure, Function and Trigger cannot be cached;
- Query that contains many other functions that may get different results each time it is executed cannot be cached.
In view of the above restrictions, in the process of using Query Cache, it is recommended to use it in a precise setting, so that only the data of the appropriate table can enter the Query Cache, and only some of the query results of the Query are cached.
4. MySQL Server other commonly used optimization
In addition to installation, logs, and Query Cache, there are many other aspects of settings that may affect the overall performance of MySQL Server, such as network connection, thread management, and table management. In this section, we will analyze other optimizable parts that may affect the performance of MySQL Server in addition to the previous sections.
Network connection and connection thread
Although the connection method of MySQL is not only through the network, but also through the named pipe, no matter what method is connected to MySQL, all the connections requested by the client are managed through the thread. Each client connection will have a corresponding generation of a connection thread. Let's first look at the performance configuration items connected to the network and their impact on performance.
- max_conecctions: The maximum number of connections allowed by the entire MySQL;
This parameter mainly affects the concurrent processing capability of the entire MySQL application. When the actual number of connections required in the system is greater than
- max_user_connections: The maximum number of connections allowed by each user;
The above parameter limits the number of connections for the entire MySQL, and
- net_buffer_length: In the network packet transmission, the initial size of the net buffer before the message is transmitted;
This parameter may mainly affect the efficiency of network transmission. Since this parameter is only the initial size of the message buffer, the main impact is that MySQL always needs to apply for multiple applications when our messages are large each time. Expand the buffer size. The default size of the system is 16KB. Generally speaking, it can meet most scenarios. Of course, if our queries are very small, the network transmission volume is very small each time, and the system memory is relatively short, the value can also be appropriately set Reduced to 8KB.
- max_allowed_packet: In the network transmission, the maximum value of a message transmission at a time;
This parameter is the same as
- back_log: The maximum number of connection requests allowed to be stored in the MySQL connection request waiting queue.
The connection request waiting queue actually means that when the number of client connection requests is too large at a certain time, the MySQL main thread cannot allocate (or create) a connection thread to each new connection request in time, and it has not been allocated to All requests of the connection thread will be stored in a waiting queue, which is the MySQL connection request queue. When our system has a large number of instantaneous connection requests, we should pay attention
The main optimization settings related to network connection interaction are introduced above, and let's take a look at the connection thread corresponding to each client connection.
In MySQL, in order to improve the performance of the process of client request to create a connection as much as possible, a
- thread_cache_size: The number of connection threads that should be stored in the Thread Cache pool.
When the system is initially started, it will not be created immediately
If our application uses short connections,
And if we are using a long connection,
- thread_stack: When each connection thread is created, the amount of memory MySQL allocates to him.
When MySQL creates a new connection thread, it needs to allocate a certain size of memory stack space to it in order to store the client's request Query and its own various status and processing information. But generally speaking, if you are not familiar with MySQL's connection thread processing mechanism, you should not easily adjust the size of this parameter. Using the system default value (192KB) can basically be used in all common application environments. If the value is set too small, it will affect the size of the Query content that the MySQL connection thread can process the client request, and the size of the Query content created by the user
The above are all how we can configure the network connection interaction and the performance-related parameters of the connection thread. Let's see how to check whether the above settings are reasonable and whether there is any need to adjust. We can get relevant status information by executing the following commands in the system to help you verify the rationality of the settings:
We now look at the setting values of system variables related to the connection thread:
MySQL > Show the Variables like 'the Thread%' ; + ------------------- + -------- + | variable_name | Value | + ---- ---------------+--------+ | thread_cache_size | 64 | | thread_stack | 196608 | + -------------- -----+--------+ Copy code
Let's take a look at the number of times the system is connected and the status value of the connection thread in the current system:
MySQL > Show Status like 'Connections' ; + --------------- + ------- + | variable_name | the Value | + ---------- -----+-------+ | Connections | 127 | + ---------------+-------+ mysql > show status like '%thread%' ; + ------------------------+-------+ | Variable_name | Value | + ------------------------+-------+ | Delayed_insert_threads | 0 | | Slow_launch_threads | 0 | | Threads_cached | 4 | | Threads_connected | 7 | | Threads_created | 11 | | Threads_running | 1 | + ------------------------+-------+ Copy code
Through the above command, we can see that the system has set up the Thread Cache pool to cache up to 32 connection threads. When each connection thread is created, the system allocates 192KB of memory stack to him. The system has received 127 connections from the client until now, and 11 connection threads have been created. However, the first 7 connection threads are in the state of connecting with the client, and only one of the 7 connection threads is in the active state. In other words, only one Russian request submitted by the client is being processed. In the Thread Cache pool, a total of 4 connection threads are Cached.
Through the analysis of system settings and current status, we can find that
Threads_Cache_Hit = (the Connections - Threads_created) / the Connections * 100 % duplicated code
We can calculate the Thread Cache hit rate in the above environment through the above calculation formula:
Generally speaking, after the system has been running stably for a period of time, our Thread Cache hit rate should be maintained at around 90% or even higher to be considered normal. It can be seen that the Thread Cache hit ratio in the above environment is basically normal.
Table Cache related optimization
Let's first look at the related mechanism of MySQL to open the table. Due to the multi-threaded implementation mechanism, in order to improve performance as much as possible, each thread in MySQL independently opens the file descriptor of the table it needs, instead of sharing the file descriptor of the opened table. . Of course, there may be different processing methods for different storage engines. Such as MyISAM table, each client thread needs to open a file descriptor to open any data file of any MyISAM table, but if it is an index file, multiple threads can share the same index file descriptor. For Innodb storage engine, if we use a shared table space to store data, then we need to open fewer file descriptors, and if we use a dedicated table space to store data, the same, because of storage If there are more data files for table data, a lot of table file descriptors will also be opened. In addition to opening the actual table or index of the database, temporary files also need to use file descriptors, which will also occupy the system
In order to solve the problem of opening table file descriptors too frequently, MySQL has implemented a Table Cache mechanism in the system, which is similar to the Thread Cache mechanism introduced earlier, mainly the descriptors of all table files opened by the Cache. It does not need to be reopened when the request is made, and it does not need to be closed immediately when the use is over. In this way, the resource consumption caused by frequently opening and closing file descriptors can be reduced. Let's take a look at the system parameters and state variables related to Table Cache.
In MySQL we pass
table_cache = max_connections * N; copy the code
Where N represents the maximum number of Tables contained in a single Query statement. But I personally understand that this calculation is not too accurate. The analysis is as follows:
first of all,
Secondly, this N is the number of Tables contained in the Query that contains the most Tables in the Query, because we cannot ignore the opening of the index file. Although the index file can share the open connection descriptor between each connection thread, it is always needed. Moreover, if the access to each table in my Query is retrieved through the current index positioning, or even through multiple indexes, then the file descriptors that need to be opened for the execution of the Query are even more, which may be N Twice or even three times as much.
Finally, this calculation formula can only calculate the maximum number of descriptors that we need to open at the same time, and the setting of table_cache does not necessarily have to be set according to this limit value, because table_cache only sets the descriptors opened by Cache The size of the quantity, not the size of the maximum amount that can be opened.
Of course, the above is just my personal understanding, and it may not be too rigorous. If you feel that you have other understandings, you can bring them up and discuss them again.
We can view the setting of table_cache and the current usage in the system in the following ways:
MySQL > Show Variables like 'the table_cache' ; + --------------- + ------- + | variable_name | the Value | + ---------- -----+-------+ | table_cache | 512 | + ---------------+-------+ MySQL > Show Status like 'open_tables' ; + --------------- + ------- + | variable_name | the Value | + ---------- -----+-------+ | Open_tables | 6 | + ---------------+-------+ Copy code
The above result shows that the table_cache set by the system is 512, which means that in this MySQL, Table Cache can cache 512 open file descriptors; the current system has only 6 open descriptors.
So under what circumstances will the Cache descriptor in the Table Cache pool be closed? Generally speaking, there are mainly the following situations where the descriptors of the Cache will be closed:
- The Cache pool of Table Cache is full, and when a connection thread needs to open a table that is not in Table Cache, MySQL will use a certain algorithm to close some descriptors that are not in use;
- When we execute commands such as Flush Table, MySQL will close all file descriptors of Cache in the current Table Cache;
- When the amount of Cache in Table Cache exceeds the value set by the table_cache parameter;
Sort Buffer, Join Buffer and Read Buffer
In MySQL, in addition to the various Caches introduced before, there are also two types of Buffers during the execution of the query that will affect the overall performance of the database.
mysql > show variables like '%buffer%' ; + -------------------------------+------ ----+ | Variable_name | Value | + -------------------------------+------- ---+ ... ... | join_buffer_size | 4190208 | ... ... | sort_buffer_size | 2097144 | + -------------------------------+----------+ copy Code
- join_buffer_size: Buffer used when our Join is ALL, index, rang or index_merge;
In fact, this kind of Join is called Full Join. In fact, each table participating in Join needs a Join Buffer, so when Join appears, there are at least two. The Join Buffer setting is 4GB before MySQL version 5.1.23, but starting from 5.1.23 version, the 4BG limit can be exceeded on 64-bit platforms other than Windows. The system default is 128KB.
- sort_buffer_size: Buffer used when sorting data in the system;
Sort Buffer is also for a single Thread, so when multiple Threads are sorted at the same time, multiple Sort Buffers will appear in the system. Generally, we can improve the processing performance of ORDER BY or GROUP BY by increasing the size of Sort Buffer. The default size of the system is 2MB, and the maximum limit is the same as Join Buffer. Before MySQL version 5.1.23, the maximum is 4GB. Starting from version 5.1.23, the 4GB limit can be exceeded on 64-bit platforms except Windows.
If there are few Join statements in the application system, you don t need to care too much
The performance improvement that can be brought by performance optimization through parameter settings may not produce a qualitative leap as many people imagine, unless the previous settings are seriously unreasonable. We cannot completely rely on performance tuning on the parameter adjustments after the database goes online through the DBA. Instead, we should minimize performance issues during the system design and development stages. Of course, it cannot be denied that parameter adjustment has a relatively large impact on system performance in certain scenarios, but it is only a few special cases after all.
Friends who like it can pay attention to it, the next article updates "Common Storage Engine Optimization"!