Web server development summary

Web server development summary

Article Source:

I. Overview

After years of actual web server development, here is a summary of practical experience. This article involves asynchronous connection, asynchronous domain name resolution, hot update, overload protection and network models and coroutines, but it will not involve basic knowledge points such as accept4 and epoll.

2. Writable events

I believe that most beginners will confuse the function of writable events, and may think that writable events have no meaning. However, it is essential to monitor and process writable events in the network server. Its function is to determine whether the connection can send data. It is mainly used to monitor when the network cannot send data immediately.

When there is data to be sent to the client, it is sent directly. If it cannot be sent in full immediately, it will be cached in the sending buffer first, and its writable events will be monitored. When the connection is writable, it will be sent again and its writable events will not be monitored (to prevent abuse of writable events).

It is worth noting that for the specified connection, the sending buffer data must be sent before the new data can be sent. This may also be easier to ignore, at least I was pitted that year.

3. the connection buffer

For long connections, maintaining the network connection buffer is also essential. At present, some network servers (such as the old access layer of QQ pet) do not have the receiving and sending buffers to maintain the connection, and they will not listen to writable events when they cannot send temporarily. It directly receives the data and processes it. If it encounters an incomplete packet during the processing, it will be directly lost. This may cause a large number of errors in the subsequent network data packets of the connection, resulting in a large amount of packet loss; when sending data, it will also be directly directly when it cannot be sent. throw away.

Each network connection needs to maintain its receiving and sending data buffers. When the connection is readable, the data is first read to the receiving buffer, and then judged whether it is complete and processed; when sending data to the connection, it is generally sent directly , If it cannot be sent immediately and completely, it will be buffered in the sending buffer, and then sent when the connection is writable, but it should be noted that if the writable buffer is not empty and needs to send new data before it can be written, this It can t be sent directly, but should be added to the sending buffer and sent uniformly, otherwise it will cause network data packet fretting.

The connection buffer memory allocation often adopts the slab memory allocation strategy, which can directly implement the slab algorithm (such as memcached), but it is recommended to directly use jemalloc and tcmalloc (such as redis).

4. accept non-obstructive

Blocking listener listening socket may also have a small probability of blocking when it accepts.

When the accept queue is empty, accept will cause blocking for blocking sockets, while non-blocking sockets will immediately return an EAGAIN error. Therefore, it should be set to non-blocking after bind and listen, and check whether it is successful when accepting.

5. asynchronous connection

Web servers often need to connect to other remote servers, but it is unacceptable to block connections as a server, so asynchronous connections are required.

When connecting asynchronously, you first need a socket and set it to non-blocking, and then connect to the socket. If connect returns 0, it means that the connection is established immediately; otherwise, you need to judge whether the connection is wrong or in the asynchronous connection process according to errno; if errno is EINPROGRESS, it means that it is still in an asynchronous connection and needs epoll to monitor the writable events of the socket (note that it is not Readable events). When it is writable, the error code can be obtained through getsockopt (that is, getsockopt(c->sfd, SOL_SOCKET, SO_ERROR, &err, (socklen_t*)&len);). If getsockopt returns 0 and the error code err is 0, the connection is established successfully. Otherwise, the connection fails.

Due to the restart of the network or the remote server, the network server will need to be able to automatically disconnect and reconnect. At the same time, it should avoid unlimited retry when the remote server is unavailable. Therefore, some reconnection strategies are required. Suppose there are at most M network connections connected to remote servers of the same type. When the effective network connection is disconnected, if the current number of connections (including those in valid and asynchronous connections) is less than M/2, the asynchronous connection will be made immediately. If the connection fails as an asynchronous connection, it cannot be connected again to prevent unlimited reconnection when the remote server is unavailable. When a connection is needed, M connections can be randomly selected N times to obtain a valid connection, and if an unavailable connection is encountered, an asynchronous connection is performed. If a valid connection is still not obtained N times, M connections are looped to obtain a valid connection object.

6. asynchronous domain name resolution

When only the domain name of the remote server is known, it is necessary to resolve the IP address of the remote server (such as the WeQuiz access layer) before the asynchronous connection. Similarly, blocking domain name resolution is not a good way for the server.

Fortunately, the linux system provides the getaddrinfo_a function to support asynchronous domain name resolution. The getaddrinfo_a function can resolve domain names synchronously or asynchronously. When the parameter is GAI_NOWAIT, it means to perform asynchronous resolution. The function call will return immediately, but the resolution will continue in the background. After the asynchronous analysis is completed, a signal (SIGEV_SIGNAL) or a new thread is started to start the specified function (SIGEV_THREAD) according to the sigevent setting.

struct gaicb* gai = (gaicb*)calloc(1, sizeof(struct gaicb));

gai->ar_name = config_ get_dns_url();/* url */

struct sigevent sig;

sig.sigev_notify = SIGEV_SIGNAL;

sig.sigev_value.sival_ptr = gai;

sig.sigev_signo = SIGRTMIN;/* signalfd/epoll */

getaddrinfo_a(GAI_NOWAIT, &gai, 1, &sig);

For the specified signal generated after asynchronous completion, the server needs to capture the signal and further resolve the IP address. In order to be able to uniformly handle network connections, inter-process communication, timers and signals in the epoll framework, the linux system provides eventfd, timerfd, and signalfd. Create dns_signal_fd = signalfd(-1, &sigmask, SFD_NONBLOCK|SFD_CLOEXEC)) here; and add it to epoll. When the specified signal is generated after asynchronous completion, the dns_signal_fd readable event will be triggered. The signalfd_siginfo object is read by the read function, and the gai_error function is used to determine whether the asynchronous domain name resolution is successful. If it is successful, you can traverse gai->ar_result to get the IP address list.

7. hot update

Hot update means that the running logic is not affected when the executable file is updated (such as the network connection is not disconnected, etc.), but the new processing will be processed according to the updated logic (such as new connection processing, etc.). The hot update function is more important for the access layer server (such as game access server or nginx, etc.), because the hot update function can avoid downtime release most of the time, and restart at any time without affecting the current processing connection.

The main points of the hot update in the WeQuiz mobile game access server:

(1) Create listen_socket and eventfd in the parent process, then create a child process, monitor the SIGUSR1 signal and wait for the end of the child process; and the child process will listen to listen_socket and eventfd, and enter the epoll loop processing.

(2) When the executable file needs to be updated, the SIGUSR1 signal can be sent to the parent process; when the parent process receives the update signal, it notifies the child process through eventfd, and at the same time forks the new process and execv the new executable file; this There are two pairs of parent-child processes.

(3) When the child process receives the eventfd update notification through epoll, it no longer listens and closes listen_socket and eventfd. Since listen_socket is closed, new connections can no longer be monitored, but existing network connections and processing are not affected, but its processing is still the old logic. When all clients are disconnected, the epoll main loop exits and the child process ends. It is worth noting that since the number of connections in the epoll processing queue cannot be obtained through the system function, the application layer needs to maintain the current number of connections by itself, and exit the epoll loop when the number of connections is equal to 0.

(4) When the old parent process waits until the old child process exits the signal, it also ends. At this time, there is only a pair of parent and child processes to complete the hot update function.

8. Overload protection

For a simple network server, it is basically no problem to reach 100W-level connections (8G memory) and 10W-level concurrency (Gigabit network card). However, the logic processing of the network server is more complicated or the interactive message packet is too large. If it is not overloaded, the server may not be available. Especially for key servers in the system (such as the game access layer), overloading may cause an avalanche of the entire system.

The overload protection of network servers often includes the maximum number of files, the maximum number of connections, system load protection, system memory protection, connection expiration, the maximum number of connections to the specified address, the maximum packet rate of the specified connection, the specified connection limit, the maximum packet volume, and the maximum buffer of the specified connection Solutions such as zone, designated address or id black and white list.

(1) Maximum number of files

You can set the maximum number of RLIMIT_NOFILE files in the main function through setrlimit to constrain the maximum number of files that the server can use. In addition, web servers often use setrlimit to set the maximum value of core files and so on.

(2) Maximum number of connections

Since the current number of valid connections cannot be obtained through epoll-related functions, the server needs to maintain the current number of connections by itself, that is, the number of connections is accumulated when the connection is closed, and it is decremented. You can judge whether the current number of connections is greater than the maximum number of connections after accept/accept4 accepts the network connection, if it is greater than the number of connections, just close the connection directly.

(3) System load protection

Update the current system load value by calling getloadavg regularly, and check whether the current load value is greater than the maximum load value (such as the number of cpu * 0.8*1000) after accepting the network connection in accept/accept4, and if it is greater than, just close the connection directly.

(4) System memory protection

Calculate the current system memory-related value by reading the/proc/meminfo file system regularly, and check whether the current memory-related value is greater than the maximum memory value after accepting the network connection in accept/accept4 (such as swap partition memory usage, available free memory and Used memory percentage, etc.), if it is greater than, just close the connection.

g_sysguard_cached_mem_swapstat = totalswap == 0? 0: (totalswap-freeswap) * 100/totalswap;

g_sysguard_cached_mem_free = freeram + cachedram + bufferram;

g_sysguard_cached_mem_used = (totalram-freeram-bufferram-cachedram) * 100/totalram;

(5) Connection expired

Connection expiration means that the client connection has not interacted with the server for a long period of time. In order to prevent too many idle connections from occupying resources such as memory, the network server should have a mechanism to clean up expired network connections. At present, common methods include ordered lists or red-black trees to process, but as a back-end server developer, polling is always not the best solution. The QQ pet and WeQuiz access layer maintains a unique timerfd descriptor through each connection object, and timerfd as a timing mechanism can be added to the epoll event architecture. When receiving the network data of the connection, the timerfd_settime is called to update the idle time value. If the idle time is over The epoll will return and close the close directly. Although it is the first attempt (at least I haven't seen it used in other projects), the access server has always been running relatively stably and should be used with confidence.


struct itimerspec timerfd_value;

timerfd_value.it_value.tv_sec = g_cached_time + settings.sysguard_limit_timeout;

timerfd_value.it_value.tv_nsec = 0;

timerfd_value.it_interval.tv_sec = settings.sysguard_limit_timeout;

timerfd_value.it_interval.tv_nsec = 0;

timerfd_settime(c->tfd, TFD_TIMER_ABSTIME, &timerfd_value, NULL);

add_event(c->tfd, AE_READABLE, c);

(6) The maximum number of connections at the specified address

By maintaining a hash table or red-black tree whose key is the address and the value is the number of connections, and after accepting the network connection in accept/accept4, check whether the number of connection objects corresponding to the address is greater than the specified number of connections (such as 10), and if it is greater, the connection will be closed directly That's it.

(7) Specify the maximum packet rate of the connection

The connection object maintains the number of complete data packets of the server protocol in a unit time. After reading the data, it is judged whether it is a complete data packet. If it is complete, the number is accumulated. At the same time, the data packet reading interval is greater than the unit time and the count is cleared. If the number of complete data packets in the unit time is greater than the limit value (such as 80), the processing of the data packet will be postponed (that is, it will only be received into the read buffer and will not be processed or forwarded temporarily). When the number is greater than the maximum value (such as 100) ) Then disconnect it directly. Of course, the connection can be directly disconnected without delaying the processing.

(8) Specify the maximum number rate of connections

The connection maximum number rate is basically the same as the connection maximum packet rate overload protection method. The difference is that the connection maximum packet rate refers to the number of complete data packets per unit time, while the connection maximum number rate refers to the number of buffer data bytes per unit time.

(9) Specify the maximum buffer for the connection

The maximum value of the readable buffer of the connection object can be judged after the recv function reads the network packet. If it is greater than the specified value (such as 256M), the connection can be disconnected; of course, it can also target the writable buffer of the connection object; in addition, After reading the complete data packet, you can also check whether it is larger than the maximum data packet.

(10) Specify the address or id black and white list

You can set the connection ip address or player id as a black and white list to deny service or not subject to overload restrictions, etc. Currently WeQuiz does not implement this overload function.

In addition, you can also set socket options such as TCP_DEFER_ACCEPT and SO_KEEPALIVE to avoid invalid clients or clean up invalid connections. For example, after the TCP_DEFER_ACCEPT option is turned on, if the operating system does not receive real data after the three-way handshake is completed, the connection is always placed in accpet In the queue, and when the same client connection (but not sending data) reaches a certain number (such as linux2.6+system 16 or so), it can no longer connect normally.

setsockopt(sfd, IPPROTO_TCP, TCP_DEFER_ACCEPT, (void*)&flags, sizeof(flags));

setsockopt(sfd, SOL_SOCKET, SO_KEEPALIVE, (int[]){1}, sizeof(int));

setsockopt(sfd, IPPROTO_TCP, TCP_KEEPIDLE, (int[]){600}, sizeof(int));

setsockopt(sfd, IPPROTO_TCP, TCP_KEEPINTVL, (int[]){30}, sizeof(int));

setsockopt(sfd, IPPROTO_TCP, TCP_KEEPCNT, (int[]){3}, sizeof(int));

9. timeout or timing mechanism

The timeout or timing mechanism is basically indispensable in the web server. For example, after receiving the request, it needs to be added to the timeout list so that it can timeout to reply to the client and clean up resources when asynchronous processing cannot be performed. For the game server, the timeout or timing mechanism does not need a real timer to achieve, it can be achieved by maintaining the timeout list and performing detection processing after the while loop or epoll is called.

Timer management often uses minimum heap (such as libevent), red-black tree (such as nginx), and time wheel (such as linux).

As an application layer server, you don t need to implement minimum heap or red-black tree or time wheel to implement timer management, but you can use multiple red-black trees in stl or boost to manage, where the timeout is the key and the related object is the value. ; The red-black tree is automatically sorted by keys. When detecting, only need to traverse from the first node until the key value is greater than the current time; of course, the timeout time of the first node can be obtained as the timeout time of epoll_wait. In addition, the large area logic server or real-time battle server on the game server often needs a persistent timer, which can be persisted to shared memory through the boost library.

(1) Timer management object

typedef std::multimap<timer_key_t,timer_value_t> timer_map_t;

typedef boost::interprocess::multimap<timer_key_t,timer_value_t,

std::less<timer_key_t>,shmem_allocator_t> timer_map_t;

(2) Timer category

class clock_timer_t



static clock_timer_t &instance() {static clock_timer_t instance; return instance;}

static uint64_t rdtsc() {

uint32_t low, high;

__asm__ volatile ("rdtsc": "=a" (low), "=d" (high));

return (uint64_t) high << 32 | low;


static uint64_t now_us() {

struct timespec tv;

clock_gettime(CLOCK_REALTIME, &tv);

return (tv.tv_sec * (uint64_t)1000000 + tv.tv_nsec/1000);


uint64_t now_ms() {

uint64_t tsc = rdtsc();

if (likely(tsc-last_tsc <= kClockPrecisionDivTwo && tsc >= last_tsc)){

return last_time;


last_tsc = tsc;

last_time = now_us()/1000;

return last_time;



const static uint64_t kClockPrecisionDivTwo = 500000;

uint64_t last_tsc;

uint64_t last_time;

clock_timer_t(): last_tsc(rdtsc()), last_time(now_us()/1000) {}

clock_timer_t(const clock_timer_t&);

const clock_timer_t &operator=(const clock_timer_t&);


(3) The timeout detection function (called in the while or epoll loop) can return a collection of timeout objects or the minimum timeout time.

timer_values_t xxsvr_timer_t::process_timer()


timer_values_t ret;

timer_key_t current = clock_timer_t::instance().now_ms();

timer_map_it it = timer_map->begin();

while (it != timer_map->end()) {

if (it->first> current) {//! return it->first-current;//return timeout

return ret;//Return to a collection of timeout objects





return ret;


10. Network model

Linux has multiple IO models such as blocking, non-blocking, multiplexing, signal-driven and asynchronous, but not every type of IO model can be applied to the network. For example, asynchronous IO cannot be used in every aspect of the network (described later). Some general network models can be summarized through different designs and related IO models. For example, commonly used asynchronous network models include reactor, proactor, semi-asynchronous semi-synchronous (hahs), leader follower (lf), multi-process asynchronous model and distributed System (server+workers), etc.

(1) reactor

The Reactor network model often refers to a single-process single-threaded event callback processing method with IO multiplexing represented by epoll. This network is most commonly used in web server development (such as redis), especially for servers with relatively simple logic, because the bottleneck is not in the cpu but in the network card (such as a gigabit network card).

(2) proactor

The Proactor network model generally uses asynchronous IO mode, which is currently commonly used in window operating systems, such as completion port IOCP; in linux, aio can be used on the socket descriptor, but not in macosx. I have tried socket + epoll + eventfd + aio mode, but failed; but testing socket + sigio (linux2.4 mainstream) + aio is fine. In terms of Linux server development, asynchronous IO is generally only used to read files asynchronously, such as filefd + O_DIRECT + posix_memalign + aio + eventfd + epoll mode (can be disabled) in nginx, but it may not be more efficient than reading files directly; The asynchronous IO mode is basically not used for file writing and network.

(3) Semi-asynchronous and semi-synchronous (hahs)

The semi-asynchronous semi-synchronous model (HalfAsync-HalfSync) often adopts a single-process multi-threaded form, which includes a monitoring main thread and a group of worker threads. The monitoring thread is responsible for accepting requests and selecting the worker thread that processes the current request (such as polling). Method, etc.), at the same time, add the request to the queue of the worker thread, and then notify the worker thread to process it, and finally the worker thread processes and responds. For hahs mode, all threads (including main thread and worker thread) have their own epoll processing loop, and each worker thread corresponds to a queue, which is mainly used for data communication between the main thread and the worker thread, and the notification between the main thread and the worker thread Communication often uses pipe or eventfd, and the epoll of the worker thread will monitor the notification descriptor. The hahs mode is also widely used, such as memcached and thrift. In addition, the zeromq message queue also uses the type mode.

//! Main thread main_thread_process

while (!quit) {

ev_s = epoll_wait(...);

for (i = 0; i <ev_s; i++) {

if (events[i].data.fd == listen_fd) {

accept4( .);

} else if (events[i].events & EPOLLIN) {

recv( );

select_worker( );

send_worker( );

notify_worker( );



//! worker_thread_process

while (!quit) {

ev_s = epoll_wait(...);

for (i = 0; i <ev_s; i++) {

if (events[i].data.fd == notify_fd) {

read( .);

do_worker( );




(4) Leader follower (lf)

The leader-follower model (Leader-Follower) also often uses single-process multi-threading. The basic idea is that one thread is the leader, and the rest of the threads are followers of the thread (essentially equal threads); when the request arrives , The leader first obtains the request, and selects one of the followers as the new leader, and then continues to process the request; in the implementation process, all threads (including the leader and follower threads) have their own epoll processing loop, which passes Equal epoll waits, and locks are used to let the system automatically select the leading thread. The lf mode is also widely used, such as pcl and some java open source frameworks. Both the lf mode and the hahs mode can make full use of the multi-core feature, which effectively increases the concurrency for servers with relatively complex logic. For the lf mode, all threads can equally use the queue mechanism of the epoll kernel, and the hahs mode requires the main thread to read and maintain it in the queue of the worker thread. Therefore, I use the lf model more commonly, such as the access service layer in the QQPet and WeQuie projects. .

while (!quit) {



while (loop.nready==0 && quit==0)

loop.nready = epoll_wait(...);

if (quit == 0) {





int fd = loop.fired[loop.nready];

conn *c = loop.conns[fd];

if (!c) {close(fd); goto Loop;}

loop.conns[fd] = NULL;




(5) Multi-process asynchronous model

The multi-process asynchronous model (Leader-Follower) often adopts the form of main process and multi-worker process, and is mainly used for stateless servers without data sharing, such as web servers such as nginx and lighttpd; its main process is mainly used to manage the work process group ( Such as hot update or pull up an abnormal work process, etc.), while the work process monitors and processes requests at the same time, but it is also easy to cause shock groups, which can be avoided through mutual exclusion locks between processes (such as nginx).

In summary, common network models have their own advantages and disadvantages. For example, reacor is simple enough, and lf uses multiple cores. But in fact, sometimes you don't have to pay too much attention to the performance of a single server (such as the number of connections and concurrency, etc.), but should focus on the linear scalability of the overall architecture (such as online game servers). Of course, some specific application servers are excluded, such as push servers biased towards the number of connections, web servers biased towards concurrency, etc. In addition, read excellent open source codes such as nginx, zeromq, redis, and memcached to experience improved technology and design capabilities. For example, Nginx can reach millions of connections and a 10G network environment can reach at least 500,000 RPS; zeromq uses a relatively unique design to make it Become the best message queue.

11. Coroutine

The coroutine is widely used in scripting languages such as python, lua and go, and the linux system also natively supports the c coroutine ucontext. The coroutine can be perfectly combined with the network framework (such as epoll, libevent, nginx, etc.) (such as gevent, etc.); the general approach is to create a new coroutine upon receiving the request and process it, and save it if it encounters a blocking operation (such as requesting a back-end service) The context is switched to the main loop, and when it can be processed (such as a back-end server reply or timeout), the specified coroutine is found through the context and processed. For the blocking function of the network layer, you can mount the corresponding hook function through the dlsym function, and then directly call the original function in the hook function, and switch processing when blocking, so that the application layer can directly call the blocking function of the network layer without having to Switch manually.

Game servers generally use a single-threaded fully asynchronous mode. Direct use of the coroutine mode may be relatively rare. However, some web applications in the form of cgi calls (such as game communities or operational activities) are gradually being used. For example, the QQ pet community game originally used the apache+cgi/fcgi mode of blocking request processing, which can basically only reach 300 concurrency per second. It is observed through strace that the time is basically consumed in network congestion, so it is necessary to seek a code that is as compatible as possible but can The technology to improve throughput makes coroutine the best choice, that is, use libevent+greenlet+python to develop new business, and choose nginx+module+ucontext to reuse old code, and finally modify less than 20 lines of code to improve performance 20 times (the actual business of siege pressure test can reach 8kQPS).

12. Other

In addition to basic code development, the network server also involves construction, debugging, optimization, stress testing and monitoring, etc. However, due to the relatively heavy task of new project development recently, we will gradually summarize the later stage and only briefly list it.

(1) Construction

I have always used cmake to build various projects (such as linux server and window/macosx client programs, etc.). I realize that cmake is one of the most excellent build tools, and its applications are also relatively wide, such as mysql, cocos2dx, and vtk. engineering.


add_executable(server server.c)

target_link_libraries(server pthread tcmalloc)

cmake .; make; make install

(2) Debugging

Most of the development and debugging of the network server can be done through the log. If necessary, it can be debugged by gdb. Of course, you can also directly use eclipse/gdb for visual debugging under the Linux system.

When the program is abnormal, if there is a core file, you can directly use gdb to debug, such as bt full to view the full stack detailed information or f jump to the specified stack to view related information; when there is no core file, you can view/var/log/message to get the address information, Locate the relevant exception code through addr2line or objdump.

For network servers, memory leak detection is also essential, and valgrind is the best memory leak detection tool.

In addition, other commonly used debugging tools (compilation phase and running phase) include nm, strings, strip, readelf, ldd, strace, ltrace and mtrace, etc.

(3) Optimization

Network server optimization involves multiple aspects such as algorithms and technologies.

In terms of algorithms, the optimal algorithm needs to be selected according to different processing scenarios, such as the nine-square-grid vision management algorithm, the jumping table ranking algorithm, and the red-black tree timer management algorithm. In addition, the best solution can also be set through lossy services, such as WeQuie The lossy leaderboard service used in.

Technical aspects can involve the separation of IO threads and logic, slab memory management (such as jemalloc and tcmalloc, etc.), socket functions (such as accept4, readv, writev and sendfile64, etc.), socket options (such as TCP_CORK, TCP_DEFER_ACCEPT, SO_KEEPALIVE, TCP_NODELAY and TCP_QUICKACK, etc.) ), new implementation mechanisms (such as aio, O_DIRECT, eventfd and clock_gettime, etc.), lock-free queues (such as CAS, boost::lockfree::spsc_queue and zmq::yqueue_t, etc.), asynchronous processing (such as using asynchronous interface libraries when operating mysql libdrizzle, redis asynchronous interface or gevent asynchronous framework, etc.), protocol selection (such as http or pb type), data storage form (such as blob type of mysql, bjson type or pb type of mongodb, etc.), storage scheme (such as mysql, redis) , Bitcask and leveldb, etc.), group avoidance (such as locking to avoid), user mode locking (such as nginx through the application layer CAS (better cross-platform)), network state machine, reference counting, time cache, CPU affinity Sex and module plug-in form (such as python, lua, etc.) and so on.

Commonly used tuning tools include valgrind, gprof, and google-perftools, such as valgrind's callgrind tool, you can add CALLGRIND_START_INSTRUMENTATION; CALLGRIND_TOGGLE_COLLECT; CALLGRIND_TOGGLE_COLLECT; CALLGRIND_STOP_INScallTRUMENT --tool collectgrind --tool before and after the code segment needs to be analyzed, and then run -atstart=no --instr-atstart=no ./webdir, the analysis result file can also be displayed visually with Kcachegrind.

In addition to improving server operating efficiency, some development kits or open source libraries can also be used to improve server development efficiency, such as the use of boost libraries to manage the shared memory of variable-length objects and the python or go framework.

(4) Pressure test

For web servers, the stress test process is indispensable. It can be used to evaluate response time and throughput, and it can also effectively check whether there is a memory leak, etc., to provide a basis for later correction and optimization.

For http servers, tools such as ab or siege are commonly used for pressure testing, such as ./siege c 500 r 10000 b q .

For other types of servers, you generally need to write your own stress testing client (such as redis stress testing tools). The common method is to directly create multiple threads, and each thread uses libevent to create multiple connections and timers to make asynchronous requests.

In addition, if you need to test a large number of connections, you may need multiple clients or need to create multiple virtual IP addresses for the server.

(5) High availability

The high-availability implementation strategy of the server includes master-slave mechanism (such as redis, etc.), dual-master mechanism (such as mysql+keepalive), dynamic selection (such as zookeeper) and symmetric mechanism (such as dynamo), etc. For example, the dual-master mechanism can be equivalent to two The machine s VIP address and heartbeat mechanism are often implemented. Keepalive services are often used. Of course, it can also be implemented by the server itself. For example, when the server starts, you need to specify parameters to identify whether it is a master or a slave. Switch, such as

void server_t::ready_as_master()


primary = 1; backup = 0;

system("/sbin/ifconfig eth0:havip broadcast netmask up");//! Virtual IP

system("/sbin/route add -host dev eth0:havip");

system("/sbin/arping -I eth0 -c 3 -s");



void server_t::ready_as_slave()


primary = 0; backup = 1;

system("/sbin/ifconfig eth0:havip broadcast netmask down");



Of course, this is a relatively simple method (the premise is that the main and standby machines can communicate normally). Abnormal conditions such as the disconnection of the network cable between the main and standby machines are not considered. At this time, you can consider the combination of dual-center control and multi-master selection. Mode and so on.

(6) Monitoring

Linux is very rich in server monitoring tools, including ps, top, ping, traceroute, nslookup, tcpdump, netstat, ss, lsof, nc, vmstat, iostat, dstat, ifstat, mpstat, iotop, dmesg, gstack and sar (such as- n/-u/-r/-b/-q, etc.) and/proc, etc., such as ps auxw to view the process flag bit (usually D is blocked at IO, R is at cpu, S means it has not been awakened in time, etc.), gstack pid view the current stack information of the process, ss -s view connection information, sar -n DEV 3 3 view package volume, vmstat 1 5 view memory or process switching, etc., iotop or iostat -tdx 1 view disk information, mpstat 2 view CPU information Wait. In addition, sometimes the most effective is the diary file of the web server.

Thirteen, the end

In addition to the basic development technology of the network server, the overall system architecture is more important (such as linear scalability). There will be time to summarize in detail later. For the network game architecture, please refer to the introduction of WeQuiz mobile game server architecture and QQPet architecture.



linux system analysis and bottleneck search\

What is load?\

  1. Output loadaverage aa bb cc or cat/proc/loadavg in top, the average waiting time of the process in a certain period of time, if this parameter value is high, it means that the system can load high.

 2. Why does the process wait for execution?
Multitasking OS, the process execution is scheduled by time-sharing AB-kernel-A
The state of the process process (ps auxw) 1. TASK_RUN (can be scheduled to run), 2. Interruptible waiting ( Long-term waiting for IO), 3. Uninterruptible IO (such as reading  

disk data) 4. Pause (no reply and cannot be scheduled), 5. Zombie processes,
only 1 and 3 can be scheduled by the kernel to run on the cpu, waiting in loadaverage It is only related to 1, 3
so that you can see that the load is high -> the process is waiting more -> there are a large number of 1, 3 two state processes caused by -> 1 is waiting for cup, 3 is waiting io
---->When the cup load or IO load will cause the system load to be high, causing the system bottleneck\

  3. What causes high cup and io, and those processes that consume high cup    
1.ps (ps aux | sort -k3nr | head -n 5) and the top command
top find the highest process using cup (%cup column)
top -H- p 14094 (-p monitor the specified process -H view the status of each thread in the process) 14870
ps -efL  


    2. vmstat judges whether the load of cup or io is
high in column b, indicating that there are many waiting processes. The
swap column frequently changes, indicating insufficient memory. Frequent changes in the
io column indicate that IO may be a bottleneck. You can analyze the bi (disk read), bo( Disk write)
3. You can use iostat to analyze whether io is a bottleneck, and you can check the %idle column. If the column is small, it means that io is frequent and may be the bottleneck of io.


4. If the analysis bottleneck is cup or io.
If io, swap, memory are relatively small, the cup load is high, indicating that the logical computing power of the cup is insufficient under intensive computing.
If the free is small in vmstat, the si and so values are large, indicating the system can not exist
if the usage rate is low cup, b vmstat column value is relatively large, iostat in% idle is relatively small, that is the bottleneck io

5. the code-level analysis, analysis of hot test function calls, each call takes time
tools gpref