Computer system structure review three

Computer system structure review three

This is the 27th day that I participated in the Wenwen Challenge. For details of the event, please view: Wenwen Challenge

Chapter 5: Storage System


5.1 Multi-level storage hierarchy: adopted by a group

Different technologies
The realized memory is composed of
Storage System.

5.2 Hit time: when

Time, required
interview time

5.3 Missing overhead:

The time to transfer the block containing the word to be accessed from m2 to M1
The time required to transmit a block of information
Sum. 3

5.4 Hit rate: When the CPU accesses the storage system, the probability of finding the required information in M1

5.5 Missing rate: refers to when the CPU fetches memory

Not found in M1
The probability of the required information---M1 is the memory closest to the CPU

5.6 Fully associative mapping:

Main memory
Any piece
Can be placed on
A location

5.7 Direct mapping:

Main memory
Each piece
Can only be placed on
A location

5.8 sets of associative mapping:

Main memory
Each piece
Can only be placed in Cache
one of

5.9 Write direct method: When performing a write operation, not only the data

Write to Cache
The corresponding block in and also
Write to the next level of memory

5.10 Write back method: only data

Write to Cache
Corresponding fast,
Don't write
Next level memory

The difference between direct writing and writing back method (test):

Write direct to v is slow, data is consistent
Write back method, v is fast, data is inconsistent

5.11 Mandatory Miss: When the first time

A block, the block
Not in Cache
In, the next level of storage needs to be transferred to Cache

5.12 Capacity miss: If all the blocks needed during program execution cannot be transferred into the Cache, when some blocks are replaced and accessed again, a miss will occur

5.13 Conflict misses: in

Group associative or direct mapping
If too many blocks are mapped to the same group, some blocks in the group will be replaced by other blocks.
, And then revisited

5.14 2:1Cache rule of thumb: capacity is N

Direct mapping
Miss rate and capacity with
N/2 two-way group associative Cache
The miss rate is about the same

5.15 Sacrifice Cache: Add one to the data path between Cache and the next level of memory

Fully connected
Small Cache
, This small Cache is called Sacrifice Cache

5.16 Association degree : When a data block is transferred from the main memory to the Cache, the Cache can be used for

The number of locations where the data block is stored

In group associative, the number of blocks in each group of Cache

5.17 System response time: the time for the computer to respond to the user's input or request

5.18 Reliability : refers to the system from

Some initial reference point
Ability to continuously provide services
, It is usually measured by mean time between failures

5.18 Availability : System

normal work
At the time
Two consecutive normal service intervals

5.19RAID: Redundant Array of Inexpensive Disks or Redundant Array of Independent Disks

Short answer questions:

1. What is the main contradiction of single-level memory? What methods are usually adopted to solve it?

The main contradiction:

  • The faster the speed, the higher the price per person.
  • The larger the capacity, the lower the price per person.
  • The larger the capacity, the slower the speed.


Adopt multi-level memory technology
, Forming a multi-level storage hierarchy

2. Which four problems should be solved in the storage hierarchy?

  • Mapping rules
    : When loading a block into a higher-level memory, where can it be placed?
  • Search algorithm
    : When the block to be accessed is in the higher-level memory, how to find the block?
  • Replacement algorithm
    : When failure occurs, which one should be replaced?
  • Write strategy
    : When doing write access, what should be done?

3. What are the address mapping methods? What are their advantages and disadvantages?

  • Fully associative mapping:

    Advantages: high utilization of Cache space, low block conflict rate



  • Direct mapping:

    Advantages: simple search mechanism, speed


    Disadvantages: Cache space utilization is low, and the probability of block collisions is high, so the cache failure rate is also high

  • Group associative mapping: it is a kind of direct mapping and fully associative mapping

4. What are the two main writing strategies? What are their advantages?

  • Direct write: easy to implement, and the data in the next level of memory is always the latest
  • Write back method : fast, "write" operation can be performed at the speed of Cache memory. Moreover, for multiple writes of the same unit, only the next level of memory needs to be written back. Some "writes" can only reach the Cache and not the main memory, so the memory frequency band used is lower.

5. What is the basic idea of pseudo-association?

When accessing the pseudo-associated Cache, the first step is to

Direct mapping
Access in the same way if
, Then take the accessed data from the corresponding block and send it to the cpu, the access is over, if
. Pseudo-associated Cache
Check another location of Cache
, To see if it matches, if this piece of
Identity match
, It is called a false hit; otherwise, the next level of memory has to be accessed.

  • When a false hit occurs, exchange the contents of the two blocks, and put the point that has just been visited in the first position (that is, press the block corresponding to the direct image)

The reason for swapping the two groups: This is based on the principle of locality. The block that has just been accessed is likely to be the next block to be accessed.

6. What is the basic idea of using secondary cache?

Through the gap between the original cache and storage

Add another level of cache
, Constitute a two-level cache. Make the first-level cache small enough to match its speed with the clock cycle of the fast cpu. At the same time, do the second level cache
big enough
Time, make it capture more
Actually miss the overhead.

7. What are the benefits of using a Cache with a small capacity and simple structure?

  • Reduce Cache hit time.
  • Small enough cache to be compatible with the processor
    Do on the same chip
    On to
    Due to off-chip access and increased time overhead.

8. What is the basic idea of "virtual index + physical identification" Cache?

  • Use directly
    Virtual address
    In-page displacement
    As an index to access the Cache, but
    Physical address
  • After the cpu sends a cache request, proceed
    Virtual->real address
    Changes can be made in parallel at the same time
    Logo reading
  • After the address change is completed, the obtained physical address and identification are performed

Supplementary test sites:

1. Design goals of multi-level memory

From the CPU point of view, the speed of changing the storage system is close to that of M1, while the capacity and price per bit are close to Mn.

2. The formula for calculating the average memory access time

Average access time:TAAverage access time: T_A,Hit time:T1Hit time: T_1 Failure Rate:FFailure rate: F ,Failure cost:TMFailure cost: T_M

Average memory access time = hit time + failure rate * failure cost


3. Cache block search method

  • Realized by looking up the table of contents
  • Parallel search and sequential search
    • Implementation method of parallel search: associative memory

Key examples:

Calculation formula:

  • Calculation formula of average memory access time (see above)

  • CPU time calculation formula

    CPUtime=IC (CPI The average number of memory accesses per instruction Miss rate Miss overhead) Clock cycle timeCPU time = IC * (CPI * average memory access times per instruction * miss rate * miss overhead) * clock cycle time

Chapter 7: Interconnection Functions


7.1 Interconnection network: it is a kind of

Switching element
According to certain
Topology and
A network constituted by a control method to realize the interconnection between nodes in a computer system

7.2 Interconnection function : reflects the corresponding replacement relationship or arrangement relationship between the network input end array and the output end array

Use the variable x to represent the input, use

Function f(x)
Indicates output

7.3 Network diameter: the distance between any two nodes in the interconnection network


7.4 Network scale: in the interconnected network

Number of nodes

7.5 Static interconnection network: between nodes

Connection path, and in operation
unable to be changed
network of

7.6 Dynamic interconnection network: by

Exchange switch
Composition, according to the requirements of the operating program
Network that changes connection status

Supplementary test sites:

1. 3.elements of the Internet

  • Interconnect structure
  • Switching element
  • way to control

Dividing width b: Cut the network composed of N nodes into two halves with the same number of nodes. In various cutting methods, the minimum number of edges along the incision is called the bisecting width of the network.

2. Features of common internetwork:

  1. Hypercube (multiple choice exam)

    • Is a binary n-cube structure
    • As long as the corresponding nodes in the two (n-1) cubes are connected by links, the scale can be formed asN=2nN=2^n -cubes of n,a total of2n 12^{n-1} link
  2. Linear array

    N nodes are connected in a row with N-1 links, the degree of internal nodes is d=2, the degree of end nodes is d=1, the diameter is D=N-1, and the division width is b=1.

  3. Ring and stringed ring

    Use a link to connect the two end nodes of the linear array to get the exchange. The network diameter of the two-way ring is D=N/2, and the network diameter of the one-way ring is D=N-2. The equal width of the ring b=2

  4. Cyclic shift network

    Network node degree d=2n-1, diameter D=n/2, network scaleN=2nN=2^n

  5. Tree and star

    Complete binary tree: node degree d=3, diameterD=2(k 1)D=2(k-1) , the equal division width b=1

    Star shape: d=N-1, diameter D=2, equal division width b=N/2 (rounded down)

  6. Mesh and ring mesh

    • A 2-dimensional grid-shaped network with a scale of N=n n: the degree of internal nodes d=4, the degree of edge nodes d=3, the degree of corner nodes d=2, and the network diameter D=2 *( n-1), equal division width b=n

    • An n n ring network, node degree: 4, network diameter: 2 (n/2) (rounded down), equal division width b=2n

3. 4 connection modes of 2*2 switch, 3 control modes of switch module

  • 4 connection methods:

Direct delivery, exchange, upload, download

  • 3 control modes of switch module
    • Level control: All switches of each level are controlled by only one control message, and these switches can only be in the same state at the same time
    • Unit control: each switch has an independent control signal, which can be in different states
    • Component level control: No.iiAll switches of level are used separatelyi+1i+1 signal control

4. Bus network, crossbar switch network, multi-level interconnection network

Key examples:

  • Cube0(x2x1x0)=x2x1!x0Cube_0(x_2x_1x_0)=x_2x_1!x_0(Ie rightx0x_0Negate)
  • Move the input terminal to the left in a circle; if k:lowkBit rotate one bit to the left; k:highkRotate one bit to the left\sigma: rotate the input terminal to the left by one bit; if/sigma_k: rotate the low k bit to the left by one bit;/sigma^k: rotate the high k bit to the left by one bit
  • :\beta: highest bit and the lowest bit are swapped; k:\beta_k: The highest bit and the lowest bit in the low k bits are swapped; k:\beta^k:the highest and lowest bits of the high k bits are swapped
  • :\rho: All orders are reversed; k:\rho_k: order in the low k bits is reversed; k\rho^k : the order in the high k bits is reversed
  • PM2+i=(x+2i)modN,PM2 i=(x 2i)modNPM2_{+i}=(x+2^i)modN,PM2_{-i}=(x-2^i)modN

Chapter 8: Multiprocessor


8.1 Centralized shared multiprocessor: It is generally composed of dozens of processors, and each processor shares a centralized physical memory. Because there is only a single main memory, and the relationship between this main memory and each processor is

, Also known as symmetric shared multiprocessor

8.2 Distributed shared multi-processor: its shared memory is distributed in each processor, each processor

Have their own local storage
To form a
"Processor-Memory" unit
, But the actual storage in these distributed processors is combined again
Unified addressing
, Logically form a shared memory. These processor memory units are connected together by an interconnection network, and each processor can access
In addition to the local storage, it can also be directly accessed through the Internet.
Remote storage in the processor storage unit.

8.3 Multi-Cache consistency: If shared data is allowed to enter the Cache, there may be a copy of the unified storage block in the Cache of multiple processors. When one of the processors modifies the data in its Cache, it will Makes the data in its Cache inconsistent with the data in other Caches. This is the Cache consistency problem of multiprocessors. To ensure that the data of multiple copies are consistent.

8.4 Writing an invalid agreement: the processor is in progress

Before the operation, copy all other caches
Void all

8.5 Write update protocol: when a processor performs a

When he put the data
To all other caches. These Caches update their copies with new data. Of course, if you know that there is no corresponding copy in other Caches, you don't need to broadcast and update.

Short answer questions:

1. In a machine with a distributed memory structure, what are the benefits of distributing the memory to each node?

  1. If most of the memory accesses are performed on the memory of the local node, the bandwidth requirements for the memory and the interconnection network can be reduced
  2. The access delay time to the local memory is small

2. In a machine with a distributed memory structure, what are the two current organization schemes for the memory address space?

  1. Treat all physically separated memories as one
    Unified shared logical space
    Mark, so that any processor can access any unit in the shared space, and the same physical address on different processors points to the same storage unit. The system of this type of computer is called a distributed shared memory system.
  2. Address the memory of each node as one
    Independent address space
    , The address spaces in different nodes are independent of each other. That is to say, the address space of the entire system is composed of multiple independent address spaces. The memory of each node can only be accessed by the local processor, and the remote processor cannot directly access it.

3. In a machine with a distributed memory structure, what are the two communication mechanisms corresponding to the two address space organization schemes? How are they realized?

  1. Shared memory communication mechanism (for computer systems that share address space)

    Pass between processors

    load and store instructions
    It is realized by reading/writing the same memory address.

  2. Messaging communication mechanism (for computer systems that use multiple independent address spaces)

    Data communication needs to pass between processors

    Explicitly pass the message
    To complete, this is called a messaging communication mechanism.

4, when implementing Cache coherency protocols, which have both tracking technology to share data state?

  • Directory protocol : The shared state of data blocks in physical storage is stored in a place called a directory. The implementation overhead of the directory protocol is slightly larger than that of the listening protocol, but it can be used to implement larger-scale multiprocessors.

  • Monitoring protocol : When the data in the physical memory is transferred to the Cache, its shared state information is placed in the Cache together with the data block. The system does not have a centralized state table. These Caches are usually connected to the shared memory bus. When a Cache needs to access the memory, it will broadcast the request on the bus, and other Cache controllers monitor the bus to determine whether they have the requested data block on the bus. If so, proceed accordingly.

5, directory protocols, Cache block which of the three states?

  • Sharing (S): The block is in
    one or more
    The processor has copies of this block, and these copies are the same as the block in memory.
  • Unbuffered (U): The block has not been transferred to the Cache. There is no copy of this block in the Cache of all processors.
  • Exclusive (E) :
    Only one
    The processor has a copy of this block, and the processor has written to it, and the data in the block in the memory is out of date. This processor is called the owner of the block.

Supplementary test sites:

1. Classify multiprocessors according to the memory structure:

  • Symmetrical shared memory multiprocessor
  • Distributed memory multiprocessor