Embedded Software Engineer Written Interview Guide-ARM System and Architecture

Embedded Software Engineer Written Interview Guide-ARM System and Architecture

Hello, everyone. I am finally back! As soon as I submitted my big paper on the 19th, I was arrested and went on a business trip. After a whole week of tossing about it, I returned to school on the night of the 26th. Pigeon hasn't updated the dry goods for a long time. Today I updated a written interview question about Arm. The content of the article has been updated on github.


ARM system and architecture

Hardware foundation

What are the similarities and differences between NAND FLASH and NOR FLASH?


readJust like accessing SRAM, you can randomly access data at any address; such as: unsighed short *pwAddr = (unsighed short *)0x02;unisignded short wVal;wVal = *pwAddrFast, with strict timing requirements, you need to pass a function to read the data, first send the read command -> send address -> judge whether nandflash is ready -> read a page of data read command, send address, judge status, read data. It is realized by operating registers, such as the data register NFDATA
writeSlow, you need to erase before writing, because writing can only be 1->0, and erasing can be 0->1Fast, you need to erase before writing, because writing can only be 1->0, and erasing can be 0->1
EraseVery slow (5S)Fast (3ms)
XIPThe code can be run directly on NOR FLASHNO
reliabilityRelatively high, the ratio of bit inversion is less than 10% of NAND FLASHRelatively low, bit reversal is more common, and verification measures must be taken
interfaceSame as RAM interface, separate address and data busI/O interface
Number of erasable times10000~100000100000~1000000
capacitySmall, 1MB~32MBLarge, 16MB~512MB
The main purposeOften used to save code and key dataUsed to save data

Note: The 0 address of nandflash and norflash does not conflict. Norflash occupies the BANK address, while nandflash does not occupy the BANK address. Its 0 address is internal.

Same point

1Erase before writing, because the write operation can only make 1->0, and the erase action is to change all bits to 1.
2Erase units are all in blocks

CPU, MPU, MCU, SOC, SOPC connection and difference?

1. CPU (Central Processing Unit) is the computing core and control core of a computer . The CPU is composed of arithmetic units, controllers and registers, and a bus that realizes the data, control and status of the connection between them. The operating principle of almost all CPUs can be divided into four stages: Fetch, Decode, Execute and Writeback. The CPU fetches instructions from the memory or high-speed buffer memory, puts them into the instruction register, decodes the instructions, and executes the instructions. The so-called computer programmability mainly refers to the programming of the CPU.

2. MPU (Micro Processor Unit), called a microprocessor (not a microcontroller), usually represents a powerful CPU (for the time being understood as an enhanced version of the CPU ), but it is not designed for any existing specific computing purposes Chip. Such chips are often the core CPUs of personal computers and high-end workstations. The most common microprocessors are Motorola's 68K series and Intel's X86 series.

3. MCU (Micro Control Unit), called microcontroller, refers to the integration of the computer's CPU, RAM, ROM, timer counter and various I/O interfaces into a chip with the emergence and development of large-scale integrated circuits Above , chip-level chips are formed, such as 51 and avr. In addition to the CPU, there are RAM and ROM. You can directly add simple peripheral devices (resistors, capacitors) to run the code , and MPUs such as x86, arm, etc. You can't put the code directly, it's just an enhanced version of the CPU, so RAM and ROM have to be added .

The main difference between MCU MPU is whether it can run code directly. MCU has internal RAM ROM, and MPU is an enhanced version of CPU, you need to add external RAM ROM to run the code.

4. SOC (System on Chip) refers to a system on a chip. MCU is only a chip-level chip, while SOC is a system-level chip . It has built-in RAM and ROM like MCU (51, avr) and at the same time is like MPU (arm ) So powerful, not only put simple code, you can put system-level code, that is to say, you can run the operating system (it will be regarded as the combination of the advantages of MCU integration and MPU strong processing power).

5. SOPC (System On a Programmable Chip) programmable system-on-chip (FPGA is one of them), the hardware configuration of the above 4 points is solidified, that is to say 51 single-chip microcomputer is 51 single-chip microcomputer, can not become avr, and avr is not avr 51 single chip microcomputer, their hardware is a one-time mask molding, and what can be changed is the software configuration. If you say that the white point is to change the code, it was originally a running water lamp, and the code is changed to become a digital tube, while SOPC is the hardware configuration. The software configuration can be modified . The software configuration is the same as the above. There is nothing to say. As for the hardware, it can be built by itself. That is to say, the chip is built by itself. This chip is called "white chip" . It is not any chip. After downloading the hardware configuration information, it is the corresponding chip . It can be turned into 51, avr, or even arm. At the same time, SOPC is based on SOC, so it is also a system-level chip, so remember to When turning him into an arm, you have to add peripheral ROM, RAM and the like, otherwise it is MPU.

What is cross compilation?

A compiler running in one computer environment can compile code that runs in another environment . We call this kind of compiler support cross-compilation . This compilation process is called cross compilation. Simply put, it is to generate executable code on one platform on another .

What needs to be noted here is the so-called platform, which actually contains two concepts: Architecture and Operating System. The same architecture can run different operating systems; similarly, the same operating system can also run on different architectures. For example, the x86 Linux platform we often say is actually the collective name of Intel x86 architecture and Linux for x86 operating system; and x86 WinNT platform is actually the abbreviation of Intel x86 architecture and Windows NT for x86 operating system.

Why do I need to cross-compile?

Sometimes it is because the target platform does not allow or cannot install the compiler we need, and we need certain features of this compiler; sometimes it is because the resources on the target platform are scarce to run the compiler we need; sometimes It is also because the target platform has not been established, and there is no operating system, and there is no compiler to run.

Describe the difference between the embedded ROM-based operating mode and the RAM-based operating mode?

RAM based

  1. The code of the hard disk and other media needs to be loaded into the ram first, and there is generally a relocation operation during the loading process.
  2. The speed is faster than ROM-based, and the available RAM is less than ROM-based, because all codes and data must be stored in RAM.

ROM based

  1. The speed is slower than RAM-based, because there will be a process of moving variables, part of the code, etc. from the memory (hard disk, flash) to RAM.

  2. Available RAM resources are more than RAM-based;

ARM processor

What are the Harvard structure and the von Neumann structure?


The Von Neumann structure adopts unified addressing of instructions and data , and uses the same bus for transmission. The CPU reads instructions and data operations cannot overlap .

The Harvard architecture uses independent addressing of instructions and data , and uses two independent buses for transmission. The operations of the CPU to read instructions and data can overlap .

Pros and cons

The Von Neumann structure is mainly used in the field of general-purpose computers, and the code and data in the memory need to be modified frequently. Unified addressing is conducive to saving resources .

The Harvard structure is mainly used in embedded computers, and the program is solidified in the hardware, which has higher reliability, operation speed and larger throughput.

What is ARM pipeline technology?

Pipeline technology shortens the program execution time through multiple functional components working in parallel , improves the efficiency and throughput of the processor core, and becomes one of the most important technologies in microprocessor design. The ARM7 processor core uses the von Neumann structure of a typical three-stage pipeline , and the ARM9 series uses the Harvard structure based on the five-stage pipeline . By increasing the number of pipeline stages, the logic of each stage of the pipeline is simplified, and the performance of the processor is further improved.

PC stands for program counter, and the pipeline uses three stages, so instructions are executed in three stages: 1. fetch (load an instruction from memory); 2. decode (identify the instruction to be executed); 3. execute (processing) Instruction and write the result back to the register). And R15 (PC) always points to "fetching" instructions, not to "executing" instructions or "decoding" instructions. Generally speaking, people habitually agree to use " the instruction being executed as the reference point ", which is called the current first instruction, so the PC always points to the third instruction. When in the ARM state, each instruction is 4 bytes long, so the PC always points to the address of the instruction plus 8 bytes, that is: PC value = current program execution position + 8 ;

The ARM instruction is a three-stage pipeline, fetching, translating, executing, and simultaneously executing. Now the PC points to the address that is being fetched (the next instruction) , then the instruction address that the cpu is translating is PC-4 (assuming In ARM state, an instruction occupies 4 bytes), the address of the instruction being executed by the cpu is PC-8 , which means that the address pointed to by the PC is 8 different from the address of the instruction currently being executed.

When an interrupt occurs suddenly, the address of the PC is saved (PC-8+4 = the address of the next instruction of PC-4)

So you know, if you return to the PC when you return, then there is an instruction that has not been executed in the middle, so use

SUB pc lr-irq #4

How many working modes does ARM have?

  1. User mode (USR)

    User mode is the mode of operation of the user program, the operating system it runs in user mode, it does not have permission to operate additional hardware resources, can only perform their own data processing, but also can not switch to other modes , in order to

    Access to hardware resources or switch to other modes can only be done through soft interrupts or exceptions.

  2. System Mode (SYS)

    The system mode is a privileged mode and is not restricted by the user mode. User mode and system mode share a set of registers. In this mode, the operating system can easily access the user mode registers, and the operating system s

    Some privileged tasks can use this mode to access some controlled resources.

    Note: Both user mode and system mode use the same registers, and there is no SPSR (Saved Program Statement Register), but the system mode has higher authority than the user mode and can access all system resources.

  3. General interrupt mode (IRQ)

    The general interrupt mode is also called the general interrupt mode, which is used to process general interrupt requests. It usually enters this mode automatically after the hardware generates an interrupt signal. This mode is a privileged mode and can freely access system hardware resources.

  4. Fast Interrupt Mode (FIQ) The fast interrupt mode is relative to the general interrupt mode. It is used to process interrupt requests that require more urgent time. It is mainly used in high-speed data transmission and channel processing. (Fast interrupt has many (R8~R14) special registers of its own. When an interrupt occurs, using its own register can avoid saving and restoring some registers. If the exception interrupt handler uses other registers besides its own physical register , The exception interrupt handler must save and restore these registers)

  5. The management mode (SVC) management mode is the default mode after the CPU is powered on . Therefore, it is mainly used for system initialization in this mode, and soft interrupt processing is also in this mode. When the user program in the user mode requests the use of hardware resources, it enters this mode through a software interrupt.

    Note: The system enters the SVC mode when the system is reset, booted, or soft interrupted.

  6. Termination mode (ABT) : The abort mode is used to support virtual memory or memory protection. When a user program accesses an illegal address and does not have permission to read a memory address, it will enter this mode. Segment faults that often occur during programming under Linux are usually Throw back in this mode.

  7. Undefined mode (UND) : The undefined mode is used to support the software simulation of the hardware coprocessor. The CPU will enter the undefined mode when the instruction operation cannot be recognized during the decoding stage of the instruction.

  1. In addition to user mode, the other 6 modes are called privileged mode. The so-called privileged mode has the following rights:

a. MRS (put the contents of the status register in the general register);

b. MSR (put the contents of general registers in the status register).

Since the content of the status register cannot be changed, you must first copy the content to the general register, then modify the content of the general register, and then copy the content of the general register to the status register to complete the "modify status register" "The task.

  1. Except for the system mode, the remaining six modes are collectively referred to as the abnormal mode.

How many 32-bit registers does Arm have?

The ARM processor has 37 registers in total . It contains 31 general registers and 6 status registers.

What is the difference between Arm2440 and 6410?

  1. The main frequency is different. 2440 is 400M. 6410 is 533/667M;

  2. The processor version is different: 2440 is arm920T core, 6410 is arm1176ZJF core;

  3. 6410 is much better than 2440 in video processing. Internal video decoder, including MPEG4 and other video formats;

  4. 6410 supports hard decoding and encoding of WMV9, xvid, mpeg4, h264 and other formats;

  5. More than 6410 and many expansion interfaces such as: tv-out, CF card and S-Video output, etc.;

  6. The spi, serial, and sd interfaces are also richer than those two;

  7. 6410 uses DDR memory controller; 2440 uses SDRam memory controller;

  8. 6410 is a dual-bus architecture, one is used for the memory bus and the other is used for the Flash bus;

  9. The 6410's startup method is more flexible: it mainly includes SD, Nand Flash, Nor Flash and OneFlash device startup;

  10. Nand Flash of 6410 supports SLC and MLC two architectures, thereby greatly expanding storage space;

  11. 6410 is a dual-bus architecture, one is used for the memory bus and the other is used for the Flash bus;

  12. 6410 has 8 DMA channels, including dedicated DMA channels such as LCD, UART, and Camera;

  13. 6410 also supports 2D and 3D graphics acceleration;

How many kinds of ARM instruction set are divided into?

Two types are Thumb instruction set and ARM instruction set. The length of ARM instruction is 32 bits, and the length of Thumb instruction is 16 bits. This feature enables ARM to execute both 16-bit instructions and 32-bit instructions, thereby enhancing the functionality of the ARM core.

General registers include R0 R15. Which three types can they be divided into?

General registers include R0-R15, which can be divided into 3 categories:

  1. Ungrouped registers R0-R7

    In all operating modes, the ungrouped registers point to the same physical register , and they are not used by the system for special purposes. Therefore, when the interrupt or exception handling performs abnormal mode conversion , since different processor operating modes all use the same physical register, data in the register may be destroyed .

  2. Group register R8-R14

    For grouped registers, the physical registers they access each time are related to the current processor operating mode .

    R13 is often used to store the stack pointer. Users can also use other registers to store the stack pointer. However, under the Thumb instruction set, some instructions force the use of R13 to store the stack pointer.

    R14 is called the link register (LR, Link Register). When the subroutine is executed, R14 can get the backup of R15 (PC). After the subroutine is executed, the value of R14 is copied back to the PC, that is, R14 is used to save the return address.

  3. Program counter PC (R15)

    Register R15 is used as a program counter (PC). In the ARM state, bits [1:0] are 0, and bits [31:2] are used to save the PC; in the Thumb state, bits [0] are 0, and bits [31] :1] is used to save the PC.

How many working states does the Arm processor have?

From a programming point of view, there are generally two working states of ARM microprocessors, ARM and Thumb, and they can be switched between the two states.

  1. ARM state: At this time, the processor executes 32-bit word-aligned ARM instructions, and most of them work in this state.

  2. Thumb state: At this time, the processor executes 16-bit halfword-aligned Thumb instructions.

In the ARM system, when the function is called, in which way are the parameters passed?

When the parameter is less than or equal to 4, it is passed through the r0-r3 register , and when the parameter is greater than 4, it is passed by pushing the stack .

Why is the memory starting address of 2440 0x30000000?

S3C2440 processor has eight fixed memory blocks, only two can be used as memory banks such as ROM, SRAM and SDRAM. The details are shown in the figure below.

Which three types of ARM coprocessor instructions are included, please describe their functions.

ARM coprocessor instructions include the following three categories:

  1. Used for the ARM processor to initialize the data processing operation of the ARM coprocessor.

  2. Used for data transfer operations between the registers of the ARM processor and the registers of the ARM coprocessor.

  3. Used to transfer data between the registers and memory units of the ARM coprocessor.

What is PLL (Phase Locked Loop)?

Simply put, the input clock exists as a "reference source". The phase-locked loop is not simply to generate the same frequency and phase signal, but is generally integrated into a certain "frequency synthesis circuit" to generate a different frequency but phase-locked signal.

It's a bit confusing, for example: a reference crystal oscillator is 10Mhz, frequency synthesizer A uses this reference source to generate a 900Mhz clock, and frequency synthesizer B generates a 1Ghz clock. Although the frequencies of the two channels are different, they are still homologous signals due to the same reference source used . On the contrary, if the sources are different, they cannot be the same even if the frequency is the same, because no two clocks in the world can be exactly the same, there is always a slight frequency difference, which leads to a phase shift. In many real applications, there are occasions where the same-source clock is required. Therefore, phase-locked loops are widely used. Another derivative application of the phase-locked loop is coherent demodulation, you can check relevant information yourself.

Interrupts and exceptions

What is the difference between an interrupt and an exception?

Interrupt refers to an electrical signal generated by external hardware entering from the interrupt pin of the CPU to interrupt the operation of the CPU.

Exception refers to some events that must be dealt with during the running of the software , and the CPU automatically generates a trap to interrupt the running of the CPU. Exceptions must be considered to be synchronized with the processor's clock when processing . In fact, exceptions are also called synchronous interrupts . When the processor executes an incorrect instruction caused by a compilation error, or a special error occurs during execution, it must be handled by the kernel. When, the processor will generate an exception.

What is the difference between interrupt and DMA?

DMA : It is a hardware mechanism that allows two-way data transfer between peripherals and system memory without the participation of the CPU . Using DMA can make the system CPU get rid of the actual I/O data transfer process, thereby greatly improving The throughput rate of the system.

Interruption : Refers to the CPU in the process of executing the program, when certain emergencies occur , the CPU must suspend the execution of the current program, and transfer to deal with the emergencies , after the processing is completed, the CPU returns to the interrupted position of the source program and continues carried out.

So the difference between interrupt and DMA is: DMA does not require CPU participation, while interrupt requires CPU participation.

Can interrupt sleep and why? Can you sleep in the lower half?

  1. During interrupt processing, no process switching should occur. Because in the interrupt context, the only interrupt that can interrupt the current interrupt handler is the higher priority interrupt , which will not be interrupted by the process . If you sleep in the interrupt context, there is no way to wake it up , because all wake_up_xxx are for a certain process. In the interrupt context, there is no concept of a process and no task_struct (this is the same for softirq and tasklet) . So if it really sleeps, such as calling a routine that causes blocking, the kernel will almost certainly die.

  2. Schedule() saves the current process context (the value of the CPU register, the state of the process, and the contents of the stack ) when switching processes , so that the process can be resumed later. After an interrupt occurs, the kernel will first save the context of the currently interrupted process (recovered after calling the interrupt handler).

    But in the interrupt handler, the value of the CPU register must have changed (the most important program counter PC, stack SP, etc.). If schedule() is called due to sleep or blocking operation at this time, the saved process context is not the current process context. Therefore, you cannot call schedule() in the interrupt handler.

  3. 2.4 When the schedule() function itself in the kernel comes in judges whether it is in the interrupt context it :

if (unlikely(in_interrupt())) BUG(); Copy code

Therefore, the result of forcibly calling schedule() is a kernel bug, but looking at the implementation of kernel schedule() in 2.6.18, there is no such sentence, which is changed.

  1. The interrupt handler will use the interrupted process kernel stack, but will not have any effect on it, because the handler will completely clear the part of the stack it used after it is used up and restore the original appearance before the interruption.

  2. When in an interrupt context , the kernel is not preemptible . Therefore, if you sleep, the kernel must hang.

What is the execution flow of the interrupt response?

Interrupt response process: cpu accepts the interrupt -> saves the interrupt context and jumps to the interrupt processing process -> executes the upper half of the interrupt -> executes the lower half of the interrupt -> restores the interrupt context.

When an exception occurs, what steps will the ARM microprocessor perform?

  1. Store the address of the next instruction in the corresponding connection register LR, so that the program can resume execution from the correct position when processing an exception return. If the anomaly is from ARM state to enter, the LR stored in the register is the address of the next instruction (current PC + 4 or PC + 8, related to the type of exception); if abnormal from Thumb state entry is held in the LR register current PC in Offset , so that the exception handler does not need to determine the state from which the exception entered. For example: In the software interruption abnormal SWI, the instruction MOV PC, R14_svc always returns to the next instruction, regardless of whether the SWI is executed in the ARM state or in the Thumb state.
  2. Copy the CPSR to the corresponding SPSR.
  3. According to the abnormal type, the operation mode bit of CPSR is forced to be set.
  4. Force the PC to fetch an instruction from the relevant exception vector address and execute it, thereby jumping to the corresponding exception handler.

What should I pay attention to when writing an interrupt service? If there are more things to do after the interrupt is generated, how do you do it?

  1. Write an interrupt service routine to pay attention Kuaijinkuaichu , which as far as possible in the interrupt service routine to quickly collect information , including hardware information, then exit interrupt can be used to do other things work queue or tasklet way. That is to interrupt the upper and lower halves.

  2. There can be no blocking operations in the interrupt service routine . It should be that the CPU is completely occupied during the interrupt (that is, there is no kernel scheduling), the interrupt is blocked, and other processes will not be able to operate.

  3. The interrupt service routine pays attention to the return value , and uses the macro defined by the operating system as the return value instead of self-defined.

  4. If there are many things to do, these tasks should be placed in the second half (tasklet, waiting queue, etc.) for processing.

Why is FIQ faster than IRQ?

  1. ARM's FIQ mode provides more banked registers , r8 to r14 and SPSR , but IRQ mode does not have that many, R8, R9, R10, R11, R12 corresponding banked registers do not have, which means that in ARM In IRQ mode, the interrupt handler itself must save the registers R8 to R12 , and then the program must restore these registers when exiting the interrupt handler . In FIQ mode, because these registers have banked registers, the CPU will automatically switch the mode . Save these values to the banked register , and automatically restore when exiting FIQ mode, so FIQ is faster than IRQ in this process. Don t underestimate these registers, when ARM is compiling, if your FIQ interrupt handler is enough to use these independent registers To operate, it will not push general-purpose registers on the stack , which also saves some time.

  2. FIQ has a higher priority than IRQ . If FIQ and IRQ are generated at the same time, FIQ will be processed first.

  3. In the Symbian system, when the CPU is in FIQ mode to process FIQ interrupts, prefetch instruction exceptions, undefined instruction exceptions, software interrupts are disabled, and all interrupts are masked. Therefore, FIQ will be executed quickly and will not be interrupted by other exceptions or interrupts, so it is faster than IRQ. The IRQ is different. When ARM processes IRQ mode to process IRQ interrupts, if a FIQ interrupt request comes, the executing IRQ interrupt handler will be preempted, and ARM switches to FIQ mode to execute this FIQ, so FIQ is faster than IRQ too much.

  4. In addition, the entry address of FIQ is 0x1c, and the entry address of IRQ is 0x18. Those who have written a complete assembly system understand the difference. 18 can only put one instruction. In order not to conflict with FIQ at 1C, this place can only jump , and FIQ is different. After 1C, there is no interrupt vector table. In this way, the FIQ interrupt handler can be placed directly at 1C . Due to the limit of the jump range, at least one jump instruction is missing.

Which is more efficient, interrupt or polling? How to decide whether to use the interrupt mode or the polling mode to realize the drive?

Interrupt is when the CPU is in a passive state to receive a signal from the device, while polling is the CPU actively inquiring whether the device has a request.

Everything is two-sided, so when you look at efficiency, you cannot simply say that efficiency is high. If the requesting device is a device that frequently requests cpu or a network device that requests a large amount of data , then the polling efficiency is higher than interruption. If it is a general device, and the frequency of the device requesting the cpu is relatively low , the interrupt efficiency is higher. Mainly depends on the frequency of requests.

letter of agreement

What are asynchronous transmission and synchronous transmission?

Asynchronous transmission: It is a typical byte-based input and output. Data is transmitted one byte at a time, and its transmission speed is low.

Synchronous transmission: The external clock signal is required for communication, and the data bytes are combined and sent together. This combination is called a frame, and its transmission speed is faster than asynchronous transmission.

What is the difference between RS232 and RS485 communication interface?

  1. The transmission method is different . RS232 adopts unbalanced transmission mode, the so-called single-ended communication . RS485 uses balanced transmission, that is, differential transmission .

  2. The transmission distance is different . RS232 is suitable for communication between local devices, and the transmission distance is generally not more than 20m . The transmission distance of RS485 is tens of meters to thousands of meters .

  3. Number of devices . RS232 only allows one-to-one communication , while the RS485 interface allows up to 128 transceivers to be connected on the bus .

  4. Connection method . RS232 stipulates that the data is expressed by level , so the line is single line, and the purpose of full duplex can be achieved with two wires ; while for RS485, the data is expressed by differential level , so two wires must be used to transmit data The basic requirement is that 4 wires must be used to achieve full duplex .

Summary: In a sense, it can be said that there are only currents on the line, and RS232/RS485 specifies the line and flow pattern of these currents .

SPI protocol

Application of SPI

SPI (Serial Peripheral Interface) protocol is a communication protocol proposed by Motorola, that is, serial peripheral device interface , which is a high-speed full-duplex communication bus. The SPI bus system is a synchronous serial peripheral interface, which enables the MCU to communicate with various peripheral devices in a serial manner to exchange information. The SPI bus can be directly connected to a variety of standard peripheral devices produced by various manufacturers, including FLASH, RAM, network controllers, LCD display drivers, A/D converters, and MCUs.


  1. MOSI (Master Output, Slave Input)

    Master device output/slave device input pin. The data of the host is output from this signal line, and the slave reads the data sent by the host from this signal line, that is, the direction of the data on this line is from the host to the slave.

  2. MISO(Master Input,, Slave Output)

    Master input/slave output pin. The host reads data from this signal line, and the data from the slave is output to the host via this signal line, that is, the direction of the data on this line is from the slave to the host.

  3. SCLK (Serial Clock)

    Clock signal line, used for communication data synchronization. It is generated by the communication host and determines the communication rate. The maximum clock frequency supported by different devices is different. For example, the maximum clock frequency of STM32 is fpclk/2. When communicating between two devices, the communication rate is limited by the low-speed device. .

  4. SS( Slave Select)

    The slave device selection signal line is often called the chip selection signal line, also called NSS, CS, and is denoted by NSS below. When there are multiple SPI slave devices connected to the SPI master, the other signal lines SCK, MOSI and MISO of the device are connected in parallel to the same SPI bus at the same time, that is, no matter how many slave devices there are, only these 3 buses are used together; and Each slave device has an independent NSS signal line. This signal line occupies a pin of the master, that is, there are as many chip select signal lines as there are slave devices.

    In the I2C protocol, the device address is used to address, select a device on the bus and communicate with it; while there is no device address in the SPI protocol, it uses the NSS signal line to address. When the host wants to select a slave device, the slave The NSS signal line of the device is set to low level, the slave device is selected, that is, the chip selection is valid, and then the host starts SPI communication with the selected slave device. Therefore, SPI communication takes the NSS line low as the start signal, and the NSS line is pulled high as the end signal.

Protocol layer

The common connection methods between SPI communication devices are shown in the figure below:

The communication sequence of SPI communication is shown in the figure below:

  1. Communication start and stop signals

    At number 1 in the figure, the NSS signal line changes from high to low, which is the start signal of SPI communication. NSS is a signal line exclusively owned by each slave. When the slave detects the start signal on its NSS line, it knows that it has been selected by the master and starts to prepare to communicate with the master. At the label in the figure, the NSS signal changes from low to high, which is the stop signal of SPI communication, which means that this communication is over and the selected state of the slave is cancelled.

  2. Data validity

    SPI uses MOSI and MISO signal lines to transmit data, and SCK signal lines for data synchronization. The MOSI and MISO data lines transmit one bit of data in each clock cycle of SCK, and data input and output are performed at the same time. During data transmission, MSB first (high bit first) or LSB (low bit first) is not rigidly stipulated, but it is necessary to ensure that the same protocol is used between two SPI communication devices. Generally, the MSB first (high bit first) in the figure above will be used. )mode.

    Observe the 2345 label in the figure, the data of MOSI and MISO change and output during the rising edge of SCK, and are sampled during the falling edge of SCK. That is, at the moment of the falling edge of SCK, the data of MOSI and MISO are valid, high level indicates data "1", and low level indicates data "0". At other times, the data is invalid, and MOSI and MISO prepare for the next data presentation.

    Each data transmission of SPI can be 8-bit or 16-bit units, and the number of units per transmission is not limited.

  3. CPOL (clock polarity)/CPHA (clock phase) and communication mode

    The time sequence in the figure described above is only one of the communication modes in SPI. There are four communication modes in SPI. The main differences are: SCK clock state and data sampling time when the bus is idle. For the convenience of description, the concepts of "clock polarity CPOL" and "clock phase CPHA" are introduced here.

    The clock polarity CPOL refers to the level signal of the SCK signal line when the SPI communication device is in the idle state (that is, the state of the SCK when the NSS line is high before the SPI communication starts). When CPOL=0, SCK is low in idle state, and when CPOL=1, the opposite is true.

    The clock phase CPHA refers to the time of data sampling. When CPHA=0, the signal on the MOSI or MISO data line will be sampled on the "odd edge" of the SCK clock line. When CPHA=1, the data line is sampled on the "even edge" of SCK.

IIC protocol


The IIC protocol is a serial bus composed of the data line SDA and the clock SCL, which can send and receive data. It is a multi-master half-duplex communication method

Each device connected to the bus has a unique address . The bit rate can reach 100kbit/s in standard mode, 400kbit/s in fast mode, and 3.4Mbit/s in high-speed mode.

The structure of the I2C bus system is as follows:

I2C timing introduction

1. Idle state

When the two signal lines of SDA and SCL on the bus are at high level at the same time , it is an idle state . status

2. Start signal

When SCL is high , SDA jumps from high to low ; it is the start signal of the bus , which can only be initiated by the host , and the signal can be started in the idle state, as shown in the following figure:

3. Stop signal

When SCL is high , SDA transitions from low to high ; it is the **stop signal of the bus,** that the data has been transferred, as shown in the following figure:

4. Transmission data format When the start signal is sent, data transmission starts. The transmitted data format is shown in the figure below:

When SCL is high , the SDA data value will be obtained, and the SDA data must be stable (if SDA is unstable, it will become a start/stop signal).

When SCL is low, it is the level change state of SDA .

If the master and slave need to complete other functions (such as an interrupt) during data transmission, SCL can be actively pulled down to make I2C enter the waiting state , and SCL will be released until the processing is over, and the data transmission will continue.

5. Acknowledge signal ACK

The data on the I2C bus is all 8-bit data (bytes) . When 8 data is sent, the sender will release the SDA data during the 9th clock pulse . When the receiver successfully receives the byte, it will An ACK response signal will be output. When SDA is high, it means NACK, and when SDA is low, it means ACK.

PS: When the host is the receiver, after receiving the last byte, the host can directly send a stop signal to end the transmission without sending an ACK .

When the slave is the receiver, and no ACK is sent, it means that the slave may be busy with other things, or it does not match the address signal and does not support multi-master transmission. The master can send a stop signal and send a start signal again to start a new transmission.

6. Complete data transmission

As shown in the figure below, after sending the start signal , an 8-bit device address is sent , where the eighth bit is the read and write flag for the device , and the data is followed immediately until the stop signal is sent. to terminate .

PS: When we read the operation for the first time, and then want to change to the write operation, we can send a start signal again, and then send the device address of the read. Different address conversions can be realized without a stop signal.

IIC transmission data format

1. Write operation

At the beginning, the main chip should send a start signal , and then send one (used to determine which chip to write data to), direction (read/write, 0 means write, 1 means read). Respond (used to determine whether the device exists), and then data can be transmitted. After the data is transmitted, there must be a response signal (to determine whether the data is accepted or not), and then the next data can be transmitted . Every time a piece of data is transmitted , the receiver will have a response signal, and after the data is sent , the main chip will send a stop signal .

White background: master slave. Gray background: Slave Master.

2. Read operation

At the beginning, the main chip should send out a start signal , and then send out a device address (used to determine which chip to read data from), direction (read/write, 0 means write, 1 means read). Respond (used to determine whether the device exists), and then data can be transmitted. After the data is transmitted, there must be a response signal (to determine whether the data is accepted or not), and then the next data is transmitted . Every time a piece of data is transmitted , the receiver will have a response signal . After the data is sent, the main chip will send a stop signal .

White background: master slave. Gray background: Slave Master


What is big endian in embedded programming? What is little endian?

Big-endian mode: The low-order byte is stored on the high address, and the high-order byte is stored on the low address.

Little-endian mode: The high-order byte is stored on the high address, and the low-order byte is stored on the low address.

STM32 belongs to the little-endian mode. Simply put, for example, u32 temp=0X12345678; suppose the temp address is 0X2000 0010. Then in the memory, storage becomes:

Address | HEX | 0X2000 0010 | 78 56 43 12 | Copy code

Because it is hexadecimal, a number is 0.5 bytes, so 12 represents a byte and 34 represents a byte.

The small-endian mode CPU stores the operands from low byte to high byte, while the big-endian mode stores the operands from high byte to low byte. For example, the storage mode of the 16-bit wide number 0x1234 in the little-endian mode CPU memory (assuming it is stored from address 0x4000) is shown in Table 1, and the storage mode in the big-endian mode CPU memory is shown in Table 2.

Table 1 Storage mode of 0x1234 in little-endian CPU memory

Memory addressStore content

Table 2 Storage mode of 0x1234 in big-endian CPU memory

Memory addressStore content

The storage mode of the 32-bit wide number 0x12345678 in the little-endian mode CPU memory (assuming it is stored from address 0x4000) is shown in Table 3, and the storage mode in the big-endian mode CPU memory is shown in Table 4.

Table 3 Storage method of 0x12345678 in little-endian CPU memory

Memory addressStore content

Table 4 Storage mode of 0x12345678 in big-endian CPU memory

Memory addressStore content

Take the following program as an example:

# include <stdio.h> struct mybitfields { unsigned short a: 4 ; unsigned short b: 5 ; unsigned short c: 7 ; }test; int main () { int i; test.a = 2 ; test.b = 3 ; test.c = 0 ; i =*(( short *)&test); printf ( "%d\n" ,i); return 0 ; } Copy code

The output of the program is 50.

In the above example, sizeof(test)=2, the declaration method of the above example is to divide a short (that is, a block of 16-bit memory) into 3 parts, the size of each part is 4, 5, and 7 bits, assignment statement

It is to convert the above 16-bit memory into short type for explanation.

The binary representation of variable a is 0000000000000010, and the lower four digits are 0010. The binary representation of variable b is 0000000000000011, and the lower five digits are 00011. The binary representation of the variable c is 0000000000000000, and the lower seven bits are 0000000.

The 80x86 machine is in the little-endian (pay attention when modifying the partition table) mode, and the microcontroller is generally in the big-endian mode . Little endian generally means that the low byte is before the high byte, that is, the low byte is at the low end of the memory address. It can be recorded like this (little end low order first opposite to the normal logical order), so after synthesis, we get 0000000000110010, that is Decimal 50.

Here is another example

# include <stdlib.h> # include <stdio.h> # include <string.h> int main () { unsigned int uiVal_1 = 0x12345678 ; unsigned int uiVal_2 = 0 ; unsigned char aucVal[ 4 ] = { 0x12 , 0x34 , 0x56 , 0x78 }; unsigned short usVal_1 = 0 ; unsigned short usVal_2 = 0 ; memcpy(&uiVal_2,aucVal,sizeof (uiVal_2)); usVal_1 = ( unsigned short )uiVal_1; //Truncate here, and all get the low bits usVal_2 = ( unsigned short )uiVal_2; //Truncate printf here ( "usVal_1:%x\n" ,usVal_1); //here Converted back to printf ( "usVal_2:%x\n" ,usVal_2); //Converted back to return 0 here ; } Copy code

The little-endian mode is to store the low byte at the low address and store the high byte at the high address. The structure is as follows

78 //Low address 56 34 12 //High address copy code

In the memory, the test machine is little endian, with addresses from small to large.

val1: 78,563,412 riVal2: 12345678 Copy the code

The results are as follows:

5678 3412Copy code

How to judge whether a computer processor is big-endian or little-endian?

# include <stdio.h> int checkCPU () { { union w { int a; char b; }c; ca = 1 ; return (cb == 1 ); } } int main () { if (checkCPU()) printf ( "little endian\n" ); else printf ( "big endian\n" ); return 0 ; } Copy code

The editor s processor is an ntel processor, because Intel processors are generally in little-endian mode, so the output of the program at this time is: little-endian

In the above code, if the processor is big-endian, it returns 0; if the processor is little-endian, it returns 1. The storage order of the union is that all members are stored from the low address. If you can know the CPU pair by changing the code Whether the memory is read and written in the little-endian mode or the big-endian mode will definitely make the interviewer admire.

It can also be judged by the pointer address. Since in a 32-bit computer system, short occupies two bytes and char occupies one byte, the following methods can be used to achieve this judgment.

# include <stdio.h> int checkCPU () { unsigned short usData = 0x1122 ; unsigned char *pucData = ( unsigned char *)&usData; return (*pucData == 0x22 ); } int main () { if (checkCPU()) printf ( "little endian\n" ); else printf ( "big endian\n" ); return 0 ; } Copy code

The output of the program is: little endian

How to convert between large and small endian?

int swapInt32 ( int intValue) { int temp = 0 ; temp = ((intValue & 0x000000FF ) << 24 )| ((intValue & 0x0000FF00 ) << 8 ) | ((intValue & 0x00FF0000 ) >> 8 ) | ((intValue & 0xFF000000 ) >> 24 ); return temp; } /*short type: */ unsigned short swapShort16 ( unsigned short shortValue) { return ((shortValue & 0x00FF ) << 8 ) | ((shortValue & 0xFF00 )>> 8 ); } /*float type:*/ float swapFloat32 ( float floatValue) { typedef union SWAP_UNION { float unionFloat; int unionInt; }SWAP_UNION; SWAP_UNION swapUnion; swapUnion.unionFloat = floatValue; swapUnion.unionInt = swapInt32( swapUnion.unionInt); return swapUnion.unionFloat; } /*Change the way of writing the double type, use a pointer, otherwise the shift will die......*/ void swapDouble64 ( unsigned char *pIn, unsigned char *pOut) { for ( int i = 0 ; i< 8 ; i++) pOut[ 7 -i] = pIn[i]; } int main () { int x = 0x12345678 ; int y = swapInt32(x); printf ( "%x\r\n" ,y); return 0 ; } Copy code

How to assign a value to the absolute address 0x100000?

( unsigned int *) 0x100000 = 1234 ; copy the code

So what should I do if I want the program to jump to an absolute address of 0x100000 for execution?

*(( void (*)( )) 0x100000 ) (); Copy code

1. 0x100000 must be forcibly converted into a function pointer, namely:

( Void (*) ()) 0x100000 copy the code

Then call it:

*(( void (*)()) 0x100000 )(); Copy code

It can be seen more intuitively with typedef:

typedef void (*) () voidFuncPtr ; *((voidFuncPtr) 0x100000 )(); Copy code

Contact the author

About the author

In the process of preparing for the autumn recruitment, the author finally got it with this information. offer from more than ten companies such as oppo, Xiaomi, Zhaoyi Innovation, Allwinner Technology, and Hikvision . This part of the information is now shared, and I hope it will be helpful to everyone!

If you see good information on the Internet, or encounter knowledge points that are not in the information in the written interview, you can pay attention my contact me, and I will sort it out for you.

If the information is wrong or inappropriate, you can submit issues to me on github. Due to limited energy, I will only maintain it carefully two platforms github and public . The errata in the data will also be updated in github.

github repository

There are a total of seven parts in this document, namely: C/C++ , data structure and algorithm analysis , Arm system and architecture , Linux driver development , operating system , network programming , and real written examination questions for famous enterprises . All content will be updated to the github repository synchronously.

Click to jump


Scan the QR code below to add me on WeChat. WeChat ID: LinuxDriverDev.