Common encryption, encoding and Hash, etc. | Java development combat

Common encryption, encoding and Hash, etc. | Java development combat

This article is participating in the "Java Topics month - Java development actual combat" to view details about active links

In the daily development process, in order to ensure the security of the program and the security of communication, we must use encryption methods, such as using asymmetric data encryption when calling the interface, and the important strings in the program Encryption to prevent decompilation and viewing, etc. Today we will take a look at various encryption methods,

Symmetric encryption

  • Use the secret key and encryption algorithm to convert the data, and the meaningless data obtained is the ciphertext; use the secret key and the decryption algorithm to reverse the ciphertext, and the obtained data is the original data.

First encrypt by encryption algorithm, and then send, after the target receives the ciphertext, it will decrypt it by decryption algorithm

  • Symmetric encryption can encrypt any binary data.

  • Classic algorithms: DES, AES

    Because the DES key is too short, it is abandoned. If the key is too short, it will be easy to crack. Why? If it is cracked by brute force, the length of the secret key is too short, which will cause the secret key to be tested quickly and then it will be cracked.

    Now the mainstream is AES. Both of these are symmetric encryption.

Asymmetric encryption

Use the public key to encrypt the data to get the cipher text; use the private key to decrypt the data to get the original data.

The difference with symmetric encryption is that the encryption algorithm is used when decrypting in asymmetric encryption, but the secret key is different.

Example: For example, if both parties want to communicate, the content of the communication has only 10 characters, which are 0,1,2,3,4,5,6,7,8,9. Encryption key: +4 for each character, decryption key: +6 for each character

Send message: 110

Encryption: 554

Decryption: 5+6 = 11, the one that gets the overflow is 1, the same is true for the middle, the last one is 4+6 = 10, the one that gets the overflow is 0, and the final result is 110.

Of course, this is unbearable to scrutinize, but he can explain the core principles of asymmetric encryption, the most important of which is overflow . If overflow is not allowed, there is no way to play with asymmetric encryption.

  • Question: If A and B communicate through asymmetric encryption, there is no problem, but the question is how to send the secret key to the other party?

    If A and B communicate, A has its own encryption key and decryption key, and the same B also has.

    So how to solve the problem of key transmission? The answer is to publish the encryption key directly

    Understand:

    When communicating, A gives B its encryption key, and B gives A its own key.

    A sends a message to B, which is then encrypted by encryption key B, and then sent to B. After B receives the ciphertext, it can use the local decryption key B to decrypt it. But: if C intercepts the encryption key and ciphertext during the sending process, can he decrypt it? Obviously not, because the encryption and decryption keys are not the same, so even if they are intercepted, they cannot be decrypted.

  • In the above question:

    Encryption key corresponds to: public key

    The decryption key corresponds to: private key

    The public key can be published arbitrarily, but the private key cannot be published to anyone and cannot be transmitted.

  • Delay use

    Can the public key solve the private key?

    Yes, first encrypt with the private key to get the ciphertext, then use the public key to encrypt again to get the original data. This method is called signature verification.

    why?

    Since the original data can be obtained by first encrypting with the public key and then encrypting with the private key. In the same way, the original data can be obtained through public key encryption after private key encryption.

  • Signing and verification

    Since the private key and the public key are mutually solvable, asymmetric encryption can also be used as a digital signature technology

    Signature: Use the private key to encrypt the original data (called signature) to get the signed data

    Verification: Use the public key to encrypt the original data (called verification) to obtain the original data

    For example: After I sign a file, I get the signature data. If someone else holds an unreadable file, it can be successfully verified by the public key, which means that the file was signed by me personally. Because only I know the private key, no one can arbitrarily create a piece of data that can be verified by the public key as the original data.

  • Use encryption and signature at the same time

    Still the picture above

    A sends a message to B. In this process, it can be intercepted by C. C cannot decrypt the original data, but C can use the public key to re-encrypt a piece of data and send it to B.

    For example, C sends: lend me 30,000 yuan. Then B uses the private key to decrypt it after receiving it, and finds that it is borrowing money, and then sends the money over. This will cause B's money to be defrauded.

    How to solve it: use encryption + signature

    A uses the other party s public key to encrypt when sending a message, and then uses its own private key to sign the message

    After receiving the message, B uses his own private key to decrypt the original data, and also needs to use the other party s public key for verification

    In this way, even when C gets the data, he can encrypt it with the public key, but he cannot sign with the private key, so this problem is solved.

  • Classic algorithms: RSA, DSA

    RSA: can be used for encryption, decryption, and signature

    DSA: Designed specifically for signing. His advantage is that he is faster.

  • Advantages: can be transmitted on insecure networks

  • Disadvantages: Complicated calculations, so performance is much worse than symmetric encryption

Key and login password

  • Key

    The key is something that just fits the ciphertext, and the ciphertext can be decrypted by this key.

    • Scenario: used for encryption and decryption
    • Purpose: to ensure that the stolen data will not be read by others
  • Login password (password)

    It is equivalent to a pass password, which is an identity verification.

    • Scenario: used for identity verification when entering a website or logging in
    • Purpose: The data provider protects the user's data and guarantees that the permission is only provided when "you are you"

Base64

Convert binary data into a string of 64 characters, each with upper and lower case 26 letters, a total of 52, then 0 to 9, followed by +/, a total of 64 characters

  • What is binary data

    Non-text data is binary data. For example, pictures, music, movies, etc. are all binary data.

  • use

    • Let the original data have the characteristics of the string, such as it can be placed in the URL for transmission, can be saved to a text file, and can be transmitted through ordinary chat software.
    • Turn a string that is readable by human eyes into an unreadable string to reduce the risk of peeping
  • Base64 encrypted transmission of pictures can be more secure and efficient, really?

    • Base64 does not have any security at all, metadata can be obtained reversely through the code table
    • The efficiency of Base64 is false. The string converted by Base64 will be larger than the original data, so it will not be efficient, on the contrary it is inefficient.
  • Variety: Base58

    He removed 4 characters in Base64, 0, O, I, l, +,/. These 6 characters were removed because it is easy to confuse, and the +/was removed because of double-clicking to copy.

  • Variation: URL encoding

    Use% to encode reserved characters in the URL, and replace + and/with two other characters

    For example: blog.csdn.net/hahaha

    The above URL, put it in the browser and press Enter, and then copy and paste it into the following:

    blog.csdn.net/%E5%93%88%E...

    Because the browser does not support displaying the man, even if you look like a man, he has actually been converted.

    If you enter China in the traffic device, note that there is a space in the middle. In the browser, you will directly use + instead, and/has a unique role, which is why you need to replace +/with other characters.

    Purpose: Eliminate ambiguity and avoid parsing errors

Compression and decompression

  • Compression: store data in another way to reduce storage space

  • Decompression: restore the compressed data to its original form for use

  • Common compression algorithms: DEFLATE, JPEG, MP3

    • DEFLATE: Archive a lot of things, while archiving can also be compressed
    • JPEG: Compress the picture
    • MP3: compress the sound
  • Does compression belong to encoding?

    • What does encoding mean?

      There is no official definition of encoding. For example: Convert A to B, and you can also transfer back, no one loses any information during the conversion process, and no information is added. This is the code

    Compression and decompression are in full compliance with this feature. So compression is also a form of encoding.

Encoding and decoding of media data

What is the encoding and decoding of pictures, audios, and videos

  • Picture encoding: the encoding format of image data coroutine JPG, PNG and other files

    In fact, it is to convert the data to the corresponding format. For example, a white dot is represented by ffffff, and the width and height of a picture is 64*64, then there will be 64 * 64 ffffff,

    To encode this 64*64 picture, you can specify the format such as:

    YS: ffffff=64*64;

    The above format can be clearly seen and decoded.

    Of course, the above is just a simple example, a good algorithm is not like that, but it s enough to make sense

  • Picture decoding: parse the data in JPG, PNG and other files into standard image data.

  • Audio and video encoding and decoding: similar to the above, there are lossy compression and lossless compression, etc., nothing more than the sound quality is not good, of course pictures are also possible, for example, WeChat emoticons are now 1mb, you need to compress the pictures, The image compressed with a good algorithm will become smaller, but it still looks similar to the original image.

Serialization

The process of converting an object (usually in memory) into a sequence of bytes

java serialization mechanism

  • Purpose: to allow things in memory to be stored and transmitted

  • Is serialization encoding?

    Strictly speaking, it is not encoding. Encoding is to convert A format to B format, and can be converted to each other at will, but serialization is the process of serializing objects in memory into bytes. In fact, they are almost the same, it depends on how you understand.

Hash

Convert any data into a specified size (usually small) range of data, his main function is abstract, digital fingerprint. For example, if there are 200 people, the 200 people are numbered by hash, such as 001, 002, etc. Each number corresponds to a person, and this number is called the hash value.

  • Classic algorithms: MD5, SHA1, SHA256, etc.

    Hash has an algorithm, and he will calculate the corresponding hash value according to the algorithm. When calculating the hash value, it is also necessary to ensure that the collision rate is very low. The collision rate refers to the different hash values. Also, it is not easy to be cracked.

    Those who have studied java should all know the hashcode, and you can re-have the hashcode method to perform custom hash value calculations, such as:

    public int hashCode (String sources) { return sources.length() } //passed in Kazakhstan is very good to get the hash value: 3 //hash values passed ha ha get: 2 copy the code

    Through the above simple algorithm, the corresponding hash value can be obtained.

    If the collision rate of the hash value is guaranteed to be very low, some advanced algorithms are required.

  • the real function

    • Data integrity verification

      For example: download a file from the Internet, the author of this file provides a 5g file and a hash value. Then you download it from the Internet and calculate the hash value of the file after the download is complete. If it is the same as the one provided by the author, it means that the file is not damaged. Otherwise, it means that the file may be tampered with or damaged.

    • Quick lookup: hashCode and HashMap

      HashMap principle

      The role and difference between hashCode and equals

      Those who have learned java must know that equals must be rewritten after hashCode is rewritten. Why is this?

      The data structure of HashMap is in the form of array + linked list. The corresponding subscript is obtained through hashCode, and then it is judged whether the data needs to be saved. When saving data, it is saved by key, and this key must be unique. When saving, pass in the key, and then look at the source code, you can find that he will calculate the hash value of the key, and then judge whether the hash is unique, that is, whether the hash collides, if not, it will use this hash The value is the key, and the value is saved. If the hash is not unique, it means that a hash collision has occurred. Then it will use equals to determine whether the contents of the key are equal, if not equal, save it, otherwise it will not save it.

    • privacy protection

      • Clear text: Some websites use clear text when saving user information, that is, the account password is directly saved, which is visible in the database. If the database is leaked, then others can directly get your account password. The disadvantages of this kind of plaintext storage.

      • Unclear text: What is unclear text? It is to hash the password. When logging in, you only need to hash the password to compare the same and prove that it is true, otherwise it will not work. If the database is leaked, he will get some hash values, which are of no use, so it is safe.

      • With salt

        Because the hash, md5 is irreversible, after getting the hash, the original password cannot be calculated backwards. But those who are very idle will hash frequently used passwords and then compare whether the hash values are equal.

        This is the reason for the above, so each website has its own salt. When the password is saved, the hash value of the password and salt is calculated, and then the corresponding result is saved. In this way, even if the hash value is obtained, it is impossible to compare the original password. Because the hash value is salted.

        Therefore, the salt of each website must be strictly protected and cannot be leaked.

    • Is Hash encoding

      No, Hash is irreversible. He just extracts the features of the object and generates a hash value.

    • Is Hash an encryption code? MD5 is encryption?

      In fact, it is not. Encryption refers to reversible, and the encrypted data can be restored after calculation. But neither hash nor MD5 meet this condition, you can call them "irreversible conversion"

    • Hash and asymmetric encryption

      When signing in asymmetric encryption, you need to use the private key to sign the original data, and then get the signature file. But if the file is very large, then the signature file will also be very large, which will cause a lot of waste.

      Therefore, the hash algorithm is placed in the signature, and the process is as follows:

      Use the hash algorithm to extract the features of the original data to get the hash value. Then encrypt the hash value with the private key (encryption with the private key is called signature) to get the signed value.

      When verifying: use the public key to verify, then get the hash value, and then calculate the hash value of the original data, if the same, it means success, otherwise the file has been tampered with.

      In this way, no matter how large the original data is when sending a message, the signature data will be very small.

character set

A Map from integers to text symbols in the real world

  • Branch

    • ASCLL: 128 characters, 1 byte

    • ISO-8859: extend ASCLL, 1 byte

    • Unicode: 130,000 characters, multi-byte

      • UTF-8: Unicode encoding branch
      • UTF-16: Unicode encoding branch
    • GBK/GB2312/GB18030

      Chinese self-developed standard, multi-byte, character set + encoding


If this article is helpful to you, it is a great honor. If there are any errors or questions in the article, you are welcome to raise them!