In embedded network devices, in order to solve the security problems caused by transparent network transmission, people generally use TLS protocol to encrypt information to prevent messages from being cracked, and provide identity authentication mechanism to avoid access to phishing websites, verify data integrity to prevent data tampering, etc. However, due to the need to encrypt and decrypt messages, it brings considerable performance overhead and memory usage, resulting in some devices with limited on-chip resources being forced to use transparent transmission to communicate, and suffer from the risk of privacy leakage during communication. In order to improve this phenomenon, the following solutions can be adopted:

  1. Use a streamlined TLS library

Unlike algorithm libraries such as OpenSSL and JSSE (JavaSecureSocketExtension) that run on general computing platforms such as X86, these algorithm libraries are designed to be comprehensive toolkits from the beginning. They are designed to provide a wide range of encryption functions, support multiple cryptographic standards and technologies, and keep up with the latest developments in security protocols and algorithms. This comprehensiveness makes libraries such as OpenSSL and JSSE perform very well in terms of compatibility and foresight, however, this comprehensiveness is also accompanied by the problem of resource consumption. Since these libraries need to support a large number of algorithms and protocols, they usually take up more memory and computing resources. For those resource-constrained devices, this is obviously unbearable. In this case, using a streamlined TLS algorithm library becomes a more reasonable choice.

mbedTLS: This is a security library designed for embedded systems. It provides support for TLS/DTLS protocols and pays great attention to the smallness and efficiency of the code. mbedTLS allows users to trim unnecessary components through configuration, thereby further reducing memory usage.

Official code repository: https://github.com/Mbed-TLS/mbedtls.git

wolfSSL: This library is also optimized for embedded systems and provides a lightweight TLS implementation. It supports the latest TLS versions and includes some advanced features, such as support for hardware accelerators.

Official code repository: https://github.com/wolfSSL/wolfssl.git

tinydtls: This is a very small DTLS library, especially suitable for devices with extremely limited resources. It focuses on the DTLS protocol and is suitable for IoT scenarios that require lightweight secure communication.

Official code repository: https://github.com/eclipse/tinydtls.git

Although these streamlined TLS libraries may not provide a wide feature set like OpenSSL, they can well meet the security needs of specific application scenarios while maintaining low resource consumption. Make the most of limited hardware resources while ensuring security.

  1. Optimize the storage and use of certificates

Normally, the size of a single TLS certificate is approximately 1~2KB without extended information. If the entire certificate chain is stored, the total size may increase significantly. In addition, the format of the certificate will also affect its size. On resource-constrained devices, directly using the complete certificate chain for authentication may not be the best choice. In order to optimize memory usage, you can consider the following strategies:

Compression algorithm: The user certificate can be compressed by a compression algorithm (such as gzip or zlib) and then stored in the on-chip memory. When you need to use the certificate, unzip it. While this approach can reduce storage space requirements, it adds additional CPU overhead at runtime to handle compression and decompression operations, so a trade-off needs to be made based on the specific performance of the device.

Sub-packaging loading: For larger certificates or certificate chains, they can be processed in a sub-packaging manner when loading into memory. That is, according to the size limit of a single transmission package, the certificate is divided into several small blocks and loaded one by one. This can avoid the problem of memory overflow caused by loading a large amount of data at once.

Read-only access: If the application scenario does not require modification of the certificate content, you can directly use the pointer to the certificate location in the on-chip memory and access the certificate data in read-only mode. This method completely eliminates the step of copying the certificate into RAM, thus greatly saving runtime memory overhead.

Taking these steps can help ensure that TLS certificates can be effectively managed and used even in memory-constrained environments, while maintaining system security and performance. When designing and implementing these solutions, the costs and benefits of each approach should be carefully evaluated to ensure that they meet specific application requirements and technical constraints.

  1. Hardware acceleration

As device security requirements continue to increase, encryption algorithms are constantly evolving and upgrading. However, these encryption algorithms often involve a large number of mathematical operations. If these operations are completely dependent on a general-purpose CPU, they will not only consume a lot of processor resources, but also cause performance bottlenecks, especially when processing a large number of concurrent security requests. Simply relying on the CPU to calculate encryption algorithms seems very irrational and inefficient. Therefore, in order to improve processing efficiency and reduce power consumption, people began to integrate specially designed hardware accelerators into the chip. These hardware accelerators are optimized for specific types of cryptographic operations and are able to complete tasks with extremely high speed and efficiency.

  • AES accelerator: used to quickly perform AES symmetric encryption and decryption operations. AES is the currently widely used block cipher standard, and its hardware accelerator can significantly increase the speed of encryption/decryption, which is particularly important for applications that require real-time encryption, such as wireless communications and video streaming.
  • HASH accelerator: used to accelerate the calculation of hash functions, such as SHA-256 or MD5, etc. Hash functions are often used to generate message digests, ensure data integrity, and play a key role in digital signature and authentication processes. Through dedicated hardware accelerators, the speed of hash calculation can be greatly improved, thus speeding up the execution process of the entire security protocol.
  • RSA accelerator: used to accelerate large number operations in the RSA public key encryption algorithm. The RSA algorithm is widely used in SSL/TLS handshake, email encryption and other fields. Since the large integer operations involved in RSA are extremely complex, using a dedicated hardware accelerator can significantly reduce the time required for encryption and decryption.

In addition to the accelerators mentioned above, there are other types of hardware accelerators, such as ECC accelerators, used for elliptic curve encryption algorithms; and random number generators (RNG), used to generate high-quality random numbers, which are important for encryption algorithms. Say it's crucial.

By integrating these proprietary hardware devices into the chip, not only can the calculation speed of encryption algorithms be greatly accelerated, but the power consumption of the overall system can also be reduced, because hardware accelerators generally have a higher energy efficiency ratio compared to software implementations. Additionally, they offload the main processor, allowing it to focus on other more important tasks, thereby improving overall system responsiveness and stability. In summary, the introduction of hardware accelerators is an effective solution to the growing security needs and performance requirements.