Tuning Tor on Linux - High Load Relays - Optimization of performance

Tuning Tor on Linux - High Load Relays - Optimization of performance

I currently run a few Tor systems like this and spend some time volunteering as a Tor relay operator. I'd like to share some of my notes here about optimizing Tor and Linux for better performance and blocking certain types of malicious traffic.

The heavy load of the Tor relay process is caused by several factors, which are based on the high network load and the computationally intensive tasks that the process has to handle. In detail, the following happens in a Linux process that serves as a Tor relay:

1. **Network traffic handling**:
   - **Connection establishment and management**: Tor relays are constantly busy establishing and managing connections to other relays and clients. This requires the handling of TCP connections, which consumes CPU resources.
   - Encryption and decryption**: Every data stream that passes through the Tor network is encrypted multiple times. The relay must therefore decrypt incoming data and encrypt outgoing data. These cryptographic operations are very computationally intensive.
   - **Data forwarding**: Relays must receive data packets and forward them to the next node in the network. This requires fast read and write operations on the network interfaces.

2 **Cryptographic tasks**:
   - **RSA encryption/decryption**: When establishing connections, Tor uses RSA for key agreement. RSA operations are particularly CPU-intensive.
   - AES encryption/decryption**: Tor uses AES to encrypt data streams. AES is efficient, but the computing requirements add up with high data volumes.
   - **HMAC calculations**: HMACs (Hash-based Message Authentication Codes) are used to ensure data integrity and authenticity, which also require computing power.

3 **Database and memory management**:
   - **Routing tables and network status**: Tor relays need to manage and regularly update large routing tables and network status information. This information is usually stored in memory structures that need to be searched and updated efficiently.
   - **Caching**: Relays keep frequently used data in memory to improve performance. However, this requires efficient memory management and increases memory requirements.

4 **Process and thread management**:
   - **Multithreading**: To handle the parallel connections and tasks, Tor relays often use multiple threads. This leads to additional CPU load due to the management and synchronization of the threads.
   - Event loop handling**: Tor uses an event-driven programming model where many small tasks are executed simultaneously. The management of these events and the scheduling also cause a certain load.

In summary, the main reasons for the high load of a Tor relay process are the intensive use of the CPU for cryptographic operations, the management of many concurrent network connections and the constant updating and management of routing and status information. These tasks are both computationally and memory intensive and can put a heavy strain on system resources.

The file /etc/sysctl.conf is used to set the performance of the Linux kernel. Below are the values that I have used for my relays for years. I will go into the details of the individual values. The settings provided here were suggested by me and other Tor Relay operators and have been adjusted over time as I monitored performance and saw improvements.

These are my successfully tested settings without the network optimizations. The network optimizations will follow in another post.

net.core.somaxconn=20480
vm.min_free_kbytes=65536
fs.file-max=64000
kernel.sysrq=0
vm.vfs_cache_pressure=150
vm.swappiness=10
vm.dirty_ratio=10
vm.dirty_background_ratio=5
vm.dirty_expire_centisecs=2000
vm.dirty_writeback_centisecs=800

The file `/etc/sysctl.conf` is used to configure kernel parameters at runtime. Here are the details of the settings you mentioned:

1. **`net.core.somaxconn=20480`**:
   - This parameter sets the maximum queue length for pending connections in a socket in listen state. The default value is often 128. A higher value such as 20480 can help to handle a larger number of incoming connections, which is useful for servers with high load.

2. **`vm.min_free_kbytes=65536`**:
   - This parameter specifies the minimum amount of free memory in kilobytes that the system should always have available. This helps to prevent the system from stalling due to a lack of memory. A higher value can improve system performance, especially under high load.

3. **`fs.file-max=64000`**:
   - This setting specifies the maximum number of file descriptors that the system can open in total. A higher value allows more files to be opened simultaneously, which is important for systems with many simultaneous file accesses.

4. **`kernel.sysrq=0`**:
   - This setting disables the SysRq key function, which can send special commands directly to the kernel. This can be disabled for security reasons to prevent unauthorized access to kernel functions.

5. **`vm.vfs_cache_pressure=150`**:
   - This parameter controls how aggressively the system reclaims memory for the VFS (Virtual File System) cache. A value of 150 means that the system will be more aggressive in flushing the VFS cache, which may free up more memory for other applications, but could also affect the performance of file system operations.

6. **`vm.swappiness=10`**:
   - This parameter determines how often the system attempts to swap out memory to the swap area. A lower value (such as 10) means that the system swaps less aggressively, which can improve performance if enough RAM is available.

7. **`vm.dirty_ratio=10`**:
   - This parameter specifies the percentage of total memory that is allowed to be "dirty" (i.e. memory that has been modified but not yet written to disk) before the system begins to write the data to disk. A value of 10% means that the system will be more aggressive in writing dirty pages.

8. **`vm.dirty_background_ratio=5`**:
   - This parameter specifies the percentage of total memory that is allowed to be "dirty" before the background write process starts to write the dirty pages to disk. A value of 5% ensures that the write process runs more frequently in the background.

9. **`vm.dirty_expire_centisecs=2000`**:
   - This setting specifies the time in hundredths of a second how long data in memory is considered "dirty" before it must be written to disk. A value of 2000 means that data that is dirty for longer than 20 seconds is forcibly written to the hard disk.

10. **`vm.dirty_writeback_centisecs=800`**:
    - This setting determines how often (in hundredths of a second) the background process that writes dirty pages to disk is executed. A value of 800 means that this process is executed every 8 seconds.

In summary, these settings help to optimize system performance and stability, especially for servers with high loads or special requirements.

The detailed information is shown below.

net.core.somaxconn=20480

The setting net.core.somaxconn=20480 specifies that up to 20,480 incoming connections can be held in the queue of a listening socket. This is particularly useful for servers under heavy load, as it helps to avoid connection losses and improve system performance when network traffic is high. However, this configuration requires that the system has sufficient resources to efficiently manage this high number of connections. By default, net.core.somaxconn is set to a value around 128 or 256 in many Linux distributions. The value 20480 is therefore a significant increase, which is configured for special high-load scenarios.

This parameter is a configuration in the Linux kernel that specifies the maximum number of pending connections that can be queued for a listening socket. It affects both TCP and UDP sockets, managing the listen backlog size, which is the number of incoming connections waiting to be accepted by the application.

When a socket is in the listening state, it can accept incoming connections. If the application does not accept these connections quickly enough, they can start to accumulate in a queue. The parameter `net.core.somaxconn` sets the maximum size of this queue, also known as the listen backlog.

The default value for `net.core.somaxconn` is usually set by the Linux distribution and can vary, typically around 128 or 128 multiplied by `sysctl_somaxconn()`. By increasing `net.core.somaxconn`, you can allow the kernel to queue more pending connections, which is useful for handling high connection rates or bursty traffic.

It's important to note that `net.core.somaxconn` does not limit the total number of connections; it only controls the number of connections that can be queued in the backlog. The actual number of connections that can be managed depends on other factors, such as the application's ability to accept connections and the available system resources, like file descriptors and memory.

**Recommendations and Considerations:**
- Increase this value if your system handles a high number of incoming connections. For example, a high-load web server might benefit from a value of 1024 or higher.
- Verification: If your server experiences connection resets or network instability with a high rate of incoming connections, consider increasing this value.

vm.swappiness=10

The setting vm.swappiness=10 signals the Linux kernel to use swap memory minimally and instead favor physical RAM as much as possible. This can improve system performance and responsiveness, especially in environments that require fast memory access time. It is a suitable setting for systems with sufficient RAM and for use cases where fast response times are important. caution: With very low swappiness, there is a risk that the system will reach the limits of the physical memory more quickly and may not be able to fulfill new memory requirements before the swap is used.

The `vm.swappiness` parameter in the Linux kernel determines the priority given to swapping out runtime memory compared to dropping pages from the system page cache.

The swappiness value ranges from 0 to 100. A lower value indicates that the kernel will try to avoid swapping as much as possible, while a higher value means the kernel will swap memory pages more readily.

The kernel's behavior is influenced by the current system load and the specific implementation, but generally:

- `vm.swappiness = 0`: The kernel will minimize swapping processes out of physical memory.
- `vm.swappiness = 100`: The kernel will aggressively swap processes out of physical memory to the swap disk.

For instance, a system with a swappiness value of 60 will swap pages more frequently than one with a value of 10. Lowering the swappiness value delays the use of swap space, which typically enhances system responsiveness.

It's important to note that setting `vm.swappiness = 0` doesn't completely disable swapping. The system will still use swap space when necessary, such as when it runs out of memory.

vm.min_free_kbytes=65536

The setting vm.min_free_kbytes=65536 ensures that the Linux kernel always keeps at least 64 MB of RAM free. This serves as a protective buffer to improve system performance and stability, especially under high load or with intensive memory requirements. It helps to avoid bottlenecks and ensures that the system can respond quickly to memory requests.

The setting `vm.min_free_kbytes` in the Linux kernel influences how much free disk space (in kilobytes) the system should always keep available. Here are some important aspects and the influence of this setting:

Free memory: vm.min_free_kbytes specifies the minimum amount of free memory the system will try to keep available at all times. If the free memory falls below this value, the kernel takes measures to free memory, such as moving pages to swap memory or deleting pages from the page cache.

Memory management: A higher setting of `vm.min_free_kbytes` can help to avoid memory management bottlenecks, especially under high load or on systems with many parallel processes. It ensures that there is always a certain buffer of free memory to quickly fulfill memory requests.

System stability: By reserving more free memory, the system can become more stable as critical processes are less likely to block or crash due to lack of memory.

Performance: A setting that is too low can have a negative impact on system performance, as the kernel is forced to urgently free memory more often. This can lead to a higher utilization of the swap memory and an increased CPU load. A setting that is too high, on the other hand, can lead to more memory remaining unused, which reduces efficiency.

Recommendation:
- For systems with high memory loads or those that require high reliability, it may be useful to increase `vm.min_free_kbytes`.
- The exact setting should be adjusted based on specific requirements and observation of system performance.

Summary:
The `vm.min_free_kbytes` setting is an important parameter to control memory management in the Linux system, which helps to ensure a minimum amount of free memory to improve system stability and performance under load conditions.

fs.file-max=64000

The setting fs.file-max=64000 specifies that the Linux system can keep a maximum of 64,000 file descriptors open at the same time. This helps to increase the capacity of the system to support many simultaneous connections or open files. This configuration is particularly useful for high load servers and applications with intensive I/O requirements. However, it is important to optimize and regularly monitor this setting based on the specific requirements and resources of your system.

- File descriptors: A file descriptor is an abstract key that represents an open file or an open socket. Every file, every network socket and every other I/O object that is opened is assigned a unique file descriptor.

- Maximum number of open files: The value fs.file-max=64000 specifies that the system can keep a maximum of 64,000 file descriptors open at any one time. This is an upper limit that prevents too many files or sockets from being open at the same time, which could lead to system instability.

- Impact on system performance: Increased capacity: a higher maximum number of open files can increase the capacity of the system to support a larger number of simultaneously running applications and processes that need to open files and sockets.

- System resources: However, increasing this limit can also result in more system resources (e.g. memory) being required to manage these file descriptors.

Usage scenarios:
High-load servers: Servers that need to manage many simultaneous connections or many simultaneously opened files, such as web servers, database servers or file servers.
Applications with a high I/O load: Applications that process a large number of files or network connections simultaneously benefit from a higher number of permitted file descriptors.

Limitations and recommendations:
System optimization: It is important to optimize the number of maximum file descriptors based on the requirements of your applications and available system resources. Too high a setting could lead to unnecessary resource consumption.

Monitoring: It is recommended to monitor the usage of file descriptors to ensure that the system does not regularly reach the set limits, which could be an indicator that the limit should be increased further.

kernel.sysrq=0

The SysRq key is a special key combination that communicates directly with the kernel and can execute various emergency commands. These commands can be used to restart the system, terminate processes, display memory information and perform other critical actions. Influence of kernel.sysrq=0:

Deactivation:

With the setting kernel.sysrq=0, all SysRq functions are deactivated. This means that no SysRq commands can be executed, even if the corresponding key combination is used.

The setting kernel.sysrq=0 disables all SysRq key combinations, preventing these emergency commands from being used. This can increase system security as it minimizes the risk of misuse of the SysRq functions. However, this also means that the useful diagnostic and emergency commands are not available, which can be particularly relevant in secure production environments.

vm.vfs_cache_pressure=150

The vm.vfs_cache_pressure=150 setting in the Linux kernel affects how aggressively the system manages the cache of the Virtual File System (VFS) data structures. Here are the details and the impact of this specific setting:

VFS Cache: The Virtual File System (VFS) is an abstraction layer that helps the kernel deal with different file systems. The VFS cache stores metadata and other file system information to speed up access to files and directories.

 Parameters:
 vfs_cache_pressure: This parameter controls how aggressively the kernel reduces the VFS cache compared to other caches (e.g. the page cache). A higher value means that the kernel is more aggressive in reducing the VFS cache, while a lower value means that the VFS cache is more likely to be retained compared to other caches.

Meaning of the value 150:
Increased aggressiveness: with the setting vm.vfs_cache_pressure=150, the VFS cache is reduced more aggressively than with the default setting (which is normally 100). This means that the kernel frees the VFS cache faster to use the memory for other purposes.

Impact on system performance:
Memory freeing: a higher value (such as 150) causes the kernel to free memory space in the VFS cache faster. This can be useful if the available RAM is limited and other applications or caches require more memory.

Reduced cache efficiency: The more aggressive release of the VFS cache can reduce the efficiency of file access, as less file system metadata is kept in the cache. This can lead to more frequent accesses to the file system, which can affect the performance of file-intensive applications.

System balance: A value of 150 indicates a prioritization of memory for other applications and caches at the expense of a smaller VFS cache.

Application scenarios:
Systems with limited RAM: On systems with limited RAM, a higher vfs_cache_pressure value may be useful to ensure that sufficient memory is available for other caches and applications.
Applications with low file access: Systems that rely less heavily on file system access can use a higher value to free up more memory for other purposes.
Servers with many concurrent processes: Servers running many concurrent processes with high memory requirements could benefit from more aggressive memory freeing.

Summary:
The vm.vfs_cache_pressure=150 setting specifies that the Linux kernel frees the VFS cache more aggressively to use memory for other purposes. This can be useful to optimize memory usage in systems with limited RAM, but comes at the expense of file access cache efficiency. This value prioritizes the release of VFS cache memory, which can be beneficial in certain scenarios, especially when memory is scarce.

vm.dirty_ratio=10


The default value for the vm.dirty_ratio parameter in the Linux kernel is usually 20, but this value can vary slightly depending on the Linux distribution and version.

With the default value of 20%, this means that the kernel starts to write data from memory to the hard disk as soon as 20% of the available RAM is occupied by dirty pages.

The setting vm.dirty_ratio=10 in the Linux kernel influences the behavior of the system with regard to the caching of data to be written to the hard disk (so-called "dirty pages"). Here are the details and the influence of this specific setting:


Dirty pages: dirty pages are memory pages that have been modified and whose changes have not yet been written to disk. These pages are cached in memory to optimize write operations and improve system performance.

Parameter:
vm.dirty_ratio: This parameter specifies the percentage of the total available memory that may be occupied by dirty pages before the kernel begins to forcibly write this data to the hard disk.

Meaning of the value 10:
10% of RAM: The setting vm.dirty_ratio=10 specifies that up to 10% of the total available RAM may be occupied by dirty pages before the kernel begins to write the data to the hard disk.

Influence on system performance:
Earlier write operations:

A lower value for vm.dirty_ratio means that the system will start writing data to disk earlier. This can lead to more frequent but smaller write operations.

Reduced risk of data loss: A lower value can reduce the risk of data loss in the event of a sudden system crash or power failure, as less unsaved data is kept in memory.

Memory availability: Less memory is reserved for dirty pages, leaving more memory available for other applications. This can be particularly beneficial for systems with limited RAM.

Application scenarios:

Systems with limited RAM: Systems with limited available RAM can benefit from a lower vm.dirty_ratio value, as more memory is available for other applications and caches.

Systems with higher risk of power outages: Systems that are more prone to sudden power outages or crashes may benefit from a lower value as the risk of data loss is reduced.

Applications with lower write requirements: Systems that do not perform intensive writes can safely use a lower value to maximize storage availability.

Summary:
The vm.dirty_ratio=10 setting specifies that up to 10% of the total available RAM may be occupied by dirty pages before the kernel starts writing this data to disk. This setting causes the kernel to start writing earlier, which reduces the risk of data loss and makes more memory available for other applications. It is particularly useful for systems with limited RAM or increased risk of power failures.

vm.dirty_background_ratio

The default value for the parameter vm.dirty_background_ratio in the Linux kernel is usually 10, which means that the kernel starts writing dirty pages to the hard disk in the background as soon as 10% of the total available RAM is occupied by dirty pages.

Summary:

Default value: 10
Meaning: The kernel starts background writes for dirty pages as soon as 10% of the available RAM is occupied by these pages.

This default value ensures that writes are distributed evenly and memory is used efficiently, improving system responsiveness and reducing the risk of sudden load spikes due to large writes.

The setting vm.dirty_background_ratio=5 in the Linux kernel influences the behavior of the system with regard to caching and background writing of data that is to be written to the hard disk (so-called "dirty pages"). Here are the details and the influence of this specific setting:

Dirty pages: dirty pages are memory pages that have been modified and whose changes have not yet been written to disk. These pages are cached in memory to optimize write operations and improve system performance.

Parameter:
vm.dirty_background_ratio: This parameter specifies the percentage of the total available memory that may be occupied by dirty pages before the kernel starts writing this data to the hard disk in the background.

Meaning of the value 5:
5% of RAM: The setting vm.dirty_background_ratio=5 specifies that the kernel starts writing dirty pages to the hard disk in the background as soon as 5% of the total available RAM is occupied by dirty pages.

Influence on system performance:
Earlier background writing: A low value for vm.dirty_background_ratio means that the system starts writing dirty pages to the hard disk in the background earlier. This prevents too many dirty pages from accumulating in memory and ensures a more even distribution of the write load.

System responsiveness: Early background writing of dirty pages prevents sudden, large writes from occurring, which can improve system responsiveness.

Memory availability: Less memory is reserved for dirty pages, leaving more memory available for other applications. This can be particularly beneficial for systems with limited RAM.

Interaction with vm.dirty_ratio:
vm.dirty_ratio: This parameter specifies the percentage of memory at which the kernel is forced to write all dirty pages to disk to free up memory.
Combination: If vm.dirty_background_ratio is set to 5% and vm.dirty_ratio is set to 20%, the kernel starts writing dirty pages in the background as soon as 5% of the RAM is occupied. When 20% is reached, the write operations are intensified to ensure that the memory is not excessively occupied.

Usage scenarios:
Systems with limited RAM: Systems with less available RAM can benefit from a lower vm.dirty_background_ratio value as it ensures that memory is used efficiently and dirty page writes are started early.
General system optimization: Systems that require an even write load and better responsiveness will benefit from a lower value as this helps to avoid sudden load spikes due to large writes.

Summary:
The setting vm.dirty_background_ratio=5 specifies that the Linux kernel starts writing dirty pages to disk in the background as soon as 5% of the total available RAM is occupied by dirty pages. This setting helps to distribute the write load more evenly and improves system responsiveness by ensuring that fewer dirty pages are kept in memory and that memory is used more efficiently.

The combination of vm.dirty_ratio=10 and vm.dirty_background_ratio=5 in my configuration ensures efficient memory management and improves system responsiveness. The kernel starts writing dirty pages in the background early (at 5% occupancy) and intensifies writing when 10% of RAM is occupied. This ensures that memory is used efficiently and that sufficient RAM remains available for other applications, while reducing the risk of data loss.

vm.dirty_expire_centisecs

The default value for the parameter vm.dirty_expire_centisecs in the Linux kernel is usually 3000 centiseconds, which corresponds to 30 seconds.

This value means that dirty pages (changed memory pages that have not yet been written to the hard disk) are considered "expired" and should be written to the hard disk if they have been held in memory for 30 seconds.

Summary:

Default value: 3000 centiseconds (30 seconds)
Meaning: Dirty pages are considered expired after 30 seconds and are selected for writing to the hard disk.

This default value represents a compromise that enables a balance between system performance and data security. By keeping dirty pages in memory for up to 30 seconds, write operations can be optimized and system performance improved while limiting the risk of data loss in the event of sudden system crashes.

The setting `vm.dirty_expire_centisecs=2000` in the Linux kernel determines how long dirty pages (modified memory pages that have not yet been written to disk) are allowed to remain in memory before they are considered "expired" and selected by the kernel for writing to disk. Here are the details and impact of this specific setting:

Dirty Pages:

Dirty pages are memory pages that have been modified but whose changes have not yet been written to disk. They are cached in memory to optimize write operations and improve system performance.

Parameter:vm.dirty_expire_centisecs`: This parameter specifies the maximum time in hundredths of a second (1/100th of a second or centiseconds) that dirty pages may remain in memory before they are considered "expired" and written to disk.

Meaning of the value 2000: 2000 centiseconds: The setting `vm.dirty_expire_centisecs=2000` specifies that dirty pages are considered expired after 2000 centiseconds (20 seconds) at the latest and are written to the hard disk.

Influence on system performance:
 Faster write operations: A lower value (compared to the default value of 3000 centiseconds) means that dirty pages are considered "expired" more quickly and are written to the hard disk earlier. This can increase data integrity as changes are written to disk faster.

Reduced risk of data loss: Writing dirty pages earlier reduces the risk of data loss in the event of a sudden system crash or power failure.
Increased writes: However, this could lead to more frequent writes, which could affect performance, especially on systems with high write loads.

Deployment scenarios:
 Data security: In environments where data security and consistency are important (e.g. database servers), a lower value could be used to ensure that changes are written to disk quickly.
 Systems with unstable power supplies: Systems that are more prone to sudden power outages or crashes may benefit from a lower value as the risk of data loss is reduced.

Summary:
The setting `vm.dirty_expire_centisecs=2000` specifies that dirty pages are considered expired and written to disk after 2000 centiseconds (20 seconds) at the latest. This increases the frequency of write operations and improves data security, as changes are persisted more quickly. This setting is particularly useful in environments that require a high level of data security or where the risk of data loss must be minimized.

vm.dirty_writeback_centisecs

The default value for the parameter vm.dirty_writeback_centisecs in the Linux kernel is usually 500 centiseconds, which corresponds to 5 seconds.

This parameter specifies the time interval in which the kernel periodically calls the background daemon (pdflush) to write dirty pages (changed memory pages that have not yet been written to the hard disk) to the hard disk.

Summary:
Default value: 500 centiseconds (5 seconds)
Meaning: The kernel calls the background daemon every 5 seconds to write dirty pages to the hard disk.

This default value ensures that write operations are performed regularly to avoid overloading the memory with dirty pages and at the same time ensure a good balance between system performance and data security.

Low value

A low value means that the background daemon is activated more frequently to write dirty pages to the hard disk.

Advantages:

Data integrity: Writing to the hard disk more frequently can reduce the risk of data loss in the event of a sudden system crash or power failure.
More stable performance: Smaller, more frequent writes can be performed, resulting in a more evenly distributed write load and avoiding potential performance drops from large, sudden writes.

Disadvantages:

Increased write activity: more frequent writes to the hard disk can lead to increased write activity, which can affect performance on traditional hard disk drives (HDDs) and shorten the lifespan of SSDs.
Higher CPU load: Frequently calling the background daemon can increase the CPU load, which could affect the performance of other processes.

High value

A high value means that the background daemon is activated less frequently to write dirty pages to the hard disk.

Advantages:

Reduced write activity: Less frequent writes can reduce the write load on the hard disk, which can extend the service life of SSDs in particular.
Reduced CPU load: Calling the background daemon less frequently can reduce the CPU load, leaving more resources available for other processes.

Disadvantages:

Potential performance drops: less frequent but larger writes can lead to sudden performance drops, especially if many dirty pages need to be written at once.
Increased risk of data loss: Writing less frequently to the hard disk increases the risk of data loss in the event of a sudden system crash or power failure.

Summary

Low value: Better for data integrity and stable performance, but increased write activity and CPU load.
High value: Better for reduced write activity and CPU load, but potential performance degradation and increased risk of data loss.

The optimal setting depends on the specific requirements of your system. For systems that rely on maximum data security and consistent performance, a lower value is beneficial. For systems that need to minimize write activity and conserve CPU resources, a higher value is more suitable. It is important to monitor the effects of the change and adjust if necessary.

Summary:
The setting vm.dirty_writeback_centisecs=800 specifies that the background daemon is called every 8 seconds to write dirty pages to the hard disk. This setting influences the frequency and size of the write operations and thus the system performance, CPU load and data integrity. Such a value offers a compromise between frequent writes (for better data integrity) and less frequent writes (for reduced write activity and CPU load).