← Back to Index View Markdown

Linux Kernel Tuning for Redis

Configure Linux kernel parameters to prevent latency spikes, persistence failures, and connection drops in production Redis deployments—essential settings that override defaults hostile to in-memory databases.

Standard Linux kernel defaults are optimized for general workloads, not in-memory databases. Production Redis instances require specific kernel tuning to achieve consistent sub-millisecond latency.

Transparent Huge Pages (THP)

THP is the most critical setting. When enabled, it causes severe latency spikes during persistence operations.

The Problem:

Redis uses fork() to create child processes for RDB snapshots and AOF rewrites. Linux uses copy-on-write (COW) to share memory between parent and child. With THP enabled, the kernel uses 2MB pages instead of 4KB pages. When the parent modifies a single byte, the kernel must copy the entire 2MB page—a 500x amplification.

Symptoms: - Latency spikes of 10-100x during BGSAVE or BGREWRITEAOF - Memory usage doubling temporarily during persistence - Client timeouts during background saves

Fix (required):

echo never > /sys/kernel/mm/transparent_hugepage/enabled

Make it permanent in /etc/rc.local or via systemd:

[Service]
ExecStartPre=/bin/sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'

Redis logs a warning at startup if THP is enabled. Never ignore this warning.

Memory Overcommit

The fork() system call can fail if the kernel thinks there's insufficient memory.

The Problem:

When Redis forks, the kernel may pessimistically assume the child needs as much memory as the parent. On a 32GB instance with 30GB used, the fork can fail even though COW means minimal actual memory is needed.

Fix:

sysctl vm.overcommit_memory=1

This tells the kernel to always allow memory allocation requests, trusting that COW will work correctly. Add to /etc/sysctl.conf:

vm.overcommit_memory=1

Swappiness

Redis data in swap is a catastrophic failure mode.

The Problem:

If any Redis memory pages are swapped to disk, access times go from nanoseconds to milliseconds—a 1,000,000x slowdown. A single swapped page can cause client timeouts.

Fix:

sysctl vm.swappiness=1

Setting to 0 is not recommended as it can cause OOM kills. Setting to 1 makes swapping extremely unlikely while still allowing the kernel to swap in desperate situations:

vm.swappiness=1

TCP Stack Tuning

Under high connection rates, default TCP settings cause connection failures.

Connection Backlog:

The listen queue for incoming connections defaults to 128. During traffic spikes, new connections are dropped before Redis sees them.

sysctl net.core.somaxconn=65536
sysctl net.ipv4.tcp_max_syn_backlog=65536

Also set in redis.conf:

tcp-backlog 65536

The effective backlog is min(tcp-backlog, somaxconn).

TIME_WAIT Connections:

With many short-lived connections, ports can be exhausted by TIME_WAIT sockets:

sysctl net.ipv4.tcp_tw_reuse=1

TCP Keepalive:

For long-lived connections through load balancers:

tcp-keepalive 300

Redis will send keepalives every 300 seconds to prevent idle connection termination.

File Descriptor Limits

Each Redis connection consumes a file descriptor.

Check Current Limits:

ulimit -n

Set Appropriately:

In /etc/security/limits.conf:

redis soft nofile 65536
redis hard nofile 65536

Or in the systemd service file:

[Service]
LimitNOFILE=65536

Redis's maxclients is limited by available file descriptors minus ~32 for internal use.

Summary Configuration

Add to /etc/sysctl.conf:

# Redis kernel tuning
vm.overcommit_memory=1
vm.swappiness=1
net.core.somaxconn=65536
net.ipv4.tcp_max_syn_backlog=65536
net.ipv4.tcp_tw_reuse=1

Disable THP at boot:

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

Apply without reboot:

sysctl -p

Verification

Check Redis startup logs for warnings:

grep -i warning /var/log/redis/redis-server.log

Common warnings to address: - "THP is enabled" - disable transparent huge pages - "overcommit_memory is set to 0" - set to 1 - "somaxconn is lower than" - increase TCP backlog

Memory Allocator

Redis uses jemalloc by default, which is well-suited for its allocation patterns. Key behaviors:

High-Water Mark: After deleting large amounts of data, jemalloc retains memory to satisfy future allocations. The process RSS reflects peak usage, not current dataset size.

Fragmentation: Monitor mem_fragmentation_ratio in INFO MEMORY: - Below 1.0: Redis is using swap (critical) - 1.0-1.5: Normal - Above 1.5: Significant fragmentation

If fragmentation is high:

MEMORY PURGE

This forces jemalloc to return memory to the OS. In Redis 4.0+, active defragmentation can be enabled:

activedefrag yes

Source

Shopify RedisDays presentation, Redis documentation, and production post-mortems from Netflix, Uber, and AWS ElastiCache teams.

← Back to Index | Markdown source