Configure Linux kernel parameters to prevent latency spikes, persistence failures, and connection drops in production Redis deployments—essential settings that override defaults hostile to in-memory databases.
Standard Linux kernel defaults are optimized for general workloads, not in-memory databases. Production Redis instances require specific kernel tuning to achieve consistent sub-millisecond latency.
THP is the most critical setting. When enabled, it causes severe latency spikes during persistence operations.
The Problem:
Redis uses fork() to create child processes for RDB snapshots and AOF rewrites. Linux uses copy-on-write (COW) to share memory between parent and child. With THP enabled, the kernel uses 2MB pages instead of 4KB pages. When the parent modifies a single byte, the kernel must copy the entire 2MB page—a 500x amplification.
Symptoms: - Latency spikes of 10-100x during BGSAVE or BGREWRITEAOF - Memory usage doubling temporarily during persistence - Client timeouts during background saves
Fix (required):
echo never > /sys/kernel/mm/transparent_hugepage/enabled
Make it permanent in /etc/rc.local or via systemd:
[Service]
ExecStartPre=/bin/sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
Redis logs a warning at startup if THP is enabled. Never ignore this warning.
The fork() system call can fail if the kernel thinks there's insufficient memory.
The Problem:
When Redis forks, the kernel may pessimistically assume the child needs as much memory as the parent. On a 32GB instance with 30GB used, the fork can fail even though COW means minimal actual memory is needed.
Fix:
sysctl vm.overcommit_memory=1
This tells the kernel to always allow memory allocation requests, trusting that COW will work correctly. Add to /etc/sysctl.conf:
vm.overcommit_memory=1
Redis data in swap is a catastrophic failure mode.
The Problem:
If any Redis memory pages are swapped to disk, access times go from nanoseconds to milliseconds—a 1,000,000x slowdown. A single swapped page can cause client timeouts.
Fix:
sysctl vm.swappiness=1
Setting to 0 is not recommended as it can cause OOM kills. Setting to 1 makes swapping extremely unlikely while still allowing the kernel to swap in desperate situations:
vm.swappiness=1
Under high connection rates, default TCP settings cause connection failures.
Connection Backlog:
The listen queue for incoming connections defaults to 128. During traffic spikes, new connections are dropped before Redis sees them.
sysctl net.core.somaxconn=65536
sysctl net.ipv4.tcp_max_syn_backlog=65536
Also set in redis.conf:
tcp-backlog 65536
The effective backlog is min(tcp-backlog, somaxconn).
TIME_WAIT Connections:
With many short-lived connections, ports can be exhausted by TIME_WAIT sockets:
sysctl net.ipv4.tcp_tw_reuse=1
TCP Keepalive:
For long-lived connections through load balancers:
tcp-keepalive 300
Redis will send keepalives every 300 seconds to prevent idle connection termination.
Each Redis connection consumes a file descriptor.
Check Current Limits:
ulimit -n
Set Appropriately:
In /etc/security/limits.conf:
redis soft nofile 65536
redis hard nofile 65536
Or in the systemd service file:
[Service]
LimitNOFILE=65536
Redis's maxclients is limited by available file descriptors minus ~32 for internal use.
Add to /etc/sysctl.conf:
# Redis kernel tuning
vm.overcommit_memory=1
vm.swappiness=1
net.core.somaxconn=65536
net.ipv4.tcp_max_syn_backlog=65536
net.ipv4.tcp_tw_reuse=1
Disable THP at boot:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
Apply without reboot:
sysctl -p
Check Redis startup logs for warnings:
grep -i warning /var/log/redis/redis-server.log
Common warnings to address: - "THP is enabled" - disable transparent huge pages - "overcommit_memory is set to 0" - set to 1 - "somaxconn is lower than" - increase TCP backlog
Redis uses jemalloc by default, which is well-suited for its allocation patterns. Key behaviors:
High-Water Mark: After deleting large amounts of data, jemalloc retains memory to satisfy future allocations. The process RSS reflects peak usage, not current dataset size.
Fragmentation: Monitor mem_fragmentation_ratio in INFO MEMORY:
- Below 1.0: Redis is using swap (critical)
- 1.0-1.5: Normal
- Above 1.5: Significant fragmentation
If fragmentation is high:
MEMORY PURGE
This forces jemalloc to return memory to the OS. In Redis 4.0+, active defragmentation can be enabled:
activedefrag yes
Shopify RedisDays presentation, Redis documentation, and production post-mortems from Netflix, Uber, and AWS ElastiCache teams.