ByteByteGoFebruary 18, 2026

#Redis#System Design#Caching

Redis for System Design: Architecture, Persistence, Scaling, and Use Cases

This guide explores Redis's core architecture, including its single-threaded nature and in-memory storage, and details its persistence options and scaling strategies. Learn how Redis is effectively used for caching, rate limiting, and leaderboards in system design.

5 min readAI Guide

Introduction

Redis is an in-memory data structure store that provides extremely low-latency access to diverse data types. It is a fundamental component in many system designs, enabling high-performance caching, real-time analytics, and efficient distributed coordination.

Configuration Checklist

Element	Version / Link
Language / Runtime	C (Redis is written in C)
Main library	Redis
Required APIs	Redis commands (e.g., `SET`, `GET`, `INCR`, `ZADD`, `ZREVRANGE`, `ZRANK`)
Keys / credentials needed	Typically Redis authentication password (configured in `redis.conf`)

Step-by-Step Guide

Step 1 — Understanding Single-Threaded Command Execution

Single-Threaded Command Execution
Redis is primarily single-threaded for command execution, meaning it processes one command at a time. This design choice simplifies concurrency control by eliminating the need for complex locks and ensuring predictable command order (First-In, First-Out). While Redis 6 and newer versions introduced I/O threads for networking, the core command logic remains single-threaded to maintain atomicity and consistency.

# Example Redis commands demonstrating sequential execution
# SET user:1:name "Alex" - Sets a string key
# INCR counter - Atomically increments a counter
# GET profile:42 - Retrieves a string key

# In a single-threaded model, these commands are processed one after another.
# If one command blocks, all subsequent commands will wait.

Step 2 — Optimizing Latency with Pipelining and Transactions

To mitigate the impact of network latency in a single-threaded environment, Redis supports pipelining and transactions. Pipelining allows clients to send multiple commands to the server without waiting for a response to each, reducing network round-trip time. Transactions bundle a set of commands to be executed atomically, ensuring all commands in the transaction succeed or fail together.

# Example of Pipelining in a Redis client (conceptual)
# client.pipeline()
# client.get('profile:42')
# client.incr('counter')
# client.set('user:1:name', 'Alex')
# results = client.execute() # All commands sent in one round trip, responses received together

# Example of a Redis Transaction (MULTI/EXEC)
# client.multi()
# client.get('profile:42')
# client.incr('counter')
# client.set('user:1:name', 'Alex')
# results = client.exec() # Commands are queued and then executed atomically

Step 3 — Configuring Persistence for Durability

Since Redis is an in-memory database, data is volatile and can be lost on machine crashes if not persisted. Redis offers several persistence options, each with different trade-offs between performance and durability.

Option A: No Persistence (Pure Cache)

Many teams use Redis as a pure cache with persistence turned off. The primary database remains the source of truth. If Redis crashes, the cache is lost but can be rebuilt by querying the database on demand. This offers the highest performance but zero durability for Redis itself.

# In redis.conf, ensure persistence is disabled for a pure cache setup
# save ""
# appendonly no

Option B: RDB Snapshots

RDB (Redis Database) persistence performs periodic snapshots of the dataset to disk. On restart, Redis loads the latest snapshot into memory. This allows Redis to come back with warm data, but any writes that occurred between the last snapshot and the crash are lost.

# In redis.conf, configure RDB snapshots
# save 900 1   # Save if 1 key changes in 900 seconds (15 minutes)
# save 300 10  # Save if 10 keys change in 300 seconds (5 minutes)
# save 60 10000 # Save if 10000 keys change in 60 seconds (1 minute)
# dbfilename dump.rdb
# dir ./      # Directory for RDB files

Option C: AOF (Append-Only File)

AOF persistence logs every write operation received by the server. When Redis restarts, it re-executes the commands in the AOF to rebuild the dataset. This offers stronger durability than RDB, with configurable sync policies.

# In redis.conf, enable AOF persistence
appendonly yes

# Configure appendfsync policy:
# appendfsync everysec # Default: fsync every second. Lose at most 1 second of data.
# appendfsync always   # fsync after every command. Strongest durability, but slower.
# appendfsync no       # OS flushes when it wants. Fastest, but least durable.

# auto-aof-rewrite-percentage 100 # Rewrite AOF when it grows by 100%
# auto-aof-rewrite-min-size 64mb # Minimum size for AOF rewrite

Step 4 — Scaling Redis for Throughput and Capacity

Scaling Redis involves strategies to handle increased read/write traffic and larger datasets. Most deployments start with a single instance and scale as needed.

Option A: Replication for Read Scaling

To increase read throughput and provide high availability, replicas can be added. The primary node handles all writes, while replicas serve read traffic. Data is asynchronously replicated from the primary to its replicas. If the primary fails, a replica can be promoted to take its place.

# On the replica instance, use the replicaof command
# redis-cli replicaof <primary_ip> <primary_port>

# To check replication status
# redis-cli INFO replication

Option B: Client-Side Sharding for Write Scaling

When write volume exceeds a single instance's capacity, client-side sharding can be implemented. The client application is responsible for distributing keys across multiple independent Redis instances (shards). This approach avoids cross-node coordination overhead within Redis itself.

# Conceptual client-side sharding logic
def get_redis_shard(key, num_shards, redis_clients):
    shard_index = hash(key) % num_shards
    return redis_clients[shard_index]

# Example usage:
# redis_clients = [Redis(host='shard1_ip'), Redis(host='shard2_ip'), ...]
# client = get_redis_shard('user:123:profile', len(redis_clients), redis_clients)
# client.set('user:123:profile', 'data')

Option C: Redis Cluster for Automatic Sharding and Failover

Redis Cluster provides automatic sharding and failover capabilities, abstracting sharding logic from the client. It partitions data across multiple Redis instances and automatically handles node failures by promoting replicas. This offers a more managed scaling solution but introduces operational complexity.

# To start a Redis Cluster node (conceptual)
# redis-server --port 7000 --cluster-enabled yes --cluster-config-file nodes-7000.conf --cluster-node-timeout 5000 --appendonly yes

# To create the cluster (using redis-cli)
# redis-cli --cluster create 127.0.0.1:7000 127.00.1:7001 ... --cluster-replicas 1

Comparison Tables

AOF `appendfsync` Policies

Policy	Durability	Performance	Latency	Use Case
`everysec`	Lose at most 1 second of data	High throughput	Low	Most common, good balance
`always`	Zero data loss	Lowest throughput	Highest	Small datasets, critical data, latency not critical
`no`	OS-dependent data loss	Highest throughput	Lowest	Pure cache, non-critical ephemeral data

Redis Scaling Approaches

Approach	Read Throughput	Write Throughput	Availability	Complexity	Use Case
Single Instance	Moderate	Moderate	Low	Low	Small to medium workloads
Replication	High	Moderate (primary bottleneck)	High (failover)	Moderate	Read-heavy workloads, caching
Client-Side Sharding	High	High	Moderate (shard failure affects subset of keys)	Moderate (client manages sharding)	High write/read workloads, cache, ephemeral state
Redis Cluster	High	High	High (automatic failover)	High (managed cluster)	Large-scale, high-traffic, critical state requiring automatic management

⚠️ Common Mistakes & Pitfalls

Not configuring persistence for critical data: Relying on Redis for critical data without proper RDB or AOF configuration can lead to data loss on server crashes. Fix: Enable and correctly configure RDB snapshots or AOF persistence based on durability requirements.
Blocking commands in a single-threaded Redis: Long-running commands (e.g., complex Lua scripts, large KEYS commands) can block the entire Redis instance, impacting all clients. Fix: Avoid long-running operations. Use SCAN instead of KEYS for iterating, optimize Lua scripts, or offload complex processing to background jobs.
Over-reliance on Redis Cluster for simple use cases: Redis Cluster adds operational complexity. For many caching or ephemeral state workloads, client-side sharding or a single replicated instance is sufficient and simpler to manage. Fix: Evaluate if Redis Cluster's automatic sharding and failover are truly necessary for your specific guarantees; prefer simpler setups when possible.
Not setting TTLs or eviction policies for caches: Without TTLs (Time To Live) or an eviction policy, a Redis cache can grow indefinitely, consuming all available memory and leading to out-of-memory errors. Fix: Set appropriate TTLs for cached keys using EXPIRE or SETEX, and configure a maxmemory limit with an eviction policy (e.g., allkeys-lru) in redis.conf.
Ignoring network latency: Even with in-memory storage, network round-trip time can be a bottleneck. Sending many individual commands can accumulate latency. Fix: Utilize pipelining to batch multiple commands into a single network request, significantly reducing overall latency for bulk operations.

Glossary

Single-threaded: A processing model where a single thread handles all incoming requests sequentially, ensuring atomicity and predictable execution order.
In-memory: Data is stored directly in the server's RAM (Random Access Memory), allowing for extremely fast read and write operations with sub-millisecond latency.
Pipelining: A client-side optimization technique where multiple commands are sent to the Redis server in a single batch without waiting for individual responses, reducing network overhead.
Sorted Set (ZSET): A Redis data structure that stores unique string members, each associated with a floating-point score, allowing members to be retrieved by score range or lexicographical order.
Atomic operation: An operation that is guaranteed to complete entirely or not at all, preventing partial updates and ensuring data consistency even with concurrent access.

Key Takeaways

Redis's single-threaded nature simplifies concurrency management and ensures atomic operations on individual keys.
Pipelining and transactions are crucial for minimizing network latency and ensuring atomicity for multi-command operations.
In-memory storage provides sub-millisecond response times, making Redis ideal for high-performance use cases.
Durability in Redis requires careful configuration of persistence options like RDB snapshots or AOF, or relying on an external durable database.
Replication enhances read throughput and availability, while sharding (client-side or Redis Cluster) scales write throughput and data capacity.
Redis is highly versatile, serving as an excellent solution for caching, distributed rate limiting, and real-time leaderboards due to its specialized data structures and atomic commands.
Choosing the right scaling and persistence strategy depends on the specific workload, durability requirements, and acceptable operational complexity.

Resources

Redis Official Documentation
Redis Persistence
Redis Replication
Redis Cluster Tutorial
Design A Rate Limiter (ByteByteGo Video) [Editor's note: Link to the specific video mentioned in the transcript]
ByteByteGo Community

All guides Lire en français →