Speaking RESP: The Redis Serialization Protocol

In the last chapter, we built an Event Loop that can accept TCP connections. But a connection is just a stream of raw bytes. To make it a database, we need a language. Redis speaks RESP (Redis Serialization Protocol).

1. Intuition: Why not JSON?

Why this exists in Redis

If you send {"command": "SET", "key": "user", "value": "alice"}, the server has to:

Scan for matching brackets.
Handle escaping.
Allocate memory for the entire string before parsing.

Real-world problem it solves: Parsing overhead. At 100k requests/sec, even a microsecond of JSON parsing creates a massive bottleneck. RESP is designed to be O(1) to find the end of a field and O(N) to parse the value.

The Analogy

RESP is like a pre-measured recipe. Instead of "some flour", it says "500g: [500 grams of flour]". You don't need to look for where the flour ends; you already have the scale ready.

2. Internal Architecture

How Redis designs this feature

RESP is a binary-safe, human-readable (mostly) protocol that uses a Type Prefix and Length Prefix strategy.

Components Involved

Prefix Decoders: Maps characters like * or $ to types.
Length Parsers: Extracts the subsequent bytes to avoid scanning for delimiters.

Trade-offs

Pros: Extremely simple to implement, extremely fast to parse. Binary safe (you can send images).
Cons: Not as compact as binary protocols like Protobuf or MessagePack.

3. End-to-End Flow (VERY IMPORTANT)

Client → TCP → Event Loop → Command Queue → Parser → Execution → Data Store → Response

For a simple SET mykey value command:

Client: Wraps command in an array: *3\r\n$3\r\nSET\r\n$5\r\nmykey\r\n$5\r\nvalue\r\n.
TCP: Sends raw bytes to the server.
Event Loop: Wakes up and reads the buffer.
Parser:

Reads *. Knows an array is coming.
Reads 3. Knows it needs to parse 3 more elements.
Reads $. Knows a bulk string is coming.
Reads 3. Knows exactly 3 bytes of data follow.

Execution: The parser emits ['SET', 'mykey', 'value'] to the command engine.
Response: The engine writes back +OK\r\n.

4. Internals Breakdown

Data Structures: The Buffer Pointer

In C, Redis uses a read buffer and an offset pointer. It doesn't "split" strings which causes memory allocations. It simply looks at the buffer and says "the value starts at index X and is Y bytes long".

Memory Behavior

Small simple strings are often allocated on the stack or reused from a pool. Bulk strings are allocated on the heap.

5. Node.js Reimplementation (Hands-on)

Step 1: The Setup

We need to handle the fact that TCP packets might be "fragmented" (one command arriving in two data events).

class RESPParser {
 constructor() {
 this.buffer = Buffer.alloc(0);
 }
 
 feed(data) {
 this.buffer = Buffer.concat([this.buffer, data]);
 }
}

Step 2: Core Logic (Parsing Bulk Strings)

Bulk strings are the bread and butter of Redis.

parseBulkString() {
 // Expected format: $5\r\nhello\r\n
 const firstNewline = this.buffer.indexOf('\r\n');
 if (firstNewline === -1) return null; // Wait for more data
 
 const length = parseInt(this.buffer.slice(1, firstNewline).toString());
 const totalLength = firstNewline + 2 + length + 2;
 
 if (this.buffer.length < totalLength) return null; // Wait for more data
 
 const value = this.buffer.slice(firstNewline + 2, firstNewline + 2 + length);
 this.buffer = this.buffer.slice(totalLength); // Clear the buffer
 return value.toString();
}

Step 3: Command Pipeline

function handleConnection(socket) {
 const parser = new RESPParser();
 socket.on('data', (data) => {
 parser.feed(data);
 let command;
 while ((command = parser.parse())) {
 // execute(command)
 socket.write('+OK\r\n');
 }
 });
}

6. Performance, High Concurrency & Backpressure

High Concurrency Behavior

RESP is designed to be O(1) for finding the end of a bulk string (using the length prefix). This allows Redis to jump through 10,000 commands in a single TCP packet without scanning every byte.

Backpressure & Bottlenecks

Backpressure: If the client sends a 512MB bulk string (the limit), the parser must buffer it. If many clients do this, the server hits his memory limit.
Bottlenecks: The primary bottleneck is Memory Bandwidth. Moving large strings from the network buffer to the data store consumes the memory bus.

7. Redis vs. Our Implementation: What we Simplified

Zero-Copy Parsing: Redis (C) parses the protocol in-place in the read buffer using pointers. Our Node.js implementation uses Buffer.slice() and Buffer.concat(), which create new memory objects and trigger Garbage Collection.
Static Buffers: Redis reuses memory blocks for common responses like +OK\r\n. We create new strings for every response.

8. Why Redis is Optimized

Redis uses Length-Prefixing instead of Delimiters (like JSON's " or {). This means the parser knows exactly how many bytes to read before it even starts, allowing for extremely high-speed memory copies (memcpy).

Optimization: Redis (C) uses sscanf and pointer arithmetic. It's zero-copy where possible. Our Node.js implementation uses Buffer.concat() and indexOf(), which creates new objects and can be slower.
Error Handling: Redis has very specific error types (e.g., ERR syntax error). We often simplify into a catch-all.

8. Summary & Key Takeaways

Length-Prefixing is Key: It makes parsing predictable and memory-safe.
Stateful Parsing: Essential for handling real-world TCP volatility.
Binary Safety: Unlike HTTP, RESP doesn't care if your data contains null bytes or special characters.

Next Step: Now that we can speak the protocol, let's implement the Storage Engine: GET, SET, and the Mystery of Key Expiration.

Data Types & Commands Event Loop & IO Multiplexing