Presenters

Source

From Redis to Valkey: Engineering the Future of In-Memory Data 🚀

In the fast-moving world of software architecture, few events trigger as much immediate action as a major licensing shift. When Redis moved away from its open-source roots in March 2024, the community didn’t just complain—they built.

Thomas Betts recently sat down with Madelyn Olson, a maintainer of the Valkey project and Principal Engineer at AWS, to discuss how a dedicated group of engineers birthed a new industry standard in just eight days. Beyond the drama of the fork lies a story of deep technical refinement, where modernizing a 2009-era hashmap design led to staggering memory savings and performance gains.


🏗️ The 8-Day Origin Story: A Community United

The transition of Redis from a permissive BSD license to a commercial license (SSPL/RSAL) acted as a catalyst. Within just eight days of the announcement, a tight-knit community of major contributors from AWS, Alibaba, Ericsson, Tencent, Huawei, and Google rallied under the Linux Foundation to create Valkey.

Valkey isn’t just a reactive fork; it is an evolution. Since its inception 18 months ago, the team has moved at breakneck speed:

  • Version 7.2: The initial fork, maintaining full compatibility.
  • Version 8.0: The first major statement release, proving the community’s ability to innovate.
  • Version 9.0: Released in November, introducing cutting-edge performance features.

Today, Valkey powers major managed services like Amazon ElastiCache, GCP MemoryStore, and offerings from Aiven and Percona.


🔄 Seamless Migration: The “Nothing Happened” Success 🎯

For developers, the biggest hurdle to adopting new tech is often the migration path. Valkey eliminates this friction by acting as a drop-in replacement for Redis Open Source 7.2.

  • The Upgrade Process: Most users on older versions can move safely to Valkey. In managed environments, this is often a single-click operation with zero downtime.
  • Tooling Compatibility: Major clients like Jedis, redis-py, and Spring Data Redis work seamlessly with Valkey.
  • High Availability: Valkey supports online upgrades by attaching replicas to existing clusters, syncing data, and performing a failover.

Madelyn notes that the highest compliment the team receives is when users report that they learned nothing during migration—it just worked.


🛠️ Under the Hood: Modernizing the Hashmap 💾

While many view Valkey as a simple key-value store, Madelyn describes it more accurately as a hashmap over TCP. The real magic, however, lies in how it handles complex data types like sets, hashes, and sorted sets while managing clustering, replication, and durability.

The core engineering challenge involved modernizing a data structure designed in 2009. The team identified several bottlenecks in the original design:

  1. Excessive Memory Allocations: The old system used many small, independent allocations.
  2. Pointer Overhead: A typical 64-bit system uses 8-byte pointers. For small data sets (averaging 100 bytes), these pointers created massive overhead.
  3. Hardware Mismatch: Modern hardware hasn’t necessarily gotten faster in clock speed, but it has become much better at parallel data processing (SIMD).

The Technical Solution:

  • Binary Index Tree: The team implemented this structure to allow random sampling across thousands of per-slot dictionaries, ensuring efficient expiration and eviction of data.
  • Swiss Tables & Linear Probing: Moving away from traditional linked lists, Valkey now uses a strategy inspired by Swiss Tables. This packs as many as seven pointers into a single 64-byte cache line.
  • SIMD Instructions: By using Single Instruction, Multiple Data, the engine can check all seven pointers simultaneously, drastically reducing backend stalls while waiting for main memory.

📈 Quantification of Impact: Performance by the Numbers 📊

The results of these low-level optimizations are profound. The team focused on throughput rather than just latency, as network hops typically dominate the latter.

  • Memory Efficiency: One customer with a workload of small keys and values (around 8 bytes each) saw a 40% reduction in memory usage.
  • Typical Savings: Even average users see an 8% reduction, delaying the need to scale up infrastructure.
  • Throughput Benchmarks:
    • 250,000 requests per second per core for standard workloads.
    • 1.2 million requests per second for hot keys (with a path to 1.4 million in upcoming releases).
  • Stability: Despite radical internal changes, the team maintained flat performance for core workloads, ensuring no regressions while saving memory.

🤖 The C vs. Rust Debate: A Pragmatic Stance 🦾

In an era where every new project seems to be written in Rust, Valkey remains firmly rooted in C. Madelyn, despite being a self-described believer in Rust, offers a thought-provoking take on why a rewrite isn’t always the answer.

  • The Risk of Porting: Rust is highly opinionated. Porting a massive C codebase often requires changing the entire structure, risking performance regressions and memory efficiency.
  • The Module Compromise: Valkey uses a plugin extensibility system where new features are written in Rust. For example, the LDAP authentication module is a clean, 300-line Rust implementation.
  • The Verdict: While new code should explore Rust, deep core infrastructure that is already well-tuned and dependency-less should stay in C to avoid unnecessary risk.

🌐 The Future of Valkey Governance 📡

Valkey operates under a Technical Steering Committee (TSC) comprising the original six creators. The project remains strictly vendor-neutral and is actively looking to expand its maintainer base.

From powering telecommunication equipment at Ericsson to running on a Steam Deck for conference demos, Valkey is proving that an open, community-driven approach can outpace proprietary shifts.

As Madelyn puts it, the goal is to keep making smarter calls on where to invest time in tech, ensuring Valkey remains the most reliable, high-performance cache in the ecosystem. 👾🎯

Appendix