Presenters
Source
Unlocking Peak Performance: A Deep Dive into QuestDB, Java, and the Tinker’s Mindset with Jaromir Hamala 🚀💡
Are you navigating the tricky waters of moving AI from a proof-of-concept to reliable production? You are absolutely not alone! Many engineering teams face this exact challenge. Senior engineers and architects at events like QCon AI Boston are already sharing how they’ve made that shift, revealing scaling patterns, lessons learned, and what they’d do differently. But what if the core of your high-performance system is built on a technology many still consider “slow”?
Meet Jaromir Hamala, a self-proclaimed “generalist tinker” with a knack for crafting highly efficient, high-intensity technology. Jaromir, who’s been coding since his ZX Spectrum days, currently pours his passion into QuestDB – a powerhouse time-series database.
QuestDB: A Three-Tiered Architecture for Blazing Fast Data 💾📈
QuestDB is a time-series database engineered for extreme ingestion rates and bridging the gap between high-intensity data sources and data lakes. Think of it as an analytical database specialized in querying data around the time axis. Instead of scanning all your data, QuestDB shines when you need aggregations over the last 24 hours, or data bucketed by specific time windows.
Its secret? A brilliant three-tiered storage system:
- Tier 1: The Write-Ahead Log (WAL) ✍️💨
- This is QuestDB’s ingestion-optimized tier. Data streams in and is written append-only to disk, achieving mind-blowing ingestion rates of millions of rows per second.
- Tier 2: The Mid-Tier 📊🔍
- Asynchronously, data from the WAL is transformed and organized into a query-optimized shape. Data files on disk are physically sorted by time, making time-based queries incredibly efficient. If you need the last hour’s data, QuestDB can find it with lightning speed. Queries here are standard SQL, making data interaction straightforward.
- Tier 3: Cold Storage 🧊📦
- For high ingestion scenarios like IoT or financial exchange data, total datasets grow rapidly. QuestDB offloads older data to archiving-optimized cold storage, like S3, in Parquet format. This means you retain historical data cost-effectively, and you can even process these Parquet files with your existing tooling without going through QuestDB directly, seamlessly integrating with data lakes.
Java’s Need for Speed: Dispelling the Myth with QuestDB ⚡️☕
“Java isn’t fast,” they say. But QuestDB begs to differ! While its GitHub stats show that 90.8% of QuestDB’s codebase is Java, it’s not your typical, “idiomatic” Java. The secret sauce comes from its founder’s background in high-frequency trading (HFT).
In HFT, every millisecond counts, so developers employ specialized Java techniques:
- Avoiding Allocations: Minimizing object creation to prevent garbage collection pauses.
- Off-Heap Data Structures: Directly managing memory outside the Java heap.
- Pooling & Object Reusing: Recycling objects instead of constantly allocating new ones.
This “un-Java-like Java” approach, combined with strategic use of C, C++, Assembler, and Rust for specific components, allows QuestDB to achieve its incredible performance. It’s compelling evidence that Java can indeed be pretty fast for both ingestion and querying, often topping independent benchmarks!
Pushing the Limits: Java’s Future & Mechanical Sympathy ⚙️🔬
QuestDB’s core still relies on “old school unsafe ways” to achieve its speed. However, with recent upgrades (targeting Java 17 and soon 21), Jaromir and the team are excited about new Java developments:
- Vector API: Jaromir has experimented with the Vector API to bring vectorized execution to ARM architectures, mirroring the efficiency currently achieved with AVX2 instructions via QuestDB’s custom C++ JIT backend. While challenges like “warm-up time” for ad-hoc queries exist, the potential for vastly more efficient filtering over billions of rows is immense.
- Valhalla: This will be a game-changer for value objects and precise memory layout control. QuestDB’s philosophy of “mechanical sympathy” – understanding and optimizing for the underlying hardware – often clashes with idiomatic Java’s abstractions. Valhalla promises to bridge this gap, allowing developers to write both clean, maintainable and high-performance code.
- Panama: This API could allow QuestDB to ditch JNI calls for memory mapping
(mmap) and other native interactions. By providing safer, zero-cost
abstractions for off-heap memory access, Panama aims to reduce the fragility
and risks associated with direct
unsafeoperations.
Jaromir emphasizes that these techniques aren’t for everyone. Most developers building Spring applications won’t need them. But for those building the frameworks where performance is the key differentiator, these advancements are crucial for closing the gap between idiomatic and high-performing Java.
The Tinker’s Tales: Kernel Debugging & The 1 Billion Row Challenge 👾🛠️
Jaromir’s “tinker” spirit drives him to explore beyond the usual boundaries. He recounts a fascinating journey into the Linux kernel to debug a performance issue that froze his entire system. This led him to discover a kernel deadlock bug, debug it step-by-step in QEMU with GDB, and even “unfreeze” his computer by “lying to the kernel” – a testament to his deep curiosity and problem-solving skills, albeit a “party trick” not for production!
He also shared insights from the infamous 1 Billion Row Challenge, where he earned a bronze medal. His key takeaway: CPUs are incredibly parallel, out-of-order execution machines. He exploited this by duplicating operations within a single thread, allowing the CPU’s multiple arithmetic-logical units to process more in parallel. This experience was “priceless” for building intuition about hardware, even if the resulting code was “super ugly” and not maintainable for long-term projects. It highlighted the stark difference between “beautiful code” and “fast code” when squeezing every last drop of performance. The ultimate lesson: computers are extremely fast, and if you are not sabotaging them, they are surprising!
AI: The New Frontier for Coders 🤖✨
Even a seasoned tinker like Jaromir embraces AI for coding. He uses tools like Codex and Claude for:
- Codebase Investigation: Quickly understanding complex projects like the HotSpot C2 compiler or the Linux kernel.
- Exploratory Work & Learning: Validating hypotheses and accelerating learning curves.
AI allows him to achieve in a Saturday what would previously take weeks of research. However, he also raises a thought-provoking concern: while AI is an amazing learning tool, for new developers, it might foster superficiality. The ease of generating code could undermine the discipline and deep concentration required to truly learn the craft, potentially impacting long-term motivation and understanding.
This new era, as Jaromir and Olimpio discuss, presents both incredible opportunities for acceleration and challenges around cognitive load and deep comprehension. As we move forward, striking a balance between leveraging AI’s power and cultivating genuine expertise will be key.
Jaromir Hamala’s journey with QuestDB, his relentless pursuit of performance, and his thoughtful insights into the future of tech remind us that the most exciting innovations often come from those who dare to tinker, question, and push the boundaries of what’s possible.