Presenters

Source

From Kubernetes Headaches to WebAssembly Wonders: The Quest for Seamless Autoscaling 🚀

Ever felt the sting of a slow-scaling application when demand spikes? You’re not alone! David, a seasoned infrastructure lead at Reich, recently took us on a fascinating journey, dissecting the often-frustrating world of autoscaling. His mission? To find that sweet spot of instantaneous, Lambda-like scaling, but without the hefty price tag. Buckle up, because we’re diving deep into the challenges of Kubernetes and the tantalizing promises of WebAssembly!

The Lambda Benchmark: Speedy Scaling, Steep Costs ⚡️💸

Let’s start with the gold standard: AWS Lambda. David shared a nail-biting anecdote where a rogue bug catapulted requests from 50 to a staggering 1,300 concurrent users in just two minutes. Lambdas, true to their reputation, handled this surge with flawless, near-instantaneous scaling. It’s impressive, no doubt!

However, there’s a catch. This lightning-fast agility comes at a significantly higher cost. Plus, Lambdas operate on a strict one-request-per-instance model. This is a far cry from the traditional Java developer’s comfort zone of large, robust servers handling multiple requests.

Kubernetes Autoscaling: A Multi-Layered Battleground ⚔️

Kubernetes, while a powerhouse, can be a labyrinth when it comes to autoscaling. David highlighted several critical hurdles:

  • Node Scaling Delays: Out-of-the-box, scaling Kubernetes nodes can drag on for up to 9 minutes. Even with optimization, hitting the 2-minute mark is a good day. This delay can be a deal-breaker when milliseconds matter.
  • Pod Startup Complexities: The journey of a pod is fraught with delays. Attaching shared volumes, running multiple containers within a single pod, and the essential startup probes all add precious seconds. David wisely advises to sum up all these timeouts to get a realistic picture of your scaling speed.
  • Image Pull Bottlenecks: Even with clever tricks like proxies or preloading, pulling container images takes time. For larger images, expect to add a good half a minute to your scaling process – a stark contrast to serverless platforms that handle images almost instantly.
  • The Metric Maze: The Horizontal Pod Autoscaler (HPA) in Kubernetes is powerful, but choosing the right metric is a dark art. David’s Rails application example was eye-opening:
    • Memory showed zero correlation with latency spikes.
    • CPU was far too flaky to be a reliable indicator.
    • Even application-specific metrics, like average thread capacity, only offered a partial correlation. He cautioned that combining the wrong metrics can lead to nonsensical scaling behaviors. 🤯
  • Scaling Down: The Even Harder Problem: If scaling up is a challenge, David emphatically stated that scaling down is an even more difficult problem to solve effectively.

Knative: Orchestration, Not a Performance Panacea ✨

Knative steps in as a compelling orchestration layer, offering features like queuing and scaling to zero. David described it as a “generally very nice project.” However, he clarified that it doesn’t magically boost scaling performance beyond what raw Kubernetes can achieve. It does, however, introduce its own layer of complexity, notably the need for a service mesh.

WebAssembly (Wasm): The Lightweight Serverless Frontier? 🌐

David’s exploration then led him to the exciting realm of Server-Side WebAssembly (Wasm). He sees it as a potential game-changer – a lightweight alternative to containers and even Knative. The allure? Sandboxed processes, dramatically smaller images, and the promise of lightning-fast startup times.

The Wasm Ecosystem: Maturing, But Not There Yet 🛠️

The Wasm ecosystem is still finding its feet. David recounted his struggles in 2023, finding it nearly impossible to build even moderately complex applications with Java or Go due to execution limits, niche internal boundaries, and a severe lack of mature SDKs. While the idea of leveraging the Kubernetes ecosystem with runtimes like Kasm was tempting, it proved experimental and prone to errors.

Progress and Lingering Hurdles 💡

Fast forward to today, and the landscape has shifted significantly. The Spin SDK has evolved, making it more viable to work with examples, even if building complex apps from scratch is still a stretch. Key advancements include improvements to the WASI specification and the emergence of SpinKube and WasmCloud as pathways to integrate Wasm workloads with Kubernetes.

Despite this progress, David stressed that significant documentation gaps remain, posing a steep learning curve for newcomers. Hitting unexpected boundaries is a common roadblock. His conclusion? While Wasm holds immense potential for lightweight serverless applications, the ecosystem is not yet fully equipped to replicate Lambda-like performance across all use cases.

The Verdict: Lambdas Lead, Wasm Promises the Future 🌟

So, what’s the takeaway? Out-of-the-box Kubernetes autoscaling often yields “very strange values.” With extensive tuning, it can achieve “significantly better values,” and Knative offers valuable orchestration without a direct performance boost.

However, server-side Wasm emerges as the most promising avenue for the lightweight, scalable serverless future David envisions. Once the ecosystem matures, it could very well be the answer to our autoscaling prayers. And for those curious to explore, David pointed out a helpful tool called “container to Wasm” for emulating Wasm environments. The future of scaling is looking brighter, and potentially much lighter! ✨

Appendix