Presenters

Source

Beyond Tool Calls: Unleashing the Power of Code Generation with AI 🚀

Hey tech enthusiasts! Ever felt the frustration of AI struggling with complex tasks, especially when dealing with a multitude of services and tools? Sunil Pai, from Cloudflare and creator of PartyKit, dives deep into a revolutionary approach that’s changing the game: code generation. Forget the clunky JSON back-and-forth; Sunil introduces us to a world where AI writes code to get things done, unlocking unprecedented efficiency and flexibility.

The Tool-Calling Conundrum: Why JSON Isn’t Always the Answer 🤯

When you’re just starting with AI agents and a few tools, the traditional method of using JSON for tool calls works fine. However, as you scale up and integrate services like Google, Jira, and wikis, stuffing hundreds of tools into the context becomes a bottleneck. The composition gets messy, and the constant back-and-forth with the model slows everything down.

Sunil highlights a critical problem: tool calling gets weird at scale. Imagine needing to interact with a vast API surface, like Cloudflare’s with its 2,600 endpoints. Exposing a tool for each would result in an astronomical token count (around 1.2 million tokens for the first call!), making it practically impossible.

The Code Mode Revolution: AI as a Coder 💻✨

The solution? Instead of relying on JSON, Sunil’s team asked the model to generate code, typically JavaScript, that could then be executed in a controlled environment. This “code mode” brings several game-changing benefits:

  • Typed APIs & Syntax Checking: Code inherently offers typed APIs, allowing for robust type checking and catching syntax errors early.
  • Leveraging Training Data: AI models are trained on massive datasets, including vast amounts of code. This means they can effectively leverage this knowledge to write functional code.
  • One-Shot Execution: Instead of multiple round trips, code execution allows for complex operations to be completed in a single run.
  • Fundamental Coding Capabilities: Code enables essential programming constructs like looping, state management, sequencing, and parallelization – capabilities that are natural for engineers but challenging for traditional tool-calling AI.

A Real-World Example: Taming the Cloudflare API 🌐🛠️

Sunil shares a compelling example from his work at Cloudflare. To manage the massive API surface, his colleague Matt Carey devised a brilliant strategy: expose only two tool calls – search and execute.

  • The search endpoint takes the entire OpenAPI JSON spec as input.
  • The execute endpoint then allows calling functions against the discovered API endpoints.

This ingenious approach drastically reduced the initial token count from 1.2 million to a mere 1,000 – a staggering 99.9% reduction! This makes interacting with complex APIs incredibly fast.

Imagine a customer facing a DDoS attack. In a panic, they don’t have time for menu diving. A regular AI with tool calls would require about eight round trips to block offending IPs. With code generation, the AI can generate a single string of JavaScript code, execute it immediately against the API surface, and resolve the issue in one go.

Sunil even provides a live demo of this “mythical server,” showcasing how the AI can list Cloudflare workers by generating and executing code, demonstrating the power of this approach even with a few hiccups on stage!

Bridging the Gap: Empowering Everyone with Code 👨‍💻👩‍💻

This shift to code generation has profound implications beyond just technical efficiency. Sunil argues that it breaks down the traditional dichotomy between technical users who can write code and non-technical users who rely on simplified interfaces.

Think about a programmer tasked with categorizing and renaming 200 photos. They’d open an IDE, write a script, perhaps use a vision model for captions, and get it done. Their mother, however, would have limited options – call someone or use a clunky, expensive app.

AI, with its ability to generate code from natural language prompts like “rename these files by date and location,” democratizes this power. Every human being on the planet now has access to a buddy that can spit out code that can interact with systems. This opens the door for highly personalized and dynamic user experiences.

The Tic-Tac-Toe Revelation: Inhabiting State Machines 👾

A fascinating anecdote involves Kenton, the creator of Cloudflare Workers. He built a simple drawing canvas and asked an AI to play Tic-Tac-Toe. Instead of generating Tic-Tac-Toe code, the AI inspected the existing state of the system – an array of strokes on the canvas – recognized it as a Tic-Tac-Toe board, and made its move.

This emergent behavior is key. The AI wasn’t programmed to play Tic-Tac-Toe; it learned to interact with the system by understanding its state. Sunil aptly describes this as the AI “inhabiting the state machine” rather than just generating a program. This is a radical departure from traditional software architecture.

The Rise of the Harness: A New Software Architecture 🏗️

The industry is rapidly developing what’s called a “harness” – a secure environment for executing AI-generated code. These harnesses are not just about generating code but also about providing a safe space to run it.

Key attributes of these sandboxed environments include:

  • Capability-Based Security: Starting with zero capabilities and explicitly granting them as needed, rather than a container with pre-defined features and security layers.
  • Fast Initialization: Using technologies like V8 isolates for quick startup times.
  • Controlled Execution: Granting explicit API access and controlling all outgoing network connections. The recommended default is no outgoing fetches, only APIs.
  • Absolute Observability: The ability to trace every action, understanding why and how code executed at any given time.

This harness architecture can be implemented using various technologies like V8 isolates, WebAssembly, or custom JavaScript interpreters. The goal is a fast, observable, and secure execution environment.

Ambitious Futures: Long-Running Workflows & Generative UI ✨

This new paradigm unlocks ambitious possibilities:

  • Long-Running Workflows: Imagine workflows that run for days, months, or even years, carrying their state throughout their lifetime.
  • Generative UI for Every User: In e-commerce, for instance, instead of a one-size-fits-all UI, AI can generate perfectly customized interfaces for each user based on their context, preferences, and past behavior. This allows for hyper-personalization, surfacing relevant actions and information dynamically.

Sunil illustrates this with an e-commerce example: a user needs to return shoes and find a similar item under $100. The AI can generate the necessary code to perform this action on the fly, even if product engineers haven’t explicitly built that feature. This means generating completely different programs backed by your backend system for every single user.

Developer Experience for Agents: The New Frontier 🤖

As AI agents become more sophisticated, we need to consider their “developer experience.” This means:

  • Clear Documentation: Markdown docs are essential for agents to understand how to interact with systems.
  • Informative Errors: Errors should guide the agent on what to do next.
  • Discoverability: Search capabilities are crucial for agents to find the right tools and APIs.

The core concept to embed is capability-based security. This isn’t limited to JavaScript; it applies to Python, WASM, and even Lisp. The key attributes remain sandboxing, capability-based security, embeddability for ephemeral execution, and absolute observability.

The End of the Distinction: Code is the New Interface 💡

For a long time, programmers had infinite power to interact with systems, while everyone else got buttons and forms. This distinction is dissolving. In this new world, code does the talking. It’s the mechanism for interacting with all your systems.

Sunil concludes by emphasizing that this is a new area of research and development. The future lies in empowering AI to generate and execute code, leading to a more dynamic, personalized, and powerful software landscape.

So, the next time you think about AI, remember that it’s not just about asking questions; it’s about empowering these agents to write the solutions. The possibilities are immense, and the journey is just beginning!

Come chat with Sunil at the pub if you want to dive deeper into this exciting evolution of software!

Appendix