Presenters

Source

Securing the Build: How to Protect Your Software’s Foundation 🛠️

Hey tech enthusiasts! Ever stopped to think about what happens before your favorite software hits your desktop or phone? The journey from source code to a polished application is a complex one, and a critical, yet often overlooked, stage is compilation. Today, we’re diving deep into why this stage is a prime target for attackers and how we can fortify it, thanks to some groundbreaking research presented at a recent tech conference.

The Software Supply Chain: A Journey with Vulnerabilities 🔗

The software supply chain is a multi-stage process that includes retrieving and reviewing source code, compilation, packaging, and finally, distribution. While significant efforts have been made to secure individual steps and the end-to-end chain with tools like Git, GitHub, Tough, Optin, SLSA, and in-toto, the compilation stage has remained a relatively vulnerable frontier.

Recent Attacks: A Wake-Up Call 🚨

The news has been rife with sophisticated attacks targeting the compilation pipeline:

  • SolarWinds Attack (2020): Attackers ingeniously planted a “helper” on build servers. This malicious entity would subtly swap legitimate source files with malicious ones during compilation, only to revert them afterward. Reviewers saw nothing amiss, as the compromise was fleeting and existed only during the build.
  • 3CX Attack: This attack involved compromising an employee’s machine, gaining network access to build machines and signing hosts. From there, attackers injected malicious libraries into the build process, resulting in installers that were signed with legitimate certificates but contained malicious components.

These incidents highlight a crucial need: defenses must extend beyond verifying source code and final signing. We need to secure the build environment itself and produce verifiable evidence of exactly what was executed during compilation.

Why is Compilation Security So Challenging? 🤔

Securing the compilation process is a thorny problem due to a few key reasons:

  • The Expanding Trust Base 🧱: Compiling even a simple “hello world” program involves more than just trusting the compiler. Every operation relies on the underlying operating system, which in turn interacts with hardware and firmware. This creates an incredibly vast Trusted Computing Base (TCB). For instance, a minimal Linux distribution can have over 400 packages, meaning you’re implicitly trusting a massive software and hardware stack, each with its own trust assumptions. Attacks can target any layer, from hardware to the OS kernel and services.
  • Maintaining Trust in Complex Toolchains ⚙️: Large-scale, complex compiler toolchains, like those found in operating systems, make it incredibly difficult to trace the root of trust. Ensuring that this trust is securely propagated through the entire chain is a monumental task.

Existing Solutions and Their Limitations 🚧

While efforts have been made, current solutions have their drawbacks:

  • In-toto and SLSA: These frameworks provide signed and verifiable attestation reports and define assurance levels for build pipelines. They ensure the trustworthiness of source code, build scripts, and compilers but don’t address issues with the build environment itself.
  • Reproducible Builds: The idea is to run the same build multiple times and compare the outputs. If they’re identical, the build is considered trustworthy. However, this approach has limitations:
    • It requires at least one trustworthy builder.
    • Multiple parties need access to the source code.
    • Benign environmental differences can lead to variations, increasing audit overhead.
    • Crucially, it still relies on the same complex build environment, not solving the core problem.
  • Hermetic Builds: These builds run in isolated, sandboxed environments (containers or VMs) and aim for deterministic results. They depend on known versions of build tools. However, the limitation here is that auditing compilation behaviors or the build environment is difficult, often requiring multiple runs to verify outputs.

This leaves us in a situation where we lack a truly auditable and practical solution for secure compilation.

Introducing TriScale: A Secure and Auditable Framework ✨

To tackle these challenges, the presented research introduces TriScale, a novel framework designed for secure and auditable compilation. TriScale is built upon three core auditable components.

How TriScale Works: A Minimalist and Protected Approach 🛡️

TriScale places a minimal runtime within a hardware-protected execution environment. This runtime features a syscall mediation layer that redirects all compilation-related syscalls to an in-memory file system. This effectively isolates the compilation process from the outside world, allowing only controlled input/output operations.

From a system perspective, TriScale operates at the hardware level. By confining all operations within its protected environment, it prevents any interaction with the host operating system. This means malicious or unintended behaviors in the rest of the OS cannot influence the compilation.

The key benefits are:

  • Significantly Smaller TCB: The trusted computing base is drastically reduced compared to regular software stacks.
  • Auditable Process: TriScale can generate attestation reports, making the entire compilation process and environment auditable.

TriScale Prototype: Tracer 🚀

The team has developed a prototype called Tracer to demonstrate TriScale’s capabilities. Tracer leverages several cutting-edge technologies:

  • Intel SGX Enclaves: Provides hardware-protected execution environments.
  • SGX Remote Attestation: Allows a verifier to confirm that code is executing within a genuine enclave.
  • in-toto Attestation: Generates attestation reports that record the steps taken during compilation.
  • WebAssembly (Wasm): Enables safe, near-native execution of trusted code across platforms in a sandbox.

The Tracer Execution Flow: A Three-Phase Journey 🗺️

The Tracer prototype follows a three-phase build process:

  1. Enclave Initialization: Users create an SGX enclave using a configuration file (e.g., a Tomo file) that specifies runtime and build parameters, triggering enclave setup.
  2. Runtime and File System Setup: A customized Wasm runtime is launched inside the enclave. It instantiates an in-memory file system, preloading necessary compilation files, in-toto tools, and compilers.
  3. Controlled Compilation: The syscall mediation layer blocks any calls other than file system calls, which are redirected to the in-memory file system. The in-toto tools then record all intermediate and final build artifacts, ensuring all steps remain confined within the trusted environment.

Currently, the Tracer prototype supports a tiny C compiler (tinycc) and can produce native Linux binaries. The Tomo configuration file allows users to define build pipeline parameters, making the process highly configurable.

The Future of Secure Compilation 🌐

The Tracer prototype, while currently Intel SGX-specific, is theoretically portable to other hardware execution environments. This research represents a significant step towards a future where software compilation is not a weak link but a robust and auditable foundation for the digital world. By isolating and scrutinizing the build process, we can build more secure and trustworthy software for everyone.

What are your thoughts on securing the compilation pipeline? Let us know in the comments below! 👇

Appendix