Presenters

Source

Where Do Your Source Attestations Live? Navigating the Labyrinth of Metadata 🗺️

Hey tech enthusiasts! Ever felt like you’re drowning in a sea of metadata, wondering where exactly to stash those crucial source attestations? You’re not alone! At a recent lightning talk, Billy Lynch from Chain Guard dove deep into this very question, exploring strategies for storing and discovering these vital pieces of information. Let’s break down the key takeaways and ponder the future of source attestation storage. 💡

What Exactly is an Attestation, Anyway? 🤔

Before we dive into where to put them, let’s quickly define what we’re talking about. An attestation is essentially a statement about metadata concerning an artifact, made by a trusted entity. It’s like a digital fingerprint or a certificate of authenticity. The crucial part? Its meaning hinges on who is making the assertion and whether we trust them.

When it comes to source code, what do we typically care about?

  • Who made the change?
  • Who reviewed the change?
  • What CI checks were performed on the commit? (e.g., vulnerability scans, end-to-end tests)

These are the breadcrumbs we want to track, and the question is, where do we store them so they’re easily discoverable and trustworthy?

Strategy 1: Leverage Your Existing Code Review Process 👨‍💻

This is often the most intuitive starting point.

  • The Upside: You’re likely already doing code reviews via pull or merge requests. These platforms already capture who proposed a change, who approved it, and which automated checks passed. Great news – you’re probably already generating some attestations!
  • The Downside:
    • Source Provider Specific: This method is heavily tied to your specific platform (e.g., GitHub, GitLab).
    • Data Limitations: You’re limited to the data your provider makes available.
    • Branch Limitations: For instance, GitHub pull requests often only show merged commits to the main branch, potentially missing attestations for work-in-progress or unmerged branches.
    • API Constraints: You’re subject to API rate limits and quotas.

While not perfect, it’s a solid starting point and definitely better than nothing.

Strategy 2: Embrace the Git Repository Itself 💾

Why not store attestations directly where the code lives? This approach often involves using mechanisms like git notes or dedicating a separate ref space within the repository. Tools like git-sign and the Salsa source tool are exploring this.

  • The Upside:
    • Decoupled from Forges: You’re no longer solely reliant on GitHub or GitLab’s specific APIs.
    • Leverages Existing Infrastructure: Git is already designed to store content, so we can reuse these proven mechanisms.
    • Formalization Efforts: There’s ongoing work to standardize how this information can be stored within source repositories.
  • The Downside:
    • Permissioning Challenges: Source code providers might not offer granular write permissions for specific refs. This means if someone has write access, they might have access to more than just attestation refs, which can be a security concern.
    • User Experience: Managing separate refs can sometimes require additional tooling to provide a seamless user experience, as Git typically operates on one ref at a time.

Strategy 3: The Simplicity of Storage Buckets ☁️

This is a straightforward approach: just put your attestation files into a designated storage bucket.

  • The Upside:
    • No Repo Access Needed: This is a completely separate system, meaning you don’t need write permissions on the repository itself. This is particularly useful for consuming third-party open-source projects where you might not have write access.
    • Flexibility: You can store arbitrary files, making it easy to associate attestations with specific commit SHAs (e.g., attestation/<commit-sha>/).
  • The Downside:
    • Discovery Nightmare: The biggest hurdle here is discovery. Without a standardized format, path structure, or agreement on where to find this metadata, it becomes incredibly difficult for clients to locate and consume these attestations at scale. While great for personal use, it doesn’t meet the bar for widespread open-source adoption.

Strategy 4: API-Driven Attestation Services 🌐

To address the discovery problem, an API-centric approach makes sense. Imagine a standardized API where you can request attestations for a given artifact.

  • The Upside:
    • Uniformity and Discoverability: Standardized formats and API endpoints mean you know exactly where and how to find attestations.
    • Leveraging Existing Standards: Tools like cosign and the OCI Referrers API are paving the way by defining how attestations and signatures can be associated with OCI images.
    • Platform Integration: GitHub has taken steps with its attestation API as part of its sigstore integration.
  • The Downside:
    • API Limitations: Currently, GitHub’s attestation API only allows identities from GitHub itself (e.g., GitHub Actions) to upload. This limits its utility for external entities.
    • Standardization is Hard: Creating new, universally adopted standards is a significant undertaking, and we’re not quite there yet for source code attestations.

The Future is Likely Hybrid 🤝

As Billy Lynch highlighted, we’re still in the early days of figuring out the best strategies for source attestation storage. The consensus seems to be that there likely won’t be a single, perfect solution.

  • AI’s Influence: The rapid advancements in AI are forcing different communities and experts to converge and discuss how their respective domains (like security and software development) can interconnect more effectively. This is driving a timely conversation around attestation storage.
  • Diverse Needs: For instance, security teams grappling with executive orders and eSBOM requirements need a clear place to store and manage a variety of attestations, from signed PDFs to build logs.
  • Practical Realities: As demonstrated in the Q&A, even for organizations like Chain Guard that rebuild open-source software, the inability to write to third-party repositories makes a Git-repo-centric approach a non-starter for consuming attestations. This reinforces the need for solutions like storage buckets or standalone APIs.

Ultimately, we’ll likely see a landscape where different methods coexist, catering to various use cases and levels of integration. The journey to a robust and discoverable attestation ecosystem is ongoing, but the conversation is happening, and that’s a fantastic step forward! ✨

Keep an eye on these developments – the way we secure and trust our software supply chains depends on it! 🚀

Appendix