Presenters
Source
Unifying the Chaos: How Veon Breto Group Mastered Workflows with Argo 🚀
Hey tech enthusiasts! Ever felt like your company’s workflows are a tangled mess of legacy systems, custom scripts, and a hundred different ways to do the same thing? If so, you’re not alone. John Keates from the Veon Breto Group, a major e-commerce retail player in the Netherlands and neighboring countries, knows this pain all too well. His recent talk peeled back the curtain on how his team tackled this colossal challenge, moving from workflow anarchy to a streamlined, GitOps-driven, Argo-powered paradise.
John, a platform engineering wizard deeply entrenched in DevOps culture and cloud architecture, highlighted a universal truth: working in e-commerce retail means a lot of software. From mobile apps and websites to middleware, back-office systems, and even warehouse logistics, the entire vertical slice is brimming with code and, consequently, a lot of deployments.
The Workflow Wild West: A Legacy Labyrinth 🤯
For the Veon Breto Group, the journey began with a typical scenario. Workflows, the sequences of steps needed to achieve a goal, were everywhere. They weren’t just for CI/CD; they powered everything from granting file share access to onboarding new team members and managing data pipelines. The problem? These workflows were scattered across a bewildering array of systems:
- Service Desk tools: Often GUI-driven, drag-and-drop.
- CI/CD systems: Like Jenkins, which John humorously described as an “amalgamation of five different languages,” often maintained by engineers who “left 10 years ago.”
- Data Engineering pipelines: Handling vast datasets across different systems.
- Even cron jobs and Windows schedulers!
Imagine being a new engineer joining the company. You would have “over 100 guesses” to figure out where a specific workflow lives, how it works, and how to influence it. This fragmentation led to a “firefighting” culture where issues were reacted to, not structurally solved. As John put it, “when something is everyone’s problem, it’s no one’s problem.”
Why Not the Obvious Choices? 🤔
Before embarking on their ambitious journey, John’s team explored common alternatives:
- AWS Code Products: The company already uses AWS. However, AWS was deprecating its code products around the time they were considering them, making this a non-starter.
- GitHub Actions: A popular choice, but it presented specific challenges for Veon Breto Group. It felt too focused on individual repository control and CI/CD, not the broader platform-level orchestration they needed. Their developers are “much more productive” with an opinionated, “golden path” approach where “9 out of 10 times” applications are built using standard components. Requiring everyone to master GitHub Actions and manage dependencies felt like a recipe for repeating past Jenkins-era problems.
Enter the Argoverse: A Unified Vision ✨
The solution, it turned out, was already partly in their hands: Argo CD. Their teams loved it, and it had proven its worth by replacing many other tools. The logical next step? “More Argo. Even more Argo!”
John’s team decided to embrace the full Argo suite:
- Argo CD: Their existing beloved GitOps deployment tool.
- Argo Events: The glue that binds everything together. It listens for events (from GitHub webhooks, HTTP endpoints, etc.), structures them into the well-defined CloudEvents format, and passes them to sensors. Even if they hadn’t used Argo Workflows, Argo Events would have been a game-changer for reliable event handling.
- Argo Workflows: The orchestrator for executing complex, multi-step tasks, replacing the fragmented legacy systems.
The “Argo” brand itself provided significant traction. When teams heard “Argo,” they were eager to pilot the new solution, unlike obscure tools that might elicit a “call us when it’s done” response.
Building the Golden Path: Design Principles 🛣️
A key challenge was ensuring developers wouldn’t become “certified professional YAML engineers.” The goal was to provide a “golden path” – a set of sensible defaults that “just works by default” most of the time. Customization is possible for non-standard cases (like mono-repos), but those developers understand they’re stepping off the paved road.
Their design philosophy centered on decoupling and reusability:
- Event Ingestion: Argo Events listens for incoming events (e.g., a Git commit).
- Dispatcher Workflow: A lightweight Argo Workflow acts as a dispatcher. It receives the event, determines the intent, and then triggers the actual task-specific workflow. This decoupling allows for testing new workflow versions without impacting defaults.
- Cluster Workflow Templates: Workflows are deployed as Cluster Workflow Templates via Argo CD. This allows teams to reference standardized, pre-built workflow steps from a central library. Developers only need to specify the desired step and its parameters, rather than writing the entire workflow from scratch. This is crucial for simplifying developer experience.
- Leveraging Existing Cloud-Native Tools: The platform integrates with
tools developers already use and trust:
- Kubernetes: For running pods and containers.
- Prometheus: For monitoring.
- Carpenter: For dynamic capacity management.
- Istio: For traffic, mTLS, and identity.
- IAM: For injecting short-lived, secure credentials into pods, replacing hard-coded secrets.
- GitOps with CRDs: Custom Resource Definitions (CRDs) are used to validate YAML files in repositories, ensuring configurations adhere to standards before deployment.
Under the Hood: How Argo Makes It Happen 🛠️
When an event, like a GitHub push, comes in, Argo Events’ listener automatically provisions webhooks. The event then flows onto the event bus, structured as a CloudEvent. A sensor, configured to react to specific events, then triggers an Argo Workflow.
This workflow might be a simple “dispatcher” that analyzes the event and then launches a more complex, task-specific workflow using a Cluster Workflow Template. These templates abstract away the complexity. Developers simply reference a named template, provide a few parameters, and the system handles the execution. For example, a CI workflow might just specify “build this ref” and the underlying template handles all the Docker containerization, testing, and artifact creation.
For feedback and notifications (e.g., a failed build), Argo Workflows can emit events back onto the bus, which other sensors can pick up to send messages to Slack, emails, or update Grafana dashboards.
Navigating the Pitfalls: Lessons Learned 💡
John shared some crucial insights from their journey:
- Avoid Over-Engineering: It’s tempting to model every tiny decision point in a flowchart as a separate node in an Argo Workflow Directed Acyclic Graph (DAG). While fun, this leads to an overly complex, hard-to-read, and potentially slow workflow.
- Avoid Under-Engineering: The opposite extreme – a single node doing everything – is equally problematic. You lose all visibility into what happened, making debugging a nightmare. The sweet spot provides enough detail for visibility without excessive complexity.
- Performance vs. Autonomy: While a 30-second workflow is faster than a 3-minute one, for autonomous runs (like a Git commit triggering a build), the user often doesn’t care about the micro-optimization. Focus on reliability and clear information over raw speed in such cases.
- Leverage Workflow Events: Don’t try to cram every notification or system integration into the DAG itself. Argo Workflows emit events, which Argo Events can then use to trigger external actions (like notifications or integrating with “90s APIs” of internal systems like Active Directory).
- Executor Plugins: John highly recommends using executor plugins for Argo Workflows to enhance functionality and simplify tasks.
The Impact: A Brighter Future for Workflows 🌟
By adopting Argo CD, Argo Events, and Argo Workflows, Veon Breto Group is transforming its operational landscape. They are moving away from a reactive, fragmented approach to a proactive, unified, and stable platform. This not only significantly reduces the cognitive load and maintenance burden on their “8 managing tools” engineers but also empowers their “100 engineers” to be more productive.
The result is a reliable, auditable, and developer-friendly workflow system that can handle “over 100 releases a day” without breaking the bank on per-minute cloud services. This strategic investment in a unified, GitOps-driven platform ensures Veon Breto Group can continue to innovate and deliver value, one streamlined workflow at a time. It’s a testament to the power of thoughtful platform engineering and the immense potential of the Argo ecosystem!