I Finally Automated My SDLC With AI Agents (And It Mostly Just Worked)

A couple of months ago I read about agentic workflows for software development—the idea of giving AI agents enough context and rules to handle entire chunks of your SDLC, not just autocomplete. It sat in my "try this later" pile until last week, when I finally had the time to set it up properly.

The setup: one file to rule them all

The core of the approach is deceptively simple: a standardized agent context file. I used to roll my own version of this—a CONTEXT.md I'd hand-write per project—but I've since switched to the emerging AGENTS.md standard, which is cross-platform and increasingly recognized across different agent tools.

Mine includes a project brief (what the codebase does, key architecture decisions, conventions) plus operating rules—how branches should be named, commit message formats, PR structure, that kind of thing. Basically everything I'd tell a new contractor on day one, written down once so the agent doesn't have to guess or get it wrong.

On top of that, I built out a structured workflow—using Cascade, the local agent in Windsurf (now rebranded as Devin Desktop by Cognition)—covering the full lifecycle. Instead of relying on loose chat prompts, I modularised the entire execution pipeline into declarative steps:

  • Create a GitHub ticket from a feature request or bug description

  • Create a branch following the naming convention

  • Spin up a work in progress doc the agent references while building

  • Commit incrementally with consistent message formatting

  • Open a PR referencing the correct issue

  • Final cleanup pass before merge

To enforce this, I mapped out the system topology cleanly inside the codebase's local configurations, breaking down the automated pipeline into explicit step configurations.

The point wasn't to remove me from the loop—it was to remove the repetitive scaffolding from the loop. Ticket creation, branch naming, PR descriptions: all the stuff that's necessary but not where my brain adds value.

Does it work?

Mostly, yes. Once the agent context file was in place, the agent consistently followed naming conventions and produced PRs that looked like I wrote them, structurally speaking.

The WIP doc turned out to be more useful than expected. Because LLMs are inherently stateless, forcing them to execute multi-step tasks over a long period usually introduces context dilution. The WIP doc acts as an externalized short-term memory state machine. It gave the agent a running scratchpad to refer back to mid-task, which cut down on it losing the thread on longer pieces of work.

It's not magic. I still review everything, and there's a real setup cost in writing a context file that's detailed enough to be useful without being so long the agent ignores half of it. But the repetitive, low-judgment parts of my workflow are now genuinely automated.

Bonus observations

Two things stood out while I was building this that go beyond my specific setup.

The rebrand landed mid-project: Right as I was getting this workflow dialed in, Windsurf rebranded to Devin Desktop and introduced Devin Local as the new default agent, with Cascade kept around as a legacy option for just a bit longer. I tried switching my workflow over, but Devin Local felt noticeably slower in practice than Cascade had been, so I've stuck with Cascade for now while it's still supported.

The bigger play: What's interesting is where Devin Desktop seems to be heading. It doesn't feel like it's trying to be the agent—it feels like it's positioning itself as a command center to coordinate multiple models/agents, picking the right one for the job rather than locking you into one.

That's a similar direction to what Ona (formerly Gitpod) is doing with their asynchronous background environments—an aside, but worth watching. If that's the trajectory, the agent context file approach becomes even more valuable: it's portable infrastructure that isn't tied to whichever agent happens to be fashionable this quarter.

What's next

I'm continuing to tweak these workflows to cover more edge cases—and ideally make them self-healing by passing deterministic validation/compiler feedback directly back into the execution loop, so the agent can recover from its own mistakes rather than getting stuck and waiting for me.

I'm also exploring more of Devin Desktop's other capabilities, while keeping an eye on competing tools to see how the multi-agent coordination space shakes out.

The longer-term vision: my role shifts toward setting the vision, owning the roadmap, and making the key architectural calls, with the agents handling execution underneath.

Comments

Popular posts from this blog

🚀 Level Up Your Code with AI: My Vibe Coding Workflow

Strategic approach to implementing Drupal

Bot Busters: Defending Your Site Against Bots