How I Use Agentic AI for Development — A Workflow That Actually Holds Up in Production
There’s a version of “using AI for development” that most developers are familiar with: tab-completing a function, asking it to explain an error, maybe generating a boilerplate class. That’s where I started too. Nearly a year later, I barely recognise that workflow. What I do today is fundamentally different — not because the tools changed (though they did), but because my mental model of what AI is in the development process changed.
This isn’t a piece about AI hype. It’s a detailed, honest account of how I actually use agentic AI — primarily Claude and GitHub Copilot — to build backend systems and data/ML pipelines. Including the parts that are still hard.
Where It Started: Autocomplete and One-Off Questions
Like most developers, my first use of AI in coding was almost passive. GitHub Copilot sat in my editor and suggested completions. I’d occasionally paste a function into a chat window and ask what was wrong with it. Useful, occasionally impressive, but ultimately just a smarter Stack Overflow.
The shift wasn’t a single moment. It was gradual — a slow accumulation of realising that the more context and structure I gave the AI, the more it could actually carry. At some point I stopped thinking of it as a lookup tool and started thinking of it as a collaborator that needed to be briefed properly.
That reframing changed everything.
The Workflow Today: From Idea to Compounded Knowledge
Here’s what the end-to-end process looks like now for any meaningful feature.
1. Brainstorming Through an Interview
I don’t start by writing a spec. I start by asking Claude to interview me about the feature.
This is deceptively powerful. Instead of me trying to articulate a half-formed idea into a structured document, the AI pulls the requirements out of me through questions. What problem are we solving? Who uses it? What are the edge cases? What are the constraints? What have we tried before?
By the end of that conversation, the shape of the feature is clear — not because the AI designed it, but because being asked good questions forced me to think it through properly.
2. Planning the Feature
With the brainstorm as context, I ask Claude to produce a structured implementation plan. This includes the breakdown of components, the sequence of development, the data flow, the test strategy, and — for larger features — a phased checklist that will later be used to track progress.
This plan is not final. It’s a first draft that I review critically.
3. Reviewing and Deepening the Plan with Parallel Research
After the initial plan is reviewed, I ask Claude to deepen it using multi-agent parallel research — essentially running concurrent threads of investigation across different aspects of the problem. For backend systems, this might mean simultaneously researching the best approach to a data schema, the right library for a pipeline stage, and the failure modes of a particular architecture pattern.
The output of this research feeds back into a revised, more detailed plan. I review it again. Only once I’m satisfied with the plan do I move to implementation.
This two-stage review — plan, then deepen, then review again — catches a surprising number of architectural issues before a single line of code is written.
4. Implementation and Testing
With a solid plan in place, I ask Claude to implement and test the feature. For large features with phased checklists, it marks each phase as completed in the planning document as it goes. This keeps both of us oriented on progress and makes it easy to resume after interruptions.
This is where GitHub Copilot becomes active in parallel — handling the in-editor completions and smaller in-context decisions while Claude manages the higher-level execution.
5. Manual Review — And Why This Step Is Non-Negotiable
Here’s something I want to be direct about: AI-generated code requires rigorous review. It is not safe to assume correctness.
In practice, code reviews on AI-assisted features take longer than they would on purely human-written code — precisely because the volume of output is higher and the AI can confidently produce code that is subtly wrong. The speed at which AI writes code does not translate linearly into faster shipping, because the review burden scales with it.
What this means in practice: I test manually, I trace through logic carefully, and I treat the AI’s output as a strong first draft from a very fast junior engineer — not as a finished product. The discipline of this review step is what separates useful AI-assisted development from dangerous over-reliance.
6. The Compounding Layer — Where the Real Leverage Is
This is the part of my workflow that I think most developers skip, and it might be the most valuable part.
Once a feature is built and verified, I ask Claude to go back through everything we did — the brainstorm, the plan, the implementation decisions, the mistakes, the iterations — and document the learnings. This includes:
What approaches worked and why
What went wrong and what the correct approach turned out to be
Patterns specific to this codebase that the AI should know for next time
Dos and don’ts for future features in the same area
This document lives alongside the codebase. The next time I work on a related feature, I load these learnings into context at the start.
The effect is compounding. Each feature makes the AI’s work on the next feature more accurate, more codebase-aware, and less prone to the same class of mistakes. It also serves as living documentation — not written as an afterthought, but derived from what actually happened.
What This Workflow Is Not
It’s not autonomous. At every stage — brainstorming, planning, implementation, review — I am involved and making judgement calls. The AI does not ship features. I do. The AI dramatically expands what I can think through and execute in a given window of time, but the accountability and the critical thinking stay with me.
It’s not always faster. Depending on the feature, AI-assisted development can feel slower than just writing the code myself, especially when the review cycle catches things that need rework. The value is not raw speed — it’s the quality of thinking that goes into the feature before implementation begins, and the knowledge that accumulates afterward.
What It Requires From You
To use AI this way, you need to be a capable enough engineer to know when it’s wrong. You need to review code, not just run it. You need to give it structured context, not lazy prompts. And you need to resist the temptation to skip the review step because the output looks right.
The developers who get the most out of agentic AI are not the ones who use it the most. They’re the ones who use it most deliberately.
Where This Is Heading
The tools are improving rapidly. Multi-agent coordination is getting more capable. Context windows are expanding. What requires careful orchestration today will likely become more fluid.
But the underlying discipline — plan before you build, review what you ship, and compound what you learn — will remain the difference between AI that accelerates good engineering and AI that accelerates the production of bugs.
That part is still on us.
