What 35% Faster Actually Looks Like
Putting Claude, Cursor, and Copilot in front of 50 engineers didn't make us faster on day one. The number came later — and it came from changing how we work, not what we type.
Editor’s note (draft): A starting draft written from Vincent’s experience to seed the blog. Review it, rewrite it in your own words, and delete this note before publishing.
When people hear that we cut development cycle time by 35% after rolling out AI tooling, they assume there was a moment — a launch, a license purchase, a Friday afternoon where everyone installed Cursor and Monday came in 35% faster. There wasn’t. The tools were the easy part. The 35% came from everything around them.
Here’s what actually happened, and what I’d tell any engineering leader about to do the same thing.
The first month, productivity went down
This is the part nobody puts in the case study. The first few weeks after we gave the org real access to Claude, Cursor, and Copilot, our throughput dipped. Engineers were learning to prompt. They were over-trusting completions in some places and ignoring genuinely good suggestions in others. Code review got slower because reviewers didn’t yet know how to read a diff that was 60% model-authored.
If you measure on week two and panic, you’ll kill the initiative right before it pays off. We told everyone up front: this is an investment with a J-curve. Expect to be worse before you’re better. Naming that in advance bought us the patience to get to the other side.
The tools don’t create the gains. The workflow does.
A model that writes code faster only helps if writing code was your bottleneck. For most teams, it isn’t. The bottleneck is the stuff between writing code: understanding the existing system, writing the tests, getting through review, untangling the flaky CI run, writing the third revision of the migration plan.
So we stopped framing AI as “autocomplete on steroids” and started pointing it at the connective tissue:
- LLM-assisted review — a first pass that catches the obvious before a human spends attention on it, so reviewers review judgment, not typos.
- Automated test generation — not to replace thinking about edge cases, but to get from “I have a plan for the tests” to “the tests exist” in minutes.
- Documentation on demand — the intelligent summary of a subsystem nobody has touched in two years, generated the moment someone has to touch it.
- AI pair programming — for the lonely, high-context work where a rubber duck that can actually read the codebase changes the pace of thought.
None of those is “the model writes the feature.” All of them remove a tax that compounds across 50 people.
Standards are the unlock, not the constraint
The instinct when you hand powerful tools to a large team is to lock them down. We did the opposite — but we were specific. We established prompt-engineering standards and AI tooling guidelines, and we trained the org on what good collaboration with a model actually looks like.
Concretely, that meant answering questions like: When is a model-generated test trustworthy enough to merge? What context do you owe a reviewer when a change is largely AI-authored? Where is the line where you stop prompting and start thinking? Codifying those answers turned 50 people improvising into 50 people improving the same way — which is the only way a number like 35% shows up at the org level instead of for one enthusiastic early adopter.
The goal was never to get individuals to feel faster. It was to make the system faster — predictably, measurably, for everyone, including the engineers who were skeptical.
What I’d measure
Cycle time, not lines of code. Lines of code is the vanity metric that AI makes meaningless — of course you can generate more of it. We watched the things that actually correlate with shipping: time from first commit to merged PR, review latency, the rate of changes that had to be reverted. The last one matters most. Speed that buys you a wave of regressions isn’t speed; it’s debt with a faster interest rate. Our revert rate held steady while cycle time dropped. That’s the result I’m actually proud of.
The part that surprised me
The biggest shift wasn’t velocity. It was where senior engineers spent their attention. When the model absorbs the mechanical work, the scarce resource becomes judgment — architecture, tradeoffs, knowing which problem is worth solving. The best people on the team got more leveraged, not less relevant. That’s the version of this future I want to build toward: not engineers replaced by models, but engineers freed to do the part that was always the actual job.
Thirty-five percent is a real number. But it’s a lagging indicator of a culture change, not a tool you can buy. Buy the tools anyway. Then do the harder work.