
The builder's new superpower
Across the N47 portfolio and in conversations with teams at foundation model labs, I keep hearing the same thing: the nature of software development is fundamentally transforming faster than anyone predicted.
The builders I talk to are already living inside this change, figuring out what their roles look like on the other side.
I recently had a terrific conversation about this with Justin McCarthy, CTO of StrongDM, who has been both a thought leader and an action leader on making the software factory real. He developed a working system where a three-person team, founded in July 2025, had functioning demos within three months.
His team's guiding question captures the shift perfectly: "Why am I doing this? The model should be doing this instead."
That question should haunt every builder right now. In the best possible way.
Three eras compressed into two years
Cursor's CEO Michael Truell recently mapped the progression of AI-assisted development into three eras.
- The first was tab autocomplete: helpful, but still the developer's hands on the keyboard.
- The second brought synchronous agents, where you'd prompt and direct AI through a conversation: useful, but you were still in the loop at every step.
- The third era, arriving now, is autonomous cloud agents that tackle larger tasks independently, iterate and test on their own, and return with reviewable artifacts: Not diffs to approve line by line, but logs, video recordings, and live previews.
The developer's job has now shifted from guiding each line of code to defining the problem and setting review criteria.
The data inside Cursor tells the story. Agent users grew 15x in a single year. Thirty-five percent of Cursor's own merged PRs now come from agents operating autonomously in cloud VMs.
The tab era lasted two years. The synchronous agent era may not last one.
Code as commodity output
What McCarthy is doing at StrongDM takes this to its logical conclusion. His software factory treats generated code the way ML engineers treat model weights: as opaque outputs whose correctness is inferred from externally observable behavior. You don't review the code. You validate whether the software satisfies real user scenarios.
This is a philosophical break from decades of software engineering orthodoxy. Reading and writing code has been the bedrock skill since the industry began. McCarthy's team has deliberately abandoned that assumption.
Their system replaces traditional tests with end-to-end user scenarios stored outside the codebase, so agents can't game the validation. A Digital Twin Universe provides behavioral clones of services like Okta, Jira, and Slack, enabling thousands of validation runs per hour. And a probabilistic satisfaction metric replaces boolean pass/fail testing: "Of all observed trajectories, what fraction likely satisfy the user?"
McCarthy's practical benchmark is provocative: "If you haven't spent at least $1,000 on tokens today per human engineer, your software factory has room for improvement."
The full stack that wins
Models are commoditizing fast, and code is becoming a commodity output.
But commodity inputs don't produce commodity outcomes. The builders who win in the factory era will stack three things: workflow ownership, proprietary data, and deep domain knowledge. Miss any one of these, and you're building on sand.
Workflow ownership
The software factory is the workflow layer, taking on the specialist grunt work that holds up everything downstream. I see the same pattern at Aurasell, one of our portfolio companies: AI agents handle the grind, so the team focuses entirely on customers. The factory applies that logic to development itself. Agents handle the code, builders focus on what the product needs to become.
Proprietary data
The data behind the task is what completes it accurately. StrongDM's Digital Twin Universe is exactly this: a proprietary data layer that lets agents validate against realistic behavioral clones. Without it, agents are guessing. With it, they converge on correctness.
Deep domain knowledge
Without deep domain knowledge, you can't know if the output is even right. The factory demands specifications precise enough for agents to execute against, scenarios that capture what real users care about, and the taste to spot when output is subtly wrong. That's the layer that can't be automated away.
Model companies see all three clearly, which is why they're racing to forward integrate into white-collar work.
Conclusion
At N47, we've always believed that the product reveals everything about a builder's capabilities. Their judgment, their conviction, their ability to solve problems that matter. The software factory amplifies this belief.
When every team can generate code at near-zero marginal cost, what separates them is the quality of thinking about what gets built. The best builders have always cared more about whether the product solves a real problem well than about the volume of code shipped.
The factory makes that truth visible to everyone.
The biggest engineering teams won't win this era. The sharpest product thinkers will.


