The architecture, the vocabulary, the five-level framework, and why November 2025 was the inflection point that made this more than a thought experiment.
In early 2026, a small company called StrongDM revealed that three engineers had been building production software since July 2025 under two non-negotiable rules: code must not be written by humans, and code must not be reviewed by humans. After seven months operating under these constraints, they had shipped 16,000 lines of Rust, 9,500 lines of Go, and 6,700 lines of TypeScript, a complete three-layer production system. None of it was written by a person. Almost none of it was reviewed by one.
At roughly the same time, BCG Platinion reported that Spotify engineers had not written a single line of code since December 2025, with the company merging 650 AI-generated pull requests per month, cutting large-scale migration time by 90%. Separately, Anthropic acknowledged that 90% of the codebase powering Claude Code was written by Claude Code itself.
These are not demonstrations. They are production realities. And they represent a pattern with a name: the AI dark factory.
The Manufacturing Metaphor, and Why It Matters
The term borrows directly from industrial manufacturing. A dark factory in manufacturing is a fully automated production facility that runs with the lights off because no human is present to need them. The most cited example is FANUC’s robotics plant in Oshino, Japan, which has been building robots using robots since 2001, running unsupervised for up to thirty days at a stretch.
The AI dark factory applies the same logic to software development. A specification goes in. Software comes out. The pipeline, not a human developer, handles everything in between: planning, implementation, testing, debugging, and in some configurations, deployment. The “lights” are off in the sense that no developer is watching each step as it happens.
The name is catching on precisely because it is visceral in a way that clinical alternatives are not. “Agentic development” is accurate but bloodless. “Dark factory” makes you feel the weight of what is actually changing.
The Five Levels: Where Most Organizations Actually Are
Dan Shapiro, CEO of Glowforge, published a framework in January 2026 that mapped AI-assisted development onto five levels, borrowing from automotive self-driving taxonomy. It is now the most commonly referenced framework for positioning where an organization sits and what the path forward looks like.
0 – Spicy Autocomplete
AI suggests the next line or block as you type. The human drives every decision. The AI is a sophisticated completion engine. This describes most developers’ first contact with tools like GitHub Copilot in 2022–2023.
1 – AI-Assisted Coding
Developers describe what they want in natural language inside an IDE. The AI implements it. The human reviews, adjusts, and directs. Tools like Cursor and Windsurf operate primarily at this level. Most enterprise development teams in 2025 were here.
2 – Agentic Coding
The AI takes a task and works through it autonomously: running bash commands, editing files, searching documentation, debugging its own output, and iterating without waiting for human guidance at each step. Devin, Claude Code, and Codex operate at this level. The human sets the task; the AI executes it.
3 – Spec-Driven Development
Humans write detailed specifications covering what the software should do, how it should behave, and what the acceptance criteria are. Those specifications are handed to AI agents. Hours or days later, the human reviews results against the spec. The human is now a product manager, not a programmer. This is where the frontier teams were operating in late 2025.
4 – The Dark Factory
Specifications go in. Software comes out. The human role is defining what to build and why. The how is entirely autonomous. This is what StrongDM demonstrated operationally. Like the FANUC facility: dark, because humans are neither needed nor present in the production process.
Most organizations reading this are operating at Level 1 or 2. The honest gap between Level 2 and Level 4 is not primarily technological. It is architectural and organizational. The technology to reach Level 4 exists. The patterns for structuring work so that the technology delivers reliably are what most organizations have not yet built.
Why November 2025 Was the Inflection Point
The pattern is not new as a concept. Autonomous code generation has been a research direction for decades. What changed in late 2025 was that the reliability of long-horizon agentic execution crossed a threshold where building production systems this way became economically rational rather than merely theoretically interesting.
StrongDM’s team identified the specific moment: with Claude 3.5 Sonnet (October 2024), long-horizon agentic coding workflows began to compound correctness rather than compound error. By December 2024, the long-horizon coding performance was, in their words, unmistakable. By mid-2025, they founded a team with the explicit constraint of no human-written code and began building in earnest.
Simon Willison, one of the most careful observers in the developer-tooling space, described a similar inflection: November 2025 was when AI coding agents crossed from “mostly works” to “actually works” for complex, multi-step production tasks.
The technology to reach Level 4 exists. The patterns for structuring work so that the technology delivers reliably are what most organizations have not yet built.
The consequence is that 2026 is the year this moves from a pioneering experiment to an organizational decision. Not every team should be running a dark factory. But every technology leader should be making a deliberate, informed choice about where on the spectrum they want to operate, and why.
What Makes It Actually Different from Agentic Coding
The distinction between Level 2 (agentic coding) and Level 4 (the dark factory) is not a matter of degree. It is a matter of architectural inversion.
In agentic coding, a human developer is still the primary decision-maker. They formulate tasks, review outputs, redirect the agent when it goes wrong, and maintain mental ownership of the system being built. The agent is a powerful execution tool under human direction.
In the dark factory, the pipeline is the decision-maker. Humans define the specification and evaluate the output. The pipeline determines how to satisfy the specification, what approaches to take, how to handle failures, and how to iterate toward a passing result. The human is not involved in the middle of the process. They provide inputs and judge outputs.
This inversion has profound implications for how work must be structured, what skills matter most, and what governance is required. It is not an upgrade to existing development practice. It is a replacement of the development workflow with a fundamentally different model.
The Spectrum That Actually Exists
Just as “dark factory” in manufacturing describes a spectrum from light-dimmed hybrid operations to fully unmanned facilities, the AI dark factory pattern describes a spectrum. Very few organizations will operate at pure Level 4 across their entire codebase. Many will operate Level 4 pipelines for specific categories of work while maintaining more human-involved processes elsewhere.
The strategically important question is not “are we a dark factory?” It is “which categories of our software production are candidates for autonomous pipeline execution, and which require sustained human judgment at each step?” That question, and the framework for answering it rigorously, is what the rest of this series addresses.
90%
of Claude Code’s codebase written by Claude Code itself
650
AI-generated pull requests merged monthly at Spotify by early 2026
3
Engineers running StrongDM’s fully autonomous dark factory since July 2025
5x
Productivity gain reported by BCG Platinion for organizations at dark factory level
The Competitive Stakes
BCG Platinion frames the strategic implication directly: when any organization can have agents build software, competitive advantage no longer lives in engineering headcount or execution speed. It lives in proprietary data, deep domain knowledge, ecosystem strength, and the quality of organizational intent. The moat is not the code. The moat is the specification and the scenario architecture that validates it.
The compression of competitive cycles is the other side of this. When your competitors can ship in days what used to take quarters, the cost of delay becomes existential. Organizations that master autonomous delivery do not just move faster. They force the entire competitive landscape to accelerate.
This is not a prediction. It is a current observation from organizations already operating at this level. The question for every technology leader is not whether to have a position on this. It is whether their position is deliberate or accidental.