Intelligence on Tap: The economics, architecture, and future of AI

3. The architecture wars

The vertical model: Convenience by design

The first architecture is the one the dominant AI labs have built by default. Call it the vertical model: the lab trains the model, hosts the model, stores the conversation history, builds the integrations, designs the interface, and controls the workflow. The user brings a subscription and a problem. The lab provides everything else.

‍

This architecture has real advantages, and dismissing them would be a mistake. Training frontier models requires capital that only the vertical model can currently justify — OpenAI's 2025 compute spend ran approximately $8.5 billion. The operational complexity of serving billions of queries reliably, routing between model tiers, managing context windows, and maintaining sub-second response times is genuinely substantial. The vertical labs have solved these problems. That is not a small thing.

‍

The enterprise pitch is equally coherent. A company deploying ChatGPT Enterprise or Claude for Work gets a managed service: one vendor relationship, one security review, one compliance framework, guaranteed uptime. For a legal firm that needs to know exactly where client data goes, or a hospital system navigating HIPAA, the vertical model's clear chain of custody is a genuine feature, not merely a convenience.

The economics also mirror a well-understood historical pattern. Oracle in the 1990s, SAP in the 2000s, Salesforce in the 2010s — enterprise technology incumbents have always extracted the highest margins from customers willing to pay for integration simplicity and vendor accountability. There is a durable market for this. It will not disappear.

‍

The limitation is scale. The vertical model, as currently structured, cannot support Intelligence on Tap at planetary scale without remaining prohibitively expensive. Parts 1 and 2 of this series established the unit economics gap: AI queries cost 1 to 10 cents to serve while generating 1 to 4 cents in revenue. That gap closes as costs fall, but it closes faster in a competitive, distributed market than in a concentrated, vertically integrated one. The pre-Google enterprise server stack had 65-75% margins. It also had no path to Google-scale unit economics. The vertical model faces the same structural ceiling.

The distributed stack: Intelligence as utility

The second architecture is being assembled from below, by developers and users who want something different: intelligence that works the way electricity does. Available on demand. Provided by a competitive market. Not tied to any single supplier's data strategy or business model.

‍

The evidence that this architecture is gaining traction is not theoretical. It is measured in GitHub stars and monthly downloads.

‍

OpenClaw, created by Austrian developer Peter Steinberger and open-sourced in November 2025, hit a viral inflection in late January 2026 that produced numbers no enterprise software launch could match: 145,000 GitHub stars and 20,000 forks within two weeks. Millions of installs across macOS, Windows, and Linux. More than 3,000 community-built skills on ClawHub, OpenClaw's open marketplace. Integration with every major messaging platform — WhatsApp, Telegram, Slack, Discord, Signal, iMessage, Teams — built not by Steinberger but by community contributors over a matter of weeks.

‍

The architecture that produced these numbers is worth examining carefully. OpenClaw is a harness, not a model: users choose between cloud models (Claude, GPT-4) or fully local models through Ollama. The user's data stays on the user's machine. The workflow is the user's to design. The intelligence layer is a competitive input, priced by the market and substitutable based on cost, capability, or trust.

‍

Even the security vulnerabilities discovered in early OpenClaw deployments illustrate the scale of demand rather than undermining it. SecurityScorecard found 42,900 OpenClaw instances exposed on the internet, with 15,200 vulnerable to remote code execution. Software does not accumulate security researchers' attention at that scale unless it has first accumulated users at that scale. The demand came before the hardening because the demand was that strong.

‍

OpenClaw is the most dramatic data point, but not an isolated one. Ollama, which allows users to run open-weight models locally, has accumulated millions of downloads and a developer community that has built hundreds of integrations. LM Studio follows the same pattern. Anthropic's Claude Cowork extends the distributed model to file and task management workflows, allowing intelligence to operate on user-controlled data without that data ever passing through a centralized lab. Meta's Llama model family, released as open weights, has been downloaded hundreds of millions of times and deployed by organizations that have no intention of routing their AI through any single vendor's API.

‍

The pattern across these products is consistent: users want intelligence available on tap, but they want to hold the data and control the workflow. The lab provides the model. The user owns the harness.

What the adoption data says

The interesting analytical question is not which architecture is better in the abstract. It is what the adoption patterns tell us about where the market is actually going.

‍

The vertical model has the larger current revenue base. OpenAI, Anthropic, and Google are generating substantial subscription and API revenue from the vertical architecture. Enterprise adoption of managed AI services is growing. The vertical model is not going away.

‍

But the distributed stack is growing faster at the edges, and the edges are where technology transitions begin. The open-source software movement did not initially threaten Oracle's enterprise revenue. It threatened the web server market, a space Oracle did not take seriously, and eventually undermined the economic logic of proprietary software at every layer of the stack. Red Hat, Apache, MySQL, Linux: each started at the margin and moved toward the center.

‍

The distributed AI stack is following a similar trajectory. Coding assistants and developer tools, where the distributed model is strongest, now represent more than 50% of all token consumption on multi-provider platforms. Developers are the most price-sensitive, integration-aware segment of the AI market. They are also the ones who build the enterprise applications that run on top of whatever infrastructure wins. The architectural choices developers make now will shape what enterprise buyers can access later.

‍

And the labs see the same number. If half of all tokens flow through software development, the application wrapping that workflow is the most valuable real estate in the stack. OpenAI shipped Codex. Anthropic built Claude Code. Google followed with Gemini CLI. Each routes around Cursor and GitHub. Claude Design now aims at Figma, Canva, Adobe, and Lovable. The vertical model is no longer just selling tokens. It is moving up the stack to own the applications that consume them.

*Sources: a16z/OpenRouter 100T Token Study; OpenAI State of Enterprise AI 2025; SecurityScorecard; industry estimates*

The economic case for each architecture

The long-run economic argument for the vertical model rests on capability and trust. Frontier model training requires capital that only concentrated investment can support. The top AI labs are spending $500 million to $2.5 billion per training run and deploying those models across billions of queries. The investment can only be justified if the model provider captures a meaningful share of the value delivered. Without the vertical model's margin structure, the incentive to fund frontier research collapses.

The vertical model funds the research. The distributed stack benefits from it. The long-run question is whether that arrangement remains stable as open-weight models approach frontier capability.

The long-run economic argument for the distributed stack rests on three structural advantages. First, competition: a market where users can route queries across multiple providers creates price pressure that a vertical model cannot generate internally. The 1,000x cost decline in inference documented in Part 1 has been driven substantially by this competition. Second, data alignment: the most valuable AI applications require access to proprietary data that many organizations are not willing to centralize at a third-party lab. The distributed model makes data-local intelligence practical. Third, the breadth of the development community: when Meta releases Llama weights, every organization that builds on those weights benefits from Meta's training investment without contributing to Meta's revenue — a form of R&D socialization that creates cost floors constraining what any vertical provider can charge.

The AOL question

The internet parallel is not perfect, but it is instructive. AOL had real advantages in 1995: curated content, reliable access, customer support, and a business model that worked. The open web had worse content, slower speeds, and no support. But the open web had something AOL could not match: an architecture that anyone could build on, at a cost that kept falling.

‍

By 2002, AOL's subscriber count had peaked. By 2006, the walled garden was a historical artifact — not because the open web was better in every dimension (it wasn't, not for years), but because its cost structure and architectural flexibility made it the inevitable foundation for everything that came next.

‍

The AI labs are not AOL. Their technical capabilities are genuinely superior to any open-weight alternative today, and the capital barriers to competing are substantially higher than they were for web development in 1996. But the distributed stack is demonstrating the same fundamental property: it aligns incentives between builders and users in a way that the vertical model cannot match at scale. Developers who route around a single provider's pricing are not disloyal; they are responding rationally to architecture, exactly as web developers responded to the economics of open protocols.

Both architectures will survive. They won't both win.

The most honest conclusion is that both architectures will coexist for a long time. The history of technology offers many examples of parallel ecosystems that maintain some form of equilibrated balance, at least for a time – Windows and Mac, iPhone and Android. The vertical model will retain strong positions in regulated industries, large enterprise deployments, and use cases where integration simplicity justifies a premium. The distributed stack will expand fastest at the developer layer, in cost-sensitive markets, and in any application where data sovereignty is non-negotiable.

‍

The historical pattern across comparable transitions — proprietary versus open-source software, walled gardens versus the open web, mainframe timesharing versus personal computing -- is that distributed architectures eventually capture the larger share of long-run value by expanding the total market. The pipes carry more water when they are cheap and open. The question for every investor and builder in this space is the same one that mattered in 1999 and in 2006: are you positioned for the architecture that maximizes margin today, or the one that maximizes volume tomorrow?

‍

Part 4 of this series examines how the foundation labs are responding to this architectural pressure: by segmenting tokens into tiers of thinking depth, accuracy, and speed, in what may be the vertical model's most durable competitive position.

‍

Sources and data notes

‍

OpenClaw adoption data: GitHub public repository statistics; SecurityScorecard, 'OpenClaw Security Analysis,' January 2026.

‍

Developer tool token consumption: a16z/OpenRouter, 'State of AI 2025' (100 Trillion Token Study).

‍

OpenAI compute spend: OpenAI financial disclosures and investor reporting; The Information, 'OpenAI 2025 Financials.'

‍

Enterprise AI adoption: OpenAI, 'State of Enterprise AI 2025'; Bain, 'AI's Trillion-Dollar Opportunity 2024.'

‍

Pre-Google enterprise stack margins: IDC data center cost surveys; SemiAnalysis historical analysis.

‍

Meta Llama download data: Meta AI blog and developer conference disclosures, 2025.

Download the full report

The economics, architecture, and future of AI, and what must change for ubiquitous, on-demand intelligence to become a sustainable, long-term reality.