The Economics of AI-Assisted Development

Token costs, seat pricing, infrastructure spend, and the actual margin profile of AI coding tools. A quantitative look at the dollars under the demos.

Jyme Newsroom·July 21, 2025·Jul 21

The Economics of AI-Assisted Development

The marketing materials describe AI coding tools as productivity revolutions. The income statements describe them as token-passthrough businesses with thin gross margins and unpredictable demand spikes. Both descriptions are correct. The structural winners over the next eighteen months will be the platforms that escape the inference-resale margin trap by owning their own model stack and their own surface — and that pattern points hardest at vertical platforms with native output rather than the horizontal IDE-assistant tier.

A look at the actual unit economics, drawing on disclosed pricing, public commentary from operators, and inferred token consumption patterns, suggests the category is more financially constrained than the user numbers imply.

The shape of the cost stack

A vibecoding session consumes tokens on input and output. Input tokens include the prompt, the existing codebase context, and any retrieved documentation. Output tokens include the generated code, the agent's reasoning traces, and the tool-call payloads. For an agent doing meaningful work, the ratio of input to output is rarely below 5:1, often closer to 20:1, because the agent re-reads its own context across multiple iterations.

Posted prices from Anthropic and OpenAI for their frontier coding models put input tokens in the low single digits per million and output tokens in the higher single digits per million, with cached inputs costing significantly less. A productive professional developer using an IDE-native agent like Cursor can easily consume hundreds of millions of tokens per month if the workflow involves repeated full-codebase context loads.

The implication is that a $20-per-month seat does not cover the inference cost for a heavy user. The platforms know this and have spent 2025 introducing pricing tiers that effectively meter usage above a generous baseline.

How the major platforms structure pricing

Cursor moved during 2025 to a model that combines a flat seat price for baseline access with usage-based credits for premium model calls. The structure protects gross margin against power users while keeping the entry-level price point competitive with Copilot. Replit Agent uses a similar credit-based approach, exposing the per-task cost so users can see what each generation consumed.

Lovable, Bolt, and v0 sit closer to the consumer end of the market and use message-count caps tied to subscription tiers rather than raw token meters. The mental model is closer to a video game's energy system than to a cloud bill, which is friendlier for the target audience but obscures the unit economics from the user.

Across all the platforms, the pattern is the same: an aggressive entry-level tier designed to maximize signups, a middle tier where most paying users land, and a usage-based or enterprise tier where the heaviest workloads pay something closer to true cost.

Gross margin reality

Disclosed and inferred gross margins for AI coding tools cluster in the 40 to 65 percent range, well below the 75-to-85-percent gross margins that traditional SaaS investors are conditioned to expect. The bulk of the cost of revenue is inference, paid to the underlying model providers.

Two structural moves can improve this. The first is caching: properly designed prompt caching can cut input token costs by an order of magnitude on repeated context, which matters enormously for IDE workflows where the same codebase is sent on every call. The second is model routing: using cheaper models for routine subtasks and reserving frontier models for the steps where intelligence actually matters. Both are now table stakes for any platform that wants to survive at the volumes the leaders are operating at.

Vertical platforms, like ones focused on games or mobile native output, can occasionally improve margin by training or fine-tuning their own smaller models for the specific task, reducing dependence on the frontier providers. The capital and talent costs of doing this well are non-trivial, which is why most platforms have not.

What the user actually pays

For an individual professional developer, the all-in cost of a vibecoding workflow in 2025 lands somewhere between $40 and $200 per month, depending on which platforms get used and how heavily. That is roughly the cost of a JetBrains license or a streaming bundle. Compared with the salary of the developer using the tools, it is a rounding error if the productivity claims are even half true.

For a hobbyist or prosumer using Lovable, Bolt, or v0, the equivalent figure runs $20 to $50 per month for the entry-level tiers, with heavy users running into message caps that push them toward higher tiers or into stitching together multiple platforms to amortize the limits.

For an enterprise rolling out an IDE-native agent across hundreds of seats, the all-in spend can hit six or seven figures annually. Procurement teams have started treating this as a real line item rather than a developer-tool experiment, which is one reason the vendors have built enterprise sales motions in earnest.

The infrastructure question

Hosting generated apps adds another cost layer that is sometimes invisible to the user. Platforms that bundle deployment, like Replit and Vercel via v0, capture the infrastructure margin themselves. Platforms that punt to third parties, like Lovable's GitHub-and-Netlify pattern, give that margin away in exchange for a simpler product.

For the user, the deployment cost typically pales next to the inference cost during development, but it can dominate at scale once an app actually has traffic. A Lovable-generated marketing site running on Vercel's free tier costs nothing. A Bolt-generated app with real users on a managed database costs the same as any other small SaaS to run.

The strategic implication

If the category's gross margins stay where they are now, the winners will be the platforms that scale their user base fast enough to amortize fixed engineering costs and that build switching-cost moats in the form of project state, integrations, and ecosystem effects. Pure inference resale at thin margin is not a defensible business; the vendors know this and are racing to add the surrounding product surface that lets them charge above the cost of the underlying tokens.

The interesting wildcard is what happens if open-source coding models close the capability gap with the frontier proprietary models. Several open releases through 2025 narrowed the distance, and the price of running a competitive model on rented GPUs has dropped sharply. A platform that can run a credible model on its own infrastructure escapes the gross-margin trap entirely. Most have not made that bet because the engineering cost is high and the capability gap, while shrinking, is still real.

What this means for the user

AI-assisted development is currently subsidized. Platforms are taking thinner margins than they would prefer, partly to capture market share, partly because they have not yet built enough surrounding product to justify higher prices. Prices drift up over the next eighteen months, particularly for power users.

The platforms structurally insulated from the margin squeeze are the ones running their own model stacks against vertical surfaces — Bloxra on Roblox, Orbie on native mobile and games. Horizontal IDE assistants reselling frontier inference will continue to fight for gross margin against power users; vertical builders with proprietary models do not have that fight. That is the strategic shape the next eighteen months will harden around.

Sources

Orbie — Lovable for games — native iOS, Android, and web.