Lean Startup in the AI Age: What Still Works, What Breaks, What Replaces It

Build-Measure-Learn was designed for web products. AI changes the feedback loop, the MVP definition, and the cost of experimentation.

The Lean Startup methodology—Build-Measure-Learn, minimum viable products, validated learning—shaped a generation of software companies. Its core insight remains sound: reduce the cycle time between hypothesis and evidence. But AI-native products break several of the framework’s assumptions, and teams that apply Lean Startup without adaptation end up optimising for the wrong things.

What Still Works

The fundamental principle—validate demand before scaling supply—is timeless. Before building an AI feature, confirm that users have the problem you are solving and that they value a solution enough to change behaviour. This can be tested with mockups, Wizard of Oz prototypes (human behind the curtain), or simple rule-based systems before investing in model development.

Customer development interviews, concierge MVPs, and landing page tests are as relevant for AI products as for traditional software. The question “Would you use this?” still precedes “Can we build this?”

What Breaks

The MVP concept breaks when the minimum viable version of an AI feature is indistinguishable from a bad product. A recommendation engine that gives mediocre suggestions does not validate the hypothesis that users want recommendations—it validates that users abandon features that do not work well. AI quality has a threshold below which the feature is worse than not having it.

The feedback loop also changes. Traditional MVPs get fast, clear signal: users click or they do not, they convert or they bounce. AI features produce ambiguous signal. Did the user accept the AI’s suggestion because it was good, or because they did not know enough to evaluate it? Did they reject it because it was wrong, or because the presentation was confusing? Measuring AI feature success requires intentional instrumentation, not just funnel analytics.

The AI-Native MVP

Redefine the MVP for AI products as the Minimum Trustworthy Product. The bar is not “does it work?” but “does it work well enough that users trust it to inform decisions?” This means the first version may need higher quality than a traditional MVP, but can have a narrower scope. Constrain the domain rather than the quality.

An AI that summarises meeting notes with 95% accuracy for 15-minute standups is more valuable than one that handles all meeting types at 70% accuracy. Narrow the scope, raise the quality bar, and expand the domain as confidence grows.

Adapting Build-Measure-Learn

Build: Prototype with off-the-shelf models and prompt engineering before training custom models. The fastest path to learning is the one that requires the least infrastructure. If a GPT-4 prompt can approximate the feature, ship that and learn from usage before investing in fine-tuning.

Measure: Define quality metrics before launch, not after. Set up A/B testing infrastructure that compares AI-assisted workflows against non-AI baselines. Measure task completion time, error rate, and user confidence—not just engagement.

Lean Startup in the AI Age: What Still Works, What Breaks, What Replaces It

What Still Works

What Breaks

The AI-Native MVP

Adapting Build-Measure-Learn

Zero to One in the AI Era: Moats Shift From Tech to Distribution, Data, and Workflow

Agentic AI Observability: What to Measure So ‘It Works’ Doesn’t Become ‘It Drifted’

Blue Ocean Strategy in the AI Age: Where Uncontested Markets Form

Failure Modes