Spec-driven development is writing a structured, codebase-grounded specification that an AI agent can execute against and a human can review. The useful version is not a frozen document. It's a living, version-controlled artifact that guides work through four phases: Specify, Plan, Tasks, Implement.

Most of the bad advice on this topic starts with the wrong bottleneck. People talk like AI coding changed software because code got cheap. True, but incomplete. The painful truth is that coding is no longer the main constraint. Specification is.

That's why the backlash exists. You've seen it. “Waterfall with markdown.” “Productivity trap.” “Token-burning ceremony.” Some of that criticism is deserved. A bloated spec file that never touches reality is just old process with new branding. A giant markdown doc with no tests, no boundaries, and no tie to the codebase is useless.

But that's not what spec-driven development is.

For a solo builder using Cursor, Claude Code, Codex, or Gemini, the primary problem isn't getting code generated. It's getting the right code generated, in the right shape, with the right tradeoffs, without spending the next day cleaning up AI-made guesses. A good spec fixes that by turning a vague feature idea into something persistent, reviewable, and checkable. Not a PRD. Not a prompt. Not a chat transcript you'll never find again.

Table of Contents

Is Spec-Driven Development Just Waterfall with Markdown

The lazy criticism is easy to repeat. “SDD is just waterfall with markdown.” That sounds smart right up until you try to ship with AI agents and realize the actual problem is not too much specification. It is vague specification, stale specification, and missing specification.

If you write a giant upfront doc, freeze it, and force the build to follow it after reality changes, you recreated waterfall in a text file. Solo founders do this all the time with PRDs, Notion pages, and giant prompt dumps. The format is not the issue. The rigidity is.

That is why this debate keeps going in circles. “Specification” gets used to mean four different things at once. One Reddit thread in r/ChatGPTCoding called specification “the most overloaded term”. That description fits. Some people mean a few bullets for an LLM. Some mean a formal contract. Some mean a planning note. Some mean code should be generated from specs and little else.

A bad spec process feels like ceremony. A good one feels like compression.

The useful line is simple. Waterfall treats the plan as fixed. Spec-driven development treats the spec as a working control document. You update it when you learn, you review it before more code gets written, and you keep it close enough to the codebase that an agent can follow it without inventing missing pieces.

What solo builders should reject

Skip the bad comparisons and you get clearer fast.

  • Waterfall locks decisions too early and rewards compliance over learning.
  • A PRD captures intent but usually stops before clear execution boundaries and testable acceptance criteria.
  • A chat log disappears into scrollback and leaves no durable artifact for the next agent run.
  • Prompt engineering tries to coax good output from incomplete context. SDD fixes the context first.

Martin Fowler's framing, via Birgitta Böckeler's taxonomy, gives this argument more precision than the usual social media takes. She splits the space into spec-first, spec-anchored, and spec-as-source in Understanding SDD on Martin Fowler. That distinction matters because critics are often attacking the strictest version while solo founders are using a lighter one that stays editable.

The version that helps a solo founder

Use a narrower definition.

Spec-driven development is writing a structured, codebase-grounded spec that an AI agent can execute and you can audit. It lives in version control. It sets boundaries. It breaks the work into steps. It includes validation. It changes when the work changes.

That is not bureaucracy. It is how you stop burning hours on AI-generated rework.

How SDD Differs from TDD BDD and Agile Stories

How SDD Differs from TDD BDD and Agile Stories

The backlash against SDD usually comes from a category error.

Solo founders hear "spec-driven development" and assume someone is trying to bring back enterprise process with a new coat of paint. That misses the point. TDD, BDD, agile stories, PRDs, and prompts all describe part of the work. SDD tries to hold the whole job together in one document that an AI agent can execute and you can review after the fact.

That difference matters because solo builders do not have a product manager, a tech lead, and a QA team sitting in separate meetings. You have one brain, limited time, and an agent that will happily fill in blanks with nonsense.

Here is the clean separation:

Practice Primary artifact What drives implementation Best use
SDD Structured spec Spec, constraints, tasks, and validation Multi-step builds where AI needs durable context
TDD Automated tests Failing tests before code Tight coding loops and correctness checks
BDD Behavior scenarios Shared examples of expected behavior Clarifying user-visible behavior
Agile stories Story or ticket Desired outcome or business need Prioritization and planning

TDD is too narrow to carry the whole job. It tells you what should fail first. It does not usually define system boundaries, file-level changes, rollout assumptions, or what the agent must leave alone.

BDD gives you clearer behavior. It still does not usually tell an AI agent how to break the work into ordered tasks across a real codebase.

Agile stories are even thinner. A story can explain why the feature matters, but "As a user, I want X" is not enough context for an agent to change six files without creating drift.

That is why the "waterfall with markdown" criticism falls apart for solo founders. Waterfall freezes decisions early and pushes discovery to the edges. Good SDD stays editable. It changes as you learn. The spec is a working control document, not a frozen handoff.

A practical test helps. If your artifact cannot answer these questions, it is not doing SDD yet:

  • What files should change?
  • What files must not change?
  • What constraints matter?
  • What steps should happen in what order?
  • How do you verify the work is done?

If those answers live across tickets, chats, memory, and half-written tests, your agent is operating on fragments.

Addy Osmani gets the format right in How to Write a Good Spec for AI Agents. The value is not theory. The value is structure. Clear sections, explicit constraints, and concrete acceptance criteria give the model fewer places to improvise.

The broader shift shows up in Goptimise's AI development insights too. Teams are getting better results from AI by improving workflow structure, not by hunting for cleverer prompts. Solo founders need that lesson more than anyone because you feel every bad loop directly.

You can also see the same pattern in a practical AI agent workflow for solo builders. The wins come from handing the agent a stable operating document, then checking execution against it.

One sentence version:

TDD checks correctness, BDD clarifies behavior, agile stories frame priority, and SDD coordinates the entire build so an AI agent does not invent the missing parts.

That is the key distinction. SDD is not a replacement religion for software process. It is the missing layer between vague intent and reliable execution.

Why This Matters for AI Coding Agents

The problem starts before the first line of code.

Why This Matters for AI Coding Agents

If you've built with AI agents for more than a week, you know the pattern. You prompt a feature. The model builds something plausible. You test it. It missed a constraint you forgot to state. You patch the prompt. It fixes that and breaks something else. By the time it “works,” you've burned a pile of time on cleanup and drift.

That's why I think the core mistake isn't trusting AI too much. It's handing it vague intent and acting surprised when it improvises.

For a broader take on how teams are adapting their workflows around this shift, Goptimise's AI development insights are worth reading. The useful thread running through modern AI-assisted development is not magic prompting. It's better structure.

AI agents fail where specs are weak

Recent research gives this problem a more precise name. The technical distinction in spec-driven development is synchronization pressure. In spec-anchored workflows, changes require updates to both code and spec, and automated checks fail when they drift. The arXiv paper describes this as executable specifications, where things like BDD scenarios or API contract tests turn the spec into a machine-checkable contract in the paper on executable specifications and synchronization.

That's the part most solo builders miss.

A spec is not valuable because it's neat. It's valuable because it creates pressure against drift. It forces the AI and the human to stay aligned to the same declared behavior.

Here's a useful companion read on workflow design for agents: this guide to AI agent workflow patterns. The key idea is simple. Agents need persistent context more than they need clever prompts.

The spec becomes a contract

This is also why Thoughtworks put spec-driven development on the radar. Not because software teams suddenly rediscovered documentation, but because AI agents tend to lose context, invent structure, and fill blanks with confidence. A spec counters that by making the blanks smaller.

A quick explainer helps:

The practical impact shows up in a few places:

  • Scope control: The agent stops expanding the feature because the “not building” line is explicit.
  • Task decomposition: The work arrives pre-sliced instead of being improvised mid-flight.
  • Review quality: You can review the plan before reviewing the code.
  • Repeatability: The same spec can be used across sessions instead of being reconstructed from memory.

If the AI keeps surprising you, your inputs are still too loose.

That's what spec-driven development fixes for small teams and solo founders. Not coding speed. Decision clarity.

The Anatomy of a Minimal Viable Spec

Most specs are too long, too vague, or both.

The Anatomy of a Minimal Viable Spec

For solo builders, the winning move is not “write the ultimate spec.” It's “write the smallest spec that prevents expensive mistakes.” The same way a product should have a minimum viable shape, your planning artifact should too. If you want a plain-language refresher on MVP thinking, Olvy's insights on Minimum Viable Product are useful because they keep the focus on what's necessary now, not what sounds impressive.

The minimal viable spec I recommend has five parts:

  1. TL;DR
  2. Scope
  3. Subtasks with acceptance criteria
  4. Assumptions tagged by risk
  5. Validation scenarios

If you want a deeper walkthrough on formatting, this guide to writing technical specifications complements the template below.

What your spec must include

Here's the standard I use.

  • TL;DR
    One sentence. One outcome. No adjectives.
    Example: “Add Google sign-in so users can create or access an account without a password.”

  • Scope
    Split this into Building and Not Building.
    By doing this, most wasted AI work dies early.

  • Subtasks with acceptance criteria
    Don't say “implement auth.” Break it apart. Name likely files or modules when you can. Define what done means.

  • Assumptions tagged by risk
    Tag each assumption as low, medium, or high risk in plain language.
    High-risk assumptions should trigger clarification before implementation.

  • Validation scenarios
    These are your executable intent checks. Not abstract quality goals. Real scenarios someone or something can verify.

Use this test: Could another person, or another model session, pick up this file tomorrow and build the same feature without guessing?

A concrete example

Below is a lean spec for Add social login with Google.

TL;DR

Allow users to sign in or register using their Google account.

Scope

Building

  • Google OAuth sign-in from login page
  • Google OAuth sign-in from signup page
  • Link returned Google identity to an existing user when emails match
  • Create a new user record when no account exists
  • Store Google provider ID, email, and avatar URL
  • Clear UI states for success and failure

Not Building

  • Apple login
  • Account linking from settings
  • Multi-provider identity management UI
  • Team or organization SSO
  • Custom onboarding after first Google sign-in

Subtasks

Frontend auth entry

  • Add “Sign in with Google” button to login and signup screens
  • Keep existing email/password flow untouched

Acceptance criteria

  • Button appears on both screens
  • Button starts OAuth flow
  • Existing auth form still works

OAuth callback handling

  • Handle success, cancellation, and provider error
  • Redirect authenticated user to the app home or intended destination

Acceptance criteria

  • Successful auth returns the user to the app in a signed-in state
  • Failed auth shows a clear error message
  • Cancellation does not create a user

User model updates

  • Extend user identity model to store provider metadata
  • Preserve support for local auth users

Acceptance criteria

  • User record can store Google provider ID
  • Existing users are not broken by schema changes
  • Matching email path does not create a duplicate account

Server-side auth logic

  • Verify identity returned from Google
  • Create or find user account
  • Start app session

Acceptance criteria

  • New Google user gets an account
  • Existing matching user gets signed in
  • Duplicate user creation is prevented

Assumptions

  • Low risk: Google OAuth credentials are already available.
  • Medium risk: Current auth stack can support provider-based login without large refactoring.
  • High risk: Matching by email is acceptable for account linking in this product.

Validation scenarios

  • User clicks Google sign-in on login page, approves access, returns signed in.
  • New user signs up with Google and receives a valid account.
  • Existing user with same email signs in and is linked correctly.
  • User cancels OAuth and sees a non-destructive error state.
  • Provider error does not create a broken session or partial user.

That's enough to hand to an AI agent.

Not because it's elegant. Because it removes the obvious places where the agent would otherwise guess. It defines the goal, the edges, the implementation slices, the risky assumptions, and the checks. That's what a useful spec does.

How to Adopt SDD Without the Bureaucracy

The internet loves turning a good habit into a religion.

How to Adopt SDD Without the Bureaucracy

If you're solo, you don't need a formal ritual around every feature. You need the minimum structure that keeps the AI from freelancing inside your codebase.

The best adoption path is still Böckeler's taxonomy from the Martin Fowler ecosystem. It gives you three levels instead of one dogma: spec-first, spec-anchored, and spec-as-source. That matters because starting with the first one is often recommended, rather than jumping straight to the most rigid version.

Start lighter than the internet tells you

For most solo founders, this progression works.

Stage What you do When it's enough
Spec-first Write the spec before implementation New features and medium-complexity changes
Spec-anchored Keep spec and code in sync as the feature evolves Core flows you'll revisit often
Spec-as-source Treat the spec as the primary maintained artifact Narrow cases where generation is highly disciplined

My opinion is blunt. Start with spec-first. Stay there longer than the hype cycle tells you. Only move to spec-anchored for parts of the product that keep changing and keep breaking. Most solo builders do not need spec-as-source across the whole app.

That's also where the “productivity trap” criticism gets traction. People adopt the heaviest version first, then wonder why they spend half the day grooming documents.

The traps that waste your time

You'll run into the same three problems fast.

  • Spec drift
    You changed code but didn't update the spec.
    Fix: update the spec in the same session as the behavior change.

  • Analysis paralysis
    You keep polishing the plan instead of shipping.
    Fix: stop when the next risky decision is clear enough to implement.

  • Gold-plated specs
    You document every edge case before proving the feature matters.
    Fix: only specify the core flow, key constraints, and likely failure paths first.

Write specs to reduce rework, not to perform seriousness.

A useful litmus test is simple. If the spec took longer to write than it saved in avoided churn, it was too heavy. If the AI still made major structural guesses, it was too thin.

The sweet spot is boring. That's why it works.

Choosing Your SDD Toolset as a Solo Builder

Tools matter less than people think. Shape matters more.

Choosing Your SDD Toolset as a Solo Builder

You do not need a dedicated platform to start. A markdown file in your repo is enough if you already know how to define scope, tasks, assumptions, and validation without wandering.

When markdown is enough

Use plain repo docs if:

  • Your feature is small: one route, one job, one integration, one model change.
  • You know the codebase well: the missing context lives in your head and that's still manageable.
  • You can review tightly: you're comfortable catching drift by hand.
  • You want zero overhead: no new interface, no workflow migration.

GitHub's Spec Kit is a strong reference point because it pushes a disciplined shape around specification and planning. Kiro and similar tools push further into dedicated workflows. OpenSpec is interesting if you care about standards and portability. Tools like Traycer, Vibe Kanban, and BMAD sit around adjacent workflow needs, with different tradeoffs around structure, planning, and execution style.

When you want more structure

You should look beyond raw markdown when one of these becomes true:

  • You keep rewriting the same kind of spec
  • You work in a brownfield codebase with lots of hidden constraints
  • Your agent sessions lose continuity
  • You want a repeatable intake process before implementation starts

That's where workflow-specific tools help. Some focus on generating structured specs. Some focus on orchestration. Some focus on standardization.

One such option is this guide on spec-driven development with Claude Code, especially if Claude Code is your main execution environment.

Another is Tekk.coach. It connects to a GitHub repo, runs a structured interview, and produces a codebase-aware spec you then hand manually to tools like Cursor, Claude Code, Codex, or Gemini. It does not create PRs, and it does not orchestrate external coding agents. If you're a solo builder who doesn't want to invent the spec format from scratch every time, that's a practical lane.

The right choice is the one that makes specs easier to write, easier to review, and harder to ignore. If a tool adds process without increasing clarity, skip it.

Stop Prompting Start Specifying

Prompting still matters. You should know the basics, and DocsBot's guide to essential prompt techniques is a decent refresher. But prompting is not enough once the work gets real.

What spec-driven development is comes down to this: a persistent artifact that gives your AI agent less room to guess and gives you something concrete to review. Not a manifesto. Not a frozen plan. Not markdown theater.

If you build alone, this is the discipline that keeps AI useful after the demo.

The bottleneck isn't typing code anymore. It's deciding what should be built, what should not be built, and how you'll know it's correct.

Start there.


Connect your GitHub repo. Describe the problem. Get a structured spec. Ship with Tekk.coach.