Most advice on specs for AI agents is backwards.
People tell you to write more. More detail. More planning. More documents. More templates. Then they act surprised when builders call spec-driven development “waterfall with markdown,” or complain about frozen docs, spec drift, and token-burning ceremony. They're not wrong.
The fix isn't to abandon specs. The fix is to stop writing bloated ones.
A good spec for AI work is a minimal spec. Not vague. Not lazy. Minimal in the strict sense: keep only the constraints that are necessary to get a correct build. That matches the old standards-design rule from the Free Software Foundation Europe, “Remove everything that is not absolutely necessary,” a principle they tie to efficiency, learnability, simplicity, and longevity in technical formats, with open standards as the basis for interoperability and long-term access in systems like the web (FSFE on minimalistic standards).
My view is simple. Six sections, in order.
(1) TL;DR.
(2) Scope.
(3) Subtasks.
(4) Assumptions.
(5) Validation.
(6) Open questions.
Nothing else. Anything more is usually ceremony.
Table of Contents
- Specs Are a Trap And Why Minimal Specs Are the Escape
- The Six-Section Structure Defined
- Why AI Agents Need Scaffolding Not Blueprints
- Anatomy of a Minimal Spec
- A Real-World Minimal Spec for a New Feature
- Minimal Specs in Practice Tools and Frameworks
- Keeping Specs Alive Without the Ceremony
Specs Are a Trap And Why Minimal Specs Are the Escape
The backlash is deserved.
If you've tried heavyweight spec workflows, you've probably felt the drag. You ask an agent for a feature and end up buried under constitutions, plans, task trees, and markdown that nobody wants to reread. GitHub Spec Kit has pushed this style into the mainstream, and the criticism is predictable because a lot of it is fair. Some builders on Reddit have openly described that workflow as exhausting and too heavy for day-to-day shipping, especially when you're solo and the cost of reviewing docs starts to exceed the cost of reviewing code itself (OpenAIDev thread on Spec Kit overhead).
That doesn't mean specs are the problem. Bloat is the problem.
The useful question isn't “should you write specs?” It's “what is the smallest spec that still prevents rework?” That boundary matters. The original Min Specs method is blunt about it. Start with a longer list of must-dos and must-not-dos, then delete any rule that can be broken without losing the core purpose. The hard part is that teams often stop there and never test whether a rule is indispensable. The result is either mushy under-specification or ceremonial over-specification. Liberating Structures makes that exact point, and also notes that ambiguity and incomplete requirements drive rework, which is why “minimum” has to be disciplined, not merely short (Min Specs method).
The backlash is pointing at the wrong target
Smart critics aren't saying “clarity is bad.” They're saying this stuff turns into process theater fast.
Birgitta Böckeler's taxonomy is useful here. Spec-first, spec-anchored, and spec-as-source are not the same thing. A lot of the backlash is really against the most rigid version, where the spec becomes the center of the universe and every change has to pass through a markdown ritual. That's a valid concern. Specs can freeze. They can drift. They can become compliance artifacts instead of working documents.
Practical rule: If the spec takes longer to audit than the code, you've already lost.
Minimal specs solve the actual complaint
A minimal spec is not anti-agile. It's anti-noise.
That's why Addy Osmani's six-part structure lands so well with builders. It's small enough to hold in your head and concrete enough to execute. It gives the agent boundaries and gives you a review surface that isn't absurd. You can still iterate. You can still cut scope mid-flight. You can still throw away bad code.
What you stop doing is pretending a giant planning stack is progress.
The Six-Section Structure Defined
Addy Osmani's canonical framing is the cleanest version I've seen: write the spec like a working brief, not like a corporate artifact. The point is to give the agent enough structure to act without stuffing the prompt with junk it doesn't need (Addy Osmani on writing a good spec for AI agents).

Why six sections is enough
Here's the structure I'd use for nearly every feature:
TL;DR
One paragraph. What you're building, why it matters, and the core outcome.Scope
Two lists. Building and Not Building. This kills accidental scope creep early.Subtasks
A small set of vertical slices. Each one should include acceptance criteria and file references where possible.Assumptions
Things you believe to be true, tagged by risk. If an assumption breaks, the build changes.Validation
End-to-end scenarios. Not vague “test this” notes. Concrete checks a human can run.Open Questions
Unresolved decisions that can block implementation or cause wrong decisions if left implicit.
That's enough because a minimal spec still needs the core controls of a real technical spec. Industry guidance is clear on the essentials: purpose and scope, functional requirements, technical standards, testing requirements, and delivery or support constraints are the threshold for a document a developer can build from without constant clarification (Heretto on technical specifications).
What this leaves out on purpose
You don't need a fake executive summary. You don't need narrative background unless it changes implementation. You don't need architecture fan fiction. You don't need a constitution for a one-person product team trying to ship a billing fix on Thursday night.
A lot of teams overstuff specs because they confuse completeness with usefulness.
Minimal specs are not short by accident. They're short because every surviving line has a job.
That's also why I wouldn't blindly copy the infographic-style categories people often use for project docs like motivation, goals, non-goals, requirements, milestones, and open questions. Those can help as brainstorming lenses, but for execution with AI agents, the six sections above are tighter. They map directly to what the agent needs next.
A recipe is a better model than a manifesto. Ingredients. Boundaries. Steps. Checks. Unknowns. Done.
Why AI Agents Need Scaffolding Not Blueprints
AI agents don't need a novel. They need a frame.
If you hand Cursor, Claude Code, Codex, or Gemini a one-line prompt, you get guesswork. If you hand them an overbuilt planning stack, you get diffusion. The model starts paying attention to details that don't matter yet, or worse, details you added because they sounded responsible. That's how you burn cycles on code you never asked for.

Specification limits beat wishful prompting
There's a useful quality-engineering analogy here. In Statistical Process Control, specification limits are the externally defined requirements, while control limits are derived from process behavior and are commonly set at ±3 sigma from the mean. That distinction matters because it separates “what must be true” from “what usually happens” (6Sigma on specification limits and control limits).
That's exactly what a minimal spec does for AI work. It defines the smallest set of essential limits that determine whether the output is acceptable.
Without those limits, you aren't guiding an agent. You're hoping.
A good minimal spec answers questions like:
- What must exist: a button, endpoint, migration, state update
- What must not happen: schema changes, extra UI states, added dependencies
- What counts as done: visible behavior, expected API response, test pass condition
- What needs a human answer: unresolved product or edge-case decisions
That's scaffolding. It's enough to support construction. It's not a finished building plan.
Why this works better with real agent workflows
In practice, you want the agent to operate inside a narrow lane. That lane gets even clearer when your spec carries measurable requirements and explicit interface constraints. Best-practice technical spec guidance pushes exactly that: endpoint parameters, error codes, rate limits, timeout parameters, versioning rules, backward compatibility, and test procedures. Those details reduce interpretive drift and turn validation into something objective instead of debatable (Docsie glossary on technical specifications).
That matters beyond code generation. If you need to test browser flows, scrape stateful pages, or validate user-visible behavior, you may pair a coding agent with tooling that can build with this AI browser agent. The point isn't tool hype. The point is that better specs make every downstream tool less confused.
For day-to-day execution, the better pattern is usually a planning layer plus a coding layer. Keep the spec stable enough to guide implementation, then hand it to the code agent in focused chunks. If your current workflow is loose and chat-driven, tightening it with a spec-first pass often works better than adding yet another giant prompt to the pile. The mechanics are similar to the approach described in this guide on an AI agent workflow for shipping features.
Blueprints try to pre-answer everything. Scaffolding answers only what must be true before work starts.
That difference is why minimal specs survive contact with real product work.
Anatomy of a Minimal Spec
The minimal spec format only works if each section carries its weight. If one section is vague, another section has to absorb the ambiguity, and that's when the document starts expanding for the wrong reasons.
The six parts that matter
1. TL;DR
This is the single paragraph version of the build. It should say what's changing, why it matters, and the success condition in plain language.
Good example:
“Add a like button to blog posts so signed-in users can like and unlike a post. Store the user's reaction, show the updated count immediately, and prevent duplicate likes from the same account.”
Bad TL;DRs sound like project blurbs. Good ones sound like implementation intent.
2. Scope
You draw hard edges at this stage. Use Building and Not Building. If you skip this, the agent invents neighboring features because they feel adjacent.
Good example:
- Building: like and unlike, count display, authenticated user check
- Not Building: dislikes, reaction analytics, notification emails
Short is good here. Sharp is better.
3. Subtasks
This is the execution core. Break the work into vertical slices. Include file references if you know them. Every subtask needs acceptance criteria, or it's just a to-do item wearing a nicer shirt.
Good example:
- Add
POST /api/posts/:id/like - Update
PostFooter.tsxto render current like state - Add persistence logic in
likes.repository.ts
Each subtask should be independently reviewable.
What good minimalism looks like in practice
4. Assumptions
Assumptions stop hidden decisions from sneaking into code. Tag them by risk. I like Low, Medium, High because it forces you to notice when half the feature depends on a guess.
Examples:
- Low risk: users are already authenticated on post pages
- Medium risk: existing post serializer can return like counts without performance issues
- High risk: current schema can support unique user-post likes without migration pain
If an assumption is fragile, name it. Don't bury it in prose.
5. Validation
Weak specs usually fail in this regard. “Test the feature” is not validation. Validation should be executable by a human and precise enough that two reviewers would agree on the outcome.
Use scenario language:
- Signed-in user opens a post with no prior like.
- User clicks Like.
- Button state changes immediately and count increments.
- Refreshing the page preserves the liked state.
- Clicking again removes the like and decrements the count.
That's enough for a human or an agent to check.
6. Open Questions
This section exists so uncertainty stays visible. If you hide open questions inside assumptions, the agent tends to answer them implicitly.
Good examples:
- Should anonymous users see the count or only signed-in users?
- Do we soft-delete likes or hard-delete them?
- Do we need an index on the likes table now or only if query latency becomes visible?
One hard rule: If a question can materially change the implementation, keep it out of the body and put it in Open Questions.
A minimal spec is only complete if a developer can build from it without repeated clarification on the core path. That means preserving the implementation controls that matter and not confusing brevity with omission. If you want a deeper walkthrough of full technical spec structure, this guide on how to write technical specifications is worth keeping nearby.
A Real-World Minimal Spec for a New Feature
Heavy specs fail on small features all the time. The failure mode is familiar. people spend more time maintaining the document than shipping the change, the agent burns tokens restating obvious context, and the spec drifts the moment implementation meets the codebase.
A minimal spec earns its keep by surviving contact with the work. If this example feels almost too short, that is the point.

Example spec
TL;DR
Add a like button to each blog post page for signed-in users. A user can like a post once, remove their like, and see the count update without a full page reload. The goal is simple engagement feedback, not a broader reactions system.
Scope
Building
- Like and unlike action for authenticated users
- Visible like count on blog post pages
- Persisted user-post like state
- Optimistic UI update with rollback on failure
Not Building
- Dislike button
- Emoji reactions
- Push or email notifications
- Author analytics dashboard
- Anonymous likes
Subtasks
Backend endpoint and persistence
Add create/delete like behavior inapp/api/posts/[id]/like/route.tsand persistence logic inlib/likes.ts.
Acceptance criteria: authenticated request creates one like per user per post; repeat like request does not create duplicates; unlike removes existing record; unauthenticated request returns a clear auth error.Frontend button state
Updatecomponents/PostActions.tsxto render current count and user-like state.
Acceptance criteria: signed-in user sees Like or Liked state correctly on first render; clicking toggles state without page refresh; count updates immediately; error state restores previous UI state.Data loading
Extend post detail loader inlib/posts.tsto returnlikeCountandviewerHasLiked.
Acceptance criteria: page render includes both values; pages without signed-in user still show count but no interactive action.
Assumptions
- Low risk: blog post pages already have access to current session data.
- Medium risk: existing post loader can be extended without touching unrelated listing pages.
- High risk: current database schema can support a unique
(user_id, post_id)constraint without migration conflicts.
Validation
- Signed-in user opens a post they haven't liked, clicks Like, sees count increase and button switch to Liked.
- User refreshes the page and still sees Liked with the same count.
- User clicks again and the count decreases.
- Signed-out user opens the same post, sees the count, and cannot trigger a like action.
- Duplicate like attempts from the same account never create more than one active like for the same post.
Open Questions
- Should signed-out users see a disabled button or no button at all?
- Do we want soft delete history for likes, or is hard delete enough?
- Do we need an index beyond the uniqueness constraint for post pages with heavy read traffic?
Why this example works
This is the level of detail I trust with an AI agent and with another engineer. It pins down behavior, constraints, file touch points, and failure handling without turning a two-state feature into a mini product brief.
The trade-off is intentional. A heavier template such as GitHub Spec Kit can be useful when multiple teams, approvals, or long-lived design decisions are involved. For a feature like this, that extra structure usually becomes ceremony. The team starts filling boxes because the template asks for them, not because the implementation needs them.
The minimal version avoids that trap.
It gives the agent enough scaffolding to act with confidence, but not enough room to wander into fake completeness. The scope cuts off adjacent ideas. The subtasks map to real code changes instead of org chart handoffs. The assumptions name the places most likely to break. The validation steps are concrete enough to catch regressions. The open questions stop unresolved product calls from getting implicitly decided in code.
That last point matters more than spec advocates sometimes admit. Smart engineers push back on specs because bloated specs age badly. They remember stale requirements, fake certainty, and long documents nobody rereads. A short spec like this addresses that criticism directly. It stays cheap to update, so it has a chance of staying true.
If I were handing this to an agent, I would expect a usable first pass. If I were reviewing it as a human, I would know where to challenge the plan before code lands. That is the bar.
Minimal Specs in Practice Tools and Frameworks
There isn't one right way to operationalize the minimal spec format. The right choice depends on whether you want total manual control, a heavier framework, or help generating codebase-aware specs.
Spec framework comparison
| Approach | Overhead | AI Agent Portability | Codebase Awareness | Learning Curve |
|---|---|---|---|---|
| DIY markdown in repo | Low | High | Manual | Low |
| GitHub Spec Kit | High | High | Partial, depends on workflow | Medium to high |
Starter repo pattern like micmcc/spec-driven-development-starter |
Low to medium | High | Manual | Low |
| Generated spec workflow with Tekk.coach | Medium | High, because you manually hand the spec to Cursor, Claude Code, Codex, or Gemini | Yes, from your GitHub repo | Low to medium |
What each approach gets right and wrong
DIY markdown is the cleanest fit for minimalists. A specs/ folder with one file per feature is enough. Repos like micmcc's spec-driven-development-starter show a practical shape for this. You keep control, keep the format lean, and avoid framework dogma.
The downside is discipline. You have to write well. You also have to keep the docs updated yourself.
GitHub Spec Kit is more formal. The upside is structure. The downside is obvious to anyone who's tried to use it on a small product: it can get heavy fast. The GitHub Spec Kit launch post lays out the broader workflow clearly, but that same breadth is why many solo builders bounce off it. The complaint isn't that it has ideas. The complaint is that it often asks for too much ceremony before useful code appears.
The criticism shows up in real user discussion too. Builders in the Claude Code community have shared leaner spec habits that look much closer to a compact execution brief than a full planning stack, because they want enough structure to guide the model without slowing themselves down (Claude Code discussion with a lean spec example).
Generated spec workflows sit in the middle. They can save time if the system reads the repo, asks clarifying questions, and outputs a six-section spec you can use. The risk is that generated docs can become polished fluff if the tool values appearance over implementation detail.
The format matters more than the tool. A bad tool can still produce a good spec if the structure is tight. A fancy tool can still waste your time if the structure is bloated.
For solo builders, that's the filter. Pick the lightest system that consistently produces buildable specs.
Keeping Specs Alive Without the Ceremony
Spec drift happens when the document tries to be history, plan, and compliance artifact at the same time.
A spec should stay alive only for as long as it improves implementation or review. After that, code and tests take over. Treating specs like stone tablets is what creates frozen docs nobody trusts. Treating them like disposable chat output creates the opposite problem. You lose the one artifact that tells the agent and the human what “done” meant.

A simple rule for avoiding spec drift
Use one rule.
Update the spec only when the change affects an unmet acceptance criterion, scope boundary, or open question.
That's it.
If you rename a variable, don't touch the spec. If you move a helper file, don't touch the spec. If you decide the feature now needs anonymous read-only access, yes, update the spec. If validation changed, yes. If a high-risk assumption broke, yes.
What to update and what to ignore
Keep the maintenance loop brutally small:
- Update TL;DR if the outcome changed.
- Update Scope if you added or removed a user-visible boundary.
- Update Subtasks only if the implementation path materially changed.
- Update Assumptions when a guess becomes fact, or fails.
- Update Validation when success conditions change.
- Update Open Questions when a decision is made or a new blocker appears.
Everything else belongs in code review, commit history, or tests.
If you need a more traditional artifact for stakeholder alignment, use a lightweight planning doc or a compact PRD template. Just don't shove that material into the execution spec by default. A separate product requirements document template can help when you need broader product framing without polluting the implementation brief.
A living spec should be easy to trust because it is easy to update. If it feels expensive to maintain, it will rot.
The minimal spec format works because it gives you one small document that can survive contact with real feature work. Not because it's fashionable. Not because it sounds rigorous. Because it cuts away everything that doesn't directly help you ship.
If you want help producing this kind of spec without writing every line from scratch, Tekk.coach reads your GitHub repo, asks structured follow-up questions, and generates a codebase-aware spec in this style. Connect your GitHub repo. Describe the problem. Get a structured spec. Ship.
Part of the Spec-Driven Development pillar — a 52-page honest playbook on shipping with AI coding agents.

