Build an AI Sales Agent
Free spec to build an AI sales agent — a research-personalize-outreach loop with a reply classifier closing the cycle. Framework-agnostic: pick model, CRM, email provider, and signal source in STACK.md. Includes typed contracts, signal-driven personalization, RFC 8058 unsubscribe headers, hand-labeled classifier eval, and deliverability hardening.
How to use this spec
- Click any row above to open the full task — title, description, subtasks, AI instructions, the works. Same layout the product uses internally.
- Hit Copy as Prompt in the right sidebar of any task. You'll get the XML-wrapped prompt Tekk uses internally — paste it into Cursor, Claude Code, Codex, ChatGPT, or anywhere else and the agent has the full task context.
- Open in jumps the same prompt directly into v0, Lovable, Bolt, Magic Patterns, Replit, or Cursor with one click.
What you're building
Goal 1: Ship a working outbound sales agent that sends only signal-driven personalized emails, classifies every reply into five categories, hands off positive replies to a human within minutes, never re-emails an unsubscribed prospect, and survives production with idempotent sends, RFC 8058 unsubscribe headers, bounce-webhook suppression, and a hand-labeled classifier eval.
The architecture is a research-personalize-outreach loop with a reply classifier closing the cycle. Five pick-once tooling decisions in STACK.md (model, language, email provider, CRM, signal source) parameterize every later task. Acceptance bar: classifier hits ≥85% accuracy on positive_interest and ≥95% on unsubscribe across a 50-example hand-labeled set; dry-run campaign on 10 mocked prospects emits zero sends to suppressed addresses, cites a verified signal_id in every email body, and produces structured trace events one-per-loop-phase.
Architecture
flowchart LR CRM[(CRM<br/>system of record)] -->|list_prospects| ICP[ICP Filter<br/>passes_icp] ICP -->|prospect| Signal[Signal Scraper<br/>scrape_signal] Signal -->|BuyingSignal| Personalize[Personalizer<br/>structured output] Personalize -->|DraftEmail| Send[Email Sender<br/>idempotent + RFC 8058] Send -->|SendResult| Wait((wait N days)) Wait --> Classifier[Reply Classifier<br/>5 categories] Classifier -->|positive_interest| Handoff[Human Handoff] Classifier -->|soft_positive / objection| Drafted[Drafted Reply Queue] Classifier -->|ooo| Followup[Followup Brain<br/>should_followup] Classifier -->|unsubscribe| Suppress[Suppression List<br/>+ CRM do_not_contact] Followup -->|send_now + fresh signal| Personalize Followup -->|mark_dormant| CRM Send -.->|log_activity| CRM Classifier -.->|log_activity| CRM
The three load-bearing decisions: signal quality over volume (the personalizer raises NoSignalError when scrape_signal returns None — signal-less sends are blocked in code, not policy), reply classifier as the value gate (five-category routing turns the inbound reply flood into a triaged queue, with positive_interest reaching a human in minutes and unsubscribe immediately suppressing), and CRM as the system of record (the agent reads prospect state from the CRM at the top of every loop and only ever appends activities or sets lifecycle_stage='unsubscribed'; editor-set do_not_contact is never overridden).
When an autonomous sales agent makes sense (and when it doesn't)
The 2026 data is in: fully autonomous AI SDRs have not replaced human sales teams. AI cold-email reply rate is 4.1% vs human 5.2%; spam-flag rate is 8% AI vs 3% human; 47% of AI SDR deployments suffer domain reputation collapse inside 90 days. Build this when the loop is signal-driven and reply-triaged — not as a volume play.
✓ Build an AI sales agent when
- You have a verified-signal source. Funding announcements, hires, tech-stack changes, job posts. Signal-driven AI SDRs average 3.8% reply across 500K+ emails (Digital Applied 2026) because every message cites a real reason to reach out.
- Reply triage is your bottleneck, not send volume. Founders running their own outbound find the value gate is classifying replies fast and drafting context-aware responses — the classifier is where this agent earns its keep.
- You have a CRM as the system of record. HubSpot, Salesforce, Attio, Pipedrive, Close. The agent reads prospect state from the CRM and never overrides editor-set
do_not_contact— that boundary prevents the failure mode that ends customer relationships. - Hybrid human + AI math works for your team. Hybrid pods (1 human SDR : 2 AI seats) book 1.9x more meetings per dollar than pure-AI configurations (Digital Applied 2026). The agent does the loop; the human handles positive replies and the drafted-response queue.
- You can run the hand-labeled classifier eval. A 50-example reply set graded with hard accuracy bars (≥85% positive_interest, ≥95% unsubscribe) is the difference between catching demos and burning relationships. If you won't author the eval, you'll catch neither.
⚠ Hire (or keep) a human SDR when
- You can't run a 6-week domain warmup. 47% of AI SDR deployments collapse inside 90 days without it (Digital Applied 2026). Microsoft 365 inboxes are the strictest filter — most B2B prospects sit behind M365. Sending 500 cold emails from a fresh domain on day one is unrecoverable.
- Your team won't honor CAN-SPAM and RFC 8058. Penalties up to $53,088 per violation; Gmail/Yahoo/Microsoft require one-click unsubscribe + complaint rate <0.3% + bounce rate <2%. Systematic opt-out failures trigger enforcement and torch the domain — not just the campaign.
- Your ICP is broad or unverified. AI cold-email reply rate drops from 6.1% (SaaS) to 1.9% (financial services) — a 3.2x spread (Digital Applied 2026). Without a tight ICP, you're spending 6x more tokens and burning sender reputation for nothing.
- You're optimizing for volume over signal. Per-rep monthly sends went 1,150 → 7,400 with AI; raw reply rates fell 4.7% → 2.9%. The agent makes signal-less sends impossible by design; if you'd override that constraint, hire a human and send 50 thoughtful emails.
- You have no real-reply data to seed the classifier. The 50-example hand-labeled set is the load-bearing eval. With zero historical replies to draw from, the classifier is graded on synthetic data — false negatives on positive_interest (lost demos) and unsubscribe (domain reputation) won't surface until production.
What the community says
How to know it's working
Ship the spec, then measure on these criteria (the eval harness task grades them):
- Classifier eval: ≥85% accuracy on positive_interest across the 50-example hand-labeled set (false negatives cost demos)
- Classifier eval: ≥95% accuracy on unsubscribe across the 50-example hand-labeled set (false negatives cost the sending domain)
- End-to-end dry run: zero sends to any prospect in the suppression list across 10 mocked prospects (suppression is checked in code before the provider call)
- End-to-end dry run: every SendResult carries a signal_id matching a real BuyingSignal from the scrape (no fabricated personalization)
- Idempotency: re-running the loop on a partially-completed campaign produces zero duplicate sends (provider returns the same email_id for the same idempotency key)
- Deliverability: every outbound email carries List-Unsubscribe + List-Unsubscribe-Post headers (RFC 8058) plus a plain-text body alongside HTML
- Webhook compliance: email.bounced and email.complained events route to suppression-list append AND CRM do_not_contact=true within one request lifecycle
- Production hardening: regression run of run_e2e.py stays within ±5% of evals/baseline.json on p50 latency and median tokens-per-send
Sources
Every claim, pattern, and acceptance threshold on this page maps back to one of these. Read them before deviating from the spec.
- ↗ How 11x rebuilt their Alice agent: from ReAct to multi-agent with LangGraph ZenML LLMOps Database
- ↗ Clay — Achieving 10x growth with agentic sales prospecting OpenAI
- ↗ AI SDR Real Performance: 100K Email Analysis 2026 Digital Applied
- ↗ AI SDR Statistics 2026: 100+ Outbound Sales Data Points Digital Applied
- ↗ How to Achieve 90%+ Cold Email Deliverability in 2026 Instantly
- ↗ The 3-Step AI Reply Agent system Instantly
- ↗ kaymen99/sales-outreach-automation-langgraph GitHub
- ↗ AIXerum/AI-Sales-Outreach-Langgraph GitHub
- ↗ ladesai123/LangGraph-Lead-Generation GitHub
- ↗ browser-use/browser-use GitHub
- ↗ Building effective agents Anthropic Engineering
- ↗ AI SDRs do not solve the cold outreach problem Hacker News
- ↗ 11x has been claiming customers it doesn't have Hacker News
- ↗ Top AI SDR tools analysis Hacker News
- ↗ You can turn ANY AI SDR into a hacker Hacker News
- ↗ How 11x Rebuilt Their Alice Agent: From ReAct to Multi-Agent with LangGraph YouTube — LangChain Interrupt
- ↗ Resend Python SDK — transactional email with idempotency + webhooks Resend
- ↗ Claude Agent SDK — custom tools via in-process MCP server GitHub — anthropics/claude-agent-sdk-python
Build this in your codebase tonight
Sign up — Tekk reads your repo, picks your stack from the five decisions in STACK.md, and writes a personalized version of this 10-task spec. Same architecture, your patterns, your dependencies. Want to do it yourself? Open any task above and hit Copy as Prompt — paste into Cursor, Claude Code, or Codex.