Skip to main content
Unlisted page
This page is unlisted. Search engines will not index it, and only users having a direct link can access it.

Stripe Minions: One-Shot, End-to-End Coding Agents

· 3 min read

Source: Part 1 | Part 2 Author: Alistair Gray (Stripe) Date: Feb 9 & Feb 19, 2026


Key Metric

Over 1,300 PRs merged per week -- completely minion-produced, human-reviewed, no human-written code.

Why Custom Agents

  • Codebase: hundreds of millions of LOC across several large repos
  • Primarily Ruby (non-Rails) with Sorbet typing -- rare combo for LLMs
  • Countless homegrown libraries specific to Stripe
  • Processes >$1T/year in production
  • Philosophy: "if it's good for humans, it's good for LLMs, too"

Entry Points

  1. Slack (primary) -- tag the Slack app from any thread, full conversation context included
  2. CLI and web interfaces
  3. Internal tool integrations -- docs platform, feature flags, ticketing
  4. Automated triggers -- CI detects flaky tests -> auto-ticket with "launch minion" button

Monitoring & Output

  • Web interface for real-time observation of agent decisions
  • On completion: branch, CI push, PR following Stripe's template
  • If code is good -> open PR for colleague review
  • If not -> provide additional instructions, minion pushes updates
  • Partially correct output used as foundation for focused human work

Devboxes (Cloud Dev Infrastructure)

PropertyDescription
ParallelizabilityMultiple agents on separate tasks simultaneously
PredictabilityStandardized configs ensure consistent behavior
IsolationWork confined to individual environments
  • "Cattle, not pets" -- standardized, easy to replace
  • Warm pool achieves "hot and ready" in ~10 seconds
  • Pre-cloned repos, cached services
  • Isolated from production and internet

Agent Harness: Forked Goose

Internally forked Block's Goose (open-source coding agent) in late 2024. Key distinction: minions operate without human supervision -- full permissions, no confirmation prompts, safe within isolated devboxes.

Blueprints: The Orchestration Framework

Central architectural innovation. A "state machine that intermixes deterministic code nodes and free-flowing agent nodes."

Example node types:

  • Agentic nodes: "Implement task," "Fix CI failures" (free-form LLM reasoning)
  • Deterministic nodes: "Run linters," "Push changes," git ops, testing (guaranteed execution)

This hybrid approach reduces token consumption and improves reliability.

Context: Rule Files

  • Directory-specific and pattern-based rules (not global -- would overwhelm context)
  • Standardized on Cursor's rule format
  • Synchronized across minions, Cursor, and Claude Code

Context: MCP (Model Context Protocol)

"Toolshed" -- centralized internal MCP server with ~500 tools for internal systems and SaaS.

  • Minions receive an intentionally small subset by default
  • Per-user customizable additional tool sets
  • MCP is the common language across all Stripe agents
  • Deterministic pre-execution of relevant MCP tools for context hydration

Feedback Loop

  1. Local linting -- heuristic-based, <5 seconds per push
  2. CI selective testing -- from 3M+ tests, only relevant ones run
  3. Autofixes -- many tests include autofixes, applied automatically
  4. Single retry -- one resolution attempt for remaining failures

Hard rule: "often one, at most two, CI runs" -- shift feedback left.

Security

  • Devboxes in QA environments
  • No access to production services or real user data
  • Internal control frameworks preventing destructive actions

Stripe Minions Architecture