One post tagged with "agent-teams" | bra1ndump (kirill dubovitskiy)

Agent Teams Report

February 20, 2026 · 10 min read

A survey of how individuals and teams are running multi-agent coding setups (Feb 2026).

1. Boris Cherny -- Creator of Claude Code, Head of Claude Code @ Anthropic

Scale: 10-15 concurrent sessions, 20-27 PRs/day, 100% AI-written code Business: Employee at Anthropic. Claude Code ~$1B annualized revenue in 6 months.

  Boris (human)
      |
      |──> 5 terminal tabs (iTerm, OS notifications)
      |──> 5-10 browser sessions (claude.ai/code)
      |──> mobile sessions (fire-and-forget)
      |
      v
  Each session = independent Claude Code instance
      |
      |── Model: Opus 4.5 + extended thinking (always)
      |── CLAUDE.md: shared knowledge base (updated weekly)
      |── Plan Mode first, then auto-accept
      |
      v
  ┌───────────────┐
  │ PostToolUse    │  <-- formatting hooks fix style drift
  │ hooks          │
  └───────┬───────┘
          |
          v
  ┌───────────────┐
  │ Verification   │  <-- Chrome extension, agent self-tests
  │ loops          │
  └───────┬───────┘
          |
          v
  ┌───────────────┐
  │ PR             │  <-- /commit-push-pr slash command
  └───────────────┘

  "Teleport" hands sessions between terminal ↔ browser ↔ mobile

Key practices:

CLAUDE.md (not AGENTS.md) as living knowledge base -- errors get documented so they never repeat
/permissions pre-allows safe bash commands
Subagents: code-simplifier, verify-app
259 PRs in 30 days. 90% of Claude Code's own codebase written by Claude Code.

Full reference | Boris's Twitter thread

2. Claude Code Native Multi-Agent -- Four Layers

Status: Subagents + SDK stable, Agent Teams experimental Business: Part of Claude Code ($200/mo Max plan, or API usage)

  Layer 1: SUBAGENTS (in-session)
  ────────────────────────────────
  Parent agent
      |
      |── Task("Explore", "find all API routes")  <-- Haiku, read-only
      |── Task("code-reviewer", "review changes")  <-- custom .claude/agents/*.md
      |── Task("general", "refactor auth")         <-- background, full tools
      |
      v
  Results summarized back to parent
  Subagents CANNOT talk to each other


  Layer 2: AGENT TEAMS (cross-session, experimental)
  ───────────────────────────────────────────────────
  CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
  tmux -CC
      |
      v
  ┌──────────────┐
  │  Team Lead    │  <-- Opus, plans work, assigns tasks
  └──────┬───────┘
         |
    ┌────┼────┐
    v    v    v
  [T1] [T2] [T3]   <-- Teammates (Sonnet/Haiku), each in tmux pane
    |    |    |
    v    v    v
  Shared task list + mailbox system
  Direct inter-agent messaging
  Dependencies: task A blocks task B

  Display: in-process OR tmux split panes
  Quality gates: TeammateIdle, TaskCompleted hooks


  Layer 3: AGENT SDK (programmatic)
  ─────────────────────────────────
  from claude_code import Agent, AgentDefinition

  agents = [
      Agent("planner", model="opus", tools=[...]),
      Agent("coder",   model="sonnet", tools=[...]),
      Agent("tester",  model="haiku",  tools=[...]),
  ]
  results = await asyncio.gather(*[a.run(task) for a in agents])

  Full control: hooks as callbacks, MCP, permissions, session resume


  Layer 4: GIT WORKTREES (manual)
  ───────────────────────────────
  claude -w feature-1  &  claude -w feature-2  &  claude -w feature-3
      |                       |                       |
      v                       v                       v
  .worktrees/feature-1   .worktrees/feature-2   .worktrees/feature-3
  (independent branch)   (independent branch)   (independent branch)

  Human merges when done. No coordination.

Full reference | Agent Teams docs

3. Simon Willison -- Parallel Agents, Different Models

Scale: 2-3 research projects/day across multiple agents Business: Independent developer, creator of Datasette. No product to sell -- writes about what works.

  Simon (human)
      |
      ├──> Claude Code (Sonnet 4.5)    <-- primary terminal agent
      ├──> Codex CLI (GPT-5-Codex)     <-- second terminal agent
      ├──> Claude Code for Web          <-- async, fire-and-forget
      ├──> Codex Cloud                  <-- async
      └──> Jules                        <-- async
           |
           v
      Each in separate terminal / browser tab
      Isolation: fresh /tmp checkouts per task
      No coordination framework -- human is the router

  ── tools ──────────────────────────
  llm CLI           <-- logs everything to SQLite, analyzed via Datasette
  files-to-prompt   <-- convert repo files to LLM context
  shot-scraper      <-- automated screenshots for visual testing

Key concepts:

"Agents = models using tools in a loop" (his canonical definition, 211 competing definitions collected)
Vibe Engineering (not vibe coding): 12 practices including automated tests, git discipline, code review
Bottleneck is human review, not agent speed
Skills > MCP for simplicity and low token overhead
"Lethal trifecta" security model: private data + untrusted content + external communication = danger

Full reference | simonwillison.net

4. dmux -- Parallel Agents via tmux + Worktrees

Scale: N concurrent agents, git worktree isolation per pane Business: MIT, fully free. Creator (Justin Schroeder) monetizes FormKit Pro ($149-$1,250). Open source: github.com/standardagents/dmux

  dmux TUI
    |
    |──> press 'n'
    |
    v
  ┌─────────────────┐
  │ AI-generate slug │  <-- OpenRouter (gpt-4o-mini)
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Create git       │  <-- .dmux/worktrees/<slug>/
  │ worktree         │      full independent working copy
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Split tmux pane  │
  │ Launch agent     │  <-- claude/codex/opencode (--acceptEdits)
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Agent works      │  <-- status via LLM analysis of terminal (1s poll)
  │ autonomously     │
  └────────┬────────┘
           |
           v
  press 'm' to merge
           |
           v
  ┌─────────────────┐
  │ AI commit msg    │  <-- conventional commits via OpenRouter
  │ Merge to main    │
  │ Remove worktree  │
  └─────────────────┘

  Hooks: worktree_created, pre_merge, post_merge
  A/B mode: two agents, same prompt, side-by-side
  Web dashboard + REST API for programmatic control

Full reference | dmux.ai

5. OpenClaw -- Open-Source AI Agent Framework

Scale: 213K+ GitHub stars, 50+ integrations Business: MIT license, free to self-host. OpenClaw Cloud planned at $39/mo. Real cost: $5-30/mo in LLM API fees. Creator: Peter Steinberger (ex-PSPDFKit, acqui-hired by OpenAI Feb 2026)

  User prompt
      |
      v
  ┌─────────────────┐
  │ OpenClaw gateway │  <-- local-first, 50+ integrations
  │ (agent router)   │      messaging, coding, browser, etc.
  └────────┬────────┘
           |
     ┌─────┼─────┐
     v     v     v
  [Sub-1][Sub-2][Sub-3]  <-- sub-agent collaboration
     |     |     |         40% accuracy boost vs monolithic prompting
     └─────┼─────┘
           |
           v
  ┌─────────────────┐
  │ Output           │  <-- declarative agent config in YAML
  └─────────────────┘

  Not primarily a coding tool -- general-purpose AI assistant
  Can run with local models (Ollama + Llama 3.3) for $0/mo
  Will remain open source under OpenAI stewardship

Full reference | github.com/openclaw

6. Superconductor -- Parallel Cloud Agents with Live Previews

Scale: N agents per ticket, cloud sandboxes, live browser previews Business: Closed-source SaaS by Volition (Gradescope founders). BYOK model. Pricing undisclosed, early access.

  Create ticket (informal)
      |
      v
  ┌─────────────────┐
  │ Launch N agents  │  <-- each in isolated cloud container
  │ on same ticket   │      (Modal / Morph Cloud)
  └────────┬────────┘
           |
     ┌─────┼─────┐
     v     v     v
  [Agent1][Agent2][Agent3]   <-- Claude/Codex/Amp/Gemini
     |     |     |
     v     v     v
  [Live] [Live] [Live]      <-- browser previews ~30s
  [prev] [prev] [prev]
     |     |     |
     └─────┼─────┘
           |
           v
  ┌─────────────────┐
  │ Compare previews │  <-- visual diff, interact with each
  │ Select best      │
  │ One-click PR     │
  └─────────────────┘

Full reference | superconductor.com

7. 8090 Software Factory -- Enterprise Agent Platform

Scale: Multi-repo code modernization Business: Proprietary. $200/seat/mo (Team), custom Enterprise, managed delivery from $1M/yr. Funded by Chamath Palihapitiya personally.

  ┌─────────────────┐
  │ Refinery         │  <-- reverse-engineer codebase into knowledge graph
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Planner          │  <-- AI generates migration/transformation plans
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Foundry          │  <-- specialized agents execute plan
  │ (agent workers)  │      across multiple repos
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Validator        │  <-- quality gate, CI, tests
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Factory Line     │  <-- full pipeline for enterprise
  │ output: PRs      │      code modernization at scale
  └─────────────────┘

Full reference | 8090.ai

8. Terragon -- Background Fire-and-Forget (SHUT DOWN)

Scale: ~30 concurrent tasks/day, auto-PR creation Business: SaaS subscription. Shut down Feb 9, 2026. Code released Apache-2.0. Why: Native background agents from Claude Code and Codex commoditized the orchestration layer.

  Create task (web / CLI / GitHub / Slack / mobile)
      |
      v
  ┌─────────────────┐
  │ Cloud sandbox    │  <-- isolated container, clone repo, create branch
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Agent works      │  <-- background, checkpoints pushed to GitHub
  │ autonomously     │      AI-generated commits
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ PR created       │  <-- automatic when done
  └────────┬────────┘
           |
           v
  Human reviews and merges

  DEAD: Codex reached 28% agent usage on Terragon within 1 month
        Native background agents made the wrapper unnecessary

Full reference | terragon-labs/terragon-oss

9. Vadim Strizheus -- "AI Employees" for VugolaAI

Scale: Claims 14 AI employees, 95% automated Business: VugolaAI (video clipping/scheduling SaaS). Free tier. Solana token (VGLA).

  Long-form video input
      |
      v
  ┌─────────────────┐
  │ AI Moment        │  <-- "AI employee" 1: detect viral-worthy segments
  │ Detection        │
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Auto-Clipping    │  <-- "AI employee" 2-N: extract, reframe, caption
  │ + Captioning     │
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Branding +       │  <-- template application
  │ Formatting       │
  └────────┬────────┘
           |
           v
  ┌─────────────────┐
  │ Multi-Platform   │  <-- TikTok, YouTube, Instagram, X, LinkedIn
  │ Scheduling       │
  └─────────────────┘

  Note: Specific agent breakdown from video tweet, not independently verified.
  The product itself IS the AI automation -- "employees" = AI pipeline stages.

Full reference | @VadimStrizheus

10. Notable Voices

Francois Chollet (@fchollet)

"Sufficiently advanced agentic coding is essentially machine learning"

Does NOT run a multi-agent setup. Warns about maintaining "sprawling mess of AI-generated legacy code." Useful contrarian check.

Andrej Karpathy

Coined "vibe coding" (Feb 2025), then abandoned it for "agentic engineering" (Feb 2026). Evolution: accept all AI output → require specs, review, test suites.

Addy Osmani

Defined Conductor (sequential) vs Orchestrator (parallel) agent frameworks. Identified the "80% problem" -- last 20% takes as long as first 80%.

Comparison Matrix

System	Type	Open Source	Pricing	Agents	Key Feature
Boris Cherny	Individual workflow	N/A (uses Claude Code)	$200/mo Max	10-15 parallel CC	Teleport between devices
Claude Code Teams	Built-in	N/A (product feature)	$200/mo Max or API	N (tmux panes)	Shared task list + mailbox
Claude Agent SDK	Library	MIT	API usage	Programmatic	Full orchestration control
Simon Willison	Individual workflow	N/A	Multi-subscription	CC + Codex + async	Human as router
dmux	OSS tool	MIT	Free	N (tmux + worktrees)	A/B agent comparison
OpenClaw	OSS framework	MIT	Free / $39 Cloud	Sub-agents	213K stars, joined OpenAI
Superconductor	SaaS	No	Undisclosed (BYOK)	N per ticket	Live browser previews
8090	Enterprise	No	$200/seat/mo+	Factory Line	Knowledge graph + modernization
Terragon	SaaS (dead)	Apache-2.0 (post-shutdown)	Was subscription	Background agents	Shut down Feb 2026
VugolaAI	Product	No	Free tier	14 "AI employees"	Video pipeline automation

Common Patterns

What works across all setups:

1. ISOLATION     -- worktrees, containers, or separate sessions
                    agents must not conflict with each other

2. PLAN FIRST    -- Opus/expensive model plans, cheaper model executes
                    Boris: Plan Mode → auto-accept
                    Agent Teams: team lead plans, teammates execute

3. MEMORY        -- CLAUDE.md / AGENTS.md / progress.txt
                    errors documented so they never repeat
                    updated by the agent, not the human

4. VERIFICATION  -- automated tests, browser screenshots, self-review
                    humans review throughput, not individual lines

5. MODEL TIERING -- Opus for planning ($$$), Sonnet for coding ($$), Haiku for tests ($)
                    "correct answer costs less total iteration time than fast wrong ones"

What doesn't work:

1. NO TESTS      -- agents spiral without verification signals
2. NO MEMORY     -- same mistakes repeat across sessions
3. SHARED STATE  -- agents editing same files = merge hell
4. NO REVIEW     -- "vibe coding" produces unmaintainable code (Chollet, Karpathy)

Business Model Summary

  FREE / OSS:
    dmux (MIT)         -- monetizes separately via FormKit Pro
    OpenClaw (MIT)     -- Cloud tier planned $39/mo, creator joined OpenAI
    claude-flow (MIT)  -- reputation/consulting play
    Ralph/Compound (MIT) -- promotes Amp (Sourcegraph)
    Terragon (Apache-2.0) -- released on shutdown

  SAAS / COMMERCIAL:
    Superconductor     -- BYOK, undisclosed platform fee, early access
    8090               -- $200/seat/mo, $1M/yr managed delivery
    VugolaAI           -- free tier + crypto token (VGLA)

  PLATFORM:
    Claude Code        -- $200/mo Max plan or API usage (~$1B ARR)
    OpenAI Codex       -- subscription + API
    GitHub Agent HQ    -- Copilot subscription (multi-vendor agents)

  The trend: orchestration tools struggle to monetize when platforms
  add native multi-agent features (see: Terragon shutdown).
  Survivors either go enterprise (8090) or stay free and build community (dmux, OpenClaw).

1. Boris Cherny -- Creator of Claude Code, Head of Claude Code @ Anthropic​

2. Claude Code Native Multi-Agent -- Four Layers​

3. Simon Willison -- Parallel Agents, Different Models​

4. dmux -- Parallel Agents via tmux + Worktrees​

5. OpenClaw -- Open-Source AI Agent Framework​

6. Superconductor -- Parallel Cloud Agents with Live Previews​

7. 8090 Software Factory -- Enterprise Agent Platform​

8. Terragon -- Background Fire-and-Forget (SHUT DOWN)​

9. Vadim Strizheus -- "AI Employees" for VugolaAI​

10. Notable Voices​

Francois Chollet (@fchollet)​

Andrej Karpathy​

Addy Osmani​

Comparison Matrix​

Common Patterns​

Business Model Summary​