Sub-Agent Orchestration in Python: 5-Level Stack (2026)

Share on SNS

sub-agent orchestration python 5-level stack architecture 2026

Sub-agent orchestration in Python reached a turning point on June 12, 2026.

On June 12, 2026, Anthropic quietly shipped the most important infrastructure update of the year. Sub-agents can now spawn their own sub-agents — five levels deep — inside Claude Code pipelines. No human in the loop. No manual handoffs. Recursive autonomy, natively supported.

If you missed it, you’re already behind.

This post gives you the complete Python architecture to operationalize that capability — a self-spawning agent chain that handles task dispatch, error recovery, and result aggregation without a single manual touch.


Why Sub-Agent Orchestration in Python Matters More Than Any Model Release

The frontier model race has plateaued. GPT vs. Opus is a less meaningful contest right now than the harness race — who builds the most capable system around the model.

Boris Churnney, the creator of Claude Code, said it plainly at the Code with Claude 2026 event in San Francisco: there is literally no manually written code anywhere inside Anthropic anymore. Agents coordinate over Slack. They write code in loops. They resolve issues across the full codebase autonomously.

That is not a marketing claim. That is the creator of the tool describing how the company that built it actually operates.

The question is not whether agentic pipelines are real. The question is whether you’ve built one yet.


The 5-Level Sub-Agent Stack: Architecture Overview

Effective sub-agent orchestration in Python starts with a clear depth hierarchy.

For the official release notes on background agent chains, see Claude Code Week 24 changelog.

Claude Code Week 24 (June 8–12) introduced background chains capped at five levels deep. Here is what that architecture looks like in production:

Level 0 — Orchestrator Agent
  └── Level 1 — Domain Dispatcher
        ├── Level 2 — Research Sub-Agent
        │     └── Level 3 — Data Extraction Agent
        │           └── Level 4 — Validation Agent
        └── Level 2 — Writing Sub-Agent
              └── Level 3 — SEO Optimization Agent
                    └── Level 4 — Internal Link Resolver

Each level operates independently. Each spawns children only when its own task scope exceeds a defined token threshold. The orchestrator at Level 0 never manually touches what happens below Level 1.

This is not a workflow diagram. This is a production architecture you can deploy today.


Production Python Source Code: Self-Spawning Agent Chain

The following implementation uses Anthropic’s Python SDK with tool-use to build a recursive sub-agent spawning system. Compatible with Claude Opus 4.6 and Sonnet 4.6.

Step 1 — Install dependencies

pip install anthropic python-dotenv

Step 2 — Environment configuration

# .env
ANTHROPIC_API_KEY=your_api_key_here
MAX_AGENT_DEPTH=5
TASK_TOKEN_THRESHOLD=2000

Step 3 — Core orchestrator engine

import anthropic
import os
import json
from dotenv import load_dotenv

load_dotenv()

client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))

MAX_DEPTH = int(os.environ.get("MAX_AGENT_DEPTH", 5))
TOKEN_THRESHOLD = int(os.environ.get("TASK_TOKEN_THRESHOLD", 2000))

SPAWN_TOOL = {
    "name": "spawn_sub_agent",
    "description": "Spawn a child agent to handle a subtask. Use when the current task scope exceeds your token budget or requires specialized execution.",
    "input_schema": {
        "type": "object",
        "properties": {
            "subtask": {
                "type": "string",
                "description": "The specific subtask to delegate to the child agent."
            },
            "context": {
                "type": "string",
                "description": "Relevant context the child agent needs to execute independently."
            },
            "priority": {
                "type": "string",
                "enum": ["critical", "high", "standard"],
                "description": "Execution priority of this subtask."
            }
        },
        "required": ["subtask", "context", "priority"]
    }
}


def run_agent(task: str, context: str, depth: int = 0) -> dict:
    """
    Recursive agent executor with depth guard.
    Spawns child agents automatically when tool_use is triggered.
    """
    if depth >= MAX_DEPTH:
        return {
            "status": "depth_limit_reached",
            "depth": depth,
            "result": f"[DEPTH CAP] Task queued for human review: {task[:120]}"
        }

    print(f"\n[AGENT DEPTH {depth}] Executing: {task[:80]}...")

    messages = [
        {
            "role": "user",
            "content": f"""You are an autonomous agent at execution depth {depth} of {MAX_DEPTH}.

TASK: {task}

CONTEXT: {context}

If this task is too broad to complete in a single pass, use the spawn_sub_agent tool
to delegate subtasks to child agents. Each child agent operates independently and
returns a result you can incorporate.

If you can complete this task directly, do so without spawning children."""
        }
    ]

    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=TOKEN_THRESHOLD,
        tools=[SPAWN_TOOL],
        messages=messages
    )

    results = {"depth": depth, "task": task, "children": [], "output": None}

    for block in response.content:
        if block.type == "text":
            results["output"] = block.text

        elif block.type == "tool_use" and block.name == "spawn_sub_agent":
            child_input = block.input
            print(f"\n  [SPAWN] Depth {depth} → spawning child agent")
            print(f"  [SUBTASK] {child_input['subtask'][:60]}...")
            print(f"  [PRIORITY] {child_input['priority']}")

            child_result = run_agent(
                task=child_input["subtask"],
                context=child_input["context"],
                depth=depth + 1
            )
            results["children"].append(child_result)

    return results


def orchestrate(master_task: str) -> None:
    """Entry point. Kicks off the Level 0 orchestrator."""
    print(f"\n{'='*60}")
    print(f"ORCHESTRATOR INITIALIZED")
    print(f"Master Task: {master_task[:80]}")
    print(f"Max Depth: {MAX_DEPTH} | Token Budget Per Agent: {TOKEN_THRESHOLD}")
    print(f"{'='*60}")

    result_tree = run_agent(
        task=master_task,
        context="Top-level orchestration session. Decompose aggressively.",
        depth=0
    )

    print(f"\n{'='*60}")
    print("EXECUTION COMPLETE — RESULT TREE:")
    print(json.dumps(result_tree, indent=2))
    print(f"{'='*60}\n")


if __name__ == "__main__":
    orchestrate(
        master_task="""Research the top 5 agentic AI frameworks released in Q2 2026,
summarize their key differentiators, and draft a comparison post optimized
for the keyword 'multi-agent orchestration frameworks 2026'."""
    )

What This Architecture Unlocks

This is what production sub-agent orchestration actually looks like at scale.

A five-level chain with a 2,000-token budget per agent gives you roughly 10,000 tokens of distributed cognitive work per master task — executed in parallel, without you at a keyboard. Here is what operators are running on this architecture right now:

  • Automated content pipelines: Level 0 receives a topic brief. Level 2 sub-agents research independently. Level 3 agents draft sections. Level 4 validates internal links and SEO scoring.
  • B2B lead processing: Level 0 receives inbound lead data. Children qualify, score, draft personalized outreach, and log to CRM — zero human handoff.
  • Codebase maintenance: Exactly what Anthropic runs internally. Agents file issues, write fixes, open PRs, and resolve merge conflicts across the stack.

Depth Guard and Self-Healing: The Non-Optional Layer

The code above includes a depth guard (if depth >= MAX_DEPTH) for a reason. Without a hard cap, recursive spawning can drain your entire API budget in a single runaway chain. Add this self-healing wrapper to your production deployment:

def handle_depth_overflow(result: dict, alert_channel: str = "ops-alerts") -> None:
    """
    Routes depth-capped tasks to a human review queue.
    Extend to Slack, PagerDuty, or your own incident bus.
    """
    if result.get("status") == "depth_limit_reached":
        print(f"[ALERT] Depth overflow at level {result['depth']}")
        print(f"[QUEUE] Forwarded to {alert_channel}: {result['result']}")
        # slack_client.post_message(channel=alert_channel, text=result['result'])

For a full self-healing logging implementation, see the Automated Logging Code post in this series.


The Operator Mindset Shift

Most builders are still thinking about AI as a tool that assists them. The five-level sub-agent architecture forces a different mental model: you are the architect, not the executor.

Your job at Level 0 is to write a master task brief clear enough for an autonomous chain to decompose and execute without you. If your brief requires clarification mid-run, the chain fails. The quality of your upfront specification is now the primary leverage point in the entire stack.

This is not a small shift. This is the difference between someone who uses a spreadsheet and someone who designs the financial system the spreadsheet reports on.

For the cognitive infrastructure required to operate at this level — sustained focus, reduced decision fatigue, high-output mental states — the Circadian Rhythm System and Nootropic Protocol are the companion reads.


What to Build Next

The five-level cap is a forcing function, not a limitation. Here is the natural build sequence from this architecture:

  1. Deploy the orchestrator on a single task. Study the spawn tree output in your terminal.
  2. Wire Level 4 outputs into a persistent vector store for cross-session memory.
  3. Add the Metrics Code layer to monitor token consumption per depth level in real time.
  4. Schedule the orchestrator via cron or n8n webhook to run on a defined cadence — fully autonomous, no human trigger required.

The architecture Anthropic built for itself is now available to every operator with an API key and a precise enough master task. The only question is whether you’ll build it before the person competing for the same contract does.


This post is part of The Agentic Protocol’s Phase 2 production code series. Each post ships a deployable architecture — no theory, no demos, no placeholders.


Share on SNS