01

Agent Orchestration

Real-world AI applications rarely involve a single agent. They require coordinated pipelines where agents hand off work, wait for dependencies, and recover from failures — all while a human operator maintains oversight. ThoughtSurgery makes orchestration a first-class citizen.

Pattern

Dependency-Aware Agent Graphs

Model your agent pipeline as a directed acyclic graph (DAG). Each node is an agent with its own AgentUIProvider and SSE stream. Edges represent data dependencies. ThoughtSurgery's session-scoped state isolation means agents never interfere with each other — and the human operator can intervene at any node independently.

📡 Data Ingest
Streaming 3 API sources
1,247 tokens
🔍 Analyzer A
Sentiment analysis
892 tokens
📊 Analyzer B
Waiting for Ingest...
Queued
📝 Report Writer
Depends on A + B
Pending
🛑
Human Review
Checkpoint before publish
Approval required
Implementation — Orchestration Controller
// orchestration-controller.tsx
// Each agent runs in its own isolated provider with its own SSE stream.
// The controller manages the DAG topology and data handoffs.

import { AgentUIProvider, useAgentUI } from '@agentui/react';

interface AgentNode {
  id: string;
  endpoint: string;
  dependsOn: string[];      // IDs of upstream agents
  status: 'pending' | 'running' | 'done' | 'error';
}

const pipeline: AgentNode[] = [
  { id: 'ingest',   endpoint: '/api/stream?agent=ingest',   dependsOn: [],                  status: 'running' },
  { id: 'analyze-a', endpoint: '/api/stream?agent=analyze-a', dependsOn: ['ingest'],          status: 'running' },
  { id: 'analyze-b', endpoint: '/api/stream?agent=analyze-b', dependsOn: ['ingest'],          status: 'pending' },
  { id: 'writer',   endpoint: '/api/stream?agent=writer',   dependsOn: ['analyze-a', 'analyze-b'], status: 'pending' },
];

function OrchestrationDashboard() {
  return (
    <div className="dag-dashboard">
      {pipeline.map(node => (
        <AgentUIProvider key={node.id} endpoint={node.endpoint}>
          <AgentNodeCard node={node} />
        </AgentUIProvider>
      ))}
    </div>
  );
}

function AgentNodeCard({ node }: { node: AgentNode }) {
  // Each card gets its own independent agent state
  const { agentState, sendCommand, intervene } = useAgentUI();
  
  const canRun = node.dependsOn.every(dep => 
    pipeline.find(n => n.id === dep)?.status === 'done'
  );

  return (
    <div className={`node-card ${node.status}`}>
      <h4>{node.id}</h4>
      <p>Status: {agentState.status}</p>
      <p>Tokens: {agentState.tokens.length}</p>
      
      {/* Independent controls per agent */}
      <button onClick={() => sendCommand('PAUSE')}>Pause</button>
      <button onClick={() => intervene({ 
        systemPrompt: 'Focus on key metrics only.' 
      })}>Redirect</button>
    </div>
  );
}
Pattern

Graceful Error Recovery & Agent Swap

When an agent in the pipeline fails — rate limit, hallucination detected, timeout — the operator doesn't need to restart the entire workflow. ThoughtSurgery's handleIntervention()lets you swap the underlying model, inject corrective context, or skip the failed node entirely.

⚠️ Failure Detected
14:23:01 ✓ ingest Completed — 3 sources loaded
14:23:08 ✓ analyze-a Sentiment: positive (0.87)
14:23:12 ✗ analyze-b Rate limited by OpenAI (429)
14:23:12 ⏸ writer Blocked — waiting on analyze-b
🔄 Operator Intervenes
14:23:15 → operator Swapping analyze-b to Anthropic
14:23:16 ✓ analyze-b Restarted with Claude adapter
14:23:24 ✓ analyze-b Trend analysis complete
14:23:25 ▶ writer Dependencies met — resuming
Implementation — Runtime Adapter Swap
// Swap the backend of a failed agent at runtime
// This is the power of the Adapter pattern — the UI doesn't change at all.

async function handleAgentFailure(agentId: string, error: Error) {
  console.log(`Agent ${agentId} failed: ${error.message}`);

  // 1. The operator clicks "Swap to Anthropic" in the UI
  //    This triggers an intervention on the agent's session:
  await fetch('/api/stream/intervention', {
    method: 'POST',
    body: JSON.stringify({
      action: 'swap_adapter',
      payload: {
        agentId,
        newAdapter: 'anthropic',
        model: 'claude-sonnet-4-20250514',
        // Carry over the accumulated context
        preserveContext: true,
      }
    })
  });
}

// Server-side handler in your custom adapter:
class OrchestrationAdapter extends BaseAgentAdapter {
  async handleIntervention(ctx: SessionContext, action: string, payload: any) {
    if (action === 'swap_adapter') {
      // Disconnect the failed adapter
      await this.adapters[payload.agentId].disconnect(ctx);
      
      // Hot-swap to a new backend
      this.adapters[payload.agentId] = createAdapter(payload.newAdapter, {
        model: payload.model,
      });
      
      // Reconnect and resume — the UI sees a seamless recovery
      await this.adapters[payload.agentId].connect(ctx);
      this.emitEvent(ctx, {
        type: 'state_change',
        payload: { status: 'running', swappedTo: payload.newAdapter }
      });
    }
  }
}
02

File Management

Agents that write to disk are dangerous. One concurrent write, one bad regex, one half-finished generation — and your file is corrupted. ThoughtSurgery treats file management as a first-class concern with AST-level parsing, per-file mutex locks, and a full edit history.

The Problem

Why String Replacement Destroys Files

Most agent frameworks use string.replace() or regex to edit files. This is catastrophically fragile. Here's what happens in production:

❌ Naive String Replace
Original:
# Project Setup
Install with `npm install`
# Project Goals
Build a scalable API
Agent runs: replace("# Project", "# Updated Project")
Result:
# Updated Project Setup
Install with `npm install`
# Updated Project Goals ← UNINTENDED!
Build a scalable API
✅ ThoughtSurgery AST Edit
Original: (same file)
# Project Setup
Install with `npm install`
# Project Goals
Build a scalable API
Agent targets: heading[0] (by AST index)
Result:
# Updated Project Setup
Install with `npm install`
# Project Goals ← UNTOUCHED ✓
Build a scalable API
Pattern

The AST → Mutex → Write Pipeline

Every file edit goes through a three-stage pipeline that is structurally impossible to corrupt. Here's the exact flow that safeUpdateMarkdown() executes:

1
🔒

Acquire Mutex

Per-file lock prevents concurrent writes. If Agent B tries to edit while Agent A holds the lock, it queues and waits.

const release = await mutex.acquire();
2
🌳

Parse to AST

File is parsed into a tree of typed nodes (headings, paragraphs, code blocks). Edits target specific nodes by type and position.

const ast = processor.parse(content);
3
✍️

Mutate & Serialize

Your mutator function receives the tree, makes surgical changes, and the processor serializes it back to valid markdown.

mutator(ast); writeFileSync(path, stringify(ast));
Implementation — Real-World File Operations
import { safeUpdateMarkdown } from '@agentui/core/dist/ast-sync';

// ── Example 1: Insert a new section ──────────────────────────
await safeUpdateMarkdown('./docs/api-reference.md', (tree) => {
  // Find the "Methods" heading
  const methodsIdx = tree.children.findIndex(
    (n: any) => n.type === 'heading' && n.depth === 2 
      && n.children[0]?.value === 'Methods'
  );
  
  // Insert a new method after the heading
  tree.children.splice(methodsIdx + 1, 0, {
    type: 'heading',
    depth: 3,
    children: [{ type: 'text', value: 'getSessionMetrics()' }]
  }, {
    type: 'paragraph',
    children: [{ type: 'text', value: 'Returns real-time metrics for an active session.' }]
  }, {
    type: 'code',
    lang: 'typescript',
    value: 'async getSessionMetrics(ctx: SessionContext): Promise<Metrics>'
  });
});

// ── Example 2: Update a table row ────────────────────────────
await safeUpdateMarkdown('./CHANGELOG.md', (tree) => {
  const table = tree.children.find((n: any) => n.type === 'table');
  if (table) {
    // Append a new row to the changelog table
    table.children.push({
      type: 'tableRow',
      children: [
        { type: 'tableCell', children: [{ type: 'text', value: 'v0.4.0' }] },
        { type: 'tableCell', children: [{ type: 'text', value: 'Added adapter hot-swap' }] },
        { type: 'tableCell', children: [{ type: 'text', value: '2026-04-01' }] },
      ]
    });
  }
});

// ── Example 3: Safe concurrent edits ─────────────────────────
// Both agents try to edit the same file simultaneously.
// The mutex ensures they execute sequentially — zero corruption.
await Promise.all([
  safeUpdateMarkdown('./report.md', (tree) => {
    // Agent 1: updates the summary section
    const summary = findSection(tree, 'Summary');
    summary.children[0].value = 'Updated executive summary.';
  }),
  safeUpdateMarkdown('./report.md', (tree) => {
    // Agent 2: updates the metrics section
    const metrics = findSection(tree, 'Metrics');
    metrics.children.push(createParagraph('Revenue: +23% QoQ'));
  }),
]);
// Both edits succeed. Neither corrupts the other.
Pattern

Building a File Version Timeline

By hooking into the adapter's event system, you can build a full version history of every file an agent touches. Each state_change event with a file_edit payload becomes a snapshot. The operator can review, diff, or roll back any edit.

Now
v4 — Agent updated Timeline section
+2 paragraphs, -1 heading · Writer Agent
View DiffRollback
2 min ago
v3 — Operator manually edited Budget
+1 table row · Human (Operator)
5 min ago
v2 — Agent added Goals section
+3 paragraphs, +1 heading · Research Agent
8 min ago
v1 — File created from template
Initial scaffold · System
Implementation — Version History via Events
// In your adapter — emit file edit events with version metadata
class FileTrackingAdapter extends BaseAgentAdapter {
  private versions: Map<string, FileVersion[]> = new Map();

  async editFile(ctx: SessionContext, filePath: string, edit: EditFn) {
    // 1. Snapshot the file before editing
    const before = readFileSync(filePath, 'utf-8');

    // 2. Perform the safe AST edit
    await safeUpdateMarkdown(filePath, edit);

    // 3. Snapshot after
    const after = readFileSync(filePath, 'utf-8');

    // 4. Store version and emit event
    const version: FileVersion = {
      timestamp: Date.now(),
      filePath,
      before,
      after,
      agentId: ctx.sessionId,
      diff: computeDiff(before, after),
    };

    this.versions.get(filePath)?.push(version) 
      ?? this.versions.set(filePath, [version]);

    // 5. UI receives this and renders the timeline
    this.emitEvent(ctx, {
      type: 'state_change',
      payload: {
        event: 'file_edit',
        filePath,
        version: this.versions.get(filePath)!.length,
        diff: version.diff,
      }
    });
  }

  // Rollback to any previous version
  async rollback(ctx: SessionContext, filePath: string, versionNum: number) {
    const history = this.versions.get(filePath);
    if (history && history[versionNum]) {
      writeFileSync(filePath, history[versionNum].before, 'utf-8');
      this.emitEvent(ctx, {
        type: 'state_change',
        payload: { event: 'file_rollback', filePath, restoredTo: versionNum }
      });
    }
  }
}
03

Agent Mission Control

When you're running dozens of agents across multiple workflows, you need a command center — not a chatbot. ThoughtSurgery provides the primitives to build production-grade mission control dashboards with fleet monitoring, resource allocation, and global emergency controls.

Blueprint

Fleet Overview Dashboard

A mission control dashboard aggregates state from every agent into a single view. Each agent'sagentState feeds into shared metrics: total tokens consumed, agents active vs idle, error rates, and estimated completion times.

🛰️ ThoughtSurgery Mission Control
12 Active3 Paused1 Error4 Idle
47,293
Total Tokens
20
Active Agents
94.2%
Success Rate
~3m
Est. Completion
research-alphaAnalyzing patent corpus12,401 tok⏸ Pause
writer-betaDrafting executive summary3,892 tok⏸ Pause
reviewer-gammaAwaiting human approval890 tok✅ Approve
code-gen-deltaRate limited (429)🔄 Retry
Implementation — Mission Control Aggregator
// mission-control.tsx
// Aggregate state from all agent sessions into a unified dashboard.

import { AgentUIProvider, useAgentUI } from '@agentui/react';
import { useState, useEffect, createContext, useContext } from 'react';

interface FleetMetrics {
  totalTokens: number;
  activeCount: number;
  pausedCount: number;
  errorCount: number;
  agents: AgentSummary[];
}

const FleetContext = createContext<FleetMetrics>(null!);

function MissionControl({ agents }: { agents: AgentConfig[] }) {
  const [fleet, setFleet] = useState<FleetMetrics>({
    totalTokens: 0, activeCount: 0, pausedCount: 0, errorCount: 0, agents: [],
  });

  return (
    <FleetContext.Provider value={fleet}>
      <FleetDashboard />
      
      {/* Each agent runs in isolation but reports to the fleet */}
      {agents.map(agent => (
        <AgentUIProvider key={agent.id} endpoint={agent.endpoint}>
          <AgentReporter 
            agentId={agent.id} 
            onStateChange={(summary) => {
              setFleet(prev => ({
                ...prev,
                totalTokens: prev.agents.reduce((sum, a) => sum + a.tokens, 0),
                activeCount: prev.agents.filter(a => a.status === 'running').length,
                agents: prev.agents.map(a => a.id === agent.id ? summary : a),
              }));
            }}
          />
        </AgentUIProvider>
      ))}
    </FleetContext.Provider>
  );
}

// Invisible component — just reports state upward
function AgentReporter({ agentId, onStateChange }) {
  const { agentState } = useAgentUI();
  
  useEffect(() => {
    onStateChange({
      id: agentId,
      status: agentState.status,
      tokens: agentState.tokens.length,
    });
  }, [agentState]);

  return null; // No UI — just a state bridge
}

// ── Global Emergency Controls ───────────────
function EmergencyPanel() {
  const fleet = useContext(FleetContext);
  
  const pauseAll = async () => {
    for (const agent of fleet.agents) {
      await fetch(`/api/stream/intervention?agent=${agent.id}`, {
        method: 'POST',
        body: JSON.stringify({ action: 'command', payload: { cmd: 'PAUSE' } })
      });
    }
  };

  return (
    <div className="emergency-panel">
      <button onClick={pauseAll} className="emergency-btn">
        🚨 PAUSE ALL AGENTS
      </button>
      <span>{fleet.activeCount} agents will be paused</span>
    </div>
  );
}
Pattern

Resource Governance & Rate Management

In production, every API call costs money. Mission control needs token budgets, rate limit awareness, and the ability to throttle or kill agents that over-consume. The adapter's handleIntervention()becomes your governance layer.

💰

Token Budgets

Set per-agent and global token limits. When an agent hits 80% of budget, auto-alert. At 100%, auto-pause.

73% of 50k budget

Rate Limits

Monitor API rate limits across providers. Automatically queue agents when approaching limits.

91% — Approaching limit
📊

Cost Tracking

Real-time cost estimation across all active sessions. Break down by agent, model, and task type.

$4.72this session
Implementation — Token Budget Governance
// Governance layer — wraps any adapter with budget controls
class GovernedAdapter extends BaseAgentAdapter {
  private tokenBudgets: Map<string, { used: number; limit: number }> = new Map();

  async connect(ctx: SessionContext) {
    // Initialize budget for this session
    this.tokenBudgets.set(ctx.sessionId, { used: 0, limit: 50000 });
    await this.innerAdapter.connect(ctx);

    // Intercept token events to track usage
    this.innerAdapter.onEvent(ctx, (event) => {
      if (event.type === 'token') {
        const budget = this.tokenBudgets.get(ctx.sessionId)!;
        budget.used += event.payload.length;

        // Auto-warn at 80%
        if (budget.used > budget.limit * 0.8) {
          this.emitEvent(ctx, {
            type: 'state_change',
            payload: { warning: 'budget_threshold', usage: budget.used / budget.limit }
          });
        }

        // Auto-pause at 100%
        if (budget.used >= budget.limit) {
          this.handleIntervention(ctx, 'command', { cmd: 'PAUSE' });
          this.emitEvent(ctx, {
            type: 'state_change',
            payload: { 
              status: 'paused',
              reason: 'Token budget exhausted',
              used: budget.used,
              limit: budget.limit
            }
          });
        }
      }

      // Forward to UI
      this.emitEvent(ctx, event);
    });
  }
}
04

What You Can Build

The superpower of this framework is that it turns transparent, human-readable text files into a live, reactive database. In current AI stacks — LangChain + Postgres, AutoGPT in a terminal — the agent's "brain" is either a black box, locked behind a complex database query, or scrolling infinitely in a CLI where you can't touch it.

By building this file-backed wrapping engine, you enable applications where the human and the AI can physically collaborate on the exact same state at the exact same time.

Current Stacks

AI memory = something to be stored and retrieved

🗄️ Postgres📌 Pinecone💀 Black Box
ThoughtSurgery

AI memory = a shared, interactive workspace

📝 Markdown👁️ Human-Readable🤝 Collaborative
01
Full Example

"God Mode" Agent Controller

A Devin-style visual command center for local, autonomous agents

⚡ Agent Controller● Running
🔧 tools.json
Web SearchOFF
Code ExecutionON
File SystemON
TerminalON
Deploy to ProdOFF
📋 tasks.md — Live Kanban
✅ Done
Clone repo
Install deps
🔄 In Progress
Refactor auth module
⏳ Queued
Write tests
Update docs
💡
Why it beats current stacks: If the agent is stuck in an infinite loop trying to scrape a website, the user doesn't have to Ctrl+C and kill the whole script. They just flip the "Web Search" toggle to Off in the UI. Your REST endpoint updates TOOLS.json. The agent's next loop sees the file change, drops the browser tool, and pivots. It's real-time, surgical steering.
📄 TASKS.md
📄 TOOLS.json
📄 CONFIG.md
02
Full Example

Collaborative Worldbuilder

An AI Scrivener — a persistent co-author, not just a chatbot

🗂 Workspace
📖 DRAFT.md
🧙 CHARACTERS.json
🌍 LORE.md AI editing…
📜 OUTLINE.md
🎭 THEMES.md
Chapter 3: The Crossing

Elena pressed her palm against the cold stone of the Archway. The runes flared — not the gentle blue of the old scripts, but a deep amber that pulsed like a heartbeat.

"This isn't Valdris magic," she whispered. The wind carried her words toward the Shattered Expanse, a name she'd only heard in the forbidden verses…

|

🤖 AI — Auto-updating LORE.md
NEWThe Shattered Expanse — Forbidden region east of the Archway. Referenced in Ch. 3.
UPDElena — Can sense magical signatures through touch. (Ch.3 evidence)
NEWAmber Runes — Distinct from Valdris blue. Unknown origin. Pulse rhythmically.
💡
Why it beats current stacks: Standard ChatGPT loses context over time and requires massive prompt injections. Standard vector databases are hard to manually edit if the AI hallucinates. Here, if the AI updates a character's trait incorrectly, the user just fixes the text box in the UI. The AST parser perfectly rewrites the Markdown safely, and the AI's "memory" is permanently corrected.
📄 DRAFT.md
📄 LORE.md
📄 CHARACTERS.json
03
Full Example

"Transparent" Second Brain

A privacy-first personal assistant via Obsidian-style markdown notes

📅 April 2026
1
2
3
4
5
6
7
8
9
10
📝 2026-04-10.md
# Daily Note
Morning standup with engineering team. Discussed Q2 roadmap.
## Invoice Summary 🤖 auto-added
Vendor: CloudServ Inc. Amount: $2,340.00. Due: Apr 20.
## TODO 🤖 auto-added
- [ ] Pay CloudServ invoice by Apr 20
- [ ] Follow up on Q2 budget allocation
🔗 Knowledge Graph
Apr 10
CloudServ
Q2 Roadmap
Invoice
💡
Why it beats current stacks: You aren't locked into a proprietary SaaS vendor's ecosystem. The entire backend is literally just a folder on the user's hard drive. Because of the SSE event bus, the moment the agent finishes processing the PDF and writes to the .md file, the UI graph instantly updates to show the new node — immediate visual feedback without a page refresh.
📄 2026-04-10.md
📄 invoices/cloudserv.pdf
📄 graph.json
04
Full Example

Visual Prompt Chain Builder

A drag-and-drop workflow designer for non-engineers

📥
User Input
trigger
🧠
Research Agent
persona: analyst
✍️
Summarize
prompt template
📊
Extract Data
prompt template
📤
Final Report
output → workflow.json
💡
Why it beats current stacks: Instead of compiling the visual graph to a proprietary cloud format, the UI simply serializes it into a clean AGENTS.md or workflow.json file in your workspace. The user gets a beautiful, intuitive builder, but the execution remains entirely local, file-based, and auditable. No vendor lock-in. No opaque cloud APIs. Just files.
📄 workflow.json
📄 AGENTS.md
📄 prompts/*.md
🎯

The Bottom Line

Current stacks treat AI memory as something to be stored and retrieved. ThoughtSurgery treats AI memory as a shared, interactive workspace.

By solving the AST-parsing and file-watching hurdles, you allow developers to stop worrying about how to safely interrupt an agent, and start building beautiful, collaborative interfaces over them.

Build your mission.

Orchestration. File management. Mission control. Four production-grade application archetypes — all from a single, headless, framework-agnostic toolkit.