The Builder's Guide

Four in-depth explorations of the most complex problems in agentic UI development — and exactly how ThoughtSurgery solves each one with production-grade patterns and real-world examples.

01Agent OrchestrationCoordinate multi-agent workflows with dependency graphs 02File ManagementAST-safe editing, versioning, and conflict resolution 03Agent Mission ControlBuild command centers for fleet-scale agent operations 04What You Can BuildFour powerful real-world applications over ThoughtSurgery

Agent Orchestration

Real-world AI applications rarely involve a single agent. They require coordinated pipelines where agents hand off work, wait for dependencies, and recover from failures — all while a human operator maintains oversight. ThoughtSurgery makes orchestration a first-class citizen.

Pattern

Dependency-Aware Agent Graphs

Model your agent pipeline as a directed acyclic graph (DAG). Each node is an agent with its own AgentUIProvider and SSE stream. Edges represent data dependencies. ThoughtSurgery's session-scoped state isolation means agents never interfere with each other — and the human operator can intervene at any node independently.

📡 Data Ingest

Streaming 3 API sources

1,247 tokens

🔍 Analyzer A

Sentiment analysis

892 tokens

📊 Analyzer B

Waiting for Ingest...

Queued

📝 Report Writer

Depends on A + B

Pending

🛑

Human Review

Checkpoint before publish

Approval required

Implementation — Orchestration Controller

// orchestration-controller.tsx
// Each agent runs in its own isolated provider with its own SSE stream.
// The controller manages the DAG topology and data handoffs.

import { AgentUIProvider, useAgentUI } from '@agentui/react';

interface AgentNode {
  id: string;
  endpoint: string;
  dependsOn: string[];      // IDs of upstream agents
  status: 'pending' | 'running' | 'done' | 'error';
}

const pipeline: AgentNode[] = [
  { id: 'ingest',   endpoint: '/api/stream?agent=ingest',   dependsOn: [],                  status: 'running' },
  { id: 'analyze-a', endpoint: '/api/stream?agent=analyze-a', dependsOn: ['ingest'],          status: 'running' },
  { id: 'analyze-b', endpoint: '/api/stream?agent=analyze-b', dependsOn: ['ingest'],          status: 'pending' },
  { id: 'writer',   endpoint: '/api/stream?agent=writer',   dependsOn: ['analyze-a', 'analyze-b'], status: 'pending' },
];

function OrchestrationDashboard() {
  return (
    <div className="dag-dashboard">
      {pipeline.map(node => (
        <AgentUIProvider key={node.id} endpoint={node.endpoint}>
          <AgentNodeCard node={node} />
        </AgentUIProvider>
      ))}
    </div>
  );
}

function AgentNodeCard({ node }: { node: AgentNode }) {
  // Each card gets its own independent agent state
  const { agentState, sendCommand, intervene } = useAgentUI();
  
  const canRun = node.dependsOn.every(dep => 
    pipeline.find(n => n.id === dep)?.status === 'done'
  );

  return (
    <div className={`node-card ${node.status}`}>
      <h4>{node.id}</h4>
      <p>Status: {agentState.status}</p>
      <p>Tokens: {agentState.tokens.length}</p>
      
      {/* Independent controls per agent */}
      <button onClick={() => sendCommand('PAUSE')}>Pause</button>
      <button onClick={() => intervene({ 
        systemPrompt: 'Focus on key metrics only.' 
      })}>Redirect</button>
    </div>
  );
}

📖 Multiple Agent Sessions 📖 Session Management 📖 AgentUIProvider API

Pattern

Graceful Error Recovery & Agent Swap

When an agent in the pipeline fails — rate limit, hallucination detected, timeout — the operator doesn't need to restart the entire workflow. ThoughtSurgery's handleIntervention()lets you swap the underlying model, inject corrective context, or skip the failed node entirely.

⚠️ Failure Detected

14:23:01 ✓ ingest Completed — 3 sources loaded

14:23:08 ✓ analyze-a Sentiment: positive (0.87)

14:23:12 ✗ analyze-b Rate limited by OpenAI (429)

14:23:12 ⏸ writer Blocked — waiting on analyze-b

🔄 Operator Intervenes

14:23:15 → operator Swapping analyze-b to Anthropic

14:23:16 ✓ analyze-b Restarted with Claude adapter

14:23:24 ✓ analyze-b Trend analysis complete

14:23:25 ▶ writer Dependencies met — resuming

Implementation — Runtime Adapter Swap

// Swap the backend of a failed agent at runtime
// This is the power of the Adapter pattern — the UI doesn't change at all.

async function handleAgentFailure(agentId: string, error: Error) {
  console.log(`Agent ${agentId} failed: ${error.message}`);

  // 1. The operator clicks "Swap to Anthropic" in the UI
  //    This triggers an intervention on the agent's session:
  await fetch('/api/stream/intervention', {
    method: 'POST',
    body: JSON.stringify({
      action: 'swap_adapter',
      payload: {
        agentId,
        newAdapter: 'anthropic',
        model: 'claude-sonnet-4-20250514',
        // Carry over the accumulated context
        preserveContext: true,
      }
    })
  });
}

// Server-side handler in your custom adapter:
class OrchestrationAdapter extends BaseAgentAdapter {
  async handleIntervention(ctx: SessionContext, action: string, payload: any) {
    if (action === 'swap_adapter') {
      // Disconnect the failed adapter
      await this.adapters[payload.agentId].disconnect(ctx);
      
      // Hot-swap to a new backend
      this.adapters[payload.agentId] = createAdapter(payload.newAdapter, {
        model: payload.model,
      });
      
      // Reconnect and resume — the UI sees a seamless recovery
      await this.adapters[payload.agentId].connect(ctx);
      this.emitEvent(ctx, {
        type: 'state_change',
        payload: { status: 'running', swappedTo: payload.newAdapter }
      });
    }
  }
}

📖 Custom Adapter Guide 📖 BaseAgentAdapter API

File Management

Agents that write to disk are dangerous. One concurrent write, one bad regex, one half-finished generation — and your file is corrupted. ThoughtSurgery treats file management as a first-class concern with AST-level parsing, per-file mutex locks, and a full edit history.

The Problem

Why String Replacement Destroys Files

Most agent frameworks use string.replace() or regex to edit files. This is catastrophically fragile. Here's what happens in production:

❌ Naive String Replace

Original:

# Project Setup

Install with `npm install`

# Project Goals

Build a scalable API

Agent runs: replace("# Project", "# Updated Project")

Result:

# Updated Project Setup

Install with `npm install`

# Updated Project Goals ← UNINTENDED!

Build a scalable API

✅ ThoughtSurgery AST Edit

Original: (same file)

# Project Setup

Install with `npm install`

# Project Goals

Build a scalable API

Agent targets: heading[0] (by AST index)

Result:

# Updated Project Setup

Install with `npm install`

# Project Goals ← UNTOUCHED ✓

Build a scalable API

Pattern

The AST → Mutex → Write Pipeline

Every file edit goes through a three-stage pipeline that is structurally impossible to corrupt. Here's the exact flow that safeUpdateMarkdown() executes:

🔒

Acquire Mutex

Per-file lock prevents concurrent writes. If Agent B tries to edit while Agent A holds the lock, it queues and waits.

const release = await mutex.acquire();

→

🌳

Parse to AST

File is parsed into a tree of typed nodes (headings, paragraphs, code blocks). Edits target specific nodes by type and position.

const ast = processor.parse(content);

→

✍️

Mutate & Serialize

Your mutator function receives the tree, makes surgical changes, and the processor serializes it back to valid markdown.

mutator(ast); writeFileSync(path, stringify(ast));

Implementation — Real-World File Operations

import { safeUpdateMarkdown } from '@agentui/core/dist/ast-sync';

// ── Example 1: Insert a new section ──────────────────────────
await safeUpdateMarkdown('./docs/api-reference.md', (tree) => {
  // Find the "Methods" heading
  const methodsIdx = tree.children.findIndex(
    (n: any) => n.type === 'heading' && n.depth === 2 
      && n.children[0]?.value === 'Methods'
  );
  
  // Insert a new method after the heading
  tree.children.splice(methodsIdx + 1, 0, {
    type: 'heading',
    depth: 3,
    children: [{ type: 'text', value: 'getSessionMetrics()' }]
  }, {
    type: 'paragraph',
    children: [{ type: 'text', value: 'Returns real-time metrics for an active session.' }]
  }, {
    type: 'code',
    lang: 'typescript',
    value: 'async getSessionMetrics(ctx: SessionContext): Promise<Metrics>'
  });
});

// ── Example 2: Update a table row ────────────────────────────
await safeUpdateMarkdown('./CHANGELOG.md', (tree) => {
  const table = tree.children.find((n: any) => n.type === 'table');
  if (table) {
    // Append a new row to the changelog table
    table.children.push({
      type: 'tableRow',
      children: [
        { type: 'tableCell', children: [{ type: 'text', value: 'v0.4.0' }] },
        { type: 'tableCell', children: [{ type: 'text', value: 'Added adapter hot-swap' }] },
        { type: 'tableCell', children: [{ type: 'text', value: '2026-04-01' }] },
      ]
    });
  }
});

// ── Example 3: Safe concurrent edits ─────────────────────────
// Both agents try to edit the same file simultaneously.
// The mutex ensures they execute sequentially — zero corruption.
await Promise.all([
  safeUpdateMarkdown('./report.md', (tree) => {
    // Agent 1: updates the summary section
    const summary = findSection(tree, 'Summary');
    summary.children[0].value = 'Updated executive summary.';
  }),
  safeUpdateMarkdown('./report.md', (tree) => {
    // Agent 2: updates the metrics section
    const metrics = findSection(tree, 'Metrics');
    metrics.children.push(createParagraph('Revenue: +23% QoQ'));
  }),
]);
// Both edits succeed. Neither corrupts the other.

📖 safeUpdateMarkdown() API 📖 AST Sync Architecture

Pattern

Building a File Version Timeline

By hooking into the adapter's event system, you can build a full version history of every file an agent touches. Each state_change event with a file_edit payload becomes a snapshot. The operator can review, diff, or roll back any edit.

Now

v4 — Agent updated Timeline section

+2 paragraphs, -1 heading · Writer Agent

View DiffRollback

2 min ago

v3 — Operator manually edited Budget

+1 table row · Human (Operator)

5 min ago

v2 — Agent added Goals section

+3 paragraphs, +1 heading · Research Agent

8 min ago

v1 — File created from template

Initial scaffold · System

Implementation — Version History via Events

// In your adapter — emit file edit events with version metadata
class FileTrackingAdapter extends BaseAgentAdapter {
  private versions: Map<string, FileVersion[]> = new Map();

  async editFile(ctx: SessionContext, filePath: string, edit: EditFn) {
    // 1. Snapshot the file before editing
    const before = readFileSync(filePath, 'utf-8');

    // 2. Perform the safe AST edit
    await safeUpdateMarkdown(filePath, edit);

    // 3. Snapshot after
    const after = readFileSync(filePath, 'utf-8');

    // 4. Store version and emit event
    const version: FileVersion = {
      timestamp: Date.now(),
      filePath,
      before,
      after,
      agentId: ctx.sessionId,
      diff: computeDiff(before, after),
    };

    this.versions.get(filePath)?.push(version) 
      ?? this.versions.set(filePath, [version]);

    // 5. UI receives this and renders the timeline
    this.emitEvent(ctx, {
      type: 'state_change',
      payload: {
        event: 'file_edit',
        filePath,
        version: this.versions.get(filePath)!.length,
        diff: version.diff,
      }
    });
  }

  // Rollback to any previous version
  async rollback(ctx: SessionContext, filePath: string, versionNum: number) {
    const history = this.versions.get(filePath);
    if (history && history[versionNum]) {
      writeFileSync(filePath, history[versionNum].before, 'utf-8');
      this.emitEvent(ctx, {
        type: 'state_change',
        payload: { event: 'file_rollback', filePath, restoredTo: versionNum }
      });
    }
  }
}

📖 emitEvent() API 📖 Emitting Custom Events

Agent Mission Control

When you're running dozens of agents across multiple workflows, you need a command center — not a chatbot. ThoughtSurgery provides the primitives to build production-grade mission control dashboards with fleet monitoring, resource allocation, and global emergency controls.

Blueprint

Fleet Overview Dashboard

A mission control dashboard aggregates state from every agent into a single view. Each agent'sagentState feeds into shared metrics: total tokens consumed, agents active vs idle, error rates, and estimated completion times.

47,293

Total Tokens

Active Agents

94.2%

Success Rate

~3m

Est. Completion

research-alphaAnalyzing patent corpus12,401 tok⏸ Pause

writer-betaDrafting executive summary3,892 tok⏸ Pause

reviewer-gammaAwaiting human approval890 tok✅ Approve

code-gen-deltaRate limited (429)—🔄 Retry

Implementation — Mission Control Aggregator

// mission-control.tsx
// Aggregate state from all agent sessions into a unified dashboard.

import { AgentUIProvider, useAgentUI } from '@agentui/react';
import { useState, useEffect, createContext, useContext } from 'react';

interface FleetMetrics {
  totalTokens: number;
  activeCount: number;
  pausedCount: number;
  errorCount: number;
  agents: AgentSummary[];
}

const FleetContext = createContext<FleetMetrics>(null!);

function MissionControl({ agents }: { agents: AgentConfig[] }) {
  const [fleet, setFleet] = useState<FleetMetrics>({
    totalTokens: 0, activeCount: 0, pausedCount: 0, errorCount: 0, agents: [],
  });

  return (
    <FleetContext.Provider value={fleet}>
      <FleetDashboard />
      
      {/* Each agent runs in isolation but reports to the fleet */}
      {agents.map(agent => (
        <AgentUIProvider key={agent.id} endpoint={agent.endpoint}>
          <AgentReporter 
            agentId={agent.id} 
            onStateChange={(summary) => {
              setFleet(prev => ({
                ...prev,
                totalTokens: prev.agents.reduce((sum, a) => sum + a.tokens, 0),
                activeCount: prev.agents.filter(a => a.status === 'running').length,
                agents: prev.agents.map(a => a.id === agent.id ? summary : a),
              }));
            }}
          />
        </AgentUIProvider>
      ))}
    </FleetContext.Provider>
  );
}

// Invisible component — just reports state upward
function AgentReporter({ agentId, onStateChange }) {
  const { agentState } = useAgentUI();
  
  useEffect(() => {
    onStateChange({
      id: agentId,
      status: agentState.status,
      tokens: agentState.tokens.length,
    });
  }, [agentState]);

  return null; // No UI — just a state bridge
}

// ── Global Emergency Controls ───────────────
function EmergencyPanel() {
  const fleet = useContext(FleetContext);
  
  const pauseAll = async () => {
    for (const agent of fleet.agents) {
      await fetch(`/api/stream/intervention?agent=${agent.id}`, {
        method: 'POST',
        body: JSON.stringify({ action: 'command', payload: { cmd: 'PAUSE' } })
      });
    }
  };

  return (
    <div className="emergency-panel">
      <button onClick={pauseAll} className="emergency-btn">
        🚨 PAUSE ALL AGENTS
      </button>
      <span>{fleet.activeCount} agents will be paused</span>
    </div>
  );
}

📖 useAgentUI() Hook 📖 React Integration Guide 📖 Full Architecture

Pattern

Resource Governance & Rate Management

In production, every API call costs money. Mission control needs token budgets, rate limit awareness, and the ability to throttle or kill agents that over-consume. The adapter's handleIntervention()becomes your governance layer.

💰

Token Budgets

Set per-agent and global token limits. When an agent hits 80% of budget, auto-alert. At 100%, auto-pause.

73% of 50k budget

⚡

Rate Limits

Monitor API rate limits across providers. Automatically queue agents when approaching limits.

91% — Approaching limit

📊

Cost Tracking

Real-time cost estimation across all active sessions. Break down by agent, model, and task type.

$4.72this session

Implementation — Token Budget Governance

// Governance layer — wraps any adapter with budget controls
class GovernedAdapter extends BaseAgentAdapter {
  private tokenBudgets: Map<string, { used: number; limit: number }> = new Map();

  async connect(ctx: SessionContext) {
    // Initialize budget for this session
    this.tokenBudgets.set(ctx.sessionId, { used: 0, limit: 50000 });
    await this.innerAdapter.connect(ctx);

    // Intercept token events to track usage
    this.innerAdapter.onEvent(ctx, (event) => {
      if (event.type === 'token') {
        const budget = this.tokenBudgets.get(ctx.sessionId)!;
        budget.used += event.payload.length;

        // Auto-warn at 80%
        if (budget.used > budget.limit * 0.8) {
          this.emitEvent(ctx, {
            type: 'state_change',
            payload: { warning: 'budget_threshold', usage: budget.used / budget.limit }
          });
        }

        // Auto-pause at 100%
        if (budget.used >= budget.limit) {
          this.handleIntervention(ctx, 'command', { cmd: 'PAUSE' });
          this.emitEvent(ctx, {
            type: 'state_change',
            payload: { 
              status: 'paused',
              reason: 'Token budget exhausted',
              used: budget.used,
              limit: budget.limit
            }
          });
        }
      }

      // Forward to UI
      this.emitEvent(ctx, event);
    });
  }
}

📖 handleIntervention() API 📖 Building Custom Adapters

What You Can Build

The superpower of this framework is that it turns transparent, human-readable text files into a live, reactive database. In current AI stacks — LangChain + Postgres, AutoGPT in a terminal — the agent's "brain" is either a black box, locked behind a complex database query, or scrolling infinitely in a CLI where you can't touch it.

By building this file-backed wrapping engine, you enable applications where the human and the AI can physically collaborate on the exact same state at the exact same time.

Current Stacks

AI memory = something to be stored and retrieved

🗄️ Postgres📌 Pinecone💀 Black Box

→

ThoughtSurgery

AI memory = a shared, interactive workspace

📝 Markdown👁️ Human-Readable🤝 Collaborative

Full Example

"God Mode" Agent Controller

A Devin-style visual command center for local, autonomous agents

📋 tasks.md — Live Kanban

✅ Done

Clone repo

Install deps

🔄 In Progress

Refactor auth module

⏳ Queued

Write tests

Update docs

💡

Why it beats current stacks: If the agent is stuck in an infinite loop trying to scrape a website, the user doesn't have to Ctrl+C and kill the whole script. They just flip the "Web Search" toggle to Off in the UI. Your REST endpoint updates TOOLS.json. The agent's next loop sees the file change, drops the browser tool, and pivots. It's real-time, surgical steering.

📄 TASKS.md

📄 TOOLS.json

📄 CONFIG.md

Full Example

Collaborative Worldbuilder

An AI Scrivener — a persistent co-author, not just a chatbot

Chapter 3: The Crossing

Elena pressed her palm against the cold stone of the Archway. The runes flared — not the gentle blue of the old scripts, but a deep amber that pulsed like a heartbeat.

"This isn't Valdris magic," she whispered. The wind carried her words toward the Shattered Expanse, a name she'd only heard in the forbidden verses…

🤖 AI — Auto-updating LORE.md

NEWThe Shattered Expanse — Forbidden region east of the Archway. Referenced in Ch. 3.

UPDElena — Can sense magical signatures through touch. (Ch.3 evidence)

NEWAmber Runes — Distinct from Valdris blue. Unknown origin. Pulse rhythmically.

💡

Why it beats current stacks: Standard ChatGPT loses context over time and requires massive prompt injections. Standard vector databases are hard to manually edit if the AI hallucinates. Here, if the AI updates a character's trait incorrectly, the user just fixes the text box in the UI. The AST parser perfectly rewrites the Markdown safely, and the AI's "memory" is permanently corrected.

📄 DRAFT.md

📄 LORE.md

📄 CHARACTERS.json

Full Example

"Transparent" Second Brain

A privacy-first personal assistant via Obsidian-style markdown notes

📅 April 2026

📝 2026-04-10.md

# Daily Note

Morning standup with engineering team. Discussed Q2 roadmap.

## Invoice Summary 🤖 auto-added

Vendor: CloudServ Inc. Amount: $2,340.00. Due: Apr 20.

## TODO 🤖 auto-added

- [ ] Pay CloudServ invoice by Apr 20

- [ ] Follow up on Q2 budget allocation

🔗 Knowledge Graph

Apr 10

CloudServ

Q2 Roadmap

Invoice

💡

Why it beats current stacks: You aren't locked into a proprietary SaaS vendor's ecosystem. The entire backend is literally just a folder on the user's hard drive. Because of the SSE event bus, the moment the agent finishes processing the PDF and writes to the .md file, the UI graph instantly updates to show the new node — immediate visual feedback without a page refresh.

📄 2026-04-10.md

📄 invoices/cloudserv.pdf

📄 graph.json

Full Example

Visual Prompt Chain Builder

A drag-and-drop workflow designer for non-engineers

📥

User Input

trigger

🧠

Research Agent

persona: analyst

✍️

Summarize

prompt template

📊

Extract Data

prompt template

📤

Final Report

output → workflow.json

💡

Why it beats current stacks: Instead of compiling the visual graph to a proprietary cloud format, the UI simply serializes it into a clean AGENTS.md or workflow.json file in your workspace. The user gets a beautiful, intuitive builder, but the execution remains entirely local, file-based, and auditable. No vendor lock-in. No opaque cloud APIs. Just files.

📄 workflow.json

📄 AGENTS.md

📄 prompts/*.md

🎯

The Bottom Line

Current stacks treat AI memory as something to be stored and retrieved. ThoughtSurgery treats AI memory as a shared, interactive workspace.

By solving the AST-parsing and file-watching hurdles, you allow developers to stop worrying about how to safely interrupt an agent, and start building beautiful, collaborative interfaces over them.

Build your mission.

Orchestration. File management. Mission control. Four production-grade application archetypes — all from a single, headless, framework-agnostic toolkit.

Star on GitHub→← Back to Overview