The Builder's Guide
Four in-depth explorations of the most complex problems in agentic UI development — and exactly how ThoughtSurgery solves each one with production-grade patterns and real-world examples.
Agent Orchestration
Real-world AI applications rarely involve a single agent. They require coordinated pipelines where agents hand off work, wait for dependencies, and recover from failures — all while a human operator maintains oversight. ThoughtSurgery makes orchestration a first-class citizen.
Dependency-Aware Agent Graphs
Model your agent pipeline as a directed acyclic graph (DAG). Each node is an agent with its own AgentUIProvider and SSE stream. Edges represent data dependencies. ThoughtSurgery's session-scoped state isolation means agents never interfere with each other — and the human operator can intervene at any node independently.
// orchestration-controller.tsx
// Each agent runs in its own isolated provider with its own SSE stream.
// The controller manages the DAG topology and data handoffs.
import { AgentUIProvider, useAgentUI } from '@agentui/react';
interface AgentNode {
id: string;
endpoint: string;
dependsOn: string[]; // IDs of upstream agents
status: 'pending' | 'running' | 'done' | 'error';
}
const pipeline: AgentNode[] = [
{ id: 'ingest', endpoint: '/api/stream?agent=ingest', dependsOn: [], status: 'running' },
{ id: 'analyze-a', endpoint: '/api/stream?agent=analyze-a', dependsOn: ['ingest'], status: 'running' },
{ id: 'analyze-b', endpoint: '/api/stream?agent=analyze-b', dependsOn: ['ingest'], status: 'pending' },
{ id: 'writer', endpoint: '/api/stream?agent=writer', dependsOn: ['analyze-a', 'analyze-b'], status: 'pending' },
];
function OrchestrationDashboard() {
return (
<div className="dag-dashboard">
{pipeline.map(node => (
<AgentUIProvider key={node.id} endpoint={node.endpoint}>
<AgentNodeCard node={node} />
</AgentUIProvider>
))}
</div>
);
}
function AgentNodeCard({ node }: { node: AgentNode }) {
// Each card gets its own independent agent state
const { agentState, sendCommand, intervene } = useAgentUI();
const canRun = node.dependsOn.every(dep =>
pipeline.find(n => n.id === dep)?.status === 'done'
);
return (
<div className={`node-card ${node.status}`}>
<h4>{node.id}</h4>
<p>Status: {agentState.status}</p>
<p>Tokens: {agentState.tokens.length}</p>
{/* Independent controls per agent */}
<button onClick={() => sendCommand('PAUSE')}>Pause</button>
<button onClick={() => intervene({
systemPrompt: 'Focus on key metrics only.'
})}>Redirect</button>
</div>
);
}Graceful Error Recovery & Agent Swap
When an agent in the pipeline fails — rate limit, hallucination detected, timeout — the operator doesn't need to restart the entire workflow. ThoughtSurgery's handleIntervention()lets you swap the underlying model, inject corrective context, or skip the failed node entirely.
// Swap the backend of a failed agent at runtime
// This is the power of the Adapter pattern — the UI doesn't change at all.
async function handleAgentFailure(agentId: string, error: Error) {
console.log(`Agent ${agentId} failed: ${error.message}`);
// 1. The operator clicks "Swap to Anthropic" in the UI
// This triggers an intervention on the agent's session:
await fetch('/api/stream/intervention', {
method: 'POST',
body: JSON.stringify({
action: 'swap_adapter',
payload: {
agentId,
newAdapter: 'anthropic',
model: 'claude-sonnet-4-20250514',
// Carry over the accumulated context
preserveContext: true,
}
})
});
}
// Server-side handler in your custom adapter:
class OrchestrationAdapter extends BaseAgentAdapter {
async handleIntervention(ctx: SessionContext, action: string, payload: any) {
if (action === 'swap_adapter') {
// Disconnect the failed adapter
await this.adapters[payload.agentId].disconnect(ctx);
// Hot-swap to a new backend
this.adapters[payload.agentId] = createAdapter(payload.newAdapter, {
model: payload.model,
});
// Reconnect and resume — the UI sees a seamless recovery
await this.adapters[payload.agentId].connect(ctx);
this.emitEvent(ctx, {
type: 'state_change',
payload: { status: 'running', swappedTo: payload.newAdapter }
});
}
}
}File Management
Agents that write to disk are dangerous. One concurrent write, one bad regex, one half-finished generation — and your file is corrupted. ThoughtSurgery treats file management as a first-class concern with AST-level parsing, per-file mutex locks, and a full edit history.
Why String Replacement Destroys Files
Most agent frameworks use string.replace() or regex to edit files. This is catastrophically fragile. Here's what happens in production:
The AST → Mutex → Write Pipeline
Every file edit goes through a three-stage pipeline that is structurally impossible to corrupt. Here's the exact flow that safeUpdateMarkdown() executes:
Acquire Mutex
Per-file lock prevents concurrent writes. If Agent B tries to edit while Agent A holds the lock, it queues and waits.
Parse to AST
File is parsed into a tree of typed nodes (headings, paragraphs, code blocks). Edits target specific nodes by type and position.
Mutate & Serialize
Your mutator function receives the tree, makes surgical changes, and the processor serializes it back to valid markdown.
import { safeUpdateMarkdown } from '@agentui/core/dist/ast-sync';
// ── Example 1: Insert a new section ──────────────────────────
await safeUpdateMarkdown('./docs/api-reference.md', (tree) => {
// Find the "Methods" heading
const methodsIdx = tree.children.findIndex(
(n: any) => n.type === 'heading' && n.depth === 2
&& n.children[0]?.value === 'Methods'
);
// Insert a new method after the heading
tree.children.splice(methodsIdx + 1, 0, {
type: 'heading',
depth: 3,
children: [{ type: 'text', value: 'getSessionMetrics()' }]
}, {
type: 'paragraph',
children: [{ type: 'text', value: 'Returns real-time metrics for an active session.' }]
}, {
type: 'code',
lang: 'typescript',
value: 'async getSessionMetrics(ctx: SessionContext): Promise<Metrics>'
});
});
// ── Example 2: Update a table row ────────────────────────────
await safeUpdateMarkdown('./CHANGELOG.md', (tree) => {
const table = tree.children.find((n: any) => n.type === 'table');
if (table) {
// Append a new row to the changelog table
table.children.push({
type: 'tableRow',
children: [
{ type: 'tableCell', children: [{ type: 'text', value: 'v0.4.0' }] },
{ type: 'tableCell', children: [{ type: 'text', value: 'Added adapter hot-swap' }] },
{ type: 'tableCell', children: [{ type: 'text', value: '2026-04-01' }] },
]
});
}
});
// ── Example 3: Safe concurrent edits ─────────────────────────
// Both agents try to edit the same file simultaneously.
// The mutex ensures they execute sequentially — zero corruption.
await Promise.all([
safeUpdateMarkdown('./report.md', (tree) => {
// Agent 1: updates the summary section
const summary = findSection(tree, 'Summary');
summary.children[0].value = 'Updated executive summary.';
}),
safeUpdateMarkdown('./report.md', (tree) => {
// Agent 2: updates the metrics section
const metrics = findSection(tree, 'Metrics');
metrics.children.push(createParagraph('Revenue: +23% QoQ'));
}),
]);
// Both edits succeed. Neither corrupts the other.Building a File Version Timeline
By hooking into the adapter's event system, you can build a full version history of every file an agent touches. Each state_change event with a file_edit payload becomes a snapshot. The operator can review, diff, or roll back any edit.
// In your adapter — emit file edit events with version metadata
class FileTrackingAdapter extends BaseAgentAdapter {
private versions: Map<string, FileVersion[]> = new Map();
async editFile(ctx: SessionContext, filePath: string, edit: EditFn) {
// 1. Snapshot the file before editing
const before = readFileSync(filePath, 'utf-8');
// 2. Perform the safe AST edit
await safeUpdateMarkdown(filePath, edit);
// 3. Snapshot after
const after = readFileSync(filePath, 'utf-8');
// 4. Store version and emit event
const version: FileVersion = {
timestamp: Date.now(),
filePath,
before,
after,
agentId: ctx.sessionId,
diff: computeDiff(before, after),
};
this.versions.get(filePath)?.push(version)
?? this.versions.set(filePath, [version]);
// 5. UI receives this and renders the timeline
this.emitEvent(ctx, {
type: 'state_change',
payload: {
event: 'file_edit',
filePath,
version: this.versions.get(filePath)!.length,
diff: version.diff,
}
});
}
// Rollback to any previous version
async rollback(ctx: SessionContext, filePath: string, versionNum: number) {
const history = this.versions.get(filePath);
if (history && history[versionNum]) {
writeFileSync(filePath, history[versionNum].before, 'utf-8');
this.emitEvent(ctx, {
type: 'state_change',
payload: { event: 'file_rollback', filePath, restoredTo: versionNum }
});
}
}
}Agent Mission Control
When you're running dozens of agents across multiple workflows, you need a command center — not a chatbot. ThoughtSurgery provides the primitives to build production-grade mission control dashboards with fleet monitoring, resource allocation, and global emergency controls.
Fleet Overview Dashboard
A mission control dashboard aggregates state from every agent into a single view. Each agent'sagentState feeds into shared metrics: total tokens consumed, agents active vs idle, error rates, and estimated completion times.
// mission-control.tsx
// Aggregate state from all agent sessions into a unified dashboard.
import { AgentUIProvider, useAgentUI } from '@agentui/react';
import { useState, useEffect, createContext, useContext } from 'react';
interface FleetMetrics {
totalTokens: number;
activeCount: number;
pausedCount: number;
errorCount: number;
agents: AgentSummary[];
}
const FleetContext = createContext<FleetMetrics>(null!);
function MissionControl({ agents }: { agents: AgentConfig[] }) {
const [fleet, setFleet] = useState<FleetMetrics>({
totalTokens: 0, activeCount: 0, pausedCount: 0, errorCount: 0, agents: [],
});
return (
<FleetContext.Provider value={fleet}>
<FleetDashboard />
{/* Each agent runs in isolation but reports to the fleet */}
{agents.map(agent => (
<AgentUIProvider key={agent.id} endpoint={agent.endpoint}>
<AgentReporter
agentId={agent.id}
onStateChange={(summary) => {
setFleet(prev => ({
...prev,
totalTokens: prev.agents.reduce((sum, a) => sum + a.tokens, 0),
activeCount: prev.agents.filter(a => a.status === 'running').length,
agents: prev.agents.map(a => a.id === agent.id ? summary : a),
}));
}}
/>
</AgentUIProvider>
))}
</FleetContext.Provider>
);
}
// Invisible component — just reports state upward
function AgentReporter({ agentId, onStateChange }) {
const { agentState } = useAgentUI();
useEffect(() => {
onStateChange({
id: agentId,
status: agentState.status,
tokens: agentState.tokens.length,
});
}, [agentState]);
return null; // No UI — just a state bridge
}
// ── Global Emergency Controls ───────────────
function EmergencyPanel() {
const fleet = useContext(FleetContext);
const pauseAll = async () => {
for (const agent of fleet.agents) {
await fetch(`/api/stream/intervention?agent=${agent.id}`, {
method: 'POST',
body: JSON.stringify({ action: 'command', payload: { cmd: 'PAUSE' } })
});
}
};
return (
<div className="emergency-panel">
<button onClick={pauseAll} className="emergency-btn">
🚨 PAUSE ALL AGENTS
</button>
<span>{fleet.activeCount} agents will be paused</span>
</div>
);
}Resource Governance & Rate Management
In production, every API call costs money. Mission control needs token budgets, rate limit awareness, and the ability to throttle or kill agents that over-consume. The adapter's handleIntervention()becomes your governance layer.
Token Budgets
Set per-agent and global token limits. When an agent hits 80% of budget, auto-alert. At 100%, auto-pause.
Rate Limits
Monitor API rate limits across providers. Automatically queue agents when approaching limits.
Cost Tracking
Real-time cost estimation across all active sessions. Break down by agent, model, and task type.
// Governance layer — wraps any adapter with budget controls
class GovernedAdapter extends BaseAgentAdapter {
private tokenBudgets: Map<string, { used: number; limit: number }> = new Map();
async connect(ctx: SessionContext) {
// Initialize budget for this session
this.tokenBudgets.set(ctx.sessionId, { used: 0, limit: 50000 });
await this.innerAdapter.connect(ctx);
// Intercept token events to track usage
this.innerAdapter.onEvent(ctx, (event) => {
if (event.type === 'token') {
const budget = this.tokenBudgets.get(ctx.sessionId)!;
budget.used += event.payload.length;
// Auto-warn at 80%
if (budget.used > budget.limit * 0.8) {
this.emitEvent(ctx, {
type: 'state_change',
payload: { warning: 'budget_threshold', usage: budget.used / budget.limit }
});
}
// Auto-pause at 100%
if (budget.used >= budget.limit) {
this.handleIntervention(ctx, 'command', { cmd: 'PAUSE' });
this.emitEvent(ctx, {
type: 'state_change',
payload: {
status: 'paused',
reason: 'Token budget exhausted',
used: budget.used,
limit: budget.limit
}
});
}
}
// Forward to UI
this.emitEvent(ctx, event);
});
}
}What You Can Build
The superpower of this framework is that it turns transparent, human-readable text files into a live, reactive database. In current AI stacks — LangChain + Postgres, AutoGPT in a terminal — the agent's "brain" is either a black box, locked behind a complex database query, or scrolling infinitely in a CLI where you can't touch it.
By building this file-backed wrapping engine, you enable applications where the human and the AI can physically collaborate on the exact same state at the exact same time.
AI memory = something to be stored and retrieved
AI memory = a shared, interactive workspace
"God Mode" Agent Controller
A Devin-style visual command center for local, autonomous agents
Ctrl+C and kill the whole script. They just flip the "Web Search" toggle to Off in the UI. Your REST endpoint updates TOOLS.json. The agent's next loop sees the file change, drops the browser tool, and pivots. It's real-time, surgical steering.Collaborative Worldbuilder
An AI Scrivener — a persistent co-author, not just a chatbot
Elena pressed her palm against the cold stone of the Archway. The runes flared — not the gentle blue of the old scripts, but a deep amber that pulsed like a heartbeat.
"This isn't Valdris magic," she whispered. The wind carried her words toward the Shattered Expanse, a name she'd only heard in the forbidden verses…
|
"Transparent" Second Brain
A privacy-first personal assistant via Obsidian-style markdown notes
.md file, the UI graph instantly updates to show the new node — immediate visual feedback without a page refresh.Visual Prompt Chain Builder
A drag-and-drop workflow designer for non-engineers
AGENTS.md or workflow.json file in your workspace. The user gets a beautiful, intuitive builder, but the execution remains entirely local, file-based, and auditable. No vendor lock-in. No opaque cloud APIs. Just files.The Bottom Line
Current stacks treat AI memory as something to be stored and retrieved. ThoughtSurgery treats AI memory as a shared, interactive workspace.
By solving the AST-parsing and file-watching hurdles, you allow developers to stop worrying about how to safely interrupt an agent, and start building beautiful, collaborative interfaces over them.
Build your mission.
Orchestration. File management. Mission control. Four production-grade application archetypes — all from a single, headless, framework-agnostic toolkit.