AI Agent Supervisor

Overview

AI Agent Supervisor

My role: Product concept to deployed system.

Challenge: Run multiple parallel work streams (including AI agents) reliably across multiple machines, without losing time to context switching or fragile terminal sessions.

Solution: A resilient, web-native terminal environment built around long-lived worker processes ("guardians") managed by a supervisor service and accessed via an SPA frontend.

Impact: 3-5x more efficient use of developer time by keeping many sessions visible, switching instantly to whichever workload needs input.

Technologies: Axum / Rust, Svelte / TS, Tauri

Problem

When you run agentic CLI tools (or any semi-interactive workloads), your work becomes a set of concurrent streams: some waiting for input, others running long tasks, others failing and needing intervention. Classic terminal multiplexers are stable, but they make it hard to:

  • keep many work streams visible at once
  • supervise streams across multiple machines
  • switch quickly to the next stream that needs attention

Architecture

The system is built around three pieces:

  • Guardians: long-lived worker processes. Each guardian owns a PTY and runs a single workload.
  • Supervisor: manages guardians, tracks state, and provides the web API for the UI.
  • Frontend: an SPA that provides the terminal experience in the browser; multiple machines appear as tabs.

The key reliability design is that the more crash-prone supervisor can die and re-adopt existing guardians, so the workloads remain survivable.

Workflow

In practice, this enables high-tempo work across multiple concurrent sessions: keep many sessions visible, and whenever one stream blocks on input, switch immediately to the next stream that can move forward. This is especially effective for SI work across multiple projects, or parallel work streams within the same project.

Deployed features include:

  • an extremely resilient terminal environment
  • terminal group management for multi-project parallelism
  • a responsive web frontend to the running processes from the machine they run on, or from any other machine through a secure connection
  • "multi-player" concurrent connections
  • full terminal scrollback

Browser access makes it versatile and easy to access, although browsers impose limitations.

With the recently added Tauri native frontend, the communication performance has been significantly improved.

Roadmap

Future AI integration can add higher-level capabilities on top of these primitives: assisting with managing work streams, detecting which sessions need attention, and proposing the next action.