b42ed192cfcb39ce923e480a3967ae0920db2df8
* iOS: Improve mission UI, add auto-reconnect, and refine input field - Fix missions showing "Default" label by using mission ID instead when no model override - Add ConnectionState enum to track SSE stream health with reconnecting/disconnected states - Implement automatic reconnection with exponential backoff (1s→30s) - Show connection status in toolbar when disconnecting, hide error bubbles for connection issues - Fix status event filtering to only apply to currently viewed mission - Reset run state when creating new mission or switching missions - Redesign input field to ChatGPT style: clean outline, no background fill, integrated send button * Add real-time desktop streaming with WebSocket MJPEG Implements desktop streaming feature to watch the AI agent work in real-time: - Backend: WebSocket endpoint at /api/desktop/stream using MJPEG frames - iOS: Bottom sheet UI with play/pause, FPS and quality controls - Web: Side-by-side split view with toggleable desktop panel - Better OpenCode error messages for debugging * Fix Bugbot review issues - Fix WebSocket reconnection on slider changes by using initial values for URL params - Fix iOS connected status set before WebSocket actually connects - Fix mission state mapping to properly handle waiting_for_tool state * Change default model from Sonnet 4 to Opus 4.5 Update DEFAULT_MODEL default value to claude-opus-4-5-20251101, the most capable model in the Claude family. * Fix additional Bugbot review issues - Add onerror handler for image loading to prevent memory leaks - Reset isPaused on disconnect to avoid UI desync - Fix data race on backoff variable using nonisolated(unsafe) * Address remaining Bugbot review issues - Make error filtering more specific to SSE reconnection errors only - Use refs for FPS/quality values to preserve current settings on reconnect * Fix initial connection state and task cleanup - Start iOS connection state as disconnected until first event - Abort spawned tasks when WebSocket handler exits to prevent resource waste * Fix connection state and backoff logic in iOS ControlView - Set connectionState to .disconnected on view disappear (was incorrectly .connected) - Only reset exponential backoff on successful (non-error) events to maintain proper backoff behavior when server is unavailable * Fix fullscreen state sync and stale WebSocket callbacks - Web: Don't set fullscreen state synchronously; rely on event listeners - Web: Add fullscreenerror event handler to catch failed fullscreen requests - iOS: Add connection ID to prevent stale WebSocket callbacks from corrupting new connection state when reconnecting * Fix user message not appearing when viewing parallel missions When switching to a parallel mission, currentMission was not being updated, causing viewingId != currentId. This made the event filter skip user_message events (which have mission_id: None from main session). Now always update currentMission when switching, ensuring the filter passes events correctly. * Fix web dashboard showing "Agent is working..." for idle missions Two fixes: 1. Set viewingMissionId immediately when loading mission from URL param - Previously viewingMissionId was null, falling back to global runState - Now it's set immediately so viewingMissionIsRunning checks runningMissions 2. Add status event filtering by mission_id - Status events now only update runState if they match the viewing mission - Similar to iOS fix for cross-mission status contamination * Fix mission not loading when accessed via URL before authentication When loading a mission via URL param (?mission=...), the initial API fetch would fail with 401 before the user authenticated. After login, nothing triggered a re-fetch of the mission data. Added auth retry mechanism: - Add signalAuthSuccess() to dispatch event after successful login - Add authRetryTrigger state and listener in control-client - Re-fetch mission and providers when auth succeeds * Fix user message not appearing when viewing a specific mission The user_message SSE event was being sent with mission_id: None, causing it to be filtered out by the frontend when viewing a specific mission. Now we read the current_mission before emitting the event and include its ID, so the frontend correctly displays the user's message. * Separate viewed mission from main mission to prevent event leaking - Thread mission_id through main control runs so assistant/thinking/tool events are tagged with the correct mission ID - Web: Track viewingMission separately from currentMission; filter SSE events by mission_id; revert to previous view on load failures - iOS: Track viewingMission separately from currentMission; filter SSE events by mission_id; restore previous view on load failures; parse depth from both 'depth' and 'current_depth' SSE fields - Update "Auto uses" label to Opus 4.5 on web This prevents mission switching from leaking messages or status updates across different missions when running parallel missions. * Fix Bugbot review issues - Use getValidJwt() and getRuntimeApiBase() in desktop-stream.tsx instead of incorrect storage keys - Show error toast for mission load failures (except 401 auth errors) to fix silent failures for already-authenticated users * Fix additional Bugbot review issues - Add connectionId guard to desktop stream WebSocket to prevent race conditions where stale onclose callbacks incorrectly set disconnected state after reconnection - Fix sync effect in control-client to only update viewingMission when viewingMissionId matches currentMission.id, preventing state corruption - Restore runState, queueLength, progress on iOS mission switch failure to avoid mismatched status indicators * Add race condition guard to URL-based mission loading * Fix data race in iOS reconnection backoff using OSAllocatedUnfairLock Replace nonisolated(unsafe) with proper thread-safe synchronization using OSAllocatedUnfairLock for the receivedSuccessfulEvent boolean that is written from the stream callback and read after completion.
Open Agent
A minimal autonomous coding agent with full machine access, implemented in Rust.
Features
- HTTP API for task submission and monitoring
- Tool-based agent loop following the "tools in a loop" pattern
- Full toolset: file operations, terminal, machine-wide search, web access, git
- OpenRouter integration for LLM access (supports any model)
- SSE streaming for real-time task progress
- AI-maintainable Rust codebase with strong typing
Quick Start
Prerequisites
- Rust 1.70+ (install via rustup)
- An OpenRouter API key (get one here)
Installation
git clone <repo-url>
cd open_agent
cargo build --release
Running
# Set your API key
export OPENROUTER_API_KEY="sk-or-v1-..."
# Optional: configure model (default: anthropic/claude-sonnet-4.5)
export DEFAULT_MODEL="anthropic/claude-sonnet-4.5"
# Optional: default working directory for relative paths (absolute paths work everywhere)
# In production this is typically /root
export WORKING_DIR="."
# Start the server
cargo run --release
The server starts on http://127.0.0.1:3000 by default.
OpenCode Backend (External Agent)
Open Agent can delegate execution to an OpenCode server instead of using its built-in agent loop.
# Point to a running OpenCode server
export AGENT_BACKEND="opencode"
export OPENCODE_BASE_URL="http://127.0.0.1:4096"
# Optional: choose OpenCode agent (build/plan/etc)
export OPENCODE_AGENT="build"
# Optional: auto-allow all permissions for OpenCode sessions (default: true)
export OPENCODE_PERMISSIVE="true"
API Reference
Submit a Task
curl -X POST http://localhost:3000/api/task \
-H "Content-Type: application/json" \
-d '{"task": "Create a Python script that prints Hello World"}'
Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending"
}
Get Task Status
curl http://localhost:3000/api/task/{id}
Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"task": "Create a Python script that prints Hello World",
"model": "openai/gpt-4.1-mini",
"iterations": 3,
"result": "I've created hello.py with a simple Hello World script...",
"log": [...]
}
Stream Task Progress (SSE)
curl http://localhost:3000/api/task/{id}/stream
Events:
log- Execution log entries (tool calls, results)done- Task completion with final status
Health Check
curl http://localhost:3000/api/health
Available Tools
| Tool | Description |
|---|---|
read_file |
Read file contents (any path on the machine) with optional line range |
write_file |
Write/create files anywhere on the machine |
delete_file |
Delete files anywhere on the machine |
list_directory |
List directory contents anywhere on the machine |
search_files |
Search for files by name pattern (machine-wide; scope with path) |
run_command |
Execute shell commands (optionally in a specified cwd) |
grep_search |
Search file contents with regex (machine-wide; scope with path) |
web_search |
Search the web (DuckDuckGo) |
fetch_url |
Fetch URL contents |
git_status |
Get git status for any repo path |
git_diff |
Show git diff for any repo path |
git_commit |
Create git commits for any repo path |
git_log |
Show git log for any repo path |
Configuration
| Variable | Default | Description |
|---|---|---|
OPENROUTER_API_KEY |
(required) | Your OpenRouter API key |
DEFAULT_MODEL |
anthropic/claude-sonnet-4.5 |
Default LLM model |
WORKING_DIR |
. (dev) / /root (prod) |
Default working directory for relative paths (agent still has full machine access) |
HOST |
127.0.0.1 |
Server bind address |
PORT |
3000 |
Server port |
MAX_ITERATIONS |
50 |
Max agent loop iterations |
Architecture
┌─────────────────┐ ┌─────────────────┐
│ HTTP Client │────▶│ HTTP API │
└─────────────────┘ │ (axum) │
└────────┬────────┘
│
┌────────▼────────┐
│ Agent Loop │◀──────┐
│ │ │
└────────┬────────┘ │
│ │
┌─────────────┼─────────────┐ │
▼ ▼ ▼ │
┌──────────┐ ┌──────────┐ ┌──────────┐
│ LLM │ │ Tools │ │ Tools │
│(OpenRouter)│ │(file,git)│ │(term,web)│
└──────────┘ └──────────┘ └──────────┘
│
└──────────────────────────────┘
(results fed back)
Development
# Run with debug logging
RUST_LOG=debug cargo run
# Run tests
cargo test
# Format code
cargo fmt
# Check for issues
cargo clippy
Dashboard (Bun)
The dashboard lives in dashboard/ and uses Bun as the package manager.
cd dashboard
bun install
PORT=3001 bun dev
Calibration (Trial-and-Error Tuning)
Open Agent supports empirical tuning of its difficulty (complexity) and cost estimation via a calibration harness.
Run calibrator
export OPENROUTER_API_KEY="sk-or-v1-..."
cargo run --release --bin calibrate -- --workspace ./.open_agent_calibration --model openai/gpt-4.1-mini --write-tuning
This writes a tuning file at ./.open_agent_calibration/.open_agent/tuning.json. Move/copy it to your real workspace as ./.open_agent/tuning.json to enable it.
License
MIT
Description
Languages
Rust
38.9%
TypeScript
35.2%
HTML
13.7%
Swift
9.6%
CSS
1.3%
Other
1.3%