Thomas Marchand 269958f0a9 Fix SSE event filtering and add Picture-in-Picture support (#25)
* Fix SSE event filtering race condition in mission views

Events were being filtered out during mission load due to a race condition where
viewingMissionId was set before currentMission finished loading. Now events only
get filtered when both IDs are set and different, allowing streaming updates to
display while missions are loading.

* Improve desktop stream UX with auto-open and auto-close

- Auto-extract display ID from desktop_start_session tool result
- Auto-open desktop stream when agent starts a desktop session
- Auto-close desktop stream when agent finishes (status becomes idle)
- Apply same improvements to both web and iOS dashboards

* Fix desktop display extraction from JSON string results

Tool results may be returned as JSON strings rather than parsed objects.
Handle both cases when extracting the display ID from desktop_start_session.

* Fix desktop stream staying open when status=idle during loading

The event filtering was updated to allow events through when currentMissionId
is null (during initial load), but the status application logic wasn't updated
to match. This created a window where tool_result could open the desktop stream
but status=idle wouldn't close it because shouldApplyStatus was false.

Now both the event filter and status application logic use consistent conditions:
allow when currentMissionId hasn't loaded yet.

* Fix desktop auto-open and add Picture-in-Picture support

- Use tool_result event's name field directly for desktop_start_session detection
  (fixes auto-open when tool_call event was filtered or missed)
- Add native Picture-in-Picture button to desktop stream
  - Converts canvas to video stream for OS-level floating window
  - Works outside the browser tab
  - Shows PiP button only when browser supports it

* Add iOS Picture-in-Picture support for desktop stream

- Implement AVSampleBufferDisplayLayer-based PiP for iOS
- Convert JPEG frames to CMSampleBuffer for PiP playback
- Add PiP buttons to desktop stream header and controls
- Fix web dashboard auto-open to use tool name from event data directly
- Add audio background mode to Info.plist for PiP support

* Fix React anti-patterns flagged by Bugbot

- Use itemsRef for synchronous read instead of calling state setters
  inside setItems updater callback (React strict mode safe)
- Attach PiP event listeners directly to video element instead of
  document, since these events don't bubble

* Fix PiP issues flagged by Bugbot

- iOS: Only disconnect stream onDisappear if PiP is not active,
  allowing stream to continue in PiP mode after sheet is dismissed
- Web: Stop existing stream tracks before creating new ones to
  prevent resource leaks on repeated PiP toggle

* Fix iOS PiP cleanup when stopped after view dismissal

- Add shouldDisconnectAfterPip flag to track deferred cleanup
- Set flag in onDisappear when PiP is active
- Clean up WebSocket and PiP resources when PiP stops if flag is set

* Fix additional PiP issues flagged by Bugbot

- iOS: Return actual isPaused state in PiP delegate using MainActor.assumeIsolated
- iOS: Add isPipReady flag and disable PiP button until setup completes
- Web: Don't forcibly exit PiP on unmount to match iOS behavior
2026-01-03 14:16:02 -08:00
2025-12-21 09:03:08 +00:00
2025-12-17 08:55:04 +00:00

Open Agent

A minimal autonomous coding agent with full machine access, implemented in Rust.

Features

  • HTTP API for task submission and monitoring
  • Tool-based agent loop following the "tools in a loop" pattern
  • Full toolset: file operations, terminal, machine-wide search, web access, git
  • OpenRouter integration for LLM access (supports any model)
  • SSE streaming for real-time task progress
  • AI-maintainable Rust codebase with strong typing

Quick Start

Prerequisites

Installation

git clone <repo-url>
cd open_agent
cargo build --release

Running

# Set your API key
export OPENROUTER_API_KEY="sk-or-v1-..."

# Optional: configure model (default: anthropic/claude-sonnet-4.5)
export DEFAULT_MODEL="anthropic/claude-sonnet-4.5"

# Optional: default working directory for relative paths (absolute paths work everywhere)
# In production this is typically /root
export WORKING_DIR="."

# Start the server
cargo run --release

The server starts on http://127.0.0.1:3000 by default.

OpenCode Backend (External Agent)

Open Agent can delegate execution to an OpenCode server instead of using its built-in agent loop.

# Point to a running OpenCode server
export AGENT_BACKEND="opencode"
export OPENCODE_BASE_URL="http://127.0.0.1:4096"

# Optional: choose OpenCode agent (build/plan/etc)
export OPENCODE_AGENT="build"

# Optional: auto-allow all permissions for OpenCode sessions (default: true)
export OPENCODE_PERMISSIVE="true"

API Reference

Submit a Task

curl -X POST http://localhost:3000/api/task \
  -H "Content-Type: application/json" \
  -d '{"task": "Create a Python script that prints Hello World"}'

Response:

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "pending"
}

Get Task Status

curl http://localhost:3000/api/task/{id}

Response:

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "task": "Create a Python script that prints Hello World",
  "model": "openai/gpt-4.1-mini",
  "iterations": 3,
  "result": "I've created hello.py with a simple Hello World script...",
  "log": [...]
}

Stream Task Progress (SSE)

curl http://localhost:3000/api/task/{id}/stream

Events:

  • log - Execution log entries (tool calls, results)
  • done - Task completion with final status

Health Check

curl http://localhost:3000/api/health

Available Tools

Tool Description
read_file Read file contents (any path on the machine) with optional line range
write_file Write/create files anywhere on the machine
delete_file Delete files anywhere on the machine
list_directory List directory contents anywhere on the machine
search_files Search for files by name pattern (machine-wide; scope with path)
run_command Execute shell commands (optionally in a specified cwd)
grep_search Search file contents with regex (machine-wide; scope with path)
web_search Search the web (DuckDuckGo)
fetch_url Fetch URL contents
git_status Get git status for any repo path
git_diff Show git diff for any repo path
git_commit Create git commits for any repo path
git_log Show git log for any repo path

Configuration

Variable Default Description
OPENROUTER_API_KEY (required) Your OpenRouter API key
DEFAULT_MODEL anthropic/claude-sonnet-4.5 Default LLM model
WORKING_DIR . (dev) / /root (prod) Default working directory for relative paths (agent still has full machine access)
HOST 127.0.0.1 Server bind address
PORT 3000 Server port
MAX_ITERATIONS 50 Max agent loop iterations

Architecture

┌─────────────────┐     ┌─────────────────┐
│   HTTP Client   │────▶│   HTTP API      │
└─────────────────┘     │   (axum)        │
                        └────────┬────────┘
                                 │
                        ┌────────▼────────┐
                        │   Agent Loop    │◀──────┐
                        │                 │       │
                        └────────┬────────┘       │
                                 │                │
                   ┌─────────────┼─────────────┐  │
                   ▼             ▼             ▼  │
            ┌──────────┐  ┌──────────┐  ┌──────────┐
            │   LLM    │  │  Tools   │  │  Tools   │
            │(OpenRouter)│ │(file,git)│ │(term,web)│
            └──────────┘  └──────────┘  └──────────┘
                   │
                   └──────────────────────────────┘
                            (results fed back)

Development

# Run with debug logging
RUST_LOG=debug cargo run

# Run tests
cargo test

# Format code
cargo fmt

# Check for issues
cargo clippy

Dashboard (Bun)

The dashboard lives in dashboard/ and uses Bun as the package manager.

cd dashboard
bun install
PORT=3001 bun dev

Calibration (Trial-and-Error Tuning)

Open Agent supports empirical tuning of its difficulty (complexity) and cost estimation via a calibration harness.

Run calibrator

export OPENROUTER_API_KEY="sk-or-v1-..."
cargo run --release --bin calibrate -- --workspace ./.open_agent_calibration --model openai/gpt-4.1-mini --write-tuning

This writes a tuning file at ./.open_agent_calibration/.open_agent/tuning.json. Move/copy it to your real workspace as ./.open_agent/tuning.json to enable it.

License

MIT

Description
No description provided
Readme 183 MiB
Languages
Rust 38.9%
TypeScript 35.2%
HTML 13.7%
Swift 9.6%
CSS 1.3%
Other 1.3%