openagent

Author	SHA1	Message	Date
Thomas Marchand	3e38d2716e	Merge master into ios-mission-loading branch Resolve merge conflicts in ControlView.swift: - Keep improved SSE reconnection logic with successful event tracking - Keep better error filtering for SSE-specific errors - Keep initial disconnected state for connection indicator - Keep switch-based state mapping for running missions	2026-01-03 06:05:24 +00:00
Thomas Marchand	3fbe3cc662	Fix fullscreen state sync and stale WebSocket callbacks - Web: Don't set fullscreen state synchronously; rely on event listeners - Web: Add fullscreenerror event handler to catch failed fullscreen requests - iOS: Add connection ID to prevent stale WebSocket callbacks from corrupting new connection state when reconnecting	2026-01-02 23:32:38 +00:00
Thomas Marchand	82573f2587	Fix connection state and backoff logic in iOS ControlView - Set connectionState to .disconnected on view disappear (was incorrectly .connected) - Only reset exponential backoff on successful (non-error) events to maintain proper backoff behavior when server is unavailable	2026-01-02 23:18:13 +00:00
Thomas Marchand	3fd40035e3	Fix initial connection state and task cleanup - Start iOS connection state as disconnected until first event - Abort spawned tasks when WebSocket handler exits to prevent resource waste	2026-01-02 23:03:54 +00:00
Thomas Marchand	c898f68026	Address remaining Bugbot review issues - Make error filtering more specific to SSE reconnection errors only - Use refs for FPS/quality values to preserve current settings on reconnect	2026-01-02 22:44:14 +00:00
Thomas Marchand	d7007acc20	Fix additional Bugbot review issues - Add onerror handler for image loading to prevent memory leaks - Reset isPaused on disconnect to avoid UI desync - Fix data race on backoff variable using nonisolated(unsafe)	2026-01-02 22:32:03 +00:00
Thomas Marchand	0480828aa2	Change default model from Sonnet 4 to Opus 4.5 Update DEFAULT_MODEL default value to claude-opus-4-5-20251101, the most capable model in the Claude family.	2026-01-02 22:19:24 +00:00
Thomas Marchand	69e5c4d915	Fix Bugbot review issues - Fix WebSocket reconnection on slider changes by using initial values for URL params - Fix iOS connected status set before WebSocket actually connects - Fix mission state mapping to properly handle waiting_for_tool state	2026-01-02 22:12:39 +00:00
Thomas Marchand	c9a2ef45c7	Add real-time desktop streaming with WebSocket MJPEG Implements desktop streaming feature to watch the AI agent work in real-time: - Backend: WebSocket endpoint at /api/desktop/stream using MJPEG frames - iOS: Bottom sheet UI with play/pause, FPS and quality controls - Web: Side-by-side split view with toggleable desktop panel - Better OpenCode error messages for debugging	2026-01-02 21:40:00 +00:00
Thomas Marchand	3398dbe271	iOS: Improve mission UI, add auto-reconnect, and refine input field (#16 ) - Fix missions showing "Default" label by using mission ID instead when no model override - Add ConnectionState enum to track SSE stream health with reconnecting/disconnected states - Implement automatic reconnection with exponential backoff (1s→30s) - Show connection status in toolbar when disconnecting, hide error bubbles for connection issues - Fix status event filtering to only apply to currently viewed mission - Reset run state when creating new mission or switching missions - Redesign input field to ChatGPT style: clean outline, no background fill, integrated send button	2026-01-02 13:00:16 -08:00
Thomas Marchand	0a960e6381	iOS: Improve mission UI, add auto-reconnect, and refine input field - Fix missions showing "Default" label by using mission ID instead when no model override - Add ConnectionState enum to track SSE stream health with reconnecting/disconnected states - Implement automatic reconnection with exponential backoff (1s→30s) - Show connection status in toolbar when disconnecting, hide error bubbles for connection issues - Fix status event filtering to only apply to currently viewed mission - Reset run state when creating new mission or switching missions - Redesign input field to ChatGPT style: clean outline, no background fill, integrated send button	2026-01-02 20:59:36 +00:00
Thomas Marchand	48fbdfdc60	Remove Local Backend, make OpenCode the only execution path (#15 ) This refactor simplifies the architecture by: ## Backend Changes - Remove AgentBackend enum and dual-backend logic - Make OpenCode the sole execution backend - Change opencode_base_url from Option<String> to String with default - Update default model to claude-sonnet-4-20250514 ## Provider System - Add GET /api/providers endpoint for model discovery - Create .open_agent/providers.json config file - Support grouped models by provider with billing type metadata ## Code Cleanup - Delete SimpleAgent (src/agents/simple.rs) - Delete TaskExecutor (src/agents/leaf/) - Delete orchestrator module (src/agents/orchestrator/) - Keep LLM client (needed for memory embeddings) - Keep budget system (useful for cost tracking) - Keep tools module (for MCP API listing) ## Dashboard Updates - Add listProviders() API function - Update model selector to group by provider - Show billing type (subscription vs pay-per-token) ## Documentation - Update CLAUDE.md to reflect OpenCode-only architecture	2026-01-02 12:32:27 -08:00
Thomas Marchand	ce33838968	OpenCode refactor and mission tracking fixes (#14 ) * Fix missions staying Active after completion with OpenCode backend - Add TerminalReason::Completed variant for successful task completion - Set terminal_reason in OpenCodeAgent on success to trigger auto-complete - Update control.rs to explicitly handle Completed terminal reason - Update CLAUDE.md with OpenCode backend documentation * Improve iOS dashboard UI polish - Remove harsh input field border, use ultraThinMaterial background with subtle focus glow - Clean up model selector pills: remove ugly truncated mission IDs, increase padding - Remove agent working indicator border for cleaner look - Increase input area bottom padding for better thumb reach * Add real-time event streaming for OpenCode backend - Add SSE streaming support to OpenCodeClient via /event endpoint - Parse and forward OpenCode events (thinking, tool_call, tool_result) - Update OpenCodeAgent to consume stream and forward to control channel - Add fallback to blocking mode if SSE connection fails This enables live UI updates in the dashboard when using OpenCode backend. * Fix running mission tracking to use actual executing mission ID Track the mission ID that the main `running` task is actually working on separately from `current_mission`, which can change when the user creates a new mission. This ensures ListRunning and GracefulShutdown correctly identify which mission is being executed. * Add MCP server for desktop tools and Playwright integration - Create desktop-mcp binary that exposes i3/Xvfb desktop automation tools as an MCP server for use with OpenCode backend - Add opencode.json with both desktop and Playwright MCP configurations - Update deployment command to include desktop-mcp binary - Document available MCP tools in CLAUDE.md Desktop tools: start_session, stop_session, screenshot, type, click, mouse_move, scroll, i3_command, get_text * Document SSH key and desktop-mcp binary in production section - Add ~/.ssh/cursor as the SSH key for production access - Add desktop-mcp binary location to production table * Emphasize bun usage and add gitignore entries - Add clear instructions to ALWAYS use bun, never npm for dashboard - Gitignore .playwright-mcp/ directory (local MCP data) - Gitignore dashboard/package-lock.json (we use bun.lockb) * Add mission delete and cleanup features to web and iOS dashboards Backend (Rust): - Add delete_mission() and delete_empty_untitled_missions() to supabase.rs - Add DELETE /api/control/missions/:id endpoint with running mission guard - Add POST /api/control/missions/cleanup endpoint for bulk cleanup Web Dashboard (Next.js): - Add deleteMission() and cleanupEmptyMissions() API functions - Add delete button (trash icon) on hover for each mission row - Add "Cleanup Empty" button with sparkles icon in filters area - Fix analytics to compute stats from missions/runs data instead of broken /api/stats iOS Dashboard (Swift): - Add deleteMission() and cleanupEmptyMissions() to APIService - Add delete() HTTP helper method - Add swipe-to-delete on mission rows (disabled for active missions) - Add "Cleanup" button with sparkles icon and progress indicator - Add success banner with auto-dismiss after cleanup * Fix CancelMission and MCP notification parsing bugs - CancelMission now uses running_mission_id instead of current_mission to correctly identify the executing mission (fixes race condition when user creates new mission while another is running) - MCP server JsonRpcRequest.id field now has #[serde(default)] to handle JSON-RPC 2.0 notifications which don't have an id field * Fix running mission tracking bugs - delete_mission: Query control actor for actual running missions instead of using always-empty running_missions list - cleanup_empty_missions: Exclude running missions from cleanup to prevent deleting missions mid-execution - get_parallel_config: Query control actor for accurate running count - Task completion: Save running_mission_id before clearing and use it for persist and auto-complete (fixes race when user creates new mission while task is running) All endpoints now use ControlCommand::ListRunning to get accurate running state from the control actor loop. * Fix bugbot issues: analytics cost, browser cleanup, title truncation, history append - Add get_total_cost_cents() to supabase.rs for aggregating all run costs - Update /api/stats endpoint to return actual total cost from database - Fix analytics page to use stats endpoint for total cost (not limited to 100 runs) - Fix desktop_mcp.rs to save browser_pid to session file after launch - Fix mission title truncation to use safe_truncate_index and append "..." - Fix mission history to append to existing DB history instead of replacing (prevents data loss when CreateMission is called during task execution) * Fix history context contamination and cumulative thinking content - Only push to local history if completed mission matches current mission, preventing old mission exchanges from contaminating new mission context - Accumulate thinking content across iterations so frontend replacement shows all thinking, matching OpenCode backend behavior * Fix MCP notifications, orphaned processes, and shutdown persistence - MCP server no longer sends responses to JSON-RPC notifications (per spec) - Clean up Xvfb/i3/Chromium processes on partial session startup failure - Graceful shutdown only persists history if running mission matches current * Fix partial field selection deserialization in cleanup endpoint Use PartialMission struct for partial field queries to avoid deserialization failure when DbMission's required fields are missing. * Clarify analytics success rate measures missions not tasks Update labels to "Mission Success Rate" and "X missions completed" to make it clear the metric is mission-level, not task-level.	2026-01-02 09:45:01 -08:00
Thomas Marchand	3164febd57	OpenCode integration with real-time streaming (#13 ) * Fix missions staying Active after completion with OpenCode backend - Add TerminalReason::Completed variant for successful task completion - Set terminal_reason in OpenCodeAgent on success to trigger auto-complete - Update control.rs to explicitly handle Completed terminal reason - Update CLAUDE.md with OpenCode backend documentation * Improve iOS dashboard UI polish - Remove harsh input field border, use ultraThinMaterial background with subtle focus glow - Clean up model selector pills: remove ugly truncated mission IDs, increase padding - Remove agent working indicator border for cleaner look - Increase input area bottom padding for better thumb reach * Add real-time event streaming for OpenCode backend - Add SSE streaming support to OpenCodeClient via /event endpoint - Parse and forward OpenCode events (thinking, tool_call, tool_result) - Update OpenCodeAgent to consume stream and forward to control channel - Add fallback to blocking mode if SSE connection fails This enables live UI updates in the dashboard when using OpenCode backend. * Fix running mission tracking to use actual executing mission ID Track the mission ID that the main `running` task is actually working on separately from `current_mission`, which can change when the user creates a new mission. This ensures ListRunning and GracefulShutdown correctly identify which mission is being executed.	2026-01-02 08:20:02 +00:00
Thomas Marchand	6acab1da5c	Fix missions showing as Active after OpenCode completion (#12 ) * Fix missions staying Active after completion with OpenCode backend - Add TerminalReason::Completed variant for successful task completion - Set terminal_reason in OpenCodeAgent on success to trigger auto-complete - Update control.rs to explicitly handle Completed terminal reason - Update CLAUDE.md with OpenCode backend documentation * Improve iOS dashboard UI polish - Remove harsh input field border, use ultraThinMaterial background with subtle focus glow - Clean up model selector pills: remove ugly truncated mission IDs, increase padding - Remove agent working indicator border for cleaner look - Increase input area bottom padding for better thumb reach * Add real-time event streaming for OpenCode backend - Add SSE streaming support to OpenCodeClient via /event endpoint - Parse and forward OpenCode events (thinking, tool_call, tool_result) - Update OpenCodeAgent to consume stream and forward to control channel - Add fallback to blocking mode if SSE connection fails This enables live UI updates in the dashboard when using OpenCode backend. * Fix running mission tracking to use actual executing mission ID Track the mission ID that the main `running` task is actually working on separately from `current_mission`, which can change when the user creates a new mission. This ensures ListRunning and GracefulShutdown correctly identify which mission is being executed.	2026-01-02 08:16:50 +00:00
Thomas Marchand	640d2b39fd	Merge pull request #11 from lfglabs-dev/Th0rgal/open-code-refactor Add OpenCode integration for backend execution	2026-01-02 07:49:00 +00:00
Thomas Marchand	50e0b0df26	Add .env*.local to dashboard gitignore Vercel CLI automatically added this entry when pulling env vars.	2026-01-02 07:48:33 +00:00
Thomas Marchand	610b9366f2	Add OpenCode integration for backend execution - Add OpenCode HTTP client module (src/opencode/mod.rs) - Add OpenCodeAgent for delegating task execution (src/agents/opencode.rs) - Update config to support AGENT_BACKEND selection (opencode/local) - Fix path canonicalization for OpenCode directory requirement - Update routes to use OpenCodeAgent when backend=opencode	2026-01-02 07:39:24 +00:00
Thomas Marchand	4fa25b9d70	Merge pull request #10 from lfglabs-dev/Th0rgal/fix-image-auth Fix image preview authentication in FilePreviewModal	2025-12-26 19:49:52 +03:00
Thomas Marchand	17b021b313	Fix race condition causing blob URL memory leak Add staleness check to prevent updating state/refs after cleanup runs when path changes during an in-flight fetch.	2025-12-26 17:39:31 +01:00
Thomas Marchand	3f487829ea	Fix image preview authentication in FilePreviewModal The img tag cannot send custom HTTP headers, causing authenticated image requests to fail. Fetch the image as a blob with proper Bearer token authentication, then use a blob URL for the src attribute.	2025-12-26 11:07:12 +01:00
Thomas Marchand	0b46c4e91f	Merge pull request #9 from lfglabs-dev/Th0rgal/client-improvements Improve web and iOS clients with enhanced UX features	2025-12-26 12:50:30 +03:00
Thomas Marchand	7356aade92	Improve web and iOS clients with enhanced UX features Add timestamps and timestamps to all messages, syntax-highlighted code blocks with copy buttons, file preview modal with syntax highlighting, analytics dashboard, quick action templates, and extended iOS ToolUI support for progress bars, alerts, and code blocks.	2025-12-26 10:09:45 +01:00
Thomas Marchand	37fede3105	Merge pull request #8 from lfglabs-dev/Th0rgal/agent-improvements Enhance agent capabilities with smart pivoting and model routing	2025-12-26 11:38:16 +03:00
Thomas Marchand	1289be8d44	Remove unused summarize_large_results config option The field was defined and configurable via SUMMARIZE_LARGE_RESULTS env var, but never actually used in any code path. LLM-based summarization of large tool results was not implemented. Remove to avoid misleading configuration.	2025-12-26 09:24:43 +01:00
Thomas Marchand	04e15e34cd	Fix blocker false positives and truncation char/byte mismatch - Remove generic "audit" keyword from Solidity task detection to avoid false positive TypeMismatch blockers on non-Solidity audit tasks - Add Solidity-specific keywords: .sol, evm, foundry, hardhat - Fix DeepSearch truncation check to compare char count (not byte count) to match the chars().take(10000) truncation logic - Add test for generic audit not triggering false positive	2025-12-26 09:10:23 +01:00
Thomas Marchand	a6346051c4	Fix env var thresholds for truncation not being applied MAX_TOOL_RESULT_CHARS env var was loaded into ExecutionThresholds but the truncation logic used ctx.config.context.max_tool_result_chars directly. Now thresholds properly override config default when set.	2025-12-26 08:58:47 +01:00
Thomas Marchand	8d2806fe29	Fix pre-existing test failures in budget and llm modules - benchmarks: Fix test_normalize_id to expect '/' to be preserved (provider prefix is needed for matching, only removes :, -, _, .) - learned: Fix test_select_model_prefers_high_success_low_cost to use values that correctly trigger the scoring formula behavior - retry: Fix test_budget_exhausted_with_progress by using 85% budget (condition is > 0.8, not >= 0.8) - error: Fix exponential_backoff to cap total delay (including jitter) at 60 seconds, not just the base delay before jitter is added	2025-12-26 08:44:08 +01:00
Thomas Marchand	620f35991f	Enhance agent capabilities with smart pivoting and adaptive model selection - Implement smart tool result handling with UTF-8-safe truncation - Add category-aware pivot prompts when agent gets stuck in loops - Wire up benchmark-based model routing for optimal task-type matching - Create 4 new composite tools (analyze_codebase, deep_search, prepare_project, debug_error) - Implement configurable execution thresholds via environment variables - Add blocker detection for early termination of impossible tasks - Improve tool failure tracking with cross-category fallback suggestions These improvements reduce iteration count, provide better guidance when stuck, and automatically select the right model for each task type.	2025-12-26 08:39:59 +01:00
Thomas Marchand	747b455a4f	Merge pull request #7 from lfglabs-dev/Th0rgal/fix-build Add TerminalReason enum to track execution failure modes	2025-12-25 22:53:03 +03:00
Thomas Marchand	f1f86d787e	Add missing RunningMissionsBar.swift to iOS Xcode project The file existed but wasn't included in the project.pbxproj, causing the build to fail with 'cannot find RunningMissionsBar in scope'.	2025-12-25 20:48:38 +01:00
Thomas Marchand	52ca4b00e9	Merge pull request #6 from lfglabs-dev/Th0rgal/claude-context-setup Add Claude context configuration files	2025-12-25 22:45:54 +03:00
Thomas Marchand	37dfac1472	Remove outdated leaf agent docs, reflect SimpleAgent architecture The agent system now uses SimpleAgent → TaskExecutor, not the old hierarchical orchestrator. Remove "Adding a New Leaf Agent" sections from both CLAUDE.md and cursor rules as they reference deprecated RootAgent/LeafAgent patterns.	2025-12-25 20:45:12 +01:00
Thomas Marchand	4a76da16f6	Expand Rust conventions with provability-first design principles Add pure functions, algebraic types, error handling examples, leaf agent creation guide, and enhanced design system notes.	2025-12-25 20:42:54 +01:00
Thomas Marchand	c8724621ca	Add TerminalReason enum and terminal_reason field to AgentResult This adds tracking of execution termination reasons (cancellation, budget exhaustion, LLM errors, stalling, infinite loops, max iterations) to properly distinguish between different failure modes in agent execution.	2025-12-25 20:42:42 +01:00
Thomas Marchand	fb2f3407b4	Add Claude context configuration files Add .claude/CLAUDE.md with project documentation (architecture, commands, conventions, env vars) and .claude/settings.json with tool permissions for streamlined agent development.	2025-12-25 20:41:20 +01:00
Thomas Marchand	812bc4dc08	Merge pull request #5 from lfglabs-dev/fixes ios fix	2025-12-25 12:04:57 +03:00
Thomas Marchand	96f0dad563	ios fix	2025-12-25 09:49:58 +01:00
Thomas Marchand	067afb28d0	Merge pull request #4 from lfglabs-dev/fixes Fixes	2025-12-25 11:32:56 +03:00
Thomas Marchand	950c238d6b	fix: interruption	2025-12-25 07:20:09 +01:00
Thomas Marchand	121cb2b7b9	fix: tool duration	2025-12-24 20:55:19 +01:00
Thomas Marchand	9abb646699	fix: open agent reports	2025-12-24 18:47:34 +01:00
Thomas Marchand	4c5e355640	fix: bugbot reported	2025-12-23 21:21:13 +01:00
Thomas Marchand	76fa7ebe89	feat: upload progress bar, URL download, and chunked uploads 1. Upload progress bar - shows real-time progress with bytes/percentage 2. URL download - paste any URL, server downloads directly (faster for large files) 3. Chunked uploads - files >10MB split into 5MB chunks with retry (3 attempts) Dashboard changes: - Progress bar UI with bytes transferred - Link icon button to paste URLs - Uses chunked upload for large files automatically Backend changes: - /api/fs/upload-chunk - receives file chunks - /api/fs/upload-finalize - assembles chunks into final file - /api/fs/download-url - server downloads from URL to filesystem	2025-12-23 19:30:06 +01:00
Thomas Marchand	ce6f552d4a	perf: skip SFTP for localhost file operations When CONSOLE_SSH_HOST is 127.0.0.1/localhost, use direct file operations instead of SSH/SFTP to itself. This makes uploads instant instead of going through the full SFTP overhead. Optimizes: upload, download, list, mkdir, rm	2025-12-23 19:09:21 +01:00
Thomas Marchand	4217dbe038	fix: dropdown resume option for blocked missions + accurate loop warning 1. Dropdown now shows "Continue Mission" for blocked status (was only showing "Reactivate" which doesn't provide resume context) 2. Loop warning message now accurately shows remaining attempts before termination instead of always saying "next call will terminate"	2025-12-23 17:37:17 +01:00
Thomas Marchand	71bcc57bf6	fix: add missing blocked/not_feasible status mappings in load_mission_from_db The helper was missing these cases, causing blocked missions to be loaded as active and breaking the resume check.	2025-12-23 13:20:03 +01:00
Thomas Marchand	cabb6f926d	fix: detect hallucinated Supabase image URLs in responses LLMs sometimes generate plausible-looking image URLs without actually uploading images. This adds validation to detect URLs that weren't in the pending_uploads list and warns the model to use actual tools.	2025-12-23 12:02:58 +01:00
Thomas Marchand	3578c5bb40	fix: improve infinite loop detection with earlier warning and context - Lower warning threshold from 3 to 2 repetitions - Lower force-complete threshold from 5 to 4 repetitions - Include last tool result in warning message so model sees WHY it's failing - Make warning message more actionable with specific suggestions	2025-12-23 10:54:02 +01:00
Thomas Marchand	261794ffe7	feat: filter and group models in dropdown - Remove llama models from selection - Remove OpenAI o-series models (o1, o3, etc.) - Group models by provider (Google, DeepSeek, Qwen, Anthropic, Mistral, OpenAI) - Sort within each category alphabetically	2025-12-23 08:54:38 +01:00

1 2 3 4 5 ...

356 Commits