20 Commits
master ... wip

Author SHA1 Message Date
891df0c924 Add logout action and enhance OTP request with parameter metadata
- Introduced LogoutAction for clearing browser session state, including localStorage cleanup and optional logout button interaction.
- Added comprehensive CLAUDE_ENHANCED.md documentation covering architecture patterns, dependency injection, browser pool management, and development workflows.
- Enhanced RequestOtpAction with ActionParamInfo metadata for better API documentation and validation.
- Updated auth module exports to include the
2025-12-08 22:45:08 +00:00
07e7f4bfff Enhance OTP Store and Session Management Features
- Added support for multiple OTP store backends: in-memory and Redis, allowing for shared state across workers.
- Introduced a new `ActionDIContext` for typed dependency injection, improving modularity and type safety in action handling.
- Updated session management to include Auth0 JWT verification, enhancing security for user authentication.
- Refactored existing code to streamline OTP request handling and improve error diagnostics.
- Removed outdated test files and optimized existing tests for better coverage of new features.
2025-12-07 15:05:05 +00:00
f502286450 Update documentation, refactor string handling, and enhance logging
- Updated the development server documentation to reflect the new port configuration.
- Refactored string handling by replacing the centralized registry with dedicated selectors for better modularity and type safety.
- Enhanced logging throughout the application by integrating loguru for structured logging and improved context handling.
- Removed outdated files and streamlined the codebase for better maintainability.
- Added new HTML parsing utilities using BeautifulSoup to improve DOM traversal and element extraction.
- Updated various components to utilize the new string selectors, ensuring consistency across the codebase.
2025-12-07 14:16:27 +00:00
5cd1fa3532 Update development server port and enhance documentation
- Changed the default port for the development server from 8000 to 8765 in the documentation.
- Updated testing instructions to reflect the new server port.
- Improved comments and structure in various files for better clarity and maintainability.
- Refactored some logging statements for consistency and readability.
- Added new configuration options for browser host management, including session isolation settings.
- Enhanced type safety and modularity in several components, ensuring better integration and performance.
2025-12-07 13:11:30 +00:00
3ecf5077c2 Refactor and Enhance Codebase for Improved Modularity and Type Safety
- Analyzed and documented architectural improvements based on the `repomix-output.md`, focusing on reducing code duplication and enhancing type safety.
- Introduced a new `ConditionalCompositeAction` class to support runtime conditional execution of action steps.
- Refactored existing action and browser helper methods to improve modularity and maintainability.
- Updated GraphQL client integration for better connection pooling and resource management.
- Enhanced error handling and diagnostics across various components, ensuring clearer feedback during execution.
- Removed outdated playbook actions and streamlined the action registry for better clarity and performance.
- Updated configuration files to reflect changes in browser host management and session handling.
- Added new tests to validate the refactored components and ensure robust functionality.
2025-12-07 11:40:34 +00:00
9e2a9bf499 Implement Semantic Form Discovery Plan and Refactor Browser Elements
- Introduced a comprehensive implementation plan for semantic form discovery, addressing architectural gaps for LLM-driven contextual form filling.
- Created new modules for type guards and context building, enhancing schema-DOM reconciliation.
- Consolidated existing utilities to reduce code duplication and improve maintainability.
- Updated dropdown handling to support schema-aware selections and improved field inference logic.
- Enhanced diagnostics and browser element interactions, including ARIA label resolution and role-based detection.
- Removed outdated regex functions and tests, streamlining the codebase.
- Added new integration tests for the updated form extraction and interaction capabilities.
2025-12-06 02:44:29 +00:00
ad14cfa366 Add demonstration board configuration and operations
- Introduced a new board configuration for the Demonstration board, including a YAML entry for board settings.
- Implemented CRUD operations for managing demonstration items via GraphQL, including creation, retrieval, updating, and archiving.
- Added utility functions for seeding the demonstration board with sample data across various initiative groups.
- Enhanced the GraphQL client to support new queries and mutations specific to the Demonstration board.
- Updated documentation to reflect the new operations and usage patterns for the demonstration board.
2025-12-06 00:03:58 +00:00
3fa3326bf3 Update .gitignore to include 'data/' directory for better file management 2025-12-05 16:47:21 +00:00
304f03ab4a x 2025-12-05 16:47:02 +00:00
770f993c5f Implement session management features for persona authentication
- Introduced SessionManager and SessionStorage for handling session persistence.
- Added API endpoints for session management, including listing and retrieving session statuses.
- Enhanced login action to support session restoration and caching.
- Updated application configuration to include session management settings.
- Refactored existing actions and registry to integrate session management capabilities.
2025-12-05 16:46:53 +00:00
8125eaf7f3 Add form schema discovery and validation for board items
- Introduced new API endpoints for fetching board schemas and validating data against them.
- Added data models for form schema, field definitions, and validation errors.
- Implemented schema fetching and validation logic to ensure data integrity before item creation.
- Enhanced existing board item creation process with optional schema validation.
- Updated raindrop operations to include form schema handling and validation utilities.
2025-12-03 11:15:35 +00:00
60e89bf2af Add board configuration and CRUD operations for board items
- Introduced a new YAML configuration file for board settings, defining default values for intake requests.
- Implemented BoardStore to manage board configurations and provide access to board details.
- Added API routes for board item management, including creation, discovery, and retrieval of board items.
- Enhanced GraphQL client to support board item operations, ensuring integration with existing automation workflows.
- Updated documentation to reflect new board management capabilities and API usage patterns.
2025-12-03 08:40:26 +00:00
d0ca9c3aa7 Add bearer token handling to GraphQL client and enhance session management
- Introduced ExtractedToken dataclass for structured token extraction.
- Implemented discover_auth_tokens function to scan localStorage for auth-related keys.
- Enhanced extract_bearer_token function to prioritize token extraction based on common patterns.
- Updated GraphQLClient to accept bearer_token in headers for API requests.
- Improved JSON parsing and error handling in validate_persona_from_storage function.
2025-12-03 05:19:19 +00:00
f5e50f88f9 Enhance browser automation framework with new Material UI helpers and diagnostics capabilities. Introduce PageLike protocol for unified page handling, improve dropdown interactions, and add comprehensive diagnostics API for troubleshooting. Refactor actions to utilize new helpers and streamline form population processes. Update configuration for extension hosts and enhance logging for better error tracking. 2025-12-03 03:28:02 +00:00
fbc2dd3494 Refactor ExtensionLocator to support parent-child relationships and enhance element interaction methods. Introduce a selector chain for improved querying and update click, fill, and wait_for methods to utilize JavaScript for better performance and reliability. Simplify error handling and improve code readability. 2025-11-24 05:32:51 +00:00
2fad7ccbee Enhance browser automation by introducing PageLike protocol for unified page handling across Playwright and extension contexts. Update actions and helpers to utilize PageLike, improving dropdown and form interactions. Add new browser element helpers for streamlined UI automation. 2025-11-24 02:46:12 +00:00
648c657e04 Refactor sourcing request action to remove unused JavaScript for input discovery and streamline form fill logging. Adjust PageHelpers to improve dropdown interaction timing and ensure listbox presence during selection. 2025-11-24 01:48:03 +00:00
8c1546d394 Implement Terminator Bridge extension for browser automation, enabling WebSocket communication to avoid CDP page refresh issues. Update configuration for extension hosts, add new extension files, and enhance browser client functionality. Refactor actions to utilize the new extension client for improved UI interactions. 2025-11-24 01:40:12 +00:00
1aac3a9a3e Refactor login action to use persona store parameter, replace FillIntakeBasicAction with sourcing request action, and enhance browser client logging. Update intake strings and selectors for new sourcing request form fields. 2025-11-23 05:51:21 +00:00
ff3b9c7edb Update personas email, refactor action context types, and introduce PageHelpers for UI interactions. Added demo action for collapsing accordions and updated configuration URLs for staging environment. 2025-11-23 01:42:17 +00:00
165 changed files with 37884 additions and 684 deletions

BIN
._.DS_Store Normal file

Binary file not shown.

5
.gitignore vendored
View File

@@ -6,7 +6,7 @@ repomix-output.md
logs/
# C extensions
*.so
data/
# Distribution / packaging
.Python
build/
@@ -144,3 +144,6 @@ uv.lock
# Local development
*.local
.envrc
# Session storage (auth tokens, sensitive data)
.sessions/

150
CLAUDE.md
View File

@@ -4,7 +4,13 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
## Project Overview
A FastAPI-based guided demo platform that automates browser interactions with Raindrop using Playwright. The app executes data-driven actions (stored in `ActionRegistry`) on behalf of personas that target configured browser hosts (CDP or headless). All configuration is externalized via YAML files and environment overrides.
A FastAPI-based guided demo platform that automates browser interactions with Raindrop using Playwright and browser extensions. The app executes data-driven actions (stored in `ActionRegistry`) on behalf of personas that target configured browser hosts (CDP, headless, or extension). All configuration is externalized via YAML files and environment overrides.
**Recent Enhancements:**
- **PageLike Protocol** — Unified interface supporting both Playwright Page and ExtensionPage for seamless switching between automation backends
- **PageHelpers Pattern** — High-level fluent API for browser interactions (wait utilities, UI operations, diagnostics)
- **Extension Mode** — WebSocket-based browser automation avoiding CDP page refresh issues, with React-compatible form filling
- **Browser Elements Package** — Reusable UI interaction helpers for dropdowns, forms, and complex UI patterns
**Entry Point:** `python -m guide` (runs `src/guide/main.py``guide.app.main:app`)
**Python Version:** 3.12+
@@ -22,7 +28,7 @@ basedpyright src
# Compile sanity check
python -m compileall src/guide
# Run development server (default: localhost:8000)
# Run development server (default: localhost:8765)
python -m guide
# or with custom host/port:
HOST=127.0.0.1 PORT=9000 python -m guide
@@ -41,9 +47,19 @@ HOST=127.0.0.1 PORT=9000 python -m guide
**Root module:** `src/guide/app/`
- **`actions/`** — Demo action implementations with auto-discovery via `@register_action` decorator. Auto-wired via `ActionRegistry` with dependency injection. Submodules: `base.py` (DemoAction, CompositeAction, ActionRegistry), `registry.py` (auto-discovery), `playbooks.py` (multi-step workflows like OnboardingFlow, FullDemoFlow), `auth/` (LoginAsPersonaAction), `intake/` (CreateIntakeAction), `sourcing/` (AddSupplierAction).
- **`actions/`** — Demo action implementations with auto-discovery via `@register_action` decorator. Auto-wired via `ActionRegistry` with dependency injection. Submodules: `base.py` (DemoAction, CompositeAction, ActionRegistry), `registry.py` (auto-discovery), `playbooks.py` (multi-step workflows like OnboardingFlow, FullDemoFlow), `diagnose_page.py` (diagnostic action for page inspection), `auth/` (LoginAsPersonaAction), `intake/` (CreateSourcingRequestAction), `sourcing/` (AddSupplierAction), `demo/` (demonstration actions like CollapseAccordionsDemoAction showcasing PageHelpers pattern).
- **`auth/`** — Pluggable MFA/auth helpers. `mfa.py` defines `MfaCodeProvider` interface; `DummyMfaCodeProvider` raises `NotImplementedError` (implement for production). `session.py` provides `ensure_persona()` for login flows.
- **`browser/`** — `BrowserPool` (persistent browser instances per host) + `BrowserClient` (context-managed page access). Handles both CDP attach and headless launch. `pool.py` manages lifecycle; `client.py` wraps for async context manager pattern. `diagnostics.py` captures screenshots, HTML, console logs for error debugging.
- **`browser/`** — Browser automation core with multiple layers:
- `pool.py``BrowserPool` manages persistent browser instances per host (lazy-initialized, allocates fresh contexts/pages per request)
- `client.py``BrowserClient` wraps BrowserPool with async context manager pattern
- `extension_client.py``ExtensionClient` provides WebSocket-based browser automation via Chrome extension to avoid CDP page refresh issues; includes `ExtensionPage` with Playwright-like API and React-compatible fill/type methods
- `types.py` — Protocol definitions: `PageLike` (unified interface for Playwright Page and ExtensionPage), `PageLocator` (protocol for locator objects)
- `helpers.py``PageHelpers` class for high-level page interactions (wait utilities, diagnostics capture, UI operations like fill_and_advance, search_and_select, click_and_wait, accordion collapse, dropdown operations)
- `wait.py` — Standalone wait and stability utilities (wait_for_selector, wait_for_navigation, wait_for_network_idle, wait_for_stable_page)
- `diagnostics.py` — Captures screenshots, HTML, console logs for error debugging
- `elements/` — Package for reusable UI interaction helpers:
- `dropdown.py` — Dropdown helpers optimized for extension mode (select_multi, select_single, select_combobox)
- `form.py` — Form fill helpers (extension-friendly wrappers for text, textarea, date, autocomplete)
- **`core/`** — App bootstrap: `config.py` (AppSettings with Pydantic v2, env prefix `RAINDROP_DEMO_`, YAML + JSON override cascade), `logging.py` (structured logging with request-scoped context variables).
- **`errors/`** — `GuideError` hierarchy (ConfigError, BrowserConnectionError, PersonaError, AuthError, MfaError, ActionExecutionError, GraphQLTransportError, GraphQLOperationError); routers normalize to HTTP responses with debug info.
- **`raindrop/`** — GraphQL client + operations. `graphql.py` (httpx-based HTTP client), `operations/` (intake.py, sourcing.py with query/mutation definitions), `generated/` (ariadne-codegen auto-generated Pydantic models), `queries/` (GraphQL query/mutation files).
@@ -53,9 +69,16 @@ HOST=127.0.0.1 PORT=9000 python -m guide
- **`api/`** — FastAPI routers in `routes/`. `health.py` (GET /healthz), `actions.py` (GET /actions, POST /actions/{id}/execute), `config.py` (GET /config/browser-hosts). Map requests → `ActionRegistry``BrowserClient``ActionEnvelope` responses with error capture.
**Config files (git-tracked):**
- `config/hosts.yaml` — Browser host targets (id, kind: cdp|headless, host, port, browser type).
- `config/hosts.yaml` — Browser host targets (id, kind: cdp|headless|extension, host, port, browser type).
- `config/personas.yaml` — Personas (id, role, email, login_method, browser_host_id).
**Extension files (git-tracked):**
- `extension/` — Terminator Bridge Chrome extension (Manifest V3)
- `manifest.json` — Extension configuration with debugger permissions
- `worker.js` — Service worker handling WebSocket and Chrome debugger API
- `content.js` — Content script for extension wake-up handshakes
- `README.md` — Extension documentation and protocol specification
**Config overrides (runtime only, never commit):**
- `RAINDROP_DEMO_BROWSER_HOSTS_JSON` — JSON array overrides `hosts.yaml`.
- `RAINDROP_DEMO_PERSONAS_JSON` — JSON array overrides `personas.yaml`.
@@ -80,7 +103,8 @@ HOST=127.0.0.1 PORT=9000 python -m guide
- Router generates `correlation_id` (UUID) and `ActionContext` (includes persona, host, params, correlation_id, shared_state dict for composite actions).
- `ActionRegistry.get(action_id)` retrieves action (auto-discovered, dependency-injected).
- `BrowserClient.open_page(host_id)` → allocates fresh context + page from BrowserPool. Reuses persistent browser instance for host; creates new page/context for isolation.
- `Action.run(page, context)` executes logic; may call `ensure_persona()` (login flow) before starting. For composite actions, passes shared_state dict to child actions.
- `Action.run(page, context)` executes logic with `PageLike`-typed page parameter (supports both Playwright Page and ExtensionPage); may call `ensure_persona()` (login flow) before starting. For composite actions, passes shared_state dict to child actions.
- Actions typically use `PageHelpers` wrapper for high-level interactions (wait utilities, UI operations, diagnostics).
- On error: captures debug info (screenshot, HTML, console logs) and returns with DebugInfo attached.
- Response: `ActionEnvelope` (status, correlation_id, result/error, debug_info).
@@ -93,10 +117,61 @@ HOST=127.0.0.1 PORT=9000 python -m guide
- Handles timeouts, connection errors, context cleanup.
- **Host Kind Resolution:**
- `kind: cdp` — connect to running Raindrop instance via Chrome DevTools Protocol (requires `host` + `port`). Errors surface as `BrowserConnectionError`.
- `kind: cdp` — connect to running Raindrop instance via Chrome DevTools Protocol (requires `host` + `port`). **WARNING:** Querying `browser.contexts` or `context.pages` triggers page refresh, closing modals and losing user state. Use `kind: extension` instead for modal interactions.
- `kind: headless` — launch Playwright browser (chromium/firefox/webkit); set `browser` field in config.
- `kind: extension` — connect to Chrome via Terminator Bridge extension using WebSocket. Provides Playwright-like API without CDP page refresh issues. Requires Chrome running with extension loaded.
- Always use `async with BrowserClient.open_page(host_id) as page:` to ensure proper cleanup (context manager unwinding in BrowserPool).
### Extension-Based Browser Automation (Solving CDP Page Refresh Problem)
**Problem:** CDP's `browser.contexts` and `context.pages` queries trigger page refreshes, causing modals to close and user state to be lost.
**Solution:** Use browser extension with Chrome's internal debugger API via WebSocket communication.
**Architecture:**
- **Python Side:** `ExtensionClient` acts as WebSocket **SERVER** listening on `0.0.0.0:17373`
- **Browser Side:** Terminator Bridge extension acts as WebSocket **CLIENT** connecting to server
- **Communication:** JSON messages with UUID-based request/response correlation
- **API:** Provides Playwright-like interface (`page.click()`, `page.fill()`, `page.evaluate()`, `page.locator()`)
- **Network:** Supports cross-network operation (Python on one machine, Chrome on another)
**Key Files:**
- `src/guide/app/browser/extension_client.py``ExtensionClient` (WebSocket server) and `ExtensionPage` (Playwright-like API with React-compatible fill/type methods using native property descriptors)
- `src/guide/app/browser/types.py``PageLike` protocol for unified interface across Playwright and extension pages
- `extension/worker.js` — MV3 service worker handling WebSocket connection and debugger API
- `extension/content.js` — Content script for extension wake-up handshakes
- `extension/manifest.json` — Manifest V3 configuration
**Setup:**
1. Load extension in Chrome: `chrome://extensions` → "Load unpacked" → select `extension/` directory
2. Navigate Chrome to target page (e.g., `https://stg.raindrop.com/`)
3. Python code uses `ExtensionClient`:
```python
from guide.app.browser.extension_client import ExtensionClient
async with ExtensionClient() as client:
page = await client.get_page()
await page.click("button.submit")
await page.fill("input[name='email']", "user@example.com")
title = await page.evaluate("document.title")
```
**Benefits:**
- ✅ No page refresh when interacting with browser
- ✅ Modals stay open during automation
- ✅ User state preserved across operations
- ✅ Works across network (Python and Chrome on different machines)
- ✅ Familiar Playwright-like API
**Restrictions:**
- Chrome debugger cannot attach to restricted pages (`chrome://`, `chrome-extension://`, `devtools://`, `edge://`, `about:`)
- Active tab must be on a regular webpage
- Requires browser extension to be loaded and Chrome running
**Testing:**
- `test_extension_client.py` — Basic connectivity and API validation
- `test_sourcing_form_extension.py` — Form filling demonstration without page refresh
### GraphQL & Data Layer
- `raindrop/graphql.py` — HTTP client (httpx, 10s timeout).
@@ -184,7 +259,7 @@ The project has completed **8 major refactoring phases** achieving full architec
- All linting passed
- 28 tests passing
- Zero code redundancy
- ~2,898 lines of production code
- ~5,229 lines of production code
## Git & PR Hygiene
@@ -240,16 +315,71 @@ class MyWorkflow(CompositeAction):
pass
```
### High-Level Browser Interactions (PageHelpers Pattern)
For actions requiring browser interactions, use the `PageHelpers` class for a fluent, high-level API:
```python
from guide.app.browser.helpers import PageHelpers
from guide.app.browser.types import PageLike
@register_action
class MyAction(DemoAction):
id = "my-action"
description = "Example using PageHelpers"
category = "demo"
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
helpers = PageHelpers(page)
# Wait for page stability
await helpers.wait_for_stable()
# Fill form and advance
await helpers.fill_and_advance(
selector="input[name='email']",
value="user@example.com",
next_selector="button.next",
)
# Search and select
await helpers.search_and_select(
search_input="input.search",
query="example",
result_selector="li.result",
)
# Collapse accordions
result = await helpers.collapse_accordions("button.accordion")
# Capture diagnostics on error
if some_error:
debug_info = await helpers.capture_diagnostics()
return ActionResult(details={"status": "success"})
```
**Available PageHelpers methods:**
- Wait utilities: `wait_for_selector()`, `wait_for_network_idle()`, `wait_for_navigation()`, `wait_for_stable()`
- UI operations: `fill_and_advance()`, `search_and_select()`, `click_and_wait()`
- Accordion operations: `collapse_accordions()`
- Dropdown operations: `select_dropdown_options()`
- Diagnostics: `capture_diagnostics()`
## Quick Checklist (New Feature)
- [ ] Add action in `actions/` submodule (or submodule directory like `actions/intake/`); use `@register_action` decorator.
- [ ] Add action in `actions/` submodule (or submodule directory like `actions/intake/`, `actions/demo/`); use `@register_action` decorator.
- [ ] Type `page` parameter as `PageLike` (not `Page`) to support both Playwright and extension pages.
- [ ] Use `PageHelpers` wrapper for high-level browser interactions (wait utilities, UI operations, diagnostics).
- [ ] Add action-specific logic; keep it thin and testable. Use `strings/` for all selectors/copy.
- [ ] For complex UI interactions, consider using `browser/elements/` helpers (dropdown, form) or extending PageHelpers.
- [ ] Ensure persona/host exist in `config/hosts.yaml` + `config/personas.yaml` (or use env overrides).
- [ ] If action interacts with modals or requires no page refresh, use `kind: extension` browser host (requires Terminator Bridge extension loaded in Chrome).
- [ ] If action needs GraphQL, add query/mutation to `raindrop/operations/` + `.graphql` files in `raindrop/queries/`.
- [ ] If action needs UI strings, add to `strings/` submodules (selectors, labels, demo_texts).
- [ ] Run `basedpyright src` + `python -m compileall src/guide` (type check + syntax check).
- [ ] Run `pytest tests/` to ensure no regressions (28 tests must pass).
- [ ] Test via `python -m guide` + navigate to `http://localhost:8000/docs` to test endpoint.
- [ ] Test via `python -m guide` + navigate to `http://localhost:8765/docs` to test endpoint.
- [ ] If auth flow required, implement/mock MFA provider or use `DummyMfaCodeProvider` for testing.
- [ ] Review error handling; raise `GuideError` subclasses, not generic exceptions.
- [ ] Commit with descriptive message (e.g., `feat: add my-action`, `test: add my-action tests`).

359
CLAUDE_ENHANCED.md Normal file
View File

@@ -0,0 +1,359 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
A FastAPI-based guided demo platform that automates browser interactions with Raindrop using Playwright and browser extensions. The app executes data-driven actions (stored in `ActionRegistry`) on behalf of personas that target configured browser hosts (CDP, headless, or extension). All configuration is externalized via YAML files and environment overrides.
**Core Architecture:**
- **Dependency Injection System** - Auto-wires actions with dependencies through constructor parameter matching
- **Session Management** - JWT verification, localStorage injection, and offline session validation
- **Multi-Mode Browser Pool** - CDP (cached pages), headless (fresh contexts), and extension modes
- **Form Field Intelligence** - Type inference, selector escaping, and helper dispatch for smart form filling
- **GraphQL Integration** - Auto token extraction, connection pooling, and structured error handling
**Entry Point:** `python -m guide` (runs `src/guide/main.py``guide.app.main:app`)
**Python Version:** 3.12+
**Key Dependencies:** FastAPI, Playwright, Pydantic v2, PyYAML, httpx
## Essential Commands
```bash
# Install dependencies
uv sync
# Type checking (required before commits)
basedpyright src
# Compile sanity check
python -m compileall src/guide
# Run development server (default: localhost:8765)
python -m guide
# or with custom host/port:
HOST=127.0.0.1 PORT=9000 python -m guide
# View API docs
# Navigate to http://localhost:8000/docs
# Key endpoints:
# GET /healthz # liveness check
# GET /actions # list action metadata
# POST /actions/{id}/execute # execute action; returns ActionEnvelope with correlation_id
# GET /config/browser-hosts # view current default + host map
```
## Code Structure
**Root module:** `src/guide/app/`
- **`actions/`** — Demo action implementations with auto-discovery via `@register_action` decorator. Auto-wired via `ActionRegistry` with dependency injection.
- **`auth/`** — Session management with JWT verification, localStorage injection, and MFA provider interfaces.
- **`browser/`** — Multi-mode browser automation core:
- `pool.py``BrowserPool` manages persistent browser instances per host with context allocation
- `client.py``BrowserClient` wraps BrowserPool with async context manager pattern
- `elements/` — Reusable UI interaction helpers with type inference and selector escaping
- `utils.py` — Cross-cutting utilities for JavaScript injection safety
- **`core/`** — App bootstrap: `config.py` (AppSettings with Pydantic v2), `logging.py` (structured logging)
- **`errors/`** — `GuideError` hierarchy with consistent HTTP response mapping
- **`raindrop/`** — GraphQL client with auto token extraction and connection pooling
- **`strings/`** — Centralized selectors, labels, and copy with domain-keyed lookups
- **`models/`** — Domain and persona models using Pydantic v2
- **`utils/`** — Shared helpers including retry logic and JWT verification
- **`api/`** — FastAPI routers mapping requests to ActionRegistry
## Architecture Patterns
### Application Startup & Dependency Wiring
The application follows a structured initialization flow in `main.py`:
1. **Entry point** (`main.py:17`) - Uvicorn loads FastAPI app from `guide.app.main:app`
2. **Load configuration** (`main.py:27`) - Settings loaded from YAML files and environment variables
3. **Create stores** (`main.py:28-29`) - PersonaStore and BoardStore from configuration
4. **Session manager** (`main.py:39`) - Creates SessionManager with storage and TTL
5. **Action registry** (`main.py:47`) - Auto-discovers actions via `actions/registry.py`
6. **Browser pool** (`main.py:55`) - Multi-mode browser instance management
7. **Keep-alive service** (`main.py:59`) - Background task to prevent CDP timeouts
**Dependency Injection Context:**
```python
registry = default_registry(
persona_store, # PersonaStore(settings)
settings.raindrop_base_url, # Raindrop GraphQL endpoint
# Additional dependencies injected by parameter name matching
)
```
### Action Execution with Dependency Injection
Actions are resolved and executed through a structured pipeline:
1. **Request handling** (`actions.py:40`) - POST to `/actions/{action_id}` endpoint
2. **Registry resolution** (`base.py:349`) - ActionRegistry.get() looks up action by ID
3. **Instantiation with DI** (`base.py:321`) - Inspects constructor signature and injects matching dependencies
4. **Page allocation** (`client.py:40`) - BrowserClient.open_page() delegates to pool
5. **Action execution** (`actions.py:64`) - Calls action.run() with page and context
**Dependency Injection Example:**
```python
@register_action
class LoginAsPersonaAction(DemoAction):
def __init__(self, session_manager: SessionManager):
# session_manager injected by matching parameter name
self._session_manager = session_manager
```
### Browser Pool Context Allocation
The browser pool supports three modes with different allocation strategies:
**CDP Mode (browserless):**
- Queries `browser.contexts` on first use
- Caches page reference to avoid refresh on subsequent requests
- Optional storage clearing if `isolate` flag enabled
**Headless Mode:**
- Creates fresh browser context for complete isolation
- New page allocated per request
**Extension Mode:**
- Uses WebSocket-based Terminator Bridge extension
- Avoids CDP page refresh issues entirely
- Supports cross-network operation
**Context Allocation Flow:**
```python
# BrowserClient delegates to pool
context, page, should_close = await pool.allocate_context_and_page(
host_id, storage_state=storage_state
)
# Pool routes to appropriate BrowserInstance by host_id
# Returns context, page, and cleanup flag
```
### Session Restoration with JWT Verification
Session management provides robust offline validation and restoration:
1. **Load session** (`session_manager.py:317`) - SessionStorage reads JSON from `.sessions/{persona_id}`
2. **Offline validation** (`session_manager.py:151`) - Checks TTL expiry and JWT token expiry
3. **Navigate to origin** (`session_manager.py:368`) - Page.goto(base_url) to establish context
4. **Token extraction** (`session.py:555`) - Scans localStorage for Auth0 SPA SDK keys
5. **JWT verification** (`utils/jwt.py:114`) - Fetches JWKS and verifies RS256 signature
6. **localStorage injection** (`session_manager.py:276`) - Restores session data into browser
7. **Persona verification** (`session.py:118`) - Detects current logged-in persona
**Reusable Token Extraction:**
```python
# Common pattern across modules
from guide.app.auth.session import extract_bearer_token
token = await extract_bearer_token(page)
# Scans localStorage for Auth0 keys and parses nested JSON structure
```
### Form Field Filling with Type Inference
Smart form filling reconciles GraphQL schemas with live DOM elements:
1. **Schema fetch** - Retrieves GraphQL schema for target board
2. **Form context build** (`context_builder.py:165`) - Extracts all form fields and matches to schema
3. **Type inference** (`field_inference.py:79`) - Inspects DOM for ARIA roles and MUI classes
4. **Helper selection** (`field_inference.py:243`) - Maps field types to helper functions
5. **Dispatch execution** - Routes to appropriate fill helper based on inferred type
**Field Type Inference:**
```python
# Reusable pattern for DOM inspection
field_type = await infer_type_from_element(page, selector)
# Returns: select_single, select_combobox, fill_with_react_events, etc.
helper = select_helper_for_type(field_type)
```
**Selector Safety:**
```python
# Critical utility for JavaScript injection safety
from guide.app.browser.utils import escape_selector
escaped = escape_selector(selector) # Escapes \, ', " for JS strings
```
### GraphQL Query Execution with Token Handling
GraphQL client provides robust token management and error handling:
1. **Auto token extraction** (`graphql.py:77`) - Reuses extract_bearer_token from auth/session
2. **Connection pooling** (`graphql.py:30`) - Persistent httpx.AsyncClient for efficiency
3. **Response validation** (`graphql.py:139`) - Pydantic TypeAdapter for structural validation
4. **Error handling** (`graphql.py:166`) - Raises typed exceptions with structured details
5. **Retry logic** (`utils/retry.py:83`) - Exponential backoff decorator for resilience
**GraphQL Execution Pattern:**
```python
# Auto token discovery from page context
client = GraphQLClient(base_url=settings.raindrop_base_url)
result = await client.execute(
query=SOME_QUERY,
bearer_token=None, # Auto-extracted from page if available
variables={"id": board_id}
)
```
### Reusable Utilities Distribution
**Selector Escaping Utility:**
- Core definition in `browser/utils.py` (`escape_selector()`)
- Used across multiple modules for safe JavaScript string interpolation
- Critical for preventing XSS and selector injection attacks
**Retry Logic:**
- Exponential backoff implementation in `utils/retry.py`
- Used in GraphQL client and other network operations
- Configurable retry attempts and backoff strategy
**JWT Verification:**
- Auth0 JWKS fetching and RS256 signature verification
- Reusable across session management and token validation
- Handles key rotation and token expiry gracefully
## Development Workflow
1. **Edit code** (actions, browser logic, GraphQL ops, etc.)
2. **Run type check:** `basedpyright src` (catches generic types, missing annotations)
3. **Sanity compile:** `python -m compileall src/guide` (syntax check)
4. **Smoke test:** `python -m guide` then hit `/docs` or manual test via curl
5. **Review error handling:** ensure `GuideError` subclasses are raised, not generic exceptions
6. **Commit** with scoped, descriptive message
## Type & Linting Standards
- **Python 3.12+:** Use PEP 604 unions (`str | None`), built-in generics (`list[str]`, `dict[str, JSONValue]`)
- **Ban `Any` and `# type: ignore`:** Use type guards or Protocol instead
- **Pydantic v2:** Explicit types, model_validate for parsing, model_copy for immutable updates
- **Type checker:** Pyright (via basedpyright)
- **Docstrings:** Imperative style, document public APIs, include usage examples
## Error Handling & Logging
- Always raise `GuideError` subclasses (not generic `Exception`); routers translate to HTTP responses
- Log via `core/logging` (structured, levelled). Include persona/action IDs and host targets for traceability
- For browser flows, use Playwright traces (enabled by default in `BrowserClient`); disable only intentionally
- Validate external inputs early; surface schema/connection issues as `GuideError`
## Testing & Quality Gates
- **Minimum gate:** `basedpyright src` + `python -m compileall src/guide` before merge
- **Test Coverage:** Comprehensive unit and integration suites in `tests/` directory
- **Test Structure:**
- `tests/unit/` — Unit tests for strings registry, models, action registration
- `tests/integration/` — Integration tests for BrowserClient, BrowserPool, browser lifecycle
- `conftest.py` — Shared fixtures for mock objects and test setup
- Mock Playwright/GraphQL in tests; avoid real network/CDP calls
- Require deterministic fixtures; document any env vars needed in test module docstring
## Performance & Footprint
- Keep browser sessions short-lived; close contexts to avoid handle leaks
- Cache expensive GraphQL lookups (per-request OK, global only if safe)
- Don't widen dependencies without justification; stick to project pins in `pyproject.toml`
- Promptly close Playwright contexts/browser handles (wrapped in contextmanager; keep action code lean)
## Action Registration & Dependency Injection
### Registering a New Action
Use the `@register_action` decorator to auto-discover and register actions:
```python
from actions.base import DemoAction, register_action
@register_action
class MyAction(DemoAction):
id = "my-action"
description = "Does something cool"
category = "demo"
# Optional: Declare dependencies (auto-injected by ActionRegistry)
def __init__(self, session_manager: SessionManager, persona_store: PersonaStore):
self._session_manager = session_manager
self.persona_store = persona_store
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
# Implementation using injected dependencies
restored = await self._session_manager.restore_session(page, context.persona)
if not restored:
# Handle session restoration failure
pass
```
**Auto-Discovery:** `ActionRegistry` uses `pkgutil.walk_packages()` to discover all modules in `actions/` and collect all `@register_action` decorated classes.
**Dependency Injection:** Parameters in `__init__` are matched by name against DI context dict during action instantiation.
### Multi-Step Workflows (CompositeAction)
For workflows spanning multiple steps with shared state:
```python
@register_action
class MyWorkflow(CompositeAction):
id = "my-workflow"
description = "Multi-step workflow"
category = "demo"
child_actions = ("step1-action", "step2-action", "step3-action")
async def on_step_complete(self, step_id: str, result: ActionResult) -> None:
# Update shared_state after each step
# Accessed in child actions via context.shared_state dict
pass
```
### Browser Interaction Patterns
**Safe Selector Usage:**
```python
from guide.app.browser.utils import escape_selector
# Always escape selectors for JavaScript injection
escaped_selector = escape_selector(selector)
result = await page.evaluate(f"document.querySelector('{escaped_selector}')")
```
**Form Field Type Inference:**
```python
from guide.app.browser.elements.field_inference import infer_type_from_element, select_helper_for_type
# Infer field type from DOM element
field_type = await infer_type_from_element(page, selector)
helper_name = select_helper_for_type(field_type)
# Dispatch to appropriate helper
if helper_name == "select_single":
await select_single(page, selector, value)
elif helper_name == "fill_with_react_events":
await fill_text(page, selector, value)
```
**GraphQL with Auto Token Management:**
```python
from guide.app.raindrop.graphql import GraphQLClient
from guide.app.auth.session import extract_bearer_token
# Token automatically extracted from page if available
client = GraphQLClient(base_url=settings.raindrop_base_url)
result = await client.execute(query=GET_BOARD, variables={"id": board_id})
```
## Quick Checklist (New Feature)
- [ ] Add action in `actions/` submodule with `@register_action` decorator
- [ ] Declare dependencies in `__init__` for automatic injection
- [ ] Use `PageLike` type for page parameter to support all browser modes
- [ ] Use `escape_selector()` for any JavaScript string interpolation
- [ ] For form interactions, use type inference and helper dispatch patterns
- [ ] For GraphQL operations, rely on auto token extraction from page context
- [ ] Run `basedpyright src` + `python -m compileall src/guide` for validation
- [ ] Test via `python -m guide` + navigate to `http://localhost:8765/docs`
- [ ] Review error handling; raise `GuideError` subclasses, not generic exceptions
- [ ] Commit with descriptive message following established patterns

63
config/boards.yaml Normal file
View File

@@ -0,0 +1,63 @@
# Board configurations with default field values for intake requests
#
# Each board defines:
# - id: Unique board identifier (matches board.id in GraphQL)
# - name: Human-readable board name
# - instance_id: Raindrop instance ID (required for board item creation)
# - defaults: Default field values for new board items
#
# Field values in API requests override these defaults.
# Use demo_texts module patterns for dynamic values (dates, etc.)
boards:
cleaningservices:
id: 579
name: cleaningservices
key: CLEAN
instance_id: 107
defaults:
# Status workflow
status: "Not Started"
group: "Backlog"
# Planning fields
new_run: "New"
opex_capex: "OpEx"
planned: "Planned"
program: "LCOM"
# Description (required field)
description: "Intake request created via API automation"
# Financial defaults (can be overridden)
target_spend: 0
baseline_spend: 0
operations:
id: 14
name: Operations
key: PL01
instance_id: 107
defaults:
status: "Not Started"
group: "Backlog"
description: "Board item created via API automation"
demonstration:
id: 596
name: Demonstration
key: DEMON
instance_id: 107
defaults:
group: "Backlog"
description: "Board item created via API automation"
# Add more boards as needed:
# example_board:
# id: 123
# name: example
# key: EX
# instance_id: 107
# defaults:
# status: "Not Started"
# description: "Default description"

View File

@@ -1,8 +1,27 @@
# Browser Host Configuration
#
# Host kinds:
# - extension: WebSocket-based automation via Terminator Bridge extension
# - cdp: Chrome DevTools Protocol connection to running browser
# - headless: Playwright-managed headless browser
#
# CDP Isolation (isolate flag):
# - isolate: false (default) — Reuse cached context/page for performance.
# State (cookies, localStorage) persists across actions.
# - isolate: true — Clear cookies/storage between requests for session isolation.
# Use when actions need clean state without modal/page refresh issues.
hosts:
demo-cdp:
demo-extension:
kind: extension
port: 17373
support-extension:
kind: extension
port: 17374
browserless-cdp:
kind: cdp
host: 192.168.50.185
port: 9223
headless-local:
kind: headless
host: browserless.lab # goes through Traefik
port: 80 # Traefik web entrypoint
cdp_url: ws://browserless.lab:80/ # explicit endpoint to avoid 0.0.0.0 from /json/version
browser: chromium
# isolate: false # uncomment to enable session isolation

View File

@@ -1,9 +1,14 @@
personas:
buyer:
role: buyer
email: buyer.demo@example.com
email: travis@raindrop.com
login_method: mfa_email
browser_host_id: demo-cdp
browser_host_id: demo-extension
analyst:
role: analyst
email: rd.daya@sidepiece.rip
login_method: mfa_email
browser_host_id: support-extension
supplier:
role: supplier
email: supplier.demo@example.com

BIN
data/._.DS_Store Normal file

Binary file not shown.

7978
docs/api-1.json Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,89 @@
# Spec Validation (2025-12-07)
Findings below are grounded in current source with inline excerpts.
## Functional Gaps
- Docling config is cosmetic: `extract_ui_elements` keeps `_docling_url`/`_api_key` params but the docstring says they are “Unused (maintained for signature compatibility)”, and the body only runs `_extract_elements_from_html(...)` on captured HTML (no HTTP call).
```python
# src/guide/app/browser/diagnostics.py
async def extract_ui_elements(..., _docling_url: str | None = None, _api_key: str | None = None):
"""... docling_url: Unused (maintained for signature compatibility)"""
html = await capture_html(page)
ui_elements = _extract_elements_from_html(html)
```
- OTP callbacks are processlocal: the store is an inmemory dict plus a modulelevel singleton. Requests and callbacks on different workers wont see the same `_requests`.
```python
# src/guide/app/auth/otp_callback.py
class OtpCallbackStore:
_requests: dict[str, OtpRequest]
...
_store: OtpCallbackStore | None = None
```
- GraphQL client rejects partial successes: any `errors` entry triggers `GraphQLOperationError`, even if `data` is present.
```python
# src/guide/app/raindrop/graphql.py
if validated.has_errors:
raise errors.GraphQLOperationError(
validated.first_error_message or "GraphQL operation failed",
details={"errors": error_details},
)
```
## Configuration & Portability
- Demonstration board bindings are hardcoded to a single environment.
```python
# src/guide/app/raindrop/operations/demonstration.py
DEMONSTRATION_BOARD_ID = 596
DEMONSTRATION_INSTANCE_ID = 107
```
- Redis URL currently unused: even with `RAINDROP_DEMO_REDIS_URL=redis://192.168.50.210:6379/4`, `AppSettings` has no Redis-related fields; only the documented keys (raindrop URLs, browser hosts, docling_*, session_*, n8n_*) are parsed, so the value is ignored at runtime.
```python
# src/guide/app/core/config.py
model_config = SettingsConfigDict(env_prefix="RAINDROP_DEMO_", ...)
# defined fields: raindrop_base_url, raindrop_graphql_url, browser_hosts, docling_*, session_*, n8n_* (no redis)
```
## Security Notes
- JWT expiry is parsed without signature verification and feeds offline session validation; a forged token with a farfuture `exp` could keep a cached session “valid” until TTL expires.
```python
# src/guide/app/auth/session_manager.py
token_expires_at = self.parse_jwt_expiry(token.value)
...
if session.token_expires_at:
token_remaining = session.token_expires_at - now
```
## Performance Considerations
- Accordion collapsing issues one Playwright hop per button (`buttons.nth(index)` inside a loop), which scales poorly on pages with many accordions.
```python
# src/guide/app/browser/elements/layout.py
for index in range(count):
button: PageLocator = buttons.nth(index)
if await icon.count() > 0:
await button.click(timeout=max_wait)
```
- Form discovery walks every `[data-cy^="board-item-"]` node and marshals its metadata in one evaluate call; large boards will produce heavy payloads.
```javascript
// src/guide/app/browser/elements/form_discovery.py
const fields = container.querySelectorAll('[data-cy^=\"board-item-\"]');
return Array.from(fields)
.filter(field => field.offsetParent !== null)
```
## DX / DI Footnote
- Action DI requires constructor params to exist in the DI context unless theyre varargs or have defaults; decorated `__init__` without `functools.wraps` can break injection.
```python
# src/guide/app/actions/base.py
for param_name, param in sig.parameters.items():
if param_name in self._di_context:
kwargs[param_name] = self._di_context[param_name]
else:
is_var_param = param.kind in (Parameter.VAR_POSITIONAL, Parameter.VAR_KEYWORD)
has_default = cast(object, param.default) != Parameter.empty
if not is_var_param and not has_default:
raise errors.ActionExecutionError(...)
```

34
extension/README.md Normal file
View File

@@ -0,0 +1,34 @@
# Terminator Bridge Extension (MV3)
Evaluates JavaScript in the active tab using Chrome DevTools Protocol without opening DevTools, bridged to a local WebSocket at `ws://127.0.0.1:17373`.
- Permissions: `debugger`, `tabs`, `scripting`, `activeTab`, `<all_urls>`
- Background service worker connects to the local WebSocket and handles `{ action: 'eval', id, code }` messages. Replies with `{ id, ok, result|error }`.
## Install (Load Unpacked)
1. Open `chrome://extensions` (or Edge: `edge://extensions`).
2. Enable Developer Mode.
3. Click "Load unpacked" and select this `browser-extension/` folder.
4. Keep the Extensions page open for easy reloading during development.
Alternatively, launch Chromium with:
```sh
chromium --load-extension=/absolute/path/to/browser-extension
```
## Protocol
- Request: `{ "id": "uuid", "action": "eval", "code": "document.title", "awaitPromise": true }`
- Response: `{ "id": "uuid", "ok": true, "result": "..." }` or `{ "id": "uuid", "ok": false, "error": "..." }`
Targets the active tab of the last focused window.
## Restrictions
The debugger cannot attach to restricted Chrome pages for security reasons:
- `chrome://` URLs (extensions, settings, etc.)
- `chrome-extension://` URLs
- `devtools://` URLs
- `edge://` URLs (if using Edge)
- `about:` pages
**Solution**: Navigate to a regular webpage (e.g., google.com, github.com) before running eval commands.

View File

@@ -0,0 +1,5 @@
{
"manifest_version": 3,
"name": "Terminator Bridge",
"version": "0.23.19"
}

455
extension/content.js Normal file
View File

@@ -0,0 +1,455 @@
/**
* Content Script for Terminator Bridge Extension.
*
* This script runs in the MAIN world (requires manifest world: "MAIN") to access
* React's native property descriptors for form filling.
*
* Responsibilities:
* 1. Wake-up handshakes for MV3 service worker
* 2. ActionDispatcher for structured commands from Python
* 3. React-compatible form filling using native property setters
*/
// ---------------------------------------------------------------------------
// Wake-up Handshake (Original functionality)
// ---------------------------------------------------------------------------
(function () {
const MSG = { type: "terminator_content_handshake" };
const sendHandshake = () => {
try {
chrome.runtime.sendMessage(MSG, () => {
// ignore response
});
} catch (_) {
// Ignore if not allowed on special pages
}
};
// Initial handshake as early as possible
sendHandshake();
// Wake-ups on focus/visibility
try {
document.addEventListener(
"visibilitychange",
() => {
if (document.visibilityState === "visible") {
sendHandshake();
}
},
{ capture: false, passive: true }
);
} catch (_) {}
try {
window.addEventListener("focus", sendHandshake, {
capture: true,
passive: true,
});
} catch (_) {}
try {
window.addEventListener("pageshow", sendHandshake, {
capture: true,
passive: true,
});
} catch (_) {}
})();
// ---------------------------------------------------------------------------
// React Utilities for Form Filling
// ---------------------------------------------------------------------------
const ReactUtils = {
/**
* Set value using React's native property setter.
* This is required for controlled components to properly update state.
*/
setNativeValue: (element, value) => {
// Determine the correct prototype based on element type
const prototype =
element.tagName === "TEXTAREA"
? window.HTMLTextAreaElement.prototype
: window.HTMLInputElement.prototype;
const nativeSetter = Object.getOwnPropertyDescriptor(prototype, "value")?.set;
if (nativeSetter) {
nativeSetter.call(element, value);
} else {
// Fallback to direct assignment
element.value = value;
}
},
/**
* Dispatch events to trigger React handlers.
*/
dispatchEvents: (element, { blur = true } = {}) => {
element.dispatchEvent(new Event("input", { bubbles: true }));
element.dispatchEvent(new Event("change", { bubbles: true }));
if (blur) {
element.blur();
}
},
/**
* Clear then fill an input (for masked inputs like currency/percentage).
*/
clearAndFill: (element, value) => {
element.focus();
element.select();
// Clear first
ReactUtils.setNativeValue(element, "");
element.dispatchEvent(new Event("input", { bubbles: true }));
// Set new value
ReactUtils.setNativeValue(element, value);
ReactUtils.dispatchEvents(element);
},
};
// ---------------------------------------------------------------------------
// DOM Utilities
// ---------------------------------------------------------------------------
const DOMUtils = {
/**
* Query element with multiple selector strategies.
*/
querySelector: (selector) => {
// Try standard querySelector first
let element = document.querySelector(selector);
if (element) return element;
// Try data-testid fallback
if (!selector.includes("[data-testid=")) {
element = document.querySelector(`[data-testid="${selector}"]`);
if (element) return element;
}
return null;
},
/**
* Find element containing specific text.
*/
findByText: (selector, text) => {
const elements = document.querySelectorAll(selector);
const normalizedText = text.toLowerCase().trim();
for (const el of elements) {
const elText = (el.textContent || "").toLowerCase().trim();
if (elText === normalizedText || elText.includes(normalizedText)) {
return el;
}
}
return null;
},
/**
* Wait for element to appear in DOM.
*/
waitForElement: async (selector, timeout = 5000, state = "attached") => {
const start = Date.now();
while (Date.now() - start < timeout) {
const element = DOMUtils.querySelector(selector);
if (state === "attached" && element) {
return element;
}
if (state === "visible" && element) {
const rect = element.getBoundingClientRect();
const style = window.getComputedStyle(element);
const isVisible =
rect.width > 0 &&
rect.height > 0 &&
style.display !== "none" &&
style.visibility !== "hidden" &&
style.opacity !== "0";
if (isVisible) return element;
}
if (state === "detached" && !element) {
return null;
}
await new Promise((r) => setTimeout(r, 100));
}
if (state === "detached") return null;
throw new Error(`Timeout waiting for selector: ${selector}`);
},
};
// ---------------------------------------------------------------------------
// Mouse Event Utilities
// ---------------------------------------------------------------------------
const MouseUtils = {
/**
* Click element with full mouse event sequence.
* MUI components require mousedown/mouseup/click sequence.
*/
clickWithEvents: (element, { focus = true } = {}) => {
if (focus && typeof element.focus === "function") {
element.focus();
}
const rect = element.getBoundingClientRect();
const centerX = rect.left + rect.width / 2;
const centerY = rect.top + rect.height / 2;
const eventOptions = {
bubbles: true,
cancelable: true,
view: window,
clientX: centerX,
clientY: centerY,
button: 0,
};
element.dispatchEvent(new MouseEvent("mousedown", eventOptions));
element.dispatchEvent(new MouseEvent("mouseup", eventOptions));
element.dispatchEvent(new MouseEvent("click", eventOptions));
return true;
},
};
// ---------------------------------------------------------------------------
// Keyboard Event Utilities
// ---------------------------------------------------------------------------
const KeyboardUtils = {
/**
* Send a keyboard event to the active element.
*/
sendKey: (key) => {
const keyCodes = {
ArrowDown: 40,
ArrowUp: 38,
Enter: 13,
Escape: 27,
Tab: 9,
};
const event = new KeyboardEvent("keydown", {
key,
code: key,
keyCode: keyCodes[key] || 0,
which: keyCodes[key] || 0,
bubbles: true,
cancelable: true,
composed: true,
});
if (document.activeElement) {
document.activeElement.dispatchEvent(event);
}
},
};
// ---------------------------------------------------------------------------
// Action Dispatcher - Handles commands from service worker
// ---------------------------------------------------------------------------
const Actions = {
/**
* Fill an input/textarea using React-compatible method.
*/
FILL: async ({ selector, value, clear = false }) => {
const element = DOMUtils.querySelector(selector);
if (!element) {
throw new Error(`Element not found: ${selector}`);
}
element.focus();
if (clear) {
ReactUtils.clearAndFill(element, value);
} else {
ReactUtils.setNativeValue(element, value);
ReactUtils.dispatchEvents(element);
}
return { success: true, selector, value };
},
/**
* Click an element with full mouse event sequence.
*/
CLICK: async ({ selector, focus = true }) => {
const element = DOMUtils.querySelector(selector);
if (!element) {
throw new Error(`Element not found: ${selector}`);
}
MouseUtils.clickWithEvents(element, { focus });
return { success: true, selector };
},
/**
* Find element by text content and click it.
*/
CLICK_TEXT: async ({ selector, text, focus = true }) => {
const element = DOMUtils.findByText(selector, text);
if (!element) {
throw new Error(`Element with text "${text}" not found in: ${selector}`);
}
MouseUtils.clickWithEvents(element, { focus });
return { success: true, selector, text };
},
/**
* Wait for an element to appear/disappear.
*/
WAIT_FOR_SELECTOR: async ({ selector, state = "attached", timeout = 5000 }) => {
const result = await DOMUtils.waitForElement(selector, timeout, state);
return {
success: true,
selector,
state,
found: result !== null,
};
},
/**
* Get page content (HTML).
*/
GET_CONTENT: async ({ mode = "html" }) => {
if (mode === "text") {
return { content: document.body.innerText };
}
return { content: document.documentElement.outerHTML };
},
/**
* Execute arbitrary JavaScript code.
* Use with caution - prefer structured actions when possible.
*/
EVAL: async ({ code, awaitPromise = true }) => {
// Wrap in function scope for clean execution
const result = (0, eval)(code);
if (awaitPromise && result instanceof Promise) {
return await result;
}
return result;
},
/**
* Send a keyboard key to the focused element.
*/
SEND_KEY: async ({ key }) => {
KeyboardUtils.sendKey(key);
return { success: true, key };
},
/**
* Check if an element is visible.
*/
IS_VISIBLE: async ({ selector }) => {
const element = DOMUtils.querySelector(selector);
if (!element) {
return { visible: false, exists: false };
}
const rect = element.getBoundingClientRect();
const style = window.getComputedStyle(element);
const visible =
rect.width > 0 &&
rect.height > 0 &&
style.display !== "none" &&
style.visibility !== "hidden" &&
style.opacity !== "0";
return { visible, exists: true };
},
/**
* Get text content of an element.
*/
GET_TEXT: async ({ selector }) => {
const element = DOMUtils.querySelector(selector);
if (!element) {
throw new Error(`Element not found: ${selector}`);
}
return {
text: element.textContent?.trim() || "",
innerText: element.innerText?.trim() || "",
};
},
/**
* Get input value.
*/
GET_VALUE: async ({ selector }) => {
const element = DOMUtils.querySelector(selector);
if (!element) {
throw new Error(`Element not found: ${selector}`);
}
return { value: element.value || "" };
},
/**
* Scroll element into view.
*/
SCROLL_INTO_VIEW: async ({ selector, behavior = "smooth", block = "center" }) => {
const element = DOMUtils.querySelector(selector);
if (!element) {
throw new Error(`Element not found: ${selector}`);
}
element.scrollIntoView({ behavior, block });
return { success: true, selector };
},
};
// ---------------------------------------------------------------------------
// Message Listener - Routes commands from service worker
// ---------------------------------------------------------------------------
chrome.runtime.onMessage.addListener((request, sender, sendResponse) => {
// Skip handshake messages (handled by wake-up code)
if (request.type === "terminator_content_handshake") {
return;
}
// Handle action dispatch
if (request.action && Actions[request.action]) {
const action = Actions[request.action];
const payload = request.payload || {};
action(payload)
.then((result) => {
sendResponse({ status: "success", result });
})
.catch((error) => {
sendResponse({ status: "error", error: error.message });
});
// Return true to indicate async response
return true;
}
// Unknown action
if (request.action) {
sendResponse({
status: "error",
error: `Unknown action: ${request.action}`,
});
return true;
}
});
// Log that content script loaded (only in development)
console.debug("[TerminatorBridge] Content script loaded with ActionDispatcher");

View File

@@ -0,0 +1,332 @@
---
tool_name: execute_sequence
arguments:
variables:
release_url:
type: string
label: GitHub Release asset URL (zip)
default: "https://github.com/mediar-ai/terminator/releases/latest/download/terminator-browser-extension.zip"
extension_dir:
type: string
label: Folder to load (will be created by the download step)
default: "%TEMP%\\terminator-bridge"
zip_path:
type: string
label: Path to downloaded zip
default: "%TEMP%\\terminator-browser-extension.zip"
selectors:
address_bar: "role:Edit|name:Address and search bar"
dev_mode_toggle: "role:Button|name:Developer mode"
load_unpacked: "role:Button|name:Load unpacked"
folder_field: "role:Edit|name:Folder:"
select_folder_btn: "role:Button|name:Select Folder"
reload_button: "role:Button|name:Reload"
extensions_doc: "role:Document|name:Extensions"
steps:
# Download the extension zip via JavaScript (NodeJS environment)
- tool_name: run_command
arguments:
engine: javascript
run: |
const fs = require('fs');
const path = require('path');
const os = require('os');
(async () => {
const url = "${{release_url}}";
if (!url || !url.trim()) throw new Error('release_url is empty');
const isWin = process.platform === 'win32';
const tmp = isWin ? (process.env.TEMP || os.tmpdir()) : os.tmpdir();
const zipPath = isWin ? path.join(tmp, 'terminator-browser-extension.zip') : path.join(tmp, 'terminator-browser-extension.zip');
const destDir = isWin ? path.join(tmp, 'terminator-bridge') : path.join(tmp, 'terminator-bridge');
const existedBefore = fs.existsSync(destDir);
try { fs.rmSync(destDir, { recursive: true, force: true }); } catch (_) {}
try { fs.mkdirSync(destDir, { recursive: true }); } catch (e) { throw new Error('Failed to create dest dir: ' + e.message); }
const res = await fetch(url);
if (!res.ok) throw new Error(`Download failed: ${res.status} ${res.statusText}`);
const arrayBuf = await res.arrayBuffer();
fs.writeFileSync(zipPath, Buffer.from(arrayBuf));
// Export values via ::set-env for the workflow engine AND return set_env for robust propagation
console.log(`::set-env name=zip_path::${zipPath}`);
console.log(`::set-env name=extension_dir::${destDir}`);
console.log(`::set-env name=is_update_mode::${existedBefore}`);
return { set_env: { zip_path: zipPath, extension_dir: destDir, is_update_mode: existedBefore } };
})();
delay_ms: 200
# Extract the downloaded zip to the destination folder (Windows + Unix)
- tool_name: run_command
arguments:
run: |
$ErrorActionPreference = 'Stop'
# Avoid template substitution issues: compute paths directly
$zip = Join-Path $env:TEMP 'terminator-browser-extension.zip'
$dest = Join-Path $env:TEMP 'terminator-bridge'
if (Test-Path $dest) { Remove-Item -Recurse -Force $dest }
New-Item -ItemType Directory -Force -Path $dest | Out-Null
Expand-Archive -Path $zip -DestinationPath $dest -Force
shell: powershell
delay_ms: 400
# Find the actual folder that contains manifest.json (some zips have a nested folder)
- tool_name: run_command
arguments:
engine: javascript
run: |
const fs = require('fs');
const path = require('path');
const os = require('os');
(async () => {
const isWin = process.platform === 'win32';
const root = isWin ? path.join(process.env.TEMP || os.tmpdir(), 'terminator-bridge') : path.join(os.tmpdir(), 'terminator-bridge');
const stack = [root];
let picked = null;
while (stack.length) {
const dir = stack.pop();
let entries;
try { entries = fs.readdirSync(dir, { withFileTypes: true }); } catch (_) { continue; }
if (entries.some(e => e.isFile && e.name.toLowerCase() === 'manifest.json' || (!e.isFile && !e.isDirectory && e.name && e.name.toLowerCase() === 'manifest.json'))) {
picked = dir; break;
}
for (const e of entries) {
if ((e.isDirectory && e.isDirectory()) || (e.isDirectory === true)) {
stack.push(path.join(dir, e.name));
}
}
}
if (!picked) {
console.log(`::set-env name=extension_dir_text::${root}`);
return { set_env: { extension_dir_text: root } };
}
console.log(`::set-env name=extension_dir_text::${picked}`);
return { set_env: { extension_dir_text: picked } };
})();
continue_on_error: false
delay_ms: 100
# Navigate directly to the Extensions page using browser navigation tool
- tool_name: navigate_browser
arguments:
url: "chrome://extensions"
browser: "chrome"
delay_ms: 1000
# Fallback: force the URL in the address bar if Chrome didn't navigate
- tool_name: wait_for_element
arguments:
selector: "${{ selectors.address_bar }}"
condition: "visible"
timeout_ms: 15000
continue_on_error: true
- tool_name: click_element
arguments:
selector: "${{ selectors.address_bar }}"
continue_on_error: true
- tool_name: type_into_element
arguments:
selector: "${{ selectors.address_bar }}"
text_to_type: "chrome://extensions"
clear_before_typing: true
verify_action: false
continue_on_error: true
- tool_name: press_key_global
arguments:
key: "{Enter}"
delay_ms: 800
continue_on_error: true
# Ensure Developer mode is ON (presence-based; do not trust is_toggled). Do NOT click "Load unpacked" here.
- tool_name: run_command
arguments:
engine: javascript
run: |
// Use terminator.js via global 'desktop'
const toggleSel = "role:Button|name:Developer mode";
const loadSel = "role:Button|name:Load unpacked";
try {
log('Waiting for chrome://extensions page to load...');
await sleep(2000);
// First, let's check if "Load unpacked" is already visible (Dev mode already on)
log('Checking if Developer mode is already enabled...');
let loadVisible = false;
try {
await desktop.locator(loadSel).first(2000);
loadVisible = true;
log('✓ Developer mode already enabled - Load unpacked button found');
} catch (_) {
log('Developer mode not enabled yet - need to toggle it');
}
if (!loadVisible) {
// Try to find and click the Developer mode toggle
log('Looking for Developer mode toggle...');
// Try multiple selector variations
const toggleSelectors = [
"role:Button|name:Developer mode",
"role:ToggleButton|name:Developer mode",
"role:Switch|name:Developer mode"
];
let devToggle = null;
for (const sel of toggleSelectors) {
try {
log(`Trying selector: ${sel}`);
devToggle = await desktop.locator(sel).first(3000);
log(`✓ Found Developer mode toggle with selector: ${sel}`);
break;
} catch (e) {
log(`Selector failed: ${sel} - ${e.message}`);
}
}
if (!devToggle) {
// Dump available elements for debugging
log('Could not find Developer mode toggle. Dumping available buttons...');
try {
const allButtons = await desktop.locator('role:Button').all(5000, 50);
log(`Found ${allButtons.length} buttons on page`);
for (let i = 0; i < Math.min(allButtons.length, 20); i++) {
const name = allButtons[i].name();
log(` Button ${i}: ${name}`);
}
} catch (dumpErr) {
log(`Failed to dump buttons: ${dumpErr.message}`);
}
throw new Error('Developer mode toggle not found');
}
// Click the toggle
log('Clicking Developer mode toggle...');
await devToggle.click();
await sleep(1000);
// Verify that Load unpacked appeared
try {
await desktop.locator(loadSel).first(3000);
log('✓ Developer mode enabled successfully - Load unpacked button appeared');
} catch (e) {
log('Warning: Load unpacked button did not appear after toggling Developer mode');
throw e;
}
}
} catch (error) {
log(`ERROR in Developer mode step: ${error.message}`);
throw error;
}
continue_on_error: false
delay_ms: 200
# Safely remove only the Terminator Bridge extension if present using JavaScript
- tool_name: run_command
arguments:
engine: javascript
run: |
// Find and remove only Terminator Bridge extension
const extensionName = "Terminator Bridge";
try {
// Wait a bit for extensions page to load
await sleep(1000);
// Look for all extension cards on the page
const allElements = await desktop.locator("role:Group").all();
log(`Found ${allElements.length} groups on extensions page`);
let terminatorFound = false;
// Search through elements to find Terminator Bridge
for (let element of allElements) {
try {
const name = await element.name();
const text = await element.value();
// Check if this element contains "Terminator Bridge" text
if ((name && name.includes(extensionName)) || (text && text.includes(extensionName))) {
log(`Found Terminator Bridge extension card`);
terminatorFound = true;
// Look for Remove button within this specific card
// Try to find the Remove button that's a child of this card
const removeButton = await element.locator("role:Button|name:Remove").first();
if (removeButton) {
log(`Found Remove button for Terminator Bridge, clicking it`);
await removeButton.click();
await sleep(500);
// Confirm removal in the dialog
await desktop.press_key("{Enter}");
log(`Confirmed removal of Terminator Bridge`);
await sleep(1000);
break;
} else {
log(`Remove button not found in Terminator Bridge card`);
}
}
} catch (e) {
// Skip elements that can't be read
continue;
}
}
if (!terminatorFound) {
log(`Terminator Bridge extension not found - probably not installed`);
}
} catch (error) {
log(`Error while trying to remove old extension: ${error.message}`);
log(`Continuing with installation anyway...`);
}
continue_on_error: true
delay_ms: 500
# Click Load unpacked, then handle folder picker dialog (Windows)
- tool_name: click_element
arguments:
selector: "${{ selectors.load_unpacked }}"
continue_on_error: false
delay_ms: 300
# Folder picker dialog (Windows)
- tool_name: wait_for_element
arguments:
selector: "${{ selectors.folder_field }}"
condition: "exists"
timeout_ms: 3000
continue_on_error: true
# Use the resolved folder containing manifest.json
- tool_name: type_into_element
arguments:
selector: "${{ selectors.folder_field }}"
text_to_type: "${{env.extension_dir_text}}"
clear_before_typing: true
verify_action: false
continue_on_error: true
- tool_name: click_element
arguments:
selector: "${{ selectors.select_folder_btn }}"
delay_ms: 1200
continue_on_error: true
# Verification: look for the Reload button that appears on unpacked extensions
- tool_name: wait_for_element
arguments:
selector: "${{ selectors.reload_button }}"
condition: "exists"
timeout_ms: 15000
stop_on_error: true

38
extension/manifest.json Normal file
View File

@@ -0,0 +1,38 @@
{
"action": {
"default_title": "Terminator Bridge"
},
"background": {
"service_worker": "worker.js",
"type": "module"
},
"content_scripts": [
{
"js": [
"content.js"
],
"match_about_blank": true,
"matches": [
"<all_urls>"
],
"run_at": "document_start",
"world": "MAIN"
}
],
"description": "Bridge to evaluate JS in the active tab via a local WebSocket (no DevTools UI).",
"host_permissions": [
"<all_urls>"
],
"manifest_version": 3,
"name": "Terminator Bridge",
"permissions": [
"debugger",
"tabs",
"scripting",
"activeTab",
"webNavigation",
"alarms",
"storage"
],
"version": "0.24.0"
}

1637
extension/worker.js Normal file

File diff suppressed because it is too large Load Diff

45
plan.md Normal file
View File

@@ -0,0 +1,45 @@
Validated assessment (updated 2025-12-07)
Executive context
- `create_app` wires settings, persona/board stores, session manager, action registry, browser pool/client, and GraphQL client into FastAPI app state (src/guide/app/main.py:22-59).
- CDP isolation reuses a cached context/page but clears state whenever `host_config.isolate` is true (src/guide/app/browser/pool.py:207-231).
Findings (grounded in code)
1) CDP isolation already clears storage (no bug to fix)
```python
# src/guide/app/browser/pool.py:207-231
if self.host_config.isolate:
await self._clear_cdp_storage(self._cdp_context, self._cdp_page)
...
await context.clear_cookies()
await page.evaluate("localStorage.clear(); sessionStorage.clear();")
```
Note: permissions are not cleared; add `clear_permissions()` if cross-action permission reset is required.
2) Replace regex-only HTML parsing with BeautifulSoup
```python
# src/guide/app/browser/diagnostics.py:1060-1105
input_pattern = r'<input[^>]*?(?:name|id|data-(?:cy|test(?:id)?))\\s*=\\s*["\\\']([^"\\\']+)["\\\'][^>]*>'
...
for match in re.finditer(button_pattern, html_content, re.IGNORECASE):
...
```
HTML parsing via regex will break on multiline attributes or nested quotes; switch to BeautifulSoup (preferred lightweight option) to extract inputs/buttons/selects/role="button" elements.
3) Trim duplicated selector registry and remove underscored alias exports
```python
# src/guide/app/strings/registry.py:51-96
description_field: ClassVar[str] = IntakeSelectors.DESCRIPTION_FIELD
...
page_header_title: ClassVar[str] = IntakeSelectors.PAGE_HEADER_TITLE
```
Every selector is re-assigned in `IntakeStrings` (and peers), doubling maintenance versus exposing the selector classes directly. Also `__all__` re-exports underscored aliases in `browser/elements/mui.py` and `dropdown/__init__.py`; plan: expose only canonical names and drop leading-underscore exports to tighten API surface.
4) Extension client hardcodes timeouts instead of using settings
```python
# src/guide/app/browser/extension_client.py:381-389
async def wait_for_selector(..., timeout: float | None = None, ...):
"""... default: 5000"""
timeout_ms = int(timeout) if timeout else 5000
```
`AppSettings.Timeouts` centralizes browser/extension timeouts (src/guide/app/core/config.py:70-104), but the extension client still defaults to 5000ms; consider plumbing `settings.timeouts` through for consistency.

View File

@@ -9,15 +9,24 @@ description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = [
"bs4>=0.0.2",
"cryptography>=42.0.0",
"faker>=38.2.0",
"fastapi>=0.121.3",
"graphql-core>=3.2.0",
"httpx>=0.27.0",
"loguru>=0.7.3",
"playwright>=1.56.0",
"pydantic>=2.12.4",
"pydantic-settings>=2.4.0",
"pyjwt>=2.8.0",
"python-dotenv>=1.2.1",
"pyyaml>=6.0.2",
"redis>=7.1.0",
"requests>=2.32.5",
"types-redis>=4.6.0.20241004",
"uvicorn>=0.30.6",
"websockets>=15.0.1",
]
[tool.hatch.build.targets.wheel]
@@ -30,6 +39,8 @@ dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.24.0",
"pytest-cov>=5.0.0",
"ruff>=0.14.6",
"pyrefly>=0.42.3",
]
[tool.pytest.ini_options]

View File

@@ -1,2 +1 @@
"""Utility namespace for local hook shims."""

View File

@@ -1,2 +1 @@
"""Hook utilities namespace."""

View File

@@ -26,7 +26,7 @@
"includeLogsCount": 50
}
},
"include": ["src/"],
"include": ["src/", "extensions/"],
"ignore": {
"useGitignore": true,
"useDefaultPatterns": true,

1
scripts/__init__.py Normal file
View File

@@ -0,0 +1 @@
"""Demo scripts for the guide application."""

874
scripts/seed_demo_board.py Normal file
View File

@@ -0,0 +1,874 @@
#!/usr/bin/env python3
"""Seed the Demonstration board with compelling demo data.
Creates board items across 4 initiative groups:
1. Manufacturing BOM (Direct Materials) - DEMON-1xx
2. Corporate Event Program (Indirect) - DEMON-2xx
3. Services & Labor - DEMON-3xx
4. IT, SaaS, and Infrastructure - DEMON-4xx
Usage:
# Set bearer token (required)
export DEMO_BEARER_TOKEN="eyJ..."
# Seed all initiative groups
python scripts/seed_demo_board.py --group all
# Seed specific group
python scripts/seed_demo_board.py --group manufacturing
python scripts/seed_demo_board.py --group event
python scripts/seed_demo_board.py --group services
python scripts/seed_demo_board.py --group it
# Cleanup previous demo items first
python scripts/seed_demo_board.py --group all --cleanup
# Dry run (show what would be created)
python scripts/seed_demo_board.py --group all --dry-run
"""
import argparse
import asyncio
import logging
import os
import sys
from dataclasses import dataclass
from typing import Literal, cast
import httpx
# Add src to path for imports
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "src"))
from guide.app.raindrop.operations.demonstration import (
DEMONSTRATION_INSTANCE_ID,
GROUP_ARCHIVED,
GROUP_HIGH_PRIORITY,
GROUP_IN_PROGRESS,
GROUP_QUEUED,
DemonstrationItem,
archive_demonstration_item,
create_demonstration_item,
list_demonstration_items,
)
logger = logging.getLogger(__name__)
# Constants
GRAPHQL_URL = "https://raindrop-staging.hasura.app/v1/graphql"
DEMO_SEEDER_TAG = "demo-seeder"
# Type alias for initiative groups
InitiativeGroup = Literal["manufacturing", "event", "services", "it", "all"]
@dataclass(frozen=True)
class SupplierRef:
"""Reference to a supplier for board item data."""
id: int
name: str
city: str = ""
state_province: str = ""
country: str = ""
website: str = ""
def to_dict(self) -> dict[str, object]:
"""Convert to dict format for board item data (object, not list)."""
return {
"id": self.id,
"name": self.name,
"city": self.city,
"state_province": self.state_province,
"country": self.country,
"website": self.website,
"__typename": "supplier",
}
@dataclass(frozen=True)
class CommodityRef:
"""Reference to a commodity for board item data."""
id: int
name: str
instance_id: int
def to_dict(self) -> dict[str, object]:
"""Convert to dict format expected by board item data."""
return {
"id": self.id,
"name": self.name,
"__typename": "commodity",
"instance_id": self.instance_id,
}
# =============================================================================
# Initiative Group 1: Manufacturing BOM (Direct Materials)
# =============================================================================
def build_demon_101(
supplier: SupplierRef, commodity: CommodityRef
) -> dict[str, object]:
"""DEMON-101: Drone X Prototype BOM Sourcing."""
return {
"description": "Drone X Prototype BOM Sourcing",
"group": GROUP_HIGH_PRIORITY,
"f9": "DRONE-X-BOM-001",
"f17": "2026-03-15",
"target_date": "2026-06-30",
"f19": 2400000,
"f20": 1,
"f1": [commodity.to_dict()],
"f14": supplier.to_dict(),
}
def build_demon_102(commodity: CommodityRef) -> dict[str, object]:
"""DEMON-102: Drone X Pilot Run Capacity Contracting."""
return {
"description": "Drone X Pilot Run Capacity Contracting",
"group": GROUP_QUEUED,
"f9": "DRONE-X-PILOT-001",
"f17": "2026-06-01", # 90 days out from prototype
"target_date": "2026-09-30",
"f19": 4800000, # $4.8M annual
"f20": 500, # Pilot run quantity
"f1": [commodity.to_dict()],
}
def build_demon_103(commodity: CommodityRef) -> dict[str, object]:
"""DEMON-103: Battery Cell Qualification New Supplier Intro."""
return {
"description": "Battery Cell Qualification New Supplier Intro",
"group": GROUP_IN_PROGRESS,
"f9": "BATTERY-QUAL-001",
"f17": "2026-02-01",
"target_date": "2026-05-15",
"f19": 600000,
"f20": 10000, # Battery cells for qualification
"f1": [commodity.to_dict()],
}
# =============================================================================
# Initiative Group 2: Corporate Event Program (Indirect)
# =============================================================================
def build_demon_201(
supplier: SupplierRef, commodity: CommodityRef
) -> dict[str, object]:
"""DEMON-201: Global SKO Barcelona 2026 (Parent Record)."""
return {
"description": "Global SKO Barcelona 2026",
"group": GROUP_HIGH_PRIORITY,
"f9": "SKO-BCN-2026",
"f17": "2025-08-01",
"target_date": "2026-01-25",
"f19": 750000,
"f20": 650, # Attendees
"f1": [commodity.to_dict()],
"f14": supplier.to_dict(),
}
def build_demon_202(
supplier: SupplierRef, commodity: CommodityRef
) -> dict[str, object]:
"""DEMON-202: SKO Venue & Rooms Contracting."""
return {
"description": "SKO Venue & Rooms Contracting",
"group": GROUP_IN_PROGRESS,
"f9": "SKO-BCN-VENUE",
"f17": "2025-06-01",
"target_date": "2025-10-31",
"f19": 420000,
"f20": 1,
"f1": [commodity.to_dict()],
"f14": supplier.to_dict(),
}
def build_demon_203(commodity: CommodityRef) -> dict[str, object]:
"""DEMON-203: SKO Catering Services."""
return {
"description": "SKO Catering Services",
"group": GROUP_QUEUED,
"f9": "SKO-BCN-CATERING",
"f17": "2025-10-01",
"target_date": "2026-01-15",
"f19": 150000,
"f20": 650, # Meals per attendee
"f1": [commodity.to_dict()],
}
def build_demon_204(
supplier: SupplierRef, commodity: CommodityRef
) -> dict[str, object]:
"""DEMON-204: SKO AV & Production."""
return {
"description": "SKO AV & Production",
"group": GROUP_IN_PROGRESS,
"f9": "SKO-BCN-AV",
"f17": "2025-07-01",
"target_date": "2026-01-20",
"f19": 180000,
"f20": 1,
"f1": [commodity.to_dict()],
"f14": supplier.to_dict(),
}
def build_demon_205(commodity: CommodityRef) -> dict[str, object]:
"""DEMON-205: SKO Swag & Promotional Items."""
return {
"description": "SKO Swag & Promotional Items",
"group": GROUP_QUEUED,
"f9": "SKO-BCN-SWAG",
"f17": "2025-11-01",
"target_date": "2026-01-10",
"f19": 35000,
"f20": 650, # Items per attendee
"f1": [commodity.to_dict()],
}
# =============================================================================
# Initiative Group 3: Services & Labor
# =============================================================================
def build_demon_301(commodity: CommodityRef) -> dict[str, object]:
"""DEMON-301: Contingent Labor Data Analyst (6 months)."""
return {
"description": "Contingent Labor Data Analyst (6 months)",
"group": GROUP_IN_PROGRESS,
"f9": "CW-DATA-ANALYST-001",
"f17": "2026-02-01",
"target_date": "2026-08-31",
"f19": 110000,
"f20": 6, # Months
"f1": [commodity.to_dict()],
}
def build_demon_302(commodity: CommodityRef) -> dict[str, object]:
"""DEMON-302: Process Improvement Consulting Phase 2."""
return {
"description": "Process Improvement Consulting Phase 2",
"group": GROUP_HIGH_PRIORITY,
"f9": "CONSULTING-LEAN-P2",
"f17": "2026-03-01",
"target_date": "2026-06-30",
"f19": 275000,
"f20": 4, # Months
"f1": [commodity.to_dict()],
}
# =============================================================================
# Initiative Group 4: IT, SaaS, and Infrastructure
# =============================================================================
def build_demon_401(
supplier: SupplierRef, commodity: CommodityRef
) -> dict[str, object]:
"""DEMON-401: Renewal: SurveyMonkey Enterprise."""
return {
"description": "Renewal: SurveyMonkey Enterprise",
"group": GROUP_QUEUED,
"f9": "SAAS-SURVEYMONKEY-RENEW",
"f17": "2026-03-01",
"target_date": "2026-04-30",
"f19": 38000,
"f20": 1, # Annual subscription
"f1": [commodity.to_dict()],
"f14": supplier.to_dict(),
}
def build_demon_402(commodity: CommodityRef) -> dict[str, object]:
"""DEMON-402: New SaaS: Event Registration Platform."""
return {
"description": "New SaaS: Event Registration Platform",
"group": GROUP_HIGH_PRIORITY,
"f9": "SAAS-EVENT-REG-NEW",
"f17": "2025-09-01",
"target_date": "2025-11-30",
"f19": 95000,
"f20": 1,
"f1": [commodity.to_dict()],
}
def build_demon_403(
supplier: SupplierRef, commodity: CommodityRef
) -> dict[str, object]:
"""DEMON-403: Laptop Refresh Finance Team (Archived/Completed)."""
return {
"description": "Laptop Refresh Finance Team",
"group": GROUP_ARCHIVED,
"f9": "IT-LAPTOP-FIN-2025",
"f17": "2025-01-15",
"target_date": "2025-03-31",
"f19": 45000,
"f20": 25, # Laptops
"f1": [commodity.to_dict()],
"f14": supplier.to_dict(),
}
# =============================================================================
# Data Lookup Functions
# =============================================================================
async def lookup_supplier_by_name(
graphql_url: str,
bearer_token: str,
search: str,
) -> SupplierRef:
"""Look up a supplier by name search."""
query = """
query SearchBusiness($pattern: String!) {
business(limit: 20, where: {name: {_ilike: $pattern}}) {
id
name
city
state_province
country
website
}
}
"""
async with httpx.AsyncClient() as client:
resp = await client.post(
graphql_url,
json={"query": query, "variables": {"pattern": f"%{search}%"}},
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {bearer_token}",
},
timeout=30.0,
)
data = cast(dict[str, object], resp.json())
if errors := data.get("errors"):
error_list = cast(list[dict[str, object]], errors)
raise ValueError(f"GraphQL error: {error_list[0].get('message', 'Unknown')}")
data_section = cast(dict[str, object], data.get("data", {}))
businesses = cast(list[dict[str, object]], data_section.get("business", []))
if not businesses:
raise ValueError(f"No supplier found matching '{search}'")
first = businesses[0]
logger.info(f"Found supplier: {first['name']} (id: {first['id']})")
return SupplierRef(
id=int(cast(int, first["id"])),
name=str(first.get("name", "")),
city=str(first.get("city", "") or ""),
state_province=str(first.get("state_province", "") or ""),
country=str(first.get("country", "") or ""),
website=str(first.get("website", "") or ""),
)
async def lookup_commodities(
graphql_url: str,
bearer_token: str,
instance_id: int,
) -> dict[str, CommodityRef]:
"""Look up commodities and return categorized dict."""
query = """
query ListCommodities($instanceId: Int!, $limit: Int!) {
commodity(
where: { instance_id: { _eq: $instanceId } }
limit: $limit
order_by: { name: asc }
) {
id
name
instance_id
}
}
"""
async with httpx.AsyncClient() as client:
resp = await client.post(
graphql_url,
json={
"query": query,
"variables": {"instanceId": instance_id, "limit": 200},
},
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {bearer_token}",
},
timeout=30.0,
)
data = cast(dict[str, object], resp.json())
if errors := data.get("errors"):
error_list = cast(list[dict[str, object]], errors)
raise ValueError(f"GraphQL error: {error_list[0].get('message', 'Unknown')}")
data_section = cast(dict[str, object], data.get("data", {}))
commodities = cast(list[dict[str, object]], data_section.get("commodity", []))
result: dict[str, CommodityRef] = {}
# Keyword to category mapping (order matters - first match wins)
category_keywords: list[tuple[list[str], str]] = [
(["electronic", "pcba"], "electronics"),
(["battery", "cell"], "battery"),
(["event"], "event_services"),
(["telecom", "service"], "event_services"),
(["facilit", "real estate"], "facilities"),
(["food", "catering"], "food"),
(["marketing", "promo"], "marketing"),
(["contingent", "labor"], "contingent"),
(["consult", "professional"], "consulting"),
(["software", "saas", "it"], "it_software"),
(["hardware", "equipment"], "it_hardware"),
]
for c in commodities:
name_lower = str(c["name"]).lower()
ref = CommodityRef(
id=int(cast(int, c["id"])),
name=str(c["name"]),
instance_id=int(cast(int, c["instance_id"])),
)
# Categorize by keyword matching
for keywords, category in category_keywords:
if any(keyword in name_lower for keyword in keywords):
# Special case: event_services should not overwrite if already set
if category == "event_services" and category in result:
continue
result[category] = ref
break
# Log summary
logger.info(f"Found {len(result)} commodity categories: {', '.join(result.keys())}")
return result
# =============================================================================
# Cleanup and Creation Helpers
# =============================================================================
async def cleanup_demo_items(graphql_url: str, bearer_token: str) -> int:
"""Archive all items created by demo-seeder."""
items = await list_demonstration_items(
graphql_url=graphql_url,
bearer_token=bearer_token,
limit=100,
include_archived=False,
)
demo_items = [i for i in items if i.requested_by == DEMO_SEEDER_TAG]
archived_count = 0
for item in demo_items:
try:
_ = await archive_demonstration_item(
graphql_url=graphql_url,
bearer_token=bearer_token,
uuid=item.uuid,
)
logger.info(f"Archived {item.id}: {item.data.get('description', 'N/A')}")
archived_count += 1
except ValueError as e:
logger.warning(f"Failed to archive {item.id}: {e}")
return archived_count
async def create_item(
graphql_url: str,
bearer_token: str,
data: dict[str, object],
dry_run: bool = False,
) -> DemonstrationItem | None:
"""Create a demonstration board item."""
description = data.get("description", "Unnamed item")
if dry_run:
group = data.get("group", "Unknown")
value = data.get("f19", 0)
logger.info(f" [DRY RUN] {description} | {group} | ${value:,}")
return None
item = await create_demonstration_item(
graphql_url=graphql_url,
bearer_token=bearer_token,
data=data,
requested_by=DEMO_SEEDER_TAG,
)
logger.info(f"Created {item.id}: {description}")
return item
# =============================================================================
# Scenario Seed Functions
# =============================================================================
async def seed_manufacturing_group(
graphql_url: str,
bearer_token: str,
supplier: SupplierRef,
commodities: dict[str, CommodityRef],
dry_run: bool = False,
) -> list[DemonstrationItem]:
"""Seed Initiative Group 1: Manufacturing BOM (Direct Materials)."""
logger.info("\n=== Initiative Group 1: Manufacturing BOM ===")
items: list[DemonstrationItem] = []
default = commodities.get(
"contingent", CommodityRef(18, "Contingent Workforce", 107)
)
electronics = commodities.get("electronics", default)
battery = commodities.get("battery", electronics)
# DEMON-101
item = await create_item(
graphql_url, bearer_token, build_demon_101(supplier, electronics), dry_run
)
if item:
items.append(item)
# DEMON-102
item = await create_item(
graphql_url, bearer_token, build_demon_102(electronics), dry_run
)
if item:
items.append(item)
# DEMON-103
item = await create_item(
graphql_url, bearer_token, build_demon_103(battery), dry_run
)
if item:
items.append(item)
return items
async def seed_event_group(
graphql_url: str,
bearer_token: str,
supplier: SupplierRef,
commodities: dict[str, CommodityRef],
dry_run: bool = False,
) -> list[DemonstrationItem]:
"""Seed Initiative Group 2: Corporate Event Program (Indirect)."""
logger.info("\n=== Initiative Group 2: Corporate Event Program ===")
items: list[DemonstrationItem] = []
default = commodities.get(
"contingent", CommodityRef(18, "Contingent Workforce", 107)
)
event_services = commodities.get("event_services", default)
facilities = commodities.get("facilities", default)
food = commodities.get("food", default)
marketing = commodities.get("marketing", default)
# DEMON-201 (Parent)
item = await create_item(
graphql_url, bearer_token, build_demon_201(supplier, event_services), dry_run
)
if item:
items.append(item)
# DEMON-202
item = await create_item(
graphql_url, bearer_token, build_demon_202(supplier, facilities), dry_run
)
if item:
items.append(item)
# DEMON-203
item = await create_item(graphql_url, bearer_token, build_demon_203(food), dry_run)
if item:
items.append(item)
# DEMON-204
item = await create_item(
graphql_url, bearer_token, build_demon_204(supplier, event_services), dry_run
)
if item:
items.append(item)
# DEMON-205
item = await create_item(
graphql_url, bearer_token, build_demon_205(marketing), dry_run
)
if item:
items.append(item)
return items
async def seed_services_group(
graphql_url: str,
bearer_token: str,
commodities: dict[str, CommodityRef],
dry_run: bool = False,
) -> list[DemonstrationItem]:
"""Seed Initiative Group 3: Services & Labor."""
logger.info("\n=== Initiative Group 3: Services & Labor ===")
items: list[DemonstrationItem] = []
default = commodities.get(
"contingent", CommodityRef(18, "Contingent Workforce", 107)
)
contingent = commodities.get("contingent", default)
consulting = commodities.get("consulting", default)
# DEMON-301
item = await create_item(
graphql_url, bearer_token, build_demon_301(contingent), dry_run
)
if item:
items.append(item)
# DEMON-302
item = await create_item(
graphql_url, bearer_token, build_demon_302(consulting), dry_run
)
if item:
items.append(item)
return items
async def seed_it_group(
graphql_url: str,
bearer_token: str,
supplier: SupplierRef,
commodities: dict[str, CommodityRef],
dry_run: bool = False,
) -> list[DemonstrationItem]:
"""Seed Initiative Group 4: IT, SaaS, and Infrastructure."""
logger.info("\n=== Initiative Group 4: IT, SaaS & Infrastructure ===")
items: list[DemonstrationItem] = []
default = commodities.get(
"contingent", CommodityRef(18, "Contingent Workforce", 107)
)
it_software = commodities.get("it_software", default)
it_hardware = commodities.get("it_hardware", default)
# DEMON-401
item = await create_item(
graphql_url, bearer_token, build_demon_401(supplier, it_software), dry_run
)
if item:
items.append(item)
# DEMON-402
item = await create_item(
graphql_url, bearer_token, build_demon_402(it_software), dry_run
)
if item:
items.append(item)
# DEMON-403 (Archived)
item = await create_item(
graphql_url, bearer_token, build_demon_403(supplier, it_hardware), dry_run
)
if item:
items.append(item)
return items
# =============================================================================
# Main Entry Point
# =============================================================================
async def main(
group: InitiativeGroup,
cleanup: bool,
dry_run: bool,
bearer_token: str,
) -> int:
"""Main entry point."""
graphql_url = GRAPHQL_URL
# Cleanup if requested
if cleanup:
logger.info("Cleaning up previous demo items...")
archived = await cleanup_demo_items(graphql_url, bearer_token)
logger.info(f"Archived {archived} demo items")
# Look up reference data
logger.info("\nLooking up reference data...")
try:
supplier = await lookup_supplier_by_name(graphql_url, bearer_token, "Solid dba")
except ValueError as e:
logger.error(f"Failed to find supplier: {e}")
return 1
try:
commodities = await lookup_commodities(
graphql_url, bearer_token, DEMONSTRATION_INSTANCE_ID
)
except ValueError as e:
logger.error(f"Failed to find commodities: {e}")
return 1
# Create items
all_items: list[DemonstrationItem] = []
if group in ("manufacturing", "all"):
items = await seed_manufacturing_group(
graphql_url, bearer_token, supplier, commodities, dry_run
)
all_items.extend(items)
if group in ("event", "all"):
items = await seed_event_group(
graphql_url, bearer_token, supplier, commodities, dry_run
)
all_items.extend(items)
if group in ("services", "all"):
items = await seed_services_group(
graphql_url, bearer_token, commodities, dry_run
)
all_items.extend(items)
if group in ("it", "all"):
items = await seed_it_group(
graphql_url, bearer_token, supplier, commodities, dry_run
)
all_items.extend(items)
# Summary
logger.info("\n" + "=" * 50)
if dry_run:
total = len(all_items) if all_items else 13 # Default count for dry run
logger.info(f"[DRY RUN] Would create {total} demo items")
else:
logger.info(f"SUCCESS: Created {len(all_items)} demo items")
return 0
def cli() -> None:
"""CLI entry point."""
parser = argparse.ArgumentParser(
description="Seed the Demonstration board with demo data",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=__doc__,
)
_ = parser.add_argument(
"--group",
choices=["manufacturing", "event", "services", "it", "all"],
default="all",
help="Which initiative group(s) to create (default: all)",
)
_ = parser.add_argument(
"--cleanup",
action="store_true",
help="Archive previous demo items before creating new ones",
)
_ = parser.add_argument(
"--dry-run",
action="store_true",
help="Show what would be created without actually creating",
)
_ = parser.add_argument(
"--token",
help="Bearer token (or set DEMO_BEARER_TOKEN env var)",
)
_ = parser.add_argument(
"-v",
"--verbose",
action="store_true",
help="Enable verbose logging",
)
args = parser.parse_args()
# Extract typed values
group_value = getattr(args, "group", "all")
cleanup_value = getattr(args, "cleanup", False)
dry_run_value = getattr(args, "dry_run", False)
verbose_value = getattr(args, "verbose", False)
token_value = getattr(args, "token", None)
group_str: str = str(group_value) if group_value else "all"
cleanup_bool: bool = bool(cleanup_value)
dry_run_bool: bool = bool(dry_run_value)
verbose_bool: bool = bool(verbose_value)
token_str: str | None = None
if token_value is not None and isinstance(token_value, str):
token_str = token_value
# Setup logging
log_level = logging.DEBUG if verbose_bool else logging.INFO
logging.basicConfig(
level=log_level,
format="%(levelname)s: %(message)s",
)
# Get bearer token
bearer_token = token_str or os.environ.get("DEMO_BEARER_TOKEN")
if not bearer_token:
logger.error("Bearer token required. Set DEMO_BEARER_TOKEN or use --token")
sys.exit(1)
# Validate group
valid_groups = ("manufacturing", "event", "services", "it", "all")
if group_str not in valid_groups:
logger.error(f"Invalid group: {group_str}")
sys.exit(1)
group_map: dict[str, InitiativeGroup] = {
"manufacturing": "manufacturing",
"event": "event",
"services": "services",
"it": "it",
"all": "all",
}
group_literal = group_map[group_str]
exit_code = asyncio.run(
main(
group=group_literal,
cleanup=cleanup_bool,
dry_run=dry_run_bool,
bearer_token=bearer_token,
)
)
sys.exit(exit_code)
if __name__ == "__main__":
cli()

4
src/guide/__main__.py Normal file
View File

@@ -0,0 +1,4 @@
from guide.main import main
if __name__ == "__main__":
main()

View File

@@ -1,3 +1,5 @@
from guide.app.actions.auth.login import LoginAsPersonaAction
from guide.app.actions.auth.logout import LogoutAction
from guide.app.actions.auth.request_otp import RequestOtpAction
__all__ = ["LoginAsPersonaAction"]
__all__ = ["LoginAsPersonaAction", "LogoutAction", "RequestOtpAction"]

View File

@@ -1,35 +1,111 @@
from playwright.async_api import Page
"""Login action with session persistence support."""
from typing import ClassVar, override
from guide.app.actions.base import DemoAction, register_action
from guide.app.auth import DummyMfaCodeProvider, ensure_persona
from guide.app import errors
from guide.app.actions.base import DemoAction, register_action
from guide.app.auth import (
SessionManager,
login_with_otp_url,
)
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionContext, ActionResult
from guide.app.models.personas import PersonaStore
from guide.app.models.personas import PersonaResolver, PersonaStore
@register_action
class LoginAsPersonaAction(DemoAction):
"""Log in as persona with session caching.
Attempts to restore session from disk before login.
Saves session to disk after successful login.
Request params:
url: One-time OTP auth URL (required for fresh login)
force_fresh_login: Skip session restoration (optional, default: false)
"""
id: ClassVar[str] = "auth.login_as_persona"
description: ClassVar[str] = "Log in as the specified persona using MFA."
description: ClassVar[str] = "Log in as persona with session caching."
category: ClassVar[str] = "auth"
_personas: PersonaStore
_mfa_provider: DummyMfaCodeProvider
_login_url: str
_session_manager: SessionManager
_persona_resolver: PersonaResolver
def __init__(self, personas: PersonaStore, login_url: str) -> None:
self._personas = personas
self._mfa_provider = DummyMfaCodeProvider()
self._login_url = login_url
def __init__(
self,
persona_store: PersonaStore,
session_manager: SessionManager,
persona_resolver: PersonaResolver,
) -> None:
"""Initialize login action.
Args:
persona_store: Store for looking up personas.
session_manager: Manager for session persistence.
persona_resolver: Resolver for email-based persona lookup.
"""
self._personas = persona_store
self._session_manager = session_manager
self._persona_resolver = persona_resolver
@override
async def run(self, page: Page, context: ActionContext) -> ActionResult:
if context.persona_id is None:
raise errors.PersonaError("persona_id is required for login action")
persona = self._personas.get(context.persona_id)
await ensure_persona(
page, persona, self._mfa_provider, login_url=self._login_url
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Execute login action with session handling.
Flow:
1. Resolve persona from persona_id or email param
2. Try to restore existing session (if not forcing fresh)
3. Validate restored session via DOM check
4. If invalid, perform fresh login via OTP URL
5. Save session after successful login
Supports two resolution modes:
- Traditional: context.persona_id set (existing behavior)
- Email-based: context.params["email"] with persona_id=None (for n8n flows)
"""
email = context.params.get("email")
if context.persona_id is not None:
persona = self._personas.get(context.persona_id)
elif email and isinstance(email, str):
persona = self._persona_resolver.resolve_by_email(email)
else:
raise errors.ActionExecutionError(
"Either persona_id or email param is required for login action",
details={"provided_params": list(context.params.keys())},
)
otp_url = context.params.get("url")
force_fresh = context.params.get("force_fresh_login", False)
# 1. Try to restore existing session (if not forcing fresh)
if not force_fresh:
restored = await self._session_manager.restore_session(page, persona)
if restored:
return restored
# 2. Perform fresh login via OTP URL
if not otp_url or not isinstance(otp_url, str):
raise errors.AuthError(
"url parameter required for fresh login",
details={"persona_id": persona.id},
)
success = await login_with_otp_url(page, otp_url, persona.email)
if not success:
raise errors.AuthError(
f"OTP login failed for {persona.email}",
details={"persona_id": persona.id, "email": persona.email},
)
# 3. Save session after successful login
if self._session_manager.auto_persist:
_ = await self._session_manager.save_session(page, persona, otp_url)
return ActionResult(
details={
"persona_id": persona.id,
"status": "logged_in",
}
)
return ActionResult(details={"persona_id": persona.id, "status": "logged_in"})

View File

@@ -0,0 +1,103 @@
"""Logout action to clear browser session state."""
from typing import ClassVar, cast, override
from loguru import logger
from guide.app.actions.base import DemoAction, register_action
from guide.app.auth.session import logout
from guide.app.browser.types import PageLike
from guide.app.core.config import AppSettings
from guide.app.models.domain import ActionContext, ActionResult
_JS_CLEAR_LOCAL_STORAGE = """
(() => {
try {
const count = localStorage.length;
localStorage.clear();
return { cleared: count, error: null };
} catch (e) {
return { cleared: 0, error: e.message };
}
})();
"""
@register_action
class LogoutAction(DemoAction):
"""Log out current user and clear browser session state.
Clears localStorage and optionally clicks logout button.
Use this to reset browser state before logging in as a different persona.
Request params:
navigate_first: URL to navigate to before clearing (optional)
click_logout: Whether to click logout button if found (default: true)
"""
id: ClassVar[str] = "auth.logout"
description: ClassVar[str] = "Log out and clear browser session state."
category: ClassVar[str] = "auth"
_settings: AppSettings
def __init__(self, settings: AppSettings) -> None:
"""Initialize logout action.
Args:
settings: Application settings.
"""
self._settings = settings
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Execute logout and clear session state.
Args:
page: Browser page instance.
context: Action context with params.
Returns:
ActionResult with logout status.
"""
navigate_url = context.params.get("navigate_first")
click_logout = context.params.get("click_logout", True)
# Navigate first if URL provided (needed for localStorage access)
if navigate_url and isinstance(navigate_url, str):
logger.info("Navigating to {} before logout", navigate_url)
_ = await page.goto(navigate_url)
elif page.url == "about:blank":
# Navigate to app base URL so we can access localStorage
base_url = self._settings.raindrop_base_url
logger.info("Navigating to {} for localStorage access", base_url)
_ = await page.goto(base_url)
# Clear localStorage
result = await page.evaluate(_JS_CLEAR_LOCAL_STORAGE)
cleared_count = 0
if isinstance(result, dict):
result_dict = cast(dict[str, object], result)
raw_cleared = result_dict.get("cleared")
if isinstance(raw_cleared, int):
cleared_count = raw_cleared
logger.info("Cleared {} localStorage items", cleared_count)
# Click logout button if requested
logout_clicked = False
if click_logout:
await logout(page)
logout_clicked = True
return ActionResult(
details={
"status": "logged_out",
"local_storage_cleared": cleared_count,
"logout_clicked": logout_clicked,
"url": page.url,
}
)
__all__ = ["LogoutAction"]

View File

@@ -0,0 +1,650 @@
"""OTP request action with webhook callback flow.
Triggers OTP email by interacting with login page, sends webhook to n8n,
waits for n8n to find the OTP URL and call back, then completes login.
"""
import asyncio
import logging
from dataclasses import dataclass
from typing import ClassVar, Literal, cast, override
import httpx
from guide.app import errors
from guide.app.actions.base import DemoAction, register_action
from guide.app.auth import (
SessionManager,
get_otp_callback_store,
login_with_otp_url,
login_with_verification_code,
)
from guide.app.browser.helpers import PageHelpers
from guide.app.browser.types import PageLike
from guide.app.core.config import AppSettings
from guide.app.models.domain import ActionContext, ActionParamInfo, ActionResult
from guide.app.models.personas import DemoPersona, PersonaResolver
from guide.app.strings.selectors.login import LoginSelectors
@dataclass(frozen=True)
class OtpCredential:
"""OTP credential returned by n8n webhook.
Can be either a magic link URL or a verification code.
"""
type: Literal["url", "code"]
value: str
_logger = logging.getLogger(__name__)
async def trigger_otp_email(
page: PageLike,
email: str,
login_url: str,
) -> bool:
"""Navigate to login page and trigger OTP email.
Handles both dropdown and text input modes for email field.
Args:
page: Browser page instance.
email: Email address to request OTP for.
login_url: Base URL of the application (login page will be at /login).
Returns:
True if OTP was triggered successfully.
"""
helpers = PageHelpers(page)
# Navigate to login page
full_login_url = f"{login_url.rstrip('/')}/login"
_logger.info("Navigating to login page: %s", full_login_url)
_ = await page.goto(full_login_url, wait_until="networkidle")
_logger.info("Page loaded, current URL: %s", page.url)
await helpers.wait_for_stable()
# Wait for login container with longer timeout
_logger.info("Waiting for login container: %s", LoginSelectors.LOGIN_CONTAINER)
_ = await page.wait_for_selector(
LoginSelectors.LOGIN_CONTAINER, state="visible", timeout=30000
)
# Check if email field is a dropdown or text input
email_text_input = page.locator(LoginSelectors.EMAIL_TEXT_INPUT)
email_dropdown = page.locator(LoginSelectors.EMAIL_DROPDOWN)
if await email_dropdown.count() > 0:
# Dropdown mode - check if correct email is already selected
current_value = await email_dropdown.text_content()
if current_value and email.lower() in current_value.lower():
_logger.info("Email already selected in dropdown: %s", email)
else:
# Try to select from dropdown or clear and type
_logger.info("Email dropdown detected, attempting to select: %s", email)
# Click the dropdown to open options
await email_dropdown.click()
await helpers.wait_for_network_idle()
# Look for matching option (escape quotes in email for selector safety)
escaped_email = email.replace('"', '\\"').replace("'", "\\'")
option = page.locator(f'li:has-text("{escaped_email}")')
if await option.count() > 0:
await option.first.click()
_logger.info("Selected email from dropdown")
else:
# Email not in dropdown - need to clear and type
_logger.info(
"Email not in dropdown options, clearing to enable text input"
)
# Close dropdown by clicking elsewhere
await page.click(LoginSelectors.LOGIN_CONTAINER)
await helpers.wait_for_network_idle()
# Click clear button to enable text input
clear_btn = page.locator(LoginSelectors.EMAIL_CLEAR_BUTTON)
if await clear_btn.count() > 0:
await clear_btn.click()
await helpers.wait_for_network_idle()
# Now fill the text input
await page.fill(LoginSelectors.EMAIL_TEXT_INPUT, email)
_logger.info("Filled email in text input after clearing")
else:
_logger.warning("Could not find clear button to enable text input")
return False
elif await email_text_input.count() > 0:
# Text input mode - fill directly
_logger.info("Email text input detected, filling: %s", email)
await page.fill(LoginSelectors.EMAIL_TEXT_INPUT, email)
else:
_logger.error("Could not find email field (dropdown or text input)")
return False
# Click login button to trigger OTP
_ = await page.wait_for_selector(
LoginSelectors.LOGIN_BUTTON, state="visible", timeout=5000
)
_logger.info("Clicking login button to trigger OTP email")
await page.click(LoginSelectors.LOGIN_BUTTON)
await helpers.wait_for_network_idle()
# Check for errors
error_el = page.locator(LoginSelectors.ERROR_MESSAGE)
if await error_el.count() > 0:
error_text = await error_el.text_content()
if error_text and error_text.strip():
_logger.error("Login error: %s", error_text)
return False
_logger.info("OTP email triggered successfully for: %s", email)
return True
async def send_otp_webhook(
webhook_url: str,
correlation_id: str,
email: str,
callback_url: str,
timeout: float = 120.0,
) -> OtpCredential | None:
"""Send webhook to n8n to notify OTP was requested.
Supports two response modes:
1. Synchronous: n8n returns OTP credential (URL or code) in the HTTP response body
2. Async callback: n8n returns empty/ack response, calls back later
Args:
webhook_url: n8n webhook URL.
correlation_id: Unique ID to correlate callback.
email: Email OTP was requested for.
callback_url: URL for n8n to send OTP URL back (for async mode).
timeout: Request timeout in seconds (default 120s for sync mode).
Returns:
OtpCredential (url or code) if included in response, None if webhook failed.
"""
payload = {
"event": "otp_requested",
"correlation_id": correlation_id,
"email": email,
"callback_url": callback_url,
}
_logger.info(
"Sending OTP webhook to n8n: correlation_id=%s, email=%s",
correlation_id,
email,
)
try:
async with httpx.AsyncClient(timeout=timeout, follow_redirects=True) as client:
_logger.info("Sending POST to webhook_url: %s", webhook_url)
response = await client.post(webhook_url, json=payload)
_logger.info("Webhook response status: %s", response.status_code)
_ = response.raise_for_status()
_logger.info("OTP webhook sent successfully")
# Try to extract OTP credential from response body
credential = _extract_otp_credential_from_response(response)
if credential:
_logger.info(
"Received OTP %s in webhook response (sync mode)", credential.type
)
return credential
except httpx.HTTPError as exc:
_logger.error(
"Failed to send OTP webhook: %s (type: %s)", exc, type(exc).__name__
)
return None
except Exception as exc:
_logger.error(
"Unexpected error sending webhook: %s (type: %s)", exc, type(exc).__name__
)
return None
def _extract_credential_from_dict(data: dict[str, object]) -> OtpCredential | None:
"""Extract OTP credential from a dictionary.
Priority: URL first (navigate to magic link), then code (fallback).
The URL flow will request the code from n8n if needed after navigation.
Args:
data: Dictionary to search for credential fields.
Returns:
OtpCredential if found, None otherwise.
"""
# Check for URL fields first - navigate to magic link before using code
if (otp_url := data.get("otp_url")) and isinstance(otp_url, str):
return OtpCredential(type="url", value=otp_url)
if (access_url := data.get("access_url")) and isinstance(access_url, str):
return OtpCredential(type="url", value=access_url)
# Fallback: verification code (only if no URL provided)
if (code := data.get("verification_code")) and isinstance(code, str):
return OtpCredential(type="code", value=code)
return None
def _extract_otp_credential_from_response(
response: httpx.Response,
) -> OtpCredential | None:
"""Extract OTP credential (URL or verification code) from n8n webhook response.
Handles multiple response formats:
- {"verification_code": "..."}
- {"otp_url": "..."}
- {"access_url": "..."}
- {"output": {"verification_code": "..."}}
- {"output": {"otp_url": "..."}}
- {"output": {"access_url": "..."}}
Returns:
OtpCredential with type "code" or "url", or None if not found.
"""
try:
raw: object = cast(object, response.json())
except Exception:
return None
if not isinstance(raw, dict):
return None
data = cast(dict[str, object], raw)
# Check top-level fields first
if credential := _extract_credential_from_dict(data):
return credential
# Check nested output object (n8n format)
output = data.get("output")
if isinstance(output, dict):
output_dict = cast(dict[str, object], output)
return _extract_credential_from_dict(output_dict)
return None
@register_action
class RequestOtpAction(DemoAction):
"""Request OTP with webhook callback flow and session persistence.
Complete flow:
1. Try to restore existing session from disk
2. If no valid session, navigate to login page
3. Enter email and click login to trigger OTP
4. Send webhook to n8n with correlation_id
5. Wait for n8n to call back with OTP URL
6. Complete login with OTP URL
7. Save session to disk for future requests
Request params:
email: Email address to request OTP for (required)
callback_base_url: Base URL for callback endpoint (optional, defaults to localhost)
force_fresh_login: Skip session restoration (optional, default: false)
switch_user: Logout current user first before login (optional, default: false)
Requires:
- RAINDROP_DEMO_N8N_WEBHOOK_URL environment variable
"""
id: ClassVar[str] = "auth.request_otp"
description: ClassVar[str] = (
"Request OTP email and wait for callback with magic link."
)
category: ClassVar[str] = "auth"
long_description: ClassVar[str | None] = """
Authenticates a user via Auth0 passwordless (magic link) flow with n8n webhook integration.
**Flow:**
1. Checks for cached session - if valid and not expired, restores it instantly
2. Navigates to login page and enters email address
3. Triggers OTP email via Auth0
4. Sends webhook to n8n with correlation_id for email retrieval
5. Waits for n8n to find the magic link and call back
6. Completes login using the magic link URL
7. Saves session to disk for future requests (avoids re-authentication)
**Session Caching:**
Sessions are cached to `.sessions/{persona_id}.session.json` and reused until expired.
Use `force_fresh_login: true` to bypass cache, or `switch_user: true` to logout first.
**n8n Integration:**
Requires n8n workflow listening at RAINDROP_DEMO_N8N_WEBHOOK_URL that:
- Receives webhook with `correlation_id`, `email`, and `callback_url`
- Finds the OTP email in inbox
- Extracts magic link URL
- POSTs to callback_url with `{"correlation_id": "...", "otp_url": "..."}`
""".strip()
params_info: ClassVar[list[ActionParamInfo]] = [
ActionParamInfo(
name="email",
description="Email address to authenticate. Must be a valid Auth0 user.",
required=True,
example="travis@raindrop.com",
),
ActionParamInfo(
name="switch_user",
description="Logout current user before login. Use when switching between personas on the same browser.",
required=False,
default="false",
example="true",
),
ActionParamInfo(
name="force_fresh_login",
description="Skip session cache and force fresh OTP authentication.",
required=False,
default="false",
example="true",
),
ActionParamInfo(
name="callback_base_url",
description="Base URL for n8n callback endpoint. Defaults to RAINDROP_DEMO_CALLBACK_BASE_URL.",
required=False,
default="http://localhost:8765",
example="https://demo.example.com",
),
]
example_request: ClassVar[dict[str, object] | None] = {
"persona_id": None,
"host_id": "browserless-cdp",
"params": {
"email": "travis@raindrop.com",
"switch_user": True,
},
}
example_response: ClassVar[dict[str, object] | None] = {
"status": "success",
"action_id": "auth.request_otp",
"correlation_id": "a5edbff3-99b3-4e86-aab5-1fcbdc910171",
"result": {
"email": "travis@raindrop.com",
"status": "logged_in",
"correlation_id": "a5edbff3-99b3-4e86-aab5-1fcbdc910171",
},
}
requires: ClassVar[list[str]] = [
"RAINDROP_DEMO_N8N_WEBHOOK_URL - n8n webhook endpoint for OTP retrieval",
"n8n workflow configured to find OTP emails and callback",
]
_settings: AppSettings
_session_manager: SessionManager
_persona_resolver: PersonaResolver
def __init__(
self,
settings: AppSettings,
session_manager: SessionManager,
persona_resolver: PersonaResolver,
) -> None:
"""Initialize with settings and session management.
Args:
settings: Application settings with n8n webhook URL.
session_manager: Manager for session persistence.
persona_resolver: Resolver for email-based persona lookup.
"""
self._settings = settings
self._session_manager = session_manager
self._persona_resolver = persona_resolver
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Execute OTP request flow with session persistence.
Args:
page: Browser page instance.
context: Action context with params.
Returns:
ActionResult with login status.
Raises:
ActionExecutionError: If OTP request fails.
AuthError: If login fails after receiving OTP URL.
"""
email = context.params.get("email")
if not email or not isinstance(email, str):
raise errors.ActionExecutionError(
"'email' param is required for OTP request",
details={"provided_params": list(context.params.keys())},
)
switch_user = context.params.get("switch_user", False)
# switch_user implies force_fresh_login (skip session restoration)
force_fresh = context.params.get("force_fresh_login", False) or switch_user
# If switching users, logout first to clear session state
if switch_user:
_logger.info("switch_user=True, clearing session before login")
base_url = self._settings.raindrop_base_url
helpers = PageHelpers(page)
# Navigate to app first
_ = await page.goto(base_url, wait_until="networkidle")
await helpers.wait_for_stable()
# Try to click logout button (may be in a menu)
logout_btn = page.locator('[data-test="auth-logout"]')
if await logout_btn.count() > 0:
_logger.info("Clicking logout button")
await logout_btn.click()
await helpers.wait_for_network_idle()
else:
# Logout button might be in user menu - try to find and click menu first
user_menu = page.locator('[data-test="user-menu"], [data-test="profile-menu"], button:has-text("Account")')
if await user_menu.count() > 0:
_logger.info("Opening user menu to find logout")
await user_menu.first.click()
await helpers.wait_for_network_idle()
# Now try logout button again
if await logout_btn.count() > 0:
await logout_btn.click()
await helpers.wait_for_network_idle()
# Clear localStorage to invalidate any remaining session
_ = await page.evaluate(
"(() => { const c = localStorage.length; localStorage.clear(); return c; })()"
)
_logger.info("Session cleared for user switch")
# Resolve persona from email for session management
persona: DemoPersona | None = None
try:
persona = self._persona_resolver.resolve_by_email(email)
except errors.PersonaError:
_logger.debug(
"No persona found for email %s, session caching disabled", email
)
# 0. Try to restore existing session (if not forcing fresh)
if persona and not force_fresh:
restored = await self._session_manager.restore_session(page, persona)
if restored:
return restored
# Continue with OTP flow if session restoration failed
webhook_url = self._settings.n8n_webhook_url
if not webhook_url:
raise errors.ConfigError(
"n8n_webhook_url not configured",
details={
"hint": "Set RAINDROP_DEMO_N8N_WEBHOOK_URL environment variable"
},
)
# Callback URL for n8n to send OTP back
callback_base = context.params.get(
"callback_base_url", self._settings.callback_base_url
)
callback_url = f"{callback_base}/auth/otp-callback"
correlation_id = context.correlation_id
store = get_otp_callback_store()
# 1. Trigger OTP email on login page
triggered = await trigger_otp_email(
page,
email,
self._settings.raindrop_base_url,
)
if not triggered:
raise errors.ActionExecutionError(
f"Failed to trigger OTP email for {email}",
details={"email": email},
)
# 1.5. Wait for email to arrive before notifying n8n
# This ensures n8n finds the fresh email, not an old one
email_delay = self._settings.n8n_otp_email_delay
_logger.info("Waiting %.1fs for OTP email to arrive...", email_delay)
await asyncio.sleep(email_delay)
# 2. Send webhook to n8n and check for sync response
timeout = self._settings.n8n_otp_callback_timeout
credential = await send_otp_webhook(
webhook_url,
correlation_id,
email,
callback_url,
timeout=float(timeout),
)
# 3. If no credential in response, fall back to async callback
if not credential:
_logger.info(
"No credential in webhook response, waiting for async callback"
)
_ = await store.register(correlation_id, email)
try:
otp_url = await store.wait_for_callback(correlation_id, timeout=timeout)
credential = OtpCredential(type="url", value=otp_url)
except TimeoutError as exc:
raise errors.ActionExecutionError(
f"Timeout waiting for OTP callback ({timeout}s)",
details={"correlation_id": correlation_id, "email": email},
) from exc
except ValueError as exc:
raise errors.ActionExecutionError(
f"OTP callback error: {exc}",
details={"correlation_id": correlation_id, "email": email},
) from exc
# 4. Complete login based on credential type
success = await self._complete_login(
page=page,
credential=credential,
email=email,
webhook_url=webhook_url,
correlation_id=correlation_id,
callback_url=callback_url,
timeout=float(timeout),
)
if not success:
raise errors.AuthError(
f"Login failed for {email}",
details={"email": email, "credential_type": credential.type},
)
# 5. Save session after successful login
if persona and self._session_manager.auto_persist:
_ = await self._session_manager.save_session(
page, persona, self._settings.raindrop_base_url
)
_logger.info("Saved session for persona %s", persona.id)
return ActionResult(
details={
"email": email,
"status": "logged_in",
"correlation_id": correlation_id,
}
)
async def _complete_login(
self,
page: PageLike,
credential: OtpCredential,
email: str,
webhook_url: str,
correlation_id: str,
callback_url: str,
timeout: float,
) -> bool:
"""Complete login with OTP credential (URL or verification code).
Handles two-phase flow:
- If credential is URL: navigate and login, detect if verification code page appears
- If credential is code: fill verification code directly
- If URL leads to verification code page: call webhook again for code
Args:
page: Browser page instance.
credential: OTP credential (url or code).
email: Email address for validation.
webhook_url: n8n webhook URL for re-fetch if needed.
correlation_id: Request correlation ID.
callback_url: Callback URL for async mode.
timeout: Timeout for webhook calls.
Returns:
True if login successful, False otherwise.
"""
if credential.type == "code":
# Direct verification code login
_logger.info("Using verification code for login: %s", email)
return await login_with_verification_code(page, credential.value, email)
# URL-based login
_logger.info("Using OTP URL for login: %s", email)
success = await login_with_otp_url(page, credential.value, email)
if success:
return True
# Check if page is asking for verification code (two-phase flow)
code_input = page.locator(LoginSelectors.VERIFICATION_CODE_INPUT)
if await code_input.count() > 0:
_logger.info(
"OTP URL led to verification code page, fetching code from n8n..."
)
# Call webhook again - n8n will return verification code this time
code_credential = await send_otp_webhook(
webhook_url,
correlation_id,
email,
callback_url,
timeout=timeout,
)
if code_credential and code_credential.type == "code":
_logger.info("Received verification code from n8n, completing login")
return await login_with_verification_code(
page, code_credential.value, email
)
_logger.error("Failed to get verification code from n8n webhook")
return False
# Login failed for other reasons
return False
__all__ = ["RequestOtpAction", "trigger_otp_email", "send_otp_webhook"]

View File

@@ -1,13 +1,16 @@
from abc import ABC, abstractmethod
from collections.abc import Callable, Iterable, Mapping
from inspect import Parameter, signature
from typing import ClassVar, override, cast
from playwright.async_api import Page
from typing import ClassVar, cast, override
from guide.app import errors
from guide.app.models.domain import ActionContext, ActionMetadata, ActionResult
from guide.app.models.types import JSONValue
from guide.app.browser.types import PageLike
from guide.app.models.domain import (
ActionContext,
ActionMetadata,
ActionParamInfo,
ActionResult,
)
class DemoAction(ABC):
@@ -15,14 +18,33 @@ class DemoAction(ABC):
Actions inherit from this class to be discoverable and executable
by the ActionRegistry.
Required class variables:
id: Unique action identifier (e.g., 'auth.request_otp')
description: Brief one-line description
category: Category for grouping (e.g., 'auth', 'intake')
Optional class variables for rich documentation:
long_description: Detailed multi-line description with usage notes
params_info: List of ActionParamInfo documenting accepted parameters
example_request: Example request payload dict
example_response: Example response payload dict
requires: List of requirements (e.g., env vars, config)
"""
id: ClassVar[str]
description: ClassVar[str]
category: ClassVar[str]
# Optional documentation fields
long_description: ClassVar[str | None] = None
params_info: ClassVar[list[ActionParamInfo]] = []
example_request: ClassVar[dict[str, object] | None] = None
example_response: ClassVar[dict[str, object] | None] = None
requires: ClassVar[list[str]] = []
@abstractmethod
async def run(self, page: Page, context: ActionContext) -> ActionResult:
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Execute the action and return a result."""
...
@@ -85,7 +107,7 @@ class CompositeAction(DemoAction):
self.context: ActionContext | None = None
@override
async def run(self, page: Page, context: ActionContext) -> ActionResult:
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Execute all child actions in sequence.
Args:
@@ -97,7 +119,7 @@ class CompositeAction(DemoAction):
"""
self.context = context
results: dict[str, ActionResult] = {}
details: dict[str, JSONValue] = {}
details: dict[str, object] = {}
for step_id in self.child_actions:
try:
@@ -138,6 +160,103 @@ class CompositeAction(DemoAction):
pass # Default: no processing
class ConditionalCompositeAction(CompositeAction):
"""Composite action with runtime conditional step execution.
Extends CompositeAction to support skipping steps based on runtime conditions.
Override `should_execute_step()` to implement conditional logic.
Example:
@register_action
class ConditionalFlow(ConditionalCompositeAction):
id = "conditional-flow"
description = "Flow with conditional steps"
category = "flows"
child_actions = ("step-one", "step-two", "step-three")
@override
async def should_execute_step(
self, step_id: str, context: ActionContext
) -> bool:
if step_id == "step-two":
return context.params.get("include_step_two", True)
return True
"""
context: ActionContext | None
async def should_execute_step(self, step_id: str, context: ActionContext) -> bool:
"""Determine if a step should execute.
Override in subclasses to implement conditional logic based on
context.params, context.shared_state, or other runtime conditions.
Args:
step_id: The action ID of the step to potentially execute.
context: The current action context.
Returns:
True to execute the step, False to skip it.
"""
_ = step_id, context # Unused in base implementation
return True
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Execute child actions conditionally.
Args:
page: The Playwright page instance.
context: The action context (shared across all steps).
Returns:
ActionResult with combined status, details, and skipped steps.
"""
self.context = context
details: dict[str, object] = {}
skipped: list[str] = []
for step_id in self.child_actions:
if not await self.should_execute_step(step_id, context):
skipped.append(step_id)
details[step_id] = {"skipped": True}
continue
try:
action = self.registry.get(step_id)
result = await action.run(page, context)
details[step_id] = result.details
if result.status == "error":
return ActionResult(
status="error",
details={
"failed_step": step_id,
"steps": details,
"skipped": skipped,
},
error=result.error,
)
await self.on_step_complete(step_id, result)
except Exception as exc:
return ActionResult(
status="error",
details={
"failed_step": step_id,
"steps": details,
"skipped": skipped,
},
error=f"Exception in step '{step_id}': {exc}",
)
return ActionResult(
status="ok",
details={"steps": details, "skipped": skipped},
)
class ActionRegistry:
"""Manages action instances and metadata.
@@ -256,6 +375,19 @@ class ActionRegistry:
raise errors.ActionExecutionError(f"Unknown action '{action_id}'")
def _build_metadata(self, action: DemoAction) -> ActionMetadata:
"""Build ActionMetadata from an action instance."""
return ActionMetadata(
id=action.id,
description=action.description,
category=action.category,
long_description=getattr(action, "long_description", None),
params=getattr(action, "params_info", []),
example_request=getattr(action, "example_request", None),
example_response=getattr(action, "example_response", None),
requires=getattr(action, "requires", []),
)
def list_metadata(self) -> list[ActionMetadata]:
"""List metadata for all registered actions.
@@ -269,39 +401,21 @@ class ActionRegistry:
# Add explicit instances
for action in self._actions.values():
metadata.append(
ActionMetadata(
id=action.id,
description=action.description,
category=action.category,
)
)
metadata.append(self._build_metadata(action))
seen_ids.add(action.id)
# Add factory functions
for factory in self._factories.values():
action = factory()
if action.id not in seen_ids:
metadata.append(
ActionMetadata(
id=action.id,
description=action.description,
category=action.category,
)
)
metadata.append(self._build_metadata(action))
seen_ids.add(action.id)
# Add globally registered actions
for action_cls in get_registered_actions().values():
if action_cls.id not in seen_ids:
action = self._instantiate_with_di(action_cls)
metadata.append(
ActionMetadata(
id=action.id,
description=action.description,
category=action.category,
)
)
metadata.append(self._build_metadata(action))
seen_ids.add(action.id)
return metadata
@@ -310,6 +424,7 @@ class ActionRegistry:
__all__ = [
"DemoAction",
"CompositeAction",
"ConditionalCompositeAction",
"ActionRegistry",
"register_action",
"get_registered_actions",

View File

@@ -0,0 +1,5 @@
"""Contract-related demo actions."""
from guide.app.actions.contract.fill_contract import FillContractFormAction
__all__ = ["FillContractFormAction"]

View File

@@ -0,0 +1,868 @@
import asyncio
import contextlib
import logging
from collections.abc import Awaitable
from typing import ClassVar, cast, override
from guide.app.actions.base import DemoAction, register_action
from guide.app.browser.elements import (
fill_date,
fill_text,
select_single,
select_typeahead,
)
from guide.app.browser.elements.dropdown import select_multi, click_with_mouse_events
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionContext, ActionResult
from guide.app.strings.demo_texts.contract import ContractTexts
from guide.app.strings.selectors.contract import ContractFormSelectors
from guide.app import errors
_logger = logging.getLogger(__name__)
@register_action
class FillContractFormAction(DemoAction):
"""Fill the contract form with demo values using dropdown helpers."""
id: ClassVar[str] = "fill-contract-form"
description: ClassVar[str] = "Populate the contract form with demo data."
category: ClassVar[str] = "contract"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
selections: dict[str, dict[str, object]] = {}
field_errors: dict[str, str] = {}
def primary_selector(selector: str) -> str:
return selector.split(",")[0].strip()
async def scroll_into_view(selector: str) -> None:
sel = primary_selector(selector)
script = f"""
(() => {{
const root = document.querySelector('{sel}');
if (!root) return false;
// Skip if this is a collapse button
if (root.getAttribute('data-cy') === 'page-header-chervon-button') {{
return false;
}}
// Find the actual form element (input, combobox, or the root if it's already an input)
let el = root;
// If root is not an input/combobox, find the first input or combobox inside it
if (root.tagName !== 'INPUT' && root.tagName !== 'TEXTAREA' && root.getAttribute('role') !== 'combobox') {{
const input = root.querySelector('input, textarea, [role="combobox"]');
if (input) el = input;
}}
// Ensure we're not scrolling to an element inside a collapse button
const collapseBtn = el.closest('button[data-cy="page-header-chervon-button"]');
if (collapseBtn) {{
return false;
}}
const rect = el.getBoundingClientRect();
const inView = rect.top >= 0 && rect.left >= 0 &&
rect.bottom <= (window.innerHeight || document.documentElement.clientHeight) &&
rect.right <= (window.innerWidth || document.documentElement.clientWidth);
if (!inView) {{
el.scrollIntoView({{block: 'center', inline: 'nearest', behavior: 'auto'}});
}}
return true;
}})();
"""
with contextlib.suppress(Exception):
_ = await page.evaluate(script)
await page.wait_for_timeout(100)
async def attempt(
name: str,
coro: Awaitable[dict[str, object] | None | bool],
timeout: float = 0.6,
) -> None:
try:
result = await asyncio.wait_for(coro, timeout=timeout)
selections[name] = (
result if isinstance(result, dict) else {"result": result}
)
except Exception as exc: # noqa: BLE001
field_errors[name] = repr(exc)
_logger.warning("Field %s failed: %s", name, exc)
async def fill_if_present(selector: str, value: str) -> bool:
"""Fill the first input/textarea inside selector if it exists."""
base_selector = primary_selector(selector)
with contextlib.suppress(Exception):
with contextlib.suppress(Exception):
_ = await page.wait_for_selector(base_selector, timeout=0.2)
await page.click(base_selector)
await page.fill(base_selector, value)
return True
try:
with contextlib.suppress(Exception):
_ = await page.wait_for_selector(
f"{base_selector} input, {base_selector} textarea", timeout=0.2
)
await page.click(f"{base_selector} input, {base_selector} textarea")
await page.fill(
f"{base_selector} input, {base_selector} textarea", value
)
return True
except Exception:
# Try direct selector if it already points to input/textarea
try:
await page.fill(base_selector, value)
return True
except Exception:
return False
async def ensure_checked(selector: str) -> None:
"""Ensure a checkbox is checked using MUI-compatible method."""
with contextlib.suppress(Exception):
# Escape selector for JavaScript
sel_escaped = (
selector.replace("\\", "\\\\")
.replace("'", "\\'")
.replace('"', '\\"')
)
# Check if already checked (use .checked property, not getAttribute)
is_checked = await page.evaluate(
f"""
(() => {{
const el = document.querySelector('{sel_escaped}');
return el ? el.checked : false;
}})();
"""
)
if not is_checked:
# Click using mouse events for MUI compatibility
_ = await click_with_mouse_events(page, selector, focus_first=True)
await page.wait_for_timeout(100)
async def deselect_field(selector: str) -> None:
"""Blur/deselect a field to enable downstream fields."""
sel = primary_selector(selector)
field_selector_js = (
sel.replace("\\", "\\\\").replace("'", "\\'").replace('"', '\\"')
)
_ = await page.evaluate(
f"""
(() => {{
const root = document.querySelector('{field_selector_js}');
if (!root) return false;
// Blur any active input
const input = root.querySelector('input');
if (input && document.activeElement === input) {{
input.blur();
}}
// Blur any active combobox/select
const combobox = root.querySelector('[role="combobox"]');
if (combobox && document.activeElement === combobox) {{
combobox.blur();
}}
// Blur the root element if it's focused
if (document.activeElement === root) {{
root.blur();
}}
// Also blur document.activeElement if it's within the root
if (document.activeElement && root.contains(document.activeElement)) {{
document.activeElement.blur();
}}
return true;
}})();
"""
)
await page.wait_for_timeout(100) # Give form time to process the blur
def _js_escape(text: str) -> str:
return text.replace("\\", "\\\\").replace("'", "\\'")
async def read_values(selector: str) -> list[str]:
"""Read visible token/input text inside a dropdown field."""
sel = _js_escape(selector)
script = f"""
(() => {{
const root = document.querySelector('{sel}');
if (!root) return [];
const chips = Array.from(root.querySelectorAll('.MuiChip-label, [data-testid="Chip"] span, .MuiAutocomplete-tag, .MuiAutocomplete-tag span, .MuiAutocomplete-chip, .MuiAutocomplete-chip span'));
const selects = Array.from(root.querySelectorAll('[role="combobox"], .MuiSelect-select'));
const inputs = Array.from(root.querySelectorAll('input, textarea'));
const vals = [];
chips.forEach(c => vals.push((c.textContent || '').trim()));
selects.forEach(s => vals.push((s.textContent || s.value || '').trim()));
inputs.forEach(i => {{
if (i.value) vals.push(i.value.trim());
}});
return vals.filter(Boolean);
}})();
"""
try:
result = await page.evaluate(script)
if isinstance(result, list):
result_list: list[object] = cast(list[object], result)
output: list[str] = []
for v in result_list:
if v is not None:
output.append(str(v))
return output
return []
except Exception:
return []
async def wait_for_input_value(
selector: str, expected: str, timeout_ms: int = 1500
) -> bool:
sel = primary_selector(selector)
deadline = asyncio.get_event_loop().time() + timeout_ms / 1000
while asyncio.get_event_loop().time() < deadline:
val = await page.evaluate(
f"(function(){{ const el = document.querySelector('{sel}')?.querySelector('input, textarea') || document.querySelector('{sel}'); return el ? (el.value || el.textContent || '').trim() : null; }})();"
)
if isinstance(val, str) and expected.lower() in val.lower():
return True
await page.wait_for_timeout(80)
return False
async def set_input_value(name: str, selector: str, value: str) -> None:
sel = primary_selector(selector)
exists = await page.evaluate(
f"(function(){{return document.querySelector('{sel}') !== null; }})();"
)
if not exists:
selections[name] = {"missing_element": True}
field_errors[name] = "element_not_found"
return
await scroll_into_view(sel)
esc = value.replace("\\", "\\\\").replace("'", "\\'")
_ = await page.evaluate(
f"""
(function(){{
const root = document.querySelector('{sel}');
const el = root?.querySelector('input, textarea') || document.querySelector('{sel}')?.querySelector('input, textarea') || document.querySelector('{sel}');
if (!el) return false;
el.value = '{esc}';
el.dispatchEvent(new Event('input', {{ bubbles: true }}));
el.dispatchEvent(new Event('change', {{ bubbles: true }}));
el.dispatchEvent(new Event('blur', {{ bubbles: true }}));
return true;
}})();
"""
)
_ = await wait_for_input_value(sel, value, timeout_ms=1500)
selections[name] = {"values": await read_values(sel)}
async def wait_for_field_enabled(selector: str, timeout_ms: int = 5000) -> bool:
"""Wait for a field to become enabled (not disabled)."""
sel = primary_selector(selector)
sel_escaped = (
sel.replace("\\", "\\\\").replace("'", "\\'").replace('"', '\\"')
)
deadline = asyncio.get_event_loop().time() + timeout_ms / 1000
while asyncio.get_event_loop().time() < deadline:
is_enabled = await page.evaluate(
f"""
(() => {{
const root = document.querySelector('{sel_escaped}');
if (!root) return false;
// Check if field or its input is disabled
if (root.hasAttribute('disabled') || root.classList.contains('Mui-disabled')) return false;
const input = root.querySelector('input, [role="combobox"]');
if (input) {{
if (input.disabled || input.hasAttribute('aria-disabled') || input.classList.contains('Mui-disabled')) {{
return false;
}}
}}
return true;
}})();
"""
)
if is_enabled:
return True
await page.wait_for_timeout(100)
return False
async def scroll_modal() -> None:
"""Scroll the contract modal form content to ensure lazy fields render."""
scroll_script = """
() => {
const modal = document.querySelector('#modalgridcontent > div > form')?.parentElement ||
document.querySelector('#dialog_contractForm [data-cy="contract-form-base"]')?.parentElement ||
document.querySelector('#modalgridcontent') ||
document.querySelector('[role="dialog"] [data-cy="contract-form-base"]')?.closest('[role="dialog"]');
if (!modal) return false;
const step = modal.scrollHeight / 6;
for (let y = 0; y <= modal.scrollHeight; y += step) {
modal.scrollTo({ top: y, behavior: 'auto' });
}
modal.scrollTo({ top: modal.scrollHeight, behavior: 'auto' });
return true;
}
"""
with contextlib.suppress(Exception):
_ = await page.evaluate(scroll_script)
await page.wait_for_timeout(200)
try:
form_present = await page.evaluate(
"(function(){return document.querySelector('form[data-cy=\"contract-form-base\"]') !== null;})();"
)
if not form_present:
return ActionResult(
details={
"message": "Contract form not found on page; open the modal and retry.",
"error": "form_not_found",
"selections": selections or None,
}
)
await scroll_modal()
# Dropdowns & autocompletes (one field at a time)
# Contract type uses select_single (same pattern as sourcing intake)
contract_type_selector = primary_selector(
ContractFormSelectors.CONTRACT_TYPE_FIELD
)
exists = await page.evaluate(
f"(function(){{return document.querySelector('{contract_type_selector}') !== null; }})();"
)
if not exists:
selections["contract_type"] = {"missing_element": True}
field_errors["contract_type"] = "element_not_found"
_logger.warning(
"Contract type field not found: %s", contract_type_selector
)
else:
await scroll_into_view(contract_type_selector)
await page.wait_for_timeout(150)
# Verify field is enabled before attempting selection
is_enabled_check_result = await page.evaluate(
f"""
(function(){{
const root = document.querySelector('{contract_type_selector.replace("'", "\\'")}');
if (!root) return false;
// Check if root IS the combobox, or find combobox inside
const combo = root.getAttribute('role') === 'combobox' ? root : root.querySelector('[role="combobox"]');
if (!combo) return false;
return !combo.hasAttribute('aria-disabled') ? true : false;
}})();
"""
)
is_enabled_check = (
bool(is_enabled_check_result)
if is_enabled_check_result is not None
else False
)
_logger.info(
"[FORM-FILL] Contract type field enabled: %s", is_enabled_check
)
# Retry selecting contract type until "Required" error disappears
max_retries = 5
retry_count = 0
contract_type_result = None
has_required_error = True
while retry_count < max_retries:
contract_type_result = await select_single(
page,
contract_type_selector,
ContractTexts.CONTRACT_TYPE,
)
_logger.info(
"[FORM-FILL] Contract type attempt %d: selected=%s not_found=%s available=%s",
retry_count + 1,
contract_type_result["selected"],
contract_type_result["not_found"],
contract_type_result["available"],
)
# Wait a bit for validation to update
await page.wait_for_timeout(200)
# Check if "Required" error message is still present
has_required_error = await page.evaluate(
f"""
(function(){{
const root = document.querySelector('{contract_type_selector.replace("'", "\\'")}');
if (!root) return false;
const helperText = root.closest('.MuiFormControl-root')?.querySelector('.MuiFormHelperText-root.Mui-error');
if (!helperText) return false;
const text = (helperText.textContent || '').trim();
return text === 'Required';
}})();
"""
)
if not has_required_error:
_logger.info(
"[FORM-FILL] Contract type 'Required' error cleared after %d attempts",
retry_count + 1,
)
break
retry_count += 1
if retry_count < max_retries:
_logger.info(
"[FORM-FILL] Contract type still shows 'Required' error, retrying..."
)
await page.wait_for_timeout(200)
# Deselect and try again
await deselect_field(contract_type_selector)
await page.wait_for_timeout(100)
if has_required_error:
_logger.warning(
"[FORM-FILL] Contract type 'Required' error still present after %d attempts",
max_retries,
)
# Verify the value was actually set
actual_value = await page.evaluate(
f"""
(function(){{
const root = document.querySelector('{contract_type_selector.replace("'", "\\'")}');
if (!root) return null;
// Check if root IS the combobox, or find combobox inside
const combo = root.getAttribute('role') === 'combobox' ? root : root.querySelector('[role="combobox"]');
return combo ? (combo.textContent || '').trim() : null;
}})();
"""
)
_logger.info(
"[FORM-FILL] Contract type actual value after selection: %s",
actual_value,
)
if contract_type_result:
selections["contract_type"] = {
"selected": contract_type_result["selected"],
"not_found": contract_type_result["not_found"],
"available": contract_type_result["available"],
}
if contract_type_result.get("not_found"):
field_errors["contract_type"] = (
f"not_found={contract_type_result['not_found']}"
)
raise errors.GuideError(
f"Contract type incomplete: {contract_type_result['not_found']}"
)
else:
selections["contract_type"] = {}
# Deselect field to enable downstream fields
await deselect_field(contract_type_selector)
# Commodities (multi) - use select_multi directly like sourcing intake
# Extract primary selector (first from comma-separated list) for proper popup button selector
commodities_field_selector = primary_selector(
ContractFormSelectors.CONTRACT_COMMODITIES_FIELD
)
# Verify element exists before attempting selection
exists = await page.evaluate(
f"(function(){{return document.querySelector('{commodities_field_selector}') !== null; }})();"
)
if not exists:
selections["contract_commodities"] = {"missing_element": True}
field_errors["contract_commodities"] = "element_not_found"
_logger.warning(
"Contract commodities field not found: %s",
commodities_field_selector,
)
else:
await scroll_into_view(commodities_field_selector)
# Wait a bit for the element to be ready after scrolling
await page.wait_for_timeout(150)
commodities_result = await select_multi(
page,
commodities_field_selector,
list(ContractTexts.CONTRACT_COMMODITIES),
)
_logger.info(
"[FORM-FILL] Contract commodities selected=%s not_found=%s available=%s",
commodities_result["selected"],
commodities_result["not_found"],
commodities_result["available"],
)
selections["contract_commodities"] = {
"selected": commodities_result["selected"],
"not_found": commodities_result["not_found"],
"available": commodities_result["available"],
}
if commodities_result["not_found"]:
field_errors["contract_commodities"] = (
f"not_found={commodities_result['not_found']}"
)
raise errors.GuideError(
f"Contract commodities incomplete: {commodities_result['not_found']}"
)
# Deselect field to enable downstream fields
await deselect_field(commodities_field_selector)
# Supplier contact - type-to-search autocomplete (requires typing to trigger API)
supplier_contact_selector = primary_selector(
ContractFormSelectors.SUPPLIER_CONTACT_FIELD
)
# Wait for field to become available (not disabled)
field_available = await wait_for_field_enabled(
supplier_contact_selector, timeout_ms=3000
)
if not field_available:
selections["supplier_contact"] = {
"missing_element": True,
"not_enabled": True,
}
field_errors["supplier_contact"] = "field_not_available"
_logger.warning(
"Supplier contact field not available: %s",
supplier_contact_selector,
)
else:
await scroll_into_view(supplier_contact_selector)
await page.wait_for_timeout(150)
contact_value = ContractTexts.SUPPLIER_CONTACT
supplier_result = await select_typeahead(
page,
supplier_contact_selector,
contact_value,
min_chars=3,
wait_ms=3000,
)
selections["supplier_contact"] = {
"selected": supplier_result["selected"],
"not_found": supplier_result["not_found"],
"available": supplier_result["available"],
}
_logger.info(
"[FORM-FILL] Supplier contact selected=%s not_found=%s available=%s",
supplier_result["selected"],
supplier_result["not_found"],
supplier_result["available"],
)
if supplier_result["not_found"]:
field_errors["supplier_contact"] = (
f"not_found={supplier_result['not_found']}"
)
# Deselect field to enable downstream fields
await deselect_field(supplier_contact_selector)
# Classification checkbox: preferred is index 2
await ensure_checked(
'#contract-form_business_att-checkbox-2 input[type="checkbox"]'
)
# Entity and Regions - single select like sourcing intake
entity_selector = primary_selector(
ContractFormSelectors.ENTITY_AND_REGIONS_FIELD
)
await scroll_into_view(entity_selector)
await page.wait_for_timeout(150)
entity_result = await select_single(
page,
entity_selector,
ContractTexts.ENTITY_AND_REGIONS,
)
selections["entity_and_regions"] = {
"selected": entity_result["selected"],
"not_found": entity_result["not_found"],
"available": entity_result["available"],
}
if entity_result["not_found"]:
field_errors["entity_and_regions"] = (
f"not_found={entity_result['not_found']}"
)
await deselect_field(entity_selector)
# Renewal Type - single select
renewal_type_selector = primary_selector(
ContractFormSelectors.RENEWAL_TYPE_FIELD
)
await scroll_into_view(renewal_type_selector)
await page.wait_for_timeout(150)
renewal_type_result = await select_single(
page,
renewal_type_selector,
ContractTexts.RENEWAL_TYPE,
)
selections["renewal_type"] = {
"selected": renewal_type_result["selected"],
"not_found": renewal_type_result["not_found"],
"available": renewal_type_result["available"],
}
if renewal_type_result["not_found"]:
field_errors["renewal_type"] = (
f"not_found={renewal_type_result['not_found']}"
)
await deselect_field(renewal_type_selector)
# Currency - single select
currency_selector = primary_selector(ContractFormSelectors.CURRENCY_FIELD)
await scroll_into_view(currency_selector)
await page.wait_for_timeout(150)
currency_result = await select_single(
page,
currency_selector,
ContractTexts.CURRENCY,
)
selections["currency"] = {
"selected": currency_result["selected"],
"not_found": currency_result["not_found"],
"available": currency_result["available"],
}
if currency_result["not_found"]:
field_errors["currency"] = f"not_found={currency_result['not_found']}"
await deselect_field(currency_selector)
# Payment Terms - single select
payment_terms_selector = primary_selector(
ContractFormSelectors.PAYMENT_TERMS_FIELD
)
await scroll_into_view(payment_terms_selector)
await page.wait_for_timeout(150)
payment_terms_result = await select_single(
page,
payment_terms_selector,
ContractTexts.PAYMENT_TERMS,
)
selections["payment_terms"] = {
"selected": payment_terms_result["selected"],
"not_found": payment_terms_result["not_found"],
"available": payment_terms_result["available"],
}
if payment_terms_result["not_found"]:
field_errors["payment_terms"] = (
f"not_found={payment_terms_result['not_found']}"
)
await deselect_field(payment_terms_selector)
# Payment Schedule - single select
payment_schedule_selector = primary_selector(
ContractFormSelectors.PAYMENT_SCHEDULE_FIELD
)
await scroll_into_view(payment_schedule_selector)
await page.wait_for_timeout(150)
payment_schedule_result = await select_single(
page,
payment_schedule_selector,
ContractTexts.PAYMENT_SCHEDULE,
)
selections["payment_schedule"] = {
"selected": payment_schedule_result["selected"],
"not_found": payment_schedule_result["not_found"],
"available": payment_schedule_result["available"],
}
if payment_schedule_result["not_found"]:
field_errors["payment_schedule"] = (
f"not_found={payment_schedule_result['not_found']}"
)
await deselect_field(payment_schedule_selector)
# Business Contact - type-to-search autocomplete
business_contact_selector = primary_selector(
ContractFormSelectors.BUSINESS_CONTACT_FIELD
)
await scroll_into_view(business_contact_selector)
await page.wait_for_timeout(150)
business_contact_result = await select_typeahead(
page,
business_contact_selector,
ContractTexts.BUSINESS_CONTACT,
min_chars=3,
wait_ms=3000,
)
selections["business_contact"] = {
"selected": business_contact_result["selected"],
"not_found": business_contact_result["not_found"],
"available": business_contact_result["available"],
}
if business_contact_result["not_found"]:
field_errors["business_contact"] = (
f"not_found={business_contact_result['not_found']}"
)
await deselect_field(business_contact_selector)
# Managing Department - single select
managing_dept_selector = primary_selector(
ContractFormSelectors.MANAGING_DEPARTMENT_FIELD
)
await scroll_into_view(managing_dept_selector)
await page.wait_for_timeout(150)
managing_dept_result = await select_single(
page,
managing_dept_selector,
ContractTexts.MANAGING_DEPARTMENT,
)
selections["managing_department"] = {
"selected": managing_dept_result["selected"],
"not_found": managing_dept_result["not_found"],
"available": managing_dept_result["available"],
}
if managing_dept_result["not_found"]:
field_errors["managing_department"] = (
f"not_found={managing_dept_result['not_found']}"
)
await deselect_field(managing_dept_selector)
# Funding Department - single select
funding_dept_selector = primary_selector(
ContractFormSelectors.FUNDING_DEPARTMENT_FIELD
)
await scroll_into_view(funding_dept_selector)
await page.wait_for_timeout(150)
funding_dept_result = await select_single(
page,
funding_dept_selector,
ContractTexts.FUNDING_DEPARTMENT,
)
selections["funding_department"] = {
"selected": funding_dept_result["selected"],
"not_found": funding_dept_result["not_found"],
"available": funding_dept_result["available"],
}
if funding_dept_result["not_found"]:
field_errors["funding_department"] = (
f"not_found={funding_dept_result['not_found']}"
)
await deselect_field(funding_dept_selector)
# Dates
await attempt(
"effective_date",
fill_date(
page,
ContractFormSelectors.EFFECTIVE_DATE_FIELD,
ContractTexts.EFFECTIVE_DATE,
),
)
await attempt(
"end_date",
fill_date(
page,
ContractFormSelectors.END_DATE_FIELD,
ContractTexts.END_DATE,
),
)
# Numeric/text inputs
await attempt(
"renewal_increase",
fill_text(
page,
ContractFormSelectors.RENEWAL_INCREASE_FIELD,
ContractTexts.RENEWAL_INCREASE,
),
)
await attempt(
"renewal_alert_days",
fill_text(
page,
ContractFormSelectors.RENEWAL_ALERT_DAYS_FIELD,
ContractTexts.RENEWAL_ALERT_DAYS,
),
)
# Notices not present on this view; skip to avoid selector errors
await attempt(
"total_value",
fill_text(
page,
ContractFormSelectors.TOTAL_VALUE,
ContractTexts.TOTAL_VALUE,
),
)
await attempt(
"budget",
fill_text(
page,
ContractFormSelectors.BUDGET_FIELD,
ContractTexts.BUDGET,
),
)
await attempt(
"project_name",
fill_if_present(
ContractFormSelectors.PROJECT_NAME_FIELD,
ContractTexts.PROJECT_NAME,
),
)
await attempt(
"master_project_name",
fill_if_present(
ContractFormSelectors.MASTER_PROJECT_NAME_FIELD,
ContractTexts.MASTER_PROJECT_NAME,
),
)
await attempt(
"rebate",
fill_text(
page,
ContractFormSelectors.REBATE_FIELD,
ContractTexts.REBATE,
),
)
await attempt(
"saving",
fill_text(
page,
ContractFormSelectors.SAVING_FIELD,
ContractTexts.SAVING,
),
)
await attempt(
"breach_notification",
fill_if_present(
ContractFormSelectors.BREACH_NOTIFICATION_FIELD,
ContractTexts.BREACH_NOTIFICATION,
),
)
# Boolean toggle
if ContractTexts.TERMINATE_FOR_CONVENIENCE:
try:
await page.click(
ContractFormSelectors.TERMINATE_FOR_CONVENIENCE_TOGGLE
)
except Exception as exc: # noqa: BLE001
_logger.warning(
"Failed to toggle terminate_for_convenience: %s", exc
)
await set_input_value(
"business_continuity",
ContractFormSelectors.BUSINESS_CONTINUITY_FIELD,
ContractTexts.BUSINESS_CONTINUITY,
)
await set_input_value(
"customer_data",
ContractFormSelectors.CUSTOMER_DATA_FIELD,
ContractTexts.CUSTOMER_DATA,
)
await set_input_value(
"reseller",
ContractFormSelectors.RESELLER_FIELD,
ContractTexts.RESELLER,
)
return ActionResult(
details={
"message": "Contract form filled",
"selections": selections,
"errors": field_errors or None,
}
)
except Exception as exc: # noqa: BLE001
_logger.exception("Contract form fill failed")
return ActionResult(
details={
"message": "Contract form fill failed",
"error": repr(exc),
"selections": selections,
"errors": field_errors or None,
}
)

View File

@@ -0,0 +1,110 @@
"""Demo actions for POC features and testing PageHelpers patterns.
This module demonstrates how to use PageHelpers class for high-level
browser interactions with minimal imports.
"""
from typing import ClassVar, override
from guide.app.actions.base import DemoAction, register_action
from guide.app.browser.helpers import PageHelpers, AccordionCollapseResult
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionContext, ActionResult
from guide.app.strings.selectors.common import CommonSelectors
@register_action
class CollapseAccordionsDemoAction(DemoAction):
"""Collapse all expanded accordion buttons on the current page.
This action demonstrates the PageHelpers pattern for browser interactions.
It finds all accordion buttons matching the selector and collapses those
that are currently expanded.
Supports optional custom selector and timeout via action parameters.
"""
id: ClassVar[str] = "demo.collapse-accordions"
description: ClassVar[str] = "Collapse all expanded accordion buttons on the page."
category: ClassVar[str] = "demo"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Collapse accordions on the current page.
Parameters:
selector (str, optional): CSS selector for accordion buttons
(default: page header accordion button with data-cy attribute)
timeout_ms (int, optional): Timeout for finding elements in ms
(default: 5000)
Returns:
ActionResult with details including:
- collapsed_count: Number of successfully collapsed buttons
- total_found: Total number of matching buttons found
- failed_indices: Comma-separated string of button indices that failed
- message: Human-readable summary
Note:
Expanded state detection is automatic via SVG icon data-testid.
The method only clicks buttons with KeyboardArrowUpOutlinedIcon.
"""
# Get selector from params or use default
selector: str = _coerce_to_str(
context.params.get("selector"),
CommonSelectors.PAGE_HEADER_ACCORDION,
)
# Get timeout from params or use default
timeout_ms: int = _coerce_to_int(context.params.get("timeout_ms"), 5000)
# Use PageHelpers for the interaction (single import!)
helpers = PageHelpers(page)
result: AccordionCollapseResult = await helpers.collapse_accordions(
selector, timeout_ms
)
# Extract result values
collapsed_count = result["collapsed_count"]
total_found = result["total_found"]
failed_indices_list = result["failed_indices"]
# Format failed indices as comma-separated string
failed_indices_str: str = ",".join(str(idx) for idx in failed_indices_list)
# Format result message
if total_found == 0:
message = f"No accordion buttons found with selector: {selector}"
elif collapsed_count == 0:
message = f"Found {total_found} accordions but failed to collapse any"
else:
message = f"Collapsed {collapsed_count} of {total_found} accordion(s)"
return ActionResult(
details={
"message": message,
"selector": selector,
"collapsed_count": collapsed_count,
"total_found": total_found,
"failed_indices": failed_indices_str,
}
)
def _coerce_to_str(value: object, default: str) -> str:
"""Coerce a value to str, or return default if None."""
return default if value is None else str(value)
def _coerce_to_int(value: object, default: int) -> int:
"""Coerce a value to int, or return default if None."""
if value is None:
return default
if isinstance(value, int):
return value
if isinstance(value, str):
return int(value)
return int(value) if isinstance(value, float) else default
__all__ = ["CollapseAccordionsDemoAction"]

View File

@@ -0,0 +1,67 @@
"""Typed dependency injection context for actions.
Provides a type-safe container for action dependencies, improving IDE support
and catching dependency mismatches at development time rather than runtime.
"""
from __future__ import annotations
from dataclasses import asdict, dataclass
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from guide.app.actions.base import ActionRegistry
from guide.app.auth import SessionManager
from guide.app.core.config import AppSettings
from guide.app.models.personas import PersonaResolver, PersonaStore
@dataclass(frozen=True)
class ActionDIContext:
"""Typed container for action dependencies.
This provides type-safe access to commonly injected dependencies.
Actions can declare constructor parameters matching these field names
to receive automatic dependency injection.
Example:
@register_action
class MyAction(DemoAction):
def __init__(self, persona_store: PersonaStore, settings: AppSettings):
self.persona_store = persona_store
self.settings = settings
"""
persona_store: PersonaStore
"""Persona store for user management."""
persona_resolver: PersonaResolver
"""Resolver for looking up personas by ID or email."""
login_url: str
"""Base URL for login actions."""
settings: AppSettings
"""Application settings for configuration access."""
session_manager: SessionManager
"""Session manager for auth session persistence."""
registry: ActionRegistry | None = None
"""Action registry (for CompositeAction child action lookups)."""
def as_dict(self) -> dict[str, object]:
"""Convert to dict for backward compatibility with existing DI system.
Returns:
Dictionary with all non-None dependencies.
"""
result: dict[str, object] = {}
data: dict[str, object] = asdict(self)
for key, value in data.items():
if value is not None:
result[key] = value
return result
__all__ = ["ActionDIContext"]

View File

@@ -0,0 +1,21 @@
"""Diagnostic actions for debugging browser automation issues.
Provides a unified DiagnoseAction that supports multiple diagnostic modes:
- connectivity: Test browser/extension connection
- field: Inspect a specific field's state
- dropdown: Inspect dropdown with options
- form: Diagnose all known form fields
- page: Inspect page structure and selectors
Also provides PingExtensionAction for minimal connectivity tests.
For REST API access, use the /diagnostics/* endpoints instead.
"""
from guide.app.actions.diagnose.ping import PingExtensionAction
from guide.app.actions.diagnose.unified import DiagnoseAction
__all__ = [
"DiagnoseAction",
"PingExtensionAction",
]

View File

@@ -0,0 +1,139 @@
"""Diagnostic action to introspect GraphQL schema."""
from typing import ClassVar, cast, override
import httpx
from guide.app.actions.base import DemoAction, register_action
from guide.app.auth.session import extract_bearer_token
from guide.app.browser.types import PageLike
from guide.app.core.config import load_settings
from guide.app.models.domain import ActionContext, ActionResult
_INTROSPECTION_QUERY = """
query IntrospectType($typeName: String!) {
__type(name: $typeName) {
name
kind
description
enumValues {
name
description
}
inputFields {
name
description
type {
name
kind
ofType {
name
kind
ofType {
name
kind
ofType {
name
kind
}
}
}
}
}
fields {
name
description
type {
name
kind
ofType {
name
kind
ofType {
name
kind
ofType {
name
kind
}
}
}
}
args {
name
type {
name
kind
ofType {
name
kind
}
}
}
}
}
}
"""
@register_action
class IntrospectSchemaAction(DemoAction):
"""Introspect GraphQL schema using bearer token from browser session."""
id: ClassVar[str] = "introspect-schema"
description: ClassVar[str] = "Introspect GraphQL schema for a specific type."
category: ClassVar[str] = "diagnose"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Query GraphQL schema for type information."""
# Get type name from params (default to Mutation to discover available mutations)
type_name = str(context.params.get("type_name", "Mutation"))
# Extract token from browser session
token_result = await extract_bearer_token(page)
if not token_result:
return ActionResult(
details={
"error": "No bearer token found in browser session",
"type_name": type_name,
}
)
# Run introspection query against GraphQL endpoint
settings = load_settings()
async with httpx.AsyncClient() as client:
resp = await client.post(
settings.raindrop_graphql_url,
json={
"query": _INTROSPECTION_QUERY,
"variables": {"typeName": type_name},
},
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {token_result.value}",
},
timeout=15.0,
)
data = cast(dict[str, object], resp.json())
# Extract schema data
data_section = data.get("data")
schema_type: object = None
if isinstance(data_section, dict):
data_dict = cast(dict[str, object], data_section)
schema_type = data_dict.get("__type")
errors = data.get("errors")
return ActionResult(
details={
"type_name": type_name,
"token_source": token_result.source_key,
"schema": schema_type,
"errors": errors,
"http_status": resp.status_code,
}
)

View File

@@ -0,0 +1,204 @@
"""Diagnostic script to validate messaging XPath selectors against live page.
Run via: python -m guide.app.actions.diagnose.messaging_selectors
Requires Chrome with Terminator Bridge extension connected to a page
with the messaging UI (e.g., board view with chat panel).
"""
import asyncio
import logging
from guide.app.browser.extension_client import ExtensionClient, ExtensionPage
from guide.app.strings.selectors.messaging import MessagingSelectors
logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
_logger = logging.getLogger(__name__)
# XPath selectors to validate
MESSAGING_SELECTORS: dict[str, str] = {
"notification_indicator": MessagingSelectors.NOTIFICATION_INDICATOR,
"modal_wrapper": MessagingSelectors.MODAL_WRAPPER,
"modal_close_button": MessagingSelectors.MODAL_CLOSE_BUTTON,
"chat_messages_container": MessagingSelectors.CHAT_MESSAGES_CONTAINER,
"chat_flyout_button": MessagingSelectors.CHAT_FLYOUT_BUTTON,
"chat_conversations_tab": MessagingSelectors.CHAT_CONVERSATIONS_TAB,
"chat_input": MessagingSelectors.CHAT_INPUT,
"send_button": MessagingSelectors.SEND_BUTTON,
}
async def validate_selector(
page: ExtensionPage, name: str, selector: str
) -> dict[str, object]:
"""Validate a single selector against the page.
Args:
page: ExtensionPage instance
name: Friendly name for the selector
selector: Playwright selector (xpath= or CSS)
Returns:
Dict with validation results
"""
result: dict[str, object] = {
"name": name,
"selector": selector,
"found": False,
"count": 0,
"visible": False,
"error": None,
}
# Handle XPath selectors (Playwright format: xpath=/...)
if selector.startswith("xpath="):
xpath = selector[6:] # Strip "xpath=" prefix
js_code = f"""
(() => {{
try {{
const result = document.evaluate(
'{xpath}',
document,
null,
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
null
);
const count = result.snapshotLength;
if (count === 0) {{
return {{ found: false, count: 0, visible: false }};
}}
const elem = result.snapshotItem(0);
const rect = elem.getBoundingClientRect();
const computed = window.getComputedStyle(elem);
const visible = rect.width > 0 && rect.height > 0 && computed.display !== 'none';
return {{
found: true,
count: count,
visible: visible,
tagName: elem.tagName,
className: elem.className || '',
id: elem.id || null,
rect: {{
top: Math.round(rect.top),
left: Math.round(rect.left),
width: Math.round(rect.width),
height: Math.round(rect.height)
}}
}};
}} catch (e) {{
return {{ found: false, count: 0, visible: false, error: e.message }};
}}
}})();
"""
else:
# CSS selector
js_code = f"""
(() => {{
try {{
const elements = document.querySelectorAll('{selector}');
const count = elements.length;
if (count === 0) {{
return {{ found: false, count: 0, visible: false }};
}}
const elem = elements[0];
const rect = elem.getBoundingClientRect();
const computed = window.getComputedStyle(elem);
const visible = rect.width > 0 && rect.height > 0 && computed.display !== 'none';
return {{
found: true,
count: count,
visible: visible,
tagName: elem.tagName,
className: elem.className || '',
id: elem.id || null,
rect: {{
top: Math.round(rect.top),
left: Math.round(rect.left),
width: Math.round(rect.width),
height: Math.round(rect.height)
}}
}};
}} catch (e) {{
return {{ found: false, count: 0, visible: false, error: e.message }};
}}
}})();
"""
raw_result = await page.evaluate(js_code)
if isinstance(raw_result, dict):
result["found"] = raw_result.get("found", False)
result["count"] = raw_result.get("count", 0)
result["visible"] = raw_result.get("visible", False)
result["error"] = raw_result.get("error")
if raw_result.get("tagName"):
result["tag"] = raw_result.get("tagName")
if raw_result.get("className"):
result["class"] = raw_result.get("className")
if raw_result.get("rect"):
result["rect"] = raw_result.get("rect")
return result
async def run_diagnostics() -> None:
"""Run diagnostics for all messaging selectors."""
_logger.info("Connecting to browser extension...")
async with ExtensionClient() as client:
page = await client.get_page()
_logger.info("Connected to browser")
# Get page info
url = await page.evaluate("window.location.href")
title = await page.evaluate("document.title")
_logger.info(f"Page: {title}")
_logger.info(f"URL: {url}")
_logger.info("-" * 60)
results: list[dict[str, object]] = []
for name, selector in MESSAGING_SELECTORS.items():
result = await validate_selector(page, name, selector)
results.append(result)
status = "" if result["found"] else ""
visible_status = "(visible)" if result["visible"] else "(hidden)"
count_str = f"[{result['count']}]" if result["count"] else ""
if result["found"]:
_logger.info(f"{status} {name}: FOUND {count_str} {visible_status}")
if result.get("tag"):
class_str = str(result.get("class", ""))[:50]
_logger.info(f" tag: {result['tag']}, class: {class_str}")
else:
_logger.warning(f"{status} {name}: NOT FOUND")
if result.get("error"):
_logger.warning(f" error: {result['error']}")
_logger.info("-" * 60)
# Summary
found_count = sum(bool(r["found"]) for r in results)
visible_count = sum(bool(r["visible"]) for r in results)
_logger.info(
f"Summary: {found_count}/{len(results)} found, {visible_count} visible"
)
# Report missing critical selectors
critical = [
"chat_flyout_button",
"chat_messages_container",
"chat_input",
"send_button",
]
if missing_critical := [
n
for n in critical
if not next((r for r in results if r["name"] == n and r["found"]), None)
]:
_logger.warning(
f"Missing critical selectors: {', '.join(missing_critical)}"
)
if __name__ == "__main__":
asyncio.run(run_diagnostics())

View File

@@ -0,0 +1,37 @@
"""Minimal ping action to test extension connectivity."""
from typing import ClassVar, override
from guide.app.actions.base import DemoAction, register_action
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionContext, ActionResult
@register_action
class PingExtensionAction(DemoAction):
"""Minimal action to test extension is responding."""
id: ClassVar[str] = "ping-extension"
description: ClassVar[str] = "Test extension connectivity with simple eval."
category: ClassVar[str] = "diagnose"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Just get page title - minimal eval."""
try:
title = await page.evaluate("document.title")
url = await page.evaluate("window.location.href")
return ActionResult(
details={
"connected": True,
"title": title,
"url": url,
}
)
except Exception as e:
return ActionResult(
details={
"connected": False,
"error": str(e),
}
)

View File

@@ -0,0 +1,62 @@
"""Diagnostic action to run arbitrary GraphQL queries."""
from typing import ClassVar, cast, override
import httpx
from guide.app.actions.base import DemoAction, register_action
from guide.app.auth.session import extract_bearer_token
from guide.app.browser.types import PageLike
from guide.app.core.config import load_settings
from guide.app.models.domain import ActionContext, ActionResult
@register_action
class RunGraphQLAction(DemoAction):
"""Run arbitrary GraphQL query using bearer token from browser session."""
id: ClassVar[str] = "run-graphql"
description: ClassVar[str] = "Run a GraphQL query with browser auth token."
category: ClassVar[str] = "diagnose"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Execute GraphQL query with authentication."""
query = context.params.get("query")
variables = context.params.get("variables", {})
if not query:
return ActionResult(details={"error": "Missing 'query' parameter"})
# Extract token from browser session
token_result = await extract_bearer_token(page)
if not token_result:
return ActionResult(
details={"error": "No bearer token found in browser session"}
)
# Run query against GraphQL endpoint
settings = load_settings()
async with httpx.AsyncClient() as client:
resp = await client.post(
settings.raindrop_graphql_url,
json={
"query": query,
"variables": variables,
},
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {token_result.value}",
},
timeout=30.0,
)
data = cast(dict[str, object], resp.json())
return ActionResult(
details={
"data": data.get("data"),
"errors": data.get("errors"),
"http_status": resp.status_code,
}
)

View File

@@ -0,0 +1,51 @@
"""Diagnostic action to discover auth tokens in localStorage."""
from typing import ClassVar, override
from guide.app.actions.base import DemoAction, register_action
from guide.app.auth.session import discover_auth_tokens, extract_bearer_token
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionContext, ActionResult
@register_action
class DiscoverTokensAction(DemoAction):
"""Discover auth-related localStorage keys and extract bearer token."""
id: ClassVar[str] = "discover-tokens"
description: ClassVar[str] = "Discover auth-related localStorage keys."
category: ClassVar[str] = "diagnose"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Scan localStorage for auth tokens and attempt bearer extraction."""
try:
tokens = await discover_auth_tokens(page)
bearer = await extract_bearer_token(page)
masked_tokens: dict[str, str] = {
key: f"{value[:4]}...{value[-4:]}" if len(value) > 12 else "****"
for key, value in tokens.items()
}
return ActionResult(
details={
"discovered_keys": list(tokens.keys()),
"token_count": len(tokens),
"masked_values": masked_tokens,
"bearer_token_found": bearer is not None,
"bearer_source_key": bearer.source_key if bearer else None,
"bearer_preview": (
f"{bearer.value[:8]}...{bearer.value[-4:]}"
if bearer and len(bearer.value) > 16
else None
),
}
)
except Exception as e:
return ActionResult(
details={
"error": str(e),
"token_count": 0,
"bearer_token_found": False,
}
)

View File

@@ -0,0 +1,317 @@
"""Unified diagnostic action for troubleshooting browser automation.
Provides a single parameterized action that can run different diagnostic modes:
- connectivity: Test browser/extension connection
- field: Inspect a specific field's state
- dropdown: Inspect dropdown with options
- form: Diagnose all known form fields
- page: Inspect page structure and selectors
- selectors: Extract all UI element selectors from page
"""
from typing import ClassVar, override
from guide.app.actions.base import DemoAction, register_action
from guide.app.browser.diagnostics import (
DiagnosticMode,
analyze_field_issues,
extract_ui_elements,
get_selected_chips,
inspect_dropdown,
inspect_field,
inspect_input,
inspect_page_structure,
test_connectivity,
)
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionContext, ActionResult
from guide.app.strings.selectors.intake import IntakeSelectors
# Known field registry - same as API endpoint
KNOWN_FIELDS: dict[str, str] = {
"commodity": IntakeSelectors.COMMODITY_FIELD,
"planned": IntakeSelectors.PLANNED_FIELD,
"regions": IntakeSelectors.REGIONS_FIELD,
"opex_capex": IntakeSelectors.OPEX_CAPEX_FIELD,
"entity": IntakeSelectors.ENTITY_FIELD,
"description": IntakeSelectors.DESCRIPTION_TEXTAREA,
"target_date": IntakeSelectors.TARGET_DATE_FIELD,
"requester": IntakeSelectors.REQUESTER_FIELD,
"owner": IntakeSelectors.ASSIGNED_OWNER_FIELD,
"reseller": IntakeSelectors.RESELLER_TEXTAREA,
}
def resolve_selector(field_or_selector: str) -> str:
"""Resolve field name to selector, or use as-is if already a selector."""
return KNOWN_FIELDS.get(field_or_selector, field_or_selector)
@register_action
class DiagnoseAction(DemoAction):
"""Unified diagnostic action for troubleshooting.
Supports multiple diagnostic modes via the 'mode' param:
- connectivity: Test browser/extension connection
- field: Inspect a specific field (requires 'selector' or 'field' param)
- dropdown: Inspect dropdown with options (requires 'selector' or 'field' param)
- form: Diagnose all known form fields
- page: Inspect page structure and selectors
- selectors: Extract all UI element selectors from page (optional 'filter' param)
Example params:
{"mode": "connectivity"}
{"mode": "field", "field": "commodity"}
{"mode": "dropdown", "selector": "[data-cy='my-field']", "open": true}
{"mode": "form"}
{"mode": "page"}
{"mode": "selectors", "filter": "contract"}
"""
id: ClassVar[str] = "diagnose"
description: ClassVar[str] = (
"Run diagnostics (mode: connectivity|field|dropdown|form|page|selectors)"
)
category: ClassVar[str] = "diagnostic"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Run diagnostics based on mode parameter."""
params = context.params or {}
mode_str = str(params.get("mode", "connectivity"))
try:
mode = DiagnosticMode(mode_str)
except ValueError:
valid_modes = [m.value for m in DiagnosticMode]
return ActionResult(
status="error",
error=f"Invalid mode '{mode_str}'. Valid modes: {valid_modes}",
)
if mode == DiagnosticMode.CONNECTIVITY:
return await self._run_connectivity(page)
if mode == DiagnosticMode.FIELD:
return await self._run_field(page, params)
if mode == DiagnosticMode.DROPDOWN:
return await self._run_dropdown(page, params)
if mode == DiagnosticMode.FORM:
return await self._run_form(page)
if mode == DiagnosticMode.PAGE:
return await self._run_page(page)
if mode == DiagnosticMode.SELECTORS:
return await self._run_selectors(page, params)
return ActionResult(status="error", error=f"Unhandled mode: {mode}")
async def _run_connectivity(self, page: PageLike) -> ActionResult:
"""Test browser/extension connectivity."""
result = await test_connectivity(page)
return ActionResult(
details={
"mode": "connectivity",
"connected": result.connected,
"title": result.title,
"url": result.url,
"error": result.error,
}
)
async def _run_field(
self, page: PageLike, params: dict[str, object]
) -> ActionResult:
"""Inspect a specific field."""
field_param = params.get("field") or params.get("selector")
if not field_param:
return ActionResult(
status="error",
error="Field mode requires 'field' or 'selector' param",
)
selector = resolve_selector(str(field_param))
field_info = await inspect_field(page, selector)
input_info = await inspect_input(page, selector)
chips = await get_selected_chips(page, selector)
issues = analyze_field_issues(field_info, input_info, chips)
return ActionResult(
details={
"mode": "field",
"selector": selector,
"field": {
"exists": field_info.exists,
"visible": field_info.visible,
"tag_name": field_info.tag_name,
"error": field_info.error,
},
"input": {
"exists": input_info.exists,
"disabled": input_info.disabled,
"read_only": input_info.read_only,
"value": input_info.value,
"aria_expanded": input_info.aria_expanded,
"error": input_info.error,
},
"chips": {
"count": chips.count,
"values": [c.text for c in chips.chips],
},
"issues": issues,
}
)
async def _run_dropdown(
self, page: PageLike, params: dict[str, object]
) -> ActionResult:
"""Inspect dropdown with options."""
field_param = params.get("field") or params.get("selector")
if not field_param:
return ActionResult(
status="error",
error="Dropdown mode requires 'field' or 'selector' param",
)
selector = resolve_selector(str(field_param))
open_dropdown = bool(params.get("open", True))
result = await inspect_dropdown(page, selector, open_dropdown=open_dropdown)
issues = analyze_field_issues(result.field, result.input_element, result.chips)
options_data: list[dict[str, object]] = []
option_count = 0
if result.listbox_after_click:
option_count = result.listbox_after_click.option_count
options_data = [
{
"index": o.index,
"text": o.text,
"selected": o.aria_selected == "true",
}
for o in result.listbox_after_click.options
]
return ActionResult(
details={
"mode": "dropdown",
"selector": selector,
"field_exists": result.field.exists,
"field_visible": result.field.visible,
"input_disabled": result.input_element.disabled,
"chip_count": result.chips.count,
"chips": [c.text for c in result.chips.chips],
"dropdown_opened": result.click_success,
"click_target": result.click_target,
"option_count": option_count,
"options": options_data,
"component_structure": result.component_structure,
"issues": issues,
}
)
async def _run_form(self, page: PageLike) -> ActionResult:
"""Diagnose all known form fields."""
url_result = await page.evaluate("window.location.href")
url = str(url_result) if url_result else "unknown"
fields: dict[str, dict[str, object]] = {}
for field_name, selector in KNOWN_FIELDS.items():
field_info = await inspect_field(page, selector)
input_info = await inspect_input(page, selector)
chips = await get_selected_chips(page, selector)
issues = analyze_field_issues(field_info, input_info, chips)
fields[field_name] = {
"selector": selector,
"exists": field_info.exists,
"visible": field_info.visible,
"input_disabled": input_info.disabled if input_info.exists else None,
"chip_count": chips.count,
"chips": [c.text for c in chips.chips],
"issues": issues,
}
# Summary
found_count = sum(bool(f.get("exists")) for f in fields.values())
visible_count = sum(bool(f.get("visible")) for f in fields.values())
with_issues = sum(bool(f.get("issues")) for f in fields.values())
return ActionResult(
details={
"mode": "form",
"url": url,
"fields": fields,
"summary": {
"total": len(fields),
"found": found_count,
"visible": visible_count,
"with_issues": with_issues,
},
}
)
async def _run_page(self, page: PageLike) -> ActionResult:
"""Inspect page structure."""
result = await inspect_page_structure(page)
return ActionResult(
details={
"mode": "page",
"url": result.url,
"title": result.title,
"form_exists": result.form_exists,
"total_data_cy_count": result.total_data_cy_count,
"data_cy_elements": result.data_cy_elements,
}
)
async def _run_selectors(
self, page: PageLike, params: dict[str, object]
) -> ActionResult:
"""Extract UI element selectors from page."""
ui_elements = await extract_ui_elements(page)
if not ui_elements:
return ActionResult(
details={
"mode": "selectors",
"ui_elements": {},
"total_elements": 0,
"filtered_elements": 0,
}
)
# Apply filter if provided
filter_str = params.get("filter")
total_count = len(ui_elements)
if filter_str and isinstance(filter_str, str):
# Case-insensitive partial match on element names
filter_upper = filter_str.upper().replace("-", "_")
filtered_elements = {
name: selector
for name, selector in ui_elements.items()
if filter_upper in name
}
filtered_count = len(filtered_elements)
else:
filtered_elements = ui_elements
filtered_count = total_count
return ActionResult(
details={
"mode": "selectors",
"ui_elements": filtered_elements,
"total_elements": total_count,
"filtered_elements": filtered_count,
"filter_applied": filter_str or None,
}
)

View File

@@ -0,0 +1,15 @@
"""Generic form automation actions.
These actions work with any entity type that has a form schema
(boards, contracts, sourcing events, etc.).
"""
from guide.app.actions.form.smart_fill import (
InspectFormContextAction,
SmartFillAction,
)
__all__ = [
"InspectFormContextAction",
"SmartFillAction",
]

View File

@@ -0,0 +1,382 @@
"""Smart form filling using schema-DOM reconciliation.
Demonstrates the Semantic Bridge architecture for LLM-driven automation:
1. Fetch board schema from GraphQL
2. Build FormContext by reconciling schema with live DOM
3. Format for LLM consumption
4. Execute fills using dynamically-dispatched helpers
"""
from __future__ import annotations
import logging
from typing import ClassVar, cast, override
from guide.app.actions.base import DemoAction, register_action
from guide.app.core.config import load_settings
from guide.app.browser.context_builder import (
FormContext,
build_form_context,
format_for_llm,
)
from guide.app.browser.elements.dropdown import (
select_combobox,
select_multi,
select_single,
)
from guide.app.browser.elements.dropdown.schema_aware import select_from_schema
from guide.app.browser.elements.inputs import fill_text
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionContext, ActionResult
from guide.app.raindrop.operations.form_schema import (
FieldDef,
SchemaSourceType,
get_form_schema,
)
_logger = logging.getLogger(__name__)
async def _dispatch_fill(
page: PageLike,
selector: str,
value: str | list[str],
helper_name: str,
field_def: FieldDef | None = None,
) -> dict[str, object]:
"""Dispatch to appropriate fill helper based on helper_name.
Args:
page: Browser page instance.
selector: CSS selector for the field.
value: Value(s) to fill.
helper_name: Helper function name from FieldContext.helper.
field_def: Optional FieldDef for schema validation.
Returns:
Dict with selection/fill result.
"""
match helper_name:
case "select_single":
if field_def is not None:
# Use schema-aware selection with validation
result = await select_from_schema(page, selector, str(value), field_def)
return {
"selected": result["selected"],
"not_found": result["not_found"],
"validated": result["validated"],
"validation_warning": result["validation_warning"],
}
result = await select_single(page, selector, str(value))
return {
"selected": result["selected"],
"not_found": result["not_found"],
}
case "select_combobox":
result = await select_combobox(page, selector, str(value))
return {
"selected": result["selected"],
"not_found": result["not_found"],
}
case "select_multi":
values = value if isinstance(value, list) else [value]
result = await select_multi(page, selector, values)
return {
"selected": result["selected"],
"not_found": result["not_found"],
}
case "fill_with_react_events":
success = await fill_text(page, selector, str(value))
return {"filled": success, "value": str(value)}
case _:
_logger.warning(
"Unknown helper '%s', falling back to fill_text", helper_name
)
success = await fill_text(page, selector, str(value))
return {"filled": success, "value": str(value), "fallback": True}
@register_action
class SmartFillAction(DemoAction):
"""Fill form fields using schema-aware semantic matching.
This action demonstrates the Semantic Bridge architecture:
- Reconciles GraphQL schema with live DOM elements
- Generates LLM-consumable context with field metadata
- Dispatches to appropriate helpers based on field type
- Validates choices against schema where applicable
Required params:
board_id (int): Board ID to fetch schema for.
values (dict): Field values to fill, keyed by field_key (e.g., "f19").
Optional params:
graphql_url (str): Override GraphQL endpoint.
container_selector (str): Container to search for fields (default: "body").
dry_run (bool): If True, only build context without filling (for inspection).
"""
id: ClassVar[str] = "smart-fill"
description: ClassVar[str] = "Fill form using schema-DOM semantic bridge"
category: ClassVar[str] = "form"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
params = context.params
# Extract parameters
board_id = params.get("board_id")
if not isinstance(board_id, int):
return ActionResult(
status="error",
details={"error": "board_id (int) is required"},
)
raw_values = params.get("values", {})
if not isinstance(raw_values, dict):
return ActionResult(
status="error",
details={"error": "values must be a dict keyed by field_key"},
)
# Cast to typed dict for iteration
values_to_fill = cast(dict[str, str | list[str]], raw_values)
dry_run = bool(params.get("dry_run", False))
container_selector = str(params.get("container_selector", "body"))
# Get bearer token from page localStorage
bearer_token = await self._extract_bearer_token(page)
if not bearer_token:
return ActionResult(
status="error",
details={"error": "No bearer token found in page localStorage"},
)
# Determine GraphQL URL (from params or config default)
settings = load_settings()
graphql_url = str(params.get("graphql_url", settings.raindrop_graphql_url))
# 1. Fetch board schema from GraphQL
_logger.info("[SmartFill] Fetching schema for board_id=%d", board_id)
try:
schema = await get_form_schema(
graphql_url=graphql_url,
bearer_token=bearer_token,
source_type=SchemaSourceType.BOARD,
entity_key=board_id,
)
except ValueError as exc:
return ActionResult(
status="error",
details={"error": f"Failed to fetch schema: {exc}"},
)
# 2. Build semantic bridge (FormContext)
_logger.info("[SmartFill] Building form context from schema + DOM")
form_context = await build_form_context(page, schema, container_selector)
# 3. Format for LLM consumption
llm_context = format_for_llm(form_context)
# Log context for debugging
_logger.info(
"[SmartFill] Form context built: %d fields matched, %d unmatched schema, %d unmatched DOM",
len(form_context.fields),
len(form_context.unmatched_schema_fields),
len(form_context.unmatched_dom_fields),
)
if dry_run:
return ActionResult(
status="ok",
details={
"mode": "dry_run",
"schema_entity": schema.entity_name,
"llm_context": llm_context,
"unmatched_schema_fields": list(
form_context.unmatched_schema_fields
),
"unmatched_dom_fields": list(form_context.unmatched_dom_fields),
},
)
# 4. Execute fills
fill_results: dict[str, dict[str, object]] = {}
fill_errors: dict[str, str] = {}
for field_key, target_value in values_to_fill.items():
if field_key not in form_context.fields:
fill_errors[field_key] = "field_not_found_in_context"
continue
field_ctx = form_context.fields[field_key]
if not field_ctx.dom_selector:
fill_errors[field_key] = "no_dom_selector_for_field"
continue
if field_ctx.is_disabled:
fill_errors[field_key] = "field_is_disabled"
continue
# Get field def for schema validation (if available)
field_def = schema.get_field(field_key)
try:
_logger.info(
"[SmartFill] Filling %s (%s) with %s using %s",
field_key,
field_ctx.label,
target_value,
field_ctx.helper,
)
result = await _dispatch_fill(
page=page,
selector=field_ctx.dom_selector,
value=target_value,
helper_name=field_ctx.helper,
field_def=field_def,
)
fill_results[field_key] = result
except Exception as exc:
_logger.exception("Failed to fill %s", field_key)
fill_errors[field_key] = str(exc)
# Return ok with partial failures in details (ActionResult only allows ok/error)
# We return ok to indicate the action ran; check fields_failed for partial issues
return ActionResult(
status="ok",
details={
"schema_entity": schema.entity_name,
"fields_requested": list(values_to_fill.keys()),
"fields_filled": fill_results,
"fields_failed": fill_errors or None,
"partial_failure": bool(fill_errors),
"llm_context": llm_context,
},
)
async def _extract_bearer_token(self, page: PageLike) -> str | None:
"""Extract bearer token from page localStorage."""
try:
result = await page.evaluate(
"""
(() => {
// Try common storage keys for Raindrop auth
const keys = [
'access_token',
'auth_token',
'token',
'rd_access_token',
];
for (const key of keys) {
const val = localStorage.getItem(key);
if (val) return val;
}
// Try parsing auth state
const authState = localStorage.getItem('auth');
if (authState) {
try {
const parsed = JSON.parse(authState);
return parsed.access_token || parsed.token || null;
} catch (e) {}
}
return null;
})();
"""
)
return str(result) if isinstance(result, str) else None
except Exception:
return None
@register_action
class InspectFormContextAction(DemoAction):
"""Inspect form context without filling any fields.
Useful for debugging and understanding form structure.
Required params:
board_id (int): Board ID to fetch schema for.
Optional params:
graphql_url (str): Override GraphQL endpoint.
container_selector (str): Container to search for fields (default: "body").
"""
id: ClassVar[str] = "inspect-form-context"
description: ClassVar[str] = "Inspect form schema-DOM context for debugging"
category: ClassVar[str] = "diagnose"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
# Delegate to SmartFillAction with dry_run=True
params = dict(context.params)
params["dry_run"] = True
params["values"] = {}
smart_fill_context = ActionContext(
action_id=context.action_id,
persona_id=context.persona_id,
browser_host_id=context.browser_host_id,
params=params,
correlation_id=context.correlation_id,
shared_state=dict(context.shared_state),
)
smart_fill = SmartFillAction()
return await smart_fill.run(page, smart_fill_context)
def _build_llm_prompt(form_context: FormContext, user_intent: str) -> str:
"""Build LLM prompt for value generation (future use).
Args:
form_context: Merged schema + DOM context.
user_intent: Natural language description of what user wants.
Returns:
Formatted prompt for LLM.
"""
llm_data = format_for_llm(form_context)
fields_desc: list[str] = []
for key, field_info in llm_data.items():
label = str(field_info.get("label", ""))
ui_type = str(field_info.get("ui_type", ""))
desc = f"- {key} ({label}): {ui_type}"
allowed_values = field_info.get("allowed_values")
if allowed_values is not None and isinstance(allowed_values, list):
desc += f" - choices: {allowed_values}"
if field_info.get("is_required"):
desc += " [REQUIRED]"
fields_desc.append(desc)
return f"""You are filling a form for: {form_context.entity_name}
User intent: {user_intent}
Available fields:
{chr(10).join(fields_desc)}
Respond with a JSON object mapping field_key to value. Only include fields you want to fill.
For menu fields, use exact choice text. For multi-select, use a list.
Example response:
{{"f19": "Active", "f20": "Hardware Services"}}
"""
# Keep _build_llm_prompt available for future LLM integration
_ = _build_llm_prompt # Suppress unused warning
__all__ = [
"SmartFillAction",
"InspectFormContextAction",
]

View File

@@ -1,3 +1,9 @@
from guide.app.actions.intake.basic import FillIntakeBasicAction
from guide.app.actions.intake.sourcing_request import (
FillIntakeBasicAction,
FillSourcingRequestAction,
)
__all__ = ["FillIntakeBasicAction"]
__all__ = [
"FillIntakeBasicAction",
"FillSourcingRequestAction",
]

View File

@@ -1,23 +0,0 @@
from playwright.async_api import Page
from typing import ClassVar, override
from guide.app.actions.base import DemoAction, register_action
from guide.app.models.domain import ActionContext, ActionResult
from guide.app.strings.registry import app_strings
@register_action
class FillIntakeBasicAction(DemoAction):
id: ClassVar[str] = "fill-intake-basic"
description: ClassVar[str] = (
"Fill the intake description and advance to the next step."
)
category: ClassVar[str] = "intake"
@override
async def run(self, page: Page, context: ActionContext) -> ActionResult:
description_val = app_strings.intake.conveyor_belt_request
await page.fill(app_strings.intake.description_field, description_val)
await page.click(app_strings.intake.next_button)
return ActionResult(details={"message": "Intake filled"})

View File

@@ -0,0 +1,132 @@
from typing import ClassVar, override
from guide.app.actions.base import DemoAction, register_action
from guide.app.browser.elements.dropdown import (
select_combobox,
select_multi,
select_single,
)
from guide.app.browser.elements.inputs import fill_date, fill_textarea
from guide.app.browser.elements.mui import DropdownResult
from guide.app.browser.helpers import PageHelpers
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionContext, ActionResult
from guide.app.strings.demo_texts.intake import IntakeTexts
from guide.app.strings.selectors.intake import IntakeSelectors
@register_action
class FillIntakeBasicAction(DemoAction):
id: ClassVar[str] = "fill-intake-basic"
description: ClassVar[str] = (
"Fill the intake description and advance to the next step."
)
category: ClassVar[str] = "intake"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
description_val = IntakeTexts.CONVEYOR_BELT_REQUEST
await page.fill(IntakeSelectors.DESCRIPTION_FIELD, description_val)
await page.click(IntakeSelectors.NEXT_BUTTON)
return ActionResult(details={"message": "Intake filled"})
@register_action
class FillSourcingRequestAction(DemoAction):
"""Fill the complete sourcing request intake form with demo data.
Uses PageHelpers pattern for wait utilities and diagnostics capture.
"""
id: ClassVar[str] = "fill-sourcing-request"
description: ClassVar[str] = (
"Fill the complete sourcing request intake form with demo data."
)
category: ClassVar[str] = "intake"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Fill the sourcing request form with demo data.
Handles multi-select commodities, regions, and all other form fields.
Uses wait_for_network_idle() between dropdown selections for stability.
"""
_ = PageHelpers(page) # Reserved for future use
results: dict[str, DropdownResult] = {
"commodities": await select_multi(
page,
IntakeSelectors.COMMODITY_FIELD,
list(IntakeTexts.COMMODITY_REQUEST),
)
}
# Planned (single)
results["planned"] = await select_single(
page,
IntakeSelectors.PLANNED_FIELD,
IntakeTexts.PLANNED_REQUEST,
)
# Regions (multi)
results["regions"] = await select_multi(
page,
IntakeSelectors.REGIONS_FIELD,
list(IntakeTexts.REGIONS_REQUEST),
)
# OpEx/CapEx (combobox)
results["opex_capex"] = await select_combobox(
page,
IntakeSelectors.OPEX_CAPEX_FIELD,
IntakeTexts.OPEX_CAPEX_REQUEST,
)
# Target date
await fill_date(
page,
IntakeSelectors.TARGET_DATE_FIELD,
IntakeTexts.TARGET_DATE_REQUEST,
)
# Text areas
await fill_textarea(
page,
IntakeSelectors.DESCRIPTION_TEXTAREA,
IntakeTexts.DESCRIPTION_REQUEST,
)
await fill_textarea(
page,
IntakeSelectors.DESIRED_SUPPLIER_NAME_TEXTAREA,
IntakeTexts.DESIRED_SUPPLIER_NAME_REQUEST,
)
await fill_textarea(
page,
IntakeSelectors.DESIRED_SUPPLIER_CONTACT_TEXTAREA,
IntakeTexts.DESIRED_SUPPLIER_CONTACT_REQUEST,
)
await fill_textarea(
page,
IntakeSelectors.RESELLER_TEXTAREA,
IntakeTexts.RESELLER_REQUEST,
)
# Entity (autocomplete single-select)
results["entity"] = await select_single(
page,
IntakeSelectors.ENTITY_FIELD,
IntakeTexts.ENTITY_REQUEST,
)
return ActionResult(
details={
"message": "Sourcing request form filled",
"selection_results": {
k: {
"selected": v["selected"],
"not_found": v["not_found"],
"available": v.get("available", []),
}
for k, v in results.items()
},
}
)

View File

@@ -0,0 +1,5 @@
"""Messaging actions for chat panel interactions."""
from guide.app.actions.messaging.respond import RespondToMessageAction
__all__ = ["RespondToMessageAction"]

View File

@@ -0,0 +1,225 @@
"""Message response action for chat panel interactions."""
from typing import ClassVar, override
from guide.app import errors
from guide.app.actions.base import DemoAction, register_action
from guide.app.browser.helpers import PageHelpers
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionContext, ActionResult
from guide.app.strings.selectors.messaging import MessagingSelectors
@register_action
class RespondToMessageAction(DemoAction):
"""Open chat panel and send a message.
Handles the full messaging flow including:
1. Dismissing any blocking modals (common after email URL login)
2. Expanding the chat flyout if not visible
3. Switching to conversations tab
4. Typing and sending the message
Visibility checks are performed between each step to ensure the chat
panel remains accessible throughout the flow.
Request params:
message: str - Message text to send (required)
"""
id: ClassVar[str] = "messaging.respond"
description: ClassVar[str] = "Open chat panel, type message, and send."
category: ClassVar[str] = "messaging"
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Execute message response flow.
Args:
page: The browser page instance.
context: Action context with params.
Returns:
ActionResult with message_sent status and details.
Raises:
ActionExecutionError: If message param missing or chat elements not found.
"""
message = context.params.get("message")
if not message or not isinstance(message, str):
raise errors.ActionExecutionError(
"'message' param is required",
details={"provided_params": list(context.params.keys())},
)
helpers = PageHelpers(page)
# 1. Wait for page stability
await helpers.wait_for_stable()
# 2. Dismiss blocking modal if present (common after email URL login)
modal_dismissed = await self._dismiss_modal_if_present(page, helpers)
# 3. Ensure chat panel is visible (expand flyout + switch to conversations)
chat_expanded = await self._ensure_chat_visible(page, helpers)
# 4. Verify chat is still visible before typing
await self._verify_chat_visible(page, helpers, step="before_typing")
# 5. Type message into chat input
await page.fill(MessagingSelectors.CHAT_INPUT, message)
# 6. Verify chat is still visible before sending
await self._verify_chat_visible(page, helpers, step="before_send")
# 7. Send message
await page.click(MessagingSelectors.SEND_BUTTON)
# 8. Wait for network activity to settle
await helpers.wait_for_network_idle()
# 9. Final verification that chat is still visible
await self._verify_chat_visible(page, helpers, step="after_send")
return ActionResult(
details={
"message_sent": True,
"message_length": len(message),
"modal_dismissed": modal_dismissed,
"chat_expanded": chat_expanded,
}
)
async def _dismiss_modal_if_present(
self, page: PageLike, helpers: PageHelpers
) -> bool:
"""Dismiss blocking modal if present.
After logging in via emailed URL, a modal may appear that blocks
access to the page. This method detects and dismisses it.
Args:
page: The browser page instance.
helpers: PageHelpers instance for wait utilities.
Returns:
True if modal was dismissed, False otherwise.
"""
modal = page.locator(MessagingSelectors.MODAL_WRAPPER)
modal_count = await modal.count()
if modal_count > 0:
close_button = page.locator(MessagingSelectors.MODAL_CLOSE_BUTTON)
close_count = await close_button.count()
if close_count > 0:
await close_button.first.click()
# Wait for modal to close
_ = await page.wait_for_selector(
MessagingSelectors.MODAL_WRAPPER,
state="hidden",
timeout=5000,
)
# Wait for page to stabilize after modal close
await helpers.wait_for_stable()
return True
return False
async def _ensure_chat_visible(self, page: PageLike, helpers: PageHelpers) -> bool:
"""Ensure chat panel is visible, expanding if needed.
Checks if chat messages container is visible. If not, clicks the
flyout button and switches to conversations tab.
Args:
page: The browser page instance.
helpers: PageHelpers instance for wait utilities.
Returns:
True if chat was expanded, False if already visible.
"""
chat_container = page.locator(MessagingSelectors.CHAT_MESSAGES_CONTAINER)
container_count = await chat_container.count()
if container_count > 0:
# Chat already visible
return False
# Click flyout button to expand
flyout_button = page.locator(MessagingSelectors.CHAT_FLYOUT_BUTTON)
flyout_count = await flyout_button.count()
if flyout_count == 0:
raise errors.ActionExecutionError(
"Chat flyout button not found",
details={"selector": MessagingSelectors.CHAT_FLYOUT_BUTTON},
)
await flyout_button.first.click()
await helpers.wait_for_stable()
# Verify flyout expanded before proceeding
await self._verify_chat_visible(page, helpers, step="after_flyout_click")
# Switch to conversations tab
conversations_tab = page.locator(MessagingSelectors.CHAT_CONVERSATIONS_TAB)
tab_count = await conversations_tab.count()
if tab_count > 0:
await conversations_tab.first.click()
await helpers.wait_for_stable()
# Verify still visible after tab switch
await self._verify_chat_visible(page, helpers, step="after_tab_switch")
return True
async def _verify_chat_visible(
self, page: PageLike, helpers: PageHelpers, step: str
) -> None:
"""Verify chat panel is visible at a given step.
Checks that the chat messages container is present and visible.
If not, attempts to re-expand the flyout once before failing.
Args:
page: The browser page instance.
helpers: PageHelpers instance for wait utilities.
step: Name of the current step (for error reporting).
Raises:
ActionExecutionError: If chat panel cannot be made visible.
"""
chat_container = page.locator(MessagingSelectors.CHAT_MESSAGES_CONTAINER)
container_count = await chat_container.count()
if container_count > 0:
# Chat is visible
return
# Chat not visible - attempt recovery by clicking flyout button
flyout_button = page.locator(MessagingSelectors.CHAT_FLYOUT_BUTTON)
flyout_count = await flyout_button.count()
if flyout_count > 0:
await flyout_button.first.click()
await helpers.wait_for_stable()
# Check again after recovery attempt
container_count = await chat_container.count()
if container_count > 0:
return
# Recovery failed - raise error with step context
raise errors.ActionExecutionError(
f"Chat panel not visible at step '{step}'",
details={
"step": step,
"container_selector": MessagingSelectors.CHAT_MESSAGES_CONTAINER,
"flyout_selector": MessagingSelectors.CHAT_FLYOUT_BUTTON,
},
)
__all__ = ["RespondToMessageAction"]

View File

@@ -11,6 +11,9 @@ Example flows:
from typing import ClassVar, override
from guide.app.actions.base import CompositeAction, register_action
from guide.app.actions.playbooks.email_notification import (
EmailNotificationResponsePlaybook,
)
from guide.app.models.domain import ActionResult
@@ -98,4 +101,8 @@ class FullDemoFlowAction(CompositeAction):
self.context.shared_state["suppliers"] = suppliers
__all__ = ["OnboardingFlowAction", "FullDemoFlowAction"]
__all__ = [
"OnboardingFlowAction",
"FullDemoFlowAction",
"EmailNotificationResponsePlaybook",
]

View File

@@ -0,0 +1,585 @@
"""Email notification response playbook.
Receives n8n webhook payload for message notifications and executes:
1. Navigate directly to conversation URL (access_url)
2. Check if redirected to login (session invalid)
3. If redirected, authenticate via OTP and retry access_url
4. Dismiss any blocking modals
5. Parse message context for LLM reply generation
6. Submit generated reply
"""
from typing import ClassVar, override
from urllib.parse import parse_qs, urlparse
from loguru import logger
from guide.app import errors
from guide.app.actions.base import ActionRegistry, DemoAction, register_action
from guide.app.auth.session import detect_current_persona
from guide.app.browser.elements._type_guards import (
get_str_from_dict,
is_dict_str_object,
is_list_of_objects,
)
from guide.app.browser.helpers import PageHelpers
from guide.app.browser.types import PageLike
from guide.app.core.config import AppSettings
from guide.app.models.domain import ActionContext, ActionResult
from guide.app.strings.selectors.messaging import MessagingSelectors
from guide.app.utils.llm import ChatMessage, LLMClient
@register_action
class EmailNotificationResponsePlaybook(DemoAction):
"""Playbook for responding to message notification emails via n8n.
Request params (from n8n):
user_email: str - Email of user to authenticate as (required)
access_url: str - Conversation URL from notification email (required)
callback_base_url: str - Base URL for OTP callback (optional)
Flow:
1. Navigate directly to access_url
2. Check if redirected to /login (session invalid)
3. If redirected, authenticate via OTP then retry access_url
4. Dismiss blocking modal if present
5. Parse message context for LLM reply generation
6. Submit generated reply
Browser host:
- Uses browserless-cdp (pass browser_host_id in request)
"""
id: ClassVar[str] = "playbook.email_notification_response"
description: ClassVar[str] = "Respond to message notification: navigate + auth if needed + reply"
category: ClassVar[str] = "playbooks"
_registry: ActionRegistry
_settings: AppSettings
def __init__(
self,
registry: ActionRegistry,
settings: AppSettings,
) -> None:
"""Initialize playbook with dependencies."""
self._registry = registry
self._settings = settings
@override
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
"""Execute email notification response playbook."""
# Validate required params upfront
user_email = context.params.get("user_email")
if not user_email or not isinstance(user_email, str):
raise errors.ActionExecutionError(
"'user_email' param is required",
details={"provided_params": list(context.params.keys())},
)
access_url = context.params.get("access_url")
if not access_url or not isinstance(access_url, str):
raise errors.ActionExecutionError(
"'access_url' param is required",
details={"provided_params": list(context.params.keys())},
)
helpers = PageHelpers(page)
auth_was_required = False
# Step 1: Navigate directly to access_url
logger.info("Step 1: Navigating directly to: {}", access_url)
_ = await page.goto(access_url, wait_until="networkidle")
await helpers.wait_for_stable()
# Step 2: Check if we need to authenticate
# - Login page shown (no session)
# - Wrong user logged in (need to switch)
current_url = page.url
title_result = await page.evaluate("document.title")
page_title = str(title_result) if title_result else ""
on_login_page = "/login" in current_url or "login" in page_title.lower()
# Check current logged-in user via localStorage
current_user = await detect_current_persona(page)
wrong_user = bool(current_user and current_user.lower() != user_email.lower())
needs_auth = on_login_page or wrong_user
if needs_auth:
if wrong_user:
logger.info(
"Step 2: Wrong user logged in (current={}, need={})",
current_user, user_email
)
else:
logger.info("Step 2: Login required (title={})", page_title)
# Step 3: Authenticate via OTP (handles logout if needed)
logger.info("Step 3: Authenticating as {}", user_email)
auth_result = await self._authenticate(
page, context, user_email, switch_user=wrong_user
)
if auth_result.status == "error":
return auth_result
auth_was_required = True
# Retry navigation to access_url after auth
logger.info("Step 3b: Retrying navigation to: {}", access_url)
_ = await page.goto(access_url, wait_until="networkidle")
await helpers.wait_for_stable()
else:
logger.info("Step 2: Correct user already logged in ({})", current_user)
# Step 4: Dismiss blocking modal if present
logger.info("Step 4: Checking for blocking modal")
modal_dismissed = await self._dismiss_modal_if_present(page, helpers)
# Step 5: Ensure chat panel is visible
logger.info("Step 5: Ensuring chat panel is visible")
chat_expanded = await self._ensure_chat_visible(page, helpers)
# Step 6: Parse message context for LLM reply generation
logger.info("Step 6: Parsing message context")
message_context = await self._parse_message_context(page)
# Step 7: Generate and submit reply
logger.info("Step 7: Generating and submitting reply")
reply_result = await self._generate_and_send_reply(
page, helpers, message_context, access_url
)
if reply_result.status == "error":
return reply_result
# Merge reply details into final result
reply_details = reply_result.details or {}
message_verified = reply_details.get("message_verified", False)
return ActionResult(
details={
"user_email": user_email,
"access_url": access_url,
"auth_was_required": auth_was_required,
"modal_dismissed": modal_dismissed,
"chat_expanded": chat_expanded,
"messages_found": message_context.get("message_count", 0),
"generated_reply": reply_details.get("generated_reply"),
"text_entered": reply_details.get("text_entered"),
"message_verified": message_verified,
}
)
async def _authenticate(
self,
page: PageLike,
context: ActionContext,
email: str,
*,
switch_user: bool = False,
) -> ActionResult:
"""Authenticate user via request_otp action."""
try:
otp_action = self._registry.get("auth.request_otp")
except errors.ActionExecutionError:
return ActionResult(
status="error",
error="auth.request_otp action not available",
)
otp_context = ActionContext(
action_id="auth.request_otp",
persona_id=context.persona_id,
browser_host_id=context.browser_host_id,
params={
"email": email,
"callback_base_url": context.params.get(
"callback_base_url", self._settings.callback_base_url
),
"switch_user": switch_user,
},
)
return await otp_action.run(page, otp_context)
async def _dismiss_modal_if_present(
self,
page: PageLike,
helpers: PageHelpers,
) -> bool:
"""Dismiss blocking modals if present after navigation."""
dismissed = False
# Check for board item editor dialog first (common blocker)
board_dialog = page.locator(MessagingSelectors.BOARD_ITEM_DIALOG)
if await board_dialog.count() > 0:
logger.info("Board item editor dialog detected, attempting to close")
# Click the close button
close_btn = page.locator(MessagingSelectors.BOARD_ITEM_DIALOG_CLOSE)
try:
if await close_btn.count() > 0:
await close_btn.click()
await helpers.wait_for_stable()
# Wait for dialog to close
try:
_ = await page.wait_for_selector(
MessagingSelectors.BOARD_ITEM_DIALOG,
state="hidden",
timeout=3000,
)
dismissed = True
logger.info("Board item editor dialog closed via close button")
except Exception:
logger.warning("Dialog still visible after clicking close")
else:
logger.warning("Close button not found for board item dialog")
except Exception as exc:
logger.warning("Failed to close board item dialog: {}", exc)
# Check for modal wrapper
modal = page.locator(MessagingSelectors.MODAL_WRAPPER)
modal_count = await modal.count()
if modal_count > 0:
close_button = page.locator(MessagingSelectors.MODAL_CLOSE_BUTTON)
close_count = await close_button.count()
if close_count > 0:
logger.info("Dismissing blocking modal")
await close_button.first.click()
_ = await page.wait_for_selector(
MessagingSelectors.MODAL_WRAPPER,
state="hidden",
timeout=5000,
)
await helpers.wait_for_stable()
dismissed = True
return dismissed
async def _ensure_chat_visible(
self,
page: PageLike,
helpers: PageHelpers,
) -> bool:
"""Ensure chat panel is visible, expanding flyout if needed."""
chat_container = page.locator(MessagingSelectors.CHAT_MESSAGES_CONTAINER)
container_count = await chat_container.count()
if container_count > 0:
logger.info("Chat panel already visible")
return False
# Click flyout button to expand
logger.info("Chat not visible, clicking flyout button")
flyout_button = page.locator(MessagingSelectors.CHAT_FLYOUT_BUTTON)
flyout_count = await flyout_button.count()
if flyout_count == 0:
logger.warning("Chat flyout button not found")
return False
await flyout_button.first.click()
await helpers.wait_for_stable()
# Switch to conversations tab if present
conversations_tab = page.locator(MessagingSelectors.CHAT_CONVERSATIONS_TAB)
tab_count = await conversations_tab.count()
if tab_count > 0:
logger.info("Switching to conversations tab")
await conversations_tab.first.click()
await helpers.wait_for_stable()
return True
async def _parse_message_context(self, page: PageLike) -> dict[str, object]:
"""Parse the message panel to extract conversation context.
Extracts the previous message(s) for LLM context.
Uses the virtualized list structure:
#chat-messages-container > div > div > div:nth-child(n) > div
"""
# Check if chat panel is visible
chat_container = page.locator(MessagingSelectors.CHAT_MESSAGES_CONTAINER)
container_visible = await chat_container.count() > 0
if not container_visible:
logger.warning("Chat messages container not visible")
return {"error": "chat_not_visible", "messages": []}
# Wait for messages to render in the virtualized list
message_selector = MessagingSelectors.CHAT_MESSAGE_ITEM
message_locator = page.locator(message_selector)
try:
# Give virtualized list time to render (up to 5s)
await message_locator.first.wait_for(state="visible", timeout=5000)
logger.info("Messages rendered in chat container")
except Exception:
logger.warning("No messages rendered after waiting")
# Extract messages from chat container
messages: list[dict[str, str]] = []
try:
message_elements = await page.evaluate(
"""
(selector) => {
const container = document.querySelector('#chat-messages-container');
if (!container) return { error: 'container_not_found' };
const messageNodes = document.querySelectorAll(selector);
const messages = [];
messageNodes.forEach((node) => {
const text = node.textContent?.trim() || '';
if (text) {
messages.push({
sender: 'unknown',
text: text.slice(0, 500),
});
}
});
return {
messageCount: messageNodes.length,
messages: messages.slice(-10),
};
}
""",
message_selector,
)
# Parse the response
if is_dict_str_object(message_elements):
raw_messages = message_elements.get("messages")
if is_list_of_objects(raw_messages):
for elem in raw_messages:
if is_dict_str_object(elem):
text = get_str_from_dict(elem, "text", "")
if text:
messages.append({
"sender": get_str_from_dict(elem, "sender", "unknown"),
"text": text,
})
return {
"chat_visible": container_visible,
"messages": messages,
"message_count": len(messages),
}
except Exception as exc:
logger.warning("Failed to extract messages: {}", exc)
return {
"chat_visible": container_visible,
"messages": messages,
"message_count": len(messages),
}
async def _generate_and_send_reply(
self,
page: PageLike,
helpers: PageHelpers,
message_context: dict[str, object],
access_url: str,
) -> ActionResult:
"""Generate reply via LLM and send it.
Args:
page: Browser page.
helpers: PageHelpers instance.
message_context: Parsed message context from chat panel.
access_url: Original access URL (contains board ID).
Returns:
ActionResult with reply details.
"""
# Extract messages from context
raw_messages = message_context.get("messages")
messages: list[dict[str, str]] = []
if is_list_of_objects(raw_messages):
for m in raw_messages:
if is_dict_str_object(m):
messages.append({
"sender": get_str_from_dict(m, "sender", "unknown"),
"text": get_str_from_dict(m, "text", ""),
})
if not messages:
logger.warning("No messages found for context - cannot generate reply")
return ActionResult(
status="error",
error="No message context available for reply generation",
details=message_context,
)
# Extract board ID from URL for reply input selector
board_id = self._extract_board_id(access_url)
if not board_id:
return ActionResult(
status="error",
error="Could not extract board ID from access_url",
details={"access_url": access_url},
)
logger.info("Generating reply for board {} with {} messages", board_id, len(messages))
# Build conversation for LLM
llm_client = LLMClient.from_settings(self._settings)
conversation = self._build_llm_conversation(messages)
try:
llm_response = await llm_client.chat_completion(
conversation,
temperature=0.7,
max_tokens=500,
)
generated_reply = llm_response.content.strip()
logger.debug("LLM generated reply: {}", generated_reply)
except errors.LLMError as exc:
logger.exception("LLM request failed: {}", exc.message)
return ActionResult(
status="error",
error=f"LLM request failed: {exc.message}",
details={"messages_count": len(messages), **exc.details},
)
if not generated_reply:
return ActionResult(
status="error",
error="LLM returned empty reply",
details={"messages_count": len(messages)},
)
# Type reply into input field - use Playwright's fill which handles Slate editors
reply_input_selector = f"#{board_id}"
reply_input = page.locator(reply_input_selector)
text_entered = False
try:
# Click to focus the input first
await reply_input.click(timeout=5000)
# Use Playwright's fill which properly handles contenteditable
await reply_input.fill(generated_reply)
await helpers.wait_for_stable()
# Verify the text was actually entered
actual_text = await reply_input.text_content()
text_entered = bool(actual_text and generated_reply[:20] in actual_text)
if not text_entered:
logger.warning(
"Text entry verification failed - expected: {}, actual: {}",
generated_reply[:50],
(actual_text or "")[:50],
)
except Exception as exc:
logger.error("Failed to type reply into {}: {}", reply_input_selector, exc)
return ActionResult(
status="error",
error=f"Failed to type reply: {exc}",
details={
"selector": reply_input_selector,
"generated_reply": generated_reply,
},
)
# Click the send button
send_button = page.locator(MessagingSelectors.CHAT_SEND_BUTTON)
try:
await send_button.click(timeout=5000)
await helpers.wait_for_stable()
except Exception as exc:
logger.error("Send button click failed: {}", exc)
return ActionResult(
status="error",
error=f"Send button click failed: {exc}",
details={
"selector": MessagingSelectors.CHAT_SEND_BUTTON,
"generated_reply": generated_reply,
},
)
# Verify message was sent by checking if it appears in the message container
await page.wait_for_timeout(self._settings.timeouts.combobox_listbox)
verify_result = await page.evaluate(
"""(expectedText) => {
const messages = document.querySelectorAll("[data-cy^='chat-message-']");
const lastMessage = messages[messages.length - 1];
if (!lastMessage) return { verified: false };
const text = lastMessage.textContent || '';
return { verified: text.includes(expectedText.slice(0, 30)) };
}""",
generated_reply,
)
message_verified = (
is_dict_str_object(verify_result) and verify_result.get("verified") is True
)
if not message_verified:
logger.warning("Message verification failed after send")
return ActionResult(
details={
"board_id": board_id,
"messages_count": len(messages),
"generated_reply": generated_reply,
"text_entered": text_entered,
"message_verified": message_verified,
}
)
def _extract_board_id(self, url: str) -> str | None:
"""Extract board ID from access URL.
The board ID is in the 'id' or 'form_item' query parameter.
Example: ...?form_item=PL01-800&id=PL01-800... -> 'PL01-800'
"""
try:
parsed = urlparse(url)
params = parse_qs(parsed.query)
# Try 'id' first, then 'form_item'
if "id" in params and params["id"]:
return params["id"][0]
if "form_item" in params and params["form_item"]:
return params["form_item"][0]
except Exception as exc:
logger.warning("Failed to parse URL for board ID: {}", exc)
return None
def _build_llm_conversation(
self, messages: list[dict[str, str]]
) -> list[ChatMessage]:
"""Build LLM conversation from chat messages.
Creates a system prompt and includes recent messages as context.
"""
system_prompt = """You are a helpful assistant responding to workplace messages.
Generate a brief, professional reply to the most recent message in the conversation.
Keep your response concise (1-3 sentences) and contextually appropriate.
Do not include any greeting like "Hi" or sign-off - just the reply content."""
# Format messages as conversation context
context_parts: list[str] = []
for msg in messages:
sender = msg.get("sender", "Unknown")
text = msg.get("text", "")
context_parts.append(f"{sender}: {text}")
user_message = f"""Here is the recent conversation:
{chr(10).join(context_parts)}
Please generate a brief, professional reply to the most recent message."""
return [
ChatMessage(role="system", content=system_prompt),
ChatMessage(role="user", content=user_message),
]
__all__ = ["EmailNotificationResponsePlaybook"]

View File

@@ -1,10 +1,14 @@
import contextlib
import importlib
import pkgutil
from pathlib import Path
from guide.app.actions import base
from guide.app.actions.base import ActionRegistry, CompositeAction, DemoAction
from guide.app.models.personas import PersonaStore
from guide.app.actions.di import ActionDIContext
from guide.app.auth import SessionManager
from guide.app.core.config import AppSettings
from guide.app.models.personas import PersonaResolver, PersonaStore
def _discover_action_modules() -> None:
@@ -28,14 +32,16 @@ def _discover_action_modules() -> None:
if module_name.endswith(("base", "registry")):
continue
try:
with contextlib.suppress(Exception):
_ = importlib.import_module(module_name)
except Exception:
# Silently skip modules that fail to import
pass
def default_registry(persona_store: PersonaStore, login_url: str) -> ActionRegistry:
def default_registry(
persona_store: PersonaStore,
login_url: str,
settings: AppSettings,
session_manager: SessionManager,
) -> ActionRegistry:
"""Create the default action registry with all registered actions.
Automatically discovers all action modules and registers them. Actions are
@@ -46,18 +52,27 @@ def default_registry(persona_store: PersonaStore, login_url: str) -> ActionRegis
Args:
persona_store: Persona store instance for user management actions
login_url: URL for login actions
settings: Application settings for actions that need configuration
session_manager: Session manager for auth session persistence
Returns:
Configured ActionRegistry ready for use
"""
_discover_action_modules()
di_context: dict[str, object] = {
"persona_store": persona_store,
"login_url": login_url,
}
persona_resolver = PersonaResolver(persona_store)
registry = ActionRegistry(di_context=di_context)
# Build typed DI context (without registry initially)
di_context = ActionDIContext(
persona_store=persona_store,
persona_resolver=persona_resolver,
login_url=login_url,
settings=settings,
session_manager=session_manager,
registry=None, # Added after registry creation
)
registry = ActionRegistry(di_context=di_context.as_dict())
# Add registry to DI context so CompositeAction can access it
registry.set_di_dependency("registry", registry)
@@ -65,4 +80,10 @@ def default_registry(persona_store: PersonaStore, login_url: str) -> ActionRegis
return registry
__all__ = ["default_registry", "ActionRegistry", "DemoAction", "CompositeAction"]
__all__ = [
"default_registry",
"ActionRegistry",
"DemoAction",
"CompositeAction",
"ActionDIContext",
]

View File

@@ -1,10 +1,10 @@
from playwright.async_api import Page
from typing import ClassVar, override
from guide.app.actions.base import DemoAction, register_action
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionContext, ActionResult
from guide.app.strings.registry import app_strings
from guide.app.strings.demo_texts.suppliers import SupplierTexts
from guide.app.strings.selectors.sourcing import SourcingSelectors
@register_action
@@ -14,9 +14,9 @@ class AddThreeSuppliersAction(DemoAction):
category: ClassVar[str] = "sourcing"
@override
async def run(self, page: Page, context: ActionContext) -> ActionResult:
suppliers = app_strings.sourcing.default_trio
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
suppliers = SupplierTexts.DEFAULT_TRIO
for supplier in suppliers:
await page.fill(app_strings.sourcing.supplier_search_input, supplier)
await page.click(app_strings.sourcing.add_supplier_button)
await page.fill(SourcingSelectors.SUPPLIER_SEARCH_INPUT, supplier)
await page.click(SourcingSelectors.ADD_SUPPLIER_BUTTON)
return ActionResult(details={"added_suppliers": list(suppliers)})

194
src/guide/app/api/CLAUDE.md Normal file
View File

@@ -0,0 +1,194 @@
# API Package - Coding Agent Guide
FastAPI router layer exposing demo automation capabilities via REST endpoints.
## Package Structure
```
api/
├── __init__.py # Main router aggregator
└── routes/
├── __init__.py # Route exports
├── health.py # GET /healthz - Liveness check
├── config.py # GET /config/browser-hosts - Config inspection
├── actions.py # Action execution endpoints
├── boards.py # Board item CRUD endpoints
└── diagnostics.py # Browser debugging endpoints
```
## Router Registration
All routes are aggregated in `api/__init__.py`:
```python
from fastapi import APIRouter
from guide.app.api.routes import actions, boards, config, diagnostics, health
router = APIRouter()
router.include_router(health.router)
router.include_router(actions.router)
router.include_router(boards.router)
router.include_router(config.router)
router.include_router(diagnostics.router)
```
Include in FastAPI app: `app.include_router(api.router)`
## Dependency Injection Pattern
Routes access app state via FastAPI dependencies:
```python
from typing import Annotated, Protocol, cast
from fastapi import Depends, Request, FastAPI
class AppStateProtocol(Protocol):
browser_client: BrowserClient
settings: AppSettings
# ... other state
def _browser_client(request: Request) -> BrowserClient:
app = cast(FastAPI, request.app)
state = cast(AppStateProtocol, cast(object, app.state))
return state.browser_client
BrowserDep = Annotated[BrowserClient, Depends(_browser_client)]
@router.post("/endpoint")
async def my_endpoint(browser: BrowserDep):
async with browser.open_page(host_id) as page:
# Use page
```
**Available Dependencies:**
- `BrowserClient` - Browser automation pool
- `ActionRegistry` - Registered demo actions
- `PersonaStore` - Configured personas
- `BoardStore` - Configured boards
- `AppSettings` - Application configuration
## Route Modules
### Health (`health.py`)
```
GET /healthz → {"status": "ok"}
```
Simple liveness probe for orchestration.
### Config (`config.py`)
```
GET /config/browser-hosts → BrowserHostsResponse
```
Returns configured browser hosts (CDP, headless, extension).
### Actions (`actions.py`)
Core action execution engine.
```
GET /actions → List action metadata
POST /actions/{action_id}/execute → Execute action
```
**Execute Request:**
```json
{
"persona_id": "analyst",
"browser_host_id": "support-extension",
"params": {"key": "value"}
}
```
**Execute Response (`ActionEnvelope`):**
```json
{
"status": "success|error",
"action_id": "login-as-persona",
"correlation_id": "uuid",
"result": {...},
"error_code": "AUTH_ERROR",
"message": "Error details",
"debug_info": {"screenshot": "base64...", "html": "..."}
}
```
**Execution Flow:**
1. Resolve persona from `PersonaStore`
2. Resolve browser host (request → persona → default)
3. Create `ActionContext` with correlation ID
4. Open browser page via `BrowserClient`
5. Run `ensure_persona()` for login (skipped for extension hosts)
6. Execute `action.run(page, context)`
7. On error: capture diagnostics (screenshot, HTML, Docling UI elements)
8. Return `ActionEnvelope`
### Boards (`boards.py`)
Board item management via GraphQL.
```
GET /boards → List configured boards
GET /boards/{name} → Get board config
POST /boards/{name}/items → Create board item
POST /boards/discover → Query GraphQL for available boards
```
**Create Board Item:**
```json
{
"browser_host_id": "support-extension",
"data": {"description": "Override defaults"},
"requested_by": "automation"
}
```
Board defaults from `config/boards.yaml` are merged with request data.
**Discover Boards:**
```json
{
"browser_host_id": "support-extension",
"name_filter": "operations"
}
```
Returns boards from GraphQL matching filter.
### Diagnostics (`diagnostics.py`)
Browser inspection endpoints for debugging.
```
GET /diagnostics/connectivity?browser_host_id=X → Test browser connection
GET /diagnostics/field?selector=X&host=Y → Inspect single field
GET /diagnostics/fields?browser_host_id=X → Inspect all known fields
GET /diagnostics/dropdown?selector=X&host=Y → Dropdown state/options
GET /diagnostics/form?browser_host_id=X → Full form overview
GET /diagnostics/page?browser_host_id=X → Page structure
GET /diagnostics/selectors?browser_host_id=X → Available data-cy selectors
```
## Adding a New Route
1. Create `routes/my_route.py` with `APIRouter(prefix="/my-prefix", tags=["my-tag"])`
2. Define Pydantic request/response models
3. Create async endpoint using dependency injection
4. Register in `api/__init__.py`: `router.include_router(my_route.router)`
## Error Handling
Catch `GuideError` subclasses and convert to `ActionEnvelope` with error details.
**Error Types:** `ConfigError`, `AuthError`, `BrowserConnectionError`, `GraphQLOperationError`
## Request Context
Set logging context per-request via `LoggingManager.context` for tracing with correlation IDs.
## Testing
Use `fastapi.testclient.TestClient` for route testing. Mock `BrowserClient` to avoid real browser connections.

View File

@@ -1,10 +1,24 @@
from fastapi import APIRouter
from guide.app.api.routes import actions, config, health
from guide.app.api.routes import (
actions,
boards,
config,
diagnostics,
health,
keepalive,
otp_callback,
sessions,
)
router = APIRouter()
router.include_router(health.router)
router.include_router(actions.router)
router.include_router(boards.router)
router.include_router(config.router)
router.include_router(diagnostics.router)
router.include_router(keepalive.router)
router.include_router(sessions.router)
router.include_router(otp_callback.router)
__all__ = ["router"]

View File

@@ -1,3 +1,19 @@
from guide.app.api.routes import actions, config, health
from guide.app.api.routes import (
actions,
boards,
config,
diagnostics,
health,
keepalive,
sessions,
)
__all__ = ["actions", "config", "health"]
__all__ = [
"actions",
"boards",
"config",
"diagnostics",
"health",
"keepalive",
"sessions",
]

View File

@@ -2,19 +2,129 @@ from typing import Annotated, Protocol, cast
from fastapi import APIRouter, Depends, Request
from fastapi import FastAPI
from loguru import logger
from guide.app.actions.registry import ActionRegistry
from guide.app.auth import DummyMfaCodeProvider, ensure_persona
from guide.app.browser.client import BrowserClient
from guide.app.browser.diagnostics import capture_all_diagnostics
from guide.app.browser.types import PageLike
from guide.app import errors
from guide.app.core.config import AppSettings
from guide.app.core.config import AppSettings, BrowserHostConfig, HostKind
from guide.app.core.logging import LoggingManager
from guide.app.models.domain import (
ActionContext,
ActionEnvelope,
ActionRequest,
ActionResult,
ActionStatus,
DebugInfo,
)
from guide.app.models.personas import PersonaStore
from guide.app.models.personas import DemoPersona, PersonaStore
def _resolve_target_host(
payload: ActionRequest,
persona: DemoPersona | None,
settings: AppSettings,
) -> str:
"""Resolve target browser host from request, persona, or default."""
target_host_id = payload.browser_host_id or (
persona.browser_host_id if persona else None
)
return target_host_id or settings.default_browser_host_id
def _validate_host_exists(
target_host_id: str,
persona: DemoPersona | None,
settings: AppSettings,
) -> None:
"""Validate browser host exists, raise ConfigError if not found."""
if target_host_id not in settings.browser_hosts:
available_hosts = list(settings.browser_hosts.keys())
source = (
"persona"
if persona and persona.browser_host_id == target_host_id
else "request"
)
raise errors.ConfigError(
f"Browser host '{target_host_id}' not found in configuration",
details={
"requested_host": target_host_id,
"available_hosts": available_hosts,
"source": source,
},
)
def _set_logging_context(
context: ActionContext,
action_id: str,
persona: DemoPersona | None,
target_host_id: str,
) -> None:
"""Set request context variables for logging."""
_ = LoggingManager.context.correlation_id.set(context.correlation_id)
_ = LoggingManager.context.action_id.set(action_id)
_ = LoggingManager.context.persona_id.set(persona.id if persona else None)
_ = LoggingManager.context.host_id.set(target_host_id)
def _is_extension_host(host_config: BrowserHostConfig | None) -> bool:
"""Check if host is an extension host."""
return host_config is not None and host_config.kind == HostKind.EXTENSION
async def _capture_error_diagnostics(
page_like: PageLike | None,
settings: AppSettings,
) -> DebugInfo | None:
"""Capture diagnostics for debugging, handling errors gracefully."""
if not page_like:
return None
_ = settings # Reserved for future diagnostic config
try:
return await capture_all_diagnostics(page_like)
except Exception as diag_exc:
logger.warning("Failed to capture diagnostics: {}", diag_exc)
return None
def _build_error_envelope(
action_id: str,
correlation_id: str,
error_code: str,
message: str,
details: dict[str, object] | None = None,
debug_info: DebugInfo | None = None,
) -> ActionEnvelope:
"""Build error response envelope."""
return ActionEnvelope(
status=ActionStatus.ERROR,
action_id=action_id,
correlation_id=correlation_id,
error_code=error_code,
message=message,
details=details,
debug_info=debug_info,
)
def _build_success_envelope(
action_id: str,
correlation_id: str,
result: ActionResult,
) -> ActionEnvelope:
"""Build success response envelope."""
return ActionEnvelope(
status=ActionStatus.SUCCESS,
action_id=action_id,
correlation_id=correlation_id,
result=result.details,
)
router = APIRouter()
@@ -69,14 +179,12 @@ async def execute_action(
browser_client: BrowserDep,
personas: PersonaDep,
settings: SettingsDep,
):
) -> ActionEnvelope:
"""Execute a registered action with browser automation."""
action = registry.get(action_id)
persona = personas.get(payload.persona_id) if payload.persona_id else None
target_host_id = payload.browser_host_id or (
persona.browser_host_id if persona else None
)
target_host_id = target_host_id or settings.default_browser_host_id
target_host_id = _resolve_target_host(payload, persona, settings)
_validate_host_exists(target_host_id, persona, settings)
context = ActionContext(
action_id=action_id,
@@ -84,35 +192,41 @@ async def execute_action(
browser_host_id=target_host_id,
params=payload.params,
)
_set_logging_context(context, action_id, persona, target_host_id)
# Set request context variables for logging
_ = LoggingManager.context.correlation_id.set(context.correlation_id)
_ = LoggingManager.context.action_id.set(action_id)
_ = LoggingManager.context.persona_id.set(persona.id if persona else None)
_ = LoggingManager.context.host_id.set(target_host_id)
mfa_provider = DummyMfaCodeProvider()
host_config = settings.browser_hosts.get(target_host_id)
is_extension = _is_extension_host(host_config)
page_like: PageLike | None = None
try:
async with browser_client.open_page(target_host_id) as page:
if persona:
page_like = cast(PageLike, cast(object, page))
if persona and not is_extension:
await ensure_persona(
page, persona, mfa_provider, login_url=settings.raindrop_base_url
page_like,
persona,
DummyMfaCodeProvider(),
login_url=settings.raindrop_base_url,
)
result = await action.run(page_like, context)
if result.status == "error":
return _build_error_envelope(
action_id,
context.correlation_id,
"ACTION_EXECUTION_FAILED",
result.error or "Action returned error status",
result.details,
)
result = await action.run(page, context)
except errors.GuideError as exc:
return ActionEnvelope(
status=ActionStatus.ERROR,
action_id=action_id,
correlation_id=context.correlation_id,
error_code=exc.code,
message=exc.message,
details=exc.details,
debug_info = await _capture_error_diagnostics(page_like, settings)
return _build_error_envelope(
action_id,
context.correlation_id,
exc.code,
exc.message,
exc.details,
debug_info,
)
return ActionEnvelope(
status=ActionStatus.SUCCESS,
action_id=action_id,
correlation_id=context.correlation_id,
result=result.details,
)
return _build_success_envelope(action_id, context.correlation_id, result)

View File

@@ -0,0 +1,605 @@
"""Board item creation API routes."""
import logging
from typing import Annotated, Protocol, cast
from fastapi import APIRouter, Depends, Request
from fastapi import FastAPI
from pydantic import BaseModel, Field
from guide.app.auth.session import extract_bearer_token
from guide.app.browser.client import BrowserClient
from guide.app.browser.types import PageLike
from guide.app.core.config import AppSettings
from guide.app.core.logging import LoggingManager
from guide.app.models.boards import BoardStore
from guide.app.models.domain import ActionStatus
from guide.app.raindrop.operations import (
SchemaSourceType,
get_form_schema,
validate_form_data,
)
from guide.app.utils.ids import new_correlation_id
from guide.app import errors
import httpx
_logger = logging.getLogger(__name__)
router = APIRouter(prefix="/boards", tags=["boards"])
class AppStateProtocol(Protocol):
browser_client: BrowserClient
board_store: BoardStore
settings: AppSettings
def _browser_client(request: Request) -> BrowserClient:
app = cast(FastAPI, request.app)
state = cast(AppStateProtocol, cast(object, app.state))
return state.browser_client
def _board_store(request: Request) -> BoardStore:
app = cast(FastAPI, request.app)
state = cast(AppStateProtocol, cast(object, app.state))
return state.board_store
def _settings(request: Request) -> AppSettings:
app = cast(FastAPI, request.app)
state = cast(AppStateProtocol, cast(object, app.state))
return state.settings
BrowserDep = Annotated[BrowserClient, Depends(_browser_client)]
BoardStoreDep = Annotated[BoardStore, Depends(_board_store)]
SettingsDep = Annotated[AppSettings, Depends(_settings)]
class CreateBoardItemRequest(BaseModel):
"""Request body for creating a board item."""
browser_host_id: str = Field(
...,
description="Browser host ID to use for extracting auth token",
)
data: dict[str, object] = Field(
default_factory=dict,
description="Field values to override board defaults",
)
requested_by: str | None = Field(
default=None,
description="Requester identifier (defaults to 'api-automation')",
)
validate_before_create: bool = Field(
default=False,
description="Validate data against board schema before creating",
)
class CreateBoardItemResponse(BaseModel):
"""Response from board item creation."""
status: ActionStatus
correlation_id: str
board_item: dict[str, object] | None = None
error_code: str | None = None
message: str | None = None
validation_errors: list[dict[str, object]] | None = None
class BoardConfigResponse(BaseModel):
"""Response showing board configuration."""
name: str
id: int
key: str
instance_id: int
defaults: dict[str, object]
class DiscoveredBoard(BaseModel):
"""A board discovered from GraphQL."""
id: int
key: str
name: str
instance_id: int
class DiscoverBoardsRequest(BaseModel):
"""Request to discover boards from GraphQL."""
browser_host_id: str = Field(..., description="Browser host ID for auth token")
name_filter: str | None = Field(
None, description="Filter boards by name (case-insensitive)"
)
class DiscoverBoardsResponse(BaseModel):
"""Response from board discovery."""
boards: list[DiscoveredBoard]
count: int
# --- Schema Discovery Models ---
class FieldChoiceResponse(BaseModel):
"""A choice option for menu fields."""
text: str
value: str | None = None
color: str | None = None
class FormatOptionsResponse(BaseModel):
"""Field formatting options."""
currency: bool = False
decimals: int | None = None
prefix: str | None = None
suffix: str | None = None
class FieldDefResponse(BaseModel):
"""Definition of a form field."""
name: str
field_type: str
label: str
required: bool = False
choices: list[FieldChoiceResponse] = Field(default_factory=list)
format_options: FormatOptionsResponse = Field(default_factory=FormatOptionsResponse)
editable: bool = True
visible: bool = True
class BoardSchemaResponse(BaseModel):
"""Complete board form schema."""
board_id: int
board_name: str
fields: list[FieldDefResponse]
required_field_names: list[str]
menu_field_names: list[str]
class GetSchemaRequest(BaseModel):
"""Request for board schema."""
browser_host_id: str = Field(..., description="Browser host ID for auth token")
class ValidationErrorResponse(BaseModel):
"""A single validation error."""
field: str
error_type: str
message: str
expected: object = None
actual: object = None
class ValidateDataRequest(BaseModel):
"""Request to validate data against board schema."""
browser_host_id: str = Field(..., description="Browser host ID for auth token")
data: dict[str, object] = Field(..., description="Data to validate")
check_required: bool = Field(True, description="Check required fields")
check_choices: bool = Field(True, description="Check menu choices")
check_types: bool = Field(True, description="Check field types")
class ValidateDataResponse(BaseModel):
"""Response from data validation."""
valid: bool
errors: list[ValidationErrorResponse] = Field(default_factory=list)
merged_data: dict[str, object] = Field(
default_factory=dict,
description="Data merged with board defaults (what would be sent)",
)
@router.post("/discover", response_model=DiscoverBoardsResponse)
async def discover_boards(
payload: DiscoverBoardsRequest,
browser_client: BrowserDep,
settings: SettingsDep,
) -> DiscoverBoardsResponse:
"""Discover available boards from GraphQL.
Queries the Raindrop GraphQL API to list all non-archived boards.
Useful for finding board IDs to add to boards.yaml configuration.
"""
async with browser_client.open_page(payload.browser_host_id) as page:
page_like = cast(PageLike, cast(object, page))
token_result = await extract_bearer_token(page_like)
if not token_result:
raise errors.AuthError("No bearer token found in browser session")
# Build query with optional name filter
if payload.name_filter:
query = """
query ListBoards($name: String!) {
board(
where: { name: { _ilike: $name }, is_archived: { _eq: false } }
order_by: { name: asc }
limit: 100
) {
id
key
name
instance_id
}
}
"""
variables = {"name": f"%{payload.name_filter}%"}
else:
query = """
query ListBoards {
board(
where: { is_archived: { _eq: false } }
order_by: { name: asc }
limit: 100
) {
id
key
name
instance_id
}
}
"""
variables = {}
async with httpx.AsyncClient() as client:
resp = await client.post(
settings.raindrop_graphql_url,
json={"query": query, "variables": variables},
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {token_result.value}",
},
timeout=30.0,
)
response_data = cast(dict[str, object], resp.json())
if gql_errors := response_data.get("errors"):
error_list = cast(list[dict[str, object]], gql_errors)
first_error = error_list[0] if error_list else {}
raise errors.GraphQLOperationError(
str(first_error.get("message", "Unknown error")),
details=first_error,
)
data_section = response_data.get("data")
if not isinstance(data_section, dict):
raise errors.GraphQLOperationError("Unexpected response format")
boards_data = cast(dict[str, object], data_section).get("board", [])
if not isinstance(boards_data, list):
boards_data = []
boards = [
DiscoveredBoard(
id=int(cast(int, b["id"])),
key=str(b["key"]),
name=str(b["name"]),
instance_id=int(cast(int, b["instance_id"])),
)
for b in cast(list[dict[str, object]], boards_data)
]
return DiscoverBoardsResponse(boards=boards, count=len(boards))
@router.get("")
async def list_boards(board_store: BoardStoreDep) -> list[BoardConfigResponse]:
"""List all configured boards with their defaults."""
return [
BoardConfigResponse(
name=name,
id=board.id,
key=board.key,
instance_id=board.instance_id,
defaults=board.defaults,
)
for name, board in [(b.name, b) for b in board_store.list()]
]
@router.get("/{board_name}")
async def get_board(board_name: str, board_store: BoardStoreDep) -> BoardConfigResponse:
"""Get a specific board configuration."""
board = board_store.get(board_name)
return BoardConfigResponse(
name=board_name,
id=board.id,
key=board.key,
instance_id=board.instance_id,
defaults=board.defaults,
)
@router.post("/{board_name}/schema", response_model=BoardSchemaResponse)
async def get_board_schema(
board_name: str,
payload: GetSchemaRequest,
browser_client: BrowserDep,
board_store: BoardStoreDep,
settings: SettingsDep,
) -> BoardSchemaResponse:
"""Get the field schema for a board.
Fetches the board definition from GraphQL to retrieve field types,
required fields, choices for menus, and format options. Use this
to understand what data can/should be provided when creating items.
"""
board = board_store.get(board_name)
async with browser_client.open_page(payload.browser_host_id) as page:
page_like = cast(PageLike, cast(object, page))
token_result = await extract_bearer_token(page_like)
if not token_result:
raise errors.AuthError("No bearer token found in browser session")
schema = await get_form_schema(
graphql_url=settings.raindrop_graphql_url,
bearer_token=token_result.value,
source_type=SchemaSourceType.BOARD,
entity_key=board.id,
)
return BoardSchemaResponse(
board_id=board.id,
board_name=schema.entity_name,
fields=[
FieldDefResponse(
name=f.name,
field_type=f.field_type.value,
label=f.label,
required=f.required,
choices=[
FieldChoiceResponse(text=c.text, value=c.value, color=c.color)
for c in f.choices
],
format_options=FormatOptionsResponse(
currency=f.format_options.currency,
decimals=f.format_options.decimals,
prefix=f.format_options.prefix,
suffix=f.format_options.suffix,
),
editable=f.editable,
visible=f.visible,
)
for f in schema.fields
],
required_field_names=[f.name for f in schema.required_fields],
menu_field_names=[f.name for f in schema.menu_fields],
)
@router.post("/{board_name}/validate", response_model=ValidateDataResponse)
async def validate_board_data(
board_name: str,
payload: ValidateDataRequest,
browser_client: BrowserDep,
board_store: BoardStoreDep,
settings: SettingsDep,
) -> ValidateDataResponse:
"""Validate data against board schema before creating an item.
This endpoint:
1. Fetches the board schema from GraphQL
2. Merges request data with board defaults
3. Validates against required fields, menu choices, and types
4. Returns validation errors without actually creating the item
Use this to verify your data is correct before calling /items.
"""
board = board_store.get(board_name)
merged_data = board.merge_with_overrides(payload.data)
async with browser_client.open_page(payload.browser_host_id) as page:
page_like = cast(PageLike, cast(object, page))
token_result = await extract_bearer_token(page_like)
if not token_result:
raise errors.AuthError("No bearer token found in browser session")
schema = await get_form_schema(
graphql_url=settings.raindrop_graphql_url,
bearer_token=token_result.value,
source_type=SchemaSourceType.BOARD,
entity_key=board.id,
)
result = validate_form_data(
schema=schema,
data=merged_data,
check_required=payload.check_required,
check_choices=payload.check_choices,
check_types=payload.check_types,
)
return ValidateDataResponse(
valid=result.valid,
errors=[
ValidationErrorResponse(
field=e.field,
error_type=e.error_type,
message=e.message,
expected=e.expected,
actual=e.actual,
)
for e in result.errors
],
merged_data=merged_data,
)
@router.post("/{board_name}/items", response_model=CreateBoardItemResponse)
async def create_board_item(
board_name: str,
payload: CreateBoardItemRequest,
browser_client: BrowserDep,
board_store: BoardStoreDep,
settings: SettingsDep,
) -> CreateBoardItemResponse:
"""Create a board item with defaults from config, overridden by request data.
This endpoint:
1. Loads board configuration from boards.yaml
2. Merges default field values with request overrides
3. Extracts bearer token from browser session
4. Creates board item via GraphQL API
Args:
board_name: Name of the board (key in boards.yaml)
payload: Request with browser_host_id and optional field overrides
Returns:
Created board item data or error details
"""
correlation_id = new_correlation_id()
# Set logging context
_ = LoggingManager.context.correlation_id.set(correlation_id)
_ = LoggingManager.context.action_id.set("create-board-item")
_ = LoggingManager.context.host_id.set(payload.browser_host_id)
# Get board configuration
board = board_store.get(board_name)
# Merge defaults with overrides
merged_data = board.merge_with_overrides(payload.data)
# Get host config to verify it exists
host_config = settings.browser_hosts.get(payload.browser_host_id)
if not host_config:
known = ", ".join(settings.browser_hosts.keys()) or "<none>"
raise errors.ConfigError(
f"Unknown browser host '{payload.browser_host_id}'. Known: {known}"
)
page_like: PageLike | None = None
try:
async with browser_client.open_page(payload.browser_host_id) as page:
page_like = cast(PageLike, cast(object, page))
# Extract bearer token from browser session
token_result = await extract_bearer_token(page_like)
if not token_result:
return CreateBoardItemResponse(
status=ActionStatus.ERROR,
correlation_id=correlation_id,
error_code="AUTH_ERROR",
message="No bearer token found in browser session",
)
# Optional: Validate against schema before creating
if payload.validate_before_create:
schema = await get_form_schema(
graphql_url=settings.raindrop_graphql_url,
bearer_token=token_result.value,
source_type=SchemaSourceType.BOARD,
entity_key=board.id,
)
validation_result = validate_form_data(schema, merged_data)
if not validation_result.valid:
return CreateBoardItemResponse(
status=ActionStatus.ERROR,
correlation_id=correlation_id,
error_code="VALIDATION_ERROR",
message=f"Data validation failed: {len(validation_result.errors)} error(s)",
validation_errors=[
{
"field": e.field,
"error_type": e.error_type,
"message": e.message,
"expected": e.expected,
"actual": e.actual,
}
for e in validation_result.errors
],
)
# Build GraphQL mutation
mutation = """
mutation CreateBoardItem($object: board_item_insert_input!) {
insert_board_item_one(object: $object) {
uuid
id
board_id
board_name
data
requested_by
created_at
instance_id
}
}
"""
variables = {
"object": {
"board_id": board.id,
"instance_id": board.instance_id,
"data": merged_data,
"requested_by": payload.requested_by or "api-automation",
}
}
# Execute GraphQL request
async with httpx.AsyncClient() as client:
resp = await client.post(
settings.raindrop_graphql_url,
json={"query": mutation, "variables": variables},
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {token_result.value}",
},
timeout=30.0,
)
response_data = cast(dict[str, object], resp.json())
if gql_errors := response_data.get("errors"):
error_list = cast(list[dict[str, object]], gql_errors)
first_error = error_list[0] if error_list else {}
error_msg = str(first_error.get("message", "Unknown GraphQL error"))
return CreateBoardItemResponse(
status=ActionStatus.ERROR,
correlation_id=correlation_id,
error_code="GRAPHQL_ERROR",
message=error_msg,
)
# Extract created item
data_section = response_data.get("data")
if isinstance(data_section, dict):
data_dict = cast(dict[str, object], data_section)
created_item = data_dict.get("insert_board_item_one")
if isinstance(created_item, dict):
return CreateBoardItemResponse(
status=ActionStatus.SUCCESS,
correlation_id=correlation_id,
board_item=cast(dict[str, object], created_item),
)
return CreateBoardItemResponse(
status=ActionStatus.ERROR,
correlation_id=correlation_id,
error_code="UNEXPECTED_RESPONSE",
message="GraphQL response did not contain expected data",
)
except errors.GuideError as exc:
return CreateBoardItemResponse(
status=ActionStatus.ERROR,
correlation_id=correlation_id,
error_code=exc.code,
message=exc.message,
)

View File

@@ -0,0 +1,486 @@
"""Diagnostics API endpoints for troubleshooting browser automation.
Provides read-only inspection endpoints for:
- Connectivity testing
- Field state inspection
- Dropdown diagnostics
- Form overview
- Page structure
"""
from typing import Protocol, cast
from fastapi import APIRouter, HTTPException, Request
from pydantic import BaseModel, Field
from guide.app.browser.client import BrowserClient
from guide.app.browser.diagnostics import (
analyze_field_issues,
extract_ui_elements,
get_selected_chips,
inspect_dropdown,
inspect_field,
inspect_input,
inspect_page_structure,
test_connectivity,
)
from guide.app.browser.types import PageLike
from guide.app.core.config import AppSettings
from guide.app.strings.selectors.intake import IntakeSelectors
class _AppStateProtocol(Protocol):
"""Protocol for app.state with typed attributes."""
browser_client: BrowserClient
settings: AppSettings
def _get_app_state(request: Request) -> _AppStateProtocol:
"""Extract typed app state from request.
FastAPI's request.app returns Starlette with untyped state.
This helper provides type-safe access to our app state.
"""
# request.app is typed as Starlette which has Any state
# We know at runtime this is our FastAPI app with typed state
return cast(_AppStateProtocol, request.app.state) # pyright: ignore[reportAny]
router = APIRouter(prefix="/diagnostics", tags=["diagnostics"])
# ---------------------------------------------------------------------------
# Response Models
# ---------------------------------------------------------------------------
class ConnectivityResponse(BaseModel):
"""Response for connectivity check."""
connected: bool
title: str | None = None
url: str | None = None
error: str | None = None
class FieldDiagnosticsResponse(BaseModel):
"""Response for field diagnostics."""
selector: str
exists: bool
visible: bool = False
tag_name: str | None = None
input_exists: bool = False
input_disabled: bool = False
chip_count: int = 0
chips: list[str] = Field(default_factory=list)
issues: list[str] = Field(default_factory=list)
error: str | None = None
class DropdownDiagnosticsResponse(BaseModel):
"""Response for dropdown diagnostics."""
selector: str
field_exists: bool
field_visible: bool = False
input_disabled: bool = False
chip_count: int = 0
chips: list[str] = Field(default_factory=list)
dropdown_opened: bool = False
option_count: int = 0
options: list[dict[str, object]] = Field(default_factory=list)
component_structure: dict[str, object] = Field(default_factory=dict)
issues: list[str] = Field(default_factory=list)
error: str | None = None
class FormDiagnosticsResponse(BaseModel):
"""Response for form diagnostics."""
url: str
fields: dict[str, FieldDiagnosticsResponse]
summary: dict[str, int]
class PageDiagnosticsResponse(BaseModel):
"""Response for page structure diagnostics."""
url: str
title: str
form_exists: bool
total_data_cy_count: int
data_cy_elements: list[dict[str, object]]
# ---------------------------------------------------------------------------
# Known Field Registry
# ---------------------------------------------------------------------------
# Map friendly field names to selectors from strings registry
KNOWN_FIELDS: dict[str, str] = {
"commodity": IntakeSelectors.COMMODITY_FIELD,
"planned": IntakeSelectors.PLANNED_FIELD,
"regions": IntakeSelectors.REGIONS_FIELD,
"opex_capex": IntakeSelectors.OPEX_CAPEX_FIELD,
"entity": IntakeSelectors.ENTITY_FIELD,
"description": IntakeSelectors.DESCRIPTION_TEXTAREA,
"target_date": IntakeSelectors.TARGET_DATE_FIELD,
"requester": IntakeSelectors.REQUESTER_FIELD,
"owner": IntakeSelectors.ASSIGNED_OWNER_FIELD,
"reseller": IntakeSelectors.RESELLER_TEXTAREA,
}
def resolve_selector(field_or_selector: str) -> str:
"""Resolve field name to selector, or use as-is if already a selector."""
return KNOWN_FIELDS.get(field_or_selector, field_or_selector)
# ---------------------------------------------------------------------------
# Endpoints
# ---------------------------------------------------------------------------
@router.get("/connectivity", response_model=ConnectivityResponse)
async def check_connectivity(
request: Request,
host_id: str | None = None,
) -> ConnectivityResponse:
"""Test browser/extension connectivity.
Args:
host_id: Optional browser host ID (uses default if not specified)
Returns:
Connectivity status with page title and URL if connected
"""
state = _get_app_state(request)
browser_client = state.browser_client
settings = state.settings
target_host = host_id or settings.default_browser_host_id
try:
async with browser_client.open_page(target_host) as raw_page:
page = cast(PageLike, cast(object, raw_page))
result = await test_connectivity(page)
return ConnectivityResponse(
connected=result.connected,
title=result.title,
url=result.url,
error=result.error,
)
except Exception as e:
return ConnectivityResponse(connected=False, error=str(e))
@router.get("/field", response_model=FieldDiagnosticsResponse)
async def diagnose_field(
request: Request,
field: str,
host_id: str | None = None,
) -> FieldDiagnosticsResponse:
"""Diagnose a specific field's state.
Args:
field: Field name (e.g., 'commodity') or CSS selector
host_id: Optional browser host ID
Returns:
Field diagnostics including visibility, input state, chips
"""
state = _get_app_state(request)
browser_client = state.browser_client
settings = state.settings
target_host = host_id or settings.default_browser_host_id
selector = resolve_selector(field)
try:
async with browser_client.open_page(target_host) as raw_page:
page = cast(PageLike, cast(object, raw_page))
field_info = await inspect_field(page, selector)
input_info = await inspect_input(page, selector)
chips = await get_selected_chips(page, selector)
issues = analyze_field_issues(field_info, input_info, chips)
return FieldDiagnosticsResponse(
selector=selector,
exists=field_info.exists,
visible=field_info.visible,
tag_name=field_info.tag_name,
input_exists=input_info.exists,
input_disabled=input_info.disabled,
chip_count=chips.count,
chips=[c.text for c in chips.chips],
issues=issues,
error=field_info.error or input_info.error,
)
except Exception as e:
return FieldDiagnosticsResponse(
selector=selector,
exists=False,
error=str(e),
)
@router.get("/dropdown", response_model=DropdownDiagnosticsResponse)
async def diagnose_dropdown(
request: Request,
field: str,
open: bool = True,
host_id: str | None = None,
) -> DropdownDiagnosticsResponse:
"""Diagnose a dropdown field's state and options.
Args:
field: Field name (e.g., 'commodity') or CSS selector
open: Whether to open the dropdown to inspect options (default: True)
host_id: Optional browser host ID
Returns:
Dropdown diagnostics including options and component structure
"""
state = _get_app_state(request)
browser_client = state.browser_client
settings = state.settings
target_host = host_id or settings.default_browser_host_id
selector = resolve_selector(field)
try:
async with browser_client.open_page(target_host) as raw_page:
page = cast(PageLike, cast(object, raw_page))
result = await inspect_dropdown(page, selector, open_dropdown=open)
issues = analyze_field_issues(
result.field, result.input_element, result.chips
)
options_list: list[dict[str, object]] = []
option_count = 0
if result.listbox_after_click:
option_count = result.listbox_after_click.option_count
options_list = [
{
"index": o.index,
"text": o.text,
"selected": o.aria_selected == "true",
}
for o in result.listbox_after_click.options
]
return DropdownDiagnosticsResponse(
selector=selector,
field_exists=result.field.exists,
field_visible=result.field.visible,
input_disabled=result.input_element.disabled,
chip_count=result.chips.count,
chips=[c.text for c in result.chips.chips],
dropdown_opened=result.click_success,
option_count=option_count,
options=options_list,
component_structure=result.component_structure,
issues=issues,
)
except Exception as e:
return DropdownDiagnosticsResponse(
selector=selector,
field_exists=False,
error=str(e),
)
@router.get("/form", response_model=FormDiagnosticsResponse)
async def diagnose_form(
request: Request,
host_id: str | None = None,
) -> FormDiagnosticsResponse:
"""Diagnose all known form fields.
Args:
host_id: Optional browser host ID
Returns:
Diagnostics for all known intake form fields
"""
state = _get_app_state(request)
browser_client = state.browser_client
settings = state.settings
target_host = host_id or settings.default_browser_host_id
fields: dict[str, FieldDiagnosticsResponse] = {}
url = "unknown"
try:
async with browser_client.open_page(target_host) as raw_page:
page = cast(PageLike, cast(object, raw_page))
# Get page URL
url_result = await page.evaluate("window.location.href")
url = str(url_result) if url_result else "unknown"
for field_name, selector in KNOWN_FIELDS.items():
field_info = await inspect_field(page, selector)
input_info = await inspect_input(page, selector)
chips = await get_selected_chips(page, selector)
issues = analyze_field_issues(field_info, input_info, chips)
fields[field_name] = FieldDiagnosticsResponse(
selector=selector,
exists=field_info.exists,
visible=field_info.visible,
tag_name=field_info.tag_name,
input_exists=input_info.exists,
input_disabled=input_info.disabled,
chip_count=chips.count,
chips=[c.text for c in chips.chips],
issues=issues,
error=field_info.error or input_info.error,
)
except Exception:
# Return partial results with error
return FormDiagnosticsResponse(
url=url,
fields=fields,
summary={"error": 1, "total": len(KNOWN_FIELDS)},
)
# Generate summary
found_count = sum(bool(f.exists) for f in fields.values())
visible_count = sum(bool(f.visible) for f in fields.values())
with_issues = sum(bool(f.issues) for f in fields.values())
return FormDiagnosticsResponse(
url=url,
fields=fields,
summary={
"total": len(fields),
"found": found_count,
"visible": visible_count,
"with_issues": with_issues,
},
)
@router.get("/page", response_model=PageDiagnosticsResponse)
async def diagnose_page(
request: Request,
host_id: str | None = None,
) -> PageDiagnosticsResponse:
"""Diagnose page structure and available selectors.
Args:
host_id: Optional browser host ID
Returns:
Page structure including data-cy elements
"""
state = _get_app_state(request)
browser_client = state.browser_client
settings = state.settings
target_host = host_id or settings.default_browser_host_id
try:
async with browser_client.open_page(target_host) as raw_page:
page = cast(PageLike, cast(object, raw_page))
result = await inspect_page_structure(page)
return PageDiagnosticsResponse(
url=result.url,
title=result.title,
form_exists=result.form_exists,
total_data_cy_count=result.total_data_cy_count,
data_cy_elements=result.data_cy_elements,
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e)) from e
@router.get("/fields", response_model=dict[str, str])
async def list_known_fields() -> dict[str, str]:
"""List all known field names and their selectors.
Returns:
Mapping of field names to CSS selectors
"""
return KNOWN_FIELDS
class SelectorsDiagnosticsResponse(BaseModel):
"""Response for selector extraction."""
ui_elements: dict[str, str]
total_elements: int
filtered_elements: int
filter_applied: str | None = None
error: str | None = None
@router.get("/selectors", response_model=SelectorsDiagnosticsResponse)
async def extract_selectors(
request: Request,
filter: str | None = None,
host_id: str | None = None,
) -> SelectorsDiagnosticsResponse:
"""Extract UI element selectors from the current page.
Uses regex patterns to identify interactive elements (inputs, buttons, selects)
with their CSS selectors based on data-cy, data-testid, name, id, or aria-label.
Args:
filter: Optional filter string for partial match on element names
host_id: Optional browser host ID
Returns:
Extracted UI elements with their CSS selectors
"""
state = _get_app_state(request)
browser_client = state.browser_client
settings = state.settings
target_host = host_id or settings.default_browser_host_id
try:
async with browser_client.open_page(target_host) as raw_page:
page = cast(PageLike, cast(object, raw_page))
ui_elements = await extract_ui_elements(page)
if not ui_elements:
return SelectorsDiagnosticsResponse(
ui_elements={},
total_elements=0,
filtered_elements=0,
)
total_count = len(ui_elements)
if filter:
filter_upper = filter.upper().replace("-", "_")
filtered_elements = {
name: selector
for name, selector in ui_elements.items()
if filter_upper in name
}
filtered_count = len(filtered_elements)
else:
filtered_elements = ui_elements
filtered_count = total_count
return SelectorsDiagnosticsResponse(
ui_elements=filtered_elements,
total_elements=total_count,
filtered_elements=filtered_count,
filter_applied=filter,
)
except Exception as e:
return SelectorsDiagnosticsResponse(
ui_elements={},
total_elements=0,
filtered_elements=0,
error=str(e),
)

View File

@@ -0,0 +1,58 @@
"""Keep-alive status endpoint for monitoring browserless session health."""
from typing import Annotated, Protocol, cast
from fastapi import APIRouter, Depends, FastAPI, Request
from guide.app.browser.keepalive import KeepAliveService
from guide.app.models.domain import (
KeepAliveHostStatusDTO,
KeepAliveStatusResponse,
)
class AppStateProtocol(Protocol):
"""Protocol for app state with keep-alive service."""
keepalive_service: KeepAliveService
def _keepalive_service(request: Request) -> KeepAliveService:
"""Extract keep-alive service from app state."""
app = cast(FastAPI, request.app)
state = cast(AppStateProtocol, cast(object, app.state))
return state.keepalive_service
KeepAliveDep = Annotated[KeepAliveService, Depends(_keepalive_service)]
router = APIRouter(prefix="/keep-alive", tags=["keep-alive"])
@router.get("/status", response_model=KeepAliveStatusResponse)
async def get_keepalive_status(service: KeepAliveDep) -> KeepAliveStatusResponse:
"""Get keep-alive status for all browserless hosts.
Returns the current status of the keep-alive service including:
- Status of each browserless host (last ping time, connection state, errors)
- Service configuration (ping interval)
- Whether the service is currently running
"""
status = service.get_status()
hosts = [
KeepAliveHostStatusDTO(
host_id=host.host_id,
last_ping=host.last_ping.isoformat() if host.last_ping else None,
is_connected=host.is_connected,
last_error=host.last_error,
)
for host in status.hosts
]
return KeepAliveStatusResponse(
hosts=hosts,
interval_seconds=status.interval_seconds,
is_running=status.is_running,
started_at=status.started_at.isoformat() if status.started_at else None,
)

View File

@@ -0,0 +1,104 @@
"""OTP callback endpoint for n8n webhook responses.
Receives OTP URLs from n8n after it searches for and extracts
the magic link from email.
"""
import logging
from typing import Annotated
from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel
from guide.app.auth.otp_callback import OtpStoreBackend, get_otp_callback_store
_logger = logging.getLogger(__name__)
router = APIRouter(prefix="/auth", tags=["auth"])
class OtpCallbackRequest(BaseModel):
"""Request body for OTP callback from n8n."""
correlation_id: str
"""Correlation ID from the original OTP request."""
otp_url: str | None = None
"""The OTP magic link URL extracted from email."""
error: str | None = None
"""Error message if OTP URL could not be retrieved."""
class OtpCallbackResponse(BaseModel):
"""Response for OTP callback."""
status: str
"""Status of the callback: 'fulfilled' or 'not_found'."""
correlation_id: str
"""The correlation ID that was processed."""
def _get_store() -> OtpStoreBackend:
"""Dependency to get the OTP callback store."""
return get_otp_callback_store()
StoreDep = Annotated[OtpStoreBackend, Depends(_get_store)]
@router.post("/otp-callback", response_model=OtpCallbackResponse)
async def otp_callback(
payload: OtpCallbackRequest,
store: StoreDep,
) -> OtpCallbackResponse:
"""Receive OTP URL callback from n8n.
Called by n8n after it:
1. Receives webhook notification that OTP was requested
2. Searches inbox for OTP email
3. Extracts magic link URL
Args:
payload: Callback request with correlation_id and otp_url or error.
store: The OTP callback store.
Returns:
Response indicating if the callback was fulfilled.
Raises:
HTTPException: If correlation_id not found (404).
"""
_logger.info(
"Received OTP callback: correlation_id=%s, has_url=%s, error=%s",
payload.correlation_id,
bool(payload.otp_url),
payload.error,
)
if not payload.otp_url and not payload.error:
raise HTTPException(
status_code=400,
detail="Either otp_url or error must be provided",
)
fulfilled = await store.fulfill(
correlation_id=payload.correlation_id,
otp_url=payload.otp_url,
error=payload.error,
)
if not fulfilled:
raise HTTPException(
status_code=404,
detail=f"No pending OTP request for correlation_id: {payload.correlation_id}",
)
return OtpCallbackResponse(
status="fulfilled",
correlation_id=payload.correlation_id,
)
__all__ = ["router"]

View File

@@ -0,0 +1,98 @@
"""Session management API endpoints."""
from typing import Annotated, Protocol, cast
from fastapi import APIRouter, Depends, FastAPI, HTTPException, Request
from pydantic import BaseModel
from guide.app.auth import SessionManager, SessionValidationResult
class AppStateProtocol(Protocol):
"""Protocol for app state with session manager."""
session_manager: SessionManager
def _session_manager(request: Request) -> SessionManager:
"""Extract session manager from app state."""
app = cast(FastAPI, request.app)
state = cast(AppStateProtocol, cast(object, app.state))
return state.session_manager
SessionManagerDep = Annotated[SessionManager, Depends(_session_manager)]
class SessionStatusResponse(BaseModel):
"""Response for session status endpoint."""
persona_id: str
email: str
is_valid: bool
reason: str | None = None
remaining_seconds: int | None = None
created_at: str
expires_at: str
origin_url: str
class SessionListResponse(BaseModel):
"""Response for session list endpoint."""
sessions: list[str]
router = APIRouter(prefix="/sessions", tags=["sessions"])
@router.get("/", response_model=SessionListResponse)
async def list_sessions(session_manager: SessionManagerDep) -> SessionListResponse:
"""List all stored session persona IDs."""
sessions = session_manager.list_sessions()
return SessionListResponse(sessions=sessions)
@router.get("/{persona_id}", response_model=SessionStatusResponse)
async def get_session_status(
persona_id: str,
session_manager: SessionManagerDep,
) -> SessionStatusResponse:
"""Get session status for a persona.
Returns session metadata and validation status.
"""
session = session_manager.load_session(persona_id)
if session is None:
raise HTTPException(
status_code=404,
detail=f"No session found for persona '{persona_id}'",
)
validation: SessionValidationResult = session_manager.validate_offline(session)
return SessionStatusResponse(
persona_id=session.persona_id,
email=session.email,
is_valid=validation.is_valid,
reason=validation.reason,
remaining_seconds=validation.remaining_seconds,
created_at=session.created_at.isoformat(),
expires_at=session.session_ttl_expires_at.isoformat(),
origin_url=session.origin_url,
)
@router.delete("/{persona_id}")
async def delete_session(
persona_id: str,
session_manager: SessionManagerDep,
) -> dict[str, str]:
"""Delete/invalidate session for a persona."""
if session_manager.invalidate(persona_id):
return {"status": "deleted", "persona_id": persona_id}
else:
raise HTTPException(
status_code=404,
detail=f"No session found for persona '{persona_id}'",
)

View File

@@ -1,16 +1,37 @@
from guide.app.auth.mfa import DummyMfaCodeProvider, MfaCodeProvider
from guide.app.auth.otp_callback import (
OtpCallbackStore,
OtpRequest,
get_otp_callback_store,
)
from guide.app.auth.session import (
detect_current_persona,
ensure_persona,
login_with_mfa,
login_with_otp_url,
login_with_verification_code,
logout,
validate_persona_from_storage,
)
from guide.app.auth.session_manager import SessionManager
from guide.app.auth.session_models import SessionData, SessionValidationResult
from guide.app.auth.session_storage import SessionStorage
__all__ = [
"DummyMfaCodeProvider",
"MfaCodeProvider",
"OtpCallbackStore",
"OtpRequest",
"SessionData",
"SessionManager",
"SessionStorage",
"SessionValidationResult",
"detect_current_persona",
"ensure_persona",
"get_otp_callback_store",
"login_with_mfa",
"login_with_otp_url",
"login_with_verification_code",
"logout",
"validate_persona_from_storage",
]

View File

@@ -0,0 +1,568 @@
"""OTP callback store for async correlation between request and webhook response.
Enables the two-phase OTP flow:
1. Action triggers OTP email, sends webhook to n8n, waits for callback
2. n8n finds OTP URL, calls callback endpoint with correlation_id
3. Action receives OTP URL and completes login
Supports two backends:
- Memory (default): In-process dict, suitable for single-worker deployments
- Redis: Shared state across workers, suitable for multi-worker deployments
"""
import asyncio
import json
import logging
from dataclasses import dataclass, field
from datetime import UTC, datetime
from typing import TYPE_CHECKING, Literal, Protocol, cast
if TYPE_CHECKING:
import redis.asyncio
_logger = logging.getLogger(__name__)
@dataclass
class OtpRequest:
"""Pending OTP request awaiting callback."""
correlation_id: str
email: str
created_at: datetime = field(default_factory=lambda: datetime.now(UTC))
event: asyncio.Event = field(default_factory=asyncio.Event)
otp_url: str | None = None
error: str | None = None
@dataclass
class OtpRequestData:
"""Serializable OTP request data (without asyncio.Event)."""
correlation_id: str
email: str
created_at: str # ISO format string
otp_url: str | None = None
error: str | None = None
def to_json(self) -> str:
"""Serialize to JSON string."""
return json.dumps(
{
"correlation_id": self.correlation_id,
"email": self.email,
"created_at": self.created_at,
"otp_url": self.otp_url,
"error": self.error,
}
)
@classmethod
def from_json(cls, data: str) -> "OtpRequestData":
"""Deserialize from JSON string."""
parsed = cast(dict[str, object], json.loads(data))
correlation_id = parsed.get("correlation_id")
email = parsed.get("email")
created_at = parsed.get("created_at")
otp_url = parsed.get("otp_url")
error = parsed.get("error")
return cls(
correlation_id=str(correlation_id) if correlation_id else "",
email=str(email) if email else "",
created_at=str(created_at) if created_at else "",
otp_url=str(otp_url) if otp_url else None,
error=str(error) if error else None,
)
class OtpStoreBackend(Protocol):
"""Protocol for OTP callback store backends."""
async def register(self, correlation_id: str, email: str) -> OtpRequest:
"""Register a new OTP request."""
...
async def wait_for_callback(
self,
correlation_id: str,
timeout: float | None = None,
) -> str:
"""Wait for OTP URL callback."""
...
async def fulfill(
self,
correlation_id: str,
otp_url: str | None = None,
error: str | None = None,
) -> bool:
"""Fulfill a pending OTP request."""
...
async def cleanup_expired(self, max_age_seconds: float = 300) -> int:
"""Remove expired requests."""
...
class MemoryOtpStore:
"""In-memory OTP callback store for single-worker deployments.
Usage:
store = MemoryOtpStore()
# In action: register and wait
request = store.register(correlation_id, email)
otp_url = await store.wait_for_callback(correlation_id, timeout=120)
# In callback endpoint: fulfill
store.fulfill(correlation_id, otp_url)
"""
_requests: dict[str, OtpRequest]
_lock: asyncio.Lock
_default_timeout: float
def __init__(self, default_timeout: float = 300.0) -> None:
"""Initialize the callback store.
Args:
default_timeout: Default timeout in seconds for waiting (5 min default).
"""
self._requests = {}
self._lock = asyncio.Lock()
self._default_timeout = default_timeout
async def register(self, correlation_id: str, email: str) -> OtpRequest:
"""Register a new OTP request.
Args:
correlation_id: Unique ID to correlate request with callback.
email: Email address OTP was requested for.
Returns:
The registered OtpRequest.
"""
async with self._lock:
# Clean up expired requests periodically to prevent memory leaks
# Do this opportunistically on register (every 10 requests worth)
if len(self._requests) > 0 and len(self._requests) % 10 == 0:
_ = await self._cleanup_expired_unlocked()
request = OtpRequest(correlation_id=correlation_id, email=email)
self._requests[correlation_id] = request
_logger.info("Registered OTP request: %s for %s", correlation_id, email)
return request
async def wait_for_callback(
self,
correlation_id: str,
timeout: float | None = None,
) -> str:
"""Wait for OTP URL callback.
Args:
correlation_id: The request correlation ID.
timeout: Timeout in seconds (default: store default).
Returns:
The OTP URL from callback.
Raises:
TimeoutError: If callback not received within timeout.
ValueError: If correlation_id not found or callback had error.
"""
timeout = timeout or self._default_timeout
async with self._lock:
request = self._requests.get(correlation_id)
if not request:
raise ValueError(
f"No pending request for correlation_id: {correlation_id}"
)
_logger.info(
"Waiting for OTP callback: %s (timeout=%ss)", correlation_id, timeout
)
try:
_ = await asyncio.wait_for(request.event.wait(), timeout=timeout)
except asyncio.TimeoutError:
await self._cleanup(correlation_id)
raise TimeoutError(
f"OTP callback timeout after {timeout}s for {correlation_id}"
) from None
if request.error:
await self._cleanup(correlation_id)
raise ValueError(f"OTP callback error: {request.error}")
if not request.otp_url:
await self._cleanup(correlation_id)
raise ValueError(f"OTP callback received but no URL for {correlation_id}")
await self._cleanup(correlation_id)
_logger.info("OTP callback received for: %s", correlation_id)
return request.otp_url
async def fulfill(
self,
correlation_id: str,
otp_url: str | None = None,
error: str | None = None,
) -> bool:
"""Fulfill a pending OTP request with URL or error.
Args:
correlation_id: The request correlation ID.
otp_url: The OTP URL from n8n.
error: Error message if OTP retrieval failed.
Returns:
True if request was found and fulfilled, False otherwise.
"""
async with self._lock:
request = self._requests.get(correlation_id)
if not request:
_logger.warning("No pending request for callback: %s", correlation_id)
return False
request.otp_url = otp_url
request.error = error
_ = request.event.set()
_logger.info(
"Fulfilled OTP request: %s (url=%s, error=%s)",
correlation_id,
bool(otp_url),
error,
)
return True
async def _cleanup(self, correlation_id: str) -> None:
"""Remove completed request from store."""
async with self._lock:
_ = self._requests.pop(correlation_id, None)
async def _cleanup_expired_unlocked(self, max_age_seconds: float = 300) -> int:
"""Remove expired requests older than max_age (must hold lock).
Args:
max_age_seconds: Maximum age in seconds.
Returns:
Number of requests cleaned up.
"""
now = datetime.now(UTC)
expired: list[str] = []
for cid, request in self._requests.items():
age = (now - request.created_at).total_seconds()
if age > max_age_seconds:
expired.append(cid)
for cid in expired:
_ = self._requests.pop(cid, None)
if expired:
_logger.info("Cleaned up %d expired OTP requests", len(expired))
return len(expired)
async def cleanup_expired(self, max_age_seconds: float = 300) -> int:
"""Remove expired requests older than max_age.
Args:
max_age_seconds: Maximum age in seconds.
Returns:
Number of requests cleaned up.
"""
async with self._lock:
return await self._cleanup_expired_unlocked(max_age_seconds)
class RedisOtpStore:
"""Redis-backed OTP callback store for multi-worker deployments.
Uses Redis for shared state across workers. Implements polling pattern
since asyncio.Event cannot be shared across processes.
Usage:
store = RedisOtpStore(redis_url="redis://localhost:6379/0")
# In action: register and wait
await store.register(correlation_id, email)
otp_url = await store.wait_for_callback(correlation_id, timeout=120)
# In callback endpoint: fulfill
await store.fulfill(correlation_id, otp_url)
"""
_redis_url: str
_default_timeout: float
_ttl_seconds: int
_poll_interval: float
_key_prefix: str
def __init__(
self,
redis_url: str,
default_timeout: float = 300.0,
ttl_seconds: int = 600,
poll_interval: float = 0.5,
key_prefix: str = "otp_callback:",
) -> None:
"""Initialize the Redis OTP callback store.
Args:
redis_url: Redis connection URL.
default_timeout: Default timeout in seconds for waiting (5 min default).
ttl_seconds: TTL for Redis keys (auto-cleanup, 10 min default).
poll_interval: Polling interval in seconds.
key_prefix: Prefix for Redis keys.
"""
self._redis_url = redis_url
self._default_timeout = default_timeout
self._ttl_seconds = ttl_seconds
self._poll_interval = poll_interval
self._key_prefix = key_prefix
def _get_key(self, correlation_id: str) -> str:
"""Get Redis key for correlation ID."""
return f"{self._key_prefix}{correlation_id}"
async def _get_redis(self) -> "redis.asyncio.Redis[str]":
"""Get Redis connection."""
from redis.asyncio import Redis
client: Redis[str] = Redis.from_url(self._redis_url, decode_responses=True)
return client
async def register(self, correlation_id: str, email: str) -> OtpRequest:
"""Register a new OTP request in Redis.
Args:
correlation_id: Unique ID to correlate request with callback.
email: Email address OTP was requested for.
Returns:
The registered OtpRequest.
"""
now = datetime.now(UTC)
data = OtpRequestData(
correlation_id=correlation_id,
email=email,
created_at=now.isoformat(),
)
client = await self._get_redis()
try:
key = self._get_key(correlation_id)
_ = await client.setex(key, self._ttl_seconds, data.to_json())
_logger.info(
"Registered OTP request in Redis: %s for %s", correlation_id, email
)
finally:
await client.close()
# Return OtpRequest with local Event for in-process compatibility
return OtpRequest(
correlation_id=correlation_id,
email=email,
created_at=now,
)
async def wait_for_callback(
self,
correlation_id: str,
timeout: float | None = None,
) -> str:
"""Wait for OTP URL callback using polling.
Args:
correlation_id: The request correlation ID.
timeout: Timeout in seconds (default: store default).
Returns:
The OTP URL from callback.
Raises:
TimeoutError: If callback not received within timeout.
ValueError: If correlation_id not found or callback had error.
"""
timeout = timeout or self._default_timeout
key = self._get_key(correlation_id)
start_time = asyncio.get_event_loop().time()
_logger.info(
"Waiting for OTP callback (Redis): %s (timeout=%ss)",
correlation_id,
timeout,
)
client = await self._get_redis()
try:
while True:
elapsed = asyncio.get_event_loop().time() - start_time
if elapsed >= timeout:
_ = await client.delete(key)
raise TimeoutError(
f"OTP callback timeout after {timeout}s for {correlation_id}"
)
raw_data = await client.get(key)
if not raw_data:
# Key may have expired (TTL) or was never registered
raise TimeoutError(
f"OTP request expired or not found for {correlation_id} (key may have exceeded TTL before callback arrived)"
)
data = OtpRequestData.from_json(raw_data)
# Check if fulfilled (has otp_url or error)
if data.otp_url or data.error:
_ = await client.delete(key)
if data.error:
raise ValueError(f"OTP callback error: {data.error}")
if not data.otp_url:
raise ValueError(
f"OTP callback received but no URL for {correlation_id}"
)
_logger.info(
"OTP callback received (Redis) for: %s", correlation_id
)
return data.otp_url
# Not fulfilled yet, wait and poll again
await asyncio.sleep(self._poll_interval)
finally:
await client.close()
async def fulfill(
self,
correlation_id: str,
otp_url: str | None = None,
error: str | None = None,
) -> bool:
"""Fulfill a pending OTP request with URL or error.
Args:
correlation_id: The request correlation ID.
otp_url: The OTP URL from n8n.
error: Error message if OTP retrieval failed.
Returns:
True if request was found and fulfilled, False otherwise.
"""
key = self._get_key(correlation_id)
client = await self._get_redis()
try:
raw_data = await client.get(key)
if not raw_data:
_logger.warning(
"No pending request in Redis for callback: %s", correlation_id
)
return False
data = OtpRequestData.from_json(raw_data)
data.otp_url = otp_url
data.error = error
# Update with fulfilled data (keep same TTL)
_ = await client.setex(key, self._ttl_seconds, data.to_json())
_logger.info(
"Fulfilled OTP request in Redis: %s (url=%s, error=%s)",
correlation_id,
bool(otp_url),
error,
)
return True
finally:
await client.close()
async def cleanup_expired(self, max_age_seconds: float = 300) -> int:
"""Remove expired requests (no-op for Redis, TTL handles cleanup).
Redis keys are automatically expired via TTL set during register.
This method exists for interface compatibility.
Args:
max_age_seconds: Maximum age in seconds (ignored, TTL used instead).
Returns:
Always 0 (Redis handles cleanup automatically).
"""
# Redis TTL handles cleanup automatically
# Parameter exists for interface compatibility
_ = max_age_seconds
return 0
# Backward compatibility alias
OtpCallbackStore = MemoryOtpStore
# Type alias for store backends
OtpStore = MemoryOtpStore | RedisOtpStore
# Global singleton instance
_store: OtpStore | None = None
_store_backend: Literal["memory", "redis"] = "memory"
_store_redis_url: str | None = None
def configure_otp_store(
backend: Literal["memory", "redis"] = "memory",
redis_url: str | None = None,
) -> None:
"""Configure the OTP store backend.
Must be called before get_otp_callback_store() to take effect.
Args:
backend: Backend type ("memory" or "redis").
redis_url: Redis URL (required if backend is "redis").
Raises:
ValueError: If backend is "redis" but no redis_url provided.
"""
global _store_backend, _store_redis_url, _store
if backend == "redis" and not redis_url:
raise ValueError("redis_url is required when backend is 'redis'")
_store_backend = backend
_store_redis_url = redis_url
_store = None # Reset singleton to pick up new config
_logger.info("Configured OTP store backend: %s", backend)
def get_otp_callback_store() -> OtpStore:
"""Get the global OTP callback store instance."""
global _store
if _store is None:
if _store_backend == "redis" and _store_redis_url:
_store = RedisOtpStore(redis_url=_store_redis_url)
_logger.info("Created Redis OTP callback store")
else:
_store = MemoryOtpStore()
_logger.info("Created Memory OTP callback store")
return _store
__all__ = [
"OtpCallbackStore",
"OtpRequest",
"OtpRequestData",
"OtpStoreBackend",
"MemoryOtpStore",
"RedisOtpStore",
"OtpStore",
"configure_otp_store",
"get_otp_callback_store",
]

View File

@@ -1,49 +1,715 @@
from playwright.async_api import Page
import json
from dataclasses import dataclass
from datetime import datetime, timezone
from typing import cast
from loguru import logger
from guide.app.auth.mfa import MfaCodeProvider
from guide.app.browser.types import PageLike
from guide.app.core.config import Timeouts
from guide.app.errors import PersonaError
from guide.app.models.personas.models import DemoPersona
from guide.app.strings.registry import app_strings
from guide.app.strings.selectors.auth import Auth0ErrorSelectors, AuthSelectors
from guide.app.utils.jwt import is_jwt_format as _is_jwt_format
# Module-level default timeouts for backward compatibility
_DEFAULT_TIMEOUTS = Timeouts()
async def detect_current_persona(page: Page) -> str | None:
"""Return the email/identifier of the currently signed-in user, if visible."""
element = page.locator(app_strings.auth.current_user_display)
if await element.count() == 0:
async def _check_for_auth_errors(
page: PageLike,
selectors: tuple[str, ...],
context: str,
) -> str | None:
"""Check page for authentication error messages.
Iterates through error selectors and returns the first matching error text.
Args:
page: Browser page to check.
selectors: Tuple of selectors to check for errors.
context: Context string for logging (e.g., "OTP login", "verification code").
Returns:
Error text if found, None otherwise.
"""
for selector in selectors:
error_el = page.locator(selector)
if await error_el.count() > 0:
error_text = await error_el.first.text_content()
logger.warning(
"{} error: {} (selector: {})",
context,
error_text,
selector,
)
return error_text or "Unknown error"
return None
@dataclass(frozen=True)
class ExtractedToken:
"""Bearer token extracted from browser session."""
value: str
source_key: str
extracted_at: datetime
def _extract_email_from_auth0_user(value: str) -> str | None:
"""Extract email from Auth0 SPA SDK user cache entry.
Auth0 SPA SDK stores user info with structure:
{"body": {"email": "...", "name": "...", ...}, "expiresAt": ...}
or sometimes just {"email": "...", ...}
"""
try:
loaded = cast(
dict[str, object] | list[object] | str | int | float | bool | None,
json.loads(value),
)
except json.JSONDecodeError:
return None
text = await element.first.text_content()
if text is None:
if not isinstance(loaded, dict):
return None
prefix = app_strings.auth.current_user_display_prefix
if prefix and text.startswith(prefix):
return text.removeprefix(prefix).strip()
return text.strip()
# Try nested body structure first (Auth0 SPA SDK pattern)
body = loaded.get("body")
if isinstance(body, dict):
body_dict = cast(dict[str, object], body)
if isinstance(email := body_dict.get("email"), str) and email:
return email
# Try direct email field
return email if isinstance(email := loaded.get("email"), str) and email else None
def _extract_email_from_json(value: str) -> str | None:
"""Try to extract an email from a JSON object value."""
try:
loaded = cast(
dict[str, object] | list[object] | str | int | float | bool | None,
json.loads(value),
)
except json.JSONDecodeError:
return None
if not isinstance(loaded, dict):
return None
# Check common email field names
email_fields = ["email", "user_email", "userEmail", "sub", "preferred_username"]
for field in email_fields:
email_value = loaded.get(field)
if isinstance(email_value, str) and "@" in email_value:
return email_value
return None
async def detect_current_persona(page: PageLike) -> str | None:
"""Return the email/identifier of the currently signed-in user from localStorage.
Scans localStorage for Auth0 SPA SDK user keys or other auth-related keys
that may contain user email information.
"""
tokens = await discover_auth_tokens(page)
logger.debug(
"Discovered {} auth-related localStorage keys: {}",
len(tokens),
list(tokens.keys()),
)
if not tokens:
logger.debug("No auth tokens found in localStorage")
return None
# First priority: Plain 'email' key (direct email string)
if "email" in tokens:
email_value = tokens["email"]
if "@" in email_value:
logger.info(
"Detected persona from localStorage 'email' key: {}", email_value
)
return email_value
# Second priority: Auth0 SPA SDK user keys (@@user@@ keys contain profile info)
for key, value in tokens.items():
if "@@auth0spajs@@" in key and "@@user@@" in key:
logger.debug("Found Auth0 SPA SDK user key: {}", key)
if email := _extract_email_from_auth0_user(value):
logger.info("Detected persona from Auth0 user cache: {}", email)
return email
# Third priority: Any auth-related key containing user email
for key, value in tokens.items():
if email := _extract_email_from_json(value):
logger.info("Detected persona from localStorage key '{}': {}", key, email)
return email
logger.debug("No email found in any localStorage keys")
return None
async def login_with_mfa(
page: Page, email: str, mfa_provider: MfaCodeProvider, login_url: str | None = None
page: PageLike,
email: str,
mfa_provider: MfaCodeProvider,
login_url: str | None = None,
) -> None:
if login_url:
_response = await page.goto(login_url)
del _response
await page.fill(app_strings.auth.email_input, email)
await page.click(app_strings.auth.send_code_button)
"""Log in with MFA. Only proceeds if email input exists after navigation."""
email_input = page.locator(AuthSelectors.EMAIL_INPUT)
# Check if we need to navigate to the login page
if await email_input.count() == 0:
if login_url:
logger.debug("Navigating to login URL: {}", login_url)
_response = await page.goto(login_url)
del _response
# Check again after navigation - user might already be logged in
if await email_input.count() == 0:
logger.debug(
"No email input found after navigation - user already logged in"
)
return
else:
logger.debug("No login URL and no email input - user already logged in")
return
logger.info("Starting MFA login for: {}", email)
await page.fill(AuthSelectors.EMAIL_INPUT, email)
await page.click(AuthSelectors.SEND_CODE_BUTTON)
logger.debug("Sent MFA code request, waiting for code")
code = mfa_provider.get_code(email)
await page.fill(app_strings.auth.code_input, code)
await page.click(app_strings.auth.submit_button)
await page.fill(AuthSelectors.CODE_INPUT, code)
await page.click(AuthSelectors.SUBMIT_BUTTON)
logger.info("MFA login submitted for: {}", email)
async def logout(page: Page) -> None:
await page.click(app_strings.auth.logout_button)
async def logout(page: PageLike) -> None:
"""Log out if the logout button exists."""
logout_btn = page.locator(AuthSelectors.LOGOUT_BUTTON)
if await logout_btn.count() > 0:
logger.info("Logging out current user")
await logout_btn.click()
else:
logger.debug("No logout button found - user may not be logged in")
async def login_with_verification_code(
page: PageLike,
verification_code: str,
expected_email: str | None = None,
*,
timeouts: Timeouts | None = None,
) -> bool:
"""Authenticate via verification code input on Auth0 page.
Used when Auth0 shows "Enter your email code to log in" page
instead of processing a magic link directly.
Args:
page: Browser page already on the Auth0 verification code page.
verification_code: 6-digit code from email.
expected_email: If provided, validate logged-in user matches this email.
timeouts: Optional Timeouts instance for centralized configuration.
Returns:
True if authentication successful, False otherwise.
"""
from guide.app.browser.wait import wait_for_stable_page
from guide.app.strings.selectors.login import LoginSelectors
effective_timeouts = timeouts or _DEFAULT_TIMEOUTS
logger.info("Starting verification code login flow")
# Wait for code input field
code_input = page.locator(LoginSelectors.VERIFICATION_CODE_INPUT)
try:
await code_input.wait_for(
state="visible", timeout=effective_timeouts.element_default
)
except Exception as exc:
logger.error("Verification code input not found: {}", exc)
return False
# Fill verification code
logger.info("Filling verification code")
await code_input.fill(verification_code)
# Click submit button
submit_btn = page.locator(LoginSelectors.VERIFICATION_CODE_SUBMIT)
if await submit_btn.count() > 0:
logger.info("Clicking verification code submit button")
await submit_btn.first.click()
else:
logger.warning("No verification code submit button found")
return False
# Wait for auth redirect
await wait_for_stable_page(page, stability_check_ms=8000)
# Log current URL after submission
logger.info("URL after verification code submission: {}", page.url)
# Check for error messages
if await _check_for_auth_errors(
page, Auth0ErrorSelectors.VERIFICATION_CODE_ERRORS, "Verification code"
):
return False
# Optionally validate logged-in user
if expected_email:
current = await detect_current_persona(page)
if current and current.lower() == expected_email.lower():
logger.info("Verification code login successful for: {}", expected_email)
return True
logger.warning(
"Verification code login validation failed - expected: {}, detected: {}",
expected_email,
current,
)
return False
logger.info("Verification code login completed (no email validation requested)")
return True
async def login_with_otp_url(
page: PageLike,
otp_url: str,
expected_email: str | None = None,
) -> bool:
"""Authenticate via one-time passwordless URL.
Navigate to the OTP URL and complete authentication.
Some OTP flows show a confirmation page requiring a button click.
Args:
page: Browser page.
otp_url: One-time auth URL with verification code (e.g., passwordless link).
expected_email: If provided, validate logged-in user matches this email.
Returns:
True if authentication successful, False otherwise.
"""
from guide.app.browser.wait import wait_for_stable_page
logger.info("Starting OTP login flow")
logger.info("OTP URL: {}", otp_url)
# Navigate to OTP URL
try:
_ = await page.goto(otp_url)
except Exception as exc:
logger.error("Failed to navigate to OTP URL: {}", exc)
return False
# Wait for page to stabilize
await wait_for_stable_page(page, stability_check_ms=2000)
# Check for error messages (e.g., expired OTP)
if await _check_for_auth_errors(
page, Auth0ErrorSelectors.OTP_URL_ERRORS, "OTP navigation"
):
return False
# Check for OTP confirmation page - Auth0 shows "Almost there" with a LOG IN button
button_clicked = False
for selector in Auth0ErrorSelectors.LOGIN_BUTTON_SELECTORS:
login_btn = page.locator(selector)
if await login_btn.count() > 0:
logger.info("Clicking OTP confirmation button: {}", selector)
await login_btn.first.click()
button_clicked = True
# Wait for auth redirect after clicking - Auth0 can be slow
await wait_for_stable_page(page, stability_check_ms=8000)
break
if not button_clicked:
logger.info("No OTP confirmation button found - may have auto-redirected")
# Log current URL after click attempt to diagnose redirect issues
logger.info("URL after OTP login flow: {}", page.url)
# Check for errors after clicking (in case click triggered error)
if await _check_for_auth_errors(
page, Auth0ErrorSelectors.OTP_URL_ERRORS, "OTP after click"
):
return False
# Optionally validate logged-in user matches expected email
if expected_email:
# Debug: log current URL and localStorage keys
try:
current_url = page.url
logger.info("Post-login URL: {}", current_url)
ls_keys = await page.evaluate("Object.keys(localStorage)")
logger.info("localStorage keys: {}", ls_keys)
except Exception as debug_exc:
logger.warning("Debug info collection failed: {}", debug_exc)
current = await detect_current_persona(page)
if current and current.lower() == expected_email.lower():
logger.info("OTP login successful for: {}", expected_email)
return True
# Capture page content on failure for debugging
try:
page_title = await page.evaluate("document.title || ''")
body_text = await page.evaluate(
"document.body?.innerText?.substring(0, 500) || ''"
)
logger.warning(
"OTP login validation failed - expected: {}, detected: {}, page_title: {}, body_preview: {}",
expected_email,
current,
page_title,
body_text,
)
except Exception:
logger.warning(
"OTP login validation failed - expected: {}, detected: {}",
expected_email,
current,
)
return False
logger.info("OTP login completed (no email validation requested)")
return True
async def ensure_persona(
page: Page,
page: PageLike,
persona: DemoPersona,
mfa_provider: MfaCodeProvider,
login_url: str | None = None,
) -> None:
"""Ensure the browser is logged in as the specified persona."""
logger.debug("Ensuring persona: {}", persona.email)
current = await detect_current_persona(page)
if current and current.lower() == persona.email.lower():
logger.debug("Already logged in as: {}", persona.email)
return
if current:
logger.info("Switching persona from {} to {}", current, persona.email)
else:
logger.info("Logging in as persona: {}", persona.email)
await logout(page)
await login_with_mfa(page, persona.email, mfa_provider, login_url=login_url)
async def _read_local_storage_value(page: PageLike, key: str) -> str | None:
"""Return raw localStorage value for key, or None if missing or inaccessible."""
js = f"""
(() => {{
try {{
const raw = window.localStorage.getItem({json.dumps(key)});
return raw === null ? null : raw;
}} catch (err) {{
return null;
}}
}})();
"""
result = await page.evaluate(js)
return str(result) if result is not None else None
async def validate_persona_from_storage(
page: PageLike,
expected_email: str,
*,
storage_key: str,
email_field: str = "email",
) -> None:
"""Ensure the active user in localStorage matches ``expected_email``.
This is intended for extension hosts where the browser is already logged in
and we want to sanity-check the active persona without navigating or
mutating session state. Raises PersonaError if the stored email is missing
or does not case-insensitively equal ``expected_email``.
Args:
page: Page-like object (Playwright Page or ExtensionPage).
expected_email: Email address that must be active in the browser.
storage_key: localStorage key that contains the user payload.
email_field: Field name inside the stored JSON object that holds email.
"""
raw = await _read_local_storage_value(page, storage_key)
if raw is None:
raise PersonaError(
"No user found in localStorage",
details={"storage_key": storage_key, "expected_email": expected_email},
)
parsed: str | dict[str, object]
try:
# json.loads returns Any; cast to union of possible JSON types
loaded = cast(
dict[str, object] | list[object] | str | int | float | bool | None,
json.loads(raw),
)
parsed = loaded if isinstance(loaded, (dict, str)) else raw
except json.JSONDecodeError:
parsed = raw
# Allow either JSON object with an email field or a plain string payload.
stored_email: str | None
if isinstance(parsed, dict):
email_value = parsed.get(email_field)
stored_email = email_value if isinstance(email_value, str) else None
else:
stored_email = parsed
if stored_email is None:
payload_type_name = "dict" if isinstance(parsed, dict) else "str"
raise PersonaError(
"localStorage user record does not contain an email",
details={
"storage_key": storage_key,
"email_field": email_field,
"expected_email": expected_email,
"payload_type": payload_type_name,
},
)
if stored_email.lower() != expected_email.lower():
raise PersonaError(
"Active browser user email does not match expected persona",
details={
"storage_key": storage_key,
"email_field": email_field,
"expected_email": expected_email,
"actual_email": stored_email,
},
)
_JS_DISCOVER_AUTH_TOKENS = """
(() => {
try {
if (typeof localStorage === 'undefined') return {};
const patterns = ['token', 'jwt', 'bearer', 'auth', 'access', 'session', 'credential', 'email', 'user'];
const results = {};
for (let i = 0; i < localStorage.length; i++) {
const key = localStorage.key(i);
if (!key) continue;
const lowerKey = key.toLowerCase();
if (patterns.some(p => lowerKey.includes(p))) {
try {
const value = localStorage.getItem(key);
if (value) results[key] = value;
} catch (e) {}
}
}
return results;
} catch (e) {
// localStorage not accessible (about:blank, restricted pages)
return {};
}
})();
"""
_JS_LIST_ALL_LOCALSTORAGE_KEYS = """
(() => {
try {
if (typeof localStorage === 'undefined') return [];
const keys = [];
for (let i = 0; i < localStorage.length; i++) {
const key = localStorage.key(i);
if (key) keys.push(key);
}
return keys;
} catch (e) {
return [];
}
})();
"""
async def discover_auth_tokens(page: PageLike) -> dict[str, str]:
"""Scan localStorage for auth-related keys and return all matches.
Search for keys containing patterns: 'token', 'jwt', 'bearer', 'auth',
'access', 'session', 'credential', 'email', or 'user'.
Args:
page: Page-like object (Playwright Page or ExtensionPage).
Returns:
Dictionary mapping localStorage key names to their values.
Returns empty dict if localStorage is not accessible (e.g., about:blank).
"""
try:
# Log all keys for diagnostic purposes (DEBUG level)
all_keys = await page.evaluate(_JS_LIST_ALL_LOCALSTORAGE_KEYS)
if isinstance(all_keys, list):
keys_list = cast(list[str], all_keys)
logger.debug("All localStorage keys ({}): {}", len(keys_list), keys_list)
result = await page.evaluate(_JS_DISCOVER_AUTH_TOKENS)
if not isinstance(result, dict):
logger.debug("localStorage scan returned non-dict result")
return {}
result_dict = cast(dict[str, object], result)
return {
key: str(value) for key, value in result_dict.items() if value is not None
}
except Exception as exc:
# Handle cases where page.evaluate fails (restricted pages, closed pages)
logger.debug("localStorage access failed: {}", exc)
return {}
def _extract_token_from_json(value: str) -> str | None:
"""Try to extract a token from a JSON object value."""
try:
# json.loads returns Any; cast to union of possible JSON types
loaded = cast(
dict[str, object] | list[object] | str | int | float | bool | None,
json.loads(value),
)
except json.JSONDecodeError:
return None
if not isinstance(loaded, dict):
return None
parsed_dict = loaded
# Priority order for token fields within JSON
token_fields = [
"access_token",
"accessToken",
"token",
"bearer",
"id_token",
"idToken",
"jwt",
]
for field in token_fields:
token_value = parsed_dict.get(field)
if isinstance(token_value, str) and token_value:
return token_value
return None
def _extract_auth0_access_token(value: str) -> str | None:
"""Extract access_token from Auth0 SPA SDK cache entry.
Auth0 SPA SDK stores tokens as JSON with structure:
{"body": {"access_token": "...", "id_token": "...", ...}, "expiresAt": ...}
"""
try:
loaded = cast(
dict[str, object] | list[object] | str | int | float | bool | None,
json.loads(value),
)
except json.JSONDecodeError:
return None
if not isinstance(loaded, dict):
return None
# Auth0 SPA SDK structure: {"body": {"access_token": "..."}}
body = loaded.get("body")
if isinstance(body, dict):
body_dict = cast(dict[str, object], body)
access_token = body_dict.get("access_token")
if isinstance(access_token, str) and access_token:
return access_token
return None
async def extract_bearer_token(page: PageLike) -> ExtractedToken | None:
"""Extract bearer token from localStorage, trying common key patterns.
Priority order for token selection:
1. Auth0 SPA SDK audience-scoped keys (contain access_token for API calls)
2. Keys containing 'access_token' or 'accessToken'
3. Keys containing 'bearer'
4. Keys containing 'token' (excluding 'refresh')
5. JWT-formatted values in any auth-related key
For each key, if the value is JSON, attempts to extract token fields.
Otherwise uses the raw value if it looks like a token.
Args:
page: Page-like object (Playwright Page or ExtensionPage).
Returns:
ExtractedToken with the token value and source key, or None if not found.
"""
tokens = await discover_auth_tokens(page)
if not tokens:
return None
# First priority: Auth0 SPA SDK audience-scoped keys (NOT @@user@@ keys)
# These keys contain the access_token needed for API calls
for key, value in tokens.items():
if "@@auth0spajs@@" in key and "@@user@@" not in key:
access_token = _extract_auth0_access_token(value)
if access_token and _is_jwt_format(access_token):
return ExtractedToken(
value=access_token,
source_key=key,
extracted_at=datetime.now(timezone.utc),
)
# Second priority: Standard key pattern matching
priority_patterns: list[tuple[str, bool]] = [
("access_token", False), # (pattern, exclude_refresh)
("accesstoken", False),
("bearer", False),
("id_token", False),
("idtoken", False),
("token", True), # exclude refresh tokens
]
for pattern, exclude_refresh in priority_patterns:
for key, value in tokens.items():
lower_key = key.lower()
if pattern not in lower_key:
continue
if exclude_refresh and "refresh" in lower_key:
continue
# Try to extract from JSON first
if extracted := _extract_token_from_json(value):
return ExtractedToken(
value=extracted,
source_key=key,
extracted_at=datetime.now(timezone.utc),
)
# Use raw value if it looks like a token (JWT or long alphanumeric)
if _is_jwt_format(value) or (len(value) > 20 and value.isalnum()):
return ExtractedToken(
value=value,
source_key=key,
extracted_at=datetime.now(timezone.utc),
)
# Fallback: find any JWT-formatted value
for key, value in tokens.items():
if (extracted := _extract_token_from_json(value)) and _is_jwt_format(extracted):
return ExtractedToken(
value=extracted,
source_key=key,
extracted_at=datetime.now(timezone.utc),
)
if _is_jwt_format(value):
return ExtractedToken(
value=value,
source_key=key,
extracted_at=datetime.now(timezone.utc),
)
return None

View File

@@ -0,0 +1,455 @@
"""Core session management service."""
from datetime import datetime, timedelta, timezone
from typing import TYPE_CHECKING, cast
from guide.app.utils.jwt import (
parse_jwt_expiry as _parse_jwt_expiry,
verify_jwt_signature,
VerifiedJWTClaims,
)
from guide.app import errors
from loguru import logger
from playwright.async_api import BrowserContext
if TYPE_CHECKING:
from playwright._impl._api_structures import SetCookieParam
from guide.app.auth.session_models import SessionData, SessionValidationResult
from guide.app.auth.session_storage import SessionStorage
from guide.app.browser.types import PageLike
from guide.app.models.domain import ActionResult
from guide.app.models.personas.models import DemoPersona
from guide.app.utils.urls import extract_base_url
_JS_GET_ALL_LOCAL_STORAGE = """
(() => {
const items = {};
for (let i = 0; i < localStorage.length; i++) {
const key = localStorage.key(i);
if (key) {
const value = localStorage.getItem(key);
if (value !== null) items[key] = value;
}
}
return items;
})();
"""
_JS_SET_LOCAL_STORAGE = """
(items) => {
for (const [key, value] of Object.entries(items)) {
localStorage.setItem(key, value);
}
}
"""
class SessionManager:
"""Manage browser session persistence for personas."""
_storage: SessionStorage
_ttl: timedelta
_auto_persist: bool
_auth0_domain: str | None
_auth0_audience: str | None
def __init__(
self,
storage: SessionStorage,
ttl: timedelta,
auto_persist: bool = True,
auth0_domain: str | None = None,
auth0_audience: str | None = None,
) -> None:
"""Initialize session manager.
Args:
storage: Session storage backend.
ttl: Time-to-live for sessions.
auto_persist: Whether to auto-save sessions after login.
auth0_domain: Auth0 tenant domain for JWT verification.
auth0_audience: Expected audience claim in JWT.
"""
self._storage = storage
self._ttl = ttl
self._auto_persist = auto_persist
self._auth0_domain = auth0_domain
self._auth0_audience = auth0_audience
@property
def auto_persist(self) -> bool:
"""Return whether auto-persist is enabled."""
return self._auto_persist
@property
def jwt_verification_enabled(self) -> bool:
"""Return whether JWT signature verification is enabled."""
return self._auth0_domain is not None
def verify_token(self, token: str) -> VerifiedJWTClaims:
"""Verify JWT token signature and return validated claims.
Requires auth0_domain to be configured. Raises AuthError on failure.
Args:
token: JWT token string.
Returns:
VerifiedJWTClaims with validated token data.
Raises:
AuthError: If verification fails or auth0_domain not configured.
"""
if not self._auth0_domain:
raise errors.AuthError(
"JWT verification requires auth0_domain to be configured",
details={"configured": False},
)
return verify_jwt_signature(
token,
auth0_domain=self._auth0_domain,
audience=self._auth0_audience,
)
def load_session(self, persona_id: str) -> SessionData | None:
"""Load session from storage.
Args:
persona_id: ID of the persona.
Returns:
SessionData if found, None otherwise.
"""
return self._storage.load(persona_id)
def validate_offline(
self,
session: SessionData,
ttl_buffer_seconds: int = 30,
) -> SessionValidationResult:
"""Validate session without browser access.
Checks TTL expiry and JWT token expiry if available.
Applies a buffer to prevent race conditions where session
expires during navigation/injection (typically 5-10 seconds).
Args:
session: Session data to validate.
ttl_buffer_seconds: Minimum remaining TTL required (default 30s).
Sessions with less remaining time are considered invalid
to prevent expiry during restoration.
Returns:
Validation result with is_valid flag and reason.
"""
now = datetime.now(timezone.utc)
# Check TTL expiry (with buffer to prevent race conditions)
remaining = session.session_ttl_expires_at - now
remaining_seconds = int(remaining.total_seconds())
if remaining_seconds <= ttl_buffer_seconds:
return SessionValidationResult(
is_valid=False,
reason="session_ttl_expired"
if remaining_seconds <= 0
else "session_ttl_near_expiry",
remaining_seconds=max(0, remaining_seconds),
)
# Check JWT token expiry if available (also with buffer)
if session.token_expires_at:
token_remaining = session.token_expires_at - now
token_remaining_seconds = int(token_remaining.total_seconds())
if token_remaining_seconds <= ttl_buffer_seconds:
return SessionValidationResult(
is_valid=False,
reason="token_expired"
if token_remaining_seconds <= 0
else "token_near_expiry",
remaining_seconds=max(0, token_remaining_seconds),
)
return SessionValidationResult(
is_valid=True,
reason=None,
remaining_seconds=remaining_seconds,
)
async def save_session(
self,
page: PageLike,
persona: DemoPersona,
origin_url: str | None = None,
) -> SessionData:
"""Extract and save session from browser.
Args:
page: Browser page to extract session from.
persona: Persona the session belongs to.
origin_url: URL the session was created on.
Returns:
Saved SessionData.
"""
now = datetime.now(timezone.utc)
# Extract localStorage
local_storage = await self._extract_local_storage(page)
# Extract cookies (Playwright pages only)
cookies = await self._extract_cookies(page)
# Try to extract bearer token and verify/parse expiry
bearer_token_value: str | None = None
bearer_token_source_key: str | None = None
token_expires_at: datetime | None = None
from guide.app.auth.session import extract_bearer_token
token = await extract_bearer_token(page)
if token:
bearer_token_value = token.value
bearer_token_source_key = token.source_key
# Verify token signature if auth0_domain is configured (strict mode)
if self.jwt_verification_enabled:
logger.debug("Verifying JWT signature for persona {}", persona.id)
verified_claims = self.verify_token(token.value)
token_expires_at = verified_claims.exp
logger.info(
"JWT verified for persona {}: sub={}, exp={}",
persona.id,
verified_claims.sub,
verified_claims.exp,
)
else:
# Fallback: parse expiry without verification (legacy mode)
logger.warning(
"JWT verification disabled - using unverified expiry for {}",
persona.id,
)
token_expires_at = self.parse_jwt_expiry(token.value)
# Determine origin URL
resolved_origin_url: str = origin_url or ""
if not resolved_origin_url:
try:
resolved_origin_url = page.url or "unknown"
except Exception:
resolved_origin_url = "unknown"
session = SessionData(
persona_id=persona.id,
email=persona.email,
cookies=cookies,
local_storage=local_storage,
bearer_token_value=bearer_token_value,
bearer_token_source_key=bearer_token_source_key,
created_at=now,
last_validated_at=now,
token_expires_at=token_expires_at,
session_ttl_expires_at=now + self._ttl,
origin_url=resolved_origin_url,
)
_ = self._storage.save(session)
return session
async def inject_local_storage(
self,
page: PageLike,
session: SessionData,
) -> None:
"""Inject localStorage from session into page.
Args:
page: Page to inject localStorage into.
session: Session data containing localStorage.
"""
if not session.local_storage:
return
_ = await page.evaluate(_JS_SET_LOCAL_STORAGE, session.local_storage)
logger.debug(
"Injected {} localStorage items for persona '{}'",
len(session.local_storage),
session.persona_id,
)
async def inject_into_context(
self,
context: BrowserContext,
session: SessionData,
) -> None:
"""Inject cookies from session into browser context.
Args:
context: Playwright browser context.
session: Session data containing cookies.
"""
if not session.cookies:
return
# Cast cookies to SetCookieParam sequence - structure is compatible
cookies_typed = cast("list[SetCookieParam]", session.cookies)
await context.add_cookies(cookies_typed)
logger.debug(
"Injected {} cookies for persona '{}'",
len(session.cookies),
session.persona_id,
)
def invalidate(self, persona_id: str) -> bool:
"""Remove session from storage.
Args:
persona_id: ID of the persona.
Returns:
True if session was deleted, False if it didn't exist.
"""
return self._storage.delete(persona_id)
async def restore_session(
self,
page: PageLike,
persona: DemoPersona,
) -> ActionResult | None:
"""Try to restore existing session from disk.
Loads session from storage, validates it, injects into browser,
and verifies the login state. Automatically invalidates expired
or invalid sessions.
Warning:
If the base URL redirects to a different origin (e.g., SSO provider),
localStorage injection will fail silently. Ensure origin_url remains
on the application domain when capturing sessions.
Args:
page: Browser page instance.
persona: Persona to restore session for.
Returns:
ActionResult if session was restored successfully, None otherwise.
"""
session = self.load_session(persona.id)
if not session:
logger.debug("No saved session found for persona {}", persona.id)
return None
validation = self.validate_offline(session)
if not validation.is_valid:
logger.info(
"Saved session for {} is invalid: {}",
persona.id,
validation.reason,
)
_ = self.invalidate(persona.id)
return None
# Extract base URL from origin
try:
base_url = extract_base_url(session.origin_url)
except ValueError as exc:
logger.warning(
"Invalid origin URL in session for {}: {}",
persona.id,
exc,
)
_ = self.invalidate(persona.id)
return None
# Navigate to base URL first (required for localStorage to work)
_ = await page.goto(base_url)
# Inject session localStorage
await self.inject_local_storage(page, session)
# Reload page to pick up injected session (using evaluate to avoid full navigation)
# This triggers the app to read localStorage without clearing it like goto() would
_ = await page.evaluate("window.location.reload()")
# Wait for page to finish loading after reload
await page.wait_for_load_state("networkidle")
# Validate session is actually working
from guide.app.auth.session import detect_current_persona
current = await detect_current_persona(page)
if current and current.lower() == persona.email.lower():
logger.info("Restored session for persona {}", persona.id)
return ActionResult(
details={
"persona_id": persona.id,
"email": persona.email,
"status": "session_restored",
"remaining_seconds": validation.remaining_seconds,
}
)
# Session injection failed - invalidate it
logger.warning(
"Session injection failed for {} (expected: {}, got: {})",
persona.id,
persona.email,
current,
)
_ = self.invalidate(persona.id)
return None
def list_sessions(self) -> list[str]:
"""List all stored session persona IDs.
Returns:
List of persona IDs with stored sessions.
"""
return self._storage.list_sessions()
def parse_jwt_expiry(self, token: str) -> datetime | None:
"""Parse exp claim from JWT payload.
Decodes the JWT payload (base64) without verification to extract
the expiration timestamp.
Args:
token: JWT token string.
Returns:
Expiration datetime if found, None otherwise.
"""
return _parse_jwt_expiry(token)
async def _extract_local_storage(self, page: PageLike) -> dict[str, str]:
"""Extract all localStorage items from page."""
result = await page.evaluate(_JS_GET_ALL_LOCAL_STORAGE)
if not isinstance(result, dict):
return {}
# Result from JS is dict[str, str] for localStorage
return {str(k): str(v) for k, v in cast("dict[str, object]", result).items()}
async def _extract_cookies(
self, page: PageLike
) -> list[dict[str, str | int | float | bool | None]]:
"""Extract cookies from page context if available."""
# Only Playwright pages have context with cookies
from playwright.async_api import Page
if isinstance(page, Page):
cookies = await page.context.cookies()
# Convert to list of dicts for serialization
# Cookie values are str, int, float, bool, or None
return [
cast("dict[str, str | int | float | bool | None]", dict(c))
for c in cookies
]
# ExtensionPage doesn't support cookie extraction - session will be localStorage-only
logger.warning(
"Cookie extraction not supported for {} - session will use localStorage only",
type(page).__name__,
)
return []

View File

@@ -0,0 +1,32 @@
"""Pydantic models for session persistence."""
from datetime import datetime
from pydantic import BaseModel, Field
# Cookie values can be str, int, float, bool, or None
type CookieValue = str | int | float | bool | None
class SessionData(BaseModel):
"""Persistent session data for a persona."""
persona_id: str
email: str
cookies: list[dict[str, CookieValue]] = Field(default_factory=list)
local_storage: dict[str, str] = Field(default_factory=dict)
bearer_token_value: str | None = None
bearer_token_source_key: str | None = None
created_at: datetime
last_validated_at: datetime
token_expires_at: datetime | None = None
session_ttl_expires_at: datetime
origin_url: str
class SessionValidationResult(BaseModel):
"""Result of session validation check."""
is_valid: bool
reason: str | None = None
remaining_seconds: int | None = None

View File

@@ -0,0 +1,113 @@
"""File-based session storage backend."""
import logging
from pathlib import Path
from guide.app.auth.session_models import SessionData
_logger = logging.getLogger(__name__)
class SessionStorage:
"""Manage session storage files for personas.
Storage location: {base_dir}/{persona_id}.session.json
"""
_base_dir: Path
def __init__(self, base_dir: Path) -> None:
"""Initialize session storage.
Args:
base_dir: Directory to store session files.
"""
self._base_dir = base_dir
def _ensure_dir(self) -> None:
"""Ensure storage directory exists with secure permissions."""
self._base_dir.mkdir(parents=True, exist_ok=True, mode=0o700)
def _session_path(self, persona_id: str) -> Path:
"""Get path to session file for persona."""
safe_id = "".join(c if c.isalnum() or c in "-_" else "_" for c in persona_id)
return self._base_dir / f"{safe_id}.session.json"
def save(self, session: SessionData) -> Path:
"""Save session data to disk atomically.
Uses temp file + rename pattern to prevent corruption from
concurrent writes or crashes mid-write.
Args:
session: Session data to persist.
Returns:
Path to the saved session file.
"""
self._ensure_dir()
path = self._session_path(session.persona_id)
temp_path = path.with_suffix(".tmp")
# Write to temp file first
_ = temp_path.write_text(session.model_dump_json(indent=2))
# Atomic rename (POSIX-compliant)
_ = temp_path.replace(path)
_logger.info("Saved session for persona '%s' to %s", session.persona_id, path)
return path
def load(self, persona_id: str) -> SessionData | None:
"""Load session data from disk.
Args:
persona_id: ID of the persona to load session for.
Returns:
SessionData if found and valid, None otherwise.
"""
path = self._session_path(persona_id)
if not path.exists():
return None
try:
return SessionData.model_validate_json(path.read_text())
except (ValueError, OSError) as exc:
_logger.warning(
"Invalid/corrupted session file for '%s': %s - deleting",
persona_id,
exc,
)
# Delete corrupted session file to prevent repeated failures
path.unlink(missing_ok=True)
return None
def delete(self, persona_id: str) -> bool:
"""Delete session file for persona.
Args:
persona_id: ID of the persona to delete session for.
Returns:
True if session was deleted, False if it didn't exist.
"""
path = self._session_path(persona_id)
if path.exists():
path.unlink()
_logger.info("Deleted session for persona '%s'", persona_id)
return True
return False
def list_sessions(self) -> list[str]:
"""List all stored session persona IDs.
Returns:
List of persona IDs with stored sessions.
"""
if not self._base_dir.exists():
return []
return [
p.stem.removesuffix(".session")
for p in self._base_dir.glob("*.session.json")
]

View File

@@ -0,0 +1,11 @@
"""Browser module exports.
Note: PageHelpers is NOT exported from this module to avoid import-time
schema generation issues with Pydantic. Import directly:
from guide.app.browser.helpers import PageHelpers
"""
from guide.app.browser.client import BrowserClient
from guide.app.browser.pool import BrowserPool
__all__ = ["BrowserClient", "BrowserPool"]

View File

@@ -1,10 +1,19 @@
import contextlib
from collections.abc import AsyncIterator
from pathlib import Path
from typing import TYPE_CHECKING
from loguru import logger
from playwright.async_api import Page
from guide.app.browser.extension_client import ExtensionPage
from guide.app.browser.pool import BrowserPool
if TYPE_CHECKING:
from playwright.async_api import StorageState
else:
StorageState = dict[str, object] # Runtime fallback
class BrowserClient:
"""Provides page access via a persistent browser pool with context isolation.
@@ -28,31 +37,45 @@ class BrowserClient:
self.pool: BrowserPool = pool
@contextlib.asynccontextmanager
async def open_page(self, host_id: str | None = None) -> AsyncIterator[Page]:
async def open_page(
self,
host_id: str | None = None,
storage_state: StorageState | str | Path | None = None,
) -> AsyncIterator[Page | ExtensionPage]:
"""Get a fresh page from the pool with guaranteed isolation.
Allocates a new context and page for this request. The context is closed
after the with block completes, ensuring complete isolation from other
requests.
For headless mode: allocates a new context and page, closes context after use.
For CDP mode: uses existing context and page, does not close context.
Args:
host_id: The host identifier, or None for the default host
storage_state: Optional Playwright storage_state to initialize context.
Only applies to headless mode.
Yields:
A Playwright Page instance with a fresh, isolated context
A Playwright Page instance
Raises:
ConfigError: If the host_id is invalid or not configured
BrowserConnectionError: If the browser connection fails
"""
context, page = await self.pool.allocate_context_and_page(host_id)
logger.info("[BrowserClient] open_page called for host_id: {}", host_id)
context, page, should_close = await self.pool.allocate_context_and_page(
host_id, storage_state=storage_state
)
logger.info(
"[BrowserClient] Got page from pool, should_close: {}", should_close
)
try:
yield page
finally:
# Explicitly close the context to ensure complete cleanup
# and prevent state leakage to subsequent requests
with contextlib.suppress(Exception):
await context.close()
logger.info("[BrowserClient] Cleaning up, should_close: {}", should_close)
# Only close context for headless mode (not CDP/extension)
if should_close and context is not None:
with contextlib.suppress(Exception):
await context.close()
__all__ = ["BrowserClient"]

View File

@@ -0,0 +1,294 @@
"""Schema-DOM reconciliation for LLM-driven form filling.
Provides utilities to:
- Map GraphQL FormSchema to live DOM elements
- Build merged context for LLM consumption
- Generate structured JSON for form automation
"""
from __future__ import annotations
from dataclasses import dataclass
from guide.app.browser.elements.field_inference import (
HelperFunction,
select_helper_for_type,
)
from guide.app.browser.elements.form_discovery import (
FormField,
extract_accessible_name,
extract_all_form_fields,
extract_field_value,
)
from guide.app.browser.types import PageLike
from guide.app.raindrop.operations.form_schema import FieldDef, FieldType, FormSchema
# ---------------------------------------------------------------------------
# Type Mapping: Schema FieldType -> UI Automation Type
# ---------------------------------------------------------------------------
_SCHEMA_TO_UI_TYPE: dict[FieldType, str] = {
FieldType.TEXT: "text",
FieldType.TEXTAREA: "textarea",
FieldType.MENU: "select",
FieldType.NUMBER: "number",
FieldType.DATE: "date",
FieldType.CHECKBOX: "checkbox",
FieldType.USER: "autocomplete",
FieldType.SUPPLIER: "autocomplete",
FieldType.COMMODITY: "autocomplete",
FieldType.DEPARTMENT: "autocomplete",
FieldType.CONTRACTS: "autocomplete",
FieldType.RELATIONSHIP: "autocomplete",
FieldType.ATTACHMENT: "file",
FieldType.UNKNOWN: "text",
}
def get_ui_type(field_type: FieldType) -> str:
"""Map schema field type to UI automation type."""
return _SCHEMA_TO_UI_TYPE.get(field_type, "text")
def get_helper_for_field(field_type: FieldType) -> HelperFunction:
"""Get automation helper function for field type."""
ui_type = get_ui_type(field_type)
return select_helper_for_type(ui_type)
# ---------------------------------------------------------------------------
# Data Structures
# ---------------------------------------------------------------------------
@dataclass(frozen=True)
class FieldContext:
"""Merged schema + DOM context for a single field."""
field_key: str # e.g., "f19" (from schema)
label: str # "Estimated Value" (from schema)
schema_type: FieldType # Original schema type (menu, user, etc.)
ui_type: str # Automation type (select, autocomplete, text)
helper: HelperFunction # Exact helper function to use
dom_selector: str | None # CSS selector if DOM match found
dom_label: str | None # Label from DOM (for verification)
current_value: list[str] | str | None # From DOM (list for multi-select chips)
is_required: bool # From schema
is_disabled: bool # From DOM
allowed_values: tuple[str, ...] | None # For menu fields (from schema)
@dataclass(frozen=True)
class FormContext:
"""Complete form context for LLM consumption."""
entity_type: str
entity_id: int | str
entity_name: str
fields: dict[str, FieldContext]
unmatched_dom_fields: tuple[str, ...] # DOM fields without schema match
unmatched_schema_fields: tuple[str, ...] # Schema fields without DOM match
# ---------------------------------------------------------------------------
# Matching Algorithm
# ---------------------------------------------------------------------------
def _normalize_label(label: str) -> str:
"""Normalize label for fuzzy matching."""
return label.lower().strip().replace("_", " ").replace("-", " ")
def _match_field_to_dom(
field_def: FieldDef,
dom_fields: list[FormField],
) -> FormField | None:
"""Match a schema field to a DOM field using priority-based matching.
Matching Strategy (priority order):
1. Exact data-cy match: data-cy contains field name
2. Field key in data-cy: Schema field name appears in data-cy attribute
3. Label text match: Schema label matches DOM label (case-insensitive)
4. Fuzzy label match: Normalized string comparison
"""
field_name = field_def.name
field_label = field_def.label
# Priority 1 & 2: data-cy matching
for dom_field in dom_fields:
data_cy = dom_field.get("data_cy")
if isinstance(data_cy, str) and data_cy:
# Exact field name in data-cy
if field_name in data_cy:
return dom_field
# Field key pattern: board-item-field-{type}-{key}
if f"-{field_name}" in data_cy or f"_{field_name}" in data_cy:
return dom_field
# Priority 3: Exact label match (case-insensitive)
normalized_schema_label = _normalize_label(field_label)
for dom_field in dom_fields:
dom_label = dom_field.get("label", "")
if _normalize_label(dom_label) == normalized_schema_label:
return dom_field
# Priority 4: Fuzzy label match (contains)
for dom_field in dom_fields:
dom_label = dom_field.get("label", "")
normalized_dom_label = _normalize_label(dom_label)
if normalized_schema_label in normalized_dom_label:
return dom_field
if normalized_dom_label in normalized_schema_label:
return dom_field
return None
async def build_form_context(
page: PageLike,
schema: FormSchema,
container_selector: str = "body",
) -> FormContext:
"""Build merged context from schema and live DOM.
Args:
page: Browser page instance
schema: Pre-fetched FormSchema from GraphQL
container_selector: CSS selector for form container
Returns:
FormContext with merged schema + DOM data
"""
# Extract all DOM fields
dom_fields = await extract_all_form_fields(page, container_selector)
# Track matching
matched_dom_indices: set[int] = set()
field_contexts: dict[str, FieldContext] = {}
unmatched_schema: list[str] = []
# Match each schema field to DOM
for field_def in schema.fields:
if matched_dom := _match_field_to_dom(field_def, dom_fields):
# Mark as matched
for i, df in enumerate(dom_fields):
if df.get("data_cy") == matched_dom.get("data_cy"):
matched_dom_indices.add(i)
break
# Get current value if selector available
current_value: list[str] | str | None = None
dom_label: str | None = None
is_disabled = matched_dom.get("disabled", False)
if matched_dom.get("selector"):
current_value = await extract_field_value(page, matched_dom["selector"])
dom_label = await extract_accessible_name(page, matched_dom["selector"])
# Build allowed values from schema choices
allowed_values: tuple[str, ...] | None = None
if field_def.choices:
allowed_values = tuple(c.text for c in field_def.choices)
field_contexts[field_def.name] = FieldContext(
field_key=field_def.name,
label=field_def.label,
schema_type=field_def.field_type,
ui_type=get_ui_type(field_def.field_type),
helper=get_helper_for_field(field_def.field_type),
dom_selector=matched_dom.get("selector"),
dom_label=dom_label or matched_dom.get("label"),
current_value=current_value,
is_required=field_def.required,
is_disabled=is_disabled,
allowed_values=allowed_values,
)
else:
unmatched_schema.append(field_def.name)
# Collect unmatched DOM fields
unmatched_dom: list[str] = []
for i, dom_field in enumerate(dom_fields):
if i not in matched_dom_indices:
if data_cy := dom_field.get("data_cy"):
unmatched_dom.append(data_cy)
return FormContext(
entity_type=schema.entity_type,
entity_id=schema.entity_id,
entity_name=schema.entity_name,
fields=field_contexts,
unmatched_dom_fields=tuple(unmatched_dom),
unmatched_schema_fields=tuple(unmatched_schema),
)
async def get_form_context_from_schema(
page: PageLike,
schema: FormSchema,
container_selector: str = "body",
) -> FormContext:
"""Build form context from pre-fetched schema.
This is the primary API. Callers fetch the schema separately
using raindrop/operations/form_schema.py, then pass it here.
Args:
page: Browser page instance
schema: Pre-fetched FormSchema from GraphQL
container_selector: CSS selector for form container
Returns:
FormContext with merged schema + DOM data
"""
return await build_form_context(page, schema, container_selector)
# ---------------------------------------------------------------------------
# LLM Context Generation
# ---------------------------------------------------------------------------
def format_for_llm(context: FormContext) -> dict[str, dict[str, object]]:
"""Format form context as LLM-consumable JSON.
Returns dict keyed by field_key with:
- label: Display name
- schema_type: Original field type from schema
- ui_type: Type for automation (aligns with select_helper_for_type)
- helper: Exact helper function to call
- selector: CSS selector for interaction
- current_value: Current value if any
- is_required: Whether field is required
- allowed_values: Valid choices for menu fields
"""
return {
field_key: {
"label": field.label,
"schema_type": field.schema_type.value, # e.g., "menu", "user"
"ui_type": field.ui_type, # e.g., "select", "autocomplete"
"helper": field.helper, # e.g., "select_single", "select_combobox"
"selector": field.dom_selector,
"current_value": field.current_value,
"is_required": field.is_required,
"is_disabled": field.is_disabled,
"allowed_values": list(field.allowed_values)
if field.allowed_values
else None,
}
for field_key, field in context.fields.items()
if field.dom_selector # Only include fields we can interact with
}
__all__ = [
"FieldContext",
"FormContext",
"build_form_context",
"get_form_context_from_schema",
"format_for_llm",
"get_ui_type",
"get_helper_for_field",
]

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,130 @@
"""Browser element helpers for MUI component automation.
This package provides extension-friendly helpers for interacting with Material UI components:
- **mui.py** - Base utilities (escape_selector, click_with_mouse_events, React helpers)
- **dropdown.py** - Dropdown/Select/Autocomplete interactions
- **inputs.py** - Input primitive helpers (fill_text, fill_textarea, etc.)
- **form_automation.py** - Orchestration for multi-field form filling
Usage:
from guide.app.browser.elements import select_combobox, fill_text
from guide.app.browser.elements.mui import fill_with_react_events
"""
# Base utilities from mui.py
from guide.app.browser.elements.mui import (
DropdownResult,
click_with_mouse_events,
escape_selector,
fill_with_react_events,
clear_and_fill_with_react_events,
send_key,
ensure_listbox,
check_listbox_visible,
check_field_disabled,
enable_field,
)
# Dropdown helpers
from guide.app.browser.elements.dropdown import (
select_multi,
select_single,
select_combobox,
select_single_choice,
select_mui_options,
select_autocomplete,
select_typeahead,
)
# Input primitive helpers
from guide.app.browser.elements.inputs import (
fill_text,
fill_textarea,
fill_date,
fill_autocomplete,
)
# Form discovery helpers
from guide.app.browser.elements.form_discovery import (
FormField,
extract_accessible_name,
extract_all_data_cy_selectors,
extract_form_field_metadata,
extract_all_form_fields,
extract_field_value,
extract_field_state,
)
# Field inference helpers
from guide.app.browser.elements.field_inference import (
infer_type_from_selector,
infer_type_from_element,
infer_type_from_xpath,
select_helper_for_type,
infer_with_fallback,
infer_with_confidence,
build_field_selector,
)
# Form automation helpers
from guide.app.browser.elements.form_automation import (
FormDataGenerator,
FormValue,
discover_and_fill_form,
fill_field,
auto_populate_form,
validate_populated_form,
smart_fill_field,
)
__all__ = [
# Base utilities
"DropdownResult",
"click_with_mouse_events",
"escape_selector",
"fill_with_react_events",
"clear_and_fill_with_react_events",
"send_key",
"ensure_listbox",
"check_listbox_visible",
"check_field_disabled",
"enable_field",
# Dropdown helpers
"select_multi",
"select_single",
"select_combobox",
"select_single_choice",
"select_mui_options",
"select_autocomplete",
"select_typeahead",
# Form helpers
"fill_text",
"fill_textarea",
"fill_date",
"fill_autocomplete",
# Form discovery
"FormField",
"extract_accessible_name",
"extract_all_data_cy_selectors",
"extract_form_field_metadata",
"extract_all_form_fields",
"extract_field_value",
"extract_field_state",
# Field inference
"infer_type_from_selector",
"infer_type_from_element",
"infer_type_from_xpath",
"select_helper_for_type",
"infer_with_fallback",
"infer_with_confidence",
"build_field_selector",
# Form automation
"FormDataGenerator",
"FormValue",
"discover_and_fill_form",
"fill_field",
"auto_populate_form",
"validate_populated_form",
"smart_fill_field",
]

View File

@@ -0,0 +1,46 @@
"""Shared type guards for browser element operations.
Provides type-safe extraction of values from JavaScript evaluation results,
which return untyped dict[str, object] structures from page.evaluate() calls.
"""
from __future__ import annotations
from typing import TypeGuard
def is_dict_str_object(obj: object) -> TypeGuard[dict[str, object]]:
"""Type guard to check if object is dict[str, object]."""
return isinstance(obj, dict)
def is_list_of_objects(obj: object) -> TypeGuard[list[object]]:
"""Type guard to check if object is a list."""
return isinstance(obj, list)
def get_str_from_dict(d: dict[str, object], key: str, default: str = "") -> str:
"""Safely extract string value from dict."""
val = d.get(key)
return str(val) if isinstance(val, str) else default
def get_str_or_none_from_dict(d: dict[str, object], key: str) -> str | None:
"""Safely extract string or None value from dict."""
val = d.get(key)
return str(val) if isinstance(val, str) else None
def get_bool_from_dict(d: dict[str, object], key: str, default: bool = False) -> bool:
"""Safely extract bool value from dict."""
val = d.get(key)
return bool(val) if isinstance(val, bool) else default
__all__ = [
"is_dict_str_object",
"is_list_of_objects",
"get_str_from_dict",
"get_str_or_none_from_dict",
"get_bool_from_dict",
]

View File

@@ -0,0 +1,77 @@
"""Dropdown helpers for MUI Select and Autocomplete components.
This package handles dropdown/select interactions for:
- MUI Autocomplete (searchable, multi-select with chips)
- MUI Select (non-searchable combobox dropdowns)
- Type-to-search autocomplete (API-triggered search)
Imports base utilities from mui.py (escape_selector, click_with_mouse_events, etc.)
"""
# Public APIs
from guide.app.browser.elements.dropdown.autocomplete import (
select_multi,
select_single,
)
from guide.app.browser.elements.dropdown.combobox import (
select_combobox,
select_single_choice,
select_mui_options,
select_autocomplete,
)
from guide.app.browser.elements.dropdown.typeahead import select_typeahead
from guide.app.browser.elements.dropdown._close import close_all_dropdowns
from guide.app.browser.elements.dropdown._helpers import get_listbox_id
from guide.app.browser.elements.dropdown.schema_aware import (
select_from_schema,
SchemaAwareDropdownResult,
)
# Re-exports from mui.py for backward compatibility
from guide.app.browser.elements.mui import (
DropdownResult,
escape_selector,
click_with_mouse_events,
send_key,
ensure_listbox,
check_listbox_visible,
check_field_disabled,
)
# Backward compatibility aliases (internal helpers - keep exports but don't encourage use)
_escape_selector = escape_selector
_send_key = send_key
_wait_for_role_option = ensure_listbox
_ensure_listbox = ensure_listbox
_check_listbox_visible = check_listbox_visible
_check_field_disabled = check_field_disabled
__all__ = [
# Public APIs
"select_multi",
"select_single",
"select_combobox",
"select_single_choice",
"select_mui_options",
"select_autocomplete",
"select_typeahead",
"select_from_schema",
"close_all_dropdowns",
"get_listbox_id",
"SchemaAwareDropdownResult",
# Re-exports from mui.py for backward compatibility
"DropdownResult",
"escape_selector",
"click_with_mouse_events",
"send_key",
"ensure_listbox",
"check_listbox_visible",
"check_field_disabled",
# Backward compatibility - internal helpers (keep exports but don't encourage use)
"_escape_selector",
"_send_key",
"_wait_for_role_option",
"_ensure_listbox",
"_check_listbox_visible",
"_check_field_disabled",
]

View File

@@ -0,0 +1,92 @@
"""Universal dropdown close helper.
Handles closing MUI dropdowns and marking listboxes to prevent stale queries.
"""
import contextlib
import logging
from typing import cast
from guide.app.browser.types import (
PageLike,
PageWithTrustedClick,
supports_trusted_click,
)
_logger = logging.getLogger(__name__)
async def close_all_dropdowns(page: PageLike) -> None:
"""Close any open dropdowns/listboxes on the page.
Uses trusted click (if available) to trigger MUI ClickAwayListener.
Falls back to blur for Playwright mode.
Also marks all listboxes with data-dropdown-closed to prevent stale queries.
"""
_logger.debug("[Dropdown] Closing all open dropdowns")
# Mark ALL current listboxes and their options as closed/stale
# so subsequent queries don't find them (prevents cross-dropdown leakage)
_ = await page.evaluate(
"""
(() => {
const listboxes = document.querySelectorAll('[role="listbox"]:not([data-dropdown-closed])');
listboxes.forEach(listbox => {
listbox.setAttribute('data-dropdown-closed', 'true');
listbox.style.setProperty('display', 'none', 'important');
listbox.style.setProperty('visibility', 'hidden', 'important');
listbox.style.setProperty('pointer-events', 'none', 'important');
// Mark all options as stale to prevent cross-dropdown leakage
listbox.querySelectorAll('[role="option"]').forEach(opt => {
opt.setAttribute('data-dropdown-stale', 'true');
});
});
return listboxes.length;
})();
"""
)
# Check if trusted_click is available (extension mode)
if supports_trusted_click(page):
coords = await page.evaluate(
"""
(() => {
const targets = [
document.querySelector('h2'),
document.querySelector('.MuiDialogTitle-root'),
document.querySelector('.MuiDialogContent-root'),
];
for (const target of targets) {
if (target) {
const rect = target.getBoundingClientRect();
return { x: rect.left + 10, y: rect.top + 10 };
}
}
return { x: 100, y: 100 };
})();
"""
)
if isinstance(coords, dict):
coords_dict = cast(dict[str, object], coords)
x_val = coords_dict.get("x", 100)
y_val = coords_dict.get("y", 100)
x = float(x_val) if isinstance(x_val, (int, float)) else 100.0
y = float(y_val) if isinstance(y_val, (int, float)) else 100.0
with contextlib.suppress(Exception):
trusted_page: PageWithTrustedClick = page
await trusted_page.trusted_click(x, y)
await page.wait_for_timeout(150)
else:
# Fallback for Playwright - blur only
with contextlib.suppress(Exception):
_ = await page.evaluate(
"""
(() => {
if (document.activeElement && document.activeElement.tagName !== 'BODY') {
document.activeElement.blur();
}
return true;
})();
"""
)
await page.wait_for_timeout(100)

View File

@@ -0,0 +1,770 @@
"""Internal helper functions for dropdown operations.
Shared utilities for autocomplete, combobox, and typeahead modules.
"""
import asyncio
import contextlib
import logging
from typing import cast
from guide.app.browser.types import PageLike
from guide.app.browser.elements.mui import (
escape_selector,
send_key,
check_listbox_visible,
click_with_mouse_events,
)
from guide.app.core.config import Timeouts
from guide.app.errors import ActionExecutionError
_logger = logging.getLogger(__name__)
# Module-level default timeouts for backward compatibility
_DEFAULT_TIMEOUTS = Timeouts()
# ---------------------------------------------------------------------------
# Shared Utilities
# ---------------------------------------------------------------------------
async def get_listbox_id(page: PageLike, selector: str) -> str | None:
"""Get listbox ID from aria-controls or aria-owns attributes.
Args:
page: Browser page instance
selector: CSS selector for the field container
Returns:
The listbox element ID if found, None otherwise
"""
escaped = escape_selector(selector)
result = await page.evaluate(
f"""
(() => {{
const el = document.querySelector('{escaped}');
const input = el?.querySelector('input') || el;
return input?.getAttribute('aria-controls') ||
input?.getAttribute('aria-owns') || null;
}})()
"""
)
return str(result) if isinstance(result, str) else None
# ---------------------------------------------------------------------------
# Autocomplete Helpers
# ---------------------------------------------------------------------------
async def is_dropdown_open(page: PageLike, field_selector: str) -> bool:
"""Check if an autocomplete dropdown is already open."""
field_selector_js = escape_selector(field_selector)
_logger.info("[Dropdown] Checking if dropdown is open for: %s", field_selector)
result = await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
if (!field) return false;
const input = field.querySelector('input');
if (input && input.getAttribute('aria-expanded') === 'true') {{
return true;
}}
const listboxId = input?.getAttribute('aria-controls') || input?.getAttribute('aria-owns');
if (listboxId) {{
const listbox = document.getElementById(listboxId);
if (listbox && listbox.offsetParent !== null) {{
return true;
}}
}}
const anyListbox = document.querySelector('[role="listbox"]');
if (anyListbox && anyListbox.offsetParent !== null) {{
const rect = anyListbox.getBoundingClientRect();
if (rect.width > 0 && rect.height > 0) {{
return true;
}}
}}
return false;
}})();
"""
)
is_open = bool(result)
_logger.info(
"[Dropdown] Dropdown open check result: %s for %s", is_open, field_selector
)
return is_open
async def open_dropdown(
page: PageLike,
field_selector: str,
popup_button_selector: str,
*,
timeouts: Timeouts | None = None,
) -> None:
"""Open an autocomplete dropdown.
Checks if already open first to avoid toggling closed.
Closes any other open dropdown before opening this one.
"""
effective_timeouts = timeouts or _DEFAULT_TIMEOUTS
if await is_dropdown_open(page, field_selector):
_logger.debug("[Dropdown] Already open, skipping click for: %s", field_selector)
return
if await check_listbox_visible(page):
_logger.info("[Dropdown] Another listbox is open, closing it with Tab + blur")
_ = await page.evaluate(
"""
(() => {
const expandedInputs = Array.from(document.querySelectorAll('input[aria-expanded="true"]'));
for (const input of expandedInputs) {
const eventProps = {
key: 'Tab',
code: 'Tab',
keyCode: 9,
which: 9,
bubbles: true,
cancelable: true,
composed: true
};
input.dispatchEvent(new KeyboardEvent('keydown', eventProps));
input.dispatchEvent(new KeyboardEvent('keyup', eventProps));
input.blur();
}
if (document.activeElement && document.activeElement.tagName !== 'BODY') {
document.activeElement.blur();
}
document.body.focus();
return true;
})();
"""
)
await page.wait_for_timeout(250)
if await check_listbox_visible(page):
_logger.warning(
"[Dropdown] Listbox still open after Tab+blur - proceeding anyway"
)
field_selector_js = escape_selector(field_selector)
_ = await page.evaluate(
f"""
(() => {{
const root = document.querySelector('{field_selector_js}');
if (!root) return false;
const input = root.querySelector('input');
if (input && input.disabled) {{
input.disabled = false;
input.removeAttribute('aria-disabled');
}}
const popupBtn = root.querySelector('.MuiAutocomplete-popupIndicator');
if (popupBtn && popupBtn.disabled) {{
popupBtn.disabled = false;
popupBtn.classList.remove('Mui-disabled');
popupBtn.removeAttribute('aria-disabled');
}}
return true;
}})();
"""
)
await page.wait_for_timeout(50)
try:
with contextlib.suppress(Exception):
_ = await page.wait_for_selector(
popup_button_selector, timeout=effective_timeouts.dropdown_field
)
popup_selector_js = escape_selector(popup_button_selector)
can_click = await page.evaluate(
f"""
(() => {{
const btn = document.querySelector('{popup_selector_js}');
return btn && !btn.disabled && !btn.classList.contains('Mui-disabled');
}})();
"""
)
if can_click:
await page.click(popup_button_selector)
else:
raise ActionExecutionError("Popup button not available")
except Exception:
with contextlib.suppress(Exception):
_ = await page.wait_for_selector(
f"{field_selector} input", timeout=effective_timeouts.dropdown_field
)
await page.click(f"{field_selector} input")
await send_key(page, "ArrowDown")
with contextlib.suppress(Exception):
_ = await page.wait_for_selector(
field_selector, timeout=effective_timeouts.dropdown_field
)
await page.click(field_selector)
await send_key(page, "ArrowDown")
await page.wait_for_timeout(100)
await page.wait_for_timeout(50)
async def get_options(
page: PageLike, field_selector: str
) -> list[dict[str, str | int]] | None:
"""Get options from a listbox associated with a field."""
field_selector_js = escape_selector(field_selector)
result = await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
const input = field ? field.querySelector('input') : null;
const listboxId = input?.getAttribute('aria-controls') || input?.getAttribute('aria-owns');
let listbox = listboxId ? document.getElementById(listboxId) : null;
if (!listbox || listbox.hasAttribute('data-dropdown-closed')) {{
listbox = document.querySelector('[role="listbox"]:not([data-dropdown-closed])');
}}
const optionNodes = listbox ? Array.from(listbox.querySelectorAll('[role="option"]')) : [];
return optionNodes.map((opt, index) => ({{ index: index, text: (opt.textContent || '').trim() }}));
}})()
"""
)
return cast(list[dict[str, str | int]] | None, result)
async def get_listbox_options(page: PageLike) -> dict[str, object]:
"""Get listbox options with visibility status."""
result = await page.evaluate(
"""
(() => {
const listbox = document.querySelector('[role="listbox"]:not([data-dropdown-closed])');
if (!listbox) return {visible: false, reason: 'no_listbox', options: []};
const rect = listbox.getBoundingClientRect();
const hasSize = rect.width > 0 && rect.height > 0;
const notHidden = listbox.style.display !== 'none' && listbox.style.visibility !== 'hidden';
const isVisible = hasSize && notHidden;
const options = Array.from(listbox.querySelectorAll('[role="option"]'))
.map(o => ({ text: (o.textContent || '').trim(), data: o.getAttribute('data-value') || '' }));
return {visible: isVisible, reason: isVisible ? 'visible' : 'hidden', options: options};
})();
"""
)
return (
cast(dict[str, object], result)
if isinstance(result, dict)
else {"visible": False, "reason": "no_listbox", "options": []}
)
async def clear_chips(page: PageLike, field_selector: str) -> None:
"""Clear chips in an autocomplete field if a clear indicator exists."""
with contextlib.suppress(Exception):
await page.click(f"{field_selector} [aria-label='Clear']")
await page.wait_for_timeout(50)
# ---------------------------------------------------------------------------
# Combobox Helpers
# ---------------------------------------------------------------------------
async def find_combobox_selector(page: PageLike, field_selector: str) -> str:
"""Find the combobox selector based on field_selector type."""
field_selector_js = escape_selector(field_selector)
element_info = await page.evaluate(
f"""
(() => {{
const el = document.querySelector('{field_selector_js}');
if (!el) return {{found: false}};
return {{
found: true,
tagName: el.tagName,
role: el.getAttribute('role') || '',
name: el.getAttribute('name') || '',
isInput: el.tagName === 'INPUT',
isLabel: el.tagName === 'LABEL',
isCombobox: el.getAttribute('role') === 'combobox'
}};
}})();
"""
)
element_info_dict = (
cast(dict[str, object], element_info) if isinstance(element_info, dict) else {}
)
if not element_info_dict.get("found"):
return f"{field_selector} [role='combobox']"
is_input = bool(element_info_dict.get("isInput", False))
is_label = bool(element_info_dict.get("isLabel", False))
is_combobox = bool(element_info_dict.get("isCombobox", False))
if is_combobox:
return field_selector
elif is_input:
combobox_via_parent = await page.evaluate(
f"""
(() => {{
const input = document.querySelector('{field_selector_js}');
if (!input) return '';
const formControl = input.closest('.MuiFormControl-root, .MuiInputBase-root');
if (formControl) {{
const combo = formControl.querySelector('[role="combobox"]');
if (combo) {{
if (combo.id) return '#' + combo.id;
return '[role="combobox"]';
}}
}}
return '';
}})();
"""
)
if combobox_via_parent and isinstance(combobox_via_parent, str):
return combobox_via_parent
return f"{field_selector} ~ [role='combobox'], {field_selector} + [role='combobox'], {field_selector}".split(
", "
)[0]
elif is_label:
return f"{field_selector} + * [role='combobox'], {field_selector} ~ * [role='combobox']"
else:
return f"{field_selector} [role='combobox']"
async def open_combobox_dropdown(
page: PageLike,
combobox_selector: str,
field_selector: str,
popup_button_selector: str,
*,
timeouts: Timeouts | None = None,
) -> bool:
"""Open combobox dropdown using multiple strategies. Returns True if opened."""
effective_timeouts = timeouts or _DEFAULT_TIMEOUTS
if await check_listbox_visible(page):
_logger.debug("[Dropdown] Combobox already open, skipping click")
return True
try:
_ = await page.wait_for_selector(
combobox_selector, timeout=effective_timeouts.dropdown_field
)
_ = await click_with_mouse_events(page, combobox_selector)
await page.wait_for_timeout(800)
if await check_listbox_visible(page):
return True
field_selector_js = escape_selector(field_selector)
is_expanded_result = await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
if (!field) return false;
const combo = field.querySelector('[role="combobox"]');
if (!combo) return false;
const expanded = combo.getAttribute('aria-expanded');
return expanded === 'true';
}})();
"""
)
if bool(is_expanded_result):
return True
except Exception as e:
_logger.warning("[Dropdown] Combobox click failed: %s", e)
icon_selectors = [
popup_button_selector,
f"{field_selector} svg[data-testid='ArrowDropDownIcon']",
f"{field_selector} .MuiSelect-icon",
"#RDGridFormFieldGroup_groupbody > div:nth-child(1) > div > div > svg[data-testid='ArrowDropDownIcon']",
]
for icon_sel in icon_selectors:
try:
_ = await page.wait_for_selector(
icon_sel, timeout=effective_timeouts.dropdown_icon
)
_ = await click_with_mouse_events(page, icon_sel, focus_first=False)
await page.wait_for_timeout(400)
if await check_listbox_visible(page):
return True
except Exception:
continue
return False
async def wait_for_dropdown_options(
page: PageLike,
wait_ms: int,
field_selector: str,
combobox_selector: str,
) -> tuple[bool, list[str]]:
"""Wait for dropdown options to appear. Returns (visible, available_options)."""
start = asyncio.get_event_loop().time()
while (
asyncio.get_event_loop().time() - start
) < 2.0 and not await check_listbox_visible(page):
await asyncio.sleep(0.1)
start = asyncio.get_event_loop().time()
while (asyncio.get_event_loop().time() - start) * 1000 < wait_ms:
result_dict = await get_listbox_options(page)
options_list = cast(list[dict[str, str]] | None, result_dict.get("options"))
if options_list and len(options_list) > 0:
available = [str(opt.get("text", "")) for opt in options_list]
return True, available
await asyncio.sleep(0.1)
_logger.warning(
"[Dropdown] No options found within %dms, retrying click for field: %s",
wait_ms,
field_selector,
)
try:
retry_selector = combobox_selector or f"{field_selector} [role='combobox']"
_ = await click_with_mouse_events(page, retry_selector)
await page.wait_for_timeout(800)
result_dict = await get_listbox_options(page)
options_list = cast(list[dict[str, str]] | None, result_dict.get("options"))
if options_list and len(options_list) > 0:
available = [str(opt.get("text", "")) for opt in options_list]
return True, available
except Exception as e:
_logger.warning("[Dropdown] Retry click also failed: %s", e)
return False, []
async def select_combobox_option(
page: PageLike, field_selector: str, value: str
) -> bool:
"""Select an option from the open dropdown. Returns True if selected."""
_ = field_selector # Reserved for future scope filtering
value_escaped = escape_selector(value)
match_result = await page.evaluate(
f"""
(() => {{
const target = '{value_escaped}'.toLowerCase();
const options = Array.from(document.querySelectorAll('[role="option"]'));
let matchIndex = -1;
let match = null;
for (let i = 0; i < options.length; i++) {{
const opt = options[i];
const text = (opt.textContent || '').trim().toLowerCase();
const dataVal = (opt.getAttribute('data-value') || '').trim().toLowerCase();
if (text === target || dataVal === target || text.includes(target) || target.includes(text.split(' - ')[0])) {{
matchIndex = i;
match = opt;
break;
}}
}}
if (!match) {{
return {{found: false, text: 'No matching option found', available: options.map(o => (o.textContent || '').trim())}};
}}
match.scrollIntoView({{block: 'center', behavior: 'instant'}});
match.setAttribute('data-dropdown-match', 'true');
return {{
found: true,
index: matchIndex,
text: (match.textContent || '').trim(),
dataValue: match.getAttribute('data-value') || ''
}};
}})();
"""
)
match_dict = (
cast(dict[str, object], match_result) if isinstance(match_result, dict) else {}
)
if not match_dict.get("found"):
return False
try:
await page.click('[role="option"][data-dropdown-match="true"]')
selected = True
except Exception:
try:
option_index_val = match_dict.get("index", 0)
option_index = (
int(option_index_val)
if isinstance(option_index_val, (int, float))
else 0
)
await page.click(f'[role="option"]:nth-child({option_index + 1})')
selected = True
except Exception:
_ = await page.evaluate(
"""
(() => {
const match = document.querySelector('[role="option"][data-dropdown-match="true"]');
if (match) {
match.click();
return true;
}
return false;
})();
"""
)
selected = True
_ = await page.evaluate(
"""
(() => {
const match = document.querySelector('[role="option"][data-dropdown-match="true"]');
if (match) {
match.removeAttribute('data-dropdown-match');
}
})();
"""
)
return selected
async def verify_combobox_selection(
page: PageLike, field_selector: str, value: str
) -> bool:
"""Verify that the combobox selection was set correctly."""
value_escaped = escape_selector(value)
field_selector_js = escape_selector(field_selector)
_ = await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
if (!field) return false;
const combo = field.querySelector('[role="combobox"]');
const hiddenInput = field.querySelector('input[type="hidden"], input[name]');
if (combo && hiddenInput) {{
const comboText = (combo.textContent || '').trim();
const options = Array.from(document.querySelectorAll('[role="option"]'));
const match = options.find(opt => {{
const text = (opt.textContent || '').trim();
return text === comboText || comboText.includes(text) || text.includes(comboText);
}});
if (match) {{
const dataValue = match.getAttribute('data-value') || comboText;
const nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLInputElement.prototype,
'value'
)?.set;
if (nativeSetter) {{
nativeSetter.call(hiddenInput, dataValue);
}} else {{
hiddenInput.value = dataValue;
}}
hiddenInput.dispatchEvent(new Event('input', {{ bubbles: true }}));
hiddenInput.dispatchEvent(new Event('change', {{ bubbles: true }}));
}}
}}
return true;
}})();
"""
)
await page.wait_for_timeout(100)
actually_selected = await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
if (!field) return false;
const combo = field.querySelector('[role="combobox"]');
if (!combo) return false;
const comboText = (combo.textContent || '').trim();
const target = '{value_escaped}';
return comboText === target ||
comboText.toLowerCase() === target.toLowerCase() ||
comboText.includes(target) ||
target.includes(comboText.split(' - ')[0]);
}})();
"""
)
if not actually_selected:
await page.wait_for_timeout(200)
actually_selected = await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
if (!field) return false;
const combo = field.querySelector('[role="combobox"]');
if (!combo) return false;
const comboText = (combo.textContent || '').trim();
const target = '{value_escaped}';
return comboText === target ||
comboText.toLowerCase() === target.toLowerCase() ||
comboText.includes(target) ||
target.includes(comboText.split(' - ')[0]);
}})();
"""
)
return bool(actually_selected)
async def force_set_combobox_value(
page: PageLike, field_selector: str, value: str
) -> bool:
"""Force-set combobox value as fallback. Returns True if set."""
value_escaped = escape_selector(value)
field_selector_js = escape_selector(field_selector)
selected = (
await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
if (!field) return false;
const native = field.querySelector('input[name], input[type="hidden"]');
const combo = field.querySelector('[role="combobox"]');
const val = '{value_escaped}';
const options = Array.from(document.querySelectorAll('[role="option"]'));
const match = options.find(opt => {{
const text = (opt.textContent || '').trim().toLowerCase();
return text === val.toLowerCase() || text.includes(val.toLowerCase()) || val.toLowerCase().includes(text.split(' - ')[0]);
}});
if (match) {{
match.click();
return true;
}}
if (native) {{
const nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLInputElement.prototype,
'value'
)?.set;
if (nativeSetter) {{
nativeSetter.call(native, val);
}} else {{
native.value = val;
}}
native.dispatchEvent(new Event('input', {{ bubbles: true }}));
native.dispatchEvent(new Event('change', {{ bubbles: true }}));
}}
if (combo) {{
combo.textContent = val;
combo.dispatchEvent(new Event('change', {{ bubbles: true }}));
}}
return true;
}})();
"""
)
or False
)
if selected:
await page.wait_for_timeout(200)
actually_selected = await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
if (!field) return false;
const combo = field.querySelector('[role="combobox"]');
if (!combo) return false;
const comboText = (combo.textContent || '').trim();
const target = '{value_escaped}';
return comboText === target || comboText.includes(target) || target.includes(comboText.split(' - ')[0]);
}})();
"""
)
selected = actually_selected or False
return bool(selected)
async def open_mui_dropdown(
page: PageLike,
field_selector: str,
popup_button_selector: str,
*,
timeouts: Timeouts | None = None,
) -> bool:
"""Open MUI dropdown by clicking popup button or field. Returns True if clicked."""
effective_timeouts = timeouts or _DEFAULT_TIMEOUTS
if await check_listbox_visible(page):
_logger.debug("[Dropdown] MUI dropdown already open, skipping click")
return True
with contextlib.suppress(Exception):
_ = await page.wait_for_selector(
popup_button_selector, timeout=effective_timeouts.dropdown_icon
)
clicked = await click_with_mouse_events(
page, popup_button_selector, focus_first=True
)
if clicked:
return True
with contextlib.suppress(Exception):
_ = await page.wait_for_selector(
field_selector, timeout=effective_timeouts.dropdown_icon
)
clicked = await click_with_mouse_events(page, field_selector, focus_first=True)
if clicked:
return True
return False
async def search_and_click_mui_option(
page: PageLike,
target_value: str,
allow_substring: bool,
) -> tuple[bool, list[str]]:
"""Search for and click a MUI option. Returns (success, available_options)."""
tgt = (target_value or "").strip()
esc = tgt.replace("\\", "\\\\").replace("'", "\\'")
search_result = await page.evaluate(
f"""
(() => {{
const opts = Array.from(document.querySelectorAll(
'[role="option"], ' +
'li[role="option"], ' +
'.MuiMenuItem-root, ' +
'[id*="option-"], ' +
'#menu-renewal_type li, ' +
'#menu-payment_terms li, ' +
'#menu-payment_schedule li'
));
const optionTexts = opts.map(o => (o.textContent || '').trim());
const tgt = '{esc}'.trim().toLowerCase();
const match = opts.find(o => {{
const txt = (o.textContent || '').trim().toLowerCase();
if (txt === tgt) return true;
if ({"true" if allow_substring else "false"}) {{
return txt.includes(tgt) || tgt.includes(txt);
}}
return false;
}});
if (!match) {{
return {{success: false, availableOptions: optionTexts}};
}}
match.scrollIntoView({{block: 'nearest'}});
match.click();
return {{success: true, availableOptions: optionTexts}};
}})();
"""
)
if isinstance(search_result, dict):
search_dict = cast(dict[str, object], search_result)
success = bool(search_dict.get("success", False))
found_options_raw: object = search_dict.get("availableOptions", [])
if isinstance(found_options_raw, list):
found_options: list[str] = [
str(opt) for opt in cast(list[object], found_options_raw)
]
else:
found_options = []
return success, found_options
return False, []

View File

@@ -0,0 +1,328 @@
"""MUI Autocomplete selection handlers.
Handles multi-select autocomplete components with chips.
"""
import contextlib
import logging
from typing import cast
from guide.app.browser.extension_client import ExtensionPage
from guide.app.browser.types import PageLike
from guide.app.browser.elements.mui import (
DropdownResult,
escape_selector,
ensure_listbox,
)
from guide.app.browser.elements.dropdown._close import close_all_dropdowns
from guide.app.browser.elements.dropdown._helpers import (
open_dropdown,
get_options,
)
_logger = logging.getLogger(__name__)
def _build_click_option_js(field_selector: str, escaped_value: str) -> str:
"""Build JavaScript to find and click an option by exact or segment match.
Args:
field_selector: CSS selector for the autocomplete wrapper
escaped_value: Escaped value to search for
Returns:
JavaScript code string that returns true if clicked, false otherwise
"""
return f"""
(() => {{
const field = document.querySelector('{field_selector}');
const input = field ? field.querySelector('input') : null;
const listboxId = input?.getAttribute('aria-controls') || input?.getAttribute('aria-owns');
const listbox = listboxId ? document.getElementById(listboxId) : document.querySelector('[role="listbox"]');
if (!listbox) return false;
const options = Array.from(listbox.querySelectorAll('[role="option"]'));
const target = '{escaped_value}'.toLowerCase();
const match = options.find(opt => {{
const text = (opt.textContent || '').trim();
const lower = text.toLowerCase();
const firstSeg = lower.split(' - ')[0];
const secondSeg = lower.split(' - ')[1] || '';
return lower === target || firstSeg === target || secondSeg === target;
}});
if (!match) return false;
match.click();
return true;
}})();
"""
def _build_partial_match_js(escaped_value: str) -> str:
"""Build JavaScript for partial text match click.
Args:
escaped_value: Escaped value to search for
Returns:
JavaScript code string that returns true if clicked, false otherwise
"""
return f"""
(() => {{
const target = '{escaped_value}'.toLowerCase();
const options = Array.from(document.querySelectorAll('[role="option"]'));
const match = options.find(opt => (opt.textContent || '').trim().toLowerCase().includes(target));
if (!match) return false;
match.click();
return true;
}})();
"""
async def _try_click_option(
page: PageLike, field_selector: str, escaped_value: str
) -> bool:
"""Try to click an option matching the value.
Args:
page: PageLike instance
field_selector: CSS selector for the autocomplete wrapper
escaped_value: Escaped value to search for
Returns:
True if option was clicked, False otherwise
"""
js_code = _build_click_option_js(field_selector, escaped_value)
result = await page.evaluate(js_code)
return bool(result)
async def _try_partial_match(page: PageLike, escaped_value: str) -> bool:
"""Try partial text match as fallback.
Args:
page: PageLike instance
escaped_value: Escaped value to search for
Returns:
True if option was clicked, False otherwise
"""
js_code = _build_partial_match_js(escaped_value)
result = await page.evaluate(js_code)
return bool(result)
async def _close_with_trusted_click(page: PageLike) -> None:
"""Close dropdown using trusted click if available (extension mode).
Args:
page: PageLike instance
"""
coords = await page.evaluate(
"""
(() => {
const targets = [
document.querySelector('h2'),
document.querySelector('.MuiDialogTitle-root'),
document.querySelector('.MuiDialogContent-root'),
];
for (const target of targets) {
if (target) {
const rect = target.getBoundingClientRect();
return { x: rect.left + 10, y: rect.top + 10 };
}
}
return { x: 100, y: 100 };
})();
"""
)
_logger.info("[Dropdown] Coordinates: %s", coords)
if not coords or not isinstance(coords, dict):
return
coords_dict = cast(dict[str, object], coords)
x_val = coords_dict.get("x", 100)
y_val = coords_dict.get("y", 100)
x = float(x_val) if isinstance(x_val, (int, float)) else 100.0
y = float(y_val) if isinstance(y_val, (int, float)) else 100.0
_logger.info("[Dropdown] Trusted click at (%s, %s)", x, y)
try:
if isinstance(page, ExtensionPage):
await page.trusted_click(x, y)
_logger.info("[Dropdown] Trusted click completed")
await page.wait_for_timeout(200)
except Exception as e:
_logger.error("[Dropdown] Trusted click FAILED: %s", e)
async def _select_value_with_retry(
page: PageLike,
field_selector: str,
popup_button_selector: str,
target_value: str,
) -> bool:
"""Try to select a value with one retry.
Args:
page: PageLike instance
field_selector: CSS selector for the autocomplete wrapper
popup_button_selector: Selector for popup trigger button
target_value: Value to select
Returns:
True if value was selected, False otherwise
"""
escaped = escape_selector(target_value)
# Ensure listbox is open
if not await ensure_listbox(page):
await open_dropdown(page, field_selector, popup_button_selector)
_ = await ensure_listbox(page)
_ = await get_options(page, field_selector)
# First attempt
if await _try_click_option(page, field_selector, escaped):
await page.wait_for_timeout(200)
_logger.debug("[Dropdown] Selected '%s' on first attempt", target_value)
return True
# Retry: reopen dropdown and try again
await open_dropdown(page, field_selector, popup_button_selector)
_ = await ensure_listbox(page)
if await _try_click_option(page, field_selector, escaped):
await page.wait_for_timeout(200)
_logger.debug("[Dropdown] Selected '%s' on retry", target_value)
return True
_logger.warning("[Dropdown] Failed to select '%s'", target_value)
return False
async def _resolve_unmatched_values(
page: PageLike,
field_selector: str,
popup_button_selector: str,
not_found: list[str],
) -> tuple[list[str], list[str]]:
"""Try partial matching for unmatched values.
Args:
page: PageLike instance
field_selector: CSS selector for the autocomplete wrapper
popup_button_selector: Selector for popup trigger button
not_found: List of values that weren't found with exact match
Returns:
Tuple of (newly_selected, still_unresolved)
"""
if not not_found:
return [], []
await open_dropdown(page, field_selector, popup_button_selector)
_ = await ensure_listbox(page)
newly_selected: list[str] = []
still_unresolved: list[str] = []
for val in not_found:
escaped = escape_selector(val)
if await _try_partial_match(page, escaped):
newly_selected.append(val)
else:
still_unresolved.append(val)
return newly_selected, still_unresolved
async def _close_dropdown(page: PageLike, field_selector: str) -> None:
"""Close the dropdown using appropriate method.
Args:
page: PageLike instance
field_selector: CSS selector for the autocomplete wrapper
"""
_logger.info("[Dropdown] Closing dropdown after %s", field_selector)
has_trusted = hasattr(page, "trusted_click")
_logger.info("[Dropdown] Has trusted_click: %s", has_trusted)
if has_trusted:
await _close_with_trusted_click(page)
else:
_logger.info("[Dropdown] Using fallback click (no trusted_click)")
with contextlib.suppress(Exception):
await page.click("h2")
await page.wait_for_timeout(150)
await close_all_dropdowns(page)
async def select_multi(
page: PageLike, field_selector: str, values: list[str]
) -> DropdownResult:
"""Select multiple values from a MUI Autocomplete field.
Handles multi-select autocomplete components with chips.
Args:
page: PageLike instance
field_selector: CSS selector for the autocomplete wrapper
values: List of values to select
Returns:
DropdownResult with selected, not_found, and available lists
"""
popup_button_selector = f"{field_selector} .MuiAutocomplete-popupIndicator"
await page.wait_for_timeout(150) # Brief stabilization
await open_dropdown(page, field_selector, popup_button_selector)
await page.wait_for_timeout(300) # Wait for dropdown animation
_ = await ensure_listbox(page)
# Collect available options
options = await get_options(page, field_selector)
available = [str(opt["text"]) for opt in (options or []) if "text" in opt]
# Select each value with retry
selected: list[str] = []
not_found: list[str] = []
for target_value in values:
if await _select_value_with_retry(
page, field_selector, popup_button_selector, target_value
):
selected.append(target_value)
else:
not_found.append(target_value)
# Try partial matching for unresolved values
newly_selected, still_unresolved = await _resolve_unmatched_values(
page, field_selector, popup_button_selector, not_found
)
selected.extend(newly_selected)
await _close_dropdown(page, field_selector)
return {
"selected": selected,
"not_found": still_unresolved,
"available": available,
}
async def select_single(
page: PageLike, field_selector: str, value: str
) -> DropdownResult:
"""Select a single value from a MUI Autocomplete field.
Convenience wrapper around select_multi for single selections.
Args:
page: PageLike instance
field_selector: CSS selector for the autocomplete wrapper
value: Value to select
Returns:
DropdownResult with selected, not_found, and available lists
"""
return await select_multi(page, field_selector, [value])

View File

@@ -0,0 +1,215 @@
"""MUI Select/Combobox selection handlers.
Handles non-searchable Select components with role=combobox.
"""
import contextlib
import logging
from guide.app.browser.types import PageLike
from guide.app.browser.elements.mui import (
DropdownResult,
send_key,
ensure_listbox,
check_field_disabled,
enable_field,
)
from guide.app.browser.elements.dropdown._close import close_all_dropdowns
from guide.app.browser.elements.dropdown._helpers import (
find_combobox_selector,
open_combobox_dropdown,
wait_for_dropdown_options,
select_combobox_option,
verify_combobox_selection,
force_set_combobox_value,
open_mui_dropdown,
search_and_click_mui_option,
get_options,
clear_chips,
)
_logger = logging.getLogger(__name__)
async def select_combobox(
page: PageLike,
field_selector: str,
value: str,
*,
wait_ms: int = 2500,
) -> DropdownResult:
"""Select a value from a MUI Select combobox dropdown.
Handles non-searchable Select components with role=combobox.
Args:
page: PageLike instance
field_selector: CSS selector for the select wrapper
value: Value to select
wait_ms: Maximum wait time for options to appear
Returns:
DropdownResult with selected, not_found, and available lists
"""
await enable_field(page, field_selector)
popup_button_selector = f"{field_selector} svg[data-testid='ArrowDropDownIcon']"
_ = await page.evaluate(
"""
(() => {
const body = document.querySelector('body, #body');
if (body) {
body.click();
}
return true;
})();
"""
)
await page.wait_for_timeout(50)
combobox_selector = await find_combobox_selector(page, field_selector)
_ = await open_combobox_dropdown(
page, combobox_selector, field_selector, popup_button_selector
)
listbox_visible, available = await wait_for_dropdown_options(
page, wait_ms, field_selector, combobox_selector
)
if not listbox_visible:
_logger.error(
"[Dropdown] Cannot select value - dropdown never opened for field: %s",
field_selector,
)
return {
"selected": [],
"not_found": [value],
"available": [],
}
selected = False
try:
selected = await select_combobox_option(page, field_selector, value)
if selected:
await page.wait_for_timeout(300)
selected = await verify_combobox_selection(page, field_selector, value)
except Exception:
selected = False
if not selected and listbox_visible:
try:
selected = await force_set_combobox_value(page, field_selector, value)
except Exception as e:
_logger.warning("[Dropdown] Force-set failed: %s", e)
elif not selected:
_logger.warning(
"[Dropdown] Cannot force-set - dropdown never opened for: %s",
field_selector,
)
# Auto-close all dropdowns to ensure clean state for next operation
await close_all_dropdowns(page)
return {
"selected": [value] if selected else [],
"not_found": [] if selected else [value],
"available": available,
}
# Alias for clarity
select_single_choice = select_combobox
async def select_mui_options(
page: PageLike,
field_selector: str,
values: list[str],
*,
clear_first: bool = False,
allow_substring: bool = False,
) -> DropdownResult:
"""Select options from a MUI Autocomplete/Select by visible text.
Generic handler that works with both Autocomplete and Select components.
Args:
page: PageLike instance
field_selector: CSS selector for the field wrapper
values: List of values to select
clear_first: Whether to clear existing chips first
allow_substring: Whether to allow substring matching
Returns:
DropdownResult with selected, not_found, and available lists
"""
if clear_first:
await clear_chips(page, field_selector)
selected: list[str] = []
not_found: list[str] = []
available: list[str] = []
for target_value in values:
popup_button_selector = (
f"{field_selector} [data-testid='ArrowDropDownIcon'], "
f"{field_selector} .MuiAutocomplete-popupIndicator, "
f"{field_selector} .MuiSelect-icon, "
f"{field_selector} [role='combobox'], "
f"{field_selector}"
)
if await check_field_disabled(page, field_selector):
_logger.warning(
"[select_mui_options] Field is disabled, skipping: %s", field_selector
)
not_found.append(target_value)
continue
_logger.info(
"[select_mui_options] Attempting to open dropdown for: %s", field_selector
)
_ = await open_mui_dropdown(page, field_selector, popup_button_selector)
await page.wait_for_timeout(800)
_ = await ensure_listbox(page, timeout_ms=2500)
options = await get_options(page, field_selector) or []
available.extend([str(o["text"]) for o in options if "text" in o])
clicked, _ = await search_and_click_mui_option(
page, target_value, allow_substring
)
_logger.info(
"[select_mui_options] Option click result for '%s': %s",
target_value,
clicked,
)
if not clicked:
with contextlib.suppress(Exception):
await page.fill(
f"{field_selector} input, {field_selector} textarea, {field_selector}",
target_value,
)
await send_key(page, "Enter")
await page.wait_for_timeout(50)
clicked = True
if clicked:
selected.append(target_value)
else:
not_found.append(target_value)
await page.wait_for_timeout(40)
# Close any open dropdowns to ensure clean state for next operation
await close_all_dropdowns(page)
return {"selected": selected, "not_found": not_found, "available": available}
# Alias for clarity
select_autocomplete = select_mui_options

View File

@@ -0,0 +1,117 @@
"""Schema-aware dropdown selection with validation.
Provides dropdown selection with optional GraphQL schema validation
for menu/select fields. Validates choices against schema definitions
while allowing dynamic fields (user, supplier, etc.) to pass through.
"""
from __future__ import annotations
from typing import TypedDict
from guide.app.browser.elements.dropdown.autocomplete import select_single
from guide.app.browser.types import PageLike
from guide.app.raindrop.operations.form_schema import FieldDef, FieldType
# Dynamic field types that don't have static choices in schema
_DYNAMIC_FIELD_TYPES = frozenset(
{
FieldType.USER,
FieldType.SUPPLIER,
FieldType.RELATIONSHIP,
FieldType.CONTRACTS,
FieldType.COMMODITY,
FieldType.DEPARTMENT,
}
)
class SchemaAwareDropdownResult(TypedDict):
"""Result of a schema-validated dropdown selection operation."""
selected: list[str]
not_found: list[str]
available: list[str]
validated: bool
validation_warning: str | None
async def select_from_schema(
page: PageLike,
selector: str,
value: str,
field_def: FieldDef | None = None,
*,
strict_validation: bool = False,
) -> SchemaAwareDropdownResult:
"""Select value with optional schema validation.
Validate against GraphQL schema choices before attempting selection.
Dynamic field types (USER, SUPPLIER, etc.) skip validation since
their options are API-driven and not enumerated in the schema.
Args:
page: Browser page instance.
selector: CSS selector for the dropdown field.
value: Value to select.
field_def: Optional FieldDef for validation.
strict_validation: If True, raise ValueError on invalid choices.
Returns:
SchemaAwareDropdownResult with selection info and validation status:
- selected: list of successfully selected values
- not_found: list of values not found in dropdown
- available: list of available options
- validated: True if value was validated against schema
- validation_warning: warning message if value not in choices
Raises:
ValueError: If strict_validation is True and value not in choices.
Example:
# With schema validation
field_def = schema.get_field("status")
result = await select_from_schema(page, selector, "Active", field_def)
if result["validation_warning"]:
logger.warning(result["validation_warning"])
# Without schema (falls back to select_single)
result = await select_from_schema(page, selector, "Some Value")
"""
# Determine if field has static choices we can validate against
has_static_choices = (
field_def is not None
and len(field_def.choices) > 0
and field_def.field_type not in _DYNAMIC_FIELD_TYPES
)
validation_warning: str | None = None
if has_static_choices and field_def is not None:
# Validate against both choice.text and choice.value
valid_texts = {c.text for c in field_def.choices}
valid_values = {c.value for c in field_def.choices if c.value}
all_valid = valid_texts | valid_values
if value not in all_valid:
validation_warning = (
f"Value '{value}' not in schema choices: {sorted(valid_texts)}"
)
if strict_validation:
raise ValueError(validation_warning)
# Proceed with selection regardless of validation
result = await select_single(page, selector, value)
# Return extended result with validation info
return SchemaAwareDropdownResult(
selected=result["selected"],
not_found=result["not_found"],
available=result["available"],
validated=has_static_choices and validation_warning is None,
validation_warning=validation_warning,
)
__all__ = ["select_from_schema", "SchemaAwareDropdownResult"]

View File

@@ -0,0 +1,271 @@
"""Type-to-search autocomplete selection handler.
Handles autocomplete fields that require typing to trigger API search.
"""
import asyncio
import logging
from guide.app.browser.types import PageLike
from guide.app.browser.elements.mui import (
DropdownResult,
escape_selector,
check_listbox_visible,
)
from guide.app.browser.elements.dropdown._close import close_all_dropdowns
_logger = logging.getLogger(__name__)
async def select_typeahead(
page: PageLike,
field_selector: str,
value: str,
*,
min_chars: int = 3,
wait_ms: int = 3000,
) -> DropdownResult:
"""Select a value from a type-to-search autocomplete field.
Unlike regular autocomplete fields that show all options on click,
type-to-search fields require typing first few characters to trigger
an API call or filter, then selecting from the filtered results.
Args:
page: PageLike instance
field_selector: CSS selector for the autocomplete wrapper
value: Full value to select (e.g., "John Smith")
min_chars: Minimum characters to type before waiting for results (default 3)
wait_ms: Maximum wait time for options to appear after typing
Returns:
DropdownResult with selected, not_found, and available lists
"""
_logger.info("[Typeahead] Starting selection for '%s' in %s", value, field_selector)
# Close any existing open dropdowns first
if await check_listbox_visible(page):
_logger.debug("[Typeahead] Closing existing dropdown before starting")
await close_all_dropdowns(page)
await page.wait_for_timeout(150)
# Find and focus the input
field_selector_js = escape_selector(field_selector)
input_found = await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
if (!field) return false;
const input = field.querySelector('input') || field;
if (input.tagName === 'INPUT') {{
input.focus();
return true;
}}
return false;
}})();
"""
)
if not input_found:
_logger.error("[Typeahead] Could not find input in field: %s", field_selector)
return {"selected": [], "not_found": [value], "available": []}
await page.wait_for_timeout(100)
# Clear any existing value and type the search characters
search_text = value[:min_chars] if len(value) >= min_chars else value
_logger.info("[Typeahead] Typing search text: '%s'", search_text)
# Clear existing value first
_ = await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
if (!field) return false;
const input = field.querySelector('input') || field;
if (input.tagName !== 'INPUT') return false;
input.focus();
input.select();
const nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLInputElement.prototype, 'value'
)?.set;
if (nativeSetter) {{
nativeSetter.call(input, '');
}} else {{
input.value = '';
}}
input.dispatchEvent(new Event('input', {{ bubbles: true }}));
return true;
}})();
"""
)
await page.wait_for_timeout(100)
# Type character by character with keyboard events to trigger debounced search
value_escaped = escape_selector(search_text)
typed = await page.evaluate(
f"""
(() => {{
const field = document.querySelector('{field_selector_js}');
if (!field) return false;
const input = field.querySelector('input') || field;
if (input.tagName !== 'INPUT') return false;
input.focus();
const searchText = '{value_escaped}';
// Type each character with keyboard events
for (let i = 0; i < searchText.length; i++) {{
const char = searchText[i];
// Dispatch keydown
input.dispatchEvent(new KeyboardEvent('keydown', {{
key: char,
code: 'Key' + char.toUpperCase(),
keyCode: char.charCodeAt(0),
which: char.charCodeAt(0),
bubbles: true,
cancelable: true
}}));
// Update value using native setter
const nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLInputElement.prototype, 'value'
)?.set;
const newValue = input.value + char;
if (nativeSetter) {{
nativeSetter.call(input, newValue);
}} else {{
input.value = newValue;
}}
// Dispatch input event (React listens to this)
input.dispatchEvent(new InputEvent('input', {{
data: char,
inputType: 'insertText',
bubbles: true,
cancelable: true
}}));
// Dispatch keyup
input.dispatchEvent(new KeyboardEvent('keyup', {{
key: char,
code: 'Key' + char.toUpperCase(),
keyCode: char.charCodeAt(0),
which: char.charCodeAt(0),
bubbles: true,
cancelable: true
}}));
}}
// Final change event
input.dispatchEvent(new Event('change', {{ bubbles: true }}));
return true;
}})();
"""
)
if not typed:
_logger.error("[Typeahead] Failed to type search text")
return {"selected": [], "not_found": [value], "available": []}
# Wait for dropdown options to appear (API response)
_logger.info("[Typeahead] Waiting for dropdown options (max %dms)", wait_ms)
start = asyncio.get_event_loop().time()
options_found = False
available: list[str] = []
while (asyncio.get_event_loop().time() - start) * 1000 < wait_ms:
# Check for options using multiple strategies (listbox may not always be present)
options_result = await page.evaluate(
"""
(() => {
// Strategy 1: Standard listbox
let listbox = document.querySelector('[role="listbox"]:not([data-dropdown-closed])');
if (listbox) {
const opts = Array.from(listbox.querySelectorAll('[role="option"]'));
if (opts.length > 0) {
return opts.map(o => (o.textContent || '').trim());
}
}
// Strategy 2: MUI Autocomplete popper (may not have role="listbox")
const popper = document.querySelector('.MuiAutocomplete-popper');
if (popper) {
const opts = Array.from(popper.querySelectorAll('[role="option"], li'));
if (opts.length > 0) {
return opts.map(o => (o.textContent || '').trim());
}
}
// Strategy 3: Any visible options on page
const anyOpts = Array.from(document.querySelectorAll('[role="option"]'));
if (anyOpts.length > 0) {
return anyOpts.map(o => (o.textContent || '').trim());
}
return [];
})();
"""
)
if isinstance(options_result, list):
from typing import cast
options_list = cast(list[object], options_result)
options_count = len(options_list)
if options_count > 0:
available = [str(opt) for opt in options_list]
_logger.info(
"[Typeahead] Found %d options: %s", options_count, available[:5]
)
options_found = True
break
await asyncio.sleep(0.15)
if not options_found:
_logger.warning(
"[Typeahead] No options appeared after typing '%s'", search_text
)
return {"selected": [], "not_found": [value], "available": []}
# Click the matching option (same pattern as select_multi)
target_escaped = escape_selector(value)
_logger.info("[Typeahead] Clicking matching option for '%s'", value)
clicked = await page.evaluate(
f"""
(() => {{
const target = '{target_escaped}'.toLowerCase();
// Find options from any source
const options = Array.from(document.querySelectorAll('[role="option"]'));
if (options.length === 0) return false;
// Find matching option
const match = options.find(opt => {{
const text = (opt.textContent || '').trim().toLowerCase();
const firstSeg = text.split(' - ')[0];
return text === target || firstSeg === target ||
text.includes(target) || target.includes(firstSeg);
}});
if (match) {{
match.scrollIntoView({{block: 'center', behavior: 'instant'}});
match.click();
return true;
}}
// If no exact match, click first option (typing already filtered)
options[0].scrollIntoView({{block: 'center', behavior: 'instant'}});
options[0].click();
return true;
}})();
"""
)
await page.wait_for_timeout(250)
if clicked:
_logger.info("[Typeahead] Successfully selected '%s'", value)
await close_all_dropdowns(page)
return {"selected": [value], "not_found": [], "available": available}
_logger.warning("[Typeahead] Failed to select '%s'", value)
await close_all_dropdowns(page)
return {"selected": [], "not_found": [value], "available": available}

View File

@@ -0,0 +1,327 @@
"""Field type inference utilities for mapping selectors to automation helpers.
Provides utilities to:
- Infer field types from selectors, elements, or XPath
- Map field types to appropriate helper functions
- Combine multiple inference strategies for robustness
"""
from __future__ import annotations
from typing import Literal
from guide.app.browser.elements._type_guards import (
get_bool_from_dict,
get_str_from_dict,
is_dict_str_object,
is_list_of_objects,
)
from guide.app.browser.types import PageLike
HelperFunction = Literal[
"select_single",
"select_combobox",
"select_multi",
"fill_with_react_events",
"unknown",
]
def infer_type_from_selector(selector: str) -> str:
"""Infer field type from CSS selector pattern.
Args:
selector: CSS selector string
Returns:
Inferred field type (select, autocomplete, text, number, user, relationship, unknown)
"""
selector_lower = selector.lower()
# Data-cy pattern-based inference
if "field-menu" in selector_lower:
return "select"
if "field-user" in selector_lower:
return "user"
if "field-text" in selector_lower:
return "text"
if "-number-" in selector_lower or "number" in selector_lower:
return "number"
if "-supplier-" in selector_lower:
return "autocomplete"
if any(
pattern in selector_lower for pattern in ["-contracts-", "-events-", "-order-"]
):
return "relationship"
# Fallback
return "unknown"
async def infer_type_from_element(page: PageLike, selector: str) -> str:
"""Infer field type from live browser element.
Args:
page: Browser page instance
selector: CSS selector for the element
Returns:
Inferred field type
"""
from guide.app.browser.utils import escape_selector
escaped = escape_selector(selector)
result = await page.evaluate(
f"""
((selector) => {{
const el = document.querySelector(selector);
if (!el) return null;
const input = el.querySelector('input') || el;
const autocomplete = el.querySelector('.MuiAutocomplete-root');
const selectRoot = el.querySelector('.MuiSelect-root');
return {{
tag_name: el.tagName.toLowerCase(),
// PRIMARY: Role-based detection (W3C standard)
role: el.getAttribute('role'),
input_role: input.getAttribute('role'),
aria_controls: input.getAttribute('aria-controls'),
aria_owns: input.getAttribute('aria-owns'),
aria_expanded: input.getAttribute('aria-expanded'),
aria_haspopup: input.getAttribute('aria-haspopup'),
// FALLBACK: MUI class detection (for pages missing ARIA)
has_autocomplete_class: !!autocomplete,
has_select_class: !!selectRoot,
has_autocomplete_parent: !!el.closest('.MuiAutocomplete-root'),
has_select_parent: !!el.closest('.MuiSelect-root'),
// Existing fields
type_attr: input.getAttribute('type'),
classes: Array.from(el.classList),
data_cy: el.getAttribute('data-cy') || el.closest('[data-cy]')?.getAttribute('data-cy')
}};
}})('{escaped}');
"""
)
if not result or not is_dict_str_object(result):
return "unknown"
# Get data-cy for semantic type detection
data_cy = get_str_from_dict(result, "data_cy")
# PRIMARY: Role-based detection (most reliable, W3C standard)
input_role = get_str_from_dict(result, "input_role")
if input_role == "combobox":
# Distinguish between select and autocomplete based on data-cy
if "field-menu" in data_cy:
return "select"
if "user" in data_cy:
return "user"
if "supplier" in data_cy:
return "autocomplete"
if any(pattern in data_cy for pattern in ["contracts", "events", "orders"]):
return "relationship"
return "autocomplete"
# Check for ARIA popup indicators
aria_controls = get_str_from_dict(result, "aria_controls")
aria_owns = get_str_from_dict(result, "aria_owns")
if aria_controls or aria_owns:
# Has popup association - likely autocomplete/select
if "field-menu" in data_cy:
return "select"
return "autocomplete"
# FALLBACK: MUI class detection (for pages without proper ARIA)
has_autocomplete_class = get_bool_from_dict(result, "has_autocomplete_class")
has_select_class = get_bool_from_dict(result, "has_select_class")
if has_autocomplete_class:
if "user" in data_cy:
return "user"
if "supplier" in data_cy:
return "autocomplete"
if any(pattern in data_cy for pattern in ["contracts", "events", "orders"]):
return "relationship"
return "autocomplete"
if has_select_class:
return "select"
if get_bool_from_dict(result, "has_autocomplete_parent"):
if "user" in data_cy:
return "user"
if "supplier" in data_cy:
return "autocomplete"
if any(pattern in data_cy for pattern in ["contracts", "events", "orders"]):
return "relationship"
return "autocomplete"
if get_bool_from_dict(result, "has_select_parent"):
return "select"
# Check classes for MUI components (last resort class detection)
classes_obj = result.get("classes")
if classes_obj is not None and is_list_of_objects(classes_obj):
classes: list[str] = [
str(class_item)
for class_item in classes_obj
if isinstance(class_item, (str, int, float, bool))
]
if any("MuiAutocomplete-root" in c for c in classes):
return "autocomplete"
if any("MuiSelect-select" in c for c in classes):
return "select"
# Check input type attribute
type_attr = get_str_from_dict(result, "type_attr")
if type_attr == "number":
return "number"
if type_attr == "text":
return "text"
if type_attr == "email":
return "text"
# Check tag name
tag_name = get_str_from_dict(result, "tag_name")
if tag_name == "textarea":
return "textarea"
if tag_name == "input":
return "text"
return "unknown"
def infer_type_from_xpath(xpath: str) -> str:
"""Infer field type from XPath selector.
Args:
xpath: XPath selector string
Returns:
Inferred field type
"""
xpath_lower = xpath.lower()
if 'role="combobox"' in xpath_lower:
return "select" if "muiselect" in xpath_lower else "autocomplete"
# Data-cy pattern inference
if "field-user" in xpath_lower:
return "user"
if "field-menu" in xpath_lower:
return "select"
return "text" if "field-text" in xpath_lower else "unknown"
def select_helper_for_type(field_type: str) -> HelperFunction:
"""Map field type to appropriate helper function.
Args:
field_type: Field type (select, autocomplete, text, etc.)
Returns:
Helper function name to use
"""
helper_mapping: dict[str, HelperFunction] = {
"select": "select_single",
"autocomplete": "select_combobox",
"text": "fill_with_react_events",
"textarea": "fill_with_react_events",
"number": "fill_with_react_events",
"user": "select_combobox",
"relationship": "select_multi",
}
return helper_mapping.get(field_type, "unknown")
async def infer_with_fallback(
page: PageLike,
selector: str,
strategies: list[str] | None = None,
) -> tuple[str, HelperFunction]:
"""Infer field type using multiple strategies with fallback.
Args:
page: Browser page instance
selector: CSS selector for the field
strategies: List of strategies to try (default: ["selector", "element", "default"])
Returns:
Tuple of (field_type, helper_function)
"""
if strategies is None:
strategies = ["selector", "element", "default"]
field_type = "unknown"
for strategy in strategies:
if strategy == "selector":
field_type = infer_type_from_selector(selector)
if field_type != "unknown":
break
elif strategy == "element":
field_type = await infer_type_from_element(page, selector)
if field_type != "unknown":
break
elif strategy == "default":
field_type = "text"
break
helper = select_helper_for_type(field_type)
return (field_type, helper)
async def infer_with_confidence(
page: PageLike,
selector: str,
) -> tuple[str, HelperFunction, float]:
"""Infer field type with confidence score.
Args:
page: Browser page instance
selector: CSS selector for the field
Returns:
Tuple of (field_type, helper_function, confidence)
"""
# Try selector-based inference (high confidence if matched)
selector_type = infer_type_from_selector(selector)
if selector_type != "unknown":
helper = select_helper_for_type(selector_type)
return (selector_type, helper, 0.9)
# Try element-based inference (medium confidence)
element_type = await infer_type_from_element(page, selector)
if element_type != "unknown":
helper = select_helper_for_type(element_type)
return (element_type, helper, 0.7)
# Fallback to text (low confidence)
helper = select_helper_for_type("text")
return ("text", helper, 0.3)
def build_field_selector(data_cy: str, child_element: str | None = None) -> str:
"""Build CSS selector from data-cy attribute.
Args:
data_cy: Data-cy attribute value
child_element: Optional child element selector (e.g., "input", "input[role='combobox']")
Returns:
Complete CSS selector
"""
field_selector = f'div[data-cy="{data_cy}"]'
return f"{field_selector} {child_element}" if child_element else field_selector

View File

@@ -0,0 +1,257 @@
"""Automated form population using discovered fields and generated test data.
Provides utilities to:
- Automatically discover form fields
- Generate appropriate test data
- Populate forms dynamically
- Validate populated data
"""
from __future__ import annotations
from typing import Protocol
from guide.app.browser.elements.dropdown import (
select_combobox,
select_multi,
select_single,
)
from guide.app.browser.elements.field_inference import (
infer_with_fallback,
select_helper_for_type,
)
from guide.app.browser.elements.form_discovery import (
FormField,
extract_all_form_fields,
)
from guide.app.browser.elements.mui import fill_with_react_events
from guide.app.browser.types import PageLike
# Type alias for form field values
FormValue = str | int | list[str] | bool
class FormDataGenerator(Protocol):
"""Protocol for form data generators.
Defines the interface for generating test data for form fields.
Implementations can use Faker, static configs, or LLM calls.
"""
def generate_form_data(self, field_types: dict[str, str]) -> dict[str, FormValue]:
"""Generate form data based on field types.
Args:
field_types: Dict mapping field names to field types
(e.g., {"status": "select", "description": "text"})
Returns:
Dict of field names to generated values
"""
...
async def discover_and_fill_form(
page: PageLike,
form_data: dict[str, FormValue],
container_selector: str = "body",
) -> dict[str, bool]:
"""Discover all form fields and fill them with provided data.
Args:
page: Browser page instance
form_data: Dict mapping field names (or data-cy values) to values
container_selector: Container to search for fields (default: body)
Returns:
Dict mapping field data-cy to success status
"""
# Discover all fields
fields = await extract_all_form_fields(page, container_selector)
results: dict[str, bool] = {}
for field in fields:
data_cy = field["data_cy"]
if not data_cy:
continue
# Find value for this field
# Try exact data-cy match first, then field name from data-cy
field_name = data_cy.split("-")[-1] # Extract last part as field name
value = form_data.get(data_cy) or form_data.get(field_name)
if value is None:
continue
# Fill the field
success = await fill_field(page, field, value)
results[data_cy] = success
return results
async def fill_field(
page: PageLike,
field: FormField,
value: str | int | list[str],
) -> bool:
"""Fill a single form field with appropriate helper.
Args:
page: Browser page instance
field: FormField metadata
value: Value to fill
Returns:
True if successful, False otherwise
"""
field_type = field["field_type"]
selector = field["selector"]
# Get appropriate helper
helper_name = select_helper_for_type(field_type)
try:
if helper_name == "select_single":
if isinstance(value, list):
value = value[0] if value else ""
result = await select_single(page, selector, str(value))
return result.get("selected", []) != []
elif helper_name == "select_combobox":
if isinstance(value, list):
value = value[0] if value else ""
result = await select_combobox(page, selector, str(value))
return result.get("selected", []) != []
elif helper_name == "select_multi":
if not isinstance(value, list):
value = [str(value)]
result = await select_multi(page, selector, [str(v) for v in value])
return result.get("selected", []) != []
elif helper_name == "fill_with_react_events":
success = await fill_with_react_events(
page, f"{selector} input", str(value)
)
return bool(success)
return False
except Exception:
return False
async def auto_populate_form(
page: PageLike,
generator: FormDataGenerator,
container_selector: str = "body",
) -> dict[str, FormValue]:
"""Automatically discover and populate form with generated test data.
Args:
page: Browser page instance
generator: FormDataGenerator implementation for generating test values
container_selector: Container to search for fields
Returns:
Dict of generated form data
"""
# Discover all fields
fields = await extract_all_form_fields(page, container_selector)
# Build field schema for data generator
field_schema: dict[str, str] = {}
for field in fields:
if data_cy := field["data_cy"]:
field_name = data_cy.split("-")[-1]
field_schema[field_name] = field["field_type"]
# Generate test data using provided generator
form_data = generator.generate_form_data(field_schema)
# Fill the form
_ = await discover_and_fill_form(page, form_data, container_selector)
return form_data
async def validate_populated_form(
page: PageLike,
expected_data: dict[str, FormValue],
container_selector: str = "body",
) -> dict[str, bool]:
"""Validate that form was populated correctly.
Args:
page: Browser page instance
expected_data: Expected field values
container_selector: Container to search for fields
Returns:
Dict mapping field names to validation status
"""
from guide.app.browser.elements.form_discovery import extract_field_value
# Discover all fields
fields = await extract_all_form_fields(page, container_selector)
validation_results: dict[str, bool] = {}
for field in fields:
data_cy = field["data_cy"]
if not data_cy:
continue
# Find expected value
field_name = data_cy.split("-")[-1]
expected_value = expected_data.get(data_cy) or expected_data.get(field_name)
if expected_value is None:
continue
# Get actual value
actual_value = await extract_field_value(page, field["selector"])
# Validate
if isinstance(expected_value, list):
# For multi-select, check if any value matches
validation_results[field_name] = any(
str(v) in str(actual_value) for v in expected_value
)
else:
validation_results[field_name] = str(expected_value) in str(actual_value)
return validation_results
async def smart_fill_field(
page: PageLike,
selector: str,
value: str | int | list[str],
) -> bool:
"""Intelligently fill a field by inferring its type.
Args:
page: Browser page instance
selector: CSS selector for the field
value: Value to fill
Returns:
True if successful, False otherwise
"""
# Infer field type with fallback strategies
field_type, _ = await infer_with_fallback(page, selector)
# Build FormField for fill_field
field = FormField(
label="",
selector=selector,
field_type=field_type,
data_cy=None,
required=False,
disabled=False,
)
return await fill_field(page, field, value)

View File

@@ -0,0 +1,472 @@
"""Form discovery utilities for extracting form metadata from HTML or live browser.
Provides utilities to:
- Extract form fields from raw HTML
- Query live browser DOM for form metadata
- Detect field types, states, and attributes
"""
from __future__ import annotations
from typing import TypedDict
from guide.app.browser.elements._type_guards import (
get_bool_from_dict,
get_str_from_dict,
get_str_or_none_from_dict,
is_dict_str_object,
is_list_of_objects,
)
from guide.app.browser.utils import escape_selector
from guide.app.browser.types import PageLike
# ---------------------------------------------------------------------------
# JavaScript Templates
# ---------------------------------------------------------------------------
_JS_ACCESSIBLE_NAME = """
((selector) => {
const field = document.querySelector(selector);
if (!field) return null;
const input = field.querySelector('input, textarea, [role="combobox"]');
if (!input) return null;
// Priority 1: Direct label association (HTMLInputElement.labels)
if (input.labels && input.labels.length > 0) {
return input.labels[0].innerText.trim();
}
// Priority 2: aria-label attribute
const ariaLabel = input.getAttribute('aria-label');
if (ariaLabel) return ariaLabel.trim();
// Priority 3: aria-labelledby reference
const labelledBy = input.getAttribute('aria-labelledby');
if (labelledBy) {
const labelEl = document.getElementById(labelledBy);
if (labelEl) return labelEl.innerText.trim();
}
// Priority 4: MUI FormControl fallback
const formControl = field.closest('.MuiFormControl-root') || field;
const labelEl = formControl.querySelector('label');
if (labelEl) return labelEl.textContent.trim();
return null;
})
"""
# ---------------------------------------------------------------------------
# Data Structures
# ---------------------------------------------------------------------------
class FormField(TypedDict):
"""Extracted form field metadata."""
label: str
selector: str
field_type: str # text, textarea, number, select, autocomplete, user, relationship
data_cy: str | None
required: bool
disabled: bool
# ---------------------------------------------------------------------------
# Accessible Name Extraction
# ---------------------------------------------------------------------------
async def extract_accessible_name(page: PageLike, selector: str) -> str | None:
"""Extract computed accessible name for a form field.
Resolution order (W3C accessible name computation):
1. input.labels[0].innerText (native label association)
2. aria-label attribute
3. aria-labelledby -> getElementById
4. MUI FormControl fallback
Args:
page: Browser page instance
selector: CSS selector for the field container
Returns:
Accessible name string or None if not found
"""
escaped = escape_selector(selector)
result = await page.evaluate(f"{_JS_ACCESSIBLE_NAME}('{escaped}')")
return str(result) if isinstance(result, str) else None
# ---------------------------------------------------------------------------
# Data-cy and Field Discovery
# ---------------------------------------------------------------------------
async def extract_all_data_cy_selectors(page: PageLike) -> list[str]:
"""Extract all data-cy selectors from page.
Args:
page: Browser page instance
Returns:
List of data-cy attribute values
"""
result = await page.evaluate(
"""
(() => {
const elements = document.querySelectorAll('[data-cy]');
return Array.from(elements).map(el => el.getAttribute('data-cy'));
})();
"""
)
if is_list_of_objects(result):
data_cy_list: list[str] = [
str(list_item)
for list_item in result
if list_item is not None and isinstance(list_item, (str, int, float, bool))
]
return data_cy_list
return []
async def extract_form_field_metadata(
page: PageLike, selector: str
) -> FormField | None:
"""Extract complete metadata for a form field.
Args:
page: Browser page instance
selector: CSS selector for the field container
Returns:
FormField metadata or None if field not found
"""
escaped = escape_selector(selector)
result = await page.evaluate(
f"""
((selector) => {{
const field = document.querySelector(selector);
if (!field) return null;
const input = field.querySelector('input, textarea, [role="combobox"]');
const autocomplete = field.querySelector('.MuiAutocomplete-root');
const select = field.querySelector('[role="combobox"]');
// Priority-based label resolution (W3C accessible name)
let label = "";
// 1. Direct label association (HTMLInputElement.labels)
if (input && input.labels && input.labels.length > 0) {{
label = input.labels[0].innerText;
}}
// 2. ARIA label
else if (input?.hasAttribute('aria-label')) {{
label = input.getAttribute('aria-label');
}}
// 3. ARIA labelledby
else if (input?.hasAttribute('aria-labelledby')) {{
const labelId = input.getAttribute('aria-labelledby');
const labelEl = document.getElementById(labelId);
if (labelEl) label = labelEl.innerText;
}}
// 4. MUI FormControl fallback
else {{
const formControl = field.closest('.MuiFormControl-root') || field;
const labelEl = formControl.querySelector('label');
if (labelEl) label = labelEl.textContent;
}}
// Determine type (preserve fields for _infer_field_type_from_metadata)
let type = 'unknown';
if (autocomplete) type = 'autocomplete';
else if (select) type = 'select';
else if (input) type = input.getAttribute('type') || 'text';
return {{
data_cy: field.getAttribute('data-cy'),
label: (label || '').trim(),
// PRESERVED: Fields required by _infer_field_type_from_metadata
type: type,
has_autocomplete: !!autocomplete,
has_select: !!select,
input_type: input ? input.getAttribute('type') : null,
// NEW: Role-based fields for enhanced inference
role: input?.getAttribute('role'),
aria_controls: input?.getAttribute('aria-controls'),
aria_owns: input?.getAttribute('aria-owns'),
// Existing fields
required: input ? input.hasAttribute('required') : false,
disabled: field.classList.contains('Mui-disabled'),
}};
}})('{escaped}');
"""
)
if not result or not is_dict_str_object(result):
return None
# Safely extract values with proper type narrowing
label = get_str_from_dict(result, "label")
data_cy = get_str_or_none_from_dict(result, "data_cy")
required = get_bool_from_dict(result, "required")
disabled = get_bool_from_dict(result, "disabled")
metadata_for_inference: dict[str, str | bool] = {
dict_key: dict_val
for dict_key, dict_val in result.items()
if isinstance(dict_val, (str, bool))
}
field_type = _infer_field_type_from_metadata(metadata_for_inference)
return FormField(
label=label,
selector=selector,
field_type=field_type,
data_cy=data_cy,
required=required,
disabled=disabled,
)
async def extract_all_form_fields(
page: PageLike,
container_selector: str = "body",
*,
limit: int | None = None,
include_metadata: bool = True,
) -> list[FormField]:
"""Extract visible form fields from page.
Filters out hidden elements (offsetParent === null) to avoid matching
duplicate/hidden fields that React may render for mobile/desktop variants.
Args:
page: Browser page instance
container_selector: Container to search within (default: body)
limit: Maximum number of fields to return (None = all). Use to reduce
payload size for large boards.
include_metadata: If False, only extract data_cy and label (lighter payload).
Returns:
List of FormField metadata for visible fields only
"""
# Build JS limit clause: .slice(0, N) or empty string
limit_clause = f".slice(0, {limit})" if limit else ""
# Full metadata extraction (original behavior)
if include_metadata:
result = await page.evaluate(
f"""
((containerSelector) => {{
const container = document.querySelector(containerSelector);
if (!container) return [];
const fields = container.querySelectorAll('[data-cy^="board-item-"]');
return Array.from(fields)
.filter(field => field.offsetParent !== null)
{limit_clause}
.map(field => {{
const label = field.querySelector('label');
const input = field.querySelector('input');
const autocomplete = field.querySelector('.MuiAutocomplete-root');
const select = field.querySelector('[role="combobox"]');
let type = 'unknown';
if (autocomplete) type = 'autocomplete';
else if (select) type = 'select';
else if (input) type = input.getAttribute('type') || 'text';
return {{
data_cy: field.getAttribute('data-cy'),
label: label ? label.textContent.trim() : '',
type: type,
required: input ? input.hasAttribute('required') : false,
disabled: field.classList.contains('Mui-disabled')
}};
}});
}})('{container_selector}');
"""
)
else:
# Lightweight extraction - only essential fields (smaller payload)
result = await page.evaluate(
f"""
((containerSelector) => {{
const container = document.querySelector(containerSelector);
if (!container) return [];
const fields = container.querySelectorAll('[data-cy^="board-item-"]');
return Array.from(fields)
.filter(field => field.offsetParent !== null)
{limit_clause}
.map(field => {{
const label = field.querySelector('label');
return {{
data_cy: field.getAttribute('data-cy'),
label: label ? label.textContent.trim() : ''
}};
}});
}})('{container_selector}');
"""
)
# Convert to FormField TypedDicts
form_fields: list[FormField] = []
if is_list_of_objects(result):
for list_item in result:
if not is_dict_str_object(list_item):
continue
data_cy = get_str_or_none_from_dict(list_item, "data_cy")
selector = f'div[data-cy="{data_cy}"]' if data_cy else ""
metadata_for_inference: dict[str, str | bool] = {
dict_key: dict_val
for dict_key, dict_val in list_item.items()
if isinstance(dict_val, (str, bool))
}
field_type = _infer_field_type_from_metadata(metadata_for_inference)
label = get_str_from_dict(list_item, "label")
required = get_bool_from_dict(list_item, "required")
disabled = get_bool_from_dict(list_item, "disabled")
form_fields.append(
FormField(
label=label,
selector=selector,
field_type=field_type,
data_cy=data_cy,
required=required,
disabled=disabled,
)
)
return form_fields
async def extract_field_value(page: PageLike, selector: str) -> list[str] | str | None:
"""Extract current value of form field, handling multi-select chips.
Args:
page: Browser page instance
selector: CSS selector for the field
Returns:
- list[str] for multi-select fields (MUI Autocomplete chips)
- str for single-value fields
- None if field not found or empty
"""
escaped = escape_selector(selector)
result = await page.evaluate(
f"""
((selector) => {{
const field = document.querySelector(selector);
if (!field) return null;
// Check for MUI Autocomplete chips (multi-select)
const chips = field.querySelectorAll('.MuiChip-label');
if (chips.length > 0) {{
return Array.from(chips).map(c => c.textContent.trim());
}}
// Single input value
const input = field.querySelector('input');
if (input && input.value) return input.value;
// Combobox text (for selects)
const select = field.querySelector('[role="combobox"]');
if (select) {{
const text = select.textContent.trim();
// Exclude placeholder text
if (text && text !== 'Select...' && text !== 'Choose...') {{
return text;
}}
}}
return null;
}})('{escaped}');
"""
)
# Handle multi-select (list of chip labels)
if is_list_of_objects(result):
return [str(item) for item in result if isinstance(item, str)]
return str(result) if isinstance(result, str) else None
async def extract_field_state(page: PageLike, selector: str) -> dict[str, bool]:
"""Extract field state (disabled, required, error, focused).
Args:
page: Browser page instance
selector: CSS selector for the field
Returns:
Dict with state flags
"""
result = await page.evaluate(
f"""
((selector) => {{
const field = document.querySelector(selector);
if (!field) return null;
const input = field.querySelector('input');
return {{
disabled: field.classList.contains('Mui-disabled'),
required: input ? input.hasAttribute('required') : false,
has_error: field.classList.contains('Mui-error'),
is_focused: field.classList.contains('Mui-focused')
}};
}})('{selector}');
"""
)
if is_dict_str_object(result):
state_dict: dict[str, bool] = {
dict_key: dict_val
for dict_key, dict_val in result.items()
if isinstance(dict_val, bool)
}
return state_dict
return {}
# Helper functions
def _infer_field_type_from_metadata(metadata: dict[str, str | bool]) -> str:
"""Infer field type from extracted metadata."""
data_cy = metadata.get("data_cy", "")
# Use data-cy prefix to infer type
if "field-menu" in str(data_cy):
return "select"
if "field-user" in str(data_cy):
return "user"
if "field-text" in str(data_cy):
return "text"
if "number" in str(data_cy):
return "number"
if any(
pattern in str(data_cy) for pattern in ["-supplier-", "contracts-", "events-"]
) and metadata.get("has_autocomplete"):
return "autocomplete" if "-supplier-" in str(data_cy) else "relationship"
# Fall back to metadata analysis
if metadata.get("has_autocomplete"):
return "autocomplete"
if metadata.get("has_select"):
return "select"
if metadata.get("input_type") == "number":
return "number"
if metadata.get("input_type") == "text":
return "text"
if metadata.get("type") == "autocomplete":
return "autocomplete"
return "select" if metadata.get("type") == "select" else "unknown"

View File

@@ -0,0 +1,60 @@
"""Input primitive helpers (extension-friendly).
This module provides atomic input-related utilities that complement the base MUI helpers.
For orchestration (multi-field form filling), see form_automation.py.
Imports directly from sibling modules to avoid circular imports.
"""
from guide.app.browser.elements.dropdown import select_single
from guide.app.browser.elements.mui import fill_with_react_events
from guide.app.browser.types import PageLike
async def fill_text(page: PageLike, selector: str, value: str) -> None:
"""Fill a text input field.
Args:
page: PageLike instance
selector: CSS selector for the input
value: Value to fill
"""
result = await fill_with_react_events(page, selector, value)
_ = result # Result captured but not used
async def fill_textarea(page: PageLike, selector: str, value: str) -> None:
"""Fill a textarea field.
Args:
page: PageLike instance
selector: CSS selector for the textarea
value: Value to fill
"""
result = await fill_with_react_events(
page, selector, value, element_type="textarea"
)
_ = result
async def fill_date(page: PageLike, selector: str, value: str) -> None:
"""Fill a date input field.
Args:
page: PageLike instance
selector: CSS selector for the date input
value: Date value (format depends on input type)
"""
await page.fill(selector, value)
async def fill_autocomplete(page: PageLike, field_selector: str, value: str) -> None:
"""Fill an autocomplete field by selecting from dropdown options.
Args:
page: PageLike instance
field_selector: CSS selector for the autocomplete wrapper
value: Value to select
"""
result = await select_single(page, field_selector, value)
_ = result

View File

@@ -0,0 +1,150 @@
"""Layout-oriented element helpers (accordions, panels, etc.)."""
import logging
from typing import TypedDict, cast
from playwright.async_api import Error as PlaywrightError
from playwright.async_api import TimeoutError as PlaywrightTimeoutError
from guide.app import errors
from guide.app.browser.types import PageLike, PageLocator
from guide.app.core.config import Timeouts
_logger = logging.getLogger(__name__)
class AccordionCollapseResult(TypedDict):
"""Result from collapsing accordions."""
collapsed_count: int
total_found: int
failed_indices: list[int]
class Accordion:
"""Collapse/expand helpers for accordion buttons.
Optimized to minimize DOM queries:
- 1 query: count buttons
- 1 query: batch-find expanded indices via JS evaluation
- X queries: click only expanded buttons (X <= total count)
"""
page: PageLike
timeouts: Timeouts
expanded_icon_selector: str
def __init__(
self,
page: PageLike,
*,
timeouts: Timeouts | None = None,
expanded_icon_selector: str = 'svg[data-testid="KeyboardArrowUpOutlinedIcon"]',
) -> None:
self.page = page
self.timeouts = timeouts or Timeouts()
self.expanded_icon_selector = expanded_icon_selector
async def _get_expanded_indices(
self,
buttons_selector: str,
) -> tuple[int, list[int]]:
"""Find all expanded accordion indices in a single DOM query.
Returns:
Tuple of (total_count, expanded_indices).
"""
icon_selector = self.expanded_icon_selector
# Single JS evaluation replaces N individual icon.count() calls
result = await self.page.evaluate(
"""
([buttonsSelector, iconSelector]) => {
const buttons = document.querySelectorAll(buttonsSelector);
const expanded = [];
buttons.forEach((btn, i) => {
if (btn.querySelector(iconSelector)) {
expanded.push(i);
}
});
return { total: buttons.length, expanded };
}
""",
[buttons_selector, icon_selector],
)
# Type-narrow the result from JS evaluation
if isinstance(result, dict):
result_dict = cast(dict[str, object], result)
total_raw = result_dict.get("total", 0)
total = int(total_raw) if isinstance(total_raw, (int, float)) else 0
raw_expanded = result_dict.get("expanded", [])
if isinstance(raw_expanded, list):
expanded: list[int] = []
raw_list = cast(list[object], raw_expanded)
for item in raw_list:
if isinstance(item, (int, float)):
expanded.append(int(item))
return (total, expanded)
return (0, [])
async def collapse_all(
self,
buttons_selector: str,
timeout_ms: int | None = None,
) -> AccordionCollapseResult:
"""Collapse all expanded accordion buttons that match selector.
Optimized: Uses single JS evaluation to find expanded buttons,
then clicks only those (reduces N+2 queries to 2+X where X = expanded count).
"""
# Single query to get count and expanded indices
total_count, expanded_indices = await self._get_expanded_indices(
buttons_selector
)
if total_count == 0:
return {"collapsed_count": 0, "total_found": 0, "failed_indices": []}
if not expanded_indices:
_logger.debug("No expanded accordions found among %d buttons", total_count)
return {
"collapsed_count": 0,
"total_found": total_count,
"failed_indices": [],
}
buttons = self.page.locator(buttons_selector)
collapsed_count = 0
failed_indices: list[int] = []
max_wait = (
timeout_ms if timeout_ms is not None else self.timeouts.element_default
)
# Click only the expanded buttons (no extra DOM queries per button)
for index in expanded_indices:
button: PageLocator = buttons.nth(index)
try:
await button.click(timeout=max_wait)
collapsed_count += 1
except (PlaywrightTimeoutError, PlaywrightError) as exc:
_logger.debug("Failed to collapse accordion %d: %s", index, exc)
failed_indices.append(index)
if expanded_indices and collapsed_count == 0:
raise errors.ActionExecutionError(
f"Failed to collapse any accordions (found {len(expanded_indices)} expanded, all failed)",
details={
"selector": buttons_selector,
"found_count": total_count,
"expanded_count": len(expanded_indices),
"failed_indices": ",".join(str(i) for i in failed_indices),
},
)
return {
"collapsed_count": collapsed_count,
"total_found": total_count,
"failed_indices": failed_indices,
}
__all__ = ["Accordion", "AccordionCollapseResult"]

View File

@@ -0,0 +1,533 @@
"""Base utilities for Material UI component automation.
This module provides foundational helpers shared across all MUI element modules:
- Selector escaping for JavaScript injection safety
- Mouse event sequences for MUI interaction compatibility
- Keyboard event dispatching
- React native property setter for controlled component state updates
- Listbox/option waiting utilities
Individual component modules (dropdown, text, checkbox, etc.) import from here.
"""
import asyncio
import contextlib
import logging
from typing import TypedDict
from guide.app.browser.types import PageLike
from guide.app.browser.utils import escape_js_string, escape_selector
from guide.app.core.config import Timeouts
_logger = logging.getLogger(__name__)
# Module-level default timeouts for backward compatibility
_DEFAULT_TIMEOUTS = Timeouts()
# ---------------------------------------------------------------------------
# Shared Types
# ---------------------------------------------------------------------------
class DropdownResult(TypedDict):
"""Result of a dropdown selection operation."""
selected: list[str]
not_found: list[str]
available: list[str]
# ---------------------------------------------------------------------------
# Keyboard Events
# ---------------------------------------------------------------------------
async def send_key(page: PageLike, key: str) -> None:
"""Send a keyboard event to the currently focused element.
Dispatches keydown and keyup events with proper keyCode for MUI components.
Args:
page: PageLike instance (Playwright Page or ExtensionPage)
key: Key name (ArrowDown, ArrowUp, Enter, Escape, Tab)
"""
keycode_map = {"ArrowDown": 40, "ArrowUp": 38, "Enter": 13, "Escape": 27, "Tab": 9}
keycode = keycode_map.get(key, 0)
_ = await page.evaluate(
f"""
(() => {{
const el = document.activeElement;
if (!el) return false;
const eventProps = {{
key: '{key}',
code: '{key}',
keyCode: {keycode},
which: {keycode},
bubbles: true,
cancelable: true,
composed: true
}};
// Dispatch both keydown and keyup for complete key press
el.dispatchEvent(new KeyboardEvent('keydown', eventProps));
el.dispatchEvent(new KeyboardEvent('keyup', eventProps));
return true;
}})();
"""
)
# Backward compatibility alias
_send_key = send_key
# ---------------------------------------------------------------------------
# Mouse Events
# ---------------------------------------------------------------------------
async def click_with_mouse_events(
page: PageLike,
selector: str,
*,
focus_first: bool = True,
) -> bool:
"""Click an element using full mouse event sequence.
MUI components often require mousedown/mouseup/click sequence rather than
just .click() to properly register user interactions and open dropdowns.
Args:
page: PageLike instance
selector: CSS selector for the element to click
focus_first: Whether to focus the element before clicking
Returns:
True if element was found and clicked, False otherwise
"""
selector_escaped = escape_selector(selector)
result = await page.evaluate(
f"""
(() => {{
const el = document.querySelector('{selector_escaped}');
if (!el) return false;
{"el.focus();" if focus_first else ""}
const rect = el.getBoundingClientRect();
const centerX = rect.left + rect.width / 2;
const centerY = rect.top + rect.height / 2;
el.dispatchEvent(new MouseEvent('mousedown', {{
bubbles: true, cancelable: true, view: window,
clientX: centerX, clientY: centerY, button: 0
}}));
el.dispatchEvent(new MouseEvent('mouseup', {{
bubbles: true, cancelable: true, view: window,
clientX: centerX, clientY: centerY, button: 0
}}));
el.dispatchEvent(new MouseEvent('click', {{
bubbles: true, cancelable: true, view: window,
clientX: centerX, clientY: centerY, button: 0
}}));
return true;
}})();
"""
)
return bool(result)
# ---------------------------------------------------------------------------
# React State Helpers
# ---------------------------------------------------------------------------
async def fill_with_react_events(
page: PageLike,
selector: str,
value: str,
*,
element_type: str = "input",
) -> bool:
"""Fill an input using React-compatible native property setter.
React controlled components require using the native value setter to
properly trigger state updates. This function:
1. Uses Object.getOwnPropertyDescriptor to get the native setter
2. Calls setter with new value
3. Dispatches input/change events to trigger React handlers
4. Blurs to trigger validation
Args:
page: PageLike instance
selector: CSS selector for the input element
value: Value to set
element_type: Element type for prototype lookup (input, textarea)
Returns:
True if element was found and filled, False otherwise
"""
selector_escaped = escape_selector(selector)
value_escaped = escape_selector(value)
prototype = (
"HTMLTextAreaElement" if element_type == "textarea" else "HTMLInputElement"
)
result = await page.evaluate(
f"""
(() => {{
const el = document.querySelector('{selector_escaped}');
if (!el) return false;
el.focus();
const nativeSetter = Object.getOwnPropertyDescriptor(
window.{prototype}.prototype,
'value'
)?.set;
if (nativeSetter) {{
nativeSetter.call(el, '{value_escaped}');
}} else {{
el.value = '{value_escaped}';
}}
el.dispatchEvent(new Event('input', {{ bubbles: true }}));
el.dispatchEvent(new Event('change', {{ bubbles: true }}));
el.blur();
return true;
}})();
"""
)
return bool(result)
async def clear_and_fill_with_react_events(
page: PageLike,
selector: str,
value: str,
) -> bool:
"""Clear input then fill using React-compatible setter.
Useful for masked inputs (e.g., '0%', '$0.00') that need clearing first.
Args:
page: PageLike instance
selector: CSS selector for the input element
value: Value to set after clearing
Returns:
True if successful, False otherwise
"""
selector_escaped = escape_selector(selector)
value_escaped = escape_selector(value)
result = await page.evaluate(
f"""
(() => {{
const el = document.querySelector('{selector_escaped}');
if (!el) return false;
el.focus();
el.select();
const nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLInputElement.prototype,
'value'
)?.set;
// Clear first
if (nativeSetter) {{
nativeSetter.call(el, '');
}} else {{
el.value = '';
}}
el.dispatchEvent(new Event('input', {{ bubbles: true }}));
// Set new value
if (nativeSetter) {{
nativeSetter.call(el, '{value_escaped}');
}} else {{
el.value = '{value_escaped}';
}}
el.dispatchEvent(new Event('input', {{ bubbles: true }}));
el.dispatchEvent(new Event('change', {{ bubbles: true }}));
el.blur();
return true;
}})();
"""
)
return bool(result)
# ---------------------------------------------------------------------------
# Listbox/Option Utilities (for dropdowns)
# ---------------------------------------------------------------------------
async def wait_for_role_option(
page: PageLike,
timeout_ms: int | None = None,
*,
listbox_id: str | None = None,
timeouts: Timeouts | None = None,
) -> bool:
"""Wait for role=option elements to appear in the DOM.
Args:
page: PageLike instance
timeout_ms: Maximum wait time in milliseconds (default from Timeouts.listbox_wait)
listbox_id: Optional listbox ID to scope the search (prevents cross-dropdown leakage)
timeouts: Optional Timeouts instance for centralized configuration
Returns:
True if options appeared, False if timeout
"""
effective_timeouts = timeouts or _DEFAULT_TIMEOUTS
effective_timeout_ms = (
timeout_ms if timeout_ms is not None else effective_timeouts.listbox_wait
)
start = asyncio.get_event_loop().time()
# Build scoped query based on listbox_id
if listbox_id:
escaped_id = escape_js_string(listbox_id)
query = f"""
(() => {{
const listbox = document.getElementById('{escaped_id}');
if (!listbox) return false;
return listbox.querySelector('[role="option"]:not([data-dropdown-stale])') !== null;
}})();
"""
else:
# Global fallback: find any visible, non-stale option in a non-closed listbox
query = """
(() => {
const listbox = document.querySelector('[role="listbox"]:not([data-dropdown-closed])');
if (!listbox) return false;
return listbox.querySelector('[role="option"]:not([data-dropdown-stale])') !== null;
})();
"""
while (asyncio.get_event_loop().time() - start) * 1000 < effective_timeout_ms:
exists = await page.evaluate(query)
if exists:
return True
await asyncio.sleep(0.05)
return False
# Backward compatibility alias
_wait_for_role_option = wait_for_role_option
async def ensure_listbox(
page: PageLike,
timeout_ms: int | None = None,
*,
listbox_id: str | None = None,
timeouts: Timeouts | None = None,
) -> bool:
"""Ensure a listbox with options is visible.
Tries Playwright's wait_for_selector first, falls back to polling.
When listbox_id is provided, scopes the search to that specific listbox
to prevent cross-dropdown leakage in multi-dropdown scenarios.
Args:
page: PageLike instance
timeout_ms: Maximum wait time in milliseconds (default from Timeouts.listbox_wait)
listbox_id: Optional listbox ID from aria-controls to scope the search
timeouts: Optional Timeouts instance for centralized configuration
Returns:
True if listbox is visible, False otherwise
"""
effective_timeouts = timeouts or _DEFAULT_TIMEOUTS
effective_timeout_ms = (
timeout_ms if timeout_ms is not None else effective_timeouts.listbox_wait
)
# When scoped to a specific listbox, skip generic wait_for_selector
# and use our scoped polling directly
if listbox_id:
return await wait_for_role_option(
page,
effective_timeout_ms,
listbox_id=listbox_id,
timeouts=effective_timeouts,
)
# Unscoped fallback: try Playwright wait then polling
with contextlib.suppress(Exception):
if hasattr(page, "wait_for_selector"):
# Use scoped selector to avoid stale options
selector = '[role="listbox"]:not([data-dropdown-closed]) [role="option"]:not([data-dropdown-stale])'
_ = await page.wait_for_selector(selector, timeout=effective_timeout_ms)
return True
return await wait_for_role_option(
page, effective_timeout_ms, timeouts=effective_timeouts
)
# Backward compatibility alias
_ensure_listbox = ensure_listbox
async def check_listbox_visible(page: PageLike) -> bool:
"""Check if a listbox is currently visible on the page.
Checks for: offsetParent, display, visibility, dimensions, and not manually closed.
Args:
page: PageLike instance
Returns:
True if listbox is visible, False otherwise
"""
result = await page.evaluate(
"""
(() => {
// Find listbox that's not manually closed
const listbox = document.querySelector('[role="listbox"]:not([data-dropdown-closed])');
if (!listbox) return false;
const rect = listbox.getBoundingClientRect();
return listbox.offsetParent !== null &&
listbox.style.display !== 'none' &&
listbox.style.visibility !== 'hidden' &&
rect.width > 0 && rect.height > 0;
})();
"""
)
return bool(result)
# Backward compatibility alias
_check_listbox_visible = check_listbox_visible
# ---------------------------------------------------------------------------
# Field State Utilities
# ---------------------------------------------------------------------------
async def check_field_disabled(page: PageLike, wrapper_selector: str) -> bool:
"""Check if a MUI field is disabled.
Checks wrapper, input, and combobox for disabled state.
Args:
page: PageLike instance
wrapper_selector: CSS selector for the field wrapper
Returns:
True if field is disabled, False otherwise
"""
wrapper_escaped = escape_selector(wrapper_selector)
result = await page.evaluate(
f"""
(() => {{
const root = document.querySelector('{wrapper_escaped}');
if (!root) return true;
if (root.hasAttribute('disabled') || root.classList.contains('Mui-disabled')) return true;
const input = root.querySelector('input');
if (input && (input.disabled || input.hasAttribute('aria-disabled'))) return true;
const combobox = root.querySelector('[role="combobox"]');
if (combobox && combobox.hasAttribute('aria-disabled')) return true;
return false;
}})();
"""
)
return bool(result)
# Backward compatibility alias
_check_field_disabled = check_field_disabled
async def enable_field(page: PageLike, wrapper_selector: str) -> None:
"""Enable a disabled MUI field by removing disabled attributes.
Handles both direct combobox elements and wrapper containers.
Args:
page: PageLike instance
wrapper_selector: CSS selector for the field wrapper
"""
wrapper_escaped = escape_selector(wrapper_selector)
_ = await page.evaluate(
f"""
(() => {{
const el = document.querySelector('{wrapper_escaped}');
if (!el) return false;
// Check if element itself is a combobox
if (el.getAttribute('role') === 'combobox') {{
el.removeAttribute('aria-disabled');
el.removeAttribute('disabled');
el.classList.remove('Mui-disabled');
return true;
}}
// Field selector is a container - find and enable children
const input = el.querySelector('input');
if (input && input.disabled) {{
input.disabled = false;
input.removeAttribute('aria-disabled');
}}
const select = el.querySelector('[role="combobox"]');
if (select) {{
select.removeAttribute('aria-disabled');
select.classList.remove('Mui-disabled');
}}
const popupBtn = el.querySelector('.MuiAutocomplete-popupIndicator');
if (popupBtn) {{
popupBtn.disabled = false;
popupBtn.classList.remove('Mui-disabled');
popupBtn.removeAttribute('aria-disabled');
}}
return true;
}})();
"""
)
await page.wait_for_timeout(50)
# Backward compatibility aliases
_enable_combobox = enable_field
__all__ = [
# Types
"DropdownResult",
# Selector utilities (import directly from browser.utils)
"escape_selector",
"escape_js_string",
# Keyboard
"send_key",
"_send_key",
# Mouse
"click_with_mouse_events",
# React state
"fill_with_react_events",
"clear_and_fill_with_react_events",
# Listbox utilities
"wait_for_role_option",
"_wait_for_role_option",
"ensure_listbox",
"_ensure_listbox",
"check_listbox_visible",
"_check_listbox_visible",
# Field state
"check_field_disabled",
"_check_field_disabled",
"enable_field",
"_enable_combobox",
]

View File

@@ -0,0 +1,956 @@
"""WebSocket server for Terminator Bridge browser extension.
Provides a Playwright-like API that translates Python calls into JavaScript
executed via the browser extension, avoiding CDP page refresh issues.
"""
import asyncio
import contextlib
import json
import uuid
from typing import Protocol, cast
from loguru import logger
from websockets.asyncio.server import Server, ServerConnection, serve
from guide.app.browser.types import PageLocator
from guide.app.core.config import DEFAULT_EXTENSION_PORT, Timeouts
from guide.app.errors import ActionExecutionError, BrowserConnectionError
# JSON-serializable values that can be returned from JavaScript evaluation
type JSONValue = (
str | int | float | bool | None | dict[str, JSONValue] | list[JSONValue]
)
class PageWithExtensions(Protocol):
"""Protocol for Page objects that support extension-specific methods.
This protocol defines the extended interface available when using
the extension client, including methods not present in standard Playwright.
"""
async def click(self, selector: str) -> None:
"""Click an element matching the selector."""
...
async def fill(self, selector: str, value: str) -> None:
"""Fill an input element with a value."""
...
async def type(self, selector: str, text: str) -> None:
"""Type text into an element."""
...
async def evaluate(self, expression: str) -> JSONValue:
"""Evaluate JavaScript expression in page context."""
...
async def wait_for_timeout(self, timeout: int) -> None:
"""Wait for a specified timeout in milliseconds."""
...
async def click_element_with_text(
self, selector: str, text: str, timeout: int = 5000
) -> None:
"""Click an element matching selector that contains specific text.
Waits for the element to appear before clicking.
"""
...
class ExtensionPage:
"""Playwright-like Page interface using browser extension for execution.
Provides familiar Playwright API methods that automatically generate
and execute JavaScript via the Terminator Bridge extension.
Usage:
async with ExtensionClient() as client:
page = await client.get_page()
await page.click("button.submit")
await page.fill("input[name='email']", "user@example.com")
title = await page.evaluate("document.title")
"""
def __init__(
self, client: "ExtensionClient", *, timeouts: Timeouts | None = None
) -> None:
self._client: ExtensionClient = client
self._timeouts: Timeouts = timeouts or Timeouts()
self._pending: dict[str, asyncio.Future[JSONValue]] = {}
self._current_url: str = ""
@property
def url(self) -> str:
"""Get current page URL (cached from last navigation/check)."""
return self._current_url
def _escape_selector(self, selector: str) -> str:
"""Escape a CSS selector for use in JavaScript single-quoted strings."""
from guide.app.browser.utils import escape_selector
return escape_selector(selector)
async def _send_command(
self,
action: str,
payload: dict[str, JSONValue],
*,
timeout: float | None = None,
) -> JSONValue:
"""Send structured command to extension via ActionDispatcher.
This method sends commands that can be handled by the content script's
ActionDispatcher, avoiding inline JavaScript generation.
Args:
action: Action name (FILL, CLICK, WAIT_FOR_SELECTOR, etc.)
payload: Action-specific parameters
timeout: Maximum time to wait for response (seconds)
Returns:
Result from action execution
Raises:
BrowserConnectionError: If command times out
Exception: If action fails
"""
request_id = str(uuid.uuid4())
future: asyncio.Future[JSONValue] = asyncio.Future()
self._pending[request_id] = future
message = {
"id": request_id,
"action": action,
"payload": payload,
}
logger.debug("[Extension] Sending {}: {}", action, payload)
await self._client.send_message(cast(dict[str, JSONValue], message))
effective_timeout = (
timeout if timeout is not None else self._timeouts.extension_command_s
)
try:
return await asyncio.wait_for(future, timeout=effective_timeout)
except asyncio.TimeoutError as e:
_ = self._pending.pop(request_id, None)
raise BrowserConnectionError(
f"Extension command timeout after {effective_timeout}s: {action}",
details={
"action": action,
"payload": payload,
"timeout_seconds": effective_timeout,
},
) from e
async def click(self, selector: str) -> None:
"""Click an element matching the selector.
Uses debugger API with full mouse event sequence to avoid content script
dependency issues.
Args:
selector: CSS selector for the element to click
Raises:
Exception: If element not found or click fails
"""
selector_escaped = self._escape_selector(selector)
js_code = f"""
(() => {{
try {{
const el = document.querySelector('{selector_escaped}');
if (!el) return {{success: false, error: 'Element not found'}};
// Focus first (some elements need focus before click)
if (typeof el.focus === 'function') {{
el.focus();
}}
// Get position for mouse events
const rect = el.getBoundingClientRect();
const centerX = rect.left + rect.width / 2;
const centerY = rect.top + rect.height / 2;
const eventOptions = {{
bubbles: true,
cancelable: true,
view: window,
clientX: centerX,
clientY: centerY,
button: 0,
}};
el.dispatchEvent(new MouseEvent('mousedown', eventOptions));
el.dispatchEvent(new MouseEvent('mouseup', eventOptions));
el.dispatchEvent(new MouseEvent('click', eventOptions));
return {{success: true}};
}} catch (e) {{
return {{success: false, error: e.message}};
}}
}})();
"""
result = await self.eval_js(js_code)
if isinstance(result, dict) and not result.get("success"):
raise ActionExecutionError(
f"Click failed: {result.get('error', 'unknown')}",
details={"selector": selector, "result": result},
)
async def fill(self, selector: str, value: str) -> None:
"""Fill an input element with a value (React-compatible).
Uses debugger API with React native property setter to avoid content script
dependency issues. This ensures values persist when React re-renders.
Args:
selector: CSS selector for the input element
value: Value to fill
Raises:
Exception: If element not found or fill fails
"""
selector_escaped = self._escape_selector(selector)
value_escaped = (
value.replace("\\", "\\\\").replace("'", "\\'").replace('"', '\\"')
)
js_code = f"""
(() => {{
const el = document.querySelector('{selector_escaped}');
if (!el) throw new Error('Element not found: {selector_escaped}');
el.focus();
const tagName = el.tagName.toLowerCase();
let nativeSetter;
if (tagName === 'textarea') {{
nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLTextAreaElement.prototype, 'value'
)?.set;
}} else {{
nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLInputElement.prototype, 'value'
)?.set;
}}
if (nativeSetter) {{
nativeSetter.call(el, '{value_escaped}');
}} else {{
el.value = '{value_escaped}';
}}
el.dispatchEvent(new Event('input', {{ bubbles: true }}));
el.dispatchEvent(new Event('change', {{ bubbles: true }}));
el.blur();
return true;
}})();
"""
_ = await self.eval_js(js_code)
async def type(self, selector: str, text: str) -> None:
"""Type text into an element (React-compatible).
Uses debugger API with React native property setter to avoid content script
dependency issues. This ensures typed values persist when React re-renders.
Args:
selector: CSS selector for the element
text: Text to type
Raises:
Exception: If element not found or typing fails
"""
# type() and fill() use the same implementation - both set input value
await self.fill(selector, text)
async def evaluate(self, expression: str, arg: object = None) -> JSONValue:
"""Evaluate JavaScript expression in page context.
Args:
expression: JavaScript expression to evaluate
arg: Optional argument to pass to the expression
Returns:
Result of the expression (must be JSON-serializable)
"""
if arg is not None:
# Wrap expression as a function call with argument
arg_json = json.dumps(arg)
code = f"({expression})({arg_json})"
return await self.eval_js(code)
return await self.eval_js(expression)
async def click_element_with_text(
self, selector: str, text: str, timeout: int | None = None
) -> None:
"""Click an element matching selector that contains specific text.
Uses content script's CLICK_TEXT action with element search.
Args:
selector: CSS selector for elements to search
text: Text content to match
timeout: Maximum time to wait in milliseconds (default from Timeouts.element_default)
Raises:
Exception: If no matching element found within timeout
"""
effective_timeout_ms = (
timeout if timeout is not None else self._timeouts.element_default
)
_ = await self._send_command(
"CLICK_TEXT",
{"selector": selector, "text": text},
timeout=effective_timeout_ms / 1000, # Convert ms to seconds
)
async def wait_for_timeout(self, timeout: int) -> None:
"""Wait for a specified timeout.
Args:
timeout: Timeout in milliseconds
"""
await asyncio.sleep(timeout / 1000)
async def trusted_click(self, x: float, y: float) -> None:
"""Perform a trusted click at specific coordinates.
Uses Chrome's Input.dispatchMouseEvent to produce events with isTrusted=true.
This is required for MUI ClickAwayListener which ignores untrusted events.
Args:
x: X coordinate (viewport-relative)
y: Y coordinate (viewport-relative)
"""
_ = await self._send_command(
"trusted_click",
{"x": x, "y": y},
timeout=self._timeouts.extension_trusted_click_s,
)
async def goto(
self,
_url: str,
*,
_timeout: float | None = None,
_wait_until: str | None = None,
_referer: str | None = None,
) -> None:
"""No-op navigation for extension mode.
In extension mode, the user manually navigates their browser to the desired
page BEFORE calling the API. Navigation is completely unnecessary and would
only cause unwanted page refreshes that close modals and lose state.
This method exists only for API compatibility with Playwright's Page interface.
Args:
_url: URL to navigate to (ignored)
_timeout: Maximum navigation time in milliseconds (ignored)
_wait_until: When to consider navigation complete (ignored)
_referer: Referer header value (ignored)
"""
def locator(self, selector: str) -> "PageLocator":
"""Create a locator for the given selector.
Args:
selector: CSS selector
Returns:
ExtensionLocator instance
"""
return cast(PageLocator, cast(object, ExtensionLocator(self, selector)))
async def wait_for_selector(
self,
selector: str,
*,
timeout: float | None = None,
state: str | None = None,
strict: bool | None = None,
) -> object | None:
"""Wait for an element matching selector to appear.
Uses debugger API with polling loop for reliable element detection.
Polls every 100ms (like Playwright) instead of waiting the full timeout.
Args:
selector: CSS selector
timeout: Maximum time to wait in milliseconds (default: 5000)
state: State to wait for (attached, visible, detached)
strict: Whether to use strict mode (ignored)
Returns:
None (for compatibility with Playwright)
"""
timeout_ms = int(timeout) if timeout else self._timeouts.element_default
poll_interval_ms = 100 # Poll every 100ms like Playwright
wait_state = state or "attached"
_ = strict # Reserved for parameter compatibility
selector_escaped = self._escape_selector(selector)
start_time = asyncio.get_event_loop().time()
while (asyncio.get_event_loop().time() - start_time) * 1000 < timeout_ms:
with contextlib.suppress(Exception):
result = await self.eval_js(f"""
(() => {{
const el = document.querySelector('{selector_escaped}');
if (!el) return {{exists: false}};
const rect = el.getBoundingClientRect();
const style = window.getComputedStyle(el);
const isVisible = rect.width > 0 && rect.height > 0 &&
style.display !== 'none' &&
style.visibility !== 'hidden' &&
style.opacity !== '0';
return {{exists: true, visible: isVisible}};
}})();
""")
if isinstance(result, dict):
exists = bool(result.get("exists", False))
visible = bool(result.get("visible", False))
if wait_state == "detached" and not exists:
return None
if wait_state == "attached" and exists:
return None
if wait_state == "visible" and exists and visible:
return None
# Poll at intervals instead of waiting full timeout
await asyncio.sleep(poll_interval_ms / 1000)
return None
async def wait_for_load_state(
self,
_state: str | None = None,
*,
timeout: float | None = None,
) -> None:
"""Wait for page load state (no-op for extension mode).
Args:
_state: Load state to wait for (ignored)
timeout: Maximum time to wait (ignored)
"""
_ = timeout # Unused but required for PageLike protocol
async def content(self) -> str:
"""Get page HTML content.
Uses content script's GET_CONTENT action.
Returns:
HTML content of the page
"""
result = await self._send_command("GET_CONTENT", {"mode": "html"})
if isinstance(result, dict) and "content" in result:
return str(result["content"])
return str(result) if result else ""
async def screenshot(self, *, _full_page: bool = False) -> bytes:
"""Capture page screenshot (not supported in extension mode).
Args:
_full_page: Whether to capture full page (ignored)
Returns:
Empty bytes (screenshot not supported in extension mode)
Raises:
NotImplementedError: Screenshots not supported in extension mode
"""
raise NotImplementedError("Screenshots not supported in extension mode")
async def eval_js(self, code: str, await_promise: bool = True) -> JSONValue:
"""Execute JavaScript code via extension and return result.
Args:
code: JavaScript code to execute
await_promise: Whether to await promises in the code
Returns:
Result from JavaScript execution
Raises:
Exception: If execution fails
"""
request_id = str(uuid.uuid4())
future: asyncio.Future[JSONValue] = asyncio.Future()
self._pending[request_id] = future
message = {
"id": request_id,
"action": "eval",
"code": code,
"awaitPromise": await_promise,
}
logger.debug("[Extension] Sending eval: {}...", code[:100])
await self._client.send_message(cast(dict[str, JSONValue], message))
eval_timeout = self._timeouts.extension_command_s
try:
return await asyncio.wait_for(future, timeout=eval_timeout)
except asyncio.TimeoutError as e:
_ = self._pending.pop(request_id, None)
raise BrowserConnectionError(
f"Extension eval timeout after {eval_timeout}s: {code[:100]}",
details={"code_preview": code[:100], "timeout_seconds": eval_timeout},
) from e
def handle_response(self, data: dict[str, JSONValue]) -> None:
"""Handle response from extension.
Args:
data: Response data from extension
"""
request_id = data.get("id")
if not request_id or request_id not in self._pending:
return
future = self._pending.pop(request_id)
if future.done():
return
if data.get("ok"):
future.set_result(data.get("result"))
else:
error_msg = data.get("error", "Unknown error")
future.set_exception(Exception(f"Extension error: {error_msg}"))
class ExtensionLocator:
"""Playwright-like Locator interface for element operations.
Provides methods for interacting with located elements.
"""
def __init__(
self,
page: ExtensionPage,
selector: str,
index: int | None = None,
parent: "ExtensionLocator | None" = None,
) -> None:
self._page: ExtensionPage = page
self._selector: str = selector
self._index: int | None = index
self._parent: ExtensionLocator | None = parent
def _selector_chain(self) -> list[dict[str, int | str | None]]:
"""Build a selector chain from root to this locator."""
chain: list[dict[str, int | str | None]] = []
node: ExtensionLocator | None = self
while node:
chain.append({"selector": node._selector, "index": node._index})
node = node._parent
chain.reverse()
return chain
def _build_chain_eval(self, body: str, empty_expr: str) -> str:
"""Generate JS that resolves the locator chain then runs body."""
chain = json.dumps(self._selector_chain())
return f"""
const chain = {chain};
let contexts = [document];
for (const step of chain) {{
let next = [];
for (const ctx of contexts) {{
next.push(...ctx.querySelectorAll(step.selector));
}}
if (step.index !== null && step.index !== undefined) {{
const el = next[step.index];
next = el ? [el] : [];
}}
contexts = next;
}}
if (!contexts.length) {{ {empty_expr} }}
const target = contexts[0];
{body}
"""
def locator(self, selector: str) -> "PageLocator":
"""Find a child element within this locator.
Keeps the selector chain (including nth index) so the child
query is scoped to the same parent element.
"""
return cast(
PageLocator,
cast(object, ExtensionLocator(self._page, selector, None, self)),
)
async def click(self, *, timeout: int | float | None = None) -> None:
"""Click the located element.
Args:
timeout: Maximum time to wait in milliseconds (ignored for extension mode)
"""
_ = timeout # Unused but required for PageLocator protocol compatibility
js_code = self._build_chain_eval(
"target.click(); return true;",
"throw new Error('Element not found');",
)
_ = await self._page.eval_js(js_code)
async def fill(self, value: str) -> None:
"""Fill the located element with a value."""
escaped_value = (
value.replace("\\", "\\\\").replace("'", "\\'").replace('"', '\\"')
)
js_code = self._build_chain_eval(
f"""
const el = target;
const tagName = el.tagName.toLowerCase();
let nativeSetter;
if (tagName === 'textarea') {{
nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLTextAreaElement.prototype,
'value'
).set;
}} else if (tagName === 'input') {{
nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLInputElement.prototype,
'value'
).set;
}} else {{
throw new Error('Element must be input or textarea');
}}
nativeSetter.call(el, '{escaped_value}');
el.dispatchEvent(new Event('input', {{ bubbles: true }}));
el.dispatchEvent(new Event('change', {{ bubbles: true }}));
return true;
""",
"throw new Error('Element not found');",
)
_ = await self._page.eval_js(js_code)
async def type(self, text: str) -> None:
"""Type text into the located element."""
escaped_text = (
text.replace("\\", "\\\\").replace("'", "\\'").replace('"', '\\"')
)
js_code = self._build_chain_eval(
f"""
const el = target;
const tagName = el.tagName.toLowerCase();
let nativeSetter;
if (tagName === 'textarea') {{
nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLTextAreaElement.prototype,
'value'
).set;
}} else if (tagName === 'input') {{
nativeSetter = Object.getOwnPropertyDescriptor(
window.HTMLInputElement.prototype,
'value'
).set;
}} else {{
throw new Error('Element must be input or textarea');
}}
nativeSetter.call(el, '{escaped_text}');
el.dispatchEvent(new Event('input', {{ bubbles: true }}));
el.dispatchEvent(new Event('change', {{ bubbles: true }}));
return true;
""",
"throw new Error('Element not found');",
)
_ = await self._page.eval_js(js_code)
async def wait_for(self, state: str = "visible", timeout: int = 5000) -> None:
"""Wait for the element to reach a specific state."""
chain = json.dumps(self._selector_chain())
js_code = f"""
const chain = {chain};
const state = '{state}';
const timeout = {timeout};
const startTime = Date.now();
function resolveChain() {{
let contexts = [document];
for (const step of chain) {{
let next = [];
for (const ctx of contexts) {{
next.push(...ctx.querySelectorAll(step.selector));
}}
if (step.index !== null && step.index !== undefined) {{
const el = next[step.index];
next = el ? [el] : [];
}}
contexts = next;
}}
return contexts;
}}
function checkState() {{
const matches = resolveChain();
if (state === 'visible') {{
return matches.some(el => el && el.offsetParent !== null);
}} else if (state === 'hidden') {{
return matches.length === 0 || matches.every(el => el && el.offsetParent === null);
}} else if (state === 'attached') {{
return matches.length > 0;
}} else if (state === 'detached') {{
return matches.length === 0;
}}
return matches.length > 0;
}}
return new Promise((resolve, reject) => {{
const interval = setInterval(() => {{
if (checkState()) {{
clearInterval(interval);
resolve(true);
}} else if (Date.now() - startTime > timeout) {{
clearInterval(interval);
reject(new Error('Timeout waiting for element state: ' + state));
}}
}}, 100);
}});
"""
_ = await self._page.eval_js(js_code, await_promise=True)
async def count(self) -> int:
"""Get the count of matching elements."""
chain = json.dumps(self._selector_chain())
js_code = f"""
const chain = {chain};
let contexts = [document];
for (const step of chain) {{
let next = [];
for (const ctx of contexts) {{
next.push(...ctx.querySelectorAll(step.selector));
}}
if (step.index !== null && step.index !== undefined) {{
const el = next[step.index];
next = el ? [el] : [];
}}
contexts = next;
}}
return contexts.length;
"""
result = await self._page.eval_js(js_code)
return int(result) if isinstance(result, (int, float)) else 0
async def text_content(self) -> str | None:
"""Get the text content of the located element."""
js_code = self._build_chain_eval(
"return target ? target.textContent : null;", "return null;"
)
result = await self._page.eval_js(js_code)
return str(result) if result is not None else None
def nth(self, index: int) -> "PageLocator":
"""Get the nth matching element."""
return cast(
PageLocator,
cast(
object,
ExtensionLocator(self._page, self._selector, index, self._parent),
),
)
@property
def first(self) -> "PageLocator":
"""Get the first matching element."""
return self.nth(0)
class ExtensionClient:
"""WebSocket server for Terminator Bridge extension.
Starts a WebSocket server that the browser extension connects to
and provides page access for automation.
Usage:
async with ExtensionClient() as client:
page = await client.get_page()
await page.click("button")
"""
def __init__(
self,
host: str = "0.0.0.0",
port: int = DEFAULT_EXTENSION_PORT,
*,
timeouts: Timeouts | None = None,
) -> None:
self._host: str = host
self._port: int = port
self._timeouts: Timeouts = timeouts or Timeouts()
self._server: Server | None = None
self._ws: ServerConnection | None = None
self._page: ExtensionPage | None = None
self._connected: asyncio.Event = asyncio.Event()
async def __aenter__(self) -> "ExtensionClient":
await self.start()
return self
async def __aexit__(
self,
exc_type: type[BaseException] | None,
exc_val: BaseException | None,
exc_tb: object,
) -> None:
await self.close()
async def start(self) -> None:
"""Start the WebSocket server and wait for extension to connect."""
logger.info("Starting WebSocket server at {}:{}", self._host, self._port)
try:
self._server = await serve(self._handle_connection, self._host, self._port)
except Exception as exc: # pragma: no cover - error path is exercised in tests via raised subclass
raise BrowserConnectionError(
"Failed to start extension WebSocket server",
details={"host": self._host, "port": self._port, "reason": str(exc)},
) from exc
logger.info("WebSocket server listening at ws://{}:{}", self._host, self._port)
logger.info("Waiting for extension to connect...")
# Wait for extension to connect (with timeout)
connection_timeout = self._timeouts.extension_connection_s
try:
_ = await asyncio.wait_for(
self._connected.wait(), timeout=connection_timeout
)
logger.info("Extension connected successfully")
except (
asyncio.TimeoutError
) as exc: # pragma: no cover - covered by unit test raising connection error
await self.close()
raise BrowserConnectionError(
f"Extension did not connect within {connection_timeout} seconds. Make sure Chrome is running with the Terminator Bridge extension loaded.",
details={
"host": self._host,
"port": self._port,
"timeout_seconds": connection_timeout,
},
) from exc
async def close(self) -> None:
"""Close the WebSocket server."""
if self._ws:
await self._ws.close()
self._ws = None
if self._server:
self._server.close()
await self._server.wait_closed()
logger.info("WebSocket server closed")
async def get_page(self) -> ExtensionPage:
"""Get the page interface for browser automation.
Returns:
ExtensionPage instance for automation
Raises:
BrowserConnectionError: If extension not connected
"""
if not self._page:
raise BrowserConnectionError(
"Extension not connected",
details={"host": self._host, "port": self._port},
)
return self._page
async def _handle_connection(self, websocket: ServerConnection) -> None:
"""Handle incoming WebSocket connection from extension.
Args:
websocket: WebSocket connection from extension
"""
logger.info("Extension connected")
self._ws = websocket
self._page = ExtensionPage(self, timeouts=self._timeouts)
# Set connected flag only AFTER page is initialized to prevent race condition
self._connected.set()
ws_timeout = self._timeouts.websocket_receive_s
try:
# Use timeout to detect hung connections
while True:
try:
# Timeout triggers ping to check connection health
message = await asyncio.wait_for(
websocket.recv(), timeout=ws_timeout
)
try:
data = cast(dict[str, JSONValue], json.loads(message))
if self._page:
self._page.handle_response(data)
except json.JSONDecodeError:
logger.warning("Invalid JSON from extension: {}", message)
except Exception as e:
logger.error("Error handling extension message: {}", e)
except asyncio.TimeoutError:
# No message in timeout period - check if connection is still alive
logger.debug(
"No message from extension for {}s, sending ping", ws_timeout
)
try:
_ = await websocket.ping()
except Exception:
logger.warning("Extension ping failed, connection may be dead")
break
except Exception as e:
logger.error("Extension connection error: {}", e)
finally:
logger.info("Extension disconnected")
# Ensure WebSocket is properly closed before clearing reference
with contextlib.suppress(Exception):
if self._ws:
await self._ws.close()
self._ws = None
self._connected.clear()
async def send_message(self, message: dict[str, JSONValue]) -> None:
"""Send message to connected extension.
Args:
message: Message to send
Raises:
BrowserConnectionError: If extension not connected
"""
if not self._ws:
raise BrowserConnectionError(
"Extension not connected",
details={"host": self._host, "port": self._port},
)
await self._ws.send(json.dumps(message))
__all__ = ["ExtensionClient", "ExtensionPage", "ExtensionLocator", "PageWithExtensions"]

View File

@@ -0,0 +1,30 @@
"""High-level page interaction helpers for demo actions.
Composes small, focused mixins so each responsibility stays isolated while
keeping the public API stable for actions.
"""
from guide.app.browser.elements.layout import Accordion, AccordionCollapseResult
from guide.app.browser.mixins import DiagnosticsMixin, InteractionMixin, WaitMixin
from guide.app.browser.types import PageLike
from guide.app.core.config import Timeouts
class PageHelpers(WaitMixin, DiagnosticsMixin, InteractionMixin):
"""High-level page interaction wrapper for demo actions."""
def __init__(self, page: PageLike, *, timeouts: Timeouts | None = None) -> None:
self.page: PageLike = page
self.timeouts: Timeouts = timeouts or Timeouts()
async def collapse_accordions(
self,
selector: str,
timeout_ms: int | None = None,
) -> AccordionCollapseResult:
"""Collapse all expanded accordion buttons matching selector."""
accordion = Accordion(self.page, timeouts=self.timeouts)
return await accordion.collapse_all(selector, timeout_ms=timeout_ms)
__all__ = ["PageHelpers", "AccordionCollapseResult"]

View File

@@ -0,0 +1,218 @@
"""HTML parsing utilities using BeautifulSoup for reliable DOM traversal.
Provides robust HTML element extraction using BeautifulSoup instead of regex patterns.
This module is designed to extract interactive UI elements (inputs, buttons, selects,
textareas) and build CSS selectors with a defined priority order.
"""
from typing import Final
from bs4 import BeautifulSoup, Tag
# Selector attribute priority order (highest to lowest)
SELECTOR_PRIORITY: Final[tuple[str, ...]] = (
"data-cy",
"data-testid",
"data-test",
"id",
"name",
"aria-label",
)
def extract_form_elements(html: str) -> dict[str, str]:
"""Extract interactive UI elements with CSS selectors via DOM traversal.
Finds all input, button, select, textarea, and role="button" elements
and maps them to semantic names with CSS selectors.
Args:
html: HTML string to parse
Returns:
Dictionary mapping semantic element names to CSS selectors
"""
soup = BeautifulSoup(html, "html.parser")
elements: dict[str, str] = {}
name_counts: dict[str, int] = {}
# Standard form elements
for tag in soup.find_all(["input", "button", "select", "textarea"]):
_add_element(tag, elements, name_counts)
# Role=button elements (custom buttons implemented with divs/spans)
for tag in soup.find_all(attrs={"role": "button"}):
_add_element(tag, elements, name_counts, prefix="ROLE_BUTTON_")
return elements
def extract_modal_content(html: str) -> str | None:
"""Extract displayed modal content, ignoring hidden modals.
Searches for modal elements using common patterns (modal-wrapper class,
modal ID, role=dialog, MuiModal class) and returns the first visible one.
Args:
html: HTML string to parse
Returns:
HTML string of the modal content if found and visible, None otherwise
"""
soup = BeautifulSoup(html, "html.parser")
# Check for modals in priority order using direct CSS class/attribute checks
# 1. Check for modal-wrapper class
for modal in soup.find_all("div", class_=True):
classes = modal.get("class")
if isinstance(classes, list) and any(
"modal-wrapper" in str(c) for c in classes
):
if not _is_hidden(modal):
return str(modal)
# 2. Check for modal in ID
for modal in soup.find_all("div", id=True):
elem_id = modal.get("id")
if isinstance(elem_id, str) and "modal" in elem_id.lower():
if not _is_hidden(modal):
return str(modal)
# 3. Check for role=dialog
for modal in soup.find_all("div", attrs={"role": "dialog"}):
if not _is_hidden(modal):
return str(modal)
# 4. Check for MuiModal class
for modal in soup.find_all("div", class_=True):
classes = modal.get("class")
if isinstance(classes, list) and any("MuiModal" in str(c) for c in classes):
if not _is_hidden(modal):
return str(modal)
return None
def build_selector(tag: Tag) -> str | None:
"""Build CSS selector using priority order.
Creates a CSS selector from the element's attributes, preferring
data-cy > data-testid > data-test > id > name > aria-label.
Args:
tag: BeautifulSoup Tag element
Returns:
CSS selector string or None if no suitable attribute found
"""
for attr in SELECTOR_PRIORITY:
value = tag.get(attr)
if value:
# Handle list values (e.g., class attributes)
if isinstance(value, list):
value = value[0] if value else None
if value:
if attr == "id":
return f"#{value}"
return f'[{attr}="{value}"]'
return None
def _is_hidden(elem: Tag) -> bool:
"""Check if element is hidden via style, aria-hidden, or hidden attribute.
Args:
elem: BeautifulSoup Tag element
Returns:
True if element appears to be hidden
"""
# Check inline style for display:none
style = elem.get("style", "")
if isinstance(style, str) and "display:" in style and "none" in style:
return True
# Check aria-hidden
if elem.get("aria-hidden") == "true":
return True
# Check hidden attribute
return elem.has_attr("hidden")
def _get_element_identifier(tag: Tag) -> str | None:
"""Get the primary identifier for an element based on selector priority.
Args:
tag: BeautifulSoup Tag element
Returns:
The identifier string or None if no suitable attribute found
"""
for attr in SELECTOR_PRIORITY:
value = tag.get(attr)
if value:
if isinstance(value, list):
value = value[0] if value else None
if value:
return str(value)
return None
def _normalize_name(name: str) -> str:
"""Normalize a name to a valid identifier format.
Args:
name: Raw name string
Returns:
Normalized uppercase name with safe characters
"""
return name.upper().replace("-", "_").replace(".", "_").replace(" ", "_")
def _add_element(
tag: Tag,
elements: dict[str, str],
name_counts: dict[str, int],
prefix: str = "",
) -> None:
"""Add element to dict with duplicate handling.
Extracts the selector and semantic name, handles duplicates by adding
numeric suffixes.
Args:
tag: BeautifulSoup Tag element
elements: Output dictionary to update
name_counts: Counter dict for duplicate handling
prefix: Optional prefix for semantic name (e.g., "ROLE_BUTTON_")
"""
selector = build_selector(tag)
if not selector:
return
identifier = _get_element_identifier(tag)
if not identifier:
return
# Build semantic name with tag prefix for non-input elements
tag_prefix = ""
if tag.name != "input":
tag_prefix = f"{tag.name.upper()}_"
base_name = f"{prefix}{tag_prefix}{_normalize_name(identifier)}"
# Handle duplicates with numeric suffixes
if base_name not in name_counts:
name_counts[base_name] = 1
elements[base_name] = selector
else:
name_counts[base_name] += 1
elements[f"{base_name}_{name_counts[base_name]}"] = selector
__all__ = [
"SELECTOR_PRIORITY",
"extract_form_elements",
"extract_modal_content",
"build_selector",
]

View File

@@ -0,0 +1,274 @@
"""Keep-alive service for browserless CDP sessions.
This module provides a background task that periodically refreshes browserless
CDP browser sessions to prevent session timeouts. It proactively establishes
connections to browserless hosts and reloads pages every 9 minutes.
"""
import asyncio
import contextlib
from dataclasses import dataclass, field
from datetime import UTC, datetime
from typing import TYPE_CHECKING
from loguru import logger
from playwright.async_api import Page
from guide.app.core.config import HostKind
if TYPE_CHECKING:
from guide.app.browser.pool import BrowserPool
DEFAULT_KEEPALIVE_INTERVAL_SECONDS = 540 # 9 minutes
@dataclass
class KeepAliveHostStatus:
"""Status of a single browserless host."""
host_id: str
last_ping: datetime | None = None
is_connected: bool = False
last_error: str | None = None
@dataclass
class KeepAliveServiceStatus:
"""Overall status of the keep-alive service."""
hosts: list[KeepAliveHostStatus] = field(default_factory=list)
interval_seconds: int = DEFAULT_KEEPALIVE_INTERVAL_SECONDS
is_running: bool = False
started_at: datetime | None = None
class KeepAliveService:
"""Background service that keeps browserless CDP sessions alive.
Periodically pings browserless CDP hosts by reloading pages to prevent
session timeouts. Proactively establishes connections if not yet connected.
"""
_pool: "BrowserPool"
_interval: int
_task: asyncio.Task[None] | None
_status: dict[str, KeepAliveHostStatus]
_started_at: datetime | None
_running: bool
def __init__(
self,
browser_pool: "BrowserPool",
interval_seconds: int = DEFAULT_KEEPALIVE_INTERVAL_SECONDS,
) -> None:
"""Initialize the keep-alive service.
Args:
browser_pool: The browser pool to manage
interval_seconds: Interval between keep-alive pings (default: 540 = 9 minutes)
"""
self._pool = browser_pool
self._interval = interval_seconds
self._task = None
self._status = {}
self._started_at = None
self._running = False
def is_browserless_host(self, host_id: str) -> bool:
"""Check if a host is a browserless CDP host.
Args:
host_id: The host identifier to check
Returns:
True if the host is a browserless CDP host
"""
if "browserless" not in host_id.lower():
return False
host_config = self._pool.settings.browser_hosts.get(host_id)
if not host_config:
return False
return host_config.kind == HostKind.CDP
def get_browserless_host_ids(self) -> list[str]:
"""Get all browserless CDP host IDs from configuration.
Returns:
List of browserless CDP host IDs
"""
return [
host_id
for host_id in self._pool.settings.browser_hosts
if self.is_browserless_host(host_id)
]
async def _allocate_and_reload(self, host_id: str) -> None:
"""Allocate a page from the pool and reload it.
Args:
host_id: The host identifier
"""
context, page, should_close = await self._pool.allocate_context_and_page(
host_id
)
if isinstance(page, Page):
_ = await page.reload(timeout=30000)
if should_close and context is not None:
await context.close()
def _update_status(
self, host_id: str, *, is_connected: bool, error: str | None = None
) -> None:
"""Update the status for a host.
Args:
host_id: The host identifier
is_connected: Whether the host is connected
error: Error message if any
"""
self._status[host_id] = KeepAliveHostStatus(
host_id=host_id,
last_ping=datetime.now(UTC),
is_connected=is_connected,
last_error=error,
)
async def ping_host(self, host_id: str) -> None:
"""Refresh a single browserless host by reloading the page.
Proactively establishes connection if not yet connected.
Args:
host_id: The host identifier to ping
"""
logger.debug("[KEEPALIVE] Pinging host '{}'", host_id)
try:
instance = self._pool.get_instance(host_id)
# Proactively establish connection if not yet connected
if instance is None or instance.browser is None:
logger.info(
"[KEEPALIVE] Host '{}' not connected, establishing connection",
host_id,
)
await self._allocate_and_reload(host_id)
self._update_status(host_id, is_connected=True)
logger.info("[KEEPALIVE] Host '{}' connected and pinged", host_id)
return
# Reconnect if browser connection is stale
if not instance.browser.is_connected():
logger.warning(
"[KEEPALIVE] Host '{}' disconnected, reconnecting", host_id
)
await self._allocate_and_reload(host_id)
self._update_status(host_id, is_connected=True)
logger.info("[KEEPALIVE] Host '{}' reconnected and pinged", host_id)
return
# Reload existing cached page or allocate new one
cached_page = self._pool.get_cached_cdp_page(host_id)
if cached_page is not None:
_ = await cached_page.reload(timeout=30000)
logger.debug("[KEEPALIVE] Reloaded cached page for host '{}'", host_id)
else:
await self._allocate_and_reload(host_id)
self._update_status(host_id, is_connected=True)
except Exception as exc:
error_msg = str(exc)
logger.error("[KEEPALIVE] Failed to ping host '{}': {}", host_id, error_msg)
self._update_status(host_id, is_connected=False, error=error_msg)
async def ping_all(self) -> None:
"""Ping all browserless CDP hosts."""
host_ids = self.get_browserless_host_ids()
if not host_ids:
logger.debug("[KEEPALIVE] No browserless hosts configured")
return
logger.info("[KEEPALIVE] Pinging {} browserless hosts", len(host_ids))
for host_id in host_ids:
await self.ping_host(host_id)
async def _run_loop(self) -> None:
"""Background loop that runs keep-alive pings at the configured interval."""
logger.info(
"[KEEPALIVE] Starting keep-alive loop (interval: {}s)", self._interval
)
# Initial ping on startup
await self.ping_all()
while True:
await asyncio.sleep(self._interval)
await self.ping_all()
def start(self) -> None:
"""Start the background keep-alive task."""
if self._task is not None and not self._task.done():
logger.warning("[KEEPALIVE] Already running")
return
self._running = True
self._started_at = datetime.now(UTC)
self._task = asyncio.create_task(self._run_loop())
logger.info("[KEEPALIVE] Service started")
async def stop(self) -> None:
"""Stop and cleanup the background keep-alive task."""
self._running = False
if self._task is None:
return
_ = self._task.cancel()
with contextlib.suppress(asyncio.CancelledError):
await self._task
self._task = None
logger.info("[KEEPALIVE] Service stopped")
def get_status(self) -> KeepAliveServiceStatus:
"""Get the current status of the keep-alive service.
Returns:
Status including all tracked hosts and service state
"""
host_ids = self.get_browserless_host_ids()
# Ensure all browserless hosts have status entries
hosts: list[KeepAliveHostStatus] = []
for host_id in host_ids:
if host_id in self._status:
hosts.append(self._status[host_id])
else:
hosts.append(
KeepAliveHostStatus(
host_id=host_id,
last_ping=None,
is_connected=False,
last_error=None,
)
)
return KeepAliveServiceStatus(
hosts=hosts,
interval_seconds=self._interval,
is_running=self._running,
started_at=self._started_at,
)
__all__ = [
"KeepAliveService",
"KeepAliveHostStatus",
"KeepAliveServiceStatus",
"DEFAULT_KEEPALIVE_INTERVAL_SECONDS",
]

View File

@@ -0,0 +1,7 @@
"""Composable mixins used by PageHelpers to keep responsibilities focused."""
from guide.app.browser.mixins.diagnostics import DiagnosticsMixin
from guide.app.browser.mixins.interaction import InteractionMixin
from guide.app.browser.mixins.wait import WaitMixin
__all__ = ["WaitMixin", "DiagnosticsMixin", "InteractionMixin"]

View File

@@ -0,0 +1,33 @@
"""Diagnostics helpers mixed into PageHelpers."""
from typing import TYPE_CHECKING, Protocol
if TYPE_CHECKING:
from guide.app.browser.types import PageLike
class _DiagnosticsProtocol(Protocol):
"""Protocol for classes that mix in DiagnosticsMixin."""
@property
def page(self) -> "PageLike": ...
class DiagnosticsMixin:
"""Capture diagnostics (HTML, screenshot, console logs).
This mixin expects `page` to be set by the class that uses it.
"""
def __init_subclass__(cls, **kwargs: object) -> None:
"""Verify subclass provides required attributes."""
super().__init_subclass__(**kwargs)
async def capture_diagnostics(self: _DiagnosticsProtocol):
"""Capture all diagnostic information (screenshot, HTML, logs)."""
from guide.app.browser.diagnostics import capture_all_diagnostics
return await capture_all_diagnostics(self.page)
__all__ = ["DiagnosticsMixin"]

View File

@@ -0,0 +1,100 @@
"""High-level interaction helpers mixed into PageHelpers."""
from typing import TYPE_CHECKING, Protocol
from guide.app.browser.elements import dropdown
if TYPE_CHECKING:
from guide.app.browser.types import PageLike
class _InteractionMixinProtocol(Protocol):
"""Protocol for classes that mix in InteractionMixin."""
@property
def page(self) -> "PageLike": ...
async def wait_for_network_idle(self, timeout_ms: int | None = None) -> None:
"""Wait for network idle state."""
...
async def wait_for_stable(
self,
stability_check_ms: int | None = None,
samples: int = 3,
) -> None:
"""Wait for page stability."""
...
class InteractionMixin:
"""Shared interaction patterns built on top of PageLike primitives."""
def __init_subclass__(cls, **kwargs: object) -> None:
"""Verify subclass provides required attributes."""
super().__init_subclass__(**kwargs)
async def wait_for_network_idle(self, _timeout_ms: int | None = None) -> None:
"""Abstract - implemented by WaitMixin."""
raise NotImplementedError("Requires WaitMixin")
async def wait_for_stable(
self,
_stability_check_ms: int | None = None,
_samples: int = 3,
) -> None:
"""Abstract - implemented by WaitMixin."""
raise NotImplementedError("Requires WaitMixin")
async def fill_and_advance(
self: _InteractionMixinProtocol,
selector: str,
value: str,
next_selector: str,
wait_for_idle: bool = True,
) -> None:
await self.page.fill(selector, value)
await self.page.click(next_selector)
if wait_for_idle:
await self.wait_for_network_idle()
async def search_and_select(
self: _InteractionMixinProtocol,
search_input: str,
query: str,
result_selector: str,
index: int = 0,
) -> None:
await self.page.fill(search_input, query)
await self.wait_for_network_idle()
results = self.page.locator(result_selector)
await results.nth(index).click()
async def click_and_wait(
self: _InteractionMixinProtocol,
selector: str,
wait_for_idle: bool = True,
wait_for_stable: bool = False,
) -> None:
await self.page.click(selector)
if wait_for_idle:
await self.wait_for_network_idle()
if wait_for_stable:
await self.wait_for_stable()
async def select_dropdown_options(
self: _InteractionMixinProtocol,
field_selector: str,
target_values: list[str],
close_after: bool = True,
) -> dict[str, list[str]]:
_ = close_after # Reserved for future implementation
result = await dropdown.select_multi(self.page, field_selector, target_values)
return {
"selected": result["selected"],
"not_found": result["not_found"],
"available_options": result.get("available", []),
}
__all__ = ["InteractionMixin"]

View File

@@ -0,0 +1,75 @@
"""Wait-related helpers mixed into PageHelpers."""
from typing import TYPE_CHECKING, Protocol
from guide.app.browser.wait import (
wait_for_network_idle,
wait_for_navigation,
wait_for_selector,
wait_for_stable_page,
)
if TYPE_CHECKING:
from guide.app.browser.types import PageLike
from guide.app.core.config import Timeouts
class _WaitMixinProtocol(Protocol):
"""Protocol for classes that mix in WaitMixin."""
@property
def page(self) -> "PageLike": ...
@property
def timeouts(self) -> "Timeouts": ...
class WaitMixin:
"""Provide wait utilities that reuse shared timeout configuration."""
def __init_subclass__(cls, **kwargs: object) -> None:
"""Verify subclass provides required attributes."""
super().__init_subclass__(**kwargs)
async def wait_for_selector(
self: _WaitMixinProtocol, selector: str, timeout_ms: int | None = None
) -> None:
await wait_for_selector(
self.page,
selector,
timeout_ms=timeout_ms,
timeouts=self.timeouts,
)
async def wait_for_network_idle(
self: _WaitMixinProtocol, timeout_ms: int | None = None
) -> None:
await wait_for_network_idle(
self.page,
timeout_ms=timeout_ms,
timeouts=self.timeouts,
)
async def wait_for_navigation(
self: _WaitMixinProtocol, timeout_ms: int | None = None
) -> None:
await wait_for_navigation(
self.page,
timeout_ms=timeout_ms,
timeouts=self.timeouts,
)
async def wait_for_stable(
self: _WaitMixinProtocol,
stability_check_ms: int | None = None,
samples: int = 3,
) -> None:
await wait_for_stable_page(
self.page,
stability_check_ms=stability_check_ms,
samples=samples,
timeouts=self.timeouts,
)
__all__ = ["WaitMixin"]

View File

@@ -6,13 +6,21 @@ expensive overhead of launching/connecting to browsers on each request.
Architecture:
- BrowserPool: Manages the lifecycle of browser instances by host
- Per host: Single persistent browser connection
- Per action: Fresh BrowserContext for complete isolation
- No page/context pooling: Each action gets a clean slate
- Per action: Context behavior depends on host kind and configuration
Isolation Modes:
- Headless: Fresh context per action (complete isolation)
- Extension: Fresh page per action (no CDP refresh issue)
- CDP with isolate=False: Cached context/page for performance (state persists)
- CDP with isolate=True: Cached context/page with storage clearing between requests
"""
import asyncio
import contextlib
import logging
from pathlib import Path
from typing import TYPE_CHECKING, TypeAlias
from loguru import logger
from playwright.async_api import (
Browser,
BrowserContext,
@@ -20,60 +28,229 @@ from playwright.async_api import (
Playwright,
async_playwright,
)
from playwright._impl._errors import TargetClosedError
from guide.app.core.config import AppSettings, BrowserHostConfig, HostKind
if TYPE_CHECKING:
from playwright.async_api import StorageState
else:
StorageState = dict[str, object] # Runtime fallback
from guide.app.core.config import (
AppSettings,
BrowserHostConfig,
DEFAULT_EXTENSION_PORT,
HostKind,
)
from guide.app import errors
from guide.app.browser.extension_client import ExtensionClient, ExtensionPage
_logger = logging.getLogger(__name__)
PageLike: TypeAlias = Page | ExtensionPage
class BrowserInstance:
"""Manages a single browser connection and its lifecycle.
Creates fresh contexts for each request to ensure complete isolation
between actions. No context pooling or reuse.
Context behavior depends on host kind:
- Headless: Fresh context per request (complete isolation)
- Extension: Fresh page per request (no context concept)
- CDP: Cached context/page with optional storage clearing (isolate flag)
"""
def __init__(
self, host_id: str, host_config: BrowserHostConfig, browser: Browser
self,
host_id: str,
host_config: BrowserHostConfig,
browser: Browser | None = None,
extension_client: ExtensionClient | None = None,
) -> None:
"""Initialize a browser instance for a host.
Args:
host_id: The host identifier
host_config: The host configuration
browser: The Playwright browser instance
browser: The Playwright browser instance (for CDP/headless)
extension_client: The extension client (for extension hosts)
"""
self.host_id: str = host_id
self.host_config: BrowserHostConfig = host_config
self.browser: Browser = browser
self.browser: Browser | None = browser
self.extension_client: ExtensionClient | None = extension_client
# Cache CDP context and page to avoid re-querying (which causes refresh)
self._cdp_context: BrowserContext | None = None
self._cdp_page: Page | None = None
# Note: Extension mode doesn't need page caching - no CDP refresh issue
async def allocate_context_and_page(self) -> tuple[BrowserContext, Page]:
async def allocate_context_and_page(
self,
storage_state: StorageState | str | Path | None = None,
) -> tuple[BrowserContext | None, PageLike, bool]:
"""Allocate a fresh context and page for this request.
Both CDP and headless modes create new contexts for complete isolation.
For CDP mode: returns cached context and page from initial connection.
For headless mode: creates a new context for complete isolation.
For extension mode: returns None context and extension page.
Args:
storage_state: Optional Playwright storage_state to initialize context.
Only applies to headless mode.
Returns:
Tuple of (context, page) - caller must close context when done
Tuple of (context, page, should_close):
- context: Browser context (None for extension hosts)
- page: Page instance
- should_close: True if context should be closed after use, False for CDP/extension
Raises:
BrowserConnectionError: If context/page creation fails
"""
try:
context = await self.browser.new_context()
page = await context.new_page()
return context, page
if self.host_config.kind == HostKind.EXTENSION:
return await self._allocate_extension_page()
if self.host_config.kind == HostKind.CDP:
return await self._allocate_cdp_page()
return await self._allocate_headless_page(storage_state=storage_state)
except errors.BrowserConnectionError:
raise
except Exception as exc:
raise errors.BrowserConnectionError(
f"Failed to allocate page for host {self.host_id}",
details={"host_id": self.host_id, "host_kind": self.host_config.kind},
) from exc
async def _allocate_extension_page(self) -> tuple[None, ExtensionPage, bool]:
"""Allocate extension page - no caching needed unlike CDP mode."""
logger.info("[EXTENSION-{}] allocate_context_and_page called", self.host_id)
if self.extension_client is None:
raise errors.BrowserConnectionError(
f"Extension client not initialized for host {self.host_id}",
details={"host_id": self.host_id},
)
# Get fresh page each time - no caching needed for extension mode
# (CDP caches to avoid page refresh on context/page queries)
page = await self.extension_client.get_page()
logger.info("[EXTENSION-{}] Returning fresh extension page", self.host_id)
return None, page, False
async def _allocate_cdp_page(self) -> tuple[BrowserContext, PageLike, bool]:
logger.info("[CDP-{}] allocate_context_and_page called", self.host_id)
# Check if cached page is still valid
if self._cdp_page is not None and self._cdp_page.is_closed():
logger.warning(
"[CDP-{}] Cached page is closed, clearing cache", self.host_id
)
self._cdp_context = None
self._cdp_page = None
# Check if browser connection is still alive
if self.browser is not None and not self.browser.is_connected():
logger.warning(
"[CDP-{}] Browser disconnected, clearing cache", self.host_id
)
self._cdp_context = None
self._cdp_page = None
raise errors.BrowserConnectionError(
f"CDP browser disconnected for host {self.host_id}",
details={"host_id": self.host_id},
)
if self._cdp_context is None or self._cdp_page is None:
browser = self._require_browser()
logger.info(
"[CDP-{}] First access - querying browser.contexts", self.host_id
)
contexts = browser.contexts
logger.info("[CDP-{}] Got {} contexts", self.host_id, len(contexts))
if not contexts:
raise errors.BrowserConnectionError(
f"No contexts available in CDP browser for host {self.host_id}",
details={"host_id": self.host_id},
)
context = contexts[0]
pages = context.pages
logger.info(
"[CDP-{}] Got {} pages: {}",
self.host_id,
len(pages),
[p.url for p in pages],
)
if not pages:
raise errors.BrowserConnectionError(
f"No pages available in CDP browser context for host {self.host_id}",
details={"host_id": self.host_id},
)
non_devtools_pages = [
p for p in pages if not p.url.startswith("devtools://")
]
if not non_devtools_pages:
raise errors.BrowserConnectionError(
"No application pages found in CDP browser (only devtools pages)",
details={"host_id": self.host_id, "pages": [p.url for p in pages]},
)
self._cdp_context = context
self._cdp_page = non_devtools_pages[-1]
logger.info("[CDP-{}] Cached page: {}", self.host_id, self._cdp_page.url)
else:
logger.info(
"[CDP-{}] Using cached page: {}", self.host_id, self._cdp_page.url
)
# Assert non-None for type narrowing (values were just assigned above)
assert self._cdp_context is not None, "CDP context should be initialized"
assert self._cdp_page is not None, "CDP page should be initialized"
# Clear storage if isolate mode is enabled for this host
if self.host_config.isolate:
await self._clear_cdp_storage(self._cdp_context, self._cdp_page)
return self._cdp_context, self._cdp_page, False
async def _allocate_headless_page(
self,
storage_state: StorageState | str | Path | None = None,
) -> tuple[BrowserContext, Page, bool]:
"""Allocate headless page with optional storage state for session restoration."""
browser = self._require_browser()
context = await browser.new_context(storage_state=storage_state)
page = await context.new_page()
return context, page, True
async def _clear_cdp_storage(self, context: BrowserContext, page: Page) -> None:
"""Clear cookies and storage for CDP isolation mode.
This provides session isolation without full context recreation,
avoiding the page refresh that would occur from re-querying contexts.
"""
logger.debug("[CDP-{}] Clearing storage for isolation", self.host_id)
await context.clear_cookies()
await page.evaluate("localStorage.clear(); sessionStorage.clear();")
def _require_browser(self) -> Browser:
if self.browser is None:
raise errors.BrowserConnectionError(
f"Browser not initialized for host {self.host_id}",
details={"host_id": self.host_id},
)
return self.browser
@property
def cached_cdp_page(self) -> Page | None:
"""Get the cached CDP page if available."""
return self._cdp_page
async def close(self) -> None:
"""Close the browser connection."""
with contextlib.suppress(Exception):
await self.browser.close()
if self.browser:
await self.browser.close()
with contextlib.suppress(Exception):
if self.extension_client:
await self.extension_client.close()
class BrowserPool:
@@ -94,17 +271,43 @@ class BrowserPool:
self._instances: dict[str, BrowserInstance] = {}
self._playwright: Playwright | None = None
self._closed: bool = False
# Per-host locks to prevent race conditions during concurrent allocation
self._locks: dict[str, asyncio.Lock] = {}
async def initialize(self) -> None:
"""Initialize the browser pool.
Starts the Playwright instance. Browser connections are created lazily
on first request to avoid startup delays.
Starts the Playwright instance and eagerly connects to all CDP hosts
to avoid page refreshes during user interactions.
"""
if self._playwright is not None:
return
# Warn if no browser hosts configured
if not self.settings.browser_hosts:
logger.warning(
"No browser hosts configured. Actions requiring browser access will fail."
)
self._playwright = await async_playwright().start()
_logger.info("Browser pool initialized")
# Eagerly connect to all CDP hosts to avoid refresh on first use
for host_id, host_config in self.settings.browser_hosts.items():
if host_config.kind == HostKind.CDP:
try:
instance = await self._create_instance(host_id, host_config)
self._instances[host_id] = instance
# Eagerly cache the page reference to avoid querying on first request
_ = await instance.allocate_context_and_page()
logger.info(
"Eagerly connected to CDP host '{}' and cached page", host_id
)
except Exception as exc:
logger.warning(
"Failed to eagerly connect to CDP host '{}': {}", host_id, exc
)
logger.info("Browser pool initialized")
async def close(self) -> None:
"""Close all browser connections and the Playwright instance."""
@@ -121,30 +324,33 @@ class BrowserPool:
with contextlib.suppress(Exception):
await self._playwright.stop()
self._playwright = None
_logger.info("Browser pool closed")
logger.info("Browser pool closed")
async def allocate_context_and_page(
self, host_id: str | None = None
) -> tuple[BrowserContext, Page]:
self,
host_id: str | None = None,
storage_state: StorageState | str | Path | None = None,
) -> tuple[BrowserContext | None, PageLike, bool]:
"""Allocate a fresh context and page for the specified host.
Lazily creates browser connections on first request per host.
Automatically reconnects if the connection has gone stale.
Args:
host_id: The host identifier, or None for the default host
storage_state: Optional Playwright storage_state to initialize context.
Only applies to headless mode.
Returns:
Tuple of (context, page) - caller must close context when done
Tuple of (context, page, should_close):
- context: Browser context (None for extension hosts)
- page: Page instance
- should_close: True if context should be closed after use, False for CDP/extension
Raises:
ConfigError: If the host_id is invalid or not configured
BrowserConnectionError: If the browser connection fails
"""
if self._playwright is None:
raise errors.ConfigError(
"Browser pool not initialized. Call initialize() first."
)
resolved_id = host_id or self.settings.default_browser_host_id
host_config = self.settings.browser_hosts.get(resolved_id)
if not host_config:
@@ -153,57 +359,164 @@ class BrowserPool:
f"Unknown browser host '{resolved_id}'. Known: {known}"
)
# Get or create the browser instance for this host
if resolved_id not in self._instances:
instance = await self._create_instance(resolved_id, host_config)
self._instances[resolved_id] = instance
# Extension hosts don't require Playwright initialization
if host_config.kind != HostKind.EXTENSION and self._playwright is None:
raise errors.ConfigError(
"Browser pool not initialized. Call initialize() first."
)
return await self._instances[resolved_id].allocate_context_and_page()
# Get or create per-host lock to prevent race conditions
if resolved_id not in self._locks:
self._locks[resolved_id] = asyncio.Lock()
async with self._locks[resolved_id]:
# Get or create the browser instance for this host
if resolved_id not in self._instances:
instance = await self._create_instance(resolved_id, host_config)
self._instances[resolved_id] = instance
try:
return await self._instances[resolved_id].allocate_context_and_page(
storage_state=storage_state
)
except (TargetClosedError, errors.BrowserConnectionError) as exc:
# Connection is stale - evict and reconnect once
logger.warning(
"Stale connection detected for host '{}', reconnecting: {}",
resolved_id,
exc,
)
await self._evict_instance(resolved_id)
instance = await self._create_instance(resolved_id, host_config)
self._instances[resolved_id] = instance
try:
return await instance.allocate_context_and_page(
storage_state=storage_state
)
except (TargetClosedError, errors.BrowserConnectionError) as retry_exc:
# Second failure - evict again to prevent lockout on future requests
await self._evict_instance(resolved_id)
raise errors.BrowserConnectionError(
f"Host '{resolved_id}' unavailable after reconnection attempt",
details={
"host_id": resolved_id,
"host_kind": host_config.kind.value,
"original_error": str(exc),
"retry_error": str(retry_exc),
},
) from retry_exc
async def _evict_instance(self, host_id: str) -> None:
"""Evict and close a stale browser instance."""
if host_id in self._instances:
instance = self._instances.pop(host_id)
with contextlib.suppress(Exception):
await instance.close()
logger.info("Evicted stale browser instance for host '{}'", host_id)
async def _create_instance(
self, host_id: str, host_config: BrowserHostConfig
) -> BrowserInstance:
"""Create a new browser instance for the given host."""
assert self._playwright is not None
if host_config.kind == HostKind.EXTENSION:
# Create and start extension client
port = host_config.port or DEFAULT_EXTENSION_PORT
extension_client = ExtensionClient(port=port)
try:
await extension_client.start()
except errors.BrowserConnectionError as exc:
await extension_client.close()
raise errors.BrowserConnectionError(
f"Cannot start extension host '{host_id}'",
details={
"host_id": host_id,
"host": "0.0.0.0",
"port": port,
"reason": exc.message,
**exc.details,
},
) from exc
except (
Exception
) as exc: # pragma: no cover - safety net for unexpected failures
await extension_client.close()
raise errors.BrowserConnectionError(
f"Cannot start extension host '{host_id}'",
details={"host_id": host_id, "host": "0.0.0.0", "port": port},
) from exc
instance = BrowserInstance(
host_id, host_config, browser=None, extension_client=extension_client
)
logger.info(
"Created extension client instance for host '{}' ({})",
host_id,
host_config.kind,
)
return instance
if self._playwright is None:
raise errors.ConfigError(
"Browser pool not initialized. Call initialize() first."
)
if host_config.kind == HostKind.CDP:
browser = await self._connect_cdp(host_config)
else:
browser = await self._launch_headless(host_config)
instance = BrowserInstance(host_id, host_config, browser)
_logger.info(
f"Created browser instance for host '{host_id}' ({host_config.kind})"
instance = BrowserInstance(host_id, host_config, browser=browser)
logger.info(
"Created browser instance for host '{}' ({})", host_id, host_config.kind
)
return instance
async def _connect_cdp(self, host_config: BrowserHostConfig) -> Browser:
"""Connect to a CDP host."""
assert self._playwright is not None
"""Connect to a CDP host.
if not host_config.host or host_config.port is None:
raise errors.ConfigError("CDP host requires 'host' and 'port' fields.")
Supports either a host/port pair (Chrome-style /json/version) or an
explicit websocket/CDP URL for gateways like Browserless that return a
non-routable `webSocketDebuggerUrl`.
"""
if self._playwright is None:
raise errors.ConfigError(
"Browser pool not initialized. Call initialize() first."
)
target_url = host_config.cdp_url
details: dict[str, object] = {}
if not target_url:
if not host_config.host or host_config.port is None:
raise errors.ConfigError(
"CDP host requires 'host' and 'port' fields when cdp_url is not provided."
)
target_url = f"http://{host_config.host}:{host_config.port}"
details |= {"host": host_config.host, "port": host_config.port}
else:
details["cdp_url"] = target_url
cdp_url = f"http://{host_config.host}:{host_config.port}"
try:
browser = await self._playwright.chromium.connect_over_cdp(cdp_url)
_logger.info(f"Connected to CDP endpoint: {cdp_url}")
browser = await self._playwright.chromium.connect_over_cdp(target_url)
logger.info("Connected to CDP endpoint: {}", target_url)
return browser
except Exception as exc:
raise errors.BrowserConnectionError(
f"Cannot connect to CDP endpoint {cdp_url}",
details={"host": host_config.host, "port": host_config.port},
f"Cannot connect to CDP endpoint {target_url}", details=details
) from exc
async def _launch_headless(self, host_config: BrowserHostConfig) -> Browser:
"""Launch a headless browser."""
assert self._playwright is not None
if self._playwright is None:
raise errors.ConfigError(
"Browser pool not initialized. Call initialize() first."
)
browser_type = self._resolve_browser_type(host_config.browser)
try:
browser = await browser_type.launch(headless=True)
_logger.info(
f"Launched headless browser: {host_config.browser or 'chromium'}"
logger.info(
"Launched headless browser: {}", host_config.browser or "chromium"
)
return browser
except Exception as exc:
@@ -214,7 +527,10 @@ class BrowserPool:
def _resolve_browser_type(self, browser: str | None):
"""Resolve browser type from configuration."""
assert self._playwright is not None
if self._playwright is None:
raise errors.ConfigError(
"Browser pool not initialized. Call initialize() first."
)
desired = (browser or "chromium").lower()
if desired == "chromium":
@@ -225,5 +541,30 @@ class BrowserPool:
return self._playwright.webkit
raise errors.ConfigError(f"Unsupported browser type '{browser}'")
def get_instance(self, host_id: str) -> BrowserInstance | None:
"""Get the browser instance for a host if it exists.
Args:
host_id: The host identifier
Returns:
The BrowserInstance if it exists, None otherwise
"""
return self._instances.get(host_id)
def get_cached_cdp_page(self, host_id: str) -> Page | None:
"""Get the cached CDP page for a host if available.
Args:
host_id: The host identifier
Returns:
The cached Page if available, None otherwise
"""
instance = self._instances.get(host_id)
if instance is None:
return None
return instance.cached_cdp_page
__all__ = ["BrowserPool", "BrowserInstance"]

View File

@@ -0,0 +1,136 @@
"""Type definitions for browser automation interfaces."""
from typing import Literal, Protocol, TypeGuard, runtime_checkable
class PageLocator(Protocol):
"""Protocol for locator objects returned by page.locator()."""
async def count(self) -> int:
"""Get the count of matching elements."""
...
@property
def first(self) -> "PageLocator":
"""Get the first matching element."""
...
def nth(self, index: int) -> "PageLocator":
"""Get the nth matching element."""
...
def locator(self, selector: str) -> "PageLocator":
"""Create a locator for a descendant element."""
...
async def text_content(self) -> str | None:
"""Get the text content of the element."""
...
async def click(self, *, timeout: int | float | None = None) -> None:
"""Click the element."""
...
async def fill(self, value: str, *, timeout: int | float | None = None) -> None:
"""Fill the element with a value."""
...
async def wait_for(
self,
*,
state: Literal["attached", "detached", "hidden", "visible"] | None = None,
timeout: int | float | None = None,
) -> None:
"""Wait for the element to match the given state."""
...
@runtime_checkable
class PageLike(Protocol):
"""Protocol for page-like objects that support common browser automation operations.
This protocol allows both Playwright's Page and ExtensionPage to be used
interchangeably in actions and auth functions.
"""
@property
def url(self) -> str:
"""Get current page URL."""
...
async def fill(self, selector: str, value: str) -> None:
"""Fill an input element with a value."""
...
async def click(self, selector: str) -> None:
"""Click an element matching the selector."""
...
async def goto(
self,
url: str,
*,
timeout: float | None = None,
wait_until: Literal["commit", "domcontentloaded", "load", "networkidle"]
| None = None,
referer: str | None = None,
) -> object | None:
"""Navigate to a URL."""
...
async def evaluate(self, expression: str, arg: object = None) -> object:
"""Evaluate JavaScript expression in page context."""
...
def locator(self, selector: str) -> PageLocator:
"""Create a locator for the given selector."""
...
async def wait_for_selector(
self,
selector: str,
*,
timeout: float | None = None,
state: Literal["attached", "detached", "hidden", "visible"] | None = None,
strict: bool | None = None,
) -> object | None:
"""Wait for an element matching selector to appear."""
...
async def wait_for_timeout(self, timeout: int) -> None:
"""Wait for a specified timeout in milliseconds."""
...
async def wait_for_load_state(
self,
state: Literal["domcontentloaded", "load", "networkidle"] | None = None,
*,
timeout: float | None = None,
) -> None:
"""Wait for page load state."""
...
async def content(self) -> str:
"""Get page HTML content."""
...
async def screenshot(self, *, full_page: bool = False) -> bytes:
"""Capture page screenshot."""
...
@runtime_checkable
class PageWithTrustedClick(PageLike, Protocol):
"""Page-like object that supports trusted_click (extension-only)."""
async def trusted_click(self, x: float, y: float) -> None:
"""Perform a trusted click at specific coordinates."""
...
def supports_trusted_click(page: PageLike) -> TypeGuard["PageWithTrustedClick"]:
"""Type guard for objects that implement trusted_click."""
return hasattr(page, "trusted_click")
__all__ = ["PageLike", "PageLocator", "PageWithTrustedClick", "supports_trusted_click"]

View File

@@ -0,0 +1,34 @@
"""Browser utility functions.
Shared utilities used across browser automation modules.
These are placed in a neutral location to avoid circular imports.
"""
def escape_selector(selector: str) -> str:
"""Escape a CSS selector for safe use in JavaScript code.
Handles backslashes, single quotes, and double quotes.
Args:
selector: Raw CSS selector string
Returns:
Escaped selector safe for JS string interpolation
"""
return selector.replace("\\", "\\\\").replace("'", "\\'").replace('"', '\\"')
def escape_js_string(value: str) -> str:
"""Escape a string for safe use in JavaScript single-quoted strings.
Args:
value: Raw string value
Returns:
Escaped string safe for JS interpolation
"""
return value.replace("\\", "\\\\").replace("'", "\\'").replace('"', '\\"')
__all__ = ["escape_selector", "escape_js_string"]

View File

@@ -6,78 +6,102 @@ and verifying visual stability before proceeding with actions.
import asyncio
from playwright.async_api import Page, TimeoutError as PlaywrightTimeoutError
from playwright.async_api import TimeoutError as PlaywrightTimeoutError
from guide.app import errors
from guide.app.browser.types import PageLike
from guide.app.core.config import Timeouts
from guide.app.utils.retry import retry_async
_DEFAULT_TIMEOUTS = Timeouts()
async def wait_for_selector(
page: Page,
page: PageLike,
selector: str,
timeout_ms: int = 5000,
timeout_ms: int | None = None,
*,
timeouts: Timeouts | None = None,
) -> None:
"""Wait for an element matching selector to be present in DOM.
Args:
page: The Playwright page instance
selector: CSS or Playwright selector string
timeout_ms: Maximum time to wait in milliseconds (default: 5000)
timeout_ms: Maximum time to wait in milliseconds
timeouts: Optional Timeouts configuration to use for defaults
Raises:
GuideError: If selector not found within timeout
"""
effective_timeouts = timeouts or _DEFAULT_TIMEOUTS
max_wait = (
timeout_ms if timeout_ms is not None else effective_timeouts.element_default
)
try:
_ = await page.wait_for_selector(selector, timeout=timeout_ms)
_ = await page.wait_for_selector(selector, timeout=max_wait)
except PlaywrightTimeoutError as exc:
msg = f"Selector '{selector}' not found within {timeout_ms}ms"
msg = f"Selector '{selector}' not found within {max_wait}ms"
raise errors.GuideError(msg) from exc
async def wait_for_navigation(
page: Page,
timeout_ms: int = 5000,
page: PageLike,
timeout_ms: int | None = None,
*,
timeouts: Timeouts | None = None,
) -> None:
"""Wait for page navigation to complete.
Args:
page: The Playwright page instance
timeout_ms: Maximum time to wait in milliseconds (default: 5000)
timeout_ms: Maximum time to wait in milliseconds
timeouts: Optional Timeouts configuration to use for defaults
Raises:
GuideError: If navigation does not complete within timeout
"""
effective_timeouts = timeouts or _DEFAULT_TIMEOUTS
max_wait = timeout_ms if timeout_ms is not None else effective_timeouts.network_idle
try:
await page.wait_for_load_state("networkidle", timeout=timeout_ms)
await page.wait_for_load_state("networkidle", timeout=max_wait)
except PlaywrightTimeoutError as exc:
msg = f"Page navigation did not complete within {timeout_ms}ms"
msg = f"Page navigation did not complete within {max_wait}ms"
raise errors.GuideError(msg) from exc
async def wait_for_network_idle(
page: Page,
timeout_ms: int = 5000,
page: PageLike,
timeout_ms: int | None = None,
*,
timeouts: Timeouts | None = None,
) -> None:
"""Wait for network to become idle (no active requests).
Args:
page: The Playwright page instance
timeout_ms: Maximum time to wait in milliseconds (default: 5000)
timeout_ms: Maximum time to wait in milliseconds
timeouts: Optional Timeouts configuration to use for defaults
Raises:
GuideError: If network does not idle within timeout
"""
effective_timeouts = timeouts or _DEFAULT_TIMEOUTS
max_wait = timeout_ms if timeout_ms is not None else effective_timeouts.network_idle
try:
await page.wait_for_load_state("networkidle", timeout=timeout_ms)
await page.wait_for_load_state("networkidle", timeout=max_wait)
except PlaywrightTimeoutError as exc:
msg = f"Network did not idle within {timeout_ms}ms"
msg = f"Network did not idle within {max_wait}ms"
raise errors.GuideError(msg) from exc
async def is_page_stable(
page: Page,
stability_check_ms: int = 500,
page: PageLike,
stability_check_ms: int | None = None,
samples: int = 3,
*,
timeouts: Timeouts | None = None,
) -> bool:
"""Check if page is visually stable (DOM not changing).
@@ -86,12 +110,19 @@ async def is_page_stable(
Args:
page: The Playwright page instance
stability_check_ms: Delay between samples in milliseconds (default: 500)
stability_check_ms: Delay between samples in milliseconds
samples: Number of stable samples required (default: 3)
timeouts: Optional Timeouts configuration to use for defaults
Returns:
True if page is stable, False otherwise
"""
effective_timeouts = timeouts or _DEFAULT_TIMEOUTS
interval_ms = (
stability_check_ms
if stability_check_ms is not None
else effective_timeouts.stability_check
)
try:
previous_content: str | None = None
@@ -102,19 +133,24 @@ async def is_page_stable(
return False
previous_content = current_content
await asyncio.sleep(stability_check_ms / 1000)
await asyncio.sleep(interval_ms / 1000)
return True
except Exception:
# If we can't check stability, assume page is stable
return True
except Exception as exc:
# If we can't check stability, log warning and return False to be safe
import logging
logging.getLogger(__name__).warning(f"Page stability check failed: {exc}")
return False
@retry_async(retries=3, delay_seconds=0.2)
async def wait_for_stable_page(
page: Page,
stability_check_ms: int = 500,
page: PageLike,
stability_check_ms: int | None = None,
samples: int = 3,
*,
timeouts: Timeouts | None = None,
) -> None:
"""Wait for page to become visually stable, with retries.
@@ -123,13 +159,19 @@ async def wait_for_stable_page(
Args:
page: The Playwright page instance
stability_check_ms: Delay between samples in milliseconds (default: 500)
stability_check_ms: Delay between samples in milliseconds
samples: Number of stable samples required (default: 3)
timeouts: Optional Timeouts configuration to use for defaults
Raises:
GuideError: If page does not stabilize after retries
"""
stable = await is_page_stable(page, stability_check_ms, samples)
stable = await is_page_stable(
page,
stability_check_ms,
samples,
timeouts=timeouts,
)
if not stable:
msg = "Page did not stabilize after retries"
raise errors.GuideError(msg)

View File

@@ -4,18 +4,29 @@ import os
from enum import Enum
from pathlib import Path
from collections.abc import Mapping
from typing import ClassVar, TypeAlias, cast
from typing import ClassVar, Literal, Protocol, TypeAlias, TypeVar, cast
from pydantic import BaseModel, Field
from pydantic import AliasChoices, BaseModel, Field
from pydantic_settings import BaseSettings, SettingsConfigDict
from guide.app.models.boards.models import BoardConfig
from guide.app.models.personas.models import DemoPersona
class _ModelWithId(Protocol):
"""Protocol for models that have an id attribute."""
id: str
_T = TypeVar("_T", bound=BaseModel)
_logger = logging.getLogger(__name__)
CONFIG_DIR = Path(__file__).resolve().parents[4] / "config"
HOSTS_FILE = CONFIG_DIR / "hosts.yaml"
PERSONAS_FILE = CONFIG_DIR / "personas.yaml"
BOARDS_FILE = CONFIG_DIR / "boards.yaml"
JsonRecord: TypeAlias = dict[str, object]
RecordList: TypeAlias = list[JsonRecord]
@@ -27,20 +38,90 @@ def _coerce_mapping(mapping: Mapping[object, object]) -> dict[str, object]:
class HostKind(str, Enum):
"""Browser host kind: CDP or headless."""
"""Browser host kind: CDP, headless, or extension."""
CDP = "cdp"
HEADLESS = "headless"
EXTENSION = "extension"
# Default port for extension WebSocket server
DEFAULT_EXTENSION_PORT = 17373
class BrowserHostConfig(BaseModel):
"""Configuration for a browser host (CDP or headless)."""
"""Configuration for a browser host (CDP or headless).
For CDP hosts, the `isolate` flag controls session isolation:
- isolate=False (default): Reuse cached context/page for performance
- isolate=True: Clear storage between requests for session isolation
"""
id: str
kind: HostKind
host: str | None = None
port: int | None = None
port: int | None = Field(default=None, ge=1, le=65535)
cdp_url: str | None = None # explicit CDP endpoint override
browser: str | None = None # chromium/firefox/webkit for headless
isolate: bool = False
"""For CDP hosts: clear cookies/storage between requests for isolation."""
class Timeouts(BaseModel):
"""Centralized timeout configuration for browser interactions.
All timeout values are in milliseconds unless otherwise noted.
"""
# Browser element operations (milliseconds)
element_default: int = 5000
"""Default timeout for element wait operations."""
network_idle: int = 5000
"""Timeout for network idle detection."""
animation: int = 300
"""Animation transition duration."""
stability_check: int = 500
"""Interval for page stability checks."""
# UI component timeouts (milliseconds)
listbox_wait: int = 600
"""Timeout for dropdown listbox appearance."""
dropdown_field: int = 1000
"""Timeout for dropdown field interactions."""
dropdown_icon: int = 500
"""Timeout for dropdown icon/indicator interactions."""
combobox_listbox: int = 2500
"""Extended timeout for combobox listbox population."""
# Extension client timeouts (seconds for consistency with asyncio)
extension_connection_s: float = 10.0
"""Timeout waiting for extension to connect."""
extension_command_s: float = 30.0
"""Timeout for extension command/eval execution."""
extension_trusted_click_s: float = 10.0
"""Timeout for trusted click operations."""
websocket_receive_s: float = 60.0
"""Timeout for WebSocket message receive (triggers ping)."""
# GraphQL/HTTP timeouts (seconds)
graphql_base_s: float = 10.0
"""Base HTTP client timeout for GraphQL."""
graphql_operation_s: float = 30.0
"""Extended timeout for GraphQL operations."""
# Debounce delays (milliseconds)
debounce_network: int = 100
"""Delay for network request stabilization."""
debounce_typeahead: int = 150
"""Delay for typeahead input debouncing."""
scroll_wait: int = 50
"""Brief pause for smooth scrolling animations."""
# Auth flow timeouts (seconds)
auth_stability_s: float = 8.0
"""Page stability check after Auth0 redirects."""
otp_callback_default_s: float = 120.0
"""Default timeout for OTP callback wait."""
class AppSettings(BaseSettings):
@@ -57,13 +138,79 @@ class AppSettings(BaseSettings):
model_config: ClassVar[SettingsConfigDict] = SettingsConfigDict(
env_prefix="RAINDROP_DEMO_",
case_sensitive=False,
env_file=".env",
env_file_encoding="utf-8",
extra="ignore",
)
raindrop_base_url: str = "https://app.raindrop.com"
raindrop_graphql_url: str = "https://app.raindrop.com/graphql"
default_browser_host_id: str = "demo-cdp"
# Raindrop URLs (use RAINDROP_STAGING_* env vars)
raindrop_base_url: str = Field(
default="https://stg.raindrop.com",
validation_alias=AliasChoices("RAINDROP_STAGING_BASE_URL", "raindrop_base_url"),
)
"""Base URL for Raindrop app (login page, etc.)"""
raindrop_graphql_url: str = Field(
default="https://raindrop-staging.hasura.app/v1/graphql",
validation_alias=AliasChoices(
"RAINDROP_STAGING_GRAPHQL_URL", "raindrop_graphql_url"
),
)
"""GraphQL API endpoint."""
# Browser configuration
default_browser_host_id: str = "browserless-cdp"
browser_hosts: dict[str, BrowserHostConfig] = Field(default_factory=dict)
personas: dict[str, DemoPersona] = Field(default_factory=dict)
boards: dict[str, BoardConfig] = Field(default_factory=dict)
timeouts: Timeouts = Field(default_factory=Timeouts)
# Session Management
session_storage_dir: Path = Field(default=Path(".sessions"))
"""Directory to store session files (relative to project root)."""
session_ttl_minutes: int = Field(default=60)
"""Session time-to-live in minutes before requiring re-authentication."""
session_auto_persist: bool = Field(default=True)
"""Automatically save sessions after successful login."""
# n8n Integration (use RAINDROP_DEMO_N8N_* env vars)
n8n_webhook_url: str | None = Field(default=None)
"""Webhook URL for n8n OTP request notifications."""
n8n_otp_callback_timeout: int = Field(default=300)
"""Timeout in seconds waiting for OTP callback from n8n (5 min default)."""
n8n_otp_email_delay: float = Field(default=15.0)
"""Delay in seconds after triggering OTP email before notifying n8n."""
callback_base_url: str = Field(default="http://localhost:8765")
"""Base URL for callback endpoints (used by n8n to call back)."""
# Auth0 JWT Verification (use RAINDROP_DEMO_AUTH0_* env vars)
auth0_domain: str | None = Field(default=None)
"""Auth0 tenant domain for JWT signature verification (e.g., 'your-tenant.auth0.com')."""
auth0_audience: str | None = Field(default=None)
"""Expected audience claim in JWT (optional, validates if set)."""
# OTP Store Backend (use RAINDROP_DEMO_OTP_* or RAINDROP_DEMO_REDIS_* env vars)
otp_store_backend: Literal["memory", "redis"] = Field(default="memory")
"""OTP callback store backend: 'memory' (single-worker) or 'redis' (multi-worker)."""
redis_url: str | None = Field(default=None)
"""Redis connection URL for shared state (required if otp_store_backend='redis')."""
# Demonstration Board Configuration (overridable via env vars)
demonstration_board_id: int = Field(default=596)
"""Board ID for the Demonstration board (DEMON). Override for different environments."""
demonstration_instance_id: int = Field(default=107)
"""Instance ID for the Demonstration board. Override for different environments."""
# LLM Configuration (OpenAI-compatible endpoint)
llm_base_url: str = Field(default="http://bifrost.lab/v1")
"""Base URL for OpenAI-compatible LLM endpoint."""
llm_model: str = Field(
default="fireworks_ai/accounts/fireworks/models/qwen3-235b-a22b-instruct-2507"
)
"""Model identifier for LLM requests."""
llm_api_key: str | None = Field(default=None)
"""API key for LLM endpoint (optional if endpoint doesn't require auth)."""
llm_timeout_s: float = Field(default=60.0)
"""Timeout in seconds for LLM requests."""
def _load_yaml_file(path: Path) -> dict[str, object]:
@@ -180,12 +327,46 @@ def _normalize_persona_records(data: object) -> RecordList:
return records
def _normalize_board_records(data: object) -> dict[str, JsonRecord]:
"""Normalize board records from YAML format.
Processes data into a dict of typed records keyed by board name.
Board records use the key as the board name identifier.
"""
# Extract content from wrapper mapping if present
content: object
if isinstance(data, Mapping):
data_mapping = cast(Mapping[object, object], data)
mapping = _coerce_mapping(data_mapping)
content = mapping.get("boards", mapping)
else:
content = data
records: dict[str, JsonRecord] = {}
# Process mapping format (dict of boards keyed by name)
if isinstance(content, Mapping):
content_mapping = cast(Mapping[object, object], content)
mapping_content = _coerce_mapping(content_mapping)
for key, value in mapping_content.items():
# Check if value is a mapping before using it as one
if isinstance(value, Mapping):
value_mapping = cast(Mapping[object, object], value)
record: JsonRecord = _coerce_mapping(value_mapping)
else:
record = {}
# Store by board name (key in YAML)
records[key] = record
return records
def load_settings() -> AppSettings:
"""Load application settings from files and environment.
Configuration is loaded in this order (later overrides earlier):
1. Default values in AppSettings model
2. YAML files (config/hosts.yaml, config/personas.yaml)
2. YAML files (config/hosts.yaml, config/personas.yaml, config/boards.yaml)
3. Environment variables (RAINDROP_DEMO_* prefix)
4. JSON overrides via environment (RAINDROP_DEMO_BROWSER_HOSTS_JSON, etc.)
"""
@@ -194,6 +375,7 @@ def load_settings() -> AppSettings:
# Load from YAML files
hosts_data = _load_yaml_file(HOSTS_FILE)
personas_data = _load_yaml_file(PERSONAS_FILE)
boards_data = _load_yaml_file(BOARDS_FILE)
hosts_dict: dict[str, BrowserHostConfig] = {}
for record in _normalize_host_records(hosts_data):
@@ -205,49 +387,74 @@ def load_settings() -> AppSettings:
persona = DemoPersona.model_validate(record)
personas_dict[persona.id] = persona
boards_dict: dict[str, BoardConfig] = {}
for name, record in _normalize_board_records(boards_data).items():
board = BoardConfig.model_validate(record)
boards_dict[name] = board
# Load JSON overrides from environment
if browser_hosts_json := os.environ.get("RAINDROP_DEMO_BROWSER_HOSTS_JSON"):
try:
# Validate JSON is a list and process each record
decoded = cast(object, json.loads(browser_hosts_json))
if not isinstance(decoded, list):
raise ValueError(
"RAINDROP_DEMO_BROWSER_HOSTS_JSON must be a JSON array"
)
# Iterate only over validated list
decoded_list = cast(list[object], decoded)
for item in decoded_list:
if isinstance(item, Mapping):
host = BrowserHostConfig.model_validate(item)
hosts_dict[host.id] = host
_load_json_override_records(
browser_hosts_json,
"RAINDROP_DEMO_BROWSER_HOSTS_JSON must be a JSON array",
BrowserHostConfig,
hosts_dict,
)
except (json.JSONDecodeError, ValueError) as exc:
_logger.warning(f"Failed to parse RAINDROP_DEMO_BROWSER_HOSTS_JSON: {exc}")
if personas_json := os.environ.get("RAINDROP_DEMO_PERSONAS_JSON"):
try:
# Validate JSON is a list and process each record
decoded = cast(object, json.loads(personas_json))
if not isinstance(decoded, list):
raise ValueError("RAINDROP_DEMO_PERSONAS_JSON must be a JSON array")
# Iterate only over validated list
decoded_list = cast(list[object], decoded)
for item in decoded_list:
if isinstance(item, Mapping):
persona = DemoPersona.model_validate(item)
personas_dict[persona.id] = persona
_load_json_override_records(
personas_json,
"RAINDROP_DEMO_PERSONAS_JSON must be a JSON array",
DemoPersona,
personas_dict,
)
except (json.JSONDecodeError, ValueError) as exc:
_logger.warning(f"Failed to parse RAINDROP_DEMO_PERSONAS_JSON: {exc}")
# Update settings with loaded configuration
if hosts_dict or personas_dict:
if hosts_dict or personas_dict or boards_dict:
settings = settings.model_copy(
update={
"browser_hosts": hosts_dict or settings.browser_hosts,
"personas": personas_dict or settings.personas,
"boards": boards_dict or settings.boards,
}
)
_logger.info(
f"Loaded {len(settings.browser_hosts)} browser hosts, {len(settings.personas)} personas"
f"Loaded {len(settings.browser_hosts)} browser hosts, {len(settings.personas)} personas, {len(settings.boards)} boards"
)
return settings
def _load_json_override_records(
json_str: str,
error_message: str,
model_class: type[_T],
target_dict: dict[str, _T],
) -> None:
"""Extract and validate records from JSON string into target dictionary.
Args:
json_str: JSON string containing an array of records
error_message: Error message to raise if JSON is not an array
model_class: Pydantic model class to validate each record
target_dict: Dictionary to update with validated records
"""
# Validate JSON is a list and process each record
decoded = cast(object, json.loads(json_str))
if not isinstance(decoded, list):
raise ValueError(error_message)
# Iterate only over validated list
decoded_list = cast(list[object], decoded)
for item in decoded_list:
if isinstance(item, Mapping):
record = model_class.model_validate(item)
# Both BrowserHostConfig and DemoPersona have id: str per contract
# Cast through object to satisfy type checker (structural typing enforced at runtime)
record_with_id = cast(_ModelWithId, cast(object, record))
target_dict[record_with_id.id] = record

View File

@@ -1,9 +1,17 @@
"""Centralized loguru logging configuration.
Provides structured JSON file logging and pretty console output with
request context injection via context variables.
"""
import contextvars
import json
import logging
import sys
from pathlib import Path
from typing import override
from loguru import logger
class _ContextVars:
"""Container for request-scoped logging context variables."""
@@ -22,66 +30,109 @@ class _ContextVars:
)
class ContextJsonFormatter(logging.Formatter):
"""JSON formatter that includes request context variables in every log entry."""
class InterceptHandler(logging.Handler):
"""Redirect standard library logging to loguru."""
@override
def format(self, record: logging.LogRecord) -> str:
"""Format the log record as JSON with context variables."""
log_data: dict[str, object] = {
"timestamp": self.formatTime(record, datefmt="%Y-%m-%dT%H:%M:%S"),
"level": record.levelname,
"logger": record.name,
"module": record.module,
"line": record.lineno,
"msg": record.getMessage(),
}
def emit(self, record: logging.LogRecord) -> None:
"""Intercept logging records and forward to loguru."""
try:
level = logger.level(record.levelname).name
except ValueError:
level = record.levelno
# Add request context fields if set
frame, depth = logging.currentframe(), 2
while frame is not None and frame.f_code.co_filename == logging.__file__:
frame = frame.f_back
depth += 1
logger.opt(depth=depth, exception=record.exc_info).log(
level, record.getMessage()
)
def setup_logger() -> None:
"""Configure loguru with console and file sinks.
Console: Pretty colored output to stderr (INFO level)
File: JSON-structured logs to logs/app_logs.log (DEBUG level, 10MB rotation, 7 day retention)
Also intercepts standard library logging (e.g., from requests, urllib) and redirects to loguru.
Context variables (correlation_id, action_id, persona_id, host_id) are automatically
included in JSON logs via loguru's bind() mechanism.
"""
# Remove default handler
logger.remove()
# Console sink: pretty colored output
_ = logger.add(
sys.stderr,
colorize=True,
format="<green>{time:HH:mm:ss}</green> | <level>{level: <8}</level> | <cyan>{name}</cyan>:<cyan>{line}</cyan> - <level>{message}</level>",
level="INFO",
)
# File sink: JSON-structured logs with context injection
log_dir = Path("logs")
log_dir.mkdir(exist_ok=True)
log_file = log_dir / "app_logs.log"
def inject_context(record: object) -> bool:
"""Filter function that injects context variables into record before serialization.
Args:
record: Loguru Record object with dict-like access to attributes.
"""
# Loguru Record objects have an 'extra' attribute that is a dict
# Access it via getattr for type safety
extra_attr = getattr(record, "extra", None)
if extra_attr is None:
return True
# Ensure we have a dict before mutating it
if not isinstance(extra_attr, dict):
return True
# Directly mutate the extra dict that loguru provides
# The isinstance check ensures it's a dict, so we can safely assign to it
if correlation_id := _ContextVars.correlation_id.get():
log_data["correlation_id"] = correlation_id
extra_attr["correlation_id"] = correlation_id
if action_id := _ContextVars.action_id.get():
log_data["action_id"] = action_id
extra_attr["action_id"] = action_id
if persona_id := _ContextVars.persona_id.get():
log_data["persona_id"] = persona_id
extra_attr["persona_id"] = persona_id
if host_id := _ContextVars.host_id.get():
log_data["host_id"] = host_id
extra_attr["host_id"] = host_id
return True # Always log this record
# Add exception info if present
if record.exc_info:
log_data["exc_info"] = self.formatException(record.exc_info)
_ = logger.add(
str(log_file),
rotation="10 MB",
retention="7 days",
level="DEBUG",
serialize=True, # JSON formatting
enqueue=True, # Thread-safe
filter=inject_context,
)
return json.dumps(log_data)
# Intercept standard library logging
logging.basicConfig(handlers=[InterceptHandler()], level=0, force=True)
# Suppress noisy third-party loggers
for logger_name in ["httpx", "httpcore", "playwright"]:
logging.getLogger(logger_name).setLevel(logging.WARNING)
class LoggingManager:
"""Manages structured JSON logging with request context injection."""
"""Manages structured logging with request context injection."""
context: type[_ContextVars] = _ContextVars
@staticmethod
def configure(level: int | str = logging.INFO) -> None:
"""Configure JSON logging with structured output and context injection.
All log entries will include:
- ISO timestamp
- Log level
- Logger name, module, line number
- Request context (if set via context variables)
Args:
level: Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
"""
root_logger = logging.getLogger()
root_logger.setLevel(level)
root_logger.handlers.clear()
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(ContextJsonFormatter())
root_logger.addHandler(handler)
# Backward compatibility: expose configure_logging function
def configure_logging(level: int | str = logging.INFO) -> None:
"""Configure JSON logging (wrapper for LoggingManager.configure)."""
LoggingManager.configure(level)
def configure_logging(_level: int | str = logging.INFO) -> None:
"""Configure logging (wrapper for setup_logger).
Note: Level parameter is ignored as loguru handles levels per sink.
"""
setup_logger()

View File

@@ -6,6 +6,7 @@ from guide.app.errors.exceptions import (
GraphQLOperationError,
GraphQLTransportError,
GuideError,
LLMError,
MfaError,
PersonaError,
)
@@ -21,6 +22,7 @@ __all__ = [
"ActionExecutionError",
"GraphQLTransportError",
"GraphQLOperationError",
"LLMError",
"guide_error_handler",
"unhandled_error_handler",
]

Some files were not shown because too many files have changed in this diff Show More