Files
guide/CLAUDE.md
Travis Vasceannie f502286450 Update documentation, refactor string handling, and enhance logging
- Updated the development server documentation to reflect the new port configuration.
- Refactored string handling by replacing the centralized registry with dedicated selectors for better modularity and type safety.
- Enhanced logging throughout the application by integrating loguru for structured logging and improved context handling.
- Removed outdated files and streamlined the codebase for better maintainability.
- Added new HTML parsing utilities using BeautifulSoup to improve DOM traversal and element extraction.
- Updated various components to utilize the new string selectors, ensuring consistency across the codebase.
2025-12-07 14:16:27 +00:00

23 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

A FastAPI-based guided demo platform that automates browser interactions with Raindrop using Playwright and browser extensions. The app executes data-driven actions (stored in ActionRegistry) on behalf of personas that target configured browser hosts (CDP, headless, or extension). All configuration is externalized via YAML files and environment overrides.

Recent Enhancements:

  • PageLike Protocol — Unified interface supporting both Playwright Page and ExtensionPage for seamless switching between automation backends
  • PageHelpers Pattern — High-level fluent API for browser interactions (wait utilities, UI operations, diagnostics)
  • Extension Mode — WebSocket-based browser automation avoiding CDP page refresh issues, with React-compatible form filling
  • Browser Elements Package — Reusable UI interaction helpers for dropdowns, forms, and complex UI patterns

Entry Point: python -m guide (runs src/guide/main.pyguide.app.main:app) Python Version: 3.12+ Key Dependencies: FastAPI, Playwright, Pydantic v2, PyYAML, httpx

Essential Commands

# Install dependencies
uv sync

# Type checking (required before commits)
basedpyright src

# Compile sanity check
python -m compileall src/guide

# Run development server (default: localhost:8765)
python -m guide
# or with custom host/port:
HOST=127.0.0.1 PORT=9000 python -m guide

# View API docs
# Navigate to http://localhost:8000/docs

# Key endpoints:
# GET  /healthz                           # liveness check
# GET  /actions                           # list action metadata
# POST /actions/{id}/execute              # execute action; returns ActionEnvelope with correlation_id
# GET  /config/browser-hosts              # view current default + host map

Code Structure

Root module: src/guide/app/

  • actions/ — Demo action implementations with auto-discovery via @register_action decorator. Auto-wired via ActionRegistry with dependency injection. Submodules: base.py (DemoAction, CompositeAction, ActionRegistry), registry.py (auto-discovery), playbooks.py (multi-step workflows like OnboardingFlow, FullDemoFlow), diagnose_page.py (diagnostic action for page inspection), auth/ (LoginAsPersonaAction), intake/ (CreateSourcingRequestAction), sourcing/ (AddSupplierAction), demo/ (demonstration actions like CollapseAccordionsDemoAction showcasing PageHelpers pattern).
  • auth/ — Pluggable MFA/auth helpers. mfa.py defines MfaCodeProvider interface; DummyMfaCodeProvider raises NotImplementedError (implement for production). session.py provides ensure_persona() for login flows.
  • browser/ — Browser automation core with multiple layers:
    • pool.pyBrowserPool manages persistent browser instances per host (lazy-initialized, allocates fresh contexts/pages per request)
    • client.pyBrowserClient wraps BrowserPool with async context manager pattern
    • extension_client.pyExtensionClient provides WebSocket-based browser automation via Chrome extension to avoid CDP page refresh issues; includes ExtensionPage with Playwright-like API and React-compatible fill/type methods
    • types.py — Protocol definitions: PageLike (unified interface for Playwright Page and ExtensionPage), PageLocator (protocol for locator objects)
    • helpers.pyPageHelpers class for high-level page interactions (wait utilities, diagnostics capture, UI operations like fill_and_advance, search_and_select, click_and_wait, accordion collapse, dropdown operations)
    • wait.py — Standalone wait and stability utilities (wait_for_selector, wait_for_navigation, wait_for_network_idle, wait_for_stable_page)
    • diagnostics.py — Captures screenshots, HTML, console logs for error debugging
    • elements/ — Package for reusable UI interaction helpers:
      • dropdown.py — Dropdown helpers optimized for extension mode (select_multi, select_single, select_combobox)
      • form.py — Form fill helpers (extension-friendly wrappers for text, textarea, date, autocomplete)
  • core/ — App bootstrap: config.py (AppSettings with Pydantic v2, env prefix RAINDROP_DEMO_, YAML + JSON override cascade), logging.py (structured logging with request-scoped context variables).
  • errors/GuideError hierarchy (ConfigError, BrowserConnectionError, PersonaError, AuthError, MfaError, ActionExecutionError, GraphQLTransportError, GraphQLOperationError); routers normalize to HTTP responses with debug info.
  • raindrop/ — GraphQL client + operations. graphql.py (httpx-based HTTP client), operations/ (intake.py, sourcing.py with query/mutation definitions), generated/ (ariadne-codegen auto-generated Pydantic models), queries/ (GraphQL query/mutation files).
  • strings/ — Flattened registry (Phase 1 refactoring complete). registry.py enforces domain-keyed lookups; missing keys raise immediately. No inline UI strings in actions. Submodules: selectors/ (CSS, data-testid, aria), labels/ (UI element names), demo_texts/ (copy snippets).
  • models/ — Domain and persona models using Pydantic v2. domain/models.py (ActionRequest, ActionContext, ActionResult, ActionMetadata, ActionEnvelope, DebugInfo, BrowserHostDTO, BrowserHostsResponse). personas/models.py (DemoPersona with auto-coerced enums PersonaRole, LoginMethod; unified from Phase 2A refactoring). personas/store.py (PersonaStore in-memory registry from AppSettings).
  • utils/ — Shared helpers. Keep <300 LoC per file; avoid circular imports. Modules: ids.py (UUID/correlation ID), env.py (environment utilities), retry.py (backoff + jitter), timing.py (rate-limiting).
  • api/ — FastAPI routers in routes/. health.py (GET /healthz), actions.py (GET /actions, POST /actions/{id}/execute), config.py (GET /config/browser-hosts). Map requests → ActionRegistryBrowserClientActionEnvelope responses with error capture.

Config files (git-tracked):

  • config/hosts.yaml — Browser host targets (id, kind: cdp|headless|extension, host, port, browser type).
  • config/personas.yaml — Personas (id, role, email, login_method, browser_host_id).

Extension files (git-tracked):

  • extension/ — Terminator Bridge Chrome extension (Manifest V3)
    • manifest.json — Extension configuration with debugger permissions
    • worker.js — Service worker handling WebSocket and Chrome debugger API
    • content.js — Content script for extension wake-up handshakes
    • README.md — Extension documentation and protocol specification

Config overrides (runtime only, never commit):

  • RAINDROP_DEMO_BROWSER_HOSTS_JSON — JSON array overrides hosts.yaml.
  • RAINDROP_DEMO_PERSONAS_JSON — JSON array overrides personas.yaml.
  • RAINDROP_DEMO_RAINDROP_BASE_URL — Override default https://app.raindrop.com.

Architecture Patterns

App Initialization (main.py → create_app)

  1. Load AppSettings (env + YAML + JSON overrides).
  2. Build PersonaStore from config.
  3. Build ActionRegistry with auto-discovered actions (via @register_action decorator + pkgutil.walk_packages()). DI context includes persona store, Raindrop URL, etc. Auto-instantiates actions with parameter matching.
  4. Create BrowserPool (manages persistent browser instances per host, allocates fresh contexts/pages per request).
  5. Create BrowserClient (wraps BrowserPool with async context manager).
  6. Stash instances on app.state for dependency injection in routers.
  7. Register error handlers (GuideError → HTTP + debug info; unhandled → 500 + logging).

Action Execution Pipeline

  • Request: POST /actions/{action_id}/execute with ActionRequest (persona_id, host_id, params).
  • Router resolves persona + host from config → validates persona exists.
  • Router generates correlation_id (UUID) and ActionContext (includes persona, host, params, correlation_id, shared_state dict for composite actions).
  • ActionRegistry.get(action_id) retrieves action (auto-discovered, dependency-injected).
  • BrowserClient.open_page(host_id) → allocates fresh context + page from BrowserPool. Reuses persistent browser instance for host; creates new page/context for isolation.
  • Action.run(page, context) executes logic with PageLike-typed page parameter (supports both Playwright Page and ExtensionPage); may call ensure_persona() (login flow) before starting. For composite actions, passes shared_state dict to child actions.
  • Actions typically use PageHelpers wrapper for high-level interactions (wait utilities, UI operations, diagnostics).
  • On error: captures debug info (screenshot, HTML, console logs) and returns with DebugInfo attached.
  • Response: ActionEnvelope (status, correlation_id, result/error, debug_info).

Browser Host Resolution & Pooling

  • BrowserPool Architecture:

    • One persistent browser instance per host (lazy-initialized on first request).
    • Fresh context + page allocated per request for isolation (no cross-request state).
    • Proper cleanup via async context manager pattern.
    • Handles timeouts, connection errors, context cleanup.
  • Host Kind Resolution:

    • kind: cdp — connect to running Raindrop instance via Chrome DevTools Protocol (requires host + port). WARNING: Querying browser.contexts or context.pages triggers page refresh, closing modals and losing user state. Use kind: extension instead for modal interactions.
    • kind: headless — launch Playwright browser (chromium/firefox/webkit); set browser field in config.
    • kind: extension — connect to Chrome via Terminator Bridge extension using WebSocket. Provides Playwright-like API without CDP page refresh issues. Requires Chrome running with extension loaded.
    • Always use async with BrowserClient.open_page(host_id) as page: to ensure proper cleanup (context manager unwinding in BrowserPool).

Extension-Based Browser Automation (Solving CDP Page Refresh Problem)

Problem: CDP's browser.contexts and context.pages queries trigger page refreshes, causing modals to close and user state to be lost.

Solution: Use browser extension with Chrome's internal debugger API via WebSocket communication.

Architecture:

  • Python Side: ExtensionClient acts as WebSocket SERVER listening on 0.0.0.0:17373
  • Browser Side: Terminator Bridge extension acts as WebSocket CLIENT connecting to server
  • Communication: JSON messages with UUID-based request/response correlation
  • API: Provides Playwright-like interface (page.click(), page.fill(), page.evaluate(), page.locator())
  • Network: Supports cross-network operation (Python on one machine, Chrome on another)

Key Files:

  • src/guide/app/browser/extension_client.pyExtensionClient (WebSocket server) and ExtensionPage (Playwright-like API with React-compatible fill/type methods using native property descriptors)
  • src/guide/app/browser/types.pyPageLike protocol for unified interface across Playwright and extension pages
  • extension/worker.js — MV3 service worker handling WebSocket connection and debugger API
  • extension/content.js — Content script for extension wake-up handshakes
  • extension/manifest.json — Manifest V3 configuration

Setup:

  1. Load extension in Chrome: chrome://extensions → "Load unpacked" → select extension/ directory
  2. Navigate Chrome to target page (e.g., https://stg.raindrop.com/)
  3. Python code uses ExtensionClient:
    from guide.app.browser.extension_client import ExtensionClient
    
    async with ExtensionClient() as client:
        page = await client.get_page()
        await page.click("button.submit")
        await page.fill("input[name='email']", "user@example.com")
        title = await page.evaluate("document.title")
    

Benefits:

  • No page refresh when interacting with browser
  • Modals stay open during automation
  • User state preserved across operations
  • Works across network (Python and Chrome on different machines)
  • Familiar Playwright-like API

Restrictions:

  • Chrome debugger cannot attach to restricted pages (chrome://, chrome-extension://, devtools://, edge://, about:)
  • Active tab must be on a regular webpage
  • Requires browser extension to be loaded and Chrome running

Testing:

  • test_extension_client.py — Basic connectivity and API validation
  • test_sourcing_form_extension.py — Form filling demonstration without page refresh

GraphQL & Data Layer

  • raindrop/graphql.py — HTTP client (httpx, 10s timeout).
  • raindrop/operations/ — query/mutation definitions + Pydantic-validated response models.
  • raindrop/generated/ — Auto-generated Pydantic models (via ariadne-codegen) from Raindrop GraphQL schema. Type-safe; auto-synced with schema.
  • raindrop/queries/ — GraphQL query/mutation .graphql files (single source of truth for operations).
  • Validate all responses with Pydantic models; schema mismatches → GuideError.
  • Never embed tokens/URLs; pull from AppSettings (env-driven).
  • Transport errors → GraphQLTransportError; operation errors → GraphQLOperationError (includes details from server).

Selector & String Management (strings/)

  • Flattened Registry (Phase 1 refactoring complete): All selectors, labels, copy in strings/ submodules; no 3-layer wrappers.
  • Direct field access: app_strings.intake.field (was app_strings.intake.selectors.field). Registry enforces domain-keyed lookups; missing keys raise immediately.
  • Organize into: selectors/ (CSS, data-testid, aria attributes), labels/ (UI element names), demo_texts/ (copy/text snippets).
  • Selectors should be reusable and labeled; avoid brittle text selectors—prefer data-testid or aria labels.
  • No inline UI strings in action code; all strings centralized in strings/ for easy maintenance and i18n readiness.

Development Workflow

  1. Edit code (actions, browser logic, GraphQL ops, etc.).
  2. Run type check: basedpyright src (catches generic types, missing annotations).
  3. Sanity compile: python -m compileall src/guide (syntax check).
  4. Smoke test: python -m guide then hit /docs or manual test via curl.
  5. Review error handling: ensure GuideError subclasses are raised, not generic exceptions.
  6. Commit with scoped, descriptive message (e.g., feat: add auth login action, chore: tighten typing).

Type & Linting Standards

  • Python 3.12+: Use PEP 604 unions (str | None), built-in generics (list[str], dict[str, JSONValue]).
  • Ban Any and # type: ignore: Use type guards or Protocol instead.
  • Pydantic v2: Explicit types, model_validate for parsing, model_copy for immutable updates.
  • Type checker: Pyright (via basedpyright).
  • Docstrings: Imperative style, document public APIs, include usage examples.

Error Handling & Logging

  • Always raise GuideError subclasses (not generic Exception); routers translate to HTTP responses.
  • Log via core/logging (structured, levelled). Include persona/action IDs and host targets for traceability.
  • For browser flows, use Playwright traces (enabled by default in BrowserClient); disable only intentionally.
  • Validate external inputs early; surface schema/connection issues as GuideError.

Testing & Quality Gates

  • Minimum gate: basedpyright src + python -m compileall src/guide before merge.
  • Test Coverage: 28 passing tests across unit and integration suites (in tests/ directory).
  • Test Structure:
    • tests/unit/ — Unit tests for strings registry, models (persona/host validation), action registration.
    • tests/integration/ — Integration tests for BrowserClient, BrowserPool, and browser lifecycle.
    • conftest.py — Shared fixtures: mock_browser_hosts, mock_personas, app_settings, persona_store, action_registry, action_context, mock_page, mock_browser_client, mock_browser_pool.
    • Pytest configured with asyncio_mode=auto for async test support.
  • Mock Playwright/GraphQL in tests; avoid real network/CDP calls.
  • Require deterministic fixtures; document any env vars needed in test module docstring.
  • No loops or conditionals in tests; use @pytest.mark.parametrize for multiple cases.

MFA & Auth

  • Default DummyMfaCodeProvider raises NotImplementedError.
  • For real runs, implement a provider and wire it in core/config.py or auth/ modules.
  • ensure_persona() in actions calls the provider; stub or override for demo/test execution.

Performance & Footprint

  • Keep browser sessions short-lived; close contexts to avoid handle leaks.
  • Cache expensive GraphQL lookups (per-request OK, global only if safe).
  • Don't widen dependencies without justification; stick to project pins in pyproject.toml.
  • Promptly close Playwright contexts/browser handles (wrapped in contextmanager; keep action code lean).

Refactoring Status (100% Complete)

The project has completed 8 major refactoring phases achieving full architectural modernization:

Phase Focus Status
Phase 1 Strings Registry Flattening Complete
Phase 2A Persona Model Unification Complete
Phase 2B BrowserHostDTO Separation Complete
Phase 3 Action Registration Standardization Complete
Phase 4 GraphQL Code Generation (ariadne-codegen) Complete
Phase 5 Browser Context Isolation Complete
Phase 6-8 Error Handling & Logging Enhancements Complete

Quality Metrics:

  • Zero type errors (basedpyright)
  • All linting passed
  • 28 tests passing
  • Zero code redundancy
  • ~5,229 lines of production code

Git & PR Hygiene

  • Scoped, descriptive commits (e.g., feat: add sourcing action, fix: handle missing persona host).
  • PRs should state changes, commands run, new config entries (hosts/personas).
  • Link related issues; include screenshots/logs for UI or API behavior changes.
  • Never commit credentials, MFA tokens, or sensitive config; use env overrides.

Action Registration & Dependency Injection

Registering a New Action

Use the @register_action decorator to auto-discover and register actions:

from actions.base import DemoAction, register_action

@register_action
class MyAction(DemoAction):
    id = "my-action"
    description = "Does something cool"
    category = "demo"

    # Optional: Declare dependencies (auto-injected by ActionRegistry)
    def __init__(self, persona_store: PersonaStore):
        self.persona_store = persona_store

    async def run(self, page: Page, context: ActionContext) -> ActionResult:
        # Implementation
        ...

Auto-Discovery: ActionRegistry uses pkgutil.walk_packages() to discover all modules in actions/ and collect all @register_action decorated classes. No manual registration needed.

Dependency Injection: Parameters in __init__ are matched by name against DI context dict. Example: persona_store parameter → resolved from DI context during action instantiation.

Multi-Step Workflows (CompositeAction)

For workflows spanning multiple steps with shared state:

@register_action
class MyWorkflow(CompositeAction):
    id = "my-workflow"
    description = "Multi-step workflow"
    category = "demo"

    child_actions = ("step1-action", "step2-action", "step3-action")

    async def on_step_complete(self, step_id: str, result: ActionResult) -> None:
        # Optional: Update shared_state after each step
        # Accessed in child actions via context.shared_state dict
        pass

High-Level Browser Interactions (PageHelpers Pattern)

For actions requiring browser interactions, use the PageHelpers class for a fluent, high-level API:

from guide.app.browser.helpers import PageHelpers
from guide.app.browser.types import PageLike

@register_action
class MyAction(DemoAction):
    id = "my-action"
    description = "Example using PageHelpers"
    category = "demo"

    async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
        helpers = PageHelpers(page)

        # Wait for page stability
        await helpers.wait_for_stable()

        # Fill form and advance
        await helpers.fill_and_advance(
            selector="input[name='email']",
            value="user@example.com",
            next_selector="button.next",
        )

        # Search and select
        await helpers.search_and_select(
            search_input="input.search",
            query="example",
            result_selector="li.result",
        )

        # Collapse accordions
        result = await helpers.collapse_accordions("button.accordion")

        # Capture diagnostics on error
        if some_error:
            debug_info = await helpers.capture_diagnostics()

        return ActionResult(details={"status": "success"})

Available PageHelpers methods:

  • Wait utilities: wait_for_selector(), wait_for_network_idle(), wait_for_navigation(), wait_for_stable()
  • UI operations: fill_and_advance(), search_and_select(), click_and_wait()
  • Accordion operations: collapse_accordions()
  • Dropdown operations: select_dropdown_options()
  • Diagnostics: capture_diagnostics()

Quick Checklist (New Feature)

  • Add action in actions/ submodule (or submodule directory like actions/intake/, actions/demo/); use @register_action decorator.
  • Type page parameter as PageLike (not Page) to support both Playwright and extension pages.
  • Use PageHelpers wrapper for high-level browser interactions (wait utilities, UI operations, diagnostics).
  • Add action-specific logic; keep it thin and testable. Use strings/ for all selectors/copy.
  • For complex UI interactions, consider using browser/elements/ helpers (dropdown, form) or extending PageHelpers.
  • Ensure persona/host exist in config/hosts.yaml + config/personas.yaml (or use env overrides).
  • If action interacts with modals or requires no page refresh, use kind: extension browser host (requires Terminator Bridge extension loaded in Chrome).
  • If action needs GraphQL, add query/mutation to raindrop/operations/ + .graphql files in raindrop/queries/.
  • If action needs UI strings, add to strings/ submodules (selectors, labels, demo_texts).
  • Run basedpyright src + python -m compileall src/guide (type check + syntax check).
  • Run pytest tests/ to ensure no regressions (28 tests must pass).
  • Test via python -m guide + navigate to http://localhost:8765/docs to test endpoint.
  • If auth flow required, implement/mock MFA provider or use DummyMfaCodeProvider for testing.
  • Review error handling; raise GuideError subclasses, not generic exceptions.
  • Commit with descriptive message (e.g., feat: add my-action, test: add my-action tests).