- Updated the development server documentation to reflect the new port configuration. - Refactored string handling by replacing the centralized registry with dedicated selectors for better modularity and type safety. - Enhanced logging throughout the application by integrating loguru for structured logging and improved context handling. - Removed outdated files and streamlined the codebase for better maintainability. - Added new HTML parsing utilities using BeautifulSoup to improve DOM traversal and element extraction. - Updated various components to utilize the new string selectors, ensuring consistency across the codebase.
23 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
A FastAPI-based guided demo platform that automates browser interactions with Raindrop using Playwright and browser extensions. The app executes data-driven actions (stored in ActionRegistry) on behalf of personas that target configured browser hosts (CDP, headless, or extension). All configuration is externalized via YAML files and environment overrides.
Recent Enhancements:
- PageLike Protocol — Unified interface supporting both Playwright Page and ExtensionPage for seamless switching between automation backends
- PageHelpers Pattern — High-level fluent API for browser interactions (wait utilities, UI operations, diagnostics)
- Extension Mode — WebSocket-based browser automation avoiding CDP page refresh issues, with React-compatible form filling
- Browser Elements Package — Reusable UI interaction helpers for dropdowns, forms, and complex UI patterns
Entry Point: python -m guide (runs src/guide/main.py → guide.app.main:app)
Python Version: 3.12+
Key Dependencies: FastAPI, Playwright, Pydantic v2, PyYAML, httpx
Essential Commands
# Install dependencies
uv sync
# Type checking (required before commits)
basedpyright src
# Compile sanity check
python -m compileall src/guide
# Run development server (default: localhost:8765)
python -m guide
# or with custom host/port:
HOST=127.0.0.1 PORT=9000 python -m guide
# View API docs
# Navigate to http://localhost:8000/docs
# Key endpoints:
# GET /healthz # liveness check
# GET /actions # list action metadata
# POST /actions/{id}/execute # execute action; returns ActionEnvelope with correlation_id
# GET /config/browser-hosts # view current default + host map
Code Structure
Root module: src/guide/app/
actions/— Demo action implementations with auto-discovery via@register_actiondecorator. Auto-wired viaActionRegistrywith dependency injection. Submodules:base.py(DemoAction, CompositeAction, ActionRegistry),registry.py(auto-discovery),playbooks.py(multi-step workflows like OnboardingFlow, FullDemoFlow),diagnose_page.py(diagnostic action for page inspection),auth/(LoginAsPersonaAction),intake/(CreateSourcingRequestAction),sourcing/(AddSupplierAction),demo/(demonstration actions like CollapseAccordionsDemoAction showcasing PageHelpers pattern).auth/— Pluggable MFA/auth helpers.mfa.pydefinesMfaCodeProviderinterface;DummyMfaCodeProviderraisesNotImplementedError(implement for production).session.pyprovidesensure_persona()for login flows.browser/— Browser automation core with multiple layers:pool.py—BrowserPoolmanages persistent browser instances per host (lazy-initialized, allocates fresh contexts/pages per request)client.py—BrowserClientwraps BrowserPool with async context manager patternextension_client.py—ExtensionClientprovides WebSocket-based browser automation via Chrome extension to avoid CDP page refresh issues; includesExtensionPagewith Playwright-like API and React-compatible fill/type methodstypes.py— Protocol definitions:PageLike(unified interface for Playwright Page and ExtensionPage),PageLocator(protocol for locator objects)helpers.py—PageHelpersclass for high-level page interactions (wait utilities, diagnostics capture, UI operations like fill_and_advance, search_and_select, click_and_wait, accordion collapse, dropdown operations)wait.py— Standalone wait and stability utilities (wait_for_selector, wait_for_navigation, wait_for_network_idle, wait_for_stable_page)diagnostics.py— Captures screenshots, HTML, console logs for error debuggingelements/— Package for reusable UI interaction helpers:dropdown.py— Dropdown helpers optimized for extension mode (select_multi, select_single, select_combobox)form.py— Form fill helpers (extension-friendly wrappers for text, textarea, date, autocomplete)
core/— App bootstrap:config.py(AppSettings with Pydantic v2, env prefixRAINDROP_DEMO_, YAML + JSON override cascade),logging.py(structured logging with request-scoped context variables).errors/—GuideErrorhierarchy (ConfigError, BrowserConnectionError, PersonaError, AuthError, MfaError, ActionExecutionError, GraphQLTransportError, GraphQLOperationError); routers normalize to HTTP responses with debug info.raindrop/— GraphQL client + operations.graphql.py(httpx-based HTTP client),operations/(intake.py, sourcing.py with query/mutation definitions),generated/(ariadne-codegen auto-generated Pydantic models),queries/(GraphQL query/mutation files).strings/— Flattened registry (Phase 1 refactoring complete).registry.pyenforces domain-keyed lookups; missing keys raise immediately. No inline UI strings in actions. Submodules:selectors/(CSS, data-testid, aria),labels/(UI element names),demo_texts/(copy snippets).models/— Domain and persona models using Pydantic v2.domain/models.py(ActionRequest, ActionContext, ActionResult, ActionMetadata, ActionEnvelope, DebugInfo, BrowserHostDTO, BrowserHostsResponse).personas/models.py(DemoPersona with auto-coerced enums PersonaRole, LoginMethod; unified from Phase 2A refactoring).personas/store.py(PersonaStore in-memory registry from AppSettings).utils/— Shared helpers. Keep <300 LoC per file; avoid circular imports. Modules:ids.py(UUID/correlation ID),env.py(environment utilities),retry.py(backoff + jitter),timing.py(rate-limiting).api/— FastAPI routers inroutes/.health.py(GET /healthz),actions.py(GET /actions, POST /actions/{id}/execute),config.py(GET /config/browser-hosts). Map requests →ActionRegistry→BrowserClient→ActionEnveloperesponses with error capture.
Config files (git-tracked):
config/hosts.yaml— Browser host targets (id, kind: cdp|headless|extension, host, port, browser type).config/personas.yaml— Personas (id, role, email, login_method, browser_host_id).
Extension files (git-tracked):
extension/— Terminator Bridge Chrome extension (Manifest V3)manifest.json— Extension configuration with debugger permissionsworker.js— Service worker handling WebSocket and Chrome debugger APIcontent.js— Content script for extension wake-up handshakesREADME.md— Extension documentation and protocol specification
Config overrides (runtime only, never commit):
RAINDROP_DEMO_BROWSER_HOSTS_JSON— JSON array overrideshosts.yaml.RAINDROP_DEMO_PERSONAS_JSON— JSON array overridespersonas.yaml.RAINDROP_DEMO_RAINDROP_BASE_URL— Override defaulthttps://app.raindrop.com.
Architecture Patterns
App Initialization (main.py → create_app)
- Load
AppSettings(env + YAML + JSON overrides). - Build
PersonaStorefrom config. - Build
ActionRegistrywith auto-discovered actions (via@register_actiondecorator +pkgutil.walk_packages()). DI context includes persona store, Raindrop URL, etc. Auto-instantiates actions with parameter matching. - Create
BrowserPool(manages persistent browser instances per host, allocates fresh contexts/pages per request). - Create
BrowserClient(wraps BrowserPool with async context manager). - Stash instances on
app.statefor dependency injection in routers. - Register error handlers (GuideError → HTTP + debug info; unhandled → 500 + logging).
Action Execution Pipeline
- Request:
POST /actions/{action_id}/executewithActionRequest(persona_id, host_id, params). - Router resolves persona + host from config → validates persona exists.
- Router generates
correlation_id(UUID) andActionContext(includes persona, host, params, correlation_id, shared_state dict for composite actions). ActionRegistry.get(action_id)retrieves action (auto-discovered, dependency-injected).BrowserClient.open_page(host_id)→ allocates fresh context + page from BrowserPool. Reuses persistent browser instance for host; creates new page/context for isolation.Action.run(page, context)executes logic withPageLike-typed page parameter (supports both Playwright Page and ExtensionPage); may callensure_persona()(login flow) before starting. For composite actions, passes shared_state dict to child actions.- Actions typically use
PageHelperswrapper for high-level interactions (wait utilities, UI operations, diagnostics). - On error: captures debug info (screenshot, HTML, console logs) and returns with DebugInfo attached.
- Response:
ActionEnvelope(status, correlation_id, result/error, debug_info).
Browser Host Resolution & Pooling
-
BrowserPool Architecture:
- One persistent browser instance per host (lazy-initialized on first request).
- Fresh context + page allocated per request for isolation (no cross-request state).
- Proper cleanup via async context manager pattern.
- Handles timeouts, connection errors, context cleanup.
-
Host Kind Resolution:
kind: cdp— connect to running Raindrop instance via Chrome DevTools Protocol (requireshost+port). WARNING: Queryingbrowser.contextsorcontext.pagestriggers page refresh, closing modals and losing user state. Usekind: extensioninstead for modal interactions.kind: headless— launch Playwright browser (chromium/firefox/webkit); setbrowserfield in config.kind: extension— connect to Chrome via Terminator Bridge extension using WebSocket. Provides Playwright-like API without CDP page refresh issues. Requires Chrome running with extension loaded.- Always use
async with BrowserClient.open_page(host_id) as page:to ensure proper cleanup (context manager unwinding in BrowserPool).
Extension-Based Browser Automation (Solving CDP Page Refresh Problem)
Problem: CDP's browser.contexts and context.pages queries trigger page refreshes, causing modals to close and user state to be lost.
Solution: Use browser extension with Chrome's internal debugger API via WebSocket communication.
Architecture:
- Python Side:
ExtensionClientacts as WebSocket SERVER listening on0.0.0.0:17373 - Browser Side: Terminator Bridge extension acts as WebSocket CLIENT connecting to server
- Communication: JSON messages with UUID-based request/response correlation
- API: Provides Playwright-like interface (
page.click(),page.fill(),page.evaluate(),page.locator()) - Network: Supports cross-network operation (Python on one machine, Chrome on another)
Key Files:
src/guide/app/browser/extension_client.py—ExtensionClient(WebSocket server) andExtensionPage(Playwright-like API with React-compatible fill/type methods using native property descriptors)src/guide/app/browser/types.py—PageLikeprotocol for unified interface across Playwright and extension pagesextension/worker.js— MV3 service worker handling WebSocket connection and debugger APIextension/content.js— Content script for extension wake-up handshakesextension/manifest.json— Manifest V3 configuration
Setup:
- Load extension in Chrome:
chrome://extensions→ "Load unpacked" → selectextension/directory - Navigate Chrome to target page (e.g.,
https://stg.raindrop.com/) - Python code uses
ExtensionClient:from guide.app.browser.extension_client import ExtensionClient async with ExtensionClient() as client: page = await client.get_page() await page.click("button.submit") await page.fill("input[name='email']", "user@example.com") title = await page.evaluate("document.title")
Benefits:
- ✅ No page refresh when interacting with browser
- ✅ Modals stay open during automation
- ✅ User state preserved across operations
- ✅ Works across network (Python and Chrome on different machines)
- ✅ Familiar Playwright-like API
Restrictions:
- Chrome debugger cannot attach to restricted pages (
chrome://,chrome-extension://,devtools://,edge://,about:) - Active tab must be on a regular webpage
- Requires browser extension to be loaded and Chrome running
Testing:
test_extension_client.py— Basic connectivity and API validationtest_sourcing_form_extension.py— Form filling demonstration without page refresh
GraphQL & Data Layer
raindrop/graphql.py— HTTP client (httpx, 10s timeout).raindrop/operations/— query/mutation definitions + Pydantic-validated response models.raindrop/generated/— Auto-generated Pydantic models (via ariadne-codegen) from Raindrop GraphQL schema. Type-safe; auto-synced with schema.raindrop/queries/— GraphQL query/mutation .graphql files (single source of truth for operations).- Validate all responses with Pydantic models; schema mismatches →
GuideError. - Never embed tokens/URLs; pull from
AppSettings(env-driven). - Transport errors →
GraphQLTransportError; operation errors →GraphQLOperationError(includesdetailsfrom server).
Selector & String Management (strings/)
- Flattened Registry (Phase 1 refactoring complete): All selectors, labels, copy in
strings/submodules; no 3-layer wrappers. - Direct field access:
app_strings.intake.field(wasapp_strings.intake.selectors.field). Registry enforces domain-keyed lookups; missing keys raise immediately. - Organize into:
selectors/(CSS, data-testid, aria attributes),labels/(UI element names),demo_texts/(copy/text snippets). - Selectors should be reusable and labeled; avoid brittle text selectors—prefer
data-testidor aria labels. - No inline UI strings in action code; all strings centralized in
strings/for easy maintenance and i18n readiness.
Development Workflow
- Edit code (actions, browser logic, GraphQL ops, etc.).
- Run type check:
basedpyright src(catches generic types, missing annotations). - Sanity compile:
python -m compileall src/guide(syntax check). - Smoke test:
python -m guidethen hit/docsor manual test via curl. - Review error handling: ensure
GuideErrorsubclasses are raised, not generic exceptions. - Commit with scoped, descriptive message (e.g.,
feat: add auth login action,chore: tighten typing).
Type & Linting Standards
- Python 3.12+: Use PEP 604 unions (
str | None), built-in generics (list[str],dict[str, JSONValue]). - Ban
Anyand# type: ignore: Use type guards or Protocol instead. - Pydantic v2: Explicit types, model_validate for parsing, model_copy for immutable updates.
- Type checker: Pyright (via basedpyright).
- Docstrings: Imperative style, document public APIs, include usage examples.
Error Handling & Logging
- Always raise
GuideErrorsubclasses (not genericException); routers translate to HTTP responses. - Log via
core/logging(structured, levelled). Include persona/action IDs and host targets for traceability. - For browser flows, use Playwright traces (enabled by default in
BrowserClient); disable only intentionally. - Validate external inputs early; surface schema/connection issues as
GuideError.
Testing & Quality Gates
- Minimum gate:
basedpyright src+python -m compileall src/guidebefore merge. - Test Coverage: 28 passing tests across unit and integration suites (in
tests/directory). - Test Structure:
tests/unit/— Unit tests for strings registry, models (persona/host validation), action registration.tests/integration/— Integration tests for BrowserClient, BrowserPool, and browser lifecycle.conftest.py— Shared fixtures: mock_browser_hosts, mock_personas, app_settings, persona_store, action_registry, action_context, mock_page, mock_browser_client, mock_browser_pool.- Pytest configured with asyncio_mode=auto for async test support.
- Mock Playwright/GraphQL in tests; avoid real network/CDP calls.
- Require deterministic fixtures; document any env vars needed in test module docstring.
- No loops or conditionals in tests; use
@pytest.mark.parametrizefor multiple cases.
MFA & Auth
- Default
DummyMfaCodeProviderraisesNotImplementedError. - For real runs, implement a provider and wire it in
core/config.pyorauth/modules. ensure_persona()in actions calls the provider; stub or override for demo/test execution.
Performance & Footprint
- Keep browser sessions short-lived; close contexts to avoid handle leaks.
- Cache expensive GraphQL lookups (per-request OK, global only if safe).
- Don't widen dependencies without justification; stick to project pins in
pyproject.toml. - Promptly close Playwright contexts/browser handles (wrapped in contextmanager; keep action code lean).
Refactoring Status (100% Complete)
The project has completed 8 major refactoring phases achieving full architectural modernization:
| Phase | Focus | Status |
|---|---|---|
| Phase 1 | Strings Registry Flattening | ✅ Complete |
| Phase 2A | Persona Model Unification | ✅ Complete |
| Phase 2B | BrowserHostDTO Separation | ✅ Complete |
| Phase 3 | Action Registration Standardization | ✅ Complete |
| Phase 4 | GraphQL Code Generation (ariadne-codegen) | ✅ Complete |
| Phase 5 | Browser Context Isolation | ✅ Complete |
| Phase 6-8 | Error Handling & Logging Enhancements | ✅ Complete |
Quality Metrics:
- Zero type errors (basedpyright)
- All linting passed
- 28 tests passing
- Zero code redundancy
- ~5,229 lines of production code
Git & PR Hygiene
- Scoped, descriptive commits (e.g.,
feat: add sourcing action,fix: handle missing persona host). - PRs should state changes, commands run, new config entries (hosts/personas).
- Link related issues; include screenshots/logs for UI or API behavior changes.
- Never commit credentials, MFA tokens, or sensitive config; use env overrides.
Action Registration & Dependency Injection
Registering a New Action
Use the @register_action decorator to auto-discover and register actions:
from actions.base import DemoAction, register_action
@register_action
class MyAction(DemoAction):
id = "my-action"
description = "Does something cool"
category = "demo"
# Optional: Declare dependencies (auto-injected by ActionRegistry)
def __init__(self, persona_store: PersonaStore):
self.persona_store = persona_store
async def run(self, page: Page, context: ActionContext) -> ActionResult:
# Implementation
...
Auto-Discovery: ActionRegistry uses pkgutil.walk_packages() to discover all modules in actions/ and collect all @register_action decorated classes. No manual registration needed.
Dependency Injection: Parameters in __init__ are matched by name against DI context dict. Example: persona_store parameter → resolved from DI context during action instantiation.
Multi-Step Workflows (CompositeAction)
For workflows spanning multiple steps with shared state:
@register_action
class MyWorkflow(CompositeAction):
id = "my-workflow"
description = "Multi-step workflow"
category = "demo"
child_actions = ("step1-action", "step2-action", "step3-action")
async def on_step_complete(self, step_id: str, result: ActionResult) -> None:
# Optional: Update shared_state after each step
# Accessed in child actions via context.shared_state dict
pass
High-Level Browser Interactions (PageHelpers Pattern)
For actions requiring browser interactions, use the PageHelpers class for a fluent, high-level API:
from guide.app.browser.helpers import PageHelpers
from guide.app.browser.types import PageLike
@register_action
class MyAction(DemoAction):
id = "my-action"
description = "Example using PageHelpers"
category = "demo"
async def run(self, page: PageLike, context: ActionContext) -> ActionResult:
helpers = PageHelpers(page)
# Wait for page stability
await helpers.wait_for_stable()
# Fill form and advance
await helpers.fill_and_advance(
selector="input[name='email']",
value="user@example.com",
next_selector="button.next",
)
# Search and select
await helpers.search_and_select(
search_input="input.search",
query="example",
result_selector="li.result",
)
# Collapse accordions
result = await helpers.collapse_accordions("button.accordion")
# Capture diagnostics on error
if some_error:
debug_info = await helpers.capture_diagnostics()
return ActionResult(details={"status": "success"})
Available PageHelpers methods:
- Wait utilities:
wait_for_selector(),wait_for_network_idle(),wait_for_navigation(),wait_for_stable() - UI operations:
fill_and_advance(),search_and_select(),click_and_wait() - Accordion operations:
collapse_accordions() - Dropdown operations:
select_dropdown_options() - Diagnostics:
capture_diagnostics()
Quick Checklist (New Feature)
- Add action in
actions/submodule (or submodule directory likeactions/intake/,actions/demo/); use@register_actiondecorator. - Type
pageparameter asPageLike(notPage) to support both Playwright and extension pages. - Use
PageHelperswrapper for high-level browser interactions (wait utilities, UI operations, diagnostics). - Add action-specific logic; keep it thin and testable. Use
strings/for all selectors/copy. - For complex UI interactions, consider using
browser/elements/helpers (dropdown, form) or extending PageHelpers. - Ensure persona/host exist in
config/hosts.yaml+config/personas.yaml(or use env overrides). - If action interacts with modals or requires no page refresh, use
kind: extensionbrowser host (requires Terminator Bridge extension loaded in Chrome). - If action needs GraphQL, add query/mutation to
raindrop/operations/+.graphqlfiles inraindrop/queries/. - If action needs UI strings, add to
strings/submodules (selectors, labels, demo_texts). - Run
basedpyright src+python -m compileall src/guide(type check + syntax check). - Run
pytest tests/to ensure no regressions (28 tests must pass). - Test via
python -m guide+ navigate tohttp://localhost:8765/docsto test endpoint. - If auth flow required, implement/mock MFA provider or use
DummyMfaCodeProviderfor testing. - Review error handling; raise
GuideErrorsubclasses, not generic exceptions. - Commit with descriptive message (e.g.,
feat: add my-action,test: add my-action tests).