Files
guide/docs/spec.md
2025-11-21 21:16:24 +00:00

14 KiB
Raw Blame History

Heres a PRD you can hand to FutureYou (or a code assistant) and not want to scream at it later.


PRD: Raindrop Demo Automation Service (FastAPI + Playwright/CDP)

  • Owner: You
  • Service Name (working): raindrop-demo-automation
  • Primary Host: 192.168.50.151 (homelab server)
  • Clients: Stream Deck (HTTP actions), CLI tools, or other automation

1. Background & Problem

You run live Raindrop demos and already:

  • Launch Chrome with CDP (remote debugging) enabled.
  • Manually drive the UI, assisted by Stream Deck macros today.
  • Have Python/Playwright scripts that can attach to that browser and perform actions.

Limitations right now:

  • Scripts run locally, tied to one machine.
  • There isnt a single, extensible service orchestrating “demo actions” (fill forms, advance steps, etc.).
  • Strings, selectors, and values are hardcoded and scattered across scripts, making it easy for a code assistant to duplicate or diverge.

You want:

  • A FastAPI service on your homelab (192.168.50.151) that exposes HTTP endpoints for demo actions.
  • Config that supports multiple demo machines (desktop, laptop via VPN) and is easy to switch.
  • A modular architecture: small packages, low line length, clear facades, zero magic values in-line.
  • All input strings / UI texts / selectors in a single package to avoid duplication and make future tooling straightforward.

2. Goals

Functional Goals

  1. Trigger demo actions via HTTP

    • Stream Deck calls endpoints like POST /actions/fill-intake or POST /actions/add-suppliers.
    • Actions attach to an existing CDP-enabled browser and run against the current Raindrop tab.
  2. Support multiple browser hosts

    • Each demo machine (desktop, laptop via VPN) is modeled as a Browser Host with:

      • ID (e.g., desktop, laptop, lab-vm)
      • Host/IP (e.g., 192.168.50.100)
      • CDP port (e.g., 9222)
    • The API caller can specify which host to target per request (query param or payload).

  3. Encapsulate all demo strings and selectors

    • No “free-floating” strings in action code for:

      • Input text (demo narratives, supplier names, event names).
      • UI labels / button text / placeholder strings.
      • CSS/XPath selectors.
    • These live in a central strings package.

  4. Extensible actions architecture

    • Easy to add new actions (e.g., “create 3-bids-and-buy event”, “run 3-way match”) without copy-paste.
    • A registry/facade manages all actions by ID.

NonGoals

  • No UI or dashboard for now (everything via HTTP / Stream Deck).
  • No multi-tenant security model beyond basic network trust.
  • No scheduling / long-running workflows (actions are short-lived scripts).

3. Users & Usage

Primary User

  • Demo Host (you) Use Stream Deck to trigger HTTP calls to the service while driving a Raindrop demo.

Usage Scenarios

  1. Simple Intake Fill

    • You navigate to the Raindrop intake screen.
    • Hit a Stream Deck button that calls POST /actions/fill-intake-basic targeting your current machine.
    • The service attaches to your browser and fills out description + moves to the next step.
  2. Three Suppliers Auto-Add

    • On the “Suppliers” step of a sourcing event, you hit a Stream Deck button.
    • The service adds three predefined suppliers.
  3. Multi-host Setup

    • Some days you demo from desktop; some days from laptop via VPN.
    • You switch the Stream Deck config or action payload to target browser_host_id=laptop.

4. High-Level Architecture

Components

  1. FastAPI Application (app)

    • Exposes REST endpoints.
    • Does auth (if/when needed), routing, validation, and returns structured responses.
  2. Config & Settings (app.core.config)

    • Uses Pydantic BaseSettings + optional YAML for:

      • Known browser hosts (id, host, port).
      • Default browser host.
      • Raindrop tenant URL.
    • No hardcoded values in code; read from config/env.

  3. Strings Package (app.strings)

    • Contains all:

      • Demo text (descriptions, comments).
      • UI labels & button names.
      • CSS/Playwright selectors.
    • Structured by domain (intake, sourcing, payables).

  4. Browser Connector (app.browser)

    • Encapsulates Playwright CDP connections.

    • Provides a BrowserClient or BrowserFacade:

      • connect(browser_host) → returns a connection.
      • get_raindrop_page() → returns a Page to act on (e.g., the tab whose URL matches Raindrop host).
    • This is the only layer that knows about CDP endpoints.

  5. Action Framework (app.actions)

    • Base DemoAction protocol/class:

      • id: str
      • run(page, context) -> ActionResult
    • Each domain package implements focused actions:

      • intake.py (fill forms, advance steps).
      • sourcing.py (add suppliers, configure event).
      • navigation.py (jump to certain pages).
    • An ActionRegistry maps action IDs → action objects.

  6. Domain Models (app.domain)

    • Typed models with Pydantic for:

      • ActionRequest, ActionResponse.
      • BrowserHost and config structures.
      • ActionContext (host id, session info, optional parameters).

5. Detailed Design

5.1 Directory Layout (Python Package)

Example layout (you can tweak):

app/
  __init__.py
  main.py               # FastAPI app instance
  api/
    __init__.py
    routes_actions.py   # /actions endpoints
    routes_config.py    # optional: expose config/browser hosts
    routes_health.py    # /healthz
  core/
    __init__.py
    config.py           # Pydantic settings and config loading
    logging.py          # logging setup
  domain/
    __init__.py
    models.py           # ActionRequest, ActionResponse, BrowserHost, ActionContext
    enums.py            # Action IDs, maybe host status enums
  browser/
    __init__.py
    client.py           # BrowserFacade/BrowerClient implementation
    page_selector.py    # logic to pick the correct Raindrop tab
  actions/
    __init__.py
    base.py             # DemoAction interface, ActionRegistry
    intake.py           # intake-related actions
    sourcing.py         # sourcing-related actions
    navigation.py       # navigation actions
  strings/
    __init__.py
    selectors.py        # all CSS/xpath selectors
    labels.py           # visible UI labels/button names
    demo_texts.py       # all pre-baked text content
config/
  strings.yaml          # optional external strings source
  hosts.yaml            # browser host definitions (desktop, laptop, etc.)

Guideline: Each module should be small, focused, and under ~200300 lines. If a module grows, split it further (e.g., intake_create.py, intake_approve.py).


5.2 Config & Settings

Use Pydantic BaseSettings to load:

  • Environment variables (for secrets, host IP).
  • YAML/JSON for structured config (hosts, string groups).

Example conceptual model:

# app/core/config.py
from pydantic import BaseSettings
from typing import Dict

class BrowserHostConfig(BaseSettings):
    id: str
    host: str
    cdp_port: int

class AppSettings(BaseSettings):
    raindrop_base_url: str
    default_browser_host_id: str
    browser_hosts: Dict[str, BrowserHostConfig]  # keyed by id

    class Config:
        env_prefix = "RAINDROP_DEMO_"
        # Optionally load from config/hosts.yaml

Requirements:

  • No raw IPs or ports in code. IP 192.168.50.151 is used at deployment level (e.g., uvicorn bind host), not in business logic.

  • Changing default host or adding a laptop host should mean:

    • Update hosts.yaml and/or env var.
    • Restart service, no code changes.

5.3 Strings Package

Objective: Any textual thing that might be typed into or read from the UI lives here.

Submodules:

  1. selectors.py

    • All selectors used by Playwright:

      • e.g., INTAKE_DESCRIPTION_FIELD, BUTTON_NEXT, SUPPLIER_SEARCH_INPUT.
    • Prefer centrally named constants:

      • selectors.INTAKE.DESCRIPTION_FIELD
      • selectors.SOURCING.SUPPLIER_SEARCH_INPUT
    • Keep selectors DRY, referenced by actions.

  2. labels.py

    • Just the user-visible string labels: e.g., "Next", "Submit", "Suppliers".
    • Some selectors may be derived from labels (e.g., Playwright get_by_role("button", name=labels.NEXT_BUTTON)).
  3. demo_texts.py

    • All “scripted” text you want to appear in demos:

      • Intakes (“500 tons of conveyor belts…”).
      • Event names.
      • Supplier names.
    • Grouped by scenario:

      • INTAKE.CONVEYOR_BELT_REQUEST
      • SOURCING.THREE_BIDS_EVENT_NAME
      • SUPPLIERS.DEFAULT_TRIO = ["Demo Supplier A", "Demo Supplier B", "Demo Supplier C"]

Optional: load from external strings.yaml but always surfaced through app.strings to keep a single import point.

Rule: Actions must never use raw strings for content or selectors directly; they import from app.strings.


5.4 Browser Connector

app.browser.client.BrowserClient (or BrowserFacade):

Responsibilities:

  • Resolve BrowserHost config by id.

  • Connect to CDP endpoint:

    • http://{host}:{cdp_port} via playwright.chromium.connect_over_cdp.
  • Resolve the correct Raindrop page:

    • Prefer a page whose URL contains settings.raindrop_base_url.
    • If multiple, pick the last active or last created (simple heuristic).
  • Provide a simple interface to actions:

    • get_page() returns a Playwright Page.
    • It should handle errors gracefully: no host, cannot connect, no pages.

Actions use only this abstraction; they never touch raw CDP URLs.


5.5 Action Framework

Base interface (app.actions.base):

class DemoAction(Protocol):
    id: str

    def run(self, page: Page, context: ActionContext) -> ActionResult:
        ...
  • ActionContext includes:

    • browser_host_id: str
    • Optional parameters (e.g., override event name).
    • Correlation id for logging.

Action Registry:

  • Maintains a mapping of action_id → DemoAction instance.

  • Prevents duplication: all action IDs are declared in one place.

  • Provides methods:

    • get(action_id: str) -> DemoAction
    • list() -> List[ActionMetadata]

Example Actions:

  • FillIntakeBasicAction

    • Uses selectors.INTAKE.DESCRIPTION_FIELD.
    • Uses demo_texts.INTAKE.CONVEYOR_BELT_REQUEST.
    • Calls page.fill() + page.click().
  • AddThreeSuppliersAction

    • Uses demo_texts.SUPPLIERS.DEFAULT_TRIO.

Each action file should be short (one or a few related actions) and import everything from strings + browser.


5.6 FastAPI Layer & API Design

Endpoints

  1. GET /healthz

    • Returns { "status": "ok" }.
  2. GET /actions

    • Returns a list of available actions:

      • id, description, category (e.g., intake, sourcing).
  3. POST /actions/{action_id}/execute

    • Request body:

      • browser_host_id (optional → use default).
      • params (optional dict, action-specific).
    • Behavior:

      • Look up host; create ActionContext.
      • Use BrowserClient to connect to host and get Raindrop page.
      • Run the action; return result status + optional metadata.
    • Response body:

      • status (ok / error).
      • action_id.
      • browser_host_id.
      • details (optional).
  4. GET /config/browser-hosts (optional)

    • Returns configured browser hosts and default host.

Telemetry & Error Handling:

  • Every request logs:

    • action_id, browser_host_id, correlation_id, duration_ms, result.
  • In case of error, return 4xx/5xx with structured error JSON:

    • e.g., {"error": "BROWSER_CONNECT_FAILED", "message": "Cannot connect to 192.168.50.100:9222"}.

6. Networking & Deployment

  • Backend Location: FastAPI app running on your homelab server (192.168.50.151).

  • Bind Address: 0.0.0.0 (so local network & VPN clients can hit it).

  • Port: configurable via env (e.g., default 8000).

  • Demo Machines:

    • Desktop/laptop each runs Chrome with:

      • --remote-debugging-port=9222
      • Known IP or DNS reachable from 192.168.50.151.
    • hosts.yaml defines each as a BrowserHost.

  • Steam Deck:

    • Configured to call e.g.:

      • http://192.168.50.151:8000/actions/fill-intake-basic/execute?browser_host_id=desktop
      • or with JSON body specifying host.

7. Extensibility & Code-Assistant Friendliness

Extensibility

  • To add a new action:

    1. Add any new strings/selectors to app.strings.
    2. Implement a small DemoAction in the right actions/* module.
    3. Register it in the ActionRegistry.
    4. Optionally expose documentation via /actions.
  • To support a new demo machine:

    • Add BrowserHostConfig entry in hosts.yaml.
    • Restart service.

Code-Assistant Guardrails

  • All core primitive operations (connect to browser, select Raindrop page, use selectors, pick texts) are centralized:

    • browser.client for CDP.
    • strings for text and selectors.
    • actions.base & ActionRegistry for action definitions.
  • PRD requirement: modules must be small and self-contained; code assistants should:

    • Prefer using existing BrowserClient instead of re-implementing CDP logic.
    • Prefer using strings instead of writing raw strings.
    • Use ActionRegistry for action lookup.

8. Security & Safety

  • Runs on trusted internal network/VPN.

  • Optional enhancements:

    • Simple API key header for Stream Deck / other clients.
    • IP allowlist (only allow local subnet/VPN).
  • No credentials in code:

    • Any secrets (e.g., if you ever login from the service) stored in env, not in repo.
  • Actions should never perform destructive or irreversible operations in Raindrop unless explicitly designed and named that way (e.g., submit-event, delete-draft).


9. Acceptance Criteria

  1. End-to-end happy path:

    • From desktop, with Chrome CDP running, you trigger fill-intake-basic via Stream Deck and the intake form is filled and advanced.
  2. Multi-host support:

    • You can successfully run the same action against desktop and laptop by only changing browser_host_id.
  3. Strings centralization:

    • A grep for known demo text ("conveyor belts", "Demo Supplier A") returns only app/strings/* files.
    • No selectors appear directly in actions/*.
  4. Small module sizes:

    • No file exceeds an agreed limit (e.g. 300 LoC), except possibly domain/models.py if thats acceptable.
  5. Extensibility check:

    • You can add a new action (e.g., open-intake-dashboard) following documented steps without touching existing actions, and it appears in /actions.