Files
noteflow/CLAUDE.md
2026-01-02 04:22:40 +00:00

35 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

NoteFlow is an intelligent meeting notetaker: local-first audio capture + navigable recall + evidence-linked summaries. It is a client-server system built around a gRPC API for bidirectional audio streaming and transcription. The repository includes:

  • A Python backend (src/noteflow/) hosting the gRPC server, domain logic, and infrastructure adapters.
  • A Tauri + React desktop client (client/) that uses Rust for IPC and a React UI (Vite).

The gRPC schema is the shared contract between backend and client; keep proto changes in sync across Python, Rust, and TypeScript.

Quick Orientation (Start Here)

  • Backend server entry point: python -m noteflow.grpc.server (implementation in src/noteflow/grpc/service.py).
  • Tauri/React client: cd client && npm run dev (web), npm run tauri dev (desktop).
  • Tauri IPC bridge: TypeScript adapters in client/src/api/tauri-adapter.ts invoke Rust commands in client/src-tauri/src/commands/.
  • Protobuf schema and generated Python stubs live in src/noteflow/grpc/proto/.

Build and Development Commands

# Install (editable with dev dependencies)
python -m pip install -e ".[dev]"

# Run gRPC server
python -m noteflow.grpc.server --help

# Run Tauri + React client UI
cd client
npm install
npm run dev
# Desktop Tauri dev (requires Rust toolchain)
npm run tauri dev
cd -

# Tests
pytest                           # Full suite
pytest -m "not integration"      # Skip external-service tests
pytest tests/domain/             # Run specific test directory
pytest -k "test_segment"         # Run by pattern

# Linting and type checking
ruff check .                     # Lint
ruff check --fix .               # Autofix
mypy src/noteflow                # Strict type checks
basedpyright                     # Additional type checks

# Client lint/format (from client/)
npm run lint                      # Biome
npm run lint:fix                  # Biome autofix
npm run format                    # Prettier
npm run format:check              # Prettier check
npm run lint:rs                   # Clippy (Rust)
npm run format:rs                 # rustfmt (Rust)

# Docker development
docker compose up -d postgres    # PostgreSQL with health checks
python scripts/dev_watch_server.py  # Auto-reload server (watches src/)

Docker Development

IMPORTANT: The server runs with hot-reload enabled. Assume Docker services are always running. Never restart, rebuild, or stop containers without explicit user permission—doing so disrupts the development workflow.

# Start PostgreSQL (with pgvector)
docker compose up -d postgres

# Dev container (VS Code) - full GUI environment
# .devcontainer/ includes PortAudio, GTK, pystray, pynput support
code .  # Open in VS Code, select "Reopen in Container"

# Development server with auto-reload
python scripts/dev_watch_server.py  # Uses watchfiles, monitors src/ and alembic.ini

Dev container features: dbus-x11, GTK-3, libgl1 for system tray and hotkey support.

Forbidden Docker Operations (without explicit permission)

  • docker compose build — rebuilds images, disrupts running containers
  • docker compose up / down / restart — starts/stops services
  • docker stop / docker kill — kills running containers
  • Any command that would interrupt the hot-reload server

Code changes are automatically picked up by the watchfiles-based hot-reload. If you need to suggest a Docker operation, ask the user first.

Architecture

src/noteflow/
├── domain/           # Entities + ports (see Domain Package Structure below)
├── application/      # Use-cases/services (Meeting, Recovery, Export, Summarization, Trigger, Webhook, Calendar, Retention, NER)
├── infrastructure/   # Implementations
│   ├── audio/        # sounddevice capture, ring buffer, VU levels, playback, buffered writer
│   ├── asr/          # faster-whisper engine, VAD segmenter, streaming
│   ├── diarization/  # Speaker diarization (streaming: diart, offline: pyannote.audio)
│   ├── summarization/# Multi-provider summarization (CloudProvider, OllamaProvider) + citation verification
│   ├── triggers/     # Auto-start signal providers (calendar, audio activity, foreground app)
│   ├── persistence/  # SQLAlchemy + asyncpg + pgvector, Alembic migrations
│   ├── security/     # keyring keystore, AES-GCM encryption
│   ├── crypto/       # Cryptographic utilities
│   ├── export/       # Markdown/HTML/PDF export
│   ├── webhooks/     # Webhook executor with retry logic and HMAC signing
│   ├── converters/   # ORM ↔ domain entity converters (including webhook, NER, calendar, integration)
│   ├── calendar/     # OAuth manager, Google/Outlook calendar adapters
│   ├── ner/          # Named entity recognition engine (spaCy)
│   ├── observability/# OpenTelemetry tracing, usage event tracking
│   ├── metrics/      # Metric collection utilities
│   ├── logging/      # Log buffer and utilities
│   └── platform/     # Platform-specific code
├── grpc/             # Proto definitions, server, client, meeting store, modular mixins
├── cli/              # CLI tools (retention management, model commands)
└── config/           # Pydantic settings (NOTEFLOW_ env vars) + feature flags

Frontend (Tauri + React) lives outside the Python package:

client/
├── src/              # React UI, state, and view components
├── src-tauri/        # Rust/Tauri shell, IPC commands, gRPC client
├── e2e/              # Playwright tests
├── package.json      # Vite + test/lint scripts
└── vite.config.ts    # Vite configuration

Key patterns:

  • Hexagonal architecture: domain → application → infrastructure
  • Repository pattern with Unit of Work (SQLAlchemyUnitOfWork)
  • gRPC bidirectional streaming for audio → transcript flow
  • Protocol-based DI (see domain/ports/ and infrastructure protocols.py files)
  • Modular gRPC mixins for separation of concerns (see below)

Domain Package Structure

domain/
├── entities/         # Core domain entities
│   ├── meeting.py    # Meeting, MeetingId, MeetingState
│   ├── segment.py    # Segment, WordTiming
│   ├── summary.py    # Summary, KeyPoint, ActionItem
│   ├── annotation.py # Annotation
│   ├── named_entity.py # NamedEntity for NER results
│   └── integration.py# Integration, IntegrationType, IntegrationStatus
├── identity/         # User/workspace identity (Sprint 16)
│   ├── roles.py      # WorkspaceRole enum with permission checks
│   ├── context.py    # UserContext, WorkspaceContext, ProjectContext, OperationContext
│   └── entities.py   # User, Workspace, WorkspaceMembership domain entities
├── webhooks/         # Webhook domain
│   ├── events.py     # WebhookEventType, WebhookConfig, WebhookDelivery, payload classes
│   └── constants.py  # Webhook-related constants
├── triggers/         # Trigger detection domain
│   ├── entities.py   # Trigger, TriggerAction, TriggerSignal
│   └── ports.py      # TriggerProvider protocol
├── summarization/    # Summarization domain
│   └── ports.py      # SummarizationProvider protocol
├── ports/            # Repository protocols
│   ├── repositories/ # Organized by concern
│   │   ├── transcript.py  # MeetingRepository, SegmentRepository, SummaryRepository
│   │   ├── asset.py       # AssetRepository for file management
│   │   ├── background.py  # DiarizationJobRepository, EntityRepository
│   │   ├── external.py    # WebhookRepository, IntegrationRepository, PreferencesRepository
│   │   └── identity.py    # UserRepository, WorkspaceRepository protocols
│   ├── unit_of_work.py    # UnitOfWork protocol with supports_* capability checks
│   ├── diarization.py     # DiarizationEngine protocol
│   ├── ner.py             # NEREngine protocol
│   └── calendar.py        # CalendarProvider protocol
├── utils/            # Domain utilities
│   └── time.py       # utc_now() helper
├── errors.py         # Domain-specific exceptions
└── value_objects.py  # Shared value objects

gRPC Mixin Architecture

The gRPC server uses modular mixins for maintainability:

grpc/_mixins/
├── streaming/        # ASR streaming (package)
│   ├── _mixin.py     # Main StreamingMixin class
│   ├── _session.py   # Session management
│   ├── _asr.py       # ASR processing
│   ├── _processing.py# Audio processing pipeline
│   ├── _partials.py  # Partial transcript handling
│   ├── _cleanup.py   # Resource cleanup
│   └── _types.py     # Type definitions
├── diarization/      # Speaker diarization (package)
│   ├── _mixin.py     # Main DiarizationMixin class
│   ├── _jobs.py      # Background job management
│   ├── _refinement.py# Offline refinement
│   ├── _streaming.py # Real-time diarization
│   ├── _speaker.py   # Speaker assignment
│   ├── _status.py    # Job status tracking
│   └── _types.py     # Type definitions
├── summarization.py  # Summary generation (separates LLM inference from DB transactions)
├── meeting.py        # Meeting lifecycle (create, get, list, delete, stop)
├── annotation.py     # Segment annotations CRUD
├── export.py         # Markdown/HTML/PDF document export
├── entities.py       # Named entity extraction operations
├── calendar.py       # Calendar sync operations
├── webhooks.py       # Webhook management operations
├── preferences.py    # User preferences operations
├── observability.py  # Usage tracking, metrics operations
├── sync.py           # State synchronization operations
├── diarization_job.py# Diarization job status/management
├── converters.py     # Protobuf ↔ domain entity converters
├── errors.py         # gRPC error helpers (abort_not_found, abort_invalid_argument)
├── protocols.py      # ServicerHost protocol for mixin composition
└── _audio_helpers.py # Audio utility functions

grpc/interceptors/    # gRPC server interceptors (Sprint 16)
└── identity.py       # Identity context propagation (request_id, user_id, workspace_id)

Client-side mixins for the Python gRPC client:

grpc/_client_mixins/
├── streaming.py      # Client streaming operations
├── meeting.py        # Meeting CRUD operations
├── diarization.py    # Diarization requests
├── export.py         # Export requests
├── annotation.py     # Annotation operations
├── converters.py     # Response converters
└── protocols.py      # ClientHost protocol

Each mixin operates on ServicerHost protocol, enabling clean composition in NoteFlowServicer.

Service Injection Pattern

Services are injected through a three-tier pattern:

  1. ServicerHost Protocol (protocols.py) — declares required service attributes
  2. NoteFlowServicer (service.py) — accepts services via __init__ and stores as instance attributes
  3. NoteFlowServer (server.py) — creates/initializes services and passes to servicer

Example: _webhook_service, _summarization_service, _ner_service all follow this pattern.

Client Architecture (Tauri + React)

client/src/
├── api/              # API layer with adapters and connection management
│   ├── tauri-adapter.ts    # Main Tauri IPC adapter
│   ├── mock-adapter.ts     # Mock adapter for testing
│   ├── cached-adapter.ts   # Caching layer
│   ├── connection-state.ts # Connection state machine
│   ├── reconnection.ts     # Auto-reconnection logic
│   ├── interface.ts        # Adapter interface definition
│   └── types/              # API type definitions
├── hooks/            # Custom React hooks
│   ├── use-diarization.ts  # Speaker diarization state
│   ├── use-cloud-consent.ts# Cloud provider consent
│   ├── use-webhooks.ts     # Webhook management
│   ├── use-oauth-flow.ts   # OAuth authentication
│   ├── use-calendar-sync.ts# Calendar synchronization
│   ├── use-entity-extraction.ts # NER operations
│   └── ...                 # Additional hooks
├── contexts/         # React contexts
│   └── connection-context.tsx # gRPC connection context
├── components/       # React components
│   ├── ui/           # Reusable UI components (shadcn/ui)
│   ├── recording/    # Recording-specific components
│   ├── settings/     # Settings panel components
│   └── analytics/    # Analytics visualizations
├── pages/            # Route pages
├── lib/              # Utilities
│   ├── tauri-helpers.ts    # Tauri utility functions
│   ├── tauri-events.ts     # Tauri event handling
│   ├── cache/              # Client-side caching
│   ├── config/             # Configuration management
│   └── ...                 # Format, crypto, utils
└── types/            # Shared TypeScript types

Rust/Tauri backend:

client/src-tauri/src/
├── commands/         # Tauri IPC command handlers
│   ├── recording/    # Recording commands (capture, device, audio)
│   ├── triggers/     # Trigger detection commands
│   ├── meeting.rs    # Meeting CRUD
│   ├── diarization.rs# Diarization operations
│   ├── calendar.rs   # Calendar sync
│   ├── webhooks.rs   # Webhook management
│   └── ...           # Export, annotation, preferences, etc.
├── grpc/             # gRPC client
│   ├── client/       # Client implementations by domain
│   └── types/        # Rust type definitions
├── state/            # Runtime state management
│   ├── app_state.rs  # Main application state
│   ├── preferences.rs# User preferences
│   ├── playback.rs   # Audio playback state
│   └── types.rs      # State type definitions
├── audio/            # Audio capture and playback
├── cache/            # Memory caching
├── crypto/           # Cryptographic operations
├── events/           # Tauri event emission
├── triggers/         # Trigger detection
├── main.rs           # Application entry point
└── lib.rs            # Library exports

Database

PostgreSQL with pgvector extension. Async SQLAlchemy with asyncpg driver.

# Alembic migrations
alembic upgrade head
alembic revision --autogenerate -m "description"

Connection via NOTEFLOW_DATABASE_URL env var or settings.

Testing Conventions

  • Test files: test_*.py, functions: test_*
  • Markers: @pytest.mark.slow (model loading), @pytest.mark.integration (external services)
  • Integration tests use testcontainers for PostgreSQL
  • Asyncio auto-mode enabled
  • React unit tests use Vitest; e2e tests use Playwright in client/e2e/.

Quality Gates

After any non-trivial changes, run the quality and test smell suite:

pytest tests/quality/                # Test smell detection (23 checks)

This suite enforces:

  • No assertion roulette (multiple assertions without messages)
  • No conditional test logic (loops/conditionals with assertions)
  • No cross-file fixture duplicates (consolidate to conftest.py)
  • No unittest-style assertions (use plain assert)
  • Proper fixture typing and scope
  • No pytest.raises without match=
  • And 17 other test quality checks

Fixtures like crypto, meetings_dir, and mock_uow are provided by tests/conftest.py — do not redefine them in test files.

Proto/gRPC

Proto definitions: src/noteflow/grpc/proto/noteflow.proto Generated files excluded from lint: *_pb2.py, *_pb2_grpc.py

Regenerate after proto changes:

python -m grpc_tools.protoc -I src/noteflow/grpc/proto \
  --python_out=src/noteflow/grpc/proto \
  --grpc_python_out=src/noteflow/grpc/proto \
  src/noteflow/grpc/proto/noteflow.proto

Sync points (high risk of breakage):

  • Rust gRPC types are generated at build time by client/src-tauri/build.rs. Keep Rust DTOs aligned with proto changes.
  • Frontend enums/DTOs in client/src/types/ mirror proto enums and backend domain types; update these when proto enums change.
  • When adding or renaming RPCs, update: server mixins, src/noteflow/grpc/client.py, Tauri command handlers, and client/src/api/tauri-adapter.ts.

Common Pitfalls & Change Checklist

Proto / API evolution

  • Update the schema in src/noteflow/grpc/proto/noteflow.proto first; treat it as the source of truth.
  • Regenerate Python stubs (*_pb2.py, *_pb2_grpc.py) and verify imports still align in src/noteflow/grpc/.
  • Ensure the gRPC server mixins in src/noteflow/grpc/_mixins/ implement new/changed RPCs.
  • Update src/noteflow/grpc/client.py (Python client wrapper) to match the RPC signature and response types.
  • Update Tauri/Rust command handlers (client/src-tauri/src/commands/) and any Rust gRPC types used by commands.
  • Update TypeScript adapters in client/src/api/tauri-adapter.ts and shared DTOs/enums in client/src/types/ and client/src/api/types/.
  • Add/adjust tests on both sides (backend unit/integration + client unit tests) when changing payload shapes.

Database schema & migrations

  • Schema changes belong in src/noteflow/infrastructure/persistence/ plus an Alembic migration in src/noteflow/infrastructure/persistence/migrations/.
  • Use alembic revision --autogenerate only after updating models; review the migration for correctness.
  • Keep NOTEFLOW_DATABASE_URL in mind when running integration tests; default behavior may fall back to memory storage.
  • Update repository/UnitOfWork implementations if new tables or relations are introduced.
  • If you add fields used by export/summarization, ensure converters in infrastructure/converters/ are updated too.

Client sync points (Rust + TS)

  • Tauri IPC surfaces (Rust commands) must match the TypeScript calls in client/src/api/tauri-adapter.ts.
  • Rust gRPC types are generated by client/src-tauri/build.rs; verify the proto path if you move proto files.
  • Frontend enums in client/src/types/ and client/src/api/types/ mirror backend/proto enums; keep them aligned to avoid silent UI bugs.

Code Style

  • Python 3.12+, 100-char line length
  • Strict mypy (allow type: ignore[code] only with comment explaining why)
  • Ruff for linting (E, W, F, I, B, C4, UP, SIM, RUF)
  • Module soft limit 500 LoC, hard limit 750 LoC
  • Frontend formatting uses Prettier (single quotes, 100 char width); linting uses Biome.
  • Rust formatting uses rustfmt; linting uses clippy via the client scripts.

Automated Enforcement (Hookify Rules)

This project uses hookify rules to enforce code quality standards. These rules are defined in .claude/hookify*.local.md and automatically block or warn on violations.

Required: Use Makefile Targets

Always use Makefile targets instead of running tools directly. The Makefile orchestrates quality checks across Python, TypeScript, and Rust.

# Primary quality commands
make quality          # Run ALL quality checks (TS + Rust + Python)
make quality-py       # Python only: lint + type-check + test-quality
make quality-ts       # TypeScript only: type-check + lint + test-quality
make quality-rs       # Rust only: clippy + lint

# Python-specific
make lint-py          # Run Pyrefly linter
make type-check-py    # Run basedpyright
make test-quality-py  # Run pytest tests/quality/
make lint-fix-py      # Auto-fix Ruff + Sourcery issues

# TypeScript-specific
make type-check       # Run tsc --noEmit
make lint             # Run Biome linter
make lint-fix         # Auto-fix Biome issues
make test-quality     # Run Vitest quality tests

# Rust-specific
make clippy           # Run Clippy linter
make lint-rs          # Run code quality script
make fmt-rs           # Format with rustfmt

# Formatting
make fmt              # Format all code (Biome + rustfmt)
make fmt-check        # Check all formatting

# E2E Tests
make e2e              # Playwright tests (requires frontend on :5173)
make e2e-grpc         # Rust gRPC integration tests

Blocked Operations

The following operations are automatically blocked:

Protected Files (Require Explicit Permission)

File/Directory What's Blocked Why
Makefile All modifications (Edit, Write, bash redirects) Build system is protected
tests/quality/ All modifications except baselines.json Quality gate infrastructure
pyproject.toml, ruff.toml, pyrightconfig.json Edits Python linter config is protected
biome.json, tsconfig.json, .eslintrc* Edits Frontend linter config is protected
.rustfmt.toml, .clippy.toml Edits Rust linter config is protected

Forbidden Code Patterns (Python Files Only)

Pattern Why Blocked Alternative
Type suppression comments (# type: ignore, # pyright: ignore) Bypasses type safety Fix the actual type error, use cast() as last resort
Any type annotations Creates type safety holes Use specific types, Protocol, TypeVar, or object
Magic numbers in assignments Hidden intent, hard to maintain Define named constants with typing.Final
Loops/conditionals in tests Non-deterministic tests Use @pytest.mark.parametrize
Multiple assertions without messages Hard to debug failures Add assertion messages or separate tests
Duplicate fixture definitions Maintenance burden Use fixtures from tests/conftest.py

Blocked Fixtures (Already in tests/conftest.py)

Do not redefine these fixtures—they are globally available:

  • mock_uow, crypto, meetings_dir, webhook_config, webhook_config_all_events
  • sample_datetime, calendar_settings, meeting_id, sample_meeting, recording_meeting
  • mock_grpc_context, mock_asr_engine, mock_optional_extras

Warnings (Non-Blocking)

Trigger Warning
Creating new source files Search for existing similar code first
Files exceeding 500 lines Consider refactoring into a package

Quality Gate Requirement

Before completing any code changes, you must run:

make quality

This is enforced by the require-make-quality hook. All quality checks must pass before completion.

Policy: No Ignoring Pre-existing Issues

If you encounter lint errors, type errors, or test failures—even if they existed before your changes—you must either:

  1. Fix immediately (for simple issues)
  2. Add to todo list (for complex issues)
  3. Launch a subagent to fix (for parallelizable work)

Claiming something is "pre-existing" or "out of scope" is not a valid reason to ignore it.

Feature Flags

Optional features controlled via NOTEFLOW_FEATURE_* environment variables:

Flag Default Controls Prerequisites
NOTEFLOW_FEATURE_TEMPLATES_ENABLED true AI summarization templates
NOTEFLOW_FEATURE_PDF_EXPORT_ENABLED true PDF export format WeasyPrint installed
NOTEFLOW_FEATURE_NER_ENABLED false Named entity extraction spaCy model downloaded
NOTEFLOW_FEATURE_CALENDAR_ENABLED false Calendar sync OAuth credentials configured
NOTEFLOW_FEATURE_WEBHOOKS_ENABLED true Webhook notifications

Access via get_feature_flags().<flag_name> or get_settings().feature_flags.<flag_name>. Features with external dependencies default to false.

Spikes (De-risking Experiments)

spikes/ contains validated platform experiments with FINDINGS.md:

  • spike_01_ui_tray_hotkeys/ - System tray and global hotkey integration
  • spike_02_audio_capture/ - sounddevice + PortAudio
  • spike_03_asr_latency/ - faster-whisper benchmarks (0.05x real-time)
  • spike_04_encryption/ - keyring + AES-GCM (826 MB/s throughput)

Key Subsystems

Speaker Diarization

  • Streaming: diart for real-time speaker detection during recording
  • Offline: pyannote.audio for post-meeting refinement (higher quality)
  • gRPC: RefineSpeakerDiarization (background job), GetDiarizationJobStatus (polling), RenameSpeaker

Summarization

  • Providers: CloudProvider (Anthropic/OpenAI), OllamaProvider (local), MockProvider (testing)
  • Templates: Configurable tone (professional/casual/technical), format (bullet_points/narrative/structured), verbosity (minimal/balanced/detailed)
  • Citation verification: Links summary claims to transcript evidence
  • Consent: Cloud providers require explicit user consent (stored in user_preferences)

Export

  • Formats: Markdown, HTML, PDF (via WeasyPrint)
  • Content: Transcript with timestamps, speaker labels, summary with key points and action items
  • gRPC: ExportTranscript RPC with ExportFormat enum
  • PDF styling: Embedded CSS for professional document layout

Named Entity Recognition (NER)

Automatic extraction of people, companies, products, locations from transcripts.

  • Engine: spaCy with transformer models (en_core_web_sm or en_core_web_trf)
  • Categories: person, company, product, technical, acronym, location, date, other
  • Segment tracking: Entities link back to source segment_ids for navigation
  • Confidence scores: Model confidence for each extracted entity
  • Pinning: Users can pin (confirm) entities for future reference
  • gRPC: ExtractEntities RPC with optional force_refresh
  • Caching: Entities persisted in named_entities table, cached until refresh

Trigger Detection

  • Signals: Calendar proximity, audio activity, foreground app detection
  • Actions: IGNORE, NOTIFY, AUTO_START with confidence thresholds
  • Client integration: Background polling with dialog prompts (start/snooze/dismiss)

Webhooks

Automated HTTP notifications for meeting lifecycle events.

  • Events: meeting.completed, summary.generated, recording.started, recording.stopped
  • Delivery: Exponential backoff retries (configurable max_retries, default 3)
  • Security: HMAC-SHA256 signing via X-NoteFlow-Signature header when secret configured
  • Headers: X-NoteFlow-Event (event type), X-NoteFlow-Delivery (unique delivery ID)
  • Fire-and-forget: Webhook failures never block primary RPC operations
  • Persistence: webhook_configs stores URL/events/secret, webhook_deliveries logs delivery attempts

Key files:

  • domain/webhooks/events.pyWebhookEventType, WebhookConfig, WebhookDelivery, payload dataclasses
  • infrastructure/webhooks/executor.py — HTTP client with retry logic
  • application/services/webhook_service.py — orchestrates delivery to registered webhooks

Integrations

OAuth-based external service connections (calendar providers, etc.).

  • Types: calendar (Google, Outlook)
  • Status tracking: pending, connected, error, disconnected
  • Secure storage: OAuth tokens stored in integration_secrets table
  • Sync history: integration_sync_runs tracks each sync operation

ORM models in persistence/models/integrations/:

  • IntegrationModel — provider config and status
  • IntegrationSecretModel — encrypted OAuth tokens
  • CalendarEventModel — cached calendar events
  • MeetingCalendarLinkModel — links meetings to calendar events

Shared Utilities & Factories

Factories

Location Function Purpose
infrastructure/summarization/factory.py create_summarization_service() Auto-configured service with provider detection
infrastructure/persistence/database.py create_async_engine() SQLAlchemy async engine from settings
infrastructure/persistence/database.py create_async_session_factory() Session factory from DB URL
config/settings.py get_settings() Cached Settings from env vars
config/settings.py get_trigger_settings() Cached TriggerSettings from env vars

Converters

Location Class/Function Purpose
infrastructure/converters/orm_converters.py OrmConverter ORM ↔ domain entities (Meeting, Segment, Summary, etc.)
infrastructure/converters/asr_converters.py AsrConverter ASR DTOs → domain WordTiming
infrastructure/converters/webhook_converters.py WebhookConverter ORM ↔ domain (WebhookConfig, WebhookDelivery)
grpc/_mixins/converters.py meeting_to_proto(), segment_to_proto_update() Domain → protobuf messages
grpc/_mixins/converters.py create_segment_from_asr() ASR result → Segment with word timings

Repository Base (persistence/repositories/_base.py)

Method Purpose
_execute_scalar() Single result query (or None)
_execute_scalars() All scalar results from query
_add_and_flush() Add model and flush to DB
_delete_and_flush() Delete model and flush

Security Helpers (infrastructure/security/keystore.py)

Function Purpose
_decode_and_validate_key() Validate base64 key, check size
_generate_key() Generate 256-bit key as (bytes, base64_str)

Export Helpers (infrastructure/export/_formatting.py)

Function Purpose
format_timestamp() Seconds → MM:SS or HH:MM:SS
format_datetime() Datetime → display string

Summarization (infrastructure/summarization/)

Location Function Purpose
_parsing.py build_transcript_prompt() Transcript with segment markers for LLM
_parsing.py parse_llm_response() JSON → Summary entity
citation_verifier.py verify_citations() Validate segment_ids exist

Diarization (infrastructure/diarization/assigner.py)

Function Purpose
assign_speaker() Speaker for time range from turns
assign_speakers_batch() Batch speaker assignment

Triggers (infrastructure/triggers/calendar.py)

Function Purpose
parse_calendar_events() Parse events from config/env

Webhooks (infrastructure/webhooks/executor.py)

Class/Method Purpose
WebhookExecutor HTTP delivery with retry logic and HMAC signing
WebhookExecutor.deliver() Deliver payload to webhook URL with exponential backoff
WebhookExecutor._build_headers() Build headers including X-NoteFlow-Signature

Recovery Service (application/services/recovery_service.py)

Method Purpose
recover_all() Orchestrate meeting + job recovery
RecoveryResult Dataclass with recovery counts

Webhook Service (application/services/webhook_service.py)

Method Purpose
register_webhook() Register a webhook configuration
trigger_meeting_completed() Fire webhooks on meeting completion
trigger_summary_generated() Fire webhooks on summary generation
trigger_recording_started/stopped() Fire webhooks on recording lifecycle

Unit of Work Repositories

The UnitOfWork protocol provides access to all repositories:

Property Repository Supports In-Memory
meetings MeetingRepository Yes
segments SegmentRepository Yes
summaries SummaryRepository Yes
annotations AnnotationRepository Yes
diarization_jobs DiarizationJobRepository Yes
preferences PreferencesRepository Yes
entities EntityRepository Yes
integrations IntegrationRepository DB only
webhooks WebhookRepository Yes

Check capability with supports_* properties (e.g., uow.supports_webhooks).

Known Issues

See docs/triage.md for tracked technical debt. See docs/sprints/ for feature implementation plans and status.

Resolved:

  • Server-side state volatility → Diarization jobs persisted to DB
  • Hardcoded directory pathsasset_path column added to meetings
  • Synchronous blocking in async gRPCrun_in_executor for diarization
  • Summarization consent not persisted → Stored in user_preferences table
  • VU meter update throttling → 20fps throttle implemented
  • Webhook infrastructure missing → Full webhook subsystem with executor, service, and repository
  • Integration/OAuth token storageIntegrationSecretModel for secure token storage

MCP Tools Reference

Tool Use Case
firecrawl_scrape Extract content from a single URL (markdown, HTML, screenshots)
firecrawl_search Web search with optional content extraction; supports operators (site:, inurl:, -exclude)
firecrawl_map Discover all URLs on a website before scraping
firecrawl_crawl Multi-page crawl with depth/limit controls (use sparingly—token-heavy)
firecrawl_extract LLM-powered structured data extraction with JSON schema

Workflow: Search first without scrapeOptions, then scrape relevant URLs individually.

Sequential Thinking (Reasoning)

Tool Use Case
sequentialthinking Break down complex problems into numbered thought steps; supports revision, branching, and hypothesis verification

When to use: Multi-step analysis, design decisions, debugging with course-correction, architecture planning.

Context7 (Library Documentation)

Tool Use Case
resolve-library-id Convert package name → Context7 library ID (required first step)
get-library-docs Fetch up-to-date docs; mode='code' for API/examples, mode='info' for concepts

Workflow: resolve-library-id("fastapi")get-library-docs("/tiangolo/fastapi", topic="dependencies").

Serena (Semantic Code Tools)

Navigation (prefer over grep/read for code):

Tool Use Case
get_symbols_overview List top-level symbols in a file (classes, functions, variables)
find_symbol Search by name path pattern; use depth for children, include_body for source
find_referencing_symbols Find all references to a symbol across the codebase
search_for_pattern Regex search across files (for non-symbol searches)

Editing (atomic symbol-level changes):

Tool Use Case
replace_symbol_body Replace entire symbol definition
insert_before_symbol / insert_after_symbol Add code adjacent to a symbol
rename_symbol Rename across entire codebase

Memory (persistent cross-session knowledge):

Tool Use Case
list_memories Show available project memories
read_memory / write_memory Retrieve or store project knowledge

Guardrails (call before major actions):

Tool Use Case
think_about_collected_information Validate research completeness after search sequences
think_about_task_adherence Verify alignment before editing code
think_about_whether_you_are_done Confirm task completion

Principle: Use symbolic tools over file reads; retrieve symbol bodies only when needed for understanding or editing.