vasceannie/noteflow

Fork 0

Files

Travis Vasceannie e4b2c733d5 oh boy

2026-01-02 04:22:40 +00:00

35 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

NoteFlow is an intelligent meeting notetaker: local-first audio capture + navigable recall + evidence-linked summaries. It is a client-server system built around a gRPC API for bidirectional audio streaming and transcription. The repository includes:

A Python backend (src/noteflow/) hosting the gRPC server, domain logic, and infrastructure adapters.
A Tauri + React desktop client (client/) that uses Rust for IPC and a React UI (Vite).

The gRPC schema is the shared contract between backend and client; keep proto changes in sync across Python, Rust, and TypeScript.

Quick Orientation (Start Here)

Backend server entry point: python -m noteflow.grpc.server (implementation in src/noteflow/grpc/service.py).
Tauri/React client: cd client && npm run dev (web), npm run tauri dev (desktop).
Tauri IPC bridge: TypeScript adapters in client/src/api/tauri-adapter.ts invoke Rust commands in client/src-tauri/src/commands/.
Protobuf schema and generated Python stubs live in src/noteflow/grpc/proto/.

Build and Development Commands

# Install (editable with dev dependencies)
python -m pip install -e ".[dev]"

# Run gRPC server
python -m noteflow.grpc.server --help

# Run Tauri + React client UI
cd client
npm install
npm run dev
# Desktop Tauri dev (requires Rust toolchain)
npm run tauri dev
cd -

# Tests
pytest                           # Full suite
pytest -m "not integration"      # Skip external-service tests
pytest tests/domain/             # Run specific test directory
pytest -k "test_segment"         # Run by pattern

# Linting and type checking
ruff check .                     # Lint
ruff check --fix .               # Autofix
mypy src/noteflow                # Strict type checks
basedpyright                     # Additional type checks

# Client lint/format (from client/)
npm run lint                      # Biome
npm run lint:fix                  # Biome autofix
npm run format                    # Prettier
npm run format:check              # Prettier check
npm run lint:rs                   # Clippy (Rust)
npm run format:rs                 # rustfmt (Rust)

# Docker development
docker compose up -d postgres    # PostgreSQL with health checks
python scripts/dev_watch_server.py  # Auto-reload server (watches src/)

Docker Development

IMPORTANT: The server runs with hot-reload enabled. Assume Docker services are always running. Never restart, rebuild, or stop containers without explicit user permission—doing so disrupts the development workflow.

# Start PostgreSQL (with pgvector)
docker compose up -d postgres

# Dev container (VS Code) - full GUI environment
# .devcontainer/ includes PortAudio, GTK, pystray, pynput support
code .  # Open in VS Code, select "Reopen in Container"

# Development server with auto-reload
python scripts/dev_watch_server.py  # Uses watchfiles, monitors src/ and alembic.ini

Dev container features: dbus-x11, GTK-3, libgl1 for system tray and hotkey support.

Forbidden Docker Operations (without explicit permission)

docker compose build — rebuilds images, disrupts running containers
docker compose up / down / restart — starts/stops services
docker stop / docker kill — kills running containers
Any command that would interrupt the hot-reload server

Code changes are automatically picked up by the watchfiles-based hot-reload. If you need to suggest a Docker operation, ask the user first.

Architecture

src/noteflow/
├── domain/           # Entities + ports (see Domain Package Structure below)
├── application/      # Use-cases/services (Meeting, Recovery, Export, Summarization, Trigger, Webhook, Calendar, Retention, NER)
├── infrastructure/   # Implementations
│   ├── audio/        # sounddevice capture, ring buffer, VU levels, playback, buffered writer
│   ├── asr/          # faster-whisper engine, VAD segmenter, streaming
│   ├── diarization/  # Speaker diarization (streaming: diart, offline: pyannote.audio)
│   ├── summarization/# Multi-provider summarization (CloudProvider, OllamaProvider) + citation verification
│   ├── triggers/     # Auto-start signal providers (calendar, audio activity, foreground app)
│   ├── persistence/  # SQLAlchemy + asyncpg + pgvector, Alembic migrations
│   ├── security/     # keyring keystore, AES-GCM encryption
│   ├── crypto/       # Cryptographic utilities
│   ├── export/       # Markdown/HTML/PDF export
│   ├── webhooks/     # Webhook executor with retry logic and HMAC signing
│   ├── converters/   # ORM ↔ domain entity converters (including webhook, NER, calendar, integration)
│   ├── calendar/     # OAuth manager, Google/Outlook calendar adapters
│   ├── ner/          # Named entity recognition engine (spaCy)
│   ├── observability/# OpenTelemetry tracing, usage event tracking
│   ├── metrics/      # Metric collection utilities
│   ├── logging/      # Log buffer and utilities
│   └── platform/     # Platform-specific code
├── grpc/             # Proto definitions, server, client, meeting store, modular mixins
├── cli/              # CLI tools (retention management, model commands)
└── config/           # Pydantic settings (NOTEFLOW_ env vars) + feature flags

Frontend (Tauri + React) lives outside the Python package:

client/
├── src/              # React UI, state, and view components
├── src-tauri/        # Rust/Tauri shell, IPC commands, gRPC client
├── e2e/              # Playwright tests
├── package.json      # Vite + test/lint scripts
└── vite.config.ts    # Vite configuration

Key patterns:

Hexagonal architecture: domain → application → infrastructure
Repository pattern with Unit of Work (SQLAlchemyUnitOfWork)
gRPC bidirectional streaming for audio → transcript flow
Protocol-based DI (see domain/ports/ and infrastructure protocols.py files)
Modular gRPC mixins for separation of concerns (see below)

Domain Package Structure

domain/
├── entities/         # Core domain entities
│   ├── meeting.py    # Meeting, MeetingId, MeetingState
│   ├── segment.py    # Segment, WordTiming
│   ├── summary.py    # Summary, KeyPoint, ActionItem
│   ├── annotation.py # Annotation
│   ├── named_entity.py # NamedEntity for NER results
│   └── integration.py# Integration, IntegrationType, IntegrationStatus
├── identity/         # User/workspace identity (Sprint 16)
│   ├── roles.py      # WorkspaceRole enum with permission checks
│   ├── context.py    # UserContext, WorkspaceContext, ProjectContext, OperationContext
│   └── entities.py   # User, Workspace, WorkspaceMembership domain entities
├── webhooks/         # Webhook domain
│   ├── events.py     # WebhookEventType, WebhookConfig, WebhookDelivery, payload classes
│   └── constants.py  # Webhook-related constants
├── triggers/         # Trigger detection domain
│   ├── entities.py   # Trigger, TriggerAction, TriggerSignal
│   └── ports.py      # TriggerProvider protocol
├── summarization/    # Summarization domain
│   └── ports.py      # SummarizationProvider protocol
├── ports/            # Repository protocols
│   ├── repositories/ # Organized by concern
│   │   ├── transcript.py  # MeetingRepository, SegmentRepository, SummaryRepository
│   │   ├── asset.py       # AssetRepository for file management
│   │   ├── background.py  # DiarizationJobRepository, EntityRepository
│   │   ├── external.py    # WebhookRepository, IntegrationRepository, PreferencesRepository
│   │   └── identity.py    # UserRepository, WorkspaceRepository protocols
│   ├── unit_of_work.py    # UnitOfWork protocol with supports_* capability checks
│   ├── diarization.py     # DiarizationEngine protocol
│   ├── ner.py             # NEREngine protocol
│   └── calendar.py        # CalendarProvider protocol
├── utils/            # Domain utilities
│   └── time.py       # utc_now() helper
├── errors.py         # Domain-specific exceptions
└── value_objects.py  # Shared value objects

gRPC Mixin Architecture

The gRPC server uses modular mixins for maintainability:

grpc/_mixins/
├── streaming/        # ASR streaming (package)
│   ├── _mixin.py     # Main StreamingMixin class
│   ├── _session.py   # Session management
│   ├── _asr.py       # ASR processing
│   ├── _processing.py# Audio processing pipeline
│   ├── _partials.py  # Partial transcript handling
│   ├── _cleanup.py   # Resource cleanup
│   └── _types.py     # Type definitions
├── diarization/      # Speaker diarization (package)
│   ├── _mixin.py     # Main DiarizationMixin class
│   ├── _jobs.py      # Background job management
│   ├── _refinement.py# Offline refinement
│   ├── _streaming.py # Real-time diarization
│   ├── _speaker.py   # Speaker assignment
│   ├── _status.py    # Job status tracking
│   └── _types.py     # Type definitions
├── summarization.py  # Summary generation (separates LLM inference from DB transactions)
├── meeting.py        # Meeting lifecycle (create, get, list, delete, stop)
├── annotation.py     # Segment annotations CRUD
├── export.py         # Markdown/HTML/PDF document export
├── entities.py       # Named entity extraction operations
├── calendar.py       # Calendar sync operations
├── webhooks.py       # Webhook management operations
├── preferences.py    # User preferences operations
├── observability.py  # Usage tracking, metrics operations
├── sync.py           # State synchronization operations
├── diarization_job.py# Diarization job status/management
├── converters.py     # Protobuf ↔ domain entity converters
├── errors.py         # gRPC error helpers (abort_not_found, abort_invalid_argument)
├── protocols.py      # ServicerHost protocol for mixin composition
└── _audio_helpers.py # Audio utility functions

grpc/interceptors/    # gRPC server interceptors (Sprint 16)
└── identity.py       # Identity context propagation (request_id, user_id, workspace_id)

Client-side mixins for the Python gRPC client:

grpc/_client_mixins/
├── streaming.py      # Client streaming operations
├── meeting.py        # Meeting CRUD operations
├── diarization.py    # Diarization requests
├── export.py         # Export requests
├── annotation.py     # Annotation operations
├── converters.py     # Response converters
└── protocols.py      # ClientHost protocol

Each mixin operates on ServicerHost protocol, enabling clean composition in NoteFlowServicer.

Service Injection Pattern

Services are injected through a three-tier pattern:

ServicerHost Protocol (protocols.py) — declares required service attributes
NoteFlowServicer (service.py) — accepts services via __init__ and stores as instance attributes
NoteFlowServer (server.py) — creates/initializes services and passes to servicer

Example: _webhook_service, _summarization_service, _ner_service all follow this pattern.

Client Architecture (Tauri + React)

client/src/
├── api/              # API layer with adapters and connection management
│   ├── tauri-adapter.ts    # Main Tauri IPC adapter
│   ├── mock-adapter.ts     # Mock adapter for testing
│   ├── cached-adapter.ts   # Caching layer
│   ├── connection-state.ts # Connection state machine
│   ├── reconnection.ts     # Auto-reconnection logic
│   ├── interface.ts        # Adapter interface definition
│   └── types/              # API type definitions
├── hooks/            # Custom React hooks
│   ├── use-diarization.ts  # Speaker diarization state
│   ├── use-cloud-consent.ts# Cloud provider consent
│   ├── use-webhooks.ts     # Webhook management
│   ├── use-oauth-flow.ts   # OAuth authentication
│   ├── use-calendar-sync.ts# Calendar synchronization
│   ├── use-entity-extraction.ts # NER operations
│   └── ...                 # Additional hooks
├── contexts/         # React contexts
│   └── connection-context.tsx # gRPC connection context
├── components/       # React components
│   ├── ui/           # Reusable UI components (shadcn/ui)
│   ├── recording/    # Recording-specific components
│   ├── settings/     # Settings panel components
│   └── analytics/    # Analytics visualizations
├── pages/            # Route pages
├── lib/              # Utilities
│   ├── tauri-helpers.ts    # Tauri utility functions
│   ├── tauri-events.ts     # Tauri event handling
│   ├── cache/              # Client-side caching
│   ├── config/             # Configuration management
│   └── ...                 # Format, crypto, utils
└── types/            # Shared TypeScript types

Rust/Tauri backend:

client/src-tauri/src/
├── commands/         # Tauri IPC command handlers
│   ├── recording/    # Recording commands (capture, device, audio)
│   ├── triggers/     # Trigger detection commands
│   ├── meeting.rs    # Meeting CRUD
│   ├── diarization.rs# Diarization operations
│   ├── calendar.rs   # Calendar sync
│   ├── webhooks.rs   # Webhook management
│   └── ...           # Export, annotation, preferences, etc.
├── grpc/             # gRPC client
│   ├── client/       # Client implementations by domain
│   └── types/        # Rust type definitions
├── state/            # Runtime state management
│   ├── app_state.rs  # Main application state
│   ├── preferences.rs# User preferences
│   ├── playback.rs   # Audio playback state
│   └── types.rs      # State type definitions
├── audio/            # Audio capture and playback
├── cache/            # Memory caching
├── crypto/           # Cryptographic operations
├── events/           # Tauri event emission
├── triggers/         # Trigger detection
├── main.rs           # Application entry point
└── lib.rs            # Library exports

Database

PostgreSQL with pgvector extension. Async SQLAlchemy with asyncpg driver.

# Alembic migrations
alembic upgrade head
alembic revision --autogenerate -m "description"

Connection via NOTEFLOW_DATABASE_URL env var or settings.

Testing Conventions

Test files: test_*.py, functions: test_*
Markers: @pytest.mark.slow (model loading), @pytest.mark.integration (external services)
Integration tests use testcontainers for PostgreSQL
Asyncio auto-mode enabled
React unit tests use Vitest; e2e tests use Playwright in client/e2e/.

Quality Gates

After any non-trivial changes, run the quality and test smell suite:

pytest tests/quality/                # Test smell detection (23 checks)

This suite enforces:

No assertion roulette (multiple assertions without messages)
No conditional test logic (loops/conditionals with assertions)
No cross-file fixture duplicates (consolidate to conftest.py)
No unittest-style assertions (use plain assert)
Proper fixture typing and scope
No pytest.raises without match=
And 17 other test quality checks

Fixtures like crypto, meetings_dir, and mock_uow are provided by tests/conftest.py — do not redefine them in test files.

Proto/gRPC

Proto definitions: src/noteflow/grpc/proto/noteflow.proto Generated files excluded from lint: *_pb2.py, *_pb2_grpc.py

Regenerate after proto changes:

python -m grpc_tools.protoc -I src/noteflow/grpc/proto \
  --python_out=src/noteflow/grpc/proto \
  --grpc_python_out=src/noteflow/grpc/proto \
  src/noteflow/grpc/proto/noteflow.proto

Sync points (high risk of breakage):

Rust gRPC types are generated at build time by client/src-tauri/build.rs. Keep Rust DTOs aligned with proto changes.
Frontend enums/DTOs in client/src/types/ mirror proto enums and backend domain types; update these when proto enums change.
When adding or renaming RPCs, update: server mixins, src/noteflow/grpc/client.py, Tauri command handlers, and client/src/api/tauri-adapter.ts.

Common Pitfalls & Change Checklist

Proto / API evolution

Update the schema in src/noteflow/grpc/proto/noteflow.proto first; treat it as the source of truth.
Regenerate Python stubs (*_pb2.py, *_pb2_grpc.py) and verify imports still align in src/noteflow/grpc/.
Ensure the gRPC server mixins in src/noteflow/grpc/_mixins/ implement new/changed RPCs.
Update src/noteflow/grpc/client.py (Python client wrapper) to match the RPC signature and response types.
Update Tauri/Rust command handlers (client/src-tauri/src/commands/) and any Rust gRPC types used by commands.
Update TypeScript adapters in client/src/api/tauri-adapter.ts and shared DTOs/enums in client/src/types/ and client/src/api/types/.
Add/adjust tests on both sides (backend unit/integration + client unit tests) when changing payload shapes.

Database schema & migrations

Schema changes belong in src/noteflow/infrastructure/persistence/ plus an Alembic migration in src/noteflow/infrastructure/persistence/migrations/.
Use alembic revision --autogenerate only after updating models; review the migration for correctness.
Keep NOTEFLOW_DATABASE_URL in mind when running integration tests; default behavior may fall back to memory storage.
Update repository/UnitOfWork implementations if new tables or relations are introduced.
If you add fields used by export/summarization, ensure converters in infrastructure/converters/ are updated too.

Client sync points (Rust + TS)

Tauri IPC surfaces (Rust commands) must match the TypeScript calls in client/src/api/tauri-adapter.ts.
Rust gRPC types are generated by client/src-tauri/build.rs; verify the proto path if you move proto files.
Frontend enums in client/src/types/ and client/src/api/types/ mirror backend/proto enums; keep them aligned to avoid silent UI bugs.

Code Style

Python 3.12+, 100-char line length
Strict mypy (allow type: ignore[code] only with comment explaining why)
Ruff for linting (E, W, F, I, B, C4, UP, SIM, RUF)
Module soft limit 500 LoC, hard limit 750 LoC
Frontend formatting uses Prettier (single quotes, 100 char width); linting uses Biome.
Rust formatting uses rustfmt; linting uses clippy via the client scripts.

Automated Enforcement (Hookify Rules)

This project uses hookify rules to enforce code quality standards. These rules are defined in .claude/hookify*.local.md and automatically block or warn on violations.

Required: Use Makefile Targets

Always use Makefile targets instead of running tools directly. The Makefile orchestrates quality checks across Python, TypeScript, and Rust.

# Primary quality commands
make quality          # Run ALL quality checks (TS + Rust + Python)
make quality-py       # Python only: lint + type-check + test-quality
make quality-ts       # TypeScript only: type-check + lint + test-quality
make quality-rs       # Rust only: clippy + lint

# Python-specific
make lint-py          # Run Pyrefly linter
make type-check-py    # Run basedpyright
make test-quality-py  # Run pytest tests/quality/
make lint-fix-py      # Auto-fix Ruff + Sourcery issues

# TypeScript-specific
make type-check       # Run tsc --noEmit
make lint             # Run Biome linter
make lint-fix         # Auto-fix Biome issues
make test-quality     # Run Vitest quality tests

# Rust-specific
make clippy           # Run Clippy linter
make lint-rs          # Run code quality script
make fmt-rs           # Format with rustfmt

# Formatting
make fmt              # Format all code (Biome + rustfmt)
make fmt-check        # Check all formatting

# E2E Tests
make e2e              # Playwright tests (requires frontend on :5173)
make e2e-grpc         # Rust gRPC integration tests

Blocked Operations

The following operations are automatically blocked:

Protected Files (Require Explicit Permission)

File/Directory	What's Blocked	Why
`Makefile`	All modifications (Edit, Write, bash redirects)	Build system is protected
`tests/quality/`	All modifications except `baselines.json`	Quality gate infrastructure
`pyproject.toml`, `ruff.toml`, `pyrightconfig.json`	Edits	Python linter config is protected
`biome.json`, `tsconfig.json`, `.eslintrc*`	Edits	Frontend linter config is protected
`.rustfmt.toml`, `.clippy.toml`	Edits	Rust linter config is protected

Forbidden Code Patterns (Python Files Only)

Pattern	Why Blocked	Alternative
Type suppression comments (`# type: ignore`, `# pyright: ignore`)	Bypasses type safety	Fix the actual type error, use `cast()` as last resort
`Any` type annotations	Creates type safety holes	Use specific types, `Protocol`, `TypeVar`, or `object`
Magic numbers in assignments	Hidden intent, hard to maintain	Define named constants with `typing.Final`
Loops/conditionals in tests	Non-deterministic tests	Use `@pytest.mark.parametrize`
Multiple assertions without messages	Hard to debug failures	Add assertion messages or separate tests
Duplicate fixture definitions	Maintenance burden	Use fixtures from `tests/conftest.py`

Blocked Fixtures (Already in `tests/conftest.py`)

Do not redefine these fixtures—they are globally available:

mock_uow, crypto, meetings_dir, webhook_config, webhook_config_all_events
sample_datetime, calendar_settings, meeting_id, sample_meeting, recording_meeting
mock_grpc_context, mock_asr_engine, mock_optional_extras

Warnings (Non-Blocking)

Trigger	Warning
Creating new source files	Search for existing similar code first
Files exceeding 500 lines	Consider refactoring into a package

Quality Gate Requirement

Before completing any code changes, you must run:

make quality

This is enforced by the require-make-quality hook. All quality checks must pass before completion.

Policy: No Ignoring Pre-existing Issues

If you encounter lint errors, type errors, or test failures—even if they existed before your changes—you must either:

Fix immediately (for simple issues)
Add to todo list (for complex issues)
Launch a subagent to fix (for parallelizable work)

Claiming something is "pre-existing" or "out of scope" is not a valid reason to ignore it.

Feature Flags

Optional features controlled via NOTEFLOW_FEATURE_* environment variables:

Flag	Default	Controls	Prerequisites
`NOTEFLOW_FEATURE_TEMPLATES_ENABLED`	`true`	AI summarization templates	—
`NOTEFLOW_FEATURE_PDF_EXPORT_ENABLED`	`true`	PDF export format	WeasyPrint installed
`NOTEFLOW_FEATURE_NER_ENABLED`	`false`	Named entity extraction	spaCy model downloaded
`NOTEFLOW_FEATURE_CALENDAR_ENABLED`	`false`	Calendar sync	OAuth credentials configured
`NOTEFLOW_FEATURE_WEBHOOKS_ENABLED`	`true`	Webhook notifications	—

Access via get_feature_flags().<flag_name> or get_settings().feature_flags.<flag_name>. Features with external dependencies default to false.

Spikes (De-risking Experiments)

spikes/ contains validated platform experiments with FINDINGS.md:

spike_01_ui_tray_hotkeys/ - System tray and global hotkey integration
spike_02_audio_capture/ - sounddevice + PortAudio
spike_03_asr_latency/ - faster-whisper benchmarks (0.05x real-time)
spike_04_encryption/ - keyring + AES-GCM (826 MB/s throughput)

Key Subsystems

Speaker Diarization

Streaming: diart for real-time speaker detection during recording
Offline: pyannote.audio for post-meeting refinement (higher quality)
gRPC: RefineSpeakerDiarization (background job), GetDiarizationJobStatus (polling), RenameSpeaker

Summarization

Providers: CloudProvider (Anthropic/OpenAI), OllamaProvider (local), MockProvider (testing)
Templates: Configurable tone (professional/casual/technical), format (bullet_points/narrative/structured), verbosity (minimal/balanced/detailed)
Citation verification: Links summary claims to transcript evidence
Consent: Cloud providers require explicit user consent (stored in user_preferences)

Export

Formats: Markdown, HTML, PDF (via WeasyPrint)
Content: Transcript with timestamps, speaker labels, summary with key points and action items
gRPC: ExportTranscript RPC with ExportFormat enum
PDF styling: Embedded CSS for professional document layout

Named Entity Recognition (NER)

Automatic extraction of people, companies, products, locations from transcripts.

Engine: spaCy with transformer models (en_core_web_sm or en_core_web_trf)
Categories: person, company, product, technical, acronym, location, date, other
Segment tracking: Entities link back to source segment_ids for navigation
Confidence scores: Model confidence for each extracted entity
Pinning: Users can pin (confirm) entities for future reference
gRPC: ExtractEntities RPC with optional force_refresh
Caching: Entities persisted in named_entities table, cached until refresh

Trigger Detection

Signals: Calendar proximity, audio activity, foreground app detection
Actions: IGNORE, NOTIFY, AUTO_START with confidence thresholds
Client integration: Background polling with dialog prompts (start/snooze/dismiss)

Webhooks

Automated HTTP notifications for meeting lifecycle events.

Events: meeting.completed, summary.generated, recording.started, recording.stopped
Delivery: Exponential backoff retries (configurable max_retries, default 3)
Security: HMAC-SHA256 signing via X-NoteFlow-Signature header when secret configured
Headers: X-NoteFlow-Event (event type), X-NoteFlow-Delivery (unique delivery ID)
Fire-and-forget: Webhook failures never block primary RPC operations
Persistence: webhook_configs stores URL/events/secret, webhook_deliveries logs delivery attempts

Key files:

domain/webhooks/events.py — WebhookEventType, WebhookConfig, WebhookDelivery, payload dataclasses
infrastructure/webhooks/executor.py — HTTP client with retry logic
application/services/webhook_service.py — orchestrates delivery to registered webhooks

Integrations

OAuth-based external service connections (calendar providers, etc.).

Types: calendar (Google, Outlook)
Status tracking: pending, connected, error, disconnected
Secure storage: OAuth tokens stored in integration_secrets table
Sync history: integration_sync_runs tracks each sync operation

ORM models in persistence/models/integrations/:

IntegrationModel — provider config and status
IntegrationSecretModel — encrypted OAuth tokens
CalendarEventModel — cached calendar events
MeetingCalendarLinkModel — links meetings to calendar events

Shared Utilities & Factories

Factories

Location	Function	Purpose
`infrastructure/summarization/factory.py`	`create_summarization_service()`	Auto-configured service with provider detection
`infrastructure/persistence/database.py`	`create_async_engine()`	SQLAlchemy async engine from settings
`infrastructure/persistence/database.py`	`create_async_session_factory()`	Session factory from DB URL
`config/settings.py`	`get_settings()`	Cached Settings from env vars
`config/settings.py`	`get_trigger_settings()`	Cached TriggerSettings from env vars

Converters

Location	Class/Function	Purpose
`infrastructure/converters/orm_converters.py`	`OrmConverter`	ORM ↔ domain entities (Meeting, Segment, Summary, etc.)
`infrastructure/converters/asr_converters.py`	`AsrConverter`	ASR DTOs → domain WordTiming
`infrastructure/converters/webhook_converters.py`	`WebhookConverter`	ORM ↔ domain (`WebhookConfig`, `WebhookDelivery`)
`grpc/_mixins/converters.py`	`meeting_to_proto()`, `segment_to_proto_update()`	Domain → protobuf messages
`grpc/_mixins/converters.py`	`create_segment_from_asr()`	ASR result → Segment with word timings

Repository Base (`persistence/repositories/_base.py`)

Method	Purpose
`_execute_scalar()`	Single result query (or None)
`_execute_scalars()`	All scalar results from query
`_add_and_flush()`	Add model and flush to DB
`_delete_and_flush()`	Delete model and flush

Security Helpers (`infrastructure/security/keystore.py`)

Function	Purpose
`_decode_and_validate_key()`	Validate base64 key, check size
`_generate_key()`	Generate 256-bit key as `(bytes, base64_str)`

Export Helpers (`infrastructure/export/_formatting.py`)

Function	Purpose
`format_timestamp()`	Seconds → `MM:SS` or `HH:MM:SS`
`format_datetime()`	Datetime → display string

Summarization (`infrastructure/summarization/`)

Location	Function	Purpose
`_parsing.py`	`build_transcript_prompt()`	Transcript with segment markers for LLM
`_parsing.py`	`parse_llm_response()`	JSON → Summary entity
`citation_verifier.py`	`verify_citations()`	Validate segment_ids exist

Diarization (`infrastructure/diarization/assigner.py`)

Function	Purpose
`assign_speaker()`	Speaker for time range from turns
`assign_speakers_batch()`	Batch speaker assignment

Triggers (`infrastructure/triggers/calendar.py`)

Function	Purpose
`parse_calendar_events()`	Parse events from config/env

Webhooks (`infrastructure/webhooks/executor.py`)

Class/Method	Purpose
`WebhookExecutor`	HTTP delivery with retry logic and HMAC signing
`WebhookExecutor.deliver()`	Deliver payload to webhook URL with exponential backoff
`WebhookExecutor._build_headers()`	Build headers including `X-NoteFlow-Signature`

Recovery Service (`application/services/recovery_service.py`)

Method	Purpose
`recover_all()`	Orchestrate meeting + job recovery
`RecoveryResult`	Dataclass with recovery counts

Webhook Service (`application/services/webhook_service.py`)

Method	Purpose
`register_webhook()`	Register a webhook configuration
`trigger_meeting_completed()`	Fire webhooks on meeting completion
`trigger_summary_generated()`	Fire webhooks on summary generation
`trigger_recording_started/stopped()`	Fire webhooks on recording lifecycle

Unit of Work Repositories

The UnitOfWork protocol provides access to all repositories:

Property	Repository	Supports In-Memory
`meetings`	`MeetingRepository`	Yes
`segments`	`SegmentRepository`	Yes
`summaries`	`SummaryRepository`	Yes
`annotations`	`AnnotationRepository`	Yes
`diarization_jobs`	`DiarizationJobRepository`	Yes
`preferences`	`PreferencesRepository`	Yes
`entities`	`EntityRepository`	Yes
`integrations`	`IntegrationRepository`	DB only
`webhooks`	`WebhookRepository`	Yes

Check capability with supports_* properties (e.g., uow.supports_webhooks).

Known Issues

See docs/triage.md for tracked technical debt. See docs/sprints/ for feature implementation plans and status.

Resolved:

~~Server-side state volatility~~ → Diarization jobs persisted to DB
~~Hardcoded directory paths~~ → asset_path column added to meetings
~~Synchronous blocking in async gRPC~~ → run_in_executor for diarization
~~Summarization consent not persisted~~ → Stored in user_preferences table
~~VU meter update throttling~~ → 20fps throttle implemented
~~Webhook infrastructure missing~~ → Full webhook subsystem with executor, service, and repository
~~Integration/OAuth token storage~~ → IntegrationSecretModel for secure token storage

MCP Tools Reference

Firecrawl (Web Scraping & Search)

Tool	Use Case
`firecrawl_scrape`	Extract content from a single URL (markdown, HTML, screenshots)
`firecrawl_search`	Web search with optional content extraction; supports operators (`site:`, `inurl:`, `-exclude`)
`firecrawl_map`	Discover all URLs on a website before scraping
`firecrawl_crawl`	Multi-page crawl with depth/limit controls (use sparingly—token-heavy)
`firecrawl_extract`	LLM-powered structured data extraction with JSON schema

Workflow: Search first without scrapeOptions, then scrape relevant URLs individually.

Sequential Thinking (Reasoning)

Tool	Use Case
`sequentialthinking`	Break down complex problems into numbered thought steps; supports revision, branching, and hypothesis verification

When to use: Multi-step analysis, design decisions, debugging with course-correction, architecture planning.

Context7 (Library Documentation)

Tool	Use Case
`resolve-library-id`	Convert package name → Context7 library ID (required first step)
`get-library-docs`	Fetch up-to-date docs; `mode='code'` for API/examples, `mode='info'` for concepts

Workflow: resolve-library-id("fastapi") → get-library-docs("/tiangolo/fastapi", topic="dependencies").

Serena (Semantic Code Tools)

Navigation (prefer over grep/read for code):

Tool	Use Case
`get_symbols_overview`	List top-level symbols in a file (classes, functions, variables)
`find_symbol`	Search by name path pattern; use `depth` for children, `include_body` for source
`find_referencing_symbols`	Find all references to a symbol across the codebase
`search_for_pattern`	Regex search across files (for non-symbol searches)

Editing (atomic symbol-level changes):

Tool	Use Case
`replace_symbol_body`	Replace entire symbol definition
`insert_before_symbol` / `insert_after_symbol`	Add code adjacent to a symbol
`rename_symbol`	Rename across entire codebase

Memory (persistent cross-session knowledge):

Tool	Use Case
`list_memories`	Show available project memories
`read_memory` / `write_memory`	Retrieve or store project knowledge

Guardrails (call before major actions):

Tool	Use Case
`think_about_collected_information`	Validate research completeness after search sequences
`think_about_task_adherence`	Verify alignment before editing code
`think_about_whether_you_are_done`	Confirm task completion

Principle: Use symbolic tools over file reads; retrieve symbol bodies only when needed for understanding or editing.

35 KiB Raw Blame History