Files

Travis Vasceannie 4e1921b939 Enhance documentation and configuration for NoteFlow

- Added a comprehensive MCP Tools Reference section in `CLAUDE.md`, detailing various tools and their use cases for web scraping, reasoning, library documentation, semantic code tools, and more.
- Updated `example.env` to reflect new configuration options for retention, diarization, encryption, and desktop client settings, improving clarity and usability.
- Introduced a new `roadmap.md` file outlining the feature gap analysis and development roadmap for NoteFlow, focusing on core pipeline completion and future enhancements.
- Updated submodule reference in the `client` directory to the latest commit.

2025-12-23 21:34:06 +00:00

17 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

NoteFlow is an intelligent meeting notetaker: local-first audio capture + navigable recall + evidence-linked summaries. It is a client-server system built around a gRPC API for bidirectional audio streaming and transcription. The repository includes:

A Python backend (src/noteflow/) hosting the gRPC server, domain logic, and infrastructure adapters.
A Tauri + React desktop client (client/) that uses Rust for IPC and a React UI (Vite).

The gRPC schema is the shared contract between backend and client; keep proto changes in sync across Python, Rust, and TypeScript.

Quick Orientation (Start Here)

Backend server entry point: python -m noteflow.grpc.server (implementation in src/noteflow/grpc/service.py).
Tauri/React client: cd client && npm run dev (web), npm run tauri dev (desktop).
Tauri IPC bridge: TypeScript calls in client/src/lib/tauri.ts map to Rust commands in client/src-tauri/src/commands/.
Protobuf schema and generated Python stubs live in src/noteflow/grpc/proto/.

Build and Development Commands

# Install (editable with dev dependencies)
python -m pip install -e ".[dev]"

# Run gRPC server
python -m noteflow.grpc.server --help

# Run Tauri + React client UI
cd client
npm install
npm run dev
# Desktop Tauri dev (requires Rust toolchain)
npm run tauri dev
cd -

# Tests
pytest                           # Full suite
pytest -m "not integration"      # Skip external-service tests
pytest tests/domain/             # Run specific test directory
pytest -k "test_segment"         # Run by pattern

# Linting and type checking
ruff check .                     # Lint
ruff check --fix .               # Autofix
mypy src/noteflow                # Strict type checks
basedpyright                     # Additional type checks

# Client lint/format (from client/)
npm run lint                      # Biome
npm run lint:fix                  # Biome autofix
npm run format                    # Prettier
npm run format:check              # Prettier check
npm run lint:rs                   # Clippy (Rust)
npm run format:rs                 # rustfmt (Rust)

# Docker development
docker compose up -d postgres    # PostgreSQL with health checks
python scripts/dev_watch_server.py  # Auto-reload server (watches src/)

Docker Development

# Start PostgreSQL (with pgvector)
docker compose up -d postgres

# Dev container (VS Code) - full GUI environment
# .devcontainer/ includes PortAudio, GTK, pystray, pynput support
code .  # Open in VS Code, select "Reopen in Container"

# Development server with auto-reload
python scripts/dev_watch_server.py  # Uses watchfiles, monitors src/ and alembic.ini

Dev container features: dbus-x11, GTK-3, libgl1 for system tray and hotkey support.

Architecture

src/noteflow/
├── domain/           # Entities (meeting, segment, annotation, summary, triggers) + ports
├── application/      # Use-cases/services (MeetingService, RecoveryService, ExportService, SummarizationService, TriggerService)
├── infrastructure/   # Implementations
│   ├── audio/        # sounddevice capture, ring buffer, VU levels, playback, buffered writer
│   ├── asr/          # faster-whisper engine, VAD segmenter, streaming
│   ├── diarization/  # Speaker diarization (streaming: diart, offline: pyannote.audio)
│   ├── summarization/# Multi-provider summarization (CloudProvider, OllamaProvider) + citation verification
│   ├── triggers/     # Auto-start signal providers (calendar, audio activity, foreground app)
│   ├── persistence/  # SQLAlchemy + asyncpg + pgvector, Alembic migrations
│   ├── security/     # keyring keystore, AES-GCM encryption
│   ├── export/       # Markdown/HTML export
│   └── converters/   # ORM ↔ domain entity converters
├── grpc/             # Proto definitions, server, client, meeting store, modular mixins
└── config/           # Pydantic settings (NOTEFLOW_ env vars)

Frontend (Tauri + React) lives outside the Python package:

client/
├── src/              # React UI, state, and view components
├── src-tauri/        # Rust/Tauri shell, IPC commands, gRPC client
├── e2e/              # Playwright tests
├── package.json      # Vite + test/lint scripts
└── vite.config.ts    # Vite configuration

Key patterns:

Hexagonal architecture: domain → application → infrastructure
Repository pattern with Unit of Work (SQLAlchemyUnitOfWork)
gRPC bidirectional streaming for audio → transcript flow
Protocol-based DI (see domain/ports/ and infrastructure protocols.py files)
Modular gRPC mixins for separation of concerns (see below)

gRPC Mixin Architecture

The gRPC server uses modular mixins for maintainability:

grpc/_mixins/
├── streaming.py      # ASR streaming, audio processing, partial buffers
├── diarization.py    # Speaker diarization jobs (background refinement, job TTL)
├── summarization.py  # Summary generation (separates LLM inference from DB transactions)
├── meeting.py        # Meeting lifecycle (create, get, list, delete)
├── annotation.py     # Segment annotations CRUD
├── export.py         # Markdown/HTML document export
├── converters.py     # Protobuf ↔ domain entity converters
└── protocols.py      # ServicerHost protocol for mixin composition

Each mixin operates on ServicerHost protocol, enabling clean composition in NoteFlowServicer.

Client Architecture (Tauri + React)

React components are in client/src/components/, shared UI types in client/src/types/, and Zustand state in client/src/store/.
Tauri IPC calls live in client/src/lib/tauri.ts and map to Rust handlers in client/src-tauri/src/commands/.
Rust application entry points are client/src-tauri/src/main.rs and client/src-tauri/src/lib.rs; shared runtime state is in client/src-tauri/src/state/.

Database

PostgreSQL with pgvector extension. Async SQLAlchemy with asyncpg driver.

# Alembic migrations
alembic upgrade head
alembic revision --autogenerate -m "description"

Connection via NOTEFLOW_DATABASE_URL env var or settings.

Testing Conventions

Test files: test_*.py, functions: test_*
Markers: @pytest.mark.slow (model loading), @pytest.mark.integration (external services)
Integration tests use testcontainers for PostgreSQL
Asyncio auto-mode enabled
React unit tests use Vitest; e2e tests use Playwright in client/e2e/.

Proto/gRPC

Proto definitions: src/noteflow/grpc/proto/noteflow.proto Generated files excluded from lint: *_pb2.py, *_pb2_grpc.py

Regenerate after proto changes:

python -m grpc_tools.protoc -I src/noteflow/grpc/proto \
  --python_out=src/noteflow/grpc/proto \
  --grpc_python_out=src/noteflow/grpc/proto \
  src/noteflow/grpc/proto/noteflow.proto

Sync points (high risk of breakage):

Rust gRPC types are generated at build time by client/src-tauri/build.rs. Keep Rust DTOs aligned with proto changes.
Frontend enums/DTOs in client/src/types/ mirror proto enums and backend domain types; update these when proto enums change.
When adding or renaming RPCs, update: server mixins, src/noteflow/grpc/client.py, Tauri command handlers, and client/src/lib/tauri.ts.

Common Pitfalls & Change Checklist

Proto / API evolution

Update the schema in src/noteflow/grpc/proto/noteflow.proto first; treat it as the source of truth.
Regenerate Python stubs (*_pb2.py, *_pb2_grpc.py) and verify imports still align in src/noteflow/grpc/.
Ensure the gRPC server mixins in src/noteflow/grpc/_mixins/ implement new/changed RPCs.
Update src/noteflow/grpc/client.py (Python client wrapper) to match the RPC signature and response types.
Update Tauri/Rust command handlers (client/src-tauri/src/commands/) and any Rust gRPC types used by commands.
Update TypeScript wrappers in client/src/lib/tauri.ts and shared DTOs/enums in client/src/types/.
Add/adjust tests on both sides (backend unit/integration + client unit tests) when changing payload shapes.

Database schema & migrations

Schema changes belong in src/noteflow/infrastructure/persistence/ plus an Alembic migration in src/noteflow/infrastructure/persistence/migrations/.
Use alembic revision --autogenerate only after updating models; review the migration for correctness.
Keep NOTEFLOW_DATABASE_URL in mind when running integration tests; default behavior may fall back to memory storage.
Update repository/UnitOfWork implementations if new tables or relations are introduced.
If you add fields used by export/summarization, ensure converters in infrastructure/converters/ are updated too.

Client sync points (Rust + TS)

Tauri IPC surfaces (Rust commands) must match the TypeScript calls in client/src/lib/tauri.ts.
Rust gRPC types are generated by client/src-tauri/build.rs; verify the proto path if you move proto files.
Frontend enums in client/src/types/ mirror backend/proto enums; keep them aligned to avoid silent UI bugs.

Code Style

Python 3.12+, 100-char line length
Strict mypy (allow type: ignore[code] only with comment explaining why)
Ruff for linting (E, W, F, I, B, C4, UP, SIM, RUF)
Module soft limit 500 LoC, hard limit 750 LoC
Frontend formatting uses Prettier (single quotes, 100 char width); linting uses Biome.
Rust formatting uses rustfmt; linting uses clippy via the client scripts.

Spikes (De-risking Experiments)

spikes/ contains validated platform experiments with FINDINGS.md:

spike_02_audio_capture/ - sounddevice + PortAudio
spike_03_asr_latency/ - faster-whisper benchmarks (0.05x real-time)
spike_04_encryption/ - keyring + AES-GCM (826 MB/s throughput)

Key Subsystems

Speaker Diarization

Streaming: diart for real-time speaker detection during recording
Offline: pyannote.audio for post-meeting refinement (higher quality)
gRPC: RefineSpeakerDiarization (background job), GetDiarizationJobStatus (polling), RenameSpeaker

Summarization

Providers: CloudProvider (Anthropic/OpenAI), OllamaProvider (local), MockProvider (testing)
Citation verification: Links summary claims to transcript evidence
Consent: Cloud providers require explicit user consent (not yet persisted)

Trigger Detection

Signals: Calendar proximity, audio activity, foreground app detection
Actions: IGNORE, NOTIFY, AUTO_START with confidence thresholds
Client integration: Background polling with dialog prompts (start/snooze/dismiss)

Shared Utilities & Factories

Factories

Location	Function	Purpose
`infrastructure/summarization/factory.py`	`create_summarization_service()`	Auto-configured service with provider detection
`infrastructure/persistence/database.py`	`create_async_engine()`	SQLAlchemy async engine from settings
`infrastructure/persistence/database.py`	`create_async_session_factory()`	Session factory from DB URL
`config/settings.py`	`get_settings()`	Cached Settings from env vars
`config/settings.py`	`get_trigger_settings()`	Cached TriggerSettings from env vars

Converters

Location	Class/Function	Purpose
`infrastructure/converters/orm_converters.py`	`OrmConverter`	ORM ↔ domain entities (Meeting, Segment, Summary, etc.)
`infrastructure/converters/asr_converters.py`	`AsrConverter`	ASR DTOs → domain WordTiming
`grpc/_mixins/converters.py`	`meeting_to_proto()`, `segment_to_proto_update()`	Domain → protobuf messages
`grpc/_mixins/converters.py`	`create_segment_from_asr()`	ASR result → Segment with word timings

Repository Base (`persistence/repositories/_base.py`)

Method	Purpose
`_execute_scalar()`	Single result query (or None)
`_execute_scalars()`	All scalar results from query
`_add_and_flush()`	Add model and flush to DB
`_delete_and_flush()`	Delete model and flush

Security Helpers (`infrastructure/security/keystore.py`)

Function	Purpose
`_decode_and_validate_key()`	Validate base64 key, check size
`_generate_key()`	Generate 256-bit key as `(bytes, base64_str)`

Export Helpers (`infrastructure/export/_formatting.py`)

Function	Purpose
`format_timestamp()`	Seconds → `MM:SS` or `HH:MM:SS`
`format_datetime()`	Datetime → display string

Summarization (`infrastructure/summarization/`)

Location	Function	Purpose
`_parsing.py`	`build_transcript_prompt()`	Transcript with segment markers for LLM
`_parsing.py`	`parse_llm_response()`	JSON → Summary entity
`citation_verifier.py`	`verify_citations()`	Validate segment_ids exist

Diarization (`infrastructure/diarization/assigner.py`)

Function	Purpose
`assign_speaker()`	Speaker for time range from turns
`assign_speakers_batch()`	Batch speaker assignment

Triggers (`infrastructure/triggers/calendar.py`)

Function	Purpose
`parse_calendar_events()`	Parse events from config/env

Recovery Service (`application/services/recovery_service.py`)

Method	Purpose
`recover_all()`	Orchestrate meeting + job recovery
`RecoveryResult`	Dataclass with recovery counts

Known Issues

See docs/triage.md for tracked technical debt.

Resolved:

~~Server-side state volatility~~ → Diarization jobs persisted to DB
~~Hardcoded directory paths~~ → asset_path column added to meetings
~~Synchronous blocking in async gRPC~~ → run_in_executor for diarization
~~Summarization consent not persisted~~ → Stored in user_preferences table
~~VU meter update throttling~~ → 20fps throttle implemented

MCP Tools Reference

Firecrawl (Web Scraping & Search)

Tool	Use Case
`firecrawl_scrape`	Extract content from a single URL (markdown, HTML, screenshots)
`firecrawl_search`	Web search with optional content extraction; supports operators (`site:`, `inurl:`, `-exclude`)
`firecrawl_map`	Discover all URLs on a website before scraping
`firecrawl_crawl`	Multi-page crawl with depth/limit controls (use sparingly—token-heavy)
`firecrawl_extract`	LLM-powered structured data extraction with JSON schema

Workflow: Search first without scrapeOptions, then scrape relevant URLs individually.

Sequential Thinking (Reasoning)

Tool	Use Case
`sequentialthinking`	Break down complex problems into numbered thought steps; supports revision, branching, and hypothesis verification

When to use: Multi-step analysis, design decisions, debugging with course-correction, architecture planning.

Context7 (Library Documentation)

Tool	Use Case
`resolve-library-id`	Convert package name → Context7 library ID (required first step)
`get-library-docs`	Fetch up-to-date docs; `mode='code'` for API/examples, `mode='info'` for concepts

Workflow: resolve-library-id("fastapi") → get-library-docs("/tiangolo/fastapi", topic="dependencies").

Serena (Semantic Code Tools)

Navigation (prefer over grep/read for code):

Tool	Use Case
`get_symbols_overview`	List top-level symbols in a file (classes, functions, variables)
`find_symbol`	Search by name path pattern; use `depth` for children, `include_body` for source
`find_referencing_symbols`	Find all references to a symbol across the codebase
`search_for_pattern`	Regex search across files (for non-symbol searches)

Editing (atomic symbol-level changes):

Tool	Use Case
`replace_symbol_body`	Replace entire symbol definition
`insert_before_symbol` / `insert_after_symbol`	Add code adjacent to a symbol
`rename_symbol`	Rename across entire codebase

Memory (persistent cross-session knowledge):

Tool	Use Case
`list_memories`	Show available project memories
`read_memory` / `write_memory`	Retrieve or store project knowledge

Guardrails (call before major actions):

Tool	Use Case
`think_about_collected_information`	Validate research completeness after search sequences
`think_about_task_adherence`	Verify alignment before editing code
`think_about_whether_you_are_done`	Confirm task completion

Principle: Use symbolic tools over file reads; retrieve symbol bodies only when needed for understanding or editing.

17 KiB Raw Blame History