Add summarization and trigger services

- Introduced `SummarizationService` and `TriggerService` to orchestrate summarization and trigger detection functionalities. - Added new modules for summarization, including citation verification and cloud-based summarization providers. - Implemented trigger detection based on audio activity and foreground application status. - Updated project configuration to include new dependencies for summarization and trigger functionalities. - Created tests for summarization and trigger services to ensure functionality and reliability.
2025-12-18 00:08:51 +00:00
parent b36ee5c211
commit 4eef1b3be6
49 changed files with 15909 additions and 4256 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,103 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+NoteFlow is an intelligent meeting notetaker: local-first audio capture + navigable recall + evidence-linked summaries. Client-server architecture using gRPC for bidirectional audio streaming and transcription.
+
+## Build and Development Commands
+
+```bash
+# Install (editable with dev dependencies)
+python -m pip install -e ".[dev]"
+
+# Run gRPC server
+python -m noteflow.grpc.server --help
+
+# Run Flet client UI
+python -m noteflow.client.app --help
+
+# Tests
+pytest                           # Full suite
+pytest -m "not integration"      # Skip external-service tests
+pytest tests/domain/             # Run specific test directory
+pytest -k "test_segment"         # Run by pattern
+
+# Linting and type checking
+ruff check .                     # Lint
+ruff check --fix .               # Autofix
+mypy src/noteflow                # Strict type checks
+basedpyright                     # Additional type checks
+```
+
+## Architecture
+
+```
+src/noteflow/
+├── domain/           # Entities (meeting, segment, annotation, summary) + ports (repository interfaces)
+├── application/      # Use-cases/services (MeetingService, RecoveryService, ExportService)
+├── infrastructure/   # Implementations
+│   ├── audio/        # sounddevice capture, ring buffer, VU levels, playback
+│   ├── asr/          # faster-whisper engine, VAD segmenter, streaming
+│   ├── persistence/  # SQLAlchemy + asyncpg + pgvector, Alembic migrations
+│   ├── security/     # keyring keystore, AES-GCM encryption
+│   ├── export/       # Markdown/HTML export
+│   └── converters/   # ORM ↔ domain entity converters
+├── grpc/             # Proto definitions, server, client, meeting store
+├── client/           # Flet UI app + components (transcript, VU meter, playback)
+└── config/           # Pydantic settings (NOTEFLOW_ env vars)
+```
+
+**Key patterns:**
+- Hexagonal architecture: domain → application → infrastructure
+- Repository pattern with Unit of Work (`SQLAlchemyUnitOfWork`)
+- gRPC bidirectional streaming for audio → transcript flow
+- Protocol-based DI (see `domain/ports/` and infrastructure `protocols.py` files)
+
+## Database
+
+PostgreSQL with pgvector extension. Async SQLAlchemy with asyncpg driver.
+
+```bash
+# Alembic migrations
+alembic upgrade head
+alembic revision --autogenerate -m "description"
+```
+
+Connection via `NOTEFLOW_DATABASE_URL` env var or settings.
+
+## Testing Conventions
+
+- Test files: `test_*.py`, functions: `test_*`
+- Markers: `@pytest.mark.slow` (model loading), `@pytest.mark.integration` (external services)
+- Integration tests use testcontainers for PostgreSQL
+- Asyncio auto-mode enabled
+
+## Proto/gRPC
+
+Proto definitions: `src/noteflow/grpc/proto/noteflow.proto`
+Generated files excluded from lint: `*_pb2.py`, `*_pb2_grpc.py`
+
+Regenerate after proto changes:
+```bash
+python -m grpc_tools.protoc -I src/noteflow/grpc/proto \
+  --python_out=src/noteflow/grpc/proto \
+  --grpc_python_out=src/noteflow/grpc/proto \
+  src/noteflow/grpc/proto/noteflow.proto
+```
+
+## Code Style
+
+- Python 3.12+, 100-char line length
+- Strict mypy (allow `type: ignore[code]` only with comment explaining why)
+- Ruff for linting (E, W, F, I, B, C4, UP, SIM, RUF)
+- Module soft limit 500 LoC, hard limit 750 LoC
+
+## Spikes (De-risking Experiments)
+
+`spikes/` contains validated platform experiments with `FINDINGS.md`:
+- `spike_01_ui_tray_hotkeys/` - Flet + pystray + pynput (requires X11)
+- `spike_02_audio_capture/` - sounddevice + PortAudio
+- `spike_03_asr_latency/` - faster-whisper benchmarks (0.05x real-time)
+- `spike_04_encryption/` - keyring + AES-GCM (826 MB/s throughput)