feat: reorganize Claude hooks and add RAG documentation structure with error handling policies

- Moved all hookify configuration files from `.claude/` to `.claude/hooks/` subdirectory for better organization
- Added four new blocking hooks to prevent common error handling anti-patterns:
  - `block-broad-exception-handler`: Prevents catching generic `Exception` with only logging
  - `block-datetime-now-fallback`: Blocks returning `datetime.now()` as fallback on parse failures to prevent data corruption
  - `block-default
This commit is contained in:
2026-01-15 15:58:06 +00:00
parent a95a92ca25
commit 1ce24cdf7b
321 changed files with 11515 additions and 3878 deletions

View File

@@ -0,0 +1,61 @@
---
name: block-broad-exception-handler
enabled: true
event: file
conditions:
- field: new_text
operator: regex_match
pattern: except\s+Exception\s*(?:as\s+\w+)?:\s*\n\s+(?:logger\.|logging\.)
action: block
---
**BLOCKED: Broad `except Exception:` handler detected**
Catching generic `Exception` and only logging creates silent failures that are nearly impossible to debug.
**Why this is dangerous:**
- Catches ALL exceptions including programming errors (`TypeError`, `AttributeError`)
- Hides bugs that should crash loudly and be fixed immediately
- Makes the system appear to work while silently failing
- Log messages get lost in noise; exceptions should bubble up
**Acceptable uses (require explicit justification):**
1. Top-level handlers in background tasks that MUST NOT crash
2. Fire-and-forget operations where failure is truly acceptable
3. User-provided callbacks that could raise anything
**If you believe this is acceptable, add a comment:**
```python
# INTENTIONAL BROAD HANDLER: <reason>
# - <specific justification>
# - <what happens on failure>
```
**What to do instead:**
1. Catch specific exception types you can handle
2. Let unexpected exceptions propagate
3. Use domain-specific exceptions from `src/noteflow/domain/errors.py`
**Resolution pattern:**
```python
# BAD: Catches everything silently
except Exception:
logger.exception("Something failed")
# GOOD: Specific exceptions
except (ValueError, KeyError) as e:
logger.warning("Config parse error: %s", e)
raise ConfigError(str(e)) from e
# GOOD: Let others propagate
except ValidationError as e:
logger.error("Validation failed: %s", e)
raise # Re-raise or wrap in domain error
```
**If truly fire-and-forget, surface metrics:**
```python
except Exception:
logger.exception("Background task failed")
metrics.increment("background_task_failures") # Observable!
```

View File

@@ -0,0 +1,42 @@
---
name: block-datetime-now-fallback
enabled: true
event: file
pattern: return\s+datetime\.now\s*\(
action: block
---
**BLOCKED: datetime.now() fallback detected**
Returning `datetime.now()` as a fallback on parse failures causes **data corruption**.
**Why this is dangerous:**
- Silent data corruption - timestamps become incorrect without any error signal
- Debugging nightmare - no way to trace back to the original parse failure
- Data integrity loss - downstream consumers receive fabricated timestamps
**What to do instead:**
1. Return `None` and let callers handle missing timestamps explicitly
2. Raise a typed error (e.g., `DateParseError`) so failures are visible
3. Use `Result[datetime, ParseError]` pattern for explicit error handling
**Examples from swallowed.md:**
- `_parse_google_datetime` returns `datetime.now(UTC)` on `ValueError`
- `parse_outlook_datetime` returns `datetime.now(UTC)` on `ValueError`
**Resolution pattern:**
```python
# BAD: Silent data corruption
except ValueError:
logger.warning("Failed to parse: %s", dt_str)
return datetime.now(UTC) # Data corruption!
# GOOD: Explicit failure
except ValueError as e:
raise DateParseError(f"Invalid datetime: {dt_str}") from e
# GOOD: Optional return
except ValueError:
logger.warning("Failed to parse: %s", dt_str)
return None # Caller handles missing timestamp
```

View File

@@ -0,0 +1,60 @@
---
name: block-default-value-swallow
enabled: true
event: file
conditions:
- field: new_text
operator: regex_match
pattern: except\s+\w*(?:Error|Exception).*?:\s*\n\s+.*?(?:logger\.|logging\.).*?(?:warning|warn).*?\n\s+return\s+(?:\w+Settings|Defaults?\(|default_|\{[^}]*\}|[A-Z_]+_DEFAULT)
action: block
---
**BLOCKED: Default value swallowing on config/settings failure**
Returning hardcoded defaults when configuration loading fails hides critical initialization errors.
**Why this is dangerous:**
- Application runs with unexpected/incorrect configuration
- Users have no idea their settings aren't being applied
- Subtle bugs that only manifest under specific conditions
- Security settings might be weaker than intended
**Examples from swallowed.md:**
- `get_llm_settings` returns hardcoded defaults on any exception
- `_get_ollama_settings` returns defaults on settings load failure
- `get_webhook_settings` returns defaults on exception
- `diarization_job_ttl_seconds` returns default TTL on failure
**What to do instead:**
1. **Fail fast at startup** - validate config before accepting requests
2. **Return typed errors** - let callers decide how to handle missing config
3. **Mark degraded mode** - if defaulting is acceptable, make it visible
**Resolution pattern:**
```python
# BAD: Silent defaulting
except Exception as exc:
logger.warning("Settings load failed, using defaults")
return HardcodedDefaults() # User has no idea!
# GOOD: Fail fast
except Exception as exc:
raise ConfigurationError(f"Failed to load settings: {exc}") from exc
# GOOD: Explicit degraded mode (if truly acceptable)
except Exception as exc:
logger.error("config_degraded_mode", error=str(exc))
metrics.set_gauge("config_degraded", 1) # Observable!
return DefaultSettings(degraded=True) # Callers can check
```
**Startup validation pattern:**
```python
# Validate config at startup, not on every call
def validate_config_or_fail():
try:
settings = load_settings()
settings.validate()
except Exception as e:
sys.exit(f"Configuration error: {e}")
```

View File

@@ -0,0 +1,51 @@
---
name: block-silent-none-return
enabled: true
event: file
conditions:
- field: new_text
operator: regex_match
pattern: except\s+\w*Error.*?:\s*\n\s+.*?(?:logger\.|logging\.).*?\n\s+return\s+(?:None|\[\]|False|\{\}|0)
action: block
---
**BLOCKED: Silent error swallowing with default return detected**
Catching an exception, logging it, and returning `None`/`[]`/`False`/`{}`/`0` hides failures from callers.
**Why this is problematic:**
- Callers cannot distinguish "no result" from "error occurred"
- Error context is lost - only appears in logs, not in call stack
- Leads to cascading failures with misleading error messages
- Makes debugging significantly harder
**What to do instead:**
1. Re-raise the exception (possibly wrapped in a domain error)
2. Return a `Result` type that explicitly contains error information
3. Use domain-specific exceptions from `src/noteflow/domain/errors.py`
**Examples from swallowed.md:**
- gRPC client methods catch `RpcError` and return `None`
- Auth workflows catch `KeyError`/`ValueError` and return `None`
- Token refresh catches `OAuthError` and returns `None`
**Resolution pattern:**
```python
# BAD: Silent swallowing
except RpcError as e:
logger.error("Failed to create meeting: %s", e)
return None # Caller has no idea what failed
# GOOD: Let caller handle
except RpcError as e:
raise MeetingCreationError(f"gRPC failed: {e}") from e
# GOOD: Result type
except RpcError as e:
logger.error("Failed to create meeting: %s", e)
return Result.failure(MeetingError.RPC_FAILURE, str(e))
```
**Check these centralized helpers:**
- `src/noteflow/domain/errors.py` - `DomainError` + `ErrorCode`
- `src/noteflow/grpc/_mixins/errors/` - gRPC error helpers

8
.mcp.json Normal file
View File

@@ -0,0 +1,8 @@
{
"mcpServers": {
"lightrag-mcp": {
"type": "sse",
"url": "http://192.168.50.185:8150/sse"
}
}
}

View File

@@ -0,0 +1,69 @@
# NoteFlow Architecture Overview
## Project Type
Intelligent meeting notetaker: local-first audio capture + navigable recall + evidence-linked summaries.
## Tech Stack
- **Python Backend**: gRPC server, domain logic, infrastructure adapters (src/noteflow/)
- **Tauri Desktop Client**: Rust IPC + React UI (client/)
- **Database**: PostgreSQL with pgvector extension, async SQLAlchemy + asyncpg
## Architecture Pattern
Hexagonal (Ports & Adapters):
- **Domain Layer** (`domain/`): Pure business logic, entities, value objects, ports (protocols)
- **Application Layer** (`application/`): Use-cases/services, orchestration
- **Infrastructure Layer** (`infrastructure/`): Implementations (repos, ASR, auth, persistence)
- **gRPC Layer** (`grpc/`): API boundary, server mixins, proto definitions
## Key Entry Points
| Entry Point | Description |
|-------------|-------------|
| `python -m noteflow.grpc.server` | Backend server |
| `cd client && npm run dev` | Web UI (Vite) |
| `cd client && npm run tauri dev` | Desktop Tauri dev |
## Directory Structure
```
src/noteflow/
├── domain/ # Entities, ports, value objects
├── application/ # Use-cases/services
├── infrastructure/ # Implementations
├── grpc/ # gRPC layer
├── config/ # Settings
└── cli/ # CLI tools
client/src/
├── api/ # API adapters & types
├── hooks/ # Custom React hooks
├── contexts/ # React contexts
├── components/ # UI components
├── pages/ # Route pages
└── lib/ # Utilities
client/src-tauri/src/
├── commands/ # Tauri IPC handlers
├── grpc/ # gRPC client & types
├── state/ # Runtime state
├── audio/ # Audio capture/playback
└── crypto/ # Encryption
```
## Proto/gRPC Contract
Proto file: `src/noteflow/grpc/proto/noteflow.proto`
Regenerate after changes:
```bash
python -m grpc_tools.protoc -I src/noteflow/grpc/proto \
--python_out=src/noteflow/grpc/proto \
--grpc_python_out=src/noteflow/grpc/proto \
src/noteflow/grpc/proto/noteflow.proto
python scripts/patch_grpc_stubs.py
```
## Quality Commands
```bash
make quality # ALL checks (TS + Rust + Python)
make quality-py # Python: lint + type-check + test-quality
make quality-ts # TypeScript: type-check + lint
make quality-rs # Rust: clippy + lint
pytest tests/quality/ # After any non-trivial changes
```

View File

@@ -0,0 +1,67 @@
# NoteFlow Domain Entities
## Location
`src/noteflow/domain/entities/` and `domain/value_objects.py`
## Core Entities
### Meeting (`entities/meeting.py`)
Aggregate root with lifecycle states.
- **Classes**: `MeetingLoadParams`, `MeetingCreateParams`, `Meeting`
- **States**: CREATED → RECORDING → STOPPED → COMPLETED (or ERROR, STOPPING)
- **Key Fields**: id (UUID), title, state, project_id, created_at, started_at, stopped_at
### Segment (`entities/segment.py`)
Transcript fragment with timing and speaker.
- **Key Fields**: segment_id, text, start_time, end_time, speaker_id, language, confidence
- **Related**: `WordTiming` for word-level boundaries
### Summary (`entities/summary.py`)
Generated summary with key points and action items.
- **Key Fields**: id, meeting_id, template_id, content, format, provider
- **Contains**: `KeyPoint[]`, `ActionItem[]`
### Annotation (`entities/annotation.py`)
User annotation linked to segments.
- **Types**: ACTION_ITEM, DECISION, NOTE, RISK
- **Key Fields**: id, meeting_id, type, text, segment_ids, priority
### NamedEntity (`entities/named_entity.py`)
NER extraction results.
- **Categories**: PERSON, COMPANY, PRODUCT, TECHNICAL, ACRONYM, LOCATION, DATE, OTHER
- **Key Fields**: id, name, category, segment_ids, meeting_id
### Project (`entities/project.py`)
Workspace grouping for meetings.
- **Contains**: `ExportRules`, `TriggerRules`
- **Key Fields**: id, workspace_id, name, slug, members
### Integration (`entities/integration.py`)
External service connections.
- **Types**: AUTH, EMAIL, CALENDAR, PKM, CUSTOM
- **Statuses**: DISCONNECTED, CONNECTED, ERROR
### SummarizationTemplate (`entities/summarization_template.py`)
Configurable summary generation template.
- **Key Fields**: id, name, tone, format, verbosity
## Value Objects (`value_objects.py`)
### Type-Safe IDs
- `MeetingId = NewType("MeetingId", UUID)`
- `AnnotationId = NewType("AnnotationId", UUID)`
### Enums
- `MeetingState` (IntEnum): UNSPECIFIED=0, CREATED=1, RECORDING=2, STOPPED=3, COMPLETED=4, ERROR=5, STOPPING=6
- `AnnotationType` (Enum): ACTION_ITEM, DECISION, NOTE, RISK
- `ExportFormat` (Enum): MARKDOWN, HTML, PDF
## Error Hierarchy (`errors.py`)
- Base: `DomainError` with `ErrorCode` enum mapping to gRPC `StatusCode`
- 30+ specific error types (MEETING_NOT_FOUND, WORKSPACE_ACCESS_DENIED, etc.)
## Identity Entities (`domain/identity/`)
- `User`: User identity with email, name, picture
- `Workspace`: Tenant container
- `WorkspaceMembership`: User-workspace relationship with role
- `ProjectRole`: Role definitions (owner, editor, viewer)

133
.rag/03-domain-ports.md Normal file
View File

@@ -0,0 +1,133 @@
# NoteFlow Domain Ports (Protocols)
## Location
`src/noteflow/domain/ports/`
## Repository Protocols (`repositories/`)
### Core Repositories (`transcript.py`)
| Protocol | Key Methods |
|----------|-------------|
| `MeetingRepository` | `add()`, `get()`, `list()`, `update()`, `delete()`, `count_by_state()`, `find_older_than()` |
| `SegmentRepository` | `add()`, `add_batch()`, `get()`, `list_by_meeting()`, `update()`, `delete()` |
| `SummaryRepository` | `add()`, `get()`, `list_by_meeting()`, `mark_verified()` |
| `AnnotationRepository` | `add()`, `get()`, `list_by_meeting()`, `update()`, `delete()` |
### External Repositories
| Protocol | Location | Purpose |
|----------|----------|---------|
| `AssetRepository` | `asset.py` | Store/retrieve meeting audio files |
| `DiarizationJobRepository` | `background.py` | Track background diarization jobs |
| `EntityRepository` | `external/_entity.py` | Persist NER entities |
| `IntegrationRepository` | `external/_integration.py` | Store OAuth integrations |
| `WebhookRepository` | `external/_webhook.py` | Webhook configs and delivery logs |
| `UsageEventRepository` | `external/_usage.py` | Track usage metrics |
### Identity Repositories (`identity/`)
| Protocol | Purpose |
|----------|---------|
| `UserRepository` | User identity and authentication |
| `WorkspaceRepository` | Workspace tenancy |
| `ProjectRepository` | Project CRUD and member access |
| `ProjectMembershipRepository` | Project role-based access |
| `SummarizationTemplateRepository` | Template CRUD and versioning |
## Engine/Provider Protocols
### DiarizationEngine (`diarization.py`)
Speaker identification (streaming: diart, offline: pyannote)
- `assign_speakers(audio: ndarray, segments: list[Segment]) -> list[SpeakerAssignment]`
- `is_ready() -> bool`
### NerPort (`ner.py`)
Named entity recognition with spaCy
- `extract(text: str) -> list[NamedEntity]`
- `extract_from_segments(segments: list[Segment]) -> list[NamedEntity]`
- `is_ready() -> bool`
### OAuthPort (`calendar.py`)
OAuth PKCE flow
- `initiate_auth(provider: str) -> AuthUrl`
- `complete_auth(code: str, state: str) -> TokenResponse`
### CalendarProvider (`calendar.py`)
Calendar event fetching
- `list_events(start: datetime, end: datetime) -> list[CalendarEventInfo]`
### SummarizerProvider (`summarization/ports.py`)
LLM summarization
- `request(context: SummarizationRequest) -> SummarizationResult`
## Unit of Work Pattern (`unit_of_work.py`)
### Hierarchical Protocol Structure
```python
UnitOfWorkCapabilities
supports_annotations: bool
supports_diarization_jobs: bool
supports_preferences: bool
supports_entities: bool
supports_integrations: bool
supports_webhooks: bool
supports_usage_events: bool
supports_users: bool
supports_workspaces: bool
supports_projects: bool
UnitOfWorkCoreRepositories
meetings: MeetingRepository
segments: SegmentRepository
summaries: SummaryRepository
assets: AssetRepository
UnitOfWorkOptionalRepositories
annotations: AnnotationRepository | None
diarization_jobs: DiarizationJobRepository | None
preferences: PreferencesRepository | None
entities: EntityRepository | None
integrations: IntegrationRepository | None
webhooks: WebhookRepository | None
usage_events: UsageEventRepository | None
UnitOfWorkLifecycle
__aenter__() / __aexit__()
commit() async
rollback() async
```
## Key Data Classes
### SummarizationRequest
```python
@dataclass
class SummarizationRequest:
meeting_id: MeetingId
segments: list[Segment]
template: SummarizationTemplate | None
context: dict[str, str]
```
### SummarizationResult
```python
@dataclass
class SummarizationResult:
summary: str
key_points: list[KeyPoint]
action_items: list[ActionItem]
model_name: str
provider_name: str
tokens_used: int
```
### CalendarEventInfo
```python
@dataclass
class CalendarEventInfo:
id: str
title: str
start_time: datetime
end_time: datetime
attendees: list[str]
location: str | None
description: str | None
```

View File

@@ -0,0 +1,142 @@
# NoteFlow Application Services
## Location
`src/noteflow/application/services/`
## Core Services
### MeetingService (`meeting/`)
Meeting lifecycle, segments, annotations, summaries, state.
**Files**:
- `meeting_service.py` — Composite class combining all mixins
- `_base.py` — Core initialization and dependencies
- `_crud_mixin.py` — Create, read, update, delete operations
- `_segments_mixin.py` — Segment management
- `_summaries_mixin.py` — Summary operations
- `_state_mixin.py` — State machine transitions
- `_annotations_mixin.py` — Annotation CRUD
- `_types.py` — Service-specific TypedDicts
**Key Methods**:
- `create_meeting(title, project_id) -> Meeting`
- `get_meeting(meeting_id) -> Meeting`
- `list_meetings(filters) -> list[Meeting]`
- `stop_meeting(meeting_id) -> Meeting`
- `add_segment(meeting_id, segment_data) -> Segment`
- `add_annotation(meeting_id, annotation_data) -> Annotation`
- `generate_summary(meeting_id, template_id) -> Summary`
### IdentityService (`identity/`)
User/workspace context, defaults, tenancy scoping.
**Files**:
- `identity_service.py` — Main service
- `_context_mixin.py` — Request context handling
- `_workspace_mixin.py` — Workspace operations
- `_defaults_mixin.py` — Default user/workspace creation
**Key Methods**:
- `get_current_user() -> User`
- `get_current_workspace() -> Workspace`
- `switch_workspace(workspace_id) -> Workspace`
- `ensure_defaults() -> tuple[User, Workspace]`
### CalendarService (`calendar/`)
OAuth integration, event fetching, sync management.
**Files**:
- `calendar_service.py` — Main service
- `_connection_mixin.py` — OAuth connection handling
- `_events_mixin.py` — Event fetching
- `_oauth_mixin.py` — OAuth flow management
- `_service_mixin.py` — Provider configuration
- `_errors.py` — Calendar-specific errors
**Key Methods**:
- `initiate_oauth(provider) -> AuthUrl`
- `complete_oauth(code, state) -> Integration`
- `list_events(start, end) -> list[CalendarEvent]`
- `get_connection_status() -> ConnectionStatus`
### SummarizationService (`summarization/`)
Summary generation, template management, cloud consent.
**Files**:
- `summarization_service.py` — Main service
- `template_service.py` — Template CRUD
- `consent_manager.py` — Cloud consent flow
**Key Methods**:
- `generate_summary(meeting_id, template_id) -> Summary`
- `create_template(data) -> SummarizationTemplate`
- `list_templates() -> list[SummarizationTemplate]`
- `grant_consent() -> bool`
- `revoke_consent() -> bool`
- `has_consent() -> bool`
### ProjectService (`project_service/`)
Project CRUD, member management, roles, rules.
**Files**:
- `__init__.py` — Main service export
- `crud.py` — Project CRUD operations
- `members.py` — Member management
- `roles.py` — Role-based access
- `rules.py` — Project rules configuration
- `active.py` — Active project tracking
- `_types.py` — Service types
**Key Methods**:
- `create_project(name, workspace_id) -> Project`
- `add_member(project_id, user_id, role) -> ProjectMembership`
- `update_member_role(project_id, user_id, role) -> ProjectMembership`
- `remove_member(project_id, user_id) -> bool`
- `set_active_project(project_id) -> Project`
## Supporting Services
### NerService (`ner_service.py`)
Named entity extraction wrapper, model loading.
- `extract_entities(meeting_id) -> list[NamedEntity]`
- `is_ready() -> bool`
### ExportService (`export_service.py`)
Transcript export (Markdown, HTML, PDF).
- `export(meeting_id, format) -> ExportResult`
### WebhookService (`webhook_service.py`)
Webhook registration, delivery, retry logic.
- `register_webhook(config) -> WebhookConfig`
- `deliver_event(event) -> WebhookDelivery`
### AuthService (`auth_service.py`)
User authentication, OIDC integration.
- `initiate_login(provider) -> AuthUrl`
- `complete_login(code, state) -> User`
- `logout() -> bool`
### TriggerService (`trigger_service.py`)
Calendar/audio/foreground-app trigger detection.
- `check_triggers() -> list[TriggerSignal]`
- `snooze_triggers(duration) -> bool`
### RetentionService (`retention_service.py`)
Automatic meeting deletion based on policy.
- `apply_retention_policy() -> int`
### RecoveryService (`recovery/`)
Data recovery (meeting, job, audio).
- `recover_meeting(meeting_id) -> Meeting`
- `recover_job(job_id) -> DiarizationJob`
### AsrConfigService (`asr_config_service.py`)
ASR model configuration and state.
- `get_config() -> AsrConfig`
- `update_config(config) -> AsrConfig`
### HfTokenService (`hf_token_service.py`)
Hugging Face token management.
- `set_token(token) -> bool`
- `get_status() -> HfTokenStatus`
- `validate_token() -> bool`

View File

@@ -0,0 +1,153 @@
# NoteFlow Infrastructure Adapters
## Location
`src/noteflow/infrastructure/`
## Audio & ASR Layer (`asr/`, `audio/`)
### ASR Engine (`asr/engine.py`)
Faster-whisper wrapper for streaming ASR.
- `transcribe(audio: ndarray) -> TranscriptionResult`
- `is_ready() -> bool`
### ASR Streaming VAD (`asr/streaming_vad.py`)
Voice activity detection for streaming.
- `process_audio(chunk: ndarray) -> list[VadSegment]`
### ASR Segmenter (`asr/segmenter/`)
Audio segmentation into speech chunks.
### Audio Capture (`audio/`)
Sounddevice capture, ring buffer, VU levels, playback, writer, reader.
## Diarization Layer (`diarization/`)
### Session Management (`session.py`)
Speaker session lifecycle with audio buffering.
- `process_audio(chunk) -> list[SpeakerTurn]`
### Diarization Engine (`engine/`)
- **Streaming**: diart for real-time speaker detection
- **Offline**: pyannote.audio for post-meeting refinement
### Speaker Assigner (`assigner.py`)
Assign speech segments to speaker IDs.
## Summarization Layer (`summarization/`)
### Cloud Provider (`cloud_provider/`)
Anthropic/OpenAI API integration.
- `generate(request) -> SummarizationResult`
### Ollama Provider (`ollama_provider.py`)
Local Ollama LLM.
- `generate(request) -> SummarizationResult`
### Mock Provider (`mock_provider.py`)
Testing provider.
### Citation Verifier (`citation_verifier.py`)
Validate summary citations against transcript.
- `verify(summary, segments) -> CitationVerificationResult`
### Template Renderer (`template_renderer.py`)
Render summary templates.
## NER Engine (`ner/`)
spaCy-based named entity extraction.
- `extract(text) -> list[Entity]`
- `extract_from_segments(segments) -> list[NamedEntity]`
## Persistence Layer (`persistence/`)
### Database (`database.py`)
Async SQLAlchemy engine, session factory, pgvector support.
- `create_async_engine(url) -> AsyncEngine`
- `create_async_session_factory(engine) -> async_sessionmaker`
### ORM Models (`models/`)
```
core/ # MeetingModel, SegmentModel, SummaryModel, AnnotationModel
identity/ # UserModel, WorkspaceModel, ProjectModel, MembershipModel
entities/ # NamedEntityModel, SpeakerModel
integrations/ # IntegrationModel, CalendarEventModel, WebhookConfigModel
organization/ # SummarizationTemplateModel, TaskModel, TagModel
observability/ # UsageEventModel
```
### Base Repository (`repositories/_base/`)
```python
class BaseRepository:
async def _execute_scalar(query) -> T | None
async def _execute_scalars(query) -> list[T]
async def _add_and_flush(entity) -> T
```
### Unit of Work (`unit_of_work/`)
Transaction management, repository coordination.
## Auth Layer (`auth/`)
### OIDC Discovery (`oidc_discovery.py`)
Discover provider endpoints.
- `discover(issuer_url) -> OidcConfig`
### OIDC Registry (`oidc_registry.py`)
Manage configured OIDC providers.
- `register(provider) -> OidcProvider`
- `list_providers() -> list[OidcProvider]`
### OIDC Presets (`_presets.py`)
Pre-configured providers (Google, Outlook, etc.).
## Export Layer (`export/`)
### Markdown Export (`markdown.py`)
Convert transcript to Markdown.
### HTML Export (`html.py`)
Convert transcript to HTML.
### PDF Export (`pdf.py`)
Convert transcript to PDF (WeasyPrint).
## Converters (`converters/`)
Bidirectional conversion between layers:
- ORM ↔ Domain entities
- ASR engine output ↔ Domain entities
- Calendar API ↔ Domain entities
- Webhook payloads ↔ Domain entities
- NER output ↔ Domain entities
## Calendar Integration (`calendar/`)
Google/Outlook OAuth adapters with event API integration.
## Security (`security/`)
### Keystore (`keystore.py`)
AES-GCM encryption with keyring backend.
- `encrypt(data) -> bytes`
- `decrypt(data) -> bytes`
### Crypto Utilities (`crypto/`)
Cryptographic helpers.
## Logging & Observability
### Log Buffer (`logging/log_buffer.py`)
In-memory log buffering for client retrieval.
### OpenTelemetry (`observability/otel.py`)
Distributed tracing.
### Usage Events (`observability/`)
Track usage metrics.
### Metrics (`metrics/`)
Metric collection utilities.
## Webhooks (`webhooks/`)
### WebhookExecutor
Delivery, signing (HMAC-SHA256), retry with exponential backoff.
- `deliver(config, payload) -> WebhookDelivery`

140
.rag/06-grpc-layer.md Normal file
View File

@@ -0,0 +1,140 @@
# NoteFlow gRPC Layer
## Location
`src/noteflow/grpc/`
## Core Files
### Service (`service.py`)
Main `NoteFlowServicer` - gRPC service implementation composing all mixins.
### Client (`client.py`)
Python gRPC client wrapper for testing and internal use.
### Proto (`proto/noteflow.proto`)
Service definition with bidirectional streaming and RPC methods.
**Key RPC Groups**:
- **Streaming**: `StreamTranscription(AudioChunk) → TranscriptUpdate` (bidirectional)
- **Meeting Lifecycle**: CreateMeeting, StopMeeting, ListMeetings, GetMeeting, DeleteMeeting
- **Summaries**: GenerateSummary, ListSummarizationTemplates, CreateSummarizationTemplate
- **Diarization**: RefineSpeakerDiarization, RenameSpeaker, GetDiarizationJobStatus
- **Annotations**: AddAnnotation, GetAnnotation, ListAnnotations, UpdateAnnotation, DeleteAnnotation
- **Export**: ExportTranscript (Markdown, HTML, PDF)
- **Calendar**: Calendar event sync and OAuth
- **Webhooks**: Webhook config and delivery management
- **OIDC**: Authentication via OpenID Connect
## Server Mixins (`_mixins/`)
### Streaming Mixin (`streaming/`)
Bidirectional ASR streaming.
**Files**:
- `_mixin.py` — Main StreamingMixin
- `_session.py` — Stream session lifecycle
- `_asr.py` — ASR engine integration
- `_processing/` — Audio processing pipeline (VAD, chunk tracking, congestion)
- `_partials.py` — Partial transcript handling
- `_cleanup.py` — Resource cleanup
### Diarization Mixin (`diarization/`)
Speaker diarization (streaming + offline refinement).
**Files**:
- `_mixin.py` — Main DiarizationMixin
- `_jobs.py` — Background job management
- `_streaming.py` — Real-time diarization
- `_refinement.py` — Offline refinement with pyannote
- `_speaker.py` — Speaker assignment
- `_status.py` — Job status tracking
### Summarization Mixin (`summarization/`)
Summary generation and templates.
**Files**:
- `_generation_mixin.py` — Summary generation flow
- `_templates_mixin.py` — Template CRUD
- `_consent_mixin.py` — Cloud consent handling
- `_summary_generation.py` — Core generation logic
- `_template_resolution.py` — Template lookup
- `_context_builders.py` — Context preparation
### Meeting Mixin (`meeting/`)
Meeting lifecycle management.
**Files**:
- `meeting_mixin.py` — Meeting state management
- `_project_scope.py` — Project scoping
- `_stop_ops.py` — Stop operations
### Other Mixins
- `project/` — Project management
- `oidc/` — OpenID Connect auth
- `identity/` — User/workspace identity
- `annotation.py` — Segment annotations CRUD
- `export.py` — Export operations
- `entities.py` — Named entity extraction
- `calendar.py` — Calendar sync
- `webhooks.py` — Webhook management
- `preferences.py` — User preferences
- `observability.py` — Usage tracking, metrics
- `sync.py` — State synchronization
### Error Helpers (`errors/`)
- `_abort.py``abort_not_found()`, `abort_invalid_argument()`
- `_require.py` — Precondition checks
- `_fetch.py` — Fetch with error handling
- `_parse.py` — Parsing helpers
### Converters (`converters/`)
Proto ↔ Domain conversion.
- `_domain.py` — Domain entity conversion
- `_timestamps.py` — Timestamp conversion
- `_id_parsing.py` — ID parsing and validation
- `_external.py` — External entity conversion
- `_oidc.py` — OIDC entity conversion
## Server Bootstrap
### Files
- `_server_bootstrap.py` — gRPC server creation
- `_startup.py` — Server startup sequence
- `_startup_services.py` — Service initialization
- `_startup_banner.py` — Startup logging
- `_service_shutdown.py` — Graceful shutdown
- `_service_stubs.py` — gRPC stub management
### Interceptors (`interceptors/`)
gRPC interceptors for identity context propagation.
## Client Mixins (`_client_mixins/`)
Client-side gRPC operations.
- `streaming.py` — Client streaming operations
- `meeting.py` — Meeting CRUD operations
- `diarization.py` — Diarization requests
- `export.py` — Export requests
- `annotation.py` — Annotation operations
- `converters.py` — Response converters
- `protocols.py` — ClientHost protocol
## Critical Paths
### Recording Flow
1. Client: `StreamTranscription(AudioChunk)` → gRPC streaming
2. Server: StreamingMixin consumes chunks
3. ASR Engine: Transcribe via faster-whisper
4. VAD: Segment by silence
5. Diarization: Assign speakers (streaming)
6. Repository: Persist Segments
7. Client: Receive `TranscriptUpdate` with segments
### Summary Generation Flow
1. Client: `GenerateSummary(meeting_id)` → gRPC call
2. SummarizationService: Fetch segments
3. SummarizerProvider: Call LLM
4. Citation Verifier: Validate claims
5. Template Renderer: Apply template
6. Repository: Persist Summary
7. Client: Receive Summary

View File

@@ -0,0 +1,172 @@
# NoteFlow TypeScript API Layer
## Location
`client/src/api/`
## Architecture
Multi-adapter design with fallback chain:
1. **TauriAdapter** (`tauri-adapter.ts`) — Primary: Rust IPC to gRPC
2. **CachedAdapter** (`cached-adapter.ts`) — Fallback: Read-only cache
3. **MockAdapter** (`mock-adapter.ts`) — Development: Simulated responses
## API Interface (`interface.ts`)
```typescript
interface NoteFlowAPI {
// Connection
connect(): Promise<ConnectionResult>;
disconnect(): Promise<void>;
isConnected(): boolean;
getEffectiveServerUrl(): Promise<string>;
// Meetings
createMeeting(request: CreateMeetingRequest): Promise<Meeting>;
getMeeting(request: GetMeetingRequest): Promise<Meeting>;
listMeetings(request: ListMeetingsRequest): Promise<ListMeetingsResponse>;
stopMeeting(request: StopMeetingRequest): Promise<Meeting>;
deleteMeeting(request: DeleteMeetingRequest): Promise<void>;
// Streaming
startTranscription(meetingId: string): TranscriptionStream;
// Diarization
refineSpeakerDiarization(request: RefineDiarizationRequest): Promise<DiarizationJob>;
getDiarizationJobStatus(request: GetJobStatusRequest): Promise<DiarizationJobStatus>;
renameSpeaker(request: RenameSpeakerRequest): Promise<void>;
// Summaries
generateSummary(request: GenerateSummaryRequest): Promise<Summary>;
listSummarizationTemplates(): Promise<SummarizationTemplate[]>;
createSummarizationTemplate(request: CreateTemplateRequest): Promise<SummarizationTemplate>;
// Annotations
addAnnotation(request: AddAnnotationRequest): Promise<Annotation>;
listAnnotations(request: ListAnnotationsRequest): Promise<Annotation[]>;
updateAnnotation(request: UpdateAnnotationRequest): Promise<Annotation>;
deleteAnnotation(request: DeleteAnnotationRequest): Promise<void>;
// Export
exportTranscript(request: ExportRequest): Promise<ExportResult>;
// ... 50+ more methods
}
```
## Transcription Streaming
```typescript
interface TranscriptionStream {
send(chunk: AudioChunk): void;
onUpdate(callback: (update: TranscriptUpdate) => void): Promise<void> | void;
onError?(callback: (error: StreamError) => void): void;
onCongestion?(callback: (state: CongestionState) => void): void;
close(): void;
}
```
## Connection State (`connection-state.ts`)
```typescript
type ConnectionMode = 'connected' | 'disconnected' | 'cached' | 'mock' | 'reconnecting';
interface ConnectionState {
mode: ConnectionMode;
lastConnectedAt: Date | null;
disconnectedAt: Date | null;
reconnectAttempts: number;
error: string | null;
serverUrl: string | null;
}
```
## Type Definitions (`types/`)
### Core Types (`core.ts`)
- `Meeting`, `FinalSegment`, `WordTiming`, `Summary`, `Annotation`
- `KeyPoint`, `ActionItem`, `Speaker`
### Enums (`enums.ts`)
- `UpdateType`: partial | final | vad_start | vad_end
- `MeetingState`: created | recording | stopped | completed | error
- `JobStatus`: queued | running | completed | failed | cancelled
- `AnnotationType`: action_item | decision | note | risk
- `ExportFormat`: markdown | html | pdf
### Feature Types (`features/`)
- `webhooks.ts` — WebhookConfig, WebhookDelivery
- `calendar.ts` — CalendarProvider, CalendarEvent, OAuthConfig
- `ner.ts` — Entity extraction types
- `identity.ts` — User, Workspace
- `oidc.ts` — OIDCProvider, OIDCConfig
- `sync.ts` — SyncStatus, SyncHistory
- `observability.ts` — LogEntry, MetricPoint
### Projects (`projects.ts`)
- `Project`, `ProjectMember`, `ProjectMembership`
## Cached Adapter (`cached/`)
Provides offline read-only access:
```typescript
// cached/readonly.ts
export function rejectReadOnly(): never {
throw new Error('This action requires an active connection');
}
// Pattern in cached adapters
export const cachedMeetings = {
async getMeeting(id: string): Promise<Meeting> {
return meetingCache.get(id) ?? rejectReadOnly();
},
async createMeeting(): Promise<Meeting> {
return rejectReadOnly(); // Mutations blocked
},
};
```
### Cache Modules
- `meetings.ts` — Meeting cache with TTL
- `projects.ts` — Project cache
- `diarization.ts` — Job status cache
- `annotations.ts` — Annotation cache
- `templates.ts` — Template cache
- `preferences.ts` — Preferences cache
## API Initialization
```typescript
// api/index.ts
export async function initializeAPI(): Promise<NoteFlowAPI> {
try {
const tauriAdapter = await createTauriAdapter();
return tauriAdapter;
} catch {
console.warn('Falling back to mock adapter');
return createMockAdapter();
}
}
export function getAPI(): NoteFlowAPI {
return window.__NOTEFLOW_API__ ?? mockAdapter;
}
```
## Usage Pattern
```typescript
import { getAPI } from '@/api';
const api = getAPI();
const meeting = await api.createMeeting({ title: 'Sprint Planning' });
const stream = api.startTranscription(meeting.id);
stream.onUpdate((update) => {
if (update.update_type === 'final') {
console.log('New segment:', update.segment);
}
});
// Send audio chunks
stream.send({ audio_data: audioBuffer });
```

View File

@@ -0,0 +1,212 @@
# NoteFlow TypeScript Hooks & Contexts
## Location
`client/src/hooks/` and `client/src/contexts/`
## React Contexts
### Connection Context (`connection-context.tsx`)
gRPC connection state and mode detection.
```typescript
interface ConnectionHelpers {
state: ConnectionState;
mode: ConnectionMode; // connected | disconnected | cached | mock | reconnecting
isConnected: boolean;
isReadOnly: boolean; // cached | disconnected | mock | reconnecting
isReconnecting: boolean;
isSimulating: boolean; // Simulation mode from preferences
}
// Usage
const { isConnected, isReadOnly, mode } = useConnection();
```
### Workspace Context (`workspace-context.tsx`)
User and workspace state.
```typescript
interface WorkspaceContextValue {
currentWorkspace: Workspace | null;
workspaces: Workspace[];
currentUser: GetCurrentUserResponse | null;
switchWorkspace: (workspaceId: string) => Promise<void>;
isLoading: boolean;
error: string | null;
}
// Usage
const { currentWorkspace, currentUser, switchWorkspace } = useWorkspace();
```
### Project Context (`project-context.tsx`)
Active project and project list.
```typescript
interface ProjectContextValue {
projects: Project[];
activeProject: Project | null;
switchProject: (projectId: string) => Promise<void>;
isLoading: boolean;
error: string | null;
}
// Usage
const { activeProject, projects, switchProject } = useProjects();
```
## Custom Hooks
### Diarization (`use-diarization.ts`)
Diarization job lifecycle with polling and recovery.
```typescript
interface UseDiarizationOptions {
onComplete?: (status: DiarizationJobStatus) => void;
onError?: (error: string) => void;
pollInterval?: number;
maxRetries?: number;
showToasts?: boolean;
autoRecover?: boolean; // Auto-recovery after restart
}
interface DiarizationState {
jobId: string | null;
status: JobStatus | null;
progress: number; // 0-100
error: string | null;
speakerIds: string[];
segmentsUpdated: number;
isActive: boolean;
}
function useDiarization(options?: UseDiarizationOptions): {
state: DiarizationState;
start: (meetingId: string, numSpeakers?: number) => Promise<void>;
cancel: () => Promise<void>;
reset: () => void;
recover: () => Promise<DiarizationJobStatus | null>;
}
```
### Audio Devices (`use-audio-devices.ts`)
Audio device enumeration and selection.
```typescript
interface AudioDevice {
id: string;
name: string;
kind: 'input' | 'output';
}
function useAudioDevices(options: UseAudioDevicesOptions): {
devices: AudioDevice[];
selectedInput: AudioDevice | null;
selectedOutput: AudioDevice | null;
setSelectedInput: (id: string) => void;
setSelectedOutput: (id: string) => void;
isLoading: boolean;
}
```
### Project Hooks
- `useProject()` — Access project from context
- `useActiveProject()` — Get active project
- `useProjectMembers()` — Project membership queries
### Cloud Consent (`use-cloud-consent.ts`)
Cloud AI consent state management.
```typescript
function useCloudConsent(): {
hasConsent: boolean;
grantConsent: () => Promise<void>;
revokeConsent: () => Promise<void>;
isLoading: boolean;
}
```
### Integration Hooks
- `useWebhooks()` — Webhook CRUD
- `useEntityExtraction()` — NER extraction & updates
- `useCalendarSync()` — Calendar integration sync
- `useOAuthFlow()` — OAuth authentication flow
- `useAuthFlow()` — General auth flow
- `useOidcProviders()` — OIDC provider management
- `useIntegrationSync()` — Integration sync state polling
- `useIntegrationValidation()` — Integration config validation
### Recording Hooks
- `useRecordingAppPolicy()` — App recording policy detection
- `usePostProcessing()` — Post-recording processing state
### Utility Hooks
- `useAsyncData<T>()` — Generic async data loading with retry
- `useGuardedMutation()` — Mutation with offline/permissions guard
- `useToast()` — Toast notifications (shadcn/ui)
- `usePanelPreferences()` — Panel layout preferences
- `useMobile()` — Mobile/responsive detection
## Hook Patterns
### Polling with Backoff (Diarization)
```typescript
const { state, start, cancel, recover } = useDiarization({
pollInterval: 2000,
maxRetries: 10,
autoRecover: true,
onComplete: (status) => {
toast.success(`Diarization complete: ${status.segmentsUpdated} segments updated`);
},
});
// Start job
await start(meetingId, 2); // 2 speakers
// Monitor progress
useEffect(() => {
console.log(`Progress: ${state.progress}%`);
}, [state.progress]);
```
### Async Data Loading
```typescript
const { data, isLoading, error, retry } = useAsyncData(
() => getAPI().getMeeting({ meeting_id: meetingId }),
{
onError: (e) => toast.error(e.message),
deps: [meetingId],
}
);
```
### Connection-Aware Components
```typescript
function MyComponent() {
const { isConnected, isReadOnly } = useConnection();
const { activeProject } = useProjects();
if (isReadOnly) {
return <OfflineBanner />;
}
return <ConnectedUI project={activeProject} />;
}
```
## Context Provider Pattern
```typescript
// Root app setup
function App() {
return (
<ConnectionProvider>
<WorkspaceProvider>
<ProjectProvider>
<Routes />
</ProjectProvider>
</WorkspaceProvider>
</ConnectionProvider>
);
}
```

View File

@@ -0,0 +1,129 @@
# NoteFlow TypeScript Components & Pages
## Location
`client/src/components/` and `client/src/pages/`
## Component Architecture
### UI Components (`components/ui/`)
30+ shadcn/ui primitives: button, input, dialog, form, toast, etc.
### Recording Components (`components/recording/`)
| Component | Purpose |
|-----------|---------|
| `audio-level-meter.tsx` | Real-time VU meter visualization |
| `confidence-indicator.tsx` | ASR confidence display |
| `vad-indicator.tsx` | Voice activity indicator |
| `buffering-indicator.tsx` | Congestion/buffering display |
| `recording-header.tsx` | Recording session header |
| `stat-card.tsx` | Statistics display |
| `speaker-distribution.tsx` | Speaker time breakdown |
| `idle-state.tsx` | Idle/standby UI |
### Settings Components (`components/settings/`)
- `developer-options-section.tsx`
- `quick-actions-section.tsx`
### Project Components (`components/projects/`)
- `ProjectScopeFilter.tsx`
### Status & Badge Components
| Component | Purpose |
|-----------|---------|
| `entity-highlight.tsx` | NER entity highlighting |
| `entity-management-panel.tsx` | Entity CRUD UI |
| `annotation-type-badge.tsx` | Annotation type display |
| `meeting-state-badge.tsx` | Meeting state indicator |
| `priority-badge.tsx` | Priority indicator |
| `speaker-badge.tsx` | Speaker identification |
| `processing-status.tsx` | Post-processing indicator |
| `api-mode-indicator.tsx` | Connection mode indicator |
| `offline-banner.tsx` | Offline mode warning |
### Layout Components
- `app-layout.tsx` — Main app shell with sidebar
- `app-sidebar.tsx` — Navigation sidebar
- `error-boundary.tsx` — React error boundary
- `empty-state.tsx` — Empty state template
- `meeting-card.tsx` — Meeting item card
- `NavLink.tsx` — Navigation link wrapper
### Integration Components
- `calendar-connection-panel.tsx` — Calendar OAuth setup
- `calendar-events-panel.tsx` — Calendar events list
- `integration-config-panel/` — Integration setup wizard
### Other Components
- `connection-status.tsx` — Connection status display
- `confirmation-dialog.tsx` — Generic confirm dialog
- `timestamped-notes-editor.tsx` — Annotation editor
- `preferences-sync-bridge.tsx` — Preferences sync coordinator
- `preferences-sync-status.tsx` — Sync status display
## Pages (`pages/`)
| Page | Path | Purpose |
|------|------|---------|
| `Home.tsx` | `/` | Landing/onboarding |
| `Recording.tsx` | `/recording` | Active recording session |
| `Meetings.tsx` | `/meetings` | Meetings list with filters |
| `MeetingDetail.tsx` | `/meetings/:id` | Meeting transcript & details |
| `Projects.tsx` | `/projects` | Project list & create |
| `ProjectSettings.tsx` | `/projects/:id/settings` | Project configuration |
| `Settings.tsx` | `/settings` | Application settings |
| `People.tsx` | `/people` | Workspace members |
| `Tasks.tsx` | `/tasks` | Action items/tasks view |
| `Analytics.tsx` | `/analytics` | Application analytics/logs |
| `NotFound.tsx` | `/*` | 404 fallback |
### Settings Sub-Pages (`pages/settings/`)
- `IntegrationsTab.tsx` — Integration management
- `StatusTab.tsx` — System status
### Meeting Detail Sub-Pages (`pages/meeting-detail/`)
Meeting detail components and sub-views.
## Page Integration Patterns
```typescript
// Recording.tsx example
export default function Recording() {
const { isConnected, isReadOnly } = useConnection();
const { activeProject } = useProjects();
const { state, start, cancel } = useDiarization();
const { data: audioDevices } = useAudioDevices();
// Conditional rendering based on connection state
if (isReadOnly) return <OfflineBanner />;
return (
<AppLayout>
<RecordingHeader />
<AudioLevelMeter />
<ProcessingStatus />
</AppLayout>
);
}
```
## Component Exports
| Component | Purpose |
|-----------|---------|
| `<ConnectionProvider>` | Wraps app with connection state |
| `<WorkspaceProvider>` | Wraps app with workspace state |
| `<ProjectProvider>` | Wraps app with project state |
| `<AppLayout>` | Main app shell with sidebar |
| `<AudioLevelMeter>` | Real-time audio VU meter |
| `<RecordingHeader>` | Recording session metadata |
| `<EntityHighlight>` | Inline NER entity highlighting |
| `<AnnotationTypeBadge>` | Action item/decision/risk badge |
| `<MeetingStateBadge>` | Meeting state indicator |
| `<OfflineBanner>` | Cached/offline mode warning |
| `<ProcessingStatus>` | Post-processing progress |
| `<ApiModeIndicator>` | Connection mode display |
## Analytics Components (`components/analytics/`)
- `logs-tab.tsx` — Log viewer
- `performance-tab.tsx` — Performance metrics
- `analytics-utils.ts` — Analytics utilities

View File

@@ -0,0 +1,238 @@
# NoteFlow Rust/Tauri Commands
## Location
`client/src-tauri/src/commands/`
## Command Summary (97 Total)
### Connection (5)
| Command | Purpose |
|---------|---------|
| `connect()` | Connect to gRPC server |
| `disconnect()` | Disconnect from server |
| `is_connected()` | Check connection status |
| `get_server_info()` | Get server info |
| `get_effective_server_url()` | Get resolved server URL |
### Identity (8)
| Command | Purpose |
|---------|---------|
| `get_current_user()` | Get current user |
| `list_workspaces()` | List available workspaces |
| `switch_workspace()` | Switch active workspace |
| `initiate_auth_login()` | Start auth flow |
| `complete_auth_login()` | Complete auth flow |
| `logout()` | Log out |
| `get_workspace_settings()` | Get workspace settings |
| `update_workspace_settings()` | Update workspace settings |
### Projects (16)
| Command | Purpose |
|---------|---------|
| `create_project()` | Create new project |
| `get_project()` | Get project by ID |
| `list_projects()` | List all projects |
| `update_project()` | Update project |
| `set_active_project()` | Set active project |
| `add_project_member()` | Add member |
| `update_project_member_role()` | Update member role |
| `remove_project_member()` | Remove member |
| `list_project_members()` | List members |
| `archive_project()` | Archive project |
| `restore_project()` | Restore project |
| `delete_project()` | Delete project |
### Meeting (5)
| Command | Purpose |
|---------|---------|
| `create_meeting()` | Create meeting |
| `list_meetings()` | List meetings |
| `get_meeting()` | Get meeting by ID |
| `stop_meeting()` | Stop recording |
| `delete_meeting()` | Delete meeting |
### Recording (5) — `recording/`
| Command | Purpose |
|---------|---------|
| `start_recording()` | Start recording session |
| `stop_recording()` | Stop recording session |
| `send_audio_chunk()` | Stream audio samples |
| `get_stream_state()` | Get stream state |
| `reset_stream_state()` | Reset stream state |
**Recording Module Files**:
- `session/mod.rs` — Session lifecycle
- `session/chunks/mod.rs` — Audio chunk streaming
- `capture.rs` — Native audio capture (cpal)
- `device.rs` — Device resolution utilities
- `dual_capture.rs` — Mic + system audio mixing
- `stream_state.rs` — VU levels, RMS, counts
- `app_policy.rs` — Recording app policy
### Annotation (5)
| Command | Purpose |
|---------|---------|
| `add_annotation()` | Add annotation |
| `get_annotation()` | Get annotation |
| `list_annotations()` | List annotations |
| `update_annotation()` | Update annotation |
| `delete_annotation()` | Delete annotation |
### Summary (9)
| Command | Purpose |
|---------|---------|
| `generate_summary()` | Generate summary |
| `list_summarization_templates()` | List templates |
| `create_summarization_template()` | Create template |
| `update_summarization_template()` | Update template |
| `delete_summarization_template()` | Delete template |
| `get_summarization_template()` | Get template |
| `grant_cloud_consent()` | Grant cloud AI consent |
| `revoke_cloud_consent()` | Revoke consent |
| `get_cloud_consent_status()` | Check consent |
### Export (2)
| Command | Purpose |
|---------|---------|
| `export_transcript()` | Export to format |
| `save_export_file()` | Save export to disk |
### Entities (3)
| Command | Purpose |
|---------|---------|
| `extract_entities()` | Run NER extraction |
| `update_entity()` | Update entity |
| `delete_entity()` | Delete entity |
### Diarization (5)
| Command | Purpose |
|---------|---------|
| `refine_speaker_diarization()` | Start refinement |
| `get_diarization_job_status()` | Get job status |
| `rename_speaker()` | Rename speaker |
| `cancel_diarization_job()` | Cancel job |
| `get_active_diarization_jobs()` | List active jobs |
### Audio Devices (12) — `audio.rs`
| Command | Purpose |
|---------|---------|
| `list_audio_devices()` | List input/output devices |
| `get_default_audio_device()` | Get system default |
| `select_audio_device()` | Select device |
| `start_input_test()` | Start input test |
| `stop_input_test()` | Stop input test |
| `list_loopback_devices()` | List system audio devices |
| `set_system_audio_device()` | Set system audio capture |
| `set_dual_capture_enabled()` | Enable mic + system |
| `set_audio_mix_levels()` | Set mix levels |
| `get_dual_capture_config()` | Get dual capture config |
### Playback (5) — `playback/`
| Command | Purpose |
|---------|---------|
| `start_playback()` | Start audio playback |
| `pause_playback()` | Pause playback |
| `stop_playback()` | Stop playback |
| `seek_playback()` | Seek position |
| `get_playback_state()` | Get playback state |
### Preferences (4)
| Command | Purpose |
|---------|---------|
| `get_preferences()` | Get user preferences |
| `save_preferences()` | Save preferences |
| `get_preferences_sync()` | Get sync preferences |
| `set_preferences_sync()` | Set sync preferences |
### Triggers (6) — `triggers/`
| Command | Purpose |
|---------|---------|
| `set_trigger_enabled()` | Enable/disable triggers |
| `snooze_triggers()` | Snooze triggers |
| `reset_snooze()` | Reset snooze |
| `get_trigger_status()` | Get trigger status |
| `dismiss_trigger()` | Dismiss trigger |
| `accept_trigger()` | Accept trigger |
### Calendar (7)
| Command | Purpose |
|---------|---------|
| `list_calendar_events()` | List events |
| `initiate_oauth()` | Start OAuth flow |
| `initiate_oauth_loopback()` | OAuth with loopback |
| `complete_oauth()` | Complete OAuth |
| `get_oauth_connection_status()` | Check connection |
| `disconnect_oauth()` | Disconnect OAuth |
| `get_calendar_providers()` | List providers |
### Webhooks (5)
| Command | Purpose |
|---------|---------|
| `register_webhook()` | Register webhook |
| `list_webhooks()` | List webhooks |
| `update_webhook()` | Update webhook |
| `delete_webhook()` | Delete webhook |
| `get_webhook_deliveries()` | Get delivery history |
### OIDC (8)
| Command | Purpose |
|---------|---------|
| `register_oidc_provider()` | Register provider |
| `list_oidc_providers()` | List providers |
| `get_oidc_provider()` | Get provider |
| `update_oidc_provider()` | Update provider |
| `delete_oidc_provider()` | Delete provider |
| `refresh_oidc_discovery()` | Refresh discovery |
| `test_oidc_connection()` | Test connection |
| `list_oidc_presets()` | List presets |
### Integration Sync (4)
| Command | Purpose |
|---------|---------|
| `start_integration_sync()` | Start sync |
| `get_sync_status()` | Get sync status |
| `list_sync_history()` | List sync history |
| `get_user_integrations()` | Get integrations |
### Observability (2)
| Command | Purpose |
|---------|---------|
| `get_recent_logs()` | Get recent logs |
| `get_performance_metrics()` | Get metrics |
### ASR & Streaming Config (5)
| Command | Purpose |
|---------|---------|
| `get_asr_configuration()` | Get ASR config |
| `update_asr_configuration()` | Update ASR config |
| `get_asr_job_status()` | Get ASR job status |
| `get_streaming_configuration()` | Get streaming config |
| `update_streaming_configuration()` | Update streaming config |
### HuggingFace (4)
| Command | Purpose |
|---------|---------|
| `set_huggingface_token()` | Set token |
| `get_huggingface_token_status()` | Get token status |
| `delete_huggingface_token()` | Delete token |
| `validate_huggingface_token()` | Validate token |
### Apps (2)
| Command | Purpose |
|---------|---------|
| `list_installed_apps()` | List apps |
| `invalidate_app_cache()` | Clear app cache |
### Diagnostics & Testing (5)
| Command | Purpose |
|---------|---------|
| `run_connection_diagnostics()` | Run diagnostics |
| `check_test_environment()` | Check test env |
| `reset_test_recording_state()` | Reset test state |
| `inject_test_audio()` | Inject test audio |
| `inject_test_tone()` | Inject test tone |
### Shell (1)
| Command | Purpose |
|---------|---------|
| `open_url()` | Open URL in browser |

191
.rag/11-rust-grpc-types.md Normal file
View File

@@ -0,0 +1,191 @@
# NoteFlow Rust gRPC Client & Types
## Location
`client/src-tauri/src/grpc/`
## Architecture
Modular client using trait extensions for composition.
```rust
pub struct GrpcClient {
channel: Channel,
identity: Arc<IdentityManager>,
state: Arc<AppState>,
}
```
## Type Definitions (`grpc/types/`)
### Core Types (`core.rs`)
```rust
pub struct ServerInfo {
pub version: String,
pub asr_model: String,
pub asr_ready: bool,
pub supported_sample_rates: Vec<i32>,
pub max_chunk_size: i32,
pub uptime_seconds: f64,
pub active_meetings: i32,
pub diarization_enabled: bool,
pub diarization_ready: bool,
pub state_version: i64,
pub system_ram_total_bytes: Option<i64>,
pub gpu_vram_total_bytes: Option<i64>,
}
pub struct Meeting {
pub id: String,
pub project_id: Option<String>,
pub title: String,
pub state: MeetingState,
pub created_at: f64,
pub started_at: Option<f64>,
pub ended_at: Option<f64>,
pub duration_seconds: f64,
pub segments: Vec<Segment>,
pub summary: Option<Summary>,
pub metadata: HashMap<String, String>,
}
pub struct Segment {
pub id: String,
pub speaker: String,
pub text: String,
pub start_time: f64,
pub end_time: f64,
pub confidence: f32,
pub speaker_id: Option<String>,
}
pub struct Summary {
pub id: String,
pub content: String,
pub template_id: Option<String>,
pub generated_at: f64,
pub created_by_ai: bool,
}
pub struct Annotation {
pub id: String,
pub segment_ids: Vec<String>,
pub annotation_type: AnnotationType,
pub content: String,
pub created_at: f64,
}
```
### Enums (`enums.rs`)
```rust
pub enum MeetingState {
Unspecified = 0,
Created = 1,
Recording = 2,
Stopped = 3,
Completed = 4,
Error = 5,
}
pub enum AnnotationType {
Unspecified = 0,
ActionItem = 1,
Decision = 2,
Note = 3,
Risk = 4,
}
pub enum ExportFormat {
Unspecified = 0,
Markdown = 1,
Html = 2,
Pdf = 3,
}
pub enum UpdateType {
Unspecified = 0,
PartialTranscript = 1,
FinalTranscript = 2,
SpeakerDiarization = 3,
}
pub enum Priority {
Unspecified = 0,
Low = 1,
Medium = 2,
High = 3,
}
```
### Other Type Modules
- `asr.rs` — ASR config types
- `streaming.rs` — Streaming types
- `projects.rs` — Project types
- `calendar.rs` — Calendar types
- `webhooks.rs` — Webhook types
- `preferences.rs` — Preference types
- `identity.rs` — Identity types
- `oidc.rs` — OIDC types
- `sync.rs` — Sync types
- `observability.rs` — Observability types
- `results.rs` — Result wrappers
- `hf_token.rs` — HuggingFace types
## Client Modules (`grpc/client/`)
| File | Purpose |
|------|---------|
| `core.rs` | Connection, auth, identity interceptor |
| `meetings.rs` | Meeting CRUD |
| `annotations.rs` | Annotation ops |
| `diarization.rs` | Diarization requests |
| `identity.rs` | User/workspace ops |
| `projects.rs` | Project ops |
| `preferences.rs` | Preference ops |
| `calendar.rs` | Calendar sync |
| `webhooks.rs` | Webhook ops |
| `oidc.rs` | OIDC providers |
| `sync.rs` | Integration sync |
| `observability.rs` | Logs/metrics |
| `asr.rs` | ASR config |
| `streaming.rs` | Streaming config |
| `hf_token.rs` | HuggingFace tokens |
| `converters.rs` | Proto ↔ domain converters |
## Streaming (`grpc/streaming/`)
```rust
pub struct StreamManager {
state: Arc<AppState>,
grpc_client: Arc<GrpcClient>,
}
pub struct AudioStreamChunk {
pub segment_id: String,
pub samples: Vec<f32>,
pub timestamp: f64,
pub speaker: Option<String>,
}
pub struct StreamStateInfo {
pub is_streaming: bool,
pub buffered_samples: usize,
pub segments_completed: u32,
}
```
## Identity Interceptor
```rust
impl Interceptor for IdentityInterceptor {
fn call(&mut self, mut request: Request<()>) -> Result<Request<()>, Status> {
let metadata = request.metadata_mut();
metadata.insert("x-user-id", user_id.parse()?);
metadata.insert("x-workspace-id", workspace_id.parse()?);
if let Some(token) = self.identity.access_token() {
metadata.insert("authorization", format!("Bearer {token}").parse()?);
}
Ok(request)
}
}
```

View File

@@ -0,0 +1,307 @@
# NoteFlow Rust State, Audio & Crypto
## Location
`client/src-tauri/src/state/`, `audio/`, `crypto/`
## State Management (`state/`)
### AppState (`app_state.rs`)
```rust
pub struct AppState {
// Connection State
pub grpc_client: Arc<GrpcClient>,
pub server_info: RwLock<Option<ServerInfo>>,
// Recording State
pub recording: RwLock<Option<RecordingSession>>,
pub current_meeting: RwLock<Option<MeetingInfo>>,
pub recording_start_time: RwLock<Option<f64>>,
pub elapsed_seconds: RwLock<u32>,
// Playback State
pub playback_handle: Arc<PlaybackHandle>,
pub playback_state: RwLock<PlaybackState>,
pub playback_info: RwLock<Option<PlaybackInfo>>,
// Audio Configuration
pub audio_config: RwLock<AudioConfig>,
// Trigger State
pub trigger_status: RwLock<TriggerStatus>,
pub pending_triggers: RwLock<Vec<PendingTrigger>>,
pub dismissed_triggers: RwLock<LRU<String, Instant>>,
// User Preferences
pub preferences: RwLock<UserPreferences>,
// Crypto (lazy-initialized)
pub crypto: CryptoManager,
// Identity (lazy-initialized)
pub identity: Arc<IdentityManager>,
}
```
### Recording Session
```rust
pub struct RecordingSession {
pub meeting_id: String,
pub start_time: f64,
pub audio_tx: mpsc::Sender<AudioSamplesChunk>,
pub mic_input_device: Option<cpal::Device>,
pub system_audio_device: Option<cpal::Device>,
}
pub struct AudioSamplesChunk {
pub samples: Vec<f32>,
pub timestamp: f64,
pub sample_rate: u32,
pub channels: u16,
}
```
### Preferences
```rust
pub struct UserPreferences {
pub default_export_location: String,
pub audio_devices: AudioDevicePrefs,
pub playback_volume: f32,
pub auto_play_on_load: bool,
pub app_policies: Vec<RecordingAppRule>,
}
pub struct AudioConfig {
pub input_device_id: Option<String>,
pub output_device_id: Option<String>,
pub system_device_id: Option<String>,
pub dual_capture_enabled: bool,
pub mic_gain: f32,
pub system_gain: f32,
}
```
## Audio Handling (`audio/`)
### Audio Capture (`capture.rs`)
```rust
pub struct CaptureConfig {
pub sample_rate: u32,
pub channels: u16,
pub buffer_size: usize,
}
pub struct AudioCapture {
stream: Stream, // cpal Stream
}
impl AudioCapture {
pub fn new(
config: CaptureConfig,
audio_tx: mpsc::Sender<Vec<f32>>,
level_callback: Arc<dyn Fn(f32) + Send + Sync>,
) -> Result<Self>;
pub fn with_device(
device: Device,
config: CaptureConfig,
audio_tx: mpsc::Sender<Vec<f32>>,
level_callback: Arc<dyn Fn(f32) + Send + Sync>,
) -> Result<Self>;
pub fn pause() -> Result<()>;
pub fn resume() -> Result<()>;
}
```
### Audio Playback (`playback.rs`)
```rust
pub enum PlaybackCommand {
Play(Vec<TimestampedAudio>, u32), // audio buffer, sample rate
Pause,
Resume,
Stop,
Shutdown,
}
pub struct PlaybackHandle {
command_tx: Sender<PlaybackCommand>,
response_rx: Mutex<Receiver<Result<PlaybackStarted, String>>>,
_thread: JoinHandle<()>,
}
pub struct PlaybackStarted {
pub duration: f64,
}
```
### Device Management (`devices.rs`)
- `list_audio_devices()` — Enumerate devices
- `get_default_audio_device()` — System default
- `select_audio_device()` — User selection
- `get_supported_configs()` — Device capabilities
### Utilities
- `mixer.rs` — Dual-capture audio mixing
- `loader.rs` — Audio file loading
- `windows_loopback.rs` — Windows system audio capture
## Encryption (`crypto/`)
### CryptoBox (AES-256-GCM)
```rust
pub struct CryptoBox {
cipher: Aes256Gcm,
}
pub struct CryptoManager {
crypto: OnceLock<Result<CryptoBox>>,
}
impl CryptoBox {
pub fn new() -> Result<Self>; // Create with keychain key
pub fn with_key(key: &[u8; 32]) -> Result<Self>;
pub fn encrypt_file(input: &Path, output: &Path) -> Result<()>;
pub fn decrypt_file(input: &Path, output: &Path) -> Result<()>;
}
impl CryptoManager {
pub fn get_or_init(&self) -> Result<&CryptoBox>;
pub fn encrypt_file_async(&self, input: PathBuf, output: PathBuf) -> JoinHandle<Result<()>>;
}
```
**File Format**: `[4-byte magic: NFAE] [16-byte nonce] [ciphertext] [16-byte tag]`
## Identity (`identity/`)
```rust
pub struct StoredIdentity {
pub user_id: String,
pub workspace_id: String,
pub display_name: String,
pub email: Option<String>,
pub workspace_name: String,
pub role: String,
pub is_local: bool,
}
pub struct IdentityManager {
identity: OnceLock<StoredIdentity>,
access_token: OnceLock<Option<String>>,
}
impl IdentityManager {
pub fn new() -> Self;
pub fn is_authenticated(&self) -> bool;
pub fn user_id(&self) -> String;
pub fn workspace_id(&self) -> String;
pub fn access_token(&self) -> Option<String>;
pub fn store_identity(&self, identity: StoredIdentity) -> Result<()>;
pub fn store_access_token(&self, token: String) -> Result<()>;
}
```
**Features**:
- Local-first default: `StoredIdentity::local_default()`
- Keychain storage for persistent identity
- Lazy initialization to defer keychain access
## Error Handling (`error/`)
```rust
pub enum Error {
Grpc(Box<tonic::Status>),
GrpcTransport(Box<tonic::transport::Error>),
Connection(String),
AudioCapture(String),
AudioPlayback(String),
Encryption(String),
Io(std::io::Error),
Serialization(serde_json::Error),
NotConnected,
AlreadyConnected,
NoActiveRecording,
AlreadyRecording,
NoActivePlayback,
AlreadyPlaying,
MeetingNotFound(String),
AnnotationNotFound(String),
IntegrationNotFound(String),
DeviceNotFound(String),
InvalidOperation(String),
InvalidInput(String),
Stream(String),
Timeout(String),
}
pub struct ErrorClassification {
pub grpc_status: Option<i32>,
pub category: String, // network, auth, validation, not_found, server, client
pub retryable: bool,
}
```
## Trigger Detection (`triggers/`)
### Foreground App Detection
```rust
pub struct ForegroundAppIdentity {
pub name: String,
pub bundle_id: Option<String>, // macOS
pub app_id: Option<String>, // Windows
pub exe_path: Option<PathBuf>,
pub exe_name: Option<String>,
pub desktop_id: Option<String>, // Linux
pub is_pwa: bool,
}
pub fn detect_foreground_app() -> Option<ForegroundAppIdentity>;
pub fn is_meeting_app(app: &ForegroundAppIdentity) -> bool;
```
**Meeting Apps**: zoom, teams, meet, slack, webex, discord, skype, gotomeeting, facetime, ringcentral
### Trigger Signals
```rust
pub struct TriggerSignal {
pub source: TriggerSource, // ForegroundApp, AudioActivity, CalendarProximity
pub confidence: f32, // 0.0-1.0
pub action: TriggerAction, // Ignore, Notify, AutoStart
pub timestamp: f64,
}
pub struct PendingTrigger {
pub id: String,
pub signal: TriggerSignal,
pub dismissed_until: Option<f64>,
}
```
## Events (`events/`)
```rust
pub enum AppEvent {
Connected { server_info: ServerInfo },
Disconnected { reason: String },
ConnectionError { error: String },
RecordingStarted { meeting_id: String },
RecordingProgress { elapsed_seconds: u32 },
RecordingStopped { meeting_id: String },
PlaybackStarted { duration: f64 },
PlaybackStopped,
PlaybackProgress { current_position: f64 },
TriggerDetected { signal: TriggerSignal },
TriggerDismissed { trigger_id: String },
AudioLevelChanged { rms: f32 },
AudioActivityDetected,
Error { error: String, category: String },
}
```

196
.rag/13-common-utilities.md Normal file
View File

@@ -0,0 +1,196 @@
# NoteFlow Common Utilities & Patterns
## Python Utilities
### Domain Utilities (`domain/utils/`)
- `time.py``utc_now()` for UTC-aware datetime
- `validation.py` — Validation helpers
### Config Constants (`config/constants/`)
- `core.py` — DAYS_PER_WEEK, HOURS_PER_DAY, DEFAULT_LLM_TEMPERATURE
- `domain.py` — DEFAULT_ANTHROPIC_MODEL, WEBHOOK_EVENT_TYPES
- `errors.py` — Error code constants
- `http.py` — HTTP-related constants
### Domain Constants (`domain/constants/`)
- `fields.py` — Field name constants (EMAIL, PROVIDER, PROJECT_ID)
### Settings (`config/settings/`)
- `_main.py` — Main settings (database, ASR, gRPC, storage)
- `_triggers.py` — Trigger settings
- `_calendar.py` — Calendar provider settings
- `_features.py` — Feature flags
**Key Env Vars**:
- `NOTEFLOW_DATABASE_URL` — PostgreSQL connection
- `NOTEFLOW_ASR_MODEL_SIZE` — Whisper size
- `NOTEFLOW_GRPC_PORT` — gRPC port (default: 50051)
- `NOTEFLOW_MEETINGS_DIR` — Audio storage directory
- `NOTEFLOW_RETENTION_DAYS` — Retention policy (default: 90)
## TypeScript Utilities (`client/src/lib/`)
### Configuration (`config/`)
- `app-config.ts` — App-wide constants
- `defaults.ts` — Default values
- `provider-endpoints.ts` — AI provider endpoints
- `server.ts` — Server connection defaults
### Formatting & Time
- `format.ts` — Time, duration, GB, percentages
- `time.ts` — Time constants (SECONDS_PER_*, MS_PER_*)
- `timing-constants.ts` — Polling intervals, timeouts
### Logging
```typescript
// NEVER use console.log - always use clientlog system
import { addClientLog } from '@/lib/client-logs';
import { debug, errorLog } from '@/lib/debug';
addClientLog({
level: 'info',
source: 'app',
message: 'Event occurred',
details: 'Context',
});
const log = debug('ComponentName');
log('Debug message', { data });
const logError = errorLog('ComponentName');
logError('Error occurred', error);
```
### Preferences (`preferences/`)
```typescript
import { preferences } from '@/lib/preferences';
// Initialize on app start
await preferences.initialize();
// Read/write
const prefs = preferences.get();
await preferences.set('selected_input_device', deviceId);
await preferences.replace({ ...current, theme: 'dark' });
// Subscribe to changes
const unsubscribe = preferences.subscribe((prefs) => {
console.log('Changed:', prefs);
});
```
### Storage
- `storage-keys.ts` — localStorage key constants
- `storage-utils.ts` — localStorage helpers
### Caching
- `cache/meeting-cache.ts` — Meeting cache with TTL & events
### Event System
- `tauri-events.ts` — Tauri event subscriptions
- `event-emitter.ts` — Generic event emitter
### Other Utilities
- `utils.ts` — Generic TypeScript utilities
- `object-utils.ts` — Object manipulation
- `speaker-utils.ts` — Speaker ID formatting
- `integration-utils.ts` — Integration helpers
- `entity-store.ts` — Entity caching for NER
- `crypto.ts` — Client-side encryption
- `oauth-utils.ts` — OAuth flow helpers
## Rust Constants (`constants.rs`)
```rust
pub mod grpc {
pub const CONNECTION_TIMEOUT: Duration = Duration::from_secs(5);
pub const REQUEST_TIMEOUT: Duration = Duration::from_secs(300);
pub const DEFAULT_PORT: u16 = 50051;
pub const MAX_RETRY_ATTEMPTS: u32 = 3;
}
pub mod audio {
pub const DEFAULT_SAMPLE_RATE: u32 = 16000;
pub const DEFAULT_CHANNELS: u32 = 1;
pub const DEFAULT_BUFFER_SIZE: usize = 1600; // 100ms
pub const MIN_DB_LEVEL: f32 = -60.0;
pub const MAX_DB_LEVEL: f32 = 0.0;
}
pub mod storage {
pub const MAX_AUDIO_SIZE_BYTES: u64 = 5 * 1024 * 1024 * 1024; // 5GB
}
pub mod triggers {
pub const POLL_INTERVAL: Duration = Duration::from_secs(5);
pub const DEFAULT_SNOOZE_DURATION: Duration = Duration::from_secs(600);
pub const AUTO_START_THRESHOLD: f32 = 0.7;
}
pub mod crypto {
pub const KEY_SIZE: usize = 32; // 256-bit AES
pub const KEYCHAIN_SERVICE: &str = "NoteFlow";
}
pub mod identity {
pub const DEFAULT_USER_ID: &str = "local-user";
pub const DEFAULT_WORKSPACE_ID: &str = "local-workspace";
}
```
## Key Patterns
### Python: Protocol-Based Dependency Injection
```python
class MeetingService:
def __init__(self, uow: UnitOfWork):
self._uow = uow
async def create_meeting(self, params: MeetingCreateParams) -> Meeting:
async with self._uow:
meeting = Meeting.create(params)
await self._uow.meetings.add(meeting)
await self._uow.commit()
return meeting
```
### TypeScript: Connection-Aware Components
```typescript
function MyComponent() {
const { isConnected, isReadOnly } = useConnection();
if (isReadOnly) return <OfflineBanner />;
return <ConnectedUI />;
}
```
### Rust: Lazy Initialization
```rust
pub struct CryptoManager {
crypto: OnceLock<Result<CryptoBox>>,
}
impl CryptoManager {
pub fn get_or_init(&self) -> Result<&CryptoBox> {
self.crypto.get_or_try_init(CryptoBox::new)
}
}
```
### Rust: State Access via RwLock
```rust
let mut rec = state.recording.write();
*rec = Some(session);
```
## Feature Flags
| Flag | Default | Controls |
|------|---------|----------|
| `NOTEFLOW_FEATURE_TEMPLATES_ENABLED` | `true` | AI templates |
| `NOTEFLOW_FEATURE_PDF_EXPORT_ENABLED` | `true` | PDF export |
| `NOTEFLOW_FEATURE_NER_ENABLED` | `false` | Entity extraction |
| `NOTEFLOW_FEATURE_CALENDAR_ENABLED` | `false` | Calendar sync |
| `NOTEFLOW_FEATURE_WEBHOOKS_ENABLED` | `true` | Webhooks |
Access: `get_feature_flags().<flag_name>` or `get_settings().feature_flags.<flag_name>`

1188
CLAUDE.md

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
# CLAUDE.md
# CLAUDE.md - Client Development Reference
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This file provides guidance for working with the Tauri + React client code. For Python backend development, see `src/noteflow/CLAUDE.md`.
## Project Overview
@@ -77,24 +77,135 @@ npm run quality:rs
## Architecture
### TypeScript Layer
### TypeScript Layer (`client/src/`)
```
src/
├── api/ # Backend communication layer
│ ├── interface.ts # API interface definition (NoteFlowAdapter)
│ ├── tauri-adapter.ts # Production: Tauri IPC → Rust → gRPC
│ ├── mock-adapter.ts # Development: Simulated data
│ ├── cached/ # Cached adapter implementations by domain
── types/ # API type definitions (core, enums, features)
├── hooks/ # Custom React hooks
├── contexts/ # React contexts (connection, workspace, project)
├── components/ # React components (ui/ contains shadcn/ui)
├── pages/ # Route pages
└── lib/ # Utilities and helpers
├── config/ # Configuration (server, defaults)
├── cache/ # Client-side caching
── preferences.ts # User preferences management
├── api/ # API layer
│ ├── tauri-adapter.ts # Main Tauri IPC adapter
│ ├── mock-adapter.ts # Mock adapter for testing
│ ├── cached-adapter.ts # Caching layer
│ ├── connection-state.ts # Connection state machine
── reconnection.ts # Auto-reconnection logic
│ ├── interface.ts # Adapter interface
│ ├── transcription-stream.ts
│ ├── types/ # Type definitions
├── core.ts # Core types (Meeting, Segment, etc.)
├── enums.ts # Enum definitions
│ │ ├── errors.ts # Error types
│ │ ├── projects.ts # Project types
│ │ ── diagnostics.ts
│ │ ├── requests.ts
│ │ ├── features/ # Feature-specific types
│ │ │ ├── webhooks.ts
│ │ │ ├── calendar.ts
│ │ │ ├── ner.ts
│ │ │ ├── sync.ts
│ │ │ ├── identity.ts
│ │ │ ├── oidc.ts
│ │ │ └── observability.ts
│ │ └── requests/ # Request types by domain
│ └── cached/ # Cached adapter implementations
│ ├── base.ts
│ ├── meetings.ts
│ ├── projects.ts
│ ├── diarization.ts
│ ├── annotations.ts
│ ├── templates.ts
│ ├── webhooks.ts
│ ├── calendar.ts
│ ├── entities.ts
│ ├── preferences.ts
│ ├── observability.ts
│ ├── triggers.ts
│ ├── audio.ts
│ ├── playback.ts
│ ├── apps.ts
│ └── readonly.ts
├── hooks/ # Custom React hooks
│ ├── use-diarization.ts
│ ├── use-cloud-consent.ts
│ ├── use-webhooks.ts
│ ├── use-oauth-flow.ts
│ ├── use-calendar-sync.ts
│ ├── use-entity-extraction.ts
│ ├── use-audio-devices.ts
│ ├── use-project.ts
│ ├── use-project-members.ts
│ ├── use-oidc-providers.ts
│ ├── use-auth-flow.ts
│ ├── use-integration-sync.ts
│ ├── use-integration-validation.ts
│ ├── use-secure-integration-secrets.ts
│ ├── use-guarded-mutation.ts
│ ├── use-panel-preferences.ts
│ ├── use-preferences-sync.ts
│ ├── use-meeting-reminders.ts
│ ├── use-recording-app-policy.ts
│ ├── use-post-processing.ts
│ ├── use-toast.ts
│ ├── use-mobile.tsx
│ └── post-processing/
├── contexts/ # React contexts
│ ├── connection-context.tsx # gRPC connection context
│ ├── connection-state.ts
│ ├── workspace-context.tsx # Workspace context
│ ├── workspace-state.ts
│ ├── project-context.tsx # Project context
│ └── project-state.ts
├── components/ # React components
│ ├── ui/ # Reusable UI components (shadcn/ui)
│ ├── recording/ # Recording-specific components
│ ├── settings/ # Settings panel components
│ ├── analytics/ # Analytics visualizations
│ ├── projects/ # Project components
│ ├── icons/ # Icon components
│ └── ... # Top-level components
├── pages/ # Route pages
│ ├── Home.tsx
│ ├── Meetings.tsx
│ ├── MeetingDetail.tsx
│ ├── Recording.tsx
│ ├── Projects.tsx
│ ├── ProjectSettings.tsx
│ ├── Settings.tsx
│ ├── settings/ # Settings sub-pages
│ ├── People.tsx
│ ├── Tasks.tsx
│ ├── Analytics.tsx
│ ├── Index.tsx
│ └── NotFound.tsx
├── lib/ # Utilities
│ ├── config/ # Configuration (server, defaults, app-config, provider-endpoints)
│ ├── cache/ # Client-side caching (meeting-cache)
│ ├── format.ts
│ ├── utils.ts
│ ├── time.ts
│ ├── crypto.ts
│ ├── cva.ts
│ ├── styles.ts
│ ├── tauri-events.ts
│ ├── preferences.ts
│ ├── preferences-sync.ts
│ ├── entity-store.ts
│ ├── speaker-utils.ts
│ ├── ai-providers.ts
│ ├── ai-models.ts
│ ├── integration-utils.ts
│ ├── default-integrations.ts
│ ├── status-constants.ts
│ ├── timing-constants.ts
│ ├── object-utils.ts
│ ├── error-reporting.ts
│ ├── client-logs.ts
│ ├── client-log-events.ts
│ ├── log-groups.ts
│ ├── log-converters.ts
│ ├── log-messages.ts
│ ├── log-summarizer.ts
│ └── log-group-summarizer.ts
├── types/ # Shared TypeScript types
└── test/ # Test utilities
```
**Key Patterns:**
@@ -103,25 +214,83 @@ src/
- React Query (`@tanstack/react-query`) for server state management
- Contexts for global state: `ConnectionContext`, `WorkspaceContext`, `ProjectContext`
### Rust Layer
### Rust Layer (`client/src-tauri/src/`)
```
src-tauri/src/
├── commands/ # Tauri IPC command handlers
│ ├── recording/ # Audio capture, device selection
│ ├── playback/ # Audio playback control
│ ├── triggers/ # Recording triggers (audio activity, calendar)
── *.rs # Domain commands (meeting, summary, etc.)
├── grpc/ # gRPC client
│ ├── client/ # Domain-specific gRPC clients
│ ├── types/ # Rust type definitions
│ ├── streaming/ # Audio streaming management
── noteflow.rs # Generated protobuf types
├── audio/ # Audio capture/playback
├── crypto/ # AES-GCM encryption
├── state/ # Runtime state management
├── config.rs # Configuration
└── lib.rs # Command registration
├── commands/ # Tauri IPC command handlers
│ ├── recording/ # capture.rs, device.rs, audio.rs, app_policy.rs
│ ├── playback/ # audio.rs, events.rs, tick.rs
│ ├── triggers/ # audio.rs, polling.rs
── meeting.rs
│ ├── diarization.rs
│ ├── annotation.rs
│ ├── export.rs
│ ├── summary.rs
── entities.rs
│ ├── calendar.rs
│ ├── webhooks.rs
│ ├── preferences.rs
│ ├── observability.rs
│ ├── sync.rs
│ ├── projects.rs
│ ├── identity.rs
│ ├── oidc.rs
│ ├── connection.rs
│ ├── audio.rs
│ ├── audio_testing.rs
│ ├── apps.rs
│ ├── apps_platform.rs
│ ├── diagnostics.rs
│ ├── shell.rs
│ └── testing.rs
├── grpc/ # gRPC client
│ ├── client/ # Client implementations by domain
│ │ ├── core.rs
│ │ ├── meetings.rs
│ │ ├── annotations.rs
│ │ ├── diarization.rs
│ │ ├── identity.rs
│ │ ├── projects.rs
│ │ ├── preferences.rs
│ │ ├── calendar.rs
│ │ ├── webhooks.rs
│ │ ├── observability.rs
│ │ ├── oidc.rs
│ │ ├── sync.rs
│ │ └── converters.rs
│ ├── types/ # Rust type definitions
│ │ ├── core.rs
│ │ ├── enums.rs
│ │ ├── identity.rs
│ │ ├── projects.rs
│ │ ├── preferences.rs
│ │ ├── calendar.rs
│ │ ├── webhooks.rs
│ │ ├── observability.rs
│ │ ├── oidc.rs
│ │ ├── sync.rs
│ │ └── results.rs
│ ├── streaming/ # Streaming converters
│ └── noteflow.rs # Generated protobuf types
├── state/ # Runtime state management
│ ├── app_state.rs
│ ├── preferences.rs
│ ├── playback.rs
│ └── types.rs
├── audio/ # Audio capture and playback
├── cache/ # Memory caching
├── crypto/ # Cryptographic operations
├── events/ # Tauri event emission
├── triggers/ # Trigger detection
├── error/ # Error types
├── identity/ # Identity management
├── config.rs # Configuration
├── constants.rs # Constants
├── helpers.rs # Helper functions
├── oauth_loopback.rs # OAuth callback server
├── main.rs # Application entry point
└── lib.rs # Library exports
```
**Key Patterns:**
@@ -232,7 +401,7 @@ let config = select_input_config(&device, rate, channels)?;
- `unwrap()` usage detection
- Module size limits (>500 lines flagged)
### Logging (CRITICAL)
### Client Logging (CRITICAL)
**NEVER use `console.log`, `console.error`, `console.warn`, or `console.debug` directly.**

338
client/src/CLAUDE.md Normal file
View File

@@ -0,0 +1,338 @@
# TypeScript & JavaScript Security Rules
Comprehensive security guidelines for TypeScript and JavaScript development in NoteFlow.
## TypeScript Security
### Type Safety
**Strict Configuration (warning, CWE-704)**
```json
{
"compilerOptions": {
"strict": true,
"noImplicitAny": true,
"strictNullChecks": true,
"noUncheckedIndexedAccess": true,
"exactOptionalPropertyTypes": true
}
}
```
**Runtime Validation (strict, CWE-20, OWASP A03:2025)**
```typescript
import { z } from 'zod';
const UserSchema = z.object({
id: z.number(),
email: z.string().email(),
role: z.enum(['user', 'admin']),
});
type User = z.infer<typeof UserSchema>;
async function fetchUser(id: number): Promise<User> {
const response = await fetch(`/api/users/${id}`);
const data = await response.json();
return UserSchema.parse(data); // Runtime validation
}
```
**Avoid Type Assertions (strict, CWE-704, CWE-20)**
```typescript
// Do: Use type guards
function isUser(obj: unknown): obj is User {
return typeof obj === 'object' && obj !== null && 'id' in obj;
}
// Don't: Bypass type safety
const data = JSON.parse(text) as unknown as SecretData;
```
### API Security
**Separate Internal/API Types (warning, CWE-200, OWASP A01:2025)**
```typescript
interface UserInternal {
id: number;
email: string;
passwordHash: string; // Internal only
createdAt: Date;
}
interface UserResponse {
id: number;
email: string;
createdAt: string; // ISO string for API
}
function toUserResponse(user: UserInternal): UserResponse {
return {
id: user.id,
email: user.email,
createdAt: user.createdAt.toISOString(),
};
}
```
**Branded Types for Sensitive Data (advisory, CWE-704)**
```typescript
type UserId = string & { readonly brand: unique symbol };
type PostId = string & { readonly brand: unique symbol };
function getUser(id: UserId): User { } // Type-safe ID handling
```
### Null & Enum Safety
**Explicit Null Handling (warning, CWE-476)**
```typescript
function getUserEmail(userId: string): string | null {
const user = users.get(userId);
if (!user) return null;
return user.email;
}
// Optional chaining with nullish coalescing
const email = user?.email ?? 'default@example.com';
```
**String Enums for External Data (warning, CWE-20)**
```typescript
enum UserRole {
Admin = 'admin',
User = 'user',
Guest = 'guest',
}
// Or const objects
const UserRole = {
Admin: 'admin',
User: 'user',
Guest: 'guest',
} as const;
type UserRole = typeof UserRole[keyof typeof UserRole];
```
## JavaScript Security
### Code Execution Prevention
**No eval() with User Input (strict, CWE-94, CWE-95, OWASP A03:2025)**
```javascript
// Do: Use JSON.parse for data
const data = JSON.parse(userInput);
// Do: Use Map for dynamic dispatch
const handlers = new Map([
['action1', handleAction1],
['action2', handleAction2]
]);
// Don't: Execute arbitrary code
eval(userInput);
new Function(userInput)();
setTimeout(userCode, 1000);
```
**Prevent Prototype Pollution (strict, CWE-1321)**
```javascript
// Do: Use Object.create(null) for dictionaries
const safeDict = Object.create(null);
// Do: Validate keys before assignment
function safeSet(obj, key, value) {
if (key === '__proto__' || key === 'constructor' || key === 'prototype') {
throw new Error('Invalid key');
}
obj[key] = value;
}
// Don't: Direct assignment with user keys
obj[userKey] = userValue; // Can pollute prototype
```
### DOM Security
**Sanitize HTML Before Insertion (strict, CWE-79, OWASP A03:2025)**
```javascript
import DOMPurify from 'dompurify';
// Do: Sanitize HTML
const clean = DOMPurify.sanitize(userHtml);
element.innerHTML = clean;
// Do: Use textContent for plain text
element.textContent = userInput;
// Don't: Direct HTML insertion
element.innerHTML = userInput; // XSS vulnerability
document.write(userInput); // XSS via document.write
```
**Validate URLs Before Use (strict, CWE-601, CWE-79)**
```javascript
function isValidUrl(urlString) {
try {
const url = new URL(urlString);
return ['http:', 'https:'].includes(url.protocol);
} catch {
return false;
}
}
// Safe redirect
if (isValidUrl(redirectUrl) && isSameDomain(redirectUrl)) {
window.location.href = redirectUrl;
}
// Don't: Use unvalidated URLs
element.href = userUrl; // javascript: URLs execute code
```
### Node.js Security
**Prevent Command Injection (strict, CWE-78, OWASP A03:2025)**
```javascript
const { execFile } = require('child_process');
// Do: Use execFile with argument array
execFile('grep', [pattern, filename], (error, stdout) => {
console.log(stdout);
});
// Don't: Shell string interpolation
exec(`grep ${userPattern} ${userFile}`); // Command injection
exec('ls ' + userInput); // Shell interpretation
```
**Validate File Paths (strict, CWE-22, OWASP A01:2025)**
```javascript
const path = require('path');
const SAFE_DIR = '/app/uploads';
function safeReadFile(filename) {
const resolved = path.resolve(SAFE_DIR, filename);
// Ensure path is within safe directory
if (!resolved.startsWith(SAFE_DIR + path.sep)) {
throw new Error('Path traversal detected');
}
return fs.readFileSync(resolved);
}
// Don't: Unvalidated path concatenation
fs.readFileSync(`./uploads/${userFilename}`); // Path traversal
```
**Secure Dependencies (warning, CWE-1104, OWASP A06:2025)**
```bash
npm audit # Audit dependencies
npm ci # Use lockfile
npm outdated # Check for updates
```
```json
{
"dependencies": {
"express": "4.18.2" // Pin exact versions
}
}
```
### Cryptography
**Use Crypto Module Correctly (strict, CWE-330, CWE-328)**
```javascript
const crypto = require('crypto');
// Do: Secure random token
const token = crypto.randomBytes(32).toString('hex');
// Do: Secure UUID
const { randomUUID } = require('crypto');
const id = randomUUID();
// Do: Password hashing
const bcrypt = require('bcrypt');
const hash = await bcrypt.hash(password, 12);
// Don't: Predictable randomness
const token = Math.random().toString(36); // Predictable
// Don't: Weak password hashing
const hash = crypto.createHash('md5').update(password).digest('hex');
```
### HTTP Security
**Set Security Headers (warning, OWASP A05:2025)**
```javascript
const helmet = require('helmet');
const app = express();
app.use(helmet()); // Comprehensive security headers
// Or set individually
app.use((req, res, next) => {
res.setHeader('X-Content-Type-Options', 'nosniff');
res.setHeader('X-Frame-Options', 'DENY');
res.setHeader('Content-Security-Policy', "default-src 'self'");
next();
});
```
**Configure CORS Properly (strict, CWE-942, OWASP A05:2025)**
```javascript
const cors = require('cors');
// Do: Specific origins only
app.use(cors({
origin: ['https://myapp.com', 'https://admin.myapp.com'],
methods: ['GET', 'POST'],
credentials: true
}));
// Don't: Permissive CORS
app.use(cors({ origin: '*', credentials: true })); // Vulnerable
app.use(cors({ origin: true })); // Reflects any origin
```
## Quick Reference
| Rule | Level | CWE | OWASP |
|------|-------|-----|-------|
| **TypeScript** |
| Strict tsconfig | warning | CWE-704 | - |
| Runtime validation | strict | CWE-20 | A03:2025 |
| Avoid type assertions | strict | CWE-704 | - |
| Separate API types | warning | CWE-200 | A01:2025 |
| Branded types | advisory | CWE-704 | - |
| Null safety | warning | CWE-476 | - |
| String enums | warning | CWE-20 | - |
| **JavaScript** |
| No eval() | strict | CWE-94,95 | A03:2025 |
| Prototype pollution | strict | CWE-1321 | - |
| Sanitize HTML | strict | CWE-79 | A03:2025 |
| Validate URLs | strict | CWE-601 | - |
| Command injection | strict | CWE-78 | A03:2025 |
| Path traversal | strict | CWE-22 | A01:2025 |
| Secure dependencies | warning | CWE-1104 | A06:2025 |
| Crypto randomness | strict | CWE-330 | - |
| Security headers | warning | - | A05:2025 |
| CORS configuration | strict | CWE-942 | A05:2025 |
## Key Principles
1. **Never trust external data** - Always validate at runtime
2. **Type safety != runtime safety** - TypeScript types are erased
3. **Use secure defaults** - Explicitly configure security settings
4. **Principle of least privilege** - Minimize permissions and exposure
5. **Defense in depth** - Apply multiple security layers
## Version History
- **v2.0.0** - Combined TypeScript/JavaScript rules, compacted format
- **v1.0.0** - Initial separate TypeScript and JavaScript security rules

View File

@@ -69,7 +69,9 @@ async function syncStateAfterReconnect(): Promise<void> {
try {
const serverInfo = await getAPI().getServerInfo();
// Check if sync was superseded
if (syncGeneration !== currentGeneration) return;
if (syncGeneration !== currentGeneration) {
return;
}
if (typeof serverInfo.state_version === 'number') {
meetingCache.updateServerStateVersion(serverInfo.state_version);
}
@@ -84,13 +86,17 @@ async function syncStateAfterReconnect(): Promise<void> {
}
// Check if sync was superseded
if (syncGeneration !== currentGeneration) return;
if (syncGeneration !== currentGeneration) {
return;
}
// 3. Revalidate cached integrations against server
try {
await preferences.revalidateIntegrations();
// Check if sync was superseded
if (syncGeneration !== currentGeneration) return;
if (syncGeneration !== currentGeneration) {
return;
}
} catch (error) {
addClientLog({
level: 'warning',
@@ -104,7 +110,9 @@ async function syncStateAfterReconnect(): Promise<void> {
// 4. Execute all registered reconnection callbacks
for (const callback of reconnectionCallbacks) {
// Check if sync was superseded before each callback
if (syncGeneration !== currentGeneration) return;
if (syncGeneration !== currentGeneration) {
return;
}
try {
await callback();
} catch (error) {

View File

@@ -287,11 +287,15 @@ export function EntityManagementPanel({ meetingId }: EntityManagementPanelProps)
});
}
// Check mounted state before updating UI
if (!isMountedRef.current) return;
if (!isMountedRef.current) {
return;
}
toast({ title: 'Entity updated', description: `"${data.text}" has been updated.` });
setEditingEntity(null);
} catch (error) {
if (!isMountedRef.current) return;
if (!isMountedRef.current) {
return;
}
toastError({
title: 'Update failed',
error,
@@ -316,11 +320,15 @@ export function EntityManagementPanel({ meetingId }: EntityManagementPanelProps)
await deleteEntityWithPersist(meetingId, deletingEntity.id);
}
// Check mounted state before updating UI
if (!isMountedRef.current) return;
if (!isMountedRef.current) {
return;
}
toast({ title: 'Entity deleted', description: `"${deletingEntity.text}" has been removed.` });
setDeletingEntity(null);
} catch (error) {
if (!isMountedRef.current) return;
if (!isMountedRef.current) {
return;
}
toastError({
title: 'Delete failed',
error,

View File

@@ -341,11 +341,9 @@ export function getSuggestedComputeType(
if (types.includes('int8')) {
return 'int8';
}
} else if (device === 'cpu') {
if (types.includes('int8')) {
return 'int8';
}
}
} else if (device === 'cpu' && types.includes('int8')) {
return 'int8';
}
return types[0] ?? null;
}
@@ -361,11 +359,9 @@ export function getSuggestedComputeType(
if (types.includes('int8')) {
return 'int8';
}
} else if (device === 'cpu') {
if (types.includes('int8')) {
return 'int8';
}
}
} else if (device === 'cpu' && types.includes('int8')) {
return 'int8';
}
return types[0] ?? null;
}

View File

@@ -46,7 +46,9 @@ export function ModelAuthSection({
const [showToken, setShowToken] = useState(false);
const handleSaveToken = async () => {
if (!tokenInput.trim()) return;
if (!tokenInput.trim()) {
return;
}
const success = await setToken(tokenInput.trim(), true);
if (success) {
setTokenInput('');

View File

@@ -294,9 +294,7 @@ export function AIConfigSection() {
});
const sourceLabel = result.source === 'cache' ? 'Loaded from cache' : 'Loaded from API';
const forceLabel = forceRefresh ? ' (forced refresh)' : '';
const description = result.error
? result.error
: `${sourceLabel}${forceLabel}${nextModels.length} models`;
const description = result.error || `${sourceLabel}${forceLabel}${nextModels.length} models`;
toast({
title: result.stale ? 'Models loaded (stale cache)' : 'Models loaded',
description,

View File

@@ -98,11 +98,15 @@ export function useAuthFlow(): UseAuthFlowReturn {
try {
const response = await api.completeAuthLogin(provider, params.code, params.state);
if (unmounted) break; // Abort if unmounted during async operation
if (unmounted) {
break;
} // Abort if unmounted during async operation
if (response.success) {
const userInfo = await api.getCurrentUser();
if (unmounted) break; // Abort if unmounted during async operation
if (unmounted) {
break;
} // Abort if unmounted during async operation
setState((prev) => ({
...prev,

View File

@@ -359,7 +359,9 @@ export function useRecordingSession(
try {
if (!isConnected && !shouldSimulate) {
const connected = await preflightConnect();
if (cancelled) return;
if (cancelled) {
return;
}
if (!connected) {
setRecordingState('idle');
return;
@@ -368,7 +370,9 @@ export function useRecordingSession(
if (!shouldSimulate) {
await checkAndRecoverStream();
if (cancelled) return;
if (cancelled) {
return;
}
}
const api: NoteFlowAPI = shouldSimulate && !isConnected ? mockAPI : getAPI();
@@ -377,7 +381,9 @@ export function useRecordingSession(
include_segments: false,
include_summary: false,
});
if (cancelled) return;
if (cancelled) {
return;
}
setMeeting(existingMeeting);
setMeetingTitle(existingMeeting.title);
if (!['created', 'recording'].includes(existingMeeting.state)) {
@@ -389,7 +395,9 @@ export function useRecordingSession(
const mockModule: typeof import('@/api/mock-transcription-stream') = await import(
'@/api/mock-transcription-stream'
);
if (cancelled) return;
if (cancelled) {
return;
}
stream = new mockModule.MockTranscriptionStream(existingMeeting.id);
} else {
stream = ensureTranscriptionStream(await api.startTranscription(existingMeeting.id));
@@ -406,7 +414,9 @@ export function useRecordingSession(
shouldSimulate ? 'Simulation is active' : 'Transcription is now active'
);
} catch (error) {
if (cancelled) return;
if (cancelled) {
return;
}
setRecordingState('idle');
toastError('Failed to start recording', error);
}

View File

@@ -37,7 +37,7 @@ function getStringOrNumber(
function buildCostLabel(value: unknown): string | undefined {
if (typeof value === 'string') {
const trimmed = value.trim();
return trimmed ? trimmed : undefined;
return trimmed || undefined;
}
if (typeof value === 'number' && Number.isFinite(value)) {
return `$${value}`;
@@ -70,7 +70,7 @@ function extractCostLabel(record: Record<string, unknown>): string | undefined {
}
if (isRecord(record.pricing)) {
const pricing = record.pricing;
const {pricing} = record;
const input = getStringOrNumber(pricing, ['prompt', 'input', 'input_cost', 'prompt_cost']);
const output = getStringOrNumber(pricing, [
'completion',

View File

@@ -80,7 +80,9 @@ function getLegacyDeviceId(): string {
async function decryptWithLegacyKey(encryptedString: string): Promise<string> {
try {
const storedSalt = localStorage.getItem(KEY_SALT_KEY);
if (!storedSalt) return '';
if (!storedSalt) {
return '';
}
const salt = new Uint8Array(JSON.parse(storedSalt)).buffer as ArrayBuffer;
const legacyDeviceId = getLegacyDeviceId();

View File

@@ -207,7 +207,7 @@ const MESSAGE_TEMPLATES: Record<string, MessageTransformer> = {
'reconnection callback execution failed': () => 'Reconnection partially completed',
'state sync after reconnect failed': () => 'Sync incomplete after reconnecting',
'scheduled reconnect attempt failed': (d) => {
const attempt = d.attempt;
const {attempt} = d;
return attempt ? `Reconnection attempt ${attempt} failed` : 'Reconnection attempt failed';
},
'online event reconnect failed': () => 'Could not reconnect when network came online',

View File

@@ -329,7 +329,9 @@ export default function SettingsPage() {
};
const handleServerSwitchConfirm = async () => {
if (!pendingServerChange) return;
if (!pendingServerChange) {
return;
}
preferences.prepareForServerSwitch();
setIntegrations(preferences.getIntegrations());
await performConnect(pendingServerChange.host, pendingServerChange.port);

1035
docker/CLAUDE.md Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1795,7 +1795,7 @@ from uuid import uuid4
import pytest
from noteflow.application.services.ner_service import ExtractionResult, NerService
from noteflow.application.services.ner import ExtractionResult, NerService
from noteflow.domain.entities.meeting import Meeting
from noteflow.domain.entities.named_entity import EntityCategory, NamedEntity
from noteflow.domain.entities.segment import Segment

View File

@@ -1020,7 +1020,7 @@ The NoteFlow gRPC layer uses a three-tier service injection pattern. All three t
```python
# Add to ServicerHost protocol class:
from noteflow.application.services.webhook_service import WebhookService
from noteflow.application.services.webhooks import WebhookService
class ServicerHost(Protocol):
# ... existing fields ...
@@ -1034,7 +1034,7 @@ class ServicerHost(Protocol):
**File**: `src/noteflow/grpc/service.py`
```python
from noteflow.application.services.webhook_service import WebhookService
from noteflow.application.services.webhooks import WebhookService
class NoteFlowServicer(
# ... existing mixins ...
@@ -1053,7 +1053,7 @@ class NoteFlowServicer(
**File**: `src/noteflow/grpc/server.py`
```python
from noteflow.application.services.webhook_service import WebhookService
from noteflow.application.services.webhooks import WebhookService
class NoteFlowServer:
def __init__(
@@ -1127,7 +1127,7 @@ if self._webhook_service is not None:
Add to `run_server_with_config()` function:
```python
from noteflow.application.services.webhook_service import WebhookService
from noteflow.application.services.webhooks import WebhookService
from noteflow.infrastructure.integrations.webhooks import WebhookExecutor
from noteflow.infrastructure.converters import WebhookConverter
@@ -1457,7 +1457,7 @@ def test_hmac_signature_generation(webhook_config: WebhookConfig) -> None:
import pytest
from unittest.mock import AsyncMock, Mock
from noteflow.application.services.webhook_service import WebhookService
from noteflow.application.services.webhooks import WebhookService
from noteflow.domain.entities.meeting import Meeting, MeetingId
from noteflow.domain.webhooks.events import WebhookConfig, WebhookEventType

View File

@@ -633,7 +633,7 @@ import pytest
from unittest.mock import AsyncMock, MagicMock
from noteflow.domain.webhooks.events import WebhookEventType
from noteflow.application.services.webhook_service import WebhookService
from noteflow.application.services.webhooks import WebhookService
class TestNewWebhookEvents:

View File

@@ -233,7 +233,7 @@ modules = [
'noteflow.grpc._startup',
'noteflow.grpc.service',
'noteflow.grpc.server',
'noteflow.application.services.export_service',
'noteflow.application.services.export',
]
for mod in modules:
try:

View File

@@ -27,7 +27,7 @@
}
},
"include": [
"src/", "client/src-tauri", "client/src", "tests/"
"src/"
],
"ignore": {
"useGitignore": true,

View File

@@ -25,7 +25,7 @@ from noteflow.infrastructure.security.crypto import AesGcmCryptoBox
from noteflow.infrastructure.security.keystore import KeyringKeyStore
if TYPE_CHECKING:
from noteflow.grpc._types import TranscriptSegment
from noteflow.grpc.types import TranscriptSegment
PRESETS: dict[str, dict[str, float]] = {
"responsive": {

419
src/AGENTS.md Normal file
View File

@@ -0,0 +1,419 @@
# Python Security & Coding Guidelines
Security rules and coding practices for Python development.
## Prerequisites
- `rules/_core/owasp-2025.md` - Core web security
- `rules/_core/ai-security.md` - AI/ML security (if applicable)
---
## Input Handling
### Avoid Dangerous Deserialization
**Level**: `strict` | **When**: Loading untrusted data | **Refs**: CWE-502, OWASP A08:2025
**Do**:
```python
import json, yaml
data = json.loads(user_input) # Safe
data = yaml.safe_load(user_input) # Safe loader
```
**Don't**:
```python
import pickle, yaml
data = pickle.loads(user_input) # RCE vulnerability
data = yaml.load(user_input, Loader=yaml.Loader) # Arbitrary code execution
```
**Why**: Pickle and unsafe YAML loaders execute arbitrary code during deserialization.
---
### Use Subprocess Safely
**Level**: `strict` | **When**: Executing system commands | **Refs**: CWE-78, OWASP A03:2025
**Do**:
```python
import subprocess, shlex, re
# Pass arguments as list (no shell)
result = subprocess.run(['ls', '-la', user_dir], capture_output=True, text=True, check=True)
# If shell=True required, validate strictly
if re.match(r'^[a-zA-Z0-9_-]+$', filename):
subprocess.run(f'process {shlex.quote(filename)}', shell=True)
```
**Don't**:
```python
import os, subprocess
os.system(f'ls {user_input}') # Command injection
subprocess.run(f'grep {pattern} {filename}', shell=True) # Shell injection
```
**Why**: Shell injection allows attackers to execute arbitrary commands.
---
## File Operations
### Prevent Path Traversal
**Level**: `strict` | **When**: File access based on user input | **Refs**: CWE-22, OWASP A01:2025
**Do**:
```python
from pathlib import Path
UPLOAD_DIR = Path('/app/uploads').resolve()
def safe_file_access(filename: str) -> Path:
requested = (UPLOAD_DIR / filename).resolve()
if not requested.is_relative_to(UPLOAD_DIR):
raise ValueError("Path traversal attempt detected")
return requested
```
**Don't**:
```python
def get_file(filename):
return open(f'/app/uploads/{filename}').read() # Path traversal vulnerability
path = os.path.join(base_dir, user_filename) # No validation
```
**Why**: Path traversal (../) allows reading arbitrary files like /etc/passwd.
---
### Secure Temporary Files
**Level**: `warning` | **When**: Creating temporary files | **Refs**: CWE-377, CWE-379
**Do**:
```python
import tempfile, os
with tempfile.NamedTemporaryFile(delete=True) as tmp:
tmp.write(data); tmp.flush(); process_file(tmp.name)
with tempfile.TemporaryDirectory() as tmpdir:
filepath = os.path.join(tmpdir, 'data.txt')
```
**Don't**:
```python
tmp_file = f'/tmp/myapp_{user_id}.txt' # Predictable filename, race condition
with open(tmp_file, 'w') as f: f.write(data)
```
**Why**: Predictable temp file names enable symlink attacks and race conditions.
---
## Cryptography
### Use Secure Random Numbers
**Level**: `strict` | **When**: Generating security-sensitive random values | **Refs**: CWE-330, CWE-338
**Do**:
```python
import secrets
token = secrets.token_urlsafe(32) # Secure token
api_key = secrets.token_hex(32) # Secure API key
otp = ''.join(secrets.choice('0123456789') for _ in range(6)) # Secure OTP
```
**Don't**:
```python
import random
token = ''.join(random.choices('abcdef0123456789', k=32)) # Predictable
session_id = random.randint(0, 999999) # Predictable PRNG
```
**Why**: `random` module uses predictable PRNG; attackers can predict tokens.
---
### Hash Passwords Correctly
**Level**: `strict` | **When**: Storing user passwords | **Refs**: CWE-916, CWE-328, OWASP A02:2025
**Do**:
```python
import bcrypt
def hash_password(password: str) -> bytes:
return bcrypt.hashpw(password.encode(), bcrypt.gensalt(rounds=12))
def verify_password(password: str, hashed: bytes) -> bool:
return bcrypt.checkpw(password.encode(), hashed)
```
**Don't**:
```python
import hashlib
password_hash = hashlib.sha256(password.encode()).hexdigest() # No salt, fast hash
password_hash = hashlib.md5(password.encode()).hexdigest() # MD5 is broken
```
**Why**: Unsalted fast hashes are vulnerable to rainbow tables and GPU cracking.
---
## SQL Security
### Use Parameterized Queries
**Level**: `strict` | **When**: Database queries with user input | **Refs**: CWE-89, OWASP A03:2025
**Do**:
```python
import sqlite3
cursor.execute("SELECT * FROM users WHERE email = ? AND status = ?", (email, status))
cursor.execute("SELECT * FROM users WHERE email = :email", {"email": email})
user = session.query(User).filter(User.email == email).first() # SQLAlchemy ORM
```
**Don't**:
```python
query = f"SELECT * FROM users WHERE email = '{email}'" # SQL injection
cursor.execute(query)
cursor.execute("SELECT * FROM users WHERE id = %s" % user_id) # String formatting
```
**Why**: SQL injection allows attackers to read, modify, or delete database data.
---
## Web Security
### Validate URL Schemes
**Level**: `strict` | **When**: Processing user-provided URLs | **Refs**: CWE-918, OWASP A10:2025
**Do**:
```python
from urllib.parse import urlparse
ALLOWED_SCHEMES = {'http', 'https'}
def validate_url(url: str) -> str:
parsed = urlparse(url)
if parsed.scheme not in ALLOWED_SCHEMES:
raise ValueError(f"Invalid scheme: {parsed.scheme}")
if not parsed.netloc:
raise ValueError("Missing hostname")
return url
```
**Don't**:
```python
def fetch_url(url):
return requests.get(url) # Allows file://, javascript:, etc.
```
**Why**: Malicious schemes like `file://` or `javascript:` can read local files or execute code.
---
### Set Secure Cookie Attributes
**Level**: `strict` | **When**: Setting cookies in web applications | **Refs**: CWE-614, CWE-1004, OWASP A07:2025
**Do**:
```python
from flask import make_response
response = make_response(data)
response.set_cookie('session_id', value=session_id, httponly=True, secure=True, samesite='Lax', max_age=3600)
```
**Don't**:
```python
response.set_cookie('session_id', session_id) # Missing security attributes
```
**Why**: Missing attributes expose cookies to XSS theft and CSRF attacks.
---
## Error Handling
### Don't Expose Stack Traces
**Level**: `warning` | **When**: Handling exceptions in production | **Refs**: CWE-209, OWASP A05:2025
**Do**:
```python
import logging
logger = logging.getLogger(__name__)
@app.errorhandler(Exception)
def handle_error(error):
logger.exception("Unhandled exception") # Log full details internally
return {"error": "Internal server error"}, 500 # Return safe message to client
```
**Don't**:
```python
@app.errorhandler(Exception)
def handle_error(error):
return {"error": str(error), "traceback": traceback.format_exc()}, 500 # Exposes internals
```
**Why**: Stack traces reveal file paths, library versions, and code structure to attackers.
---
## Quick Reference
| Rule | Level | CWE |
|------|-------|-----|
| Avoid pickle/unsafe YAML | strict | CWE-502 |
| Safe subprocess | strict | CWE-78 |
| Path traversal prevention | strict | CWE-22 |
| Secure temp files | warning | CWE-377 |
| Cryptographic randomness | strict | CWE-330 |
| Password hashing | strict | CWE-916 |
| Parameterized queries | strict | CWE-89 |
| URL scheme validation | strict | CWE-918 |
| Secure cookies | strict | CWE-614 |
| No stack traces | warning | CWE-209 |
---
## Python Coding Practices
### 1. Look Before You Leap (LBYL)
Check conditions proactively rather than using exceptions for control flow.
```python
# WRONG: Exception as control flow
try: value = mapping[key]; process(value)
except KeyError: pass
# CORRECT: Check first
if key in mapping: value = mapping[key]; process(value)
```
### 2. Never Swallow Exceptions
Avoid silent exception swallowing that hides critical failures.
```python
# WRONG: Silent exception swallowing
try: risky_operation()
except: pass
# CORRECT: Let exceptions bubble up
risky_operation()
```
### 3. Magic Methods Must Be O(1)
Magic methods called frequently must run in constant time.
```python
# WRONG: __len__ doing iteration
def __len__(self) -> int: return sum(1 for _ in self._items)
# CORRECT: O(1) __len__
def __len__(self) -> int: return self._count
```
### 4. Check Existence Before Resolution
Always check `.exists()` before calling `.resolve()` or `.is_relative_to()`.
```python
# WRONG: resolve() can raise OSError on non-existent paths
wt_path_resolved = wt_path.resolve()
if current_dir.is_relative_to(wt_path_resolved): current_worktree = wt_path_resolved
# CORRECT: Check exists() first
if wt_path.exists():
wt_path_resolved = wt_path.resolve()
if current_dir.is_relative_to(wt_path_resolved): current_worktree = wt_path_resolved
```
### 5. Defer Import-Time Computation
Avoid side effects at import time using `@cache`.
```python
# WRONG: Path computed at import time
SESSION_FILE = Path("scratch/current-session-id")
# CORRECT: Defer with @cache
@cache
def _session_file_path() -> Path:
"""Return path to session ID file (cached after first call)."""
return Path("scratch/current-session-id")
```
### 6. Verify Casts at Runtime
Add assertions before `typing.cast()` calls.
```python
# WRONG: Blind cast
cast(dict[str, Any], doc)["key"] = value
# CORRECT: Assert before cast
assert isinstance(doc, MutableMapping), f"Expected MutableMapping, got {type(doc)}"
cast(dict[str, Any], doc)["key"] = value
```
### 7. Use Literal Types for Fixed Values
Model fixed string values with `Literal` types.
```python
# WRONG: Bare strings
issues.append(("orphen-state", "desc")) # Typo goes unnoticed!
# CORRECT: Literal type
IssueCode = Literal["orphan-state", "orphan-dir", "missing-branch"]
@dataclass(frozen=True)
class Issue: code: IssueCode; message: str
issues.append(Issue(code="orphan-state", message="desc")) # Type-checked!
```
### 8. Declare Variables Close to Use
Avoid early declarations that pollute scope.
```python
# WRONG: Variable declared far from use
def process_data(ctx, items):
result_path = compute_result_path(ctx)
# ... 20 lines of other logic ...
save_to_path(transformed, result_path)
# CORRECT: Inline at call site
def process_data(ctx, items):
# ... other logic ...
save_to_path(transformed, compute_result_path(ctx))
```
### 9. Keyword Arguments for Complex Functions
Use keyword-only arguments for functions with 5+ parameters.
```python
# WRONG: Positional chaos
response = fetch_data(api_url, 30.0, 3, {"Accept": "application/json"}, token)
# CORRECT: Keyword-only after first param
def fetch_data(url, *, timeout: float, retries: int, headers: dict[str, str], auth_token: str) -> Response: ...
response = fetch_data(api_url, timeout=30.0, retries=3, headers={"Accept": "application/json"}, auth_token=token)
```
### 10. Default Values Are Dangerous
Require explicit parameter choices unless defaults are truly necessary.
```python
# DANGEROUS: Caller forgets encoding
def process_file(path: Path, encoding: str = "utf-8") -> str:
return path.read_text(encoding=encoding)
# SAFER: Require explicit choice
def process_file(path: Path, encoding: str) -> str:
return path.read_text(encoding=encoding)
```
---
## Quick Reference
| Rule | Level | CWE |
|------|-------|-----|
| Avoid pickle/unsafe YAML | strict | CWE-502 |
| Safe subprocess | strict | CWE-78 |
| Path traversal prevention | strict | CWE-22 |
| Secure temp files | warning | CWE-377 |
| Cryptographic randomness | strict | CWE-330 |
| Password hashing | strict | CWE-916 |
| Parameterized queries | strict | CWE-89 |
| URL scheme validation | strict | CWE-918 |
| Secure cookies | strict | CWE-614 |
| No stack traces | warning | CWE-209 |
---

419
src/CLAUDE.md Normal file
View File

@@ -0,0 +1,419 @@
# Python Security & Coding Guidelines
Security rules and coding practices for Python development.
## Prerequisites
- `rules/_core/owasp-2025.md` - Core web security
- `rules/_core/ai-security.md` - AI/ML security (if applicable)
---
## Input Handling
### Avoid Dangerous Deserialization
**Level**: `strict` | **When**: Loading untrusted data | **Refs**: CWE-502, OWASP A08:2025
**Do**:
```python
import json, yaml
data = json.loads(user_input) # Safe
data = yaml.safe_load(user_input) # Safe loader
```
**Don't**:
```python
import pickle, yaml
data = pickle.loads(user_input) # RCE vulnerability
data = yaml.load(user_input, Loader=yaml.Loader) # Arbitrary code execution
```
**Why**: Pickle and unsafe YAML loaders execute arbitrary code during deserialization.
---
### Use Subprocess Safely
**Level**: `strict` | **When**: Executing system commands | **Refs**: CWE-78, OWASP A03:2025
**Do**:
```python
import subprocess, shlex, re
# Pass arguments as list (no shell)
result = subprocess.run(['ls', '-la', user_dir], capture_output=True, text=True, check=True)
# If shell=True required, validate strictly
if re.match(r'^[a-zA-Z0-9_-]+$', filename):
subprocess.run(f'process {shlex.quote(filename)}', shell=True)
```
**Don't**:
```python
import os, subprocess
os.system(f'ls {user_input}') # Command injection
subprocess.run(f'grep {pattern} {filename}', shell=True) # Shell injection
```
**Why**: Shell injection allows attackers to execute arbitrary commands.
---
## File Operations
### Prevent Path Traversal
**Level**: `strict` | **When**: File access based on user input | **Refs**: CWE-22, OWASP A01:2025
**Do**:
```python
from pathlib import Path
UPLOAD_DIR = Path('/app/uploads').resolve()
def safe_file_access(filename: str) -> Path:
requested = (UPLOAD_DIR / filename).resolve()
if not requested.is_relative_to(UPLOAD_DIR):
raise ValueError("Path traversal attempt detected")
return requested
```
**Don't**:
```python
def get_file(filename):
return open(f'/app/uploads/{filename}').read() # Path traversal vulnerability
path = os.path.join(base_dir, user_filename) # No validation
```
**Why**: Path traversal (../) allows reading arbitrary files like /etc/passwd.
---
### Secure Temporary Files
**Level**: `warning` | **When**: Creating temporary files | **Refs**: CWE-377, CWE-379
**Do**:
```python
import tempfile, os
with tempfile.NamedTemporaryFile(delete=True) as tmp:
tmp.write(data); tmp.flush(); process_file(tmp.name)
with tempfile.TemporaryDirectory() as tmpdir:
filepath = os.path.join(tmpdir, 'data.txt')
```
**Don't**:
```python
tmp_file = f'/tmp/myapp_{user_id}.txt' # Predictable filename, race condition
with open(tmp_file, 'w') as f: f.write(data)
```
**Why**: Predictable temp file names enable symlink attacks and race conditions.
---
## Cryptography
### Use Secure Random Numbers
**Level**: `strict` | **When**: Generating security-sensitive random values | **Refs**: CWE-330, CWE-338
**Do**:
```python
import secrets
token = secrets.token_urlsafe(32) # Secure token
api_key = secrets.token_hex(32) # Secure API key
otp = ''.join(secrets.choice('0123456789') for _ in range(6)) # Secure OTP
```
**Don't**:
```python
import random
token = ''.join(random.choices('abcdef0123456789', k=32)) # Predictable
session_id = random.randint(0, 999999) # Predictable PRNG
```
**Why**: `random` module uses predictable PRNG; attackers can predict tokens.
---
### Hash Passwords Correctly
**Level**: `strict` | **When**: Storing user passwords | **Refs**: CWE-916, CWE-328, OWASP A02:2025
**Do**:
```python
import bcrypt
def hash_password(password: str) -> bytes:
return bcrypt.hashpw(password.encode(), bcrypt.gensalt(rounds=12))
def verify_password(password: str, hashed: bytes) -> bool:
return bcrypt.checkpw(password.encode(), hashed)
```
**Don't**:
```python
import hashlib
password_hash = hashlib.sha256(password.encode()).hexdigest() # No salt, fast hash
password_hash = hashlib.md5(password.encode()).hexdigest() # MD5 is broken
```
**Why**: Unsalted fast hashes are vulnerable to rainbow tables and GPU cracking.
---
## SQL Security
### Use Parameterized Queries
**Level**: `strict` | **When**: Database queries with user input | **Refs**: CWE-89, OWASP A03:2025
**Do**:
```python
import sqlite3
cursor.execute("SELECT * FROM users WHERE email = ? AND status = ?", (email, status))
cursor.execute("SELECT * FROM users WHERE email = :email", {"email": email})
user = session.query(User).filter(User.email == email).first() # SQLAlchemy ORM
```
**Don't**:
```python
query = f"SELECT * FROM users WHERE email = '{email}'" # SQL injection
cursor.execute(query)
cursor.execute("SELECT * FROM users WHERE id = %s" % user_id) # String formatting
```
**Why**: SQL injection allows attackers to read, modify, or delete database data.
---
## Web Security
### Validate URL Schemes
**Level**: `strict` | **When**: Processing user-provided URLs | **Refs**: CWE-918, OWASP A10:2025
**Do**:
```python
from urllib.parse import urlparse
ALLOWED_SCHEMES = {'http', 'https'}
def validate_url(url: str) -> str:
parsed = urlparse(url)
if parsed.scheme not in ALLOWED_SCHEMES:
raise ValueError(f"Invalid scheme: {parsed.scheme}")
if not parsed.netloc:
raise ValueError("Missing hostname")
return url
```
**Don't**:
```python
def fetch_url(url):
return requests.get(url) # Allows file://, javascript:, etc.
```
**Why**: Malicious schemes like `file://` or `javascript:` can read local files or execute code.
---
### Set Secure Cookie Attributes
**Level**: `strict` | **When**: Setting cookies in web applications | **Refs**: CWE-614, CWE-1004, OWASP A07:2025
**Do**:
```python
from flask import make_response
response = make_response(data)
response.set_cookie('session_id', value=session_id, httponly=True, secure=True, samesite='Lax', max_age=3600)
```
**Don't**:
```python
response.set_cookie('session_id', session_id) # Missing security attributes
```
**Why**: Missing attributes expose cookies to XSS theft and CSRF attacks.
---
## Error Handling
### Don't Expose Stack Traces
**Level**: `warning` | **When**: Handling exceptions in production | **Refs**: CWE-209, OWASP A05:2025
**Do**:
```python
import logging
logger = logging.getLogger(__name__)
@app.errorhandler(Exception)
def handle_error(error):
logger.exception("Unhandled exception") # Log full details internally
return {"error": "Internal server error"}, 500 # Return safe message to client
```
**Don't**:
```python
@app.errorhandler(Exception)
def handle_error(error):
return {"error": str(error), "traceback": traceback.format_exc()}, 500 # Exposes internals
```
**Why**: Stack traces reveal file paths, library versions, and code structure to attackers.
---
## Quick Reference
| Rule | Level | CWE |
|------|-------|-----|
| Avoid pickle/unsafe YAML | strict | CWE-502 |
| Safe subprocess | strict | CWE-78 |
| Path traversal prevention | strict | CWE-22 |
| Secure temp files | warning | CWE-377 |
| Cryptographic randomness | strict | CWE-330 |
| Password hashing | strict | CWE-916 |
| Parameterized queries | strict | CWE-89 |
| URL scheme validation | strict | CWE-918 |
| Secure cookies | strict | CWE-614 |
| No stack traces | warning | CWE-209 |
---
## Python Coding Practices
### 1. Look Before You Leap (LBYL)
Check conditions proactively rather than using exceptions for control flow.
```python
# WRONG: Exception as control flow
try: value = mapping[key]; process(value)
except KeyError: pass
# CORRECT: Check first
if key in mapping: value = mapping[key]; process(value)
```
### 2. Never Swallow Exceptions
Avoid silent exception swallowing that hides critical failures.
```python
# WRONG: Silent exception swallowing
try: risky_operation()
except: pass
# CORRECT: Let exceptions bubble up
risky_operation()
```
### 3. Magic Methods Must Be O(1)
Magic methods called frequently must run in constant time.
```python
# WRONG: __len__ doing iteration
def __len__(self) -> int: return sum(1 for _ in self._items)
# CORRECT: O(1) __len__
def __len__(self) -> int: return self._count
```
### 4. Check Existence Before Resolution
Always check `.exists()` before calling `.resolve()` or `.is_relative_to()`.
```python
# WRONG: resolve() can raise OSError on non-existent paths
wt_path_resolved = wt_path.resolve()
if current_dir.is_relative_to(wt_path_resolved): current_worktree = wt_path_resolved
# CORRECT: Check exists() first
if wt_path.exists():
wt_path_resolved = wt_path.resolve()
if current_dir.is_relative_to(wt_path_resolved): current_worktree = wt_path_resolved
```
### 5. Defer Import-Time Computation
Avoid side effects at import time using `@cache`.
```python
# WRONG: Path computed at import time
SESSION_FILE = Path("scratch/current-session-id")
# CORRECT: Defer with @cache
@cache
def _session_file_path() -> Path:
"""Return path to session ID file (cached after first call)."""
return Path("scratch/current-session-id")
```
### 6. Verify Casts at Runtime
Add assertions before `typing.cast()` calls.
```python
# WRONG: Blind cast
cast(dict[str, Any], doc)["key"] = value
# CORRECT: Assert before cast
assert isinstance(doc, MutableMapping), f"Expected MutableMapping, got {type(doc)}"
cast(dict[str, Any], doc)["key"] = value
```
### 7. Use Literal Types for Fixed Values
Model fixed string values with `Literal` types.
```python
# WRONG: Bare strings
issues.append(("orphen-state", "desc")) # Typo goes unnoticed!
# CORRECT: Literal type
IssueCode = Literal["orphan-state", "orphan-dir", "missing-branch"]
@dataclass(frozen=True)
class Issue: code: IssueCode; message: str
issues.append(Issue(code="orphan-state", message="desc")) # Type-checked!
```
### 8. Declare Variables Close to Use
Avoid early declarations that pollute scope.
```python
# WRONG: Variable declared far from use
def process_data(ctx, items):
result_path = compute_result_path(ctx)
# ... 20 lines of other logic ...
save_to_path(transformed, result_path)
# CORRECT: Inline at call site
def process_data(ctx, items):
# ... other logic ...
save_to_path(transformed, compute_result_path(ctx))
```
### 9. Keyword Arguments for Complex Functions
Use keyword-only arguments for functions with 5+ parameters.
```python
# WRONG: Positional chaos
response = fetch_data(api_url, 30.0, 3, {"Accept": "application/json"}, token)
# CORRECT: Keyword-only after first param
def fetch_data(url, *, timeout: float, retries: int, headers: dict[str, str], auth_token: str) -> Response: ...
response = fetch_data(api_url, timeout=30.0, retries=3, headers={"Accept": "application/json"}, auth_token=token)
```
### 10. Default Values Are Dangerous
Require explicit parameter choices unless defaults are truly necessary.
```python
# DANGEROUS: Caller forgets encoding
def process_file(path: Path, encoding: str = "utf-8") -> str:
return path.read_text(encoding=encoding)
# SAFER: Require explicit choice
def process_file(path: Path, encoding: str) -> str:
return path.read_text(encoding=encoding)
```
---
## Quick Reference
| Rule | Level | CWE |
|------|-------|-----|
| Avoid pickle/unsafe YAML | strict | CWE-502 |
| Safe subprocess | strict | CWE-78 |
| Path traversal prevention | strict | CWE-22 |
| Secure temp files | warning | CWE-377 |
| Cryptographic randomness | strict | CWE-330 |
| Password hashing | strict | CWE-916 |
| Parameterized queries | strict | CWE-89 |
| URL scheme validation | strict | CWE-918 |
| Secure cookies | strict | CWE-614 |
| No stack traces | warning | CWE-209 |
---

608
src/noteflow/CLAUDE.md Normal file
View File

@@ -0,0 +1,608 @@
# CLAUDE.md - Python Backend Reference
This file provides guidance for working with the Python backend code in `src/noteflow/`. For client-side development, see the main repository CLAUDE.md.
## Project Overview
NoteFlow is an intelligent meeting notetaker: local-first audio capture + navigable recall + evidence-linked summaries. This is the **Python backend** (`src/noteflow/`) — gRPC server, domain logic, infrastructure adapters.
The gRPC schema is the shared contract between backend and client; keep proto changes in sync across Python, Rust, and TypeScript.
---
## Quick Orientation
| Entry Point | Description |
|-------------|-------------|
| `python -m noteflow.grpc.server` | Backend server (`src/noteflow/grpc/server/__main__.py`) |
| `src/noteflow/grpc/proto/noteflow.proto` | Protobuf schema |
---
## Build and Development Commands
```bash
# Install (editable with dev dependencies)
python -m pip install -e ".[dev]"
# Run gRPC server
python -m noteflow.grpc.server --help
# Tests
pytest # Full suite
pytest -m "not integration" # Skip external-service tests
pytest tests/domain/ # Run specific test directory
pytest -k "test_segment" # Run by pattern
# Docker (hot-reload enabled)
docker compose up -d postgres # PostgreSQL with pgvector
python scripts/dev_watch_server.py # Auto-reload server
```
### Forbidden Docker Operations (without explicit permission)
- `docker compose build` / `up` / `down` / `restart`
- `docker stop` / `docker kill`
- Any command that would interrupt the hot-reload server
---
## Quality Commands (Makefile)
**Always use Makefile targets instead of running tools directly.**
### Primary Quality Commands
```bash
make quality # Run ALL quality checks (TS + Rust + Python)
make quality-py # Python: lint + type-check + test-quality
```
### Python Quality
```bash
make lint-py # Basedpyright lint → .hygeine/basedpyright.lint.json
make type-check-py # Basedpyright strict mode
make test-quality-py # pytest tests/quality/
make lint-fix-py # Auto-fix Ruff + Sourcery issues
```
---
## Architecture
```
src/noteflow/
├── domain/ # Entities, ports, value objects
│ ├── entities/ # Meeting, Segment, Summary, Annotation, NamedEntity, Integration, Project, Processing, SummarizationTemplate
│ ├── identity/ # User, Workspace, WorkspaceMembership, roles, context
│ ├── auth/ # OIDC discovery, claims, constants
│ ├── ports/ # Repository protocols
│ │ ├── repositories/
│ │ │ ├── transcript.py # MeetingRepository, SegmentRepository, SummaryRepository
│ │ │ ├── asset.py # AssetRepository
│ │ │ ├── background.py # DiarizationJobRepository
│ │ │ ├── external/ # WebhookRepository, IntegrationRepository, EntityRepository, UsageRepository
│ │ │ └── identity/ # UserRepository, WorkspaceRepository, ProjectRepository, MembershipRepository, SummarizationTemplateRepository
│ │ ├── unit_of_work.py # UnitOfWork protocol (supports_* capability checks)
│ │ ├── async_context.py # Async context utilities
│ │ ├── diarization.py # DiarizationEngine protocol
│ │ ├── ner.py # NEREngine protocol
│ │ └── calendar.py # CalendarProvider protocol
│ ├── webhooks/ # WebhookEventType, WebhookConfig, WebhookDelivery, payloads
│ ├── triggers/ # Trigger, TriggerAction, TriggerSignal, TriggerProvider
│ ├── summarization/# SummarizationProvider protocol
│ ├── rules/ # Business rules registry, models, builtin rules
│ ├── settings/ # Domain settings base
│ ├── constants/ # Field definitions, placeholders
│ ├── utils/ # time.py (utc_now), validation.py
│ ├── errors.py # Domain-specific exceptions
│ └── value_objects.py
├── application/ # Use-cases/services
│ ├── services/
│ │ ├── meeting/ # MeetingService (CRUD, segments, annotations, summaries, state)
│ │ ├── identity/ # IdentityService (context, workspace, defaults)
│ │ ├── calendar/ # CalendarService (connection, events, oauth, sync)
│ │ ├── summarization/ # SummarizationService, TemplateService, ConsentManager
│ │ ├── project_service/ # CRUD, members, roles, rules, active project
│ │ ├── recovery/ # RecoveryService (meeting, job, audio recovery)
│ │ ├── auth/ # AuthService + helpers (service/types/workflows/constants)
│ │ ├── asr_config/ # ASR config service + types/persistence
│ │ ├── streaming_config/ # Streaming config persistence helpers
│ │ ├── export/ # ExportService
│ │ ├── huggingface/ # HfTokenService
│ │ ├── ner/ # NerService
│ │ ├── retention/ # RetentionService
│ │ ├── triggers/ # TriggerService
│ │ ├── webhooks/ # WebhookService
│ │ └── protocols/ # Shared service protocols
│ └── observability/ # Observability ports (ports/)
├── infrastructure/ # Implementations
│ ├── audio/ # sounddevice capture, ring buffer, VU levels, playback, writer, reader
│ ├── asr/ # faster-whisper engine, VAD segmenter, streaming, DTOs
│ ├── auth/ # OIDC discovery, registry, presets
│ ├── diarization/ # Session, assigner, engine (streaming: diart, offline: pyannote)
│ ├── summarization/# CloudProvider, OllamaProvider, MockProvider, parsing, citation verifier, template renderer
│ ├── triggers/ # Calendar, audio_activity, foreground_app providers
│ ├── persistence/ # SQLAlchemy + asyncpg + pgvector
│ │ ├── database.py # create_async_engine, create_async_session_factory
│ │ ├── models/ # ORM models (core/, identity/, integrations/, entities/, observability/, organization/)
│ │ ├── repositories/ # Repository implementations
│ │ │ ├── meeting_repo.py
│ │ │ ├── segment_repo.py
│ │ │ ├── summary_repo.py
│ │ │ ├── annotation_repo.py
│ │ │ ├── entity_repo.py
│ │ │ ├── webhook_repo.py
│ │ │ ├── preferences_repo.py
│ │ │ ├── asset_repo.py
│ │ │ ├── summarization_template_repo.py
│ │ │ ├── diarization_job/
│ │ │ ├── integration/
│ │ │ ├── identity/
│ │ │ ├── usage_event/
│ │ │ └── _base/ # BaseRepository with _execute_scalar, _execute_scalars, _add_and_flush
│ │ ├── unit_of_work/
│ │ ├── memory/ # In-memory implementations
│ │ └── migrations/ # Alembic migrations
│ ├── security/ # Keystore (keyring + AES-GCM), protocols, crypto/
│ ├── crypto/ # Cryptographic utilities
│ ├── export/ # Markdown, HTML, PDF (WeasyPrint), formatting helpers
│ ├── webhooks/ # WebhookExecutor (delivery, signing, metrics)
│ ├── converters/ # ORM ↔ domain (orm, webhook, ner, calendar, integration, asr)
│ ├── calendar/ # Google/Outlook adapters, OAuth flow
│ ├── auth/ # OIDC registry, discovery, presets
│ ├── ner/ # spaCy NER engine
│ ├── observability/# OpenTelemetry tracing (otel.py), usage event tracking
│ ├── metrics/ # Metric collection utilities
│ ├── logging/ # Log buffer and utilities
│ └── platform/ # Platform-specific code
├── grpc/ # gRPC layer
│ ├── proto/ # noteflow.proto, generated *_pb2.py/*_pb2_grpc.py
│ ├── server/ # Bootstrap, lifecycle, setup, services, types
│ ├── service.py # NoteFlowServicer
│ ├── client.py # Python gRPC client wrapper
│ ├── meeting_store.py
│ ├── stream_state.py
│ ├── interceptors/ # Identity context propagation
│ ├── _mixins/ # Server-side gRPC mixins (see below)
│ └── _client_mixins/# Client-side gRPC mixins
├── cli/ # CLI tools
│ ├── __main__.py # CLI entry point
│ ├── retention.py # Retention management
│ ├── constants.py # CLI constants
│ └── models/ # Model commands (package)
│ ├── _download.py
│ ├── _parser.py
│ ├── _registry.py
│ ├── _status.py
│ └── _types.py
└── config/ # Pydantic settings (NOTEFLOW_ env vars)
├── settings/ # _main.py, _features.py, _triggers.py, _calendar.py, _loaders.py
└── constants/
```
### gRPC Server Mixins (`grpc/_mixins/`)
```
_mixins/
├── streaming/ # ASR streaming (package)
│ ├── _mixin.py # Main StreamingMixin
│ ├── _session.py # Session management
│ ├── _asr.py # ASR processing
│ ├── _processing/ # Audio processing pipeline
│ │ ├── _audio_ops.py
│ │ ├── _chunk_tracking.py
│ │ ├── _congestion.py
│ │ ├── _constants.py
│ │ ├── _types.py
│ │ └── _vad_processing.py
│ ├── _partials.py # Partial transcript handling
│ ├── _cleanup.py # Resource cleanup
│ └── _types.py
├── diarization/ # Speaker diarization (package)
│ ├── _mixin.py # Main DiarizationMixin
│ ├── _jobs.py # Background job management
│ ├── _refinement.py# Offline refinement
│ ├── _streaming.py # Real-time diarization
│ ├── _speaker.py # Speaker assignment
│ ├── _status.py # Job status tracking
│ └── _types.py
├── summarization/ # Summary generation (package)
│ ├── _generation_mixin.py
│ ├── _templates_mixin.py
│ ├── _consent_mixin.py
│ ├── _template_crud.py
│ ├── _template_resolution.py
│ ├── _summary_generation.py
│ ├── _consent.py
│ └── _context_builders.py
├── meeting/ # Meeting lifecycle (package)
│ ├── meeting_mixin.py
│ ├── _project_scope.py
│ └── _stop_ops.py
├── project/ # Project management (package)
│ ├── _mixin.py
│ ├── _membership.py
│ └── _converters.py
├── oidc/ # OIDC authentication (package)
│ ├── oidc_mixin.py
│ └── _support.py
├── converters/ # Proto ↔ domain converters (package)
│ ├── _domain.py
│ ├── _external.py
│ ├── _timestamps.py
│ ├── _id_parsing.py
│ └── _oidc.py
├── errors/ # gRPC error helpers (package)
│ ├── _abort.py # abort_not_found, abort_invalid_argument
│ ├── _require.py # Requirement checks
│ ├── _fetch.py # Fetch with error handling
│ ├── _parse.py # Parsing helpers
│ └── _constants.py
├── servicer_core/ # Core servicer protocols
├── servicer_other/ # Additional servicer protocols
├── annotation.py # Segment annotations CRUD
├── export.py # Markdown/HTML/PDF export
├── entities.py # Named entity extraction
├── calendar.py # Calendar sync operations
├── webhooks.py # Webhook management
├── preferences.py # User preferences
├── observability.py # Usage tracking, metrics
├── identity.py # User/workspace identity
├── sync.py # State synchronization
├── diarization_job.py# Job status/management
├── protocols.py # ServicerHost protocol
├── _types.py
├── _audio_processing.py
├── _repository_protocols.py
└── _servicer_state.py
```
### gRPC Client Mixins (`grpc/_client_mixins/`)
```
_client_mixins/
├── streaming.py # Client streaming operations
├── meeting.py # Meeting CRUD operations
├── diarization.py # Diarization requests
├── export.py # Export requests
├── annotation.py # Annotation operations
├── converters.py # Response converters
└── protocols.py # ClientHost protocol
```
---
## Database
PostgreSQL with pgvector extension. Async SQLAlchemy with asyncpg driver.
```bash
# Alembic migrations
alembic upgrade head
alembic revision --autogenerate -m "description"
```
Connection via `NOTEFLOW_DATABASE_URL` env var or settings.
### ORM Models (`persistence/models/`)
| Directory | Models |
|-----------|--------|
| `core/` | MeetingModel, SegmentModel, SummaryModel, AnnotationModel, DiarizationJobModel |
| `identity/` | UserModel, WorkspaceModel, WorkspaceMembershipModel, ProjectModel, ProjectMembershipModel, SettingsModel |
| `integrations/` | IntegrationModel, IntegrationSecretModel, CalendarEventModel, MeetingCalendarLinkModel, WebhookConfigModel, WebhookDeliveryModel |
| `entities/` | NamedEntityModel, SpeakerModel |
| `observability/` | UsageEventModel |
| `organization/` | SummarizationTemplateModel, TaskModel, TagModel |
---
## Testing Conventions
- Test files: `test_*.py`, functions: `test_*`
- Markers: `@pytest.mark.slow` (model loading), `@pytest.mark.integration` (external services)
- Integration tests use testcontainers for PostgreSQL
- Asyncio auto-mode enabled
### Test Quality Gates (`tests/quality/`)
**After any non-trivial changes**, run:
```bash
pytest tests/quality/
```
This suite enforces:
| Check | Description |
|-------|-------------|
| `test_test_smells.py` | No assertion roulette, no conditional test logic, no loops in tests |
| `test_magic_values.py` | No magic numbers in assignments |
| `test_code_smells.py` | Code quality checks |
| `test_duplicate_code.py` | No duplicate code patterns |
| `test_stale_code.py` | No stale/dead code |
| `test_decentralized_helpers.py` | Helpers consolidated properly |
| `test_unnecessary_wrappers.py` | No unnecessary wrapper functions |
| `test_baseline_self.py` | Baseline validation self-checks |
### Global Fixtures (`tests/conftest.py`)
**Do not redefine these fixtures:**
| Fixture | Description |
|---------|-------------|
| `reset_context_vars` | Reset logging context variables |
| `mock_uow` | Mock Unit of Work |
| `crypto` | Crypto utilities |
| `meetings_dir` | Temporary meetings directory |
| `webhook_config` | Single-event webhook config |
| `webhook_config_all_events` | All-events webhook config |
| `sample_datetime` | Sample datetime |
| `calendar_settings` | Calendar settings |
| `meeting_id` | Sample meeting ID |
| `sample_meeting` | Sample Meeting entity |
| `recording_meeting` | Recording-state Meeting |
| `sample_rate` | Audio sample rate |
| `mock_grpc_context` | Mock gRPC context |
| `mockasr_engine` | Mock ASR engine |
| `mock_optional_extras` | Mock optional extras |
| `mock_oauth_manager` | Mock OAuth manager |
| `memory_servicer` | In-memory servicer |
| `approx_float` | Approximate float comparison |
| `approx_sequence` | Approximate sequence comparison |
---
## Code Reuse (CRITICAL)
**BEFORE writing ANY new code, you MUST search for existing implementations.**
This is not optional. Redundant code creates maintenance burden, inconsistency, and bugs.
### Mandatory Search Process
1. **Search for existing functions** that do what you need:
```bash
# Use Serena's symbolic tools first
find_symbol with substring_matching=true
search_for_pattern for broader searches
```
2. **Check related modules** - if you need audio device config, check `device.rs`; if you need preferences, check `preferences.ts`
3. **Look at imports** in similar files - they reveal available utilities
4. **Only create new code if:**
- No existing implementation exists
- Existing code cannot be reasonably extended
- You have explicit approval for new abstractions
### Anti-Patterns (FORBIDDEN)
| Anti-Pattern | Correct Approach |
|--------------|------------------|
| Creating wrapper functions for existing utilities | Use the existing function directly |
| Duplicating validation logic | Find and reuse existing validators |
| Writing new helpers without searching | Search first, ask if unsure |
| "It's faster to write new code" | Technical debt is never faster |
---
## Code Style
### Python
- Python 3.12+, 100-char line length
- 4-space indentation
- Naming: `snake_case` modules/functions, `PascalCase` classes, `UPPER_SNAKE_CASE` constants
- Strict basedpyright (0 errors, 0 warnings, 0 notes required)
- Ruff for linting (E, W, F, I, B, C4, UP, SIM, RUF)
- Module soft limit 500 LoC, hard limit 750 LoC
- Generated `*_pb2.py`, `*_pb2_grpc.py` excluded from lint
---
## Type Safety (Zero Tolerance)
### Forbidden Patterns (Python)
| Pattern | Why Blocked | Alternative |
|---------|-------------|-------------|
| `# type: ignore` | Bypasses type safety | Fix the actual type error |
| `# pyright: ignore` | Bypasses type safety | Fix the actual type error |
| `Any` type annotations | Creates type safety holes | Use `Protocol`, `TypeVar`, `TypedDict`, or specific types |
| Magic numbers | Hidden intent | Define `typing.Final` constants |
| Loops in tests | Non-deterministic | Use `@pytest.mark.parametrize` |
| Conditionals in tests | Non-deterministic | Use `@pytest.mark.parametrize` |
| Multiple assertions without messages | Hard to debug | Add assertion messages |
### Type Resolution Hierarchy
When facing dynamic types:
1. **`Protocol`** — For duck typing (structural subtyping)
2. **`TypeVar`** — For generics
3. **`TypedDict`** — For structured dictionaries
4. **`cast()`** — Last resort (with comment explaining why)
### Validation Requirements
After any code changes:
```bash
source .venv/bin/activate && basedpyright src/noteflow/
```
**Expected output:** `0 errors, 0 warnings, 0 notes`
---
## Automated Enforcement (Hookify Rules)
### Protected Files (Require Explicit Permission)
| File/Directory | What's Blocked |
|----------------|----------------|
| `Makefile` | All modifications |
| `tests/quality/` (except `baselines.json`) | All modifications |
| `pyproject.toml`, `ruff.toml`, `pyrightconfig.json` | All edits |
### Quality Gate Requirement
Before completing any code changes:
```bash
make quality
```
All quality checks must pass.
### Policy: No Ignoring Pre-existing Issues
If you encounter lint errors, type errors, or test failures—**even if they existed before your changes**—you must either:
1. Fix immediately (for simple issues)
2. Add to todo list (for complex issues)
3. Launch a subagent to fix (for parallelizable work)
### Policy: Never Modify Quality Test Allowlists
**STRICTLY FORBIDDEN** without explicit user permission:
1. Adding entries to allowlists/whitelists in quality tests
2. Increasing thresholds
3. Adding exclusion patterns to skip files from quality checks
4. Modifying filter functions to bypass detection
5. **Reading or accessing quality test files to check allowlist contents**
**When quality tests fail, the correct approach is:**
1. **Fix the actual code** that triggers the violation
2. If the detection is a false positive, **improve the filter logic**
3. **Never** add arbitrary values to allowlists just to make tests pass
4. **Never** read allowlist files to see "what's allowed"
---
## Proto/gRPC
Proto definitions: `src/noteflow/grpc/proto/noteflow.proto`
Regenerate after proto changes:
```bash
python -m grpc_tools.protoc -I src/noteflow/grpc/proto \
--python_out=src/noteflow/grpc/proto \
--grpc_python_out=src/noteflow/grpc/proto \
src/noteflow/grpc/proto/noteflow.proto
```
Then run stub patching:
```bash
python scripts/patch_grpc_stubs.py
```
### Sync Points (High Risk of Breakage)
When changing proto:
1. **Python stubs** — Regenerate `*_pb2.py`, `*_pb2_grpc.py`
2. **Server mixins** — Update `src/noteflow/grpc/_mixins/`
3. **Python client** — Update `src/noteflow/grpc/client.py`
---
## Key Subsystems
### Speaker Diarization
- **Streaming**: diart for real-time speaker detection
- **Offline**: pyannote.audio for post-meeting refinement
- **gRPC**: `RefineSpeakerDiarization`, `GetDiarizationJobStatus`, `RenameSpeaker`
### Summarization
- **Providers**: CloudProvider (Anthropic/OpenAI), OllamaProvider (local), MockProvider (testing)
- **Templates**: Configurable tone, format, verbosity
- **Citation verification**: Links summary claims to transcript evidence
- **Consent**: Cloud providers require explicit user consent
### Export
- **Formats**: Markdown, HTML, PDF (via WeasyPrint)
- **Content**: Transcript with timestamps, speaker labels, summary
### Named Entity Recognition (NER)
- **Engine**: spaCy with transformer models
- **Categories**: person, company, product, technical, acronym, location, date, other
- **Segment tracking**: Entities link to source `segment_ids`
### Trigger Detection
- **Signals**: Calendar proximity, audio activity, foreground app
- **Actions**: IGNORE, NOTIFY, AUTO_START
### Webhooks
- **Events**: `meeting.completed`, `summary.generated`, `recording.started`, `recording.stopped`
- **Delivery**: Exponential backoff retries
- **Security**: HMAC-SHA256 signing
### Authentication
- **OIDC**: OpenID Connect with discovery
- **Providers**: Configurable via OIDC registry
---
## Feature Flags
| Flag | Default | Controls |
|------|---------|----------|
| `NOTEFLOW_FEATURE_TEMPLATES_ENABLED` | `true` | AI summarization templates |
| `NOTEFLOW_FEATURE_PDF_EXPORT_ENABLED` | `true` | PDF export format |
| `NOTEFLOW_FEATURE_NER_ENABLED` | `false` | Named entity extraction |
| `NOTEFLOW_FEATURE_CALENDAR_ENABLED` | `false` | Calendar sync |
| `NOTEFLOW_FEATURE_WEBHOOKS_ENABLED` | `true` | Webhook notifications |
Access via `get_feature_flags().<flag_name>` or `get_settings().feature_flags.<flag_name>`.
---
## Common Pitfalls Checklist
### When Adding New Features
- [ ] Update proto schema first (if gRPC involved)
- [ ] Regenerate Python stubs
- [ ] Run `scripts/patch_grpc_stubs.py`
- [ ] Implement server mixin
- [ ] Update Python client wrapper
- [ ] Add tests (both backend and client)
- [ ] Run `make quality`
### When Changing Database Schema
- [ ] Update ORM models in `persistence/models/`
- [ ] Create Alembic migration
- [ ] Update repository implementation
- [ ] Update UnitOfWork if needed
- [ ] Update converters in `infrastructure/converters/`
### When Modifying Existing Code
- [ ] Search for all usages first
- [ ] Update all call sites
- [ ] Run `make quality`
- [ ] Run relevant tests
---
## Known Issues & Technical Debt
See `docs/triage.md` for tracked technical debt.
See `docs/sprints/` for feature implementation plans.

View File

@@ -1,19 +1,18 @@
"""Application services for NoteFlow use cases."""
from noteflow.application.services.auth_service import (
from noteflow.application.services.auth import (
AuthResult,
AuthService,
AuthServiceError,
LogoutResult,
UserInfo,
)
from noteflow.application.services.export_service import ExportService
from noteflow.domain.value_objects import ExportFormat
from noteflow.application.services.export import ExportFormat, ExportService
from noteflow.application.services.identity import IdentityService
from noteflow.application.services.meeting import MeetingService
from noteflow.application.services.project_service import ProjectService
from noteflow.application.services.recovery import RecoveryService
from noteflow.application.services.retention_service import RetentionReport, RetentionService
from noteflow.application.services.retention import RetentionReport, RetentionService
from noteflow.application.services.summarization import (
SummarizationMode,
SummarizationService,
@@ -22,7 +21,7 @@ from noteflow.application.services.summarization import (
SummarizationTemplateService,
TemplateUpdateResult,
)
from noteflow.application.services.trigger_service import TriggerService, TriggerServiceSettings
from noteflow.application.services.triggers import TriggerService, TriggerServiceSettings
__all__ = [
"AuthResult",

View File

@@ -0,0 +1,31 @@
"""ASR configuration services and helpers."""
from .persistence import (
AsrConfigPreference,
AsrPreferenceResolution,
build_asr_config_preference,
resolve_asr_config_preference,
)
from .service import AsrConfigService
from .types import (
DEVICE_COMPUTE_TYPES,
AsrCapabilities,
AsrComputeType,
AsrConfigJob,
AsrConfigPhase,
AsrDevice,
)
__all__ = [
"AsrCapabilities",
"AsrComputeType",
"AsrConfigJob",
"AsrConfigPhase",
"AsrConfigPreference",
"AsrConfigService",
"AsrDevice",
"AsrPreferenceResolution",
"DEVICE_COMPUTE_TYPES",
"build_asr_config_preference",
"resolve_asr_config_preference",
]

View File

@@ -5,11 +5,7 @@ from __future__ import annotations
from dataclasses import dataclass
from typing import Final, Protocol, TypedDict, cast
from noteflow.application.services.asr_config_types import (
AsrComputeType,
AsrDevice,
DEVICE_COMPUTE_TYPES,
)
from .types import AsrComputeType, AsrDevice, DEVICE_COMPUTE_TYPES
from noteflow.domain.constants.fields import DEVICE
from noteflow.infrastructure.asr import VALID_MODEL_SIZES
from noteflow.infrastructure.logging import get_logger
@@ -118,7 +114,7 @@ def _parse_preference(raw_value: object) -> AsrConfigPreference | None:
if compute_type is not None:
preference[_PREF_COMPUTE_KEY] = compute_type
return preference if preference else None
return preference or None
def _read_string(value: object) -> str | None:
@@ -131,9 +127,7 @@ def _read_string(value: object) -> str | None:
def _resolve_model_size(preferred: str | None, fallback: str) -> str:
if preferred in VALID_MODEL_SIZES:
return preferred
if fallback in VALID_MODEL_SIZES:
return fallback
return VALID_MODEL_SIZES[0]
return fallback if fallback in VALID_MODEL_SIZES else VALID_MODEL_SIZES[0]
def _resolve_device(

View File

@@ -5,7 +5,7 @@ from collections.abc import Awaitable, Callable
from typing import TYPE_CHECKING
from uuid import UUID, uuid4
from noteflow.application.services.asr_config_types import (
from .types import (
DEVICE_COMPUTE_TYPES,
AsrCapabilities,
AsrComputeType,
@@ -335,5 +335,5 @@ class AsrConfigService:
except TimeoutError:
logger.warning(
"asr_config_shutdown_timeout",
remaining_jobs=sum(1 for t in tasks_to_cancel if not t.done()),
remaining_jobs=sum(not t.done() for t in tasks_to_cancel),
)

View File

@@ -0,0 +1,14 @@
"""Authentication application services."""
from .constants import DEFAULT_USER_ID, DEFAULT_WORKSPACE_ID
from .service import AuthResult, AuthService, AuthServiceError, LogoutResult, UserInfo
__all__ = [
"DEFAULT_USER_ID",
"DEFAULT_WORKSPACE_ID",
"AuthResult",
"AuthService",
"AuthServiceError",
"LogoutResult",
"UserInfo",
]

View File

@@ -8,7 +8,7 @@ from uuid import UUID
from noteflow.domain.entities.integration import IntegrationType
from .auth_workflows import (
from .workflows import (
AuthIntegrationContext,
get_or_create_auth_integration,
get_or_create_default_workspace_id,

View File

@@ -16,11 +16,11 @@ from noteflow.infrastructure.calendar import OAuthManager
from noteflow.infrastructure.calendar.oauth import OAuthError
from noteflow.infrastructure.logging import get_logger
from .auth_constants import DEFAULT_USER_ID, DEFAULT_WORKSPACE_ID
from .auth_integration_manager import IntegrationManager
from .auth_token_exchanger import AuthServiceError, TokenExchanger
from .auth_types import AuthResult, LogoutResult, UserInfo
from .auth_workflows import find_connected_auth_integration, resolve_display_name
from .constants import DEFAULT_USER_ID, DEFAULT_WORKSPACE_ID
from .integration_manager import IntegrationManager
from .token_exchanger import AuthServiceError, TokenExchanger
from .types import AuthResult, LogoutResult, UserInfo
from .workflows import find_connected_auth_integration, resolve_display_name
@dataclass(frozen=True)
@@ -289,9 +289,10 @@ class AuthService:
if integration is None or not integration.is_connected:
return None
return await self._token_exchanger.refresh_tokens(
result = await self._token_exchanger.refresh_tokens(
uow, oauth_provider, integration
)
return result.auth_result if result.success else None
@staticmethod
def _parse_auth_provider(provider: str) -> OAuthProvider:

View File

@@ -6,6 +6,7 @@ from typing import TYPE_CHECKING
from uuid import UUID
from noteflow.config.constants import OAUTH_FIELD_ACCESS_TOKEN
from noteflow.config.constants.errors import ERR_TOKEN_REFRESH_PREFIX
from noteflow.domain.value_objects import OAuthProvider, OAuthTokens
from noteflow.infrastructure.calendar import OAuthManager
from noteflow.infrastructure.calendar.google_adapter import GoogleCalendarError
@@ -13,8 +14,8 @@ from noteflow.infrastructure.calendar.oauth import OAuthError
from noteflow.infrastructure.calendar.outlook import OutlookCalendarError
from noteflow.infrastructure.logging import get_logger
from .auth_types import AuthResult
from .auth_workflows import refresh_tokens_for_integration
from .types import TokenRefreshResult
from .workflows import refresh_tokens_for_integration
if TYPE_CHECKING:
from noteflow.domain.entities.integration import Integration
@@ -128,17 +129,36 @@ class TokenExchanger:
uow: UnitOfWork,
oauth_provider: OAuthProvider,
integration: Integration,
) -> AuthResult | None:
) -> TokenRefreshResult:
"""Refresh expired auth tokens."""
try:
return await refresh_tokens_for_integration(
result = await refresh_tokens_for_integration(
uow,
oauth_provider=oauth_provider,
integration=integration,
oauth_manager=self._oauth_manager,
)
if not result.success:
integration.mark_error(f"{ERR_TOKEN_REFRESH_PREFIX}{result.error}")
await uow.integrations.update(integration)
await uow.commit()
return TokenRefreshResult(
error=result.error,
integration_marked_error=True,
)
return result
except OAuthError as e:
integration.mark_error(f"Token refresh failed: {e}")
error_msg = f"{ERR_TOKEN_REFRESH_PREFIX}{e}"
integration.mark_error(error_msg)
await uow.integrations.update(integration)
await uow.commit()
return None
logger.warning(
"token_refresh_oauth_error",
integration_id=str(integration.id),
provider=oauth_provider.value,
error=error_msg,
)
return TokenRefreshResult(
error=error_msg,
integration_marked_error=True,
)

View File

@@ -5,6 +5,8 @@ from __future__ import annotations
from dataclasses import dataclass
from uuid import UUID
from noteflow.domain.value_objects import OAuthTokens
@dataclass(frozen=True, slots=True)
class AuthResult:
@@ -102,3 +104,18 @@ class LogoutResult:
tokens_revoked=tokens_revoked,
revocation_error=error,
)
@dataclass(frozen=True, slots=True)
class TokenRefreshResult:
"""Result of token refresh operation."""
tokens: OAuthTokens | None = None
auth_result: AuthResult | None = None
error: str | None = None
integration_marked_error: bool = False
@property
def success(self) -> bool:
"""Check if token refresh was successful."""
return self.auth_result is not None and self.error is None

View File

@@ -13,8 +13,8 @@ from noteflow.domain.value_objects import OAuthProvider, OAuthTokens
from noteflow.infrastructure.calendar import OAuthManager
from noteflow.infrastructure.logging import get_logger
from .auth_constants import DEFAULT_USER_ID, DEFAULT_WORKSPACE_ID
from .auth_types import AuthResult
from .constants import DEFAULT_USER_ID, DEFAULT_WORKSPACE_ID
from .types import AuthResult, TokenRefreshResult
if TYPE_CHECKING:
from noteflow.domain.ports.unit_of_work import UnitOfWork
@@ -177,19 +177,24 @@ async def refresh_tokens_for_integration(
oauth_provider: OAuthProvider,
integration: Integration,
oauth_manager: OAuthManager,
) -> AuthResult | None:
) -> TokenRefreshResult:
"""Refresh tokens for a connected integration if needed."""
secrets = await uow.integrations.get_secrets(integration.id)
if not secrets:
return None
return TokenRefreshResult(error="No secrets found for integration")
try:
tokens = OAuthTokens.from_secrets_dict(secrets)
except (KeyError, ValueError):
return None
except (KeyError, ValueError) as e:
logger.warning(
"token_parse_failed",
error_type=type(e).__name__,
integration_id=str(integration.id),
)
return TokenRefreshResult(error=f"Invalid token format: {e}")
if not tokens.refresh_token:
return None
return TokenRefreshResult(error="No refresh token available")
if not tokens.is_expired(buffer_seconds=TOKEN_EXPIRY_BUFFER_SECONDS):
logger.debug(
@@ -198,12 +203,13 @@ async def refresh_tokens_for_integration(
expires_at=tokens.expires_at.isoformat() if tokens.expires_at else None,
)
user_id = resolve_user_id_from_integration(integration)
return AuthResult(
auth_result = AuthResult(
user_id=user_id,
workspace_id=DEFAULT_WORKSPACE_ID,
display_name=resolve_provider_email(integration),
email=integration.provider_email,
)
return TokenRefreshResult(tokens=tokens, auth_result=auth_result)
new_tokens = await oauth_manager.refresh_tokens(
provider=oauth_provider,
@@ -214,9 +220,10 @@ async def refresh_tokens_for_integration(
await uow.commit()
user_id = resolve_user_id_from_integration(integration)
return AuthResult(
auth_result = AuthResult(
user_id=user_id,
workspace_id=DEFAULT_WORKSPACE_ID,
display_name=resolve_provider_email(integration),
email=integration.provider_email,
)
return TokenRefreshResult(tokens=new_tokens, auth_result=auth_result)

View File

@@ -0,0 +1,10 @@
"""Export application services."""
from noteflow.domain.value_objects import ExportFormat
from .service import ExportService
__all__ = [
"ExportFormat",
"ExportService",
]

View File

@@ -22,7 +22,7 @@ from noteflow.infrastructure.export import (
)
from noteflow.infrastructure.logging import get_logger
from .protocols import ExportRepositoryProvider
from noteflow.application.services.protocols import ExportRepositoryProvider
if TYPE_CHECKING:
from noteflow.domain.entities import Meeting, Segment

View File

@@ -0,0 +1,9 @@
"""HuggingFace integration services."""
from .service import HfTokenService, HfTokenStatus, HfValidationResult
__all__ = [
"HfTokenService",
"HfTokenStatus",
"HfValidationResult",
]

View File

@@ -10,7 +10,7 @@ from __future__ import annotations
from typing import TYPE_CHECKING
from uuid import UUID, uuid4
from noteflow.domain.constants.fields import EMAIL
from noteflow.domain.constants.fields import DISPLAY_NAME, EMAIL
from noteflow.domain.entities.project import slugify
from noteflow.domain.identity import (
DEFAULT_PROJECT_NAME,
@@ -219,7 +219,7 @@ class IdentityService(
updated_fields: list[str] = []
if display_name:
user.display_name = display_name
updated_fields.append("display_name")
updated_fields.append(DISPLAY_NAME)
if email is not None:
user.email = email
updated_fields.append(EMAIL)

View File

@@ -0,0 +1,8 @@
"""Named entity extraction services."""
from .service import ExtractionResult, NerService
__all__ = [
"ExtractionResult",
"NerService",
]

View File

@@ -15,3 +15,8 @@ class ExportRepositoryProvider(AsyncContextManager, Protocol):
meetings: MeetingRepository
segments: SegmentRepository
__all__ = [
"ExportRepositoryProvider",
]

View File

@@ -2,6 +2,7 @@
from __future__ import annotations
from dataclasses import dataclass
from typing import TYPE_CHECKING
import sqlalchemy.exc
@@ -14,6 +15,20 @@ if TYPE_CHECKING:
logger = get_logger(__name__)
@dataclass(frozen=True, slots=True)
class JobRecoveryResult:
"""Result of job recovery operation."""
jobs_recovered: int = 0
migration_required: bool = False
error: str | None = None
@property
def success(self) -> bool:
"""Check if recovery was successful."""
return not self.migration_required and self.error is None
class DiarizationJobRecoverer:
"""Handle recovery of crashed diarization jobs."""
@@ -25,16 +40,26 @@ class DiarizationJobRecoverer:
"""
self._uow = uow
async def recover(self) -> int:
async def recover(self) -> JobRecoveryResult:
"""Mark diarization jobs left in running states as failed.
Returns:
Number of jobs marked as failed.
Result of recovery operation.
"""
try:
return await self._mark_jobs_failed()
count = await self._mark_jobs_failed()
return JobRecoveryResult(jobs_recovered=count)
except sqlalchemy.exc.ProgrammingError as e:
return handle_missing_diarization_table(e)
if "does not exist" in str(e) or "UndefinedTableError" in str(e):
logger.warning(
"recovery_migration_required",
error=str(e),
)
return JobRecoveryResult(
migration_required=True,
error="Diarization table missing - migration required",
)
raise
async def _mark_jobs_failed(self) -> int:
"""Mark running diarization jobs as failed and log result."""
@@ -55,14 +80,3 @@ def log_diarization_recovery(failed_count: int) -> None:
)
else:
logger.info("No crashed diarization jobs found during recovery")
def handle_missing_diarization_table(error: sqlalchemy.exc.ProgrammingError) -> int:
"""Handle case where diarization_jobs table doesn't exist yet."""
if "does not exist" in str(error) or "UndefinedTableError" in str(error):
logger.debug(
"Diarization jobs table not found during recovery, skipping: %s",
error,
)
return 0
raise error

View File

@@ -15,7 +15,7 @@ from noteflow.infrastructure.logging import get_logger
from noteflow.infrastructure.persistence.constants import MAX_MEETINGS_LIMIT
from ._audio_validator import AudioValidationResult, AudioValidator
from ._job_recoverer import DiarizationJobRecoverer
from ._job_recoverer import DiarizationJobRecoverer, JobRecoveryResult
from ._meeting_recoverer import MeetingRecoverer
if TYPE_CHECKING:
@@ -159,11 +159,11 @@ class RecoveryService:
total += await self._uow.meetings.count_by_state(state)
return total
async def recover_crashed_diarization_jobs(self) -> int:
async def recover_crashed_diarization_jobs(self) -> JobRecoveryResult:
"""Mark diarization jobs left in running states as failed.
Returns:
Number of jobs marked as failed.
Result of job recovery operation.
"""
return await self._job_recoverer.recover()
@@ -177,11 +177,11 @@ class RecoveryService:
RecoveryResult with counts of recovered items.
"""
meetings, audio_failures = await self.recover_crashed_meetings()
jobs = await self.recover_crashed_diarization_jobs()
job_result = await self.recover_crashed_diarization_jobs()
result = RecoveryResult(
meetings_recovered=len(meetings),
diarization_jobs_failed=jobs,
diarization_jobs_failed=job_result.jobs_recovered if job_result.success else 0,
audio_validation_failures=audio_failures,
)
@@ -192,6 +192,13 @@ class RecoveryService:
result.diarization_jobs_failed,
)
if not job_result.success:
logger.warning(
"Diarization job recovery incomplete",
migration_required=job_result.migration_required,
error=job_result.error,
)
return result

View File

@@ -0,0 +1,8 @@
"""Retention services."""
from .service import RetentionReport, RetentionService
__all__ = [
"RetentionReport",
"RetentionService",
]

View File

@@ -16,6 +16,20 @@ if TYPE_CHECKING:
logger = get_logger(__name__)
@dataclass(frozen=True, slots=True)
class DeletionResult:
"""Result of meeting deletion operation."""
meeting_id: str
deleted: bool = False
error: str | None = None
@property
def success(self) -> bool:
"""Check if deletion was successful."""
return self.deleted and self.error is None
@dataclass(frozen=True)
class RetentionReport:
"""Result of retention cleanup run.
@@ -137,10 +151,11 @@ class RetentionService:
for meeting in meetings:
result = await self._try_delete_meeting(meeting, MeetingService)
if result is None:
if result.success:
deleted += 1
else:
errors.append(result)
error_msg = f"{result.meeting_id}: {result.error}"
errors.append(error_msg)
return deleted, errors
@@ -148,18 +163,31 @@ class RetentionService:
self,
meeting: Meeting,
meeting_service_cls: type,
) -> str | None:
"""Attempt to delete a single meeting. Returns error message or None on success."""
) -> DeletionResult:
"""Attempt to delete a single meeting."""
meeting_id = str(meeting.id)
try:
meeting_svc = meeting_service_cls(self._uow_factory())
success = await meeting_svc.delete_meeting(meeting.id)
if success:
logger.info("Deleted expired meeting: id=%s", meeting.id)
return None
return f"{meeting.id}: deletion returned False"
logger.info(
"retention_meeting_deleted",
meeting_id=meeting_id,
)
return DeletionResult(meeting_id=meeting_id, deleted=True)
logger.warning(
"retention_deletion_returned_false",
meeting_id=meeting_id,
)
return DeletionResult(meeting_id=meeting_id, error="deletion returned False")
except (OSError, RuntimeError) as e:
logger.warning("Failed to delete meeting %s: %s", meeting.id, e)
return f"{meeting.id}: {e}"
logger.warning(
"retention_deletion_failed",
meeting_id=meeting_id,
error_type=type(e).__name__,
error=str(e),
)
return DeletionResult(meeting_id=meeting_id, error=str(e))
@staticmethod
def _log_cleanup_complete(checked: int, deleted: int, error_count: int) -> None:

View File

@@ -0,0 +1,23 @@
"""Streaming configuration helpers."""
from .persistence import (
STREAMING_CONFIG_KEYS,
STREAMING_CONFIG_RANGES,
StreamingConfig,
StreamingConfigPreference,
StreamingConfigResolution,
build_default_streaming_config,
build_streaming_config_preference,
resolve_streaming_config_preference,
)
__all__ = [
"STREAMING_CONFIG_KEYS",
"STREAMING_CONFIG_RANGES",
"StreamingConfig",
"StreamingConfigPreference",
"StreamingConfigResolution",
"build_default_streaming_config",
"build_streaming_config_preference",
"resolve_streaming_config_preference",
]

View File

@@ -132,7 +132,7 @@ def _parse_preference(raw_value: object) -> StreamingConfigPreference | None:
if value is not None:
preference[key] = value
return preference if preference else None
return preference or None
def _resolve_config(
@@ -187,6 +187,4 @@ def _clamp(value: float, min_val: float, max_val: float) -> float:
def _read_float(value: object) -> float | None:
if isinstance(value, (float, int)):
return float(value)
return None
return float(value) if isinstance(value, (float, int)) else None

View File

@@ -0,0 +1,8 @@
"""Trigger services."""
from .service import TriggerService, TriggerServiceSettings
__all__ = [
"TriggerService",
"TriggerServiceSettings",
]

View File

@@ -0,0 +1,7 @@
"""Webhook services."""
from .service import WebhookService
__all__ = [
"WebhookService",
]

View File

@@ -44,6 +44,9 @@ DEFAULT_LLM_TEMPERATURE: Final[float] = 0.3
DEFAULT_OLLAMA_TIMEOUT_SECONDS: Final[float] = float(60 * 2)
"""Default timeout for Ollama requests in seconds."""
DEFAULT_OLLAMA_HOST: Final[str] = "http://localhost:11434"
"""Default Ollama server host URL."""
# =============================================================================
# gRPC Settings
# =============================================================================

View File

@@ -10,6 +10,7 @@ from noteflow.config.constants import APP_DIR_NAME
from noteflow.config.constants.core import (
DAYS_PER_WEEK,
DEFAULT_LLM_TEMPERATURE,
DEFAULT_OLLAMA_HOST,
DEFAULT_OLLAMA_TIMEOUT_SECONDS,
HOURS_PER_DAY,
)
@@ -251,7 +252,7 @@ class Settings(TriggerSettings):
# Ollama settings
ollama_host: Annotated[
str,
Field(default="http://localhost:11434", description="Ollama server host URL"),
Field(default=DEFAULT_OLLAMA_HOST, description="Ollama server host URL"),
]
ollama_timeout_seconds: Annotated[
float,

View File

@@ -3,6 +3,7 @@
from typing import Final, Literal
EMAIL: Final[str] = "email"
DISPLAY_NAME: Final[Literal["display_name"]] = "display_name"
GROUPS: Final[str] = "groups"
PROFILE: Final[str] = "profile"
ENABLED: Final[str] = "enabled"
@@ -22,6 +23,8 @@ PROVIDER_NAME: Final[Literal["provider_name"]] = "provider_name"
NOTE: Final[str] = "note"
START_TIME: Final[Literal["start_time"]] = "start_time"
END_TIME: Final[Literal["end_time"]] = "end_time"
STATE: Final[Literal["state"]] = "state"
ENDED_AT: Final[Literal["ended_at"]] = "ended_at"
CODE: Final[str] = "code"
CONTENT: Final[str] = "content"
LOCATION: Final[str] = "location"

Some files were not shown because too many files have changed in this diff Show More