chore: update logging configuration and enhance project structure
- Added new logging configuration to improve observability across various services. - Introduced a `.repomixignore` file to exclude unnecessary files from version control. - Updated `pyproject.toml` to include additional paths for script discovery. - Refreshed submodule references for the client to ensure compatibility with recent changes. All quality checks pass.
This commit is contained in:
@@ -9,7 +9,7 @@ conditions:
|
||||
pattern: tests?/.*\.py$
|
||||
- field: new_text
|
||||
operator: regex_match
|
||||
pattern: \b(for|while)\s+[^:]+:[\s\S]*?(assert|pytest\.raises)|if\s+[^:]+:[\s\S]*?(assert|pytest\.raises)
|
||||
pattern: \b(for|while|if)\s+[^:]+:[\s\S]*?(assert|pytest\.raises)
|
||||
---
|
||||
|
||||
🚫 **Test Quality Violation: Loops or Conditionals in Tests**
|
||||
|
||||
120
.repomixignore
120
.repomixignore
@@ -1,4 +1,116 @@
|
||||
# Add patterns to ignore here, one per line
|
||||
# Example:
|
||||
# *.log
|
||||
# tmp/
|
||||
# Generated protobuf files (large, auto-generated)
|
||||
**/*_pb2.py
|
||||
**/*_pb2_grpc.py
|
||||
**/*.pb2.py
|
||||
**/*.pb2_grpc.py
|
||||
|
||||
# Lock files (very large, not needed for code understanding)
|
||||
uv.lock
|
||||
**/Cargo.lock
|
||||
**/package-lock.json
|
||||
**/bun.lockb
|
||||
**/yarn.lock
|
||||
**/*.lock
|
||||
|
||||
# Binary/image files
|
||||
**/*.png
|
||||
**/*.jpg
|
||||
**/*.jpeg
|
||||
**/*.gif
|
||||
**/*.ico
|
||||
**/*.svg
|
||||
**/*.icns
|
||||
**/*.webp
|
||||
client/app-icon.png
|
||||
client/public/favicon.ico
|
||||
client/public/placeholder.svg
|
||||
|
||||
# Build artifacts and generated code
|
||||
**/target/
|
||||
**/gen/
|
||||
**/dist/
|
||||
**/build/
|
||||
**/.vite/
|
||||
**/node_modules/
|
||||
**/__pycache__/
|
||||
**/*.egg-info/
|
||||
**/.pytest_cache/
|
||||
**/.mypy_cache/
|
||||
**/.ruff_cache/
|
||||
**/coverage/
|
||||
**/htmlcov/
|
||||
**/playwright-report/
|
||||
**/test-results/
|
||||
|
||||
# Documentation (verbose, can be referenced separately)
|
||||
docs/
|
||||
**/*.md
|
||||
!README.md
|
||||
!AGENTS.md
|
||||
!CLAUDE.md
|
||||
|
||||
# Benchmark files
|
||||
.benchmarks/
|
||||
**/*.json
|
||||
!package.json
|
||||
!tsconfig*.json
|
||||
!biome.json
|
||||
!components.json
|
||||
!compose.yaml
|
||||
!alembic.ini
|
||||
!pyproject.toml
|
||||
!repomix.config.json
|
||||
|
||||
# Large API spec file
|
||||
noteflow-api-spec.json
|
||||
|
||||
# IDE and editor files
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*.swn
|
||||
*.code-workspace
|
||||
|
||||
# Temporary and scratch files
|
||||
*.tmp
|
||||
*.temp
|
||||
scratch.md
|
||||
repomix-output.md
|
||||
|
||||
# Environment files
|
||||
.env
|
||||
.env.*
|
||||
!.env.example
|
||||
|
||||
# Logs
|
||||
logs/
|
||||
*.log
|
||||
|
||||
# Spikes (experimental code)
|
||||
spikes/
|
||||
|
||||
# Python virtual environment
|
||||
.venv/
|
||||
venv/
|
||||
env/
|
||||
|
||||
# OS files
|
||||
.DS_Store
|
||||
._*
|
||||
.Spotlight-V100
|
||||
.Trashes
|
||||
Thumbs.db
|
||||
ehthumbs.db
|
||||
*~
|
||||
|
||||
# Git files
|
||||
.git/
|
||||
.gitmodules
|
||||
|
||||
# Claude/Serena project files (internal tooling)
|
||||
.claude/
|
||||
.serena/
|
||||
|
||||
# Dev container configs (not needed for code understanding)
|
||||
.devcontainer/
|
||||
|
||||
2
client
2
client
Submodule client updated: d85f9edd6d...4e52a319fb
415
docs/spec.md
415
docs/spec.md
@@ -0,0 +1,415 @@
|
||||
# NoteFlow Spec Validation (2025-12-31)
|
||||
|
||||
This document validates the previous spec review against the current repository state and
|
||||
adds concrete evidence with file locations and excerpts. Locations are given as
|
||||
`path:line` within this repo.
|
||||
|
||||
## Corrections vs prior spec (validated)
|
||||
|
||||
- Background tasks: diarization jobs are already tracked and cancelled on shutdown; the
|
||||
untracked task issue is specific to integration sync tasks.
|
||||
- Webhook executor already uses per-request timeouts and truncates response bodies; gaps
|
||||
are delivery-id tracking and connection limits, not retry logic itself.
|
||||
- Outlook adapter error handling is synchronous-safe with `response.text`, but lacks
|
||||
explicit timeouts, pagination, and bounded error body logging.
|
||||
|
||||
---
|
||||
|
||||
## High-impact findings (confirmed/updated)
|
||||
|
||||
### 1) Timestamp representations are inconsistent across the gRPC schema
|
||||
|
||||
Status: Confirmed.
|
||||
|
||||
Evidence:
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:217`
|
||||
```proto
|
||||
// Creation timestamp (Unix epoch seconds)
|
||||
double created_at = 4;
|
||||
```
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:745`
|
||||
```proto
|
||||
// Start time (Unix timestamp seconds)
|
||||
int64 start_time = 3;
|
||||
```
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:1203`
|
||||
```proto
|
||||
// Start timestamp (ISO 8601)
|
||||
string started_at = 7;
|
||||
```
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:149`
|
||||
```proto
|
||||
// Server-side processing timestamp
|
||||
double server_timestamp = 5;
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- Multiple time encodings (double seconds, int64 seconds, ISO strings) force
|
||||
per-field conversions and increase client/server mismatch risk.
|
||||
|
||||
Recommendations:
|
||||
- Standardize absolute time to `google.protobuf.Timestamp` and durations to
|
||||
`google.protobuf.Duration` in new fields or v2 messages.
|
||||
- Keep legacy fields for backward compatibility and deprecate them in comments.
|
||||
- Provide helper conversions in `src/noteflow/grpc/_mixins/converters.py` to reduce
|
||||
repeated ad-hoc conversions.
|
||||
|
||||
---
|
||||
|
||||
### 2) UpdateAnnotation uses sentinel defaults with no presence tracking
|
||||
|
||||
Status: Confirmed.
|
||||
|
||||
Evidence:
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:502`
|
||||
```proto
|
||||
message UpdateAnnotationRequest {
|
||||
// Updated type (optional, keeps existing if not set)
|
||||
AnnotationType annotation_type = 2;
|
||||
|
||||
// Updated text (optional, keeps existing if empty)
|
||||
string text = 3;
|
||||
|
||||
// Updated start time (optional, keeps existing if 0)
|
||||
double start_time = 4;
|
||||
|
||||
// Updated end time (optional, keeps existing if 0)
|
||||
double end_time = 5;
|
||||
|
||||
// Updated segment IDs (replaces existing)
|
||||
repeated int32 segment_ids = 6;
|
||||
}
|
||||
```
|
||||
- `src/noteflow/grpc/_mixins/annotation.py:127`
|
||||
```python
|
||||
# Update fields if provided
|
||||
if request.annotation_type != noteflow_pb2.ANNOTATION_TYPE_UNSPECIFIED:
|
||||
annotation.annotation_type = proto_to_annotation_type(request.annotation_type)
|
||||
if request.text:
|
||||
annotation.text = request.text
|
||||
if request.start_time > 0:
|
||||
annotation.start_time = request.start_time
|
||||
if request.end_time > 0:
|
||||
annotation.end_time = request.end_time
|
||||
if request.segment_ids:
|
||||
annotation.segment_ids = list(request.segment_ids)
|
||||
```
|
||||
- Contrast: presence-aware optional fields already exist elsewhere:
|
||||
`src/noteflow/grpc/proto/noteflow.proto:973`
|
||||
```proto
|
||||
message UpdateWebhookRequest {
|
||||
// Updated URL (optional)
|
||||
optional string url = 2;
|
||||
// Updated name (optional)
|
||||
optional string name = 4;
|
||||
// Updated enabled status (optional)
|
||||
optional bool enabled = 6;
|
||||
}
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- You cannot clear text to an empty string or set a time to 0 intentionally.
|
||||
- `segment_ids` cannot be cleared because an empty list is treated as "no update".
|
||||
|
||||
Recommendations:
|
||||
- Introduce a patch-style request with `google.protobuf.FieldMask` (or `optional` fields)
|
||||
and keep the legacy fields for backward compatibility.
|
||||
- If you keep legacy fields, add explicit `clear_*` flags for fields that need clearing.
|
||||
|
||||
---
|
||||
|
||||
### 3) TranscriptUpdate payload is ambiguous without `oneof`
|
||||
|
||||
Status: Confirmed.
|
||||
|
||||
Evidence:
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:136`
|
||||
```proto
|
||||
message TranscriptUpdate {
|
||||
string meeting_id = 1;
|
||||
UpdateType update_type = 2;
|
||||
string partial_text = 3;
|
||||
FinalSegment segment = 4;
|
||||
double server_timestamp = 5;
|
||||
}
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- The schema allows both `partial_text` and `segment` or neither, even when
|
||||
`update_type` implies one payload. Clients must defensively branch.
|
||||
|
||||
Recommendations:
|
||||
- Add a new `TranscriptUpdateV2` with `oneof payload { PartialTranscript partial = 4; FinalSegment segment = 5; }`
|
||||
and a new RPC (e.g., `StreamTranscriptionV2`) to avoid breaking existing clients.
|
||||
- Prefer `google.protobuf.Timestamp` for `server_timestamp` in the v2 message.
|
||||
|
||||
---
|
||||
|
||||
### 4) Background task tracking is inconsistent
|
||||
|
||||
Status: Partially confirmed.
|
||||
|
||||
Evidence (tracked + cancelled diarization tasks):
|
||||
- `src/noteflow/grpc/_mixins/diarization/_jobs.py:130`
|
||||
```python
|
||||
# Create background task and store reference for potential cancellation
|
||||
task = asyncio.create_task(self._run_diarization_job(job_id, num_speakers))
|
||||
self._diarization_tasks[job_id] = task
|
||||
```
|
||||
- `src/noteflow/grpc/service.py:445`
|
||||
```python
|
||||
for job_id, task in list(self._diarization_tasks.items()):
|
||||
if not task.done():
|
||||
task.cancel()
|
||||
with contextlib.suppress(asyncio.CancelledError):
|
||||
await task
|
||||
```
|
||||
|
||||
Evidence (untracked sync tasks):
|
||||
- `src/noteflow/grpc/_mixins/sync.py:109`
|
||||
```python
|
||||
sync_task = asyncio.create_task(
|
||||
self._perform_sync(integration_id, sync_run.id, str(provider)),
|
||||
name=f"sync-{sync_run.id}",
|
||||
)
|
||||
# Add callback to clean up on completion
|
||||
sync_task.add_done_callback(lambda _: None)
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- Sync tasks are not stored for cancellation on shutdown and exceptions are not
|
||||
centrally observed (even if `_perform_sync` handles most errors).
|
||||
|
||||
Recommendations:
|
||||
- Add a shared background-task registry (or a `TaskGroup`) in the servicer and
|
||||
register sync tasks so they can be cancelled on shutdown.
|
||||
- Use a done-callback that logs uncaught exceptions and removes the task from the registry.
|
||||
|
||||
---
|
||||
|
||||
### 5) Segmenter leading buffer uses O(n) `pop(0)` in a hot path
|
||||
|
||||
Status: Confirmed.
|
||||
|
||||
Evidence:
|
||||
- `src/noteflow/infrastructure/asr/segmenter.py:233`
|
||||
```python
|
||||
while total_duration > self.config.leading_buffer and self._leading_buffer:
|
||||
removed = self._leading_buffer.pop(0)
|
||||
self._leading_buffer_samples -= len(removed)
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- `pop(0)` shifts the list each time, causing O(n) behavior under sustained audio streaming.
|
||||
|
||||
Recommendations:
|
||||
- Replace the list with `collections.deque` and use `popleft()` for O(1) removals.
|
||||
|
||||
---
|
||||
|
||||
### 6) ChunkedAssetReader lacks strict bounds checks for chunk framing
|
||||
|
||||
Status: Partially confirmed.
|
||||
|
||||
Evidence:
|
||||
- `src/noteflow/infrastructure/security/crypto.py:279`
|
||||
```python
|
||||
length_bytes = self._handle.read(4)
|
||||
if len(length_bytes) < 4:
|
||||
break # End of file
|
||||
|
||||
chunk_length = struct.unpack(">I", length_bytes)[0]
|
||||
chunk_data = self._handle.read(chunk_length)
|
||||
if len(chunk_data) < chunk_length:
|
||||
raise ValueError("Truncated chunk")
|
||||
|
||||
nonce = chunk_data[:NONCE_SIZE]
|
||||
ciphertext = chunk_data[NONCE_SIZE:-TAG_SIZE]
|
||||
tag = chunk_data[-TAG_SIZE:]
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- There is no explicit guard for `chunk_length < NONCE_SIZE + TAG_SIZE`, which can
|
||||
create invalid slices and decryption failures.
|
||||
- A short read of the 1-byte version header in `open()` is not checked before unpacking.
|
||||
|
||||
Recommendations:
|
||||
- Add a `read_exact()` helper and validate `chunk_length >= NONCE_SIZE + TAG_SIZE`.
|
||||
- Treat partial length headers as errors (or explicitly document EOF behavior).
|
||||
- Consider optional AAD (chunk index/version) to detect reordering if needed.
|
||||
|
||||
---
|
||||
|
||||
## Medium-priority, but worth fixing
|
||||
|
||||
### 7) gRPC size limits are defined in multiple places
|
||||
|
||||
Status: Confirmed.
|
||||
|
||||
Evidence:
|
||||
- `src/noteflow/grpc/service.py:86`
|
||||
```python
|
||||
MAX_CHUNK_SIZE: Final[int] = 1024 * 1024 # 1MB
|
||||
```
|
||||
- `src/noteflow/config/constants.py:27`
|
||||
```python
|
||||
MAX_GRPC_MESSAGE_SIZE: Final[int] = 100 * 1024 * 1024
|
||||
```
|
||||
- `src/noteflow/grpc/server.py:158`
|
||||
```python
|
||||
self._server = grpc.aio.server(
|
||||
options=[
|
||||
("grpc.max_send_message_length", 100 * 1024 * 1024),
|
||||
("grpc.max_receive_message_length", 100 * 1024 * 1024),
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- Multiple sources of truth can drift and the service advertises `MAX_CHUNK_SIZE`
|
||||
without enforcing it in the streaming path.
|
||||
|
||||
Recommendations:
|
||||
- Move message size and chunk size into `Settings` and use them consistently in
|
||||
`server.py` and `service.py`.
|
||||
- Enforce chunk size in streaming handlers and surface the same value in `ServerInfo`.
|
||||
|
||||
---
|
||||
|
||||
### 8) Outlook adapter lacks explicit timeouts and pagination handling
|
||||
|
||||
Status: Confirmed.
|
||||
|
||||
Evidence:
|
||||
- `src/noteflow/infrastructure/calendar/outlook_adapter.py:81`
|
||||
```python
|
||||
async with httpx.AsyncClient() as client:
|
||||
response = await client.get(url, params=params, headers=headers)
|
||||
|
||||
if response.status_code != HTTP_STATUS_OK:
|
||||
error_msg = response.text
|
||||
logger.error("Microsoft Graph API error: %s", error_msg)
|
||||
raise OutlookCalendarError(f"{ERR_API_PREFIX}{error_msg}")
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- No explicit timeouts or connection limits are set.
|
||||
- Graph API frequently paginates via `@odata.nextLink`.
|
||||
- Error bodies are logged in full (could be large).
|
||||
|
||||
Recommendations:
|
||||
- Configure `httpx.AsyncClient(timeout=..., limits=httpx.Limits(...))`.
|
||||
- Implement pagination with `@odata.nextLink` to honor `limit` correctly.
|
||||
- Truncate error bodies before logging and raise a bounded error message.
|
||||
|
||||
---
|
||||
|
||||
### 9) Webhook executor: delivery ID is not recorded, and client limits are missing
|
||||
|
||||
Status: Partially confirmed.
|
||||
|
||||
Evidence:
|
||||
- `src/noteflow/infrastructure/webhooks/executor.py:255`
|
||||
```python
|
||||
delivery_id = str(uuid4())
|
||||
headers = {
|
||||
HTTP_HEADER_WEBHOOK_DELIVERY: delivery_id,
|
||||
HTTP_HEADER_WEBHOOK_TIMESTAMP: timestamp,
|
||||
}
|
||||
```
|
||||
- `src/noteflow/infrastructure/webhooks/executor.py:306`
|
||||
```python
|
||||
return WebhookDelivery(
|
||||
id=uuid4(),
|
||||
webhook_id=config.id,
|
||||
event_type=event_type,
|
||||
...
|
||||
)
|
||||
```
|
||||
- Client is initialized without limits:
|
||||
`src/noteflow/infrastructure/webhooks/executor.py:103`
|
||||
```python
|
||||
self._client = httpx.AsyncClient(timeout=self._timeout)
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- The delivery ID sent to recipients is not stored in delivery records, making
|
||||
correlation harder.
|
||||
- Connection pooling limits are unspecified.
|
||||
|
||||
Recommendations:
|
||||
- Reuse `delivery_id` as `WebhookDelivery.id` or add a dedicated field to persist it.
|
||||
- Add `httpx.Limits` (max connections/keepalive) and consider retrying with `Retry-After`
|
||||
for 429s.
|
||||
- Include `delivery_id` in logs and any audit trail fields.
|
||||
|
||||
---
|
||||
|
||||
### 10) OpenTelemetry exporter uses `insecure=True`
|
||||
|
||||
Status: Confirmed.
|
||||
|
||||
Evidence:
|
||||
- `src/noteflow/infrastructure/observability/otel.py:99`
|
||||
```python
|
||||
otlp_exporter = OTLPSpanExporter(endpoint=otlp_endpoint, insecure=True)
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- TLS is disabled unconditionally when OTLP is configured, even in production.
|
||||
|
||||
Recommendations:
|
||||
- Make `insecure` a settings flag or infer it from the endpoint scheme.
|
||||
|
||||
---
|
||||
|
||||
## Cross-cutting design recommendations
|
||||
|
||||
### 11) Replace stringly-typed statuses with enums in proto
|
||||
|
||||
Status: Confirmed.
|
||||
|
||||
Evidence:
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:1191`
|
||||
```proto
|
||||
// Status: "running", "success", "error"
|
||||
string status = 3;
|
||||
```
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:856`
|
||||
```proto
|
||||
// Connection status: disconnected, connected, error
|
||||
string status = 2;
|
||||
```
|
||||
|
||||
Why it matters:
|
||||
- Clients must match string literals and risk typos or unsupported values.
|
||||
|
||||
Recommendations:
|
||||
- Introduce enums (e.g., `SyncRunStatus`, `OAuthConnectionStatus`) with explicit values
|
||||
and migrate clients gradually via new fields or v2 messages.
|
||||
|
||||
---
|
||||
|
||||
### 12) Test targets to cover the highest-risk changes
|
||||
|
||||
Status: Recommendation.
|
||||
|
||||
Existing coverage highlights:
|
||||
- Segmenter fuzz tests already exist: `tests/stress/test_segmenter_fuzz.py`.
|
||||
- Crypto chunk reader integrity tests exist: `tests/stress/test_audio_integrity.py`.
|
||||
|
||||
Suggested additions:
|
||||
- A gRPC proto-level test for patch semantics on `UpdateAnnotation` once a mask/optional
|
||||
field approach is introduced.
|
||||
- A sync task lifecycle test that asserts background tasks are cancelled on shutdown.
|
||||
- An Outlook adapter test that simulates `@odata.nextLink` pagination.
|
||||
|
||||
---
|
||||
|
||||
## Small, low-risk cleanup opportunities
|
||||
|
||||
- Consider replacing `Delete*Response { bool success }` in new RPCs with
|
||||
`google.protobuf.Empty` to reduce payload variability.
|
||||
- Audit other timestamp fields (`double` vs `int64` vs `string`) and normalize when
|
||||
introducing new API versions.
|
||||
|
||||
|
||||
@@ -42,14 +42,61 @@ npm run quality:all # TS + Rust quality
|
||||
|
||||
### Code Limits
|
||||
|
||||
| Metric | Soft Limit | Hard Limit | Location |
|
||||
|--------|------------|------------|----------|
|
||||
| Module lines | 500 | 750 | `test_code_smells.py` |
|
||||
| Function lines | 50 (tests), 75 (src) | — | `test_code_smells.py` |
|
||||
| Function complexity | 15 | — | `test_code_smells.py` |
|
||||
| Parameters | 7 | — | `test_code_smells.py` |
|
||||
| Class methods | 20 | — | `test_code_smells.py` |
|
||||
| Nesting depth | 5 | — | `test_code_smells.py` |
|
||||
| Metric | Threshold | Max Violations | Location |
|
||||
|--------|-----------|----------------|----------|
|
||||
| Module lines (soft) | 500 | 5 | `test_code_smells.py` |
|
||||
| Module lines (hard) | 750 | 0 | `test_code_smells.py` |
|
||||
| Function lines (src) | 75 | 7 | `test_code_smells.py` |
|
||||
| Function lines (tests) | 50 | 3 | `test_test_smells.py` |
|
||||
| Function complexity | 15 | 2 | `test_code_smells.py` |
|
||||
| Parameters | 7 | 35 | `test_code_smells.py` |
|
||||
| Class methods | 20 | 1 | `test_code_smells.py` |
|
||||
| Class lines (god class) | 500 | 1 | `test_code_smells.py` |
|
||||
| Nesting depth | 5 | 2 | `test_code_smells.py` |
|
||||
| Feature envy | 5+ accesses | 5 | `test_code_smells.py` |
|
||||
|
||||
### Magic Values & Literals (`test_magic_values.py`)
|
||||
|
||||
| Rule | Max Allowed | Target | Description |
|
||||
|------|-------------|--------|-------------|
|
||||
| Magic numbers (>100) | 10 | 0 | Use named constants |
|
||||
| Repeated string literals | 30 | 0 | Extract to constants |
|
||||
| Hardcoded paths | 0 | 0 | Use Path objects/config |
|
||||
| Hardcoded credentials | 0 | 0 | Use env vars/secrets |
|
||||
|
||||
### Stale Code (`test_stale_code.py`)
|
||||
|
||||
| Rule | Max Allowed | Target | Description |
|
||||
|------|-------------|--------|-------------|
|
||||
| Stale TODO/FIXME comments | 10 | 0 | Address or remove |
|
||||
| Commented-out code blocks | 0 | 0 | Remove dead code |
|
||||
| Unused imports | 5 | 0 | Remove or use |
|
||||
| Unreachable code | 0 | 0 | Remove dead paths |
|
||||
| Deprecated patterns | 5 | 0 | Modernize code |
|
||||
|
||||
### Duplicate Code (`test_duplicate_code.py`)
|
||||
|
||||
| Rule | Max Allowed | Target | Description |
|
||||
|------|-------------|--------|-------------|
|
||||
| Duplicate function bodies | 1 | 0 | Extract shared functions |
|
||||
| Repeated code patterns | 177 | 50 | Refactor to reduce duplication |
|
||||
|
||||
### Unnecessary Wrappers (`test_unnecessary_wrappers.py`)
|
||||
|
||||
| Rule | Max Allowed | Target | Description |
|
||||
|------|-------------|--------|-------------|
|
||||
| Trivial wrapper functions | varies | 0 | Remove or add value |
|
||||
| Alias imports | varies | 0 | Import directly |
|
||||
| Redundant type aliases | 2 | 0 | Use original types |
|
||||
| Passthrough classes | 1 | 0 | Flatten hierarchy |
|
||||
|
||||
### Decentralized Helpers (`test_decentralized_helpers.py`)
|
||||
|
||||
| Rule | Max Allowed | Target | Description |
|
||||
|------|-------------|--------|-------------|
|
||||
| Scattered helper functions | 15 | 5 | Consolidate to utils |
|
||||
| Utility modules not centralized | 0 | 0 | Move to shared location |
|
||||
| Duplicate helper implementations | 25 | 0 | Deduplicate |
|
||||
|
||||
### Test Requirements
|
||||
|
||||
@@ -57,27 +104,33 @@ npm run quality:all # TS + Rust quality
|
||||
|
||||
| Rule | Max Allowed | Target | File |
|
||||
|------|-------------|--------|------|
|
||||
| Assertion roulette (>3 assertions without msg) | 25 | 0 | `test_test_smells.py` |
|
||||
| Conditional test logic | 15 | 0 | `test_test_smells.py` |
|
||||
| Assertion roulette (>3 assertions without msg) | 50 | 0 | `test_test_smells.py` |
|
||||
| Conditional test logic | 40 | 0 | `test_test_smells.py` |
|
||||
| Empty tests | 0 | 0 | `test_test_smells.py` |
|
||||
| Sleepy tests (time.sleep) | 3 | 0 | `test_test_smells.py` |
|
||||
| Tests without assertions | 3 | 0 | `test_test_smells.py` |
|
||||
| Tests without assertions | 5 | 0 | `test_test_smells.py` |
|
||||
| Redundant assertions | 0 | 0 | `test_test_smells.py` |
|
||||
| Print statements in tests | 3 | 0 | `test_test_smells.py` |
|
||||
| Print statements in tests | 5 | 0 | `test_test_smells.py` |
|
||||
| Skipped tests without reason | 0 | 0 | `test_test_smells.py` |
|
||||
| Exception handling (try/except) | 3 | 0 | `test_test_smells.py` |
|
||||
| Magic numbers in assertions | 25 | 10 | `test_test_smells.py` |
|
||||
| Duplicate test names | 5 | 0 | `test_test_smells.py` |
|
||||
| Exception handling (broad try/except) | 3 | 0 | `test_test_smells.py` |
|
||||
| Magic numbers in assertions | 50 | 10 | `test_test_smells.py` |
|
||||
| Sensitive equality (str/repr compare) | 10 | 0 | `test_test_smells.py` |
|
||||
| Eager tests (>10 method calls) | 10 | 0 | `test_test_smells.py` |
|
||||
| Duplicate test names | 15 | 0 | `test_test_smells.py` |
|
||||
| Hardcoded test data paths | 0 | 0 | `test_test_smells.py` |
|
||||
| Long test methods (>50 lines) | 3 | 0 | `test_test_smells.py` |
|
||||
| unittest-style assertions | 0 | 0 | `test_test_smells.py` |
|
||||
| Fixtures without type hints | 5 | 0 | `test_test_smells.py` |
|
||||
| Unused fixture parameters | 3 | 0 | `test_test_smells.py` |
|
||||
| pytest.raises without match= | 20 | 0 | `test_test_smells.py` |
|
||||
| Session fixtures with mutation | 0 | 0 | `test_test_smells.py` |
|
||||
| Fixtures without type hints | 10 | 0 | `test_test_smells.py` |
|
||||
| Unused fixture parameters | 5 | 0 | `test_test_smells.py` |
|
||||
| Fixtures with wrong scope | 5 | 0 | `test_test_smells.py` |
|
||||
| Conftest fixture duplication | 0 | 0 | `test_test_smells.py` |
|
||||
| pytest.raises without match= | 50 | 0 | `test_test_smells.py` |
|
||||
| Cross-file fixture duplicates | 0 | 0 | `test_test_smells.py` |
|
||||
|
||||
**Reduction schedule**:
|
||||
- After each sprint, reduce non-zero thresholds by 20% (rounded down)
|
||||
- Goal: All thresholds at target values by Sprint 6
|
||||
- Goal: All thresholds at target values by Sprint 8
|
||||
|
||||
### Docstring Requirements
|
||||
|
||||
@@ -134,12 +187,13 @@ npm run quality:all # TS + Rust quality
|
||||
| Repeated strings | >3 occurrences | Extract to constants |
|
||||
| TODO/FIXME comments | >10 | Address or remove |
|
||||
| Long functions | >100 lines | Split into helpers |
|
||||
| Deep nesting | >5 levels (20 spaces) | Flatten control flow |
|
||||
| Deep nesting | >7 levels (28 spaces) | Flatten control flow |
|
||||
| unwrap() calls | >20 | Use ? or expect() |
|
||||
| clone() per file | >10 | Review ownership |
|
||||
| clone() per file | >10 suspicious | Review ownership (excludes Arc::clone, handles) |
|
||||
| Parameters | >5 | Use struct/builder |
|
||||
| Duplicate error messages | >2 | Use error enum |
|
||||
| File size | >500 lines | Split module |
|
||||
| Scattered helpers | >10 files | Consolidate format_/parse_/convert_ functions |
|
||||
|
||||
### Clippy Enforcement
|
||||
|
||||
|
||||
503
docs/sprints/sprint_logging_centralization.md
Normal file
503
docs/sprints/sprint_logging_centralization.md
Normal file
@@ -0,0 +1,503 @@
|
||||
# Sprint: Centralized Logging Infrastructure
|
||||
|
||||
| Attribute | Value |
|
||||
|-----------|-------|
|
||||
| **Size** | L (Large) |
|
||||
| **Phase** | Infrastructure |
|
||||
| **Prerequisites** | None |
|
||||
| **Owner** | TBD |
|
||||
|
||||
---
|
||||
|
||||
## Open Issues
|
||||
|
||||
| Issue | Blocking? | Resolution Path |
|
||||
|-------|-----------|-----------------|
|
||||
| LogBuffer integration | No | Adapt LogBuffer to consume structlog events |
|
||||
| CLI modules use Rich Console | No | Ensure no conflicts with structlog Rich renderer |
|
||||
|
||||
---
|
||||
|
||||
## Validation Status
|
||||
|
||||
| Component | Exists | Notes |
|
||||
|-----------|--------|-------|
|
||||
| `LogBuffer` / `LogBufferHandler` | Yes | Needs adaptation for structlog |
|
||||
| `get_logging_context()` | Yes | Context vars for request_id, user_id, workspace_id |
|
||||
| OTEL trace context capture | Yes | `_get_current_trace_context()` in log_buffer.py |
|
||||
| Rich dependency | Yes | `rich>=14.2.0` in pyproject.toml |
|
||||
| structlog dependency | No | Must add to pyproject.toml |
|
||||
|
||||
---
|
||||
|
||||
## Objective
|
||||
|
||||
Centralize NoteFlow's logging infrastructure using **structlog** with dual output: **Rich console rendering** for development and **JSON output** for observability/OTEL integration. Migrate all 71 existing files from stdlib `logging` to structlog while preserving existing context propagation and OTEL trace correlation.
|
||||
|
||||
---
|
||||
|
||||
## Key Decisions
|
||||
|
||||
| Decision | Choice | Rationale |
|
||||
|----------|--------|-----------|
|
||||
| **Library** | structlog + Rich | Structured logging with context binding; Rich console renderer included |
|
||||
| **Output Strategy** | Dual simultaneous | JSON to file/collector AND Rich to console always |
|
||||
| **Context Handling** | Auto-inject + override | Leverage existing context vars; allow per-call extras |
|
||||
| **Migration Scope** | Full migration | Convert all 71 files to `structlog.get_logger()` |
|
||||
| **stdlib Bridge** | Yes | Use `structlog.stdlib` for seamless integration |
|
||||
|
||||
---
|
||||
|
||||
## What Already Exists
|
||||
|
||||
### Reusable Assets
|
||||
|
||||
| Asset | Location | Reuse Strategy |
|
||||
|-------|----------|----------------|
|
||||
| Context variables | `infrastructure/logging/structured.py` | Inject via structlog processor |
|
||||
| LogEntry dataclass | `infrastructure/logging/log_buffer.py` | Adapt as structlog processor output |
|
||||
| LogBuffer | `infrastructure/logging/log_buffer.py` | Create structlog processor that feeds LogBuffer |
|
||||
| OTEL trace extraction | `infrastructure/logging/log_buffer.py:132-156` | Convert to structlog processor |
|
||||
| Observability setup | `infrastructure/observability/otel.py` | Integrate with structlog OTEL processor |
|
||||
|
||||
### Current Logging Patterns
|
||||
|
||||
```python
|
||||
# Pattern 1: Module-level logger (71 files)
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.info("message", extra={...})
|
||||
|
||||
# Pattern 2: %-style formatting (widespread)
|
||||
logger.warning("Failed to process %s: %s", item_id, error)
|
||||
|
||||
# Pattern 3: Exception logging
|
||||
logger.exception("Operation failed")
|
||||
|
||||
# Pattern 4: CLI modules (2 files)
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
### Task Breakdown
|
||||
|
||||
| Task | Effort | Description |
|
||||
|------|--------|-------------|
|
||||
| **T1: Core Configuration** | M | Create `configure_logging()` with dual output |
|
||||
| **T2: Structlog Processors** | M | Build processor chain (context, OTEL, timestamps) |
|
||||
| **T3: Rich Renderer Integration** | S | Configure structlog's ConsoleRenderer with Rich |
|
||||
| **T4: JSON Renderer** | S | Configure JSONRenderer for observability |
|
||||
| **T5: LogBuffer Processor** | M | Create processor that feeds existing LogBuffer |
|
||||
| **T6: Context Injection Processor** | S | Processor using `get_logging_context()` |
|
||||
| **T7: OTEL Span Processor** | S | Extract trace_id/span_id from current span |
|
||||
| **T8: Entry Point Updates** | S | Update `grpc/server.py`, CLI entry points |
|
||||
| **T9: Migration Script** | M | AST-based migration of 71 files |
|
||||
| **T10: File Migration (Batch 1)** | L | Migrate application/services (12 files) |
|
||||
| **T11: File Migration (Batch 2)** | L | Migrate infrastructure/* (35 files) |
|
||||
| **T12: File Migration (Batch 3)** | L | Migrate grpc/* (24 files) |
|
||||
| **T13: Test Updates** | M | Update test fixtures and assertions |
|
||||
| **T14: Documentation** | S | Update CLAUDE.md and add logging guide |
|
||||
|
||||
**Total Effort:** XL (spans multiple sessions)
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
### Processor Chain
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ Structlog Processor Chain │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ 1. filter_by_level ─► Skip if level too low │
|
||||
│ 2. add_logger_name ─► Add logger name to event │
|
||||
│ 3. add_log_level ─► Add level string │
|
||||
│ 4. PositionalArgumentsFormatter ─► Handle %-style formatting │
|
||||
│ 5. TimeStamper(fmt="iso") ─► ISO 8601 timestamp │
|
||||
│ 6. add_noteflow_context ─► request_id, user_id, workspace_id │
|
||||
│ 7. add_otel_trace_context ─► trace_id, span_id, parent_span_id │
|
||||
│ 8. CallsiteParameterAdder ─► filename, func_name, lineno │
|
||||
│ 9. StackInfoRenderer ─► Stack traces if requested │
|
||||
│ 10. format_exc_info ─► Exception formatting │
|
||||
│ 11. UnicodeDecoder ─► Decode bytes to str │
|
||||
│ │
|
||||
│ ┌─────────────────────┐ ┌─────────────────────┐ │
|
||||
│ │ Rich Console │ │ JSON File/OTLP │ │
|
||||
│ │ (dev.ConsoleRenderer)│ │ (JSONRenderer) │ │
|
||||
│ └──────────┬──────────┘ └──────────┬──────────┘ │
|
||||
│ │ │ │
|
||||
│ ▼ ▼ │
|
||||
│ StreamHandler FileHandler / LogBuffer │
|
||||
│ (stderr, TTY) (noteflow.log / in-memory) │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Module Structure
|
||||
|
||||
```
|
||||
infrastructure/logging/
|
||||
├── __init__.py # Public API exports
|
||||
├── config.py # NEW: configure_logging(), LoggingConfig
|
||||
├── processors.py # NEW: Custom processors (context, OTEL, LogBuffer)
|
||||
├── handlers.py # NEW: Dual-output handler configuration
|
||||
├── structured.py # KEEP: Context variables (minimal changes)
|
||||
└── log_buffer.py # ADAPT: LogBuffer processor integration
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Domain Model
|
||||
|
||||
### LoggingConfig (New)
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class LoggingConfig:
|
||||
"""Configuration for centralized logging."""
|
||||
|
||||
level: str = "INFO"
|
||||
json_file: Path | None = None # None = no file output
|
||||
enable_console: bool = True
|
||||
enable_json_console: bool = False # Force JSON even on TTY
|
||||
enable_log_buffer: bool = True
|
||||
enable_otel_context: bool = True
|
||||
enable_noteflow_context: bool = True
|
||||
console_colors: bool = True # Auto-detect TTY if None
|
||||
```
|
||||
|
||||
### Custom Processors (New)
|
||||
|
||||
```python
|
||||
def add_noteflow_context(
|
||||
logger: WrappedLogger,
|
||||
method_name: str,
|
||||
event_dict: EventDict,
|
||||
) -> EventDict:
|
||||
"""Inject request_id, user_id, workspace_id from context vars."""
|
||||
ctx = get_logging_context()
|
||||
for key, value in ctx.items():
|
||||
if value is not None and key not in event_dict:
|
||||
event_dict[key] = value
|
||||
return event_dict
|
||||
|
||||
|
||||
def add_otel_trace_context(
|
||||
logger: WrappedLogger,
|
||||
method_name: str,
|
||||
event_dict: EventDict,
|
||||
) -> EventDict:
|
||||
"""Inject OpenTelemetry trace/span IDs if available."""
|
||||
try:
|
||||
from opentelemetry import trace
|
||||
|
||||
span = trace.get_current_span()
|
||||
if span.is_recording():
|
||||
ctx = span.get_span_context()
|
||||
event_dict["trace_id"] = format(ctx.trace_id, "032x")
|
||||
event_dict["span_id"] = format(ctx.span_id, "016x")
|
||||
parent = getattr(span, "parent", None)
|
||||
if parent:
|
||||
event_dict["parent_span_id"] = format(parent.span_id, "016x")
|
||||
except ImportError:
|
||||
pass
|
||||
return event_dict
|
||||
|
||||
|
||||
def log_buffer_processor(
|
||||
logger: WrappedLogger,
|
||||
method_name: str,
|
||||
event_dict: EventDict,
|
||||
) -> EventDict:
|
||||
"""Feed structured event to LogBuffer for UI streaming."""
|
||||
buffer = get_log_buffer()
|
||||
buffer.append(
|
||||
LogEntry(
|
||||
timestamp=event_dict.get("timestamp", datetime.now(UTC)),
|
||||
level=event_dict.get("level", "info"),
|
||||
source=event_dict.get("logger", ""),
|
||||
message=event_dict.get("event", ""),
|
||||
details={k: str(v) for k, v in event_dict.items()
|
||||
if k not in ("timestamp", "level", "logger", "event")},
|
||||
trace_id=event_dict.get("trace_id"),
|
||||
span_id=event_dict.get("span_id"),
|
||||
)
|
||||
)
|
||||
return event_dict
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration API
|
||||
|
||||
### Primary Entry Point
|
||||
|
||||
```python
|
||||
# infrastructure/logging/config.py
|
||||
|
||||
def configure_logging(
|
||||
config: LoggingConfig | None = None,
|
||||
*,
|
||||
level: str = "INFO",
|
||||
json_file: Path | None = None,
|
||||
) -> None:
|
||||
"""Configure centralized logging with dual output.
|
||||
|
||||
Call once at application startup (e.g., in grpc/server.py main()).
|
||||
|
||||
Args:
|
||||
config: Full configuration object, or use keyword args.
|
||||
level: Log level (DEBUG, INFO, WARNING, ERROR).
|
||||
json_file: Optional path for JSON log file.
|
||||
"""
|
||||
if config is None:
|
||||
config = LoggingConfig(level=level, json_file=json_file)
|
||||
|
||||
shared_processors = _build_processor_chain(config)
|
||||
|
||||
# Configure structlog
|
||||
structlog.configure(
|
||||
processors=shared_processors + [
|
||||
structlog.stdlib.ProcessorFormatter.wrap_for_formatter,
|
||||
],
|
||||
wrapper_class=structlog.stdlib.BoundLogger,
|
||||
logger_factory=structlog.stdlib.LoggerFactory(),
|
||||
cache_logger_on_first_use=True,
|
||||
)
|
||||
|
||||
# Configure stdlib logging handlers
|
||||
_configure_handlers(config, shared_processors)
|
||||
```
|
||||
|
||||
### Usage After Migration
|
||||
|
||||
```python
|
||||
# Before (current)
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.info("Processing meeting %s", meeting_id)
|
||||
|
||||
# After (migrated)
|
||||
import structlog
|
||||
logger = structlog.get_logger()
|
||||
logger.info("processing_meeting", meeting_id=meeting_id)
|
||||
|
||||
# Or with bound context
|
||||
logger = structlog.get_logger().bind(meeting_id=meeting_id)
|
||||
logger.info("processing_started")
|
||||
logger.info("processing_completed", segments=42)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Migration Strategy
|
||||
|
||||
### Phase 1: Infrastructure (T1-T8)
|
||||
|
||||
1. Add `structlog>=24.0` to `pyproject.toml`
|
||||
2. Create `infrastructure/logging/config.py` with `configure_logging()`
|
||||
3. Create `infrastructure/logging/processors.py` with custom processors
|
||||
4. Create `infrastructure/logging/handlers.py` for handler setup
|
||||
5. Update entry points to call `configure_logging()`
|
||||
|
||||
### Phase 2: Automated Migration (T9)
|
||||
|
||||
Create AST-based migration script:
|
||||
|
||||
```python
|
||||
# scripts/migrate_logging.py
|
||||
|
||||
"""Migrate stdlib logging to structlog.
|
||||
|
||||
Transforms:
|
||||
import logging
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.info("message %s", arg)
|
||||
|
||||
To:
|
||||
import structlog
|
||||
logger = structlog.get_logger()
|
||||
logger.info("message", arg=arg)
|
||||
"""
|
||||
```
|
||||
|
||||
### Phase 3: Batch Migration (T10-T12)
|
||||
|
||||
| Batch | Files | Priority | Notes |
|
||||
|-------|-------|----------|-------|
|
||||
| **Batch 1** | `application/services/*` (12) | High | Core business logic |
|
||||
| **Batch 2** | `infrastructure/*` (35) | Medium | Infrastructure adapters |
|
||||
| **Batch 3** | `grpc/*` (24) | High | API layer, interceptors |
|
||||
|
||||
### Rollback Strategy
|
||||
|
||||
- Keep stdlib logging configured as structlog backend
|
||||
- If issues arise, revert `configure_logging()` to stdlib-only mode
|
||||
- Migration is reversible via git; no database changes
|
||||
|
||||
---
|
||||
|
||||
## Deliverables
|
||||
|
||||
### New Files
|
||||
|
||||
- [ ] `src/noteflow/infrastructure/logging/config.py`
|
||||
- [ ] `src/noteflow/infrastructure/logging/processors.py`
|
||||
- [ ] `src/noteflow/infrastructure/logging/handlers.py`
|
||||
- [ ] `scripts/migrate_logging.py`
|
||||
- [ ] `docs/guides/logging.md`
|
||||
|
||||
### Modified Files
|
||||
|
||||
- [ ] `pyproject.toml` — add structlog dependency
|
||||
- [ ] `src/noteflow/infrastructure/logging/__init__.py` — export new API
|
||||
- [ ] `src/noteflow/infrastructure/logging/log_buffer.py` — adapt for structlog
|
||||
- [ ] `src/noteflow/grpc/server.py` — call `configure_logging()`
|
||||
- [ ] `src/noteflow/cli/retention.py` — remove `basicConfig`, use structlog
|
||||
- [ ] `src/noteflow/cli/models.py` — remove `basicConfig`, use structlog
|
||||
- [ ] 71 files with `logging.getLogger()` — migrate to structlog
|
||||
|
||||
### Tests
|
||||
|
||||
- [ ] `tests/infrastructure/logging/test_config.py`
|
||||
- [ ] `tests/infrastructure/logging/test_processors.py`
|
||||
- [ ] `tests/infrastructure/logging/test_handlers.py`
|
||||
- [ ] Update existing tests that assert on log output
|
||||
|
||||
---
|
||||
|
||||
## Test Strategy
|
||||
|
||||
### Unit Tests
|
||||
|
||||
```python
|
||||
# tests/infrastructure/logging/test_processors.py
|
||||
|
||||
@pytest.fixture
|
||||
def mock_context_vars(monkeypatch):
|
||||
"""Set up context variables for testing."""
|
||||
monkeypatch.setattr("noteflow.infrastructure.logging.structured.request_id_var",
|
||||
ContextVar("request_id", default="test-req-123"))
|
||||
# ...
|
||||
|
||||
def test_add_noteflow_context_injects_request_id(mock_context_vars):
|
||||
"""Verify context vars are injected into event dict."""
|
||||
event_dict = {"event": "test"}
|
||||
result = add_noteflow_context(None, "info", event_dict)
|
||||
assert result["request_id"] == "test-req-123"
|
||||
|
||||
def test_add_otel_trace_context_graceful_without_otel():
|
||||
"""Verify processor works when OpenTelemetry not installed."""
|
||||
event_dict = {"event": "test"}
|
||||
result = add_otel_trace_context(None, "info", event_dict)
|
||||
assert "trace_id" not in result # Graceful degradation
|
||||
|
||||
@pytest.mark.parametrize("level,expected", [
|
||||
("DEBUG", True),
|
||||
("INFO", True),
|
||||
("WARNING", True),
|
||||
("ERROR", True),
|
||||
])
|
||||
def test_configure_logging_accepts_all_levels(level, expected):
|
||||
"""Verify all log levels are accepted."""
|
||||
config = LoggingConfig(level=level)
|
||||
configure_logging(config)
|
||||
# Assert no exception raised
|
||||
```
|
||||
|
||||
### Integration Tests
|
||||
|
||||
```python
|
||||
# tests/infrastructure/logging/test_integration.py
|
||||
|
||||
@pytest.mark.integration
|
||||
def test_dual_output_produces_both_formats(tmp_path, capsys):
|
||||
"""Verify console and JSON outputs are produced simultaneously."""
|
||||
json_file = tmp_path / "test.log"
|
||||
configure_logging(LoggingConfig(
|
||||
level="INFO",
|
||||
json_file=json_file,
|
||||
enable_console=True,
|
||||
))
|
||||
|
||||
logger = structlog.get_logger("test")
|
||||
logger.info("test_event", key="value")
|
||||
|
||||
# Verify console output (Rich formatted)
|
||||
captured = capsys.readouterr()
|
||||
assert "test_event" in captured.err
|
||||
|
||||
# Verify JSON file output
|
||||
with open(json_file) as f:
|
||||
log_line = json.loads(f.readline())
|
||||
assert log_line["event"] == "test_event"
|
||||
assert log_line["key"] == "value"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quality Gates
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- [ ] All 71 files migrated to structlog
|
||||
- [ ] Dual output working (Rich console + JSON)
|
||||
- [ ] Context variables auto-injected (request_id, user_id, workspace_id)
|
||||
- [ ] OTEL trace/span IDs appear in logs when tracing enabled
|
||||
- [ ] LogBuffer receives structured events for UI streaming
|
||||
- [ ] No `logging.basicConfig()` calls remain
|
||||
- [ ] All tests pass
|
||||
- [ ] `ruff check` and `basedpyright` pass
|
||||
- [ ] Documentation updated
|
||||
|
||||
### Performance Requirements
|
||||
|
||||
- Log emission overhead < 10μs per call
|
||||
- No blocking I/O in hot paths (async file writes)
|
||||
- Memory-bounded LogBuffer (existing 1000-entry limit)
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
### New Dependencies
|
||||
|
||||
```toml
|
||||
# pyproject.toml
|
||||
[project]
|
||||
dependencies = [
|
||||
# ... existing ...
|
||||
"structlog>=24.0",
|
||||
]
|
||||
```
|
||||
|
||||
### Compatibility Notes
|
||||
|
||||
- structlog 24.0+ required for `ProcessorFormatter` improvements
|
||||
- Rich 14.2.0 already installed (compatible)
|
||||
- OpenTelemetry integration optional (graceful degradation)
|
||||
|
||||
---
|
||||
|
||||
## Risks & Mitigations
|
||||
|
||||
| Risk | Impact | Mitigation |
|
||||
|------|--------|------------|
|
||||
| Migration breaks existing log parsing | Medium | Maintain JSON schema compatibility |
|
||||
| Performance regression | Low | Benchmark before/after; structlog is fast |
|
||||
| Rich console conflicts with existing CLI usage | Low | CLI modules already use Rich; test integration |
|
||||
| OTEL context not propagating | Medium | Integration tests with mock tracer |
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [structlog Documentation](https://www.structlog.org/)
|
||||
- [structlog + Rich Integration](https://www.structlog.org/en/stable/console-output.html)
|
||||
- [structlog + OTEL](https://www.structlog.org/en/stable/frameworks.html#opentelemetry)
|
||||
- [Existing LogBuffer Implementation](../src/noteflow/infrastructure/logging/log_buffer.py)
|
||||
744
docs/sprints/sprint_logging_centralization_PLAN.md
Normal file
744
docs/sprints/sprint_logging_centralization_PLAN.md
Normal file
@@ -0,0 +1,744 @@
|
||||
# Centralized Logging Migration - Agent-Driven Execution Plan
|
||||
|
||||
> **Sprint Reference**: `docs/sprints/sprint_logging_centralization.md`
|
||||
> **Technical Debt**: `docs/triage.md`
|
||||
> **Quality Gates**: `docs/sprints/QUALITY_STANDARDS.md`
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This plan orchestrates the migration from stdlib `logging` to `structlog` using specialized agents for discovery, validation, and implementation. Each phase is designed for parallel execution where possible.
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ CRITICAL: Quality Enforcement Rules
|
||||
|
||||
### ABSOLUTE PROHIBITIONS
|
||||
|
||||
1. **NEVER modify quality test thresholds** - If violations exceed thresholds, FIX THE CODE, not the tests
|
||||
2. **NEVER claim errors are "preexisting"** without baseline proof - Capture baselines BEFORE any changes
|
||||
3. **NEVER batch quality checks** - Run `make quality-py` after EVERY file modification
|
||||
4. **NEVER skip quality gates** to "fix later" - All code must pass before proceeding
|
||||
|
||||
### MANDATORY QUALITY CHECKPOINTS
|
||||
|
||||
**After EVERY code change** (not "at the end of a phase"):
|
||||
|
||||
```bash
|
||||
# Run after EACH file edit - no exceptions
|
||||
make quality-py
|
||||
```
|
||||
|
||||
This runs:
|
||||
- `ruff check .` — Linting (ALL code, not just tests)
|
||||
- `basedpyright` — Type checking (ALL code, not just tests)
|
||||
- `pytest tests/quality/ -q` — Code smell detection (ALL code)
|
||||
|
||||
### BASELINE CAPTURE (Required Before Phase 2)
|
||||
|
||||
```bash
|
||||
# Capture baseline BEFORE any migration work
|
||||
make quality-py 2>&1 | tee /tmp/quality_baseline_$(date +%Y%m%d_%H%M%S).log
|
||||
|
||||
# Record current threshold violations
|
||||
pytest tests/quality/ -v --tb=no | grep -E "(PASSED|FAILED|violations)" > /tmp/threshold_baseline.txt
|
||||
```
|
||||
|
||||
Any NEW violations introduced during migration are **agent responsibility** and must be fixed immediately.
|
||||
|
||||
### CODE QUALITY STANDARDS (Apply to ALL Code)
|
||||
|
||||
These apply to **infrastructure modules, services, gRPC handlers, processors** — not just tests:
|
||||
|
||||
| Rule | Applies To | Enforcement |
|
||||
|------|-----------|-------------|
|
||||
| No `# type: ignore` | ALL Python code | `basedpyright` |
|
||||
| No `Any` type | ALL Python code | `basedpyright` |
|
||||
| Union syntax `str \| None` | ALL Python code | `ruff UP` |
|
||||
| Module < 500 lines (soft) | ALL modules | `tests/quality/test_code_smells.py` |
|
||||
| Module < 750 lines (hard) | ALL modules | `tests/quality/test_code_smells.py` |
|
||||
| Function < 75 lines | ALL functions | `tests/quality/test_code_smells.py` |
|
||||
| Complexity < 15 | ALL functions | `tests/quality/test_code_smells.py` |
|
||||
| Parameters ≤ 7 | ALL functions | `tests/quality/test_code_smells.py` |
|
||||
| No magic numbers > 100 | ALL code | `tests/quality/test_magic_values.py` |
|
||||
| No hardcoded paths | ALL code | `tests/quality/test_magic_values.py` |
|
||||
| No repeated string literals | ALL code | `tests/quality/test_magic_values.py` |
|
||||
| No stale TODO/FIXME | ALL code | `tests/quality/test_stale_code.py` |
|
||||
| No commented-out code | ALL code | `tests/quality/test_stale_code.py` |
|
||||
|
||||
---
|
||||
|
||||
## Phase 0: Pre-Flight Validation
|
||||
|
||||
### Agent Task: Dependency Audit
|
||||
|
||||
| Agent | Purpose | Deliverable |
|
||||
|-------|---------|-------------|
|
||||
| `Explore` | Verify structlog compatibility with existing Rich usage | Compatibility report |
|
||||
| `Explore` | Locate all `logging.basicConfig()` calls | File list with line numbers |
|
||||
| `Explore` | Find LogBuffer integration points | Integration map |
|
||||
|
||||
**Commands to validate:**
|
||||
```bash
|
||||
# Verify Rich is installed
|
||||
python -c "import rich; print(rich.__version__)"
|
||||
|
||||
# Dry-run structlog install
|
||||
uv pip install --dry-run "structlog>=24.0"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Discovery & Target Mapping
|
||||
|
||||
### 1.1 Agent: Locate All Logging Usage
|
||||
|
||||
**Objective**: Build comprehensive map of all 71+ files using stdlib logging.
|
||||
|
||||
**Agent Type**: `Explore` (thorough mode)
|
||||
|
||||
**Queries**:
|
||||
1. "Find all files with `import logging` in src/noteflow/"
|
||||
2. "Find all `logging.getLogger(__name__)` patterns"
|
||||
3. "Find all `logger.info/debug/warning/error/exception` calls with their argument patterns"
|
||||
4. "Identify %-style formatting vs f-string usage in log calls"
|
||||
|
||||
**Expected Output Structure**:
|
||||
```yaml
|
||||
discovery:
|
||||
files_with_logging: 71
|
||||
patterns:
|
||||
module_logger: 68 # logger = logging.getLogger(__name__)
|
||||
basic_config: 2 # logging.basicConfig()
|
||||
percent_style: 45 # logger.info("msg %s", arg)
|
||||
fstring_style: 23 # logger.info(f"msg {arg}")
|
||||
exception_calls: 12 # logger.exception()
|
||||
by_layer:
|
||||
application: 12
|
||||
infrastructure: 35
|
||||
grpc: 24
|
||||
```
|
||||
|
||||
### 1.2 Agent: Map Critical Logging Gaps (from triage.md)
|
||||
|
||||
**Agent Type**: `Explore` (thorough mode)
|
||||
|
||||
**Objective**: Validate each issue in triage.md still exists and capture exact locations.
|
||||
|
||||
**Target Categories**:
|
||||
|
||||
| Category | File Pattern | Agent Query |
|
||||
|----------|--------------|-------------|
|
||||
| Network/External | `*_provider.py`, `*_adapter.py` | "Find async HTTP calls without timing logs" |
|
||||
| Blocking Ops | `*_engine.py`, `*_service.py` | "Find `run_in_executor` calls without duration logging" |
|
||||
| Silent Failures | `repositories/*.py` | "Find try/except blocks that return None without logging" |
|
||||
| State Transitions | `_mixins/*.py` | "Find state assignments without transition logs" |
|
||||
| DB Operations | `repositories/*.py`, `unit_of_work.py` | "Find commit/rollback without logging" |
|
||||
|
||||
### 1.3 Agent: Context Variable Analysis
|
||||
|
||||
**Agent Type**: `feature-dev:code-explorer`
|
||||
|
||||
**Objective**: Trace `get_logging_context()` usage for processor design.
|
||||
|
||||
**Tasks**:
|
||||
1. Find all `request_id_var`, `user_id_var`, `workspace_id_var` usages
|
||||
2. Map where context is SET vs where it's READ
|
||||
3. Identify any gaps in context propagation
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Infrastructure Implementation
|
||||
|
||||
### 2.1 Create Core Configuration Module
|
||||
|
||||
**File**: `src/noteflow/infrastructure/logging/config.py`
|
||||
|
||||
**Agent Type**: `feature-dev:code-architect`
|
||||
|
||||
**Design Constraints** (from QUALITY_STANDARDS.md):
|
||||
- No `Any` types
|
||||
- No `# type: ignore` without justification
|
||||
- All public functions must have return type annotations
|
||||
- Docstrings written imperatively
|
||||
|
||||
**Implementation Spec**:
|
||||
```python
|
||||
"""Centralized logging configuration with dual output.
|
||||
|
||||
Configures structlog with Rich console + JSON file output.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import structlog
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence
|
||||
|
||||
from structlog.typing import Processor
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class LoggingConfig:
|
||||
"""Configuration for centralized logging."""
|
||||
|
||||
level: str = "INFO"
|
||||
json_file: Path | None = None
|
||||
enable_console: bool = True
|
||||
enable_json_console: bool = False
|
||||
enable_log_buffer: bool = True
|
||||
enable_otel_context: bool = True
|
||||
enable_noteflow_context: bool = True
|
||||
console_colors: bool = True
|
||||
|
||||
|
||||
def configure_logging(
|
||||
config: LoggingConfig | None = None,
|
||||
*,
|
||||
level: str = "INFO",
|
||||
json_file: Path | None = None,
|
||||
) -> None:
|
||||
"""Configure centralized logging with dual output.
|
||||
|
||||
Call once at application startup.
|
||||
|
||||
Args:
|
||||
config: Full configuration object, or use keyword args.
|
||||
level: Log level (DEBUG, INFO, WARNING, ERROR).
|
||||
json_file: Optional path for JSON log file.
|
||||
"""
|
||||
...
|
||||
```
|
||||
|
||||
### 2.2 Create Custom Processors Module
|
||||
|
||||
**File**: `src/noteflow/infrastructure/logging/processors.py`
|
||||
|
||||
**Agent Type**: `feature-dev:code-architect`
|
||||
|
||||
**Processors to Implement**:
|
||||
|
||||
| Processor | Source | Purpose |
|
||||
|-----------|--------|---------|
|
||||
| `add_noteflow_context` | New | Inject request_id, user_id, workspace_id |
|
||||
| `add_otel_trace_context` | Adapt from `log_buffer.py:132-156` | Inject trace_id, span_id |
|
||||
| `log_buffer_processor` | New | Feed events to existing LogBuffer |
|
||||
|
||||
**Quality Requirements**:
|
||||
- Each processor must be a pure function
|
||||
- Must handle missing context gracefully (no exceptions)
|
||||
- Must include type annotations for all parameters
|
||||
|
||||
### 2.3 Create Handlers Module
|
||||
|
||||
**File**: `src/noteflow/infrastructure/logging/handlers.py`
|
||||
|
||||
**Agent Type**: `feature-dev:code-architect`
|
||||
|
||||
**Responsibilities**:
|
||||
- Configure Rich ConsoleRenderer for TTY
|
||||
- Configure JSONRenderer for file/OTEL
|
||||
- Wire both to stdlib logging handlers
|
||||
|
||||
### 2.4 Adapt LogBuffer
|
||||
|
||||
**File**: `src/noteflow/infrastructure/logging/log_buffer.py`
|
||||
|
||||
**Agent Type**: `feature-dev:code-reviewer` (review current implementation first)
|
||||
|
||||
**Changes Required**:
|
||||
1. Create structlog processor that feeds LogBuffer
|
||||
2. Convert `LogEntry` creation to use structlog event_dict
|
||||
3. Preserve existing `_get_current_trace_context()` logic
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Entry Point Integration
|
||||
|
||||
### 3.1 Agent: Locate Entry Points
|
||||
|
||||
**Agent Type**: `Explore`
|
||||
|
||||
**Query**: "Find all main() functions and startup initialization in src/noteflow/"
|
||||
|
||||
**Expected Entry Points**:
|
||||
- `src/noteflow/grpc/server.py` - Main server
|
||||
- `src/noteflow/cli/retention.py` - CLI tool
|
||||
- `src/noteflow/cli/models.py` - CLI tool
|
||||
|
||||
### 3.2 Integration Tasks
|
||||
|
||||
| File | Change | Validation |
|
||||
|------|--------|------------|
|
||||
| `grpc/server.py` | Add `configure_logging()` call before server start | Server logs in both formats |
|
||||
| `cli/retention.py` | Remove `basicConfig()`, add `configure_logging()` | CLI logs correctly |
|
||||
| `cli/models.py` | Remove `basicConfig()`, add `configure_logging()` | CLI logs correctly |
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Automated Migration Script
|
||||
|
||||
### 4.1 Migration Script Design
|
||||
|
||||
**File**: `scripts/migrate_logging.py`
|
||||
|
||||
**Agent Type**: `feature-dev:code-architect`
|
||||
|
||||
**Transformations**:
|
||||
|
||||
```python
|
||||
# Transform 1: Import statement
|
||||
# Before: import logging
|
||||
# After: import structlog
|
||||
|
||||
# Transform 2: Logger creation
|
||||
# Before: logger = logging.getLogger(__name__)
|
||||
# After: logger = structlog.get_logger()
|
||||
|
||||
# Transform 3: %-style formatting
|
||||
# Before: logger.info("Processing %s for %s", item_id, user_id)
|
||||
# After: logger.info("processing", item_id=item_id, user_id=user_id)
|
||||
|
||||
# Transform 4: Exception logging
|
||||
# Before: logger.exception("Failed to process")
|
||||
# After: logger.exception("processing_failed")
|
||||
```
|
||||
|
||||
**Quality Requirements**:
|
||||
- Must preserve semantic meaning
|
||||
- Must handle all patterns found in Phase 1
|
||||
- Must be idempotent (safe to run multiple times)
|
||||
- Must generate report of changes
|
||||
|
||||
### 4.2 Validation Agent
|
||||
|
||||
**Agent Type**: `agent-code-quality`
|
||||
|
||||
**Post-Migration Checks**:
|
||||
1. Run `ruff check` on migrated files
|
||||
2. Run `basedpyright` on migrated files
|
||||
3. Verify no `import logging` remains (except stdlib bridge)
|
||||
4. Verify all `logger.` calls use keyword arguments
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Batch Migration Execution
|
||||
|
||||
### 5.1 Batch 1: Application Services (12 files)
|
||||
|
||||
**Agent Type**: `agent-python-executor`
|
||||
|
||||
**Files** (to be confirmed by discovery agent):
|
||||
```
|
||||
src/noteflow/application/services/
|
||||
├── meeting_service.py
|
||||
├── recovery_service.py
|
||||
├── export_service.py
|
||||
├── summarization_service.py
|
||||
├── trigger_service.py
|
||||
├── webhook_service.py
|
||||
├── calendar_service.py
|
||||
├── retention_service.py
|
||||
├── ner_service.py
|
||||
└── ...
|
||||
```
|
||||
|
||||
**⚠️ Execution Strategy (PER-FILE, NOT PER-BATCH)**:
|
||||
```bash
|
||||
# For EACH file in the batch:
|
||||
|
||||
# 1. Migrate ONE file
|
||||
python scripts/migrate_logging.py src/noteflow/application/services/meeting_service.py
|
||||
|
||||
# 2. IMMEDIATELY run quality check
|
||||
make quality-py
|
||||
|
||||
# 3. If NEW violations introduced:
|
||||
# - FIX THEM NOW
|
||||
# - Re-run make quality-py
|
||||
# - Do NOT proceed until clean
|
||||
|
||||
# 4. Only then migrate next file
|
||||
python scripts/migrate_logging.py src/noteflow/application/services/recovery_service.py
|
||||
make quality-py
|
||||
# ... repeat for each file
|
||||
|
||||
# 5. After ALL files in batch pass individually:
|
||||
pytest tests/application/ -v
|
||||
```
|
||||
|
||||
**PROHIBITED**: Running migration script on entire batch then checking quality once
|
||||
|
||||
### 5.2 Batch 2: Infrastructure (35 files)
|
||||
|
||||
**Agent Type**: `agent-python-executor`
|
||||
|
||||
**Subdirectories**:
|
||||
- `audio/` - capture, writer, playback
|
||||
- `asr/` - engine, segmenter
|
||||
- `diarization/` - engine, session
|
||||
- `summarization/` - providers, parsing
|
||||
- `persistence/` - database, repositories, unit_of_work
|
||||
- `triggers/` - calendar, audio, app
|
||||
- `webhooks/` - executor
|
||||
- `calendar/` - adapters, oauth
|
||||
- `ner/` - engine
|
||||
- `export/` - markdown, html, pdf
|
||||
- `security/` - keystore
|
||||
- `observability/` - otel
|
||||
|
||||
**⚠️ Same per-file workflow as Batch 1:**
|
||||
```bash
|
||||
# For EACH of the 35 files:
|
||||
# 1. Migrate ONE file
|
||||
# 2. make quality-py
|
||||
# 3. Fix any NEW violations
|
||||
# 4. Proceed only when clean
|
||||
```
|
||||
|
||||
### 5.3 Batch 3: gRPC Layer (24 files)
|
||||
|
||||
**Agent Type**: `agent-python-executor`
|
||||
|
||||
**Components**:
|
||||
- `server.py`, `service.py`, `client.py`
|
||||
- `_mixins/` - all mixins
|
||||
- `interceptors/` - identity interceptor
|
||||
- `_client_mixins/` - client mixins
|
||||
|
||||
**⚠️ Same per-file workflow as Batch 1:**
|
||||
```bash
|
||||
# For EACH of the 24 files:
|
||||
# 1. Migrate ONE file
|
||||
# 2. make quality-py
|
||||
# 3. Fix any NEW violations
|
||||
# 4. Proceed only when clean
|
||||
```
|
||||
|
||||
### Migration Abort Conditions
|
||||
|
||||
**STOP IMMEDIATELY if any of these occur:**
|
||||
|
||||
1. **Threshold modification detected** - Any change to `tests/quality/*.py` threshold values
|
||||
2. **Cumulative violations > 5** - Too many unfixed violations accumulating
|
||||
3. **Type errors without fix** - `basedpyright` errors not immediately addressed
|
||||
4. **Baseline not captured** - Starting migration without `/tmp/quality_baseline.log`
|
||||
|
||||
**Recovery**: Revert all changes since last known-good state, re-capture baseline, restart
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Test Updates & Validation
|
||||
|
||||
### 6.1 Agent: Update Test Fixtures
|
||||
|
||||
**Agent Type**: `agent-testing-architect`
|
||||
|
||||
**Tasks**:
|
||||
1. Create `tests/infrastructure/logging/conftest.py` with shared fixtures
|
||||
2. Update any tests asserting on log output
|
||||
3. Add integration tests for dual output
|
||||
|
||||
**Fixture Requirements** (per QUALITY_STANDARDS.md):
|
||||
```python
|
||||
@pytest.fixture
|
||||
def logging_config() -> LoggingConfig:
|
||||
"""Provide test logging configuration."""
|
||||
return LoggingConfig(
|
||||
level="DEBUG",
|
||||
enable_console=False, # Suppress console in tests
|
||||
enable_log_buffer=True,
|
||||
)
|
||||
```
|
||||
|
||||
### 6.2 Agent: Write Unit Tests
|
||||
|
||||
**Agent Type**: `agent-testing-architect`
|
||||
|
||||
**Test Files to Create**:
|
||||
- `tests/infrastructure/logging/test_config.py`
|
||||
- `tests/infrastructure/logging/test_processors.py`
|
||||
- `tests/infrastructure/logging/test_handlers.py`
|
||||
|
||||
**Test Requirements** (per QUALITY_STANDARDS.md):
|
||||
- No loops in tests
|
||||
- No conditionals in tests
|
||||
- Use `pytest.mark.parametrize` for multiple cases
|
||||
- Use `pytest.param` with descriptive IDs
|
||||
- All fixtures must have type hints
|
||||
|
||||
### 6.3 Quality Gate Execution
|
||||
|
||||
**Commands**:
|
||||
```bash
|
||||
# Run quality checks
|
||||
pytest tests/quality/ -v
|
||||
|
||||
# Run new logging tests
|
||||
pytest tests/infrastructure/logging/ -v
|
||||
|
||||
# Run full test suite
|
||||
pytest -m "not integration" -v
|
||||
|
||||
# Type checking
|
||||
basedpyright src/noteflow/infrastructure/logging/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 7: Documentation & Cleanup
|
||||
|
||||
### 7.1 Documentation Updates
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `CLAUDE.md` | Add logging configuration section |
|
||||
| `docs/guides/logging.md` | Create usage guide (NEW) |
|
||||
| `docs/triage.md` | Mark resolved issues |
|
||||
|
||||
### 7.2 Cleanup Tasks
|
||||
|
||||
**Agent Type**: `Explore`
|
||||
|
||||
**Verification Queries**:
|
||||
1. "Confirm no `logging.basicConfig()` calls remain"
|
||||
2. "Confirm no `logging.getLogger(__name__)` patterns remain"
|
||||
3. "Confirm all files use `structlog.get_logger()`"
|
||||
|
||||
---
|
||||
|
||||
## Execution Order & Dependencies
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
P0[Phase 0: Pre-Flight] --> P1[Phase 1: Discovery]
|
||||
P1 --> P2[Phase 2: Infrastructure]
|
||||
P2 --> P3[Phase 3: Entry Points]
|
||||
P3 --> P4[Phase 4: Migration Script]
|
||||
P4 --> P5A[Phase 5.1: App Services]
|
||||
P5A --> P5B[Phase 5.2: Infrastructure]
|
||||
P5B --> P5C[Phase 5.3: gRPC]
|
||||
P5C --> P6[Phase 6: Tests]
|
||||
P6 --> P7[Phase 7: Docs]
|
||||
```
|
||||
|
||||
**Parallelization Opportunities**:
|
||||
- Phase 2 modules (config.py, processors.py, handlers.py) can be developed in parallel
|
||||
- Batch migrations can be parallelized across different directories
|
||||
- Test writing can happen in parallel with Phase 5 batches
|
||||
|
||||
---
|
||||
|
||||
## Agent Orchestration Protocol
|
||||
|
||||
### MANDATORY: Quality Gate After Every Edit
|
||||
|
||||
**Every agent MUST run this after each file modification:**
|
||||
|
||||
```bash
|
||||
# IMMEDIATE - after every file edit
|
||||
make quality-py
|
||||
|
||||
# If ANY failure:
|
||||
# 1. FIX THE VIOLATION IMMEDIATELY
|
||||
# 2. Do NOT proceed to next file
|
||||
# 3. Do NOT claim "preexisting" without baseline proof
|
||||
```
|
||||
|
||||
### Agent Workflow Pattern
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ FOR EACH FILE MODIFICATION: │
|
||||
│ │
|
||||
│ 1. Read current file state │
|
||||
│ 2. Make edit │
|
||||
│ 3. Run: make quality-py │
|
||||
│ 4. IF FAIL → Fix immediately, go to step 3 │
|
||||
│ 5. IF PASS → Proceed to next edit │
|
||||
│ │
|
||||
│ NEVER: Skip step 3-4 │
|
||||
│ NEVER: Batch multiple edits before quality check │
|
||||
│ NEVER: Change threshold values in tests/quality/ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Discovery Phase
|
||||
```bash
|
||||
# Capture baseline FIRST
|
||||
make quality-py 2>&1 | tee /tmp/quality_baseline.log
|
||||
|
||||
# Launch exploration agents in parallel
|
||||
# Agent: Explore (thorough)
|
||||
# Queries:
|
||||
# - "Find all files with 'import logging' in src/noteflow/"
|
||||
# - "Map logging patterns by file type"
|
||||
# - "Find silent error handlers returning None"
|
||||
```
|
||||
|
||||
### Implementation Phase (Per-File Quality Gates)
|
||||
|
||||
```bash
|
||||
# For EACH new file created:
|
||||
|
||||
# Step 1: Create file
|
||||
# Step 2: IMMEDIATELY run quality check
|
||||
make quality-py
|
||||
|
||||
# Step 3: If violations, fix before creating next file
|
||||
# Step 4: Only proceed when clean
|
||||
|
||||
# Example sequence for config.py:
|
||||
# - Write config.py
|
||||
# - make quality-py ← MUST PASS
|
||||
# - Write processors.py
|
||||
# - make quality-py ← MUST PASS
|
||||
# - Write handlers.py
|
||||
# - make quality-py ← MUST PASS
|
||||
```
|
||||
|
||||
### Migration Phase (Per-File Quality Gates)
|
||||
|
||||
```bash
|
||||
# For EACH migrated file:
|
||||
|
||||
# Step 1: Migrate single file
|
||||
# Step 2: IMMEDIATELY run quality check
|
||||
make quality-py
|
||||
|
||||
# Step 3: If new violations introduced, fix before next file
|
||||
# Compare against baseline to identify NEW vs preexisting
|
||||
|
||||
# Example: Migrating meeting_service.py
|
||||
# - Edit meeting_service.py (logging → structlog)
|
||||
# - make quality-py
|
||||
# - If new violations: FIX THEM
|
||||
# - Only then proceed to next service file
|
||||
```
|
||||
|
||||
### Continuous Validation Commands
|
||||
|
||||
```bash
|
||||
# Run continuously during development
|
||||
watch -n 30 'make quality-py'
|
||||
|
||||
# Or after each save (if using editor hooks)
|
||||
# VSCode: tasks.json with "runOn": "save"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Risk Mitigation
|
||||
|
||||
| Risk | Mitigation | Agent Responsibility |
|
||||
|------|------------|---------------------|
|
||||
| Migration breaks existing log parsing | Maintain JSON schema compatibility | `agent-code-quality` |
|
||||
| Rich console conflicts | Test CLI integration early | `Explore` |
|
||||
| OTEL context not propagating | Integration tests with mock tracer | `agent-testing-architect` |
|
||||
| Performance regression | Benchmark before/after | `agent-feasibility` |
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] All 71 files migrated to structlog
|
||||
- [ ] Zero `logging.basicConfig()` calls remain
|
||||
- [ ] Zero `logging.getLogger(__name__)` patterns remain
|
||||
- [ ] Dual output working (Rich console + JSON)
|
||||
- [ ] Context variables auto-injected
|
||||
- [ ] OTEL trace/span IDs appear when tracing enabled
|
||||
- [ ] LogBuffer receives structured events
|
||||
- [ ] All quality checks pass (`pytest tests/quality/`)
|
||||
- [ ] All type checks pass (`basedpyright`)
|
||||
- [ ] Documentation updated
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Quality Compliance Checklist
|
||||
|
||||
Per `docs/sprints/QUALITY_STANDARDS.md`:
|
||||
|
||||
### 🚫 PROHIBITED ACTIONS (Violation = Immediate Rollback)
|
||||
|
||||
- [ ] **NEVER** modify threshold values in `tests/quality/*.py`
|
||||
- [ ] **NEVER** add `# type: ignore` without explicit user approval
|
||||
- [ ] **NEVER** use `Any` type
|
||||
- [ ] **NEVER** skip `make quality-py` after a file edit
|
||||
- [ ] **NEVER** blame "preexisting issues" without baseline comparison
|
||||
|
||||
### ALL CODE Requirements (Not Just Tests)
|
||||
|
||||
These apply to `config.py`, `processors.py`, `handlers.py`, AND all migrated files:
|
||||
|
||||
| Requirement | Check Command | Applies To |
|
||||
|-------------|---------------|------------|
|
||||
| No `# type: ignore` | `basedpyright` | ALL `.py` files |
|
||||
| No `Any` types | `basedpyright` | ALL `.py` files |
|
||||
| Union syntax `X \| None` | `ruff check` | ALL `.py` files |
|
||||
| Module < 500 lines | `pytest tests/quality/` | ALL modules |
|
||||
| Function < 75 lines | `pytest tests/quality/` | ALL functions |
|
||||
| Complexity < 15 | `pytest tests/quality/` | ALL functions |
|
||||
| Parameters ≤ 7 | `pytest tests/quality/` | ALL functions |
|
||||
| No magic numbers | `pytest tests/quality/` | ALL code |
|
||||
| No hardcoded paths | `pytest tests/quality/` | ALL code |
|
||||
| Docstrings imperative | Manual review | ALL public APIs |
|
||||
|
||||
### Test-Specific Requirements
|
||||
|
||||
These apply ONLY to test files in `tests/`:
|
||||
|
||||
- [ ] No loops around assertions
|
||||
- [ ] No conditionals around assertions
|
||||
- [ ] `pytest.mark.parametrize` for multiple cases
|
||||
- [ ] `pytest.raises` with `match=` parameter
|
||||
- [ ] All fixtures have type hints
|
||||
- [ ] Fixtures in conftest.py (not duplicated)
|
||||
|
||||
### Per-Edit Verification Workflow
|
||||
|
||||
```bash
|
||||
# After EVERY edit (not batched):
|
||||
make quality-py
|
||||
|
||||
# Expected output for clean code:
|
||||
# === Ruff (Python Lint) ===
|
||||
# All checks passed!
|
||||
# === Basedpyright ===
|
||||
# 0 errors, 0 warnings, 0 informations
|
||||
# === Python Test Quality ===
|
||||
# XX passed in X.XXs
|
||||
|
||||
# If ANY failure: FIX IMMEDIATELY before next edit
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Appendix B: Threshold Values (READ-ONLY Reference)
|
||||
|
||||
**⚠️ These values are READ-ONLY. Agents MUST NOT modify them.**
|
||||
|
||||
From `tests/quality/test_code_smells.py`:
|
||||
```python
|
||||
# DO NOT CHANGE THESE VALUES
|
||||
MODULE_SOFT_LIMIT = 500
|
||||
MODULE_HARD_LIMIT = 750
|
||||
FUNCTION_LINE_LIMIT = 75
|
||||
COMPLEXITY_LIMIT = 15
|
||||
PARAMETER_LIMIT = 7
|
||||
```
|
||||
|
||||
From `tests/quality/test_magic_values.py`:
|
||||
```python
|
||||
# DO NOT CHANGE THESE VALUES
|
||||
MAX_MAGIC_NUMBERS = 10
|
||||
MAX_REPEATED_STRINGS = 30
|
||||
MAX_HARDCODED_PATHS = 0
|
||||
```
|
||||
|
||||
If your code exceeds these limits, **refactor the code**, not the thresholds.
|
||||
904
docs/sprints/sprint_quality_suite_hardening.md
Normal file
904
docs/sprints/sprint_quality_suite_hardening.md
Normal file
@@ -0,0 +1,904 @@
|
||||
# Sprint: Quality Suite Hardening
|
||||
|
||||
## Overview
|
||||
|
||||
**Goal**: Transform the quality test suite from threshold-based enforcement to baseline-based enforcement, fix detection holes, and add self-tests to prevent regression.
|
||||
|
||||
**Priority**: High - Quality gates are the primary defense against technical debt creep
|
||||
|
||||
**Estimated Effort**: Medium (2-3 days)
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The current quality suite has several weaknesses that make it "gameable" or prone to silent failures:
|
||||
|
||||
1. **Threshold-Based Enforcement**: Using `max_allowed = N` caps that drift over time
|
||||
2. **Silent Parse Failures**: `except SyntaxError: continue` hides unparseable files
|
||||
3. **Detection Holes**: Some rules are compiled but not applied (skipif), or have logic bugs (hardcoded paths)
|
||||
4. **Allowlist Maintenance Sink**: Magic values allowlists will grow unbounded
|
||||
5. **Inconsistent File Discovery**: Multiple `find_python_files()` implementations with different excludes
|
||||
6. **No Self-Tests**: Quality detectors can silently degrade
|
||||
|
||||
## Proposed Architecture
|
||||
|
||||
### 1. Baseline-Based Enforcement System
|
||||
|
||||
Replace all `assert len(violations) <= N` with baseline comparison:
|
||||
|
||||
```
|
||||
tests/quality/
|
||||
├── __init__.py
|
||||
├── _baseline.py # NEW: Baseline loading and comparison
|
||||
├── baselines.json # NEW: Frozen violation snapshots
|
||||
├── test_baseline_self.py # NEW: Self-tests for baseline system
|
||||
├── test_code_smells.py # MODIFIED: Use baseline enforcement
|
||||
├── test_stale_code.py # MODIFIED: Use baseline enforcement
|
||||
├── test_test_smells.py # MODIFIED: Use baseline enforcement
|
||||
├── test_magic_values.py # MODIFIED: Use baseline enforcement
|
||||
├── test_duplicate_code.py # MODIFIED: Use baseline enforcement
|
||||
├── test_unnecessary_wrappers.py # MODIFIED: Use baseline enforcement
|
||||
├── test_decentralized_helpers.py # MODIFIED: Use baseline enforcement
|
||||
└── _helpers.py # NEW: Centralized file discovery
|
||||
```
|
||||
|
||||
### 2. Stable Violation IDs
|
||||
|
||||
Violation IDs must be stable across refactors (avoid line-number-only IDs):
|
||||
|
||||
| Rule Category | ID Format |
|
||||
|---------------|-----------|
|
||||
| Function-level | `rule|relative_path|function_name` |
|
||||
| Class-level | `rule|relative_path|class_name` |
|
||||
| Line-level | `rule|relative_path|content_hash` |
|
||||
| Wrapper | `thin_wrapper|relative_path|function_name|wrapped_call` |
|
||||
|
||||
### 3. Baseline JSON Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"schema_version": 1,
|
||||
"generated_at": "2025-12-31T00:00:00Z",
|
||||
"rules": {
|
||||
"high_complexity": [
|
||||
"src/noteflow/infrastructure/summarization/_parsing.py|parse_llm_response",
|
||||
"src/noteflow/grpc/_mixins/streaming/_mixin.py|StreamTranscription"
|
||||
],
|
||||
"thin_wrapper": [
|
||||
"src/noteflow/config/settings.py|get_settings|_load_settings"
|
||||
],
|
||||
"stale_todo": [
|
||||
"src/noteflow/grpc/service.py|hash:abc123"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: Foundation (Day 1)
|
||||
|
||||
#### Task 1.1: Create Baseline Infrastructure
|
||||
|
||||
**File**: `tests/quality/_baseline.py`
|
||||
|
||||
```python
|
||||
"""Baseline-based quality enforcement infrastructure.
|
||||
|
||||
This module provides the foundation for "no new debt" quality gates.
|
||||
Instead of allowing N violations, we compare against a frozen baseline
|
||||
of existing violations. Any new violation fails immediately.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
import json
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
BASELINE_PATH = Path(__file__).parent / "baselines.json"
|
||||
SCHEMA_VERSION = 1
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Violation:
|
||||
"""Represents a quality rule violation with stable identity."""
|
||||
|
||||
rule: str
|
||||
relative_path: str
|
||||
identifier: str # function/class name or content hash
|
||||
detail: str = "" # optional detail (wrapped call, metric value, etc.)
|
||||
|
||||
@property
|
||||
def stable_id(self) -> str:
|
||||
"""Generate stable ID for baseline comparison."""
|
||||
parts = [self.rule, self.relative_path, self.identifier]
|
||||
if self.detail:
|
||||
parts.append(self.detail)
|
||||
return "|".join(parts)
|
||||
|
||||
def __str__(self) -> str:
|
||||
"""Human-readable representation."""
|
||||
return f"{self.relative_path}:{self.identifier} [{self.rule}]"
|
||||
|
||||
|
||||
@dataclass
|
||||
class BaselineResult:
|
||||
"""Result of baseline comparison."""
|
||||
|
||||
new_violations: list[Violation]
|
||||
fixed_violations: list[str] # IDs that were in baseline but not found
|
||||
current_count: int
|
||||
baseline_count: int
|
||||
|
||||
@property
|
||||
def passed(self) -> bool:
|
||||
"""True if no new violations introduced."""
|
||||
return len(self.new_violations) == 0
|
||||
|
||||
|
||||
def load_baseline() -> dict[str, set[str]]:
|
||||
"""Load baseline violations from JSON file."""
|
||||
if not BASELINE_PATH.exists():
|
||||
return {}
|
||||
|
||||
data = json.loads(BASELINE_PATH.read_text(encoding="utf-8"))
|
||||
|
||||
# Version check
|
||||
if data.get("schema_version", 0) != SCHEMA_VERSION:
|
||||
raise ValueError(
|
||||
f"Baseline schema version mismatch: "
|
||||
f"expected {SCHEMA_VERSION}, got {data.get('schema_version')}"
|
||||
)
|
||||
|
||||
return {rule: set(ids) for rule, ids in data.get("rules", {}).items()}
|
||||
|
||||
|
||||
def save_baseline(violations_by_rule: dict[str, list[Violation]]) -> None:
|
||||
"""Save current violations as new baseline.
|
||||
|
||||
This should only be called manually when intentionally updating the baseline.
|
||||
"""
|
||||
data = {
|
||||
"schema_version": SCHEMA_VERSION,
|
||||
"generated_at": datetime.now(timezone.utc).isoformat(),
|
||||
"rules": {
|
||||
rule: sorted(v.stable_id for v in violations)
|
||||
for rule, violations in violations_by_rule.items()
|
||||
}
|
||||
}
|
||||
|
||||
BASELINE_PATH.write_text(
|
||||
json.dumps(data, indent=2, sort_keys=True) + "\n",
|
||||
encoding="utf-8"
|
||||
)
|
||||
|
||||
|
||||
def assert_no_new_violations(
|
||||
rule: str,
|
||||
current_violations: list[Violation],
|
||||
*,
|
||||
max_new_allowed: int = 0,
|
||||
) -> BaselineResult:
|
||||
"""Assert no new violations beyond the frozen baseline.
|
||||
|
||||
Args:
|
||||
rule: The rule name (e.g., "high_complexity", "thin_wrapper")
|
||||
current_violations: List of violations found in current scan
|
||||
max_new_allowed: Allow up to N new violations (default 0)
|
||||
|
||||
Returns:
|
||||
BaselineResult with comparison details
|
||||
|
||||
Raises:
|
||||
AssertionError: If new violations exceed max_new_allowed
|
||||
"""
|
||||
baseline = load_baseline()
|
||||
allowed_ids = baseline.get(rule, set())
|
||||
|
||||
current_ids = {v.stable_id for v in current_violations}
|
||||
|
||||
new_ids = current_ids - allowed_ids
|
||||
fixed_ids = allowed_ids - current_ids
|
||||
|
||||
new_violations = [v for v in current_violations if v.stable_id in new_ids]
|
||||
|
||||
result = BaselineResult(
|
||||
new_violations=sorted(new_violations, key=lambda v: v.stable_id),
|
||||
fixed_violations=sorted(fixed_ids),
|
||||
current_count=len(current_violations),
|
||||
baseline_count=len(allowed_ids),
|
||||
)
|
||||
|
||||
if len(new_violations) > max_new_allowed:
|
||||
message_parts = [
|
||||
f"[{rule}] {len(new_violations)} NEW violations introduced "
|
||||
f"(baseline: {len(allowed_ids)}, current: {len(current_violations)}):",
|
||||
]
|
||||
for v in new_violations[:20]:
|
||||
message_parts.append(f" + {v}")
|
||||
|
||||
if fixed_ids:
|
||||
message_parts.append(f"\nFixed {len(fixed_ids)} violations (can update baseline):")
|
||||
for fid in list(fixed_ids)[:5]:
|
||||
message_parts.append(f" - {fid}")
|
||||
|
||||
raise AssertionError("\n".join(message_parts))
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def content_hash(content: str, length: int = 8) -> str:
|
||||
"""Generate short hash of content for stable line-level IDs."""
|
||||
return hashlib.sha256(content.encode()).hexdigest()[:length]
|
||||
```
|
||||
|
||||
#### Task 1.2: Create Centralized File Discovery
|
||||
|
||||
**File**: `tests/quality/_helpers.py`
|
||||
|
||||
```python
|
||||
"""Centralized helpers for quality tests.
|
||||
|
||||
All quality tests should use these helpers to ensure consistent
|
||||
file discovery and avoid gaps in coverage.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import ast
|
||||
from pathlib import Path
|
||||
|
||||
# Root paths
|
||||
PROJECT_ROOT = Path(__file__).parent.parent.parent
|
||||
SRC_ROOT = PROJECT_ROOT / "src" / "noteflow"
|
||||
TESTS_ROOT = PROJECT_ROOT / "tests"
|
||||
|
||||
# Excluded patterns (generated code)
|
||||
GENERATED_PATTERNS = {"*_pb2.py", "*_pb2_grpc.py", "*_pb2.pyi"}
|
||||
|
||||
# Excluded directories
|
||||
EXCLUDED_DIRS = {".venv", "__pycache__", "node_modules", ".git"}
|
||||
|
||||
|
||||
def find_source_files(
|
||||
root: Path = SRC_ROOT,
|
||||
*,
|
||||
include_tests: bool = False,
|
||||
include_conftest: bool = False,
|
||||
include_migrations: bool = False,
|
||||
include_quality: bool = False,
|
||||
) -> list[Path]:
|
||||
"""Find Python source files with consistent exclusions.
|
||||
|
||||
Args:
|
||||
root: Root directory to search
|
||||
include_tests: Include test files (test_*.py)
|
||||
include_conftest: Include conftest.py files
|
||||
include_migrations: Include Alembic migration files
|
||||
include_quality: Include tests/quality/ files
|
||||
|
||||
Returns:
|
||||
List of Path objects for matching files
|
||||
"""
|
||||
files: list[Path] = []
|
||||
|
||||
for py_file in root.rglob("*.py"):
|
||||
# Skip excluded directories
|
||||
if any(d in py_file.parts for d in EXCLUDED_DIRS):
|
||||
continue
|
||||
|
||||
# Skip generated files
|
||||
if any(py_file.match(p) for p in GENERATED_PATTERNS):
|
||||
continue
|
||||
|
||||
# Skip conftest unless included
|
||||
if not include_conftest and py_file.name == "conftest.py":
|
||||
continue
|
||||
|
||||
# Skip migrations unless included
|
||||
if not include_migrations and "migrations" in py_file.parts:
|
||||
continue
|
||||
|
||||
# Skip tests unless included
|
||||
if not include_tests and "tests" in py_file.parts:
|
||||
continue
|
||||
|
||||
# Skip quality tests unless included (prevents recursion)
|
||||
if not include_quality and "quality" in py_file.parts:
|
||||
continue
|
||||
|
||||
files.append(py_file)
|
||||
|
||||
return sorted(files)
|
||||
|
||||
|
||||
def find_test_files(
|
||||
root: Path = TESTS_ROOT,
|
||||
*,
|
||||
include_quality: bool = False,
|
||||
) -> list[Path]:
|
||||
"""Find test files with consistent exclusions.
|
||||
|
||||
Args:
|
||||
root: Root directory to search
|
||||
include_quality: Include tests/quality/ files
|
||||
|
||||
Returns:
|
||||
List of test file paths
|
||||
"""
|
||||
files: list[Path] = []
|
||||
|
||||
for py_file in root.rglob("test_*.py"):
|
||||
# Skip excluded directories
|
||||
if any(d in py_file.parts for d in EXCLUDED_DIRS):
|
||||
continue
|
||||
|
||||
# Skip quality tests unless included
|
||||
if not include_quality and "quality" in py_file.parts:
|
||||
continue
|
||||
|
||||
files.append(py_file)
|
||||
|
||||
return sorted(files)
|
||||
|
||||
|
||||
def parse_file_safe(file_path: Path) -> tuple[ast.AST | None, str | None]:
|
||||
"""Parse a Python file, returning AST or error message.
|
||||
|
||||
Unlike bare `ast.parse`, this never silently fails.
|
||||
|
||||
Returns:
|
||||
(ast, None) on success
|
||||
(None, error_message) on failure
|
||||
"""
|
||||
try:
|
||||
source = file_path.read_text(encoding="utf-8")
|
||||
tree = ast.parse(source)
|
||||
return tree, None
|
||||
except SyntaxError as e:
|
||||
return None, f"{file_path}: SyntaxError at line {e.lineno}: {e.msg}"
|
||||
except Exception as e:
|
||||
return None, f"{file_path}: {type(e).__name__}: {e}"
|
||||
|
||||
|
||||
def relative_path(file_path: Path, root: Path = SRC_ROOT) -> str:
|
||||
"""Get path relative to project root for stable IDs."""
|
||||
try:
|
||||
return str(file_path.relative_to(PROJECT_ROOT))
|
||||
except ValueError:
|
||||
return str(file_path)
|
||||
```
|
||||
|
||||
#### Task 1.3: Create Self-Tests for Quality Infrastructure
|
||||
|
||||
**File**: `tests/quality/test_baseline_self.py`
|
||||
|
||||
```python
|
||||
"""Self-tests for quality infrastructure.
|
||||
|
||||
These tests ensure the quality detectors themselves work correctly.
|
||||
This prevents the quality suite from silently degrading.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import ast
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from tests.quality._baseline import (
|
||||
Violation,
|
||||
assert_no_new_violations,
|
||||
content_hash,
|
||||
load_baseline,
|
||||
)
|
||||
from tests.quality._helpers import parse_file_safe
|
||||
|
||||
|
||||
class TestParseFileSafe:
|
||||
"""Tests for safe file parsing."""
|
||||
|
||||
def test_valid_python_parses(self, tmp_path: Path) -> None:
|
||||
"""Valid Python code should parse successfully."""
|
||||
file = tmp_path / "valid.py"
|
||||
file.write_text("def foo(): pass\n")
|
||||
|
||||
tree, error = parse_file_safe(file)
|
||||
|
||||
assert tree is not None
|
||||
assert error is None
|
||||
|
||||
def test_syntax_error_returns_message(self, tmp_path: Path) -> None:
|
||||
"""Syntax errors should return descriptive message, not raise."""
|
||||
file = tmp_path / "invalid.py"
|
||||
file.write_text("def foo(\n") # Incomplete
|
||||
|
||||
tree, error = parse_file_safe(file)
|
||||
|
||||
assert tree is None
|
||||
assert error is not None
|
||||
assert "SyntaxError" in error
|
||||
|
||||
|
||||
class TestViolation:
|
||||
"""Tests for Violation dataclass."""
|
||||
|
||||
def test_stable_id_format(self) -> None:
|
||||
"""Stable ID should include all components."""
|
||||
v = Violation(
|
||||
rule="thin_wrapper",
|
||||
relative_path="src/foo.py",
|
||||
identifier="my_func",
|
||||
detail="wrapped_call",
|
||||
)
|
||||
|
||||
assert v.stable_id == "thin_wrapper|src/foo.py|my_func|wrapped_call"
|
||||
|
||||
def test_stable_id_without_detail(self) -> None:
|
||||
"""Stable ID should work without detail."""
|
||||
v = Violation(
|
||||
rule="high_complexity",
|
||||
relative_path="src/bar.py",
|
||||
identifier="complex_func",
|
||||
)
|
||||
|
||||
assert v.stable_id == "high_complexity|src/bar.py|complex_func"
|
||||
|
||||
|
||||
class TestContentHash:
|
||||
"""Tests for content hashing."""
|
||||
|
||||
def test_same_content_same_hash(self) -> None:
|
||||
"""Same content should produce same hash."""
|
||||
content = "# TODO: fix this"
|
||||
|
||||
assert content_hash(content) == content_hash(content)
|
||||
|
||||
def test_different_content_different_hash(self) -> None:
|
||||
"""Different content should produce different hash."""
|
||||
assert content_hash("foo") != content_hash("bar")
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Detector Self-Tests
|
||||
# =============================================================================
|
||||
|
||||
|
||||
class TestSkipifDetection:
|
||||
"""Self-tests for skipif detection (prevents the hole we found)."""
|
||||
|
||||
def test_detects_skip_without_reason(self) -> None:
|
||||
"""Should detect @pytest.mark.skip without reason."""
|
||||
code = '''
|
||||
@pytest.mark.skip
|
||||
def test_something():
|
||||
pass
|
||||
'''
|
||||
# This is what the detector should catch
|
||||
import re
|
||||
skip_pattern = re.compile(r"@pytest\.mark\.skip\s*(?:\(\s*\))?$", re.MULTILINE)
|
||||
|
||||
matches = skip_pattern.findall(code)
|
||||
assert len(matches) == 1
|
||||
|
||||
def test_detects_skip_with_empty_parens(self) -> None:
|
||||
"""Should detect @pytest.mark.skip() with empty parens."""
|
||||
code = "@pytest.mark.skip()\ndef test_foo(): pass"
|
||||
import re
|
||||
skip_pattern = re.compile(r"@pytest\.mark\.skip\s*(?:\(\s*\))?$", re.MULTILINE)
|
||||
|
||||
assert skip_pattern.search(code) is not None
|
||||
|
||||
def test_detects_skipif_without_reason(self) -> None:
|
||||
"""Should detect @pytest.mark.skipif without reason keyword."""
|
||||
code = '@pytest.mark.skipif(sys.platform == "win32")\ndef test_foo(): pass'
|
||||
import re
|
||||
# This pattern should match skipif without reason=
|
||||
skipif_pattern = re.compile(
|
||||
r"@pytest\.mark\.skipif\s*\([^)]*\)(?!\s*#.*reason)",
|
||||
re.MULTILINE
|
||||
)
|
||||
|
||||
# The current code compiles but doesn't use this pattern - this is the bug!
|
||||
# The test validates what SHOULD happen
|
||||
match = skipif_pattern.search(code)
|
||||
# We expect to find it (without reason=)
|
||||
assert match is not None
|
||||
# But if reason= is present, we shouldn't match
|
||||
code_with_reason = '@pytest.mark.skipif(sys.platform == "win32", reason="Windows")'
|
||||
assert "reason=" in code_with_reason
|
||||
|
||||
|
||||
class TestHardcodedPathDetection:
|
||||
"""Self-tests for hardcoded path detection (fixes the split bug)."""
|
||||
|
||||
def test_detects_home_path(self) -> None:
|
||||
"""Should detect /home/user paths."""
|
||||
import re
|
||||
pattern = r'["\']\/(?:home|usr|var|etc|opt|tmp)\/\w+'
|
||||
|
||||
line = 'PATH = "/home/user/data"'
|
||||
assert re.search(pattern, line) is not None
|
||||
|
||||
def test_ignores_path_in_comment(self) -> None:
|
||||
"""Should ignore paths that appear after # comment."""
|
||||
import re
|
||||
pattern = r'["\']\/(?:home|usr|var|etc|opt|tmp)\/\w+'
|
||||
|
||||
line = '# Example: PATH = "/home/user/data"'
|
||||
match = re.search(pattern, line)
|
||||
|
||||
if match:
|
||||
# The bug: line.split(pattern) doesn't work because pattern is regex
|
||||
# This is the CORRECT check:
|
||||
comment_pos = line.find("#")
|
||||
if comment_pos != -1 and comment_pos < match.start():
|
||||
# Path is in comment, should be ignored
|
||||
assert True
|
||||
else:
|
||||
# Path is NOT in comment, should be flagged
|
||||
assert False, "Should have been ignored"
|
||||
|
||||
def test_detects_path_with_inline_comment_after(self) -> None:
|
||||
"""Path before inline comment should still be detected."""
|
||||
import re
|
||||
pattern = r'["\']\/(?:home|usr|var|etc|opt|tmp)\/\w+'
|
||||
|
||||
line = 'PATH = "/home/user/thing" # legit comment'
|
||||
match = re.search(pattern, line)
|
||||
|
||||
assert match is not None
|
||||
# Comment is AFTER the match, so this should be flagged
|
||||
comment_pos = line.find("#")
|
||||
assert comment_pos > match.start(), "Comment should be after the path"
|
||||
|
||||
|
||||
class TestThinWrapperDetection:
|
||||
"""Self-tests for thin wrapper detection."""
|
||||
|
||||
def test_detects_simple_passthrough(self) -> None:
|
||||
"""Should detect simple return-only wrappers."""
|
||||
code = '''
|
||||
def wrapper():
|
||||
return wrapped()
|
||||
'''
|
||||
tree = ast.parse(code)
|
||||
func = tree.body[0]
|
||||
assert isinstance(func, ast.FunctionDef)
|
||||
|
||||
# The body has one statement (Return with Call)
|
||||
assert len(func.body) == 1
|
||||
stmt = func.body[0]
|
||||
assert isinstance(stmt, ast.Return)
|
||||
assert isinstance(stmt.value, ast.Call)
|
||||
|
||||
def test_detects_await_passthrough(self) -> None:
|
||||
"""Should detect async return await wrappers."""
|
||||
code = '''
|
||||
async def wrapper():
|
||||
return await wrapped()
|
||||
'''
|
||||
tree = ast.parse(code)
|
||||
func = tree.body[0]
|
||||
assert isinstance(func, ast.AsyncFunctionDef)
|
||||
|
||||
stmt = func.body[0]
|
||||
assert isinstance(stmt, ast.Return)
|
||||
# The value is Await wrapping Call
|
||||
assert isinstance(stmt.value, ast.Await)
|
||||
assert isinstance(stmt.value.value, ast.Call)
|
||||
|
||||
def test_ignores_wrapper_with_logic(self) -> None:
|
||||
"""Should ignore wrappers that add logic."""
|
||||
code = '''
|
||||
def wrapper(x):
|
||||
if x:
|
||||
return wrapped()
|
||||
return None
|
||||
'''
|
||||
tree = ast.parse(code)
|
||||
func = tree.body[0]
|
||||
|
||||
# Multiple statements = not a thin wrapper
|
||||
assert len(func.body) > 1
|
||||
```
|
||||
|
||||
### Phase 2: Fix Detection Holes (Day 1-2)
|
||||
|
||||
#### Task 2.1: Fix skipif Detection Bug
|
||||
|
||||
**File**: `tests/quality/test_test_smells.py`
|
||||
|
||||
The current code compiles the skipif pattern but never uses it:
|
||||
|
||||
```python
|
||||
# CURRENT (broken):
|
||||
skip_pattern = re.compile(r"@pytest\.mark\.skip\s*(?:\(\s*\))?$", re.MULTILINE)
|
||||
re.compile( # <-- compiled but NOT assigned!
|
||||
r"@pytest\.mark\.skipif\s*\([^)]*\)\s*$", re.MULTILINE
|
||||
)
|
||||
|
||||
# FIXED:
|
||||
skip_pattern = re.compile(r"@pytest\.mark\.skip\s*(?:\(\s*\))?$", re.MULTILINE)
|
||||
skipif_pattern = re.compile(
|
||||
r"@pytest\.mark\.skipif\s*\([^)]*\)(?!\s*,\s*reason=)",
|
||||
re.MULTILINE
|
||||
)
|
||||
```
|
||||
|
||||
Then use both patterns in the detection loop.
|
||||
|
||||
#### Task 2.2: Fix Hardcoded Path Detection Bug
|
||||
|
||||
**File**: `tests/quality/test_magic_values.py`
|
||||
|
||||
The current code has a logic bug with `line.split(pattern)`:
|
||||
|
||||
```python
|
||||
# CURRENT (broken):
|
||||
if re.search(pattern, line):
|
||||
if "test" not in line.lower() and "#" not in line.split(pattern)[0]:
|
||||
# line.split(pattern) splits on LITERAL string, not regex!
|
||||
violations.append(...)
|
||||
|
||||
# FIXED:
|
||||
match = re.search(pattern, line)
|
||||
if match:
|
||||
# Check if # appears BEFORE the match
|
||||
comment_pos = line.find("#")
|
||||
if comment_pos != -1 and comment_pos < match.start():
|
||||
continue # Path is in comment, skip
|
||||
if "test" not in line.lower():
|
||||
violations.append(...)
|
||||
```
|
||||
|
||||
#### Task 2.3: Fix Silent SyntaxError Handling
|
||||
|
||||
Replace all `except SyntaxError: continue` with error collection:
|
||||
|
||||
```python
|
||||
# CURRENT (silent failure):
|
||||
for py_file in find_python_files(src_root):
|
||||
source = py_file.read_text(encoding="utf-8")
|
||||
try:
|
||||
tree = ast.parse(source)
|
||||
except SyntaxError:
|
||||
continue # <-- Silent skip!
|
||||
|
||||
# FIXED (fail loudly):
|
||||
from tests.quality._helpers import parse_file_safe
|
||||
|
||||
parse_errors: list[str] = []
|
||||
|
||||
for py_file in find_python_files(src_root):
|
||||
tree, error = parse_file_safe(py_file)
|
||||
if error:
|
||||
parse_errors.append(error)
|
||||
continue
|
||||
# ... process tree ...
|
||||
|
||||
# At the end of the test:
|
||||
assert not parse_errors, (
|
||||
f"Quality scan hit {len(parse_errors)} parse error(s):\n"
|
||||
+ "\n".join(parse_errors)
|
||||
)
|
||||
```
|
||||
|
||||
### Phase 3: Migrate Tests to Baseline (Day 2)
|
||||
|
||||
#### Task 3.1: Migrate High-Impact Tests First
|
||||
|
||||
Priority order (highest gaming risk first):
|
||||
|
||||
1. `test_no_stale_todos` - Easy to add TODOs
|
||||
2. `test_no_trivial_wrapper_functions` - High cap (42)
|
||||
3. `test_no_high_complexity_functions` - Complexity creep
|
||||
4. `test_no_long_parameter_lists` - High cap (35)
|
||||
5. `test_no_repeated_code_patterns` - Very high cap (177)
|
||||
|
||||
Example migration for `test_no_stale_todos`:
|
||||
|
||||
```python
|
||||
# BEFORE:
|
||||
def test_no_stale_todos() -> None:
|
||||
# ... detection logic ...
|
||||
max_allowed = 10
|
||||
assert len(stale_comments) <= max_allowed, ...
|
||||
|
||||
# AFTER:
|
||||
from tests.quality._baseline import Violation, assert_no_new_violations, content_hash
|
||||
from tests.quality._helpers import find_source_files, parse_file_safe, relative_path
|
||||
|
||||
def test_no_stale_todos() -> None:
|
||||
violations: list[Violation] = []
|
||||
parse_errors: list[str] = []
|
||||
|
||||
for py_file in find_source_files():
|
||||
lines = py_file.read_text(encoding="utf-8").splitlines()
|
||||
rel_path = relative_path(py_file)
|
||||
|
||||
for i, line in enumerate(lines, start=1):
|
||||
match = stale_pattern.search(line)
|
||||
if match:
|
||||
tag = match.group(1).upper()
|
||||
message = match.group(2).strip()
|
||||
violations.append(
|
||||
Violation(
|
||||
rule="stale_todo",
|
||||
relative_path=rel_path,
|
||||
identifier=content_hash(f"{i}:{line.strip()}"),
|
||||
detail=tag,
|
||||
)
|
||||
)
|
||||
|
||||
assert not parse_errors, "\n".join(parse_errors)
|
||||
assert_no_new_violations("stale_todo", violations)
|
||||
```
|
||||
|
||||
#### Task 3.2: Generate Initial Baseline
|
||||
|
||||
After migrating all tests, generate the baseline:
|
||||
|
||||
```bash
|
||||
# Run with special env var to generate baseline
|
||||
QUALITY_GENERATE_BASELINE=1 pytest tests/quality/ -v
|
||||
|
||||
# Or use a management script
|
||||
python -m tests.quality._baseline --generate
|
||||
```
|
||||
|
||||
### Phase 4: Advanced Improvements (Day 3)
|
||||
|
||||
#### Task 4.1: Replace Magic Value Allowlists with "Must Be Named" Rule
|
||||
|
||||
Instead of maintaining `ALLOWED_NUMBERS` and `ALLOWED_STRINGS`, use:
|
||||
|
||||
```python
|
||||
def test_no_repeated_literals() -> None:
|
||||
"""Detect literals that repeat and should be constants.
|
||||
|
||||
Rule: Any literal that appears more than once in a module or
|
||||
across multiple modules should be promoted to a named constant.
|
||||
|
||||
Universal exceptions (0, 1, -1, "utf-8", HTTP verbs) are allowed.
|
||||
"""
|
||||
UNIVERSAL_EXCEPTIONS = {
|
||||
0, 1, 2, -1, # Universal integers
|
||||
0.0, 1.0, # Universal floats
|
||||
"", " ", "\n", "\t", # Universal strings
|
||||
"utf-8", "utf-8",
|
||||
"GET", "POST", "PUT", "DELETE", "PATCH",
|
||||
}
|
||||
|
||||
literal_occurrences: dict[object, list[Violation]] = defaultdict(list)
|
||||
|
||||
for py_file in find_source_files():
|
||||
# ... collect literals ...
|
||||
for node in ast.walk(tree):
|
||||
if isinstance(node, ast.Constant):
|
||||
value = node.value
|
||||
if value in UNIVERSAL_EXCEPTIONS:
|
||||
continue
|
||||
if isinstance(value, str) and len(value) < 3:
|
||||
continue # Short strings OK
|
||||
# ... add to occurrences ...
|
||||
|
||||
# Flag any literal appearing 3+ times
|
||||
violations = []
|
||||
for value, occurrences in literal_occurrences.items():
|
||||
if len(occurrences) >= 3:
|
||||
violations.extend(occurrences)
|
||||
|
||||
assert_no_new_violations("repeated_literal", violations)
|
||||
```
|
||||
|
||||
#### Task 4.2: Add CODEOWNERS Protection
|
||||
|
||||
**File**: `.github/CODEOWNERS`
|
||||
|
||||
```
|
||||
# Quality suite requires maintainer approval
|
||||
tests/quality/ @your-team/maintainers
|
||||
tests/quality/baselines.json @your-team/maintainers
|
||||
```
|
||||
|
||||
### Phase 5: Documentation and Rollout
|
||||
|
||||
#### Task 5.1: Update QUALITY_STANDARDS.md
|
||||
|
||||
Add section explaining baseline enforcement:
|
||||
|
||||
```markdown
|
||||
## Baseline-Based Quality Gates
|
||||
|
||||
Quality tests use baseline enforcement instead of fixed caps:
|
||||
|
||||
- **No new debt**: Adding any new violation fails immediately
|
||||
- **Baseline file**: `tests/quality/baselines.json` freezes existing violations
|
||||
- **Reducing debt**: Fix violations, then update baseline to remove entries
|
||||
- **Protected file**: Baseline changes require maintainer approval
|
||||
|
||||
### Updating the Baseline
|
||||
|
||||
When you've fixed violations and want to update the baseline:
|
||||
|
||||
```bash
|
||||
# Regenerate baseline with current violations
|
||||
python -m pytest tests/quality/ --generate-baseline
|
||||
|
||||
# Review and commit
|
||||
git diff tests/quality/baselines.json
|
||||
git add tests/quality/baselines.json
|
||||
git commit -m "chore: reduce quality baseline (N violations fixed)"
|
||||
```
|
||||
|
||||
### Adding Exceptions (Rare)
|
||||
|
||||
If a new violation is intentional:
|
||||
|
||||
1. Document why in a comment near the code
|
||||
2. Add the stable ID to `baselines.json`
|
||||
3. Get maintainer approval for the change
|
||||
```
|
||||
|
||||
## Migration Strategy
|
||||
|
||||
### Step 1: Add Infrastructure (No Behavior Change)
|
||||
|
||||
1. Add `_baseline.py` and `_helpers.py`
|
||||
2. Add `test_baseline_self.py`
|
||||
3. Verify self-tests pass
|
||||
|
||||
### Step 2: Fix Detection Bugs
|
||||
|
||||
1. Fix skipif detection
|
||||
2. Fix hardcoded path comment handling
|
||||
3. Fix silent SyntaxError handling
|
||||
4. Verify with self-tests
|
||||
|
||||
### Step 3: Parallel Run Period
|
||||
|
||||
1. Add baseline checks alongside existing caps
|
||||
2. Both must pass (cap AND baseline)
|
||||
3. Monitor for issues
|
||||
|
||||
### Step 4: Remove Caps
|
||||
|
||||
1. Remove `max_allowed` assertions
|
||||
2. Baseline becomes sole enforcement
|
||||
3. Generate and commit initial baseline
|
||||
|
||||
### Step 5: Reduce Baseline Over Time
|
||||
|
||||
1. Sprint goals include "reduce N violations"
|
||||
2. Update baseline when fixes land
|
||||
3. Celebrate progress!
|
||||
|
||||
## Success Metrics
|
||||
|
||||
| Metric | Current | Target |
|
||||
|--------|---------|--------|
|
||||
| Parse error handling | Silent skip | Fail loudly |
|
||||
| Enforcement mechanism | Threshold caps | Baseline comparison |
|
||||
| Detection holes | 2+ known | 0 known |
|
||||
| Self-test coverage | 0% | 10+ detector tests |
|
||||
| Baseline violations | N/A | Tracked and decreasing |
|
||||
|
||||
## Risks and Mitigations
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Baseline file conflicts | Small file, clear ownership |
|
||||
| Too strict initially | Start with current counts frozen |
|
||||
| Self-tests incomplete | Add tests as holes are found |
|
||||
| Agent edits baseline | CODEOWNERS + branch protection |
|
||||
|
||||
## References
|
||||
|
||||
- [Test Smells](https://testsmells.org/)
|
||||
- [xUnit Test Patterns](http://xunitpatterns.com/)
|
||||
- [Quality Debt](https://martinfowler.com/bliki/TechnicalDebt.html)
|
||||
155
docs/sprints/sprint_quality_suite_hardening_PLAN.md
Normal file
155
docs/sprints/sprint_quality_suite_hardening_PLAN.md
Normal file
@@ -0,0 +1,155 @@
|
||||
# Sprint Plan: Quality Suite Hardening
|
||||
|
||||
## Summary
|
||||
|
||||
Transform quality tests from threshold-based (`max_allowed = N`) to baseline-based enforcement, fix detection holes, and add self-tests.
|
||||
|
||||
## Execution Checklist
|
||||
|
||||
### Phase 1: Foundation Infrastructure
|
||||
|
||||
- [ ] Create `tests/quality/_baseline.py` with `Violation`, `assert_no_new_violations()`, `content_hash()`
|
||||
- [ ] Create `tests/quality/_helpers.py` with centralized `find_source_files()`, `parse_file_safe()`
|
||||
- [ ] Create `tests/quality/baselines.json` (empty initially, schema v1)
|
||||
- [ ] Create `tests/quality/test_baseline_self.py` with infrastructure self-tests
|
||||
|
||||
### Phase 2: Fix Detection Holes
|
||||
|
||||
- [ ] Fix `test_no_ignored_tests_without_reason`: Add missing `skipif_pattern` variable and usage
|
||||
- [ ] Fix `test_no_hardcoded_paths`: Replace `line.split(pattern)` with `match.start()` comparison
|
||||
- [ ] Replace all `except SyntaxError: continue` with `parse_file_safe()` + error collection
|
||||
- [ ] Add self-tests for each fixed detector
|
||||
|
||||
### Phase 3: Migrate to Baseline Enforcement
|
||||
|
||||
Priority order (highest gaming risk):
|
||||
|
||||
1. [ ] `test_no_stale_todos` (cap: 10)
|
||||
2. [ ] `test_no_trivial_wrapper_functions` (cap: 42)
|
||||
3. [ ] `test_no_high_complexity_functions` (cap: 2)
|
||||
4. [ ] `test_no_long_parameter_lists` (cap: 35)
|
||||
5. [ ] `test_no_repeated_code_patterns` (cap: 177)
|
||||
6. [ ] `test_no_god_classes` (cap: 1)
|
||||
7. [ ] `test_no_deep_nesting` (cap: 2)
|
||||
8. [ ] `test_no_long_methods` (cap: 7)
|
||||
9. [ ] `test_no_feature_envy` (cap: 5)
|
||||
10. [ ] `test_no_orphaned_imports` (cap: 5)
|
||||
11. [ ] `test_no_deprecated_patterns` (cap: 5)
|
||||
12. [ ] `test_no_assertion_roulette` (cap: 50)
|
||||
13. [ ] `test_no_conditional_test_logic` (cap: 40)
|
||||
14. [ ] `test_no_sleepy_tests` (cap: 3)
|
||||
15. [ ] `test_no_unknown_tests` (cap: 5)
|
||||
16. [ ] `test_no_redundant_prints` (cap: 5)
|
||||
17. [ ] `test_no_exception_handling_in_tests` (cap: 3)
|
||||
18. [ ] `test_no_magic_numbers_in_assertions` (cap: 50)
|
||||
19. [ ] `test_no_sensitive_equality` (cap: 10)
|
||||
20. [ ] `test_no_eager_tests` (cap: 10)
|
||||
21. [ ] `test_no_duplicate_test_names` (cap: 15)
|
||||
22. [ ] `test_no_long_test_methods` (cap: 3)
|
||||
23. [ ] `test_fixtures_have_type_hints` (cap: 10)
|
||||
24. [ ] `test_no_unused_fixture_parameters` (cap: 5)
|
||||
25. [ ] `test_fixture_scope_appropriate` (cap: 5)
|
||||
26. [ ] `test_no_pytest_raises_without_match` (cap: 50)
|
||||
27. [ ] `test_no_magic_numbers` (cap: 10)
|
||||
28. [ ] `test_no_repeated_string_literals` (cap: 30)
|
||||
29. [ ] `test_no_alias_imports` (cap: 10)
|
||||
30. [ ] `test_no_redundant_type_aliases` (cap: 2)
|
||||
31. [ ] `test_no_passthrough_classes` (cap: 1)
|
||||
32. [ ] `test_no_duplicate_function_bodies` (cap: 1)
|
||||
33. [ ] `test_helpers_not_scattered` (cap: 15)
|
||||
34. [ ] `test_no_duplicate_helper_implementations` (cap: 25)
|
||||
35. [ ] `test_module_size_limits` (soft cap: 5, hard: 0)
|
||||
|
||||
### Phase 4: Generate Initial Baseline
|
||||
|
||||
- [ ] Run all quality tests to collect current violations
|
||||
- [ ] Generate `baselines.json` with frozen violation IDs
|
||||
- [ ] Verify all tests pass with baseline enforcement
|
||||
- [ ] Remove `max_allowed` assertions from all tests
|
||||
|
||||
### Phase 5: Advanced Improvements (Optional)
|
||||
|
||||
- [ ] Replace magic value allowlists with "must be named" rule
|
||||
- [ ] Add `.github/CODEOWNERS` for `tests/quality/` protection
|
||||
- [ ] Update `docs/sprints/QUALITY_STANDARDS.md` with baseline workflow
|
||||
|
||||
## Files to Create
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `tests/quality/_baseline.py` | Baseline loading, comparison, violation types |
|
||||
| `tests/quality/_helpers.py` | Centralized file discovery, safe parsing |
|
||||
| `tests/quality/baselines.json` | Frozen violation snapshot |
|
||||
| `tests/quality/test_baseline_self.py` | Self-tests for infrastructure |
|
||||
|
||||
## Files to Modify
|
||||
|
||||
| File | Changes |
|
||||
|------|---------|
|
||||
| `tests/quality/test_code_smells.py` | Use `_helpers`, baseline enforcement |
|
||||
| `tests/quality/test_stale_code.py` | Use `_helpers`, baseline enforcement |
|
||||
| `tests/quality/test_test_smells.py` | Fix skipif bug, use baseline |
|
||||
| `tests/quality/test_magic_values.py` | Fix path bug, use baseline |
|
||||
| `tests/quality/test_unnecessary_wrappers.py` | Use `_helpers`, baseline |
|
||||
| `tests/quality/test_duplicate_code.py` | Use `_helpers`, baseline |
|
||||
| `tests/quality/test_decentralized_helpers.py` | Use `_helpers`, baseline |
|
||||
|
||||
## Key Design Decisions
|
||||
|
||||
### Stable Violation IDs
|
||||
|
||||
```
|
||||
{rule}|{relative_path}|{identifier}[|{detail}]
|
||||
|
||||
Examples:
|
||||
- high_complexity|src/noteflow/grpc/service.py|StreamTranscription
|
||||
- thin_wrapper|src/noteflow/config/settings.py|get_settings|_load_settings
|
||||
- stale_todo|src/noteflow/cli/main.py|hash:a1b2c3d4
|
||||
```
|
||||
|
||||
### Baseline JSON Schema
|
||||
|
||||
```json
|
||||
{
|
||||
"schema_version": 1,
|
||||
"generated_at": "ISO8601",
|
||||
"rules": {
|
||||
"rule_name": ["stable_id_1", "stable_id_2"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Parse Error Handling
|
||||
|
||||
```python
|
||||
# Never silently skip
|
||||
tree, error = parse_file_safe(file_path)
|
||||
if error:
|
||||
parse_errors.append(error)
|
||||
continue
|
||||
|
||||
# Fail at end if any errors
|
||||
assert not parse_errors, "\n".join(parse_errors)
|
||||
```
|
||||
|
||||
## Verification Commands
|
||||
|
||||
```bash
|
||||
# Run quality tests
|
||||
pytest tests/quality/ -v
|
||||
|
||||
# Generate baseline (after migration)
|
||||
QUALITY_GENERATE_BASELINE=1 pytest tests/quality/ -v
|
||||
|
||||
# Check for new violations only
|
||||
pytest tests/quality/ -v --tb=short
|
||||
```
|
||||
|
||||
## Success Criteria
|
||||
|
||||
1. All quality tests pass with baseline enforcement
|
||||
2. No `max_allowed` caps remain in test code
|
||||
3. Self-tests cover all detection mechanisms
|
||||
4. Parse errors fail loudly instead of silently
|
||||
5. Detection holes (skipif, hardcoded paths) are fixed
|
||||
6. `baselines.json` tracks all existing violations
|
||||
265
docs/sprints/sprint_spec_validation_fixes.md
Normal file
265
docs/sprints/sprint_spec_validation_fixes.md
Normal file
@@ -0,0 +1,265 @@
|
||||
# Sprint: Spec Validation Fixes
|
||||
|
||||
> **Source**: `docs/spec.md` (2025-12-31 validation)
|
||||
> **Quality Gates**: `docs/sprints/QUALITY_STANDARDS.md`
|
||||
> **Status**: Planning
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This sprint addresses 12 findings from the spec validation document, ranging from gRPC schema inconsistencies to performance issues and security gaps. Each finding has been validated against the current codebase with exact file locations and evidence.
|
||||
|
||||
---
|
||||
|
||||
## Priority Classification
|
||||
|
||||
### P0 - Critical (Security/Data Integrity)
|
||||
|
||||
| ID | Finding | Risk | Effort |
|
||||
|----|---------|------|--------|
|
||||
| #6 | ChunkedAssetReader lacks bounds checks | Data corruption, decryption failures | Medium |
|
||||
| #10 | OTEL exporter uses `insecure=True` | Telemetry data exposed in transit | Low |
|
||||
|
||||
### P1 - High (API Contract/Correctness)
|
||||
|
||||
| ID | Finding | Risk | Effort |
|
||||
|----|---------|------|--------|
|
||||
| #1 | Timestamp representations inconsistent | Client/server mismatch, conversion errors | High |
|
||||
| #2 | UpdateAnnotation sentinel defaults | Cannot clear fields intentionally | Medium |
|
||||
| #3 | TranscriptUpdate ambiguous without `oneof` | Clients must defensively branch | Medium |
|
||||
| #11 | Stringly-typed statuses | Typos, unsupported values at runtime | Medium |
|
||||
|
||||
### P2 - Medium (Reliability/Performance)
|
||||
|
||||
| ID | Finding | Risk | Effort |
|
||||
|----|---------|------|--------|
|
||||
| #4 | Background task tracking inconsistent | Sync tasks not cancelled on shutdown | Medium |
|
||||
| #5 | Segmenter O(n) `pop(0)` in hot path | Performance degradation under load | Low |
|
||||
| #7 | gRPC size limits in multiple places | Configuration drift | Low |
|
||||
| #8 | Outlook adapter lacks timeouts/pagination | Hangs, incomplete data | Medium |
|
||||
| #9 | Webhook delivery ID not recorded | Correlation impossible | Low |
|
||||
|
||||
### P3 - Low (Test Coverage)
|
||||
|
||||
| ID | Finding | Risk | Effort |
|
||||
|----|---------|------|--------|
|
||||
| #12 | Test targets for high-risk changes | Regression risk | Medium |
|
||||
|
||||
---
|
||||
|
||||
## Detailed Findings
|
||||
|
||||
### #1 Timestamp Representations Inconsistent
|
||||
|
||||
**Status**: Confirmed
|
||||
**Locations**:
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:217` - `double created_at`
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:745` - `int64 start_time`
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:1203` - `string started_at` (ISO 8601)
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:149` - `double server_timestamp`
|
||||
|
||||
**Impact**: Multiple time encodings force per-field conversions and increase mismatch risk.
|
||||
|
||||
**Solution**:
|
||||
1. Add `google.protobuf.Timestamp` fields in new/v2 messages
|
||||
2. Deprecate legacy fields with comments
|
||||
3. Add helper conversions in `src/noteflow/grpc/_mixins/converters.py`
|
||||
|
||||
---
|
||||
|
||||
### #2 UpdateAnnotation Sentinel Defaults
|
||||
|
||||
**Status**: Confirmed
|
||||
**Locations**:
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:502` - Message definition
|
||||
- `src/noteflow/grpc/_mixins/annotation.py:127` - Handler logic
|
||||
|
||||
**Impact**: Cannot clear text to empty string, set time to 0, or clear segment_ids.
|
||||
|
||||
**Solution**:
|
||||
1. Add `optional` keyword to fields (proto3 presence tracking)
|
||||
2. Use `HasField()` checks in handler instead of sentinel comparisons
|
||||
3. Add `clear_*` flags for backward compatibility
|
||||
|
||||
---
|
||||
|
||||
### #3 TranscriptUpdate Ambiguous Without `oneof`
|
||||
|
||||
**Status**: Confirmed
|
||||
**Location**: `src/noteflow/grpc/proto/noteflow.proto:136`
|
||||
|
||||
**Impact**: Schema allows both `partial_text` and `segment` or neither.
|
||||
|
||||
**Solution**:
|
||||
1. Create `TranscriptUpdateV2` with `oneof payload`
|
||||
2. Add new RPC `StreamTranscriptionV2`
|
||||
3. Use `google.protobuf.Timestamp` for `server_timestamp`
|
||||
|
||||
---
|
||||
|
||||
### #4 Background Task Tracking Inconsistent
|
||||
|
||||
**Status**: Partially confirmed
|
||||
**Locations**:
|
||||
- `src/noteflow/grpc/_mixins/diarization/_jobs.py:130` - Tracked tasks
|
||||
- `src/noteflow/grpc/_mixins/sync.py:109` - Untracked sync tasks
|
||||
|
||||
**Impact**: Sync tasks not cancelled on shutdown, exceptions not observed.
|
||||
|
||||
**Solution**:
|
||||
1. Add shared `BackgroundTaskRegistry` in servicer
|
||||
2. Register sync tasks for cancellation
|
||||
3. Add done-callback for exception logging
|
||||
|
||||
---
|
||||
|
||||
### #5 Segmenter O(n) `pop(0)` in Hot Path
|
||||
|
||||
**Status**: Confirmed
|
||||
**Location**: `src/noteflow/infrastructure/asr/segmenter.py:233`
|
||||
|
||||
**Impact**: O(n) behavior under sustained audio streaming.
|
||||
|
||||
**Solution**:
|
||||
1. Replace `list` with `collections.deque`
|
||||
2. Use `popleft()` for O(1) removals
|
||||
|
||||
---
|
||||
|
||||
### #6 ChunkedAssetReader Lacks Bounds Checks
|
||||
|
||||
**Status**: Partially confirmed
|
||||
**Location**: `src/noteflow/infrastructure/security/crypto.py:279`
|
||||
|
||||
**Impact**: No guard for `chunk_length < NONCE_SIZE + TAG_SIZE`, invalid slices possible.
|
||||
|
||||
**Solution**:
|
||||
1. Add `read_exact()` helper
|
||||
2. Validate `chunk_length >= NONCE_SIZE + TAG_SIZE`
|
||||
3. Treat partial length headers as errors
|
||||
4. Consider optional AAD for chunk index
|
||||
|
||||
---
|
||||
|
||||
### #7 gRPC Size Limits in Multiple Places
|
||||
|
||||
**Status**: Confirmed
|
||||
**Locations**:
|
||||
- `src/noteflow/grpc/service.py:86` - `MAX_CHUNK_SIZE = 1MB`
|
||||
- `src/noteflow/config/constants.py:27` - `MAX_GRPC_MESSAGE_SIZE = 100MB`
|
||||
- `src/noteflow/grpc/server.py:158` - Hardcoded in options
|
||||
|
||||
**Impact**: Multiple sources of truth can drift.
|
||||
|
||||
**Solution**:
|
||||
1. Move to `Settings` class
|
||||
2. Use consistently in `server.py` and `service.py`
|
||||
3. Enforce chunk size in streaming handlers
|
||||
4. Surface in `ServerInfo`
|
||||
|
||||
---
|
||||
|
||||
### #8 Outlook Adapter Lacks Timeouts/Pagination
|
||||
|
||||
**Status**: Confirmed
|
||||
**Location**: `src/noteflow/infrastructure/calendar/outlook_adapter.py:81`
|
||||
|
||||
**Impact**: No timeouts, no pagination via `@odata.nextLink`, unbounded error logging.
|
||||
|
||||
**Solution**:
|
||||
1. Configure `httpx.AsyncClient(timeout=..., limits=...)`
|
||||
2. Implement pagination with `@odata.nextLink`
|
||||
3. Truncate error bodies before logging
|
||||
|
||||
---
|
||||
|
||||
### #9 Webhook Delivery ID Not Recorded
|
||||
|
||||
**Status**: Partially confirmed
|
||||
**Locations**:
|
||||
- `src/noteflow/infrastructure/webhooks/executor.py:255` - ID generated
|
||||
- `src/noteflow/infrastructure/webhooks/executor.py:306` - Different ID in record
|
||||
- `src/noteflow/infrastructure/webhooks/executor.py:103` - No client limits
|
||||
|
||||
**Impact**: Delivery ID sent to recipients not stored, correlation impossible.
|
||||
|
||||
**Solution**:
|
||||
1. Reuse `delivery_id` as `WebhookDelivery.id`
|
||||
2. Add `httpx.Limits` configuration
|
||||
3. Include `delivery_id` in logs
|
||||
|
||||
---
|
||||
|
||||
### #10 OTEL Exporter Uses `insecure=True`
|
||||
|
||||
**Status**: Confirmed
|
||||
**Location**: `src/noteflow/infrastructure/observability/otel.py:99`
|
||||
|
||||
**Impact**: TLS disabled unconditionally, even in production.
|
||||
|
||||
**Solution**:
|
||||
1. Add `NOTEFLOW_OTEL_INSECURE` setting
|
||||
2. Infer from endpoint scheme (`http://` vs `https://`)
|
||||
3. Default to secure
|
||||
|
||||
---
|
||||
|
||||
### #11 Stringly-Typed Statuses
|
||||
|
||||
**Status**: Confirmed
|
||||
**Locations**:
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:1191` - `string status` for sync
|
||||
- `src/noteflow/grpc/proto/noteflow.proto:856` - `string status` for OAuth
|
||||
|
||||
**Impact**: Clients must match string literals, risk typos.
|
||||
|
||||
**Solution**:
|
||||
1. Add `SyncRunStatus` enum
|
||||
2. Add `OAuthConnectionStatus` enum
|
||||
3. Migrate via new fields or v2 messages
|
||||
|
||||
---
|
||||
|
||||
### #12 Test Targets for High-Risk Changes
|
||||
|
||||
**Status**: Recommendation
|
||||
**Existing Coverage**:
|
||||
- `tests/stress/test_segmenter_fuzz.py`
|
||||
- `tests/stress/test_audio_integrity.py`
|
||||
|
||||
**Suggested Additions**:
|
||||
1. gRPC proto-level test for patch semantics on `UpdateAnnotation`
|
||||
2. Sync task lifecycle test for shutdown cancellation
|
||||
3. Outlook adapter test for `@odata.nextLink` pagination
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] All P0 findings resolved
|
||||
- [ ] All P1 findings resolved or have v2 migration path
|
||||
- [ ] All P2 findings resolved
|
||||
- [ ] Test coverage for high-risk changes
|
||||
- [ ] Zero new quality threshold violations
|
||||
- [ ] All type checks pass (`basedpyright`)
|
||||
- [ ] Documentation updated
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
- Proto regeneration affects Rust/TS clients
|
||||
- Backward compatibility required for existing API consumers
|
||||
- Feature flags for v2 migrations where applicable
|
||||
|
||||
---
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Proto changes break clients | Deprecate + add new fields (no removal) |
|
||||
| Performance regression from deque | Benchmark before/after |
|
||||
| OTEL secure default breaks dev | Make configurable with sane defaults |
|
||||
| Task registry overhead | Lightweight set-based tracking |
|
||||
1417
docs/sprints/sprint_spec_validation_fixes_PLAN.md
Normal file
1417
docs/sprints/sprint_spec_validation_fixes_PLAN.md
Normal file
File diff suppressed because it is too large
Load Diff
@@ -29,6 +29,8 @@ dependencies = [
|
||||
"authlib>=1.6.6",
|
||||
"rich>=14.2.0",
|
||||
"types-psutil>=7.2.0.20251228",
|
||||
# Structured logging
|
||||
"structlog>=24.0",
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
@@ -203,6 +205,7 @@ disable_error_code = ["import-untyped"]
|
||||
[tool.basedpyright]
|
||||
pythonVersion = "3.12"
|
||||
typeCheckingMode = "standard"
|
||||
extraPaths = ["scripts"]
|
||||
reportMissingTypeStubs = false
|
||||
reportUnknownMemberType = false
|
||||
reportUnknownArgumentType = false
|
||||
|
||||
@@ -12,9 +12,9 @@
|
||||
"files": true,
|
||||
"removeComments": true,
|
||||
"removeEmptyLines": true,
|
||||
"compress": true,
|
||||
"compress": false,
|
||||
"topFilesLength": 5,
|
||||
"showLineNumbers": false,
|
||||
"showLineNumbers": true,
|
||||
"truncateBase64": false,
|
||||
"copyToClipboard": false,
|
||||
"tokenCountTree": false,
|
||||
@@ -26,11 +26,67 @@
|
||||
"includeLogsCount": 50
|
||||
}
|
||||
},
|
||||
"include": ["src/"],
|
||||
"include": [
|
||||
"tests/quality"
|
||||
],
|
||||
"ignore": {
|
||||
"useGitignore": true,
|
||||
"useDefaultPatterns": true,
|
||||
"customPatterns": []
|
||||
"customPatterns": [
|
||||
"**/*_pb2.py",
|
||||
"**/*_pb2_grpc.py",
|
||||
"**/*.pb2.py",
|
||||
"**/*.pb2_grpc.py",
|
||||
"**/*.pyi",
|
||||
"**/noteflow.rs",
|
||||
"**/noteflow_pb2.py",
|
||||
"src/noteflow_pb2.py",
|
||||
"client/src-tauri/src/grpc/noteflow.rs",
|
||||
"src/noteflow/grpc/proto/noteflow_pb2.py",
|
||||
"src/noteflow/grpc/proto/noteflow_pb2_grpc.py",
|
||||
"src/noteflow/grpc/proto/noteflow_pb2.pyi",
|
||||
"**/node_modules/**",
|
||||
"**/target/**",
|
||||
"**/gen/**",
|
||||
"**/__pycache__/**",
|
||||
"**/*.pyc",
|
||||
"**/.pytest_cache/**",
|
||||
"**/.mypy_cache/**",
|
||||
"**/.ruff_cache/**",
|
||||
"**/dist/**",
|
||||
"**/build/**",
|
||||
"**/.vite/**",
|
||||
"**/coverage/**",
|
||||
"**/htmlcov/**",
|
||||
"**/playwright-report/**",
|
||||
"**/test-results/**",
|
||||
"uv.lock",
|
||||
"**/Cargo.lock",
|
||||
"**/package-lock.json",
|
||||
"**/bun.lockb",
|
||||
"**/yarn.lock",
|
||||
"**/*.lock",
|
||||
"**/*.lockb",
|
||||
"**/*.png",
|
||||
"**/*.jpg",
|
||||
"**/*.jpeg",
|
||||
"**/*.gif",
|
||||
"**/*.ico",
|
||||
"**/*.svg",
|
||||
"**/*.icns",
|
||||
"**/*.webp",
|
||||
"**/*.xml",
|
||||
"**/icons/**",
|
||||
"**/public/**",
|
||||
"client/app-icon.png",
|
||||
"**/*.md",
|
||||
".benchmarks/**",
|
||||
"noteflow-api-spec.json",
|
||||
"scratch.md",
|
||||
"repomix-output.md",
|
||||
"**/logs/**",
|
||||
"**/status_line.json"
|
||||
]
|
||||
},
|
||||
"security": {
|
||||
"enableSecurityCheck": true
|
||||
|
||||
@@ -6,7 +6,6 @@ Uses existing Integration entity and IntegrationRepository for persistence.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID
|
||||
|
||||
@@ -22,6 +21,7 @@ from noteflow.infrastructure.calendar import (
|
||||
from noteflow.infrastructure.calendar.google_adapter import GoogleCalendarError
|
||||
from noteflow.infrastructure.calendar.oauth_manager import OAuthError
|
||||
from noteflow.infrastructure.calendar.outlook_adapter import OutlookCalendarError
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Callable
|
||||
@@ -29,7 +29,7 @@ if TYPE_CHECKING:
|
||||
from noteflow.config.settings import CalendarIntegrationSettings
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class CalendarServiceError(Exception):
|
||||
|
||||
@@ -15,12 +15,15 @@ from noteflow.infrastructure.export import (
|
||||
PdfExporter,
|
||||
TranscriptExporter,
|
||||
)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.domain.entities import Meeting, Segment
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
from noteflow.domain.value_objects import MeetingId
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class ExportFormat(Enum):
|
||||
"""Supported export formats."""
|
||||
@@ -83,17 +86,43 @@ class ExportService:
|
||||
Raises:
|
||||
ValueError: If meeting not found.
|
||||
"""
|
||||
logger.info(
|
||||
"Starting transcript export",
|
||||
meeting_id=str(meeting_id),
|
||||
format=fmt.value,
|
||||
)
|
||||
async with self._uow:
|
||||
found_meeting = await self._uow.meetings.get(meeting_id)
|
||||
if not found_meeting:
|
||||
from noteflow.config.constants import ERROR_MSG_MEETING_PREFIX
|
||||
|
||||
msg = f"{ERROR_MSG_MEETING_PREFIX}{meeting_id} not found"
|
||||
logger.warning(
|
||||
"Export failed: meeting not found",
|
||||
meeting_id=str(meeting_id),
|
||||
)
|
||||
raise ValueError(msg)
|
||||
|
||||
segments = await self._uow.segments.get_by_meeting(meeting_id)
|
||||
segment_count = len(segments)
|
||||
logger.debug(
|
||||
"Retrieved segments for export",
|
||||
meeting_id=str(meeting_id),
|
||||
segment_count=segment_count,
|
||||
)
|
||||
|
||||
exporter = self._get_exporter(fmt)
|
||||
return exporter.export(found_meeting, segments)
|
||||
result = exporter.export(found_meeting, segments)
|
||||
|
||||
content_size = len(result) if isinstance(result, bytes) else len(result.encode("utf-8"))
|
||||
logger.info(
|
||||
"Transcript export completed",
|
||||
meeting_id=str(meeting_id),
|
||||
format=fmt.value,
|
||||
segment_count=segment_count,
|
||||
content_size_bytes=content_size,
|
||||
)
|
||||
return result
|
||||
|
||||
async def export_to_file(
|
||||
self,
|
||||
@@ -114,22 +143,60 @@ class ExportService:
|
||||
Raises:
|
||||
ValueError: If meeting not found or format cannot be determined.
|
||||
"""
|
||||
logger.info(
|
||||
"Starting file export",
|
||||
meeting_id=str(meeting_id),
|
||||
output_path=str(output_path),
|
||||
format=fmt.value if fmt else "inferred",
|
||||
)
|
||||
|
||||
# Determine format from extension if not provided
|
||||
if fmt is None:
|
||||
fmt = self._infer_format_from_extension(output_path.suffix)
|
||||
logger.debug(
|
||||
"Format inferred from extension",
|
||||
extension=output_path.suffix,
|
||||
inferred_format=fmt.value,
|
||||
)
|
||||
|
||||
content = await self.export_transcript(meeting_id, fmt)
|
||||
|
||||
# Ensure correct extension
|
||||
exporter = self._get_exporter(fmt)
|
||||
original_path = output_path
|
||||
if output_path.suffix != exporter.file_extension:
|
||||
output_path = output_path.with_suffix(exporter.file_extension)
|
||||
logger.debug(
|
||||
"Adjusted file extension",
|
||||
original_path=str(original_path),
|
||||
adjusted_path=str(output_path),
|
||||
expected_extension=exporter.file_extension,
|
||||
)
|
||||
|
||||
output_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
if isinstance(content, bytes):
|
||||
output_path.write_bytes(content)
|
||||
else:
|
||||
output_path.write_text(content, encoding="utf-8")
|
||||
try:
|
||||
if isinstance(content, bytes):
|
||||
output_path.write_bytes(content)
|
||||
else:
|
||||
output_path.write_text(content, encoding="utf-8")
|
||||
|
||||
file_size = output_path.stat().st_size
|
||||
logger.info(
|
||||
"File export completed",
|
||||
meeting_id=str(meeting_id),
|
||||
output_path=str(output_path),
|
||||
format=fmt.value,
|
||||
file_size_bytes=file_size,
|
||||
)
|
||||
except OSError as exc:
|
||||
logger.error(
|
||||
"File write failed",
|
||||
meeting_id=str(meeting_id),
|
||||
output_path=str(output_path),
|
||||
error=str(exc),
|
||||
)
|
||||
raise
|
||||
|
||||
return output_path
|
||||
|
||||
def _infer_format_from_extension(self, extension: str) -> ExportFormat:
|
||||
@@ -153,12 +220,23 @@ class ExportService:
|
||||
".htm": ExportFormat.HTML,
|
||||
EXPORT_EXT_PDF: ExportFormat.PDF,
|
||||
}
|
||||
fmt = extension_map.get(extension.lower())
|
||||
normalized_ext = extension.lower()
|
||||
fmt = extension_map.get(normalized_ext)
|
||||
if fmt is None:
|
||||
logger.warning(
|
||||
"Unrecognized file extension for format inference",
|
||||
extension=extension,
|
||||
supported_extensions=list(extension_map.keys()),
|
||||
)
|
||||
raise ValueError(
|
||||
f"Cannot infer format from extension '{extension}'. "
|
||||
f"Supported: {', '.join(extension_map.keys())}"
|
||||
)
|
||||
logger.debug(
|
||||
"Format inference successful",
|
||||
extension=normalized_ext,
|
||||
inferred_format=fmt.value,
|
||||
)
|
||||
return fmt
|
||||
|
||||
def get_supported_formats(self) -> list[tuple[str, str]]:
|
||||
|
||||
@@ -7,7 +7,6 @@ Following hexagonal architecture:
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID, uuid4
|
||||
|
||||
@@ -23,6 +22,7 @@ from noteflow.domain.identity import (
|
||||
WorkspaceContext,
|
||||
WorkspaceRole,
|
||||
)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.persistence.models import (
|
||||
DEFAULT_USER_ID,
|
||||
DEFAULT_WORKSPACE_ID,
|
||||
@@ -33,7 +33,7 @@ if TYPE_CHECKING:
|
||||
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class IdentityService:
|
||||
@@ -64,6 +64,7 @@ class IdentityService:
|
||||
"""
|
||||
if not uow.supports_users:
|
||||
# Return a synthetic context for memory mode
|
||||
logger.debug("Memory mode: returning synthetic default user context")
|
||||
return UserContext(
|
||||
user_id=UUID(DEFAULT_USER_ID),
|
||||
display_name=DEFAULT_USER_DISPLAY_NAME,
|
||||
@@ -71,6 +72,7 @@ class IdentityService:
|
||||
|
||||
user = await uow.users.get_default()
|
||||
if user:
|
||||
logger.debug("Found existing default user: %s", user.id)
|
||||
return UserContext(
|
||||
user_id=user.id,
|
||||
display_name=user.display_name,
|
||||
@@ -110,6 +112,7 @@ class IdentityService:
|
||||
"""
|
||||
if not uow.supports_workspaces:
|
||||
# Return a synthetic context for memory mode
|
||||
logger.debug("Memory mode: returning synthetic default workspace context")
|
||||
return WorkspaceContext(
|
||||
workspace_id=UUID(DEFAULT_WORKSPACE_ID),
|
||||
workspace_name=DEFAULT_WORKSPACE_NAME,
|
||||
@@ -118,6 +121,11 @@ class IdentityService:
|
||||
|
||||
workspace = await uow.workspaces.get_default_for_user(user_id)
|
||||
if workspace:
|
||||
logger.debug(
|
||||
"Found existing default workspace for user %s: %s",
|
||||
user_id,
|
||||
workspace.id,
|
||||
)
|
||||
membership = await uow.workspaces.get_membership(workspace.id, user_id)
|
||||
role = WorkspaceRole(membership.role.value) if membership else WorkspaceRole.OWNER
|
||||
return WorkspaceContext(
|
||||
@@ -169,10 +177,22 @@ class IdentityService:
|
||||
user = await self.get_or_create_default_user(uow)
|
||||
|
||||
if workspace_id:
|
||||
logger.info(
|
||||
"Resolving context for explicit workspace_id=%s, user_id=%s",
|
||||
workspace_id,
|
||||
user.user_id,
|
||||
)
|
||||
ws_context = await self._get_workspace_context(uow, workspace_id, user.user_id)
|
||||
else:
|
||||
logger.debug("No workspace_id provided, using default workspace")
|
||||
ws_context = await self.get_or_create_default_workspace(uow, user.user_id)
|
||||
|
||||
logger.debug(
|
||||
"Resolved operation context: user=%s, workspace=%s, request_id=%s",
|
||||
user.user_id,
|
||||
ws_context.workspace_id,
|
||||
request_id,
|
||||
)
|
||||
return OperationContext(
|
||||
user=user,
|
||||
workspace=ws_context,
|
||||
@@ -200,24 +220,38 @@ class IdentityService:
|
||||
PermissionError: If user not a member.
|
||||
"""
|
||||
if not uow.supports_workspaces:
|
||||
logger.debug("Memory mode: returning synthetic workspace context for %s", workspace_id)
|
||||
return WorkspaceContext(
|
||||
workspace_id=workspace_id,
|
||||
workspace_name=DEFAULT_WORKSPACE_NAME,
|
||||
role=WorkspaceRole.OWNER,
|
||||
)
|
||||
|
||||
logger.debug("Looking up workspace %s for user %s", workspace_id, user_id)
|
||||
workspace = await uow.workspaces.get(workspace_id)
|
||||
if not workspace:
|
||||
from noteflow.config.constants import ERROR_MSG_WORKSPACE_PREFIX
|
||||
|
||||
logger.warning("Workspace not found: %s", workspace_id)
|
||||
msg = f"{ERROR_MSG_WORKSPACE_PREFIX}{workspace_id} not found"
|
||||
raise ValueError(msg)
|
||||
|
||||
membership = await uow.workspaces.get_membership(workspace_id, user_id)
|
||||
if not membership:
|
||||
logger.warning(
|
||||
"Permission denied: user %s is not a member of workspace %s",
|
||||
user_id,
|
||||
workspace_id,
|
||||
)
|
||||
msg = f"User not a member of workspace {workspace_id}"
|
||||
raise PermissionError(msg)
|
||||
|
||||
logger.debug(
|
||||
"Workspace access granted: user=%s, workspace=%s, role=%s",
|
||||
user_id,
|
||||
workspace_id,
|
||||
membership.role,
|
||||
)
|
||||
return WorkspaceContext(
|
||||
workspace_id=workspace.id,
|
||||
workspace_name=workspace.name,
|
||||
@@ -243,9 +277,18 @@ class IdentityService:
|
||||
List of workspaces.
|
||||
"""
|
||||
if not uow.supports_workspaces:
|
||||
logger.debug("Memory mode: returning empty workspace list")
|
||||
return []
|
||||
|
||||
return await uow.workspaces.list_for_user(user_id, limit, offset)
|
||||
workspaces = await uow.workspaces.list_for_user(user_id, limit, offset)
|
||||
logger.debug(
|
||||
"Listed %d workspaces for user %s (limit=%d, offset=%d)",
|
||||
len(workspaces),
|
||||
user_id,
|
||||
limit,
|
||||
offset,
|
||||
)
|
||||
return workspaces
|
||||
|
||||
async def create_workspace(
|
||||
self,
|
||||
@@ -316,9 +359,15 @@ class IdentityService:
|
||||
User if found, None otherwise.
|
||||
"""
|
||||
if not uow.supports_users:
|
||||
logger.debug("Memory mode: users not supported, returning None")
|
||||
return None
|
||||
|
||||
return await uow.users.get(user_id)
|
||||
user = await uow.users.get(user_id)
|
||||
if user:
|
||||
logger.debug("Found user: %s", user_id)
|
||||
else:
|
||||
logger.debug("User not found: %s", user_id)
|
||||
return user
|
||||
|
||||
async def update_user_profile(
|
||||
self,
|
||||
@@ -347,15 +396,27 @@ class IdentityService:
|
||||
|
||||
user = await uow.users.get(user_id)
|
||||
if not user:
|
||||
logger.warning("User not found for profile update: %s", user_id)
|
||||
return None
|
||||
|
||||
updated_fields: list[str] = []
|
||||
if display_name:
|
||||
user.display_name = display_name
|
||||
updated_fields.append("display_name")
|
||||
if email is not None:
|
||||
user.email = email
|
||||
updated_fields.append("email")
|
||||
|
||||
if not updated_fields:
|
||||
logger.debug("No fields to update for user %s", user_id)
|
||||
return user
|
||||
|
||||
updated = await uow.users.update(user)
|
||||
await uow.commit()
|
||||
|
||||
logger.info("Updated user profile: %s", user_id)
|
||||
logger.info(
|
||||
"Updated user profile: user_id=%s, fields=%s",
|
||||
user_id,
|
||||
", ".join(updated_fields),
|
||||
)
|
||||
return updated
|
||||
|
||||
@@ -5,7 +5,6 @@ Orchestrates meeting-related use cases with persistence.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from collections.abc import Sequence
|
||||
from datetime import UTC, datetime
|
||||
from typing import TYPE_CHECKING
|
||||
@@ -20,6 +19,7 @@ from noteflow.domain.entities import (
|
||||
WordTiming,
|
||||
)
|
||||
from noteflow.domain.value_objects import AnnotationId, AnnotationType
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence as SequenceType
|
||||
@@ -27,7 +27,7 @@ if TYPE_CHECKING:
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
from noteflow.domain.value_objects import MeetingId, MeetingState
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class MeetingService:
|
||||
@@ -64,6 +64,7 @@ class MeetingService:
|
||||
async with self._uow:
|
||||
saved = await self._uow.meetings.create(meeting)
|
||||
await self._uow.commit()
|
||||
logger.info("Created meeting", meeting_id=str(saved.id), title=title, state=saved.state.value)
|
||||
return saved
|
||||
|
||||
async def get_meeting(self, meeting_id: MeetingId) -> Meeting | None:
|
||||
@@ -76,7 +77,12 @@ class MeetingService:
|
||||
Meeting if found, None otherwise.
|
||||
"""
|
||||
async with self._uow:
|
||||
return await self._uow.meetings.get(meeting_id)
|
||||
meeting = await self._uow.meetings.get(meeting_id)
|
||||
if meeting is None:
|
||||
logger.debug("Meeting not found", meeting_id=str(meeting_id))
|
||||
else:
|
||||
logger.debug("Retrieved meeting", meeting_id=str(meeting_id), state=meeting.state.value)
|
||||
return meeting
|
||||
|
||||
async def list_meetings(
|
||||
self,
|
||||
@@ -97,12 +103,14 @@ class MeetingService:
|
||||
Tuple of (meeting sequence, total matching count).
|
||||
"""
|
||||
async with self._uow:
|
||||
return await self._uow.meetings.list_all(
|
||||
meetings, total = await self._uow.meetings.list_all(
|
||||
states=states,
|
||||
limit=limit,
|
||||
offset=offset,
|
||||
sort_desc=sort_desc,
|
||||
)
|
||||
logger.debug("Listed meetings", count=len(meetings), total=total, limit=limit, offset=offset)
|
||||
return meetings, total
|
||||
|
||||
async def start_recording(self, meeting_id: MeetingId) -> Meeting | None:
|
||||
"""Start recording a meeting.
|
||||
@@ -116,11 +124,14 @@ class MeetingService:
|
||||
async with self._uow:
|
||||
meeting = await self._uow.meetings.get(meeting_id)
|
||||
if meeting is None:
|
||||
logger.warning("Cannot start recording: meeting not found", meeting_id=str(meeting_id))
|
||||
return None
|
||||
|
||||
previous_state = meeting.state.value
|
||||
meeting.start_recording()
|
||||
await self._uow.meetings.update(meeting)
|
||||
await self._uow.commit()
|
||||
logger.info("Started recording", meeting_id=str(meeting_id), from_state=previous_state, to_state=meeting.state.value)
|
||||
return meeting
|
||||
|
||||
async def stop_meeting(self, meeting_id: MeetingId) -> Meeting | None:
|
||||
@@ -137,13 +148,15 @@ class MeetingService:
|
||||
async with self._uow:
|
||||
meeting = await self._uow.meetings.get(meeting_id)
|
||||
if meeting is None:
|
||||
logger.warning("Cannot stop meeting: not found", meeting_id=str(meeting_id))
|
||||
return None
|
||||
|
||||
# Graceful shutdown: RECORDING -> STOPPING -> STOPPED
|
||||
meeting.begin_stopping()
|
||||
previous_state = meeting.state.value
|
||||
meeting.begin_stopping() # RECORDING -> STOPPING -> STOPPED
|
||||
meeting.stop_recording()
|
||||
await self._uow.meetings.update(meeting)
|
||||
await self._uow.commit()
|
||||
logger.info("Stopped meeting", meeting_id=str(meeting_id), from_state=previous_state, to_state=meeting.state.value)
|
||||
return meeting
|
||||
|
||||
async def complete_meeting(self, meeting_id: MeetingId) -> Meeting | None:
|
||||
@@ -158,11 +171,14 @@ class MeetingService:
|
||||
async with self._uow:
|
||||
meeting = await self._uow.meetings.get(meeting_id)
|
||||
if meeting is None:
|
||||
logger.warning("Cannot complete meeting: not found", meeting_id=str(meeting_id))
|
||||
return None
|
||||
|
||||
previous_state = meeting.state.value
|
||||
meeting.complete()
|
||||
await self._uow.meetings.update(meeting)
|
||||
await self._uow.commit()
|
||||
logger.info("Completed meeting", meeting_id=str(meeting_id), from_state=previous_state, to_state=meeting.state.value)
|
||||
return meeting
|
||||
|
||||
async def delete_meeting(self, meeting_id: MeetingId) -> bool:
|
||||
@@ -181,16 +197,14 @@ class MeetingService:
|
||||
async with self._uow:
|
||||
meeting = await self._uow.meetings.get(meeting_id)
|
||||
if meeting is None:
|
||||
logger.warning("Cannot delete meeting: not found", meeting_id=str(meeting_id))
|
||||
return False
|
||||
|
||||
# Delete filesystem assets (use stored asset_path if different from meeting_id)
|
||||
await self._uow.assets.delete_meeting_assets(meeting_id, meeting.asset_path)
|
||||
|
||||
# Delete DB record (cascade handles children)
|
||||
success = await self._uow.meetings.delete(meeting_id)
|
||||
if success:
|
||||
await self._uow.commit()
|
||||
logger.info("Deleted meeting %s", meeting_id)
|
||||
logger.info("Deleted meeting", meeting_id=str(meeting_id), title=meeting.title)
|
||||
|
||||
return success
|
||||
|
||||
@@ -240,25 +254,15 @@ class MeetingService:
|
||||
async with self._uow:
|
||||
saved = await self._uow.segments.add(meeting_id, segment)
|
||||
await self._uow.commit()
|
||||
logger.debug("Added segment", meeting_id=str(meeting_id), segment_id=segment_id, start=start_time, end=end_time)
|
||||
return saved
|
||||
|
||||
async def add_segments_batch(
|
||||
self,
|
||||
meeting_id: MeetingId,
|
||||
segments: Sequence[Segment],
|
||||
) -> Sequence[Segment]:
|
||||
"""Add multiple segments in batch.
|
||||
|
||||
Args:
|
||||
meeting_id: Meeting identifier.
|
||||
segments: Segments to add.
|
||||
|
||||
Returns:
|
||||
Added segments.
|
||||
"""
|
||||
async def add_segments_batch(self, meeting_id: MeetingId, segments: Sequence[Segment]) -> Sequence[Segment]:
|
||||
"""Add multiple segments in batch."""
|
||||
async with self._uow:
|
||||
saved = await self._uow.segments.add_batch(meeting_id, segments)
|
||||
await self._uow.commit()
|
||||
logger.debug("Added segments batch", meeting_id=str(meeting_id), count=len(segments))
|
||||
return saved
|
||||
|
||||
async def get_segments(
|
||||
@@ -339,19 +343,18 @@ class MeetingService:
|
||||
async with self._uow:
|
||||
saved = await self._uow.summaries.save(summary)
|
||||
await self._uow.commit()
|
||||
logger.info("Saved summary", meeting_id=str(meeting_id), provider=provider_name or "unknown", model=model_name or "unknown")
|
||||
return saved
|
||||
|
||||
async def get_summary(self, meeting_id: MeetingId) -> Summary | None:
|
||||
"""Get summary for a meeting.
|
||||
|
||||
Args:
|
||||
meeting_id: Meeting identifier.
|
||||
|
||||
Returns:
|
||||
Summary if exists, None otherwise.
|
||||
"""
|
||||
"""Get summary for a meeting."""
|
||||
async with self._uow:
|
||||
return await self._uow.summaries.get_by_meeting(meeting_id)
|
||||
summary = await self._uow.summaries.get_by_meeting(meeting_id)
|
||||
if summary is None:
|
||||
logger.debug("Summary not found", meeting_id=str(meeting_id))
|
||||
else:
|
||||
logger.debug("Retrieved summary", meeting_id=str(meeting_id), provider=summary.provider_name or "unknown")
|
||||
return summary
|
||||
|
||||
# Annotation methods
|
||||
|
||||
@@ -392,6 +395,14 @@ class MeetingService:
|
||||
async with self._uow:
|
||||
saved = await self._uow.annotations.add(annotation)
|
||||
await self._uow.commit()
|
||||
logger.info(
|
||||
"Added annotation",
|
||||
meeting_id=str(meeting_id),
|
||||
annotation_id=str(annotation.id),
|
||||
annotation_type=annotation_type.value,
|
||||
start_time=start_time,
|
||||
end_time=end_time,
|
||||
)
|
||||
return saved
|
||||
|
||||
async def get_annotation(self, annotation_id: AnnotationId) -> Annotation | None:
|
||||
@@ -455,6 +466,12 @@ class MeetingService:
|
||||
async with self._uow:
|
||||
updated = await self._uow.annotations.update(annotation)
|
||||
await self._uow.commit()
|
||||
logger.info(
|
||||
"Updated annotation",
|
||||
annotation_id=str(annotation.id),
|
||||
meeting_id=str(annotation.meeting_id),
|
||||
annotation_type=annotation.annotation_type.value,
|
||||
)
|
||||
return updated
|
||||
|
||||
async def delete_annotation(self, annotation_id: AnnotationId) -> bool:
|
||||
@@ -470,4 +487,10 @@ class MeetingService:
|
||||
success = await self._uow.annotations.delete(annotation_id)
|
||||
if success:
|
||||
await self._uow.commit()
|
||||
logger.info("Deleted annotation", annotation_id=str(annotation_id))
|
||||
else:
|
||||
logger.warning(
|
||||
"Cannot delete annotation: not found",
|
||||
annotation_id=str(annotation_id),
|
||||
)
|
||||
return success
|
||||
|
||||
@@ -7,12 +7,12 @@ Orchestrates NER extraction, caching, and persistence following hexagonal archit
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from dataclasses import dataclass
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from noteflow.config.settings import get_feature_flags
|
||||
from noteflow.domain.entities.named_entity import NamedEntity
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Callable, Sequence
|
||||
@@ -24,7 +24,7 @@ if TYPE_CHECKING:
|
||||
|
||||
UoWFactory = Callable[[], SqlAlchemyUnitOfWork]
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -4,7 +4,7 @@ from __future__ import annotations
|
||||
|
||||
from uuid import UUID
|
||||
|
||||
from noteflow.config.constants import ERROR_MSG_PROJECT_PREFIX
|
||||
from noteflow.config.constants import ERROR_MSG_PROJECT_PREFIX, ERROR_MSG_WORKSPACE_PREFIX
|
||||
from noteflow.domain.entities.project import Project
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
|
||||
@@ -43,7 +43,7 @@ class ActiveProjectMixin:
|
||||
|
||||
workspace = await uow.workspaces.get(workspace_id)
|
||||
if workspace is None:
|
||||
msg = f"Workspace {workspace_id} not found"
|
||||
msg = f"{ERROR_MSG_WORKSPACE_PREFIX}{workspace_id} not found"
|
||||
raise ValueError(msg)
|
||||
|
||||
if project_id is not None:
|
||||
@@ -92,7 +92,7 @@ class ActiveProjectMixin:
|
||||
|
||||
workspace = await uow.workspaces.get(workspace_id)
|
||||
if workspace is None:
|
||||
msg = f"Workspace {workspace_id} not found"
|
||||
msg = f"{ERROR_MSG_WORKSPACE_PREFIX}{workspace_id} not found"
|
||||
raise ValueError(msg)
|
||||
|
||||
active_project_id: UUID | None = None
|
||||
|
||||
@@ -2,17 +2,17 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID, uuid4
|
||||
|
||||
from noteflow.domain.entities.project import Project, ProjectSettings, slugify
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class ProjectCrudMixin:
|
||||
|
||||
@@ -2,17 +2,17 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID
|
||||
|
||||
from noteflow.domain.identity import ProjectMembership, ProjectRole
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class ProjectMembershipMixin:
|
||||
|
||||
@@ -6,7 +6,6 @@ Optionally validate audio file integrity for crashed meetings.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from dataclasses import dataclass
|
||||
from datetime import UTC, datetime
|
||||
from pathlib import Path
|
||||
@@ -15,12 +14,13 @@ from typing import TYPE_CHECKING, ClassVar
|
||||
import sqlalchemy.exc
|
||||
|
||||
from noteflow.domain.value_objects import MeetingState
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.domain.entities import Meeting
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
|
||||
@@ -2,17 +2,18 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from collections.abc import Callable
|
||||
from dataclasses import dataclass
|
||||
from datetime import UTC, datetime, timedelta
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.domain.entities import Meeting
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
|
||||
@@ -5,7 +5,6 @@ Coordinate provider selection, consent handling, and citation verification.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from typing import TYPE_CHECKING
|
||||
@@ -19,6 +18,7 @@ from noteflow.domain.summarization import (
|
||||
SummarizationRequest,
|
||||
SummarizationResult,
|
||||
)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Awaitable, Callable, Sequence
|
||||
@@ -33,7 +33,7 @@ if TYPE_CHECKING:
|
||||
# Type alias for consent persistence callback
|
||||
ConsentPersistCallback = Callable[[bool], Awaitable[None]]
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class SummarizationMode(Enum):
|
||||
|
||||
@@ -5,17 +5,17 @@ Orchestrate trigger detection with rate limiting and snooze support.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import time
|
||||
from dataclasses import dataclass
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from noteflow.domain.triggers.entities import TriggerAction, TriggerDecision, TriggerSignal
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.domain.triggers.ports import SignalProvider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from noteflow.config.constants import DEFAULT_MEETING_TITLE
|
||||
@@ -14,13 +13,15 @@ from noteflow.domain.webhooks import (
|
||||
WebhookConfig,
|
||||
WebhookDelivery,
|
||||
WebhookEventType,
|
||||
payload_to_dict,
|
||||
)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.webhooks import WebhookExecutor
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.domain.entities.meeting import Meeting
|
||||
|
||||
_logger = logging.getLogger(__name__)
|
||||
_logger = get_logger(__name__)
|
||||
|
||||
|
||||
class WebhookService:
|
||||
@@ -95,7 +96,7 @@ class WebhookService:
|
||||
|
||||
return await self._deliver_to_all(
|
||||
WebhookEventType.MEETING_COMPLETED,
|
||||
payload.to_dict(),
|
||||
payload_to_dict(payload),
|
||||
)
|
||||
|
||||
async def trigger_summary_generated(
|
||||
@@ -123,7 +124,7 @@ class WebhookService:
|
||||
|
||||
return await self._deliver_to_all(
|
||||
WebhookEventType.SUMMARY_GENERATED,
|
||||
payload.to_dict(),
|
||||
payload_to_dict(payload),
|
||||
)
|
||||
|
||||
async def trigger_recording_started(
|
||||
@@ -149,7 +150,7 @@ class WebhookService:
|
||||
|
||||
return await self._deliver_to_all(
|
||||
WebhookEventType.RECORDING_STARTED,
|
||||
payload.to_dict(),
|
||||
payload_to_dict(payload),
|
||||
)
|
||||
|
||||
async def trigger_recording_stopped(
|
||||
@@ -178,7 +179,7 @@ class WebhookService:
|
||||
|
||||
return await self._deliver_to_all(
|
||||
WebhookEventType.RECORDING_STOPPED,
|
||||
payload.to_dict(),
|
||||
payload_to_dict(payload),
|
||||
)
|
||||
|
||||
async def _deliver_to_all(
|
||||
|
||||
@@ -9,12 +9,18 @@ import sys
|
||||
|
||||
from rich.console import Console
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
console = Console()
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""Dispatch to appropriate subcommand CLI."""
|
||||
logger.info("cli_invoked", argv=sys.argv)
|
||||
|
||||
if len(sys.argv) < 2:
|
||||
logger.debug("cli_no_command", message="No command provided, showing help")
|
||||
console.print("[bold]NoteFlow CLI[/bold]")
|
||||
console.print()
|
||||
console.print("Available commands:")
|
||||
@@ -32,19 +38,31 @@ def main() -> None:
|
||||
sys.exit(1)
|
||||
|
||||
command = sys.argv[1]
|
||||
subcommand_args = sys.argv[2:]
|
||||
|
||||
# Remove the command from argv so submodule parsers work correctly
|
||||
sys.argv = [sys.argv[0], *sys.argv[2:]]
|
||||
sys.argv = [sys.argv[0], *subcommand_args]
|
||||
|
||||
if command == "retention":
|
||||
from noteflow.cli.retention import main as retention_main
|
||||
logger.debug("cli_dispatch", command=command, subcommand_args=subcommand_args)
|
||||
try:
|
||||
from noteflow.cli.retention import main as retention_main
|
||||
|
||||
retention_main()
|
||||
retention_main()
|
||||
except Exception:
|
||||
logger.exception("cli_command_failed", command=command)
|
||||
raise
|
||||
elif command == "models":
|
||||
from noteflow.cli.models import main as models_main
|
||||
logger.debug("cli_dispatch", command=command, subcommand_args=subcommand_args)
|
||||
try:
|
||||
from noteflow.cli.models import main as models_main
|
||||
|
||||
models_main()
|
||||
models_main()
|
||||
except Exception:
|
||||
logger.exception("cli_command_failed", command=command)
|
||||
raise
|
||||
else:
|
||||
logger.warning("cli_unknown_command", command=command)
|
||||
console.print(f"[red]Unknown command:[/red] {command}")
|
||||
console.print("Available commands: retention, models")
|
||||
sys.exit(1)
|
||||
|
||||
@@ -7,7 +7,6 @@ Usage:
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import logging
|
||||
import subprocess
|
||||
import sys
|
||||
from dataclasses import dataclass, field
|
||||
@@ -15,12 +14,10 @@ from dataclasses import dataclass, field
|
||||
from rich.console import Console
|
||||
|
||||
from noteflow.config.constants import SPACY_MODEL_LG, SPACY_MODEL_SM
|
||||
from noteflow.infrastructure.logging import configure_logging, get_logger
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
|
||||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
configure_logging()
|
||||
logger = get_logger(__name__)
|
||||
console = Console()
|
||||
|
||||
# Constants to avoid magic strings
|
||||
|
||||
@@ -7,20 +7,17 @@ Usage:
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import logging
|
||||
import sys
|
||||
|
||||
from rich.console import Console
|
||||
|
||||
from noteflow.application.services import RetentionService
|
||||
from noteflow.config.settings import get_settings
|
||||
from noteflow.infrastructure.logging import configure_logging, get_logger
|
||||
from noteflow.infrastructure.persistence.unit_of_work import create_uow_factory
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
|
||||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
configure_logging()
|
||||
logger = get_logger(__name__)
|
||||
console = Console()
|
||||
|
||||
|
||||
|
||||
@@ -222,3 +222,28 @@ SCHEMA_TYPE_BOOLEAN: Final[str] = "boolean"
|
||||
|
||||
SCHEMA_TYPE_ARRAY_ITEMS: Final[str] = "items"
|
||||
"""JSON schema type name for array items."""
|
||||
|
||||
# Log event names - centralized to avoid repeated strings
|
||||
LOG_EVENT_DATABASE_REQUIRED_FOR_ANNOTATIONS: Final[str] = "database_required_for_annotations"
|
||||
"""Log event when annotations require database persistence."""
|
||||
|
||||
LOG_EVENT_ANNOTATION_NOT_FOUND: Final[str] = "annotation_not_found"
|
||||
"""Log event when annotation lookup fails."""
|
||||
|
||||
LOG_EVENT_INVALID_ANNOTATION_ID: Final[str] = "invalid_annotation_id"
|
||||
"""Log event when annotation ID is invalid."""
|
||||
|
||||
LOG_EVENT_SERVICE_NOT_ENABLED: Final[str] = "service_not_enabled"
|
||||
"""Log event when a service feature is not enabled."""
|
||||
|
||||
LOG_EVENT_WEBHOOK_REGISTRATION_FAILED: Final[str] = "webhook_registration_failed"
|
||||
"""Log event when webhook registration fails."""
|
||||
|
||||
LOG_EVENT_WEBHOOK_UPDATE_FAILED: Final[str] = "webhook_update_failed"
|
||||
"""Log event when webhook update fails."""
|
||||
|
||||
LOG_EVENT_WEBHOOK_DELETE_FAILED: Final[str] = "webhook_delete_failed"
|
||||
"""Log event when webhook deletion fails."""
|
||||
|
||||
LOG_EVENT_INVALID_WEBHOOK_ID: Final[str] = "invalid_webhook_id"
|
||||
"""Log event when webhook ID is invalid."""
|
||||
|
||||
@@ -485,6 +485,23 @@ class Settings(TriggerSettings):
|
||||
Field(default=120.0, ge=10.0, le=600.0, description="Timeout for Ollama requests"),
|
||||
]
|
||||
|
||||
# OpenTelemetry settings
|
||||
otel_endpoint: Annotated[
|
||||
str | None,
|
||||
Field(default=None, description="OTLP endpoint for telemetry export"),
|
||||
]
|
||||
otel_insecure: Annotated[
|
||||
bool | None,
|
||||
Field(
|
||||
default=None,
|
||||
description="Use insecure (non-TLS) connection. If None, inferred from endpoint scheme",
|
||||
),
|
||||
]
|
||||
otel_service_name: Annotated[
|
||||
str,
|
||||
Field(default="noteflow", description="Service name for OpenTelemetry resource"),
|
||||
]
|
||||
|
||||
@property
|
||||
def database_url_str(self) -> str:
|
||||
"""Return database URL as string."""
|
||||
|
||||
@@ -23,6 +23,8 @@ from .events import (
|
||||
WebhookDelivery,
|
||||
WebhookEventType,
|
||||
WebhookPayload,
|
||||
WebhookPayloadDict,
|
||||
payload_to_dict,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
@@ -48,4 +50,7 @@ __all__ = [
|
||||
"WebhookDelivery",
|
||||
"WebhookEventType",
|
||||
"WebhookPayload",
|
||||
"WebhookPayloadDict",
|
||||
# Helpers
|
||||
"payload_to_dict",
|
||||
]
|
||||
|
||||
@@ -6,10 +6,10 @@ infrastructure/persistence/models/integrations/webhook.py for seamless conversio
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from dataclasses import asdict, dataclass, field
|
||||
from datetime import datetime
|
||||
from enum import Enum
|
||||
from typing import Any
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID, uuid4
|
||||
|
||||
from noteflow.domain.utils.time import utc_now
|
||||
@@ -18,6 +18,31 @@ from noteflow.domain.webhooks.constants import (
|
||||
DEFAULT_WEBHOOK_TIMEOUT_MS,
|
||||
)
|
||||
|
||||
# Type alias for JSON-serializable webhook payload values
|
||||
# Webhook payloads use flat structures with primitive types
|
||||
type WebhookPayloadValue = str | int | float | bool | None
|
||||
type WebhookPayloadDict = dict[str, WebhookPayloadValue]
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from typing import TypeVar
|
||||
|
||||
_PayloadT = TypeVar("_PayloadT", bound="WebhookPayload")
|
||||
|
||||
|
||||
def payload_to_dict(payload: WebhookPayload) -> WebhookPayloadDict:
|
||||
"""Convert webhook payload dataclass to typed dictionary.
|
||||
|
||||
Uses dataclasses.asdict() for conversion, filtering out None values
|
||||
to keep payloads compact.
|
||||
|
||||
Args:
|
||||
payload: Any WebhookPayload subclass instance.
|
||||
|
||||
Returns:
|
||||
Dictionary with non-None field values.
|
||||
"""
|
||||
return {k: v for k, v in asdict(payload).items() if v is not None}
|
||||
|
||||
|
||||
class WebhookEventType(Enum):
|
||||
"""Types of webhook trigger events."""
|
||||
@@ -134,7 +159,7 @@ class WebhookDelivery:
|
||||
id: UUID
|
||||
webhook_id: UUID
|
||||
event_type: WebhookEventType
|
||||
payload: dict[str, Any]
|
||||
payload: WebhookPayloadDict
|
||||
status_code: int | None
|
||||
response_body: str | None
|
||||
error_message: str | None
|
||||
@@ -147,7 +172,7 @@ class WebhookDelivery:
|
||||
cls,
|
||||
webhook_id: UUID,
|
||||
event_type: WebhookEventType,
|
||||
payload: dict[str, Any],
|
||||
payload: WebhookPayloadDict,
|
||||
*,
|
||||
status_code: int | None = None,
|
||||
response_body: str | None = None,
|
||||
@@ -197,6 +222,8 @@ class WebhookDelivery:
|
||||
class WebhookPayload:
|
||||
"""Base webhook event payload.
|
||||
|
||||
Use payload_to_dict() helper for JSON serialization.
|
||||
|
||||
Attributes:
|
||||
event: Event type identifier string.
|
||||
timestamp: ISO 8601 formatted event timestamp.
|
||||
@@ -207,18 +234,6 @@ class WebhookPayload:
|
||||
timestamp: str
|
||||
meeting_id: str
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
"""Convert to dictionary for JSON serialization.
|
||||
|
||||
Returns:
|
||||
Dictionary representation of the payload.
|
||||
"""
|
||||
return {
|
||||
"event": self.event,
|
||||
"timestamp": self.timestamp,
|
||||
"meeting_id": self.meeting_id,
|
||||
}
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class MeetingCompletedPayload(WebhookPayload):
|
||||
@@ -236,21 +251,6 @@ class MeetingCompletedPayload(WebhookPayload):
|
||||
segment_count: int
|
||||
has_summary: bool
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
"""Convert to dictionary for JSON serialization.
|
||||
|
||||
Returns:
|
||||
Dictionary representation including meeting details.
|
||||
"""
|
||||
base = WebhookPayload.to_dict(self)
|
||||
return {
|
||||
**base,
|
||||
"title": self.title,
|
||||
"duration_seconds": self.duration_seconds,
|
||||
"segment_count": self.segment_count,
|
||||
"has_summary": self.has_summary,
|
||||
}
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class SummaryGeneratedPayload(WebhookPayload):
|
||||
@@ -268,21 +268,6 @@ class SummaryGeneratedPayload(WebhookPayload):
|
||||
key_points_count: int
|
||||
action_items_count: int
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
"""Convert to dictionary for JSON serialization.
|
||||
|
||||
Returns:
|
||||
Dictionary representation including summary details.
|
||||
"""
|
||||
base = WebhookPayload.to_dict(self)
|
||||
return {
|
||||
**base,
|
||||
"title": self.title,
|
||||
"executive_summary": self.executive_summary,
|
||||
"key_points_count": self.key_points_count,
|
||||
"action_items_count": self.action_items_count,
|
||||
}
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class RecordingPayload(WebhookPayload):
|
||||
@@ -295,14 +280,3 @@ class RecordingPayload(WebhookPayload):
|
||||
|
||||
title: str
|
||||
duration_seconds: float | None = None
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
"""Convert to dictionary for JSON serialization.
|
||||
|
||||
Returns:
|
||||
Dictionary representation including recording details.
|
||||
"""
|
||||
result = {**WebhookPayload.to_dict(self), "title": self.title}
|
||||
if self.duration_seconds is not None:
|
||||
result["duration_seconds"] = self.duration_seconds
|
||||
return result
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import grpc
|
||||
@@ -13,11 +12,12 @@ from noteflow.grpc._client_mixins.converters import (
|
||||
)
|
||||
from noteflow.grpc._types import AnnotationInfo
|
||||
from noteflow.grpc.proto import noteflow_pb2
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.grpc._client_mixins.protocols import ClientHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class AnnotationClientMixin:
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import grpc
|
||||
@@ -10,11 +9,12 @@ import grpc
|
||||
from noteflow.grpc._client_mixins.converters import job_status_to_str
|
||||
from noteflow.grpc._types import DiarizationResult, RenameSpeakerResult
|
||||
from noteflow.grpc.proto import noteflow_pb2
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.grpc._client_mixins.protocols import ClientHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class DiarizationClientMixin:
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import grpc
|
||||
@@ -10,11 +9,12 @@ import grpc
|
||||
from noteflow.grpc._client_mixins.converters import export_format_to_proto
|
||||
from noteflow.grpc._types import ExportResult
|
||||
from noteflow.grpc.proto import noteflow_pb2
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.grpc._client_mixins.protocols import ClientHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class ExportClientMixin:
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import grpc
|
||||
@@ -10,11 +9,12 @@ import grpc
|
||||
from noteflow.grpc._client_mixins.converters import proto_to_meeting_info
|
||||
from noteflow.grpc._types import MeetingInfo, TranscriptSegment
|
||||
from noteflow.grpc.proto import noteflow_pb2
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.grpc._client_mixins.protocols import ClientHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class MeetingClientMixin:
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import queue
|
||||
import threading
|
||||
import time
|
||||
@@ -15,6 +14,7 @@ from noteflow.config.constants import DEFAULT_SAMPLE_RATE
|
||||
from noteflow.grpc._config import STREAMING_CONFIG
|
||||
from noteflow.grpc._types import ConnectionCallback, TranscriptCallback, TranscriptSegment
|
||||
from noteflow.grpc.proto import noteflow_pb2
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import numpy as np
|
||||
@@ -22,7 +22,7 @@ if TYPE_CHECKING:
|
||||
|
||||
from noteflow.grpc._client_mixins.protocols import ClientHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class StreamingClientMixin:
|
||||
|
||||
@@ -6,13 +6,14 @@ These are pure functions that operate on audio data without state.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import struct
|
||||
|
||||
import numpy as np
|
||||
from numpy.typing import NDArray
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
def resample_audio(
|
||||
|
||||
@@ -7,8 +7,14 @@ from uuid import uuid4
|
||||
|
||||
import grpc.aio
|
||||
|
||||
from noteflow.config.constants import (
|
||||
LOG_EVENT_ANNOTATION_NOT_FOUND,
|
||||
LOG_EVENT_DATABASE_REQUIRED_FOR_ANNOTATIONS,
|
||||
LOG_EVENT_INVALID_ANNOTATION_ID,
|
||||
)
|
||||
from noteflow.domain.entities import Annotation
|
||||
from noteflow.domain.value_objects import AnnotationId
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
from .converters import (
|
||||
@@ -22,6 +28,8 @@ from .errors import abort_database_required, abort_invalid_argument, abort_not_f
|
||||
if TYPE_CHECKING:
|
||||
from .protocols import ServicerHost
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Entity type names for error messages
|
||||
_ENTITY_ANNOTATION = "Annotation"
|
||||
_ENTITY_ANNOTATIONS = "Annotations"
|
||||
@@ -42,6 +50,10 @@ class AnnotationMixin:
|
||||
"""Add an annotation to a meeting."""
|
||||
async with self._create_repository_provider() as repo:
|
||||
if not repo.supports_annotations:
|
||||
logger.error(
|
||||
LOG_EVENT_DATABASE_REQUIRED_FOR_ANNOTATIONS,
|
||||
meeting_id=request.meeting_id,
|
||||
)
|
||||
await abort_database_required(context, _ENTITY_ANNOTATIONS)
|
||||
|
||||
meeting_id = await parse_meeting_id_or_abort(request.meeting_id, context)
|
||||
@@ -58,6 +70,14 @@ class AnnotationMixin:
|
||||
|
||||
saved = await repo.annotations.add(annotation)
|
||||
await repo.commit()
|
||||
logger.info(
|
||||
"annotation_added",
|
||||
annotation_id=str(saved.id),
|
||||
meeting_id=str(meeting_id),
|
||||
annotation_type=annotation_type.value,
|
||||
start_time=saved.start_time,
|
||||
end_time=saved.end_time,
|
||||
)
|
||||
return annotation_to_proto(saved)
|
||||
|
||||
async def GetAnnotation(
|
||||
@@ -68,16 +88,34 @@ class AnnotationMixin:
|
||||
"""Get an annotation by ID."""
|
||||
async with self._create_repository_provider() as repo:
|
||||
if not repo.supports_annotations:
|
||||
logger.error(
|
||||
LOG_EVENT_DATABASE_REQUIRED_FOR_ANNOTATIONS,
|
||||
annotation_id=request.annotation_id,
|
||||
)
|
||||
await abort_database_required(context, _ENTITY_ANNOTATIONS)
|
||||
|
||||
try:
|
||||
annotation_id = parse_annotation_id(request.annotation_id)
|
||||
except ValueError:
|
||||
logger.error(
|
||||
LOG_EVENT_INVALID_ANNOTATION_ID,
|
||||
annotation_id=request.annotation_id,
|
||||
)
|
||||
await abort_invalid_argument(context, "Invalid annotation_id")
|
||||
|
||||
annotation = await repo.annotations.get(annotation_id)
|
||||
if annotation is None:
|
||||
logger.error(
|
||||
LOG_EVENT_ANNOTATION_NOT_FOUND,
|
||||
annotation_id=request.annotation_id,
|
||||
)
|
||||
await abort_not_found(context, _ENTITY_ANNOTATION, request.annotation_id)
|
||||
logger.debug(
|
||||
"annotation_retrieved",
|
||||
annotation_id=str(annotation_id),
|
||||
meeting_id=str(annotation.meeting_id),
|
||||
annotation_type=annotation.annotation_type.value,
|
||||
)
|
||||
return annotation_to_proto(annotation)
|
||||
|
||||
async def ListAnnotations(
|
||||
@@ -88,11 +126,16 @@ class AnnotationMixin:
|
||||
"""List annotations for a meeting."""
|
||||
async with self._create_repository_provider() as repo:
|
||||
if not repo.supports_annotations:
|
||||
logger.error(
|
||||
LOG_EVENT_DATABASE_REQUIRED_FOR_ANNOTATIONS,
|
||||
meeting_id=request.meeting_id,
|
||||
)
|
||||
await abort_database_required(context, _ENTITY_ANNOTATIONS)
|
||||
|
||||
meeting_id = await parse_meeting_id_or_abort(request.meeting_id, context)
|
||||
# Check if time range filter is specified
|
||||
if request.start_time > 0 or request.end_time > 0:
|
||||
has_time_filter = request.start_time > 0 or request.end_time > 0
|
||||
if has_time_filter:
|
||||
annotations = await repo.annotations.get_by_time_range(
|
||||
meeting_id,
|
||||
request.start_time,
|
||||
@@ -101,6 +144,14 @@ class AnnotationMixin:
|
||||
else:
|
||||
annotations = await repo.annotations.get_by_meeting(meeting_id)
|
||||
|
||||
logger.debug(
|
||||
"annotations_listed",
|
||||
meeting_id=str(meeting_id),
|
||||
count=len(annotations),
|
||||
has_time_filter=has_time_filter,
|
||||
start_time=request.start_time if has_time_filter else None,
|
||||
end_time=request.end_time if has_time_filter else None,
|
||||
)
|
||||
return noteflow_pb2.ListAnnotationsResponse(
|
||||
annotations=[annotation_to_proto(a) for a in annotations]
|
||||
)
|
||||
@@ -113,15 +164,27 @@ class AnnotationMixin:
|
||||
"""Update an existing annotation."""
|
||||
async with self._create_repository_provider() as repo:
|
||||
if not repo.supports_annotations:
|
||||
logger.error(
|
||||
LOG_EVENT_DATABASE_REQUIRED_FOR_ANNOTATIONS,
|
||||
annotation_id=request.annotation_id,
|
||||
)
|
||||
await abort_database_required(context, _ENTITY_ANNOTATIONS)
|
||||
|
||||
try:
|
||||
annotation_id = parse_annotation_id(request.annotation_id)
|
||||
except ValueError:
|
||||
logger.error(
|
||||
LOG_EVENT_INVALID_ANNOTATION_ID,
|
||||
annotation_id=request.annotation_id,
|
||||
)
|
||||
await abort_invalid_argument(context, "Invalid annotation_id")
|
||||
|
||||
annotation = await repo.annotations.get(annotation_id)
|
||||
if annotation is None:
|
||||
logger.error(
|
||||
LOG_EVENT_ANNOTATION_NOT_FOUND,
|
||||
annotation_id=request.annotation_id,
|
||||
)
|
||||
await abort_not_found(context, _ENTITY_ANNOTATION, request.annotation_id)
|
||||
|
||||
# Update fields if provided
|
||||
@@ -138,6 +201,12 @@ class AnnotationMixin:
|
||||
|
||||
updated = await repo.annotations.update(annotation)
|
||||
await repo.commit()
|
||||
logger.info(
|
||||
"annotation_updated",
|
||||
annotation_id=str(annotation_id),
|
||||
meeting_id=str(updated.meeting_id),
|
||||
annotation_type=updated.annotation_type.value,
|
||||
)
|
||||
return annotation_to_proto(updated)
|
||||
|
||||
async def DeleteAnnotation(
|
||||
@@ -148,15 +217,31 @@ class AnnotationMixin:
|
||||
"""Delete an annotation."""
|
||||
async with self._create_repository_provider() as repo:
|
||||
if not repo.supports_annotations:
|
||||
logger.error(
|
||||
LOG_EVENT_DATABASE_REQUIRED_FOR_ANNOTATIONS,
|
||||
annotation_id=request.annotation_id,
|
||||
)
|
||||
await abort_database_required(context, _ENTITY_ANNOTATIONS)
|
||||
|
||||
try:
|
||||
annotation_id = parse_annotation_id(request.annotation_id)
|
||||
except ValueError:
|
||||
logger.error(
|
||||
LOG_EVENT_INVALID_ANNOTATION_ID,
|
||||
annotation_id=request.annotation_id,
|
||||
)
|
||||
await abort_invalid_argument(context, "Invalid annotation_id")
|
||||
|
||||
success = await repo.annotations.delete(annotation_id)
|
||||
if success:
|
||||
await repo.commit()
|
||||
logger.info(
|
||||
"annotation_deleted",
|
||||
annotation_id=str(annotation_id),
|
||||
)
|
||||
return noteflow_pb2.DeleteAnnotationResponse(success=True)
|
||||
logger.error(
|
||||
LOG_EVENT_ANNOTATION_NOT_FOUND,
|
||||
annotation_id=request.annotation_id,
|
||||
)
|
||||
await abort_not_found(context, _ENTITY_ANNOTATION, request.annotation_id)
|
||||
|
||||
@@ -9,10 +9,13 @@ import grpc.aio
|
||||
from noteflow.application.services.calendar_service import CalendarServiceError
|
||||
from noteflow.domain.entities.integration import IntegrationStatus
|
||||
from noteflow.domain.value_objects import OAuthProvider
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
from .errors import abort_internal, abort_invalid_argument, abort_unavailable
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
_ERR_CALENDAR_NOT_ENABLED = "Calendar integration not enabled"
|
||||
|
||||
if TYPE_CHECKING:
|
||||
@@ -50,12 +53,20 @@ class CalendarMixin:
|
||||
) -> noteflow_pb2.ListCalendarEventsResponse:
|
||||
"""List upcoming calendar events from connected providers."""
|
||||
if self._calendar_service is None:
|
||||
logger.warning("calendar_list_events_unavailable", reason="service_not_enabled")
|
||||
await abort_unavailable(context, _ERR_CALENDAR_NOT_ENABLED)
|
||||
|
||||
provider = request.provider if request.provider else None
|
||||
hours_ahead = request.hours_ahead if request.hours_ahead > 0 else None
|
||||
limit = request.limit if request.limit > 0 else None
|
||||
|
||||
logger.debug(
|
||||
"calendar_list_events_request",
|
||||
provider=provider,
|
||||
hours_ahead=hours_ahead,
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
try:
|
||||
events = await self._calendar_service.list_calendar_events(
|
||||
provider=provider,
|
||||
@@ -63,6 +74,7 @@ class CalendarMixin:
|
||||
limit=limit,
|
||||
)
|
||||
except CalendarServiceError as e:
|
||||
logger.error("calendar_list_events_failed", error=str(e), provider=provider)
|
||||
await abort_internal(context, str(e))
|
||||
|
||||
proto_events = [
|
||||
@@ -80,6 +92,12 @@ class CalendarMixin:
|
||||
for event in events
|
||||
]
|
||||
|
||||
logger.info(
|
||||
"calendar_list_events_success",
|
||||
provider=provider,
|
||||
event_count=len(proto_events),
|
||||
)
|
||||
|
||||
return noteflow_pb2.ListCalendarEventsResponse(
|
||||
events=proto_events,
|
||||
total_count=len(proto_events),
|
||||
@@ -92,21 +110,38 @@ class CalendarMixin:
|
||||
) -> noteflow_pb2.GetCalendarProvidersResponse:
|
||||
"""Get available calendar providers with authentication status."""
|
||||
if self._calendar_service is None:
|
||||
logger.warning("calendar_providers_unavailable", reason="service_not_enabled")
|
||||
await abort_unavailable(context, _ERR_CALENDAR_NOT_ENABLED)
|
||||
|
||||
logger.debug("calendar_get_providers_request")
|
||||
|
||||
providers = []
|
||||
for provider_name, display_name in [
|
||||
(OAuthProvider.GOOGLE.value, "Google Calendar"),
|
||||
(OAuthProvider.OUTLOOK.value, "Microsoft Outlook"),
|
||||
]:
|
||||
status = await self._calendar_service.get_connection_status(provider_name)
|
||||
is_authenticated = status.status == IntegrationStatus.CONNECTED.value
|
||||
providers.append(
|
||||
noteflow_pb2.CalendarProvider(
|
||||
name=provider_name,
|
||||
is_authenticated=status.status == IntegrationStatus.CONNECTED.value,
|
||||
is_authenticated=is_authenticated,
|
||||
display_name=display_name,
|
||||
)
|
||||
)
|
||||
logger.debug(
|
||||
"calendar_provider_status",
|
||||
provider=provider_name,
|
||||
is_authenticated=is_authenticated,
|
||||
status=status.status,
|
||||
)
|
||||
|
||||
authenticated_count = sum(1 for p in providers if p.is_authenticated)
|
||||
logger.info(
|
||||
"calendar_get_providers_success",
|
||||
total_providers=len(providers),
|
||||
authenticated_count=authenticated_count,
|
||||
)
|
||||
|
||||
return noteflow_pb2.GetCalendarProvidersResponse(providers=providers)
|
||||
|
||||
@@ -117,16 +152,34 @@ class CalendarMixin:
|
||||
) -> noteflow_pb2.InitiateOAuthResponse:
|
||||
"""Start OAuth flow for a calendar provider."""
|
||||
if self._calendar_service is None:
|
||||
logger.warning("oauth_initiate_unavailable", reason="service_not_enabled")
|
||||
await abort_unavailable(context, _ERR_CALENDAR_NOT_ENABLED)
|
||||
|
||||
logger.debug(
|
||||
"oauth_initiate_request",
|
||||
provider=request.provider,
|
||||
has_redirect_uri=bool(request.redirect_uri),
|
||||
)
|
||||
|
||||
try:
|
||||
auth_url, state = await self._calendar_service.initiate_oauth(
|
||||
provider=request.provider,
|
||||
redirect_uri=request.redirect_uri if request.redirect_uri else None,
|
||||
)
|
||||
except CalendarServiceError as e:
|
||||
logger.error(
|
||||
"oauth_initiate_failed",
|
||||
provider=request.provider,
|
||||
error=str(e),
|
||||
)
|
||||
await abort_invalid_argument(context, str(e))
|
||||
|
||||
logger.info(
|
||||
"oauth_initiate_success",
|
||||
provider=request.provider,
|
||||
state=state,
|
||||
)
|
||||
|
||||
return noteflow_pb2.InitiateOAuthResponse(
|
||||
auth_url=auth_url,
|
||||
state=state,
|
||||
@@ -139,8 +192,15 @@ class CalendarMixin:
|
||||
) -> noteflow_pb2.CompleteOAuthResponse:
|
||||
"""Complete OAuth flow with authorization code."""
|
||||
if self._calendar_service is None:
|
||||
logger.warning("oauth_complete_unavailable", reason="service_not_enabled")
|
||||
await abort_unavailable(context, _ERR_CALENDAR_NOT_ENABLED)
|
||||
|
||||
logger.debug(
|
||||
"oauth_complete_request",
|
||||
provider=request.provider,
|
||||
state=request.state,
|
||||
)
|
||||
|
||||
try:
|
||||
success = await self._calendar_service.complete_oauth(
|
||||
provider=request.provider,
|
||||
@@ -148,6 +208,11 @@ class CalendarMixin:
|
||||
state=request.state,
|
||||
)
|
||||
except CalendarServiceError as e:
|
||||
logger.warning(
|
||||
"oauth_complete_failed",
|
||||
provider=request.provider,
|
||||
error=str(e),
|
||||
)
|
||||
return noteflow_pb2.CompleteOAuthResponse(
|
||||
success=False,
|
||||
error_message=str(e),
|
||||
@@ -156,6 +221,12 @@ class CalendarMixin:
|
||||
# Get the provider email after successful connection
|
||||
status = await self._calendar_service.get_connection_status(request.provider)
|
||||
|
||||
logger.info(
|
||||
"oauth_complete_success",
|
||||
provider=request.provider,
|
||||
email=status.email,
|
||||
)
|
||||
|
||||
return noteflow_pb2.CompleteOAuthResponse(
|
||||
success=success,
|
||||
provider_email=status.email or "",
|
||||
@@ -168,10 +239,25 @@ class CalendarMixin:
|
||||
) -> noteflow_pb2.GetOAuthConnectionStatusResponse:
|
||||
"""Get OAuth connection status for a provider."""
|
||||
if self._calendar_service is None:
|
||||
logger.warning("oauth_status_unavailable", reason="service_not_enabled")
|
||||
await abort_unavailable(context, _ERR_CALENDAR_NOT_ENABLED)
|
||||
|
||||
logger.debug(
|
||||
"oauth_status_request",
|
||||
provider=request.provider,
|
||||
integration_type=request.integration_type or "calendar",
|
||||
)
|
||||
|
||||
info = await self._calendar_service.get_connection_status(request.provider)
|
||||
|
||||
logger.info(
|
||||
"oauth_status_retrieved",
|
||||
provider=request.provider,
|
||||
status=info.status,
|
||||
has_email=bool(info.email),
|
||||
has_error=bool(info.error_message),
|
||||
)
|
||||
|
||||
return noteflow_pb2.GetOAuthConnectionStatusResponse(
|
||||
connection=_build_oauth_connection(info, request.integration_type or "calendar")
|
||||
)
|
||||
@@ -183,8 +269,16 @@ class CalendarMixin:
|
||||
) -> noteflow_pb2.DisconnectOAuthResponse:
|
||||
"""Disconnect OAuth integration and revoke tokens."""
|
||||
if self._calendar_service is None:
|
||||
logger.warning("oauth_disconnect_unavailable", reason="service_not_enabled")
|
||||
await abort_unavailable(context, _ERR_CALENDAR_NOT_ENABLED)
|
||||
|
||||
logger.debug("oauth_disconnect_request", provider=request.provider)
|
||||
|
||||
success = await self._calendar_service.disconnect(request.provider)
|
||||
|
||||
if success:
|
||||
logger.info("oauth_disconnect_success", provider=request.provider)
|
||||
else:
|
||||
logger.warning("oauth_disconnect_failed", provider=request.provider)
|
||||
|
||||
return noteflow_pb2.DisconnectOAuthResponse(success=success)
|
||||
|
||||
@@ -3,15 +3,19 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import time
|
||||
from datetime import UTC, datetime
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID
|
||||
|
||||
from google.protobuf.timestamp_pb2 import Timestamp
|
||||
|
||||
from noteflow.application.services.export_service import ExportFormat
|
||||
from noteflow.domain.entities import Annotation, Meeting, Segment, Summary, WordTiming
|
||||
from noteflow.domain.value_objects import AnnotationId, AnnotationType, MeetingId
|
||||
from noteflow.infrastructure.converters import AsrConverter
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
from .errors import _AbortableContext
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.infrastructure.asr.dto import AsrResult
|
||||
@@ -38,7 +42,7 @@ def parse_meeting_id(meeting_id_str: str) -> MeetingId:
|
||||
|
||||
async def parse_meeting_id_or_abort(
|
||||
meeting_id_str: str,
|
||||
context: object,
|
||||
context: _AbortableContext,
|
||||
) -> MeetingId:
|
||||
"""Parse meeting ID or abort with INVALID_ARGUMENT.
|
||||
|
||||
@@ -46,7 +50,7 @@ async def parse_meeting_id_or_abort(
|
||||
|
||||
Args:
|
||||
meeting_id_str: Meeting ID as string (UUID format).
|
||||
context: gRPC servicer context.
|
||||
context: gRPC servicer context with abort capability.
|
||||
|
||||
Returns:
|
||||
MeetingId value object.
|
||||
@@ -317,3 +321,85 @@ def proto_to_export_format(proto_format: int) -> ExportFormat:
|
||||
if proto_format == noteflow_pb2.EXPORT_FORMAT_PDF:
|
||||
return ExportFormat.PDF
|
||||
return ExportFormat.MARKDOWN # Default to Markdown
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# Timestamp Conversion Helpers
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
|
||||
def datetime_to_proto_timestamp(dt: datetime) -> Timestamp:
|
||||
"""Convert datetime to protobuf Timestamp.
|
||||
|
||||
Args:
|
||||
dt: Datetime to convert (should be timezone-aware).
|
||||
|
||||
Returns:
|
||||
Protobuf Timestamp message.
|
||||
"""
|
||||
ts = Timestamp()
|
||||
ts.FromDatetime(dt)
|
||||
return ts
|
||||
|
||||
|
||||
def proto_timestamp_to_datetime(ts: Timestamp) -> datetime:
|
||||
"""Convert protobuf Timestamp to datetime.
|
||||
|
||||
Args:
|
||||
ts: Protobuf Timestamp message.
|
||||
|
||||
Returns:
|
||||
Timezone-aware datetime (UTC).
|
||||
"""
|
||||
return ts.ToDatetime().replace(tzinfo=UTC)
|
||||
|
||||
|
||||
def epoch_seconds_to_datetime(seconds: float) -> datetime:
|
||||
"""Convert Unix epoch seconds to datetime.
|
||||
|
||||
Args:
|
||||
seconds: Unix epoch seconds (float for sub-second precision).
|
||||
|
||||
Returns:
|
||||
Timezone-aware datetime (UTC).
|
||||
"""
|
||||
return datetime.fromtimestamp(seconds, tz=UTC)
|
||||
|
||||
|
||||
def datetime_to_epoch_seconds(dt: datetime) -> float:
|
||||
"""Convert datetime to Unix epoch seconds.
|
||||
|
||||
Args:
|
||||
dt: Datetime to convert.
|
||||
|
||||
Returns:
|
||||
Unix epoch seconds as float.
|
||||
"""
|
||||
return dt.timestamp()
|
||||
|
||||
|
||||
def iso_string_to_datetime(iso_str: str) -> datetime:
|
||||
"""Parse ISO 8601 string to datetime.
|
||||
|
||||
Args:
|
||||
iso_str: ISO 8601 formatted string.
|
||||
|
||||
Returns:
|
||||
Timezone-aware datetime (UTC if no timezone in string).
|
||||
"""
|
||||
dt = datetime.fromisoformat(iso_str.replace("Z", "+00:00"))
|
||||
if dt.tzinfo is None:
|
||||
dt = dt.replace(tzinfo=UTC)
|
||||
return dt
|
||||
|
||||
|
||||
def datetime_to_iso_string(dt: datetime) -> str:
|
||||
"""Format datetime as ISO 8601 string.
|
||||
|
||||
Args:
|
||||
dt: Datetime to format.
|
||||
|
||||
Returns:
|
||||
ISO 8601 formatted string with timezone.
|
||||
"""
|
||||
return dt.isoformat()
|
||||
|
||||
@@ -3,7 +3,6 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID, uuid4
|
||||
|
||||
@@ -11,6 +10,7 @@ import grpc
|
||||
|
||||
from noteflow.domain.utils import utc_now
|
||||
from noteflow.domain.value_objects import MeetingState
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.persistence.repositories import DiarizationJob
|
||||
|
||||
from ...proto import noteflow_pb2
|
||||
@@ -21,7 +21,7 @@ from ._types import DIARIZATION_TIMEOUT_SECONDS, GrpcContext
|
||||
if TYPE_CHECKING:
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
def create_diarization_error_response(
|
||||
|
||||
@@ -3,9 +3,10 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ...proto import noteflow_pb2
|
||||
from ._jobs import JobsMixin
|
||||
from ._refinement import RefinementMixin
|
||||
@@ -16,7 +17,7 @@ from ._types import GrpcContext
|
||||
if TYPE_CHECKING:
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class DiarizationMixin(
|
||||
|
||||
@@ -2,13 +2,13 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import numpy as np
|
||||
|
||||
from noteflow.infrastructure.audio.reader import MeetingAudioReader
|
||||
from noteflow.infrastructure.diarization import SpeakerTurn
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ..converters import parse_meeting_id_or_none
|
||||
from ._speaker import apply_speaker_to_segment
|
||||
@@ -16,7 +16,7 @@ from ._speaker import apply_speaker_to_segment
|
||||
if TYPE_CHECKING:
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class RefinementMixin:
|
||||
|
||||
@@ -85,10 +85,10 @@ class SpeakerMixin:
|
||||
"""
|
||||
if not request.old_speaker_id or not request.new_speaker_name:
|
||||
await abort_invalid_argument(
|
||||
context, "old_speaker_id and new_speaker_name are required" # type: ignore[arg-type]
|
||||
context, "old_speaker_id and new_speaker_name are required"
|
||||
)
|
||||
|
||||
meeting_id = await parse_meeting_id_or_abort(request.meeting_id, context) # type: ignore[arg-type]
|
||||
meeting_id = await parse_meeting_id_or_abort(request.meeting_id, context)
|
||||
|
||||
updated_count = 0
|
||||
|
||||
|
||||
@@ -2,10 +2,10 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from noteflow.domain.utils import utc_now
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.persistence.repositories import DiarizationJob
|
||||
|
||||
from ...proto import noteflow_pb2
|
||||
@@ -15,7 +15,7 @@ from ._types import DIARIZATION_TIMEOUT_SECONDS
|
||||
if TYPE_CHECKING:
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class JobStatusMixin:
|
||||
|
||||
@@ -3,19 +3,19 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from functools import partial
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import numpy as np
|
||||
from numpy.typing import NDArray
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.persistence.repositories import StreamingTurn
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class StreamingDiarizationMixin:
|
||||
|
||||
@@ -4,13 +4,13 @@ from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import contextlib
|
||||
import logging
|
||||
from datetime import timedelta
|
||||
from typing import TYPE_CHECKING, Protocol
|
||||
|
||||
import grpc
|
||||
|
||||
from noteflow.domain.utils.time import utc_now
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
from .errors import ERR_CANCELLED_BY_USER, abort_not_found
|
||||
@@ -18,7 +18,7 @@ from .errors import ERR_CANCELLED_BY_USER, abort_not_found
|
||||
if TYPE_CHECKING:
|
||||
from .protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
# Diarization job TTL default (1 hour in seconds)
|
||||
|
||||
@@ -2,12 +2,13 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID
|
||||
|
||||
import grpc.aio
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
from .converters import parse_meeting_id_or_abort
|
||||
from .errors import (
|
||||
@@ -25,7 +26,7 @@ if TYPE_CHECKING:
|
||||
|
||||
from .protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class EntitiesMixin:
|
||||
|
||||
@@ -9,6 +9,7 @@ import grpc.aio
|
||||
|
||||
from noteflow.application.services.export_service import ExportFormat, ExportService
|
||||
from noteflow.config.constants import EXPORT_EXT_HTML, EXPORT_EXT_PDF, EXPORT_FORMAT_HTML
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
from .converters import parse_meeting_id_or_abort, proto_to_export_format
|
||||
@@ -17,6 +18,8 @@ from .errors import ENTITY_MEETING, abort_not_found
|
||||
if TYPE_CHECKING:
|
||||
from .protocols import ServicerHost
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Format metadata lookup
|
||||
_FORMAT_METADATA: dict[ExportFormat, tuple[str, str]] = {
|
||||
ExportFormat.MARKDOWN: ("Markdown", ".md"),
|
||||
@@ -40,6 +43,13 @@ class ExportMixin:
|
||||
"""Export meeting transcript to specified format."""
|
||||
# Map proto format to ExportFormat
|
||||
fmt = proto_to_export_format(request.format)
|
||||
fmt_name, fmt_ext = _FORMAT_METADATA.get(fmt, ("Unknown", ""))
|
||||
|
||||
logger.info(
|
||||
"Export requested: meeting_id=%s format=%s",
|
||||
request.meeting_id,
|
||||
fmt_name,
|
||||
)
|
||||
|
||||
# Use unified repository provider - works with both DB and memory
|
||||
meeting_id = await parse_meeting_id_or_abort(request.meeting_id, context)
|
||||
@@ -55,16 +65,28 @@ class ExportMixin:
|
||||
# PDF returns bytes which must be base64-encoded for gRPC string transport
|
||||
if isinstance(result, bytes):
|
||||
content = base64.b64encode(result).decode("ascii")
|
||||
content_size = len(result)
|
||||
else:
|
||||
content = result
|
||||
content_size = len(content)
|
||||
|
||||
# Get format metadata
|
||||
fmt_name, fmt_ext = _FORMAT_METADATA.get(fmt, ("Unknown", ""))
|
||||
logger.info(
|
||||
"Export completed: meeting_id=%s format=%s bytes=%d",
|
||||
request.meeting_id,
|
||||
fmt_name,
|
||||
content_size,
|
||||
)
|
||||
|
||||
return noteflow_pb2.ExportTranscriptResponse(
|
||||
content=content,
|
||||
format_name=fmt_name,
|
||||
file_extension=fmt_ext,
|
||||
)
|
||||
except ValueError:
|
||||
except ValueError as exc:
|
||||
logger.error(
|
||||
"Export failed: meeting_id=%s format=%s error=%s",
|
||||
request.meeting_id,
|
||||
fmt_name,
|
||||
str(exc),
|
||||
)
|
||||
await abort_not_found(context, ENTITY_MEETING, request.meeting_id)
|
||||
|
||||
@@ -3,7 +3,6 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID
|
||||
|
||||
@@ -16,7 +15,7 @@ from noteflow.config.constants import (
|
||||
from noteflow.domain.entities import Meeting
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
from noteflow.domain.value_objects import MeetingState
|
||||
from noteflow.infrastructure.logging import get_workspace_id
|
||||
from noteflow.infrastructure.logging import get_logger, get_workspace_id
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
from .converters import meeting_to_proto, parse_meeting_id_or_abort
|
||||
@@ -25,7 +24,7 @@ from .errors import ENTITY_MEETING, abort_invalid_argument, abort_not_found
|
||||
if TYPE_CHECKING:
|
||||
from .protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Timeout for waiting for stream to exit gracefully
|
||||
STOP_WAIT_TIMEOUT_SECONDS: float = 2.0
|
||||
@@ -91,6 +90,10 @@ class MeetingMixin:
|
||||
try:
|
||||
project_id = UUID(request.project_id)
|
||||
except ValueError:
|
||||
logger.warning(
|
||||
"CreateMeeting: invalid project_id format",
|
||||
project_id=request.project_id,
|
||||
)
|
||||
await abort_invalid_argument(context, f"{ERROR_INVALID_PROJECT_ID_PREFIX}{request.project_id}")
|
||||
|
||||
async with self._create_repository_provider() as repo:
|
||||
@@ -104,6 +107,12 @@ class MeetingMixin:
|
||||
)
|
||||
saved = await repo.meetings.create(meeting)
|
||||
await repo.commit()
|
||||
logger.info(
|
||||
"Meeting created",
|
||||
meeting_id=str(saved.id),
|
||||
title=saved.title or DEFAULT_MEETING_TITLE,
|
||||
project_id=str(project_id) if project_id else None,
|
||||
)
|
||||
return meeting_to_proto(saved)
|
||||
|
||||
async def StopMeeting(
|
||||
@@ -117,6 +126,7 @@ class MeetingMixin:
|
||||
and waits briefly for it to exit before closing resources.
|
||||
"""
|
||||
meeting_id = request.meeting_id
|
||||
logger.info("StopMeeting requested", meeting_id=meeting_id)
|
||||
|
||||
# Signal stop to active stream and wait for graceful exit
|
||||
if meeting_id in self._active_streams:
|
||||
@@ -138,50 +148,49 @@ class MeetingMixin:
|
||||
async with self._create_repository_provider() as repo:
|
||||
meeting = await repo.meetings.get(parsed_meeting_id)
|
||||
if meeting is None:
|
||||
logger.warning("StopMeeting: meeting not found", meeting_id=meeting_id)
|
||||
await abort_not_found(context, ENTITY_MEETING, meeting_id)
|
||||
|
||||
# Idempotency guard: return success if already stopped/stopping/completed
|
||||
if meeting.state in (
|
||||
MeetingState.STOPPED,
|
||||
MeetingState.STOPPING,
|
||||
MeetingState.COMPLETED,
|
||||
):
|
||||
previous_state = meeting.state.value
|
||||
|
||||
# Idempotency: return success if already stopped/stopping/completed
|
||||
terminal_states = (MeetingState.STOPPED, MeetingState.STOPPING, MeetingState.COMPLETED)
|
||||
if meeting.state in terminal_states:
|
||||
logger.debug("StopMeeting: already terminal", meeting_id=meeting_id, state=meeting.state.value)
|
||||
return meeting_to_proto(meeting)
|
||||
|
||||
try:
|
||||
# Graceful shutdown: RECORDING -> STOPPING -> STOPPED
|
||||
meeting.begin_stopping()
|
||||
meeting.begin_stopping() # RECORDING -> STOPPING -> STOPPED
|
||||
meeting.stop_recording()
|
||||
except ValueError as e:
|
||||
logger.error("StopMeeting: invalid transition", meeting_id=meeting_id, state=previous_state, error=str(e))
|
||||
await abort_invalid_argument(context, str(e))
|
||||
await repo.meetings.update(meeting)
|
||||
# Clean up streaming diarization turns if DB supports it
|
||||
if repo.supports_diarization_jobs:
|
||||
await repo.diarization_jobs.clear_streaming_turns(meeting_id)
|
||||
await repo.commit()
|
||||
|
||||
# Trigger webhooks (fire-and-forget)
|
||||
if self._webhook_service is not None:
|
||||
try:
|
||||
await self._webhook_service.trigger_recording_stopped(
|
||||
meeting_id=meeting_id,
|
||||
title=meeting.title or DEFAULT_MEETING_TITLE,
|
||||
duration_seconds=meeting.duration_seconds or 0.0,
|
||||
)
|
||||
# INTENTIONAL BROAD HANDLER: Fire-and-forget webhook
|
||||
# - Webhook failures must never block StopMeeting RPC
|
||||
except Exception:
|
||||
logger.exception("Failed to trigger recording.stopped webhooks")
|
||||
|
||||
try:
|
||||
await self._webhook_service.trigger_meeting_completed(meeting)
|
||||
# INTENTIONAL BROAD HANDLER: Fire-and-forget webhook
|
||||
# - Webhook failures must never block StopMeeting RPC
|
||||
except Exception:
|
||||
logger.exception("Failed to trigger meeting.completed webhooks")
|
||||
|
||||
logger.info("Meeting stopped", meeting_id=meeting_id, from_state=previous_state, to_state=meeting.state.value)
|
||||
await self._fire_stop_webhooks(meeting)
|
||||
return meeting_to_proto(meeting)
|
||||
|
||||
async def _fire_stop_webhooks(self: ServicerHost, meeting: Meeting) -> None:
|
||||
"""Trigger webhooks for meeting stop (fire-and-forget)."""
|
||||
if self._webhook_service is None:
|
||||
return
|
||||
try:
|
||||
await self._webhook_service.trigger_recording_stopped(
|
||||
meeting_id=str(meeting.id),
|
||||
title=meeting.title or DEFAULT_MEETING_TITLE,
|
||||
duration_seconds=meeting.duration_seconds or 0.0,
|
||||
)
|
||||
except Exception:
|
||||
logger.exception("Failed to trigger recording.stopped webhooks")
|
||||
try:
|
||||
await self._webhook_service.trigger_meeting_completed(meeting)
|
||||
except Exception:
|
||||
logger.exception("Failed to trigger meeting.completed webhooks")
|
||||
|
||||
async def ListMeetings(
|
||||
self: ServicerHost,
|
||||
request: noteflow_pb2.ListMeetingsRequest,
|
||||
@@ -211,6 +220,14 @@ class MeetingMixin:
|
||||
sort_desc=sort_desc,
|
||||
project_id=project_id,
|
||||
)
|
||||
logger.debug(
|
||||
"ListMeetings returned",
|
||||
count=len(meetings),
|
||||
total=total,
|
||||
limit=limit,
|
||||
offset=offset,
|
||||
project_id=str(project_id) if project_id else None,
|
||||
)
|
||||
return noteflow_pb2.ListMeetingsResponse(
|
||||
meetings=[meeting_to_proto(m, include_segments=False) for m in meetings],
|
||||
total_count=total,
|
||||
@@ -222,10 +239,17 @@ class MeetingMixin:
|
||||
context: grpc.aio.ServicerContext,
|
||||
) -> noteflow_pb2.Meeting:
|
||||
"""Get meeting details."""
|
||||
logger.debug(
|
||||
"GetMeeting requested",
|
||||
meeting_id=request.meeting_id,
|
||||
include_segments=request.include_segments,
|
||||
include_summary=request.include_summary,
|
||||
)
|
||||
meeting_id = await parse_meeting_id_or_abort(request.meeting_id, context)
|
||||
async with self._create_repository_provider() as repo:
|
||||
meeting = await repo.meetings.get(meeting_id)
|
||||
if meeting is None:
|
||||
logger.warning("GetMeeting: meeting not found", meeting_id=request.meeting_id)
|
||||
await abort_not_found(context, ENTITY_MEETING, request.meeting_id)
|
||||
# Load segments if requested
|
||||
if request.include_segments:
|
||||
@@ -247,10 +271,13 @@ class MeetingMixin:
|
||||
context: grpc.aio.ServicerContext,
|
||||
) -> noteflow_pb2.DeleteMeetingResponse:
|
||||
"""Delete a meeting."""
|
||||
logger.info("DeleteMeeting requested", meeting_id=request.meeting_id)
|
||||
meeting_id = await parse_meeting_id_or_abort(request.meeting_id, context)
|
||||
async with self._create_repository_provider() as repo:
|
||||
success = await repo.meetings.delete(meeting_id)
|
||||
if success:
|
||||
await repo.commit()
|
||||
logger.info("Meeting deleted", meeting_id=request.meeting_id)
|
||||
return noteflow_pb2.DeleteMeetingResponse(success=True)
|
||||
logger.warning("DeleteMeeting: meeting not found", meeting_id=request.meeting_id)
|
||||
await abort_not_found(context, ENTITY_MEETING, request.meeting_id)
|
||||
|
||||
@@ -4,11 +4,12 @@ from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
import json
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import grpc.aio
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
from .errors import abort_database_required, abort_failed_precondition
|
||||
|
||||
@@ -16,7 +17,7 @@ if TYPE_CHECKING:
|
||||
from .protocols import ServicerHost
|
||||
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Entity type names for error messages
|
||||
_ENTITY_PREFERENCES = "Preferences"
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID
|
||||
|
||||
@@ -10,6 +9,7 @@ import grpc.aio
|
||||
|
||||
from noteflow.config.constants import ERROR_INVALID_PROJECT_ID_PREFIX
|
||||
from noteflow.domain.errors import CannotArchiveDefaultProjectError
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ...proto import noteflow_pb2
|
||||
from ..errors import (
|
||||
@@ -29,7 +29,7 @@ if TYPE_CHECKING:
|
||||
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
async def _require_project_service(
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from collections.abc import AsyncIterator
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
@@ -10,6 +9,7 @@ import numpy as np
|
||||
from numpy.typing import NDArray
|
||||
|
||||
from noteflow.domain.entities import Segment
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ...proto import noteflow_pb2
|
||||
from ..converters import (
|
||||
@@ -21,7 +21,7 @@ from ..converters import (
|
||||
if TYPE_CHECKING:
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
async def process_audio_segment(
|
||||
|
||||
@@ -2,13 +2,14 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
def cleanup_stream_resources(host: ServicerHost, meeting_id: str) -> None:
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from collections.abc import AsyncIterator
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
@@ -10,6 +9,8 @@ import grpc.aio
|
||||
import numpy as np
|
||||
from numpy.typing import NDArray
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ...proto import noteflow_pb2
|
||||
from .._audio_helpers import convert_audio_format
|
||||
from ..errors import abort_failed_precondition, abort_invalid_argument
|
||||
@@ -29,7 +30,7 @@ from ._types import StreamSessionInit
|
||||
if TYPE_CHECKING:
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class StreamingMixin:
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from collections.abc import AsyncIterator
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
@@ -10,6 +9,8 @@ import grpc.aio
|
||||
import numpy as np
|
||||
from numpy.typing import NDArray
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ...proto import noteflow_pb2
|
||||
from .._audio_helpers import convert_audio_format, decode_audio_chunk, validate_stream_format
|
||||
from ..converters import create_vad_update
|
||||
@@ -20,7 +21,7 @@ from ._partials import clear_partial_buffer, maybe_emit_partial
|
||||
if TYPE_CHECKING:
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
async def process_stream_chunk(
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import grpc
|
||||
@@ -10,6 +9,7 @@ import grpc.aio
|
||||
|
||||
from noteflow.config.constants import DEFAULT_MEETING_TITLE, ERROR_MSG_MEETING_PREFIX
|
||||
from noteflow.infrastructure.diarization import SpeakerTurn
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ..converters import parse_meeting_id_or_none
|
||||
from ..errors import abort_failed_precondition
|
||||
@@ -18,7 +18,7 @@ from ._types import StreamSessionInit
|
||||
if TYPE_CHECKING:
|
||||
from ..protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class StreamSessionManager:
|
||||
@@ -87,7 +87,7 @@ class StreamSessionManager:
|
||||
return StreamSessionInit(
|
||||
next_segment_id=0,
|
||||
error_code=grpc.StatusCode.NOT_FOUND,
|
||||
error_message=f"Meeting {meeting_id} not found",
|
||||
error_message=f"{ERROR_MSG_MEETING_PREFIX}{meeting_id} not found",
|
||||
)
|
||||
|
||||
dek, wrapped_dek, dek_updated = host._ensure_meeting_dek(meeting)
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import grpc.aio
|
||||
@@ -10,6 +9,7 @@ import grpc.aio
|
||||
from noteflow.domain.entities import Segment, Summary
|
||||
from noteflow.domain.summarization import ProviderUnavailableError
|
||||
from noteflow.domain.value_objects import MeetingId
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.summarization._parsing import build_style_prompt
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
@@ -21,7 +21,7 @@ if TYPE_CHECKING:
|
||||
|
||||
from .protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class SummarizationMixin:
|
||||
|
||||
@@ -3,13 +3,13 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID
|
||||
|
||||
import grpc.aio
|
||||
|
||||
from noteflow.domain.entities import SyncRun
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
from .errors import (
|
||||
@@ -24,7 +24,7 @@ from .errors import (
|
||||
if TYPE_CHECKING:
|
||||
from .protocols import ServicerHost
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
_ERR_CALENDAR_NOT_ENABLED = "Calendar integration not enabled"
|
||||
|
||||
@@ -49,77 +49,64 @@ class SyncMixin:
|
||||
request: noteflow_pb2.StartIntegrationSyncRequest,
|
||||
context: grpc.aio.ServicerContext,
|
||||
) -> noteflow_pb2.StartIntegrationSyncResponse:
|
||||
"""Start a sync operation for an integration.
|
||||
|
||||
Creates a sync run record and kicks off the actual sync asynchronously.
|
||||
"""
|
||||
"""Start a sync operation for an integration."""
|
||||
if self._calendar_service is None:
|
||||
await abort_unavailable(context, _ERR_CALENDAR_NOT_ENABLED)
|
||||
|
||||
try:
|
||||
integration_id = UUID(request.integration_id)
|
||||
except ValueError:
|
||||
await abort_invalid_argument(
|
||||
context,
|
||||
f"Invalid integration_id format: {request.integration_id}",
|
||||
)
|
||||
await abort_invalid_argument(context, f"Invalid integration_id format: {request.integration_id}")
|
||||
return noteflow_pb2.StartIntegrationSyncResponse()
|
||||
|
||||
# Verify integration exists
|
||||
async with self._create_repository_provider() as uow:
|
||||
integration = await uow.integrations.get(integration_id)
|
||||
|
||||
# Fallback: if integration not found by ID, try looking up by provider name
|
||||
# This handles cases where frontend uses local IDs that don't match backend
|
||||
integration, integration_id = await self._resolve_integration(uow, integration_id, context, request)
|
||||
if integration is None:
|
||||
# Try to find connected calendar integration by provider (google/outlook)
|
||||
from noteflow.domain.value_objects import OAuthProvider
|
||||
|
||||
for provider_name in [OAuthProvider.GOOGLE, OAuthProvider.OUTLOOK]:
|
||||
candidate = await uow.integrations.get_by_provider(
|
||||
provider=provider_name,
|
||||
integration_type="calendar",
|
||||
)
|
||||
if candidate is not None and candidate.is_connected:
|
||||
integration = candidate
|
||||
integration_id = integration.id
|
||||
break
|
||||
|
||||
if integration is None:
|
||||
await abort_not_found(context, ENTITY_INTEGRATION, request.integration_id)
|
||||
return noteflow_pb2.StartIntegrationSyncResponse()
|
||||
return noteflow_pb2.StartIntegrationSyncResponse()
|
||||
|
||||
provider = integration.config.get("provider") if integration.config else None
|
||||
if not provider:
|
||||
await abort_failed_precondition(
|
||||
context,
|
||||
"Integration provider not configured",
|
||||
)
|
||||
await abort_failed_precondition(context, "Integration provider not configured")
|
||||
return noteflow_pb2.StartIntegrationSyncResponse()
|
||||
|
||||
# Create sync run
|
||||
sync_run = SyncRun.start(integration_id)
|
||||
sync_run = await uow.integrations.create_sync_run(sync_run)
|
||||
await uow.commit()
|
||||
|
||||
# Cache the sync run for quick status lookups
|
||||
cache = self._ensure_sync_runs_cache()
|
||||
cache[sync_run.id] = sync_run
|
||||
|
||||
# Fire off async sync task (store reference to prevent GC)
|
||||
sync_task = asyncio.create_task(
|
||||
asyncio.create_task(
|
||||
self._perform_sync(integration_id, sync_run.id, str(provider)),
|
||||
name=f"sync-{sync_run.id}",
|
||||
)
|
||||
# Add callback to clean up on completion
|
||||
sync_task.add_done_callback(lambda _: None)
|
||||
|
||||
).add_done_callback(lambda _: None)
|
||||
logger.info("Started sync run %s for integration %s", sync_run.id, integration_id)
|
||||
return noteflow_pb2.StartIntegrationSyncResponse(sync_run_id=str(sync_run.id), status="running")
|
||||
|
||||
return noteflow_pb2.StartIntegrationSyncResponse(
|
||||
sync_run_id=str(sync_run.id),
|
||||
status="running",
|
||||
)
|
||||
async def _resolve_integration(
|
||||
self: ServicerHost,
|
||||
uow: object,
|
||||
integration_id: UUID,
|
||||
context: grpc.aio.ServicerContext,
|
||||
request: noteflow_pb2.StartIntegrationSyncRequest,
|
||||
) -> tuple[object | None, UUID]:
|
||||
"""Resolve integration by ID with provider fallback.
|
||||
|
||||
Returns (integration, resolved_id) tuple. Returns (None, id) if not found after aborting.
|
||||
"""
|
||||
from noteflow.domain.value_objects import OAuthProvider
|
||||
|
||||
integration = await uow.integrations.get(integration_id)
|
||||
if integration is not None:
|
||||
return integration, integration_id
|
||||
|
||||
# Fallback: try connected calendar integrations by provider
|
||||
for provider_name in [OAuthProvider.GOOGLE, OAuthProvider.OUTLOOK]:
|
||||
candidate = await uow.integrations.get_by_provider(provider=provider_name, integration_type="calendar")
|
||||
if candidate is not None and candidate.is_connected:
|
||||
return candidate, candidate.id
|
||||
|
||||
await abort_not_found(context, ENTITY_INTEGRATION, request.integration_id)
|
||||
return None, integration_id
|
||||
|
||||
async def _perform_sync(
|
||||
self: ServicerHost,
|
||||
|
||||
@@ -8,16 +8,26 @@ from uuid import UUID
|
||||
|
||||
import grpc.aio
|
||||
|
||||
from noteflow.config.constants import (
|
||||
LOG_EVENT_INVALID_WEBHOOK_ID,
|
||||
LOG_EVENT_WEBHOOK_DELETE_FAILED,
|
||||
LOG_EVENT_WEBHOOK_REGISTRATION_FAILED,
|
||||
LOG_EVENT_WEBHOOK_UPDATE_FAILED,
|
||||
)
|
||||
from noteflow.domain.errors import ErrorCode
|
||||
from noteflow.domain.utils.time import utc_now
|
||||
from noteflow.domain.webhooks.events import (
|
||||
WebhookConfig,
|
||||
WebhookDelivery,
|
||||
WebhookEventType,
|
||||
)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
from ..proto import noteflow_pb2
|
||||
from .errors import abort_database_required, abort_invalid_argument, abort_not_found
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from .protocols import ServicerHost
|
||||
|
||||
@@ -85,28 +95,30 @@ class WebhooksMixin:
|
||||
"""Register a new webhook configuration."""
|
||||
# Validate URL
|
||||
if not request.url or not request.url.startswith(("http://", "https://")):
|
||||
await abort_invalid_argument(
|
||||
context, "URL must start with http:// or https://"
|
||||
)
|
||||
logger.error(LOG_EVENT_WEBHOOK_REGISTRATION_FAILED, reason="invalid_url", url=request.url)
|
||||
await abort_invalid_argument(context, "URL must start with http:// or https://")
|
||||
|
||||
# Validate events
|
||||
if not request.events:
|
||||
logger.error(LOG_EVENT_WEBHOOK_REGISTRATION_FAILED, reason="no_events", url=request.url)
|
||||
await abort_invalid_argument(context, "At least one event type required")
|
||||
|
||||
try:
|
||||
events = _parse_events(list(request.events))
|
||||
except ValueError as exc:
|
||||
logger.error(LOG_EVENT_WEBHOOK_REGISTRATION_FAILED, reason="invalid_event_type", url=request.url, error=str(exc))
|
||||
await abort_invalid_argument(context, f"Invalid event type: {exc}")
|
||||
|
||||
try:
|
||||
workspace_id = UUID(request.workspace_id)
|
||||
except ValueError:
|
||||
from noteflow.config.constants import ERROR_INVALID_WORKSPACE_ID_FORMAT
|
||||
|
||||
logger.error(LOG_EVENT_WEBHOOK_REGISTRATION_FAILED, reason="invalid_workspace_id", workspace_id=request.workspace_id)
|
||||
await abort_invalid_argument(context, ERROR_INVALID_WORKSPACE_ID_FORMAT)
|
||||
|
||||
async with self._create_repository_provider() as uow:
|
||||
if not uow.supports_webhooks:
|
||||
logger.error(LOG_EVENT_WEBHOOK_REGISTRATION_FAILED, reason=ErrorCode.DATABASE_REQUIRED.code)
|
||||
await abort_database_required(context, _ENTITY_WEBHOOKS)
|
||||
|
||||
config = WebhookConfig.create(
|
||||
@@ -120,6 +132,7 @@ class WebhooksMixin:
|
||||
)
|
||||
saved = await uow.webhooks.create(config)
|
||||
await uow.commit()
|
||||
logger.info("webhook_registered", webhook_id=str(saved.id), workspace_id=str(workspace_id), url=request.url, name=saved.name)
|
||||
return _webhook_config_to_proto(saved)
|
||||
|
||||
async def ListWebhooks(
|
||||
@@ -130,6 +143,10 @@ class WebhooksMixin:
|
||||
"""List registered webhooks."""
|
||||
async with self._create_repository_provider() as uow:
|
||||
if not uow.supports_webhooks:
|
||||
logger.error(
|
||||
"webhook_list_failed",
|
||||
reason=ErrorCode.DATABASE_REQUIRED.code,
|
||||
)
|
||||
await abort_database_required(context, _ENTITY_WEBHOOKS)
|
||||
|
||||
if request.enabled_only:
|
||||
@@ -137,6 +154,11 @@ class WebhooksMixin:
|
||||
else:
|
||||
webhooks = await uow.webhooks.get_all()
|
||||
|
||||
logger.debug(
|
||||
"webhooks_listed",
|
||||
count=len(webhooks),
|
||||
enabled_only=request.enabled_only,
|
||||
)
|
||||
return noteflow_pb2.ListWebhooksResponse(
|
||||
webhooks=[_webhook_config_to_proto(w) for w in webhooks],
|
||||
total_count=len(webhooks),
|
||||
@@ -151,14 +173,29 @@ class WebhooksMixin:
|
||||
try:
|
||||
webhook_id = _parse_webhook_id(request.webhook_id)
|
||||
except ValueError:
|
||||
logger.error(
|
||||
LOG_EVENT_WEBHOOK_UPDATE_FAILED,
|
||||
reason=LOG_EVENT_INVALID_WEBHOOK_ID,
|
||||
webhook_id=request.webhook_id,
|
||||
)
|
||||
await abort_invalid_argument(context, _ERR_INVALID_WEBHOOK_ID)
|
||||
|
||||
async with self._create_repository_provider() as uow:
|
||||
if not uow.supports_webhooks:
|
||||
logger.error(
|
||||
LOG_EVENT_WEBHOOK_UPDATE_FAILED,
|
||||
reason=ErrorCode.DATABASE_REQUIRED.code,
|
||||
webhook_id=str(webhook_id),
|
||||
)
|
||||
await abort_database_required(context, _ENTITY_WEBHOOKS)
|
||||
|
||||
config = await uow.webhooks.get_by_id(webhook_id)
|
||||
if config is None:
|
||||
logger.error(
|
||||
LOG_EVENT_WEBHOOK_UPDATE_FAILED,
|
||||
reason="not_found",
|
||||
webhook_id=str(webhook_id),
|
||||
)
|
||||
await abort_not_found(context, _ENTITY_WEBHOOK, request.webhook_id)
|
||||
|
||||
# Build updates dict with proper typing
|
||||
@@ -184,6 +221,12 @@ class WebhooksMixin:
|
||||
updated = replace(config, **updates, updated_at=utc_now())
|
||||
saved = await uow.webhooks.update(updated)
|
||||
await uow.commit()
|
||||
|
||||
logger.info(
|
||||
"webhook_updated",
|
||||
webhook_id=str(webhook_id),
|
||||
updated_fields=list(updates.keys()),
|
||||
)
|
||||
return _webhook_config_to_proto(saved)
|
||||
|
||||
async def DeleteWebhook(
|
||||
@@ -195,14 +238,36 @@ class WebhooksMixin:
|
||||
try:
|
||||
webhook_id = _parse_webhook_id(request.webhook_id)
|
||||
except ValueError:
|
||||
logger.error(
|
||||
LOG_EVENT_WEBHOOK_DELETE_FAILED,
|
||||
reason=LOG_EVENT_INVALID_WEBHOOK_ID,
|
||||
webhook_id=request.webhook_id,
|
||||
)
|
||||
await abort_invalid_argument(context, _ERR_INVALID_WEBHOOK_ID)
|
||||
|
||||
async with self._create_repository_provider() as uow:
|
||||
if not uow.supports_webhooks:
|
||||
logger.error(
|
||||
LOG_EVENT_WEBHOOK_DELETE_FAILED,
|
||||
reason=ErrorCode.DATABASE_REQUIRED.code,
|
||||
webhook_id=str(webhook_id),
|
||||
)
|
||||
await abort_database_required(context, _ENTITY_WEBHOOKS)
|
||||
|
||||
deleted = await uow.webhooks.delete(webhook_id)
|
||||
await uow.commit()
|
||||
|
||||
if deleted:
|
||||
logger.info(
|
||||
"webhook_deleted",
|
||||
webhook_id=str(webhook_id),
|
||||
)
|
||||
else:
|
||||
logger.error(
|
||||
LOG_EVENT_WEBHOOK_DELETE_FAILED,
|
||||
reason="not_found",
|
||||
webhook_id=str(webhook_id),
|
||||
)
|
||||
return noteflow_pb2.DeleteWebhookResponse(success=deleted)
|
||||
|
||||
async def GetWebhookDeliveries(
|
||||
@@ -214,15 +279,32 @@ class WebhooksMixin:
|
||||
try:
|
||||
webhook_id = _parse_webhook_id(request.webhook_id)
|
||||
except ValueError:
|
||||
logger.error(
|
||||
"webhook_deliveries_query_failed",
|
||||
reason=LOG_EVENT_INVALID_WEBHOOK_ID,
|
||||
webhook_id=request.webhook_id,
|
||||
)
|
||||
await abort_invalid_argument(context, _ERR_INVALID_WEBHOOK_ID)
|
||||
|
||||
limit = min(request.limit or 50, 500)
|
||||
|
||||
async with self._create_repository_provider() as uow:
|
||||
if not uow.supports_webhooks:
|
||||
logger.error(
|
||||
"webhook_deliveries_query_failed",
|
||||
reason=ErrorCode.DATABASE_REQUIRED.code,
|
||||
webhook_id=str(webhook_id),
|
||||
)
|
||||
await abort_database_required(context, _ENTITY_WEBHOOKS)
|
||||
|
||||
deliveries = await uow.webhooks.get_deliveries(webhook_id, limit=limit)
|
||||
|
||||
logger.debug(
|
||||
"webhook_deliveries_queried",
|
||||
webhook_id=str(webhook_id),
|
||||
count=len(deliveries),
|
||||
limit=limit,
|
||||
)
|
||||
return noteflow_pb2.GetWebhookDeliveriesResponse(
|
||||
deliveries=[_webhook_delivery_to_proto(d) for d in deliveries],
|
||||
total_count=len(deliveries),
|
||||
|
||||
@@ -6,7 +6,6 @@ clean separation of concerns for server initialization.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import sys
|
||||
from typing import TypedDict
|
||||
|
||||
@@ -30,6 +29,7 @@ from noteflow.config.settings import (
|
||||
from noteflow.domain.entities.integration import IntegrationStatus
|
||||
from noteflow.grpc._config import DiarizationConfig, GrpcServerConfig
|
||||
from noteflow.infrastructure.diarization import DiarizationEngine
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.ner import NerEngine
|
||||
from noteflow.infrastructure.persistence.database import (
|
||||
create_engine_and_session_factory,
|
||||
@@ -49,7 +49,7 @@ class DiarizationEngineKwargs(TypedDict, total=False):
|
||||
min_speakers: int
|
||||
max_speakers: int
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
async def _auto_enable_cloud_llm(
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import queue
|
||||
import threading
|
||||
import time
|
||||
@@ -16,6 +15,7 @@ from noteflow.config.constants import DEFAULT_SAMPLE_RATE
|
||||
from noteflow.grpc._config import STREAMING_CONFIG
|
||||
from noteflow.grpc.client import TranscriptSegment
|
||||
from noteflow.grpc.proto import noteflow_pb2
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import numpy as np
|
||||
@@ -24,7 +24,7 @@ if TYPE_CHECKING:
|
||||
from noteflow.grpc.client import ConnectionCallback, TranscriptCallback
|
||||
from noteflow.grpc.proto import noteflow_pb2_grpc
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import queue
|
||||
import threading
|
||||
from typing import TYPE_CHECKING, Final
|
||||
@@ -29,6 +28,7 @@ from noteflow.grpc._types import (
|
||||
TranscriptSegment,
|
||||
)
|
||||
from noteflow.grpc.proto import noteflow_pb2, noteflow_pb2_grpc
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import numpy as np
|
||||
@@ -48,7 +48,7 @@ __all__ = [
|
||||
"TranscriptSegment",
|
||||
]
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
DEFAULT_SERVER: Final[str] = "localhost:50051"
|
||||
CHUNK_TIMEOUT: Final[float] = 0.1 # Timeout for getting chunks from queue
|
||||
|
||||
@@ -6,7 +6,6 @@ by extracting from metadata and setting context variables.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from collections.abc import Awaitable, Callable
|
||||
from typing import TypeVar
|
||||
|
||||
@@ -15,12 +14,13 @@ from grpc import aio
|
||||
|
||||
from noteflow.infrastructure.logging import (
|
||||
generate_request_id,
|
||||
get_logger,
|
||||
request_id_var,
|
||||
user_id_var,
|
||||
workspace_id_var,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Metadata keys for identity context
|
||||
METADATA_REQUEST_ID = "x-request-id"
|
||||
|
||||
@@ -4,7 +4,6 @@ from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import logging
|
||||
import signal
|
||||
import time
|
||||
from typing import TYPE_CHECKING
|
||||
@@ -15,7 +14,7 @@ from pydantic import ValidationError
|
||||
from noteflow.config.settings import get_feature_flags, get_settings
|
||||
from noteflow.infrastructure.asr import FasterWhisperEngine
|
||||
from noteflow.infrastructure.asr.engine import VALID_MODEL_SIZES
|
||||
from noteflow.infrastructure.logging import LogBufferHandler
|
||||
from noteflow.infrastructure.logging import LoggingConfig, configure_logging, get_logger
|
||||
from noteflow.infrastructure.persistence.unit_of_work import SqlAlchemyUnitOfWork
|
||||
from noteflow.infrastructure.summarization import create_summarization_service
|
||||
|
||||
@@ -49,7 +48,7 @@ if TYPE_CHECKING:
|
||||
from noteflow.config.settings import Settings
|
||||
from noteflow.infrastructure.diarization import DiarizationEngine
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class NoteFlowServer:
|
||||
@@ -515,15 +514,9 @@ def main() -> None:
|
||||
"""Entry point for NoteFlow gRPC server."""
|
||||
args = _parse_args()
|
||||
|
||||
# Configure logging
|
||||
log_level = logging.DEBUG if args.verbose else logging.INFO
|
||||
logging.basicConfig(
|
||||
level=log_level,
|
||||
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
|
||||
)
|
||||
root_logger = logging.getLogger()
|
||||
if not any(isinstance(handler, LogBufferHandler) for handler in root_logger.handlers):
|
||||
root_logger.addHandler(LogBufferHandler(level=log_level))
|
||||
# Configure centralized logging with structlog
|
||||
log_level = "DEBUG" if args.verbose else "INFO"
|
||||
configure_logging(LoggingConfig(level=log_level))
|
||||
|
||||
# Load settings from environment
|
||||
try:
|
||||
|
||||
@@ -4,7 +4,6 @@ from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import contextlib
|
||||
import logging
|
||||
import time
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING, ClassVar, Final
|
||||
@@ -20,6 +19,7 @@ from noteflow.infrastructure.asr import Segmenter, SegmenterConfig, StreamingVad
|
||||
from noteflow.infrastructure.audio.partial_buffer import PartialAudioBuffer
|
||||
from noteflow.infrastructure.audio.writer import MeetingAudioWriter
|
||||
from noteflow.infrastructure.diarization import DiarizationSession
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.persistence.memory import MemoryUnitOfWork
|
||||
from noteflow.infrastructure.persistence.repositories import DiarizationJob
|
||||
from noteflow.infrastructure.persistence.unit_of_work import SqlAlchemyUnitOfWork
|
||||
@@ -59,7 +59,7 @@ if TYPE_CHECKING:
|
||||
from noteflow.infrastructure.asr import FasterWhisperEngine
|
||||
from noteflow.infrastructure.diarization import DiarizationEngine, SpeakerTurn
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class NoteFlowServicer(
|
||||
@@ -140,11 +140,6 @@ class NoteFlowServicer(
|
||||
self._crypto = AesGcmCryptoBox(self._keystore)
|
||||
self._audio_writers: dict[str, MeetingAudioWriter] = {}
|
||||
|
||||
# Initialize all state dictionaries
|
||||
self._init_streaming_state_dicts()
|
||||
|
||||
def _init_streaming_state_dicts(self) -> None:
|
||||
"""Initialize all streaming state dictionaries."""
|
||||
# VAD and segmentation state per meeting
|
||||
self._vad_instances: dict[str, StreamingVad] = {}
|
||||
self._segmenters: dict[str, Segmenter] = {}
|
||||
|
||||
@@ -6,18 +6,19 @@ Provides Whisper-based transcription with word-level timestamps.
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from collections.abc import Iterator
|
||||
from functools import partial
|
||||
from typing import TYPE_CHECKING, Final
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import numpy as np
|
||||
from numpy.typing import NDArray
|
||||
|
||||
from noteflow.infrastructure.asr.dto import AsrResult, WordTiming
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Available model sizes
|
||||
VALID_MODEL_SIZES: Final[tuple[str, ...]] = (
|
||||
|
||||
@@ -5,6 +5,7 @@ Manages speech segment boundaries using Voice Activity Detection.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from collections import deque
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum, auto
|
||||
from typing import TYPE_CHECKING
|
||||
@@ -75,7 +76,8 @@ class Segmenter:
|
||||
_leading_duration: float = field(default=0.0, init=False)
|
||||
|
||||
# Audio buffers with cached sample counts for O(1) length lookups
|
||||
_leading_buffer: list[NDArray[np.float32]] = field(default_factory=list, init=False)
|
||||
# Using deque for _leading_buffer enables O(1) popleft() vs O(n) pop(0)
|
||||
_leading_buffer: deque[NDArray[np.float32]] = field(default_factory=deque, init=False)
|
||||
_leading_buffer_samples: int = field(default=0, init=False)
|
||||
_speech_buffer: list[NDArray[np.float32]] = field(default_factory=list, init=False)
|
||||
_speech_buffer_samples: int = field(default=0, init=False)
|
||||
@@ -238,9 +240,9 @@ class Segmenter:
|
||||
# Calculate total buffer duration using cached sample count
|
||||
total_duration = self._leading_buffer_samples / self.config.sample_rate
|
||||
|
||||
# Trim to configured leading buffer size
|
||||
# Trim to configured leading buffer size (O(1) with deque.popleft)
|
||||
while total_duration > self.config.leading_buffer and self._leading_buffer:
|
||||
removed = self._leading_buffer.pop(0)
|
||||
removed = self._leading_buffer.popleft()
|
||||
self._leading_buffer_samples -= len(removed)
|
||||
total_duration = self._leading_buffer_samples / self.config.sample_rate
|
||||
|
||||
|
||||
@@ -5,7 +5,6 @@ Provide cross-platform audio input capture with device handling.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import time
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
@@ -14,11 +13,12 @@ import sounddevice as sd
|
||||
|
||||
from noteflow.config.constants import DEFAULT_SAMPLE_RATE
|
||||
from noteflow.infrastructure.audio.dto import AudioDeviceInfo, AudioFrameCallback
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from numpy.typing import NDArray
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class SoundDeviceCapture:
|
||||
|
||||
@@ -5,7 +5,6 @@ Provide cross-platform audio output playback from ring buffer audio.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import threading
|
||||
from collections.abc import Callable
|
||||
from enum import Enum, auto
|
||||
@@ -16,11 +15,12 @@ import sounddevice as sd
|
||||
from numpy.typing import NDArray
|
||||
|
||||
from noteflow.config.constants import DEFAULT_SAMPLE_RATE, POSITION_UPDATE_INTERVAL
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.infrastructure.audio.dto import TimestampedAudio
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class PlaybackState(Enum):
|
||||
|
||||
@@ -7,7 +7,6 @@ Reuses ChunkedAssetReader from security/crypto.py for decryption.
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
@@ -15,12 +14,13 @@ import numpy as np
|
||||
|
||||
from noteflow.config.constants import DEFAULT_SAMPLE_RATE
|
||||
from noteflow.infrastructure.audio.dto import TimestampedAudio
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.security.crypto import ChunkedAssetReader
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.infrastructure.security.crypto import AesGcmCryptoBox
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class MeetingAudioReader:
|
||||
|
||||
@@ -4,7 +4,6 @@ from __future__ import annotations
|
||||
|
||||
import io
|
||||
import json
|
||||
import logging
|
||||
import threading
|
||||
from datetime import UTC, datetime
|
||||
from pathlib import Path
|
||||
@@ -17,6 +16,7 @@ from noteflow.config.constants import (
|
||||
DEFAULT_SAMPLE_RATE,
|
||||
PERIODIC_FLUSH_INTERVAL_SECONDS,
|
||||
)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.security.crypto import ChunkedAssetWriter
|
||||
|
||||
if TYPE_CHECKING:
|
||||
@@ -24,7 +24,7 @@ if TYPE_CHECKING:
|
||||
|
||||
from noteflow.infrastructure.security.crypto import AesGcmCryptoBox
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class MeetingAudioWriter:
|
||||
|
||||
@@ -6,17 +6,17 @@ Fetches and parses OIDC provider configuration from
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import httpx
|
||||
|
||||
from noteflow.domain.auth.oidc import OidcDiscoveryConfig
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.domain.auth.oidc import OidcProviderConfig
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class OidcDiscoveryError(Exception):
|
||||
|
||||
@@ -6,7 +6,6 @@ like Authentik, Authelia, Keycloak, Auth0, Okta, and Azure AD.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from dataclasses import dataclass, field
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID
|
||||
@@ -20,11 +19,12 @@ from noteflow.infrastructure.auth.oidc_discovery import (
|
||||
OidcDiscoveryClient,
|
||||
OidcDiscoveryError,
|
||||
)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.domain.ports.unit_of_work import UnitOfWork
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
|
||||
@@ -5,7 +5,6 @@ Implements CalendarPort for Google Calendar using the Google Calendar API v3.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from datetime import UTC, datetime, timedelta
|
||||
|
||||
import httpx
|
||||
@@ -21,8 +20,9 @@ from noteflow.config.constants import (
|
||||
)
|
||||
from noteflow.domain.ports.calendar import CalendarEventInfo, CalendarPort
|
||||
from noteflow.domain.value_objects import OAuthProvider
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class GoogleCalendarError(Exception):
|
||||
|
||||
@@ -8,7 +8,6 @@ from __future__ import annotations
|
||||
|
||||
import base64
|
||||
import hashlib
|
||||
import logging
|
||||
import secrets
|
||||
from datetime import UTC, datetime, timedelta
|
||||
from typing import TYPE_CHECKING, ClassVar
|
||||
@@ -26,11 +25,12 @@ from noteflow.config.constants import (
|
||||
)
|
||||
from noteflow.domain.ports.calendar import OAuthPort
|
||||
from noteflow.domain.value_objects import OAuthProvider, OAuthState, OAuthTokens
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.config.settings import CalendarIntegrationSettings
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class OAuthError(Exception):
|
||||
|
||||
@@ -5,8 +5,8 @@ Implements CalendarPort for Outlook using Microsoft Graph API.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from datetime import UTC, datetime, timedelta
|
||||
from typing import Final
|
||||
|
||||
import httpx
|
||||
|
||||
@@ -21,14 +21,35 @@ from noteflow.config.constants import (
|
||||
)
|
||||
from noteflow.domain.ports.calendar import CalendarEventInfo, CalendarPort
|
||||
from noteflow.domain.value_objects import OAuthProvider
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# HTTP client configuration
|
||||
GRAPH_API_TIMEOUT: Final[float] = 30.0 # seconds
|
||||
MAX_CONNECTIONS: Final[int] = 10
|
||||
MAX_ERROR_BODY_LENGTH: Final[int] = 500
|
||||
GRAPH_API_MAX_PAGE_SIZE: Final[int] = 100 # Graph API maximum
|
||||
|
||||
|
||||
class OutlookCalendarError(Exception):
|
||||
"""Outlook Calendar API error."""
|
||||
|
||||
|
||||
def _truncate_error_body(body: str) -> str:
|
||||
"""Truncate error body to prevent log bloat.
|
||||
|
||||
Args:
|
||||
body: Raw error response body.
|
||||
|
||||
Returns:
|
||||
Truncated body with indicator if truncation occurred.
|
||||
"""
|
||||
if len(body) <= MAX_ERROR_BODY_LENGTH:
|
||||
return body
|
||||
return body[:MAX_ERROR_BODY_LENGTH] + "... (truncated)"
|
||||
|
||||
|
||||
class OutlookCalendarAdapter(CalendarPort):
|
||||
"""Microsoft Graph Calendar API adapter.
|
||||
|
||||
@@ -46,10 +67,13 @@ class OutlookCalendarAdapter(CalendarPort):
|
||||
) -> list[CalendarEventInfo]:
|
||||
"""Fetch upcoming calendar events from Outlook Calendar.
|
||||
|
||||
Implements pagination via @odata.nextLink to ensure all events
|
||||
within the limit are retrieved.
|
||||
|
||||
Args:
|
||||
access_token: Microsoft Graph OAuth token with Calendars.Read scope.
|
||||
hours_ahead: Hours to look ahead from current time.
|
||||
limit: Maximum events to return (capped by Graph API).
|
||||
limit: Maximum events to return.
|
||||
|
||||
Returns:
|
||||
List of Outlook calendar events ordered by start datetime.
|
||||
@@ -61,11 +85,18 @@ class OutlookCalendarAdapter(CalendarPort):
|
||||
start_time = now.strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
end_time = (now + timedelta(hours=hours_ahead)).strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
|
||||
url = f"{self.GRAPH_API_BASE}/me/calendarView"
|
||||
params: dict[str, str | int] = {
|
||||
headers = {
|
||||
HTTP_AUTHORIZATION: f"{HTTP_BEARER_PREFIX}{access_token}",
|
||||
"Prefer": 'outlook.timezone="UTC"',
|
||||
}
|
||||
|
||||
# Initial page request
|
||||
page_size = min(limit, GRAPH_API_MAX_PAGE_SIZE)
|
||||
url: str | None = f"{self.GRAPH_API_BASE}/me/calendarView"
|
||||
params: dict[str, str | int] | None = {
|
||||
"startDateTime": start_time,
|
||||
"endDateTime": end_time,
|
||||
"$top": limit,
|
||||
"$top": page_size,
|
||||
"$orderby": "start/dateTime",
|
||||
"$select": (
|
||||
"id,subject,start,end,location,bodyPreview,"
|
||||
@@ -73,26 +104,36 @@ class OutlookCalendarAdapter(CalendarPort):
|
||||
),
|
||||
}
|
||||
|
||||
headers = {
|
||||
HTTP_AUTHORIZATION: f"{HTTP_BEARER_PREFIX}{access_token}",
|
||||
"Prefer": 'outlook.timezone="UTC"',
|
||||
}
|
||||
all_events: list[CalendarEventInfo] = []
|
||||
|
||||
async with httpx.AsyncClient() as client:
|
||||
response = await client.get(url, params=params, headers=headers)
|
||||
async with httpx.AsyncClient(
|
||||
timeout=httpx.Timeout(GRAPH_API_TIMEOUT),
|
||||
limits=httpx.Limits(max_connections=MAX_CONNECTIONS),
|
||||
) as client:
|
||||
while url is not None:
|
||||
response = await client.get(url, params=params, headers=headers)
|
||||
|
||||
if response.status_code == HTTP_STATUS_UNAUTHORIZED:
|
||||
raise OutlookCalendarError(ERR_TOKEN_EXPIRED)
|
||||
if response.status_code == HTTP_STATUS_UNAUTHORIZED:
|
||||
raise OutlookCalendarError(ERR_TOKEN_EXPIRED)
|
||||
|
||||
if response.status_code != HTTP_STATUS_OK:
|
||||
error_msg = response.text
|
||||
logger.error("Microsoft Graph API error: %s", error_msg)
|
||||
raise OutlookCalendarError(f"{ERR_API_PREFIX}{error_msg}")
|
||||
if response.status_code != HTTP_STATUS_OK:
|
||||
error_body = _truncate_error_body(response.text)
|
||||
logger.error("Microsoft Graph API error: %s", error_body)
|
||||
raise OutlookCalendarError(f"{ERR_API_PREFIX}{error_body}")
|
||||
|
||||
data = response.json()
|
||||
items = data.get("value", [])
|
||||
data = response.json()
|
||||
items = data.get("value", [])
|
||||
|
||||
return [self._parse_event(item) for item in items]
|
||||
for item in items:
|
||||
all_events.append(self._parse_event(item))
|
||||
if len(all_events) >= limit:
|
||||
return all_events
|
||||
|
||||
# Check for next page
|
||||
url = data.get("@odata.nextLink")
|
||||
params = None # nextLink includes query params
|
||||
|
||||
return all_events
|
||||
|
||||
async def get_user_email(self, access_token: str) -> str:
|
||||
"""Get authenticated user's email address.
|
||||
@@ -110,16 +151,19 @@ class OutlookCalendarAdapter(CalendarPort):
|
||||
params = {"$select": "mail,userPrincipalName"}
|
||||
headers = {HTTP_AUTHORIZATION: f"{HTTP_BEARER_PREFIX}{access_token}"}
|
||||
|
||||
async with httpx.AsyncClient() as client:
|
||||
async with httpx.AsyncClient(
|
||||
timeout=httpx.Timeout(GRAPH_API_TIMEOUT),
|
||||
limits=httpx.Limits(max_connections=MAX_CONNECTIONS),
|
||||
) as client:
|
||||
response = await client.get(url, params=params, headers=headers)
|
||||
|
||||
if response.status_code == HTTP_STATUS_UNAUTHORIZED:
|
||||
raise OutlookCalendarError(ERR_TOKEN_EXPIRED)
|
||||
|
||||
if response.status_code != HTTP_STATUS_OK:
|
||||
error_msg = response.text
|
||||
logger.error("Microsoft Graph API error: %s", error_msg)
|
||||
raise OutlookCalendarError(f"{ERR_API_PREFIX}{error_msg}")
|
||||
error_body = _truncate_error_body(response.text)
|
||||
logger.error("Microsoft Graph API error: %s", error_body)
|
||||
raise OutlookCalendarError(f"{ERR_API_PREFIX}{error_body}")
|
||||
|
||||
data = response.json()
|
||||
# Prefer mail, fall back to userPrincipalName
|
||||
|
||||
@@ -8,6 +8,7 @@ from __future__ import annotations
|
||||
from typing import TYPE_CHECKING
|
||||
from uuid import UUID
|
||||
|
||||
from noteflow.config.constants import RULE_FIELD_DESCRIPTION
|
||||
from noteflow.domain.ports.calendar import CalendarEventInfo
|
||||
from noteflow.infrastructure.triggers.calendar import CalendarEvent
|
||||
|
||||
@@ -70,7 +71,7 @@ class CalendarEventConverter:
|
||||
"calendar_id": calendar_id,
|
||||
"calendar_name": calendar_name,
|
||||
"title": event.title,
|
||||
"description": event.description,
|
||||
RULE_FIELD_DESCRIPTION: event.description,
|
||||
"start_time": event.start_time,
|
||||
"end_time": event.end_time,
|
||||
"location": event.location,
|
||||
|
||||
@@ -8,7 +8,6 @@ Requires optional dependencies: pip install noteflow[diarization]
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import os
|
||||
import warnings
|
||||
from typing import TYPE_CHECKING
|
||||
@@ -16,6 +15,7 @@ from typing import TYPE_CHECKING
|
||||
from noteflow.config.constants import DEFAULT_SAMPLE_RATE, ERR_HF_TOKEN_REQUIRED
|
||||
from noteflow.infrastructure.diarization.dto import SpeakerTurn
|
||||
from noteflow.infrastructure.diarization.session import DiarizationSession
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence
|
||||
@@ -24,7 +24,7 @@ if TYPE_CHECKING:
|
||||
from numpy.typing import NDArray
|
||||
from pyannote.core import Annotation
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class DiarizationEngine:
|
||||
|
||||
@@ -6,20 +6,20 @@ without cross-talk. Shared models are loaded once and reused across sessions.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from collections.abc import Sequence
|
||||
from dataclasses import dataclass, field
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from noteflow.config.constants import DEFAULT_SAMPLE_RATE
|
||||
from noteflow.infrastructure.diarization.dto import SpeakerTurn
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import numpy as np
|
||||
from diart import SpeakerDiarization
|
||||
from numpy.typing import NDArray
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -1,6 +1,15 @@
|
||||
"""Logging infrastructure for NoteFlow."""
|
||||
"""Logging infrastructure for NoteFlow.
|
||||
|
||||
This module provides centralized logging with structlog, supporting:
|
||||
- Dual output (Rich console for development, JSON for observability)
|
||||
- Automatic context injection (request_id, user_id, workspace_id)
|
||||
- OpenTelemetry trace correlation
|
||||
- In-memory log buffer for UI streaming
|
||||
"""
|
||||
|
||||
from .config import LoggingConfig, configure_logging, get_logger
|
||||
from .log_buffer import LogBuffer, LogBufferHandler, LogEntry, get_log_buffer
|
||||
from .processors import add_noteflow_context, add_otel_trace_context
|
||||
from .structured import (
|
||||
generate_request_id,
|
||||
get_logging_context,
|
||||
@@ -16,8 +25,13 @@ __all__ = [
|
||||
"LogBuffer",
|
||||
"LogBufferHandler",
|
||||
"LogEntry",
|
||||
"LoggingConfig",
|
||||
"add_noteflow_context",
|
||||
"add_otel_trace_context",
|
||||
"configure_logging",
|
||||
"generate_request_id",
|
||||
"get_log_buffer",
|
||||
"get_logger",
|
||||
"get_logging_context",
|
||||
"get_request_id",
|
||||
"get_user_id",
|
||||
|
||||
185
src/noteflow/infrastructure/logging/config.py
Normal file
185
src/noteflow/infrastructure/logging/config.py
Normal file
@@ -0,0 +1,185 @@
|
||||
"""Centralized logging configuration with dual output.
|
||||
|
||||
Configure structlog with Rich console + JSON file output for both development
|
||||
and observability use cases.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import sys
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import structlog
|
||||
|
||||
from .processors import build_processor_chain
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence
|
||||
|
||||
from structlog.typing import Processor
|
||||
|
||||
# Default log level constant
|
||||
_DEFAULT_LEVEL = "INFO"
|
||||
|
||||
# Rich console width for traceback formatting
|
||||
_RICH_CONSOLE_WIDTH = 120
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class LoggingConfig:
|
||||
"""Configuration for centralized logging.
|
||||
|
||||
Attributes:
|
||||
level: Minimum log level (DEBUG, INFO, WARNING, ERROR).
|
||||
json_file: Optional path for JSON log file output.
|
||||
enable_console: Enable Rich console output.
|
||||
enable_json_console: Force JSON output to console (for production).
|
||||
enable_log_buffer: Feed logs to in-memory LogBuffer for UI streaming.
|
||||
enable_otel_context: Include OpenTelemetry trace/span IDs.
|
||||
enable_noteflow_context: Include request_id, user_id, workspace_id.
|
||||
console_colors: Enable Rich colors (auto-detect TTY if not set).
|
||||
"""
|
||||
|
||||
level: str = _DEFAULT_LEVEL
|
||||
json_file: Path | None = None
|
||||
enable_console: bool = True
|
||||
enable_json_console: bool = False
|
||||
enable_log_buffer: bool = True
|
||||
enable_otel_context: bool = True
|
||||
enable_noteflow_context: bool = True
|
||||
console_colors: bool = True
|
||||
|
||||
|
||||
# Log level name to constant mapping
|
||||
_LEVEL_MAP: dict[str, int] = {
|
||||
"DEBUG": logging.DEBUG,
|
||||
"INFO": logging.INFO,
|
||||
"WARNING": logging.WARNING,
|
||||
"ERROR": logging.ERROR,
|
||||
"CRITICAL": logging.CRITICAL,
|
||||
}
|
||||
|
||||
|
||||
def _get_log_level(level_name: str) -> int:
|
||||
"""Convert level name to logging constant."""
|
||||
return _LEVEL_MAP.get(level_name.upper(), logging.INFO)
|
||||
|
||||
|
||||
def _create_renderer(config: LoggingConfig) -> Processor:
|
||||
"""Create the appropriate renderer based on configuration.
|
||||
|
||||
Uses Rich console rendering for TTY output with colors and formatting,
|
||||
JSON for non-TTY or production environments.
|
||||
"""
|
||||
if config.enable_json_console or not sys.stderr.isatty():
|
||||
return structlog.processors.JSONRenderer()
|
||||
|
||||
# Use Rich console renderer for beautiful TTY output
|
||||
from rich.console import Console
|
||||
from rich.traceback import install as install_rich_traceback
|
||||
|
||||
# Install Rich traceback handler for better exception formatting
|
||||
install_rich_traceback(show_locals=False, width=_RICH_CONSOLE_WIDTH, suppress=[structlog])
|
||||
|
||||
Console(stderr=True, force_terminal=config.console_colors)
|
||||
return structlog.dev.ConsoleRenderer(
|
||||
colors=config.console_colors,
|
||||
exception_formatter=structlog.dev.rich_traceback,
|
||||
)
|
||||
|
||||
|
||||
def _configure_structlog(processors: Sequence[Processor]) -> None:
|
||||
"""Configure structlog with the processor chain."""
|
||||
structlog.configure(
|
||||
processors=[*processors, structlog.stdlib.ProcessorFormatter.wrap_for_formatter],
|
||||
wrapper_class=structlog.stdlib.BoundLogger,
|
||||
logger_factory=structlog.stdlib.LoggerFactory(),
|
||||
cache_logger_on_first_use=True,
|
||||
)
|
||||
|
||||
|
||||
def _setup_handlers(
|
||||
config: LoggingConfig,
|
||||
log_level: int,
|
||||
processors: Sequence[Processor],
|
||||
renderer: Processor,
|
||||
) -> None:
|
||||
"""Configure and attach handlers to the root logger."""
|
||||
formatter = structlog.stdlib.ProcessorFormatter(
|
||||
foreign_pre_chain=processors,
|
||||
processors=[structlog.stdlib.ProcessorFormatter.remove_processors_meta, renderer],
|
||||
)
|
||||
|
||||
root_logger = logging.getLogger()
|
||||
root_logger.setLevel(log_level)
|
||||
|
||||
# Clear existing handlers
|
||||
for handler in root_logger.handlers[:]:
|
||||
root_logger.removeHandler(handler)
|
||||
|
||||
if config.enable_console:
|
||||
console_handler = logging.StreamHandler(sys.stderr)
|
||||
console_handler.setFormatter(formatter)
|
||||
console_handler.setLevel(log_level)
|
||||
root_logger.addHandler(console_handler)
|
||||
|
||||
if config.json_file is not None:
|
||||
json_formatter = structlog.stdlib.ProcessorFormatter(
|
||||
foreign_pre_chain=processors,
|
||||
processors=[
|
||||
structlog.stdlib.ProcessorFormatter.remove_processors_meta,
|
||||
structlog.processors.JSONRenderer(),
|
||||
],
|
||||
)
|
||||
file_handler = logging.FileHandler(config.json_file)
|
||||
file_handler.setFormatter(json_formatter)
|
||||
file_handler.setLevel(log_level)
|
||||
root_logger.addHandler(file_handler)
|
||||
|
||||
if config.enable_log_buffer:
|
||||
from .log_buffer import LogBufferHandler, get_log_buffer
|
||||
|
||||
buffer_handler = LogBufferHandler(buffer=get_log_buffer(), level=log_level)
|
||||
root_logger.addHandler(buffer_handler)
|
||||
|
||||
|
||||
def configure_logging(
|
||||
config: LoggingConfig | None = None,
|
||||
*,
|
||||
level: str = _DEFAULT_LEVEL,
|
||||
json_file: Path | None = None,
|
||||
) -> None:
|
||||
"""Configure centralized logging with dual output.
|
||||
|
||||
Call once at application startup. Configures both structlog and stdlib
|
||||
logging for seamless integration.
|
||||
|
||||
Args:
|
||||
config: Full configuration object, or use keyword args.
|
||||
level: Log level (DEBUG, INFO, WARNING, ERROR).
|
||||
json_file: Optional path for JSON log file.
|
||||
"""
|
||||
if config is None:
|
||||
config = LoggingConfig(level=level, json_file=json_file)
|
||||
|
||||
log_level = _get_log_level(config.level)
|
||||
processors = build_processor_chain(config)
|
||||
renderer = _create_renderer(config)
|
||||
|
||||
_configure_structlog(processors)
|
||||
_setup_handlers(config, log_level, processors, renderer)
|
||||
|
||||
|
||||
def get_logger(name: str | None = None) -> structlog.stdlib.BoundLogger:
|
||||
"""Get a structlog logger instance.
|
||||
|
||||
Args:
|
||||
name: Optional logger name (defaults to calling module).
|
||||
|
||||
Returns:
|
||||
Configured structlog BoundLogger.
|
||||
"""
|
||||
return structlog.get_logger(name)
|
||||
139
src/noteflow/infrastructure/logging/processors.py
Normal file
139
src/noteflow/infrastructure/logging/processors.py
Normal file
@@ -0,0 +1,139 @@
|
||||
"""Custom structlog processors for NoteFlow logging.
|
||||
|
||||
Provide context injection, OpenTelemetry integration, and LogBuffer feeding.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import TYPE_CHECKING, Final
|
||||
|
||||
import structlog
|
||||
|
||||
# Log field name constants (avoid repeated string literals)
|
||||
_TRACE_ID: Final = "trace_id"
|
||||
_SPAN_ID: Final = "span_id"
|
||||
_PARENT_SPAN_ID: Final = "parent_span_id"
|
||||
_HEX_32: Final = "032x"
|
||||
_HEX_16: Final = "016x"
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence
|
||||
|
||||
from structlog.typing import EventDict, Processor, WrappedLogger
|
||||
|
||||
from .config import LoggingConfig
|
||||
|
||||
|
||||
def add_noteflow_context(
|
||||
logger: WrappedLogger,
|
||||
method_name: str,
|
||||
event_dict: EventDict,
|
||||
) -> EventDict:
|
||||
"""Inject request_id, user_id, workspace_id from context vars.
|
||||
|
||||
Only adds values that are set and not already present in the event.
|
||||
|
||||
Args:
|
||||
logger: The wrapped logger instance.
|
||||
method_name: Name of the log method called.
|
||||
event_dict: Current event dictionary.
|
||||
|
||||
Returns:
|
||||
Updated event dictionary with context values.
|
||||
"""
|
||||
from .structured import get_logging_context
|
||||
|
||||
ctx = get_logging_context()
|
||||
for key, value in ctx.items():
|
||||
if value is not None and key not in event_dict:
|
||||
event_dict[key] = value
|
||||
return event_dict
|
||||
|
||||
|
||||
def add_otel_trace_context(
|
||||
logger: WrappedLogger,
|
||||
method_name: str,
|
||||
event_dict: EventDict,
|
||||
) -> EventDict:
|
||||
"""Inject OpenTelemetry trace/span IDs if available.
|
||||
|
||||
Gracefully handles missing OpenTelemetry installation.
|
||||
|
||||
Args:
|
||||
logger: The wrapped logger instance.
|
||||
method_name: Name of the log method called.
|
||||
event_dict: Current event dictionary.
|
||||
|
||||
Returns:
|
||||
Updated event dictionary with trace context.
|
||||
"""
|
||||
try:
|
||||
from opentelemetry import trace
|
||||
|
||||
span = trace.get_current_span()
|
||||
if span is not None and span.is_recording():
|
||||
ctx = span.get_span_context()
|
||||
if ctx is not None and ctx.is_valid:
|
||||
event_dict[_TRACE_ID] = format(ctx.trace_id, _HEX_32)
|
||||
event_dict[_SPAN_ID] = format(ctx.span_id, _HEX_16)
|
||||
# Parent span ID if available
|
||||
parent = getattr(span, "parent", None)
|
||||
if parent is not None:
|
||||
parent_ctx = getattr(parent, _SPAN_ID, None)
|
||||
if parent_ctx is not None:
|
||||
event_dict[_PARENT_SPAN_ID] = format(parent_ctx, _HEX_16)
|
||||
except ImportError:
|
||||
pass
|
||||
except (AttributeError, TypeError):
|
||||
# Graceful degradation for edge cases
|
||||
pass
|
||||
return event_dict
|
||||
|
||||
|
||||
def build_processor_chain(config: LoggingConfig) -> Sequence[Processor]:
|
||||
"""Build the structlog processor chain based on configuration.
|
||||
|
||||
Args:
|
||||
config: Logging configuration.
|
||||
|
||||
Returns:
|
||||
Sequence of processors in execution order.
|
||||
"""
|
||||
processors: list[Processor] = [
|
||||
# Filter by level early
|
||||
structlog.stdlib.filter_by_level,
|
||||
# Add standard fields
|
||||
structlog.stdlib.add_logger_name,
|
||||
structlog.stdlib.add_log_level,
|
||||
# Handle %-style formatting from legacy code
|
||||
structlog.stdlib.PositionalArgumentsFormatter(),
|
||||
# ISO 8601 timestamp
|
||||
structlog.processors.TimeStamper(fmt="iso"),
|
||||
]
|
||||
|
||||
# Context injection (optional based on config)
|
||||
if config.enable_noteflow_context:
|
||||
processors.append(add_noteflow_context)
|
||||
|
||||
if config.enable_otel_context:
|
||||
processors.append(add_otel_trace_context)
|
||||
|
||||
# Additional standard processors
|
||||
processors.extend([
|
||||
# Add callsite information (file, function, line)
|
||||
structlog.processors.CallsiteParameterAdder(
|
||||
parameters=[
|
||||
structlog.processors.CallsiteParameter.FILENAME,
|
||||
structlog.processors.CallsiteParameter.FUNC_NAME,
|
||||
structlog.processors.CallsiteParameter.LINENO,
|
||||
]
|
||||
),
|
||||
# Stack traces if requested
|
||||
structlog.processors.StackInfoRenderer(),
|
||||
# Exception formatting
|
||||
structlog.processors.format_exc_info,
|
||||
# Decode bytes to strings
|
||||
structlog.processors.UnicodeDecoder(),
|
||||
])
|
||||
|
||||
return processors
|
||||
@@ -3,14 +3,15 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import time
|
||||
from collections import deque
|
||||
from dataclasses import dataclass
|
||||
|
||||
import psutil
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
|
||||
@@ -6,7 +6,6 @@ Provides named entity extraction with lazy model loading and segment tracking.
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from functools import partial
|
||||
from typing import TYPE_CHECKING, Final
|
||||
|
||||
@@ -17,11 +16,12 @@ from noteflow.config.constants import (
|
||||
SPACY_MODEL_TRF,
|
||||
)
|
||||
from noteflow.domain.entities.named_entity import EntityCategory, NamedEntity
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from spacy.language import Language
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Map spaCy entity types to our categories
|
||||
_SPACY_CATEGORY_MAP: Final[dict[str, EntityCategory]] = {
|
||||
|
||||
@@ -7,12 +7,13 @@ this module gracefully degrades to no-op behavior.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from contextlib import AbstractContextManager
|
||||
from functools import cache
|
||||
from typing import Protocol, cast
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Track whether OpenTelemetry is available and configured
|
||||
_otel_configured: bool = False
|
||||
@@ -47,6 +48,7 @@ def configure_observability(
|
||||
*,
|
||||
enable_grpc_instrumentation: bool = True,
|
||||
otlp_endpoint: str | None = None,
|
||||
otlp_insecure: bool | None = None,
|
||||
) -> bool:
|
||||
"""Initialize OpenTelemetry trace and metrics providers.
|
||||
|
||||
@@ -58,6 +60,8 @@ def configure_observability(
|
||||
service_name: Service name for resource identification.
|
||||
enable_grpc_instrumentation: Whether to auto-instrument gRPC.
|
||||
otlp_endpoint: Optional OTLP endpoint for exporting telemetry.
|
||||
otlp_insecure: Use insecure connection. If None, infers from endpoint
|
||||
scheme (http:// = insecure, https:// = secure).
|
||||
|
||||
Returns:
|
||||
True if configuration succeeded, False if OTel is not available.
|
||||
@@ -96,9 +100,20 @@ def configure_observability(
|
||||
)
|
||||
from opentelemetry.sdk.trace.export import BatchSpanProcessor
|
||||
|
||||
otlp_exporter = OTLPSpanExporter(endpoint=otlp_endpoint, insecure=True)
|
||||
# Determine insecure mode: explicit setting or infer from scheme
|
||||
if otlp_insecure is not None:
|
||||
use_insecure = otlp_insecure
|
||||
else:
|
||||
# Infer from endpoint scheme: http:// = insecure, https:// = secure
|
||||
use_insecure = otlp_endpoint.startswith("http://")
|
||||
|
||||
otlp_exporter = OTLPSpanExporter(endpoint=otlp_endpoint, insecure=use_insecure)
|
||||
tracer_provider.add_span_processor(BatchSpanProcessor(otlp_exporter))
|
||||
logger.info("OTLP trace exporter configured: %s", otlp_endpoint)
|
||||
logger.info(
|
||||
"OTLP trace exporter configured: %s (insecure=%s)",
|
||||
otlp_endpoint,
|
||||
use_insecure,
|
||||
)
|
||||
except ImportError:
|
||||
logger.warning("OTLP exporter not available, traces will not be exported")
|
||||
|
||||
|
||||
@@ -16,6 +16,7 @@ from noteflow.application.observability.ports import (
|
||||
UsageEvent,
|
||||
UsageEventSink,
|
||||
)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.observability.otel import _check_otel_available
|
||||
|
||||
if TYPE_CHECKING:
|
||||
@@ -25,7 +26,7 @@ if TYPE_CHECKING:
|
||||
SqlAlchemyUsageEventRepository,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class LoggingUsageEventSink:
|
||||
|
||||
@@ -3,7 +3,6 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
@@ -18,10 +17,12 @@ from sqlalchemy.ext.asyncio import (
|
||||
create_async_engine as sa_create_async_engine,
|
||||
)
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.config import Settings
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
def create_async_engine(settings: Settings) -> AsyncEngine:
|
||||
|
||||
@@ -1,13 +1,13 @@
|
||||
"""File system asset repository."""
|
||||
|
||||
import logging
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
|
||||
from noteflow.domain.ports.repositories import AssetRepository
|
||||
from noteflow.domain.value_objects import MeetingId
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class FileSystemAssetRepository(AssetRepository):
|
||||
|
||||
@@ -50,15 +50,15 @@ class SqlAlchemyProjectRepository(BaseRepository):
|
||||
if settings.export_rules is not None:
|
||||
export_data: dict[str, object] = {}
|
||||
if settings.export_rules.default_format is not None:
|
||||
export_data["default_format"] = settings.export_rules.default_format.value
|
||||
export_data[RULE_FIELD_DEFAULT_FORMAT] = settings.export_rules.default_format.value
|
||||
if settings.export_rules.include_audio is not None:
|
||||
export_data["include_audio"] = settings.export_rules.include_audio
|
||||
export_data[RULE_FIELD_INCLUDE_AUDIO] = settings.export_rules.include_audio
|
||||
if settings.export_rules.include_timestamps is not None:
|
||||
export_data["include_timestamps"] = settings.export_rules.include_timestamps
|
||||
export_data[RULE_FIELD_INCLUDE_TIMESTAMPS] = settings.export_rules.include_timestamps
|
||||
if settings.export_rules.template_id is not None:
|
||||
export_data["template_id"] = str(settings.export_rules.template_id)
|
||||
export_data[RULE_FIELD_TEMPLATE_ID] = str(settings.export_rules.template_id)
|
||||
if export_data:
|
||||
data["export_rules"] = export_data
|
||||
data[RULE_FIELD_EXPORT_RULES] = export_data
|
||||
|
||||
if settings.trigger_rules is not None:
|
||||
trigger_data: dict[str, object] = {}
|
||||
|
||||
@@ -71,11 +71,18 @@ class SqlAlchemyWorkspaceRepository(BaseRepository):
|
||||
app_match_patterns=trigger_data.get(RULE_FIELD_APP_MATCH_PATTERNS),
|
||||
)
|
||||
|
||||
# Extract and validate optional settings with type narrowing
|
||||
rag_enabled_raw = data.get("rag_enabled")
|
||||
rag_enabled = rag_enabled_raw if isinstance(rag_enabled_raw, bool) else None
|
||||
|
||||
template_raw = data.get("default_summarization_template")
|
||||
template = template_raw if isinstance(template_raw, str) else None
|
||||
|
||||
return WorkspaceSettings(
|
||||
export_rules=export_rules,
|
||||
trigger_rules=trigger_rules,
|
||||
rag_enabled=data.get("rag_enabled"), # type: ignore[arg-type]
|
||||
default_summarization_template=data.get("default_summarization_template"), # type: ignore[arg-type]
|
||||
rag_enabled=rag_enabled,
|
||||
default_summarization_template=template,
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
|
||||
@@ -5,7 +5,6 @@ Provides AES-GCM encryption for audio data with envelope encryption.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import secrets
|
||||
import struct
|
||||
from collections.abc import Iterator
|
||||
@@ -15,23 +14,45 @@ from typing import TYPE_CHECKING, BinaryIO, Final
|
||||
from cryptography.exceptions import InvalidTag
|
||||
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
|
||||
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.security.protocols import EncryptedChunk
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from noteflow.infrastructure.security.keystore import InMemoryKeyStore, KeyringKeyStore
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Constants
|
||||
KEY_SIZE: Final[int] = 32 # 256-bit key
|
||||
NONCE_SIZE: Final[int] = 12 # 96-bit nonce for AES-GCM
|
||||
TAG_SIZE: Final[int] = 16 # 128-bit authentication tag
|
||||
MIN_CHUNK_LENGTH: Final[int] = NONCE_SIZE + TAG_SIZE # Minimum valid encrypted chunk
|
||||
|
||||
# File format magic number and version
|
||||
FILE_MAGIC: Final[bytes] = b"NFAE" # NoteFlow Audio Encrypted
|
||||
FILE_VERSION: Final[int] = 1
|
||||
|
||||
|
||||
def _read_exact(handle: BinaryIO, size: int, description: str) -> bytes:
|
||||
"""Read exactly size bytes or raise ValueError.
|
||||
|
||||
Args:
|
||||
handle: File handle to read from.
|
||||
size: Number of bytes to read.
|
||||
description: Description for error message.
|
||||
|
||||
Returns:
|
||||
Exactly size bytes.
|
||||
|
||||
Raises:
|
||||
ValueError: If fewer than size bytes available.
|
||||
"""
|
||||
data = handle.read(size)
|
||||
if len(data) < size:
|
||||
raise ValueError(f"Truncated {description}: expected {size}, got {len(data)}")
|
||||
return data
|
||||
|
||||
|
||||
class AesGcmCryptoBox:
|
||||
"""AES-GCM based encryption with envelope encryption.
|
||||
|
||||
@@ -263,7 +284,14 @@ class ChunkedAssetReader:
|
||||
self._handle = None
|
||||
raise ValueError(f"Invalid file format: expected {FILE_MAGIC!r}, got {magic!r}")
|
||||
|
||||
version = struct.unpack("B", self._handle.read(1))[0]
|
||||
try:
|
||||
version_bytes = _read_exact(self._handle, 1, "version header")
|
||||
except ValueError as e:
|
||||
self._handle.close()
|
||||
self._handle = None
|
||||
raise ValueError(f"Invalid file format: {e}") from e
|
||||
|
||||
version = struct.unpack("B", version_bytes)[0]
|
||||
if version != FILE_VERSION:
|
||||
self._handle.close()
|
||||
self._handle = None
|
||||
@@ -279,15 +307,22 @@ class ChunkedAssetReader:
|
||||
while True:
|
||||
# Read chunk length
|
||||
length_bytes = self._handle.read(4)
|
||||
if len(length_bytes) == 0:
|
||||
break # Clean end of file
|
||||
if len(length_bytes) < 4:
|
||||
break # End of file
|
||||
raise ValueError("Truncated chunk length header")
|
||||
|
||||
chunk_length = struct.unpack(">I", length_bytes)[0]
|
||||
|
||||
# Validate minimum chunk size (nonce + tag at minimum)
|
||||
if chunk_length < MIN_CHUNK_LENGTH:
|
||||
raise ValueError(
|
||||
f"Invalid chunk length {chunk_length}: "
|
||||
f"minimum is {MIN_CHUNK_LENGTH} (nonce + tag)"
|
||||
)
|
||||
|
||||
# Read chunk data
|
||||
chunk_data = self._handle.read(chunk_length)
|
||||
if len(chunk_data) < chunk_length:
|
||||
raise ValueError("Truncated chunk")
|
||||
chunk_data = _read_exact(self._handle, chunk_length, "chunk data")
|
||||
|
||||
# Parse chunk (nonce + ciphertext + tag)
|
||||
nonce = chunk_data[:NONCE_SIZE]
|
||||
|
||||
@@ -5,7 +5,6 @@ Provides secure master key storage using OS credential stores.
|
||||
|
||||
import base64
|
||||
import binascii
|
||||
import logging
|
||||
import os
|
||||
import secrets
|
||||
import stat
|
||||
@@ -15,8 +14,9 @@ from typing import Final
|
||||
import keyring
|
||||
|
||||
from noteflow.config.constants import APP_DIR_NAME
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
# Constants
|
||||
KEY_SIZE: Final[int] = 32 # 256-bit key
|
||||
|
||||
@@ -3,12 +3,12 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
from datetime import UTC, datetime
|
||||
from typing import TYPE_CHECKING, TypedDict, cast
|
||||
|
||||
from noteflow.domain.entities import ActionItem, KeyPoint, Summary
|
||||
from noteflow.domain.summarization import InvalidResponseError
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence
|
||||
@@ -16,7 +16,7 @@ if TYPE_CHECKING:
|
||||
from noteflow.domain.entities import Segment
|
||||
from noteflow.domain.summarization import SummarizationRequest
|
||||
|
||||
_logger = logging.getLogger(__name__)
|
||||
_logger = get_logger(__name__)
|
||||
|
||||
|
||||
class _KeyPointData(TypedDict, total=False):
|
||||
|
||||
@@ -1,17 +1,16 @@
|
||||
"""Factory for creating configured SummarizationService instances."""
|
||||
|
||||
import logging
|
||||
|
||||
from noteflow.application.services.summarization_service import (
|
||||
SummarizationMode,
|
||||
SummarizationService,
|
||||
SummarizationServiceSettings,
|
||||
)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.summarization.citation_verifier import SegmentCitationVerifier
|
||||
from noteflow.infrastructure.summarization.mock_provider import MockSummarizer
|
||||
from noteflow.infrastructure.summarization.ollama_provider import OllamaSummarizer
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
def create_summarization_service(
|
||||
|
||||
@@ -3,15 +3,12 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import time
|
||||
from datetime import UTC, datetime
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from noteflow.domain.entities import Summary
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
from noteflow.domain.summarization import (
|
||||
InvalidResponseError,
|
||||
ProviderUnavailableError,
|
||||
@@ -19,6 +16,7 @@ from noteflow.domain.summarization import (
|
||||
SummarizationResult,
|
||||
SummarizationTimeoutError,
|
||||
)
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.summarization._parsing import (
|
||||
SYSTEM_PROMPT,
|
||||
build_transcript_prompt,
|
||||
@@ -28,6 +26,8 @@ from noteflow.infrastructure.summarization._parsing import (
|
||||
if TYPE_CHECKING:
|
||||
import ollama
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
def _get_ollama_settings() -> tuple[str, float, float]:
|
||||
"""Get Ollama settings with fallback defaults.
|
||||
|
||||
@@ -7,7 +7,6 @@ This is a best-effort heuristic: it combines (a) system output activity and
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import time
|
||||
from dataclasses import dataclass, field
|
||||
from typing import TYPE_CHECKING
|
||||
@@ -15,6 +14,7 @@ from typing import TYPE_CHECKING
|
||||
from noteflow.config.constants import DEFAULT_SAMPLE_RATE
|
||||
from noteflow.domain.triggers.entities import TriggerSignal, TriggerSource
|
||||
from noteflow.infrastructure.audio.levels import RmsLevelProvider
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
from noteflow.infrastructure.triggers.audio_activity import (
|
||||
AudioActivityProvider,
|
||||
AudioActivitySettings,
|
||||
@@ -24,7 +24,7 @@ if TYPE_CHECKING:
|
||||
import numpy as np
|
||||
from numpy.typing import NDArray
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -12,6 +12,7 @@ from dataclasses import dataclass
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
from noteflow.domain.triggers.entities import TriggerSignal, TriggerSource
|
||||
from noteflow.infrastructure.logging import get_logger
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import numpy as np
|
||||
@@ -19,6 +20,8 @@ if TYPE_CHECKING:
|
||||
|
||||
from noteflow.infrastructure.audio import RmsLevelProvider
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
class AudioActivitySettings:
|
||||
@@ -71,6 +74,20 @@ class AudioActivityProvider:
|
||||
self._settings = settings
|
||||
self._history: deque[tuple[float, bool]] = deque(maxlen=self._settings.max_history)
|
||||
self._lock = threading.Lock()
|
||||
self._last_signal_state: bool = False
|
||||
self._frame_count: int = 0
|
||||
self._active_frame_count: int = 0
|
||||
|
||||
logger.info(
|
||||
"Audio activity provider initialized",
|
||||
enabled=settings.enabled,
|
||||
threshold_db=settings.threshold_db,
|
||||
window_seconds=settings.window_seconds,
|
||||
min_active_ratio=settings.min_active_ratio,
|
||||
min_samples=settings.min_samples,
|
||||
max_history=settings.max_history,
|
||||
weight=settings.weight,
|
||||
)
|
||||
|
||||
@property
|
||||
def source(self) -> TriggerSource:
|
||||
@@ -98,6 +115,20 @@ class AudioActivityProvider:
|
||||
is_active = db >= self._settings.threshold_db
|
||||
with self._lock:
|
||||
self._history.append((timestamp, is_active))
|
||||
self._frame_count += 1
|
||||
if is_active:
|
||||
self._active_frame_count += 1
|
||||
|
||||
# Log summary every 100 frames to avoid spam
|
||||
if self._frame_count % 100 == 0:
|
||||
logger.debug(
|
||||
"Audio activity update summary",
|
||||
frame_count=self._frame_count,
|
||||
active_frames=self._active_frame_count,
|
||||
history_size=len(self._history),
|
||||
last_db=round(db, 1),
|
||||
last_active=is_active,
|
||||
)
|
||||
|
||||
def get_signal(self) -> TriggerSignal | None:
|
||||
"""Get current signal if sustained activity detected.
|
||||
@@ -113,21 +144,62 @@ class AudioActivityProvider:
|
||||
history = list(self._history)
|
||||
|
||||
if len(history) < self._settings.min_samples:
|
||||
logger.debug(
|
||||
"Insufficient samples for signal evaluation",
|
||||
history_size=len(history),
|
||||
min_samples=self._settings.min_samples,
|
||||
)
|
||||
return None
|
||||
|
||||
# Prune old samples outside window
|
||||
now = time.monotonic()
|
||||
cutoff = now - self._settings.window_seconds
|
||||
recent = [(ts, active) for ts, active in history if ts >= cutoff]
|
||||
pruned_count = len(history) - len(recent)
|
||||
|
||||
if pruned_count > 0:
|
||||
logger.debug(
|
||||
"Pruned old samples from history",
|
||||
pruned_count=pruned_count,
|
||||
remaining_count=len(recent),
|
||||
window_seconds=self._settings.window_seconds,
|
||||
)
|
||||
|
||||
if len(recent) < self._settings.min_samples:
|
||||
logger.debug(
|
||||
"Insufficient recent samples after pruning",
|
||||
recent_count=len(recent),
|
||||
min_samples=self._settings.min_samples,
|
||||
)
|
||||
return None
|
||||
|
||||
# Calculate activity ratio
|
||||
active_count = sum(bool(active) for _, active in recent)
|
||||
ratio = active_count / len(recent)
|
||||
signal_detected = ratio >= self._settings.min_active_ratio
|
||||
|
||||
if ratio < self._settings.min_active_ratio:
|
||||
# Log state transitions (signal detected vs not)
|
||||
if signal_detected != self._last_signal_state:
|
||||
if signal_detected:
|
||||
logger.info(
|
||||
"Audio activity signal detected",
|
||||
activity_ratio=round(ratio, 3),
|
||||
min_active_ratio=self._settings.min_active_ratio,
|
||||
active_count=active_count,
|
||||
sample_count=len(recent),
|
||||
weight=self.max_weight,
|
||||
)
|
||||
else:
|
||||
logger.info(
|
||||
"Audio activity signal cleared",
|
||||
activity_ratio=round(ratio, 3),
|
||||
min_active_ratio=self._settings.min_active_ratio,
|
||||
active_count=active_count,
|
||||
sample_count=len(recent),
|
||||
)
|
||||
self._last_signal_state = signal_detected
|
||||
|
||||
if not signal_detected:
|
||||
return None
|
||||
|
||||
return TriggerSignal(source=self.source, weight=self.max_weight)
|
||||
@@ -139,4 +211,13 @@ class AudioActivityProvider:
|
||||
def clear_history(self) -> None:
|
||||
"""Clear activity history. Useful when recording starts."""
|
||||
with self._lock:
|
||||
previous_size = len(self._history)
|
||||
self._history.clear()
|
||||
self._frame_count = 0
|
||||
self._active_frame_count = 0
|
||||
self._last_signal_state = False
|
||||
|
||||
logger.debug(
|
||||
"Audio activity history cleared",
|
||||
previous_size=previous_size,
|
||||
)
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user