# Technical Debt Triage This document tracks known issues, technical debt, and areas needing improvement. --- ## Insufficient Logging - Comprehensive Audit **Discovered:** 2025-12-31 **Last Verified:** 2026-01-03 **Sprint Docs:** - `docs/sprints/sprint_logging_gap_remediation_p1.md` - `docs/sprints/sprint_logging_gap_remediation_p2.md` **Impact:** Caused ~1 hour of debugging when Ollama 120s timeout appeared as migration hang **Total Issues Found:** 100+ --- ## 1. Network/External Service Connections ### 1.1 CRITICAL: Ollama Availability Check - Silent 120s Timeout **File:** `src/noteflow/infrastructure/summarization/ollama_provider.py:101-115` ```python @property def is_available(self) -> bool: try: client = self._get_client() client.list() # Silent 120-second timeout! return True except (ConnectionError, TimeoutError, ...): return False ``` **Status (2026-01-03):** Resolved — `log_timing` added around availability check. **Fix:** ```python @property def is_available(self) -> bool: try: logger.info("Checking Ollama availability at %s (timeout: %.0fs)...", self._host, self._timeout) client = self._get_client() client.list() logger.info("Ollama server is available") return True except TimeoutError: logger.warning("Ollama server timeout at %s after %.0fs", self._host, self._timeout) return False except (ConnectionError, RuntimeError, OSError) as e: logger.debug("Ollama server unreachable at %s: %s", self._host, e) return False ``` --- ### 1.2 Cloud Summarization API Calls - No Request Logging **File:** `src/noteflow/infrastructure/summarization/cloud_provider.py:238-282` ```python def _call_openai(self, user_prompt: str, system_prompt: str) -> tuple[str, int | None]: try: response = client.chat.completions.create(...) # No timing logged except TimeoutError as e: raise SummarizationTimeoutError(...) # No duration logged ``` **Status (2026-01-03):** Resolved — `log_timing` wraps OpenAI/Anthropic calls and response logging added. **Fix:** Add `logger.info("Initiating OpenAI API call: model=%s", self._model)` before call, log duration after. --- ### 1.3 Google Calendar API - No Request Logging **File:** `src/noteflow/infrastructure/calendar/google_adapter.py:76-91` ```python async with httpx.AsyncClient() as client: response = await client.get(url, params=params, headers=headers) # No logging ``` **Status (2026-01-03):** Resolved — request timing logged via `log_timing`. **Fix:** Log request start, duration, and response status. --- ### 1.4 OAuth Token Refresh - Missing Timing **File:** `src/noteflow/infrastructure/calendar/oauth_manager.py:211-222` ```python async def refresh_tokens(...) -> OAuthTokens: response = await client.post(token_url, data=data) # No timing ``` **Status (2026-01-03):** Resolved — refresh timing logged via `log_timing`. --- ### 1.5 Webhook Delivery - Missing Initial Request Log **File:** `src/noteflow/infrastructure/webhooks/executor.py:107-237` ```python async def deliver(...) -> WebhookDelivery: for attempt in range(1, max_retries + 1): _logger.debug("Webhook delivery attempt %d/%d", attempt, max_retries) # DEBUG only ``` **Status (2026-01-03):** Resolved — info log at delivery start + completion. --- ### 1.6 Database Connection Creation - No Logging **File:** `src/noteflow/infrastructure/persistence/database.py:85-116` ```python def create_engine_and_session_factory(...): engine = sa_create_async_engine(database_url, pool_size=pool_size, ...) # No logging of connection parameters ``` **Status (2026-01-03):** Resolved — engine creation logged with masked URL. --- ### 1.7 Rust gRPC Client Connection - No Tracing **File:** `client/src-tauri/src/grpc/client/core.rs:174-197` ```rust async fn perform_connect(&self) -> Result { let channel = endpoint.connect().await // No tracing before/after .map_err(|e| Error::Connection(...))?; ``` **Status (2026-01-03):** Not implemented — see P1 sprint. --- ## 2. Blocking/Long-Running Operations ### 2.1 NER Service - Silent Model Warmup **File:** `src/noteflow/application/services/ner_service.py:185-197` ```python await loop.run_in_executor( None, lambda: self._ner_engine.extract("warm up"), # No logging ) ``` **Status (2026-01-03):** Not implemented — see P1 sprint. --- ### 2.2 ASR Transcription - No Duration Logging **File:** `src/noteflow/infrastructure/asr/engine.py:156-177` ```python async def transcribe_async(...) -> list[AsrResult]: return await loop.run_in_executor(None, ...) # No timing ``` **Status (2026-01-03):** Not implemented — see P1 sprint. --- ### 2.3 Diarization - Missing Blocking Operation Logging **File:** `src/noteflow/infrastructure/diarization/engine.py:299-347` ```python def diarize_full(...) -> Sequence[SpeakerTurn]: logger.debug("Running offline diarization on %.2fs audio", ...) # DEBUG only annotation = self._offline_pipeline(waveform, ...) # No end logging ``` **Status (2026-01-03):** Resolved — `log_timing` wraps diarization. --- ### 2.4 Diarization Job Timeout - No Pre-Timeout Context **File:** `src/noteflow/grpc/_mixins/diarization/_jobs.py:173-186` ```python async with asyncio.timeout(DIARIZATION_TIMEOUT_SECONDS): updated_count = await self.refine_speaker_diarization(...) # No logging of timeout value before entering block ``` **Status (2026-01-03):** Resolved — timeout value logged in job handler. --- ## 3. Error Handling - Silent Failures ### 3.1 Silent ValueError Returns **Files:** - `src/noteflow/grpc/_mixins/meeting.py:64-67` - workspace UUID parse - `src/noteflow/grpc/_mixins/converters.py:76-79` - meeting ID parse - `src/noteflow/grpc/_mixins/diarization/_jobs.py:84-87` - meeting ID validation - `src/noteflow/infrastructure/triggers/calendar.py:141-144` - datetime parse ```python try: UUID(workspace_id) except ValueError: return None # Silent failure, no logging ``` **Status (2026-01-03):** Not implemented — add WARN + redaction (P1 sprint). --- ### 3.2 Silent Settings Fallbacks **Files:** - `src/noteflow/infrastructure/webhooks/executor.py:56-65` - `src/noteflow/infrastructure/summarization/ollama_provider.py:44-48` - `src/noteflow/infrastructure/summarization/cloud_provider.py:48-52` - `src/noteflow/grpc/_mixins/diarization_job.py:63-66` ```python except Exception: return DEFAULT_VALUES # No logging that fallback occurred ``` **Status (2026-01-03):** Not implemented — add warning logs (P1 sprint). --- ### 3.3 gRPC Client Stub Unavailable - Silent Returns **Files:** `src/noteflow/grpc/_client_mixins/*.py` (multiple locations) ```python if not self._stub: return None # No logging of connection issue ``` **Status (2026-01-03):** Not implemented — add rate-limited warn log (P1 sprint). --- ## 4. State Transitions and Lifecycle ### 4.1 Meeting State Changes Not Logged **Status (2026-01-03):** Resolved — meeting service logs transitions. --- ### 4.2 Diarization Job State - Missing Previous State **File:** `src/noteflow/grpc/_mixins/diarization/_jobs.py:147-171` ```python await repo.diarization_jobs.update_status(job_id, JOB_STATUS_RUNNING, ...) **Status (2026-01-03):** Resolved — state transitions logged. ``` --- ### 4.3 Segmenter State Machine - No Transition Logging **File:** `src/noteflow/infrastructure/asr/segmenter.py:121-127` ```python if is_speech: self._state = SegmenterState.SPEECH # No logging of IDLE -> SPEECH ``` **Status (2026-01-03):** Not implemented — see P1 sprint. --- ### 4.4 Stream Cleanup - No Logging **File:** `src/noteflow/grpc/_mixins/streaming/_cleanup.py:14-34` ```python def cleanup_stream_resources(host, meeting_id): # Multiple cleanup operations, no completion log host._active_streams.discard(meeting_id) ``` --- ### 4.5 Diarization Session Close - DEBUG Only **File:** `src/noteflow/infrastructure/diarization/session.py:145-159` **Status (2026-01-03):** Not implemented — see P2 sprint. ```python def close(self) -> None: logger.debug("Session %s closed", self.meeting_id) # Should be INFO ``` --- ### 4.6 Background Task Spawning - No Task ID **File:** `src/noteflow/grpc/_mixins/diarization/_jobs.py:130-132` **Status (2026-01-03):** Not implemented — see P2 sprint. ```python task = asyncio.create_task(self._run_diarization_job(job_id, num_speakers)) self._diarization_tasks[job_id] = task # No logging of task creation ``` --- ### 4.7 Audio Flush Thread - No Start/End Logging **File:** `src/noteflow/infrastructure/audio/writer.py:135-157` **Status (2026-01-03):** Not implemented — see P2 sprint. ```python self._flush_thread.start() # No logging # ... def _periodic_flush_loop(self): while not self._stop_flush.wait(...): # No entry/exit logging for loop ``` --- ## 5. Database Operations ### 5.1 BaseRepository - No Query Timing **File:** `src/noteflow/infrastructure/persistence/repositories/_base.py` **Status (2026-01-03):** Not implemented — see P2 sprint. All methods (`_execute_scalar`, `_execute_scalars`, `_add_and_flush`, `_delete_and_flush`, `_add_all_and_flush`, `_execute_update`, `_execute_delete`) have no timing or logging. --- ### 5.2 Unit of Work - No Transaction Logging **File:** `src/noteflow/infrastructure/persistence/unit_of_work.py:220-296` **Status (2026-01-03):** Not implemented — see P2 sprint. --- ### 5.3 Repository CRUD Operations - No Logging **Files:** - `meeting_repo.py` - create, update, delete, list_all - `segment_repo.py` - add_batch, update_embedding, update_speaker - `summary_repo.py` - save (upsert with cascades) - `diarization_job_repo.py` - create, mark_running_as_failed, prune_completed - `entity_repo.py` - save_batch, delete_by_meeting - `webhook_repo.py` - create, add_delivery - `integration_repo.py` - set_secrets - `usage_event_repo.py` - add_batch, delete_before - `preferences_repo.py` - set_bulk **Status (2026-01-03):** Not implemented — see P2 sprint. --- ## 6. File System Operations ### 6.1 Meeting Directory Creation - Not Logged **File:** `src/noteflow/infrastructure/audio/writer.py:109-111` **Status (2026-01-03):** Resolved — audio writer open logs meeting and dir. ```python self._meeting_dir.mkdir(parents=True, exist_ok=True) # No logging ``` --- ### 6.2 Manifest Read/Write - Not Logged **File:** `src/noteflow/infrastructure/audio/writer.py:122-123` **Status (2026-01-03):** Partially implemented — open logged, manifest write still unlogged (P2 sprint). ```python manifest_path.write_text(json.dumps(manifest, indent=2)) # No logging ``` --- ### 6.3 Asset Deletion - Silent No-Op **File:** `src/noteflow/infrastructure/persistence/repositories/asset_repo.py:49-51` **Status (2026-01-03):** Not implemented — see P2 sprint. ```python if meeting_dir.exists(): shutil.rmtree(meeting_dir) logger.info("Deleted meeting assets at %s", meeting_dir) # No log when directory doesn't exist ``` --- ## 7. Export Operations ### 7.1 PDF Export - No Timing **File:** `src/noteflow/infrastructure/export/pdf.py:161-186` ```python def export(self, meeting, segments) -> bytes: pdf_bytes = weasy_html(string=html_content).write_pdf() # No timing return pdf_bytes ``` **Status (2026-01-03):** Not implemented — see P2 sprint. --- ### 7.2 Markdown/HTML Export - No Logging **Files:** `markdown.py:37-89`, `html.py:158-187` **Status (2026-01-03):** Not implemented — see P2 sprint. No logging of export operations. --- ## 8. Initialization Sequences ### 8.1 Lazy Model Loading - Not Logged at Load Time **Files:** - `NerEngine._ensure_loaded()` - spaCy model load - `DiarizationEngine` - pyannote model load - `OllamaSummarizer._get_client()` - client creation **Status (2026-01-03):** Partially implemented — some model loads logged, NER warmup not logged (P1 sprint). --- ### 8.2 Singleton Creation - Silent **File:** `src/noteflow/infrastructure/metrics/collector.py:168-178` **Status (2026-01-03):** Not implemented — out of P1/P2 scope unless needed. ```python def get_metrics_collector() -> MetricsCollector: global _metrics_collector if _metrics_collector is None: _metrics_collector = MetricsCollector() # No logging return _metrics_collector ``` --- ### 8.3 Provider Registration - DEBUG Level **File:** `src/noteflow/application/services/summarization_service.py:119-127` **Status (2026-01-03):** Partially implemented — still debug in factory registration; consider if INFO needed. ```python def register_provider(self, mode, provider): logger.debug("Registered %s provider", mode.value) # Should be INFO at startup ``` --- ## Summary Statistics | Category | Issue Count | Severity | |----------|-------------|----------| | Network/External Services | 7 | CRITICAL (mostly resolved) | | Blocking/Long-Running | 4 | HIGH (partially unresolved) | | Error Handling | 10+ | HIGH (partially unresolved) | | State Transitions | 7 | MEDIUM (partially unresolved) | | Database Operations | 30+ | MEDIUM (unresolved) | | File System | 3 | LOW (partially unresolved) | | Export | 3 | LOW (unresolved) | | Initialization | 5 | MEDIUM (partially unresolved) | | **Total** | **100+** | - | --- ## Recommended Logging Pattern For all async/blocking operations: ```python logger.info("Starting : context=%s", context) start = time.perf_counter() try: result = await some_operation() elapsed_ms = (time.perf_counter() - start) * 1000 logger.info(" completed: result_count=%d, duration_ms=%.2f", len(result), elapsed_ms) except TimeoutError: elapsed_ms = (time.perf_counter() - start) * 1000 logger.error(" timeout after %.2fms", elapsed_ms) raise except Exception as e: elapsed_ms = (time.perf_counter() - start) * 1000 logger.error(" failed after %.2fms: %s", elapsed_ms, e) raise ``` --- ## Priority Fixes ### P0 - Fix Immediately 1. (Resolved) Ollama `is_available` timeout logging 2. (Resolved) Summarization factory timing 3. (Resolved) Database migration progress logging ### P1 - Fix This Sprint 4. (Resolved) All external HTTP calls (calendar, OAuth, webhooks) 5. All `run_in_executor` calls (ASR, NER, diarization) 6. Silent ValueError returns ### P2 - Fix Next Sprint 7. Repository CRUD logging 8. State transition logging (segmenter + diarization session) 9. Background task lifecycle logging --- ## Resolved Issues - ~~Server-side state volatility~~ → Diarization jobs persisted to DB - ~~Hardcoded directory paths~~ → `asset_path` column added to meetings - ~~Synchronous blocking in async gRPC~~ → `run_in_executor` for diarization - ~~Summarization consent not persisted~~ → Stored in `user_preferences` table - ~~VU meter update throttling~~ → 20fps throttle implemented - ~~Webhook infrastructure missing~~ → Full webhook subsystem implemented - ~~Integration/OAuth token storage~~ → `IntegrationSecretModel` for secure storage