- Introduced logging for CRUD operations in repositories to enhance visibility into database interactions. - Implemented timing logs for BaseRepository and UnitOfWork lifecycle events to track performance. - Added export logging for size and duration without exposing sensitive content. - Promoted logging levels for specific operations to improve clarity and reduce noise in logs. - Established a framework for consistent logging practices across persistence and export functionalities.
15 KiB
Technical Debt Triage
This document tracks known issues, technical debt, and areas needing improvement.
Insufficient Logging - Comprehensive Audit
Discovered: 2025-12-31 Last Verified: 2026-01-03 Sprint Docs:
docs/sprints/sprint_logging_gap_remediation_p1.mddocs/sprints/sprint_logging_gap_remediation_p2.md
Impact: Caused ~1 hour of debugging when Ollama 120s timeout appeared as migration hang Total Issues Found: 100+
1. Network/External Service Connections
1.1 CRITICAL: Ollama Availability Check - Silent 120s Timeout
File: src/noteflow/infrastructure/summarization/ollama_provider.py:101-115
@property
def is_available(self) -> bool:
try:
client = self._get_client()
client.list() # Silent 120-second timeout!
return True
except (ConnectionError, TimeoutError, ...):
return False
Status (2026-01-03): Resolved — log_timing added around availability check.
Fix:
@property
def is_available(self) -> bool:
try:
logger.info("Checking Ollama availability at %s (timeout: %.0fs)...", self._host, self._timeout)
client = self._get_client()
client.list()
logger.info("Ollama server is available")
return True
except TimeoutError:
logger.warning("Ollama server timeout at %s after %.0fs", self._host, self._timeout)
return False
except (ConnectionError, RuntimeError, OSError) as e:
logger.debug("Ollama server unreachable at %s: %s", self._host, e)
return False
1.2 Cloud Summarization API Calls - No Request Logging
File: src/noteflow/infrastructure/summarization/cloud_provider.py:238-282
def _call_openai(self, user_prompt: str, system_prompt: str) -> tuple[str, int | None]:
try:
response = client.chat.completions.create(...) # No timing logged
except TimeoutError as e:
raise SummarizationTimeoutError(...) # No duration logged
Status (2026-01-03): Resolved — log_timing wraps OpenAI/Anthropic calls and response logging added.
Fix: Add logger.info("Initiating OpenAI API call: model=%s", self._model) before call, log duration after.
1.3 Google Calendar API - No Request Logging
File: src/noteflow/infrastructure/calendar/google_adapter.py:76-91
async with httpx.AsyncClient() as client:
response = await client.get(url, params=params, headers=headers) # No logging
Status (2026-01-03): Resolved — request timing logged via log_timing.
Fix: Log request start, duration, and response status.
1.4 OAuth Token Refresh - Missing Timing
File: src/noteflow/infrastructure/calendar/oauth_manager.py:211-222
async def refresh_tokens(...) -> OAuthTokens:
response = await client.post(token_url, data=data) # No timing
Status (2026-01-03): Resolved — refresh timing logged via log_timing.
1.5 Webhook Delivery - Missing Initial Request Log
File: src/noteflow/infrastructure/webhooks/executor.py:107-237
async def deliver(...) -> WebhookDelivery:
for attempt in range(1, max_retries + 1):
_logger.debug("Webhook delivery attempt %d/%d", attempt, max_retries) # DEBUG only
Status (2026-01-03): Resolved — info log at delivery start + completion.
1.6 Database Connection Creation - No Logging
File: src/noteflow/infrastructure/persistence/database.py:85-116
def create_engine_and_session_factory(...):
engine = sa_create_async_engine(database_url, pool_size=pool_size, ...)
# No logging of connection parameters
Status (2026-01-03): Resolved — engine creation logged with masked URL.
1.7 Rust gRPC Client Connection - No Tracing
File: client/src-tauri/src/grpc/client/core.rs:174-197
async fn perform_connect(&self) -> Result<ServerInfo> {
let channel = endpoint.connect().await // No tracing before/after
.map_err(|e| Error::Connection(...))?;
Status (2026-01-03): Not implemented — see P1 sprint.
2. Blocking/Long-Running Operations
2.1 NER Service - Silent Model Warmup
File: src/noteflow/application/services/ner_service.py:185-197
await loop.run_in_executor(
None,
lambda: self._ner_engine.extract("warm up"), # No logging
)
Status (2026-01-03): Not implemented — see P1 sprint.
2.2 ASR Transcription - No Duration Logging
File: src/noteflow/infrastructure/asr/engine.py:156-177
async def transcribe_async(...) -> list[AsrResult]:
return await loop.run_in_executor(None, ...) # No timing
Status (2026-01-03): Not implemented — see P1 sprint.
2.3 Diarization - Missing Blocking Operation Logging
File: src/noteflow/infrastructure/diarization/engine.py:299-347
def diarize_full(...) -> Sequence[SpeakerTurn]:
logger.debug("Running offline diarization on %.2fs audio", ...) # DEBUG only
annotation = self._offline_pipeline(waveform, ...) # No end logging
Status (2026-01-03): Resolved — log_timing wraps diarization.
2.4 Diarization Job Timeout - No Pre-Timeout Context
File: src/noteflow/grpc/_mixins/diarization/_jobs.py:173-186
async with asyncio.timeout(DIARIZATION_TIMEOUT_SECONDS):
updated_count = await self.refine_speaker_diarization(...)
# No logging of timeout value before entering block
Status (2026-01-03): Resolved — timeout value logged in job handler.
3. Error Handling - Silent Failures
3.1 Silent ValueError Returns
Files:
src/noteflow/grpc/_mixins/meeting.py:64-67- workspace UUID parsesrc/noteflow/grpc/_mixins/converters.py:76-79- meeting ID parsesrc/noteflow/grpc/_mixins/diarization/_jobs.py:84-87- meeting ID validationsrc/noteflow/infrastructure/triggers/calendar.py:141-144- datetime parse
try:
UUID(workspace_id)
except ValueError:
return None # Silent failure, no logging
Status (2026-01-03): Not implemented — add WARN + redaction (P1 sprint).
3.2 Silent Settings Fallbacks
Files:
src/noteflow/infrastructure/webhooks/executor.py:56-65src/noteflow/infrastructure/summarization/ollama_provider.py:44-48src/noteflow/infrastructure/summarization/cloud_provider.py:48-52src/noteflow/grpc/_mixins/diarization_job.py:63-66
except Exception:
return DEFAULT_VALUES # No logging that fallback occurred
Status (2026-01-03): Not implemented — add warning logs (P1 sprint).
3.3 gRPC Client Stub Unavailable - Silent Returns
Files: src/noteflow/grpc/_client_mixins/*.py (multiple locations)
if not self._stub:
return None # No logging of connection issue
Status (2026-01-03): Not implemented — add rate-limited warn log (P1 sprint).
4. State Transitions and Lifecycle
4.1 Meeting State Changes Not Logged
Status (2026-01-03): Resolved — meeting service logs transitions.
4.2 Diarization Job State - Missing Previous State
File: src/noteflow/grpc/_mixins/diarization/_jobs.py:147-171
await repo.diarization_jobs.update_status(job_id, JOB_STATUS_RUNNING, ...)
**Status (2026-01-03):** Resolved — state transitions logged.
4.3 Segmenter State Machine - No Transition Logging
File: src/noteflow/infrastructure/asr/segmenter.py:121-127
if is_speech:
self._state = SegmenterState.SPEECH # No logging of IDLE -> SPEECH
Status (2026-01-03): Not implemented — see P1 sprint.
4.4 Stream Cleanup - No Logging
File: src/noteflow/grpc/_mixins/streaming/_cleanup.py:14-34
def cleanup_stream_resources(host, meeting_id):
# Multiple cleanup operations, no completion log
host._active_streams.discard(meeting_id)
4.5 Diarization Session Close - DEBUG Only
File: src/noteflow/infrastructure/diarization/session.py:145-159
Status (2026-01-03): Not implemented — see P2 sprint.
def close(self) -> None:
logger.debug("Session %s closed", self.meeting_id) # Should be INFO
4.6 Background Task Spawning - No Task ID
File: src/noteflow/grpc/_mixins/diarization/_jobs.py:130-132
Status (2026-01-03): Not implemented — see P2 sprint.
task = asyncio.create_task(self._run_diarization_job(job_id, num_speakers))
self._diarization_tasks[job_id] = task # No logging of task creation
4.7 Audio Flush Thread - No Start/End Logging
File: src/noteflow/infrastructure/audio/writer.py:135-157
Status (2026-01-03): Not implemented — see P2 sprint.
self._flush_thread.start() # No logging
# ...
def _periodic_flush_loop(self):
while not self._stop_flush.wait(...):
# No entry/exit logging for loop
5. Database Operations
5.1 BaseRepository - No Query Timing
File: src/noteflow/infrastructure/persistence/repositories/_base.py
Status (2026-01-03): Not implemented — see P2 sprint.
All methods (_execute_scalar, _execute_scalars, _add_and_flush, _delete_and_flush, _add_all_and_flush, _execute_update, _execute_delete) have no timing or logging.
5.2 Unit of Work - No Transaction Logging
File: src/noteflow/infrastructure/persistence/unit_of_work.py:220-296
Status (2026-01-03): Not implemented — see P2 sprint.
5.3 Repository CRUD Operations - No Logging
Files:
meeting_repo.py- create, update, delete, list_allsegment_repo.py- add_batch, update_embedding, update_speakersummary_repo.py- save (upsert with cascades)diarization_job_repo.py- create, mark_running_as_failed, prune_completedentity_repo.py- save_batch, delete_by_meetingwebhook_repo.py- create, add_deliveryintegration_repo.py- set_secretsusage_event_repo.py- add_batch, delete_beforepreferences_repo.py- set_bulk
Status (2026-01-03): Not implemented — see P2 sprint.
6. File System Operations
6.1 Meeting Directory Creation - Not Logged
File: src/noteflow/infrastructure/audio/writer.py:109-111
Status (2026-01-03): Resolved — audio writer open logs meeting and dir.
self._meeting_dir.mkdir(parents=True, exist_ok=True) # No logging
6.2 Manifest Read/Write - Not Logged
File: src/noteflow/infrastructure/audio/writer.py:122-123
Status (2026-01-03): Partially implemented — open logged, manifest write still unlogged (P2 sprint).
manifest_path.write_text(json.dumps(manifest, indent=2)) # No logging
6.3 Asset Deletion - Silent No-Op
File: src/noteflow/infrastructure/persistence/repositories/asset_repo.py:49-51
Status (2026-01-03): Not implemented — see P2 sprint.
if meeting_dir.exists():
shutil.rmtree(meeting_dir)
logger.info("Deleted meeting assets at %s", meeting_dir)
# No log when directory doesn't exist
7. Export Operations
7.1 PDF Export - No Timing
File: src/noteflow/infrastructure/export/pdf.py:161-186
def export(self, meeting, segments) -> bytes:
pdf_bytes = weasy_html(string=html_content).write_pdf() # No timing
return pdf_bytes
Status (2026-01-03): Not implemented — see P2 sprint.
7.2 Markdown/HTML Export - No Logging
Files: markdown.py:37-89, html.py:158-187
Status (2026-01-03): Not implemented — see P2 sprint.
No logging of export operations.
8. Initialization Sequences
8.1 Lazy Model Loading - Not Logged at Load Time
Files:
NerEngine._ensure_loaded()- spaCy model loadDiarizationEngine- pyannote model loadOllamaSummarizer._get_client()- client creation
Status (2026-01-03): Partially implemented — some model loads logged, NER warmup not logged (P1 sprint).
8.2 Singleton Creation - Silent
File: src/noteflow/infrastructure/metrics/collector.py:168-178
Status (2026-01-03): Not implemented — out of P1/P2 scope unless needed.
def get_metrics_collector() -> MetricsCollector:
global _metrics_collector
if _metrics_collector is None:
_metrics_collector = MetricsCollector() # No logging
return _metrics_collector
8.3 Provider Registration - DEBUG Level
File: src/noteflow/application/services/summarization_service.py:119-127
Status (2026-01-03): Partially implemented — still debug in factory registration; consider if INFO needed.
def register_provider(self, mode, provider):
logger.debug("Registered %s provider", mode.value) # Should be INFO at startup
Summary Statistics
| Category | Issue Count | Severity |
|---|---|---|
| Network/External Services | 7 | CRITICAL (mostly resolved) |
| Blocking/Long-Running | 4 | HIGH (partially unresolved) |
| Error Handling | 10+ | HIGH (partially unresolved) |
| State Transitions | 7 | MEDIUM (partially unresolved) |
| Database Operations | 30+ | MEDIUM (unresolved) |
| File System | 3 | LOW (partially unresolved) |
| Export | 3 | LOW (unresolved) |
| Initialization | 5 | MEDIUM (partially unresolved) |
| Total | 100+ | - |
Recommended Logging Pattern
For all async/blocking operations:
logger.info("Starting <operation>: context=%s", context)
start = time.perf_counter()
try:
result = await some_operation()
elapsed_ms = (time.perf_counter() - start) * 1000
logger.info("<Operation> completed: result_count=%d, duration_ms=%.2f", len(result), elapsed_ms)
except TimeoutError:
elapsed_ms = (time.perf_counter() - start) * 1000
logger.error("<Operation> timeout after %.2fms", elapsed_ms)
raise
except Exception as e:
elapsed_ms = (time.perf_counter() - start) * 1000
logger.error("<Operation> failed after %.2fms: %s", elapsed_ms, e)
raise
Priority Fixes
P0 - Fix Immediately
- (Resolved) Ollama
is_availabletimeout logging - (Resolved) Summarization factory timing
- (Resolved) Database migration progress logging
P1 - Fix This Sprint
- (Resolved) All external HTTP calls (calendar, OAuth, webhooks)
- All
run_in_executorcalls (ASR, NER, diarization) - Silent ValueError returns
P2 - Fix Next Sprint
- Repository CRUD logging
- State transition logging (segmenter + diarization session)
- Background task lifecycle logging
Resolved Issues
Server-side state volatility→ Diarization jobs persisted to DBHardcoded directory paths→asset_pathcolumn added to meetingsSynchronous blocking in async gRPC→run_in_executorfor diarizationSummarization consent not persisted→ Stored inuser_preferencestableVU meter update throttling→ 20fps throttle implementedWebhook infrastructure missing→ Full webhook subsystem implementedIntegration/OAuth token storage→IntegrationSecretModelfor secure storage