Files
noteflow/docs/sprints/sprint_logging_gap_remediation_p2.md
Travis Vasceannie 0c1dbb362f feat: add structured logging for persistence and export operations
- Introduced logging for CRUD operations in repositories to enhance visibility into database interactions.
- Implemented timing logs for BaseRepository and UnitOfWork lifecycle events to track performance.
- Added export logging for size and duration without exposing sensitive content.
- Promoted logging levels for specific operations to improve clarity and reduce noise in logs.
- Established a framework for consistent logging practices across persistence and export functionalities.
2026-01-14 01:18:44 +00:00

5.5 KiB

Sprint: Logging Gap Remediation (P2 - Persistence/Exports)

Size: L | Owner: Platform | Prerequisites: P1 logging gaps resolved Phase: Observability - Data & Lifecycle


Open Issues & Prerequisites

⚠️ Review Date: 2026-01-03 — Verification complete, scope needs prioritization.

Blocking Issues

ID Issue Status Resolution
B1 Log volume for repository CRUD operations Pending Decide sampling/level policy
B2 Sensitive data in repository logs Pending Redaction and field allowlist

Design Gaps to Address

ID Gap Resolution
G1 Consistent DB timing strategy across BaseRepository and UoW Add log_timing helpers or per-method timing
G2 Export logs should include size without dumping content Log byte count + segment count only

Prerequisite Verification

Prerequisite Status Notes
Logging helpers available log_timing, get_logger
State transition logger log_state_transition

Validation Status (2026-01-03)

PARTIALLY IMPLEMENTED

Component Status Notes
DB migrations lifecycle logs Partial Migration start/end logged; repo/UoW still silent
Audio writer open logging Partial Open/flush errors logged, but thread lifecycle unlogged

NOT IMPLEMENTED

Component Status Notes
BaseRepository query timing Not implemented src/noteflow/infrastructure/persistence/repositories/_base.py
UnitOfWork lifecycle logs Not implemented src/noteflow/infrastructure/persistence/unit_of_work.py
Repository CRUD logging Not implemented meeting_repo.py, segment_repo.py, summary_repo.py, etc.
Asset deletion no-op logging Not implemented src/noteflow/infrastructure/persistence/repositories/asset_repo.py
Export timing/logging Not implemented pdf.py, markdown.py, html.py
Diarization session close log level Not implemented src/noteflow/infrastructure/diarization/session.py uses debug
Background task lifecycle logs Not implemented src/noteflow/grpc/_mixins/diarization/_jobs.py task creation missing

Downstream impact: Limited visibility into DB performance, export latency, and lifecycle cleanup.


Objective

Add structured logging for persistence, export, and lifecycle operations so DB performance issues and long-running exports are diagnosable without ad-hoc debugging.


Key Decisions

Decision Choice Rationale
Repository logging level INFO for mutations, DEBUG for reads Avoid log noise while capturing state changes
Timing strategy log_timing around DB write batches Consistent duration metrics without per-row spam
Export logging Log sizes and durations only Avoid dumping user content

What Already Exists

Asset Location Implication
Migration logging src/noteflow/infrastructure/persistence/database.py Reuse for DB lifecycle logs
Log helpers src/noteflow/infrastructure/logging/* Standardize on structured logging

Scope

Task Effort Notes
Infrastructure Layer
Add BaseRepository timing wrappers M _execute_* methods emit duration
Add UnitOfWork lifecycle logs S aenter/commit/rollback/exit
Add CRUD mutation logs in repositories L Create/Update/Delete summary logs
Add asset deletion no-op log S log when directory missing
Add export timing logs M PDF/Markdown/HTML export duration + size
Promote diarization session close to INFO S session.py
Log diarization job task creation S grpc/_mixins/diarization/_jobs.py
Add audio flush thread lifecycle logs S infrastructure/audio/writer.py

Total Effort: L (4-8 hours)


Deliverables

Backend

Infrastructure Layer:

  • src/noteflow/infrastructure/persistence/repositories/_base.py — timing logs for DB operations
  • src/noteflow/infrastructure/persistence/unit_of_work.py — session/commit/rollback logs
  • src/noteflow/infrastructure/persistence/repositories/*_repo.py — mutation logging
  • src/noteflow/infrastructure/persistence/repositories/asset_repo.py — no-op delete log
  • src/noteflow/infrastructure/export/pdf.py — duration + byte-size log
  • src/noteflow/infrastructure/export/markdown.py — export count log
  • src/noteflow/infrastructure/export/html.py — export count log
  • src/noteflow/infrastructure/diarization/session.py — info-level close log
  • src/noteflow/grpc/_mixins/diarization/_jobs.py — background task creation log
  • src/noteflow/infrastructure/audio/writer.py — flush thread lifecycle logs

Test Strategy

Core test cases

  • Repositories: caplog validates mutation logging for create/update/delete
  • UnitOfWork: log emitted on commit/rollback paths
  • Exports: ensure logs include duration and output size (bytes/segments)
  • Lifecycle: diarization session close emits info log

Quality Gates

  • Logging includes structured fields and avoids payload content
  • No new # type: ignore or Any introduced
  • pytest passes for touched modules
  • ruff check + mypy pass

Post-Sprint

  • Assess performance impact of repo timing logs
  • Consider opt-in logging for high-volume read paths