13 KiB
13 KiB
Python Tests Development Guide
Overview
The test suite (tests/) provides comprehensive coverage for the NoteFlow Python backend. Tests are organized to mirror the source structure and enforce quality gates through baseline comparison.
Architecture: pytest + pytest-asyncio + testcontainers for integration
Directory Structure
tests/
├── application/ # Service and use case tests
├── benchmarks/ # Performance benchmarking tests
├── config/ # Configuration tests
├── domain/ # Domain entity and value object tests
├── fixtures/ # Test data files (audio samples)
│ └── audio/ # Audio fixture files
├── grpc/ # gRPC servicer and streaming tests
├── infrastructure/ # Infrastructure adapter tests
│ ├── asr/ # Speech-to-text engine tests
│ ├── audio/ # Audio processing tests
│ ├── auth/ # Authentication/OAuth tests
│ ├── calendar/ # Calendar integration tests
│ ├── diarization/ # Speaker diarization tests
│ ├── export/ # Export functionality tests
│ ├── gpu/ # GPU hardware detection tests
│ ├── metrics/ # Metrics and observability tests
│ ├── ner/ # Named Entity Recognition tests
│ ├── observability/ # OpenTelemetry instrumentation tests
│ ├── persistence/ # Database ORM tests
│ ├── security/ # Encryption/crypto tests
│ ├── summarization/ # Summary generation tests
│ ├── triggers/ # Window trigger detection tests
│ └── webhooks/ # Webhook delivery tests
├── integration/ # Full system integration tests (PostgreSQL)
├── quality/ # Code quality and static analysis tests
│ ├── _detectors/ # Quality rule detector implementations
│ └── baselines.json # Frozen violation baseline
├── stress/ # Stress, concurrency, and fuzz tests
└── conftest.py # Root-level shared fixtures
Running Tests
# All tests
pytest
# Skip slow tests (model loading)
pytest -m "not slow"
# Integration tests only
pytest -m integration
# Quality gate checks
pytest tests/quality/
# Stress/fuzz tests
pytest tests/stress/
# Specific test file
pytest tests/domain/test_meeting.py
# With coverage
pytest --cov=src/noteflow --cov-report=html
Pytest Markers
| Marker | Purpose |
|---|---|
@pytest.mark.slow |
Model loading, GPU operations |
@pytest.mark.integration |
External services (PostgreSQL) |
@pytest.mark.stress |
Stress and concurrency tests |
Usage:
@pytest.mark.slow
@pytest.mark.parametrize("model_size", ["tiny", "base"])
def test_asr_model_loading(model_size: str) -> None:
...
Key Fixtures
Root Fixtures (conftest.py)
| Fixture | Scope | Purpose |
|---|---|---|
mock_optional_extras |
session | Mocks openai, anthropic, ollama (collection time) |
reset_context_vars |
function | Isolates logging context variables |
mock_uow |
function | Full UnitOfWork mock with all repos |
meeting_id |
function | MeetingId (UUID-based) |
sample_meeting |
function | Meeting in CREATED state |
recording_meeting |
function | Meeting in RECORDING state |
sample_rate |
function | 16000 (DEFAULT_SAMPLE_RATE) |
crypto |
function | AesGcmCryptoBox with InMemoryKeyStore |
meetings_dir |
function | Temporary meetings directory |
webhook_config |
function | WebhookConfig for MEETING_COMPLETED |
mock_grpc_context |
function | Mock gRPC ServicerContext |
mockasr_engine |
function | Mock ASR engine |
memory_servicer |
function | In-memory NoteFlowServicer |
Utility Functions (not fixtures):
approx_float(expected, rel, abs)— Type-safe pytest.approx wrapperapprox_sequence(expected, rel, abs)— Float sequence comparison
Integration Fixtures (integration/conftest.py)
| Fixture | Purpose |
|---|---|
session_factory |
PostgreSQL async_sessionmaker (testcontainers) |
session |
Individual AsyncSession with rollback |
persisted_meeting |
Created and persisted Meeting |
stopped_meeting_with_segments |
Meeting in STOPPED state with speaker segments |
audio_fixture_path |
Path to sample_discord.wav |
audio_samples |
Normalized float32 array |
Quality Fixtures (quality/)
| Fixture | Purpose |
|---|---|
baselines.json |
Frozen violation counts by rule |
_baseline.py |
Baseline comparison utilities |
_detectors/ |
AST-based detection implementations |
Testing Patterns
1. Declarative Tests (No Loops)
CRITICAL: Test functions must NOT contain loops. Use parametrize:
# ✅ CORRECT: Parametrized test
@pytest.mark.parametrize(
("text", "expected_category"),
[
("John Smith", EntityCategory.PERSON),
("Apple Inc.", EntityCategory.COMPANY),
("Python", EntityCategory.TECHNICAL),
],
)
def test_extract_entity_category(text: str, expected_category: EntityCategory) -> None:
result = extract_entity(text)
assert result.category == expected_category
# ❌ WRONG: Loop in test
def test_extract_entity_categories() -> None:
test_cases = [("John Smith", EntityCategory.PERSON), ...]
for text, expected in test_cases: # FORBIDDEN
result = extract_entity(text)
assert result.category == expected
2. Async Testing
# Async fixtures
@pytest.fixture
async def persisted_meeting(session: AsyncSession) -> Meeting:
meeting = Meeting.create(...)
session.add(MeetingModel.from_entity(meeting))
await session.commit()
return meeting
# Async tests (auto-marked by pytest-asyncio)
async def test_fetch_meeting(uow: UnitOfWork, persisted_meeting: Meeting) -> None:
result = await uow.meetings.get(persisted_meeting.id)
assert result is not None
3. Mock Strategies
Repository Mocks:
@pytest.fixture
def mock_meetings_repo() -> AsyncMock:
repo = AsyncMock(spec=MeetingRepository)
repo.get = AsyncMock(return_value=None)
repo.create = AsyncMock()
return repo
Service Mocks with Side Effects:
@pytest.fixture
def mock_executor(captured_payloads: list) -> AsyncMock:
async def capture_delivery(config, event_type, payload):
captured_payloads.append(payload)
return WebhookDelivery(status_code=200, ...)
executor = AsyncMock(spec=WebhookExecutor)
executor.deliver = AsyncMock(side_effect=capture_delivery)
return executor
4. Quality Gate Pattern
Tests in tests/quality/ use baseline comparison:
def test_no_high_complexity_functions() -> None:
violations = collect_high_complexity(parse_errors=[])
assert_no_new_violations("high_complexity", violations)
Behavior:
- Fails if NEW violations are introduced
- Passes if violations match or decrease from baseline
- Baseline is frozen in
baselines.json
5. Integration Test Setup
@pytest.fixture(scope="session")
async def session_factory() -> AsyncGenerator[async_sessionmaker, None]:
_, database_url = get_or_create_container() # testcontainers
engine = create_test_engine(database_url)
async with engine.begin() as conn:
await initialize_test_schema(conn)
yield create_test_session_factory(engine)
await cleanup_test_schema(conn)
await engine.dispose()
Quality Tests (tests/quality/)
Quality Rules
| Test File | Rule | Description |
|---|---|---|
test_code_smells.py |
high_complexity | Cyclomatic complexity > threshold |
test_code_smells.py |
god_class | Classes with too many methods |
test_code_smells.py |
deep_nesting | Nesting depth > 7 levels |
test_code_smells.py |
long_method | Methods > 50 lines |
test_test_smells.py |
test_loop | Loops in test assertions |
test_test_smells.py |
test_conditional | Conditionals in assertions |
test_magic_values.py |
magic_number | Unextracted numeric constants |
test_duplicate_code.py |
code_duplication | Repeated code blocks |
test_stale_code.py |
dead_code | Unreachable code |
test_unnecessary_wrappers.py |
thin_wrapper | Unnecessary wrapper functions |
test_decentralized_helpers.py |
helper_sprawl | Unconsolidated helpers |
test_baseline_self.py |
baseline_integrity | Baseline file validation |
Baseline System
# _baseline.py
@dataclass
class Violation:
rule: str
relative_path: str
identifier: str
detail: str | None = None
@property
def stable_id(self) -> str:
"""Unique identifier: rule|relative_path|identifier[|detail]"""
...
def assert_no_new_violations(rule: str, violations: list[Violation]) -> None:
"""Compares current violations against frozen baseline."""
baseline = load_baseline()
result = compare_violations(baseline[rule], violations)
assert result.passed, f"New violations: {result.new_violations}"
Running Quality Checks
# All quality tests
pytest tests/quality/
# Specific rule
pytest tests/quality/test_code_smells.py
# Update baseline (REQUIRES APPROVAL)
# Do NOT do this without explicit permission
Fixture Scope Guidelines
| Scope | Use When |
|---|---|
session |
Expensive setup (DB containers, ML models) |
module |
Shared across test class (NER engine) |
function |
Per-test isolation (default) |
autouse=True |
Auto-inject into all tests |
Adding New Tests
1. Choose Location
Mirror the source structure:
src/noteflow/domain/entities/meeting.py→tests/domain/test_meeting.pysrc/noteflow/infrastructure/ner/engine.py→tests/infrastructure/ner/test_engine.py
2. Use Existing Fixtures
Check conftest.py files for reusable fixtures before creating new ones.
3. Follow Patterns
# tests/<module>/test_my_feature.py
import pytest
from noteflow.<module> import MyFeature
class TestMyFeature:
"""Tests for MyFeature class."""
def test_basic_operation(self, sample_meeting: Meeting) -> None:
"""Test basic operation with sample meeting."""
feature = MyFeature()
result = feature.process(sample_meeting)
assert result.success is True
@pytest.mark.parametrize(
("input_value", "expected"),
[
("valid", True),
("invalid", False),
],
)
def test_validation(self, input_value: str, expected: bool) -> None:
"""Test validation with various inputs."""
result = MyFeature.validate(input_value)
assert result == expected
@pytest.mark.slow
def test_with_model_loading(self) -> None:
"""Test requiring ML model loading."""
...
4. Add Fixtures to conftest.py
# tests/<module>/conftest.py
import pytest
from noteflow.<module> import MyFeature
@pytest.fixture
def my_feature() -> MyFeature:
"""Configured MyFeature instance."""
return MyFeature(config=test_config)
Forbidden Patterns
❌ Loops in Test Assertions
# WRONG
def test_items() -> None:
for item in items:
assert item.valid
# RIGHT: Use parametrize
@pytest.mark.parametrize("item", items)
def test_item_valid(item: Item) -> None:
assert item.valid
❌ Conditionals in Assertions
# WRONG
def test_result() -> None:
if condition:
assert result == expected_a
else:
assert result == expected_b
# RIGHT: Separate tests or parametrize
❌ Modifying Quality Baselines
NEVER add entries to baselines.json without explicit approval. Fix the actual code violation instead.
❌ Using print() for Debugging
# WRONG
def test_something() -> None:
print(f"Debug: {result}") # Will be captured, not shown
# RIGHT: Use pytest's capsys or assert messages
def test_something(capsys) -> None:
...
captured = capsys.readouterr()
Key Files Reference
| File | Purpose |
|---|---|
conftest.py |
Root fixtures (524 lines) |
quality/baselines.json |
Frozen violation baseline |
quality/_baseline.py |
Baseline comparison utilities |
quality/_detectors/ |
AST-based detectors |
integration/conftest.py |
PostgreSQL testcontainers setup |
stress/conftest.py |
Stress test fixtures |
fixtures/audio/ |
Audio sample files |
See Also
/src/noteflow/CLAUDE.md— Python backend standards/pyproject.toml— pytest configuration/support/— Test support utilities (non-test code)