Files

Travis Vasceannie 00095896a5 docs: add new CLAUDE.md documentation files across various components and update Claude configuration.

2026-01-20 03:16:01 +00:00

13 KiB

Raw Permalink Blame History

Python Tests Development Guide

Overview

The test suite (tests/) provides comprehensive coverage for the NoteFlow Python backend. Tests are organized to mirror the source structure and enforce quality gates through baseline comparison.

Architecture: pytest + pytest-asyncio + testcontainers for integration

Directory Structure

tests/
├── application/          # Service and use case tests
├── benchmarks/          # Performance benchmarking tests
├── config/              # Configuration tests
├── domain/              # Domain entity and value object tests
├── fixtures/            # Test data files (audio samples)
│   └── audio/           # Audio fixture files
├── grpc/                # gRPC servicer and streaming tests
├── infrastructure/      # Infrastructure adapter tests
│   ├── asr/             # Speech-to-text engine tests
│   ├── audio/           # Audio processing tests
│   ├── auth/            # Authentication/OAuth tests
│   ├── calendar/        # Calendar integration tests
│   ├── diarization/     # Speaker diarization tests
│   ├── export/          # Export functionality tests
│   ├── gpu/             # GPU hardware detection tests
│   ├── metrics/         # Metrics and observability tests
│   ├── ner/             # Named Entity Recognition tests
│   ├── observability/   # OpenTelemetry instrumentation tests
│   ├── persistence/     # Database ORM tests
│   ├── security/        # Encryption/crypto tests
│   ├── summarization/   # Summary generation tests
│   ├── triggers/        # Window trigger detection tests
│   └── webhooks/        # Webhook delivery tests
├── integration/         # Full system integration tests (PostgreSQL)
├── quality/             # Code quality and static analysis tests
│   ├── _detectors/      # Quality rule detector implementations
│   └── baselines.json   # Frozen violation baseline
├── stress/              # Stress, concurrency, and fuzz tests
└── conftest.py          # Root-level shared fixtures

Running Tests

# All tests
pytest

# Skip slow tests (model loading)
pytest -m "not slow"

# Integration tests only
pytest -m integration

# Quality gate checks
pytest tests/quality/

# Stress/fuzz tests
pytest tests/stress/

# Specific test file
pytest tests/domain/test_meeting.py

# With coverage
pytest --cov=src/noteflow --cov-report=html

Pytest Markers

Marker	Purpose
`@pytest.mark.slow`	Model loading, GPU operations
`@pytest.mark.integration`	External services (PostgreSQL)
`@pytest.mark.stress`	Stress and concurrency tests

Usage:

@pytest.mark.slow
@pytest.mark.parametrize("model_size", ["tiny", "base"])
def test_asr_model_loading(model_size: str) -> None:
    ...

Key Fixtures

Root Fixtures (`conftest.py`)

Fixture	Scope	Purpose
`mock_optional_extras`	session	Mocks openai, anthropic, ollama (collection time)
`reset_context_vars`	function	Isolates logging context variables
`mock_uow`	function	Full UnitOfWork mock with all repos
`meeting_id`	function	MeetingId (UUID-based)
`sample_meeting`	function	Meeting in CREATED state
`recording_meeting`	function	Meeting in RECORDING state
`sample_rate`	function	16000 (DEFAULT_SAMPLE_RATE)
`crypto`	function	AesGcmCryptoBox with InMemoryKeyStore
`meetings_dir`	function	Temporary meetings directory
`webhook_config`	function	WebhookConfig for MEETING_COMPLETED
`mock_grpc_context`	function	Mock gRPC ServicerContext
`mockasr_engine`	function	Mock ASR engine
`memory_servicer`	function	In-memory NoteFlowServicer

Utility Functions (not fixtures):

approx_float(expected, rel, abs) — Type-safe pytest.approx wrapper
approx_sequence(expected, rel, abs) — Float sequence comparison

Integration Fixtures (`integration/conftest.py`)

Fixture	Purpose
`session_factory`	PostgreSQL async_sessionmaker (testcontainers)
`session`	Individual AsyncSession with rollback
`persisted_meeting`	Created and persisted Meeting
`stopped_meeting_with_segments`	Meeting in STOPPED state with speaker segments
`audio_fixture_path`	Path to sample_discord.wav
`audio_samples`	Normalized float32 array

Quality Fixtures (`quality/`)

Fixture	Purpose
`baselines.json`	Frozen violation counts by rule
`_baseline.py`	Baseline comparison utilities
`_detectors/`	AST-based detection implementations

Testing Patterns

1. Declarative Tests (No Loops)

CRITICAL: Test functions must NOT contain loops. Use parametrize:

# ✅ CORRECT: Parametrized test
@pytest.mark.parametrize(
    ("text", "expected_category"),
    [
        ("John Smith", EntityCategory.PERSON),
        ("Apple Inc.", EntityCategory.COMPANY),
        ("Python", EntityCategory.TECHNICAL),
    ],
)
def test_extract_entity_category(text: str, expected_category: EntityCategory) -> None:
    result = extract_entity(text)
    assert result.category == expected_category


# ❌ WRONG: Loop in test
def test_extract_entity_categories() -> None:
    test_cases = [("John Smith", EntityCategory.PERSON), ...]
    for text, expected in test_cases:  # FORBIDDEN
        result = extract_entity(text)
        assert result.category == expected

2. Async Testing

# Async fixtures
@pytest.fixture
async def persisted_meeting(session: AsyncSession) -> Meeting:
    meeting = Meeting.create(...)
    session.add(MeetingModel.from_entity(meeting))
    await session.commit()
    return meeting


# Async tests (auto-marked by pytest-asyncio)
async def test_fetch_meeting(uow: UnitOfWork, persisted_meeting: Meeting) -> None:
    result = await uow.meetings.get(persisted_meeting.id)
    assert result is not None

3. Mock Strategies

Repository Mocks:

@pytest.fixture
def mock_meetings_repo() -> AsyncMock:
    repo = AsyncMock(spec=MeetingRepository)
    repo.get = AsyncMock(return_value=None)
    repo.create = AsyncMock()
    return repo

Service Mocks with Side Effects:

@pytest.fixture
def mock_executor(captured_payloads: list) -> AsyncMock:
    async def capture_delivery(config, event_type, payload):
        captured_payloads.append(payload)
        return WebhookDelivery(status_code=200, ...)

    executor = AsyncMock(spec=WebhookExecutor)
    executor.deliver = AsyncMock(side_effect=capture_delivery)
    return executor

4. Quality Gate Pattern

Tests in tests/quality/ use baseline comparison:

def test_no_high_complexity_functions() -> None:
    violations = collect_high_complexity(parse_errors=[])
    assert_no_new_violations("high_complexity", violations)

Behavior:

Fails if NEW violations are introduced
Passes if violations match or decrease from baseline
Baseline is frozen in baselines.json

5. Integration Test Setup

@pytest.fixture(scope="session")
async def session_factory() -> AsyncGenerator[async_sessionmaker, None]:
    _, database_url = get_or_create_container()  # testcontainers
    engine = create_test_engine(database_url)
    async with engine.begin() as conn:
        await initialize_test_schema(conn)
    yield create_test_session_factory(engine)
    await cleanup_test_schema(conn)
    await engine.dispose()

Quality Tests (`tests/quality/`)

Quality Rules

Test File	Rule	Description
`test_code_smells.py`	high_complexity	Cyclomatic complexity > threshold
`test_code_smells.py`	god_class	Classes with too many methods
`test_code_smells.py`	deep_nesting	Nesting depth > 7 levels
`test_code_smells.py`	long_method	Methods > 50 lines
`test_test_smells.py`	test_loop	Loops in test assertions
`test_test_smells.py`	test_conditional	Conditionals in assertions
`test_magic_values.py`	magic_number	Unextracted numeric constants
`test_duplicate_code.py`	code_duplication	Repeated code blocks
`test_stale_code.py`	dead_code	Unreachable code
`test_unnecessary_wrappers.py`	thin_wrapper	Unnecessary wrapper functions
`test_decentralized_helpers.py`	helper_sprawl	Unconsolidated helpers
`test_baseline_self.py`	baseline_integrity	Baseline file validation

Baseline System

# _baseline.py
@dataclass
class Violation:
    rule: str
    relative_path: str
    identifier: str
    detail: str | None = None

    @property
    def stable_id(self) -> str:
        """Unique identifier: rule|relative_path|identifier[|detail]"""
        ...

def assert_no_new_violations(rule: str, violations: list[Violation]) -> None:
    """Compares current violations against frozen baseline."""
    baseline = load_baseline()
    result = compare_violations(baseline[rule], violations)
    assert result.passed, f"New violations: {result.new_violations}"

Running Quality Checks

# All quality tests
pytest tests/quality/

# Specific rule
pytest tests/quality/test_code_smells.py

# Update baseline (REQUIRES APPROVAL)
# Do NOT do this without explicit permission

Fixture Scope Guidelines

Scope	Use When
`session`	Expensive setup (DB containers, ML models)
`module`	Shared across test class (NER engine)
`function`	Per-test isolation (default)
`autouse=True`	Auto-inject into all tests

Adding New Tests

1. Choose Location

Mirror the source structure:

src/noteflow/domain/entities/meeting.py → tests/domain/test_meeting.py
src/noteflow/infrastructure/ner/engine.py → tests/infrastructure/ner/test_engine.py

2. Use Existing Fixtures

Check conftest.py files for reusable fixtures before creating new ones.

3. Follow Patterns

# tests/<module>/test_my_feature.py
import pytest
from noteflow.<module> import MyFeature


class TestMyFeature:
    """Tests for MyFeature class."""

    def test_basic_operation(self, sample_meeting: Meeting) -> None:
        """Test basic operation with sample meeting."""
        feature = MyFeature()
        result = feature.process(sample_meeting)
        assert result.success is True

    @pytest.mark.parametrize(
        ("input_value", "expected"),
        [
            ("valid", True),
            ("invalid", False),
        ],
    )
    def test_validation(self, input_value: str, expected: bool) -> None:
        """Test validation with various inputs."""
        result = MyFeature.validate(input_value)
        assert result == expected

    @pytest.mark.slow
    def test_with_model_loading(self) -> None:
        """Test requiring ML model loading."""
        ...

4. Add Fixtures to conftest.py

# tests/<module>/conftest.py
import pytest
from noteflow.<module> import MyFeature


@pytest.fixture
def my_feature() -> MyFeature:
    """Configured MyFeature instance."""
    return MyFeature(config=test_config)

Forbidden Patterns

❌ Loops in Test Assertions

# WRONG
def test_items() -> None:
    for item in items:
        assert item.valid

# RIGHT: Use parametrize
@pytest.mark.parametrize("item", items)
def test_item_valid(item: Item) -> None:
    assert item.valid

❌ Conditionals in Assertions

# WRONG
def test_result() -> None:
    if condition:
        assert result == expected_a
    else:
        assert result == expected_b

# RIGHT: Separate tests or parametrize

❌ Modifying Quality Baselines

NEVER add entries to baselines.json without explicit approval. Fix the actual code violation instead.

❌ Using print() for Debugging

# WRONG
def test_something() -> None:
    print(f"Debug: {result}")  # Will be captured, not shown

# RIGHT: Use pytest's capsys or assert messages
def test_something(capsys) -> None:
    ...
    captured = capsys.readouterr()

Key Files Reference

File	Purpose
`conftest.py`	Root fixtures (524 lines)
`quality/baselines.json`	Frozen violation baseline
`quality/_baseline.py`	Baseline comparison utilities
`quality/_detectors/`	AST-based detectors
`integration/conftest.py`	PostgreSQL testcontainers setup
`stress/conftest.py`	Stress test fixtures
`fixtures/audio/`	Audio sample files

13 KiB Raw Permalink Blame History