Files
Travis Vasceannie bdca72b0b3
Some checks failed
Integration Tests / Integration Tests (ubuntu-latest, 3.12) (push) Has been cancelled
Refine error handling and context management in core modules
- Enhance context isolation in specialized_exceptions.py to prevent shared state mutations and ensure safe metadata updates.
- Improve execution timeout handling in security.py with backward-compatible fallback mechanisms.
- Streamline price extraction logic in c_intel.py to handle various input types more effectively.
- Update fallback messaging in paperless agent to preserve context and provide clearer error status during LLM access failures.
- Clean up import statements in test configuration files for better organization.
2025-09-28 21:04:59 -04:00
..
2025-08-05 13:03:53 -04:00
2025-09-27 23:06:13 -04:00
2025-09-27 23:06:13 -04:00
2025-08-01 21:18:22 -04:00
2025-08-06 13:09:30 -04:00
2025-09-27 23:06:13 -04:00
2025-09-27 23:06:13 -04:00
2025-08-05 13:03:53 -04:00

BizBud Testing Architecture: A Guide for Developers and LLMs

This document provides a comprehensive overview of the testing architecture for the BizBud project. It is designed to help developers and LLM code assistants quickly understand how to write, run, and maintain high-quality tests.

1. How to Run Tests

All tests are executed using pytest from the project root directory.

# Activate the virtual environment
source .venv/bin/activate

# Run all tests
pytest

# Run only unit tests
pytest -m unit

# Run only integration tests
pytest -m integration

# Run a specific test file
pytest tests/unit_tests/nodes/core/test_error.py

# Run tests and skip slow ones
pytest -m "not slow"

2. Guiding Principles

  • Clarity and Readability: Tests should be easy to understand.
  • Isolation: Unit tests must not have external dependencies (network, database). Mocks and stubs are used to achieve this.
  • Automation: All tests are designed to be run automatically in a CI/CD pipeline.
  • Convention over Configuration: The testing framework follows established conventions to minimize setup.
  • No Loops in Tests: Test functions should never contain loops - use fixtures or parametrization instead.
  • No Conditionals in Tests: Test functions should never contain if/else statements - separate into distinct test functions.
  • Single Purpose: Each test should have a single, clear assertion or responsibility.

3. Core Testing Patterns & Abstractions

To write effective tests, it's crucial to understand these core patterns.

Pattern 1: Centralized Configuration Fixture

Tests rely on a consistent, mock configuration provided by the base_config_dict fixture. This ensures that tests run in a predictable environment without depending on local .env files.

  • Source: tests/helpers/fixtures/config_fixtures.py
  • Usage: This fixture is often used by other fixtures to construct a validated AppConfig object for the application state.

Pattern 2: State Scaffolding with Fixtures and Factories

Given the graph-based architecture, setting up the initial state is the most common testing task. We use a combination of fixtures and factories for this.

  • State Fixtures: For common scenarios, dedicated state fixtures provide a ready-to-use state dictionary (e.g., base_state, research_state).
    • Source: tests/helpers/fixtures/state_fixtures.py
    • Example: def test_something(research_state): ...
  • State Factories: For tests requiring custom state, factory fixtures allow you to build a state dictionary with specific overrides.
    • Source: tests/helpers/fixtures/factory_fixtures.py
    • Example:
      def test_with_custom_query(research_state_factory):
          # Create a research state with a specific query
          custom_state = research_state_factory(query="my custom query")
          # ... run test with custom_state
      

Pattern 3: Parametrized Testing for Multiple Scenarios

When testing multiple data sets or scenarios, use @pytest.mark.parametrize instead of loops:

# ❌ DON'T: Use loops in test functions
def test_user_processing_bad():
    users = get_users()
    for user in users:
        result = process_user(user)
        assert result.success

# ✅ DO: Use parametrization
@pytest.mark.parametrize("user", get_users())
def test_user_processing(user):
    result = process_user(user)
    assert result.success

# ✅ ALTERNATIVE: Use fixtures with generator expressions
def test_user_processing_batch(users):
    results = [process_user(user) for user in users]
    assert all(result.success for result in results)

Pattern 4: Separate Test Functions for Different Scenarios

When testing different conditions, create separate test functions instead of using conditionals:

# ❌ DON'T: Use conditionals in tests
def test_api_response_bad():
    response = api_call()
    if response.status_code == 200:
        assert "data" in response.json()
    elif response.status_code == 404:
        assert "error" in response.json()

# ✅ DO: Separate test functions with fixtures
@pytest.fixture
def success_response():
    return Mock(status_code=200, json=lambda: {"data": []})

@pytest.fixture
def not_found_response():
    return Mock(status_code=404, json=lambda: {"error": "Not found"})

def test_successful_api_response(success_response):
    assert "data" in success_response.json()

def test_not_found_api_response(not_found_response):
    assert "error" in not_found_response.json()

Pattern 5: Complex Data Preparation in Fixtures

Move all complex setup logic to fixtures, keeping test functions clean:

# ✅ Complex setup in fixtures is acceptable
@pytest.fixture
def complex_test_data():
    """Generate complex nested test data."""
    data = {"categories": [], "items": []}
    
    # Complex setup logic is acceptable in fixtures
    categories = ["electronics", "books", "clothing"]
    for category in categories:
        cat_data = {"name": category, "items": []}
        
        for i in range(3):
            item = {
                "id": f"{category}_{i}",
                "name": f"{category.title()} Item {i}",
                "price": (i + 1) * 10.0,
                "category": category
            }
            cat_data["items"].append(item)
            data["items"].append(item)
        
        data["categories"].append(cat_data)
    
    return data

def test_data_structure(complex_test_data):
    assert len(complex_test_data["categories"]) == 3
    assert len(complex_test_data["items"]) == 9

Pattern 6: Async Fixture Patterns

For async operations, use async fixtures:

@pytest.fixture
async def async_test_data():
    """Async fixture for data that requires async setup."""
    data = []
    
    # Async setup operations
    for i in range(5):
        item = await create_async_item(i)
        data.append(item)
    
    return data

@pytest.fixture
async def connected_client():
    """Async fixture for connected client."""
    client = AsyncClient()
    await client.connect()
    
    yield client
    
    await client.disconnect()

4. Directory Structure

The tests/ directory is organized by function and test type:

  • tests/unit_tests/: Contains tests for individual components (e.g., functions, classes) in complete isolation. These are fast, focused, and must not have external dependencies like network or database access.
  • tests/integration_tests/: For tests that verify the interaction between multiple components. These tests ensure that different parts of the application work together as expected. External services are typically mocked or stubbed.
  • tests/e2e/: Holds end-to-end tests that simulate a full user workflow from start to finish. These are the most comprehensive tests, validating the entire application stack.
  • tests/helpers/: A crucial directory for shared testing infrastructure, including:
    • assertions/: Custom assertion functions for more descriptive test failures.
    • factories/: Reusable functions for creating test data and objects.
    • fixtures/: Shared Pytest fixtures that can be used across the entire test suite.
    • mocks/: Pre-configured mock objects and patchers.
  • tests/cassettes/: Stores vcrpy cassettes, which are recordings of live HTTP interactions. Using cassettes allows API-dependent tests to run quickly and deterministically without making actual network calls.
  • tests/manual/: Contains scripts for manual testing, debugging, and verification. These are not part of the automated test suite and are used for ad-hoc checks or to diagnose complex issues.
  • tests/meta/: Includes "meta-tests" that validate the testing architecture itself. These tests enforce conventions, ensure fixtures are correctly configured, and maintain the overall integrity of the test suite.
  • tests/conftest.py: The root Pytest configuration file. It defines project-wide fixtures, hooks, and custom markers, making them globally available.

5. Core Technologies & Libraries

  • Test Runner: Pytest is the primary framework for writing and running tests.
  • Asynchronous Testing: pytest-asyncio is used to test async code, with the anyio_backend fixture configured to use asyncio.
  • Mocking:
    • unittest.mock: Used for standard mocking of objects and functions.
    • vcrpy: Used to record and replay HTTP interactions, ensuring tests that involve APIs are fast and reliable. Cassettes are stored in tests/cassettes/.
  • Assertions: Custom assertion helpers are defined in tests/helpers/assertions/ to provide more descriptive failure messages.

6. Key Fixtures & Environment

Fixtures provide a fixed baseline for tests. The test environment is automatically configured by the setup_test_environment fixture in tests/conftest.py, which sets ENVIRONMENT="test" and other variables for every test session.

  • Session-Scoped: anyio_backend, setup_test_environment
  • Function-Scoped: clean_state provides a fresh, default state for each test.
  • Imported Fixtures: The root conftest.py globally imports fixtures from tests/helpers/fixtures/ (config_fixtures, factory_fixtures, mock_fixtures, state_fixtures), making them available to all tests without explicit imports.

7. Test Markers

Custom Pytest markers are used to categorize tests:

  • @pytest.mark.unit: For unit tests.
  • @pytest.mark.integration: For integration tests.
  • @pytest.mark.e2e: For end-to-end tests.
  • @pytest.mark.slow: For tests that are known to be slow.
  • @pytest.mark.web: For tests requiring a live internet connection (used sparingly).
  • @pytest.mark.browser: For tests that require browser automation.

8. CI/CD Integration

[Placeholder] This section should describe how tests are run in the CI/CD pipeline (e.g., GitHub Actions). It should specify which test suites are run, how coverage is measured, and any other relevant details.

9. Test Policy Compliance & Quality Standards

No-Loops and No-Conditionals Policy

Our testing framework enforces strict quality standards to ensure maintainable, readable, and reliable tests:

Policy Rules:

  • No loops in test functions: Use @pytest.mark.parametrize or fixtures instead
  • No conditionals in test functions: Create separate test functions for different scenarios
  • All complex logic belongs in fixtures: Test functions should contain only assertions

Common Violations and Fixes:

Violation: Result Collection Validation

# ❌ VIOLATION: Loop in test function
def test_multiple_results():
    results = get_results()
    for result in results:
        assert result["status"] == "success"

# ✅ FIX: Use generator expression
def test_multiple_results():
    results = get_results()
    assert all(r["status"] == "success" for r in results)

# ✅ ALTERNATIVE: Parametrized testing
@pytest.mark.parametrize("result", get_results())
def test_individual_result(result):
    assert result["status"] == "success"

Violation: Conditional Logic

# ❌ VIOLATION: Conditional in test function
def test_response_handling():
    response = get_response()
    if "data" in response:
        assert len(response["data"]) > 0
    else:
        assert "error" in response

# ✅ FIX: Separate test functions
def test_successful_response():
    response = get_successful_response()  # Fixture
    assert "data" in response
    assert len(response["data"]) > 0

def test_error_response():
    response = get_error_response()  # Fixture
    assert "error" in response

Fixture Architecture Guidelines

Fixture Categories and Naming Conventions

  • test_* - Raw test data
  • mock_* - Mocked services/objects
  • *_config - Configuration objects
  • *_client - Client connections
  • *_scenario - Parametrized scenarios
  • prepared_* - Data with preprocessing
  • clean_* - Clean state fixtures
  • populated_* - Pre-populated fixtures

Fixture Scoping for Performance

# Use appropriate scopes to optimize performance
@pytest.fixture(scope="session")  # Expensive setup, shared across tests
def expensive_resource():
    pass

@pytest.fixture(scope="module")   # Shared within test module
def module_config():
    pass

@pytest.fixture(scope="function") # Default, isolated per test
def test_data():
    pass

Mock and Patch Fixtures

@pytest.fixture
def mock_external_service():
    """Mock external service responses."""
    with patch('app.services.external_service') as mock:
        mock.get_data.return_value = {"result": "mocked"}
        mock.post_data.return_value = {"status": "success"}
        yield mock

@pytest.fixture
def failing_external_service():
    """Mock external service that fails."""
    with patch('app.services.external_service') as mock:
        mock.get_data.side_effect = ConnectionError("Service unavailable")
        mock.post_data.side_effect = TimeoutError("Request timeout")
        yield mock

10. Writing New Tests: Best Practices

  1. Choose the Right Location: Place your test file in the appropriate directory (unit_tests, integration_tests, e2e) mirroring the source code structure.
  2. Follow Policy Rules: Never use loops or conditionals in test functions - use fixtures and parametrization instead.
  3. Use Core Patterns: Leverage state fixtures and factories to set up test conditions.
  4. Isolate Tests: For unit tests, aggressively mock external dependencies, including file system access, network requests, and database calls.
  5. Use Factories: For creating complex data objects, use the factories located in tests/helpers/factories/.
  6. Add Markers: Apply the appropriate markers (unit, integration, etc.) to your test functions.
  7. Write Clear Assertions: Use standard assert statements. For complex validations, consider adding a helper to tests/helpers/assertions/.
  8. Single Purpose: Each test should validate one specific behavior or outcome.
  9. Descriptive Names: Test and fixture names should clearly describe what they test or provide.

Migration Checklist for Existing Tests

When updating existing tests to comply with policy:

  • Move all loops to fixtures or use parametrize
  • Replace conditionals with separate test functions
  • Extract data generation to fixtures
  • Use appropriate fixture scopes
  • Add type hints to fixtures
  • Follow naming conventions
  • Ensure test isolation
  • Maintain test coverage
  • Update test documentation

11. Quality Assurance and CI Integration

Pre-commit Hooks

The project includes automated policy enforcement through pre-commit hooks:

- id: test-policy-check
  name: Test Policy Compliance
  entry: python scripts/check_test_policy.py
  language: python
  files: ^tests/.*\.py$
  args: [--strict]

Linting Integration

Test policy compliance is integrated with our standard linting tools:

  • Pyrefly: Advanced type checking with test policy validation
  • Ruff: Fast linting with custom test pattern detection
  • BasedPyright: Strict type checking including test patterns

Performance Benefits

Following these patterns provides measurable benefits:

  • Parallelization: Tests without loops/conditionals can run in parallel more effectively
  • Debugging: Failed tests have clearer, more isolated failure points
  • Maintenance: Consistent patterns make tests easier to understand and modify
  • Reliability: Reduced complexity leads to more stable test execution