- Enhance context isolation in specialized_exceptions.py to prevent shared state mutations and ensure safe metadata updates. - Improve execution timeout handling in security.py with backward-compatible fallback mechanisms. - Streamline price extraction logic in c_intel.py to handle various input types more effectively. - Update fallback messaging in paperless agent to preserve context and provide clearer error status during LLM access failures. - Clean up import statements in test configuration files for better organization.
BizBud Testing Architecture: A Guide for Developers and LLMs
This document provides a comprehensive overview of the testing architecture for the BizBud project. It is designed to help developers and LLM code assistants quickly understand how to write, run, and maintain high-quality tests.
1. How to Run Tests
All tests are executed using pytest from the project root directory.
# Activate the virtual environment
source .venv/bin/activate
# Run all tests
pytest
# Run only unit tests
pytest -m unit
# Run only integration tests
pytest -m integration
# Run a specific test file
pytest tests/unit_tests/nodes/core/test_error.py
# Run tests and skip slow ones
pytest -m "not slow"
2. Guiding Principles
- Clarity and Readability: Tests should be easy to understand.
- Isolation: Unit tests must not have external dependencies (network, database). Mocks and stubs are used to achieve this.
- Automation: All tests are designed to be run automatically in a CI/CD pipeline.
- Convention over Configuration: The testing framework follows established conventions to minimize setup.
- No Loops in Tests: Test functions should never contain loops - use fixtures or parametrization instead.
- No Conditionals in Tests: Test functions should never contain if/else statements - separate into distinct test functions.
- Single Purpose: Each test should have a single, clear assertion or responsibility.
3. Core Testing Patterns & Abstractions
To write effective tests, it's crucial to understand these core patterns.
Pattern 1: Centralized Configuration Fixture
Tests rely on a consistent, mock configuration provided by the base_config_dict fixture. This ensures that tests run in a predictable environment without depending on local .env files.
- Source:
tests/helpers/fixtures/config_fixtures.py - Usage: This fixture is often used by other fixtures to construct a validated
AppConfigobject for the application state.
Pattern 2: State Scaffolding with Fixtures and Factories
Given the graph-based architecture, setting up the initial state is the most common testing task. We use a combination of fixtures and factories for this.
- State Fixtures: For common scenarios, dedicated state fixtures provide a ready-to-use state dictionary (e.g.,
base_state,research_state).- Source:
tests/helpers/fixtures/state_fixtures.py - Example:
def test_something(research_state): ...
- Source:
- State Factories: For tests requiring custom state, factory fixtures allow you to build a state dictionary with specific overrides.
- Source:
tests/helpers/fixtures/factory_fixtures.py - Example:
def test_with_custom_query(research_state_factory): # Create a research state with a specific query custom_state = research_state_factory(query="my custom query") # ... run test with custom_state
- Source:
Pattern 3: Parametrized Testing for Multiple Scenarios
When testing multiple data sets or scenarios, use @pytest.mark.parametrize instead of loops:
# ❌ DON'T: Use loops in test functions
def test_user_processing_bad():
users = get_users()
for user in users:
result = process_user(user)
assert result.success
# ✅ DO: Use parametrization
@pytest.mark.parametrize("user", get_users())
def test_user_processing(user):
result = process_user(user)
assert result.success
# ✅ ALTERNATIVE: Use fixtures with generator expressions
def test_user_processing_batch(users):
results = [process_user(user) for user in users]
assert all(result.success for result in results)
Pattern 4: Separate Test Functions for Different Scenarios
When testing different conditions, create separate test functions instead of using conditionals:
# ❌ DON'T: Use conditionals in tests
def test_api_response_bad():
response = api_call()
if response.status_code == 200:
assert "data" in response.json()
elif response.status_code == 404:
assert "error" in response.json()
# ✅ DO: Separate test functions with fixtures
@pytest.fixture
def success_response():
return Mock(status_code=200, json=lambda: {"data": []})
@pytest.fixture
def not_found_response():
return Mock(status_code=404, json=lambda: {"error": "Not found"})
def test_successful_api_response(success_response):
assert "data" in success_response.json()
def test_not_found_api_response(not_found_response):
assert "error" in not_found_response.json()
Pattern 5: Complex Data Preparation in Fixtures
Move all complex setup logic to fixtures, keeping test functions clean:
# ✅ Complex setup in fixtures is acceptable
@pytest.fixture
def complex_test_data():
"""Generate complex nested test data."""
data = {"categories": [], "items": []}
# Complex setup logic is acceptable in fixtures
categories = ["electronics", "books", "clothing"]
for category in categories:
cat_data = {"name": category, "items": []}
for i in range(3):
item = {
"id": f"{category}_{i}",
"name": f"{category.title()} Item {i}",
"price": (i + 1) * 10.0,
"category": category
}
cat_data["items"].append(item)
data["items"].append(item)
data["categories"].append(cat_data)
return data
def test_data_structure(complex_test_data):
assert len(complex_test_data["categories"]) == 3
assert len(complex_test_data["items"]) == 9
Pattern 6: Async Fixture Patterns
For async operations, use async fixtures:
@pytest.fixture
async def async_test_data():
"""Async fixture for data that requires async setup."""
data = []
# Async setup operations
for i in range(5):
item = await create_async_item(i)
data.append(item)
return data
@pytest.fixture
async def connected_client():
"""Async fixture for connected client."""
client = AsyncClient()
await client.connect()
yield client
await client.disconnect()
4. Directory Structure
The tests/ directory is organized by function and test type:
tests/unit_tests/: Contains tests for individual components (e.g., functions, classes) in complete isolation. These are fast, focused, and must not have external dependencies like network or database access.tests/integration_tests/: For tests that verify the interaction between multiple components. These tests ensure that different parts of the application work together as expected. External services are typically mocked or stubbed.tests/e2e/: Holds end-to-end tests that simulate a full user workflow from start to finish. These are the most comprehensive tests, validating the entire application stack.tests/helpers/: A crucial directory for shared testing infrastructure, including:assertions/: Custom assertion functions for more descriptive test failures.factories/: Reusable functions for creating test data and objects.fixtures/: Shared Pytest fixtures that can be used across the entire test suite.mocks/: Pre-configured mock objects and patchers.
tests/cassettes/: Storesvcrpycassettes, which are recordings of live HTTP interactions. Using cassettes allows API-dependent tests to run quickly and deterministically without making actual network calls.tests/manual/: Contains scripts for manual testing, debugging, and verification. These are not part of the automated test suite and are used for ad-hoc checks or to diagnose complex issues.tests/meta/: Includes "meta-tests" that validate the testing architecture itself. These tests enforce conventions, ensure fixtures are correctly configured, and maintain the overall integrity of the test suite.tests/conftest.py: The root Pytest configuration file. It defines project-wide fixtures, hooks, and custom markers, making them globally available.
5. Core Technologies & Libraries
- Test Runner: Pytest is the primary framework for writing and running tests.
- Asynchronous Testing:
pytest-asynciois used to testasynccode, with theanyio_backendfixture configured to useasyncio. - Mocking:
unittest.mock: Used for standard mocking of objects and functions.vcrpy: Used to record and replay HTTP interactions, ensuring tests that involve APIs are fast and reliable. Cassettes are stored intests/cassettes/.
- Assertions: Custom assertion helpers are defined in
tests/helpers/assertions/to provide more descriptive failure messages.
6. Key Fixtures & Environment
Fixtures provide a fixed baseline for tests. The test environment is automatically configured by the setup_test_environment fixture in tests/conftest.py, which sets ENVIRONMENT="test" and other variables for every test session.
- Session-Scoped:
anyio_backend,setup_test_environment - Function-Scoped:
clean_stateprovides a fresh, default state for each test. - Imported Fixtures: The root
conftest.pyglobally imports fixtures fromtests/helpers/fixtures/(config_fixtures,factory_fixtures,mock_fixtures,state_fixtures), making them available to all tests without explicit imports.
7. Test Markers
Custom Pytest markers are used to categorize tests:
@pytest.mark.unit: For unit tests.@pytest.mark.integration: For integration tests.@pytest.mark.e2e: For end-to-end tests.@pytest.mark.slow: For tests that are known to be slow.@pytest.mark.web: For tests requiring a live internet connection (used sparingly).@pytest.mark.browser: For tests that require browser automation.
8. CI/CD Integration
[Placeholder] This section should describe how tests are run in the CI/CD pipeline (e.g., GitHub Actions). It should specify which test suites are run, how coverage is measured, and any other relevant details.
9. Test Policy Compliance & Quality Standards
No-Loops and No-Conditionals Policy
Our testing framework enforces strict quality standards to ensure maintainable, readable, and reliable tests:
Policy Rules:
- No loops in test functions: Use
@pytest.mark.parametrizeor fixtures instead - No conditionals in test functions: Create separate test functions for different scenarios
- All complex logic belongs in fixtures: Test functions should contain only assertions
Common Violations and Fixes:
Violation: Result Collection Validation
# ❌ VIOLATION: Loop in test function
def test_multiple_results():
results = get_results()
for result in results:
assert result["status"] == "success"
# ✅ FIX: Use generator expression
def test_multiple_results():
results = get_results()
assert all(r["status"] == "success" for r in results)
# ✅ ALTERNATIVE: Parametrized testing
@pytest.mark.parametrize("result", get_results())
def test_individual_result(result):
assert result["status"] == "success"
Violation: Conditional Logic
# ❌ VIOLATION: Conditional in test function
def test_response_handling():
response = get_response()
if "data" in response:
assert len(response["data"]) > 0
else:
assert "error" in response
# ✅ FIX: Separate test functions
def test_successful_response():
response = get_successful_response() # Fixture
assert "data" in response
assert len(response["data"]) > 0
def test_error_response():
response = get_error_response() # Fixture
assert "error" in response
Fixture Architecture Guidelines
Fixture Categories and Naming Conventions
test_*- Raw test datamock_*- Mocked services/objects*_config- Configuration objects*_client- Client connections*_scenario- Parametrized scenariosprepared_*- Data with preprocessingclean_*- Clean state fixturespopulated_*- Pre-populated fixtures
Fixture Scoping for Performance
# Use appropriate scopes to optimize performance
@pytest.fixture(scope="session") # Expensive setup, shared across tests
def expensive_resource():
pass
@pytest.fixture(scope="module") # Shared within test module
def module_config():
pass
@pytest.fixture(scope="function") # Default, isolated per test
def test_data():
pass
Mock and Patch Fixtures
@pytest.fixture
def mock_external_service():
"""Mock external service responses."""
with patch('app.services.external_service') as mock:
mock.get_data.return_value = {"result": "mocked"}
mock.post_data.return_value = {"status": "success"}
yield mock
@pytest.fixture
def failing_external_service():
"""Mock external service that fails."""
with patch('app.services.external_service') as mock:
mock.get_data.side_effect = ConnectionError("Service unavailable")
mock.post_data.side_effect = TimeoutError("Request timeout")
yield mock
10. Writing New Tests: Best Practices
- Choose the Right Location: Place your test file in the appropriate directory (
unit_tests,integration_tests,e2e) mirroring the source code structure. - Follow Policy Rules: Never use loops or conditionals in test functions - use fixtures and parametrization instead.
- Use Core Patterns: Leverage state fixtures and factories to set up test conditions.
- Isolate Tests: For unit tests, aggressively mock external dependencies, including file system access, network requests, and database calls.
- Use Factories: For creating complex data objects, use the factories located in
tests/helpers/factories/. - Add Markers: Apply the appropriate markers (
unit,integration, etc.) to your test functions. - Write Clear Assertions: Use standard
assertstatements. For complex validations, consider adding a helper totests/helpers/assertions/. - Single Purpose: Each test should validate one specific behavior or outcome.
- Descriptive Names: Test and fixture names should clearly describe what they test or provide.
Migration Checklist for Existing Tests
When updating existing tests to comply with policy:
- ✅ Move all loops to fixtures or use parametrize
- ✅ Replace conditionals with separate test functions
- ✅ Extract data generation to fixtures
- ✅ Use appropriate fixture scopes
- ✅ Add type hints to fixtures
- ✅ Follow naming conventions
- ✅ Ensure test isolation
- ✅ Maintain test coverage
- ✅ Update test documentation
11. Quality Assurance and CI Integration
Pre-commit Hooks
The project includes automated policy enforcement through pre-commit hooks:
- id: test-policy-check
name: Test Policy Compliance
entry: python scripts/check_test_policy.py
language: python
files: ^tests/.*\.py$
args: [--strict]
Linting Integration
Test policy compliance is integrated with our standard linting tools:
- Pyrefly: Advanced type checking with test policy validation
- Ruff: Fast linting with custom test pattern detection
- BasedPyright: Strict type checking including test patterns
Performance Benefits
Following these patterns provides measurable benefits:
- Parallelization: Tests without loops/conditionals can run in parallel more effectively
- Debugging: Failed tests have clearer, more isolated failure points
- Maintenance: Consistent patterns make tests easier to understand and modify
- Reliability: Reduced complexity leads to more stable test execution