Files
Travis Vasceannie 7a84d75d8e Refactor type safety checks and enhance error handling across various modules
- Update typing in error handling and validation nodes to improve type safety.
- Refactor cache decorators for async compatibility and cleanup functionality.
- Enhance URL processing and validation logic with improved type checks.
- Centralize error handling and recovery mechanisms in nodes.
- Simplify and standardize function signatures across multiple modules for consistency.
- Resolve linting issues and ensure compliance with type safety standards.
2025-09-28 13:45:52 -04:00
..
2025-08-05 13:03:53 -04:00
2025-08-05 13:03:53 -04:00

Crash Test Suite

Overview

This comprehensive crash test suite is designed to identify potential failure points in the biz-budz system by simulating various crash scenarios, resource exhaustion, and edge cases. The tests help ensure system resilience and stability under adverse conditions.

Test Categories

1. Memory Exhaustion Tests (test_memory_exhaustion.py)

  • Critical Level: HIGH
  • Purpose: Test memory management and resource exhaustion scenarios
  • Coverage:
    • Service factory memory leaks
    • LLM client large prompt memory usage
    • Database connection pool exhaustion
    • Vector store large batch operations
    • Concurrent operation memory pressure
    • JSON serialization memory usage
    • Thread pool resource exhaustion
    • Async task proliferation

2. Configuration Validation Tests (test_config_validation.py)

  • Critical Level: HIGH
  • Purpose: Test configuration validation and invalid config scenarios
  • Coverage:
    • Missing required configuration sections
    • Invalid LLM configuration values
    • Invalid database configuration values
    • Invalid vector store configuration values
    • Missing environment variables
    • Malformed YAML configuration
    • File permission issues
    • Invalid data types in configuration

3. Database Failure Tests (test_database_failures.py)

  • Critical Level: HIGH
  • Purpose: Test database connection failures and recovery
  • Coverage:
    • Initial connection failures
    • Connection timeouts
    • Connection pool exhaustion
    • Query execution failures
    • Transaction failures and rollbacks
    • Connection leak detection
    • Database maintenance mode
    • Schema change handling
    • Disk full scenarios
    • Permission denied errors
    • Deadlock detection

4. LLM Service Failure Tests (test_llm_service_failures.py)

  • Critical Level: HIGH
  • Purpose: Test LLM service failures and timeout scenarios
  • Coverage:
    • API timeouts
    • Rate limiting
    • Authentication failures
    • Service unavailability
    • Network connection failures
    • Malformed API responses
    • Streaming response failures
    • Token limit exceeded
    • Quota exceeded
    • Model not found errors
    • Multiple provider fallback

5. Concurrency and Race Condition Tests (test_concurrency_races.py)

  • Critical Level: HIGH
  • Purpose: Test concurrent processing and race conditions
  • Coverage:
    • Service factory concurrent initialization
    • Database connection pool races
    • Redis concurrent operations
    • LLM client concurrent requests
    • Search/extraction orchestrator concurrency
    • Thread pool race conditions
    • Async task cancellation races
    • Service initialization order races
    • Shared state race conditions
    • Resource cleanup races

6. State Corruption Tests (test_state_corruption.py)

  • Critical Level: HIGH
  • Purpose: Test state management corruption and recovery
  • Coverage:
    • Deep copy corruption
    • JSON serialization corruption
    • Pickle corruption
    • Type corruption
    • Missing field corruption
    • Memory corruption
    • Concurrent modification corruption
    • Graph propagation corruption
    • Deserialization corruption
    • Validation corruption detection
    • Recovery mechanisms
    • Checkpoint corruption

7. Network Failure Tests (test_network_failures.py)

  • Critical Level: HIGH
  • Purpose: Test network failures and external service unavailability
  • Coverage:
    • DNS resolution failures
    • Connection timeouts
    • SSL certificate errors
    • Network unreachable scenarios
    • HTTP server errors
    • Redis connection failures
    • Vector store connection failures
    • Search API unavailability
    • Web scraping failures
    • Intermittent network failures
    • Network partitions
    • Bandwidth limitations
    • Proxy failures
    • Firewall blocking

8. Malformed Input Tests (test_malformed_input.py)

  • Critical Level: MEDIUM
  • Purpose: Test malformed input and edge case data handling
  • Coverage:
    • Empty input handling
    • Extremely long input
    • Unicode and special characters
    • Malformed JSON input
    • Malformed XML input
    • Malformed HTML input
    • Injection attempts (SQL, XSS, Command)
    • Binary data input
    • Circular reference input
    • Malformed search queries
    • Malformed LLM prompts
    • Malformed URL input
    • Deeply nested input
    • Mixed encoding input
    • State corruption with malformed input

9. File System Error Tests (test_filesystem_errors.py)

  • Critical Level: MEDIUM
  • Purpose: Test file system errors and permission issues
  • Coverage:
    • File not found errors
    • Permission denied (read/write)
    • Directory not found errors
    • Disk full errors
    • File already exists errors
    • File locked errors
    • File too large errors
    • Invalid file path errors
    • Circular symlink errors
    • File system corruption
    • Network drive unavailability
    • File handle exhaustion
    • Concurrent file access
    • Cache directory creation failure
    • Log file rotation failure
    • Temporary file cleanup failure

Running the Tests

Prerequisites

  1. Python Environment: Python 3.12+ with development dependencies installed
  2. Test Dependencies: pytest, pytest-asyncio, pytest-json-report, psutil
  3. Virtual Environment: Activate your virtual environment
  4. Configuration: Ensure test configuration is properly set up

Running All Tests

# Run all crash tests
python tests/crash_tests/run_crash_tests.py

# Run with help
python tests/crash_tests/run_crash_tests.py --help

Running Individual Test Suites

# Run specific test file
pytest tests/crash_tests/test_memory_exhaustion.py -v

# Run with detailed output
pytest tests/crash_tests/test_database_failures.py -v --tb=long

# Run with markers
pytest tests/crash_tests/ -m "not slow" -v

Running Tests in CI/CD

# Run in CI environment
python tests/crash_tests/run_crash_tests.py --ci

# Generate JUnit XML report
pytest tests/crash_tests/ --junitxml=crash_test_results.xml

Test Output

Console Output

  • Real-time test execution progress
  • Pass/fail status for each test suite
  • Summary statistics
  • Critical failure highlights

Report Files

  • crash_test_report.json: Detailed JSON report with all test results
  • crash_test_summary.md: Human-readable markdown summary
  • Individual test JSON reports (temporary)

Exit Codes

  • 0: All tests passed
  • 1: Some tests failed (non-critical)
  • 2: Critical failures detected

Test Configuration

Environment Variables

# Required for some tests
export OPENAI_API_KEY=your_key_here
export ANTHROPIC_API_KEY=your_key_here
export TAVILY_API_KEY=your_key_here
export DATABASE_URL=your_db_url
export REDIS_URL=your_redis_url
export QDRANT_URL=your_qdrant_url

Test Settings

  • Timeout: Individual test timeout (default: 300s)
  • Retry: Number of retries for flaky tests (default: 3)
  • Parallel: Number of parallel test workers (default: auto)
  • Verbose: Detailed output level (default: normal)

Interpreting Results

Critical Failures

  • High Priority: Immediate attention required
  • Impact: System may crash or become unstable
  • Action: Fix before deployment

Non-Critical Failures

  • Medium Priority: Should be addressed soon
  • Impact: Degraded performance or functionality
  • Action: Include in next sprint

Expected Behaviors

Some tests are designed to trigger specific error conditions:

  • Memory exhaustion tests may consume significant memory
  • Network tests may have longer execution times
  • Database tests may require specific database states
  • Concurrency tests may show occasional race conditions

Monitoring and Alerting

Metrics to Monitor

  • Memory Usage: Peak memory consumption during tests
  • CPU Usage: Peak CPU usage during concurrent tests
  • Database Connections: Connection pool usage
  • Network Requests: External service call patterns
  • Error Rates: Frequency of different error types

Alerting Thresholds

  • Memory Usage: > 80% of available memory
  • CPU Usage: > 90% for extended periods
  • Database Connections: > 80% of pool size
  • Error Rate: > 5% of requests failing
  • Response Time: > 10s for critical operations

Maintenance

Regular Updates

  • Update test cases as new failure scenarios are discovered
  • Add tests for new features and services
  • Update expected behaviors as system evolves
  • Review and update critical failure thresholds

Test Hygiene

  • Remove obsolete tests
  • Update mock configurations
  • Refresh test data
  • Validate test environment setup

Best Practices

Test Design

  • Isolation: Each test should be independent
  • Cleanup: Proper resource cleanup after tests
  • Deterministic: Tests should produce consistent results
  • Focused: Each test should verify one specific failure scenario

Resource Management

  • Memory: Monitor memory usage during tests
  • Cleanup: Ensure proper cleanup of test resources
  • Timeouts: Set appropriate timeouts for different scenarios
  • Parallel Execution: Balance parallelism with resource constraints

Error Handling

  • Expected Errors: Distinguish between expected and unexpected errors
  • Recovery: Test recovery mechanisms where applicable
  • Logging: Comprehensive logging for debugging failures
  • Graceful Degradation: Verify graceful degradation behaviors

Troubleshooting

Common Issues

  1. Memory Exhaustion During Tests

    • Reduce parallel test execution
    • Increase system memory allocation
    • Check for memory leaks in test setup
  2. Database Connection Failures

    • Verify database service is running
    • Check connection pool configuration
    • Ensure test database permissions
  3. Network Timeout Issues

    • Increase timeout values for network tests
    • Check internet connectivity
    • Verify external service availability
  4. File System Permission Errors

    • Ensure test has appropriate file permissions
    • Check disk space availability
    • Verify temporary directory access

Debugging Tips

  • Use verbose output (-v) for detailed test information
  • Check individual test logs for specific failures
  • Review system resource usage during test execution
  • Verify test environment configuration

Contributing

When adding new crash tests:

  1. Follow Naming Convention: test_[category]_[scenario].py
  2. Add Documentation: Document test purpose and coverage
  3. Include in Suite: Add to run_crash_tests.py test suites
  4. Set Critical Level: Assign appropriate critical level
  5. Update README: Update this documentation

Example Test Structure

import pytest
from unittest.mock import Mock, patch

class TestNewFailureScenario:
    \"\"\"Test new failure scenario.\"\"\"
    
    @pytest.fixture
    def mock_config(self):
        \"\"\"Mock configuration for testing.\"\"\"
        return Mock()
    
    @pytest.mark.asyncio
    async def test_specific_failure_case(self, mock_config):
        \"\"\"Test specific failure case.\"\"\"
        # Arrange
        # Act
        # Assert
        pass

License

This crash test suite is part of the biz-budz project and follows the same license terms.