- Update typing in error handling and validation nodes to improve type safety. - Refactor cache decorators for async compatibility and cleanup functionality. - Enhance URL processing and validation logic with improved type checks. - Centralize error handling and recovery mechanisms in nodes. - Simplify and standardize function signatures across multiple modules for consistency. - Resolve linting issues and ensure compliance with type safety standards.
Crash Test Suite
Overview
This comprehensive crash test suite is designed to identify potential failure points in the biz-budz system by simulating various crash scenarios, resource exhaustion, and edge cases. The tests help ensure system resilience and stability under adverse conditions.
Test Categories
1. Memory Exhaustion Tests (test_memory_exhaustion.py)
- Critical Level: HIGH
- Purpose: Test memory management and resource exhaustion scenarios
- Coverage:
- Service factory memory leaks
- LLM client large prompt memory usage
- Database connection pool exhaustion
- Vector store large batch operations
- Concurrent operation memory pressure
- JSON serialization memory usage
- Thread pool resource exhaustion
- Async task proliferation
2. Configuration Validation Tests (test_config_validation.py)
- Critical Level: HIGH
- Purpose: Test configuration validation and invalid config scenarios
- Coverage:
- Missing required configuration sections
- Invalid LLM configuration values
- Invalid database configuration values
- Invalid vector store configuration values
- Missing environment variables
- Malformed YAML configuration
- File permission issues
- Invalid data types in configuration
3. Database Failure Tests (test_database_failures.py)
- Critical Level: HIGH
- Purpose: Test database connection failures and recovery
- Coverage:
- Initial connection failures
- Connection timeouts
- Connection pool exhaustion
- Query execution failures
- Transaction failures and rollbacks
- Connection leak detection
- Database maintenance mode
- Schema change handling
- Disk full scenarios
- Permission denied errors
- Deadlock detection
4. LLM Service Failure Tests (test_llm_service_failures.py)
- Critical Level: HIGH
- Purpose: Test LLM service failures and timeout scenarios
- Coverage:
- API timeouts
- Rate limiting
- Authentication failures
- Service unavailability
- Network connection failures
- Malformed API responses
- Streaming response failures
- Token limit exceeded
- Quota exceeded
- Model not found errors
- Multiple provider fallback
5. Concurrency and Race Condition Tests (test_concurrency_races.py)
- Critical Level: HIGH
- Purpose: Test concurrent processing and race conditions
- Coverage:
- Service factory concurrent initialization
- Database connection pool races
- Redis concurrent operations
- LLM client concurrent requests
- Search/extraction orchestrator concurrency
- Thread pool race conditions
- Async task cancellation races
- Service initialization order races
- Shared state race conditions
- Resource cleanup races
6. State Corruption Tests (test_state_corruption.py)
- Critical Level: HIGH
- Purpose: Test state management corruption and recovery
- Coverage:
- Deep copy corruption
- JSON serialization corruption
- Pickle corruption
- Type corruption
- Missing field corruption
- Memory corruption
- Concurrent modification corruption
- Graph propagation corruption
- Deserialization corruption
- Validation corruption detection
- Recovery mechanisms
- Checkpoint corruption
7. Network Failure Tests (test_network_failures.py)
- Critical Level: HIGH
- Purpose: Test network failures and external service unavailability
- Coverage:
- DNS resolution failures
- Connection timeouts
- SSL certificate errors
- Network unreachable scenarios
- HTTP server errors
- Redis connection failures
- Vector store connection failures
- Search API unavailability
- Web scraping failures
- Intermittent network failures
- Network partitions
- Bandwidth limitations
- Proxy failures
- Firewall blocking
8. Malformed Input Tests (test_malformed_input.py)
- Critical Level: MEDIUM
- Purpose: Test malformed input and edge case data handling
- Coverage:
- Empty input handling
- Extremely long input
- Unicode and special characters
- Malformed JSON input
- Malformed XML input
- Malformed HTML input
- Injection attempts (SQL, XSS, Command)
- Binary data input
- Circular reference input
- Malformed search queries
- Malformed LLM prompts
- Malformed URL input
- Deeply nested input
- Mixed encoding input
- State corruption with malformed input
9. File System Error Tests (test_filesystem_errors.py)
- Critical Level: MEDIUM
- Purpose: Test file system errors and permission issues
- Coverage:
- File not found errors
- Permission denied (read/write)
- Directory not found errors
- Disk full errors
- File already exists errors
- File locked errors
- File too large errors
- Invalid file path errors
- Circular symlink errors
- File system corruption
- Network drive unavailability
- File handle exhaustion
- Concurrent file access
- Cache directory creation failure
- Log file rotation failure
- Temporary file cleanup failure
Running the Tests
Prerequisites
- Python Environment: Python 3.12+ with development dependencies installed
- Test Dependencies: pytest, pytest-asyncio, pytest-json-report, psutil
- Virtual Environment: Activate your virtual environment
- Configuration: Ensure test configuration is properly set up
Running All Tests
# Run all crash tests
python tests/crash_tests/run_crash_tests.py
# Run with help
python tests/crash_tests/run_crash_tests.py --help
Running Individual Test Suites
# Run specific test file
pytest tests/crash_tests/test_memory_exhaustion.py -v
# Run with detailed output
pytest tests/crash_tests/test_database_failures.py -v --tb=long
# Run with markers
pytest tests/crash_tests/ -m "not slow" -v
Running Tests in CI/CD
# Run in CI environment
python tests/crash_tests/run_crash_tests.py --ci
# Generate JUnit XML report
pytest tests/crash_tests/ --junitxml=crash_test_results.xml
Test Output
Console Output
- Real-time test execution progress
- Pass/fail status for each test suite
- Summary statistics
- Critical failure highlights
Report Files
crash_test_report.json: Detailed JSON report with all test resultscrash_test_summary.md: Human-readable markdown summary- Individual test JSON reports (temporary)
Exit Codes
0: All tests passed1: Some tests failed (non-critical)2: Critical failures detected
Test Configuration
Environment Variables
# Required for some tests
export OPENAI_API_KEY=your_key_here
export ANTHROPIC_API_KEY=your_key_here
export TAVILY_API_KEY=your_key_here
export DATABASE_URL=your_db_url
export REDIS_URL=your_redis_url
export QDRANT_URL=your_qdrant_url
Test Settings
- Timeout: Individual test timeout (default: 300s)
- Retry: Number of retries for flaky tests (default: 3)
- Parallel: Number of parallel test workers (default: auto)
- Verbose: Detailed output level (default: normal)
Interpreting Results
Critical Failures
- High Priority: Immediate attention required
- Impact: System may crash or become unstable
- Action: Fix before deployment
Non-Critical Failures
- Medium Priority: Should be addressed soon
- Impact: Degraded performance or functionality
- Action: Include in next sprint
Expected Behaviors
Some tests are designed to trigger specific error conditions:
- Memory exhaustion tests may consume significant memory
- Network tests may have longer execution times
- Database tests may require specific database states
- Concurrency tests may show occasional race conditions
Monitoring and Alerting
Metrics to Monitor
- Memory Usage: Peak memory consumption during tests
- CPU Usage: Peak CPU usage during concurrent tests
- Database Connections: Connection pool usage
- Network Requests: External service call patterns
- Error Rates: Frequency of different error types
Alerting Thresholds
- Memory Usage: > 80% of available memory
- CPU Usage: > 90% for extended periods
- Database Connections: > 80% of pool size
- Error Rate: > 5% of requests failing
- Response Time: > 10s for critical operations
Maintenance
Regular Updates
- Update test cases as new failure scenarios are discovered
- Add tests for new features and services
- Update expected behaviors as system evolves
- Review and update critical failure thresholds
Test Hygiene
- Remove obsolete tests
- Update mock configurations
- Refresh test data
- Validate test environment setup
Best Practices
Test Design
- Isolation: Each test should be independent
- Cleanup: Proper resource cleanup after tests
- Deterministic: Tests should produce consistent results
- Focused: Each test should verify one specific failure scenario
Resource Management
- Memory: Monitor memory usage during tests
- Cleanup: Ensure proper cleanup of test resources
- Timeouts: Set appropriate timeouts for different scenarios
- Parallel Execution: Balance parallelism with resource constraints
Error Handling
- Expected Errors: Distinguish between expected and unexpected errors
- Recovery: Test recovery mechanisms where applicable
- Logging: Comprehensive logging for debugging failures
- Graceful Degradation: Verify graceful degradation behaviors
Troubleshooting
Common Issues
-
Memory Exhaustion During Tests
- Reduce parallel test execution
- Increase system memory allocation
- Check for memory leaks in test setup
-
Database Connection Failures
- Verify database service is running
- Check connection pool configuration
- Ensure test database permissions
-
Network Timeout Issues
- Increase timeout values for network tests
- Check internet connectivity
- Verify external service availability
-
File System Permission Errors
- Ensure test has appropriate file permissions
- Check disk space availability
- Verify temporary directory access
Debugging Tips
- Use verbose output (
-v) for detailed test information - Check individual test logs for specific failures
- Review system resource usage during test execution
- Verify test environment configuration
Contributing
When adding new crash tests:
- Follow Naming Convention:
test_[category]_[scenario].py - Add Documentation: Document test purpose and coverage
- Include in Suite: Add to
run_crash_tests.pytest suites - Set Critical Level: Assign appropriate critical level
- Update README: Update this documentation
Example Test Structure
import pytest
from unittest.mock import Mock, patch
class TestNewFailureScenario:
\"\"\"Test new failure scenario.\"\"\"
@pytest.fixture
def mock_config(self):
\"\"\"Mock configuration for testing.\"\"\"
return Mock()
@pytest.mark.asyncio
async def test_specific_failure_case(self, mock_config):
\"\"\"Test specific failure case.\"\"\"
# Arrange
# Act
# Assert
pass
License
This crash test suite is part of the biz-budz project and follows the same license terms.