Files

yangdx fc9f7c705e Fix linting

2025-11-18 08:07:54 +08:00

8.4 KiB

Raw Permalink Blame History

Workspace Isolation Test Suite

Overview

Comprehensive test coverage for LightRAG's workspace isolation feature, ensuring that different workspaces (projects) can coexist independently without data contamination or resource conflicts.

Test Architecture

Design Principles

Concurrency-Based Assertions: Instead of timing-based tests (which are flaky), we measure actual concurrent lock holders
Timeline Validation: Finite state machine validates proper sequential execution
Performance Metrics: Each test reports execution metrics for debugging and optimization
Configurable Stress Testing: Environment variables control test intensity

Test Categories

1. Data Isolation Tests

Tests: 1, 4, 8, 9, 10 Purpose: Verify that data in one workspace doesn't leak into another

Test 1: Pipeline Status Isolation - Core shared data structures remain separate
Test 4: Multi-Workspace Concurrency - Concurrent operations don't interfere
Test 8: Update Flags Isolation - Flag management respects workspace boundaries
Test 9: Empty Workspace Standardization - Edge case handling for empty workspace strings
Test 10: JsonKVStorage Integration - Storage layer properly isolates data

2. Lock Mechanism Tests

Tests: 2, 5, 6 Purpose: Validate that locking mechanisms allow parallelism across workspaces while enforcing serialization within workspaces

Test 2: Lock Mechanism - Different workspaces run in parallel, same workspace serializes
Test 5: Re-entrance Protection - Prevent deadlocks from re-entrant lock acquisition
Test 6: Namespace Lock Isolation - Different namespaces within same workspace are independent

3. Backward Compatibility Tests

Test: 3 Purpose: Ensure legacy code without workspace parameters still functions correctly

Default workspace fallback behavior
Empty workspace handling
None vs empty string normalization

4. Error Handling Tests

Test: 7 Purpose: Validate guardrails for invalid configurations

Missing workspace validation
Workspace normalization
Edge case handling

5. End-to-End Integration Tests

Test: 11 Purpose: Validate complete LightRAG workflows maintain isolation

Full document insertion pipeline
File system separation
Data content verification

Running Tests

Basic Usage

# Run all workspace isolation tests
pytest tests/test_workspace_isolation.py -v

# Run specific test
pytest tests/test_workspace_isolation.py::test_lock_mechanism -v

# Run with detailed output
pytest tests/test_workspace_isolation.py -v -s

Environment Configuration

Stress Testing

Enable stress testing with configurable number of workers:

# Enable stress mode with default 3 workers
LIGHTRAG_STRESS_TEST=true pytest tests/test_workspace_isolation.py -v

# Custom number of workers (e.g., 10)
LIGHTRAG_STRESS_TEST=true LIGHTRAG_TEST_WORKERS=10 pytest tests/test_workspace_isolation.py -v

Keep Test Artifacts

Preserve temporary directories for manual inspection:

# Keep test artifacts (useful for debugging)
LIGHTRAG_KEEP_ARTIFACTS=true pytest tests/test_workspace_isolation.py -v

Combined Example

# Stress test with 20 workers and keep artifacts
LIGHTRAG_STRESS_TEST=true \
LIGHTRAG_TEST_WORKERS=20 \
LIGHTRAG_KEEP_ARTIFACTS=true \
pytest tests/test_workspace_isolation.py::test_lock_mechanism -v -s

CI/CD Integration

# Recommended CI/CD command (no artifacts, default workers)
pytest tests/test_workspace_isolation.py -v --tb=short

Test Implementation Details

Helper Functions

`_measure_lock_parallelism`

Measures actual concurrency rather than wall-clock time.

Returns:

max_parallel: Peak number of concurrent lock holders
timeline: Ordered list of (task_name, event) tuples
metrics: Dict with performance data (duration, concurrency, workers)

Example:

workload = [
    ("task1", "workspace1", "namespace"),
    ("task2", "workspace2", "namespace"),
]
max_parallel, timeline, metrics = await _measure_lock_parallelism(workload)

# Assert on actual behavior, not timing
assert max_parallel >= 2  # Two different workspaces should run concurrently

`_assert_no_timeline_overlap`

Validates sequential execution using finite state machine.

Validates:

No overlapping lock acquisitions
Proper lock release ordering
All locks properly released

Example:

timeline = [
    ("task1", "start"),
    ("task1", "end"),
    ("task2", "start"),
    ("task2", "end"),
]
_assert_no_timeline_overlap(timeline)  # Passes - no overlap

timeline_bad = [
    ("task1", "start"),
    ("task2", "start"),  # ERROR: task2 started before task1 ended
    ("task1", "end"),
]
_assert_no_timeline_overlap(timeline_bad)  # Raises AssertionError

Configuration Variables

Variable	Type	Default	Description
`LIGHTRAG_STRESS_TEST`	bool	`false`	Enable stress testing mode
`LIGHTRAG_TEST_WORKERS`	int	`3`	Number of parallel workers in stress mode
`LIGHTRAG_KEEP_ARTIFACTS`	bool	`false`	Keep temporary test directories

Performance Benchmarks

Expected Performance (Reference System)

Test 1-9: < 1s each
Test 10: < 2s (includes file I/O)
Test 11: < 5s (includes full RAG pipeline)
Total Suite: < 15s

Stress Test Performance

With LIGHTRAG_TEST_WORKERS=10:

Test 2 (Parallel): ~0.05s (10 workers, all concurrent)
Test 2 (Serial): ~0.10s (2 workers, serialized)

Troubleshooting

Common Issues

Flaky Test Failures

Symptom: Tests pass locally but fail in CI/CD Cause: System under heavy load, timing-based assertions Solution: Our tests use concurrency-based assertions, not timing. If failures persist, check the timeline output in error messages.

Resource Cleanup Errors

Symptom: "Directory not empty" or "Cannot remove directory" Cause: Concurrent test execution or OS file locking Solution: Run tests serially (pytest -n 1) or use LIGHTRAG_KEEP_ARTIFACTS=true to inspect state

Lock Timeout Errors

Symptom: "Lock acquisition timeout" Cause: Deadlock or resource starvation Solution: Check test output for deadlock patterns, review lock acquisition order

Debug Tips

Enable verbose output:

pytest tests/test_workspace_isolation.py -v -s

Run single test with artifacts:

LIGHTRAG_KEEP_ARTIFACTS=true pytest tests/test_workspace_isolation.py::test_json_kv_storage_workspace_isolation -v -s

Check performance metrics: Look for the "Performance:" lines in test output showing duration and concurrency.
Inspect timeline on failure: Timeline data is included in assertion error messages.

Contributing

Adding New Tests

Follow naming convention: test_<feature>_<aspect>
Add purpose/scope comments: Explain what and why
Use helper functions: _measure_lock_parallelism, _assert_no_timeline_overlap
Document assertions: Explain expected behavior in assertions
Update this README: Add test to appropriate category

Test Template

@pytest.mark.asyncio
async def test_new_feature():
    """
    Brief description of what this test validates.
    """
    # Purpose: Why this test exists
    # Scope: What functions/classes this tests
    print("\n" + "=" * 60)
    print("TEST N: Feature Name")
    print("=" * 60)

    # Test implementation
    # ...

    print("✅ PASSED: Feature Name")
    print(f"   Validation details")

Test Coverage Matrix

Component	Data Isolation	Lock Mechanism	Backward Compat	Error Handling	E2E
shared_storage	✅ T1, T4	✅ T2, T5, T6	✅ T3	✅ T7	✅ T11
update_flags	✅ T8	-	-	-	-
JsonKVStorage	✅ T10	-	-	-	✅ T11
LightRAG Core	-	-	-	-	✅ T11
Namespace	✅ T9	-	✅ T3	✅ T7	-

Legend: T# = Test number

Version History

v2.0 (2025-01-18): Added performance metrics, stress testing, configurable cleanup
v1.0 (Initial): Basic workspace isolation tests with timing-based assertions

8.4 KiB Raw Permalink Blame History