- Added bind mount for .venv directory in devcontainer to persist Python virtual environment across container rebuilds - Enabled updateRemoteUserUID for proper file permissions in devcontainer - Normalized peer dependency flags in package-lock.json (removed inconsistent "peer": true from core dependencies, added to test-only dependencies) - Added empty codex file placeholder - Created comprehensive
8.5 KiB
8.5 KiB
ROCm Support Implementation Checklist
This checklist tracks the implementation progress for Sprint 18.5.
Phase 1: Device Abstraction Layer
1.1 GPU Detection Module
- Create
src/noteflow/infrastructure/gpu/__init__.py - Create
src/noteflow/infrastructure/gpu/detection.py- Implement
GpuBackendenum (NONE, CUDA, ROCM, MPS) - Implement
GpuInfodataclass - Implement
detect_gpu_backend()function - Implement
get_gpu_info()function - Add ROCm version detection via
torch.version.hip
- Implement
- Create
tests/infrastructure/gpu/test_detection.py- Test no-torch case
- Test CUDA detection
- Test ROCm detection (HIP check)
- Test MPS detection
- Test CPU fallback
1.2 Domain Types
- Create
src/noteflow/domain/ports/gpu.py- Export
GpuBackendenum - Export
GpuInfotype - Define
GpuDetectionProtocol
- Export
1.3 ASR Device Types
- Update
src/noteflow/application/services/asr_config/types.py- Add
ROCM = "rocm"toAsrDeviceenum - Add ROCm entry to
DEVICE_COMPUTE_TYPESmapping - Update
AsrCapabilitiesdataclass withrocm_availableandgpu_backendfields
- Add
1.4 Diarization Device Mixin
- Update
src/noteflow/infrastructure/diarization/engine/_device_mixin.py- Add ROCm detection in
_detect_available_device() - Maintain backward compatibility with "cuda" device string
- Add ROCm detection in
1.5 System Metrics
- Update
src/noteflow/infrastructure/metrics/system_resources.py- Handle ROCm VRAM queries (same API as CUDA via HIP)
- Add
gpu_backendfield to metrics
1.6 gRPC Proto
- Update
src/noteflow/grpc/proto/noteflow.proto- Add
ASR_DEVICE_ROCM = 3toAsrDeviceenum - Add
rocm_availablefield toAsrConfiguration - Add
gpu_backendfield toAsrConfiguration
- Add
- Regenerate Python stubs
- Run
scripts/patch_grpc_stubs.py
1.7 Phase 1 Tests
- Run
pytest tests/infrastructure/gpu/ - Run
make quality-py - Verify no regressions in CUDA detection
Phase 2: ASR Engine Protocol
2.1 Engine Protocol Definition
- Extend
src/noteflow/infrastructure/asr/protocols.py(or relocate todomain/ports)- Reuse
AsrResult/WordTimingfrominfrastructure/asr/dto.py - Add
deviceproperty (logical device: cpu/cuda/rocm) - Add
compute_typeproperty - Confirm
model_size+is_loadedalready covered - Add optional
transcribe_file()helper (if needed)
- Reuse
2.2 Refactor FasterWhisperEngine
- Update
src/noteflow/infrastructure/asr/engine.py- Ensure compliance with
AsrEngine - Add explicit type annotations
- Document as CUDA/CPU backend
- Ensure compliance with
- Create
tests/infrastructure/asr/test_protocol_compliance.py- Verify
FasterWhisperEngineimplements protocol
- Verify
2.3 PyTorch Whisper Engine (Fallback)
- Create
src/noteflow/infrastructure/asr/pytorch_engine.py- Implement
WhisperPyTorchEngineclass - Implement all protocol methods
- Handle device placement (cuda/rocm/cpu)
- Support all compute types
- Implement
- Create
tests/infrastructure/asr/test_pytorch_engine.py- Test model loading
- Test transcription
- Test device handling
2.4 Engine Factory
- Create
src/noteflow/infrastructure/asr/factory.py- Implement
create_asr_engine()function - Implement
_resolve_device()helper - Implement
_create_cpu_engine()helper - Implement
_create_cuda_engine()helper - Implement
_create_rocm_engine()helper - Define
EngineCreationErrorexception
- Implement
- Create
tests/infrastructure/asr/test_factory.py- Test auto device resolution
- Test explicit device selection
- Test fallback behavior
- Test error cases
2.5 Update Engine Manager
- Update
src/noteflow/application/services/asr_config/_engine_manager.py- Add
detect_rocm_available()method - Update
build_capabilities()for ROCm - Update
check_configuration()for ROCm validation - Use factory for engine creation in
build_engine_for_job()
- Add
- Update
tests/application/test_asr_config_service.py- Add ROCm detection tests
- Add ROCm validation tests
2.6 Phase 2 Tests
- Run full ASR test suite
- Run
make quality-py - Verify CUDA path unchanged
Phase 3: ROCm-Specific Engine
3.1 ROCm Engine Implementation
- Create
src/noteflow/infrastructure/asr/rocm_engine.py- Implement
FasterWhisperRocmEngineclass - Handle CTranslate2-ROCm import with fallback
- Implement all protocol methods
- Add ROCm-specific optimizations
- Implement
- Create
tests/infrastructure/asr/test_rocm_engine.py- Test import fallback behavior
- Test engine creation (mock)
- Test protocol compliance
3.2 Update Factory for ROCm
- Update
src/noteflow/infrastructure/asr/factory.py- Add ROCm engine import with graceful fallback
- Log warning when falling back to PyTorch
- Update factory tests for ROCm path
3.3 ROCm Installation Detection
- Update
src/noteflow/infrastructure/gpu/detection.py- Add
is_ctranslate2_rocm_available()function - Add
get_rocm_version()function
- Add
- Add corresponding tests
3.4 Phase 3 Tests
- Run ROCm-specific tests (skip if no ROCm)
- Run
make quality-py - Test on AMD hardware (if available)
Phase 4: Configuration & Distribution
4.1 Feature Flag
- Update
src/noteflow/config/settings/_features.py- Add
NOTEFLOW_FEATURE_ROCM_ENABLEDflag - Document in settings
- Add
- Update any feature flag guards
4.2 gRPC Config Handlers
- Update
src/noteflow/grpc/mixins/asr_config.py- Handle ROCm device in
GetAsrConfiguration() - Handle ROCm device in
UpdateAsrConfiguration() - Add ROCm to capabilities response
- Handle ROCm device in
- Update tests in
tests/grpc/test_asr_config.py
4.3 Dependencies
- Update
pyproject.toml- Add
rocmextras group - Add
openai-whisperas optional dependency - Document ROCm installation in comments
- Add
- Create
requirements-rocm.txt(optional)
4.4 Docker ROCm Image
- Create
docker/Dockerfile.rocm- Base on
rocm/pytorchimage - Install NoteFlow with ROCm extras
- Configure for GPU access
- Base on
- Update
compose.yaml(and/or addcompose.rocm.yaml) with ROCm profile - Test Docker image build
4.5 Documentation
- Create
docs/installation/rocm.md- System requirements
- PyTorch ROCm installation
- CTranslate2-ROCm installation (optional)
- Docker usage
- Troubleshooting
- Update main README with ROCm section
- Update
CLAUDE.mdwith ROCm notes
4.6 Phase 4 Tests
- Run full test suite
- Run
make quality - Build ROCm Docker image
- Test on AMD hardware
Final Validation
Quality Gates
pytest tests/quality/passesmake quality-pypassesmake qualitypasses (full stack)- Proto regenerated correctly
- No type errors (
basedpyright) - No lint errors (
ruff)
Functional Validation
- CUDA path works (no regression)
- CPU path works (no regression)
- ROCm detection works
- PyTorch fallback works
- gRPC configuration works
- Device switching works
Documentation
- Sprint README complete
- Implementation checklist complete
- Installation guide complete
- API documentation updated
Notes
Files Created
| File | Status |
|---|---|
src/noteflow/domain/ports/gpu.py |
❌ |
src/noteflow/domain/ports/asr.py |
optional (only if relocating protocol) |
src/noteflow/infrastructure/gpu/__init__.py |
❌ |
src/noteflow/infrastructure/gpu/detection.py |
❌ |
src/noteflow/infrastructure/asr/pytorch_engine.py |
❌ |
src/noteflow/infrastructure/asr/rocm_engine.py |
❌ |
src/noteflow/infrastructure/asr/factory.py |
❌ |
docker/Dockerfile.rocm |
❌ |
docs/installation/rocm.md |
❌ |
Files Modified
| File | Status |
|---|---|
application/services/asr_config/types.py |
❌ |
application/services/asr_config/_engine_manager.py |
❌ |
infrastructure/diarization/engine/_device_mixin.py |
❌ |
infrastructure/metrics/system_resources.py |
❌ |
infrastructure/asr/engine.py |
❌ |
infrastructure/asr/protocols.py |
❌ |
grpc/proto/noteflow.proto |
❌ |
grpc/mixins/asr_config.py |
❌ |
config/settings/_features.py |
❌ |
pyproject.toml |
❌ |