266 lines
8.1 KiB
Markdown
266 lines
8.1 KiB
Markdown
# SPRINT-GAP-002: State Synchronization Gaps
|
|
|
|
| Attribute | Value |
|
|
|-----------|-------|
|
|
| **Sprint** | GAP-002 |
|
|
| **Size** | M (Medium) |
|
|
| **Owner** | TBD |
|
|
| **Phase** | Hardening |
|
|
| **Prerequisites** | None |
|
|
|
|
## Open Issues
|
|
|
|
- [ ] Determine cache invalidation strategy (push vs poll vs hybrid)
|
|
- [ ] Define acceptable staleness window for meeting data
|
|
- [ ] Decide on WebSocket/SSE vs polling for real-time updates
|
|
|
|
## Validation Status
|
|
|
|
| Component | Exists | Needs Work |
|
|
|-----------|--------|------------|
|
|
| Meeting cache | Yes | Needs invalidation |
|
|
| Sync run cache (backend) | Yes | Client unaware of TTL |
|
|
| Active project resolution | Yes | Client unaware of implicit selection |
|
|
| Integration ID validation | Yes | Partial implementation |
|
|
|
|
## Objective
|
|
|
|
Ensure consistent state between backend and client by implementing proper cache invalidation, explicit state communication, and recovery mechanisms for stale data.
|
|
|
|
## Key Decisions
|
|
|
|
| Decision | Choice | Rationale |
|
|
|----------|--------|-----------|
|
|
| Cache invalidation | Event-driven + polling fallback | Real-time when connected, polling for recovery |
|
|
| Staleness window | 30 seconds | Balance freshness vs server load |
|
|
| Active project sync | Explicit API response | Server should return resolved project_id |
|
|
| Sync run status | Polling with backoff | Already implemented, needs resilience |
|
|
|
|
## What Already Exists
|
|
|
|
### Backend State Management
|
|
- `_sync_runs` in-memory cache with 60-second TTL (`sync.py:146`)
|
|
- Active project resolution at meeting creation time
|
|
- Diarization job status tracking (DB + memory fallback)
|
|
|
|
### Client State Management
|
|
- `meetingCache` in `lib/cache/meeting-cache.ts`
|
|
- Connection state machine in `connection-state.ts`
|
|
- Reconnection logic in `reconnection.ts`
|
|
- Integration ID caching in preferences
|
|
|
|
## Identified Issues
|
|
|
|
### 1. Meeting Cache Never Invalidates (High)
|
|
|
|
**Location**: `client/src/api/tauri-adapter.ts:257-286`
|
|
|
|
```typescript
|
|
async createMeeting(request: CreateMeetingRequest): Promise<Meeting> {
|
|
const meeting = await invoke<Meeting>(TauriCommands.CREATE_MEETING, {...});
|
|
meetingCache.cacheMeeting(meeting); // Cached
|
|
return meeting;
|
|
}
|
|
```
|
|
|
|
**Problem**: Meetings are cached on create/fetch but never invalidated:
|
|
- Server-side state changes (stop, complete) not reflected
|
|
- Another client's modifications invisible
|
|
- Stale data shown after server restart
|
|
|
|
**Impact**: Users see outdated meeting states, segments, summaries.
|
|
|
|
### 2. Sync Run Cache TTL Invisible to Client (Medium)
|
|
|
|
**Location**: `src/noteflow/grpc/_mixins/sync.py:143-146`
|
|
|
|
```python
|
|
finally:
|
|
# Clean up cache after a delay (keep for status queries)
|
|
await asyncio.sleep(60)
|
|
cache.pop(sync_run_id, None)
|
|
```
|
|
|
|
**Problem**: Backend clears sync run from cache after 60 seconds, but client:
|
|
- Continues to poll `GetSyncStatus` expecting data
|
|
- Receives NOT_FOUND after TTL expires
|
|
- No distinction between "completed and expired" vs "never existed"
|
|
|
|
### 3. Active Project Silently Resolved (Medium)
|
|
|
|
**Location**: `src/noteflow/grpc/_mixins/meeting.py:100-101`
|
|
|
|
```python
|
|
if project_id is None:
|
|
project_id = await _resolve_active_project_id(self, repo)
|
|
```
|
|
|
|
**Problem**: When client doesn't send `project_id`:
|
|
- Server resolves from workspace context
|
|
- Client doesn't know which project was used
|
|
- UI may show meeting in wrong project context
|
|
|
|
### 4. Integration ID Validation Fire-and-Forget (Low)
|
|
|
|
**Location**: `client/src/lib/preferences.ts:234`
|
|
|
|
```typescript
|
|
validateCachedIntegrations().catch(() => {});
|
|
```
|
|
|
|
**Problem**: Integration validation errors are silently ignored:
|
|
- Stale integration IDs remain in cache
|
|
- Operations fail with confusing errors later
|
|
- No user notification of invalid cached data
|
|
|
|
### 5. Reconnection Doesn't Sync State (Medium)
|
|
|
|
**Location**: `client/src/api/reconnection.ts:49-53`
|
|
|
|
```typescript
|
|
try {
|
|
await getAPI().connect();
|
|
resetReconnectAttempts();
|
|
setConnectionMode('connected');
|
|
setConnectionError(null);
|
|
} catch (error) { ... }
|
|
```
|
|
|
|
**Problem**: After reconnection:
|
|
- Active streams are not recovered
|
|
- Meeting states may be stale
|
|
- No synchronization of in-flight operations
|
|
|
|
## Scope
|
|
|
|
### Task Breakdown
|
|
|
|
| Task | Effort | Description |
|
|
|------|--------|-------------|
|
|
| Add meeting cache invalidation | M | Invalidate on reconnect, periodic refresh |
|
|
| Return resolved project_id in responses | S | Backend returns actual project_id used |
|
|
| Add sync run expiry to response | S | Include `expires_at` field |
|
|
| Add cache version header | S | Server sends version, client invalidates on mismatch |
|
|
| Implement state sync on reconnect | M | Refresh critical state after connection restored |
|
|
| Surface validation errors | S | Emit events for integration validation failures |
|
|
|
|
### Files to Modify
|
|
|
|
**Backend:**
|
|
- `src/noteflow/grpc/_mixins/meeting.py` - Return resolved project_id
|
|
- `src/noteflow/grpc/_mixins/sync.py` - Add expiry info to response
|
|
- `src/noteflow/grpc/proto/noteflow.proto` - Add fields
|
|
|
|
**Client:**
|
|
- `client/src/lib/cache/meeting-cache.ts` - Add invalidation
|
|
- `client/src/api/reconnection.ts` - Sync state on reconnect
|
|
- `client/src/lib/preferences.ts` - Surface validation errors
|
|
- `client/src/hooks/use-sync-status.ts` - Handle expiry
|
|
|
|
## API Schema Changes
|
|
|
|
### Meeting Response Enhancement
|
|
|
|
```protobuf
|
|
message CreateMeetingResponse {
|
|
Meeting meeting = 1;
|
|
// New: Explicit resolved project context
|
|
optional string resolved_project_id = 2;
|
|
}
|
|
```
|
|
|
|
### Sync Status Response Enhancement
|
|
|
|
```protobuf
|
|
message GetSyncStatusResponse {
|
|
string status = 1;
|
|
// Existing fields...
|
|
|
|
// New: When this sync run expires from cache
|
|
optional string expires_at = 10;
|
|
// New: Distinguish "not found" reasons
|
|
optional string not_found_reason = 11; // "expired" | "never_existed"
|
|
}
|
|
```
|
|
|
|
### Cache Versioning
|
|
|
|
```protobuf
|
|
message ServerInfo {
|
|
// Existing fields...
|
|
|
|
// New: Increment on breaking state changes
|
|
int64 state_version = 10;
|
|
}
|
|
```
|
|
|
|
## Migration Strategy
|
|
|
|
### Phase 1: Add Expiry Information (Low Risk)
|
|
- Add `expires_at` to sync run responses
|
|
- Client shows "Sync info expired" instead of error
|
|
- No breaking changes
|
|
|
|
### Phase 2: Add Resolved IDs (Low Risk)
|
|
- Return resolved `project_id` in meeting responses
|
|
- Client updates UI context accordingly
|
|
- Backward compatible (optional field)
|
|
|
|
### Phase 3: Implement Cache Invalidation (Medium Risk)
|
|
- Add cache version to server info
|
|
- Client invalidates on version mismatch
|
|
- Add event-driven invalidation for critical updates
|
|
|
|
### Phase 4: Reconnection Sync (Medium Risk)
|
|
- Refresh active meeting state on reconnect
|
|
- Notify user of any state changes
|
|
- Handle conflicts gracefully
|
|
|
|
## Deliverables
|
|
|
|
### Backend
|
|
- [ ] Return resolved `project_id` in `CreateMeeting` response
|
|
- [ ] Add `expires_at` to sync status responses
|
|
- [ ] Add `state_version` to server info
|
|
- [ ] Emit events for state changes (future: WebSocket)
|
|
|
|
### Client
|
|
- [ ] Meeting cache invalidation on reconnect
|
|
- [ ] Meeting cache periodic refresh (30s for active meeting)
|
|
- [ ] Handle sync run expiry gracefully
|
|
- [ ] Update context with resolved project_id
|
|
- [ ] Surface integration validation errors
|
|
- [ ] State synchronization on reconnect
|
|
|
|
### Tests
|
|
- [ ] Integration test: meeting state sync after disconnect
|
|
- [ ] Integration test: sync run expiry handling
|
|
- [ ] Unit test: cache invalidation triggers
|
|
- [ ] E2E test: multi-client state consistency
|
|
|
|
## Test Strategy
|
|
|
|
### Fixtures
|
|
- Mock server with controllable state version
|
|
- Multi-client simulation
|
|
- Network partition simulation
|
|
|
|
### Test Cases
|
|
|
|
| Case | Input | Expected |
|
|
|------|-------|----------|
|
|
| Meeting modified by server | Create, modify via API, refresh | Client shows updated state |
|
|
| Sync run expires | Start sync, wait 70s, check status | Graceful "expired" message |
|
|
| Reconnection | Disconnect, modify, reconnect | State synchronized |
|
|
| Active project | Create meeting without project_id | Response includes resolved project_id |
|
|
| Cache version bump | Server restart with new version | Client invalidates caches |
|
|
|
|
## Quality Gates
|
|
|
|
- [ ] No stale meeting states shown after reconnection
|
|
- [ ] Sync run expiry handled gracefully (no error dialogs)
|
|
- [ ] Active project always known to client
|
|
- [ ] Integration validation errors surface to user
|
|
- [ ] All cache operations have invalidation path
|
|
- [ ] Tests cover multi-client scenarios
|