UN-2830 Suppress OpenTelemetry EventLogger LogRecord deprecation warning in prompt-service
Suppressed the LogDeprecatedInitWarning from OpenTelemetry SDK 1.37.0's EventLogger.emit()
method which still uses deprecated trace_id/span_id/trace_flags parameters instead of the
context parameter.
This is a temporary workaround for an upstream bug in OpenTelemetry Python SDK where
EventLogger.emit() was not updated when LogRecord API was changed to accept context parameter.
The warning only appears in prompt-service because llama-index emits OpenTelemetry Events.
See: https://github.com/open-telemetry/opentelemetry-python/issues/4328🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
- Moved redis_client.py and redis_queue_client.py from workers/shared/cache/ to unstract/core/src/unstract/core/cache/
- Created new cache module in unstract-core with __init__.py exports
- Updated all import statements across workers to use unstract.core.cache instead of shared.cache
- Updated imports in log_consumer tasks and cache_backends to use new location
- Maintains backward compatibility by keeping same API and functionality
This commit implements multiple defensive fixes to prevent race conditions when duplicate
workers process the same file execution, which was causing:
1. COMPLETED/ERROR status being overwritten with PENDING→EXECUTING by late workers
2. Premature chord callback cleanup (FileExpired errors) while workers still processing
3. Missing FINALIZATION stage detection during grace period
4. Incorrect COMPLETED:SUCCESS status for failed executions
Fixes implemented:
- Pre-creation: Check FileExecutionStatusTracker (Redis) + DB before PENDING update
- Grace period: Extended duplicate detection to include FINALIZATION stage
- Destination: Wait for DESTINATION_PROCESSING lock release (prevents chord cleanup)
- Terminal states: Handle both COMPLETED and ERROR in processor.py
- Status tracking: Set COMPLETED:FAILED for failed executions (was SUCCESS)
- Refactoring: Consolidated completed_ttl to method initialization
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* UN-2901 [FIX] Prevent invalid status updates (EXECUTING/ERROR) from duplicate file processing runs
Fixes race condition where late-arriving workers overwrite COMPLETED status
with invalid EXECUTING or ERROR states, causing files to appear failed/stuck
even though processing succeeded.
Changes:
- FileAPIClient: Fixed URL construction and method call bugs
- Fresh DB validation: Check current status before updating to EXECUTING
- Grace period optimization: Early exit when duplicate detected during tool polling
- File count accuracy: Include skipped files in total_files calculation
Impact: Files now correctly maintain COMPLETED status; no duplicate processing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* coderabbit commnets addressed
---------
Co-authored-by: Claude <noreply@anthropic.com>
* UN-2901 [FIX] Container startup race condition with polling grace period
* UN-2901 [FIX] Add Redis retry resilience and fix container failure detection
- Add configurable Redis retry decorator with exponential backoff
- Fix critical bug where containers that never start are marked as SUCCESS
- Add robust env var validation for retry configuration
- Apply retry logic to FileExecutionStatusTracker and ToolExecutionTracker
- Document REDIS_RETRY_MAX_ATTEMPTS and REDIS_RETRY_BACKOFF_FACTOR env vars
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* UN-2901 [FIX] Address CodeRabbitAI review feedback for race condition fix
This commit addresses all valid CodeRabbitAI review comments on PR #1602:
1. **Fix retry loop semantics**: Changed retry loop to use range(max_retries + 1)
where max_retries means "retries after initial attempt", not total attempts.
Updated default from 5 to 4 (total 5 attempts) for clarity.
2. **Fix TypeError in file_execution_tracker.py**: Fixed json.loads() receiving
dict instead of string by using string fallback values.
3. **Fix unsafe env var parsing**: Added _safe_get_env_int/_safe_get_env_float
helpers with validation and fallback to defaults with warning logs.
4. **Fix status None check**: Added defensive None check before calling .get()
on status dict in grace period reset logic.
5. **Update sample.env defaults**: Changed REDIS_RETRY_MAX_ATTEMPTS from 5 to 4
and updated comments to clarify retry semantics.
6. **Improve transient failure handling**: Changed logger.error to logger.warning
for transient status fetch failures, added sleep before continue to respect
polling interval and avoid API hammering.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
UN-2897 [FIX] Prevent SIGSEGV from Secret Manager gRPC in Google Drive connector
Add module-level cache for client_secrets to prevent repeated Secret Manager
API calls which use gRPC and are not fork-safe. This addresses the second
root cause discovered after initial lazy initialization fix.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* Update tox.ini to document libmagic1 system dependency requirement
* [CI] Consolidate test reports into single PR comment
This change reduces noise in PR comments by combining multiple test
reports (runner and sdk1) into a single consolidated comment.
Changes:
- Add combine-test-reports.sh script to merge test reports
- Update CI workflow to combine reports before posting
- Replace separate runner/sdk1 comment steps with single combined step
- Summary section shows test counts (passed/failed/total) collapsed by default
- Full detailed reports available in collapsible sections
- Simplify job summary to use combined report
Benefits:
- Single PR comment instead of multiple separate comments
- Cleaner PR comment section with less clutter
- Easy-to-read summary with detailed inspection on demand
- Maintains all existing test information
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix test count extraction in combine-test-reports script
The previous version was using text pattern matching which incorrectly
extracted test counts. This fix properly parses the pytest-md-report
markdown table format by:
- Finding the table header row to determine column positions
- Locating the TOTAL row (handles both TOTAL and **TOTAL** formatting)
- Extracting values from the passed, failed, and SUBTOTAL columns
- Using proper table parsing instead of pattern matching
This resolves the issue where the summary showed incorrect counts
(23 for both runner and sdk1 instead of the actual 11 and 66).
Fixes: Test count summary in PR comments now shows correct values
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix LaTeX formatting in pytest-md-report test count extraction
pytest-md-report wraps all table values in LaTeX formatting:
$$\textcolor{...}{\tt{VALUE}}$$
The previous fix attempted to parse the table but didn't handle
the LaTeX formatting, causing it to extract 0 for all counts.
Changes:
- Add strip_latex() function to extract values from LaTeX wrappers
- Update grep pattern to match TOTAL row specifically (not SUBTOTAL in header)
- Apply LaTeX stripping to all extracted values before parsing
This fix was tested locally with tox-generated reports and correctly
shows: Runner=11 passed, SDK1=66 passed
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
UN-2897 [FIX] Google Drive connector SIGSEGV crashes in Celery ForkPoolWorker
Implements lazy initialization for Google Drive API client to prevent
segmentation faults when Celery forks worker processes.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* Removing redundant validations
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* UN-2893 [FIX] Fix duplicate process handling status updates and UI error logs
Prevent duplicate worker processes from updating file execution status
and showing UI error logs during GKE race conditions.
- Added is_duplicate_skip flag to FileProcessingResult dataclass
- Fixed destination_processed default value for correct duplicate detection
- Skip status updates and UI logs when duplicate is detected
- Only first worker updates status, second worker silently exits
* logger.error converted to logger.exception
* error to exception in logs
* UN-2882 [FIX] Fix BigQuery float precision issue in PARSE_JSON for metadata serialization
- Added BigQuery-specific float sanitization with IEEE 754 double precision safe zone
- Consolidated duplicate float sanitization logic into shared utilities
- Fixed insertion errors caused by floats with >15 significant figures
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix truthiness check for empty values in JSON serialization
- Changed 'if sanitized_value' to 'if sanitized_value is not None'
- Prevents empty dicts {}, empty lists [], and zero values from becoming None
- Addresses CodeRabbit AI feedback on PR #1593🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* feat/smart-table-extractor
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address review comments
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address review comments
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address review comments
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Deepak K <89829542+Deepak-Kesavan@users.noreply.github.com>
* UN-2889 [FIX] Handle Celery logger with empty request_id to prevent SIGSEGV crashes
- Simplified logging filters into RequestIDFilter and OTelFieldFilter
- Removed custom DjangoStyleFormatter and StructuredFormatter classes
- Removed Celery's worker_log_format config that created formatters without filters
- Removed LOG_FORMAT environment variable and all format options
- All workers now use single standardized format with filters always applied
* addressd coderabiit comment
* addressd coderabiit comment
UN-2885 [FIX] Fix MinIO cleanup failure due to incorrect organization ID format
Fixed MinIO path mismatch causing cleanup failures by correcting the
organization_id format returned in workflow execution context.
Changed backend/workflow_manager/internal_views.py:443 from returning
numeric organization.id (e.g., "2") to string organization.organization_id
(e.g., "org_HJqNgodUQsA99S3A") in _get_organization_context() method.
This ensures MinIO cleanup uses the correct organization ID format that
matches the paths where tools write execution files.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* Fix BigQuery float precision issue by normalizing floats before JSON serialization
- Added _sanitize_floats_for_database() helper function to recursively normalize
float values to 6 decimal precision using string formatting
- Modified _add_processing_columns() to sanitize metadata before json.dumps()
- Fixes BigQuery insertion failures caused by floats that can't round-trip
through string representation (e.g., 22.770092)
- Solution normalizes internal binary representation via float(f"{x:.6f}")
- Handles edge cases: NaN and Infinity converted to None
- Works recursively on nested dicts/lists
- Backward compatible, preserves meaningful precision
- Protects all database types (BigQuery, PostgreSQL, MySQL, Snowflake)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* addressing PR comments add sanitized method over data feild too
* moved math import to top of the file
---------
Co-authored-by: Claude <noreply@anthropic.com>
- Remove X-Organization-ID from session headers in _setup_session()
- Remove X-Organization-ID from set_organization_context() method
- Update clear_organization_context() to only clear instance variables
- Use per-request headers in _make_request() to prevent pollution
This prevents callback workers from inheriting wrong organization context
when using shared HTTP sessions with singleton pattern.
Fixes: UN-2877
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add lock TTL (10 min default) to prevent deadlock if worker crashes during destination processing
- Add error check before marking DESTINATION_PROCESSING SUCCESS to ensure no false success status
- Add DESTINATION_PROCESSING_LOCK_TTL_IN_SECOND=600 to workers/sample.env
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Fixed stage order: DESTINATION_PROCESSING(2) → FINALIZATION(3) → COMPLETED(4)
- Implemented atomic lock via DESTINATION_PROCESSING IN_PROGRESS status
- Added 3-minute TTL for COMPLETED stage to optimize Redis memory
- Removed conflicting stage status updates from service.py
- Enhanced status history tracking with complete IN_PROGRESS → SUCCESS transitions
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fixed parameter name from 'exclude_execution_id' to 'current_execution_id'
in worker API client to match backend endpoint expectations. This allows
worker retries after pod crashes to properly exclude current execution
from duplicate detection.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* feat/smart-table-extractor
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address review comments
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address review comments
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address review comments
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Gayathri <142381512+gaya3-zipstack@users.noreply.github.com>
* UN-2871 [FEATURE] Add shared workflow executions filter to enable multi-user access
Update WorkflowExecutionManager.for_user() to include executions from workflows shared with users, ensuring consistent access control across workflow and execution models.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update backend/workflow_manager/workflow_v2/models/execution.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Rahul Johny <116638720+johnyrahul@users.noreply.github.com>
* UN-2871 [FIX] Move Q import to top-level for PEP8 compliance
Move django.db.models.Q import from function-level to module-level to comply with linting standards and improve code organization.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* UN-2871 [SECURITY] Fix execution filtering to respect independent workflow and deployment sharing
Update WorkflowExecutionManager.for_user() to properly handle independent sharing between workflows and API deployments/pipelines. Previous implementation only checked workflow sharing, allowing users to see executions for unshared deployments.
Key changes:
- Add separate filters for API deployments and pipelines access
- Implement proper logic for independent sharing scenarios:
* Workflow shared + no pipeline -> User sees workflow-level executions
* API/Pipeline shared (regardless of workflow) -> User sees those executions
* Both shared -> User sees all related executions
* Neither shared -> User cannot see executions
This ensures users can only view executions for resources they have explicit access to, preventing unauthorized data exposure.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* UN-2871 [PERF] Optimize ExecutionFilter to use EXISTS instead of values_list for large datasets
Replace inefficient values_list() queries with EXISTS subqueries in filter_execution_entity().
This significantly improves performance when filtering by entity type on large datasets.
Performance improvements:
- API filter: Uses EXISTS check instead of fetching all API deployment IDs
- ETL filter: Uses EXISTS check instead of fetching all ETL pipeline IDs
- TASK filter: Uses EXISTS check instead of fetching all TASK pipeline IDs
- Workflow filter: Simplified to use isnull check (removed redundant workflow_id filter)
EXISTS is more efficient because:
1. Stops at first match (short-circuits)
2. Doesn't transfer data from database to application
3. Better query optimizer hints for the database
4. Reduced memory usage
The queryset is already filtered by user permissions via get_queryset(),
so this change only optimizes the entity type filtering step.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* UN-2871 [FIX] Move Exists and OuterRef imports to module level for PEP8 compliance
Move django.db.models.Exists and OuterRef imports from function-level to module-level to comply with linting standards and improve code organization.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Signed-off-by: Rahul Johny <116638720+johnyrahul@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* UN-2867 [FIX] Fixed webhook notifications showing incorrect pipeline name
- Added pipeline name fetch from Pipeline/APIDeployment models in callback worker
- Replaced workflow name fallback with explicit "Unknown API"/"Unknown Pipeline" values
- Extracted pipeline name fetching logic into reusable _fetch_pipeline_name_from_api() method
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* some exception erros to convert to logger.exception for better debugging
* replace logger.warning with exception
* addressed coderabbitai's comments
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
UN-2869 [FIX] Add broker heartbeat configuration to prevent RabbitMQ connection timeouts
This fix addresses false duplicate file detection caused by stale IN_PROGRESS
records when RabbitMQ disconnects idle workers after 60 seconds.
Changes:
- Added broker_heartbeat=30s to WorkerCeleryConfig in workers/shared/models/worker_models.py
- Configurable via CELERY_BROKER_HEARTBEAT env var (default: 30s)
- Prevents RabbitMQ connection drops during long-running tasks
- Eliminates stale cache/DB entries that cause false duplicate detection
Technical Details:
- RabbitMQ default timeout: 60 seconds
- Recommended heartbeat: 30 seconds (half of timeout)
- Uses get_celery_setting() for hierarchical config: worker-specific -> global -> default
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* UN-2866 [MISC] Add database duplicate file check in file_processing worker
This adds a defensive double-safeguard layer to prevent duplicate file
processing at the file_processing worker level, complementing the existing
duplicate check in the general worker.
Changes:
- Added _check_file_already_active_in_db() helper function that checks
database (not Redis) for files already being processed in other executions
- Modified _pre_create_file_executions() to check for duplicates before
creating file execution records
- Duplicates create WorkflowFileExecution records marked as ERROR status
(not skipped) with detailed error message for full audit trail
- Added UI logging via log_file_processing_error() for user visibility
- Modified _process_individual_files() to handle skipped duplicates gracefully
- Updated _compile_batch_result() to include skipped_duplicates count
Benefits:
- Double protection against race conditions (catches edge cases)
- Full audit trail - every file has a database record with explanation
- Better user experience - users see why files were skipped in UI
- Transparent failure counting - duplicates counted as failed with reason
- Database-only check (efficient, no Redis overhead)
Implementation Details:
- Only checks database (Redis already updated by general worker)
- Fail-safe approach - allows processing if check fails
- Creates ERROR records for duplicates instead of silently skipping
- Proper logging at both console and UI levels
- Skipped duplicates included in failed_files count
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* UN-2866 [MISC] Enhanced duplicate file detection with Redis-first check and stable identifier tracking
- Redis-first duplicate check with DB fallback for faster detection
- Stable identifier tracking (uuid:path) to prevent misclassification
- Reuse ActiveFileManager methods for consistency
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
UN-2865 [FIX] Remove premature COMPLETED status update in general worker
This fix addresses a critical bug where the general worker incorrectly
marked workflow executions as COMPLETED immediately after orchestrating
async file processing, while files were still being processed.
Changes:
- Removed WorkflowExecutionStatusUpdate that set status to COMPLETED
- Removed incorrect execution_time update (only orchestration time)
- Removed incorrect total_files calculation
- Updated comments to clarify orchestration vs execution completion
- Updated logging to reflect async orchestration behavior
The callback worker now properly handles setting the final COMPLETED or
ERROR status after all files finish processing, matching the pattern
used by the API deployment worker.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
UN-2862 [FIX] Remove legacy QUEUED and CANCELED statuses
- Remove QUEUED and CANCELED enum values from ExecutionStatus
- Remove legacy status mappings from 4 worker status mapping files
- Remove dead LEGACY_STATUS_MAPPING dict (never used)
- Remove unreachable QUEUED/CANCELED mappings from callback tasks
- Fix unused QUEUED default parameter to PENDING in internal_client
- Add human-readable labels to ExecutionStatus.choices using .title()
- Remove unmigrated API constraint from file_execution model
All legacy status references cleaned up while maintaining backward compatibility
through string-based legacy mappings in callback tasks.
Co-authored-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
Updated WorkflowEndpointViewSet to use Workflow.objects.for_user() method
instead of filtering by workflow__created_by. This ensures that workflow
endpoints are visible for both owned and shared workflows, maintaining
consistency with WorkflowViewSet behavior.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* UN-2860 [FIX] Fixed active file cache not preventing duplicate file processing
Fixed critical bug where ActiveFileFilter cache checks were failing to detect
files already being processed, causing duplicate file processing in concurrent
workflow executions.
Key fixes:
- Fixed cache data access: Extract execution_id from nested cache structure
(cached_data["data"]["execution_id"] instead of cached_data["execution_id"])
- Changed cache status from "EXECUTING" to "PENDING" for queued files
- Increased MAX_ACTIVE_FILE_CACHE_TTL from 1hr to 2hrs for resource-constrained environments
- Added cache cleanup in finally blocks to prevent stale entries
- Fixed cache key format consistency (hash-based) between backend and workers
- Optimized DB queries to skip files already found in cache
- Removed ~370 lines of dead code (file_management_utils.py and unused methods)
Root cause: RedisCacheBackend wraps data in {data: {...}, cached_at, ttl} but
filter_pipeline was accessing execution_id directly instead of from nested data key.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* addressed code rabbit comments
* optional db param for clear cache
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* UN-2793 [FEAT] Add retry logic with exponential backoff to SDK1
Implemented automatic retry logic for platform and prompt service calls
with configurable exponential backoff, comprehensive test coverage, and
CI integration.
Features:
- Exponential backoff with jitter for transient failures
- Configurable via environment variables (MAX_RETRIES, MAX_TIME, BASE_DELAY, etc.)
- Retries ConnectionError, Timeout, HTTPError (502/503/504), OSError
- 67 tests with 100% pass rate
- CI integration with test reporting
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* [SECURITY] Use full commit SHA for sticky-pull-request-comment action
Replace tag reference with full commit SHA for better security:
- marocchino/sticky-pull-request-comment@v2 → @7737449 (v2.9.4)
This prevents potential supply chain attacks where tags could be moved
to point to malicious code. Commit SHAs are immutable.
Fixes SonarQube security hotspot for external GitHub action.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* [FIX] Allow retryable HTTP errors (502/503/504) to propagate for retry
Fixed HTTPError handling in _get_adapter_configuration to check status
codes and re-raise retryable errors (502, 503, 504) so the retry
decorator can handle them. Non-retryable errors are still converted
to SdkError as before.
Changes:
- Check HTTPError status code before converting to SdkError
- Re-raise HTTPError for 502/503/504 to allow retry decorator to retry
- Added parametrized test for all retryable status codes (502, 503, 504)
- All 12 platform tests pass
This fixes a bug where 502/503/504 errors were not being retried
because they were converted to SdkError before the retry decorator
could see them.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* [FIX] Use pytest.approx() for floating point comparisons in tests
Replaced direct equality comparisons (==) with pytest.approx() for
floating point values to avoid precision issues and satisfy SonarQube
code quality check (python:S1244).
Changes in test_retry_utils.py:
- test_exponential_backoff_without_jitter: Use pytest.approx() for 1.0, 2.0, 4.0, 8.0
- test_max_delay_cap: Use pytest.approx() for 5.0
This is the proper way to compare floating point values in tests,
accounting for floating point precision limitations.
All 4 TestCalculateDelay tests pass.
Fixes SonarQube: python:S1244 - Do not perform equality checks with
floating point values.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* minor: Addressed code smells, ruff fixes
* misc: Fixed tox config for sdk1 tests
* misc: Ruff issues fixed
* misc: tox tests fixed
* prompt service lock file for venv
* updated lock files for backend and prompt-service
* UN-2793 [FEAT] Update to unstract-sdk v0.78.0 with retry logic support (#1567)
[FEAT] Update unstract-sdk to v0.78.0 across all services and tools
- Updated unstract-sdk dependency from v0.77.3 to v0.78.0 in all pyproject.toml files
- Main repository, backend, workers, platform-service, prompt-service
- filesystem and tool-registry modules
- Updated tool requirements.txt files (structure, classifier, text_extractor)
- Bumped tool versions in properties.json:
- Structure tool: 0.0.88 → 0.0.89
- Classifier tool: 0.0.68 → 0.0.69
- Text extractor tool: 0.0.64 → 0.0.65
- Updated tool versions in backend/sample.env and public_tools.json
- Regenerated all uv.lock files with new SDK version
This update brings in the retry logic with exponential backoff from unstract-sdk v0.78.0
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
---------
Signed-off-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
* Added flie storage deps for sdk1
* Removed specific file storage deps from tomls and clubed them together with unstract-sdk1
* Commit uv.lock changes
* Pinned versions for azure-identity and azure-mgmt-apimanagement in backend's pyproject.toml
---------
Co-authored-by: pk-zipstack <187789273+pk-zipstack@users.noreply.github.com>
* UN-2850 [FIX] Update ETL webhook response status change for scheduled executions
- Add INPROGRESS status to NotificationStatus enum for scheduled executions
- Add SCHEDULED_EXECUTION source to NotificationSource enum
- Fix notification status mapping for backward compatibility:
- API workflows use COMPLETED status
- ETL/TASK workflows use SUCCESS status
- Update Slack webhook formatting to inline display
- Standardize key naming in Slack notifications (ID → Id)
- Trigger INPROGRESS notification when scheduled pipeline starts
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* missed to commit
---------
Co-authored-by: Claude <noreply@anthropic.com>
* UN-2854 [MISC] Set CELERY_TASK_REJECT_ON_WORKER_LOST to false
- Change default value from True to False in worker_models.py
- Update sample.env to reflect new default (false)
- Fix JSON credential quoting in sample.env (double to single quotes)
- Prevents duplicate task processing on worker connection loss
- Matches backend behavior (which never had this issue)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* task_reject_on_worker_lost default value was wrongly added
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix(sdk1): resolve 185 ruff linting issues across multiple categories
Systematically fixed all ruff linting errors in the sdk1 directory by category:
- I001: Fixed 38 import sorting issues
- ANN201/ANN202/ANN204: Added missing return type annotations
- ANN003: Added type annotations for **kwargs parameters
- ANN001: Added type annotations for function arguments
- D415: Fixed docstring punctuation (first line ending)
- B006: Resolved 15 mutable argument default issues
- B007: Fixed 1 unused loop control variable
- B008: Corrected function calls in argument defaults
- B024: Fixed abstract base class declarations
* Fixed D205, B904, D107, F821, E501, D200, D411 lint issues
* Fixed lint issues
* Fixed lint issues
* Fixed all lint issues
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Added ANN101 to ignore list in ruff linter
* Added UP038 and ANN102 to ruff ignore list
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fixed issue with pycln in base.py
* - Removed global ignore for "UP038" and "C901" Added comment in the issue code block(C901) to supress the issue.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* added ignore for UP038 in tools.py
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Addressed comments by code rabbit
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* modified tool.py to ignore UP038 issue
* Addressed code rabbit suggestions
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Gayathri <142381512+gaya3-zipstack@users.noreply.github.com>
* UN-2470 [MISC] Remove Django dependency from Celery workers
This commit introduces a new worker architecture that decouples
Celery workers from Django where possible, enabling support for
gevent/eventlet pool types and reducing worker startup overhead.
Key changes:
- Created separate worker modules (api-deployment, callback, file_processing, general)
- Added internal API endpoints for worker communication
- Implemented Django-free task execution where appropriate
- Added shared utilities and client facades
- Updated container configurations for new worker architecture
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix pre-commit issues: file permissions and ruff errors
Setup the docker for new workers
- Add executable permissions to worker entrypoint files
- Fix import order in namespace package __init__.py
- Remove unused variable api_status in general worker
- Address ruff E402 and F841 errors
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* refactoreed, Dockerfiles,fixes
* flexibility on celery run commands
* added debug logs
* handled filehistory for API
* cleanup
* cleanup
* cloud plugin structure
* minor changes in import plugin
* added notification and logger workers under new worker module
* add docker compatibility for new workers
* handled docker issues
* log consumer worker fixes
* added scheduler worker
* minor env changes
* cleanup the logs
* minor changes in logs
* resolved scheduler worker issues
* cleanup and refactor
* ensuring backward compatibbility to existing wokers
* added configuration internal apis and cache utils
* optimization
* Fix API client singleton pattern to share HTTP sessions
- Fix flawed singleton implementation that was trying to share BaseAPIClient instances
- Now properly shares HTTP sessions between specialized clients
- Eliminates 6x BaseAPIClient initialization by reusing the same underlying session
- Should reduce API deployment orchestration time by ~135ms (from 6 clients to 1 session)
- Added debug logging to verify singleton pattern activation
* cleanup and structuring
* cleanup in callback
* file system connectors issue
* celery env values changes
* optional gossip
* variables for sync, mingle and gossip
* Fix for file type check
* Task pipeline issue resolving
* api deployement failed response handled
* Task pipline fixes
* updated file history cleanup with active file execution
* pipline status update and workflow ui page execution
* cleanup and resolvinf conflicts
* remove unstract-core from conenctoprs
* Commit uv.lock changes
* uv locks updates
* resolve migration issues
* defer connector-metadtda
* Fix connector migration for production scale
- Add encryption key handling with defer() to prevent decryption failures
- Add final cleanup step to fix duplicate connector names
- Optimize for large datasets with batch processing and bulk operations
- Ensure unique constraint in migration 0004 can be created successfully
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* hitl fixes
* minor fixes on hitl
* api_hub related changes
* dockerfile fixes
* api client cache fixes with actual response class
* fix: tags and llm_profile_id
* optimized clear cache
* cleanup
* enhanced logs
* added more handling on is file dir and added loggers
* cleanup the runplatform script
* internal apis are excempting from csrf
* sonal cloud issues
* sona-cloud issues
* resolving sonar cloud issues
* resolving sonar cloud issues
* Delta: added Batch size fix in workers
* comments addressed
* celery configurational changes for new workers
* fiixes in callback regaurding the pipline type check
* change internal url registry logic
* gitignore changes
* gitignore changes
* addressng pr cmmnets and cleanup the codes
* adding missed profiles for v2
* sonal cloud blocker issues resolved
* imlement otel
* Commit uv.lock changes
* handle execution time and some cleanup
* adding user_data in metadata Pr: https://github.com/Zipstack/unstract/pull/1544
* scheduler backward compatibitlity
* replace user_data with custom_data
* Commit uv.lock changes
* celery worker command issue resolved
* enhance package imports in connectors by changing to lazy imports
* Update runner.py by removing the otel from it
Update runner.py by removing the otel from it
Signed-off-by: ali <117142933+muhammad-ali-e@users.noreply.github.com>
* added delta changes
* handle erro to destination db
* resolve tool instances id validation and hitl queu name in API
* handled direct execution from workflow page to worker and logs
* handle cost logs
* Update health.py
Signed-off-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* minor log changes
* introducing log consumer scheduler to bulk create, and socket .emit from worker for ws
* Commit uv.lock changes
* time limit or timeout celery config cleanup
* implemented redis client class in worker
* pipline status enum mismatch
* notification worker fixes
* resolve uv lock conflicts
* workflow log fixes
* ws channel name issue resolved. and handling redis down in status tracker, and removing redis keys
* default TTL changed for unified logs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: ali <117142933+muhammad-ali-e@users.noreply.github.com>
Signed-off-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>