Files

Chandrasekharan M 0c0c8c1034 UN-2793 [FEAT] Add retry logic with exponential backoff to SDK1 (#1564 )

* UN-2793 [FEAT] Add retry logic with exponential backoff to SDK1

Implemented automatic retry logic for platform and prompt service calls
with configurable exponential backoff, comprehensive test coverage, and
CI integration.

Features:
- Exponential backoff with jitter for transient failures
- Configurable via environment variables (MAX_RETRIES, MAX_TIME, BASE_DELAY, etc.)
- Retries ConnectionError, Timeout, HTTPError (502/503/504), OSError
- 67 tests with 100% pass rate
- CI integration with test reporting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* [SECURITY] Use full commit SHA for sticky-pull-request-comment action

Replace tag reference with full commit SHA for better security:
- marocchino/sticky-pull-request-comment@v2 → @7737449 (v2.9.4)

This prevents potential supply chain attacks where tags could be moved
to point to malicious code. Commit SHAs are immutable.

Fixes SonarQube security hotspot for external GitHub action.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* [FIX] Allow retryable HTTP errors (502/503/504) to propagate for retry

Fixed HTTPError handling in _get_adapter_configuration to check status
codes and re-raise retryable errors (502, 503, 504) so the retry
decorator can handle them. Non-retryable errors are still converted
to SdkError as before.

Changes:
- Check HTTPError status code before converting to SdkError
- Re-raise HTTPError for 502/503/504 to allow retry decorator to retry
- Added parametrized test for all retryable status codes (502, 503, 504)
- All 12 platform tests pass

This fixes a bug where 502/503/504 errors were not being retried
because they were converted to SdkError before the retry decorator
could see them.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* [FIX] Use pytest.approx() for floating point comparisons in tests

Replaced direct equality comparisons (==) with pytest.approx() for
floating point values to avoid precision issues and satisfy SonarQube
code quality check (python:S1244).

Changes in test_retry_utils.py:
- test_exponential_backoff_without_jitter: Use pytest.approx() for 1.0, 2.0, 4.0, 8.0
- test_max_delay_cap: Use pytest.approx() for 5.0

This is the proper way to compare floating point values in tests,
accounting for floating point precision limitations.

All 4 TestCalculateDelay tests pass.

Fixes SonarQube: python:S1244 - Do not perform equality checks with
floating point values.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* minor: Addressed code smells, ruff fixes

* misc: Fixed tox config for sdk1 tests

* misc: Ruff issues fixed

* misc: tox tests fixed

* prompt service lock file for venv

* updated lock files for backend and prompt-service

* UN-2793 [FEAT] Update to unstract-sdk v0.78.0 with retry logic support (#1567)

[FEAT] Update unstract-sdk to v0.78.0 across all services and tools

- Updated unstract-sdk dependency from v0.77.3 to v0.78.0 in all pyproject.toml files
  - Main repository, backend, workers, platform-service, prompt-service
  - filesystem and tool-registry modules
- Updated tool requirements.txt files (structure, classifier, text_extractor)
- Bumped tool versions in properties.json:
  - Structure tool: 0.0.88 → 0.0.89
  - Classifier tool: 0.0.68 → 0.0.69
  - Text extractor tool: 0.0.64 → 0.0.65
- Updated tool versions in backend/sample.env and public_tools.json
- Regenerated all uv.lock files with new SDK version

This update brings in the retry logic with exponential backoff from unstract-sdk v0.78.0

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>

---------

Signed-off-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>

2025-10-09 10:48:19 +05:30

src

UN-2793 [FEAT] Add retry logic with exponential backoff to SDK1 (#1564 )

2025-10-09 10:48:19 +05:30

tests

Initial commit on Unstract

2024-02-25 16:19:36 +05:30

__init__.py

Initial commit on Unstract

2024-02-25 16:19:36 +05:30

.dockerignore

Initial commit on Unstract

2024-02-25 16:19:36 +05:30

.gitignore

Initial commit on Unstract

2024-02-25 16:19:36 +05:30

Dockerfile

UN-2812 [MISC] Update Dockerfiles with latest configurations and fix hadolint warnings (#1551 )

2025-09-29 16:38:12 +05:30

README.md

[FEATURE] Remote storage flag removal for remote storage (#1101 )

2025-02-19 11:47:28 +05:30

requirements.txt

UN-2793 [FEAT] Add retry logic with exponential backoff to SDK1 (#1564 )

2025-10-09 10:48:19 +05:30

sample.env

[FEATURE] Remote storage flag removal for remote storage (#1101 )

2025-02-19 11:47:28 +05:30

README.md

Text Extractor Tool

The Text Extractor Tool is a powerful tool designed to extract text from documents. In other words, it converts documents into their text versions. For example, it can convert PDF to text, image to text, etc.

Required Environment Variables

Variable	Description
`PLATFORM_SERVICE_HOST`	The host where the platform service is running
`PLATFORM_SERVICE_PORT`	The port where the service is listening
`PLATFORM_SERVICE_API_KEY`	The API key for the platform
`EXECUTION_DATA_DIR`	The directory in the filesystem which contains contents for tool execution
`X2TEXT_HOST`	The host where the x2text service is running
`X2TEXT_PORT`	The port where the x2text service is listening

Setting Up a Dev Environment

Setup a virtual environment and activate it:

python -m venv .venv
source .venv/bin/activate

Install the dependencies for the tool.

Two Options
- Install by Pypi version
```
pip install -r requirements.txt
```
- To use the local development version of the unstract-sdk install it from the local repository. Replace the path with the path to your local repository
```
pip install -e ~/path_to_repo/sdks/.
```

Tool execution preparation

Load the environment variables for the tool. Make a copy of the sample.env file and name it .env. Fill in the required values. They get loaded with python-dotenv through the SDK.
Update the tool's data_dir marked by the EXECUTION_DATA_DIR env. This has to be done before each tool execution since the tool updates the INFILE and METADATA.json.

Run SPEC command

Represents the JSON schema for the runtime configurable settings of a tool

python main.py --command SPEC

Run PROPERTIES command

Describes some metadata for the tool such as its version, description, inputs and outputs

python main.py --command PROPERTIES

Run ICON command

Returns the SVG icon for the tool, used by Unstract's frontend

python main.py --command ICON

Run VARIABLES command

Represents the runtime variables or envs that will be used by the tool

python main.py --command VARIABLES

Run RUN command

The schema of the JSON required for settings can be found by running the SPEC command. Alternatively if you have access to the code base, it is located in the config folder as spec.json.

python main.py \
    --command RUN \
    --settings '{
        "extractorId": "<extractor_id of adapter>"
        }' \
    --workflow-id '00000000-0000-0000-0000-000000000000' \
    --log-level DEBUG

Testing the tool from its docker image

Build the tool docker image from the folder containing the Dockerfile with

docker build -t unstract/tool-example:0.0.1 .

Make sure the directory pointed by EXECUTION_DATA_DIR has the required information for the tool to run and necessary services like the platform-service is up. To test the tool from its docker image, run the following command

docker run -it \
    --network unstract-network \
    --env-file .env \
    -v "$(pwd)"/data_dir:/app/data_dir \
    unstract/tool-example:0.0.1 \
    --command RUN \
    --settings '{
        "extractorId": "<extractor_id of adapter>"
        }' \
    --workflow-id '00000000-0000-0000-0000-000000000000' \
    --log-level DEBUG