# LightRAG Offline Deployment Guide This guide provides comprehensive instructions for deploying LightRAG in offline environments where internet access is limited or unavailable. If you deploy LightRAG using Docker, there is no need to refer to this document, as the LightRAG Docker image is pre-configured for offline operation. > Software packages requiring `transformers`, `torch`, or `cuda` will not be included in the offline dependency group. Consequently, document extraction tools such as Docling, as well as local LLM models like Hugging Face and LMDeploy, are outside the scope of offline installation support. These high-compute-resource-demanding services should not be integrated into LightRAG. Docling will be decoupled and deployed as a standalone service. ## Table of Contents - [Overview](#overview) - [Quick Start](#quick-start) - [Layered Dependencies](#layered-dependencies) - [Tiktoken Cache Management](#tiktoken-cache-management) - [Complete Offline Deployment Workflow](#complete-offline-deployment-workflow) - [Troubleshooting](#troubleshooting) ## Overview LightRAG uses dynamic package installation (`pipmaster`) for optional features based on file types and configurations. In offline environments, these dynamic installations will fail. This guide shows you how to pre-install all necessary dependencies and cache files. ### What Gets Dynamically Installed? LightRAG dynamically installs packages for: - **Storage Backends**: `redis`, `neo4j`, `pymilvus`, `pymongo`, `asyncpg`, `qdrant-client` - **LLM Providers**: `openai`, `anthropic`, `ollama`, `zhipuai`, `aioboto3`, `voyageai`, `llama-index`, `lmdeploy`, `transformers`, `torch` - **Tiktoken Models**: BPE encoding models downloaded from OpenAI CDN **Note**: Document processing dependencies (`pypdf`, `python-docx`, `python-pptx`, `openpyxl`) are now pre-installed with the `api` extras group and no longer require dynamic installation. ## Quick Start ### Option 1: Using pip with Offline Extras ```bash # Online environment: Install all offline dependencies pip install lightrag-hku[offline] # Download tiktoken cache lightrag-download-cache # Create offline package pip download lightrag-hku[offline] -d ./offline-packages tar -czf lightrag-offline.tar.gz ./offline-packages ~/.tiktoken_cache # Transfer to offline server scp lightrag-offline.tar.gz user@offline-server:/path/to/ # Offline environment: Install tar -xzf lightrag-offline.tar.gz pip install --no-index --find-links=./offline-packages lightrag-hku[offline] export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache ``` ### Option 2: Using Requirements Files ```bash # Online environment: Download packages pip download -r requirements-offline.txt -d ./packages # Transfer to offline server tar -czf packages.tar.gz ./packages scp packages.tar.gz user@offline-server:/path/to/ # Offline environment: Install tar -xzf packages.tar.gz pip install --no-index --find-links=./packages -r requirements-offline.txt ``` ## Layered Dependencies LightRAG provides flexible dependency groups for different use cases: ### Available Dependency Groups | Group | Description | Use Case | | ----- | ----------- | -------- | | `api` | API server + document processing | FastAPI server with PDF, DOCX, PPTX, XLSX support | | `offline-storage` | Storage backends | Redis, Neo4j, MongoDB, PostgreSQL, etc. | | `offline-llm` | LLM providers | OpenAI, Anthropic, Ollama, etc. | | `offline` | Complete offline package | API + Storage + LLM (all features) | **Note**: Document processing (PDF, DOCX, PPTX, XLSX) is included in the `api` extras group. The previous `offline-docs` group has been merged into `api` for better integration. > Software packages requiring `transformers`, `torch`, or `cuda` will not be included in the offline dependency group. ### Installation Examples ```bash # Install API with document processing pip install lightrag-hku[api] # Install API and storage backends pip install lightrag-hku[api,offline-storage] # Install all offline dependencies (recommended for offline deployment) pip install lightrag-hku[offline] ``` ### Using Individual Requirements Files ```bash # Storage backends only pip install -r requirements-offline-storage.txt # LLM providers only pip install -r requirements-offline-llm.txt # All offline dependencies pip install -r requirements-offline.txt ``` ## Tiktoken Cache Management Tiktoken downloads BPE encoding models on first use. In offline environments, you must pre-download these models. ### Using the CLI Command After installing LightRAG, use the built-in command: ```bash # Download to default location (see output for exact path) lightrag-download-cache # Download to specific directory lightrag-download-cache --cache-dir ./tiktoken_cache # Download specific models only lightrag-download-cache --models gpt-4o-mini gpt-4 ``` ### Default Models Downloaded - `gpt-4o-mini` (LightRAG default) - `gpt-4o` - `gpt-4` - `gpt-3.5-turbo` - `text-embedding-ada-002` - `text-embedding-3-small` - `text-embedding-3-large` ### Setting Cache Location in Offline Environment ```bash # Option 1: Environment variable (temporary) export TIKTOKEN_CACHE_DIR=/path/to/tiktoken_cache # Option 2: Add to ~/.bashrc or ~/.zshrc (persistent) echo 'export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache' >> ~/.bashrc source ~/.bashrc # Option 3: Copy to default location cp -r /path/to/tiktoken_cache ~/.tiktoken_cache/ ``` ## Complete Offline Deployment Workflow ### Step 1: Prepare in Online Environment ```bash # 1. Install LightRAG with offline dependencies pip install lightrag-hku[offline] # 2. Download tiktoken cache lightrag-download-cache --cache-dir ./offline_cache/tiktoken # 3. Download all Python packages pip download lightrag-hku[offline] -d ./offline_cache/packages # 4. Create archive for transfer tar -czf lightrag-offline-complete.tar.gz ./offline_cache # 5. Verify contents tar -tzf lightrag-offline-complete.tar.gz | head -20 ``` ### Step 2: Transfer to Offline Environment ```bash # Using scp scp lightrag-offline-complete.tar.gz user@offline-server:/tmp/ # Or using USB/physical media # Copy lightrag-offline-complete.tar.gz to USB drive ``` ### Step 3: Install in Offline Environment ```bash # 1. Extract archive cd /tmp tar -xzf lightrag-offline-complete.tar.gz # 2. Install Python packages pip install --no-index \ --find-links=/tmp/offline_cache/packages \ lightrag-hku[offline] # 3. Set up tiktoken cache mkdir -p ~/.tiktoken_cache cp -r /tmp/offline_cache/tiktoken/* ~/.tiktoken_cache/ export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache # 4. Add to shell profile for persistence echo 'export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache' >> ~/.bashrc ``` ### Step 4: Verify Installation ```bash # Test Python import python -c "from lightrag import LightRAG; print('✓ LightRAG imported')" # Test tiktoken python -c "from lightrag.utils import TiktokenTokenizer; t = TiktokenTokenizer(); print('✓ Tiktoken working')" # Test optional dependencies (if installed) python -c "import docling; print('✓ Docling available')" python -c "import redis; print('✓ Redis available')" ``` ## Troubleshooting ### Issue: Tiktoken fails with network error **Problem**: `Unable to load tokenizer for model gpt-4o-mini` **Solution**: ```bash # Ensure TIKTOKEN_CACHE_DIR is set echo $TIKTOKEN_CACHE_DIR # Verify cache files exist ls -la ~/.tiktoken_cache/ # If empty, you need to download cache in online environment first ``` ### Issue: Dynamic package installation fails **Problem**: `Error installing package xxx` **Solution**: ```bash # Pre-install the specific package you need # For API with document processing: pip install lightrag-hku[api] # For storage backends: pip install lightrag-hku[offline-storage] # For LLM providers: pip install lightrag-hku[offline-llm] ``` ### Issue: Missing dependencies at runtime **Problem**: `ModuleNotFoundError: No module named 'xxx'` **Solution**: ```bash # Check what you have installed pip list | grep -i xxx # Install missing component pip install lightrag-hku[offline] # Install all offline deps ``` ### Issue: Permission denied on tiktoken cache **Problem**: `PermissionError: [Errno 13] Permission denied` **Solution**: ```bash # Ensure cache directory has correct permissions chmod 755 ~/.tiktoken_cache chmod 644 ~/.tiktoken_cache/* # Or use a user-writable directory export TIKTOKEN_CACHE_DIR=~/my_tiktoken_cache mkdir -p ~/my_tiktoken_cache ``` ## Best Practices 1. **Test in Online Environment First**: Always test your complete setup in an online environment before going offline. 2. **Keep Cache Updated**: Periodically update your offline cache when new models are released. 3. **Document Your Setup**: Keep notes on which optional dependencies you actually need. 4. **Version Pinning**: Consider pinning specific versions in production: ```bash pip freeze > requirements-production.txt ``` 5. **Minimal Installation**: Only install what you need: ```bash # If you only need API with document processing pip install lightrag-hku[api] # Then manually add specific LLM: pip install openai ``` ## Additional Resources - [LightRAG GitHub Repository](https://github.com/HKUDS/LightRAG) - [Docker Deployment Guide](./DockerDeployment.md) - [API Documentation](../lightrag/api/README.md) ## Support If you encounter issues not covered in this guide: 1. Check the [GitHub Issues](https://github.com/HKUDS/LightRAG/issues) 2. Review the [project documentation](../README.md) 3. Create a new issue with your offline deployment details