9.3 KiB
LightRAG Offline Deployment Guide
This guide provides comprehensive instructions for deploying LightRAG in offline environments where internet access is limited or unavailable.
If you deploy LightRAG using Docker, there is no need to refer to this document, as the LightRAG Docker image is pre-configured for offline operation.
Software packages requiring
transformers,torch, orcudawill not be included in the offline dependency group. Consequently, document extraction tools such as Docling, as well as local LLM models like Hugging Face and LMDeploy, are outside the scope of offline installation support. These high-compute-resource-demanding services should not be integrated into LightRAG. Docling will be decoupled and deployed as a standalone service.
Table of Contents
- Overview
- Quick Start
- Layered Dependencies
- Tiktoken Cache Management
- Complete Offline Deployment Workflow
- Troubleshooting
Overview
LightRAG uses dynamic package installation (pipmaster) for optional features based on file types and configurations. In offline environments, these dynamic installations will fail. This guide shows you how to pre-install all necessary dependencies and cache files.
What Gets Dynamically Installed?
LightRAG dynamically installs packages for:
- Storage Backends:
redis,neo4j,pymilvus,pymongo,asyncpg,qdrant-client - LLM Providers:
openai,anthropic,ollama,zhipuai,aioboto3,voyageai,llama-index,lmdeploy,transformers,torch - Tiktoken Models: BPE encoding models downloaded from OpenAI CDN
Note: Document processing dependencies (pypdf, python-docx, python-pptx, openpyxl) are now pre-installed with the api extras group and no longer require dynamic installation.
Quick Start
Option 1: Using pip with Offline Extras
# Online environment: Install all offline dependencies
pip install lightrag-hku[offline]
# Download tiktoken cache
lightrag-download-cache
# Create offline package
pip download lightrag-hku[offline] -d ./offline-packages
tar -czf lightrag-offline.tar.gz ./offline-packages ~/.tiktoken_cache
# Transfer to offline server
scp lightrag-offline.tar.gz user@offline-server:/path/to/
# Offline environment: Install
tar -xzf lightrag-offline.tar.gz
pip install --no-index --find-links=./offline-packages lightrag-hku[offline]
export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache
Option 2: Using Requirements Files
# Online environment: Download packages
pip download -r requirements-offline.txt -d ./packages
# Transfer to offline server
tar -czf packages.tar.gz ./packages
scp packages.tar.gz user@offline-server:/path/to/
# Offline environment: Install
tar -xzf packages.tar.gz
pip install --no-index --find-links=./packages -r requirements-offline.txt
Layered Dependencies
LightRAG provides flexible dependency groups for different use cases:
Available Dependency Groups
| Group | Description | Use Case |
|---|---|---|
api |
API server + document processing | FastAPI server with PDF, DOCX, PPTX, XLSX support |
offline-storage |
Storage backends | Redis, Neo4j, MongoDB, PostgreSQL, etc. |
offline-llm |
LLM providers | OpenAI, Anthropic, Ollama, etc. |
offline |
Complete offline package | API + Storage + LLM (all features) |
Note: Document processing (PDF, DOCX, PPTX, XLSX) is included in the api extras group. The previous offline-docs group has been merged into api for better integration.
Software packages requiring
transformers,torch, orcudawill not be included in the offline dependency group.
Installation Examples
# Install API with document processing
pip install lightrag-hku[api]
# Install API and storage backends
pip install lightrag-hku[api,offline-storage]
# Install all offline dependencies (recommended for offline deployment)
pip install lightrag-hku[offline]
Using Individual Requirements Files
# Storage backends only
pip install -r requirements-offline-storage.txt
# LLM providers only
pip install -r requirements-offline-llm.txt
# All offline dependencies
pip install -r requirements-offline.txt
Tiktoken Cache Management
Tiktoken downloads BPE encoding models on first use. In offline environments, you must pre-download these models.
Using the CLI Command
After installing LightRAG, use the built-in command:
# Download to default location (see output for exact path)
lightrag-download-cache
# Download to specific directory
lightrag-download-cache --cache-dir ./tiktoken_cache
# Download specific models only
lightrag-download-cache --models gpt-4o-mini gpt-4
Default Models Downloaded
gpt-4o-mini(LightRAG default)gpt-4ogpt-4gpt-3.5-turbotext-embedding-ada-002text-embedding-3-smalltext-embedding-3-large
Setting Cache Location in Offline Environment
# Option 1: Environment variable (temporary)
export TIKTOKEN_CACHE_DIR=/path/to/tiktoken_cache
# Option 2: Add to ~/.bashrc or ~/.zshrc (persistent)
echo 'export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache' >> ~/.bashrc
source ~/.bashrc
# Option 3: Copy to default location
cp -r /path/to/tiktoken_cache ~/.tiktoken_cache/
Complete Offline Deployment Workflow
Step 1: Prepare in Online Environment
# 1. Install LightRAG with offline dependencies
pip install lightrag-hku[offline]
# 2. Download tiktoken cache
lightrag-download-cache --cache-dir ./offline_cache/tiktoken
# 3. Download all Python packages
pip download lightrag-hku[offline] -d ./offline_cache/packages
# 4. Create archive for transfer
tar -czf lightrag-offline-complete.tar.gz ./offline_cache
# 5. Verify contents
tar -tzf lightrag-offline-complete.tar.gz | head -20
Step 2: Transfer to Offline Environment
# Using scp
scp lightrag-offline-complete.tar.gz user@offline-server:/tmp/
# Or using USB/physical media
# Copy lightrag-offline-complete.tar.gz to USB drive
Step 3: Install in Offline Environment
# 1. Extract archive
cd /tmp
tar -xzf lightrag-offline-complete.tar.gz
# 2. Install Python packages
pip install --no-index \
--find-links=/tmp/offline_cache/packages \
lightrag-hku[offline]
# 3. Set up tiktoken cache
mkdir -p ~/.tiktoken_cache
cp -r /tmp/offline_cache/tiktoken/* ~/.tiktoken_cache/
export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache
# 4. Add to shell profile for persistence
echo 'export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache' >> ~/.bashrc
Step 4: Verify Installation
# Test Python import
python -c "from lightrag import LightRAG; print('✓ LightRAG imported')"
# Test tiktoken
python -c "from lightrag.utils import TiktokenTokenizer; t = TiktokenTokenizer(); print('✓ Tiktoken working')"
# Test optional dependencies (if installed)
python -c "import docling; print('✓ Docling available')"
python -c "import redis; print('✓ Redis available')"
Troubleshooting
Issue: Tiktoken fails with network error
Problem: Unable to load tokenizer for model gpt-4o-mini
Solution:
# Ensure TIKTOKEN_CACHE_DIR is set
echo $TIKTOKEN_CACHE_DIR
# Verify cache files exist
ls -la ~/.tiktoken_cache/
# If empty, you need to download cache in online environment first
Issue: Dynamic package installation fails
Problem: Error installing package xxx
Solution:
# Pre-install the specific package you need
# For API with document processing:
pip install lightrag-hku[api]
# For storage backends:
pip install lightrag-hku[offline-storage]
# For LLM providers:
pip install lightrag-hku[offline-llm]
Issue: Missing dependencies at runtime
Problem: ModuleNotFoundError: No module named 'xxx'
Solution:
# Check what you have installed
pip list | grep -i xxx
# Install missing component
pip install lightrag-hku[offline] # Install all offline deps
Issue: Permission denied on tiktoken cache
Problem: PermissionError: [Errno 13] Permission denied
Solution:
# Ensure cache directory has correct permissions
chmod 755 ~/.tiktoken_cache
chmod 644 ~/.tiktoken_cache/*
# Or use a user-writable directory
export TIKTOKEN_CACHE_DIR=~/my_tiktoken_cache
mkdir -p ~/my_tiktoken_cache
Best Practices
-
Test in Online Environment First: Always test your complete setup in an online environment before going offline.
-
Keep Cache Updated: Periodically update your offline cache when new models are released.
-
Document Your Setup: Keep notes on which optional dependencies you actually need.
-
Version Pinning: Consider pinning specific versions in production:
pip freeze > requirements-production.txt -
Minimal Installation: Only install what you need:
# If you only need API with document processing pip install lightrag-hku[api] # Then manually add specific LLM: pip install openai
Additional Resources
Support
If you encounter issues not covered in this guide:
- Check the GitHub Issues
- Review the project documentation
- Create a new issue with your offline deployment details