Fix linting

This commit is contained in:
yangdx
2025-12-24 17:20:36 +08:00
parent 833ddf4931
commit 14d9865670

View File

@@ -7,7 +7,7 @@
### Core Metrics
| Metric | What It Measures | Good Score |
|--------|-----------------|-----------|
| ------ | ---------------- | ---------- |
| **Faithfulness** | Is the answer factually accurate based on retrieved context? | > 0.80 |
| **Answer Relevance** | Is the answer relevant to the user's question? | > 0.80 |
| **Context Recall** | Was all relevant information retrieved from documents? | > 0.80 |
@@ -103,7 +103,7 @@ results/
The evaluation script supports command-line arguments for easy configuration:
| Argument | Short | Default | Description |
|----------|-------|---------|-------------|
| -------- | ----- | ------- | ----------- |
| `--dataset` | `-d` | `sample_dataset.json` | Path to test dataset JSON file |
| `--ragendpoint` | `-r` | `http://localhost:9621` or `$LIGHTRAG_API_URL` | LightRAG API endpoint URL |
@@ -153,7 +153,7 @@ The evaluation framework supports customization through environment variables:
- Non-compatible endpoints will cause evaluation failures
| Variable | Default | Description |
|----------|---------|-------------|
| -------- | ------- | ----------- |
| **LLM Configuration** | | |
| `EVAL_LLM_MODEL` | `gpt-4o-mini` | LLM model used for RAGAS evaluation |
| `EVAL_LLM_BINDING_API_KEY` | falls back to `OPENAI_API_KEY` | API key for LLM evaluation |
@@ -262,7 +262,7 @@ EVAL_LLM_TIMEOUT=180 # 3-minute timeout per request
**Common Issues and Solutions:**
| Issue | Solution |
|-------|----------|
| ----- | -------- |
| **Warning: "LM returned 1 generations instead of 3"** | Reduce `EVAL_MAX_CONCURRENT` to 1 or decrease `EVAL_QUERY_TOP_K` |
| **Context Precision returns NaN** | Lower `EVAL_QUERY_TOP_K` to reduce LLM calls per test case |
| **Rate limit errors (429)** | Increase `EVAL_LLM_MAX_RETRIES` and decrease `EVAL_MAX_CONCURRENT` |
@@ -302,7 +302,7 @@ EVAL_LLM_TIMEOUT=180 # 3-minute timeout per request
### What Low Scores Mean
| Metric | Low Score Indicates |
|--------|-------------------|
| ------ | ------------------- |
| **Faithfulness** | Responses contain hallucinations or incorrect information |
| **Answer Relevance** | Answers don't match what users asked |
| **Context Recall** | Missing important information in retrieval |