875 lines
121 KiB
Plaintext
875 lines
121 KiB
Plaintext
2025-09-18 07:54:36 | INFO | ingest_pipeline.cli.tui.utils.runners | Initializing collection management TUI
|
|
2025-09-18 07:54:36 | INFO | ingest_pipeline.cli.tui.utils.runners | Scanning available storage backends
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/.well-known/openid-configuration "HTTP/1.1 404 Not Found"
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/meta "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: GET https://pypi.org/pypi/weaviate-client/json "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/schema "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:36 | INFO | ingest_pipeline.cli.tui.utils.runners | weaviate connected successfully
|
|
2025-09-18 07:54:36 | INFO | ingest_pipeline.cli.tui.utils.runners | open_webui connected successfully
|
|
2025-09-18 07:54:36 | INFO | ingest_pipeline.cli.tui.utils.runners | r2r connected successfully
|
|
2025-09-18 07:54:36 | INFO | ingest_pipeline.cli.tui.utils.runners | Launching TUI with 3 backend(s): weaviate, open_webui, r2r
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/schema "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:36 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/d06bd108-ae7f-44f4-92fb-2ac556784920 "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/139c04d5-7d38-4595-8e12-79a67fd731e7 "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/96832710-8146-4e3b-88f3-4b3929f67dbf "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/dade78d9-9893-4966-bd4b-31f1c1635cfa "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/721c1517-b2cd-482d-bd1c-f99571f0f31f "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/f867530b-5eea-43bf-8257-d3da497cb10b "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/cbd4ae82-6fdd-4a4e-a4d5-d0b97ae988fd "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 07:54:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:00 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:00 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=fa923688-217a-41ce-a381-ae4bb8e4d40c "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:00 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flows/ "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:01 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 07:55:01 | INFO | prefect.engine | View at http://prefect.lab/runs/flow-run/ffa24f3c-fb6a-4fb5-b929-225eac154755
|
|
2025-09-18 07:55:01 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ffa24f3c-fb6a-4fb5-b929-225eac154755/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 07:55:01 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flow_runs/ffa24f3c-fb6a-4fb5-b929-225eac154755 "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:01 | INFO | prefect.flow_runs | Beginning flow run 'colossal-swan' for flow 'ingestion_pipeline'
|
|
2025-09-18 07:55:01 | INFO | prefect.flow_runs | View at http://prefect.lab/runs/flow-run/ffa24f3c-fb6a-4fb5-b929-225eac154755
|
|
2025-09-18 07:55:01 | INFO | prefect.flow_runs | Starting ingestion from https://r2r-docs.sciphi.ai/introduction
|
|
2025-09-18 07:55:01 | INFO | prefect.flow_runs | Validating source...
|
|
2025-09-18 07:55:01 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=067c7220-07bc-4299-acbb-ebb65e47b26f "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:01 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:01 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flows/78f3cfb6-1339-49c6-89f4-c38effea29e4 "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:03 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=6975811f-b28f-4a9a-8e7e-40a210313c82 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:03 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=5f81018b-d377-438b-bdaa-dd7f02a1b29f "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=2ece8111-a752-4a05-95e9-58a000f64d68 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 07:55:05 | INFO | prefect.flow_runs | Ingesting documents...
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/task_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flows/ "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 07:55:05 | INFO | prefect.engine | View at http://prefect.lab/runs/flow-run/092dd5b6-0f86-4e27-94ae-28c7638e7c40
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/092dd5b6-0f86-4e27-94ae-28c7638e7c40/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flow_runs/092dd5b6-0f86-4e27-94ae-28c7638e7c40 "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | prefect.flow_runs | Beginning subflow run 'amiable-marmoset' for flow 'firecrawl_to_r2r'
|
|
2025-09-18 07:55:05 | INFO | prefect.flow_runs | View at http://prefect.lab/runs/flow-run/092dd5b6-0f86-4e27-94ae-28c7638e7c40
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flows/bac48c85-e6dc-4da0-99d5-6f26e027cabb "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=022cacb0-b4d4-4989-aade-55300219df5e "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:05 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=f2bf95d5-c836-4266-8148-336bb7c622fc "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:05 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/map "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:06 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 07:55:06 | INFO | prefect.flow_runs | Discovered 5 unique URLs from Firecrawl map
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=6fdc6a20-d4d7-4b74-8567-745ae21ed80e "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/c5c726b4-805a-5e22-ad13-323750b25efa "HTTP/1.1 404 Not Found"
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/a534965a-9da2-566e-a9ad-3e0da59bd3ae "HTTP/1.1 404 Not Found"
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/8af54b00-fe82-55c5-a1a5-fd0544139b62 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/9a2d0156-602f-5e4a-a8e1-22edd4c987e6 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/c01a1979-1dba-5731-bc71-39daff2e6ca2 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:06 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 07:55:06 | INFO | prefect.flow_runs | Scraping 1 batches of Firecrawl pages
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=2f238b05-eae3-44e7-8fe9-8a43fad6a505 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:06 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 07:55:11 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:12 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:14 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:14 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:16 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:16 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:16 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 07:55:16 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=b09316a7-665d-4ef4-9d9d-8e4fcdb17aa2 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:16 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:16 | INFO | prefect.task_runs | Task run failed with exception: 1 validation error for Document
|
|
metadata.author
|
|
Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
|
|
For further information visit https://errors.pydantic.dev/2.11/v/string_type - Retry 1/1 will start 10 second(s) from now
|
|
2025-09-18 07:55:17 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 07:55:26 | ERROR | prefect.task_runs | Task run failed with exception: 1 validation error for Document
|
|
metadata.author
|
|
Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
|
|
For further information visit https://errors.pydantic.dev/2.11/v/string_type - Retries are exhausted
|
|
Traceback (most recent call last):
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/task_engine.py", line 1459, in run_context
|
|
yield self
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/task_engine.py", line 1538, in run_task_async
|
|
await engine.call_task_fn(txn)
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/task_engine.py", line 1476, in call_task_fn
|
|
result = await call_with_parameters(self.task.fn, parameters)
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
File "/home/vasceannie/projects/rag-manager/ingest_pipeline/flows/ingestion.py", line 149, in annotate_firecrawl_metadata_task
|
|
documents = [ingestor.create_document(page, job) for page in pages]
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
File "/home/vasceannie/projects/rag-manager/ingest_pipeline/ingestors/firecrawl.py", line 489, in create_document
|
|
return Document(
|
|
^^^^^^^^^
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/pydantic/main.py", line 253, in __init__
|
|
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
pydantic_core._pydantic_core.ValidationError: 1 validation error for Document
|
|
metadata.author
|
|
Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
|
|
For further information visit https://errors.pydantic.dev/2.11/v/string_type
|
|
2025-09-18 07:55:26 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:26 | ERROR | prefect.task_runs | Finished in state Failed('Task run encountered an exception ValidationError: 1 validation error for Document\nmetadata.author\n Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]\n For further information visit https://errors.pydantic.dev/2.11/v/string_type')
|
|
2025-09-18 07:55:26 | ERROR | prefect.flow_runs | Encountered exception during execution: 1 validation error for Document
|
|
metadata.author
|
|
Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
|
|
For further information visit https://errors.pydantic.dev/2.11/v/string_type
|
|
Traceback (most recent call last):
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/flow_engine.py", line 1357, in run_context
|
|
yield self
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/flow_engine.py", line 1419, in run_flow_async
|
|
await engine.call_flow_fn()
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/flow_engine.py", line 1371, in call_flow_fn
|
|
result = await call_with_parameters(self.flow.fn, self.parameters)
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
File "/home/vasceannie/projects/rag-manager/ingest_pipeline/flows/ingestion.py", line 467, in firecrawl_to_r2r_flow
|
|
documents = await annotate_firecrawl_metadata_task(scraped_pages, job)
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/task_engine.py", line 1540, in run_task_async
|
|
return engine.state if return_type == "state" else await engine.result()
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/task_engine.py", line 1087, in result
|
|
raise self._raised
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/task_engine.py", line 1459, in run_context
|
|
yield self
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/task_engine.py", line 1538, in run_task_async
|
|
await engine.call_task_fn(txn)
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/prefect/task_engine.py", line 1476, in call_task_fn
|
|
result = await call_with_parameters(self.task.fn, parameters)
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
File "/home/vasceannie/projects/rag-manager/ingest_pipeline/flows/ingestion.py", line 149, in annotate_firecrawl_metadata_task
|
|
documents = [ingestor.create_document(page, job) for page in pages]
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
File "/home/vasceannie/projects/rag-manager/ingest_pipeline/ingestors/firecrawl.py", line 489, in create_document
|
|
return Document(
|
|
^^^^^^^^^
|
|
File "/home/vasceannie/projects/rag-manager/.venv/lib/python3.12/site-packages/pydantic/main.py", line 253, in __init__
|
|
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
pydantic_core._pydantic_core.ValidationError: 1 validation error for Document
|
|
metadata.author
|
|
Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
|
|
For further information visit https://errors.pydantic.dev/2.11/v/string_type
|
|
2025-09-18 07:55:26 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/092dd5b6-0f86-4e27-94ae-28c7638e7c40/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 07:55:26 | ERROR | prefect.flow_runs | Finished in state Failed('Flow run encountered an exception: ValidationError: 1 validation error for Document\nmetadata.author\n Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]\n For further information visit https://errors.pydantic.dev/2.11/v/string_type')
|
|
2025-09-18 07:55:26 | INFO | prefect.flow_runs | Ingestion failed: 1 validation error for Document
|
|
metadata.author
|
|
Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
|
|
For further information visit https://errors.pydantic.dev/2.11/v/string_type
|
|
2025-09-18 07:55:26 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:26 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:26 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=2ac58f3e-8c81-4c9e-9a68-b5c84902da18 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 07:55:26 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 07:55:26 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 07:55:26 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ffa24f3c-fb6a-4fb5-b929-225eac154755/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 07:55:26 | INFO | prefect.flow_runs | Finished in state Completed()
|
|
2025-09-18 07:55:27 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:00:45 | INFO | ingest_pipeline.cli.tui.utils.runners | Shutting down storage connections
|
|
2025-09-18 08:00:45 | INFO | ingest_pipeline.cli.tui.utils.runners | All storage connections closed gracefully
|
|
2025-09-18 08:00:49 | INFO | ingest_pipeline.cli.tui.utils.runners | Initializing collection management TUI
|
|
2025-09-18 08:00:49 | INFO | ingest_pipeline.cli.tui.utils.runners | Scanning available storage backends
|
|
2025-09-18 08:00:49 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/.well-known/openid-configuration "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:00:50 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/meta "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:50 | INFO | httpx | HTTP Request: GET https://pypi.org/pypi/weaviate-client/json "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:50 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/schema "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:50 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:50 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:50 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:50 | INFO | ingest_pipeline.cli.tui.utils.runners | weaviate connected successfully
|
|
2025-09-18 08:00:50 | INFO | ingest_pipeline.cli.tui.utils.runners | open_webui connected successfully
|
|
2025-09-18 08:00:50 | INFO | ingest_pipeline.cli.tui.utils.runners | r2r connected successfully
|
|
2025-09-18 08:00:50 | INFO | ingest_pipeline.cli.tui.utils.runners | Launching TUI with 3 backend(s): weaviate, open_webui, r2r
|
|
2025-09-18 08:00:50 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/schema "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:50 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:50 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:50 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/d06bd108-ae7f-44f4-92fb-2ac556784920 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/139c04d5-7d38-4595-8e12-79a67fd731e7 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/96832710-8146-4e3b-88f3-4b3929f67dbf "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/dade78d9-9893-4966-bd4b-31f1c1635cfa "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/721c1517-b2cd-482d-bd1c-f99571f0f31f "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/f867530b-5eea-43bf-8257-d3da497cb10b "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/cbd4ae82-6fdd-4a4e-a4d5-d0b97ae988fd "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:00:51 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:14 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:14 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=4e57b817-47a8-4468-b030-48bcb4a52c2f "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:14 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flows/ "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:14 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:14 | INFO | prefect.engine | View at http://prefect.lab/runs/flow-run/063d86dc-a190-4be9-a56a-de1d1257478f
|
|
2025-09-18 08:01:14 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/063d86dc-a190-4be9-a56a-de1d1257478f/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:14 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flow_runs/063d86dc-a190-4be9-a56a-de1d1257478f "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:14 | INFO | prefect.flow_runs | Beginning flow run 'ingenious-meerkat' for flow 'ingestion_pipeline'
|
|
2025-09-18 08:01:14 | INFO | prefect.flow_runs | View at http://prefect.lab/runs/flow-run/063d86dc-a190-4be9-a56a-de1d1257478f
|
|
2025-09-18 08:01:14 | INFO | prefect.flow_runs | Starting ingestion from https://r2r-docs.sciphi.ai/introduction
|
|
2025-09-18 08:01:14 | INFO | prefect.flow_runs | Validating source...
|
|
2025-09-18 08:01:15 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=e5c33d4d-db9c-4a37-8a41-3d553a584909 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:15 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:15 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flows/78f3cfb6-1339-49c6-89f4-c38effea29e4 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:16 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=d85fd4d3-4dd6-451f-ac17-8d3bdc4bf6ab "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:16 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=128eff67-fb5c-49f0-a29b-633de309b56a "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=61d9033b-b8e2-4740-bf02-ad5534afc98e "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:01:19 | INFO | prefect.flow_runs | Ingesting documents...
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/task_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flows/ "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:19 | INFO | prefect.engine | View at http://prefect.lab/runs/flow-run/f79e3c88-5696-47b7-804f-52e63e119d4f
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/f79e3c88-5696-47b7-804f-52e63e119d4f/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flow_runs/f79e3c88-5696-47b7-804f-52e63e119d4f "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | prefect.flow_runs | Beginning subflow run 'scarlet-kestrel' for flow 'firecrawl_to_r2r'
|
|
2025-09-18 08:01:19 | INFO | prefect.flow_runs | View at http://prefect.lab/runs/flow-run/f79e3c88-5696-47b7-804f-52e63e119d4f
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flows/bac48c85-e6dc-4da0-99d5-6f26e027cabb "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=a4940a61-03b2-42fd-8be5-c56fb41c5fbb "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:19 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=73eee395-facc-4155-8860-f0bb18505773 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/map "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:20 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:01:20 | INFO | prefect.flow_runs | Discovered 5 unique URLs from Firecrawl map
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=a15d4179-100a-4e62-9c29-96d5dc33f25b "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/c5c726b4-805a-5e22-ad13-323750b25efa "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/a534965a-9da2-566e-a9ad-3e0da59bd3ae "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/8af54b00-fe82-55c5-a1a5-fd0544139b62 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/9a2d0156-602f-5e4a-a8e1-22edd4c987e6 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/c01a1979-1dba-5731-bc71-39daff2e6ca2 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:20 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:01:20 | INFO | prefect.flow_runs | Scraping 1 batches of Firecrawl pages
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=ab85073c-948a-4dfd-a69a-88908220a6d7 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:20 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:21 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:24 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:25 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:26 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:28 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:29 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:29 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:29 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:01:29 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=cd0f7f38-7c7d-4c70-8b71-96747cc6f47d "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:29 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:31 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:31 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:33 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:35 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:36 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:38 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:38 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:01:38 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=8a7fc241-e552-41df-a026-f108da848756 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:38 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:38 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Using collection ID: 866022d4-9a5d-4ff2-9609-1412502d44a1 for collection: r2r
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Creating document with ID: c5c726b4-805a-5e22-ad13-323750b25efa
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Built metadata for document c5c726b4-805a-5e22-ad13-323750b25efa: {'source_url': 'https://r2r-docs.sciphi.ai/introduction', 'content_type': 'text/markdown', 'word_count': 296, 'char_count': 3000, 'timestamp': '2025-09-18T08:01:29.271720+00:00', 'ingestion_source': 'web', 'title': 'Introduction | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced all-in-one AI Retrieval-Augmented Generation (RAG) solution with multimodal content ingestion, hybrid search, configurable GraphRAG, and a Deep Research API for complex queries.'}
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Creating document c5c726b4-805a-5e22-ad13-323750b25efa with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction", "content_type": "text/markdown", "word_count": 296, "char_count": 3000, "timestamp": "2025-09-18T08:01:29.271720+00:00", "ingestion_source": "web", "title": "Introduction | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced all-in-one AI Retrieval-Augmented Generation (RAG) solution with multimodal content ingestion, hybrid search, configurable GraphRAG, and a Deep Research API for complex queries."}
|
|
2025-09-18 08:01:38 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | R2R returned document ID: c5c726b4-805a-5e22-ad13-323750b25efa
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Document c5c726b4-805a-5e22-ad13-323750b25efa should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Creating document with ID: a534965a-9da2-566e-a9ad-3e0da59bd3ae
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Built metadata for document a534965a-9da2-566e-a9ad-3e0da59bd3ae: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/system', 'content_type': 'text/markdown', 'word_count': 146, 'char_count': 1275, 'timestamp': '2025-09-18T08:01:29.271879+00:00', 'ingestion_source': 'web', 'title': 'System | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'Learn about the R2R system architecture'}
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Creating document a534965a-9da2-566e-a9ad-3e0da59bd3ae with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/system", "content_type": "text/markdown", "word_count": 146, "char_count": 1275, "timestamp": "2025-09-18T08:01:29.271879+00:00", "ingestion_source": "web", "title": "System | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "Learn about the R2R system architecture"}
|
|
2025-09-18 08:01:38 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | R2R returned document ID: a534965a-9da2-566e-a9ad-3e0da59bd3ae
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Document a534965a-9da2-566e-a9ad-3e0da59bd3ae should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Creating document with ID: 8af54b00-fe82-55c5-a1a5-fd0544139b62
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Built metadata for document 8af54b00-fe82-55c5-a1a5-fd0544139b62: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/whats-new', 'content_type': 'text/markdown', 'word_count': 42, 'char_count': 350, 'timestamp': '2025-09-18T08:01:29.271949+00:00', 'ingestion_source': 'web', 'title': "What's New | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", 'description': 'Release notes for version 0.3.5 of an advanced AI retrieval system featuring Agentic RAG with improved API, SSE streaming output, and enhanced citations.'}
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Creating document 8af54b00-fe82-55c5-a1a5-fd0544139b62 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/whats-new", "content_type": "text/markdown", "word_count": 42, "char_count": 350, "timestamp": "2025-09-18T08:01:29.271949+00:00", "ingestion_source": "web", "title": "What's New | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "Release notes for version 0.3.5 of an advanced AI retrieval system featuring Agentic RAG with improved API, SSE streaming output, and enhanced citations."}
|
|
2025-09-18 08:01:38 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | R2R returned document ID: 8af54b00-fe82-55c5-a1a5-fd0544139b62
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Document 8af54b00-fe82-55c5-a1a5-fd0544139b62 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Creating document with ID: 9a2d0156-602f-5e4a-a8e1-22edd4c987e6
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Built metadata for document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/what-is-r2r', 'content_type': 'text/markdown', 'word_count': 444, 'char_count': 3541, 'timestamp': '2025-09-18T08:01:29.272137+00:00', 'ingestion_source': 'web', 'title': 'What is R2R? | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced AI retrieval system that provides infrastructure and tools for implementing efficient, scalable, and reliable AI-powered document understanding in applications through Retrieval-Augmented Generation (RAG) with a RESTful API.'}
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Creating document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:01:38 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/what-is-r2r", "content_type": "text/markdown", "word_count": 444, "char_count": 3541, "timestamp": "2025-09-18T08:01:29.272137+00:00", "ingestion_source": "web", "title": "What is R2R? | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced AI retrieval system that provides infrastructure and tools for implementing efficient, scalable, and reliable AI-powered document understanding in applications through Retrieval-Augmented Generation (RAG) with a RESTful API."}
|
|
2025-09-18 08:01:39 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | R2R returned document ID: 9a2d0156-602f-5e4a-a8e1-22edd4c987e6
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | Document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | Creating document with ID: c01a1979-1dba-5731-bc71-39daff2e6ca2
|
|
2025-09-18 08:01:39 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | Built metadata for document c01a1979-1dba-5731-bc71-39daff2e6ca2: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/rag', 'content_type': 'text/markdown', 'word_count': 632, 'char_count': 4228, 'timestamp': '2025-09-18T08:01:29.272373+00:00', 'ingestion_source': 'web', 'title': 'More about RAG | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'This document provides a comprehensive guide to implementing and configuring Retrieval-Augmented Generation (RAG) using the R2R system, covering setup, configuration, and operational details.'}
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | Creating document c01a1979-1dba-5731-bc71-39daff2e6ca2 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/rag", "content_type": "text/markdown", "word_count": 632, "char_count": 4228, "timestamp": "2025-09-18T08:01:29.272373+00:00", "ingestion_source": "web", "title": "More about RAG | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "This document provides a comprehensive guide to implementing and configuring Retrieval-Augmented Generation (RAG) using the R2R system, covering setup, configuration, and operational details."}
|
|
2025-09-18 08:01:39 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | R2R returned document ID: c01a1979-1dba-5731-bc71-39daff2e6ca2
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | Document c01a1979-1dba-5731-bc71-39daff2e6ca2 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:01:39 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:01:39 | INFO | prefect.flow_runs | Upserted 5 documents into R2R (0 failed)
|
|
2025-09-18 08:01:39 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/f79e3c88-5696-47b7-804f-52e63e119d4f/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:39 | INFO | prefect.flow_runs | Finished in state Completed()
|
|
2025-09-18 08:01:39 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:39 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:39 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=3aa36664-8d4e-4c80-9482-0dee2f2b003f "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:01:39 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:01:39 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:01:39 | INFO | prefect.flow_runs | Ingestion completed: 5 processed, 0 failed
|
|
2025-09-18 08:01:39 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/063d86dc-a190-4be9-a56a-de1d1257478f/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:01:39 | INFO | prefect.flow_runs | Finished in state Completed()
|
|
2025-09-18 08:01:41 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:07:27 | INFO | ingest_pipeline.cli.tui.utils.runners | Shutting down storage connections
|
|
2025-09-18 08:07:27 | INFO | ingest_pipeline.cli.tui.utils.runners | All storage connections closed gracefully
|
|
2025-09-18 08:07:31 | INFO | ingest_pipeline.cli.tui.utils.runners | Initializing collection management TUI
|
|
2025-09-18 08:07:31 | INFO | ingest_pipeline.cli.tui.utils.runners | Scanning available storage backends
|
|
2025-09-18 08:07:31 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/.well-known/openid-configuration "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:07:31 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/meta "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:31 | INFO | httpx | HTTP Request: GET https://pypi.org/pypi/weaviate-client/json "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:31 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/schema "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:31 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:31 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:31 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:31 | INFO | ingest_pipeline.cli.tui.utils.runners | weaviate connected successfully
|
|
2025-09-18 08:07:31 | INFO | ingest_pipeline.cli.tui.utils.runners | open_webui connected successfully
|
|
2025-09-18 08:07:31 | INFO | ingest_pipeline.cli.tui.utils.runners | r2r connected successfully
|
|
2025-09-18 08:07:31 | INFO | ingest_pipeline.cli.tui.utils.runners | Launching TUI with 3 backend(s): weaviate, open_webui, r2r
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/schema "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/d06bd108-ae7f-44f4-92fb-2ac556784920 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/139c04d5-7d38-4595-8e12-79a67fd731e7 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/96832710-8146-4e3b-88f3-4b3929f67dbf "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/dade78d9-9893-4966-bd4b-31f1c1635cfa "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/721c1517-b2cd-482d-bd1c-f99571f0f31f "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/f867530b-5eea-43bf-8257-d3da497cb10b "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/cbd4ae82-6fdd-4a4e-a4d5-d0b97ae988fd "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:07:32 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:01 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:01 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=92c2d296-3666-4e3d-894d-76aa2ba37134 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:01 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flows/ "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:01 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:01 | INFO | prefect.engine | View at http://prefect.lab/runs/flow-run/34fad9cf-0b69-46da-88a8-755ea10237a1
|
|
2025-09-18 08:08:01 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/34fad9cf-0b69-46da-88a8-755ea10237a1/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:01 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flow_runs/34fad9cf-0b69-46da-88a8-755ea10237a1 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:01 | INFO | prefect.flow_runs | Beginning flow run 'uppish-cow' for flow 'ingestion_pipeline'
|
|
2025-09-18 08:08:01 | INFO | prefect.flow_runs | View at http://prefect.lab/runs/flow-run/34fad9cf-0b69-46da-88a8-755ea10237a1
|
|
2025-09-18 08:08:01 | INFO | prefect.flow_runs | Starting ingestion from https://r2r-docs.sciphi.ai/introduction
|
|
2025-09-18 08:08:01 | INFO | prefect.flow_runs | Validating source...
|
|
2025-09-18 08:08:01 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=933e7349-accd-47ba-a443-ad004dc80204 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:01 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:01 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flows/78f3cfb6-1339-49c6-89f4-c38effea29e4 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:03 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=816af358-1638-4ced-97b0-e06686fc1a81 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:03 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:06 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=89f5424f-d39b-4d5a-9d2c-795ff23f7ac9 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=11dfd0a9-209c-432e-b1fe-973b00a779d4 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:08:07 | INFO | prefect.flow_runs | Ingesting documents...
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/task_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flows/ "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:07 | INFO | prefect.engine | View at http://prefect.lab/runs/flow-run/3208cd24-4801-43da-886f-d4bcde98c727
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/3208cd24-4801-43da-886f-d4bcde98c727/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flow_runs/3208cd24-4801-43da-886f-d4bcde98c727 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | prefect.flow_runs | Beginning subflow run 'magnificent-mouflon' for flow 'firecrawl_to_r2r'
|
|
2025-09-18 08:08:07 | INFO | prefect.flow_runs | View at http://prefect.lab/runs/flow-run/3208cd24-4801-43da-886f-d4bcde98c727
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flows/bac48c85-e6dc-4da0-99d5-6f26e027cabb "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=b85fa79b-e50a-43ab-9cbc-6d356f9d9c02 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:07 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=23f93c72-be3b-4e0e-99a9-6aa26dcb687d "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:07 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/map "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:08 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:08:08 | INFO | prefect.flow_runs | Discovered 5 unique URLs from Firecrawl map
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=942907ec-99f4-4122-b494-aac403d7a710 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/c5c726b4-805a-5e22-ad13-323750b25efa "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/a534965a-9da2-566e-a9ad-3e0da59bd3ae "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/8af54b00-fe82-55c5-a1a5-fd0544139b62 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/9a2d0156-602f-5e4a-a8e1-22edd4c987e6 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/c01a1979-1dba-5731-bc71-39daff2e6ca2 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:08 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:08:08 | INFO | prefect.flow_runs | Scraping 1 batches of Firecrawl pages
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=9dc7984e-4dec-4eb3-95b1-35b24c8fb708 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:08 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:09 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:12 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:13 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:15 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:15 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:17 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:17 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:17 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:08:17 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=e67effe9-bc4d-4eb3-9621-ed8ed19057e3 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:17 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:18 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:19 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:19 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:20 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:21 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:23 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:23 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:08:23 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=a58ca065-e7eb-4180-bda4-1b1b31068c37 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:23 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:23 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Using collection ID: 866022d4-9a5d-4ff2-9609-1412502d44a1 for collection: r2r
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Creating document with ID: c5c726b4-805a-5e22-ad13-323750b25efa
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Built metadata for document c5c726b4-805a-5e22-ad13-323750b25efa: {'source_url': 'https://r2r-docs.sciphi.ai/introduction', 'content_type': 'text/markdown', 'word_count': 296, 'char_count': 3000, 'timestamp': '2025-09-18T08:08:17.530458+00:00', 'ingestion_source': 'web', 'title': 'Introduction | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced AI retrieval-augmented generation system with multimodal content ingestion, hybrid search, configurable GraphRAG, and a Deep Research API for complex queries.'}
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Creating document c5c726b4-805a-5e22-ad13-323750b25efa with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction", "content_type": "text/markdown", "word_count": 296, "char_count": 3000, "timestamp": "2025-09-18T08:08:17.530458+00:00", "ingestion_source": "web", "title": "Introduction | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced AI retrieval-augmented generation system with multimodal content ingestion, hybrid search, configurable GraphRAG, and a Deep Research API for complex queries."}
|
|
2025-09-18 08:08:23 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:23 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | R2R returned document ID: c5c726b4-805a-5e22-ad13-323750b25efa
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Document c5c726b4-805a-5e22-ad13-323750b25efa should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Creating document with ID: a534965a-9da2-566e-a9ad-3e0da59bd3ae
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Built metadata for document a534965a-9da2-566e-a9ad-3e0da59bd3ae: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/system', 'content_type': 'text/markdown', 'word_count': 146, 'char_count': 1275, 'timestamp': '2025-09-18T08:08:17.530646+00:00', 'ingestion_source': 'web', 'title': 'System | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced AI retrieval system with agentic RAG capabilities, built on a modular service-oriented architecture with RESTful API, vector storage, and support for hybrid search and GraphRAG.'}
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Creating document a534965a-9da2-566e-a9ad-3e0da59bd3ae with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/system", "content_type": "text/markdown", "word_count": 146, "char_count": 1275, "timestamp": "2025-09-18T08:08:17.530646+00:00", "ingestion_source": "web", "title": "System | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced AI retrieval system with agentic RAG capabilities, built on a modular service-oriented architecture with RESTful API, vector storage, and support for hybrid search and GraphRAG."}
|
|
2025-09-18 08:08:23 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | R2R returned document ID: a534965a-9da2-566e-a9ad-3e0da59bd3ae
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Document a534965a-9da2-566e-a9ad-3e0da59bd3ae should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Creating document with ID: 8af54b00-fe82-55c5-a1a5-fd0544139b62
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Built metadata for document 8af54b00-fe82-55c5-a1a5-fd0544139b62: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/whats-new', 'content_type': 'text/markdown', 'word_count': 42, 'char_count': 350, 'timestamp': '2025-09-18T08:08:17.530703+00:00', 'ingestion_source': 'web', 'title': "What's New | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", 'description': 'Version 0.3.5 release notes for an advanced AI retrieval system featuring Agentic RAG with RESTful API, SSE streaming, and improved citations.'}
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Creating document 8af54b00-fe82-55c5-a1a5-fd0544139b62 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/whats-new", "content_type": "text/markdown", "word_count": 42, "char_count": 350, "timestamp": "2025-09-18T08:08:17.530703+00:00", "ingestion_source": "web", "title": "What's New | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "Version 0.3.5 release notes for an advanced AI retrieval system featuring Agentic RAG with RESTful API, SSE streaming, and improved citations."}
|
|
2025-09-18 08:08:23 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | R2R returned document ID: 8af54b00-fe82-55c5-a1a5-fd0544139b62
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Document 8af54b00-fe82-55c5-a1a5-fd0544139b62 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Creating document with ID: 9a2d0156-602f-5e4a-a8e1-22edd4c987e6
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Built metadata for document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/what-is-r2r', 'content_type': 'text/markdown', 'word_count': 444, 'char_count': 3541, 'timestamp': '2025-09-18T08:08:17.530910+00:00', 'ingestion_source': 'web', 'title': 'What is R2R? | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced AI retrieval system that provides infrastructure and tools for implementing efficient, scalable, and reliable AI-powered document understanding in applications.'}
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Creating document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:08:23 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/what-is-r2r", "content_type": "text/markdown", "word_count": 444, "char_count": 3541, "timestamp": "2025-09-18T08:08:17.530910+00:00", "ingestion_source": "web", "title": "What is R2R? | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced AI retrieval system that provides infrastructure and tools for implementing efficient, scalable, and reliable AI-powered document understanding in applications."}
|
|
2025-09-18 08:08:24 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | R2R returned document ID: 9a2d0156-602f-5e4a-a8e1-22edd4c987e6
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | Document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | Creating document with ID: c01a1979-1dba-5731-bc71-39daff2e6ca2
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | Built metadata for document c01a1979-1dba-5731-bc71-39daff2e6ca2: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/rag', 'content_type': 'text/markdown', 'word_count': 632, 'char_count': 4228, 'timestamp': '2025-09-18T08:08:17.531237+00:00', 'ingestion_source': 'web', 'title': 'More about RAG | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'This document provides a comprehensive guide to implementing and configuring Retrieval-Augmented Generation (RAG) with the R2R system, covering setup, configuration, and operational details.'}
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | Creating document c01a1979-1dba-5731-bc71-39daff2e6ca2 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/rag", "content_type": "text/markdown", "word_count": 632, "char_count": 4228, "timestamp": "2025-09-18T08:08:17.531237+00:00", "ingestion_source": "web", "title": "More about RAG | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "This document provides a comprehensive guide to implementing and configuring Retrieval-Augmented Generation (RAG) with the R2R system, covering setup, configuration, and operational details."}
|
|
2025-09-18 08:08:24 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | R2R returned document ID: c01a1979-1dba-5731-bc71-39daff2e6ca2
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | Document c01a1979-1dba-5731-bc71-39daff2e6ca2 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:08:24 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:08:24 | INFO | prefect.flow_runs | Upserted 5 documents into R2R (0 failed)
|
|
2025-09-18 08:08:24 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/3208cd24-4801-43da-886f-d4bcde98c727/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:24 | INFO | prefect.flow_runs | Finished in state Completed()
|
|
2025-09-18 08:08:24 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:24 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:24 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=58784f1e-c888-42c3-83f3-d212c901c316 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:08:24 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:08:24 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:08:24 | INFO | prefect.flow_runs | Ingestion completed: 5 processed, 0 failed
|
|
2025-09-18 08:08:24 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/34fad9cf-0b69-46da-88a8-755ea10237a1/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:08:24 | INFO | prefect.flow_runs | Finished in state Completed()
|
|
2025-09-18 08:08:25 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:05 | INFO | ingest_pipeline.cli.tui.utils.runners | Shutting down storage connections
|
|
2025-09-18 08:13:05 | INFO | ingest_pipeline.cli.tui.utils.runners | All storage connections closed gracefully
|
|
2025-09-18 08:13:09 | INFO | ingest_pipeline.cli.tui.utils.runners | Initializing collection management TUI
|
|
2025-09-18 08:13:09 | INFO | ingest_pipeline.cli.tui.utils.runners | Scanning available storage backends
|
|
2025-09-18 08:13:09 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/.well-known/openid-configuration "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:13:09 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/meta "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:09 | INFO | httpx | HTTP Request: GET https://pypi.org/pypi/weaviate-client/json "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:09 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/schema "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:09 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:10 | INFO | ingest_pipeline.cli.tui.utils.runners | weaviate connected successfully
|
|
2025-09-18 08:13:10 | INFO | ingest_pipeline.cli.tui.utils.runners | open_webui connected successfully
|
|
2025-09-18 08:13:10 | INFO | ingest_pipeline.cli.tui.utils.runners | r2r connected successfully
|
|
2025-09-18 08:13:10 | INFO | ingest_pipeline.cli.tui.utils.runners | Launching TUI with 3 backend(s): weaviate, open_webui, r2r
|
|
2025-09-18 08:13:10 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/schema "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:10 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:10 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:10 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:10 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/d06bd108-ae7f-44f4-92fb-2ac556784920 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/139c04d5-7d38-4595-8e12-79a67fd731e7 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/96832710-8146-4e3b-88f3-4b3929f67dbf "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/dade78d9-9893-4966-bd4b-31f1c1635cfa "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/721c1517-b2cd-482d-bd1c-f99571f0f31f "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/f867530b-5eea-43bf-8257-d3da497cb10b "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/cbd4ae82-6fdd-4a4e-a4d5-d0b97ae988fd "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:11 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:12 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:12 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:12 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:31 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:31 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=d782cd1e-1eea-48fe-8e20-cc6b73e7352b "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:31 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flows/ "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:31 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:31 | INFO | prefect.engine | View at http://prefect.lab/runs/flow-run/813e6876-948a-4833-a855-a88ef455dcf8
|
|
2025-09-18 08:13:31 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/813e6876-948a-4833-a855-a88ef455dcf8/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:31 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flow_runs/813e6876-948a-4833-a855-a88ef455dcf8 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:31 | INFO | prefect.flow_runs | Beginning flow run 'jade-skylark' for flow 'ingestion_pipeline'
|
|
2025-09-18 08:13:31 | INFO | prefect.flow_runs | View at http://prefect.lab/runs/flow-run/813e6876-948a-4833-a855-a88ef455dcf8
|
|
2025-09-18 08:13:31 | INFO | prefect.flow_runs | Starting ingestion from https://r2r-docs.sciphi.ai/introduction
|
|
2025-09-18 08:13:31 | INFO | prefect.flow_runs | Validating source...
|
|
2025-09-18 08:13:32 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=30fd7b21-d90e-49b1-bdb9-306689a9ad01 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:32 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:32 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flows/78f3cfb6-1339-49c6-89f4-c38effea29e4 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:33 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=afdcdbd5-ed46-442d-9311-d9c83fc14a13 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:33 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:35 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:35 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:35 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:13:35 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:35 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=ff9eea92-2967-42da-8673-fdf63dd27b57 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:35 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:35 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=15f58d95-731b-4c7c-b42e-86054b548ef2 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:35 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:35 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:13:35 | INFO | prefect.flow_runs | Ingesting documents...
|
|
2025-09-18 08:13:35 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/task_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:35 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flows/ "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:36 | INFO | prefect.engine | View at http://prefect.lab/runs/flow-run/c962cb9b-f332-4862-a9ff-5e57b06c49ed
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/c962cb9b-f332-4862-a9ff-5e57b06c49ed/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flow_runs/c962cb9b-f332-4862-a9ff-5e57b06c49ed "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:36 | INFO | prefect.flow_runs | Beginning subflow run 'winged-wren' for flow 'firecrawl_to_r2r'
|
|
2025-09-18 08:13:36 | INFO | prefect.flow_runs | View at http://prefect.lab/runs/flow-run/c962cb9b-f332-4862-a9ff-5e57b06c49ed
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flows/bac48c85-e6dc-4da0-99d5-6f26e027cabb "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=23d7e58d-c5e9-464b-9d53-fd89a06f7198 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:36 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=47567ea9-ee2d-4c18-991b-a5bf46b10fd8 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:36 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/map "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:37 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:13:37 | INFO | prefect.flow_runs | Discovered 5 unique URLs from Firecrawl map
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=5a6c2608-7305-45c9-a9c2-7908e76dbc4d "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/c5c726b4-805a-5e22-ad13-323750b25efa "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/a534965a-9da2-566e-a9ad-3e0da59bd3ae "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/8af54b00-fe82-55c5-a1a5-fd0544139b62 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/9a2d0156-602f-5e4a-a8e1-22edd4c987e6 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/c01a1979-1dba-5731-bc71-39daff2e6ca2 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:37 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:13:37 | INFO | prefect.flow_runs | Scraping 1 batches of Firecrawl pages
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=3735d66c-52e4-4085-97f9-e0b381fccf96 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:37 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:38 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:39 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:40 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:40 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:42 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:43 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:43 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:43 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:13:43 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=7f757009-47c5-460c-b7b8-a865c295f794 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:43 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:44 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:44 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:45 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:46 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:47 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:48 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:48 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:13:48 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=bb287f37-852d-4b01-9eef-b1a1d23eafa2 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:48 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:48 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Using collection ID: 866022d4-9a5d-4ff2-9609-1412502d44a1 for collection: r2r
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Creating document with ID: c5c726b4-805a-5e22-ad13-323750b25efa
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Built metadata for document c5c726b4-805a-5e22-ad13-323750b25efa: {'source_url': 'https://r2r-docs.sciphi.ai/introduction', 'content_type': 'text/markdown', 'word_count': 296, 'char_count': 3000, 'timestamp': '2025-09-18T08:13:43.092808+00:00', 'ingestion_source': 'web', 'title': 'Introduction | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced AI retrieval system that provides Retrieval-Augmented Generation capabilities with a RESTful API, featuring multimodal content ingestion, hybrid search, configurable GraphRAG, and a Deep Research API for complex queries.'}
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Creating document c5c726b4-805a-5e22-ad13-323750b25efa with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction", "content_type": "text/markdown", "word_count": 296, "char_count": 3000, "timestamp": "2025-09-18T08:13:43.092808+00:00", "ingestion_source": "web", "title": "Introduction | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced AI retrieval system that provides Retrieval-Augmented Generation capabilities with a RESTful API, featuring multimodal content ingestion, hybrid search, configurable GraphRAG, and a Deep Research API for complex queries."}
|
|
2025-09-18 08:13:48 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | R2R returned document ID: c5c726b4-805a-5e22-ad13-323750b25efa
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Document c5c726b4-805a-5e22-ad13-323750b25efa should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Creating document with ID: a534965a-9da2-566e-a9ad-3e0da59bd3ae
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Built metadata for document a534965a-9da2-566e-a9ad-3e0da59bd3ae: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/system', 'content_type': 'text/markdown', 'word_count': 146, 'char_count': 1275, 'timestamp': '2025-09-18T08:13:43.092979+00:00', 'ingestion_source': 'web', 'title': 'System | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced AI retrieval system with a modular architecture supporting both simple RAG applications and complex production-grade systems with features like hybrid search and GraphRAG.'}
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Creating document a534965a-9da2-566e-a9ad-3e0da59bd3ae with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/system", "content_type": "text/markdown", "word_count": 146, "char_count": 1275, "timestamp": "2025-09-18T08:13:43.092979+00:00", "ingestion_source": "web", "title": "System | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced AI retrieval system with a modular architecture supporting both simple RAG applications and complex production-grade systems with features like hybrid search and GraphRAG."}
|
|
2025-09-18 08:13:48 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | R2R returned document ID: a534965a-9da2-566e-a9ad-3e0da59bd3ae
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Document a534965a-9da2-566e-a9ad-3e0da59bd3ae should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Creating document with ID: 8af54b00-fe82-55c5-a1a5-fd0544139b62
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Built metadata for document 8af54b00-fe82-55c5-a1a5-fd0544139b62: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/whats-new', 'content_type': 'text/markdown', 'word_count': 42, 'char_count': 350, 'timestamp': '2025-09-18T08:13:43.093036+00:00', 'ingestion_source': 'web', 'title': "What's New | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", 'description': 'Release notes for version 0.3.5 of an advanced AI retrieval system featuring improved Agentic RAG API, SSE streaming output, enhanced citations, and minor bug fixes.'}
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Creating document 8af54b00-fe82-55c5-a1a5-fd0544139b62 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/whats-new", "content_type": "text/markdown", "word_count": 42, "char_count": 350, "timestamp": "2025-09-18T08:13:43.093036+00:00", "ingestion_source": "web", "title": "What's New | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "Release notes for version 0.3.5 of an advanced AI retrieval system featuring improved Agentic RAG API, SSE streaming output, enhanced citations, and minor bug fixes."}
|
|
2025-09-18 08:13:48 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | R2R returned document ID: 8af54b00-fe82-55c5-a1a5-fd0544139b62
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Document 8af54b00-fe82-55c5-a1a5-fd0544139b62 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:13:48 | INFO | prefect.task_runs | Creating document with ID: 9a2d0156-602f-5e4a-a8e1-22edd4c987e6
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Built metadata for document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/what-is-r2r', 'content_type': 'text/markdown', 'word_count': 444, 'char_count': 3541, 'timestamp': '2025-09-18T08:13:43.093218+00:00', 'ingestion_source': 'web', 'title': 'What is R2R? | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced AI retrieval system that provides infrastructure and tools for implementing efficient, scalable, and reliable AI-powered document understanding in applications.'}
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Creating document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/what-is-r2r", "content_type": "text/markdown", "word_count": 444, "char_count": 3541, "timestamp": "2025-09-18T08:13:43.093218+00:00", "ingestion_source": "web", "title": "What is R2R? | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced AI retrieval system that provides infrastructure and tools for implementing efficient, scalable, and reliable AI-powered document understanding in applications."}
|
|
2025-09-18 08:13:49 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | R2R returned document ID: 9a2d0156-602f-5e4a-a8e1-22edd4c987e6
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Creating document with ID: c01a1979-1dba-5731-bc71-39daff2e6ca2
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Built metadata for document c01a1979-1dba-5731-bc71-39daff2e6ca2: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/rag', 'content_type': 'text/markdown', 'word_count': 632, 'char_count': 4228, 'timestamp': '2025-09-18T08:13:43.093435+00:00', 'ingestion_source': 'web', 'title': 'More about RAG | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'This document provides a comprehensive guide to implementing and configuring Retrieval-Augmented Generation (RAG) with the R2R system, explaining how it combines large language models with precise information retrieval from documents.'}
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Creating document c01a1979-1dba-5731-bc71-39daff2e6ca2 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/rag", "content_type": "text/markdown", "word_count": 632, "char_count": 4228, "timestamp": "2025-09-18T08:13:43.093435+00:00", "ingestion_source": "web", "title": "More about RAG | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "This document provides a comprehensive guide to implementing and configuring Retrieval-Augmented Generation (RAG) with the R2R system, explaining how it combines large language models with precise information retrieval from documents."}
|
|
2025-09-18 08:13:49 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | R2R returned document ID: c01a1979-1dba-5731-bc71-39daff2e6ca2
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Document c01a1979-1dba-5731-bc71-39daff2e6ca2 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:13:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:13:49 | INFO | prefect.flow_runs | Upserted 5 documents into R2R (0 failed)
|
|
2025-09-18 08:13:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/c962cb9b-f332-4862-a9ff-5e57b06c49ed/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:49 | INFO | prefect.flow_runs | Finished in state Completed()
|
|
2025-09-18 08:13:49 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:49 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=59db8ba2-e01a-4e42-baf8-e2dabaaf4757 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:13:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:13:49 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:13:49 | INFO | prefect.flow_runs | Ingestion completed: 5 processed, 0 failed
|
|
2025-09-18 08:13:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/813e6876-948a-4833-a855-a88ef455dcf8/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:13:49 | INFO | prefect.flow_runs | Finished in state Completed()
|
|
2025-09-18 08:13:50 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:28:46 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:46 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=06321c35-36eb-442c-9e78-513baef02343 "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:28:46 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flows/ "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:46 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:28:46 | INFO | prefect.engine | View at http://prefect.lab/runs/flow-run/3d6f8223-0b7e-43fd-a1f4-c102f3fc8919
|
|
2025-09-18 08:28:46 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/3d6f8223-0b7e-43fd-a1f4-c102f3fc8919/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:28:46 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flow_runs/3d6f8223-0b7e-43fd-a1f4-c102f3fc8919 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:46 | INFO | prefect.flow_runs | Beginning flow run 'magic-deer' for flow 'ingestion_pipeline'
|
|
2025-09-18 08:28:46 | INFO | prefect.flow_runs | View at http://prefect.lab/runs/flow-run/3d6f8223-0b7e-43fd-a1f4-c102f3fc8919
|
|
2025-09-18 08:28:46 | INFO | prefect.flow_runs | Starting ingestion from https://r2r-docs.sciphi.ai/introduction
|
|
2025-09-18 08:28:46 | INFO | prefect.flow_runs | Validating source...
|
|
2025-09-18 08:28:46 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:48 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=5a229335-b24f-4b76-9ae1-0cae020e3cfe "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:28:49 | INFO | prefect.flow_runs | Ingesting documents...
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/task_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flows/ "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:28:49 | INFO | prefect.engine | View at http://prefect.lab/runs/flow-run/c9d6f1ef-c902-4ad7-8cad-7c7b07a0c013
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/c9d6f1ef-c902-4ad7-8cad-7c7b07a0c013/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/flow_runs/c9d6f1ef-c902-4ad7-8cad-7c7b07a0c013 "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | prefect.flow_runs | Beginning subflow run 'tourmaline-beluga' for flow 'firecrawl_to_r2r'
|
|
2025-09-18 08:28:49 | INFO | prefect.flow_runs | View at http://prefect.lab/runs/flow-run/c9d6f1ef-c902-4ad7-8cad-7c7b07a0c013
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:49 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:28:49 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/map "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:50 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:28:50 | INFO | prefect.flow_runs | Discovered 5 unique URLs from Firecrawl map
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/c5c726b4-805a-5e22-ad13-323750b25efa "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/a534965a-9da2-566e-a9ad-3e0da59bd3ae "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/8af54b00-fe82-55c5-a1a5-fd0544139b62 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/9a2d0156-602f-5e4a-a8e1-22edd4c987e6 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/documents/c01a1979-1dba-5731-bc71-39daff2e6ca2 "HTTP/1.1 404 Not Found"
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:50 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:28:50 | INFO | prefect.flow_runs | Scraping 1 batches of Firecrawl pages
|
|
2025-09-18 08:28:50 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:52 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:28:53 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:55 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:55 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:57 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:58 | INFO | httpx | HTTP Request: POST http://crawl.lab:30002/v2/scrape "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:58 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:58 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:28:58 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:28:58 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:28:59 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:00 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:01 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:02 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:03 | INFO | httpx | HTTP Request: POST http://llm.lab/v1/chat/completions "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:03 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:03 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:29:03 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:03 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:03 | INFO | prefect.task_runs | Using collection ID: 866022d4-9a5d-4ff2-9609-1412502d44a1 for collection: r2r
|
|
2025-09-18 08:29:03 | INFO | prefect.task_runs | Creating document with ID: c5c726b4-805a-5e22-ad13-323750b25efa
|
|
2025-09-18 08:29:03 | INFO | prefect.task_runs | Built metadata for document c5c726b4-805a-5e22-ad13-323750b25efa: {'source_url': 'https://r2r-docs.sciphi.ai/introduction', 'content_type': 'text/markdown', 'word_count': 296, 'char_count': 3000, 'timestamp': '2025-09-18T08:28:58.061547+00:00', 'ingestion_source': 'web', 'title': 'Introduction | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced AI retrieval-augmented generation system with multimodal content ingestion, hybrid search, and a Deep Research API for complex queries.'}
|
|
2025-09-18 08:29:03 | INFO | prefect.task_runs | Creating document c5c726b4-805a-5e22-ad13-323750b25efa with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:29:03 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:29:03 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction", "content_type": "text/markdown", "word_count": 296, "char_count": 3000, "timestamp": "2025-09-18T08:28:58.061547+00:00", "ingestion_source": "web", "title": "Introduction | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced AI retrieval-augmented generation system with multimodal content ingestion, hybrid search, and a Deep Research API for complex queries."}
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | R2R returned document ID: c5c726b4-805a-5e22-ad13-323750b25efa
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Document c5c726b4-805a-5e22-ad13-323750b25efa should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Creating document with ID: a534965a-9da2-566e-a9ad-3e0da59bd3ae
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Built metadata for document a534965a-9da2-566e-a9ad-3e0da59bd3ae: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/system', 'content_type': 'text/markdown', 'word_count': 146, 'char_count': 1275, 'timestamp': '2025-09-18T08:28:58.061702+00:00', 'ingestion_source': 'web', 'title': 'System | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced AI retrieval system with a modular architecture supporting retrieval-augmented generation, vector storage, and GraphRAG capabilities.'}
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Creating document a534965a-9da2-566e-a9ad-3e0da59bd3ae with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/system", "content_type": "text/markdown", "word_count": 146, "char_count": 1275, "timestamp": "2025-09-18T08:28:58.061702+00:00", "ingestion_source": "web", "title": "System | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced AI retrieval system with a modular architecture supporting retrieval-augmented generation, vector storage, and GraphRAG capabilities."}
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | R2R returned document ID: a534965a-9da2-566e-a9ad-3e0da59bd3ae
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Document a534965a-9da2-566e-a9ad-3e0da59bd3ae should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Creating document with ID: 8af54b00-fe82-55c5-a1a5-fd0544139b62
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Built metadata for document 8af54b00-fe82-55c5-a1a5-fd0544139b62: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/whats-new', 'content_type': 'text/markdown', 'word_count': 42, 'char_count': 350, 'timestamp': '2025-09-18T08:28:58.061749+00:00', 'ingestion_source': 'web', 'title': "What's New | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", 'description': 'This document announces version 0.3.5 of an advanced AI retrieval system with Agentic Retrieval-Augmented Generation (RAG) and a RESTful API, featuring improved API, SSE streaming output, and enhanced citations.'}
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Creating document 8af54b00-fe82-55c5-a1a5-fd0544139b62 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/whats-new", "content_type": "text/markdown", "word_count": 42, "char_count": 350, "timestamp": "2025-09-18T08:28:58.061749+00:00", "ingestion_source": "web", "title": "What's New | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "This document announces version 0.3.5 of an advanced AI retrieval system with Agentic Retrieval-Augmented Generation (RAG) and a RESTful API, featuring improved API, SSE streaming output, and enhanced citations."}
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | R2R returned document ID: 8af54b00-fe82-55c5-a1a5-fd0544139b62
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Document 8af54b00-fe82-55c5-a1a5-fd0544139b62 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Creating document with ID: 9a2d0156-602f-5e4a-a8e1-22edd4c987e6
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Built metadata for document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/what-is-r2r', 'content_type': 'text/markdown', 'word_count': 444, 'char_count': 3541, 'timestamp': '2025-09-18T08:28:58.061954+00:00', 'ingestion_source': 'web', 'title': 'What is R2R? | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'R2R is an advanced AI retrieval system that provides infrastructure and tools for implementing efficient, scalable, and reliable AI-powered document understanding in applications.'}
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Creating document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/what-is-r2r", "content_type": "text/markdown", "word_count": 444, "char_count": 3541, "timestamp": "2025-09-18T08:28:58.061954+00:00", "ingestion_source": "web", "title": "What is R2R? | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "R2R is an advanced AI retrieval system that provides infrastructure and tools for implementing efficient, scalable, and reliable AI-powered document understanding in applications."}
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | R2R returned document ID: 9a2d0156-602f-5e4a-a8e1-22edd4c987e6
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Document 9a2d0156-602f-5e4a-a8e1-22edd4c987e6 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Creating document with ID: c01a1979-1dba-5731-bc71-39daff2e6ca2
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Built metadata for document c01a1979-1dba-5731-bc71-39daff2e6ca2: {'source_url': 'https://r2r-docs.sciphi.ai/introduction/rag', 'content_type': 'text/markdown', 'word_count': 632, 'char_count': 4228, 'timestamp': '2025-09-18T08:28:58.062194+00:00', 'ingestion_source': 'web', 'title': 'More about RAG | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.', 'description': 'This document provides a comprehensive guide to implementing and configuring Retrieval-Augmented Generation (RAG) using the R2R system, covering setup, configuration, and operational details.'}
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Creating document c01a1979-1dba-5731-bc71-39daff2e6ca2 with collection_ids: [866022d4-9a5d-4ff2-9609-1412502d44a1]
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Sending to R2R - files keys: ['raw_text', 'metadata', 'id', 'ingestion_mode', 'collection_ids']
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Metadata JSON: {"source_url": "https://r2r-docs.sciphi.ai/introduction/rag", "content_type": "text/markdown", "word_count": 632, "char_count": 4228, "timestamp": "2025-09-18T08:28:58.062194+00:00", "ingestion_source": "web", "title": "More about RAG | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.", "description": "This document provides a comprehensive guide to implementing and configuring Retrieval-Augmented Generation (RAG) using the R2R system, covering setup, configuration, and operational details."}
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://r2r.lab/v3/documents "HTTP/1.1 202 Accepted"
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | R2R returned document ID: c01a1979-1dba-5731-bc71-39daff2e6ca2
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Document c01a1979-1dba-5731-bc71-39daff2e6ca2 should be assigned to collection 866022d4-9a5d-4ff2-9609-1412502d44a1 via creation API
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:29:04 | INFO | prefect.flow_runs | Upserted 5 documents into R2R (0 failed)
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/c9d6f1ef-c902-4ad7-8cad-7c7b07a0c013/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:29:04 | INFO | prefect.flow_runs | Finished in state Completed()
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/admin/version "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/increment "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: GET http://prefect.lab/api/csrf-token?client=dad10334-91ca-42b5-bc9d-0adba08e2ebd "HTTP/1.1 422 Unprocessable Entity"
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/concurrency_limits/decrement "HTTP/1.1 200 OK"
|
|
2025-09-18 08:29:04 | INFO | prefect.task_runs | Finished in state Completed()
|
|
2025-09-18 08:29:04 | INFO | prefect.flow_runs | Ingestion completed: 5 processed, 0 failed
|
|
2025-09-18 08:29:04 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/flow_runs/3d6f8223-0b7e-43fd-a1f4-c102f3fc8919/set_state "HTTP/1.1 201 Created"
|
|
2025-09-18 08:29:04 | INFO | prefect.flow_runs | Finished in state Completed()
|
|
2025-09-18 08:29:06 | INFO | httpx | HTTP Request: POST http://prefect.lab/api/logs/ "HTTP/1.1 201 Created"
|
|
2025-09-18 22:21:09 | INFO | ingest_pipeline.cli.tui.utils.runners | Initializing collection management TUI
|
|
2025-09-18 22:21:09 | INFO | ingest_pipeline.cli.tui.utils.runners | Scanning available storage backends
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/.well-known/openid-configuration "HTTP/1.1 404 Not Found"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/meta "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET https://pypi.org/pypi/weaviate-client/json "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/schema "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://chat.lab/api/v1/knowledge/list "HTTP/1.1 401 Unauthorized"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | ingest_pipeline.cli.tui.utils.runners | weaviate connected successfully
|
|
2025-09-18 22:21:09 | WARNING | ingest_pipeline.cli.tui.utils.runners | open_webui connection failed
|
|
2025-09-18 22:21:09 | INFO | ingest_pipeline.cli.tui.utils.runners | r2r connected successfully
|
|
2025-09-18 22:21:09 | INFO | ingest_pipeline.cli.tui.utils.runners | Launching TUI with 2 backend(s): weaviate, r2r
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://weaviate.yo/v1/schema "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: POST http://weaviate.yo/v1/graphql "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 22:21:09 | INFO | httpx | HTTP Request: GET http://r2r.lab/v3/collections "HTTP/1.1 200 OK"
|
|
2025-09-18 23:06:37 | INFO | ingest_pipeline.cli.tui.utils.runners | Shutting down storage connections
|
|
2025-09-18 23:06:37 | INFO | ingest_pipeline.cli.tui.utils.runners | All storage connections closed gracefully
|