Here are clear written examples of **metadata tagging** in both Open WebUI and Weaviate, showing how you can associate tags and structured metadata with knowledge objects for RAG and semantic search. *** ### Example: Metadata Tagging in Open WebUI You send a document to the Open WebUI API endpoint, attaching metadata and tags in the content field as a JSON string: ```json POST http://localhost/api/v1/documents/create Content-Type: application/json { "name": "policy_doc_2022", "title": "2022 Policy Handbook", "collection_name": "company_handbooks", "filename": "policy_2022.pdf", "content": "{\"tags\": [\"policy\", \"2022\", \"hr\"], \"source_url\": \"https://example.com/policy_2022.pdf\", \"author\": \"Jane Doe\"}" } ``` - The `"tags"` field is a list of labels for classification (policy, 2022, hr). - The `"source_url"` and `"author"` fields provide additional metadata useful for retrieval, audit, and filtering.[1][2] For pipeline-based ingestion, you might design a function to extract and append metadata before vectorization: ```python metadata = { "tags": ["policy", "2022"], "source_url": document_url, "author": document_author } embed_with_metadata(chunk, metadata) ``` This metadata becomes part of your retrieval context in RAG workflows.[1] *** ### Example: Metadata Tagging in Weaviate In Weaviate, metadata and tags are defined directly in the schema and attached to each object when added: **Schema definition:** ```json { "class": "Document", "properties": [ {"name": "title", "dataType": ["text"]}, {"name": "tags", "dataType": ["text[]"]}, {"name": "source_url", "dataType": ["text"]}, {"name": "author", "dataType": ["text"]} ] } ``` **Object creation example:** ```python client.data_object.create( data_object={ "title": "2022 Policy Handbook", "tags": ["policy", "2022", "hr"], "source_url": "https://example.com/policy_2022.pdf", "author": "Jane Doe" }, class_name="Document" ) ``` - The `"tags"` field is a text array, ideal for semantic filtering and faceting. - Other fields store provenance metadata, supporting advanced queries and data governance.[3][4][5] **Query with metadata filtering:** ```python result = ( client.query .get("Document", ["title", "tags", "author"]) .with_filter({"path": ["tags"], "operator": "ContainsAny", "value": ["policy", "hr"]}) .do() ) ``` This retrieves documents classified with either "policy" or "hr" tags.[4][3] *** Both platforms support **metadata tagging** for documents, which enables powerful RAG scenarios, detailed filtering, and context-rich retrievals.[5][2][3][4][1] [1](https://www.reddit.com/r/OpenWebUI/comments/1hmmg9a/how_to_handle_metadata_during_vectorization/) [2](https://github.com/open-webui/open-webui/discussions/4692) [3](https://stackoverflow.com/questions/75006703/query-large-list-of-metadate-in-weaviate) [4](https://weaviate.io/blog/enterprise-workflow-langchain-weaviate) [5](https://docs.weaviate.io/academy/py/zero_to_mvp/schema_and_imports/schema) [6](https://docs.weaviate.io/weaviate/api/graphql/additional-properties) [7](https://weaviate.io/blog/sycamore-and-weaviate) [8](https://docs.llamaindex.ai/en/stable/examples/vector_stores/WeaviateIndex_auto_retriever/) [9](https://forum.weaviate.io/t/recommendations-for-metadata-or-knowledge-graphs/960) [10](https://weaviate.io/blog/agent-workflow-automation-n8n-weaviate) [11](https://github.com/open-webui/open-webui/discussions/9804) [12](https://docs.quarkiverse.io/quarkus-langchain4j/dev/rag-weaviate.html) [13](https://github.com/weaviate/weaviate-examples) [14](https://docs.openwebui.com/getting-started/api-endpoints/) [15](https://weaviate.io/blog/hybrid-search-for-web-developers) [16](https://dev.to/stephenc222/how-to-use-weaviate-to-store-and-query-vector-embeddings-4b9b) [17](https://helpdesk.egnyte.com/hc/en-us/articles/360035813612-Using-Metadata-in-the-WebUI) [18](https://docs.datadoghq.com/integrations/weaviate/) [19](https://docs.openwebui.com/features/) [20](https://documentation.suse.com/suse-ai/1.0/html/openwebui-configuring/index.html) [21](https://docs.openwebui.com/getting-started/env-configuration/)