RAG source resolution in chat completion pipeline:
- backend/open_webui/retrieval/utils.py (lines 963-965, 1063-1068, 1126-1131 in get_sources_from_items)
Current main branch (commit 6fdd19bf1) and likely all versions with RAG functionality.
The get_sources_from_items function resolves file and knowledge base references into vector search queries during chat completion. Three of the five code paths perform vector store queries without any authorization check, allowing users to extract content from files and knowledge bases they do not have access to.
| Path | Lines | Access Check |
|------|-------|-------------|
| type: "file", full-context | 1044-1050 | ✅ has_access_to_file |
| type: "file", non-full-context (default) | 1063-1068 | ❌ None |
| type: "collection" | 1070-1118 | ✅ Present |
| type: "text" with collection_name | 963-965 | ❌ None |
| Bare collection_name/collection_names | 1126-1131 | ❌ None |
The three unprotected paths pass user-supplied collection names directly to query_collection(), which queries the vector store without any authorization. Collection names follow predictable formats: file-<file_id> for files and the knowledge base UUID for knowledge bases.
| Metric | Value | Rationale | |--------|-------|-----------| | Attack Vector | Network (N) | Exploited remotely via chat completion API | | Attack Complexity | Low (L) | Single API call with a known resource ID | | Privileges Required | Low (L) | Requires a valid user account | | User Interaction | None (N) | No victim interaction required | | Scope | Unchanged (U) | Impact within the application's data boundary | | Confidentiality | High (H) | Full content of private files/knowledge bases extractable | | Integrity | None (N) | No data modification | | Availability | None (N) | No denial of service |
file-<file_id>).POST /api/chat/completions
{
"model": "any-accessible-model",
"messages": [{"role": "user", "content": "What does this document say about pricing?"}],
"files": [{"type": "file", "id": "<revoked_file_id>"}]
}
file-<id> and queries the vector store with no access check.The same attack works via {"type": "text", "collection_name": "<knowledge_base_id>"} for knowledge bases.
{
"github_reviewed": true,
"github_reviewed_at": "2026-05-08T20:03:09Z",
"cwe_ids": [
"CWE-862"
],
"severity": "MODERATE",
"nvd_published_at": null
}