mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-06-29 15:31:05 +08:00
fix(agent): enforce document access on POST /api/v1/agents/rerun (#15145)
## Related issues Closes #15144 ### What problem does this PR solve? `POST /api/v1/agents/rerun` loaded a pipeline operation log by UUID via `PipelineOperationLogService.get_documents_info` with no authorization, then wiped chunks, reset document counters, deleted tasks, and re-queued dataflow for the victim document. Any authenticated user who knew a victim's pipeline log id could disrupt parsing on documents they did not own. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Changes | File | Change | |------|--------| | `api/apps/restful_apis/agent_api.py` | Call `DocumentService.accessible(doc["id"], tenant_id)` before destructive rerun operations; deny with generic `"Document not found."` | | `test/unit_test/api/apps/restful_apis/test_rerun_agent_authorization.py` | Unit tests: cross-tenant log rejected, missing/unauthorized same message, authorized rerun proceeds | ### Security notes - **CWE-639:** Closes cross-tenant pipeline rerun / chunk wipe via leaked log UUID. - `tenant_id` from `@add_tenant_id_to_kwargs` is `current_user.id`; `DocumentService.accessible` covers team-shared KBs. ### Test plan - [ ] `pytest test/unit_test/api/apps/restful_apis/test_rerun_agent_authorization.py` - [ ] Manual: attacker cannot rerun victim pipeline log id ```bash cd ragflow uv run pytest test/unit_test/api/apps/restful_apis/test_rerun_agent_authorization.py -q ``` --------- Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
This commit is contained in:
@@ -14,6 +14,7 @@
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import hashlib
|
||||
import os
|
||||
import shutil
|
||||
import tiktoken
|
||||
@@ -26,7 +27,8 @@ def _ensure_tiktoken_cache() -> str:
|
||||
os.environ["TIKTOKEN_CACHE_DIR"] = cache_dir
|
||||
|
||||
bundled_encoding_path = get_project_base_directory("ragflow_deps", "cl100k_base.tiktoken")
|
||||
cached_encoding_path = os.path.join(cache_dir, "9b5ad71b2ce5302211f9c61530b329a4922fc6a4")
|
||||
encoding_url = "https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken"
|
||||
cached_encoding_path = os.path.join(cache_dir, hashlib.sha1(encoding_url.encode()).hexdigest())
|
||||
|
||||
if os.path.exists(bundled_encoding_path) and not os.path.exists(cached_encoding_path):
|
||||
shutil.copyfile(bundled_encoding_path, cached_encoding_path)
|
||||
|
||||
Reference in New Issue
Block a user