mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-07-04 18:45:38 +08:00
## Summary While fixing #16467 (IterationItem crash on `@` in user-defined output keys), an audit of `agent/**/*.py` revealed **three additional sites** with the same vulnerability. This PR hardens all of them with `maxsplit=1` and adds regression tests. This is **defense-in-depth hardening**, not a behavior change. The current `variable_ref_patt` regex constrains `var_nm` to `[A-Za-z0-9_.-]+`, so single-`@` templates resolve exactly as before. The `maxsplit=1` only kicks in if the trailing side itself contains `@` — currently unreachable from the public DSL surface, but trivially exploitable the moment a user-defined output key happens to contain `@` (e.g. `user@email`) or the regex is ever relaxed. > **Note on issue scope**: The primary fix for #16467 (the `list_tenant_added_models` `ValueError` crash on `@` in model names) is in PR #16468. This PR is a **follow-up hardening sweep** of the same vulnerability class found in `agent/` during that audit; it does not duplicate or replace #16468. ## Sites hardened | File | Line | Method | |------|------|--------| | `agent/canvas.py` | 206 | `Graph.get_variable_value` | | `agent/canvas.py` | 256 | `Graph.set_variable_value` | | `agent/component/base.py` | 533 | `ComponentBase.get_input_elements_from_text` | | `agent/component/iterationitem.py` | 88 | `IterationItem.output_collation` | All now use `split("@", 1)` with an inline comment explaining the rationale. The trailing side keeps any embedded `@`. ## Sites already safe (audited but left alone) | File | Reason safe | |------|------------| | `agent/canvas.py:708` (`is_reff`) | Pre-checks `len(arr) != 2` | | `agent/component/categorize.py` | Uses `rsplit` | | `agent/component/iteration.py` | Pre-validates via regex | | Other call sites | `rsplit` or regex pre-validation | ## Regression tests 9 new tests across 2 files, all `pytest.mark.p2`: | File | Tests | |------|-------| | `test/unit_test/agent/test_canvas_at_split.py` | 6 — `get_variable_value`, `set_variable_value`, round-trip, single-`@`, missing-component | | `test/unit_test/agent/component/test_iterationitem_at_split.py` | 3 — `output_collation` with `@` in var, single-`@`, non-matching cid | Each test was **verified to fail with `ValueError: too many values to unpack (expected 2)`** when the corresponding fix is temporarily reverted, confirming the tests actually catch the bug rather than just exercising the happy path. ## Test results ``` 9 passed in 0.04s ``` Full agent unit suite also clean (38 passed, 3 skipped; 6 unrelated pre-existing collection errors from missing `peewee`/`requests` in local venv — not caused by this PR). ## Related - Issue: #16467 - Primary fix PR: #16468 (closes the issue) - This PR: defense-in-depth follow-up, intentionally non-blocking on #16467 --------- Co-authored-by: skbs-eng <skbs-eng@users.noreply.github.com>