mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-07-01 08:15:44 +08:00
## Summary
Aligns the **Go agent runtime/canvas/components/tools** behavior with
the **Python `agent/` implementation** so the same stored canvas DSL
produces the same execution result on either side. Every component,
tool, and runtime primitive in `internal/agent/` is now driven by the
same semantics as its Python counterpart — variable resolution, template
substitution, control flow, error reporting, retry/cancel, and stream
event shapes.
The **retrieval component is the one explicit exception** in this PR. It
is being reworked in a separate change and is excluded from this
alignment pass; the wrapper slot (`universe_a_wrappers.go →
newRetrievalComponent`) is preserved.
## Scope of alignment
### Components (all aligned with `agent/component/`)
`Begin` · `Message` · `LLM` (incl. ChatTemplateKwargs,
MessageHistoryWindowSize, VisualFiles, Cite, OutputStructure,
JSONOutput, TopP, MaxRetries, DelayAfterError, credentials) · `Agent`
(react + tool artifact capture + `Reset()` interface-assert) · `Switch`
(12/12 operators, Python-equivalent semantics) · `Categorize` · `Invoke`
· `Iteration` · `Loop` (macro-expansion through `workflowx.AddLoopNode`)
· `UserFillUp` (Python-equivalent interrupt/resume via eino
`compose.Interrupt`/`ResumeWithData`) · `FillUp` · `DataOperations` ·
`ListOperations` · `StringTransform` · `VariableAggregator` ·
`VariableAssigner` · `Browser` (full stagehand runtime parity) ·
`DocsGenerator` · `ExcelProcessor`.
### Tools (all aligned with `agent/tools/`)
`Retrieval` (wrapper slot only — logic out of scope) · `MCPToolAdapter`
(streamable-HTTP) · `CodeExec` (sandbox bridge with
`code_exec_contract.go` matching Python contract) · `AkShare` · `ArXiv`
· `Crawler` · `DeepL` · `DuckDuckGo` · `Email` · `ExeSQL` · `GitHub` ·
`Google` · `GoogleScholar` · `Jin10` · `PubMed` · `QWeather` · `SearXNG`
· `Tavily` · `Tushare` · `Wencai` · `Wikipedia` · `YahooFinance` —
uniform `eino tool.InvokableTool` interface, SSRF protection, shared
HTTP client.
### Canvas execution engine (`internal/agent/canvas/`)
Aligned with Python's `agent/canvas.py`:
- **Scheduler** (`scheduler.go`): state pre/post handlers, node lambdas,
per-component timeout resolver (4-level: per-class env → per-class table
→ uniform env → 600s fallback), `legacyNoOpNames`.
- **Loop subgraph** (`loop_subgraph.go`): Python-equivalent
`AddLoopNode` macro expansion + condition translation.
- **Multibranch** (`multibranch.go`): `Switch` / `Categorize` routing
via `compose.NewGraphMultiBranch` — same branch selection semantics as
Python.
- **Parallel subgraph** (`parallel_subgraph.go`): matches Python's
parallel fan-out contract.
- **Interrupt/Resume** (`interrupt_resume.go`): `UserFillUpNodeBody` /
`IsInterruptError` / `ExtractInterruptContexts` — replaces the
deprecated Python sentinel chain with eino's native interrupt API,
preserving the same external behavior.
- **Checkpoint** (`checkpoint_store.go`): `RedisCheckPointStore`
Get/Set/Delete, with business metadata (status / canvas_id /
parent_run_id) on a parallel Redis Hash.
- **RunTracker** (`run_tracker.go`): Start / MarkSucceeded / MarkFailed
/ MarkCancelled / AttachCheckpoint — same lifecycle as the Python run
record.
- **Cancel** (`cancel.go`): Redis pub/sub watch.
- **Stream** (`stream.go`): SSE channel with `messages` / `waiting` /
`errors` / `done` events, same shape as Python's `agent.canvas.RunEvent`
payload.
### DSL bridge (`internal/agent/dsl/`)
- `normalize.go`: v1↔v2 collapsed into a single wire format — Python and
Go consume the same stored JSON.
- `reset.go`: per-run state reset matches Python's `Canvas.reset()`
semantics.
- Testdata mirrors Python's `agent_msg.json` / `all.json` / etc.
### Runtime (`internal/agent/runtime/`)
- `CanvasState` / `NewCanvasState` / `GetVar` / `SetVar` / `ReadVars`:
same `{{cpn_id@param}}` resolution model.
- `ResolveTemplate` (regex fast path + gonja fallback) — Python
Jinja-style semantics.
- `selector.go`, `metrics.go`, `component.go`: shared runtime contracts.
## Out of scope (intentionally)
- **`Retrieval` component logic** — wrapped only; full parity lands in a
follow-up PR.
- **Frontend** — only minor dsl-bridge / canvas UX fixes ride along.
- **CLI / admin / model registry** — orthogonal to agent behavior.
## How alignment is verified
`internal/service/agent_run_e2e_test.go` exercises the **full production
chain** against real Python-shaped DSL fixtures:
```
loadCanvasForUser → versionDAO.GetLatest → decodeCanvasFromDSL →
canvas.Compile → cc.Workflow.Invoke → answer extraction
```
using in-memory SQLite + miniredis (no Docker). Covers:
- `TestRunAgent_RealCanvas_BeginMessage` — happy path, `{{sys.query}}`
resolution
- `TestRunAgent_RealCanvas_WaitForUserResume` — two-run resume cycle
(Python-equivalent)
- `TestRunAgent_RealCanvas_CompileFails` — unknown component name →
sanitized error (Python-equivalent)
- `TestRunAgent_RealCanvas_InvokeFails` — unresolvable template ref
(Python-equivalent)
- `TestRunAgent_RunTracker_AttachCheckpoint_CallSequence` —
Start→AttachCheckpoint→MarkSucceeded lifecycle
`internal/handler/agent_test.go` — SSE streaming parity (`Content-Type:
text/event-stream`, `data: {…}\n\n`, trailing `data: [DONE]\n\n`,
OpenAI-compatible non-stream `choices`).
`internal/agent/canvas/fixture_compile_test.go` + per-component tests
pin the Python-equivalent outputs.
```
go test -count=1 -v -run 'TestRunAgent_RealCanvas|TestRunAgent_RunTracker' ./internal/service/
```
## Design reference
`docs/develop/agent-go-port-design.md` (1329 lines, last cross-checked
2026-06-17) — module layout, per-component / per-tool inventory,
corner-case catalogue, and the actionable backlog (Section 14, including
the retrieval alignment follow-up).
---------
Co-authored-by: Claude <noreply@anthropic.com>
228 lines
7.0 KiB
Python
228 lines
7.0 KiB
Python
from abc import ABC
|
|
import os
|
|
from agent.component.base import ComponentBase, ComponentParamBase
|
|
from api.utils.api_utils import timeout
|
|
|
|
class ListOperationsParam(ComponentParamBase):
|
|
"""
|
|
Define the List Operations component parameters.
|
|
"""
|
|
def __init__(self):
|
|
super().__init__()
|
|
self.query = ""
|
|
self.operations = "nth"
|
|
self.n = 0
|
|
self.strict = False
|
|
self.sort_method = "asc"
|
|
# Comma-separated list of map keys to sort by (primary,
|
|
# tiebreak, ...). Empty / unset falls back to the legacy
|
|
# full-hashable-key behaviour (sort by the lexicographically
|
|
# first field). Mirrors internal/agent/component/list_operations.go
|
|
# parseSortByFieldList + opSort's SortBy path.
|
|
self.sort_by = ""
|
|
self.filter = {
|
|
"operator": "=",
|
|
"value": ""
|
|
}
|
|
self.outputs = {
|
|
"result": {
|
|
"value": [],
|
|
"type": "Array of ?"
|
|
},
|
|
"first": {
|
|
"value": "",
|
|
"type": "?"
|
|
},
|
|
"last": {
|
|
"value": "",
|
|
"type": "?"
|
|
}
|
|
}
|
|
|
|
def check(self):
|
|
self.check_empty(self.query, "query")
|
|
self.check_valid_value(
|
|
self.operations,
|
|
"Support operations",
|
|
["nth", "head", "tail", "filter", "sort", "drop_duplicates"],
|
|
)
|
|
|
|
def get_input_form(self) -> dict[str, dict]:
|
|
return {}
|
|
|
|
|
|
class ListOperations(ComponentBase,ABC):
|
|
component_name = "ListOperations"
|
|
|
|
@timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60)))
|
|
def _invoke(self, **kwargs):
|
|
self.input_objects=[]
|
|
inputs = getattr(self._param, "query", None)
|
|
self.inputs = self._canvas.get_variable_value(inputs)
|
|
if not isinstance(self.inputs, list):
|
|
raise TypeError("The input of List Operations should be an array.")
|
|
self.set_input_value(inputs, self.inputs)
|
|
if self._param.operations == "nth":
|
|
self._nth()
|
|
elif self._param.operations == "head":
|
|
self._head()
|
|
elif self._param.operations == "tail":
|
|
self._tail()
|
|
elif self._param.operations == "filter":
|
|
self._filter()
|
|
elif self._param.operations == "sort":
|
|
self._sort()
|
|
elif self._param.operations == "drop_duplicates":
|
|
self._drop_duplicates()
|
|
|
|
|
|
def _coerce_n(self):
|
|
try:
|
|
return int(getattr(self._param, "n", 0))
|
|
except Exception:
|
|
return 0
|
|
|
|
def _is_strict(self):
|
|
strict = getattr(self._param, "strict", False)
|
|
if isinstance(strict, str):
|
|
return strict.strip().lower() in {"1", "true", "yes", "on"}
|
|
return bool(strict)
|
|
|
|
def _set_outputs(self, outputs):
|
|
self._param.outputs["result"]["value"] = outputs
|
|
self._param.outputs["first"]["value"] = outputs[0] if outputs else None
|
|
self._param.outputs["last"]["value"] = outputs[-1] if outputs else None
|
|
|
|
def _raise_strict_range_error(self, operation, n):
|
|
raise ValueError(
|
|
f"{operation} requires n to be within the valid range in strict mode, got {n}."
|
|
)
|
|
|
|
def _nth(self):
|
|
n = self._coerce_n()
|
|
strict = self._is_strict()
|
|
if n == 0:
|
|
if strict:
|
|
self._raise_strict_range_error("nth", n)
|
|
outputs = []
|
|
elif n > 0:
|
|
if n <= len(self.inputs):
|
|
outputs = [self.inputs[n - 1]]
|
|
elif strict:
|
|
self._raise_strict_range_error("nth", n)
|
|
else:
|
|
outputs = []
|
|
else:
|
|
if abs(n) <= len(self.inputs):
|
|
outputs = [self.inputs[n]]
|
|
elif strict:
|
|
self._raise_strict_range_error("nth", n)
|
|
else:
|
|
outputs = []
|
|
self._set_outputs(outputs)
|
|
|
|
def _head(self):
|
|
n = self._coerce_n()
|
|
strict = self._is_strict()
|
|
if strict:
|
|
if 1 <= n <= len(self.inputs):
|
|
outputs = self.inputs[:n]
|
|
else:
|
|
self._raise_strict_range_error("head", n)
|
|
else:
|
|
if n < 1:
|
|
outputs = []
|
|
else:
|
|
outputs = self.inputs[:n]
|
|
self._set_outputs(outputs)
|
|
|
|
def _tail(self):
|
|
n = self._coerce_n()
|
|
strict = self._is_strict()
|
|
if strict:
|
|
if 1 <= n <= len(self.inputs):
|
|
outputs = self.inputs[-n:]
|
|
else:
|
|
self._raise_strict_range_error("tail", n)
|
|
else:
|
|
if n < 1:
|
|
outputs = []
|
|
else:
|
|
outputs = self.inputs[-n:]
|
|
self._set_outputs(outputs)
|
|
|
|
def _filter(self):
|
|
self._set_outputs([i for i in self.inputs if self._eval(self._norm(i),self._param.filter["operator"],self._param.filter["value"])])
|
|
|
|
def _norm(self,v):
|
|
s = "" if v is None else str(v)
|
|
return s
|
|
|
|
def _eval(self, v, operator, value):
|
|
if operator == "=":
|
|
return v == value
|
|
elif operator == "≠":
|
|
return v != value
|
|
elif operator == "contains":
|
|
return value in v
|
|
elif operator == "start with":
|
|
return v.startswith(value)
|
|
elif operator == "end with":
|
|
return v.endswith(value)
|
|
else:
|
|
return False
|
|
|
|
def _sort(self):
|
|
items = self.inputs or []
|
|
method = getattr(self._param, "sort_method", "asc") or "asc"
|
|
reverse = method == "desc"
|
|
|
|
if not items:
|
|
self._set_outputs([])
|
|
return
|
|
|
|
first = items[0]
|
|
|
|
if isinstance(first, dict):
|
|
sort_by_raw = getattr(self._param, "sort_by", "") or ""
|
|
sort_by = [k.strip() for k in sort_by_raw.split(",") if k.strip()]
|
|
if sort_by:
|
|
outputs = sorted(
|
|
items,
|
|
key=lambda x: tuple(x.get(k) for k in sort_by),
|
|
reverse=reverse,
|
|
)
|
|
else:
|
|
outputs = sorted(
|
|
items,
|
|
key=lambda x: self._hashable(x),
|
|
reverse=reverse,
|
|
)
|
|
else:
|
|
outputs = sorted(items, reverse=reverse)
|
|
|
|
self._set_outputs(outputs)
|
|
|
|
def _drop_duplicates(self):
|
|
seen = set()
|
|
outs = []
|
|
for item in self.inputs:
|
|
k = self._hashable(item)
|
|
if k in seen:
|
|
continue
|
|
seen.add(k)
|
|
outs.append(item)
|
|
self._set_outputs(outs)
|
|
|
|
def _hashable(self,x):
|
|
if isinstance(x, dict):
|
|
return tuple(sorted((k, self._hashable(v)) for k, v in x.items()))
|
|
if isinstance(x, (list, tuple)):
|
|
return tuple(self._hashable(v) for v in x)
|
|
if isinstance(x, set):
|
|
return tuple(sorted(self._hashable(v) for v in x))
|
|
return x
|
|
|
|
def thoughts(self) -> str:
|
|
return "ListOperation in progress"
|