ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-29 15:31:05 +08:00

Author	SHA1	Message	Date
philluiz2323	e256d91ade	fix: guard SSRF in ExeSQL agent tool DB host (#15609 ) ### What problem does this PR solve? Closes #15608. The ExeSQL agent tool (`agent/tools/exesql.py`) opens database connections to a node-author-controlled host/port with no SSRF validation. The sibling `test_db_connection` endpoint already validates the host via `common.ssrf_guard.assert_host_is_safe` (added by PR #14860), but the tool that actually performs the connection at agent run time was left unguarded — so the guard is bypassed simply by running the agent. An agent author can point the host at `127.0.0.1`, `169.254.169.254` (cloud metadata), or any internal RFC1918 host/port, turning ExeSQL into an internal port-scanner / metadata-fetch primitive. ### Fix Mirror the accepted endpoint guard: validate (and resolve) the host once, before the `db_type` dispatch, and connect to the validated public IP so a later DNS change cannot rebind the host to an internal address. - Add `from common.ssrf_guard import assert_host_is_safe`. - `safe_host = assert_host_is_safe(self._param.host)` before the dispatch (rejects loopback, link-local/metadata, RFC1918, and unresolvable hosts). - Substitute the validated IP into all 6 driver branches: mysql/mariadb, oceanbase, postgres, mssql, trino, IBM DB2. Adds `test/unit_test/agent/tools/test_exesql_ssrf.py` covering loopback, link-local/metadata, RFC1918, and empty-host rejection (before any connection), plus an allowed host dialing the validated IP. ### Validation - `python3 -m py_compile agent/tools/exesql.py` - `ruff check agent/tools/exesql.py test/unit_test/agent/tools/test_exesql_ssrf.py` - `pytest test/unit_test/agent/tools/test_exesql_ssrf.py` — 5 passed ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>	2026-06-29 09:45:16 +08:00
cleanjunc	14174b2364	fix(agent): add HTTP timeout to external API tools (#15436 ) ### What problem does this PR solve? Closes #15435 Several agent tools call external HTTP APIs through `requests` with no request timeout. When an upstream host accepts the connection but never responds (a slow or overloaded API, a half open connection, a stuck load balancer), the call blocks forever. These tools run inside agent canvas execution, so a single stalled socket freezes the entire agent run with no recovery. Ten call sites were affected: - `agent/tools/qweather.py` (4 calls) - `agent/tools/jin10.py` (4 calls) - `agent/tools/tushare.py` (1 call) - `agent/tools/github.py` (1 call) The `github.py` tool already carried the `@timeout` decorator from `common/connection_utils.py`, but that does not protect against this case. In the default configuration the decorator waits on its result queue with no timeout, and a daemon thread blocked inside a socket read cannot be killed, so the run still hangs. The per request timeout added here is what actually bounds the call. This is the same bug class as the merged Go stream timeout fix, surfacing in the Python tool layer. Changes: - Pass `timeout=DEFAULT_TIMEOUT` on all 10 calls, reusing the existing shared constant in `common/http_client.py` (configurable via `HTTP_CLIENT_TIMEOUT`) so there is one source of truth rather than scattered literals. - Add an AST based unit test at `test/unit_test/agent/tools/test_http_timeout.py` that scans every tool module and fails if any `requests` or `httpx` request call omits a `timeout`, guarding current and future call sites. Verification: - Reproduced the indefinite block against a stalling local server, and confirmed that adding a timeout raises `ReadTimeout` promptly. - Confirmed the `@timeout` decorator does not interrupt a blocked no timeout request in its default configuration. - The new test flags exactly the 10 original call sites on the pre fix code and passes (22 modules) after the fix. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): --------- Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>	2026-06-29 09:45:16 +08:00
Muhammad Furqan	fe14cc35cf	fix(agent/tools): DeepL component fails validation and drops errors (#16332 ) ### What problem does this PR solve? `DeepLParam.check()` validated `self.top_n`, but DeepL has no such parameter (it is not defined on the param class or its base), so `check()` always raised `AttributeError` and a DeepL component could never pass validation. Removed the bogus `top_n` check. Also fixed the `_run` except branch, which computed `be_output("Error...")` but never returned it, silently dropping the error message. Closes #16329 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Add test cases ### Testing Added `test/unit_test/agent/component/test_deepl.py` covering `DeepLParam.check()` with valid defaults and rejection of invalid source/target languages.	2026-06-25 14:40:56 +08:00
Muhammad Furqan	3747a6bfeb	fix(agent/tools): PubMed tool always returns "Unknown Authors" (#16330 ) ### What problem does this PR solve? Fixes the PubMed tool always emitting `Authors: Unknown Authors`. The `safe_find` closure in `_format_pubmed_content` was hardcoded to search from the article root, so the per-author `LastName`/`ForeName` lookups never matched. `safe_find` now accepts an optional `base` node (defaults to `child`, preserving the existing field lookups), and the author loop passes the current `<Author>` element. Closes #16328 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Add test cases ### Testing Added `test/testcases/test_web_api/test_canvas_app/test_pubmed_unit.py` covering per-author parsing, intact title/journal/DOI fields, and the no-authors fallback. Before: `Authors: Unknown Authors` After: `Authors: Furqan Khan, Jane Smith`	2026-06-25 14:34:37 +08:00
Willsgao	824c88423c	fix(agent): log Wikipedia disambiguation and page errors instead of s… (#16207 ) ## Problem The Wikipedia tool silently swallows all exceptions with `except Exception: pass`, making it impossible to debug failures when fetching Wikipedia pages. ## Fix Replace the bare `except Exception: pass` with specific exception handling: - `DisambiguationError`: log available options - `PageError`: log page not found - `Exception`: log unexpected errors with full traceback Co-authored-by: wills <willsgao@163.com> Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>	2026-06-25 13:10:29 +08:00
Ilya Bogin	10d02e54a8	Add Keenable web search tool to the agent (#16233 ) Adds Keenable as a web search tool in the agent, alongside the existing Tavily/DuckDuckGo/SearXNG/Google tools. The main difference from the other search tools is that it doesn't require an API key. By default it uses Keenable's keyless public endpoint, so it works out of the box. Providing a key (in the tool config) switches to the authenticated endpoint and lifts the rate limits. ### Changes - Backend: `agent/tools/keenable.py` — `KeenableSearch`, follows the Tavily/DuckDuckGo tool shape (results go through `_retrieve_chunks`). Auto-registered by `agent/tools/__init__.py`. - Frontend: wired into the agent builder — operator + icon, config form (optional API key, search mode, site filter, top N), the search tool menu, and the existing api_key export sanitizer. ### Config - API key: optional. Blank = keyless free tier; set it to lift limits / enable `realtime` mode. - `site`: restrict to a single domain. - `mode`: `pro` (default) or `realtime`. ### Notes `KEENABLE_API_URL` can override the API base (HTTPS enforced; defaults to `https://api.keenable.ai`). The tool only sends the query (no URL fetch), so there's no SSRF surface. Verified the frontend with `vite build` and the backend search path against the public endpoint.	2026-06-25 12:12:28 +08:00
Lynn	dc4b82523b	Feat: tenant llm provider (#14595 ) ### What problem does this PR solve? Python implementation of the Go-based model_provider API suite. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: bill <yibie_jingnian@163.com>	2026-05-29 17:39:41 +08:00
nickmopen	43cbfd447a	Fix: ExeSQL node continues on per-statement SQL errors (#15140 ) Wrap per-statement execution in both the generic and IBM DB2 loops so a failing statement reports a friendly "SQL Execution Failed" message and continues, instead of letting a raw driver exception abort the node and discard results from statements that already succeeded. Rolls back after a failure so PostgreSQL's aborted-transaction state does not cascade into every subsequent statement in the batch. ### What problem does this PR solve? Closes #14737 The ExeSQL agent node splits its input on `;` and runs each statement in a loop. Both execution loops — the generic one (`cursor.execute`) and the IBM DB2 one (`ibm_db.exec_immediate`) — were wrapped only in a `try/finally` for resource cleanup, with no `except` around statement execution. As a result, when any single statement failed (e.g. the reporter's MSSQL `('42S02', "[42S02] ... 对象名 'ASSET_AUDIT' 无效")`): - The raw, unformatted driver exception bubbled up and the node failed with an ugly `_ERROR` instead of friendly information. - The whole node aborted — results from statements that had already succeeded were discarded, and the remaining statements in the batch never ran. The reporter confirmed this was the real pain point: "after reporting an exception, the previous normal query cannot be executed properly … Do not interrupt the workflow for any issues." Connection-level failures were already wrapped with a friendly `"Database Connection Failed!"` prefix — only per-statement execution errors were missed. This PR wraps per-statement execution in `try/except` in both loops. A failing statement now: - records a friendly `SQL Execution Failed: <sql>\n<error>` entry into the `json` and `formalized_content` outputs (the actual DB error is kept so the user can see what failed), and - `continue`s to the next statement — so earlier results survive and later statements still run. After a failure in the generic loop, the connection is rolled back so PostgreSQL's aborted-transaction state does not cascade into every subsequent statement in the batch. The node returns normally (no `_ERROR` raised), so the agent workflow proceeds instead of halting. Connection failures remain fatal (correct — nothing can run without a connection). The pre-existing `break` on `cursor.rowcount == 0` is intentionally left unchanged; it is out of scope for this fix. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-27 16:37:14 +08:00
Magicbook1108	b28e134944	Feat: add local & ssh provider in admin panel (#15039 ) ### What problem does this PR solve? Feat: add local & ssh provider in admin panel ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-20 16:56:20 +08:00
plind	7edabdf7c3	fix(retrieval): keep manual metadata filter reusable inside Iteration (#14849 ) ## What problem does this PR solve? Closes #12582. When a Retrieval component sits inside an Iteration with a manual metadata filter that references the iteration variable (e.g. `{IterationItem:abc@item}`), every iteration reuses the value resolved on the first pass. Root cause: [`_resolve_manual_filter` in `agent/tools/retrieval.py`](https://github.com/infiniflow/ragflow/blob/main/agent/tools/retrieval.py#L144-L171) mutated `flt["value"]` in place. The `filters` list passed in is the live `self._param.meta_data_filter["manual"]` (see [`apply_meta_data_filter` in `common/metadata_utils.py:257-261`](https://github.com/infiniflow/ragflow/blob/main/common/metadata_utils.py#L257-L261)), so after the first iteration the param dict permanently held the resolved string instead of the original variable reference. ```text iter #1: flt["value"] = "{IterationItem:abc@item}" → resolved to "AI" after mutation: flt["value"] = "AI" ← written back into _param iter #2: flt["value"] = "AI" ← no {…} matches retrieval keeps filtering by "AI" forever ``` This PR returns a shallow copy with the resolved value instead, leaving the original filter (and its variable reference) intact for the next iteration. ## Type of change - [x] Bug fix (non-breaking change which fixes an issue) ## Test plan - [ ] Build an agent: `Agent (structured output → list of areas) → Iteration → Retrieval (manual filter: Area = {IterationItem/Item}) → Message`. Run with a multi-area query and confirm each iteration's Retrieval result matches its own item, not the first item. - [ ] Regression: Retrieval with a manual metadata filter outside an Iteration still resolves the variable correctly on each request. - [ ] Regression: Retrieval with no metadata filter and with `auto` / `semi_auto` filters behave unchanged.	2026-05-19 15:08:31 +08:00
wdeveloper16	14c0985182	feat: bump Python minimum from 3.12 to 3.13, drop strenum backport (#14767 ) Closes #14753 ## What changed \| File \| Change \| \|---\|---\| \| `pyproject.toml` \| `requires-python` → `>=3.13,<3.15`; remove `strenum==0.4.15` \| \| `Dockerfile` \| `uv python install 3.13`, `uv sync --python 3.13` \| \| `.github/workflows/tests.yml` \| `uv sync --python 3.13` on both matrix legs \| \| `CLAUDE.md` \| dev setup command + requirements note updated \| \| `deepdoc/parser/mineru_parser.py` \| `from strenum import StrEnum` → `from enum import StrEnum` \| \| `agent/tools/code_exec.py` \| same \| `StrEnum` has been in the stdlib since Python 3.11 — the `strenum` backport package is no longer needed once the floor is 3.13. ## Why uv.lock is not regenerated `uv lock --python 3.13` fails because: 1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0` 2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels) depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0` 3. These two constraints are irreconcilable on Python 3.13 The lockfile regeneration requires loosening the `numpy` upper bound in the `infiniflow/graspologic` fork. Once that fork commit is updated and the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will succeed. ## RFC corrections Two claims in the original RFC (#14753) did not hold up under code review: - "graspologic hard-blocks 3.13" — the infiniflow fork at the pinned commit has no `<3.13` Python constraint. The blocker is the transitive `numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a direct Python version cap. - "free-threading throughput gains for I/O-bound workload" — Python 3.13 free-threading requires a special `--disable-gil` build and provides no benefit for async I/O code (the GIL is already released during I/O). The real motivation is forward compatibility and improved error messages.	2026-05-15 14:40:53 +08:00
yingjianzh	4c68a6b86c	fix(agent): pass top_k and fix similarity weight slider behavior (#14760 ) ### What problem does this PR solve? This PR fixes two issues in Agent Retrieval behavior and configuration UX: 1. `top_k` configured in Agent Retrieval was not passed down to the backend retriever call, so retrieval could ignore the configured vector recall limit. 2. Similarity weight slider semantics were confusing in Agent forms because the Agent field stores `keywords_similarity_weight` while UI interactions were interpreted as vector weight. This could cause displayed values and actual behavior to diverge. This PR ensures Agent retrieval uses configured `top_k`, and makes the slider behavior consistent and explicit for both vector and keyword weight modes. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-15 10:49:14 +08:00
eviaaaaa	63df01fe3f	fix(agent): handle duplicate MCP tool names (#14217 ) ### What problem does this PR solve? When multiple MCP servers expose tools with the same name, the agent currently registers those tools using their original MCP names. This can lead to two issues: - later MCP tools may overwrite earlier ones in the agent tool map - duplicate function names may be exposed to the LLM This PR fixes duplicate MCP tool-name handling by applying the same indexed naming strategy already used for native agent tools. Native tools are exposed with generated names such as `<tool_name>_<index>` to avoid collisions, and MCP tools now follow the same convention for consistency. Specifically, this PR: - assigns unique indexed function names to MCP tools exposed to the LLM - preserves each MCP tool's original server-side name in an `MCPToolBinding` - dispatches MCP calls using the original MCP tool name while keeping the indexed name in the agent tool map - allows MCP metadata conversion to override only the OpenAI function name without modifying the original MCP tool metadata ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Validation The validation was performed using two MCP servers. Both servers exposed a tool with the same name: `mcp0`. Both tools take no input parameters. MCP Server One: <img width="1780" height="625" alt="ONE" src="https://github.com/user-attachments/assets/801a2654-fc10-4b71-b31c-81841fd40c55" /> MCP Server Two: <img width="1777" height="624" alt="Second" src="https://github.com/user-attachments/assets/c095151d-7bdf-47c8-9bfe-6aaf4a01b944" /> Before the fix: When invoking `mcp0`, only the `mcp0` tool from the MCP server injected later could be called successfully. As shown below, both `mcp0` tools were present, but only the later-registered one was actually invokable. <img width="694" height="935" alt="Three" src="https://github.com/user-attachments/assets/3b9d7ab2-1765-492c-b8e0-bf05a69933ca" /> After the fix: Both `mcp0` tools can now be invoked correctly. <img width="737" height="1095" alt="F" src="https://github.com/user-attachments/assets/6e896627-2b7f-41bb-becc-daa0c73ff58f" /> <img width="730" height="1090" alt="six" src="https://github.com/user-attachments/assets/aba75593-26ae-4e3b-951d-b45ff177fd32" />	2026-05-14 15:28:39 +08:00
buua436	daf8a58c4b	Fix: add codeexec attachments output (#14787 ) ### What problem does this PR solve? add codeexec attachments output ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-11 19:16:33 +08:00
box4wangjing	292b0b8bce	chore: fix some comments to improve readability (#14756 ) ### What problem does this PR solve? fix some comments to improve readability ### Type of change - [x] Documentation Update --------- Signed-off-by: box4wangjing <box4wangjing@outlook.com>	2026-05-11 16:48:48 +08:00
sxxtony	59c35100c5	Perf: push metadata filters down to Elasticsearch (#14576 ) ### What problem does this PR solve? Fixes #14412. `common.metadata_utils.meta_filter` evaluates user-defined metadata conditions in Python after `DocMetadataService.get_flatted_meta_by_kbs` loads the entire `meta_fields` table into memory. Past a few thousand documents per knowledge base this becomes a memory bottleneck and a wasted ES round-trip — every filter request currently fetches up to 10000 metadata rows even when the resulting `doc_ids` list is tiny. This PR adds an ES push-down path that translates the same filter language into a `bool` query and returns just the matching document IDs. Changes - `common/metadata_es_filter.py` (new): pure-Python translator from the RAGflow filter list to ES DSL. Covers every operator the in-memory path supports (`=`, `≠`, `>`, `<`, `≥`, `≤`, `in`, `not in`, `contains`, `not contains`, `start with`, `end with`, `empty`, `not empty`) with `case_insensitive: true` on `prefix` and `wildcard` for parity with the existing lower-cased Python comparisons. User wildcard metacharacters are escaped before being injected into `wildcard` patterns. Negative operators (`≠`, `not in`, `not contains`, ranges) are wrapped with an `exists` guard so they do not accidentally match documents missing the key, matching the legacy `if k not in metas` behaviour. - `api/db/services/doc_metadata_service.py`: new `DocMetadataService.filter_doc_ids_by_meta_pushdown(kb_ids, filters, logic)` that returns the doc IDs ES matched, or `None` to signal the caller should fall back to the in-memory path. Returns `None` when the active doc store is Infinity (`meta_fields` is a JSON column, not a dotted-object mapping), when any filter cannot be expressed in DSL (`UnsupportedMetaFilter`), or when the ES request or metadata index lookup errors. - `common/metadata_utils.py`: `apply_meta_data_filter` accepts an optional `kb_ids` argument. When supplied, conditions go through push-down first via a new `_try_meta_pushdown` helper; on `None` the function falls back to the original `meta_filter` call. Default behaviour is unchanged for callers that don't pass `kb_ids`. - Updated all four callers (`agent/tools/retrieval.py`, `api/db/services/dialog_service.py` ×2, `api/apps/services/dataset_api_service.py`, `api/apps/sdk/session.py`) to forward `kb_ids` so the push-down path is exercised in production. - `test/unit_test/common/test_metadata_es_filter.py` (new): 35 unit tests covering every operator's DSL shape, value coercion (`ast.literal_eval`, lowercasing, ISO-date pass-through), wildcard escaping, OR-logic wrapping that protects negative clauses, and the doc-ID extractor. Behaviour preserved - The in-memory `meta_filter` is untouched and still services every fallback case (Infinity backend, unknown operators, ES outages). - The eligibility / credibility / issue-multiplier semantics described in the LLM-driven `auto` and `semi_auto` modes still hand the LLM the full in-memory `metas` dict to choose conditions from. Only the evaluation of those generated conditions is pushed down. - Existing tests in `test/unit_test/common/test_metadata_filter_operators.py` continue to pass (14/14). Test plan - `pytest test/unit_test/common/test_metadata_es_filter.py` — 35 passed. - `pytest test/unit_test/common/test_metadata_filter_operators.py` — 14 passed. - `ruff check` clean on every modified file. - Reviewer please validate the ES query shapes against a live cluster — particularly `case_insensitive` on `wildcard` and `prefix` (requires ES 7.10+) and the `exists` + `must_not` pairing for `≠`. Notes - The first cut caps each push-down request at 10000 results, matching the existing `get_flatted_meta_by_kbs` limit, and logs a warning when the cap is hit. A `search_after` follow-up would let us drop the cap entirely once the push-down path is validated. - Operator parity with the in-memory path is exact for the canonical unicode operators (`≥`, `≤`, `≠`) used internally; the ASCII aliases (`>=`, `<=`, `!=`) are normalised by `convert_conditions` before they reach the translator. ### Type of change - [x] Performance Improvement --------- Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>	2026-05-07 21:23:43 +08:00
Magicbook1108	c29335cbff	Feat: support local provider for code exec component & remove some outdated models (#14637 ) ### What problem does this PR solve? Feat: support local provider for code exec component & remove some outdated models ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-07 21:23:13 +08:00
buua436	e6e80041f5	Fix: agent toolcall null response & schema validation & DeepSeek think history (#14425 ) ### What problem does this PR solve? agent toolcall null response & schema validation & DeepSeek think history ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-04-28 17:09:08 +08:00
Jack	290f0294d6	Refactor: migrate artifact API (#14348 ) ### What problem does this PR solve? Before migration: GET /v1/document/artifact/<filename> After migration: GET /api/v1/documents/artifact/<filename> ### Type of change - [x] Refactoring	2026-04-27 15:19:41 +08:00
Xing Hong	fb95136f39	Fix: validate URL scheme and resolved IP before crawling to prevent SSRF (#14090 ) ### What problem does this PR solve? The POST /upload_info?url=<url> endpoint accepted a user-supplied URL and passed it directly to AsyncWebCrawler without any validation. There were no restrictions on URL scheme, destination hostname, or resolved IP address. This allowed any authenticated user to instruct the server to make outbound HTTP requests to internal infrastructure — including RFC 1918 private networks, loopback addresses, and cloud metadata services such as http://169.254.169.254 — effectively using the server as a proxy for internal network reconnaissance or credential theft. This PR adds an SSRF guard (_validate_url_for_crawl) that runs before any crawl is initiated. It enforces an allowlist of safe schemes (http/https), resolves the hostname at validation time, and rejects any URL whose resolved IP falls within a private or reserved network range. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-04-25 14:30:15 +08:00
akie	3911d90993	Fix: agent application can not show Cite (#14047 ) Close #14018 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Problem In Agent applications, even with the cite option enabled, only inline [ID: x] citation markers are visible (showing chunk content on hover). The Agent does not display the referenced file cards below the response, unlike Chat applications. ### Root Cause The Agent's Retrieval tool (agent/tools/retrieval.py) calls retriever.retrieval() with aggs=False, which means the retrieval results do not include doc_aggs (document aggregation) data. Without doc_aggs, the frontend ReferenceDocumentList component has no data to render the file cards. In contrast, the Chat application (api/db/services/dialog_service.py) calls the same retriever.retrieval() method with aggs=True. ### Fix Changed aggs=False to aggs=True in agent/tools/retrieval.py so that document aggregation data is returned along with the retrieved chunks.	2026-04-13 11:06:14 +08:00
Magicbook1108	9ce293a736	Refact: update exesql notification (#14027 ) ### What problem does this PR solve? Refact: update exesql notification ### Type of change - [x] Refactoring	2026-04-10 13:42:57 +08:00
balibabu	38acf34724	Fix: The agent selected a knowledge base, but the API returned the error: "No dataset is selected". (#13950 ) ### What problem does this PR solve? Fix: The agent selected a knowledge base, but the API returned the error: "No dataset is selected". ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: balibabu <assassin_cike@163.com>	2026-04-07 14:16:37 +08:00
Yongteng Lei	112007243d	Refa: refine code_exec component (#13925 ) ### What problem does this PR solve? Refine code_exec component. ### Type of change - [x] Refactoring	2026-04-07 11:48:29 +08:00
Lynn	db57155b30	Fix: get user_id from variables (#13716 ) ### What problem does this PR solve? Get user_id from canvas variable when input a {} pattern value. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-03-20 23:39:34 +08:00
Yongteng Lei	dd839f30e8	Fix: code supports matplotlib (#13724 ) ### What problem does this PR solve? Code as "final" node: ![img_v3_02vs_aece4caf-8403-4939-9e68-9845a22c2cfg](https://github.com/user-attachments/assets/9d87b8df-da6b-401c-bf6d-8b807fe92c22) Code as "mid" node: ![img_v3_02vv_f74f331f-d755-44ab-a18c-96fff8cbd34g](https://github.com/user-attachments/assets/c94ef3f9-2a6c-47cb-9d2b-19703d2752e4) ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-03-20 20:32:00 +08:00
Lynn	02070bab2a	Feat: record user_id in memory (#13585 ) ### What problem does this PR solve? Get user_id from canvas and record it. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-03-13 15:38:35 +08:00
Eden	ab6ca75245	fix(agent): ensure database connections are properly closed in ExeSQL tool (#13427 ) ## Summary Fix a database connection and cursor resource leak in the ExeSQL agent tool. When SQL execution raises an exception (for example syntax error or missing table), the existing code path skips `cursor.close()` and `db.close()`, causing database connections to accumulate over time. This can eventually lead to connection exhaustion in long-running agent workflows. ## Root Cause The cleanup logic for database cursors and connections is placed after the SQL execution loop without `try/finally` protection. If an exception occurs during `cursor.execute()`, `fetchmany()`, or result processing, the cleanup code is not reached and the connection remains open. The same issue also exists in the IBM DB2 execution path where `ibm_db.close(conn)` may be skipped when exceptions occur. ## Fix - Wrap SQL execution logic in `try/finally` blocks to guarantee resource cleanup. - Ensure `cursor.close()` and `db.close()` are always executed. - Add explicit `db.close()` when `db.cursor()` creation fails. - Remove redundant close calls in early-return branches since `finally` now handles cleanup. ## Impact - No change to normal execution behavior. - Ensures database resources are always released when errors occur. - Prevents connection leaks in long-running workflows. - Only affects `agent/tools/exesql.py`. ## Testing Manual test scenarios: 1. Valid SQL execution 2. SQL syntax error 3. Query against a non-existing table 4. Execution cancellation during query In all scenarios the database cursor and connection are properly closed. Code quality checks: - `ruff check` passed - No new warnings introduced	2026-03-09 10:36:02 +08:00
Lynn	62cb292635	Feat/tenant model (#13072 ) ### What problem does this PR solve? Add id for table tenant_llm and apply in LLMBundle. ### Type of change - [x] Refactoring --------- Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com> Co-authored-by: Liu An <asiro@qq.com>	2026-03-05 17:27:17 +08:00
Magicbook1108	1aa49a11f0	Feat: support AWS SES smtp (#13195 ) ### What problem does this PR solve? Support AWS SES smtp #13179 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-02-26 13:49:53 +08:00
Magicbook1108	98e1d5aa5c	Refact: switch from google-generativeai to google-genai (#13140 ) ### What problem does this PR solve? Refact: switch from oogle-generativeai to google-genai #13132 Refact: commnet out unused pywencai. ### Type of change - [x] Refactoring	2026-02-24 10:28:33 +08:00
Carve_	ee23b9eb63	feature:Add OceanBase Support to Text-to-SQL Agent (#12919 ) ### What problem does this PR solve? Close #12768. This PR adds OceanBase support to RAGFlow’s Text-to-SQL (ExeSQL) component. OceanBase is integrated via MySQL compatibility mode, and the UI `db_type` options are updated accordingly. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Changes Backend - Add `oceanbase` `db_type` validation and connection logic in `exesql.py` and reuse existing MySQL compatibility mode Frontend - Add OceanBase option to the ExeSQL `db_type` selector ### How to test 1. Configure OceanBase connection in ExeSQL node (host/port/user/password/database) 2. Input: “Show 10 rows from test table” 3. Generated SQL: `SELECT * FROM test LIMIT 10;` 4. Query executes successfully and results are returned ### Screenshots - ExeSQL db_type includes OceanBase <img width="649" height="1015" alt="2" src="https://github.com/user-attachments/assets/e0a5f7b9-e282-402a-8639-64c1aef8fce6" /> - ExeSQL test OceanBase connection <img width="2247" height="1140" alt="test_ob" src="https://github.com/user-attachments/assets/f16ebd93-b48e-4d18-b53f-8496581e755d" /> - Query results from OceanBase shown in UI <img width="2550" height="1351" alt="1" src="https://github.com/user-attachments/assets/b44163dc-baab-420d-b31e-b644bdcb77a9" />	2026-01-31 15:03:40 +08:00
eviaaaaa	c59ae4c7c2	Fix: codeExec return types & error handling; Update Spark model mappings (#12896 ) ## What problem does this PR solve? This PR addresses three specific issues to improve agent reliability and model support: 1. `codeExec` Output Limitation: Previously, the `codeExec` tool was strictly limited to returning `string` types. I updated the output constraint to `object` to support structured data (Dicts, Lists, etc.) required for complex downstream tasks. 2. `codeExec` Error Handling: Improved the execution logic so that when runtime errors occur, the tool captures the exception and returns the error message as the output instead of causing the process to abort or fail silently. 3. Spark Model Configuration: - Added support for the `MAX-32k` model variant. - Fixed the `Spark-Lite` mapping from `general` to `lite` to match the latest API specifications. ## Type of change - [x] Bug Fix (fixes execution logic and model mapping) - [x] New Feature / Enhancement (adds model support and improves tool flexibility) ## Key Changes ### `agent/tools/code_exec.py` - Changed the output type definition from `string` to `object`. - Refactored the execution flow to gracefully catch exceptions and return error messages as part of the tool output. ### `rag/llm/chat_model.py` - Added `"Spark-Max-32K": "max-32k"` to the model list. - Updated `"Spark-Lite"` value from `"general"` to `"lite"`. ## Checklist - [x] My code follows the style guidelines of this project. - [x] I have performed a self-review of my own code. Signed-off-by: evilhero <2278596667@qq.com>	2026-01-29 19:22:35 +08:00
qinling0210	9a5208976c	Put document metadata in ES/Infinity (#12826 ) ### What problem does this PR solve? Put document metadata in ES/Infinity. Index name of meta data: ragflow_doc_meta_{tenant_id} ### Type of change - [x] Refactoring	2026-01-28 13:29:34 +08:00
Zhichang Yu	fd11aca8e5	feat: Implement pluggable multi-provider sandbox architecture (#12820 ) ## Summary Implement a flexible sandbox provider system supporting both self-managed (Docker) and SaaS (Aliyun Code Interpreter) backends for secure code execution in agent workflows. Key Changes: - ✅ Aliyun Code Interpreter provider using official `agentrun-sdk>=0.0.16` - ✅ Self-managed provider with gVisor (runsc) security - ✅ Arguments parameter support for dynamic code execution - ✅ Database-only configuration (removed fallback logic) - ✅ Configuration scripts for quick setup Issue #12479 ## Features ### 🔌 Provider Abstraction Layer 1. Self-Managed Provider (`agent/sandbox/providers/self_managed.py`) - Wraps existing executor_manager HTTP API - gVisor (runsc) for secure container isolation - Configurable pool size, timeout, retry logic - Languages: Python, Node.js, JavaScript - ⚠️ Requires: gVisor installation, Docker, base images 2. Aliyun Code Interpreter (`agent/sandbox/providers/aliyun_codeinterpreter.py`) - SaaS integration using official agentrun-sdk - Serverless microVM execution with auto-authentication - Hard timeout: 30 seconds max - Credentials: `AGENTRUN_ACCESS_KEY_ID`, `AGENTRUN_ACCESS_KEY_SECRET`, `AGENTRUN_ACCOUNT_ID`, `AGENTRUN_REGION` - Automatically wraps code to call `main()` function 3. E2B Provider (`agent/sandbox/providers/e2b.py`) - Placeholder for future integration ### ⚙️ Configuration System - `conf/system_settings.json`: Default provider = `aliyun_codeinterpreter` - `agent/sandbox/client.py`: Enforces database-only configuration - Admin UI: `/admin/sandbox-settings` - Configuration validation via `validate_config()` method - Health checks for all providers ### 🎯 Key Capabilities Arguments Parameter Support: All providers support passing arguments to `main()` function: ```python # User code def main(name: str, count: int) -> dict: return {"message": f"Hello {name}!" * count} # Executed with: arguments={"name": "World", "count": 3} # Result: {"message": "Hello World!Hello World!Hello World!"} ``` Self-Describing Providers: Each provider implements `get_config_schema()` returning form configuration for Admin UI Error Handling: Structured `ExecutionResult` with stdout, stderr, exit_code, execution_time ## Configuration Scripts Two scripts for quick Aliyun sandbox setup: Shell Script (requires jq): ```bash source scripts/configure_aliyun_sandbox.sh ``` Python Script (interactive): ```bash python3 scripts/configure_aliyun_sandbox.py ``` ## Testing ```bash # Unit tests uv run pytest agent/sandbox/tests/test_providers.py -v # Aliyun provider tests uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter.py -v # Integration tests (requires credentials) uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py -v # Quick SDK validation python3 agent/sandbox/tests/verify_sdk.py ``` Test Coverage: - 30 unit tests for provider abstraction - Provider-specific tests for Aliyun - Integration tests with real API - Security tests for executor_manager ## Documentation - `docs/develop/sandbox_spec.md` - Complete architecture specification - `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration from legacy sandbox - `agent/sandbox/tests/QUICKSTART.md` - Quick start guide - `agent/sandbox/tests/README.md` - Testing documentation ## Breaking Changes ⚠️ Migration Required: 1. Directory Move: `sandbox/` → `agent/sandbox/` - Update imports: `from sandbox.` → `from agent.sandbox.` 2. Mandatory Configuration: - SystemSettings must have `sandbox.provider_type` configured - Removed fallback default values - Configuration must exist in database (from `conf/system_settings.json`) 3. Aliyun Credentials: - Requires `AGENTRUN_` environment variables (not `ALIYUN_`) - `AGENTRUN_ACCOUNT_ID` is now required (Aliyun primary account ID) 4. Self-Managed Provider: - gVisor (runsc) must be installed for security - Install: `go install gvisor.dev/gvisor/runsc@latest` ## Database Schema Changes ```python # SystemSettings.value: CharField → TextField api/db/db_models.py: Changed for unlimited config length # SystemSettingsService.get_by_name(): Fixed query precision api/db/services/system_settings_service.py: startswith → exact match ``` ## Files Changed ### Backend (Python) - `agent/sandbox/providers/base.py` - SandboxProvider ABC interface - `agent/sandbox/providers/manager.py` - ProviderManager - `agent/sandbox/providers/self_managed.py` - Self-managed provider - `agent/sandbox/providers/aliyun_codeinterpreter.py` - Aliyun provider - `agent/sandbox/providers/e2b.py` - E2B provider (placeholder) - `agent/sandbox/client.py` - Unified client (enforces DB-only config) - `agent/tools/code_exec.py` - Updated to use provider system - `admin/server/services.py` - SandboxMgr with registry & validation - `admin/server/routes.py` - 5 sandbox API endpoints - `conf/system_settings.json` - Default: aliyun_codeinterpreter - `api/db/db_models.py` - TextField for SystemSettings.value - `api/db/services/system_settings_service.py` - Exact match query ### Frontend (TypeScript/React) - `web/src/pages/admin/sandbox-settings.tsx` - Settings UI - `web/src/services/admin-service.ts` - Sandbox service functions - `web/src/services/admin.service.d.ts` - Type definitions - `web/src/utils/api.ts` - Sandbox API endpoints ### Documentation - `docs/develop/sandbox_spec.md` - Architecture spec - `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration guide - `agent/sandbox/tests/QUICKSTART.md` - Quick start - `agent/sandbox/tests/README.md` - Testing guide ### Configuration Scripts - `scripts/configure_aliyun_sandbox.sh` - Shell script (jq) - `scripts/configure_aliyun_sandbox.py` - Python script ### Tests - `agent/sandbox/tests/test_providers.py` - 30 unit tests - `agent/sandbox/tests/test_aliyun_codeinterpreter.py` - Provider tests - `agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py` - Integration tests - `agent/sandbox/tests/verify_sdk.py` - SDK validation ## Architecture ``` Admin UI → Admin API → SandboxMgr → ProviderManager → [SelfManaged\|Aliyun\|E2B] ↓ SystemSettings ``` ## Usage ### 1. Configure Provider Via Admin UI: 1. Navigate to `/admin/sandbox-settings` 2. Select provider (Aliyun Code Interpreter / Self-Managed) 3. Fill in configuration 4. Click "Test Connection" to verify 5. Click "Save" to apply Via Configuration Scripts: ```bash # Aliyun provider export AGENTRUN_ACCESS_KEY_ID="xxx" export AGENTRUN_ACCESS_KEY_SECRET="yyy" export AGENTRUN_ACCOUNT_ID="zzz" export AGENTRUN_REGION="cn-shanghai" source scripts/configure_aliyun_sandbox.sh ``` ### 2. Restart Service ```bash cd docker docker compose restart ragflow-server ``` ### 3. Execute Code in Agent ```python from agent.sandbox.client import execute_code result = execute_code( code='def main(name: str) -> dict: return {"message": f"Hello {name}!"}', language="python", timeout=30, arguments={"name": "World"} ) print(result.stdout) # {"message": "Hello World!"} ``` ## Troubleshooting ### "Container pool is busy" (Self-Managed) - Cause: Pool exhausted (default: 1 container in `.env`) - Fix: Increase `SANDBOX_EXECUTOR_MANAGER_POOL_SIZE` to 5+ ### "Sandbox provider type not configured" - Cause: Database missing configuration - Fix: Run config script or set via Admin UI ### "gVisor not found" - Cause: runsc not installed - Fix: `go install gvisor.dev/gvisor/runsc@latest && sudo cp ~/go/bin/runsc /usr/local/bin/` ### Aliyun authentication errors - Cause: Wrong environment variable names - Fix: Use `AGENTRUN_` prefix (not `ALIYUN_`) ## Checklist - [x] All tests passing (30 unit tests + integration tests) - [x] Documentation updated (spec, migration guide, quickstart) - [x] Type definitions added (TypeScript) - [x] Admin UI implemented - [x] Configuration validation - [x] Health checks implemented - [x] Error handling with structured results - [x] Breaking changes documented - [x] Configuration scripts created - [x] gVisor requirements documented Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-28 13:28:21 +08:00
Kevin Hu	927db0b373	Refa: asyncio.to_thread to ThreadPoolExecutor to break thread limitat… (#12716 ) ### Type of change - [x] Refactoring	2026-01-20 13:29:37 +08:00
E.G	9da48ab0bd	fix: Handle NaN/Infinity values in ExeSQL JSON response (#12666 ) ## Summary Fixes #12631 When SQL query results contain NaN (Not a Number) or Infinity values (e.g., from division by zero or other calculations), the JSON serialization would fail because NaN and Infinity are not valid JSON values. This caused the agent interface to show 'undefined' error, as described in the issue where `EXAMINE_TIMES` became `NaN` and broke the JSON parsing. ## Root Cause The `convert_decimals` function in `exesql.py` was only handling `Decimal` types, but not `float` values that could be `NaN` or `Infinity`. When these invalid JSON values were serialized: ```json {"EXAMINE_TIMES": NaN} // Invalid JSON! ``` The frontend JSON parser would fail, causing the 'undefined' error. ## Solution Extended `convert_decimals` to detect `float` values and convert `NaN`/`Infinity` to `null` before JSON serialization: ```python if isinstance(obj, float): if math.isnan(obj) or math.isinf(obj): return None return obj ``` This ensures all SQL results can be properly serialized to valid JSON. --- This is a Gittensor contribution. gittensor:user:GlobalStar117 Co-authored-by: GlobalStar117 <GlobalStar117@users.noreply.github.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>	2026-01-19 12:46:06 +08:00
Kevin Hu	9a10558f80	Refa: async retrieval process. (#12629 ) ### Type of change - [x] Refactoring - [x] Performance Improvement	2026-01-15 12:28:49 +08:00
Kevin Hu	23a9544b73	Fix: toc async issue. (#12485 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-07 15:35:30 +08:00
Kevin Hu	461c81e14a	Fix: KG search issue. (#12364 ) ### What problem does this PR solve? Close #12347 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-31 14:40:27 +08:00
Lynn	7498bc63a3	Fix: judge retrieval from (#12223 ) ### What problem does this PR solve? Judge retrieval from in retrieval component, and fix bug in message component ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-26 13:01:46 +08:00
Lynn	6e9691a419	Feat: message manage (#12196 ) ### What problem does this PR solve? Manage message and use in agent. Issue #4213 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-25 21:18:13 +08:00
Yongteng Lei	0f0fb53256	Refa: refactor metadata filter (#11907 ) ### What problem does this PR solve? Refactor metadata filter. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-12 17:12:38 +08:00
Kevin Hu	ea4a5cd665	Fix: tokenizer issue. (#11902 ) #11786 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-11 17:38:17 +08:00
TeslaZY	c610bb605a	Added semi-automatic mode to the metadata filter (#11886 ) ### What problem does this PR solve? Retrieval metadata filtering adds semi-automatic mode, and users can manually check the metadata key that participates in LLM to generate filter conditions. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-11 10:45:21 +08:00
Jin Hai	43f51baa96	Fix errors (#11804 ) ### What problem does this PR solve? 1. typos 2. grammar errors. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-08 12:21:18 +08:00
Yongteng Lei	27b0550876	Refa: cleanup synchronous functions in agent_with_tools (#11736 ) ### What problem does this PR solve? Cleanup synchronous functions in agent_with_tools. ### Type of change - [x] Refactoring	2025-12-04 14:15:05 +08:00
Yongteng Lei	e3f40db963	Refa: make RAGFlow more asynchronous 2 (#11689 ) ### What problem does this PR solve? Make RAGFlow more asynchronous 2. #11551, #11579, #11619. ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-12-03 14:19:53 +08:00
Kevin Hu	b5ad7b7062	Feat: support TOC transformer. (#11685 ) ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-03 12:27:50 +08:00
Kevin Hu	a6681d6366	Revert "Refa: make RAGFlow more asynchronous 2" (#11669 ) Reverts infiniflow/ragflow#11664	2025-12-02 19:42:05 +08:00

1 2

93 Commits