Commit Graph

7158 Commits

Author SHA1 Message Date
monsterDavid
7da4f200e5 fix(agent): enable MCP file preview via doc_id (#15399)
## Summary
MCP-wrapped agents could only force-download files looked up by
`doc_id`. This adds an explicit preview path and inline response headers
for previewable file types.

- **New** `GET /api/v1/agents/attachments/{attachment_id}/preview` —
inline preview for PDFs, images, and other safe types (pass `ext` and/or
`mime_type`)
- **Improved** `GET /api/v1/documents/{doc_id}/preview` — sets inline
disposition using the document filename
- **Improved** attachment download routing — resolves `mime_type` /
`ext` query params (no default `markdown`), supports
`disposition=inline`
- **DocGenerator output** — includes URL-encoded `preview_url` for MCP
clients
- **Legacy `/document/download/...` aliases** — still use download
semantics; MCP clients should call `/preview` explicitly

Fixes #15398

## Test plan
- [x] `pytest test/unit_test/api/utils/test_file_response_headers.py`
(6/6)

---------

Co-authored-by: MkDev11 <mkdev11@users.noreply.github.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Ling Qin <qinling0210@163.com>
2026-07-03 19:56:01 +08:00
maoyifeng
0f4f2135f3 Go:cli move _order _columns sort group (#16615)
### Summary
1. Move common functions to format.go
2. modify show name spaces to _
3. move _order _columns column sort group;
4. add dao empty enterprise file
2026-07-03 19:37:53 +08:00
Jin Hai
6b571694df Go: Update error info (#16619)
As title.

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-07-03 19:37:25 +08:00
S
1861087787 fix(agent): defend against @ in var names at all template-split sites (#16469)
## Summary

While fixing #16467 (IterationItem crash on `@` in user-defined output
keys), an audit of `agent/**/*.py` revealed **three additional sites**
with the same vulnerability. This PR hardens all of them with
`maxsplit=1` and adds regression tests.

This is **defense-in-depth hardening**, not a behavior change. The
current `variable_ref_patt` regex constrains `var_nm` to
`[A-Za-z0-9_.-]+`, so single-`@` templates resolve exactly as before.
The `maxsplit=1` only kicks in if the trailing side itself contains `@`
— currently unreachable from the public DSL surface, but trivially
exploitable the moment a user-defined output key happens to contain `@`
(e.g. `user@email`) or the regex is ever relaxed.

> **Note on issue scope**: The primary fix for #16467 (the
`list_tenant_added_models` `ValueError` crash on `@` in model names) is
in PR #16468. This PR is a **follow-up hardening sweep** of the same
vulnerability class found in `agent/` during that audit; it does not
duplicate or replace #16468.

## Sites hardened

| File | Line | Method |
|------|------|--------|
| `agent/canvas.py` | 206 | `Graph.get_variable_value` |
| `agent/canvas.py` | 256 | `Graph.set_variable_value` |
| `agent/component/base.py` | 533 |
`ComponentBase.get_input_elements_from_text` |
| `agent/component/iterationitem.py` | 88 |
`IterationItem.output_collation` |

All now use `split("@", 1)` with an inline comment explaining the
rationale. The trailing side keeps any embedded `@`.

## Sites already safe (audited but left alone)

| File | Reason safe |
|------|------------|
| `agent/canvas.py:708` (`is_reff`) | Pre-checks `len(arr) != 2` |
| `agent/component/categorize.py` | Uses `rsplit` |
| `agent/component/iteration.py` | Pre-validates via regex |
| Other call sites | `rsplit` or regex pre-validation |

## Regression tests

9 new tests across 2 files, all `pytest.mark.p2`:

| File | Tests |
|------|-------|
| `test/unit_test/agent/test_canvas_at_split.py` | 6 —
`get_variable_value`, `set_variable_value`, round-trip, single-`@`,
missing-component |
| `test/unit_test/agent/component/test_iterationitem_at_split.py` | 3 —
`output_collation` with `@` in var, single-`@`, non-matching cid |

Each test was **verified to fail with `ValueError: too many values to
unpack (expected 2)`** when the corresponding fix is temporarily
reverted, confirming the tests actually catch the bug rather than just
exercising the happy path.

## Test results

```
9 passed in 0.04s
```

Full agent unit suite also clean (38 passed, 3 skipped; 6 unrelated
pre-existing collection errors from missing `peewee`/`requests` in local
venv — not caused by this PR).

## Related

- Issue: #16467
- Primary fix PR: #16468 (closes the issue)
- This PR: defense-in-depth follow-up, intentionally non-blocking on
#16467

---------

Co-authored-by: skbs-eng <skbs-eng@users.noreply.github.com>
2026-07-03 19:26:27 +08:00
Haruko386
fd7fb6669a fix: cannot get query in agent-log (#16610)
### Summary

As title

bug:


fixed:
<img width="1827" height="1286" alt="image"
src="https://github.com/user-attachments/assets/0cdc391c-43d7-4330-bc34-3aefe5d4f4ee"
/>
2026-07-03 18:56:32 +08:00
Jin Hai
83d09b16ce Fix Go: list providers order issue. (#16616)
### Summary

As title.

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-07-03 18:27:32 +08:00
Haruko386
dde8b6d54c fix: get team's search in own search-list (#16599)
### Summary

As title:
2026-07-03 18:26:03 +08:00
Haruko386
226d0ff77c fix: get team merber's chat (#16597)
### Summary

As title
2026-07-03 18:25:31 +08:00
Haruko386
488574fd80 fix: get all memory in team with permission=me (#16593)
### Summary

As title:
2026-07-03 18:25:04 +08:00
Yingfeng
706fa4e87a Feat: add gbrain compile template for session/memory data (#16613) 2026-07-03 18:22:29 +08:00
qinling0210
ffc4d29a06 Port agent PRs to GO - 3 (#16596)
### Summary

Port
https://github.com/infiniflow/ragflow/pull/16415
https://github.com/infiniflow/ragflow/pull/16417
2026-07-03 18:03:23 +08:00
Yingfeng
8db68e3eec Refactor(harness): remove naive inline graph engine , unify graph execution under single pregel engine (#16608) 2026-07-03 17:50:30 +08:00
Muhammad Furqan
3cba34d67f fix(agent/tools): port Crawler to ToolBase so it can load and run (#16415)
### What problem does this PR solve?

Closes #16414.

The **Crawler** agent tool (`agent/tools/crawler.py`) was never ported
to the modern `ToolBase`/`_invoke` interface during the agent module
redesign, so it was broken in three independent ways:

1. **Crashed on construction.** `CrawlerParam` extends `ToolParamBase`,
whose `__init__` reads `self.meta["parameters"]`, but `CrawlerParam`
defined no `meta`. Constructing it raised `AttributeError:
'CrawlerParam' object has no attribute 'meta'`. Because
`agent/canvas.py` instantiates `component_class(component_name +
"Param")()` while loading a canvas, **any agent containing a Crawler
node failed to load.**
2. **`_invoke` missing.** It extends `ToolBase` (whose `invoke()`
dispatches to `self._invoke`) but only implemented the legacy `_run`, so
`_invoke` resolved to `ComponentBase._invoke` → `NotImplementedError`.
3. **`be_output` removed.** `_run` called `Crawler.be_output(...)`,
which no longer exists on the base classes.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Changes

- Add a `ToolMeta` to `CrawlerParam` (defined before
`super().__init__()`, matching every other ported tool such as
`ArXivParam`/`TavilyExtractParam`) advertising a required `query`
parameter — the URL to crawl, default `{sys.query}`, consistent with the
`{sys.query}` convention shared by the other tools.
- Replace the legacy `_run`/`be_output` with `_invoke`/`set_output`,
writing the extracted page content to `formalized_content` (errors
surfaced via `_ERROR`), consistent with the other tools.
- Preserve the existing SSRF guard (`assert_url_is_safe` +
`pin_dns_global`).
- Add regression tests
(`test/unit_test/agent/component/test_crawler.py`) covering param
construction, validation, and the tool descriptor.

Same class of defect as #16329 (DeepL). Backend-only; no frontend
changes.

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-07-03 17:15:48 +08:00
Jin Hai
1880e65e99 Go: refactor (#16602)
### Summary

1. update doc
2. refactor route code

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-07-03 17:00:43 +08:00
chanx
79518973e5 Fix: optimize folder data handling in MoveDialog component (#16580) 2026-07-03 16:13:57 +08:00
euvre
4effd057f0 i18n: localize visual input file label in agent form (#16594) 2026-07-03 15:31:27 +08:00
Jin Hai
a4c370c5ba Go: fix 'list services' (#16598)
### Summary

```
RAGFlow(admin)> list services;
+-----------------------------------------------------------------------------+-----------+----+---------------+------+---------------+-----------+
| extra                                                                       | host      | id | name          | port | service_type  | status    |
+-----------------------------------------------------------------------------+-----------+----+---------------+------+---------------+-----------+
| map[database:1 mq_type:redis password:infini_rag_flow]                      | localhost | 0  | redis         | 6379 | message_queue | alive     |
| map[password:infini_rag_flow retrieval_type:elasticsearch username:elastic] | localhost | 1  | elasticsearch | 1200 | retrieval     | alive     |
|                                                                             | 0.0.0.0   | 2  | nats          | 4222 | message_queue | CONNECTED |
| map[meta_type:mysql password:infini_rag_flow username:root]                 | localhost | 3  | mysql         | 3306 | meta_data     | alive     |
| map[password:infini_rag_flow store_type:minio user:rag_flow]                | localhost | 4  | minio         | 9000 | file_store    | alive     |
+-----------------------------------------------------------------------------+-----------+----+---------------+------+---------------+-----------+

```

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-07-03 15:08:37 +08:00
euvre
994b603374 fix: prevent duplicate MCP server name when editing (#16588) 2026-07-03 14:30:43 +08:00
euvre
7b341539e7 fix: prevent exporting empty MCP server selection (#16589) 2026-07-03 14:22:17 +08:00
Hz_
ac5d0c4615 fix(go-file): KB counter drift when deleting files with linked documents (#16584)
## Summary

Use `DocumentService.RemoveDocumentKeepFile` when deleting files that
are linked to documents.

  ## Change

  - inject `DocumentService` into `FileService`
  - replace direct document deletion in `deleteSingleFile`
  - remove the obsolete file-local engine deletion helper

  ## Result

Deleting a file now cleans up linked documents through the same service
path used elsewhere, keeping KB counters and document engine cleanup
consistent.
2026-07-03 14:07:54 +08:00
Haruko386
ee942711c4 fix: unable to fetch tools for MCP (#16583) 2026-07-03 14:05:42 +08:00
Haruko386
b2e4740acd fix: unable to import mcp from local (#16590)
### Summary

As title
2026-07-03 14:05:07 +08:00
Haruko386
383d059969 fix: agent chat completions can not use (#16570)
### Summary

As title
<img width="2370" height="2039" alt="image"
src="https://github.com/user-attachments/assets/4cccf543-3908-49ee-8101-c5068fbf53ec"
/>
2026-07-03 13:25:14 +08:00
euvre
e65bac238e fix: preserve existing links when bulk linking files to knowledge bases (#16587) 2026-07-03 13:17:19 +08:00
Wang Qi
6a4b9be426 Refactor: reformat all code for lefthook using ruff and gofmt (#16585) dev-20260703-2 2026-07-03 12:53:39 +08:00
Yingfeng
19fcb4a981 Fix harness DAG slow-branch test cased by nil initialization of pregel engine (#16591) dev-20260703 2026-07-03 12:53:25 +08:00
euvre
918229613a fix: prevent duplicate 'skills' and '.knowledgebase' folders caused by race conditions (#16568) 2026-07-03 12:06:45 +08:00
Muhammad Furqan
83540185e1 fix(agent/tools): port AkShare to ToolBase so it works as an Agent tool (#16417)
### What problem does this PR solve?

Closes #16416.

The **AkShare** agent tool (`agent/tools/akshare.py`) was never ported
to the modern `ToolBase`/`_invoke` interface during the agent module
redesign and was still written against the removed legacy
`_run`/`be_output` API, so it was non-functional:

1. **Adding it to an Agent raised `AttributeError`.** `AkShare` extended
`ComponentBase` (not `ToolBase`) and `AkShareParam` defined no `meta`,
so it had no `get_meta()`. `agent/component/agent_with_tools.py` builds
each tool's function descriptor via `cpn.get_meta()`, so constructing an
Agent that includes the AkShare tool raised `AttributeError: 'AkShare'
object has no attribute 'get_meta'`.
2. **It could never run.** `invoke()` dispatches to `self._invoke`, but
`AkShare` only implemented the legacy `_run`, so `_invoke` fell through
to `ComponentBase._invoke` → `NotImplementedError`. `_run` also called
`be_output(...)`, which no longer exists on the base classes.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Changes

- Port `AkShareParam` to `ToolParamBase` with a `ToolMeta` (defined
before `super().__init__()`, matching `ArXivParam`/`TavilyExtractParam`)
exposing a required `query` parameter — the stock symbol to look up,
default `{sys.query}`. `query` matches the `{sys.query}` convention
shared by the other tools.
- Rewrite the component with `_invoke`/`set_output("formalized_content",
...)` (errors surfaced via `_ERROR`), keeping `top_n` and importing
`akshare` lazily.
- Add regression tests
(`test/unit_test/agent/component/test_akshare.py`) covering param
construction, validation, and the tool descriptor.

Same class of defect as #16329 (DeepL) and #16414 (Crawler).
Backend-only; no frontend changes.

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-07-03 11:39:26 +08:00
Jin Hai
1aa8abe373 Go: file syncer service framework (#16579)
### Summary

./ragflow_main --syncer to start file syncer


config yaml file has following config
```
file_syncer:
  max_concurrent_syncs: 4 # concurrent file sync threads
  sync_interval: 3 # check interval

```

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-07-03 11:14:02 +08:00
Kevin Hu
62f94cd59b Feat: Add knowledge compilation workflows (#16515)
## Summary
- Add knowledge compilation template APIs, services, and builtin
template seed data
- Add advanced knowledge compile structure/artifact/RAPTOR workflow
support
- Update parsing, dataset/document APIs, and supporting services for
compilation workflows
2026-07-02 23:22:07 +08:00
Jin Hai
7d64a78f83 Go: unify three services into one binary (#16462)
### Summary

Plan to start api_server, admin_server and ingestor in one binary:
- ./ragflow_main --admin
- ./ragflow_main --api
- ./ragflow_main --ingestor

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-07-02 21:21:10 +08:00
Liu An
32c5cb16e9 Docs: Update version references to v0.26.3 in READMEs and docs (#16574) v0.26.3 2026-07-02 20:55:15 +08:00
Wang Qi
93f6d647d4 Fix the sandbox exec image cannot show and download (#16577) 2026-07-02 20:49:51 +08:00
maoyifeng
4a81b9cfde fix workflow file type Identify (#16576)
fix workflow file type Identify
2026-07-02 20:41:14 +08:00
Lynn
bc54903bf6 Fix: display model_id in memory_list (#16567) 2026-07-02 20:28:27 +08:00
chanx
9a6d30bfe6 Fix: send agent log date filters as local wall-clock strings (#16575) 2026-07-02 20:23:15 +08:00
qinling0210
dcbd0d260c Port agent PRs to GO - 2 (#16565)
### Summary

Port the following PRs to GO in this PR

https://github.com/infiniflow/ragflow/pull/16420
https://github.com/infiniflow/ragflow/pull/13295
2026-07-02 20:20:11 +08:00
qinling0210
24118ac0d1 Fix chat thinking & Figure issue in GO (#16558)
### Summary

Fix chat thinking & Figure issue
2026-07-02 20:19:50 +08:00
Hz_
42aba36c1b fix(go): chunk stats after chunk deletion (#16553)
## Summary
- Decrement document and knowledgebase chunk counts after chunks are
deleted
- Keep token counts unchanged because deleted chunk token totals are not
available
- Add tests for stats update, zero-delete behavior, error handling, and
transaction rollback
2026-07-02 19:54:42 +08:00
Hz_
dfd95c9c5c fix(go): Add tenant filter to file queries (#16526)
## Summary

- Add `tenant_id` filtering to `FileDAO.Query`.
- Pass tenant IDs through existing file query call sites.
- Prevent cross-tenant filename and folder duplicate checks.
2026-07-02 19:54:22 +08:00
Jin Hai
11dfea489d Fix Go: fix minio port issue (#16552)
### Summary

1. env 'MINIO_PORT' is used for MINIO external access, which shouldn't
be used in Go config.
2. After RAGFlow 1.0 release, MINIO_PORT will be used for docker compose
internal usage. new ENV MINIO_EXTERNAL_PORT will be used for external
access.

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-07-02 19:15:58 +08:00
euvre
fc9116578c Fix: PDF page count detection for compressed PDFs (#16487) 2026-07-02 19:08:49 +08:00
Wang Qi
f7e39a09dc Fix graphrag generate error - AttributeError: 'RedisDB' object has no attribute 'mget' (#16573) 2026-07-02 19:06:16 +08:00
Jack
6ea95807be Fix: disable agent tests (#16562)
### Summary

Per discussion with @yuzhichang , disable agent test firstly.


https://github.com/infiniflow/ragflow/actions/runs/28562749273/job/84704079689?pr=16521
[0.094ms] [rows:0] SELECT * FROM `tenant_model_instance` WHERE
provider_id = "provider-1" AND instance_name = "default" ORDER BY
`tenant_model_instance`.`id` LIMIT 1
  --- FAIL: TestInvoke_ProxyDNSPin (2.00s)
invoke_test.go:375: dial error = Invoke: do: Get "http://8.8.8.8/api":
context deadline exceeded; want pinned proxy IP 192.88.99.1:9999
(connection-refused is acceptable; an absent IP means the dialer fell
through to the default resolver and the pinning regression went
undetected)
2026/07/02 14:34:58
/home/infiniflow/runners_work/tower01-9dd627fd9c44/ragflow/ragflow/internal/dao/kb.go:79
record not found
[0.358ms] [rows:0] SELECT * FROM `knowledgebase` WHERE id = "da1" AND
status = "1" ORDER BY `knowledgebase`.`id` LIMIT 1
2026/07/02 14:34:58
/home/infiniflow/runners_work/tower01-9dd627fd9c44/ragflow/ragflow/internal/dao/kb.go:79
record not found
[0.283ms] [rows:0] SELECT * FROM `knowledgebase` WHERE id = "da1" AND
status = "1" ORDER BY `knowledgebase`.`id` LIMIT 1
2026/07/02 14:34:58
/home/infiniflow/runners_work/tower01-9dd627fd9c44/ragflow/ragflow/internal/dao/kb.go:79
record not found
[0.523ms] [rows:0] SELECT * FROM `knowledgebase` WHERE id = "da1" AND
status = "1" ORDER BY `knowledgebase`.`id` LIMIT 1
2026/07/02 14:34:58 ExpectPing will have no effect as monitoring pings
is disabled. Use MonitorPingsOption to enable.
  FAIL
  FAIL    ragflow/internal/agent/component    2.759s
  ok      ragflow/internal/agent/component/io    0.026s
2026-07-02 18:50:20 +08:00
Renzo
7d422ba67d feat(go): implement chatbots/<dialog_id>/info and searchbots/detail (#15420)
### What problem does this PR solve?

Part of #15240 (rewriting the RAGFlow API server in Go).

Implements the two public bot endpoints from
`api/apps/restful_apis/bot_api.py`:

- **`GET /api/v1/chatbots/<dialog_id>/info`** (`chatbots_inputs`) —
returns `{title, avatar, prologue, has_tavily_key}` for a dialog the
authenticated tenant owns (tenant match + `status == VALID`), otherwise
`"Authentication error: no access to this chatbot!"`.
- **`GET /api/v1/searchbots/detail`** (`detail_share_embedded`) —
returns search-app detail for a `search_id` the tenant can access.
Permission is checked across the tenant's joined tenants; denial returns
`"Has no permission for this operation."` (operating error, `data:
false`) and a missing app returns `"Can't find this Search App!"`.

Both endpoints authenticate with an SDK **beta token** (`Authorization:
Bearer <beta>`) rather than a session — the token is resolved to a
tenant via `APIToken.query(beta=token)`, backed by a new
`APITokenDAO.GetByBeta`. Because they perform their own token-based
auth, the routes are registered on the unauthenticated route group
(mirroring the Python blueprint, which has no `@login_required`).

Both live in a new `internal/handler/bot.go` + `internal/service/bot.go`
since they share the same source module. Handler unit tests cover the
auth, success, and error-mapping paths.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Claude Code <claude@anthropic.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ling Qin <qinling0210@163.com>
2026-07-02 18:46:00 +08:00
Jack
7ae18a45ee Fix: correct download_deps.py path in error messages and add native libs doc (#16557)
## Summary

Fix error messages in `build.sh` and add documentation in
`internal/development.md` for downloading native static libraries
(pdfium, pdf_oxide, office_oxide).

## Changes

- `build.sh`: change error hint from `uv run download_deps.py` to `uv
run ragflow_deps/download_deps.py` (correct path from project root)
- `internal/development.md`: add section 2.1 documenting how to download
native libs and install lld
2026-07-02 18:41:39 +08:00
maoyifeng
3e7e5f4f6a add web and build start steps (#16572)
### Summary

update ci
2026-07-02 18:17:06 +08:00
writinwaters
ce8941ded4 Docs: Added v0.26.3 release notes. (#16566) 2026-07-02 17:50:14 +08:00
chanx
16b8c79a2b Fix: hide model settings button and related functionality (#16563) 2026-07-02 17:49:52 +08:00
chanx
2ef78189ce Fix: pass mcp to useExportMcp for correct JSON export filename (#16564) 2026-07-02 17:49:46 +08:00