Commit Graph

6321 Commits

Author SHA1 Message Date
BitToby
bd4ce39038 Go: implement provider: Perplexity (#15008)
## What
- Add Perplexity as a chat and embedding provider backed by its
OpenAI-compatible `/chat/completions` and `/v1/embeddings` APIs
- Register Perplexity in the Go model factory and provider config
- Support non-streaming chat, SSE streaming chat, embeddings, model
listing, and connection checks

Refs #14736

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 15:33:02 +08:00
dripsmvcp
d5ba14a128 feat(go): implement provider Astraflow (#15062) (#15064)
- Adds an `Astraflow` Go driver so the new API server can route
Astraflow (UCloud ModelVerse) chat instances, matching the existing
Python `AstraflowChat` (`rag/llm/chat_model.py:1237`). Follows the same
SaaS-driver shape used for Avian, Novita, TogetherAI, Replicate,
DeepInfra, Upstage, and LongCat.

Closes #15062

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 15:32:56 +08:00
dripsmvcp
5a18df0fd0 Go: implement provider: Avian (#15045)
Closes #15044.

Avian was listed unchecked in the Go-rewrite tracker #14736 and already
had an llm_factories.json entry with 4 preconfigured chat models
(deepseek-v3.2, kimi-k2.5, glm-5, minimax-m2.5), but the Go API server
had no driver to route them. The Python side has supported Avian at
rag/llm/chat_model.py:1220 (AvianChat) via the LiteLLM openai/ provider
with default base https://api.avian.io/v1.

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 15:32:49 +08:00
sxxtony
7740ec6c95 Go: implement Embed (embeddings) in Replicate driver (#15073)
### What problem does this PR solve?

`ReplicateModel.Embed` in `internal/entity/models/replicate.go` was a
`"replicate, no such method"` stub. Tracking issue #14736 lists
Replicate's embedding surface as not implemented. This PR wires it up
against Replicate's documented embedding schema.

Until this PR, a tenant who selected a Replicate embedding model got the
sentinel error on every embed call.

Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 15:32:41 +08:00
天海蒼灆
3e5b11a523 Feat(browser control):Add new agent component 'browser' to control browser by AI (#14888)
### What problem does this PR solve?
This PR adds a new `Browser` operator to Agent workflows, enabling
prompt-driven browser automation in RAGFlow.Technically based
‘Browser-Use’

It includes:
- Backend browser component execution with tenant LLM integration
- Upload source support (file IDs, URLs, variables, CSV/JSON array)
- Downloaded file persistence to RAGFlow storage
- Frontend node/operator integration, form config, icon, and i18n
updates
- Unit tests for upload/download and ID parsing logic
- Dependency and Docker updates for browser-use runtime support

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-21 15:32:32 +08:00
Stephen Hu
da112e3db0 Refactor:improve dify retrieval logic (#15036)
### What problem does this PR solve?

improve dify retrieval logic for o(n) io to o(1)

### Type of change
- [x] Refactoring
2026-05-21 15:32:24 +08:00
Kevin Hu
e7544562cc Feat: @tool decorator for chat-model tool registration (#15047)
## Summary

- Adds a lightweight `@tool` decorator and `FunctionToolSession` adapter
in `rag/llm/tool_decorator.py` that let callers register plain Python
functions as LLM tools without hand-writing OpenAI function schemas or
building an MCP-style session.
- Refactors `Base.bind_tools` and `LiteLLMBase.bind_tools` in
`rag/llm/chat_model.py` to accept either the new decorator form
`bind_tools(tools=[fn1, fn2])` or the existing `(toolcall_session,
tools_schemas)` form, so existing agent/dialog call-sites in
`agent/component/agent_with_tools.py`, `api/db/services/llm_service.py`,
and `api/db/services/dialog_service.py` are unaffected.
- Adds 8 unit tests in `test/unit_test/rag/llm/test_tool_decorator.py`
covering schema shape, required/optional inference, sync + async
dispatch, and bad-input rejection.

## Usage

```python
from rag.llm.tool_decorator import tool

@tool
def get_weather(city: str) -> str:
    """Get current weather for a city.

    :param city: City name to look up.
    """
    return f"{city}: 21 C, partly cloudy"

chat_mdl.bind_tools(tools=[get_weather])
ans, tk = await chat_mdl.async_chat_with_tools(system, history)
```

The decorator introspects `inspect.signature` + type hints + the
docstring (`:param name:` style) and attaches an OpenAI-format
`openai_schema` to the callable. `FunctionToolSession` duck-types the
existing `ToolCallSession` protocol, dispatching async callables
directly and sync ones through `thread_pool_exec` so the event loop is
never blocked.

## Design notes

- `tool_decorator.py` deliberately does **not** live inside
`rag/llm/__init__.py` to avoid forcing every consumer through the heavy
provider auto-discovery loop and to sidestep a circular import
(`__init__.py` imports `chat_model`, which would otherwise need symbols
from `__init__.py`).
- `FunctionToolSession` is duck-typed against
`common.mcp_tool_call_conn.ToolCallSession` rather than explicitly
inheriting from it, so importing the decorator doesn't pull the MCP
client SDK into the import graph.
- Docstring parsing is intentionally minimal (`:param name:` only) to
keep this dependency-free; Google/NumPy styles can be added later via
`docstring_parser` if needed.

## Test plan

- [x] `python -m pytest test/unit_test/rag/llm/test_tool_decorator.py
-v` — 8 passed
- [x] `python -m pytest test/unit_test/rag/llm/
--ignore=test/unit_test/rag/llm/test_perplexity_embed.py` — 11 passed
(the ignored test has a pre-existing `numpy` import that's unrelated)
- [ ] Reviewer: smoke-test the new path end-to-end with a live model via
`chat_mdl.bind_tools(tools=[my_fn])` to confirm the OpenAI-format
schemas pass through unchanged

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 15:32:17 +08:00
bitloi
a6186244ee fix: handle missing SDK authorization headers (#15050)
### What problem does this PR solve?

Closes #15048.

Several SDK session routes in `api/apps/sdk/session.py` called
`.split()` directly on `request.headers.get("Authorization")`. When
clients omitted the header, the handlers raised `AttributeError` before
returning the existing `Authorization is not valid!` response.

This PR centralizes SDK Authorization parsing in a small helper and
keeps the existing error response for missing, empty, or malformed
headers.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Tests

- `ZHIPU_AI_API_KEY=dummy uv run --python 3.13 --group test pytest
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py::test_sdk_session_routes_missing_authorization_unit
-q`
- `uv run --python 3.13 --group test ruff check api/apps/sdk/session.py
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
- `python3 -m py_compile api/apps/sdk/session.py
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
- `git diff --check`
2026-05-21 15:32:00 +08:00
kingloon
da4eaf9fb0 Fix: remove duplicate function definitions (#15063)
### What problem does this PR solve?

Remove duplicate function definitions in
`api/db/services/dialog_service.py`.

**Problem:** Two helper functions were defined twice in the same file,
but with different parameter orders:

- First definition (line 57):
`_resolve_reference_metadata(request_payload=None, config=None)`
- Second definition (line 136): `_resolve_reference_metadata(config,
request_payload=None)`

**Solution:** Keep the second definition (which is actually used by
other modules) and remove the first one to avoid confusion.

Additionally, remove duplicate `_enrich_chunks_with_document_metadata`
definition (keep line 140 version).
<img width="1584" height="313" alt="image"
src="https://github.com/user-attachments/assets/7daee832-244f-4bb2-8488-e3b65012a3f9"
/>
<img width="1672" height="359" alt="image"
src="https://github.com/user-attachments/assets/4fd2f523-273c-4b20-a7c9-ab35740b7834"
/>


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2026-05-21 15:31:51 +08:00
kpdev
6932615852 fix(api): infer /documents/{id}/download Content-Type from filename when ext is omitted (#15052) (#15053)
## Summary

- Align **GET `/api/v1/documents/<doc_id>/download`** with
**`/preview`**: resolve extension and MIME type from the stored document
name when the **`ext` query parameter is omitted**, instead of
defaulting to `markdown`.
- When **`?ext=`** is present, behavior stays the same as before
(explicit extension / `Content-Type` mapping).
- Enforce the same access + document lookup pattern as preview
(**`accessible`** + **`get_by_id`**).
- Extend unit tests for the no-`ext` PDF filename case.

## Test plan

- [x] `uv run pytest
test/testcases/test_web_api/test_document_app/test_document_metadata.py::TestDocumentMetadataUnit::test_download_attachment_success_and_exception_unit`
- [x] Optional: `curl -sSI` against
`/api/v1/documents/<pdf_doc_id>/download` without `ext` and confirm
`Content-Type: application/pdf`

Fixes #15052.
2026-05-21 15:31:36 +08:00
dripsmvcp
440153c378 fix(api): check kb ownership in /dify/retrieval (#15028)
POST /api/v1/dify/retrieval resolved the caller via @apikey_required
(injecting tenant_id) but then fetched the requested knowledge_id with
no tenant filter and ran the full retrieval pipeline against
kb.tenant_id (the owner). Any valid Dify-compatible API key could
retrieve chunks from any tenant whose KB UUID was known. Adds the
missing ownership check.

## Root Cause
api/apps/sdk/dify_retrieval.py line 253:
KnowledgebaseService.get_by_id(kb_id) fetched the KB by id alone, then
the handler used kb.tenant_id (the OWNER) to build the embedding model
and call the retriever. The caller tenant_id was only used downstream at
line 278 for retrieval_by_children, well after cross-tenant data was
already retrieved.

grep confirmed there was no KnowledgebaseService.accessible call
anywhere in the handler.

## Fix
Two-line guard immediately after the existing get_by_id lookup,
mirroring the pattern PR #14749 lands for the sibling sdk/doc.py routes
(download, parse, stop_parsing, retrieval_test):

    e, kb = KnowledgebaseService.get_by_id(kb_id)
    if not e:
return build_error_result(message="Knowledgebase not found!",
code=RetCode.NOT_FOUND)
+   if not KnowledgebaseService.accessible(kb_id, tenant_id):
+ return build_error_result(message="No authorization.",
code=RetCode.AUTHENTICATION_ERROR)
    if kb.tenant_embd_id:
        ...

KnowledgebaseService.accessible already handles solo-tenant ownership,
team membership via TenantService.get_joined_tenants_by_user_id, and the
permission=ME distinction. No behavior change for legitimate callers;
cross-tenant callers now receive RetCode.AUTHENTICATION_ERROR (109).

## Test Plan
- [x] Regression test added:
test/unit_test/api/apps/sdk/test_dify_retrieval.py
- test_cross_tenant_request_is_rejected -- attacker tenant calling owner
tenant KB gets 109; retriever is not invoked
- test_same_tenant_request_succeeds -- owner tenant gets the records
back
- test_missing_knowledge_base_returns_not_found -- missing KB returns
404 BEFORE the access check fires (legit callers see the clearer
message)
- [x] All 3 tests pass after the fix
- [x] Cross-tenant test FAILS on pre-fix main (KeyError on result[code]
because handler leaks records dict instead of returning auth error)
- [x] ruff check clean on both changed files
- [x] No drive-by reformatting in dify_retrieval.py -- only the 2 added
lines

### Post-fix output

    test_cross_tenant_request_is_rejected           PASSED [ 33%]
    test_same_tenant_request_succeeds               PASSED [ 66%]
    test_missing_knowledge_base_returns_not_found   PASSED [100%]

============================== 3 passed in 0.04s
===============================

Closes #15027
2026-05-21 13:29:00 +08:00
Chan
0c93161a14 fix: prevent session user_id spoofing via request body (#15077)
### What problem does this PR solve?

Closes #15076 

Two endpoints in `api/apps/restful_apis/chat_api.py` accepted a
`user_id` field from the request body and used it directly when creating
a session:

```python
# before (vulnerable)
"user_id": req.get("user_id", current_user.id)          # create_session
conv = await _create_session_for_completion(chat_id, dia, req.get("user_id", current_user.id))  # session_completion
```

Any authenticated caller could supply an arbitrary `user_id` and have
the new session attributed to a different user — effectively spoofing
session ownership. Both call sites are now fixed to always use
`current_user.id`, which is set by the authentication middleware and
cannot be tampered with via the request payload.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Changes

| File | Change |
|------|--------|
| `api/apps/restful_apis/chat_api.py` | Remove `req.get("user_id", ...)`
fallback in `create_session` and `session_completion`; always use
`current_user.id` |
|
`test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
| Add `test_create_session_user_id_not_spoofable` and
`test_session_completion_user_id_not_spoofable` (both `@pytest.mark.p2`)
|

### Testing

Two new unit tests assert that a `user_id` value supplied in the request
body is silently ignored and the session is always owned by the
authenticated user:

```
test_create_session_user_id_not_spoofable
test_session_completion_user_id_not_spoofable
```

Run with:

```bash
uv run pytest test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py -k "spoofable" -v
```
2026-05-21 13:28:14 +08:00
web-dev0521
2d3a1a4483 feat(go-models): add Azure OpenAI model driver (#15022)
## What problem does this PR solve?

Closes #15021.

The Go model-provider layer had no support for **Azure OpenAI**. Azure
OpenAI is *not* a drop-in base-URL swap of the OpenAI driver — it
differs in authentication, endpoint structure, and how models are listed
— so it needs its own `ModelDriver` implementation.

## Type of change

- [x] New feature (non-breaking change which adds functionality)

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 11:52:56 +08:00
Renzo
c7ac9b7171 Go: implement provider: GPUStack (chat) (#15024)
### What problem does this PR solve?

Fixes #15023

GPUStack is listed as unchecked in the Go-rewrite tracker #14736, and
`internal/service/llm.go:171` already classifies it as a self-deployed
provider alongside Ollama, Xinference, LocalAI, and LM Studio — but
`internal/entity/models/` had no `gpustack.go` driver, so the new Go API
server could not route GPUStack instances. This PR adds the chat surface
for GPUStack so it lines up with the existing self-hosted Go drivers.

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 11:49:18 +08:00
Renzo
394cd5d116 Go: implement Embed in Xinference driver (#14932)
## Summary

- Replaces the `"no such method"` stub on `XinferenceModel.Embed`
(`internal/entity/models/xinference.go`) with a real implementation
against Xinference's OpenAI-compatible `/v1/embeddings` endpoint.
- Adds the `"embedding": "v1/embeddings"` URL suffix to
`conf/models/xinference.json`.
- Mirrors the Python `XinferenceEmbed` class in
`rag/llm/embedding_model.py:407` for payload shape (OpenAI-compatible
`model + input` → `data[*].index + data[*].embedding`) and tolerates the
same no-auth default Xinference deployments use. Authorization is only
sent when a non-empty API key is configured, via the existing
`setXinferenceAuth` helper.
- Reuses the existing `normalizeXinferenceBaseURL` + `baseURLForRegion`
helpers so both `http://127.0.0.1:9997` and `http://127.0.0.1:9997/v1`
resolve to the same `/v1/embeddings` target without doubled `/v1`.
- Validates response indices — duplicate, missing, or out-of-range
`data[*].index` values fail with a clear error rather than silently
producing misaligned vectors.
- Returns `[]EmbeddingData` in original input order (placed by `Index`)
so downstream callers can index positionally without re-sorting.
- Forwards `EmbeddingConfig.Dimension` as `dimensions` when `> 0`,
matching the OpenAI cluster pattern.

Closes #14810

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 11:47:30 +08:00
Renzo
fec0b968e7 Go: implement Rerank in Novita driver (#15014)
### What problem does this PR solve?

Fixes #15012

The Novita Go driver landed in #14850 and shipped a stub `Rerank` method
that returned `"novita, no such method"`, so Novita could not be used as
a rerank provider in RAGFlow. This PR fills that gap, in the same way
#14895 filled the Embed gap on the same driver.

Novita exposes a public rerank endpoint at `POST
https://api.novita.ai/openai/v1/rerank` that accepts the
Cohere-compatible request shape (`{model, query, documents, top_n}`)
with `Authorization: Bearer <api_key>`. `baai/bge-reranker-v2-m3` is
documented in Novita's model library with a 1024-token limit.
2026-05-21 10:19:17 +08:00
Renzo
536ed07d27 Go: implement Rerank in Xinference driver (#15032)
### What problem does this PR solve?

Fixes #14816

The Xinference Go driver landed chat in #14938 and Embed is in review in
#14932, but `Rerank` shipped as a stub that returns `"xinference, no
such method"`. Tenants who launch a rerank model with `--model-type
rerank` on their Xinference instance cannot route it through the Go API
server. This PR fills the gap.

Xinference exposes an OpenAI-compatible REST API. The rerank endpoint is
at `POST <base>/v1/rerank` and accepts the Cohere-shaped body `{model,
query, documents, top_n}`, returning `{results: [{index,
relevance_score}]}` — the same wire shape used by the merged NVIDIA
(#14778), Aliyun (#14676), Gitee (#14656), ZhipuAI (#14608), Novita
(#15014), and LocalAI (#14813) Rerank implementations. Documented in
[Xinference rerank
docs](https://inference.readthedocs.io/en/v1.6.1/models/model_abilities/rerank.html);
the [builtin rerank model
catalog](https://inference.readthedocs.io/en/stable/models/builtin/rerank/)
lists `bge-reranker-base`, `bge-reranker-large`, `bge-reranker-v2-m3`,
and others.
2026-05-21 10:14:30 +08:00
sxxtony
63db30f0d9 Go: implement provider: n1n.ai (#15010)
### What problem does this PR solve?

Add a Go driver for **n1n.ai** (https://docs.n1n.ai), one of the
unchecked providers on the umbrella tracking issue #14736. n1n.ai is an
OpenAI-compatible aggregator hosting a 450+ model catalog (GPT, Claude,
Gemini, DeepSeek, Kimi, Qwen, embedding + reranker families) under
`https://api.n1n.ai/v1`.

Until this PR, a tenant who configured `n1n` as a model provider in the
Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.

---------

Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
2026-05-21 10:13:15 +08:00
Jack Storment
dc01e0e51c Go: implement Embed (embeddings) in TogetherAI driver (#15017)
### What problem does this PR solve?

Fixes #15015

The TogetherAI Go driver in `internal/entity/models/togetherai.go`
shipped a stub `Embed` method that returned `"TogetherAI, no such
method"`, so TogetherAI could not be used as an embedding provider in
RAGFlow. This PR fills that gap.

TogetherAI exposes a public OpenAI-compatible embeddings endpoint at
`POST https://api.together.ai/v1/embeddings` that accepts the standard
`{model, input}` shape with `Authorization: Bearer <api_key>` (confirmed
in TogetherAI's official docs:
https://docs.together.ai/docs/embeddings-overview). Documented embedding
models include `intfloat/multilingual-e5-large-instruct`,
`BAAI/bge-large-en-v1.5`, and `BAAI/bge-base-en-v1.5`.

### Changes

- `internal/entity/models/togetherai.go`: implement
`TogetherAIModel.Embed`.
- Validate inputs (api key, model name) and short-circuit on empty
texts.
  - Resolve region with the existing `baseURLForRegion` helper.
  - Build URL from `URLSuffix.Embedding`.
- Send `{model, input}` POST body, add `dimensions` when
`embeddingConfig.Dimension > 0` (matches the pattern in #14735).
  - Bearer auth + JSON content type, mirroring the chat path.
- Parse `{data: [{embedding, index}]}` and reorder by `index`, rejecting
out-of-range indices, duplicates, and missing entries so the output
always lines up with the input. Same shape as the merged Mistral,
Upstage, and Novita Embed implementations.
- `conf/models/togetherai.json`:
  - Add `"embedding": "embeddings"` to `url_suffix`.
- Add default embedding model entries for
`intfloat/multilingual-e5-large-instruct`, `BAAI/bge-large-en-v1.5`, and
`BAAI/bge-base-en-v1.5`.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-20 20:48:44 +08:00
chanx
aa0a3d6988 Fix: The logs on the data source details page are not fully displayed. (#15056)
### What problem does this PR solve?

Fix: The logs on the data source details page are not fully displayed.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-20 20:32:31 +08:00
qinling0210
dbef3e361f Update chunk/metadata cli (#15055)
### What problem does this PR solve?

Update chunk/metadata cli

### Type of change

- [ ] Refactoring
2026-05-20 20:32:06 +08:00
Jin Hai
90c76e73d0 Docs: Update version references to v0.25.5 in READMEs and docs (#15059)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
v0.25.5
2026-05-20 20:05:45 +08:00
writinwaters
31bf29bc38 Docs: Initial draft of v0.25.5 release notes. (#15058)
### What problem does this PR solve?

Intial draft of v0.25.5 release notes.

### Type of change

- [x] Documentation Update
2026-05-20 19:44:40 +08:00
Haruko386
4a91ca5349 Go: implement provider: MinerU_Local (#15051)
### What problem does this PR solve?
1. Add model types when add model
---
```
RAGFlow(user)> add model 'pipeline' to provider 'mineru_local' instance 'test' with tokens 131072 doc_parse;
SUCCESS
```

2. implement provider: MinerU_Local
---
**Verified from CLI**
```
RAGFlow(user)> parse with 'pipeline@test@mineru_local' file './internal/test.pdf'
+--------------------------------------+
| task_id                              |
+--------------------------------------+
| c7260e31-b6e2-4b36-955d-e9c60510c669 |
+--------------------------------------+
RAGFlow(user)> show 'test@mineru_local' task 'c7260e31-b6e2-4b36-955d-e9c60510c669'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| content                                                                                                                                                                                                                                                          | index |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Bingxin Ke Anton Obukhov Shengyu Huang Nando Metzger Rodrigo Caye Daudt Konrad Schindler Photogrammetry and Remote Sensing, ETH Zurich ¨

![](images/ae256101419715b544d13722...  | 1     |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-20 19:21:57 +08:00
Wang Qi
6ce76e6799 Fix discord async issue (#15054)
### What problem does this PR solve?

RuntimeError: Cannot run the event loop while another loop is running

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-20 19:21:19 +08:00
Magicbook1108
b28e134944 Feat: add local & ssh provider in admin panel (#15039)
### What problem does this PR solve?

Feat: add local & ssh provider in admin panel

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-20 16:56:20 +08:00
bitloi
6499bce2a6 fix: Langfuse chat observation (#15026)
### What problem does this PR solve?

Closes #15025

Langfuse-enabled `dialog_service.async_chat()` regressed to
`langfuse_tracer.start_generation(...)` after the earlier Langfuse v4
migration. Langfuse v4 uses `start_observation(as_type="generation")`,
so the remaining `start_generation` call can fail when chat tracing is
enabled.

This restores the migrated `start_observation(as_type="generation")`
call for chat observations while preserving the existing trace context,
model, input payload, and update/end flow. It also adds a regression
test with a fake Langfuse v4-style client that exposes
`start_observation()` but not `start_generation()`.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Tests

- `.venv/bin/pytest
test/unit_test/api/db/services/test_dialog_service_final_answer.py -q`
- `.venv/bin/ruff check api/db/services/dialog_service.py
test/unit_test/api/db/services/test_dialog_service_final_answer.py`
2026-05-20 15:01:19 +08:00
balibabu
1ed8a118cf Fix: The folder tree menu for moving folders cannot be scrolled. (#15037)
### What problem does this PR solve?

Fix: The folder tree menu for moving folders cannot be scrolled.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-20 14:59:36 +08:00
bitloi
d69518ea42 fix(go): guard custom base URL driver creation (#15030)
### What problem does this PR solve?

Closes #15029.

Some custom `base_url` paths in `ModelProviderService` call
`NewInstance(newURL)` and then immediately invoke methods on the
returned driver. Several real Go model drivers still return `nil` from
`NewInstance`, so those paths can panic instead of returning a normal
error.

This PR:

- centralizes custom base URL driver creation in `model_service.go`
- skips request-local driver creation when `base_url` is blank or
whitespace
- preserves the existing region key behavior when building the
request-local base URL map
- returns a clear error when the provider driver is missing or
`NewInstance` returns `nil`
- routes list/check/task and active model paths through the guarded
helper
- adds focused unit coverage for empty-region preservation, regional
base URLs, blank base URLs, nil drivers, and nil `NewInstance` results

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Test plan

- [x] `git diff --check upstream/main...HEAD`
- [x] `/root/go/bin/gofmt -w internal/service/model_service.go
internal/service/model_service_test.go`
- [x] `GOPATH=/root/gopath GOTOOLCHAIN=local /root/go/bin/go test
./internal/service -run TestNewModelDriverForBaseURL -count=1 -vet=off`
- [x] `GOPATH=/root/gopath GOTOOLCHAIN=local /root/go/bin/go build
./internal/service/... ./internal/entity/models/...`

Note: the same targeted `go test` command without `-vet=off` is
currently blocked by an existing unrelated vet finding in
`internal/service/llm.go:355` (`non-constant format string in call to
fmt.Errorf`).
2026-05-20 14:58:20 +08:00
Idriss Sbaaoui
aea90f4e39 Feat: add new tests and tescases for restful api suite (#15038)
### What problem does this PR solve?

extend restful api suite

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
2026-05-20 14:56:55 +08:00
Haruko386
2836a934b5 Go: implement provider: 302.AI and JieKou-AI (#15034)
### What problem does this PR solve?

This PR implement implement provider 302.AI and JieKouAI

**The following functionalities are now supported:**

**302.ai**

- [x] chat / think chat / stream chat / stream think chat
- [x] Embedding
- [x] ASR
- [x] ListModels
- [x] Provider connection checking
- [x] Balance
- [x] Rerank
- [x] OCR
- [x] Doc Parse
- [x] Show task 
- [ ]  ~~List Tasks!~~
- [ ] TTS

**JieKouAI**

- [x] chat / think chat / stream chat / stream think chat
- [x] Embedding
- [x] Rerank
- [x] ListModels

**Verified examples from the CLI:**
```palintext
# jiekouAI

RAGFlow(user)> stream think chat with 'zai-org/glm-4.5@test@jiekouai' message 'Hi'
Thinking: Let me think about how to respond to this simple greeting. The user just said "Hi", which is a basic and friendly way to start a conversation. I should respond in a similarly warm and welcoming manner.First, I need to acknowledge their greeting and reciprocate with enthusiasm. Something like "Hello!" or "Hi there!" would work well to create a positive atmosphere right from the start.Next, I should make it clear that I'm ready to help. Since they haven't asked anything specific yet, I'll keep it open-ended and inviting. Perhaps offering assistance with a question or task would encourage them to engage further.I should also maintain a professional yet approachable tone. Being an AI assistant, I want to convey that I'm knowledgeable and capable, but also friendly and easy to talk to.Let me put this all together into a concise response. I'll start with a cheerful greeting, express my readiness to help, and finish with an open invitation for them to share what's on their mind. This should create a welcoming environment for whatever they want to discuss next.
Answer: ! I'm Claude, an AI assistant created by Anthropic. I'm here to help you with information, answer questions, or assist you with tasks. What can I help you with today?

RAGFlow(user)> think chat with 'zai-org/glm-4.5@test@jiekouai' message 'Hi'
Thinking: Let me consider how to respond to this greeting. The user initiated with a simple "Hi," so a friendly and open response would be most appropriate to encourage further conversation. I should maintain a welcoming tone while offering assistance.

The response should accomplish a few key things: return the greeting warmly, show openness to conversation, and offer specific ways I can help. This approach demonstrates both approachability and usefulness.

I'll start with a greeting in return, then express my availability to help, and finish by suggesting some areas where I can provide assistance. This creates a natural flow from acknowledgment to support.

It's important to keep the response concise but inviting. Since the user hasn't specified their needs yet, I'll present a few broad categories of assistance to spark their thinking about what they might want to discuss or ask about.

The response should end with an encouraging note that prompts them to share what's on their mind, keeping the conversational ball in their court while making it clear I'm ready to engage with whatever they need.
Answer: Hello! How can I help you today? Whether you have questions, need information, or just want to chat, I'm here to assist.

RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'text-embedding-3-large@test@jiekouai' dimension 16
+-----------+-------+
| dimension | index |
+-----------+-------+
| 3072      | 0     |
| 3072      | 1     |
+-----------+-------+

RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'baai/bge-reranker-v2-m3@test@jiekouai' top 3
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0     | 0.9830034       |
| 2     | 0.06399203      |
| 1     | 0.04665664      |
+-------+-----------------+


# 302.ai

RAGFlow(user)> think chat with 'kimi-k2.6@test@302.ai' message 'who r u'
Thinking: The user is asking "who r u" which is a casual way of asking "who are you." I need to identify myself as an AI assistant created by Moonshot AI. I should be friendly, concise, and helpful.

Key points to include:
- I am Kimi, an AI assistant made by Moonshot AI
- I can help with various tasks like answering questions, writing, analysis, coding, etc.
- Keep it casual but informative since the user used "r u" (text speak)

I should not:
- Pretend to be human
- Claim to have personal experiences or emotions
- Be overly formal or robotic

Simple, friendly response is best.
Answer: I'm Kimi, an AI assistant made by Moonshot AI. I can help you with answering questions, writing, coding, analysis, or just chatting. What can I do for you?
Time: 17.687750

RAGFlow(user)> stream think chat with 'kimi-k2.6@test@302.ai' message 'who r u'
Thinking:  user asked "who r u" which is a casual way of asking "who are you." I should introduce myself as Kimi, an AI assistant developed by Moonshot AI. I need to be friendly, concise, and accurate. I should mention my capabilities briefly and keep the tone helpful. Since the user used casual text speak ("r u"), I can match that energy with a friendly but still informative tone.Key points:- I'm Kimi, an AI assistant made by Moonshot AI- I can help with various tasks like answering questions, writing, coding, analysis, etc.- Keep it brief but warm- Don't claim to be human- Don't over-explainDraft:"I'm Kimi, an AI assistant created by Moonshot AI. I can help with answering questions, writing, coding, analysis, brainstorming, and lots of other tasks. What can I do for you?"This is good - direct, accurate, and inviting.
Answer:  Kimi, an AI assistant made by Moonshot AI. I can help with answering questions, writing, coding, analysis, brainstorming, and lots of other stuff. What can I do for you?
Time: 14.912576

RAGFlow(user)> asr with 'whisper-v3-turbo@test@302.ai' audio './internal/test.wav' param ''
+---------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                |
+---------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired |
+---------------------------------------------------------------------------------------------------------------------+

RAGFlow(user)> ocr with 'mistral-ocr-latest@test@302.ai' file './internal/test.pdf'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                                                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Bingxin Ke

Nando Metzger

Anton Obukhov

Rodrigo Caye Daudt

Shengyu Huang

Konrad Schindler

Photogrammetry and Remote Sensing, ETH Zürich

![img-0.jpeg](img-0.jpeg)
Figur...  |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


RAGFlow(user)> parse with 'vlm@test@302.ai' file 'https://arxiv.org/pdf/2505.09358'
+--------------------------------------+
| task_id                              |
+--------------------------------------+
| 6de6eae6-c122-4b67-91e8-b061a0b8c087 |
+--------------------------------------+
RAGFlow(user)> show 'test@302.ai' task '6de6eae6-c122-4b67-91e8-b061a0b8c087'
+----------------------------------------------------------------------------+-------+
| content                                                                    | index |
+----------------------------------------------------------------------------+-------+
| https://file.302.ai/gpt/imgs/20260519/b340fdff4774699c287fe4ee4658b317.zip | 0     |
+----------------------------------------------------------------------------+-------+

RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'jina-embeddings-v3@test@302.ai' dimension 16
+-----------+-------+
| dimension | index |
+-----------+-------+
| 1024      | 0     |
| 1024      | 1     |
+-----------+-------+
RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'jina-reranker-v2-base-multilingual@test@302.ai' top 3;
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0     | 0.74167407      |
| 2     | 0.18832397      |
| 1     | 0.15713684      |
+-------+-----------------+
```


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-20 14:10:15 +08:00
qinling0210
77834870fc Refact functions in engine in GO (#14981)
### What problem does this PR solve?

Refact functions in engine in GO
### Type of change

- [x] Refactoring
2026-05-19 17:34:59 +08:00
Idriss Sbaaoui
6b2fcb4116 Feat: add new tests and tescases for restful api suite (#14996)
### What problem does this PR solve?

extend restful api suite

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
2026-05-19 17:17:31 +08:00
plind
6796a47b8d feat(sdk): make Begin inputs discoverable on Session.ask (#14842)
### What problem does this PR solve?

Closes #14751.

The user reported that after adding a variable (e.g. `key1`) to an
agent's **Begin** component, the Python SDK gave them no way to pass it:
their call `session.ask(question=user_question, stream=False)` had no
parameter for `key1`, and the `ask()` signature was just `(question,
stream, **kwargs)` with a docstring that only described streaming
behavior.

The functionality already works — `_ask_agent` does
`json_data.update(kwargs)` and the server reads `inputs` from the
request body at `agent_api.py:902`. The canonical shape is also in the
public API docs (`docs/references/python_api_reference.md:1817-1840`):

```python
session.ask(
    "",
    stream=False,
    inputs={"line_var": {"type": "line", "value": "I am line_var"}},
    return_trace=True,
)
```

But because `inputs`, `release`, and `return_trace` were hidden behind
`**kwargs`, they did not appear in IDE signature help, and the docstring
did not mention them. Users had no path from "I added a key in the UI"
to "I need to pass `inputs=...` with this exact shape."

This PR promotes the three most relevant Begin-related arguments to
named parameters and rewrites the docstring with a worked example.

### What this PR changes

- `sdk/python/ragflow_sdk/modules/session.py`:
- `Session.ask()` signature becomes `ask(question="", stream=False,
inputs=None, release=None, return_trace=None, **kwargs)`.
- These three new named params are forwarded into the existing `kwargs`
dict before dispatch, so the wire format and downstream behavior are
unchanged.
- Docstring rewritten in numpy style, including the structured `{"type":
..., "value": ...}` shape that the Begin component requires (see
`agent/component/begin.py:45-60`).

No backend changes. `**kwargs` is preserved for forward compatibility
with other body fields (`session_id`, `files`, `user_id`,
`custom_header`, …).

### Test plan

- [ ] `session.ask(question="hi", stream=False)` — existing call still
works
- [ ] `session.ask("", stream=False, inputs={"key1": {"type": "line",
"value": "v"}})` — Begin component receives `key1 = "v"`
- [ ] `session.ask("", stream=True, return_trace=True)` — streaming
response includes trace events
- [ ] IDE / `help(Session.ask)` now shows `inputs`, `release`,
`return_trace` with descriptions

### Type of change

- [x] Refactoring
- [x] Documentation Update
2026-05-19 16:14:57 +08:00
Rene Arredondo
f58e0b3eca Feat: VLM image descriptions in MinerU parser (#14869) (#14946)
## Summary

Closes #14869.

Adds VLM-based semantic descriptions to **image chunks produced by the
MinerU parser**, closing a long-standing parity gap with the deepdoc
parser's `VisionFigureParser`. A maintainer flagged this in #13342
("We may add the VLM enhancement to MinerU parser as well") and an
earlier proposal exists in #13824; this PR lands the change end-to-end
inside the existing parser plumbing.

## Why

Today the MinerU parser returns image chunks containing only the
native `image_caption` and `image_footnote` strings from MinerU's
JSON. When neither is present (or when both are sparse), the chunk
carries effectively no searchable content for the figure and
retrieval misses it entirely. Users who configured a local VLM
(reporter's case: Gemma-4-31B) had to post-process MinerU's
`tmp/*.json` themselves.

The deepdoc parser already solves this via
[`VisionFigureParser`](deepdoc/parser/figure_parser.py): when the
tenant has an `IMAGE2TEXT` model configured, each figure gets a
semantic description merged into its chunk. This PR brings the same
behavior to MinerU.

## What changed

### `deepdoc/parser/mineru_parser.py`

- **New method `_enhance_images_with_vlm(outputs, vision_model,
callback=None)`** —
  collects every `IMAGE` block with a readable `img_path`, runs
  `rag.app.picture.vision_llm_chunk` in a 10-worker
  `ThreadPoolExecutor` using the existing
  `vision_llm_figure_describe_prompt`, and writes the result back as
  `vlm_description`. Per-image failures are logged and skipped — they
  never abort the run.
- **`_transfer_to_sections` (IMAGE branch)** — folds
  `vlm_description` into the section text alongside caption +
  footnote, so the description becomes part of the chunk and is
  searchable / retrievable.
- **`parse_pdf`** — after `_read_output`, calls
  `_enhance_images_with_vlm(outputs, vision_model, callback=callback)`
  when a `vision_model` kwarg is supplied. Wrapped in `try / except`
  so a VLM outage cannot break parsing.

### `rag/app/naive.py` (`by_mineru`)

After successfully resolving the MinerU OCR parser, also resolves the
tenant's default `LLMType.IMAGE2TEXT` model via
`get_tenant_default_model_by_type`, wraps it in an `LLMBundle`, and
injects it as `kwargs["vision_model"]` before delegating to
`parse_pdf`.

## Behavior

| Tenant config | Behavior |
|---|---|
| `IMAGE2TEXT` model configured | MinerU image chunks contain `caption +
footnote + VLM description`. Retrieval against figures now actually
works. |
| No `IMAGE2TEXT` model configured | Exact same output as today (caption
+ footnote only). Lookup fails silently with an info log; no error, no
regression. |
| VLM call fails for a single image | That image silently falls back to
caption + footnote; other images proceed. |
| Caller already passes `vision_model` in kwargs | We don't override it
— `if "vision_model" not in kwargs` guards the lookup. |

## Files

- `deepdoc/parser/mineru_parser.py` (+56)
- `rag/app/naive.py` (+13)
2026-05-19 16:08:10 +08:00
Idriss Sbaaoui
95b56e73f2 Feat: add new tests and tescases for restful api suite (#14993)
### What problem does this PR solve?

extend restful api suite

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
2026-05-19 15:43:15 +08:00
Rene Arredondo
ce3402cbb9 Fix: restore saved api_key fallback in add_llm (#14921) (#14941)
## Summary

Closes #14921.

Reconfiguring an existing LLM provider to enable **tool call** or
**vision** fails with `Your API key is invalid. Fail to access model.`
even when the saved API key is correct. The most visible report is
VLLM ("Cannot add vllm model" once `--enable-auto-tool-choice` /
vision is toggled on), but the bug applies to every provider whose
api_key field stays blank in edit mode.

## Root cause

PR #14885 ("Fix: llm add api key overridden") removed the existing-key
lookup in `api/apps/llm_app.py::add_llm`. The intent was correct —
stop the saved key from clobbering a user-provided new one — but the
removal was unconditional, so the edit path now has no fallback at all:

1. `web/src/pages/user-setting/setting-model/hooks.tsx:230` sets the
   initial `api_key` form value to `''` in edit mode (the real key is
   never returned to the browser).
2. The user toggles `is_tools` / `vision` without retyping the key.
3. `hooks.tsx:183-185` strips the empty `api_key` from the payload.
4. `add_llm` defaults to the placeholder `"x"`
   (`api/apps/llm_app.py:182`).
5. The upstream provider rejects `"x"` with `Your API key is invalid`.

## Fix

Restore the fallback **narrowly**, before any factory-specific handler
runs:

- If `req.get("api_key") is None`, look up the tenant's existing record
  (using the correctly suffixed `llm_name` for VLLM /
  OpenAI-API-Compatible / LocalAI / HuggingFace).
- Decode the saved blob with `_decode_api_key_config` and write **only
  the decoded `api_key` string** back into `req["api_key"]`. Never use
  the raw JSON payload — that was the exact thing PR #14885 was trying
  to avoid.
- When the user **does** type a new key, `req.get("api_key")` is not
  `None` and the fallback is skipped, so PR #14885's fix is preserved.

| Scenario | Before this PR | After this PR |
|---|---|---|
| Plain factory (VLLM, Ollama, …), retype key | OK | OK |
| Plain factory, blank key in edit (the bug) | Fails with "API key is
invalid" | Recovers saved key, validates against the real one |
| OpenRouter / Bedrock, change `provider_order` only | Fails |
`apikey_json([...])` rebuilds the JSON with saved `api_key` + new field
|
| User clears the form and types a brand-new key | OK (key replaced) |
OK (key replaced — fallback skipped) |

## Files changed

- `api/apps/llm_app.py` — restored fallback in `add_llm` (no other call
sites touched).

## Test plan

- [ ] Add a VLLM chat model with a valid api_key, no toggles → save
succeeds.
- [ ] Edit the same model, toggle **tool call** on, leave api_key blank
      → save succeeds, validation runs against the saved key.
- [ ] Edit again, toggle **vision** on (model_type → `image2text`),
      leave api_key blank → save succeeds.
- [ ] Edit again and **type a new api_key** → the new key replaces the
      saved one (`is None` check skips the fallback). Verify via the DB
      row or by deliberately typing a wrong key and observing the
      validation failure.
- [ ] Repeat the blank-key edit with **OpenRouter**, changing only
      `provider_order` → resulting api_key JSON contains the saved
      `api_key` and the new `provider_order`.
- [ ] First-time add of a new model name → no existing record, fallback
      no-ops, behaves as before.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2026-05-19 15:32:09 +08:00
tmimmanuel
243d9ed281 Add TogetherAI chat provider (#14957)
## What
- Add TogetherAI as a chat provider backed by its OpenAI-compatible
`/v1/chat/completions` API
- Register TogetherAI in the Go model factory and provider config
- Support non-streaming chat, SSE streaming chat, model listing, and
connection checks

## Notes
- Uses the current TogetherAI OpenAI-compatible base URL
`https://api.together.ai/v1`
- Forwards documented chat parameters from `ChatConfig`: `max_tokens`,
`temperature`, `top_p`, `stop`, and GPT-OSS `reasoning_effort`
- Routes Together reasoning traces from `reasoning` /
`reasoning_content` into `ReasonContent`

## Tests
- `go test -vet=off -run TestTogetherAI -count=1
./internal/entity/models`
- `go test -vet=off -count=1 ./internal/entity/models`

Refs #14736
2026-05-19 15:10:42 +08:00
tmimmanuel
09a06f1b00 Go: implement provider: Xinference (#14938)
### What problem does this PR solve?

Closes #14808.

Adds a Go model driver for Xinference so self-hosted Xinference chat
models can be used through the Go provider layer instead of falling
through to the dummy driver. Xinference exposes an OpenAI-compatible API
under `/v1`; the driver accepts either a root endpoint such as
`http://127.0.0.1:9997` or an OpenAI-compatible endpoint such as
`http://127.0.0.1:9997/v1` and normalizes it before calling chat or
model-listing routes.

### What is changed?

- Add `internal/entity/models/xinference.go` implementing `ModelDriver`
for Xinference chat.
- Route provider name `xinference` in
`internal/entity/models/factory.go`.
- Add `conf/models/xinference.json` as a local provider config.
- Add focused unit tests in `internal/entity/models/xinference_test.go`.

Initial method coverage:

- `ChatWithMessages`: POST `/v1/chat/completions`.
- `ChatStreamlyWithSender`: SSE streaming from `/v1/chat/completions`.
- `ListModels`: GET `/v1/models`.
- `CheckConnection`: lightweight `ListModels` probe.
- Optional auth: send `Authorization: Bearer <api_key>` only when a
non-empty key is configured, matching Xinference no-auth and
auth-enabled deployments.
- `Balance`, `Embed`, `Rerank`, ASR, TTS, and OCR return `no such
method` for this initial chat-provider PR.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Bug Fix (non-breaking change which fixes an issue)

### Tests

- `go test -vet=off -run TestXinference -count=1
./internal/entity/models/...`
- `go test -vet=off -count=1 ./internal/entity/models/...`

### References

- Xinference docs:
https://inference.readthedocs.io/zh-cn/latest/index.html
- OpenAI-compatible chat usage:
https://inference.readthedocs.io/zh-cn/latest/getting_started/using_xinference.html
- API key auth:
https://inference.readthedocs.io/zh-cn/latest/user_guide/auth_system.html

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-19 15:10:13 +08:00
plind
7edabdf7c3 fix(retrieval): keep manual metadata filter reusable inside Iteration (#14849)
## What problem does this PR solve?

Closes #12582.

When a Retrieval component sits inside an Iteration with a **manual**
metadata filter that references the iteration variable (e.g.
`{IterationItem:abc@item}`), every iteration reuses the value resolved
on the **first** pass.

Root cause: [`_resolve_manual_filter` in
`agent/tools/retrieval.py`](https://github.com/infiniflow/ragflow/blob/main/agent/tools/retrieval.py#L144-L171)
mutated `flt["value"]` in place. The `filters` list passed in is the
live `self._param.meta_data_filter["manual"]` (see
[`apply_meta_data_filter` in
`common/metadata_utils.py:257-261`](https://github.com/infiniflow/ragflow/blob/main/common/metadata_utils.py#L257-L261)),
so after the first iteration the param dict permanently held the
resolved string instead of the original variable reference.

```text
iter #1: flt["value"] = "{IterationItem:abc@item}"  →  resolved to "AI"
         after mutation: flt["value"] = "AI"        ← written back into _param

iter #2: flt["value"] = "AI"                         ← no {…} matches
         retrieval keeps filtering by "AI" forever
```

This PR returns a shallow copy with the resolved value instead, leaving
the original filter (and its variable reference) intact for the next
iteration.

## Type of change

- [x] Bug fix (non-breaking change which fixes an issue)

## Test plan

- [ ] Build an agent: `Agent (structured output → list of areas) →
Iteration → Retrieval (manual filter: Area = {IterationItem/Item}) →
Message`. Run with a multi-area query and confirm each iteration's
Retrieval result matches its own item, not the first item.
- [ ] Regression: Retrieval with a manual metadata filter outside an
Iteration still resolves the variable correctly on each request.
- [ ] Regression: Retrieval with no metadata filter and with `auto` /
`semi_auto` filters behave unchanged.
2026-05-19 15:08:31 +08:00
plind
f169ab4b39 feat(tts): cache synthesized speech in Redis to avoid redundant calls (#14851)
## What problem does this PR solve?

Closes #12017.

TTS output is deterministic for a given `(model, text)` pair, so
re-running the same text through the same TTS model produces the same
bytes — yet `Canvas.tts` and `dialog_service.tts` re-synthesized on
every request. That's slow and wastes provider quota whenever the same
assistant response is replayed, shared across users, or repeated within
a session.

### Change

New helper `rag/utils/tts_cache.py` with `synthesize_with_cache(tts_mdl,
cleaned_text)`:

- **Key:** `tts:cache:{model_id}:{sha256(text)}` — separate namespace
per model, identical cleaned text reuses a single entry across both call
sites.
- **Value:** the hex-encoded audio blob both call sites already
returned. No format change for downstream consumers.
- **TTL:** 7 days by default, configurable via
`RAGFLOW_TTS_CACHE_TTL_SECONDS`.
- **Failure modes:** a Redis hiccup falls back to direct synthesis; a
failed synthesis still returns `None` (existing contract preserved).


[`Canvas.tts`](https://github.com/infiniflow/ragflow/blob/main/agent/canvas.py#L683-L724)
and
[`dialog_service.tts`](https://github.com/infiniflow/ragflow/blob/main/api/db/services/dialog_service.py#L1367-L1380)
now route through the helper; the per-file bytes-accumulation/hex-encode
loop has been removed in favor of one shared implementation.

## Type of change

- [x] New Feature (non-breaking change which adds functionality)

## Test plan

- [ ] **Cache hit, chat path:** Configure a dialog with TTS enabled, ask
the same question twice with `stream=false`. Verify the second response
returns the same `audio_binary` and that the second invocation doesn't
hit the TTS provider (e.g., observe provider-side logs / usage counters;
check no `LLMBundle.tts can't update token usage` log line on the second
run).
- [ ] **Cache hit, agent path:** Same exercise via a Conversational
Agent that includes a Message component playing back the answer.
- [ ] **Cache isolation per model:** Switch tenant's `tts_id` between
two models, run the same text against each — confirm the second model's
first synthesis still happens (no cross-model hits).
- [ ] **TTL override:** Set `RAGFLOW_TTS_CACHE_TTL_SECONDS=120`, confirm
the entry expires after 2 minutes.
- [ ] **Redis unavailable:** Stop Redis (or break the connection).
Verify the TTS endpoint still works — synthesis falls back to direct
calls, with a `TTS cache lookup failed` / `TTS cache store failed`
warning logged.
- [ ] **Failure path:** Configure a TTS model with an invalid API key,
ensure the response still returns successfully with `audio_binary=None`
(no regression vs. current behavior).
2026-05-19 14:20:40 +08:00
OrbisAI Security
f17a66d4f0 fix: the opencc c library uses fgets() to read dicti... in text.c (#13970)
## Summary
Fix critical severity security issue in
`internal/cpp/opencc/dictionary/text.c`.

## Vulnerability
| Field | Value |
|-------|-------|
| **ID** | V-001 |
| **Severity** | CRITICAL |
| **Scanner** | multi_agent_ai |
| **Rule** | `V-001` |
| **File** | `internal/cpp/opencc/dictionary/text.c:107` |

**Description**: The OpenCC C library uses fgets() to read dictionary
and configuration files without proper bounds validation on subsequent
buffer operations. While fgets() itself is bounds-checked, the sprintf()
call at config_reader.c:174 constructs file paths by concatenating
home_path and filename without verifying the result fits in pkg_filename
buffer. An attacker providing malformed OpenCC configuration files with
excessively long path components can overflow the fixed-size buffer,
overwriting adjacent memory including return addresses and function
pointers.

## Changes
- `internal/cpp/opencc/config_reader.c`
- `internal/cpp/opencc/dictionary/text.c`
- `internal/cpp/opencc/utils.c`

## Verification
- [x] Build passes
- [x] Scanner re-scan confirms fix
- [x] LLM code review passed

---
*Automated security fix by [OrbisAI Security](https://orbisappsec.com)*


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Improved error detection and handling for malformed configuration and
dictionary entries during file parsing.
* Enhanced memory cleanup in error recovery paths to prevent potential
issues.
* Strengthened robustness of string operations and buffer handling
throughout the library.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-15.us-west-2.compute.internal>
2026-05-19 13:55:33 +08:00
刘康伟
c6e3a2e713 Fix: MinerU vlm-http-client backend output file detection (#14240)
## Problem
When using MinerU with `vlm-http-client` backend, the parser fails to
find the output files because they are located in a `vlm/` subdirectory,
but the `_read_output`
  method doesn't check this location.

  ## Error Message
  [ERROR]MinerU not found.
  [MinerU] Missing output file, tried: ...

  ## Root Cause
The MinerU API with `vlm-http-client` backend returns output files in
the following structure:
  output_dir/
    vlm/
      filename_content_list.json
      filename.md
      images/

  However, the `_read_output` method in `mineru_parser.py` only checks:
  1. `output_dir/filename_content_list.json`
  2. `output_dir/sanitized_filename_content_list.json`
3. `output_dir/sanitized_filename/sanitized_filename_content_list.json`

  It doesn't check the `vlm/` subdirectory.

  ## Solution
  Added two additional fallback paths to check the `vlm/` subdirectory:
  - `output_dir/vlm/filename_content_list.json`
  - `output_dir/vlm/sanitized_filename_content_list.json`

  ## Testing
Tested with MinerU API using `vlm-http-client` backend. The parser now
successfully finds and processes the output files.

  ## Related
  This issue occurs specifically when using:
  - MinerU backend: `vlm-http-client`
  - MinerU server URL configured for remote vLLM inference
2026-05-19 12:28:31 +08:00
buua436
87d22a4415 Fix: agent session log message (#14991)
### What problem does this PR solve?

agent session log message
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-19 12:00:02 +08:00
tmimmanuel
4c9529ef36 Add Replicate chat provider (#14958)
## What
- Add Replicate as a chat provider backed by the documented predictions
API
- Register Replicate in the Go model factory and provider config
- Support non-streaming chat through sync predictions, polling fallback,
streaming through `urls.stream`, model listing, and connection checks

## Notes
- Uses `POST /v1/predictions` with Replicate model identifiers in
`version`, which supports official and community model identifiers
- Maps RAGFlow messages into Replicate prompt-shaped inputs (`prompt`,
optional `system_prompt`) and forwards common documented LLM inputs:
`max_new_tokens`, `temperature`, `top_p`
- Preserves whitespace in SSE output chunks and emits RAGFlow `[DONE]`
at stream completion

## Tests
- `go test -vet=off -run TestReplicate -count=1
./internal/entity/models`
- `go test -vet=off -count=1 ./internal/entity/models`

Refs #14736
2026-05-19 11:10:36 +08:00
Haruko386
db9e782747 Go: implement provider: MinerU (#14990)
### What problem does this PR solve?

Implement MinerU Provider

**The following functionalities are now supported:**

**MinerU**
----
- [x] Parse file
- [x] Show task
- [ ] ~~List tasks~~

**Verified examples from the CLI:**
```plaintext
RAGFlow(user)> parse with 'vlm@test@mineru' file 'https://arxiv.org/pdf/2505.09358'
+--------------------------------------+
| task_id                              |
+--------------------------------------+
| 142ac8ea-d9d0-4a68-a2d1-d3af67635dc9 |
+--------------------------------------+

RAGFlow(user)> show 'test@mineru' task '142ac8ea-d9d0-4a68-a2d1-d3af67635dc9'
+--------------------------------------------+-------+
| content                                    | index |
+--------------------------------------------+-------+
| Task is running... Progress: 17 / 18 pages | 0     |
+--------------------------------------------+-------+

RAGFlow(user)> show 'test@mineru' task '142ac8ea-d9d0-4a68-a2d1-d3af67635dc9'
+--------------------------------------------------------------------------------------------+-------+
| content                                                                                    | index |
+--------------------------------------------------------------------------------------------+-------+
| https://cdn-mineru.openxlab.org.cn/pdf/2026-05-18/142ac8ea-d9d0-4a68-a2d1-d3af67635dc9.zip | 0     |
+--------------------------------------------------------------------------------------------+-------+

```


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-19 10:49:33 +08:00
kingloon
525a87be0f Misc: fix some typos (#14987)
### What problem does this PR solve?

Fix minor code quality issues:

1. Fix typo in assertion error message: "Can't fine" → "Can't find"
2. Remove duplicate line in common/connection_utils.py

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2026-05-19 10:47:06 +08:00
jony376
198f3c4b9a Fix: validate memory tenant model IDs on update and enforce tenant scope in memory pipeline (#14923)
### Related issues

Closes #14922

### What problem does this PR solve?

`POST /memories` already resolves `tenant_llm_id` and `tenant_embd_id`
through `ensure_tenant_model_id_for_params`, but `PUT
/memories/<memory_id>` accepted client-supplied `tenant_llm_id` /
`tenant_embd_id` without checking that those `tenant_llm` rows belong to
the memory owner’s tenant. A caller could persist another tenant’s row
IDs and later trigger extraction or embedding that loaded foreign model
credentials via `get_model_config_by_id(tenant_model_id)` with no tenant
allow-list.

This change aligns the update path with create: updates that change
models must go through `llm_id` / `embd_id` and
`ensure_tenant_model_id_for_params` scoped to the **memory’s**
`tenant_id` (not only the current user, so team-access cases stay
correct). Direct `tenant_*` fields in the body without `llm_id` /
`embd_id` are rejected. As defense in depth, `memory_message_service`
passes `allowed_tenant_ids` / `requester_tenant_id` into
`get_model_config_by_id` for LLM and embedding resolution so mismatched
IDs cannot be used even if bad data existed. A regression test rejects
payloads that set only `tenant_llm_id` / `tenant_embd_id`.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: jony376 <jony376@gmail.com>
2026-05-19 10:11:46 +08:00
Magicbook1108
b69a6a5d80 Feat: full optimization on connector dashboard (#14979)
### What problem does this PR solve?

This PR improves the connector dashboard task management experience and
adds better visibility into connector execution logs.

### Overview:

#### Before
<img width="700" alt="image"
src="https://github.com/user-attachments/assets/e4a8ed6f-2e18-4f0f-8528-41a514550052"
/>

#### Now:
<img width="700" alt="Screenshot from 2026-05-18 16-31-30"
src="https://github.com/user-attachments/assets/d4ca193b-847a-49ae-9e4f-5fbca60ea627"
/>

### 1. Add a new logging page to the connector dashboard

A new logging page has been added so users can view connector task
execution logs directly from the connector dashboard.

### 2. Merge the Resume button into Confirm

The separate **Resume** button has been removed. The **Confirm** button
now represents different actions depending on the current task state:

- **Save**: Save form changes and reschedule tasks.
- **Stop**: Cancel currently scheduled or running tasks.
- **Resume**: Create new scheduled tasks after the previous tasks have
been stopped.
- **Start**: Start tasks when no task has been started yet.

### 3. Separate syncing and pruning tasks

Connector tasks are now separated into **syncing** and **pruning**.

Pruning is controlled by the **Sync deleted files** option:

- When **Sync deleted files** is disabled, only syncing tasks are shown.
- When **Sync deleted files** is enabled, both syncing and pruning tasks
are shown.

**Now: Sync deleted files disabled**

<img width="700" alt="Sync deleted files disabled"
src="https://github.com/user-attachments/assets/dbd9232e-614a-407f-a0b1-c109e5fa567d"
/>

**Now: Sync deleted files enabled**

<img width="700" alt="Sync deleted files enabled"
src="https://github.com/user-attachments/assets/1f527f48-ccb3-4ee8-97ca-086891489296"
/>

### 4. Update logs in backend

<img width="700" alt="image"
src="https://github.com/user-attachments/assets/10a95a3f-98c1-4e67-8afa-ddf6cda5b0b2"
/>

### 5. Remove connector resume API

- Removed: `POST /v1/connectors/<connector_id>/resume`
- Replaced by: `PATCH /v1/connectors/<connector_id>`


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-19 10:07:11 +08:00
buua436
41a9fc0030 Go: add dataset graph api (#14984)
### What problem does this PR solve?

add dataset graph api

### Type of change

- [x] Refactoring
2026-05-18 20:02:53 +08:00