### What problem does this PR solve?
LLM/chat and search UIs render Markdown in several places (document
preview, floating chat widget, next-search, etc.). Plugin lists and
behavior were duplicated or inconsistent, and single newlines in model
output were not always rendered as visible line breaks, which hurts
readability for chat-style content.
This PR centralizes shared **remark/rehype** configuration (including
**`remark-breaks`** for newline handling) and wires the main Markdown
surfaces to use it, so behavior is consistent and easier to maintain.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
This PR implement implement OCR for Baidu and Mistral, implement
PaddleOCR provider and implement ASR for CoHere
**Verified examples from the CLI:**
```
RAGFlow(user)> ocr with 'mistral-ocr-2512@test@mistral' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
RAGFlow(user)> ocr with 'paddleocr-vl-0.9b@test@baidu' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# PaddleOCR
RAGFlow(user)> ocr with 'PaddleOCR-VL-1.5@test@paddleocr' file './internal/test.pdf'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Bingxin Ke
Nando Metzger
Photogra
Anton Obukhov
Rodrigo Caye Daudt
netry and Remote Sensing,
Shengyu Huang
Konrad Schindler
ETH Zürich
<div style="text-align: c... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# Cohere
RAGFlow(user)> asr with 'cohere-transcribe-03-2026@test@cohere' audio './internal/test.wav' param '{"language": "en"}'
+-----------------------------------------------------------------------------------------------------------------------+
| text |
+-----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired. |
+-----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
Closes#14753
## What changed
| File | Change |
|---|---|
| `pyproject.toml` | `requires-python` → `>=3.13,<3.15`; remove
`strenum==0.4.15` |
| `Dockerfile` | `uv python install 3.13`, `uv sync --python 3.13` |
| `.github/workflows/tests.yml` | `uv sync --python 3.13` on both matrix
legs |
| `CLAUDE.md` | dev setup command + requirements note updated |
| `deepdoc/parser/mineru_parser.py` | `from strenum import StrEnum` →
`from enum import StrEnum` |
| `agent/tools/code_exec.py` | same |
`StrEnum` has been in the stdlib since Python 3.11 — the `strenum`
backport package is no longer needed once the floor is 3.13.
## Why uv.lock is not regenerated
`uv lock --python 3.13` fails because:
1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0`
2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels)
depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0`
3. These two constraints are irreconcilable on Python 3.13
The lockfile regeneration requires loosening the `numpy` upper bound in
the `infiniflow/graspologic` fork. Once that fork commit is updated and
the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will
succeed.
## RFC corrections
Two claims in the original RFC (#14753) did not hold up under code
review:
- **"graspologic hard-blocks 3.13"** — the infiniflow fork at the pinned
commit has no `<3.13` Python constraint. The blocker is the transitive
`numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a
direct Python version cap.
- **"free-threading throughput gains for I/O-bound workload"** — Python
3.13 free-threading requires a special `--disable-gil` build and
provides no benefit for async I/O code (the GIL is already released
during I/O). The real motivation is forward compatibility and improved
error messages.
## Summary
- Rename misspelled attribute `model_speciess` to `model_species` across
4 files
- The extra `s` is a typo — `species` is already plural
## Test plan
- [ ] Verify PDF parsing with laws/manual/paper parser types still works
correctly
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: yuj <yuj@ztjzsoft.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### What problem does this PR solve?
This PRimplement TTS, ASR for Siliconflow and TTs for StepFun
**The following functionalities are now supported:**
**SiliConFlow:**
- [x] Text To Speech
- [x] Audio To Text
- [x] Stream Audio To Text
**StrepFun:**
- [x] Audio To Text
- [x] Stream Audio To Text
**Verified examples from the CLI:**
```plaintext
# SiliconFlow
RAGFlow(user)> tts with 'FunAudioLLM/CosyVoice2-0.5B@test@Siliconflow' text 'hello? show yourself' play format 'wav' param '{"voice": "fnlp/MOSS-TTSD-v0.5:alex"}'
SUCCESS
RAGFlow(user)> asr with 'FunAudioLLM/SenseVoiceSmall@test@siliconflow' audio './internal/test.wav' param ''
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. |
+----------------------------------------------------------------------------------------------------------------------+
RAGFlow(user)> stream asr with 'FunAudioLLM/SenseVoiceSmall@test@siliconflow' audio './internal/test.wav' param ''
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. |
+----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
```
+---------------------+----------------------------------+-------------+-----------------+---------+--------+------+
| created_at | id | meta_fields | name | size | status | type |
+---------------------+----------------------------------+-------------+-----------------+---------+--------+------+
| 2026-05-08 19:35:08 | f6aa38bb4ad111f1ba6338a74640adcc | map[] | abc.pdf | 3387987 | 1 | pdf |
+---------------------+----------------------------------+-------------+-----------------+---------+--------+------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Users frequently ask how to use Ollama for local LLM inference with
RAGFlow. This FAQ entry provides step-by-step instructions for setting
up Ollama as a local model provider.
### Type of change
- [x] Documentation update
### Description
Adds a new FAQ entry: "How do I use Ollama with RAGFlow for local LLM
inference?"
Covers:
1. Starting Ollama and pulling a model
2. Configuring Ollama as a model provider in RAGFlow Settings
3. Using the Ollama model in an assistant
### What problem does this PR solve?
Closes#14878.
`VllmModel.Rerank()` in
[internal/entity/models/vllm.go:551](internal/entity/models/vllm.go#L551)
is currently a stub returning `nil, fmt.Errorf("%s, Rerank not
implemented", z.Name())`, and
[conf/models/vllm.json](conf/models/vllm.json) is missing a `rerank`
entry in `url_suffix`. Chat (long-standing) and embeddings (#14688)
already work, so rerank is the last missing leg of the retrieval
pipeline for operators running everything on a single self-hosted vLLM
server — today they have to point rerank at a different provider, which
defeats the point of a fully local deployment.
Upstream vLLM has supported a Jina/Cohere-compatible `POST /v1/rerank`
endpoint since v0.7
([vllm-project/vllm#12376](https://github.com/vllm-project/vllm/pull/12376)).
The request/response shape is essentially identical to the NVIDIA driver
landed in #14778, so this PR mirrors that structure with two
vLLM-specific adjustments.
This PR replaces the stub with a real implementation against vLLM's
`/v1/rerank`:
- `POST {baseURL}/rerank`
- Request body: `{"model": "<modelName>", "query": "<query>",
"documents": [...], "top_n": <int>}` — documents are a flat `[]string`,
**not** wrapped as `{text: "..."}` like NVIDIA's `/ranking`.
- Response body: `{"results": [{"index": int, "relevance_score": float},
...]}` (Jina-compatible; the optional `document` field is ignored since
callers reconstruct text via `Index`).
- `Authorization: Bearer <ApiKey>` is set **only when `APIConfig.ApiKey`
is non-empty**, matching the existing `Embed`/`ListModels` behaviour in
this file. vLLM is a local driver and can be deployed without an API
key.
The return shape matches the existing `*RerankResponse` contract used by
the NVIDIA ([nvidia.go:461](internal/entity/models/nvidia.go#L461)),
Aliyun ([aliyun.go:507](internal/entity/models/aliyun.go#L507)), and
ZhipuAI ([zhipu-ai.go:554](internal/entity/models/zhipu-ai.go#L554))
drivers, i.e. `Data []RerankResult` carrying `{Index, RelevanceScore}`
in the API's ranking order. Callers that need original-input order sort
by `Index`.
Behaviour requirements from the issue, all covered:
1. Empty `documents` → returns `&RerankResponse{}` without an HTTP call.
2. Missing `modelName` → `"model name is required"` validation error.
3. `rerankConfig.TopN` honored when `0 < TopN < len(documents)`;
otherwise `top_n` defaults to `len(documents)` so callers get a score
per input.
4. Non-200 responses return an error including upstream status and body
(`"vLLM rerank API error: <status>, body: <body>"`).
5. Response `index` values are bounds-checked against `len(documents)`.
**Scope:**
- [internal/entity/models/vllm.go](internal/entity/models/vllm.go) —
replaces the `Rerank` stub at line 551 with a real implementation; adds
`vllmRerankRequest`/`vllmRerankResponse` types for the slim subset of
the payload we need. Region/baseURL resolution, 30s context timeout,
conditional bearer header, and error wrapping all follow the existing
patterns in this file.
- [conf/models/vllm.json](conf/models/vllm.json) — adds `"rerank":
"rerank"` to `url_suffix`, joined to the operator-configured vLLM base
URL the same way the NVIDIA driver joins at
[nvidia.go:485](internal/entity/models/nvidia.go#L485).
-
[internal/entity/models/vllm_rerank_test.go](internal/entity/models/vllm_rerank_test.go)
— adds 7 `httptest`-backed tests mirroring `nvidia_rerank_test.go`:
happy path (out-of-order ranking → Index preservation), `top_n` clamp to
`RerankConfig.TopN`, empty-documents short-circuit, missing-model-name
validation, HTTP error propagation, out-of-range index rejection, and a
vLLM-specific `TestVllmRerankWithoutAPIKey` locking in the optional-auth
behaviour that distinguishes this driver from NVIDIA.
**Out of scope:** no interface change, no DDL, no frontend change. Chat,
embeddings, and balance paths are untouched. No new user-facing docs
required beyond the existing rerank model setup page — vLLM joins the
list of providers whose rerank model can be selected once `/v1/rerank`
is exposed by the server.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fixes#13975
## Problem
The GitHub data source connector had both `include_pull_requests` and
`include_issues` defaulting to `false` in both the frontend form and the
backend sync code. This meant that with the default configuration, **no
content was synced at all** from a GitHub repository — silently
producing zero results.
Additionally, the form field labels contained a typo: "Inlcude" instead
of "Include".
## Solution
- Changed `include_pull_requests` default from `false` to `true` in the
frontend form fields and default values
- Changed `include_issues` default from `false` to `true` in the
frontend form fields and default values
- Changed both backend defaults in `sync_data_source.py` from `False` to
`True`
- Fixed label typos: "Inlcude Pull Requests" → "Include Pull Requests"
and "Inlcude Issues" → "Include Issues"
This makes the GitHub connector consistent with the GitLab connector,
which already defaults `include_mrs`, `include_issues`, and
`include_code_files` all to `true`.
## Testing
- The connector now syncs both pull requests and issues by default when
a new GitHub data source is created
- Users who want to exclude PRs or issues can uncheck the corresponding
checkboxes in the form
Co-authored-by: octo-patch <octo-patch@github.com>
## What problem does this PR solve?
Closes#13384.
The `/api/v1/agentbots/<agent_id>/completions` non-streaming path
returned the first yielded SSE chunk and exited:
```python
async for answer in agent_completion(objs[0].tenant_id, agent_id, **req):
return get_result(data=answer)
```
That meant structured output, the full assistant message, and reference
data were all dropped when an agent was called with `stream=false`.
Streaming worked because each event was forwarded individually;
non-streaming was returning a raw SSE-formatted string from a single
early event.
The v1 endpoint at
[`agent_api.py:1006-1050`](https://github.com/infiniflow/ragflow/blob/main/api/apps/restful_apis/agent_api.py#L1006-L1050)
already handles this correctly. This PR mirrors that aggregation in the
SDK beta endpoint: parse each SSE line, accumulate `content` from
`message` events, merge `reference`, collect `outputs.structured` from
each `node_finished` event keyed by `component_id`, and attach all of
them to the final response.
## Type of change
- [x] Bug fix (non-breaking change which fixes an issue)
## Test plan
- [ ] Build an agent with a node that emits structured output, call
`POST /api/v1/agentbots/<agent_id>/completions` with `stream=false` and
a beta API token, verify `data.structured.<component_id>` is present in
the response.
- [ ] Same agent with `stream=true` — verify behavior is unchanged.
- [ ] Agent without structured output — verify `data.structured` is
omitted, `content` and `reference` still aggregated correctly.
### What problem does this PR solve?
```
RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png'
+----------------------------------------------------------+
| text |
+----------------------------------------------------------+
| 生活不是等待风暴过去,而是学会在雨中翩翩起舞。
——佚名 |
+----------------------------------------------------------+
RAGFlow(user)> list 'test@gitee' tasks;
+---------+----------------------------------+
| status | task_id |
+---------+----------------------------------+
| success | C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5 |
+---------+----------------------------------+
RAGFlow(user)> show 'test@gitee' task 'C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5';
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| content | index |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| # PDF 1: Purpose of RAGFlow
RAGFlow is an open source Retrieval-Augmented Generation (RAG) engine designed to turn raw documents into reliable context for large language models.Its purpose is to make it practical to build an Al assistant that can ans... | 1 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Summary
Harden `api/utils/configs.deserialize_b64` so that it always routes
pickle data through the existing `RestrictedUnpickler`
(`restricted_loads`) rather than falling back to bare `pickle.loads()`.
- **CWE-502** — Deserialization of Untrusted Data
- **File / function**: `api/utils/configs.py` → `deserialize_b64`
- **Caller**: `SerializedField.python_value` in `api/db/db_models.py`
(invoked by Peewee whenever a pickled DB column is read)
## The issue
Before this change, `deserialize_b64` consulted a
`use_deserialize_safe_module` config flag that **defaults to `False`**
and is not set anywhere in the repository:
```python
use_deserialize_safe_module = get_base_config('use_deserialize_safe_module', False)
if use_deserialize_safe_module:
return restricted_loads(src)
return pickle.loads(src) # <-- default path
```
So the default code path was unrestricted `pickle.loads()` on bytes read
from a MySQL `SerializedField(serialized_type=PICKLE)` column. Any
attacker who can influence those bytes (SQL injection elsewhere,
compromised DB credentials, a backup restored from an untrusted source,
or a compromised replication peer) can craft a pickle payload that
achieves arbitrary code execution on the ragflow application server when
the field is next read.
Today no model in-tree instantiates a `SerializedField` with the default
PICKLE type — only `JsonSerializedField` is used in practice — so the
attack surface is currently **latent** rather than actively reachable
through an HTTP endpoint. But the insecure-by-default behaviour is a
sharp edge: any future field that uses the default PICKLE serialization
would silently inherit RCE-on-read semantics.
## The fix
```diff
- use_deserialize_safe_module = get_base_config(
- 'use_deserialize_safe_module', False)
- if use_deserialize_safe_module:
- return restricted_loads(src)
- return pickle.loads(src)
+ return restricted_loads(src)
```
`restricted_loads` is the existing `RestrictedUnpickler` already defined
in the same file, which limits permitted modules to `numpy` and
`rag_flow`. The config flag (and the now-dead `get_base_config` import)
are removed.
Diff is 1 insertion / 6 deletions, scoped to a single function.
## Testing
- Built a malicious pickle whose `__reduce__` resolves to
`posix.system('id')`. Pre-fix: executes. Post-fix: `restricted_loads`
raises `UnpicklingError: global 'posix.system' is forbidden`.
- Round-tripped a benign `numpy.ndarray` through `serialize_b64` →
`deserialize_b64`. Values preserved bit-for-bit.
- Confirmed `use_deserialize_safe_module` is not set in any config file
in the tree, so removing the flag does not change any operator-facing
knob that was actually in use.
## A note on `restricted_loads` itself
The existing `SECURITY.md` notes that `restricted_loads`'s `numpy`
allow-list can still be reached via `numpy.f2py.diagnose.run_command`.
This PR does **not** attempt to fix that — it is a separate hardening
question about tightening the allow-list to specific symbols rather than
whole modules. The change here strictly improves on the status quo (bare
`pickle.loads`) and brings the default path in line with what the
`restricted_loads` helper was clearly designed for. Happy to follow up
with a separate PR narrowing the allow-list if that direction is
welcome.
## Adversarial review
Before submitting, we tried to argue this finding away. The two
strongest objections are (1) "no field uses PICKLE today, so this is
unreachable" — true, but the default behaviour of a security-sensitive
helper still matters because new fields silently inherit it; and (2)
"the attacker already needs DB write access, which is game over" —
partially true, but pickle-RCE meaningfully escalates *data tampering*
into *code execution on the application host* (filesystem, internal
network, in-process secrets), which is not equivalent. The fix is one
line of real code, has no behavioural cost for legitimate callers, and
removes an insecure default. We decided it was worth filing.
---
<sub>_Submitted by Sebastion — autonomous open-source security research
from [Foundation Machines](https://foundationmachines.ai). Free for
public repos via the [Sebastion AI GitHub
App](https://github.com/marketplace/sebastion-ai)._</sub>
### What problem does this PR solve?
This PR fixes two issues in Agent Retrieval behavior and configuration
UX:
1. `top_k` configured in Agent Retrieval was not passed down to the
backend retriever call, so retrieval could ignore the configured vector
recall limit.
2. Similarity weight slider semantics were confusing in Agent forms
because the Agent field stores `keywords_similarity_weight` while UI
interactions were interpreted as vector weight. This could cause
displayed values and actual behavior to diverge.
This PR ensures Agent retrieval uses configured `top_k`, and makes the
slider behavior consistent and explicit for both vector and keyword
weight modes.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- Fixes **Tongyi-Qianwen** (`QWenEmbed`) text embeddings when the
configured `base_url` points at DashScope **international**
(`dashscope-intl.aliyuncs.com`) or **China** (`dashscope.aliyuncs.com`)
hosts, including values copied from Model Studio that use the
**OpenAI-compatible** path (`.../compatible-mode/v1`).
- The `dashscope` Python SDK (`TextEmbedding.call`) expects the
**native** HTTP root (`https://<host>/api/v1`), not the
OpenAI-compatible base URL. Without mapping, international accounts
could hit the wrong host or path.
## Implementation
- Added `_dashscope_native_http_api_url()` to normalize known DashScope
hosts to `.../api/v1`, and wired `QWenEmbed` to set
`dashscope.base_http_api_url` before each embedding call (document and
query).
## Notes
- In-code comments document the Tongyi-Qianwen / DashScope intl vs CN
behavior for future maintainers.
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
## Problem
When a user uploads a file attachment in their first message (Q1) and
then sends a follow-up message (Q2) that triggers a backend response,
the uploaded file attachment disappears from Q1 in the chat UI.
Fixes#13959
## Root Cause
In `single-chat-box.tsx`, a `useEffect` hook syncs `derivedMessages`
from `conversation?.messages` whenever the conversation data changes
(e.g., after a new assistant reply arrives):
```typescript
useEffect(() => {
const messages = conversation?.messages;
if (Array.isArray(messages)) {
setDerivedMessages(messages); // ← overwrites local state
}
}, [conversation?.messages, setDerivedMessages]);
```
The problem is that `conversation.messages` comes from the server, which
stores messages as plain JSON. Browser `File` objects (uploaded by the
user) cannot be serialized to JSON, so they are never stored on the
server. Each time the server data is applied to local state, the `files`
array on the user's first message is lost.
## Fix
Instead of replacing the local messages wholesale, preserve any `files`
entries from the previous local state by ID before applying the server
data:
```typescript
useEffect(() => {
const messages = conversation?.messages;
if (Array.isArray(messages)) {
setDerivedMessages((prevMessages) => {
const filesMap = new Map(
prevMessages
.filter((m) => m.files?.length)
.map((m) => [m.id, m.files]),
);
if (filesMap.size === 0) {
return messages;
}
return messages.map((m) => ({
...m,
files: filesMap.get(m.id) ?? m.files,
}));
});
}
}, [conversation?.messages, setDerivedMessages]);
```
This is a minimal, targeted fix: when there are no local files to
preserve the behavior is identical to before (early return with plain
assignment). When local file objects exist they are re-attached to the
corresponding server messages by ID.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved search query processing to properly handle special characters
and apostrophes in search terms and synonyms.
* Fixed chat message file attachments to persist when syncing with
server.
* **Refactor**
* Simplified OCR detection return values by removing timing metadata.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: ximi <octo-patch@github.com>
### What problem does this PR solve?
add document download endpoint and refactor existing download function
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: This enables SelectWithSearch to search by label.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
This PR implement TTS for FishAudio and MiniMax provider and ASR for
FishAudio
**The following functionalities are now supported:**
**FishAudio:**
- [x] Text To Speech
- [x] Stream Text To Speech
- [x] Audio To Text
**OpenRouter:**
- [x] Text To Speech
**Verified examples from the CLI:**
```plaintext
**FishAudio**
RAGFlow(user)> tts with 's1@test@fishaudio' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"reference_id": "90e65eaaf50e4470b8e6d43ee6afd7d5", "temperature": 0.7, "top_p": 0.7, "prosody": {"speed": 1, "volume": 0, "normalize_loudness": true}, "chunk_length": 300, "normalize": true, "sample_rate": 44100, "mp3_bitrate": 128, "latency": "normal", "max_new_tokens": 1024, "repetition_penalty": 1.2, "min_chunk_length": 50, "condition_on_previous_chunks": true, "early_stop_threshold": 1}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/s1_output.wav
SUCCESS
RAGFlow(user)> stream tts with 's1@test@fishaudio' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"reference_id": "90e65eaaf50e4470b8e6d43ee6afd7d5", "temperature": 0.7, "top_p": 0.7, "prosody": {"speed": 1, "volume": 0, "normalize_loudness": true}, "chunk_length": 300, "normalize": true, "sample_rate": 44100, "mp3_bitrate": 128, "latency": "normal", "max_new_tokens": 1024, "repetition_penalty": 1.2, "min_chunk_length": 50, "condition_on_previous_chunks": true, "early_stop_threshold": 1}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/s1_output.wav
SUCCESS
RAGFlow(user)> asr with 'transcribe-1@test@fishaudio' audio './internal/test.wav' param '{"language": "en", "ignore_timestamps": true}'
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. |
+----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
## Summary
- Fixes resource-management bugs in the `POST
/agents/test_db_connection` endpoint where database connections could be
left open on error (part of #14750)
## Changes
- `api/apps/restful_apis/agent_api.py` — `test_db_connection`:
- mysql / mariadb / oceanbase / postgres: replaced bare `db.connect()` /
`db.close()` fallthrough with `with db.connection_context()` and a probe
`SELECT 1` — guaranteed close on both success and exception
- mssql: nested `try/finally` blocks so `cursor.close()` and
`db.close()` are always called even when `cursor.execute()` raises
- trino: wrapped cursor ops in `try/finally` for the same reason
- Removed the `if req["db_type"] != "mssql": db.connect(); db.close()`
shared fallthrough block — each branch now owns its teardown
- Consolidated to a single `return get_json_result(...)` after the
if/elif chain
### What problem does this PR solve?
When multiple MCP servers expose tools with the same name, the agent
currently registers those tools using their original MCP names. This can
lead to two issues:
- later MCP tools may overwrite earlier ones in the agent tool map
- duplicate function names may be exposed to the LLM
This PR fixes duplicate MCP tool-name handling by applying the same
indexed naming strategy already used for native agent tools. Native
tools are exposed with generated names such as `<tool_name>_<index>` to
avoid collisions, and MCP tools now follow the same convention for
consistency.
Specifically, this PR:
- assigns unique indexed function names to MCP tools exposed to the LLM
- preserves each MCP tool's original server-side name in an
`MCPToolBinding`
- dispatches MCP calls using the original MCP tool name while keeping
the indexed name in the agent tool map
- allows MCP metadata conversion to override only the OpenAI function
name without modifying the original MCP tool metadata
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Validation
The validation was performed using two MCP servers. Both servers exposed
a tool with the same name: `mcp0`. Both tools take no input parameters.
**MCP Server One:**
<img width="1780" height="625" alt="ONE"
src="https://github.com/user-attachments/assets/801a2654-fc10-4b71-b31c-81841fd40c55"
/>
**MCP Server Two:**
<img width="1777" height="624" alt="Second"
src="https://github.com/user-attachments/assets/c095151d-7bdf-47c8-9bfe-6aaf4a01b944"
/>
**Before the fix:**
When invoking `mcp0`, only the `mcp0` tool from the MCP server injected
later could be called successfully. As shown below, both `mcp0` tools
were present, but only the later-registered one was actually invokable.
<img width="694" height="935" alt="Three"
src="https://github.com/user-attachments/assets/3b9d7ab2-1765-492c-b8e0-bf05a69933ca"
/>
**After the fix:**
Both `mcp0` tools can now be invoked correctly.
<img width="737" height="1095" alt="F"
src="https://github.com/user-attachments/assets/6e896627-2b7f-41bb-becc-daa0c73ff58f"
/>
<img width="730" height="1090" alt="six"
src="https://github.com/user-attachments/assets/aba75593-26ae-4e3b-951d-b45ff177fd32"
/>
### What problem does this PR solve?
Fixes#14866.
Previously, `DocumentService.increment_chunk_num` and
`decrement_chunk_num` updated the `Document` row and its parent
`Knowledgebase` row in two separate, non-transactional statements. If
the second update failed (DB error, connection drop, etc.) after the
first one succeeded, the document and knowledge base chunk/token
counters would drift apart and stay inconsistent.
There was also a behavioral asymmetry between the two methods:
- `increment_chunk_num` only logged a warning when the document row was
missing and returned a value that callers usually treated as success.
- `decrement_chunk_num` raised `LookupError` in the same situation.
This PR makes the counter updates atomic and aligns the missing-document
behavior between the two methods:
- Wrap the `Document` and `Knowledgebase` updates in
`increment_chunk_num` / `decrement_chunk_num` inside a `DB.atomic()`
block so both succeed or both roll back together.
- Raise `LookupError` from `increment_chunk_num` when the target
document no longer exists, matching `decrement_chunk_num`.
- Update `reset_document_for_reparse` in `document_api_service.py` to
catch the new `LookupError` and return a proper "Document not found!"
API error instead of propagating the exception.
No schema changes, no API contract changes for the success path; only
the failure mode for a missing document during reparse is now a clean
error response instead of an uncaught exception.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- Replace `gen_conf={}` with `gen_conf=None` + guard in
`rag/llm/chat_model.py` (12 instances across Base, BaiChuanChat,
LocalLLM, MistralChat, ReplicateChat, BaiduYiyanChat, GoogleChat
classes)
- Replace `doc_ids=[]` with `doc_ids=None` + guard in
`api/db/services/document_service.py` (1 instance)
- Mutable default arguments are shared across all calls, causing
potential cross-request state contamination
- See Python docs:
https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects
## Test plan
- [x] Verify LLM calls work with and without explicit gen_conf
- [x] No behavior change for existing callers — `None` is replaced with
`{}` at function entry
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Closes#14853
The `/agents/download` and `/agents/<agent_id>/upload` endpoints in the
agent API are missing `@login_required` and `@add_tenant_id_to_kwargs`
decorators, allowing unauthenticated access. This is a security issue —
any user can upload files to or download files from an agent without
being logged in. Additionally, the upload endpoint bypasses canvas
access control (`@_require_canvas_access_async`).
This PR adds the missing authentication and authorization decorators to
both endpoints and replaces the manual `user_id` / `created_by` lookups
with the `tenant_id` provided by the auth middleware, making these
endpoints consistent with the rest of the agent API.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
In `api/utils/web_utils.py`, `__get_pdf_from_html()` creates a Chrome
WebDriver but only calls `driver.quit()` inside the `TimeoutException`
handler. If the page element becomes stale before the timeout (no
exception raised), the WebDriver is never quit, leaking the Chrome
browser process and returning `None`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Changes
- Move the PDF printing logic and `driver.quit()` outside the `except`
block so they execute on all code paths
- Use `try/finally` to ensure `driver.quit()` is always called, even if
the `Page.printToPDF` DevTools call fails
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary
`Graph.set_variable_param_value()` in `agent/canvas.py` has a bug in its
nested path traversal logic. The `for` loop iterates through **all**
keys in the path (including the last one), descending into every level.
After the loop, it then tries to set `cur[keys[-1]] = value`, but `cur`
has already descended one level too deep.
**Example:** For `path = "a.b"`, `value = "hello"`:
- **Before (bug):** `obj["a"]["b"]` becomes `{"b": "hello"}` instead of
`"hello"`
- **After (fix):** `obj["a"]["b"]` becomes `"hello"` as expected
The fix changes `for key in keys:` to `for key in keys[:-1]:`, so the
loop only navigates to the parent dict, and the final key is set
directly. This is consistent with how the read-side counterpart
`get_variable_param_value()` works.
This method is called by `set_variable_value()` when assigning to nested
variable paths (e.g., `component@root.nested.key`), which is used by the
`VariableAssigner` component.
## Test plan
- [ ] Create a canvas with a VariableAssigner that writes to a nested
path (e.g., `component@obj.nested.key`)
- [ ] Verify the value is set correctly at the expected path, not
wrapped in an extra dict layer
- [ ] Verify single-key paths (e.g., `component@key`) still work
correctly
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Fixed a bug in variable parameter assignment where nested structures
were being incorrectly modified, ensuring values are now properly set at
their intended locations without unintended overwrites.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### What problem does this PR solve?
This PR implement TTS for MiniMax provider and CLI testing for TTS
**The following functionalities are now supported:**
**MiniMax:**
- [x] Chat / Stream Chat
- [x] Embedding
- [x] Rerank
- [x] Model listing
- [x] Provider connection checking
- [x] Text To Speech
- [ ] OCRFile
- [ ] ~~Audio To Text~~
- [ ] ~~Balance~~
**Verified examples from the CLI:**
```plaintext
RAGFlow(user)> tts with 'speech-2.8-hd@test@minimax' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"voice_setting": {"voice_id": "English_radiant_girl", "speed": 1, "vol": 1, "pitch": 0}, "audio_setting": {"sample_rate": 32000, "bitrate": 128000, "format": "wav", "channel": 1}, "output_format": "hex"}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/speech-2.8-hd_output.wav
SUCCESS
RAGFlow(user)> stream tts with 'speech-2.8-hd@test@minimax' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"voice_setting": {"voice_id": "English_radiant_girl", "speed": 1, "vol": 1, "pitch": 0}, "audio_setting": {"sample_rate": 32000, "bitrate": 128000, "format": "wav", "channel": 1}, "output_format": "hex"}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/speech-2.8-hd_output.wav
SUCCESS
```
Set `Play` to play audio in CLI
Set `Save` `PATH_TO_SAVE` to save file
Set `format` to save file in wav or mp3
Set `Param` align with official request body
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Closes#9837
The Langfuse integration currently only sends the output text to
`langfuse_generation.update()` without including token usage
information. This means Langfuse cannot track input/output token
consumption for cost analysis and monitoring.
### Solution
Add the `usage` parameter to `langfuse_generation.update()` with:
- `input`: approximate input token count from `message_fit_in()`
- `output`: approximate output token count from
`num_tokens_from_string(answer)`
- `total`: sum of input and output
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
\`assert \"string\"\` always passes in Python because non-empty strings
are truthy. This silently skips input validation:
- **variable_assigner.py line 51**: \`assert \"Variable is not
complete.\"\` → \`raise ValueError(\"Variable is not complete.\")\`
- **loop.py line 59**: \`assert \"Loop Variable is not complete.\"\` →
\`raise ValueError(\"Loop Variable is not complete.\")\`
Without this fix, incomplete variables pass validation silently and
cause a confusing KeyError on the next line.
## Description
This PR fixes critical bugs and improves the robustness of the RAG
reranking module while maintaining **100% backward compatibility** with
all existing functionality and providers.
## Key Changes
1. **Network Stability**: Added 30s timeout to all API requests to
prevent service blocking
2. **Boundary Protection**: Added empty query/text validation for all
rerank models
3. **Response Fault Tolerance**: Replaced hardcoded key access with
`.get()` to avoid KeyError crashes
4. **Bug Fixes**:
- Fixed `Ai302Rerank` (completely non-functional before)
- Fixed `GPUStackRerank` incorrect exception catching
- Fixed `_normalize_rank` empty array crash
5. **Code Specification**: Added type annotations, standardized
unimplemented class prompts
## Compatibility
- ✅ No changes to any class/method names
- ✅ All rerank providers (Jina/Cohere/NVIDIA/HuggingFace etc.) work as
before
- ✅ No breaking changes, zero impact on existing workflows
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.25.3 to v0.25.4
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Includes gpt-5.4-mini and gpt-5.4-nano to the OpenAI model list
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add a Go driver for Voyage AI (https://voyageai.com), one of the
unchecked providers on the umbrella tracking issue #14736. Voyage AI is
**embed + rerank only** — no chat, no streaming, no `/v1/models`
endpoint. It's the first provider in the Go layer of this shape.
Until this PR, a tenant who configured `voyage` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.
### What this PR includes
- New `internal/entity/models/voyage.go` with a `VoyageModel`
implementing the `ModelDriver` interface.
- New `conf/models/voyage.json` with 6 embedding models (`voyage-3.5`,
`voyage-3.5-lite`, `voyage-3-large`, `voyage-code-3`, `voyage-law-2`,
`voyage-finance-2`) and 2 rerank models (`rerank-2`, `rerank-2-lite`).
- `factory.go`: route `"voyage"` to `NewVoyageModel`.
- `internal/entity/models/voyage_test.go`: 19 unit tests.
### How the driver works
- **Embed**: `POST /v1/embeddings`. Response is OpenAI-shaped (`{data:
[{embedding, index, object, text}], model, usage}`). Driver reorders by
`index`, rejects duplicate / out-of-range / missing slots, and
short-circuits empty input without an HTTP call.
- **Rerank**: `POST /v1/rerank`. Voyage uses **`top_k`** as the request
param name (not `top_n` like Aliyun/SiliconFlow); the driver translates
`RerankConfig.TopN` → `top_k`. Response is Cohere-shaped (`{data:
[{relevance_score, index}], model}`), so the existing
`RerankResponse{Data: []RerankResult{Index, RelevanceScore}}` shape fits
cleanly.
- **`ListModels`**: returns a hardcoded list of `voyageKnownModels`.
Voyage does **not** expose `/v1/models` (probed live, returns 404), so
the driver synthesizes the list from the same set the config ships. New
upstream models are added by extending one slice.
- **`CheckConnection`**: pings a 1-input embed call against
`voyage-3.5`. Without `/v1/models`, this is the cheapest way to verify
the API key + network path before a tenant tries a real workload.
- **`ChatWithMessages` / `ChatStreamlyWithSender` / `Balance` /
`TranscribeAudio` / `AudioSpeech` / `OCRFile`**: all return `"no such
method"`. Voyage does not host any of these surfaces.
No interface change. No new dependencies.
### How was this tested?
**19 unit tests** in `internal/entity/models/voyage_test.go` — all pass
on go 1.25:
```
$ go test -vet=off -run TestVoyage -count=1 ./internal/entity/models/...
ok ragflow/internal/entity/models 0.036s
```
Coverage: Name; Embed (happy path, reorder, empty-input, missing
key/model, duplicate index, out-of-range index, missing slot); Rerank
(happy path with `top_k` assertion, default-to-len-documents, empty
documents, out-of-range index); ListModels (static list, missing key);
CheckConnection (happy, 401); chat methods sentinels; Balance sentinel;
audio/OCR sentinels.
`go build ./internal/entity/models/...` exits 0.
**Live integration test** against `api.voyageai.com`:
```
=== RUN TestVoyageLiveSmoke
[OK] Name() = "voyage"
[OK] ListModels (static): 8 models -> [voyage-3.5 voyage-3.5-lite voyage-3-large voyage-code-3 voyage-law-2 voyage-finance-2 rerank-2 rerank-2-lite]
[OK] CheckConnection
[OK] Embed vectors=3 dim=1024 indices=[0 1 2]
[OK] Embed(empty) -> 0 vectors
[OK] Rerank results=3 scores=[0.8125 0.59765625 0.39453125]
[OK] ChatWithMessages returns voyage, no such method
[OK] Balance returns voyage, no such method
VOYAGE LIVE SMOKE PASSED
--- PASS: TestVoyageLiveSmoke (0.81s)
```
What the live run proves on the wire:
- Auth (`Bearer <key>`) accepted by `api.voyageai.com`.
- Embed `voyage-3.5` on 3 inputs returns 3 vectors at dim 1024 with
`index` field preserved as `[0, 1, 2]` — the reorder-by-index code is
exercised on real data.
- Empty input short-circuits without an HTTP call (mock server would
have been hit if it did).
- Rerank `rerank-2` on 3 docs returns 3 real `relevance_score` floats
`[0.8125, 0.598, 0.395]`. The `top_k` translation works on the live
wire.
- All sentinel methods return the documented `"no such method"` strings.
### Note on PR history
This branch was previously named for LocalAI Embed work which is now
consolidated into PR #14813. The branch was reset to `upstream/main` and
rebuilt for Voyage. Diff against `main` is a clean +838 lines across 4
files.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Tracking: #14736
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## Summary
Closes#14774.
Adds free-form tags on agents (UserCanvas) with full UI + API:
- Stored as comma-separated `tags` column on `UserCanvas` with online
migration.
- New endpoints: `GET /v1/agents/tags` (aggregate counts) and `PUT
/v1/agent/<id>/tags` (write). `GET /v1/agents` accepts a `tags=` query.
- "Edit tags" item in agent dropdown opens a chip-style editor dialog;
tags render as badges on each agent card.
- New "Tags" facet in the agents filter bar, with counts.
## Implementation notes
- **Tag matching is exact-token**: the SQL filter wraps stored tags as
`,…,` and matches `,ml,` so `ml` doesn't match `ml-ops`.
- **Server-side normalization** in `UserCanvasService.update_tags`:
dedup (case-insensitive), per-tag cap of 64 chars, total length capped
at 512 chars to fit the column, commas inside tag values are replaced
with spaces.
- **Tenant authorization**: `PUT /v1/agent/<id>/tags` gates on
`UserCanvasService.accessible(canvas_id, tenant_id)`.
- **Tag listing scope**: `UserCanvasService.list_tags` follows the same
own + team-shared rule as `get_by_tenant_ids`.
- **i18n**: keys added to `en.ts` and `zh.ts` only (per project
convention; other locales fall back).
- **`HomeCard`** gets a non-breaking `extra?: ReactNode` slot for the
chip row; no `src/components/ui/` files modified.
## Test plan
- [ ] Backend boot runs `migrate_db` → confirm `user_canvas.tags` column
exists (`DESCRIBE user_canvas`).
- [ ] Agents page renders cards normally (no console error from missing
field).
- [ ] `⋯ → Edit tags` opens a dialog that stays open (regression: dialog
was unmounting with the dropdown).
- [ ] Typing a tag without pressing Enter and clicking Save persists it
(regression: last typed tag was being dropped).
- [ ] Chip input supports Enter/comma to commit, Backspace on empty to
remove, `×` to remove individual chip.
- [ ] Tag containing a comma sent via API is stored with the comma
replaced by a space.
- [ ] 20 long tags sent via API does not error (length cap silently
truncates).
- [ ] "Tags" filter in the filter bar shows counts and narrows the list.
- [ ] Filtering by `ml` does **not** return agents tagged `ml-ops`.
- [ ] UI in Chinese shows 编辑标签 / 添加标签以整理和筛选你的智能体 etc.
- [ ] `PUT /v1/agent/<other-tenant-id>/tags` returns `Agent not found or
no permission.`
Introduce comprehensive floating widget customization: add new widget
settings (title, subtitle, footer, colors, mute, streaming) with types
and defaults, and expose them via EmbedDialog UI (split into Embed Setup
and Widget Customization tabs). Persist and load settings through Agent
page by reading/writing globals and wiring an onSaveWidgetSettings
handler to setAgent; show a loading ButtonLoading for saving. Update
embed iframe query params and FloatingChatWidget to honor URL params
(colors, text, mute/streaming) with validation/normalization, color
darkening for gradients, footer link normalization, and improved
styling. Also add copy-to-clipboard in message toolbar, adjust syntax
highlighter layout and Copy button, and add i18n key for muteWidget.
### What problem does this PR solve?
Adds a few fields to the embed widget modal to customize the appearance
of the floating widget when embedded into a page.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Noah <Noah.Thompson@ecn.forces.gc.ca>
## Summary
- Convert `pdfplumber.open()` to use `with` context manager in
`api/utils/file_utils.py` (`thumbnail_img` function)
- If any exception occurs between `open()` and `close()`, the PDF file
handle leaks
- The rest of the codebase (e.g. `read_potential_broken_pdf` in the same
file) already uses `with pdfplumber.open(...)` correctly
## Test plan
- [x] PDF thumbnail generation works correctly with context manager
- [x] Resources properly cleaned up on exceptions
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
# feat: Add Generic REST API Connector
## What problem does this PR solve?
RAGFlow supports many specific data source connectors (MySQL, Slack,
Google Drive, etc.), but there was no way to connect an arbitrary REST
API as a data source. Users with custom or third-party APIs had to write
a new connector class for each one.
This PR adds a **generic, configuration-driven REST API connector** that
lets users connect any REST API as a data source entirely through the UI
— no code changes needed per API.
---
## Features
### Core Connector (`common/data_source/rest_api_connector.py`)
- Implements `LoadConnector` and `PollConnector` interfaces for full and
incremental sync
- **Configurable authentication:** None, API Key (custom header), Bearer
Token, Basic Auth
- **Pluggable pagination:** Page-based, Offset-based, Cursor-based, or
None
- Smart page-size inference from user's query parameters to avoid
duplicate/conflicting params
- Configurable request delay between pages to prevent API rate limiting
- Auto-detection of the items array in JSON responses (`items`,
`results`, `data`, `records`, or first list found)
- **Advanced field mapping** with dot-notation (`country.name`), array
wildcards (`newsType[*].name`), type hints, and default values
- Optional content template rendering (`"Title: {title}\nBody: {body}"`)
- HTML stripping for content fields
- Stable document IDs via `hash128` from a configurable ID field or
auto-generated from item content
- Pydantic configuration schema with automatic coercion of UI string
inputs to dicts/lists
### Backend Registration (`rag/svr/sync_data_source.py`,
`common/constants.py`, `common/data_source/config.py`)
- `REST_API` sync class wired into RAGFlow's `func_factory`
- Full sync (`load_from_state`) and incremental polling (`poll_source`)
support
- Credentials and config passed from task to connector following
existing patterns (MySQL, SeaFile, etc.)
### Test Connection Endpoint (`api/apps/connector_app.py`)
- `POST /v1/connector/<id>/test` validates config schema,
authentication, and API connectivity without triggering a sync
- Clear error messages for auth failures vs. config issues
### Frontend UI (`web/src/pages/user-setting/data-source/constant/`)
- **Postman-style configuration:** Base URL, Query Parameters (key=value
per line), Auth, Content Fields, Metadata Fields, Pagination Type
- Auth-type-aware form: fields for API key header/value, Bearer token,
or Basic username/password appear only when relevant
- **Advanced Settings** toggle for: Custom Headers, Max Pages, Request
Delay, Poll Timestamp Field, Request Body (POST)
- Connector icon (SVG) and i18n strings (English)
- **"Test Connection"** button to validate before syncing
---
## Controls & Safety
- Configurable max pages safety cap (default: 1000, adjustable in UI)
- Configurable request delay between pages (default: 0.5s, adjustable in
UI)
- Auth errors (401/403) fail immediately without retries; transient
errors retry with exponential backoff
- Diagnostic logging: auth setup confirmation, request details on
failure, content field extraction status
---
## Type of change
- [x] New Feature (non-breaking change which adds functionality)
##Visual Screenshots of Features
<img width="482" height="510" alt="Screenshot 2026-03-11 at 5 19 52 PM"
src="https://github.com/user-attachments/assets/dcb7ab4a-1622-44f3-bb02-d6f0527314c4"
/>
(Connector can be configured within the external data sources tab)
Configuration Parameters:
<img width="661" height="682" alt="Screenshot 2026-03-11 at 5 20 46 PM"
src="https://github.com/user-attachments/assets/5e154e71-4ab5-4872-bfb2-04f02b73c18a"
/>
<img width="661" height="682" alt="Screenshot 2026-03-11 at 5 20 54 PM"
src="https://github.com/user-attachments/assets/00cb14b7-0bcf-4b94-9d71-34e93369ecb2"
/>
Connection can be tested before attaching to dataset:
<img width="981" height="681" alt="Screenshot 2026-03-11 at 5 21 40 PM"
src="https://github.com/user-attachments/assets/aaa6eeeb-89a7-4349-bc34-2423bf8be9ee"
/>
Ingestion tested with API connector (works perfectly fine):
<img width="1062" height="705" alt="Screenshot 2026-03-11 at 5 22 30 PM"
src="https://github.com/user-attachments/assets/afcd0d58-cadd-4152-badc-d2f14d96fbec"
/>
Search & Retrieval works as well with metadata flow:
<img width="1062" height="705" alt="Screenshot 2026-03-11 at 5 23 05 PM"
src="https://github.com/user-attachments/assets/d41ee935-dcf7-4456-b317-22a76ca032c0"
/>
---------
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
### What problem does this PR solve?
As title
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### Related issues
Closes#14781
### What problem does this PR solve?
Some retrieval endpoints accepted caller-supplied `tenant_rerank_id` and
resolved it through `get_model_config_by_id(...)`. That helper loaded
`TenantLLM` rows by global database id and returned decoded model
configuration without checking whether the model belonged to the
authenticated tenant or the dataset owner tenant.
This meant dataset access was validated, but rerank-model selection was
not. A caller who knew or could guess another tenant's
`tenant_rerank_id` could attempt retrieval with a foreign rerank model
config, creating a cross-tenant authorization gap for model usage.
This PR closes that gap by making `tenant_rerank_id` resolution
tenant-aware across the retrieval paths that accept it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Solution
- Extend `get_model_config_by_id(...)` to accept an optional
`allowed_tenant_ids` set and reject `TenantLLM` rows whose `tenant_id`
is outside that set.
- Pass the allowed tenant scope from retrieval endpoints that accept
`tenant_rerank_id`:
- `api/apps/sdk/doc.py`
- `api/apps/sdk/session.py`
- `api/apps/services/dataset_api_service.py`
- Use the authenticated tenant plus dataset-owner tenant ids already
derived by each retrieval flow as the authorization boundary for rerank
model selection.
- Add focused unit coverage to assert unauthorized `tenant_rerank_id`
values are rejected and that the allowed tenant set is propagated
correctly.
### Testing
- `python -m py_compile` on:
- `api/db/joint_services/tenant_model_service.py`
- `api/apps/services/dataset_api_service.py`
- `api/apps/sdk/doc.py`
- `api/apps/sdk/session.py`
- Added unit tests in:
-
`test/testcases/test_http_api/test_file_management_within_dataset/test_doc_sdk_routes_unit.py`
-
`test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
### Notes for reviewers
- This change is intentionally narrow: it affects only the
`tenant_rerank_id` path, not the normal `rerank_id` name-based
resolution path.
- Local lint/syntax checks passed.
- Full pytest execution could not be completed in this environment
because the local test runtime is missing `strenum`, so the route-test
files fail during collection before exercising the updated cases.
---------
Co-authored-by: jony376 <jony376@gmail.com>