ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-07-05 02:55:48 +08:00

Author	SHA1	Message	Date
carlos4s	2eba2c4d75	Add Anthropic Go model provider (#14940 ) ### What problem does this PR solve? Adds the missing Anthropic provider implementation for the Go model provider layer. Closes #14939 ### What changed - Add `conf/models/anthropic.json` with Anthropic Claude chat/vision models and API endpoints. - Add `internal/entity/models/anthropic.go` implementing non-streaming Messages API chat, model listing, and connection checking. - Register `anthropic` in the Go model factory. - Add httptest coverage for headers, payload mapping, response parsing, validation errors, provider errors, model listing, connection checking, factory registration, and unsupported methods. ### Notes Streaming chat is left as an explicit `no such method` follow-up because this initial provider focuses on non-streaming chat and connection checking. ### Tests - `docker run --rm -v /home/ubuntu/Documents/gitTensor_repos/carlos/ragflow:/work -v /tmp/ragflow-go-cache:/go/pkg/mod -v /tmp/ragflow-go-build:/root/.cache/go-build -w /work golang:1.25 go test -vet=off ./internal/entity/models -run Anthropic -count=1 -v` - `docker run --rm -v /home/ubuntu/Documents/gitTensor_repos/carlos/ragflow:/work -v /tmp/ragflow-go-cache:/go/pkg/mod -v /tmp/ragflow-go-build:/root/.cache/go-build -w /work golang:1.25 go test -vet=off ./internal/entity -count=1` - `git diff --check` - `jq . conf/models/anthropic.json >/dev/null` Plain `go test ./internal/entity/models` currently hits pre-existing unrelated vet findings in other provider files (`baidu.go`, `cohere.go`, `fishaudio.go`, `openrouter.go`). --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 12:03:33 +08:00
Jake Armstrong	fe1433d1ff	Go: add Jina chat completions support (#14935 ) ### What problem does this PR solve? This PR adds non-streaming chat support for the Jina Go model provider. The Jina provider was added with embedding, rerank, model listing, and connection checking, but `ChatWithMessages` still returned a not-implemented error even though Jina exposes an OpenAI-compatible `/v1/chat/completions` endpoint. Closes #14933 The following functionalities are now supported: ### Jina: - [x] Chat - [ ] Stream Chat - [x] Embedding - [x] Rerank - [x] Model listing - [x] Provider connection checking - [ ] Balance ### Implementation details: - Implements `JinaModel.ChatWithMessages` - Sends `Authorization: Bearer <api-key>` and JSON chat completion requests - Validates API key, model name, messages, and configured region before making requests - Forwards supported chat config fields: `max_tokens`, `temperature`, `top_p`, and `stop` - Parses the first chat completion choice into `ChatResponse.Answer` - Adds `jina-ai/jina-vlm` as a chat-capable model in `conf/models/jina.json` - Adds focused unit tests for request construction, auth, response parsing, validation errors, provider errors, and region handling Verification: ```plaintext docker run --rm -v $PWD:/repo -w /repo golang:1.25 sh -c '/usr/local/go/bin/gofmt -w internal/entity/models/jina.go internal/entity/models/jina_test.go && /usr/local/go/bin/go test -vet=off ./internal/entity/models -run TestJina -count=1' ok ragflow/internal/entity/models 0.037s ``` Note: `go test ./internal/entity/models -run TestJina -count=1` currently hits unrelated existing vet findings in other provider files, so the focused Jina tests were run with `-vet=off`. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 12:03:12 +08:00
Panda Dev	6794ad2f70	Go: implement Embed (embeddings) in Novita driver (#14895 ) ### What problem does this PR solve? Fixes #14893 The Novita Go driver landed in #14850 and shipped a stub `Embed` method that returned `"novita, no such method"`, so Novita could not be used as an embedding provider in RAGFlow. This PR fills that gap. Novita exposes a public embeddings endpoint at `POST https://api.novita.ai/v3/embeddings` that accepts the standard OpenAI-compatible request shape (`{model, input}`) with `Authorization: Bearer <api_key>`. Two embedding models are documented in Novita's model library: `baai/bge-m3` (multilingual, 8192 tokens) and `baai/bge-large-en-v1.5`. ### Changes - `internal/entity/models/novita.go`: implement `NovitaModel.Embed`. - Validate inputs (api key, model name) and short-circuit on empty texts. - Resolve region with the existing `baseURLForRegion` helper. - Build URL from `URLSuffix.Embedding` (the embeddings path lives under `/v3/`, separate from the chat path under `/openai/v1/`). - Send `{model, input}` POST body, add `dimensions` when `embeddingConfig.Dimension > 0` (matches the pattern in #14735). - Bearer auth + JSON content type, mirroring the chat path. - Parse `{data: [{embedding, index}]}` and reorder by `index`, rejecting out-of-range indices, duplicates, and missing entries so the output always lines up with the input. Same shape as the merged Mistral and Upstage Embed implementations. - `conf/models/novita.json`: - Add `"embedding": "v3/embeddings"` to `url_suffix`. - Add default embedding model entries for `baai/bge-m3` and `baai/bge-large-en-v1.5` so they appear in the model picker. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 12:02:28 +08:00
Idriss Sbaaoui	e98f3e5c0d	Fix session deletion leaking chat-upload blobs (#14969 ) ### What problem does this PR solve? This fixes a bug where files uploaded in chat were left in storage after the session was deleted. It now removes those chat-uploaded blobs during session deletion. fixes #14965 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 11:14:27 +08:00
qinling0210	9d94527b1d	Bump to infinity v0.7.0 (#14968 ) ### What problem does this PR solve? Upgrade infinity ### Type of change - [x] Refactoring	2026-05-18 10:25:59 +08:00
07heco	e194027b01	refactor: optimize BaseTitleChunker to improve RAG document chunk quality (#14247 ) ## RAG Optimization Description Optimize the core `BaseTitleChunker` in `rag/flow/chunker/title_chunker/common.py` to improve RAG document chunking quality and retrieval accuracy. ## Key Changes 1. Format-branched text processing: Preserve original whitespace & indentation for Markdown/HTML payloads to maintain document semantics and chunk fidelity; only perform full whitespace cleaning on plain text content. 2. Empty chunk filtering: Thoroughly filter invalid pure-blank lines to reduce noisy data in vector database. 3. Code deduplication: Unified markdown/text/html payload extraction logic, removed redundant repeated code blocks. 4. None serialization fix: Avoid converting `None` value into literal `"None"` string in chunk text fields. 5. Production logging: Added input/output line count logging for filter logic, observable in online environment. 6. 100% backward compatible: No changes to chunking hierarchy rules, output format and all existing workflows. ## RAG Business Value - Preserves document format fidelity for structured Markdown/HTML files - Reduces invalid noisy chunks → improves RAG retrieval precision - Cleans plain text data → optimizes vector embedding quality - Improves code maintainability with no breaking changes - Provides observable logging for chunk filtering behavior ## Compatibility - ✅ No API changes - ✅ No chunk logic modifications - ✅ All document parsing/chunking workflows unaffected - ✅ All pre-checks passed, no code conflicts ### Type of change - [x] Refactoring - [x] Performance Improvement	2026-05-18 10:00:18 +08:00
Ricardo-M-L	ff318aba7a	fix: correct literal_eval dispatch and bool isinstance ordering in agent components (#13988 ) ## Summary This PR fixes 3 bugs in agent components: ### Bug 1: `DataOperations._invoke()` dispatches `"literal_eval"` to wrong handler File: `agent/component/data_operations.py`, line 76 The `_invoke()` method compares `self._param.operations` against `"recursive_eval"` (line 76), but the valid value defined in `DataOperationsParam.__init__()` (line 29) and validated in `check()` (line 43) is `"literal_eval"`. This means selecting the `literal_eval` operation from the frontend would never match, and the method `_literal_eval()` would never be called. Fix: Change `"recursive_eval"` to `"literal_eval"` in the dispatch condition. ### Bug 2: `VariableAssigner._clear()` — `bool` branch unreachable File: `agent/component/variable_assigner.py`, lines 95–100 In Python, `bool` is a subclass of `int` (`True` is `isinstance(True, int) == True`). The `isinstance(variable, int)` check on line 95 catches boolean values before the `isinstance(variable, bool)` check on line 99, making the bool branch unreachable. A boolean variable would be cleared to `0` instead of `False`. Fix: Move the `isinstance(variable, bool)` check before `isinstance(variable, int)`. ### Bug 3: `LoopItem.evaluate_condition()` — `bool` branch unreachable File: `agent/component/loopitem.py`, lines 67–93 Same issue as Bug 2: `isinstance(var, (int, float))` on line 67 catches boolean values before `isinstance(var, bool)` on line 85. Boolean variables would be evaluated with numeric operators (`=`, `≠`, `>`, etc.) instead of boolean operators (`is`, `is not`). Fix: Move the `isinstance(var, bool)` check before `isinstance(var, (int, float))`. ## Test plan - [ ] Verify `DataOperations` with `literal_eval` operation correctly invokes `_literal_eval()` - [ ] Verify `VariableAssigner._clear()` returns `False` for boolean variables (not `0`) - [ ] Verify `LoopItem.evaluate_condition()` uses boolean operators for `True`/`False` values 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Fixed operation routing logic to correctly dispatch the "literal_eval" operation to its handler. * Refactor * Reorganized conditional branch ordering in agent components to improve code structure and maintainability without affecting functional behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-18 09:58:45 +08:00
小熊	09d45046e5	Feat/web markdown UI updates (#14214 ) ### What problem does this PR solve? LLM/chat and search UIs render Markdown in several places (document preview, floating chat widget, next-search, etc.). Plugin lists and behavior were duplicated or inconsistent, and single newlines in model output were not always rendered as visible line breaks, which hurts readability for chat-style content. This PR centralizes shared remark/rehype configuration (including `remark-breaks` for newline handling) and wires the main Markdown surfaces to use it, so behavior is consistent and easier to maintain. ### Type of change - [x] Refactoring --------- Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>	2026-05-15 22:29:44 +08:00
Haruko386	bf41d35729	Go: implement PaddleOCR provider and implement ASR for CoHere (#14954 ) ### What problem does this PR solve? This PR implement implement OCR for Baidu and Mistral, implement PaddleOCR provider and implement ASR for CoHere Verified examples from the CLI: ``` RAGFlow(user)> ocr with 'mistral-ocr-2512@test@mistral' file './internal/text.jpg' +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ RAGFlow(user)> ocr with 'paddleocr-vl-0.9b@test@baidu' file './internal/text.jpg' +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ # PaddleOCR RAGFlow(user)> ocr with 'PaddleOCR-VL-1.5@test@paddleocr' file './internal/test.pdf' +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation Bingxin Ke Nando Metzger Photogra Anton Obukhov Rodrigo Caye Daudt netry and Remote Sensing, Shengyu Huang Konrad Schindler ETH Zürich <div style="text-align: c... \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ # Cohere RAGFlow(user)> asr with 'cohere-transcribe-03-2026@test@cohere' audio './internal/test.wav' param '{"language": "en"}' +-----------------------------------------------------------------------------------------------------------------------+ \| text \| +-----------------------------------------------------------------------------------------------------------------------+ \| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired. \| +-----------------------------------------------------------------------------------------------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-15 18:41:43 +08:00
wdeveloper16	14c0985182	feat: bump Python minimum from 3.12 to 3.13, drop strenum backport (#14767 ) Closes #14753 ## What changed \| File \| Change \| \|---\|---\| \| `pyproject.toml` \| `requires-python` → `>=3.13,<3.15`; remove `strenum==0.4.15` \| \| `Dockerfile` \| `uv python install 3.13`, `uv sync --python 3.13` \| \| `.github/workflows/tests.yml` \| `uv sync --python 3.13` on both matrix legs \| \| `CLAUDE.md` \| dev setup command + requirements note updated \| \| `deepdoc/parser/mineru_parser.py` \| `from strenum import StrEnum` → `from enum import StrEnum` \| \| `agent/tools/code_exec.py` \| same \| `StrEnum` has been in the stdlib since Python 3.11 — the `strenum` backport package is no longer needed once the floor is 3.13. ## Why uv.lock is not regenerated `uv lock --python 3.13` fails because: 1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0` 2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels) depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0` 3. These two constraints are irreconcilable on Python 3.13 The lockfile regeneration requires loosening the `numpy` upper bound in the `infiniflow/graspologic` fork. Once that fork commit is updated and the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will succeed. ## RFC corrections Two claims in the original RFC (#14753) did not hold up under code review: - "graspologic hard-blocks 3.13" — the infiniflow fork at the pinned commit has no `<3.13` Python constraint. The blocker is the transitive `numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a direct Python version cap. - "free-threading throughput gains for I/O-bound workload" — Python 3.13 free-threading requires a special `--disable-gil` build and provides no benefit for async I/O code (the GIL is already released during I/O). The real motivation is forward compatibility and improved error messages.	2026-05-15 14:40:53 +08:00
Ricardo-M-L	cb606e1c38	fix: correct attribute name typo model_speciess to model_species (#13929 ) ## Summary - Rename misspelled attribute `model_speciess` to `model_species` across 4 files - The extra `s` is a typo — `species` is already plural ## Test plan - [ ] Verify PDF parsing with laws/manual/paper parser types still works correctly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: yuj <yuj@ztjzsoft.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 14:19:41 +08:00
Haruko386	c2863173b0	Go: implement TTS, ASR for Siliconflow and TTs for StepFun (#14944 ) ### What problem does this PR solve? This PRimplement TTS, ASR for Siliconflow and TTs for StepFun The following functionalities are now supported: SiliConFlow: - [x] Text To Speech - [x] Audio To Text - [x] Stream Audio To Text StrepFun: - [x] Audio To Text - [x] Stream Audio To Text Verified examples from the CLI: ```plaintext # SiliconFlow RAGFlow(user)> tts with 'FunAudioLLM/CosyVoice2-0.5B@test@Siliconflow' text 'hello? show yourself' play format 'wav' param '{"voice": "fnlp/MOSS-TTSD-v0.5:alex"}' SUCCESS RAGFlow(user)> asr with 'FunAudioLLM/SenseVoiceSmall@test@siliconflow' audio './internal/test.wav' param '' +----------------------------------------------------------------------------------------------------------------------+ \| text \| +----------------------------------------------------------------------------------------------------------------------+ \| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. \| +----------------------------------------------------------------------------------------------------------------------+ RAGFlow(user)> stream asr with 'FunAudioLLM/SenseVoiceSmall@test@siliconflow' audio './internal/test.wav' param '' +----------------------------------------------------------------------------------------------------------------------+ \| text \| +----------------------------------------------------------------------------------------------------------------------+ \| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. \| +----------------------------------------------------------------------------------------------------------------------+ ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-05-15 14:03:33 +08:00
Jin Hai	335dd5a263	Go: add cli command, list dataset documents (#14948 ) ### What problem does this PR solve? ``` +---------------------+----------------------------------+-------------+-----------------+---------+--------+------+ \| created_at \| id \| meta_fields \| name \| size \| status \| type \| +---------------------+----------------------------------+-------------+-----------------+---------+--------+------+ \| 2026-05-08 19:35:08 \| f6aa38bb4ad111f1ba6338a74640adcc \| map[] \| abc.pdf \| 3387987 \| 1 \| pdf \| +---------------------+----------------------------------+-------------+-----------------+---------+--------+------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-05-15 14:00:45 +08:00
SnakeEye-sudo (Er. Sangam Krishna)	1a25191b13	docs: add FAQ entry for using Ollama with RAGFlow (#14557 ) ### What problem does this PR solve? Users frequently ask how to use Ollama for local LLM inference with RAGFlow. This FAQ entry provides step-by-step instructions for setting up Ollama as a local model provider. ### Type of change - [x] Documentation update ### Description Adds a new FAQ entry: "How do I use Ollama with RAGFlow for local LLM inference?" Covers: 1. Starting Ollama and pulling a model 2. Configuring Ollama as a model provider in RAGFlow Settings 3. Using the Ollama model in an assistant	2026-05-15 13:54:09 +08:00
Hunnyboy1217	86bcf9767d	Go: implement Rerank in vLLM driver (#14878 ) (#14880 ) ### What problem does this PR solve? Closes #14878. `VllmModel.Rerank()` in [internal/entity/models/vllm.go:551](internal/entity/models/vllm.go#L551) is currently a stub returning `nil, fmt.Errorf("%s, Rerank not implemented", z.Name())`, and [conf/models/vllm.json](conf/models/vllm.json) is missing a `rerank` entry in `url_suffix`. Chat (long-standing) and embeddings (#14688) already work, so rerank is the last missing leg of the retrieval pipeline for operators running everything on a single self-hosted vLLM server — today they have to point rerank at a different provider, which defeats the point of a fully local deployment. Upstream vLLM has supported a Jina/Cohere-compatible `POST /v1/rerank` endpoint since v0.7 ([vllm-project/vllm#12376](https://github.com/vllm-project/vllm/pull/12376)). The request/response shape is essentially identical to the NVIDIA driver landed in #14778, so this PR mirrors that structure with two vLLM-specific adjustments. This PR replaces the stub with a real implementation against vLLM's `/v1/rerank`: - `POST {baseURL}/rerank` - Request body: `{"model": "<modelName>", "query": "<query>", "documents": [...], "top_n": <int>}` — documents are a flat `[]string`, not wrapped as `{text: "..."}` like NVIDIA's `/ranking`. - Response body: `{"results": [{"index": int, "relevance_score": float}, ...]}` (Jina-compatible; the optional `document` field is ignored since callers reconstruct text via `Index`). - `Authorization: Bearer <ApiKey>` is set only when `APIConfig.ApiKey` is non-empty, matching the existing `Embed`/`ListModels` behaviour in this file. vLLM is a local driver and can be deployed without an API key. The return shape matches the existing `RerankResponse` contract used by the NVIDIA ([nvidia.go:461](internal/entity/models/nvidia.go#L461)), Aliyun ([aliyun.go:507](internal/entity/models/aliyun.go#L507)), and ZhipuAI ([zhipu-ai.go:554](internal/entity/models/zhipu-ai.go#L554)) drivers, i.e. `Data []RerankResult` carrying `{Index, RelevanceScore}` in the API's ranking order. Callers that need original-input order sort by `Index`. Behaviour requirements from the issue, all covered: 1. Empty `documents` → returns `&RerankResponse{}` without an HTTP call. 2. Missing `modelName` → `"model name is required"` validation error. 3. `rerankConfig.TopN` honored when `0 < TopN < len(documents)`; otherwise `top_n` defaults to `len(documents)` so callers get a score per input. 4. Non-200 responses return an error including upstream status and body (`"vLLM rerank API error: <status>, body: <body>"`). 5. Response `index` values are bounds-checked against `len(documents)`. Scope:* - [internal/entity/models/vllm.go](internal/entity/models/vllm.go) — replaces the `Rerank` stub at line 551 with a real implementation; adds `vllmRerankRequest`/`vllmRerankResponse` types for the slim subset of the payload we need. Region/baseURL resolution, 30s context timeout, conditional bearer header, and error wrapping all follow the existing patterns in this file. - [conf/models/vllm.json](conf/models/vllm.json) — adds `"rerank": "rerank"` to `url_suffix`, joined to the operator-configured vLLM base URL the same way the NVIDIA driver joins at [nvidia.go:485](internal/entity/models/nvidia.go#L485). - [internal/entity/models/vllm_rerank_test.go](internal/entity/models/vllm_rerank_test.go) — adds 7 `httptest`-backed tests mirroring `nvidia_rerank_test.go`: happy path (out-of-order ranking → Index preservation), `top_n` clamp to `RerankConfig.TopN`, empty-documents short-circuit, missing-model-name validation, HTTP error propagation, out-of-range index rejection, and a vLLM-specific `TestVllmRerankWithoutAPIKey` locking in the optional-auth behaviour that distinguishes this driver from NVIDIA. Out of scope: no interface change, no DDL, no frontend change. Chat, embeddings, and balance paths are untouched. No new user-facing docs required beyond the existing rerank model setup page — vLLM joins the list of providers whose rerank model can be selected once `/v1/rerank` is exposed by the server. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-15 13:27:22 +08:00
Octopus	eaa5d9921b	fix: enable GitHub connector to sync PRs and issues by default (#14062 ) Fixes #13975 ## Problem The GitHub data source connector had both `include_pull_requests` and `include_issues` defaulting to `false` in both the frontend form and the backend sync code. This meant that with the default configuration, no content was synced at all from a GitHub repository — silently producing zero results. Additionally, the form field labels contained a typo: "Inlcude" instead of "Include". ## Solution - Changed `include_pull_requests` default from `false` to `true` in the frontend form fields and default values - Changed `include_issues` default from `false` to `true` in the frontend form fields and default values - Changed both backend defaults in `sync_data_source.py` from `False` to `True` - Fixed label typos: "Inlcude Pull Requests" → "Include Pull Requests" and "Inlcude Issues" → "Include Issues" This makes the GitHub connector consistent with the GitLab connector, which already defaults `include_mrs`, `include_issues`, and `include_code_files` all to `true`. ## Testing - The connector now syncs both pull requests and issues by default when a new GitHub data source is created - Users who want to exclude PRs or issues can uncheck the corresponding checkboxes in the form Co-authored-by: octo-patch <octo-patch@github.com>	2026-05-15 13:26:31 +08:00
plind	c9622d0924	fix(agentbot): aggregate structured output in non-streaming completions (#14848 ) ## What problem does this PR solve? Closes #13384. The `/api/v1/agentbots/<agent_id>/completions` non-streaming path returned the first yielded SSE chunk and exited: ```python async for answer in agent_completion(objs[0].tenant_id, agent_id, **req): return get_result(data=answer) ``` That meant structured output, the full assistant message, and reference data were all dropped when an agent was called with `stream=false`. Streaming worked because each event was forwarded individually; non-streaming was returning a raw SSE-formatted string from a single early event. The v1 endpoint at [`agent_api.py:1006-1050`](https://github.com/infiniflow/ragflow/blob/main/api/apps/restful_apis/agent_api.py#L1006-L1050) already handles this correctly. This PR mirrors that aggregation in the SDK beta endpoint: parse each SSE line, accumulate `content` from `message` events, merge `reference`, collect `outputs.structured` from each `node_finished` event keyed by `component_id`, and attach all of them to the final response. ## Type of change - [x] Bug fix (non-breaking change which fixes an issue) ## Test plan - [ ] Build an agent with a node that emits structured output, call `POST /api/v1/agentbots/<agent_id>/completions` with `stream=false` and a beta API token, verify `data.structured.<component_id>` is present in the response. - [ ] Same agent with `stream=true` — verify behavior is unchanged. - [ ] Agent without structured output — verify `data.structured` is omitted, `content` and `reference` still aggregated correctly.	2026-05-15 12:42:33 +08:00
Jin Hai	3a5df08c76	Go: add file parse command (#14892 ) ### What problem does this PR solve? ``` RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png' +----------------------------------------------------------+ \| text \| +----------------------------------------------------------+ \| 生活不是等待风暴过去，而是学会在雨中翩翩起舞。 ——佚名 \| +----------------------------------------------------------+ RAGFlow(user)> list 'test@gitee' tasks; +---------+----------------------------------+ \| status \| task_id \| +---------+----------------------------------+ \| success \| C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5 \| +---------+----------------------------------+ RAGFlow(user)> show 'test@gitee' task 'C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5'; +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ \| content \| index \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ \| # PDF 1: Purpose of RAGFlow RAGFlow is an open source Retrieval-Augmented Generation (RAG) engine designed to turn raw documents into reliable context for large language models.Its purpose is to make it practical to build an Al assistant that can ans... \| 1 \| +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-05-15 12:29:52 +08:00
Sebastion	547b8cf9d8	security: always use RestrictedUnpickler in deserialize_b64 (CWE-502) (#14803 ) ## Summary Harden `api/utils/configs.deserialize_b64` so that it always routes pickle data through the existing `RestrictedUnpickler` (`restricted_loads`) rather than falling back to bare `pickle.loads()`. - CWE-502 — Deserialization of Untrusted Data - File / function: `api/utils/configs.py` → `deserialize_b64` - Caller: `SerializedField.python_value` in `api/db/db_models.py` (invoked by Peewee whenever a pickled DB column is read) ## The issue Before this change, `deserialize_b64` consulted a `use_deserialize_safe_module` config flag that defaults to `False` and is not set anywhere in the repository: ```python use_deserialize_safe_module = get_base_config('use_deserialize_safe_module', False) if use_deserialize_safe_module: return restricted_loads(src) return pickle.loads(src) # <-- default path ``` So the default code path was unrestricted `pickle.loads()` on bytes read from a MySQL `SerializedField(serialized_type=PICKLE)` column. Any attacker who can influence those bytes (SQL injection elsewhere, compromised DB credentials, a backup restored from an untrusted source, or a compromised replication peer) can craft a pickle payload that achieves arbitrary code execution on the ragflow application server when the field is next read. Today no model in-tree instantiates a `SerializedField` with the default PICKLE type — only `JsonSerializedField` is used in practice — so the attack surface is currently latent rather than actively reachable through an HTTP endpoint. But the insecure-by-default behaviour is a sharp edge: any future field that uses the default PICKLE serialization would silently inherit RCE-on-read semantics. ## The fix ```diff - use_deserialize_safe_module = get_base_config( - 'use_deserialize_safe_module', False) - if use_deserialize_safe_module: - return restricted_loads(src) - return pickle.loads(src) + return restricted_loads(src) ``` `restricted_loads` is the existing `RestrictedUnpickler` already defined in the same file, which limits permitted modules to `numpy` and `rag_flow`. The config flag (and the now-dead `get_base_config` import) are removed. Diff is 1 insertion / 6 deletions, scoped to a single function. ## Testing - Built a malicious pickle whose `__reduce__` resolves to `posix.system('id')`. Pre-fix: executes. Post-fix: `restricted_loads` raises `UnpicklingError: global 'posix.system' is forbidden`. - Round-tripped a benign `numpy.ndarray` through `serialize_b64` → `deserialize_b64`. Values preserved bit-for-bit. - Confirmed `use_deserialize_safe_module` is not set in any config file in the tree, so removing the flag does not change any operator-facing knob that was actually in use. ## A note on `restricted_loads` itself The existing `SECURITY.md` notes that `restricted_loads`'s `numpy` allow-list can still be reached via `numpy.f2py.diagnose.run_command`. This PR does not attempt to fix that — it is a separate hardening question about tightening the allow-list to specific symbols rather than whole modules. The change here strictly improves on the status quo (bare `pickle.loads`) and brings the default path in line with what the `restricted_loads` helper was clearly designed for. Happy to follow up with a separate PR narrowing the allow-list if that direction is welcome. ## Adversarial review Before submitting, we tried to argue this finding away. The two strongest objections are (1) "no field uses PICKLE today, so this is unreachable" — true, but the default behaviour of a security-sensitive helper still matters because new fields silently inherit it; and (2) "the attacker already needs DB write access, which is game over" — partially true, but pickle-RCE meaningfully escalates data tampering into code execution on the application host (filesystem, internal network, in-process secrets), which is not equivalent. The fix is one line of real code, has no behavioural cost for legitimate callers, and removes an insecure default. We decided it was worth filing. --- <sub>_Submitted by Sebastion — autonomous open-source security research from [Foundation Machines](https://foundationmachines.ai). Free for public repos via the [Sebastion AI GitHub App](https://github.com/marketplace/sebastion-ai)._</sub>	2026-05-15 10:58:27 +08:00
yingjianzh	4c68a6b86c	fix(agent): pass top_k and fix similarity weight slider behavior (#14760 ) ### What problem does this PR solve? This PR fixes two issues in Agent Retrieval behavior and configuration UX: 1. `top_k` configured in Agent Retrieval was not passed down to the backend retriever call, so retrieval could ignore the configured vector recall limit. 2. Similarity weight slider semantics were confusing in Agent forms because the Agent field stores `keywords_similarity_weight` while UI interactions were interpreted as vector weight. This could cause displayed values and actual behavior to diverge. This PR ensures Agent retrieval uses configured `top_k`, and makes the slider behavior consistent and explicit for both vector and keyword weight modes. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-15 10:49:14 +08:00
sham-sr	ef2969a462	fix(llm): Tongyi-Qianwen embeddings use correct DashScope native API for intl URLs (#14784 ) ## Summary - Fixes Tongyi-Qianwen (`QWenEmbed`) text embeddings when the configured `base_url` points at DashScope international (`dashscope-intl.aliyuncs.com`) or China (`dashscope.aliyuncs.com`) hosts, including values copied from Model Studio that use the OpenAI-compatible path (`.../compatible-mode/v1`). - The `dashscope` Python SDK (`TextEmbedding.call`) expects the native HTTP root (`https://<host>/api/v1`), not the OpenAI-compatible base URL. Without mapping, international accounts could hit the wrong host or path. ## Implementation - Added `_dashscope_native_http_api_url()` to normalize known DashScope hosts to `.../api/v1`, and wired `QWenEmbed` to set `dashscope.base_http_api_url` before each embedding call (document and query). ## Notes - In-code comments document the Tongyi-Qianwen / DashScope intl vs CN behavior for future maintainers. --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-15 10:07:48 +08:00
Octopus	d887b578c5	fix: preserve uploaded file attachments after subsequent assistant messages (#13993 ) ## Problem When a user uploads a file attachment in their first message (Q1) and then sends a follow-up message (Q2) that triggers a backend response, the uploaded file attachment disappears from Q1 in the chat UI. Fixes #13959 ## Root Cause In `single-chat-box.tsx`, a `useEffect` hook syncs `derivedMessages` from `conversation?.messages` whenever the conversation data changes (e.g., after a new assistant reply arrives): ```typescript useEffect(() => { const messages = conversation?.messages; if (Array.isArray(messages)) { setDerivedMessages(messages); // ← overwrites local state } }, [conversation?.messages, setDerivedMessages]); ``` The problem is that `conversation.messages` comes from the server, which stores messages as plain JSON. Browser `File` objects (uploaded by the user) cannot be serialized to JSON, so they are never stored on the server. Each time the server data is applied to local state, the `files` array on the user's first message is lost. ## Fix Instead of replacing the local messages wholesale, preserve any `files` entries from the previous local state by ID before applying the server data: ```typescript useEffect(() => { const messages = conversation?.messages; if (Array.isArray(messages)) { setDerivedMessages((prevMessages) => { const filesMap = new Map( prevMessages .filter((m) => m.files?.length) .map((m) => [m.id, m.files]), ); if (filesMap.size === 0) { return messages; } return messages.map((m) => ({ ...m, files: filesMap.get(m.id) ?? m.files, })); }); } }, [conversation?.messages, setDerivedMessages]); ``` This is a minimal, targeted fix: when there are no local files to preserve the behavior is identical to before (early return with plain assignment). When local file objects exist they are re-attached to the corresponding server messages by ID. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Improved search query processing to properly handle special characters and apostrophes in search terms and synonyms. * Fixed chat message file attachments to persist when syncing with server. * Refactor * Simplified OCR detection return values by removing timing metadata. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ximi <octo-patch@github.com>	2026-05-15 09:53:35 +08:00
buua436	58819f5d3e	fix: add document download endpoint and refactor existing download function (#14927 ) ### What problem does this PR solve? add document download endpoint and refactor existing download function ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-15 09:36:58 +08:00
writinwaters	5a5bbee948	Doc: Finalized v0.25.4 release notes (#14929 ) ### What problem does this PR solve? v0.25.4 release notes. Final. ### Type of change - [x] Documentation Update	2026-05-14 21:08:39 +08:00
balibabu	41072ed44d	Feat: This enables SelectWithSearch to search by label. (#14925 ) ### What problem does this PR solve? Feat: This enables SelectWithSearch to search by label. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: balibabu <assassin_cike@163.com>	2026-05-14 20:33:11 +08:00
Haruko386	106f4b777e	Go: implement TTS for fishaudio, openrouter and asr for fishaudio (#14926 ) ### What problem does this PR solve? This PR implement TTS for FishAudio and MiniMax provider and ASR for FishAudio The following functionalities are now supported: FishAudio: - [x] Text To Speech - [x] Stream Text To Speech - [x] Audio To Text OpenRouter: - [x] Text To Speech Verified examples from the CLI: ```plaintext FishAudio RAGFlow(user)> tts with 's1@test@fishaudio' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"reference_id": "90e65eaaf50e4470b8e6d43ee6afd7d5", "temperature": 0.7, "top_p": 0.7, "prosody": {"speed": 1, "volume": 0, "normalize_loudness": true}, "chunk_length": 300, "normalize": true, "sample_rate": 44100, "mp3_bitrate": 128, "latency": "normal", "max_new_tokens": 1024, "repetition_penalty": 1.2, "min_chunk_length": 50, "condition_on_previous_chunks": true, "early_stop_threshold": 1}' Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/s1_output.wav SUCCESS RAGFlow(user)> stream tts with 's1@test@fishaudio' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"reference_id": "90e65eaaf50e4470b8e6d43ee6afd7d5", "temperature": 0.7, "top_p": 0.7, "prosody": {"speed": 1, "volume": 0, "normalize_loudness": true}, "chunk_length": 300, "normalize": true, "sample_rate": 44100, "mp3_bitrate": 128, "latency": "normal", "max_new_tokens": 1024, "repetition_penalty": 1.2, "min_chunk_length": 50, "condition_on_previous_chunks": true, "early_stop_threshold": 1}' Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/s1_output.wav SUCCESS RAGFlow(user)> asr with 'transcribe-1@test@fishaudio' audio './internal/test.wav' param '{"language": "en", "ignore_timestamps": true}' +----------------------------------------------------------------------------------------------------------------------+ \| text \| +----------------------------------------------------------------------------------------------------------------------+ \| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. \| +----------------------------------------------------------------------------------------------------------------------+ ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-14 18:58:00 +08:00
wdeveloper16	a98994ff91	fix: close db connections reliably in test_db_connection (#14777 ) ## Summary - Fixes resource-management bugs in the `POST /agents/test_db_connection` endpoint where database connections could be left open on error (part of #14750) ## Changes - `api/apps/restful_apis/agent_api.py` — `test_db_connection`: - mysql / mariadb / oceanbase / postgres: replaced bare `db.connect()` / `db.close()` fallthrough with `with db.connection_context()` and a probe `SELECT 1` — guaranteed close on both success and exception - mssql: nested `try/finally` blocks so `cursor.close()` and `db.close()` are always called even when `cursor.execute()` raises - trino: wrapped cursor ops in `try/finally` for the same reason - Removed the `if req["db_type"] != "mssql": db.connect(); db.close()` shared fallthrough block — each branch now owns its teardown - Consolidated to a single `return get_json_result(...)` after the if/elif chain	2026-05-14 16:45:44 +08:00
eviaaaaa	63df01fe3f	fix(agent): handle duplicate MCP tool names (#14217 ) ### What problem does this PR solve? When multiple MCP servers expose tools with the same name, the agent currently registers those tools using their original MCP names. This can lead to two issues: - later MCP tools may overwrite earlier ones in the agent tool map - duplicate function names may be exposed to the LLM This PR fixes duplicate MCP tool-name handling by applying the same indexed naming strategy already used for native agent tools. Native tools are exposed with generated names such as `<tool_name>_<index>` to avoid collisions, and MCP tools now follow the same convention for consistency. Specifically, this PR: - assigns unique indexed function names to MCP tools exposed to the LLM - preserves each MCP tool's original server-side name in an `MCPToolBinding` - dispatches MCP calls using the original MCP tool name while keeping the indexed name in the agent tool map - allows MCP metadata conversion to override only the OpenAI function name without modifying the original MCP tool metadata ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Validation The validation was performed using two MCP servers. Both servers exposed a tool with the same name: `mcp0`. Both tools take no input parameters. MCP Server One: <img width="1780" height="625" alt="ONE" src="https://github.com/user-attachments/assets/801a2654-fc10-4b71-b31c-81841fd40c55" /> MCP Server Two: <img width="1777" height="624" alt="Second" src="https://github.com/user-attachments/assets/c095151d-7bdf-47c8-9bfe-6aaf4a01b944" /> Before the fix: When invoking `mcp0`, only the `mcp0` tool from the MCP server injected later could be called successfully. As shown below, both `mcp0` tools were present, but only the later-registered one was actually invokable. <img width="694" height="935" alt="Three" src="https://github.com/user-attachments/assets/3b9d7ab2-1765-492c-b8e0-bf05a69933ca" /> After the fix: Both `mcp0` tools can now be invoked correctly. <img width="737" height="1095" alt="F" src="https://github.com/user-attachments/assets/6e896627-2b7f-41bb-becc-daa0c73ff58f" /> <img width="730" height="1090" alt="six" src="https://github.com/user-attachments/assets/aba75593-26ae-4e3b-951d-b45ff177fd32" />	2026-05-14 15:28:39 +08:00
dale053	bd99a22661	fix: atomic chunk/token counter updates for documents and knowledge b… (#14867 ) ### What problem does this PR solve? Fixes #14866. Previously, `DocumentService.increment_chunk_num` and `decrement_chunk_num` updated the `Document` row and its parent `Knowledgebase` row in two separate, non-transactional statements. If the second update failed (DB error, connection drop, etc.) after the first one succeeded, the document and knowledge base chunk/token counters would drift apart and stay inconsistent. There was also a behavioral asymmetry between the two methods: - `increment_chunk_num` only logged a warning when the document row was missing and returned a value that callers usually treated as success. - `decrement_chunk_num` raised `LookupError` in the same situation. This PR makes the counter updates atomic and aligns the missing-document behavior between the two methods: - Wrap the `Document` and `Knowledgebase` updates in `increment_chunk_num` / `decrement_chunk_num` inside a `DB.atomic()` block so both succeed or both roll back together. - Raise `LookupError` from `increment_chunk_num` when the target document no longer exists, matching `decrement_chunk_num`. - Update `reset_document_for_reparse` in `document_api_service.py` to catch the new `LookupError` and return a proper "Document not found!" API error instead of propagating the exception. No schema changes, no API contract changes for the success path; only the failure mode for a missing document during reparse is now a clean error response instead of an uncaught exception. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 14:48:52 +08:00
buua436	3c68ad03be	Go: update user settings fields (#14918 ) ### What problem does this PR solve? update user settings fields ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 14:47:15 +08:00
Ethan T.	ba8cb9dd4a	fix: replace mutable default arguments with None in LLM chat models (#13513 ) ## Summary - Replace `gen_conf={}` with `gen_conf=None` + guard in `rag/llm/chat_model.py` (12 instances across Base, BaiChuanChat, LocalLLM, MistralChat, ReplicateChat, BaiduYiyanChat, GoogleChat classes) - Replace `doc_ids=[]` with `doc_ids=None` + guard in `api/db/services/document_service.py` (1 instance) - Mutable default arguments are shared across all calls, causing potential cross-request state contamination - See Python docs: https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects ## Test plan - [x] Verify LLM calls work with and without explicit gen_conf - [x] No behavior change for existing callers — `None` is replaced with `{}` at function entry 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-14 14:46:47 +08:00
buua436	0450400efd	Go: fix LastLoginTime update (#14917 ) ### What problem does this PR solve? fix LastLoginTime update ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 14:46:39 +08:00
dale053	714f777fa0	Fix: missing authentication on agent file upload and download endpoints (#14854 ) ### What problem does this PR solve? Closes #14853 The `/agents/download` and `/agents/<agent_id>/upload` endpoints in the agent API are missing `@login_required` and `@add_tenant_id_to_kwargs` decorators, allowing unauthenticated access. This is a security issue — any user can upload files to or download files from an agent without being logged in. Additionally, the upload endpoint bypasses canvas access control (`@_require_canvas_access_async`). This PR adds the missing authentication and authorization decorators to both endpoints and replaces the manual `user_id` / `created_by` lookups with the `tenant_id` provided by the auth middleware, making these endpoints consistent with the rest of the agent API. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 13:48:41 +08:00
buua436	f0122179dd	GO: align time units with Python and centralize timestamp injection in BaseModel (#14875 ) ### What problem does this PR solve? align time units with Python and centralize timestamp injection in BaseModel ### Type of change - [x] Refactoring	2026-05-14 13:46:46 +08:00
buua436	82e06db8c3	Doc: code component output section (#14915 ) ### What problem does this PR solve? code component output section ### Type of change - [x] Documentation Update	2026-05-14 13:42:40 +08:00
Ricardo-M-L	48b4aa3e93	Fix WebDriver resource leak in HTML-to-PDF conversion (#14310 ) ### What problem does this PR solve? In `api/utils/web_utils.py`, `__get_pdf_from_html()` creates a Chrome WebDriver but only calls `driver.quit()` inside the `TimeoutException` handler. If the page element becomes stale before the timeout (no exception raised), the WebDriver is never quit, leaking the Chrome browser process and returning `None`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Changes - Move the PDF printing logic and `driver.quit()` outside the `except` block so they execute on all code paths - Use `try/finally` to ensure `driver.quit()` is always called, even if the `Page.printToPDF` DevTools call fails Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-14 13:28:58 +08:00
Ricardo-M-L	4bfdb1e123	fix: correct nested path traversal in set_variable_param_value (#13986 ) ## Summary `Graph.set_variable_param_value()` in `agent/canvas.py` has a bug in its nested path traversal logic. The `for` loop iterates through all keys in the path (including the last one), descending into every level. After the loop, it then tries to set `cur[keys[-1]] = value`, but `cur` has already descended one level too deep. Example: For `path = "a.b"`, `value = "hello"`: - Before (bug): `obj["a"]["b"]` becomes `{"b": "hello"}` instead of `"hello"` - After (fix): `obj["a"]["b"]` becomes `"hello"` as expected The fix changes `for key in keys:` to `for key in keys[:-1]:`, so the loop only navigates to the parent dict, and the final key is set directly. This is consistent with how the read-side counterpart `get_variable_param_value()` works. This method is called by `set_variable_value()` when assigning to nested variable paths (e.g., `component@root.nested.key`), which is used by the `VariableAssigner` component. ## Test plan - [ ] Create a canvas with a VariableAssigner that writes to a nested path (e.g., `component@obj.nested.key`) - [ ] Verify the value is set correctly at the expected path, not wrapped in an extra dict layer - [ ] Verify single-key paths (e.g., `component@key`) still work correctly <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Fixed a bug in variable parameter assignment where nested structures were being incorrectly modified, ensuring values are now properly set at their intended locations without unintended overwrites. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-14 13:27:04 +08:00
Haruko386	ef46005ef1	Go: implement TTS for MiniMax provider and CLI testing for TTS (#14911 ) ### What problem does this PR solve? This PR implement TTS for MiniMax provider and CLI testing for TTS The following functionalities are now supported: MiniMax: - [x] Chat / Stream Chat - [x] Embedding - [x] Rerank - [x] Model listing - [x] Provider connection checking - [x] Text To Speech - [ ] OCRFile - [ ] ~~Audio To Text~~ - [ ] ~~Balance~~ Verified examples from the CLI: ```plaintext RAGFlow(user)> tts with 'speech-2.8-hd@test@minimax' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"voice_setting": {"voice_id": "English_radiant_girl", "speed": 1, "vol": 1, "pitch": 0}, "audio_setting": {"sample_rate": 32000, "bitrate": 128000, "format": "wav", "channel": 1}, "output_format": "hex"}' Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/speech-2.8-hd_output.wav SUCCESS RAGFlow(user)> stream tts with 'speech-2.8-hd@test@minimax' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"voice_setting": {"voice_id": "English_radiant_girl", "speed": 1, "vol": 1, "pitch": 0}, "audio_setting": {"sample_rate": 32000, "bitrate": 128000, "format": "wav", "channel": 1}, "output_format": "hex"}' Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/speech-2.8-hd_output.wav SUCCESS ``` Set `Play` to play audio in CLI Set `Save` `PATH_TO_SAVE` to save file Set `format` to save file in wav or mp3 Set `Param` align with official request body ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-14 13:19:31 +08:00
Br1an	d46bbd30f7	Fix: send input and output token usage to Langfuse (#13294 ) ### What problem does this PR solve? Closes #9837 The Langfuse integration currently only sends the output text to `langfuse_generation.update()` without including token usage information. This means Langfuse cannot track input/output token consumption for cost analysis and monitoring. ### Solution Add the `usage` parameter to `langfuse_generation.update()` with: - `input`: approximate input token count from `message_fit_in()` - `output`: approximate output token count from `num_tokens_from_string(answer)` - `total`: sum of input and output ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-14 13:11:37 +08:00
Ricardo-M-L	cc21dc7f00	fix: replace broken assert with raise ValueError in variable_assigner and loop (#13906 ) \`assert \"string\"\` always passes in Python because non-empty strings are truthy. This silently skips input validation: - variable_assigner.py line 51: \`assert \"Variable is not complete.\"\` → \`raise ValueError(\"Variable is not complete.\")\` - loop.py line 59: \`assert \"Loop Variable is not complete.\"\` → \`raise ValueError(\"Loop Variable is not complete.\")\` Without this fix, incomplete variables pass validation silently and cause a confusing KeyError on the next line.	2026-05-14 12:33:17 +08:00
07heco	8dc5b1b42d	fix: optimize reranking module robustness and bug fixes (#14264 ) ## Description This PR fixes critical bugs and improves the robustness of the RAG reranking module while maintaining 100% backward compatibility with all existing functionality and providers. ## Key Changes 1. Network Stability: Added 30s timeout to all API requests to prevent service blocking 2. Boundary Protection: Added empty query/text validation for all rerank models 3. Response Fault Tolerance: Replaced hardcoded key access with `.get()` to avoid KeyError crashes 4. Bug Fixes: - Fixed `Ai302Rerank` (completely non-functional before) - Fixed `GPUStackRerank` incorrect exception catching - Fixed `_normalize_rank` empty array crash 5. Code Specification: Added type annotations, standardized unimplemented class prompts ## Compatibility - ✅ No changes to any class/method names - ✅ All rerank providers (Jina/Cohere/NVIDIA/HuggingFace etc.) work as before - ✅ No breaking changes, zero impact on existing workflows ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-14 11:56:09 +08:00
writinwaters	851b16b913	Docs: Added v0.25.4 release notes draft. (#14914 ) ### What problem does this PR solve? v0.25.4 release notes draft. ### Type of change - [x] Documentation Update v0.25.4	2026-05-14 11:24:20 +08:00
Liu An	f038a34154	Docs: Update version references to v0.25.4 in READMEs and docs (#14912 ) ### What problem does this PR solve? - Update version tags in README files (including translations) from v0.25.3 to v0.25.4 - Modify Docker image references and documentation to reflect new version - Update version badges and image descriptions - Maintain consistency across all language variants of README files ### Type of change - [x] Documentation Update	2026-05-14 11:07:08 +08:00
buua436	b89878c593	Fix: dataset document download route (#14910 ) ### What problem does this PR solve? dataset document download route ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 10:59:06 +08:00
writinwaters	1c0eaa504b	Docs: Finalized v0.25.3 release notes (#14913 ) ### What problem does this PR solve? 0.25.3 release notes, final. ### Type of change - [x] Documentation Update	2026-05-14 10:57:43 +08:00
sirj0k3r	b2b63600f1	Adds gpt-5.4-mini and gpt-5.4-nano (#14908 ) ### What problem does this PR solve? Includes gpt-5.4-mini and gpt-5.4-nano to the OpenAI model list ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-14 10:16:24 +08:00
Yingfeng	e577901388	Fix doc format (#14909 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 09:49:45 +08:00
tmimmanuel	cb01529d8b	Go: implement provider: Voyage AI (#14811 ) ### What problem does this PR solve? Add a Go driver for Voyage AI (https://voyageai.com), one of the unchecked providers on the umbrella tracking issue #14736. Voyage AI is embed + rerank only — no chat, no streaming, no `/v1/models` endpoint. It's the first provider in the Go layer of this shape. Until this PR, a tenant who configured `voyage` as a model provider in the Go layer fell through to the default branch of `internal/entity/models/factory.go` and got the dummy driver. ### What this PR includes - New `internal/entity/models/voyage.go` with a `VoyageModel` implementing the `ModelDriver` interface. - New `conf/models/voyage.json` with 6 embedding models (`voyage-3.5`, `voyage-3.5-lite`, `voyage-3-large`, `voyage-code-3`, `voyage-law-2`, `voyage-finance-2`) and 2 rerank models (`rerank-2`, `rerank-2-lite`). - `factory.go`: route `"voyage"` to `NewVoyageModel`. - `internal/entity/models/voyage_test.go`: 19 unit tests. ### How the driver works - Embed: `POST /v1/embeddings`. Response is OpenAI-shaped (`{data: [{embedding, index, object, text}], model, usage}`). Driver reorders by `index`, rejects duplicate / out-of-range / missing slots, and short-circuits empty input without an HTTP call. - Rerank: `POST /v1/rerank`. Voyage uses `top_k` as the request param name (not `top_n` like Aliyun/SiliconFlow); the driver translates `RerankConfig.TopN` → `top_k`. Response is Cohere-shaped (`{data: [{relevance_score, index}], model}`), so the existing `RerankResponse{Data: []RerankResult{Index, RelevanceScore}}` shape fits cleanly. - `ListModels`: returns a hardcoded list of `voyageKnownModels`. Voyage does not expose `/v1/models` (probed live, returns 404), so the driver synthesizes the list from the same set the config ships. New upstream models are added by extending one slice. - `CheckConnection`: pings a 1-input embed call against `voyage-3.5`. Without `/v1/models`, this is the cheapest way to verify the API key + network path before a tenant tries a real workload. - `ChatWithMessages` / `ChatStreamlyWithSender` / `Balance` / `TranscribeAudio` / `AudioSpeech` / `OCRFile`: all return `"no such method"`. Voyage does not host any of these surfaces. No interface change. No new dependencies. ### How was this tested? 19 unit tests in `internal/entity/models/voyage_test.go` — all pass on go 1.25: ``` $ go test -vet=off -run TestVoyage -count=1 ./internal/entity/models/... ok ragflow/internal/entity/models 0.036s ``` Coverage: Name; Embed (happy path, reorder, empty-input, missing key/model, duplicate index, out-of-range index, missing slot); Rerank (happy path with `top_k` assertion, default-to-len-documents, empty documents, out-of-range index); ListModels (static list, missing key); CheckConnection (happy, 401); chat methods sentinels; Balance sentinel; audio/OCR sentinels. `go build ./internal/entity/models/...` exits 0. Live integration test against `api.voyageai.com`: ``` === RUN TestVoyageLiveSmoke [OK] Name() = "voyage" [OK] ListModels (static): 8 models -> [voyage-3.5 voyage-3.5-lite voyage-3-large voyage-code-3 voyage-law-2 voyage-finance-2 rerank-2 rerank-2-lite] [OK] CheckConnection [OK] Embed vectors=3 dim=1024 indices=[0 1 2] [OK] Embed(empty) -> 0 vectors [OK] Rerank results=3 scores=[0.8125 0.59765625 0.39453125] [OK] ChatWithMessages returns voyage, no such method [OK] Balance returns voyage, no such method VOYAGE LIVE SMOKE PASSED --- PASS: TestVoyageLiveSmoke (0.81s) ``` What the live run proves on the wire: - Auth (`Bearer <key>`) accepted by `api.voyageai.com`. - Embed `voyage-3.5` on 3 inputs returns 3 vectors at dim 1024 with `index` field preserved as `[0, 1, 2]` — the reorder-by-index code is exercised on real data. - Empty input short-circuits without an HTTP call (mock server would have been hit if it did). - Rerank `rerank-2` on 3 docs returns 3 real `relevance_score` floats `[0.8125, 0.598, 0.395]`. The `top_k` translation works on the live wire. - All sentinel methods return the documented `"no such method"` strings. ### Note on PR history This branch was previously named for LocalAI Embed work which is now consolidated into PR #14813. The branch was reset to `upstream/main` and rebuilt for Voyage. Diff against `main` is a clean +838 lines across 4 files. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Tracking: #14736 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-14 09:46:54 +08:00
plind	dd76653dc1	feat: add tag management for Agents with filtering and sorting (#14774 ) (#14799 ) ## Summary Closes #14774. Adds free-form tags on agents (UserCanvas) with full UI + API: - Stored as comma-separated `tags` column on `UserCanvas` with online migration. - New endpoints: `GET /v1/agents/tags` (aggregate counts) and `PUT /v1/agent/<id>/tags` (write). `GET /v1/agents` accepts a `tags=` query. - "Edit tags" item in agent dropdown opens a chip-style editor dialog; tags render as badges on each agent card. - New "Tags" facet in the agents filter bar, with counts. ## Implementation notes - Tag matching is exact-token: the SQL filter wraps stored tags as `,…,` and matches `,ml,` so `ml` doesn't match `ml-ops`. - Server-side normalization in `UserCanvasService.update_tags`: dedup (case-insensitive), per-tag cap of 64 chars, total length capped at 512 chars to fit the column, commas inside tag values are replaced with spaces. - Tenant authorization: `PUT /v1/agent/<id>/tags` gates on `UserCanvasService.accessible(canvas_id, tenant_id)`. - Tag listing scope: `UserCanvasService.list_tags` follows the same own + team-shared rule as `get_by_tenant_ids`. - i18n: keys added to `en.ts` and `zh.ts` only (per project convention; other locales fall back). - `HomeCard` gets a non-breaking `extra?: ReactNode` slot for the chip row; no `src/components/ui/` files modified. ## Test plan - [ ] Backend boot runs `migrate_db` → confirm `user_canvas.tags` column exists (`DESCRIBE user_canvas`). - [ ] Agents page renders cards normally (no console error from missing field). - [ ] `⋯ → Edit tags` opens a dialog that stays open (regression: dialog was unmounting with the dropdown). - [ ] Typing a tag without pressing Enter and clicking Save persists it (regression: last typed tag was being dropped). - [ ] Chip input supports Enter/comma to commit, Backspace on empty to remove, `×` to remove individual chip. - [ ] Tag containing a comma sent via API is stored with the comma replaced by a space. - [ ] 20 long tags sent via API does not error (length cap silently truncates). - [ ] "Tags" filter in the filter bar shows counts and narrows the list. - [ ] Filtering by `ml` does not return agents tagged `ml-ops`. - [ ] UI in Chinese shows 编辑标签 / 添加标签以整理和筛选你的智能体 etc. - [ ] `PUT /v1/agent/<other-tenant-id>/tags` returns `Agent not found or no permission.`	2026-05-13 21:41:32 +08:00
writinwaters	cb49f47c38	Docs: Editorial updates to the v0.25.3 release notes draft. (#14903 ) ### What problem does this PR solve? v0.25.3 release notes. To be continued. ### Type of change - [x] Documentation Update	2026-05-13 21:36:34 +08:00

1 2 3 4 5 ...

6255 Commits