ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-30 07:51:10 +08:00

Author	SHA1	Message	Date
qinling0210	563d855780	Implement OpenAI chat completions in GO (#16177 ) ### What problem does this PR solve? Implement OpenAI chat completions in GO POST /api/v1/openai/<chat_id>/chat/completions OpenAI chat cli: internal/development.md ### Type of change - [x] Refactoring	2026-06-18 18:07:27 +08:00
Lynn	47bd9dd049	Fix: replace tenant_llm apis (#16131 ) Replace tenant_llm apis with provider-instance apis.	2026-06-18 16:38:32 +08:00
buua436	78b4906f7a	fix: tighten embedding truncation threshold (#16123 ) ### What problem does this PR solve? Use a 95% max_length threshold before truncating embedding inputs, which reduces the chance of provider-side invalid-parameter errors on near-limit chunks. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-17 14:18:02 +08:00
euvre	fe46244d30	fix: paginate non-DeepDOC PDF parsing tasks to prevent OOM (#16106 ) The parser pods suffer from OOM kills when processing large PDF documents. The root cause is in api/db/services/task_service.py: when layout_recognize is not DeepDOC (e.g. Plain Text), page_size was set to MAXIMUM_TASK_PAGE_NUMBER (100 million), causing the entire PDF to be processed as a single task with all pages loaded into memory simultaneously. This PR fixes the issue by paginating non-DeepDOC PDF parsing tasks the same way DeepDOC already does.	2026-06-17 09:33:53 +08:00
Wang Qi	17e3aad7ae	Revert "fix: paginate non-DeepDOC PDF parsing tasks to prevent OOM" (#16104 ) Reverts infiniflow/ragflow#15951	2026-06-16 20:11:45 +08:00
euvre	d2a18d5c46	fix: paginate non-DeepDOC PDF parsing tasks to prevent OOM (#15951 ) ### What problem does this PR solve? The parser pods suffer from OOM kills when processing large PDF documents. The root cause is in api/db/services/task_service.py: when layout_recognize is not DeepDOC (e.g. Plain Text), page_size was set to MAXIMUM_TASK_PAGE_NUMBER (100 million), causing the entire PDF to be processed as a single task with all pages loaded into memory simultaneously. This PR fixes the issue by paginating non-DeepDOC PDF parsing tasks the same way DeepDOC already does. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [x] Performance Improvement - [ ] Other (please describe):	2026-06-16 20:07:19 +08:00
Kevin Hu	15f50e5cb2	fix: rename dialog_id to chat_id in chat_channel (backend + frontend) (#16096 ) ## Summary - The `ChatChannel` DB column was renamed from `dialog_id` to `chat_id` via a migration (added in a prior commit). - Aligns the REST API layer (`chat_channel_api.py`, `chat_channel_service.py`) to use `chat_id` consistently. - Updates the frontend (`interface.ts`, `hooks.ts`, `connect-dialog-modal.tsx`, `added-channel-card.tsx`) to read/write `chat_id` instead of `dialog_id`. - The joined `dialog_name` alias in the list query is unchanged (backend still returns it under that name). Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-16 19:02:20 +08:00
Lynn	70792de899	Fix: v0.26.1 model provider (#16073 ) ### What problem does this PR solve? Fix: - Pass session_id to langfuse. - Get correct status for add model_type. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-16 16:21:43 +08:00
Kevin Hu	5a817762fa	Refactor: Change table chat_channel status data type. (#16061 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring	2026-06-16 12:02:12 +08:00
buua436	8e235b7b95	fix: add legacy chat/completions mode (#16014 ) ### What problem does this PR solve? Adds a legacy mode for /chat/completions that restores v0.23.0-style output by converting start_to_think/end_to_think back into raw <think></think> markers and streaming cumulative answer text. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-16 10:34:06 +08:00
Lynn	47495c1f6a	Feat: model provider (#16028 ) ### What problem does this PR solve? Feat: - Allow upsert model_type for instance model Fix: - Allow create instance with duplicate api_key ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-06-15 19:10:33 +08:00
Wang Qi	f6a2075ad0	Fix one data source can be synced to multiple dataset (#16023 ) Fix one data source can be synced to multiple dataset Test add/delete - worked.	2026-06-15 16:54:25 +08:00
Yingfeng	b5bea72e4b	Add git-like file commit API (#15978 ) ### What problem does this PR solve? \| # \| Method \| Endpoint \| Description \| Git Equivalent \| \|---\|--------\|----------\|-------------\|----------------\| \| 1 \| `POST` \| `/api/v1/{prefix}/{folder_id}/commits` \| Create a snapshot commit with file changes (add/modify/delete/rename) \| `git add` + `git commit` \| \| 2 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits` \| List commit history (paginated) \| `git log` \| \| 3 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}` \| Get commit detail with file changes \| `git show` \| \| 4 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files` \| List file changes in a commit \| `git show --name-status` \| \| 5 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/diff?from=...&to=...` \| Compare two commits and return differences \| `git diff` \| \| 6 \| `GET` \| `/api/v1/{prefix}/{folder_id}/changes` \| Get uncommitted changes (add/modify/delete) \| `git status` \| \| 7 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/tree` \| Get the folder tree snapshot at commit time \| `git ls-tree` \| \| 8 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files/{file_id}/content` \| Get a file's content as it existed in a specific commit \| `git show HEAD:file` \| \| 9 \| `GET` \| `/api/v1/{prefix}/{file_id}/versions` \| Get version history for a specific file across all commits \| `git log -- file` \| Where `{prefix}/{id}` can be: - `folders/{folder_id}` — direct folder access - `workspaces/{workspace_id}` — alias of `folders/{folder_id}` - `datasets/{dataset_id}` — resolves to the dataset's folder - `memories/{memory_id}` — resolves to the memory's folder - `skills/{skill_id}` — resolves to the skill's folder ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2026-06-15 11:19:56 +08:00
Kevin Hu	b5a426e6e0	Feat: chat channels — connect assistants to external messaging bots (#15850 ) ### What problem does this PR solve? #15844 Adds a Chat channels capability so a RAGFlow assistant (Dialog) can be exposed as a bot on external messaging platforms (Feishu/Lark, Discord, Telegram, Slack, WeCom, LINE, etc.). An admin configures a bot in the UI, connects it to an assistant, and inbound messages are answered from that assistant's knowledge base — replies are delivered back on the channel. Feishu/Lark is implemented and tested end-to-end. Discord, Telegram, LINE, and WeCom are scaffolded against the same interface; the remaining listed channels are tracked as follow-ups. ### Design Backend - New `chat_channel` table (`tenant_id`, `name`, `channel`, `config` JSON holding `{credential: {...}}`, `dialog_id`, `status`) + `ChatChannelService` and RESTful CRUD under `/api/v1/chat_channels`. - Channel framework under `api/channels/`: a `core` registry + per-channel packages that self-register a builder and implement a common `Channel` interface (`start`/`stop`/`send` + inbound normalization) over `IncomingMessage`/`OutgoingMessage`. - Embedded reconcile loop in `ragflow_server` (`api/channels/bootstrap.py`): loads enabled bots, and starts/stops/restarts them as rows change (no server restart needed). Inbound messages run the connected dialog via the non-streaming completion path, keeping per-end-user conversation history. - Missing optional channel SDKs degrade gracefully (channel skipped with a warning; others unaffected). Channel-level errors are logged, not crashed. - Feishu's WebSocket client runs in a dedicated thread with its own event loop to avoid cross-loop/contextvars conflicts with the channel runtime. Frontend - Settings → Chat channels panel: available-channels grid + configured-bots list with add/edit/delete and a Connect assistant popup that binds a bot to a dialog. - Brand icons via simple-icons / reused shared data-source assets, with colored fallbacks for brands not available. - Route, sidebar entry, i18n (en/zh), and a top-nav segment-boundary fix so the settings page no longer highlights the Chat tab. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### Notes - DB: new `chat_channel` table is auto-created; `chat_channel.dialog_id` is also covered by a `migrate_db` `alter_db_add_column` for existing installs. - Channel SDKs (`lark-oapi`, `discord.py`, `python-telegram-bot`, `line-bot-sdk`, `wechatpy`, `aiohttp`) added to dependencies. - Screenshots / per-channel credential docs to follow. <img width="1338" height="1290" alt="Image" src="https://github.com/user-attachments/assets/042cb2f9-0dad-4e6a-bcf7-43ced4bbd704" /> <img width="1344" height="738" alt="Image" src="https://github.com/user-attachments/assets/373cd08e-ec40-4c67-9c51-4d948b1ba617" /> <img width="672" height="887" alt="Image" src="https://github.com/user-attachments/assets/5a34953f-a9a3-4c1e-869e-5eff0dc64c84" /> ---------	2026-06-12 18:21:30 +08:00
Jonathan Chang	de06c9a60b	feat: Langfuse session grouping for multi-turn chat traces (#15679 ) ## Summary This PR passes `session_id` into Langfuse trace observations so multi-turn chat messages can be grouped under the same session in Langfuse. Changes include: - Propagate `session_id` from chat/session APIs into `dialog_service.async_chat`. - Pass `session_id` into Langfuse `start_observation(...)`. - Share Langfuse `trace_context` with chat, embedding, rerank, and TTS model bundles where applicable. - Add unit coverage to verify Langfuse observations receive `session_id`. - Update affected test stubs for the new optional Langfuse context arguments. ## Related Issue Closes: #15636 ## Change Type - [x] Feature - [x] Bug fix - [x] Test - [ ] Refactor - [ ] Documentation - [ ] Breaking change ## Real Behavior Proof Before this change: - Langfuse observations were created without `session_id`. - Multi-turn chat traces could not be grouped by session in Langfuse. After this change: - Chat/session flows pass `session_id` into `async_chat`. - Langfuse observations include `session_id`. - Related model bundles receive shared trace context and session metadata. Validation result: ```bash uv run python -m py_compile \ api/db/services/tenant_llm_service.py \ api/db/services/llm_service.py \ api/db/services/dialog_service.py \ api/db/services/conversation_service.py \ api/apps/restful_apis/chat_api.py \ test/unit_test/api/db/services/test_dialog_service_final_answer.py \ test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py ``` Passed. ```bash uv run pytest \ test/unit_test/api/db/services/test_dialog_service_final_answer.py \ test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py -q ``` Result: ```text 11 passed in 16.89s ``` ```bash git diff --check ``` Passed. ## Checklist - [x] Analyzed the issue requirement. - [x] Checked existing Langfuse trace integration. - [x] Implemented only the requested session grouping behavior. - [x] Added/updated unit tests. - [x] Ran focused tests successfully. - [x] Ran Python compile validation. - [x] Ran whitespace diff validation.	2026-06-12 10:18:06 +08:00
bohdansolovie	381091df71	fix(dialog): guard async_ask() against empty or invalid kb_ids (#15530 ) Fixes #15529 . ### Problem `async_ask()` accessed `kbs[0]` without verifying that `KnowledgebaseService.get_by_ids()` returned any knowledge bases. Empty or stale `kb_ids` raised `IndexError`, which surfaced as HTTP 500 on search/bot SSE endpoints. ### Fix - Add an early guard when `kbs` is empty, yielding a final SSE error event (consistent with `gen_mindmap()` in the same module). - Add regression tests for empty `kb_ids` and deleted/invalid KB IDs. ### Test plan - [ ] `pytest test/unit_test/api/db/services/test_dialog_service_final_answer.py -k "async_ask_empty or async_ask_stale"` - [ ] Manual: `POST /api/v1/searchbots/ask` with invalid `kb_ids` returns SSE error, not HTTP 500 --------- Co-authored-by: Wang Qi <wangq8@outlook.com>	2026-06-11 15:52:59 +08:00
少卿	8e17a12990	fix: remove think text buffering for real-time reasoning stream (#15891 ) Fix: remove think text buffering for real-time reasoning stream	2026-06-10 16:55:57 +08:00
Lynn	478c9846a1	Fix: model list (#15860 ) ### What problem does this PR solve? Remove tenant_llm call in rag. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-10 14:59:57 +08:00
buua436	c1496ffd43	fix: propagate memory tenant id in task collect (#15837 ) ### What problem does this PR solve? Propagate `tenant_id` from memory task messages into task collection so refactored task execution can build a valid context. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-09 17:47:48 +08:00
DearsisHS	cbb3896aaa	fix(api): guard missing row in SearchService.get_detail (#15622 ) ## Summary `SearchService.get_detail` crashed with `AttributeError` (HTTP 500) when no matching row existed, because it called `.first().to_dict()` before the `if not search` guard — making that guard dead code. ## Root cause `.first()` returns `None` when the query matches nothing (deleted search app, or joined `User` not `VALID`). `None.to_dict()` raises before the guard runs. ## Fix ```diff .first() - .to_dict() ) if not search: return {} - return search + return search.to_dict() ``` Guard the `None` first, then serialize — restoring the intended `{}` "not found" return that every caller (`search_api`, `bot_api`, `chat_api`, `dataset_api_service`) already handles. ## Files changed - `api/db/services/search_service.py` ## Verification - `ruff check` — clean - Logic: `.first()` → `None` now hits `return {}` instead of `None.to_dict()`. Local full pytest not run (heavy RAG deps); CI validates. ## Note Implemented with LLM assistance (model: claude-opus-4-8). Closes #15621 Co-authored-by: dearsishs <MCarter112116@outlook.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 23:01:28 +08:00
buua436	c8c890b06c	fix: refine think stream parsing (#15745 ) ### What problem does this PR solve? Refine the stream parsing for `<think>` / `</think>` so MiniMax and DeepSeek-style chunking both flush in the right order without mixing think and answer buffers. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-08 16:53:22 +08:00
qinling0210	c960dc2a4c	Refine handling of POST /api/v1/datasets/search in GO (#15583 ) ### What problem does this PR solve? Refine handling of POST /api/v1/datasets/search in GO ### Type of change - [x] Refactoring	2026-06-08 11:49:37 +08:00
Wang Qi	aa9545e4c9	Revert "fix: duplicate document ingest guard" (#15707 ) Reverts infiniflow/ragflow#15638	2026-06-05 17:45:29 +08:00
buua436	71649db3b0	fix: prevent duplicated post-think text (#15651 ) ### What problem does this PR solve? This fixes duplicated post-think text in streamed chat responses. When the model emits text immediately after `</think>`, the stream state now advances its cursor correctly so the same visible prefix is not emitted twice. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-05 13:21:26 +08:00
buua436	423fb6faae	fix: duplicate document ingest guard (#15638 ) ### What problem does this PR solve? When a document is rerun or updated concurrently, the previous unconditional update could overwrite a newer task state. This change adds an `update_time`-based optimistic lock so the update only succeeds if the record has not been modified by another flow in the meantime. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-04 17:57:51 +08:00
buua436	bbacb31226	Fix: think stream tail handling (#15582 ) ### What problem does this PR solve? think stream tail handling ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-04 10:04:35 +08:00
Lynn	dc4b82523b	Feat: tenant llm provider (#14595 ) ### What problem does this PR solve? Python implementation of the Go-based model_provider API suite. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: bill <yibie_jingnian@163.com>	2026-05-29 17:39:41 +08:00
Jack	f0cb7a544b	Refactor: Task Executor (#15154 ) ### What problem does this PR solve? 1. Break huge function into smaller pieces 2. Add unit test for the smaller pieces function 3. Layer-ed design a. infra layer - task_context.py, recording_context.py, write_operation_interceptor.py, ... b. service layer - *_service.py c. business layer - task_handler.py 4. Default behavior: use "refactor-ed version" - can switch to original version by change env variable ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring - [x] Performance Improvement --------- Co-authored-by: Liu An <asiro@qq.com> Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>	2026-05-27 21:54:17 +08:00
Ahmad Intisar	e6068a7f7e	Fix: table parser metadata (#15127 ) ### What problem does this PR solve? This PR improves the table upload flow for CSV/Excel files by allowing table column role configuration at upload time. Previously, users had to: 1. Upload and parse a table file. 2. Open parser settings and manually set table column roles. 3. Re-parse the file for the roles to take effect. This was inefficient and required an unnecessary second parse. With this change: 1. When the knowledge base uses table parsing, the upload dialog extracts CSV/Excel headers client-side. 2. Users can choose Auto mode or Manual mode. 3. In Manual mode, users can assign per-column roles before upload. 4. The selected parser config is sent with the upload request and applied server-side during document creation. Result: configured table column roles are applied from the first parse. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>	2026-05-25 16:05:38 +08:00
Wang Qi	7e6844118b	Fix search vector_similarity_weight (#15108 ) ### What problem does this PR solve? Fix search vector_similarity_weight ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-22 16:05:13 +08:00
kpdev	faf77a5a8a	feat(evaluation): track token usage in evaluation results (#13487 ) ## Summary Implements the TODO in `evaluation_service.py`: Track token usage in evaluation results. ## Changes - Import `num_tokens_from_string` from `common.token_utils` - Prompt tokens: Use the full prompt returned by `async_chat` when available (includes system prompt + knowledge base + query), otherwise fall back to the question token count - Completion tokens: Count tokens in the generated answer - Storage: Store `token_usage` as `{prompt_tokens, completion_tokens, total_tokens}` in each `EvaluationResult` instead of `None` ## Why The evaluation pipeline previously saved `token_usage: None` for every result. This change allows downstream consumers (e.g. evaluation dashboards, cost tracking) to see approximate token usage per test case using the same tokenizer (tiktoken cl100k_base) used elsewhere in RAGFlow. ## Testing - No new tests added; existing evaluation flow unchanged - Token counting uses existing `num_tokens_from_string` utility --------- Co-authored-by: kiannidev <kiannidev@users.noreply.github.com>	2026-05-22 15:19:53 +08:00
Full Stack Developer	8f90740d2e	feat: pass chat_template_kwargs through agent chat completion (#14542 ) ### What problem does this PR solve? The agent API currently does not pass chat_template_kwargs to the underlying LLM call path, so clients cannot control template-level model behavior (such as thinking-mode toggles) when invoking /agents/chat/completion. This PR adds passthrough support for chat_template_kwargs across agent execution flows (session and non-session, streaming and non-streaming) by propagating it through canvas runtime state and into LLM invocation kwargs. This addresses the feature gap raised in [Issue #14182](https://github.com/infiniflow/ragflow/issues/14182). Closes #14182 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-22 15:15:49 +08:00
kingloon	da4eaf9fb0	Fix: remove duplicate function definitions (#15063 ) ### What problem does this PR solve? Remove duplicate function definitions in `api/db/services/dialog_service.py`. Problem: Two helper functions were defined twice in the same file, but with different parameter orders: - First definition (line 57): `_resolve_reference_metadata(request_payload=None, config=None)` - Second definition (line 136): `_resolve_reference_metadata(config, request_payload=None)` Solution: Keep the second definition (which is actually used by other modules) and remove the first one to avoid confusion. Additionally, remove duplicate `_enrich_chunks_with_document_metadata` definition (keep line 140 version). <img width="1584" height="313" alt="image" src="https://github.com/user-attachments/assets/7daee832-244f-4bb2-8488-e3b65012a3f9" /> <img width="1672" height="359" alt="image" src="https://github.com/user-attachments/assets/4fd2f523-273c-4b20-a7c9-ab35740b7834" /> ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [x] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2026-05-21 15:31:51 +08:00
bitloi	6499bce2a6	fix: Langfuse chat observation (#15026 ) ### What problem does this PR solve? Closes #15025 Langfuse-enabled `dialog_service.async_chat()` regressed to `langfuse_tracer.start_generation(...)` after the earlier Langfuse v4 migration. Langfuse v4 uses `start_observation(as_type="generation")`, so the remaining `start_generation` call can fail when chat tracing is enabled. This restores the migrated `start_observation(as_type="generation")` call for chat observations while preserving the existing trace context, model, input payload, and update/end flow. It also adds a regression test with a fake Langfuse v4-style client that exposes `start_observation()` but not `start_generation()`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Tests - `.venv/bin/pytest test/unit_test/api/db/services/test_dialog_service_final_answer.py -q` - `.venv/bin/ruff check api/db/services/dialog_service.py test/unit_test/api/db/services/test_dialog_service_final_answer.py`	2026-05-20 15:01:19 +08:00
plind	f169ab4b39	feat(tts): cache synthesized speech in Redis to avoid redundant calls (#14851 ) ## What problem does this PR solve? Closes #12017. TTS output is deterministic for a given `(model, text)` pair, so re-running the same text through the same TTS model produces the same bytes — yet `Canvas.tts` and `dialog_service.tts` re-synthesized on every request. That's slow and wastes provider quota whenever the same assistant response is replayed, shared across users, or repeated within a session. ### Change New helper `rag/utils/tts_cache.py` with `synthesize_with_cache(tts_mdl, cleaned_text)`: - Key: `tts:cache:{model_id}:{sha256(text)}` — separate namespace per model, identical cleaned text reuses a single entry across both call sites. - Value: the hex-encoded audio blob both call sites already returned. No format change for downstream consumers. - TTL: 7 days by default, configurable via `RAGFLOW_TTS_CACHE_TTL_SECONDS`. - Failure modes: a Redis hiccup falls back to direct synthesis; a failed synthesis still returns `None` (existing contract preserved). [`Canvas.tts`](https://github.com/infiniflow/ragflow/blob/main/agent/canvas.py#L683-L724) and [`dialog_service.tts`](https://github.com/infiniflow/ragflow/blob/main/api/db/services/dialog_service.py#L1367-L1380) now route through the helper; the per-file bytes-accumulation/hex-encode loop has been removed in favor of one shared implementation. ## Type of change - [x] New Feature (non-breaking change which adds functionality) ## Test plan - [ ] Cache hit, chat path: Configure a dialog with TTS enabled, ask the same question twice with `stream=false`. Verify the second response returns the same `audio_binary` and that the second invocation doesn't hit the TTS provider (e.g., observe provider-side logs / usage counters; check no `LLMBundle.tts can't update token usage` log line on the second run). - [ ] Cache hit, agent path: Same exercise via a Conversational Agent that includes a Message component playing back the answer. - [ ] Cache isolation per model: Switch tenant's `tts_id` between two models, run the same text against each — confirm the second model's first synthesis still happens (no cross-model hits). - [ ] TTL override: Set `RAGFLOW_TTS_CACHE_TTL_SECONDS=120`, confirm the entry expires after 2 minutes. - [ ] Redis unavailable: Stop Redis (or break the connection). Verify the TTS endpoint still works — synthesis falls back to direct calls, with a `TTS cache lookup failed` / `TTS cache store failed` warning logged. - [ ] Failure path: Configure a TTS model with an invalid API key, ensure the response still returns successfully with `audio_binary=None` (no regression vs. current behavior).	2026-05-19 14:20:40 +08:00
kingloon	525a87be0f	Misc: fix some typos (#14987 ) ### What problem does this PR solve? Fix minor code quality issues: 1. Fix typo in assertion error message: "Can't fine" → "Can't find" 2. Remove duplicate line in common/connection_utils.py ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2026-05-19 10:47:06 +08:00
Magicbook1108	b69a6a5d80	Feat: full optimization on connector dashboard (#14979 ) ### What problem does this PR solve? This PR improves the connector dashboard task management experience and adds better visibility into connector execution logs. ### Overview: #### Before <img width="700" alt="image" src="https://github.com/user-attachments/assets/e4a8ed6f-2e18-4f0f-8528-41a514550052" /> #### Now: <img width="700" alt="Screenshot from 2026-05-18 16-31-30" src="https://github.com/user-attachments/assets/d4ca193b-847a-49ae-9e4f-5fbca60ea627" /> ### 1. Add a new logging page to the connector dashboard A new logging page has been added so users can view connector task execution logs directly from the connector dashboard. ### 2. Merge the Resume button into Confirm The separate Resume button has been removed. The Confirm button now represents different actions depending on the current task state: - Save: Save form changes and reschedule tasks. - Stop: Cancel currently scheduled or running tasks. - Resume: Create new scheduled tasks after the previous tasks have been stopped. - Start: Start tasks when no task has been started yet. ### 3. Separate syncing and pruning tasks Connector tasks are now separated into syncing and pruning. Pruning is controlled by the Sync deleted files option: - When Sync deleted files is disabled, only syncing tasks are shown. - When Sync deleted files is enabled, both syncing and pruning tasks are shown. Now: Sync deleted files disabled <img width="700" alt="Sync deleted files disabled" src="https://github.com/user-attachments/assets/dbd9232e-614a-407f-a0b1-c109e5fa567d" /> Now: Sync deleted files enabled <img width="700" alt="Sync deleted files enabled" src="https://github.com/user-attachments/assets/1f527f48-ccb3-4ee8-97ca-086891489296" /> ### 4. Update logs in backend <img width="700" alt="image" src="https://github.com/user-attachments/assets/10a95a3f-98c1-4e67-8afa-ddf6cda5b0b2" /> ### 5. Remove connector resume API - Removed: `POST /v1/connectors/<connector_id>/resume` - Replaced by: `PATCH /v1/connectors/<connector_id>` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-19 10:07:11 +08:00
Jake Armstrong	93d3deb5e4	Fix admin CLI system variable commands (#14956 ) ## What Fixes #12409. Implements admin CLI support for: - `list vars;` - `show var <name-or-prefix>;` - `set var <name> <value>;` ## Changes - Wire Go CLI variable commands to the admin API. - Support integer and quoted string values in `SET VAR`. - Return variable rows as `data_type`, `name`, `setting_type`, and `value`. - Add exact-name lookup with prefix fallback for `SHOW VAR`. - Validate values by stored data type: `string`, `integer`, `bool`, and `json`. - Keep the legacy Python admin CLI/server behavior aligned. - Update admin CLI docs and add focused tests. ## Verification - `go test -count=1 ./internal/cli` - `python3.12 -m py_compile admin/server/services.py admin/server/routes.py api/db/services/system_settings_service.py admin/client/parser.py admin/client/ragflow_client.py` - Python admin CLI parser smoke test for `SET VAR`, quoted values, `SHOW VAR`, and `LIST VARS`. - Attempted `./run_go_tests.sh`; local environment is missing native tokenizer/linker artifacts: - `internal/cpp/cmake-build-release/librag_tokenizer_c_api.a` - `-lstdc++` Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 19:08:45 +08:00
Hamza Amin Khokhar	2dbe3b8a62	fix: metadata_condition returning all docs when filter matches nothing (#14967 ) ### What problem does this PR solve? When _parse_doc_id_filter_with_metadata returns [], the empty list is falsy so the WHERE id IN (...) clause was silently skipped, causing the full dataset to be returned instead of an empty result. Change `if doc_ids:` to `if doc_ids is not None:` in both get_list() and get_by_kb_id() to distinguish between no filter (None) and a filter that matched zero documents ([]). Fixes #14962 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 18:54:30 +08:00
dev	b12eaee38b	fix(api): enforce tenant access for connector routes (#14747 ) ### What problem does this PR solve? Fixes #14746. Adds tenant access checks for connector-by-id REST routes before reading connector details, mutating connector config/status, deleting connectors, rebuilding, or listing sync logs. Unauthorized callers now receive `RetCode.AUTHENTICATION_ERROR` with `No authorization.` without reaching the connector/log mutation paths. Validation: - `python3 -m pytest --confcutdir=test/testcases/test_web_api/test_connector_app test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py` - `uvx ruff check api/apps/restful_apis/connector_api.py api/db/services/connector_service.py test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: dev111-actor <dev111-actor@users.noreply.github.com>	2026-05-18 16:09:26 +08:00
Wang Qi	56d73d0c2c	Refactor: speed up ragflow server, save startup memory (#14973 ) ### What problem does this PR solve? Refactor: speed up ragflow server, save startup memory, saved 200MiB, and 5-9 seconds start time. ##### Before 1241292 \| \| \_ python3 api/ragflow_server.py RAGFlow server is ready after 25.61845850944519s initialization. ##### After 1019968 \| \| \_ python3 api/ragflow_server.py RAGFlow server is ready after 16.205134391784668s initialization. ### Type of change - [x] Refactoring	2026-05-18 15:55:59 +08:00
qinling0210	f1d2383572	Push metadata filters down to Infinity (#14974 ) ### What problem does this PR solve? Push metadata filters down to Infinity ### Type of change - [x] Refactoring	2026-05-18 14:22:04 +08:00
Kevin Hu	7cdc74bbe5	Refactor: Drop the vector fetch for ES (#14970 ) ## Summary - Stop pulling chunk vectors (`q__vec`) back from Elasticsearch in the main retrieval path. ES already knows them; shipping them was pure bandwidth/memory overhead. - Recover the per-chunk cosine similarity via a second KNN-only ES call filtered by the candidate chunk ids. The new `_score` is merged with locally computed term similarity using the user-configured `vector_similarity_weight`. - Lazily fetch the chunk embedding only for the chunks `insert_citations` actually needs. ## Details `rag/nlp/search.py`* - `Dealer.search`: no longer appends `q__vec` to the ES select list. OceanBase still gets it (its rerank path is unchanged). - New `Dealer._knn_scores(sres, idx_names, kb_ids)`: a `MatchDenseExpr` over the cached query vector filtered by `id IN sres.ids`, returning `{chunk_id: cosine_score}` via ES `_score`. - New `Dealer.rerank_with_knn(...)`: term similarity from `qryr.token_similarity` plus the ES-supplied KNN score, combined with `tkweight`/`vtweight` and the existing rank-feature bonus. - New `Dealer.fetch_chunk_vectors(chunk_ids, tenant_ids, kb_ids, dim)`: on-demand vector fetch for citation use. - `Dealer.retrieval` routes Infinity → unchanged, OceanBase → existing local `rerank`, ES → new KNN-score path. `common/doc_store/es_conn_base.py`* - New `get_scores(res)` helper returning `{_id: _score}` directly from hit headers (ES doesn't surface `_score` through `get_fields`). `api/db/services/dialog_service.py` - New top-level `_hydrate_chunk_vectors(...)` helper. On ES it back-fills `ck["vector"]` from `fetch_chunk_vectors` right before `insert_citations`. No-op on Infinity / OB (their chunks already carry vectors). - Both `decorate_answer` closures became `async` and are `await`-ed at all call sites in `async_chat` and `async_ask`. ## Backend behavior \| Backend \| Returns chunk vec in main search \| Sim source \| Vectors for citations \| \|---\|---\|---\|---\| \| ES \| No \| second KNN call (`_score`) merged with term sim \| fetched on demand \| \| Infinity \| No (unchanged) \| normalized `_score` \| already on chunks \| \| OceanBase \| Yes (kept) \| local hybrid rerank \| already on chunks \| ## Test plan	2026-05-18 14:21:56 +08:00
Rene Arredondo	9f2fb4611f	Fix: guard empty/whitespace embedding inputs in LLMBundle (#14428 ) (#14924 ) Closes #14428 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 14:11:54 +08:00
dale053	bd99a22661	fix: atomic chunk/token counter updates for documents and knowledge b… (#14867 ) ### What problem does this PR solve? Fixes #14866. Previously, `DocumentService.increment_chunk_num` and `decrement_chunk_num` updated the `Document` row and its parent `Knowledgebase` row in two separate, non-transactional statements. If the second update failed (DB error, connection drop, etc.) after the first one succeeded, the document and knowledge base chunk/token counters would drift apart and stay inconsistent. There was also a behavioral asymmetry between the two methods: - `increment_chunk_num` only logged a warning when the document row was missing and returned a value that callers usually treated as success. - `decrement_chunk_num` raised `LookupError` in the same situation. This PR makes the counter updates atomic and aligns the missing-document behavior between the two methods: - Wrap the `Document` and `Knowledgebase` updates in `increment_chunk_num` / `decrement_chunk_num` inside a `DB.atomic()` block so both succeed or both roll back together. - Raise `LookupError` from `increment_chunk_num` when the target document no longer exists, matching `decrement_chunk_num`. - Update `reset_document_for_reparse` in `document_api_service.py` to catch the new `LookupError` and return a proper "Document not found!" API error instead of propagating the exception. No schema changes, no API contract changes for the success path; only the failure mode for a missing document during reparse is now a clean error response instead of an uncaught exception. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 14:48:52 +08:00
Ethan T.	ba8cb9dd4a	fix: replace mutable default arguments with None in LLM chat models (#13513 ) ## Summary - Replace `gen_conf={}` with `gen_conf=None` + guard in `rag/llm/chat_model.py` (12 instances across Base, BaiChuanChat, LocalLLM, MistralChat, ReplicateChat, BaiduYiyanChat, GoogleChat classes) - Replace `doc_ids=[]` with `doc_ids=None` + guard in `api/db/services/document_service.py` (1 instance) - Mutable default arguments are shared across all calls, causing potential cross-request state contamination - See Python docs: https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects ## Test plan - [x] Verify LLM calls work with and without explicit gen_conf - [x] No behavior change for existing callers — `None` is replaced with `{}` at function entry 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-14 14:46:47 +08:00
Br1an	d46bbd30f7	Fix: send input and output token usage to Langfuse (#13294 ) ### What problem does this PR solve? Closes #9837 The Langfuse integration currently only sends the output text to `langfuse_generation.update()` without including token usage information. This means Langfuse cannot track input/output token consumption for cost analysis and monitoring. ### Solution Add the `usage` parameter to `langfuse_generation.update()` with: - `input`: approximate input token count from `message_fit_in()` - `output`: approximate output token count from `num_tokens_from_string(answer)` - `total`: sum of input and output ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-14 13:11:37 +08:00
plind	dd76653dc1	feat: add tag management for Agents with filtering and sorting (#14774 ) (#14799 ) ## Summary Closes #14774. Adds free-form tags on agents (UserCanvas) with full UI + API: - Stored as comma-separated `tags` column on `UserCanvas` with online migration. - New endpoints: `GET /v1/agents/tags` (aggregate counts) and `PUT /v1/agent/<id>/tags` (write). `GET /v1/agents` accepts a `tags=` query. - "Edit tags" item in agent dropdown opens a chip-style editor dialog; tags render as badges on each agent card. - New "Tags" facet in the agents filter bar, with counts. ## Implementation notes - Tag matching is exact-token: the SQL filter wraps stored tags as `,…,` and matches `,ml,` so `ml` doesn't match `ml-ops`. - Server-side normalization in `UserCanvasService.update_tags`: dedup (case-insensitive), per-tag cap of 64 chars, total length capped at 512 chars to fit the column, commas inside tag values are replaced with spaces. - Tenant authorization: `PUT /v1/agent/<id>/tags` gates on `UserCanvasService.accessible(canvas_id, tenant_id)`. - Tag listing scope: `UserCanvasService.list_tags` follows the same own + team-shared rule as `get_by_tenant_ids`. - i18n: keys added to `en.ts` and `zh.ts` only (per project convention; other locales fall back). - `HomeCard` gets a non-breaking `extra?: ReactNode` slot for the chip row; no `src/components/ui/` files modified. ## Test plan - [ ] Backend boot runs `migrate_db` → confirm `user_canvas.tags` column exists (`DESCRIBE user_canvas`). - [ ] Agents page renders cards normally (no console error from missing field). - [ ] `⋯ → Edit tags` opens a dialog that stays open (regression: dialog was unmounting with the dropdown). - [ ] Typing a tag without pressing Enter and clicking Save persists it (regression: last typed tag was being dropped). - [ ] Chip input supports Enter/comma to commit, Backspace on empty to remove, `×` to remove individual chip. - [ ] Tag containing a comma sent via API is stored with the comma replaced by a space. - [ ] 20 long tags sent via API does not error (length cap silently truncates). - [ ] "Tags" filter in the filter bar shows counts and narrows the list. - [ ] Filtering by `ml` does not return agents tagged `ml-ops`. - [ ] UI in Chinese shows 编辑标签 / 添加标签以整理和筛选你的智能体 etc. - [ ] `PUT /v1/agent/<other-tenant-id>/tags` returns `Agent not found or no permission.`	2026-05-13 21:41:32 +08:00
tmimmanuel	663fc1d42c	fix(opensearch): implement doc-meta dispatch surface on OSConnection (#14577 ) ### What problem does this PR solve? Fixes #14570. On OpenSearch backends (`DOC_ENGINE=opensearch`) every document-metadata write failed with `'OSConnection' object has no attribute 'create_doc_meta_idx'`, so both `PATCH /api/v1/datasets/{ds}/documents/{doc}` with `meta_fields` and `POST /api/v1/datasets/{ds}/metadata/update` were unusable while every other document operation (retrieval, parsing, name update, chunk management) worked correctly on the same OpenSearch cluster. The bug runs deeper than the missing method name in the error message suggests. `DocMetadataService` also reached into `settings.docStoreConn.es.*` directly for the index refresh, the scripted partial update, and the count call, which means that even after adding `create_doc_meta_idx` to `OSConnection` the very next call in the same metadata flow would still raise `AttributeError` because `OSConnection` exposes `self.os` rather than `self.es`. Fixing only the reported symptom would have moved the failure one line down without restoring the feature. This PR adds a uniform document-metadata dispatch surface to both connection classes so they present the same abstract API, and routes the service layer through that surface via `getattr` guards instead of poking at backend-specific attributes. The four new methods on `OSConnection` and `ESConnectionBase` are `create_doc_meta_idx`, `refresh_idx`, `count_idx`, and `replace_meta_fields`. `OSConnection.create_doc_meta_idx` reuses the existing `conf/doc_meta_es_mapping.json` schema in the OpenSearch `body=` form because OpenSearch and Elasticsearch share the same index-creation payload, and `replace_meta_fields` emits a full scripted assignment (`ctx._source.meta_fields = params.meta_fields`) on both backends so removed keys actually disappear instead of being preserved by deep-merge semantics. The `getattr`-guarded dispatch in `DocMetadataService` keeps the existing fall-through paths intact for Infinity and OceanBase, which continue to rely on their search-based count fallback and on the delete-then-insert metadata replacement they used before, so this change is strictly additive for those two backends. Verification: `pytest test/unit_test/rag/utils/test_opensearch_doc_meta.py` runs 16 new unit tests that pass locally and pin the `OSConnection` dispatch surface, the `create_doc_meta_idx` short-circuit when the index already exists, the mapping-file payload routing, the `IndicesClient.create` failure path, the `refresh_idx` and `count_idx` success and error sentinels, and the full-assignment script emitted by `replace_meta_fields`. The test module stubs `common.settings` and `rag.nlp` at import time so the suite runs without the heavy backend SDKs that the rest of the repository pulls in transitively. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: tmimmanuel <tmimmanuel@users.noreply.github.com>	2026-05-11 17:04:28 +08:00
box4wangjing	292b0b8bce	chore: fix some comments to improve readability (#14756 ) ### What problem does this PR solve? fix some comments to improve readability ### Type of change - [x] Documentation Update --------- Signed-off-by: box4wangjing <box4wangjing@outlook.com>	2026-05-11 16:48:48 +08:00

1 2 3 4 5 ...

639 Commits