ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-29 23:41:12 +08:00

Author	SHA1	Message	Date
Kevin Hu	b5a426e6e0	Feat: chat channels — connect assistants to external messaging bots (#15850 ) ### What problem does this PR solve? #15844 Adds a Chat channels capability so a RAGFlow assistant (Dialog) can be exposed as a bot on external messaging platforms (Feishu/Lark, Discord, Telegram, Slack, WeCom, LINE, etc.). An admin configures a bot in the UI, connects it to an assistant, and inbound messages are answered from that assistant's knowledge base — replies are delivered back on the channel. Feishu/Lark is implemented and tested end-to-end. Discord, Telegram, LINE, and WeCom are scaffolded against the same interface; the remaining listed channels are tracked as follow-ups. ### Design Backend - New `chat_channel` table (`tenant_id`, `name`, `channel`, `config` JSON holding `{credential: {...}}`, `dialog_id`, `status`) + `ChatChannelService` and RESTful CRUD under `/api/v1/chat_channels`. - Channel framework under `api/channels/`: a `core` registry + per-channel packages that self-register a builder and implement a common `Channel` interface (`start`/`stop`/`send` + inbound normalization) over `IncomingMessage`/`OutgoingMessage`. - Embedded reconcile loop in `ragflow_server` (`api/channels/bootstrap.py`): loads enabled bots, and starts/stops/restarts them as rows change (no server restart needed). Inbound messages run the connected dialog via the non-streaming completion path, keeping per-end-user conversation history. - Missing optional channel SDKs degrade gracefully (channel skipped with a warning; others unaffected). Channel-level errors are logged, not crashed. - Feishu's WebSocket client runs in a dedicated thread with its own event loop to avoid cross-loop/contextvars conflicts with the channel runtime. Frontend - Settings → Chat channels panel: available-channels grid + configured-bots list with add/edit/delete and a Connect assistant popup that binds a bot to a dialog. - Brand icons via simple-icons / reused shared data-source assets, with colored fallbacks for brands not available. - Route, sidebar entry, i18n (en/zh), and a top-nav segment-boundary fix so the settings page no longer highlights the Chat tab. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### Notes - DB: new `chat_channel` table is auto-created; `chat_channel.dialog_id` is also covered by a `migrate_db` `alter_db_add_column` for existing installs. - Channel SDKs (`lark-oapi`, `discord.py`, `python-telegram-bot`, `line-bot-sdk`, `wechatpy`, `aiohttp`) added to dependencies. - Screenshots / per-channel credential docs to follow. <img width="1338" height="1290" alt="Image" src="https://github.com/user-attachments/assets/042cb2f9-0dad-4e6a-bcf7-43ced4bbd704" /> <img width="1344" height="738" alt="Image" src="https://github.com/user-attachments/assets/373cd08e-ec40-4c67-9c51-4d948b1ba617" /> <img width="672" height="887" alt="Image" src="https://github.com/user-attachments/assets/5a34953f-a9a3-4c1e-869e-5eff0dc64c84" /> ---------	2026-06-12 18:21:30 +08:00
Jonathan Chang	de06c9a60b	feat: Langfuse session grouping for multi-turn chat traces (#15679 ) ## Summary This PR passes `session_id` into Langfuse trace observations so multi-turn chat messages can be grouped under the same session in Langfuse. Changes include: - Propagate `session_id` from chat/session APIs into `dialog_service.async_chat`. - Pass `session_id` into Langfuse `start_observation(...)`. - Share Langfuse `trace_context` with chat, embedding, rerank, and TTS model bundles where applicable. - Add unit coverage to verify Langfuse observations receive `session_id`. - Update affected test stubs for the new optional Langfuse context arguments. ## Related Issue Closes: #15636 ## Change Type - [x] Feature - [x] Bug fix - [x] Test - [ ] Refactor - [ ] Documentation - [ ] Breaking change ## Real Behavior Proof Before this change: - Langfuse observations were created without `session_id`. - Multi-turn chat traces could not be grouped by session in Langfuse. After this change: - Chat/session flows pass `session_id` into `async_chat`. - Langfuse observations include `session_id`. - Related model bundles receive shared trace context and session metadata. Validation result: ```bash uv run python -m py_compile \ api/db/services/tenant_llm_service.py \ api/db/services/llm_service.py \ api/db/services/dialog_service.py \ api/db/services/conversation_service.py \ api/apps/restful_apis/chat_api.py \ test/unit_test/api/db/services/test_dialog_service_final_answer.py \ test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py ``` Passed. ```bash uv run pytest \ test/unit_test/api/db/services/test_dialog_service_final_answer.py \ test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py -q ``` Result: ```text 11 passed in 16.89s ``` ```bash git diff --check ``` Passed. ## Checklist - [x] Analyzed the issue requirement. - [x] Checked existing Langfuse trace integration. - [x] Implemented only the requested session grouping behavior. - [x] Added/updated unit tests. - [x] Ran focused tests successfully. - [x] Ran Python compile validation. - [x] Ran whitespace diff validation.	2026-06-12 10:18:06 +08:00
Lynn	9d5950963b	Fix: get is_tools from model record (#15946 ) ### What problem does this PR solve? As title. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-11 17:29:28 +08:00
少卿	9614605bf9	fix: propagate max_tokens from model config to downstream consumers (#15945 ) ## Summary `get_model_config_from_provider_instance()` was not including `max_tokens` in its returned dict, causing all downstream consumers (dialog truncation, message fitting, knowledge base trimming, embedding, graphrag, RAPTOR) to fall back to the hardcoded default of 8192 tokens regardless of the actual model context window size (e.g., GPT-4o 128K, Claude 200K). Closes #15944 ## Root Cause The function builds `model_config` with only: `llm_factory`, `api_key`, `llm_name`, `api_base`, `model_type`, `is_tools`. `max_tokens` is never included. Yet the data exists in four independent sources: 1. `TenantModel.extra` JSON field — written by `provider_api_service.py:659` 2. `conf/llm_factories.json` — every model entry has `max_tokens` 3. `rag/llm/model_meta.py` — 9 provider classes fetch real context windows from APIs 4. `TenantLLM.max_tokens` database column None of them are read by this function. ## Fix Two lines added, one per return path: - Path B (model_obj exists → provider-instance model): reads `max_tokens` from `model_obj.extra` JSON - Path C (fallback → factory config): reads `max_tokens` from `llm_info` (sourced from `llm_factories.json`) Both fall back to 8192 when the value is absent, preserving backward compatibility. ## Impact This single 5-line change fixes the context window budget for all 78+ call sites across 20 files that construct `LLMBundle` or read `max_tokens` from the config dict, including: \| Consumer \| File \| Effect \| \|---\|---\|---\| \| Dialog chat truncation \| `dialog_service.py:562` \| `message_fit_in(msg, max_tokens * 0.95)` now uses real context window \| \| Knowledge base trimming \| `dialog_service.py:752` \| `kb_prompt(kbinfos, max_tokens)` now fits more retrieved content \| \| Agent message fitting \| `agent/component/llm.py:322` \| Agent prompts no longer truncated at 7946 tokens \| \| Embedding truncation \| `task_executor.py:704` \| Embedding input uses actual model limit \| \| GraphRAG extraction \| `graphrag/*/extractor.py` \| Entity extraction gets full context budget \| \| LLM4Tenant.max_length \| `tenant_llm_service.py:513` \| Chat model wrapper exposes real context window \|	2026-06-11 17:24:58 +08:00
bohdansolovie	381091df71	fix(dialog): guard async_ask() against empty or invalid kb_ids (#15530 ) Fixes #15529 . ### Problem `async_ask()` accessed `kbs[0]` without verifying that `KnowledgebaseService.get_by_ids()` returned any knowledge bases. Empty or stale `kb_ids` raised `IndexError`, which surfaced as HTTP 500 on search/bot SSE endpoints. ### Fix - Add an early guard when `kbs` is empty, yielding a final SSE error event (consistent with `gen_mindmap()` in the same module). - Add regression tests for empty `kb_ids` and deleted/invalid KB IDs. ### Test plan - [ ] `pytest test/unit_test/api/db/services/test_dialog_service_final_answer.py -k "async_ask_empty or async_ask_stale"` - [ ] Manual: `POST /api/v1/searchbots/ask` with invalid `kb_ids` returns SSE error, not HTTP 500 --------- Co-authored-by: Wang Qi <wangq8@outlook.com>	2026-06-11 15:52:59 +08:00
Rene Arredondo	b978e26208	fix(db): drop Peewee-auto-named unique index on tenant_model_instance (#15699 ) (#15879 ) ## Summary Fixes #15699. User upgrades to v0.25.6 against an existing MySQL database, tries to add an Ollama provider instance, and gets: ``` MySQL IntegrityError: Duplicate entry 'dbaafbfe608a11f1a5516d6066988224' for key 'tenant_model_instance.tenantmodelinstance_api_key_provider_id' ``` The route at [api/apps/restful_apis/provider_api.py:354](api/apps/restful_apis/provider_api.py#L354) catches it and returns `get_error_data_result(message="Internal server error")` — which by RAGFlow's convention is HTTP 200 with an error `code` on the body — hence the reporter's "200 status code but the database errored" complaint. ### Root cause The provider-instance refactor in [PR #15460](https://github.com/infiniflow/ragflow/pull/15460) dropped the unique-compound-index tuple from `TenantModelInstance`: ```python # Removed in #15460 class Meta: db_table = "tenant_model_instance" indexes = ( (("api_key", "provider_id"), True), # unique ) ``` and added a one-shot drop in `migrate_db()` for existing databases. But the drop targets the wrong index name: ```python # Before this PR — wrong name for table_name, index_name in [ ("tenant_model_instance", "idx_api_key_provider_id"), # ← doesn't exist ("tenant_model", "idx_provider_model_instance"), ]: ``` Peewee's auto-derived index name is `<lowercase classname>_<col1>_<col2>` → `tenantmodelinstance_api_key_provider_id`, which matches the user's error verbatim. The drop raises `OperationalError: 1091 (HY000): Can't DROP …`, the surrounding `except` clause at [db_models.py:1736](api/db/db_models.py#L1736) swallows it as expected-on-fresh-installs, and the legacy unique index lives on indefinitely. ### Why Ollama hits it specifically Ollama doesn't require an API key. The form posts `api_key: ""`. The app-layer dedupe at [provider_api_service.py:288-292](api/apps/services/provider_api_service.py#L288-L292): ```python api_key_str = "" if api_key: # ← skipped for "" ... same_key_instance = TenantModelInstanceService.get_by_provider_id_and_api_key(...) if same_key_instance: return False, f"Already exist instance: ... with api_key {api_key}" ``` falls through for empty keys. Control reaches `TenantModelInstanceService.create_instance(..., api_key="")` which inserts a row whose `(api_key, provider_id) = ("", <provider_uuid>)` collides with any prior Ollama row that already shipped that same pair → the still-present unique index throws. (`dbaafbfe608a11f1a5516d6066988224` in the user's error is the duplicated `provider_id` UUID, paired with the empty `api_key`.) ### Fix Add the Peewee auto-name alongside the existing `idx_` entry so the migration finally drops the obsolete index on next restart: ```python legacy_indexes = [ ("tenant_model_instance", "idx_api_key_provider_id"), ("tenant_model_instance", "tenantmodelinstance_api_key_provider_id"), # ← added ("tenant_model", "idx_provider_model_instance"), ] ``` The surrounding `try/except (OperationalError, ProgrammingError)` matches `1091` / `can't DROP` / `does not exist` and treats them as success, so every state is idempotent (see Test plan). ### Idempotency matrix \| Database state \| First entry (`idx_api_key_provider_id`) \| New entry (`tenantmodelinstance_api_key_provider_id`) \| \| --- \| --- \| --- \| \| Fresh install (≥ #15460) — neither index exists \| `1091` → swallowed \| `1091` → swallowed \| \| Upgraded from before `dc4b82523` (the user's case) — auto-name present \| `1091` → swallowed \| drops the index* \| \| Upgraded after a manual rename to `idx_` \| drops the index \| `1091` → swallowed \| \| Re-run of `migrate_db()` after either of the above \| `1091` → swallowed \| `1091` → swallowed \| No rollback hazard: nothing depends on this unique constraint anymore (`create_instance` dedupes by `instance_name` via `duplicate_name`, see [tenant_model_instance_service.py:27](api/db/services/tenant_model_instance_service.py#L27)). ### What this PR does NOT change - `provider_api_service.create_provider_instance`* — its `if api_key:` gate is correct for the post-migration world: multiple Ollama instances with empty keys under one provider are legitimate, so we shouldn't tighten the app-layer check. - `TenantModelInstance` Peewee model — the `indexes` tuple was already removed in #15460. New databases never get the constraint in the first place. - The `except → get_error_data_result` → HTTP 200 pattern at `provider_api.py:354` — that's a project-wide convention; changing one route to HTTP 500 would be inconsistent and out of scope. ## Test plan - [ ] Reproducer (pre-fix): on a database originally created before #15460, configure an Ollama provider with an empty `api_key`, then try to create a second instance under the same provider — confirm the `Duplicate entry … 'tenantmodelinstance_api_key_provider_id'` error in the server log. - [ ] Verify the index is present pre-restart: `SHOW INDEX FROM tenant_model_instance WHERE Key_name = 'tenantmodelinstance_api_key_provider_id';` — non-empty result. - [ ] Restart with the fix applied: server starts cleanly, `migrate_db()` runs, no `Failed to drop index` in critical logs. - [ ] Verify the index is gone post-restart: same `SHOW INDEX` query — empty result. - [ ] Re-run the reproducer: two Ollama instances under the same provider, both `api_key=""`, both succeed. - [ ] Restart a second time — no new errors; the matching `1091` swallow keeps the migration idempotent. - [ ] Fresh install smoke test: drop the DB volume, start clean — no `1091` noise (the new index never existed), no functional regression. ## Files changed - [api/db/db_models.py](api/db/db_models.py) — extend the legacy-index drop list with `tenantmodelinstance_api_key_provider_id`; refactor the inline list to a named `legacy_indexes` local with a comment pointing at #15460 and #15699. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Co-authored-by: Wang Qi <wangq8@outlook.com>	2026-06-11 15:47:12 +08:00
少卿	8e17a12990	fix: remove think text buffering for real-time reasoning stream (#15891 ) Fix: remove think text buffering for real-time reasoning stream	2026-06-10 16:55:57 +08:00
buua436	dcf623d60d	feat: support multi-type factory models (#15893 ) ### What problem does this PR solve? Support factory models with multiple model types, so visual chat models can be exposed as both image2text and chat while preserving the database model-type-per-record design. This also updates the SILICONFLOW model list and adds a helper script to refresh SiliconFlow models from the provider API. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-06-10 15:35:21 +08:00
Lynn	478c9846a1	Fix: model list (#15860 ) ### What problem does this PR solve? Remove tenant_llm call in rag. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-10 14:59:57 +08:00
Wang Qi	9aa81e7cad	Fix paddle ocr / minerU cannot add (#15858 ) Fix paddle ocr / minerU cannot add	2026-06-10 13:04:13 +08:00
buua436	c1496ffd43	fix: propagate memory tenant id in task collect (#15837 ) ### What problem does this PR solve? Propagate `tenant_id` from memory task messages into task collection so refactored task execution can build a valid context. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-09 17:47:48 +08:00
DearsisHS	cbb3896aaa	fix(api): guard missing row in SearchService.get_detail (#15622 ) ## Summary `SearchService.get_detail` crashed with `AttributeError` (HTTP 500) when no matching row existed, because it called `.first().to_dict()` before the `if not search` guard — making that guard dead code. ## Root cause `.first()` returns `None` when the query matches nothing (deleted search app, or joined `User` not `VALID`). `None.to_dict()` raises before the guard runs. ## Fix ```diff .first() - .to_dict() ) if not search: return {} - return search + return search.to_dict() ``` Guard the `None` first, then serialize — restoring the intended `{}` "not found" return that every caller (`search_api`, `bot_api`, `chat_api`, `dataset_api_service`) already handles. ## Files changed - `api/db/services/search_service.py` ## Verification - `ruff check` — clean - Logic: `.first()` → `None` now hits `return {}` instead of `None.to_dict()`. Local full pytest not run (heavy RAG deps); CI validates. ## Note Implemented with LLM assistance (model: claude-opus-4-8). Closes #15621 Co-authored-by: dearsishs <MCarter112116@outlook.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 23:01:28 +08:00
buua436	c8c890b06c	fix: refine think stream parsing (#15745 ) ### What problem does this PR solve? Refine the stream parsing for `<think>` / `</think>` so MiniMax and DeepSeek-style chunking both flush in the right order without mixing think and answer buffers. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-08 16:53:22 +08:00
qinling0210	c960dc2a4c	Refine handling of POST /api/v1/datasets/search in GO (#15583 ) ### What problem does this PR solve? Refine handling of POST /api/v1/datasets/search in GO ### Type of change - [x] Refactoring	2026-06-08 11:49:37 +08:00
Wang Qi	aa9545e4c9	Revert "fix: duplicate document ingest guard" (#15707 ) Reverts infiniflow/ragflow#15638	2026-06-05 17:45:29 +08:00
buua436	71649db3b0	fix: prevent duplicated post-think text (#15651 ) ### What problem does this PR solve? This fixes duplicated post-think text in streamed chat responses. When the model emits text immediately after `</think>`, the stream state now advances its cursor correctly so the same visible prefix is not emitted twice. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-05 13:21:26 +08:00
buua436	423fb6faae	fix: duplicate document ingest guard (#15638 ) ### What problem does this PR solve? When a document is rerun or updated concurrently, the previous unconditional update could overwrite a newer task state. This change adds an `update_time`-based optimistic lock so the update only succeeds if the record has not been modified by another flow in the meantime. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-04 17:57:51 +08:00
buua436	bbacb31226	Fix: think stream tail handling (#15582 ) ### What problem does this PR solve? think stream tail handling ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-04 10:04:35 +08:00
euvre	9a9d3ddf5f	fix: show default embedding model when provider is not yet registered (#15511 ) ### What problem does this PR solve? ### Problem On the Model Providers page, the Embedding Model dropdown in System Model Settings shows empty (no default selected), even though a default embedding model is configured in `service_conf.yaml`. ### Root Cause Two issues were identified: 1. Backend: `_get_model_info` fails for unregistered providers The tenant's `embd_id` is set to `bge-m3@xxxx` during initialization (from the placeholder config `factory: 'xxxx'`). The `_get_model_info` function requires the provider to exist in `tenant_model_provider` table, but `xxxx` is never a real provider. Even after the user adds a real provider (e.g., ZHIPU-AI), the stale `embd_id` still references the non-existent one, causing the function to return `None`. 2. Frontend: default models cache not invalidated after adding provider `useAddProviderInstance` only invalidates `addedProviders` and `allModels` caches after adding a provider instance, but does not invalidate the `defaultModels` cache. This means the default model list is not re-fetched until the user manually refreshes the page. ### Fix `api/apps/services/models_api_service.py` - Added `_resolve_model_from_tenant_providers()` helper: when the default model's provider doesn't exist (e.g., placeholder `xxxx`), it searches through the tenant's actually registered providers for a model of the same type and returns the first match. - When an instance name doesn't match (e.g., `"default"` vs actual name `"1"`), the function now auto-resolves to the first real instance under that provider. - Falls back to `FACTORY_LLM_INFOS` validation when neither provider nor instance exists. `web/src/hooks/use-llm-request.tsx` - Added `queryClient.invalidateQueries({ queryKey: LlmKeys.defaultModels() })` to `useAddProviderInstance` so that the default model list is re-fetched immediately after a provider instance is added, eliminating the need for a manual page refresh. ### Testing - Verified with a tenant whose `embd_id=bge-m3@xxxx` and only provider is ZHIPU-AI (instance `1`): `_resolve_model_from_tenant_providers` correctly resolves to `embedding-2@1@ZHIPU-AI`. - After adding a provider via the UI, the embedding model dropdown now immediately shows the resolved default without requiring a page refresh. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Signed-off-by: noob <yixiao121314@outlook.com>	2026-06-04 09:55:49 +08:00
Lynn	36357a6afd	Fix: model provider (#15517 ) ### What problem does this PR solve? Fix: - Handle siliconflow and siliconflow_intl api_key ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-02 19:04:20 +08:00
Lynn	3bc5ed282e	Fix: model-provider bugs (#15460 ) ### What problem does this PR solve? Fix: - Use @ to avoid split by `_` in model_name. - Verify api_key when add instance. - Pop api_key in list intances response. - Remove useless index. - Sort providers, instances and models by name. - Get `is_tools` from llm_factories.json ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-02 13:24:53 +08:00
Lynn	dc4b82523b	Feat: tenant llm provider (#14595 ) ### What problem does this PR solve? Python implementation of the Go-based model_provider API suite. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: bill <yibie_jingnian@163.com>	2026-05-29 17:39:41 +08:00
Jack	f0cb7a544b	Refactor: Task Executor (#15154 ) ### What problem does this PR solve? 1. Break huge function into smaller pieces 2. Add unit test for the smaller pieces function 3. Layer-ed design a. infra layer - task_context.py, recording_context.py, write_operation_interceptor.py, ... b. service layer - *_service.py c. business layer - task_handler.py 4. Default behavior: use "refactor-ed version" - can switch to original version by change env variable ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring - [x] Performance Improvement --------- Co-authored-by: Liu An <asiro@qq.com> Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>	2026-05-27 21:54:17 +08:00
Ahmad Intisar	e6068a7f7e	Fix: table parser metadata (#15127 ) ### What problem does this PR solve? This PR improves the table upload flow for CSV/Excel files by allowing table column role configuration at upload time. Previously, users had to: 1. Upload and parse a table file. 2. Open parser settings and manually set table column roles. 3. Re-parse the file for the roles to take effect. This was inefficient and required an unnecessary second parse. With this change: 1. When the knowledge base uses table parsing, the upload dialog extracts CSV/Excel headers client-side. 2. Users can choose Auto mode or Manual mode. 3. In Manual mode, users can assign per-column roles before upload. 4. The selected parser config is sent with the upload request and applied server-side during document creation. Result: configured table column roles are applied from the first parse. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>	2026-05-25 16:05:38 +08:00
Wang Qi	7e6844118b	Fix search vector_similarity_weight (#15108 ) ### What problem does this PR solve? Fix search vector_similarity_weight ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-22 16:05:13 +08:00
kpdev	faf77a5a8a	feat(evaluation): track token usage in evaluation results (#13487 ) ## Summary Implements the TODO in `evaluation_service.py`: Track token usage in evaluation results. ## Changes - Import `num_tokens_from_string` from `common.token_utils` - Prompt tokens: Use the full prompt returned by `async_chat` when available (includes system prompt + knowledge base + query), otherwise fall back to the question token count - Completion tokens: Count tokens in the generated answer - Storage: Store `token_usage` as `{prompt_tokens, completion_tokens, total_tokens}` in each `EvaluationResult` instead of `None` ## Why The evaluation pipeline previously saved `token_usage: None` for every result. This change allows downstream consumers (e.g. evaluation dashboards, cost tracking) to see approximate token usage per test case using the same tokenizer (tiktoken cl100k_base) used elsewhere in RAGFlow. ## Testing - No new tests added; existing evaluation flow unchanged - Token counting uses existing `num_tokens_from_string` utility --------- Co-authored-by: kiannidev <kiannidev@users.noreply.github.com>	2026-05-22 15:19:53 +08:00
Full Stack Developer	8f90740d2e	feat: pass chat_template_kwargs through agent chat completion (#14542 ) ### What problem does this PR solve? The agent API currently does not pass chat_template_kwargs to the underlying LLM call path, so clients cannot control template-level model behavior (such as thinking-mode toggles) when invoking /agents/chat/completion. This PR adds passthrough support for chat_template_kwargs across agent execution flows (session and non-session, streaming and non-streaming) by propagating it through canvas runtime state and into LLM invocation kwargs. This addresses the feature gap raised in [Issue #14182](https://github.com/infiniflow/ragflow/issues/14182). Closes #14182 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-22 15:15:49 +08:00
dale053	c33d0b8081	fix: prevent sensitive fields from leaking in user API responses (#14792 ) Closes #14789 ### What problem does this PR solve? User API endpoints (`login`, `user_profile`, `user_add`, `forget_reset_password`) were returning full user objects via `to_json()` / `to_dict()`, which included sensitive fields like `password` and `access_token` in the response body. This leaks credentials to the client. This PR adds a `to_safe_dict()` method on the `User` model that strips sensitive fields (`password`, `access_token`) and replaces all affected call sites to use it. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-22 15:14:26 +08:00
kingloon	da4eaf9fb0	Fix: remove duplicate function definitions (#15063 ) ### What problem does this PR solve? Remove duplicate function definitions in `api/db/services/dialog_service.py`. Problem: Two helper functions were defined twice in the same file, but with different parameter orders: - First definition (line 57): `_resolve_reference_metadata(request_payload=None, config=None)` - Second definition (line 136): `_resolve_reference_metadata(config, request_payload=None)` Solution: Keep the second definition (which is actually used by other modules) and remove the first one to avoid confusion. Additionally, remove duplicate `_enrich_chunks_with_document_metadata` definition (keep line 140 version). <img width="1584" height="313" alt="image" src="https://github.com/user-attachments/assets/7daee832-244f-4bb2-8488-e3b65012a3f9" /> <img width="1672" height="359" alt="image" src="https://github.com/user-attachments/assets/4fd2f523-273c-4b20-a7c9-ab35740b7834" /> ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [x] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2026-05-21 15:31:51 +08:00
bitloi	6499bce2a6	fix: Langfuse chat observation (#15026 ) ### What problem does this PR solve? Closes #15025 Langfuse-enabled `dialog_service.async_chat()` regressed to `langfuse_tracer.start_generation(...)` after the earlier Langfuse v4 migration. Langfuse v4 uses `start_observation(as_type="generation")`, so the remaining `start_generation` call can fail when chat tracing is enabled. This restores the migrated `start_observation(as_type="generation")` call for chat observations while preserving the existing trace context, model, input payload, and update/end flow. It also adds a regression test with a fake Langfuse v4-style client that exposes `start_observation()` but not `start_generation()`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Tests - `.venv/bin/pytest test/unit_test/api/db/services/test_dialog_service_final_answer.py -q` - `.venv/bin/ruff check api/db/services/dialog_service.py test/unit_test/api/db/services/test_dialog_service_final_answer.py`	2026-05-20 15:01:19 +08:00
plind	f169ab4b39	feat(tts): cache synthesized speech in Redis to avoid redundant calls (#14851 ) ## What problem does this PR solve? Closes #12017. TTS output is deterministic for a given `(model, text)` pair, so re-running the same text through the same TTS model produces the same bytes — yet `Canvas.tts` and `dialog_service.tts` re-synthesized on every request. That's slow and wastes provider quota whenever the same assistant response is replayed, shared across users, or repeated within a session. ### Change New helper `rag/utils/tts_cache.py` with `synthesize_with_cache(tts_mdl, cleaned_text)`: - Key: `tts:cache:{model_id}:{sha256(text)}` — separate namespace per model, identical cleaned text reuses a single entry across both call sites. - Value: the hex-encoded audio blob both call sites already returned. No format change for downstream consumers. - TTL: 7 days by default, configurable via `RAGFLOW_TTS_CACHE_TTL_SECONDS`. - Failure modes: a Redis hiccup falls back to direct synthesis; a failed synthesis still returns `None` (existing contract preserved). [`Canvas.tts`](https://github.com/infiniflow/ragflow/blob/main/agent/canvas.py#L683-L724) and [`dialog_service.tts`](https://github.com/infiniflow/ragflow/blob/main/api/db/services/dialog_service.py#L1367-L1380) now route through the helper; the per-file bytes-accumulation/hex-encode loop has been removed in favor of one shared implementation. ## Type of change - [x] New Feature (non-breaking change which adds functionality) ## Test plan - [ ] Cache hit, chat path: Configure a dialog with TTS enabled, ask the same question twice with `stream=false`. Verify the second response returns the same `audio_binary` and that the second invocation doesn't hit the TTS provider (e.g., observe provider-side logs / usage counters; check no `LLMBundle.tts can't update token usage` log line on the second run). - [ ] Cache hit, agent path: Same exercise via a Conversational Agent that includes a Message component playing back the answer. - [ ] Cache isolation per model: Switch tenant's `tts_id` between two models, run the same text against each — confirm the second model's first synthesis still happens (no cross-model hits). - [ ] TTL override: Set `RAGFLOW_TTS_CACHE_TTL_SECONDS=120`, confirm the entry expires after 2 minutes. - [ ] Redis unavailable: Stop Redis (or break the connection). Verify the TTS endpoint still works — synthesis falls back to direct calls, with a `TTS cache lookup failed` / `TTS cache store failed` warning logged. - [ ] Failure path: Configure a TTS model with an invalid API key, ensure the response still returns successfully with `audio_binary=None` (no regression vs. current behavior).	2026-05-19 14:20:40 +08:00
kingloon	525a87be0f	Misc: fix some typos (#14987 ) ### What problem does this PR solve? Fix minor code quality issues: 1. Fix typo in assertion error message: "Can't fine" → "Can't find" 2. Remove duplicate line in common/connection_utils.py ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2026-05-19 10:47:06 +08:00
jony376	198f3c4b9a	Fix: validate memory tenant model IDs on update and enforce tenant scope in memory pipeline (#14923 ) ### Related issues Closes #14922 ### What problem does this PR solve? `POST /memories` already resolves `tenant_llm_id` and `tenant_embd_id` through `ensure_tenant_model_id_for_params`, but `PUT /memories/<memory_id>` accepted client-supplied `tenant_llm_id` / `tenant_embd_id` without checking that those `tenant_llm` rows belong to the memory owner’s tenant. A caller could persist another tenant’s row IDs and later trigger extraction or embedding that loaded foreign model credentials via `get_model_config_by_id(tenant_model_id)` with no tenant allow-list. This change aligns the update path with create: updates that change models must go through `llm_id` / `embd_id` and `ensure_tenant_model_id_for_params` scoped to the memory’s `tenant_id` (not only the current user, so team-access cases stay correct). Direct `tenant_*` fields in the body without `llm_id` / `embd_id` are rejected. As defense in depth, `memory_message_service` passes `allowed_tenant_ids` / `requester_tenant_id` into `get_model_config_by_id` for LLM and embedding resolution so mismatched IDs cannot be used even if bad data existed. A regression test rejects payloads that set only `tenant_llm_id` / `tenant_embd_id`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: jony376 <jony376@gmail.com>	2026-05-19 10:11:46 +08:00
Magicbook1108	b69a6a5d80	Feat: full optimization on connector dashboard (#14979 ) ### What problem does this PR solve? This PR improves the connector dashboard task management experience and adds better visibility into connector execution logs. ### Overview: #### Before <img width="700" alt="image" src="https://github.com/user-attachments/assets/e4a8ed6f-2e18-4f0f-8528-41a514550052" /> #### Now: <img width="700" alt="Screenshot from 2026-05-18 16-31-30" src="https://github.com/user-attachments/assets/d4ca193b-847a-49ae-9e4f-5fbca60ea627" /> ### 1. Add a new logging page to the connector dashboard A new logging page has been added so users can view connector task execution logs directly from the connector dashboard. ### 2. Merge the Resume button into Confirm The separate Resume button has been removed. The Confirm button now represents different actions depending on the current task state: - Save: Save form changes and reschedule tasks. - Stop: Cancel currently scheduled or running tasks. - Resume: Create new scheduled tasks after the previous tasks have been stopped. - Start: Start tasks when no task has been started yet. ### 3. Separate syncing and pruning tasks Connector tasks are now separated into syncing and pruning. Pruning is controlled by the Sync deleted files option: - When Sync deleted files is disabled, only syncing tasks are shown. - When Sync deleted files is enabled, both syncing and pruning tasks are shown. Now: Sync deleted files disabled <img width="700" alt="Sync deleted files disabled" src="https://github.com/user-attachments/assets/dbd9232e-614a-407f-a0b1-c109e5fa567d" /> Now: Sync deleted files enabled <img width="700" alt="Sync deleted files enabled" src="https://github.com/user-attachments/assets/1f527f48-ccb3-4ee8-97ca-086891489296" /> ### 4. Update logs in backend <img width="700" alt="image" src="https://github.com/user-attachments/assets/10a95a3f-98c1-4e67-8afa-ddf6cda5b0b2" /> ### 5. Remove connector resume API - Removed: `POST /v1/connectors/<connector_id>/resume` - Replaced by: `PATCH /v1/connectors/<connector_id>` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-19 10:07:11 +08:00
Jake Armstrong	93d3deb5e4	Fix admin CLI system variable commands (#14956 ) ## What Fixes #12409. Implements admin CLI support for: - `list vars;` - `show var <name-or-prefix>;` - `set var <name> <value>;` ## Changes - Wire Go CLI variable commands to the admin API. - Support integer and quoted string values in `SET VAR`. - Return variable rows as `data_type`, `name`, `setting_type`, and `value`. - Add exact-name lookup with prefix fallback for `SHOW VAR`. - Validate values by stored data type: `string`, `integer`, `bool`, and `json`. - Keep the legacy Python admin CLI/server behavior aligned. - Update admin CLI docs and add focused tests. ## Verification - `go test -count=1 ./internal/cli` - `python3.12 -m py_compile admin/server/services.py admin/server/routes.py api/db/services/system_settings_service.py admin/client/parser.py admin/client/ragflow_client.py` - Python admin CLI parser smoke test for `SET VAR`, quoted values, `SHOW VAR`, and `LIST VARS`. - Attempted `./run_go_tests.sh`; local environment is missing native tokenizer/linker artifacts: - `internal/cpp/cmake-build-release/librag_tokenizer_c_api.a` - `-lstdc++` Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-18 19:08:45 +08:00
Hamza Amin Khokhar	2dbe3b8a62	fix: metadata_condition returning all docs when filter matches nothing (#14967 ) ### What problem does this PR solve? When _parse_doc_id_filter_with_metadata returns [], the empty list is falsy so the WHERE id IN (...) clause was silently skipped, causing the full dataset to be returned instead of an empty result. Change `if doc_ids:` to `if doc_ids is not None:` in both get_list() and get_by_kb_id() to distinguish between no filter (None) and a filter that matched zero documents ([]). Fixes #14962 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 18:54:30 +08:00
dev	b12eaee38b	fix(api): enforce tenant access for connector routes (#14747 ) ### What problem does this PR solve? Fixes #14746. Adds tenant access checks for connector-by-id REST routes before reading connector details, mutating connector config/status, deleting connectors, rebuilding, or listing sync logs. Unauthorized callers now receive `RetCode.AUTHENTICATION_ERROR` with `No authorization.` without reaching the connector/log mutation paths. Validation: - `python3 -m pytest --confcutdir=test/testcases/test_web_api/test_connector_app test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py` - `uvx ruff check api/apps/restful_apis/connector_api.py api/db/services/connector_service.py test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: dev111-actor <dev111-actor@users.noreply.github.com>	2026-05-18 16:09:26 +08:00
Wang Qi	56d73d0c2c	Refactor: speed up ragflow server, save startup memory (#14973 ) ### What problem does this PR solve? Refactor: speed up ragflow server, save startup memory, saved 200MiB, and 5-9 seconds start time. ##### Before 1241292 \| \| \_ python3 api/ragflow_server.py RAGFlow server is ready after 25.61845850944519s initialization. ##### After 1019968 \| \| \_ python3 api/ragflow_server.py RAGFlow server is ready after 16.205134391784668s initialization. ### Type of change - [x] Refactoring	2026-05-18 15:55:59 +08:00
qinling0210	f1d2383572	Push metadata filters down to Infinity (#14974 ) ### What problem does this PR solve? Push metadata filters down to Infinity ### Type of change - [x] Refactoring	2026-05-18 14:22:04 +08:00
Kevin Hu	7cdc74bbe5	Refactor: Drop the vector fetch for ES (#14970 ) ## Summary - Stop pulling chunk vectors (`q__vec`) back from Elasticsearch in the main retrieval path. ES already knows them; shipping them was pure bandwidth/memory overhead. - Recover the per-chunk cosine similarity via a second KNN-only ES call filtered by the candidate chunk ids. The new `_score` is merged with locally computed term similarity using the user-configured `vector_similarity_weight`. - Lazily fetch the chunk embedding only for the chunks `insert_citations` actually needs. ## Details `rag/nlp/search.py`* - `Dealer.search`: no longer appends `q__vec` to the ES select list. OceanBase still gets it (its rerank path is unchanged). - New `Dealer._knn_scores(sres, idx_names, kb_ids)`: a `MatchDenseExpr` over the cached query vector filtered by `id IN sres.ids`, returning `{chunk_id: cosine_score}` via ES `_score`. - New `Dealer.rerank_with_knn(...)`: term similarity from `qryr.token_similarity` plus the ES-supplied KNN score, combined with `tkweight`/`vtweight` and the existing rank-feature bonus. - New `Dealer.fetch_chunk_vectors(chunk_ids, tenant_ids, kb_ids, dim)`: on-demand vector fetch for citation use. - `Dealer.retrieval` routes Infinity → unchanged, OceanBase → existing local `rerank`, ES → new KNN-score path. `common/doc_store/es_conn_base.py`* - New `get_scores(res)` helper returning `{_id: _score}` directly from hit headers (ES doesn't surface `_score` through `get_fields`). `api/db/services/dialog_service.py` - New top-level `_hydrate_chunk_vectors(...)` helper. On ES it back-fills `ck["vector"]` from `fetch_chunk_vectors` right before `insert_citations`. No-op on Infinity / OB (their chunks already carry vectors). - Both `decorate_answer` closures became `async` and are `await`-ed at all call sites in `async_chat` and `async_ask`. ## Backend behavior \| Backend \| Returns chunk vec in main search \| Sim source \| Vectors for citations \| \|---\|---\|---\|---\| \| ES \| No \| second KNN call (`_score`) merged with term sim \| fetched on demand \| \| Infinity \| No (unchanged) \| normalized `_score` \| already on chunks \| \| OceanBase \| Yes (kept) \| local hybrid rerank \| already on chunks \| ## Test plan	2026-05-18 14:21:56 +08:00
Rene Arredondo	9f2fb4611f	Fix: guard empty/whitespace embedding inputs in LLMBundle (#14428 ) (#14924 ) Closes #14428 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-18 14:11:54 +08:00
wdeveloper16	14c0985182	feat: bump Python minimum from 3.12 to 3.13, drop strenum backport (#14767 ) Closes #14753 ## What changed \| File \| Change \| \|---\|---\| \| `pyproject.toml` \| `requires-python` → `>=3.13,<3.15`; remove `strenum==0.4.15` \| \| `Dockerfile` \| `uv python install 3.13`, `uv sync --python 3.13` \| \| `.github/workflows/tests.yml` \| `uv sync --python 3.13` on both matrix legs \| \| `CLAUDE.md` \| dev setup command + requirements note updated \| \| `deepdoc/parser/mineru_parser.py` \| `from strenum import StrEnum` → `from enum import StrEnum` \| \| `agent/tools/code_exec.py` \| same \| `StrEnum` has been in the stdlib since Python 3.11 — the `strenum` backport package is no longer needed once the floor is 3.13. ## Why uv.lock is not regenerated `uv lock --python 3.13` fails because: 1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0` 2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels) depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0` 3. These two constraints are irreconcilable on Python 3.13 The lockfile regeneration requires loosening the `numpy` upper bound in the `infiniflow/graspologic` fork. Once that fork commit is updated and the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will succeed. ## RFC corrections Two claims in the original RFC (#14753) did not hold up under code review: - "graspologic hard-blocks 3.13" — the infiniflow fork at the pinned commit has no `<3.13` Python constraint. The blocker is the transitive `numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a direct Python version cap. - "free-threading throughput gains for I/O-bound workload" — Python 3.13 free-threading requires a special `--disable-gil` build and provides no benefit for async I/O code (the GIL is already released during I/O). The real motivation is forward compatibility and improved error messages.	2026-05-15 14:40:53 +08:00
dale053	bd99a22661	fix: atomic chunk/token counter updates for documents and knowledge b… (#14867 ) ### What problem does this PR solve? Fixes #14866. Previously, `DocumentService.increment_chunk_num` and `decrement_chunk_num` updated the `Document` row and its parent `Knowledgebase` row in two separate, non-transactional statements. If the second update failed (DB error, connection drop, etc.) after the first one succeeded, the document and knowledge base chunk/token counters would drift apart and stay inconsistent. There was also a behavioral asymmetry between the two methods: - `increment_chunk_num` only logged a warning when the document row was missing and returned a value that callers usually treated as success. - `decrement_chunk_num` raised `LookupError` in the same situation. This PR makes the counter updates atomic and aligns the missing-document behavior between the two methods: - Wrap the `Document` and `Knowledgebase` updates in `increment_chunk_num` / `decrement_chunk_num` inside a `DB.atomic()` block so both succeed or both roll back together. - Raise `LookupError` from `increment_chunk_num` when the target document no longer exists, matching `decrement_chunk_num`. - Update `reset_document_for_reparse` in `document_api_service.py` to catch the new `LookupError` and return a proper "Document not found!" API error instead of propagating the exception. No schema changes, no API contract changes for the success path; only the failure mode for a missing document during reparse is now a clean error response instead of an uncaught exception. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 14:48:52 +08:00
Ethan T.	ba8cb9dd4a	fix: replace mutable default arguments with None in LLM chat models (#13513 ) ## Summary - Replace `gen_conf={}` with `gen_conf=None` + guard in `rag/llm/chat_model.py` (12 instances across Base, BaiChuanChat, LocalLLM, MistralChat, ReplicateChat, BaiduYiyanChat, GoogleChat classes) - Replace `doc_ids=[]` with `doc_ids=None` + guard in `api/db/services/document_service.py` (1 instance) - Mutable default arguments are shared across all calls, causing potential cross-request state contamination - See Python docs: https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects ## Test plan - [x] Verify LLM calls work with and without explicit gen_conf - [x] No behavior change for existing callers — `None` is replaced with `{}` at function entry 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-14 14:46:47 +08:00
Br1an	d46bbd30f7	Fix: send input and output token usage to Langfuse (#13294 ) ### What problem does this PR solve? Closes #9837 The Langfuse integration currently only sends the output text to `langfuse_generation.update()` without including token usage information. This means Langfuse cannot track input/output token consumption for cost analysis and monitoring. ### Solution Add the `usage` parameter to `langfuse_generation.update()` with: - `input`: approximate input token count from `message_fit_in()` - `output`: approximate output token count from `num_tokens_from_string(answer)` - `total`: sum of input and output ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-05-14 13:11:37 +08:00
plind	dd76653dc1	feat: add tag management for Agents with filtering and sorting (#14774 ) (#14799 ) ## Summary Closes #14774. Adds free-form tags on agents (UserCanvas) with full UI + API: - Stored as comma-separated `tags` column on `UserCanvas` with online migration. - New endpoints: `GET /v1/agents/tags` (aggregate counts) and `PUT /v1/agent/<id>/tags` (write). `GET /v1/agents` accepts a `tags=` query. - "Edit tags" item in agent dropdown opens a chip-style editor dialog; tags render as badges on each agent card. - New "Tags" facet in the agents filter bar, with counts. ## Implementation notes - Tag matching is exact-token: the SQL filter wraps stored tags as `,…,` and matches `,ml,` so `ml` doesn't match `ml-ops`. - Server-side normalization in `UserCanvasService.update_tags`: dedup (case-insensitive), per-tag cap of 64 chars, total length capped at 512 chars to fit the column, commas inside tag values are replaced with spaces. - Tenant authorization: `PUT /v1/agent/<id>/tags` gates on `UserCanvasService.accessible(canvas_id, tenant_id)`. - Tag listing scope: `UserCanvasService.list_tags` follows the same own + team-shared rule as `get_by_tenant_ids`. - i18n: keys added to `en.ts` and `zh.ts` only (per project convention; other locales fall back). - `HomeCard` gets a non-breaking `extra?: ReactNode` slot for the chip row; no `src/components/ui/` files modified. ## Test plan - [ ] Backend boot runs `migrate_db` → confirm `user_canvas.tags` column exists (`DESCRIBE user_canvas`). - [ ] Agents page renders cards normally (no console error from missing field). - [ ] `⋯ → Edit tags` opens a dialog that stays open (regression: dialog was unmounting with the dropdown). - [ ] Typing a tag without pressing Enter and clicking Save persists it (regression: last typed tag was being dropped). - [ ] Chip input supports Enter/comma to commit, Backspace on empty to remove, `×` to remove individual chip. - [ ] Tag containing a comma sent via API is stored with the comma replaced by a space. - [ ] 20 long tags sent via API does not error (length cap silently truncates). - [ ] "Tags" filter in the filter bar shows counts and narrows the list. - [ ] Filtering by `ml` does not return agents tagged `ml-ops`. - [ ] UI in Chinese shows 编辑标签 / 添加标签以整理和筛选你的智能体 etc. - [ ] `PUT /v1/agent/<other-tenant-id>/tags` returns `Agent not found or no permission.`	2026-05-13 21:41:32 +08:00
jony376	7f699d1202	Fix: enforce tenant authorization for `tenant_rerank_id` in retrieval flows (#14782 ) ### Related issues Closes #14781 ### What problem does this PR solve? Some retrieval endpoints accepted caller-supplied `tenant_rerank_id` and resolved it through `get_model_config_by_id(...)`. That helper loaded `TenantLLM` rows by global database id and returned decoded model configuration without checking whether the model belonged to the authenticated tenant or the dataset owner tenant. This meant dataset access was validated, but rerank-model selection was not. A caller who knew or could guess another tenant's `tenant_rerank_id` could attempt retrieval with a foreign rerank model config, creating a cross-tenant authorization gap for model usage. This PR closes that gap by making `tenant_rerank_id` resolution tenant-aware across the retrieval paths that accept it. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Solution - Extend `get_model_config_by_id(...)` to accept an optional `allowed_tenant_ids` set and reject `TenantLLM` rows whose `tenant_id` is outside that set. - Pass the allowed tenant scope from retrieval endpoints that accept `tenant_rerank_id`: - `api/apps/sdk/doc.py` - `api/apps/sdk/session.py` - `api/apps/services/dataset_api_service.py` - Use the authenticated tenant plus dataset-owner tenant ids already derived by each retrieval flow as the authorization boundary for rerank model selection. - Add focused unit coverage to assert unauthorized `tenant_rerank_id` values are rejected and that the allowed tenant set is propagated correctly. ### Testing - `python -m py_compile` on: - `api/db/joint_services/tenant_model_service.py` - `api/apps/services/dataset_api_service.py` - `api/apps/sdk/doc.py` - `api/apps/sdk/session.py` - Added unit tests in: - `test/testcases/test_http_api/test_file_management_within_dataset/test_doc_sdk_routes_unit.py` - `test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py` ### Notes for reviewers - This change is intentionally narrow: it affects only the `tenant_rerank_id` path, not the normal `rerank_id` name-based resolution path. - Local lint/syntax checks passed. - Full pytest execution could not be completed in this environment because the local test runtime is missing `strenum`, so the route-test files fail during collection before exercising the updated cases. --------- Co-authored-by: jony376 <jony376@gmail.com>	2026-05-13 19:53:08 +08:00
tmimmanuel	663fc1d42c	fix(opensearch): implement doc-meta dispatch surface on OSConnection (#14577 ) ### What problem does this PR solve? Fixes #14570. On OpenSearch backends (`DOC_ENGINE=opensearch`) every document-metadata write failed with `'OSConnection' object has no attribute 'create_doc_meta_idx'`, so both `PATCH /api/v1/datasets/{ds}/documents/{doc}` with `meta_fields` and `POST /api/v1/datasets/{ds}/metadata/update` were unusable while every other document operation (retrieval, parsing, name update, chunk management) worked correctly on the same OpenSearch cluster. The bug runs deeper than the missing method name in the error message suggests. `DocMetadataService` also reached into `settings.docStoreConn.es.*` directly for the index refresh, the scripted partial update, and the count call, which means that even after adding `create_doc_meta_idx` to `OSConnection` the very next call in the same metadata flow would still raise `AttributeError` because `OSConnection` exposes `self.os` rather than `self.es`. Fixing only the reported symptom would have moved the failure one line down without restoring the feature. This PR adds a uniform document-metadata dispatch surface to both connection classes so they present the same abstract API, and routes the service layer through that surface via `getattr` guards instead of poking at backend-specific attributes. The four new methods on `OSConnection` and `ESConnectionBase` are `create_doc_meta_idx`, `refresh_idx`, `count_idx`, and `replace_meta_fields`. `OSConnection.create_doc_meta_idx` reuses the existing `conf/doc_meta_es_mapping.json` schema in the OpenSearch `body=` form because OpenSearch and Elasticsearch share the same index-creation payload, and `replace_meta_fields` emits a full scripted assignment (`ctx._source.meta_fields = params.meta_fields`) on both backends so removed keys actually disappear instead of being preserved by deep-merge semantics. The `getattr`-guarded dispatch in `DocMetadataService` keeps the existing fall-through paths intact for Infinity and OceanBase, which continue to rely on their search-based count fallback and on the delete-then-insert metadata replacement they used before, so this change is strictly additive for those two backends. Verification: `pytest test/unit_test/rag/utils/test_opensearch_doc_meta.py` runs 16 new unit tests that pass locally and pin the `OSConnection` dispatch surface, the `create_doc_meta_idx` short-circuit when the index already exists, the mapping-file payload routing, the `IndicesClient.create` failure path, the `refresh_idx` and `count_idx` success and error sentinels, and the full-assignment script emitted by `replace_meta_fields`. The test module stubs `common.settings` and `rag.nlp` at import time so the suite runs without the heavy backend SDKs that the rest of the repository pulls in transitively. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: tmimmanuel <tmimmanuel@users.noreply.github.com>	2026-05-11 17:04:28 +08:00
box4wangjing	292b0b8bce	chore: fix some comments to improve readability (#14756 ) ### What problem does this PR solve? fix some comments to improve readability ### Type of change - [x] Documentation Update --------- Signed-off-by: box4wangjing <box4wangjing@outlook.com>	2026-05-11 16:48:48 +08:00
Sank	592dba1489	Refact: Added a private helper _visibility_and_status_filter (#13627 ) ### What problem does this PR solve? Added a private helper _visibility_and_status_filter(joined_tenant_ids, user_id) that returns the Peewee condition: visible to user (team or own) and status is VALID. ### Type of change - [x] Refactoring --------- Co-authored-by: Serobabov Aleksandr <40SerobabovAS@region.cbr.ru> Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-05-11 15:21:41 +08:00

1 2 3 4 5 ...

751 Commits