ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-07-03 01:01:56 +08:00

Author	SHA1	Message	Date
Kevin Hu	62f94cd59b	Feat: Add knowledge compilation workflows (#16515 ) ## Summary - Add knowledge compilation template APIs, services, and builtin template seed data - Add advanced knowledge compile structure/artifact/RAPTOR workflow support - Update parsing, dataset/document APIs, and supporting services for compilation workflows	2026-07-02 23:22:07 +08:00
Wang Qi	93f6d647d4	Fix the sandbox exec image cannot show and download (#16577 )	2026-07-02 20:49:51 +08:00
Lynn	bc54903bf6	Fix: display model_id in memory_list (#16567 )	2026-07-02 20:28:27 +08:00
Br1an	27c9a093bd	Fix: close MCP sessions after canvas execution to prevent connection leaks (#13295 ) ### What problem does this PR solve? Closes #12962 MCPToolCallSessions created during agent execution (in `Agent.__init__`) are never explicitly closed. Each session starts its own event loop thread and opens an SSE/HTTP connection to the MCP server. When the canvas goes out of scope, these threads and connections remain alive indefinitely, accumulating over time and causing resource exhaustion after prolonged use. ### Solution 1. Add a `Graph.close()` method that iterates all components, finds MCPToolCallSessions held by Agent tools, and calls `close_sync()` on each to properly shut down the event loop, thread, and connection. 2. Call `canvas.close()` in `finally` blocks after `canvas.run()` completes in `canvas_service.py` and `canvas_app.py`. 3. Move MCP session cleanup to `finally` blocks in `test_tool` endpoint (`mcp_server_app.py`) and `get_mcp_tools` (`api_utils.py`) to ensure sessions are closed even on exceptions. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: conflict-resolver <conflict-resolver@local> Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>	2026-07-02 10:57:24 +08:00
Öndery	742188c3bb	feat(agent): report accurate aggregated token usage and propagate session/user + input/output to Langfuse for agent runs (#16420 ) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Other (please describe): ## Summary Agent (Canvas) runs previously did not surface token usage in the SSE stream, and RAGFlow's own Langfuse generations for agent runs were missing the prompt/completion split and the session/user correlation. This made it impossible for an external caller (or Langfuse) to reconcile an agent turn's cost with the upstream provider (e.g. OpenRouter), because a single turn can issue several distinct LLM calls (query rewriting / cross-language translation, multi-round tool reasoning, nested sub-agents, and the final answer). This PR introduces a per-run token usage sink so that every LLM call in a run is aggregated and reported once, and enriches Langfuse generations with the prompt/completion split plus session/user attributes. ## What changes ### 1. Per-run token usage sink (`common/token_utils.py`) - Adds two `contextvars`: `token_usage_sink` (a mutable per-run accumulator) and `langfuse_run_attrs` (session_id/user_id for the run). - Adds `record_run_token_usage(...)` (thread-safe via a lock, because `thread_pool_exec` copies the context into worker threads that share the sink dict) and `usage_from_response(...)` which extracts a `{prompt_tokens, completion_tokens, total_tokens}` split from OpenAI/OpenRouter-style responses. ### 2. Provider layer captures the prompt/completion split (`rag/llm/chat_model.py`) - `LiteLLMBase` and `Base` now store `self.last_usage` (prompt/completion/total) for the most recent chat call, in both the plain and tool-calling paths. - Streaming requests set `stream_options.include_usage = True` (LiteLLM path) so the authoritative usage arrives on the final chunk; this is read even on the usage-only chunk that carries no `choices`. - Fixes a multi-round accounting bug in `_with_tools`: token totals were overwritten* by each round (`total_tokens = tol`) instead of accumulated, undercounting multi-round tool conversations. Each round is now committed to a running aggregate. ### 3. LLMBundle reports usage once, per call (`api/db/services/llm_service.py`) - New `_report_usage(total_tokens)` records the call's usage into the active run sink and returns the prompt/completion/total split for Langfuse. The split is only used when it is consistent with the authoritative total; otherwise only the total is reported. - All three chat entry points (`async_chat`, `async_chat_streamly`, `async_chat_streamly_delta`) now emit `usage_details` with `input`/`output`/`total` instead of total-only. - `_start_langfuse_observation` now applies `session_id`/`user_id` from the per-run context (`langfuse_run_attrs`) so agent-run generations are correctly grouped, even though agent LLMBundles are constructed without those attributes. ### 4. Canvas installs the sink and emits the aggregate (`agent/canvas.py`) - `Canvas.run()` installs a fresh `token_usage_sink` and `langfuse_run_attrs` (from `user_id`/`session_id`) at the start of every turn. - `message_end` now includes an aggregated `usage` object: `{prompt_tokens, completion_tokens, total_tokens, calls}` covering all LLM calls in the run. ### 5. Pass session id into the run (`api/db/services/canvas_service.py`) - `completion()` forwards `session_id` to `Canvas.run()` for Langfuse session correlation. ## Why a context variable LLM calls in an agent run originate from many places that each build their own `LLMBundle` (e.g. `cross_languages`/`keyword_extraction` helpers, the Agent component, and nested sub-agents invoked as tools). A run-scoped context variable is the only non-invasive chokepoint that captures all of them exactly once, including nested agents (which run in the same async context) and thread-pool tools (the executor copies the context). ## Behavior / compatibility - No public API or wire-format removal: `message_end` gains an additional optional `usage` field; existing consumers are unaffected. - When a provider does not return authoritative usage, behavior falls back to the previous token estimate (total only, no split). - Non-agent flows (Dataflow `Pipeline`, sync `Graph.run`) are untouched. ## Testing - [x] Simple agent answer: `message_end.usage.total_tokens` matches provider usage. - [x] Agent with cross-language retrieval: aggregate equals the sum of both provider calls. - [x] Tool-calling agent (multi-round): total accumulates across rounds. - [x] Nested agent (agent-as-tool): sub-agent tokens included in the parent run total. - [x] Langfuse: agent generations show input/output split and are grouped by session/user. --------- Co-authored-by: yzc <yuzhichang@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-07-02 09:35:28 +08:00
Harsh Kashyap	d770217b25	fix(api): fall back to factory max_tokens for tenant models (#16364 )	2026-07-01 16:00:13 +08:00
Lynn	400476f0b3	Feat: SoMark (#16482 ) Follow #15486 Co-authored-by: limuting <limuting233@gmail.com> Co-authored-by: lutianyi <lutianyi233@163.com> Co-authored-by: justinychuang <huangyicheng@soulcode.cn> Co-authored-by: maybehokori <138367708+maybehokori@users.noreply.github.com>	2026-07-01 13:29:28 +08:00
Taranum Wasu	e23f63bd93	fix(agent): prevent empty LLM user message after prompt fitting (#16413 ) ## Summary - Treat `max_tokens=0` as unset (`or 8192`) when building model context budgets, fixing agents that silently zeroed prompts when a vLLM model had `max_tokens: 0` in tenant config - Replace trailing same-role canvas history in `LLM._sys_prompt_and_msg` instead of skipping the current user prompt - Add `LLM.fit_messages()` validation after `message_fit_in` on agent paths so empty user content fails fast with a clear error instead of reaching vLLM Fixes #16411 ## Root cause Agent canvas workflow called `message_fit_in` with `int(max_length * 0.97)`. When `max_length` was `0`, both system and user content were trimmed to empty strings. The `[HISTORY STREAMLY]` log showing only `{"role":"user","content":""}` matches this. A secondary bug skipped appending the formatted user prompt when history ended with a `user` role message. ## Test plan - [x] Added `test/unit_test/agent/component/test_llm_prompt.py` for role-replace, validation, and zero-budget fitting - [x] Added `test_message_fit_in_zero_budget_preserves_non_empty_messages` in `test_generator_message_fit_in.py` - [ ] CI unit tests - [ ] Manual: agent canvas `begin → Retrieval → Agent → Message` with vLLM Qwen3; confirm user message reaches LLM Made with [Cursor](https://cursor.com) --------- Co-authored-by: Taranum Wasu <taranumwasu@Taranums-MacBook-Pro.local> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-07-01 09:30:54 +08:00
天海蒼灆	3c946a7e58	fix(agent): add canvas_type filter and field to list_agents API (#15754 ) ### What problem does this PR solve? GET /api/v1/agents (list_agents) already supports filtering by canvas_category, keywords, tags, and owner_ids, but it does not support canvas_type — even though canvas_type is a persisted field on UserCanvas and is already accepted on agent create/update APIs. This gap causes two issues: Filtering — clients cannot list agents by business category (e.g. Marketing, Agent, Ingestion Pipeline) without fetching all agents and filtering client-side. Response payload — list_agents did not return canvas_type in each canvas item, so consumers had to call GET /api/v1/agents/{id} per agent to read it. This PR adds optional canvas_type query parameter support and includes canvas_type in the list response. ### Type of change - [√] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2026-06-30 17:43:26 +08:00
Zhichang Yu	0c3952147c	fix(codeql): close remaining 44 CodeQL alerts post-merge (#16408 ) ## Summary After #16407 merged, 44 of the original 93 CodeQL alerts were still open on the default branch. This PR closes the remaining ones by: 1. Moving 32 existing `// codeql[...]` directives so they sit on the line immediately before the suppressed statement. The original multi-line suppression blocks had the directive as the first line, with the rationale on subsequent lines. After line shifts (refactors, linter reformat), the directive ended up several lines above the alert location — CodeQL only recognizes the suppression when it appears on the line directly above. (32 alerts across 27 files.) 2. Adding 9 new `// codeql[...]` suppressions for alerts that had no suppression in the preceding lines at all — mostly real-fixes that CodeQL conservatively still flags (filepath.Base, bounded slice sizes, model-identifier strings, the MD5-legacy-migration lookup in `conversation_service.py`). ## Files changed - `api/db/services/conversation_service.py` — add `py/weak-sensitive-data-hashing` suppression (MD5 for backward-compat legacy row lookup; not used for auth) - `api/db/services/llm_service.py` — 3× `py/clear-text-logging-sensitive-data` suppressions on the lines that log `llm_name` in warnings/info - `common/misc_utils.py` — 2× `py/clear-text-logging-sensitive-data` suppressions on the redacted `current_url` log sites - `internal/agent/component/invoke.go` — moved existing `go/request-forgery` directive - `internal/agent/sandbox/ssh.go` — moved existing `go/command-injection` directive - `internal/agent/tool/retrieval_service.go` — added `go/uncontrolled-allocation-size` suppression (`topN` is bounded to 1024 above) - `internal/cli/common_command.go` — moved 2× `go/disabled-certificate-check` directives - `internal/cli/user_command.go` — added `go/clear-text-logging` suppression (filepath.Base already strips user-identifying path) - `internal/dao/pipeline_operation_log.go` — moved 2× `go/sql-injection` directives - `internal/dao/user_canvas.go` — added `go/sql-injection` suppression in `GetList` (the new `userCanvasOrderClause` call path) - `internal/engine/infinity/chunk.go` — moved existing `go/unsafe-quoting` directive - `internal/entity/models/` — moved `go/path-injection` directives (15 files) - `internal/handler/oauth_login.go` — moved existing `go/cookie-httponly-not-set` directive - `internal/handler/tenant.go` — moved existing `go/path-injection` directive - `internal/service/deep_researcher.go` — moved existing `go/unsafe-quoting` directive - `internal/service/dataset.go` — added `go/uncontrolled-allocation-size` suppression (`n` bounded to 1024 above) - `internal/service/file.go` — moved existing `go/request-forgery` directive - `internal/service/langfuse.go` — moved 2× `go/request-forgery` directives - `internal/utility/mcp_client.go` — moved 3× `go/request-forgery` directives - `internal/utility/smtp.go` — moved existing `go/email-injection` directive - `rag/prompts/generator.py` — added `py/clear-text-logging-sensitive-data` suppression - `web/.../use-provider-fields.tsx` — added `js/prototype-pollution-utility` suppression (FORBIDDEN_KEYS guard is on the line above) ## Why the previous PR left alerts open `// codeql[query-id] explanation` must be on the line immediately before* the suppressed statement per the [GitHub CodeQL suppression spec](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/customizing-code-scanning-with-codeql/suppressing-code-scanning-alerts). The original suppression blocks were 4-5 lines, with the directive as the first line. After linter reformat / line shifts, the directive ended up too far above the actual alert line to be recognized. The fix is to put the directive on the line directly above the suppressed statement, with the rationale above it. ## Test plan - All 9 modified Python files `ast.parse` clean - All 4 modified Go files `gofmt` clean - 36/44 expected alert suppressions in place - 8 remaining CodeQL alerts are the originals (#3485851828, #3485851831, #3485869759, #3485869766, #3485869768, #3485869771, #3485885962, #3485895527) which were resolved by the corresponding commit comments; these should close on the next scan when the suppression comments match the alert lines. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-06-29 09:45:16 +08:00
Zhichang Yu	195bfffb5e	fix(security): address 93 CodeQL code-scanning alerts across 61 files (#16407 ) ## Summary Resolves all 93 open alerts at https://github.com/infiniflow/ragflow/security/code-scanning by rule: \| Rule \| Count \| Treatment \| \|------\|-------\|-----------\| \| py/clear-text-logging-sensitive-data \| 23 \| Real fix — log scrubbing \| \| go/path-injection \| 15 \| Real fix where possible, suppression with rationale \| \| go/request-forgery \| 8 \| Suppression with rationale (operator-controlled URLs) \| \| go/clear-text-logging \| 10 \| Real fix — log scrubbing \| \| go/unsafe-quoting \| 5 \| Real fix — escape or refactor \| \| go/sql-injection \| 3 \| Real fix — orderby whitelist + CodeQL comment \| \| go/uncontrolled-allocation-size \| 2 \| Real fix — cap to 1024 \| \| go/incorrect-integer-conversion \| 3 \| Real fix — ParseInt + range check \| \| go/insecure-hostkeycallback \| 1 \| Real fix — known_hosts file \| \| go/disabled-certificate-check \| 2 \| Suppression with rationale \| \| go/command-injection \| 1 \| Suppression (sanitized via shq()) \| \| go/email-injection \| 1 \| Suppression with rationale \| \| go/cookie-httponly-not-set \| 1 \| Suppression (SPA bootstrap) \| \| js/stack-trace-exposure \| 1 \| Real fix — generic client message \| \| js/prototype-pollution-utility \| 1 \| Real fix — reject __proto__/constructor/prototype \| \| py/weak-sensitive-data-hashing \| 1 \| Real fix — MD5 → SHA-256 \| \| py/incomplete-url-substring-sanitization \| 3 \| Real fix — urlparse(hostname) \| \| py/paramiko-missing-host-key-validation \| 1 \| Real fix — load_system_host_keys + RejectPolicy \| \| cpp/integer-multiplication-cast-to-long \| 2 \| Real fix — cast to size_t \| ## Real fixes (with measurable security improvement) SSH host key verification (Go + Python) Replace `InsecureIgnoreHostKey()` / `paramiko.AutoAddPolicy()` with proper host key verification against a known_hosts file (configurable via `SSH_KNOWN_HOSTS` env / `known_hosts` config field; fail-closed when unset). Loads `~/.ssh/known_hosts` first via `load_system_host_keys()` so existing setups keep working. SQL injection in `user_canvas` Add `userCanvasOrderableColumns` whitelist + `userCanvasOrderClause` helper. Both `GetList()` and `ListByTenantIDs()` now route the user-supplied `orderby` query param through the helper, defaulting to `create_time` on miss. SQL injection in `pipeline_operation_log` Existing whitelist documented via CodeQL comment. Real SQL injection in `infinity/chunk.go:931` Escape `'` → `''` on user-controlled `questionText` before splicing into `filter_fulltext(...)` SQL filter. Real SQL injection in `elasticsearch/sql.go:75` Defense-in-depth escape on tokenizer output before splicing into `MATCH(...)`. Python code injection in `result_protocol.go` Replace raw JSON literal embedding into Python/JS expressions with base64 + `json.loads` / `JSON.parse(Buffer.from(..., 'base64').toString('utf8'))`. Eliminates both the unsafe-quoting sink and the brittleness of mixing JSON true/false/null with Python syntax. URL substring check bypass in `embedding_model.py` Replace `if "dashscope-intl.aliyuncs.com" in u` with `urlparse(u).hostname == "dashscope-intl.aliyuncs.com"` so a base_url like `https://attacker.example/?u=dashscope-intl.aliyuncs.com` cannot bypass the routing. Prototype pollution in `setNestedValue` (TS) Reject `__proto__`/`constructor`/`prototype` keys before any assignment. Integer overflow - scrypt params via `ParseInt` + non-positive check (`internal/common/password.go`) - `topN` and `n` caps to 1024 (retrieval_service.go, dataset.go) - `nallocstatesize` cast to `size_t` (cpp/re2/onepass.cc) Cookie httponly* Set explicitly with rationale: this is the OAuth bootstrap cookie intentionally read by the SPA. Stack trace exposure Replace `error.message` in HTTP 500 response with generic `"internal error"`; full error still logged server-side via `console.error`. Weak hashing MD5 → SHA-256 for deterministic `conv_id` derivation (`conversation_service.py`). Log scrubbing Remove or redact user-controlled / sensitive content from clear-text logs across 8 ingestion parsers, `llm_service.py` ×11, `tenant_llm_service.py` ×7, `misc_utils.py` ×4, `redis_conn.py` ×10, `conftest.py` ×4, `init_data.py`, `dataset_api_service.py`, `generator.py`, `mysql_migration.py`, `cli.go`, `user_command.go`, `pdf_parser.go`. Most patterns converted to parameterized logging (`logging.info("...: %d", n)`) or static messages. ## CodeQL suppressions (each with rationale) For alerts where the data flow is genuinely safe but CodeQL can't see the context — operator-controlled URLs, sanitized inputs, etc. — I added `// codeql[go/<rule>] <rationale>` annotations rather than dismissing them, so future readers can audit the rationale inline: - `internal/agent/component/invoke.go:135` — Invoke is a generic canvas HTTP client - `internal/service/langfuse.go` ×2 — host is per-tenant operator config - `internal/service/file.go:1184` — already SSRF-guarded by `assertURLSafe` - `internal/utility/mcp_client.go` ×3 — already `AssertURLSafe` + IP-pinned - `internal/entity/models/bedrock.go` — sigv4-signed request, URL can't be tampered - `internal/service/deep_researcher.go:269` — `callback` is SSE display string, not SQL - `internal/engine/infinity/chunk.go:346` — UUIDs can't contain `'` (RFC 4122) - `internal/cli/common_command.go` ×2 — CLI trusts operator-configured URL - `internal/utility/smtp.go:194` — msg is server-built, not user form input - `internal/entity/models/*` ×14 (path-injection) — audio file paths are caller-supplied ## Test plan - ✅ All 13 modified Go packages build cleanly - ✅ 663 tests pass across `internal/agent/sandbox`, `internal/common`, `internal/agent/component`, `internal/engine/infinity`, `internal/dao` - ✅ All 11 modified Python files parse via `ast.parse` - ✅ TypeScript `tsc --noEmit` clean on the modified `use-provider-fields.tsx` - ✅ `node --check` clean on the modified JS file 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-06-29 09:45:16 +08:00
Zhichang Yu	f58fae5fb7	feat(go-agent): Ported retrieval node, added Keenable web search tool (#16396 ) Ported retrieval node, added Keenable web search tool - [x] New Feature (non-breaking change which adds functionality)	2026-06-29 09:45:16 +08:00
Wang Qi	985e3c1db5	Fix document progress not set to fail when embedding model error (#16381 )	2026-06-26 16:11:54 +08:00
Harsh Kashyap	49312cace3	fix(api): align use_sql Markdown separator with Source header (#16317 )	2026-06-25 19:00:01 +08:00
Idriss Sbaaoui	fb8e5ad4b2	Fix multimodal chat image routing for VLM channel requests (#16343 )	2026-06-25 14:38:29 +08:00
buua436	479a9a715e	feat: unify provider id or name routing (#16336 )	2026-06-25 13:04:21 +08:00
Wang Qi	d0fc75f1bb	Fix when empty response not set, it report: ERROR: 'knowledge' (#16338 )	2026-06-25 13:02:24 +08:00
Ambercssa	e9cdd09b67	fix(agent): handle different reference data formats (#16276 )	2026-06-24 13:33:59 +08:00
Wang Qi	6046bc6a8e	Fix: handle empty folder when link to datasets (#16296 )	2026-06-24 13:31:32 +08:00
Ju Boxiang	39b194453d	Fix: paginate get_flatted_meta_by_kbs to support datasets with >10k documents (#16034 ) (#16095 )	2026-06-24 13:20:07 +08:00
ちー	5928b8b9ae	fix(document_service): prevent NoneType error on progress_msg.strip() (#16289 ) ### What problem does this PR solve? When I run RAGFlow_server.py: ``` 2026-06-24 10:27:01,938 ERROR 3413485 fetch task exception Traceback (most recent call last): File "/home/infiniflow/Documents/development/ragflow/api/db/services/document_service.py", line 948, in _sync_progress if t.progress_msg.strip(): ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'strip' ``` fixed: ```python if t.progress_msg.strip(): # fix: if (t.progress_msg or "").strip(): ``` Fix crash in `_sync_progress` when `progress_msg` is `None`. #### Root Cause `progress_msg` from task records can be `None`, causing: ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-24 13:07:40 +08:00
buua436	d5d9d19fbe	fix: keep chat channel bindings consistent (#16274 )	2026-06-24 11:51:35 +08:00
qinling0210	563d855780	Implement OpenAI chat completions in GO (#16177 ) ### What problem does this PR solve? Implement OpenAI chat completions in GO POST /api/v1/openai/<chat_id>/chat/completions OpenAI chat cli: internal/development.md ### Type of change - [x] Refactoring	2026-06-18 18:07:27 +08:00
Lynn	47bd9dd049	Fix: replace tenant_llm apis (#16131 ) Replace tenant_llm apis with provider-instance apis.	2026-06-18 16:38:32 +08:00
buua436	78b4906f7a	fix: tighten embedding truncation threshold (#16123 ) ### What problem does this PR solve? Use a 95% max_length threshold before truncating embedding inputs, which reduces the chance of provider-side invalid-parameter errors on near-limit chunks. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-17 14:18:02 +08:00
euvre	fe46244d30	fix: paginate non-DeepDOC PDF parsing tasks to prevent OOM (#16106 ) The parser pods suffer from OOM kills when processing large PDF documents. The root cause is in api/db/services/task_service.py: when layout_recognize is not DeepDOC (e.g. Plain Text), page_size was set to MAXIMUM_TASK_PAGE_NUMBER (100 million), causing the entire PDF to be processed as a single task with all pages loaded into memory simultaneously. This PR fixes the issue by paginating non-DeepDOC PDF parsing tasks the same way DeepDOC already does.	2026-06-17 09:33:53 +08:00
Wang Qi	17e3aad7ae	Revert "fix: paginate non-DeepDOC PDF parsing tasks to prevent OOM" (#16104 ) Reverts infiniflow/ragflow#15951	2026-06-16 20:11:45 +08:00
euvre	d2a18d5c46	fix: paginate non-DeepDOC PDF parsing tasks to prevent OOM (#15951 ) ### What problem does this PR solve? The parser pods suffer from OOM kills when processing large PDF documents. The root cause is in api/db/services/task_service.py: when layout_recognize is not DeepDOC (e.g. Plain Text), page_size was set to MAXIMUM_TASK_PAGE_NUMBER (100 million), causing the entire PDF to be processed as a single task with all pages loaded into memory simultaneously. This PR fixes the issue by paginating non-DeepDOC PDF parsing tasks the same way DeepDOC already does. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [x] Performance Improvement - [ ] Other (please describe):	2026-06-16 20:07:19 +08:00
Kevin Hu	15f50e5cb2	fix: rename dialog_id to chat_id in chat_channel (backend + frontend) (#16096 ) ## Summary - The `ChatChannel` DB column was renamed from `dialog_id` to `chat_id` via a migration (added in a prior commit). - Aligns the REST API layer (`chat_channel_api.py`, `chat_channel_service.py`) to use `chat_id` consistently. - Updates the frontend (`interface.ts`, `hooks.ts`, `connect-dialog-modal.tsx`, `added-channel-card.tsx`) to read/write `chat_id` instead of `dialog_id`. - The joined `dialog_name` alias in the list query is unchanged (backend still returns it under that name). Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-16 19:02:20 +08:00
Lynn	70792de899	Fix: v0.26.1 model provider (#16073 ) ### What problem does this PR solve? Fix: - Pass session_id to langfuse. - Get correct status for add model_type. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-16 16:21:43 +08:00
Kevin Hu	5a817762fa	Refactor: Change table chat_channel status data type. (#16061 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring	2026-06-16 12:02:12 +08:00
buua436	8e235b7b95	fix: add legacy chat/completions mode (#16014 ) ### What problem does this PR solve? Adds a legacy mode for /chat/completions that restores v0.23.0-style output by converting start_to_think/end_to_think back into raw <think></think> markers and streaming cumulative answer text. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-16 10:34:06 +08:00
Lynn	47495c1f6a	Feat: model provider (#16028 ) ### What problem does this PR solve? Feat: - Allow upsert model_type for instance model Fix: - Allow create instance with duplicate api_key ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-06-15 19:10:33 +08:00
Wang Qi	f6a2075ad0	Fix one data source can be synced to multiple dataset (#16023 ) Fix one data source can be synced to multiple dataset Test add/delete - worked.	2026-06-15 16:54:25 +08:00
Yingfeng	b5bea72e4b	Add git-like file commit API (#15978 ) ### What problem does this PR solve? \| # \| Method \| Endpoint \| Description \| Git Equivalent \| \|---\|--------\|----------\|-------------\|----------------\| \| 1 \| `POST` \| `/api/v1/{prefix}/{folder_id}/commits` \| Create a snapshot commit with file changes (add/modify/delete/rename) \| `git add` + `git commit` \| \| 2 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits` \| List commit history (paginated) \| `git log` \| \| 3 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}` \| Get commit detail with file changes \| `git show` \| \| 4 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files` \| List file changes in a commit \| `git show --name-status` \| \| 5 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/diff?from=...&to=...` \| Compare two commits and return differences \| `git diff` \| \| 6 \| `GET` \| `/api/v1/{prefix}/{folder_id}/changes` \| Get uncommitted changes (add/modify/delete) \| `git status` \| \| 7 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/tree` \| Get the folder tree snapshot at commit time \| `git ls-tree` \| \| 8 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files/{file_id}/content` \| Get a file's content as it existed in a specific commit \| `git show HEAD:file` \| \| 9 \| `GET` \| `/api/v1/{prefix}/{file_id}/versions` \| Get version history for a specific file across all commits \| `git log -- file` \| Where `{prefix}/{id}` can be: - `folders/{folder_id}` — direct folder access - `workspaces/{workspace_id}` — alias of `folders/{folder_id}` - `datasets/{dataset_id}` — resolves to the dataset's folder - `memories/{memory_id}` — resolves to the memory's folder - `skills/{skill_id}` — resolves to the skill's folder ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2026-06-15 11:19:56 +08:00
Kevin Hu	b5a426e6e0	Feat: chat channels — connect assistants to external messaging bots (#15850 ) ### What problem does this PR solve? #15844 Adds a Chat channels capability so a RAGFlow assistant (Dialog) can be exposed as a bot on external messaging platforms (Feishu/Lark, Discord, Telegram, Slack, WeCom, LINE, etc.). An admin configures a bot in the UI, connects it to an assistant, and inbound messages are answered from that assistant's knowledge base — replies are delivered back on the channel. Feishu/Lark is implemented and tested end-to-end. Discord, Telegram, LINE, and WeCom are scaffolded against the same interface; the remaining listed channels are tracked as follow-ups. ### Design Backend - New `chat_channel` table (`tenant_id`, `name`, `channel`, `config` JSON holding `{credential: {...}}`, `dialog_id`, `status`) + `ChatChannelService` and RESTful CRUD under `/api/v1/chat_channels`. - Channel framework under `api/channels/`: a `core` registry + per-channel packages that self-register a builder and implement a common `Channel` interface (`start`/`stop`/`send` + inbound normalization) over `IncomingMessage`/`OutgoingMessage`. - Embedded reconcile loop in `ragflow_server` (`api/channels/bootstrap.py`): loads enabled bots, and starts/stops/restarts them as rows change (no server restart needed). Inbound messages run the connected dialog via the non-streaming completion path, keeping per-end-user conversation history. - Missing optional channel SDKs degrade gracefully (channel skipped with a warning; others unaffected). Channel-level errors are logged, not crashed. - Feishu's WebSocket client runs in a dedicated thread with its own event loop to avoid cross-loop/contextvars conflicts with the channel runtime. Frontend - Settings → Chat channels panel: available-channels grid + configured-bots list with add/edit/delete and a Connect assistant popup that binds a bot to a dialog. - Brand icons via simple-icons / reused shared data-source assets, with colored fallbacks for brands not available. - Route, sidebar entry, i18n (en/zh), and a top-nav segment-boundary fix so the settings page no longer highlights the Chat tab. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### Notes - DB: new `chat_channel` table is auto-created; `chat_channel.dialog_id` is also covered by a `migrate_db` `alter_db_add_column` for existing installs. - Channel SDKs (`lark-oapi`, `discord.py`, `python-telegram-bot`, `line-bot-sdk`, `wechatpy`, `aiohttp`) added to dependencies. - Screenshots / per-channel credential docs to follow. <img width="1338" height="1290" alt="Image" src="https://github.com/user-attachments/assets/042cb2f9-0dad-4e6a-bcf7-43ced4bbd704" /> <img width="1344" height="738" alt="Image" src="https://github.com/user-attachments/assets/373cd08e-ec40-4c67-9c51-4d948b1ba617" /> <img width="672" height="887" alt="Image" src="https://github.com/user-attachments/assets/5a34953f-a9a3-4c1e-869e-5eff0dc64c84" /> ---------	2026-06-12 18:21:30 +08:00
Jonathan Chang	de06c9a60b	feat: Langfuse session grouping for multi-turn chat traces (#15679 ) ## Summary This PR passes `session_id` into Langfuse trace observations so multi-turn chat messages can be grouped under the same session in Langfuse. Changes include: - Propagate `session_id` from chat/session APIs into `dialog_service.async_chat`. - Pass `session_id` into Langfuse `start_observation(...)`. - Share Langfuse `trace_context` with chat, embedding, rerank, and TTS model bundles where applicable. - Add unit coverage to verify Langfuse observations receive `session_id`. - Update affected test stubs for the new optional Langfuse context arguments. ## Related Issue Closes: #15636 ## Change Type - [x] Feature - [x] Bug fix - [x] Test - [ ] Refactor - [ ] Documentation - [ ] Breaking change ## Real Behavior Proof Before this change: - Langfuse observations were created without `session_id`. - Multi-turn chat traces could not be grouped by session in Langfuse. After this change: - Chat/session flows pass `session_id` into `async_chat`. - Langfuse observations include `session_id`. - Related model bundles receive shared trace context and session metadata. Validation result: ```bash uv run python -m py_compile \ api/db/services/tenant_llm_service.py \ api/db/services/llm_service.py \ api/db/services/dialog_service.py \ api/db/services/conversation_service.py \ api/apps/restful_apis/chat_api.py \ test/unit_test/api/db/services/test_dialog_service_final_answer.py \ test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py ``` Passed. ```bash uv run pytest \ test/unit_test/api/db/services/test_dialog_service_final_answer.py \ test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py -q ``` Result: ```text 11 passed in 16.89s ``` ```bash git diff --check ``` Passed. ## Checklist - [x] Analyzed the issue requirement. - [x] Checked existing Langfuse trace integration. - [x] Implemented only the requested session grouping behavior. - [x] Added/updated unit tests. - [x] Ran focused tests successfully. - [x] Ran Python compile validation. - [x] Ran whitespace diff validation.	2026-06-12 10:18:06 +08:00
Lynn	9d5950963b	Fix: get is_tools from model record (#15946 ) ### What problem does this PR solve? As title. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-11 17:29:28 +08:00
少卿	9614605bf9	fix: propagate max_tokens from model config to downstream consumers (#15945 ) ## Summary `get_model_config_from_provider_instance()` was not including `max_tokens` in its returned dict, causing all downstream consumers (dialog truncation, message fitting, knowledge base trimming, embedding, graphrag, RAPTOR) to fall back to the hardcoded default of 8192 tokens regardless of the actual model context window size (e.g., GPT-4o 128K, Claude 200K). Closes #15944 ## Root Cause The function builds `model_config` with only: `llm_factory`, `api_key`, `llm_name`, `api_base`, `model_type`, `is_tools`. `max_tokens` is never included. Yet the data exists in four independent sources: 1. `TenantModel.extra` JSON field — written by `provider_api_service.py:659` 2. `conf/llm_factories.json` — every model entry has `max_tokens` 3. `rag/llm/model_meta.py` — 9 provider classes fetch real context windows from APIs 4. `TenantLLM.max_tokens` database column None of them are read by this function. ## Fix Two lines added, one per return path: - Path B (model_obj exists → provider-instance model): reads `max_tokens` from `model_obj.extra` JSON - Path C (fallback → factory config): reads `max_tokens` from `llm_info` (sourced from `llm_factories.json`) Both fall back to 8192 when the value is absent, preserving backward compatibility. ## Impact This single 5-line change fixes the context window budget for all 78+ call sites across 20 files that construct `LLMBundle` or read `max_tokens` from the config dict, including: \| Consumer \| File \| Effect \| \|---\|---\|---\| \| Dialog chat truncation \| `dialog_service.py:562` \| `message_fit_in(msg, max_tokens * 0.95)` now uses real context window \| \| Knowledge base trimming \| `dialog_service.py:752` \| `kb_prompt(kbinfos, max_tokens)` now fits more retrieved content \| \| Agent message fitting \| `agent/component/llm.py:322` \| Agent prompts no longer truncated at 7946 tokens \| \| Embedding truncation \| `task_executor.py:704` \| Embedding input uses actual model limit \| \| GraphRAG extraction \| `graphrag/*/extractor.py` \| Entity extraction gets full context budget \| \| LLM4Tenant.max_length \| `tenant_llm_service.py:513` \| Chat model wrapper exposes real context window \|	2026-06-11 17:24:58 +08:00
bohdansolovie	381091df71	fix(dialog): guard async_ask() against empty or invalid kb_ids (#15530 ) Fixes #15529 . ### Problem `async_ask()` accessed `kbs[0]` without verifying that `KnowledgebaseService.get_by_ids()` returned any knowledge bases. Empty or stale `kb_ids` raised `IndexError`, which surfaced as HTTP 500 on search/bot SSE endpoints. ### Fix - Add an early guard when `kbs` is empty, yielding a final SSE error event (consistent with `gen_mindmap()` in the same module). - Add regression tests for empty `kb_ids` and deleted/invalid KB IDs. ### Test plan - [ ] `pytest test/unit_test/api/db/services/test_dialog_service_final_answer.py -k "async_ask_empty or async_ask_stale"` - [ ] Manual: `POST /api/v1/searchbots/ask` with invalid `kb_ids` returns SSE error, not HTTP 500 --------- Co-authored-by: Wang Qi <wangq8@outlook.com>	2026-06-11 15:52:59 +08:00
Rene Arredondo	b978e26208	fix(db): drop Peewee-auto-named unique index on tenant_model_instance (#15699 ) (#15879 ) ## Summary Fixes #15699. User upgrades to v0.25.6 against an existing MySQL database, tries to add an Ollama provider instance, and gets: ``` MySQL IntegrityError: Duplicate entry 'dbaafbfe608a11f1a5516d6066988224' for key 'tenant_model_instance.tenantmodelinstance_api_key_provider_id' ``` The route at [api/apps/restful_apis/provider_api.py:354](api/apps/restful_apis/provider_api.py#L354) catches it and returns `get_error_data_result(message="Internal server error")` — which by RAGFlow's convention is HTTP 200 with an error `code` on the body — hence the reporter's "200 status code but the database errored" complaint. ### Root cause The provider-instance refactor in [PR #15460](https://github.com/infiniflow/ragflow/pull/15460) dropped the unique-compound-index tuple from `TenantModelInstance`: ```python # Removed in #15460 class Meta: db_table = "tenant_model_instance" indexes = ( (("api_key", "provider_id"), True), # unique ) ``` and added a one-shot drop in `migrate_db()` for existing databases. But the drop targets the wrong index name: ```python # Before this PR — wrong name for table_name, index_name in [ ("tenant_model_instance", "idx_api_key_provider_id"), # ← doesn't exist ("tenant_model", "idx_provider_model_instance"), ]: ``` Peewee's auto-derived index name is `<lowercase classname>_<col1>_<col2>` → `tenantmodelinstance_api_key_provider_id`, which matches the user's error verbatim. The drop raises `OperationalError: 1091 (HY000): Can't DROP …`, the surrounding `except` clause at [db_models.py:1736](api/db/db_models.py#L1736) swallows it as expected-on-fresh-installs, and the legacy unique index lives on indefinitely. ### Why Ollama hits it specifically Ollama doesn't require an API key. The form posts `api_key: ""`. The app-layer dedupe at [provider_api_service.py:288-292](api/apps/services/provider_api_service.py#L288-L292): ```python api_key_str = "" if api_key: # ← skipped for "" ... same_key_instance = TenantModelInstanceService.get_by_provider_id_and_api_key(...) if same_key_instance: return False, f"Already exist instance: ... with api_key {api_key}" ``` falls through for empty keys. Control reaches `TenantModelInstanceService.create_instance(..., api_key="")` which inserts a row whose `(api_key, provider_id) = ("", <provider_uuid>)` collides with any prior Ollama row that already shipped that same pair → the still-present unique index throws. (`dbaafbfe608a11f1a5516d6066988224` in the user's error is the duplicated `provider_id` UUID, paired with the empty `api_key`.) ### Fix Add the Peewee auto-name alongside the existing `idx_` entry so the migration finally drops the obsolete index on next restart: ```python legacy_indexes = [ ("tenant_model_instance", "idx_api_key_provider_id"), ("tenant_model_instance", "tenantmodelinstance_api_key_provider_id"), # ← added ("tenant_model", "idx_provider_model_instance"), ] ``` The surrounding `try/except (OperationalError, ProgrammingError)` matches `1091` / `can't DROP` / `does not exist` and treats them as success, so every state is idempotent (see Test plan). ### Idempotency matrix \| Database state \| First entry (`idx_api_key_provider_id`) \| New entry (`tenantmodelinstance_api_key_provider_id`) \| \| --- \| --- \| --- \| \| Fresh install (≥ #15460) — neither index exists \| `1091` → swallowed \| `1091` → swallowed \| \| Upgraded from before `dc4b82523` (the user's case) — auto-name present \| `1091` → swallowed \| drops the index* \| \| Upgraded after a manual rename to `idx_` \| drops the index \| `1091` → swallowed \| \| Re-run of `migrate_db()` after either of the above \| `1091` → swallowed \| `1091` → swallowed \| No rollback hazard: nothing depends on this unique constraint anymore (`create_instance` dedupes by `instance_name` via `duplicate_name`, see [tenant_model_instance_service.py:27](api/db/services/tenant_model_instance_service.py#L27)). ### What this PR does NOT change - `provider_api_service.create_provider_instance`* — its `if api_key:` gate is correct for the post-migration world: multiple Ollama instances with empty keys under one provider are legitimate, so we shouldn't tighten the app-layer check. - `TenantModelInstance` Peewee model — the `indexes` tuple was already removed in #15460. New databases never get the constraint in the first place. - The `except → get_error_data_result` → HTTP 200 pattern at `provider_api.py:354` — that's a project-wide convention; changing one route to HTTP 500 would be inconsistent and out of scope. ## Test plan - [ ] Reproducer (pre-fix): on a database originally created before #15460, configure an Ollama provider with an empty `api_key`, then try to create a second instance under the same provider — confirm the `Duplicate entry … 'tenantmodelinstance_api_key_provider_id'` error in the server log. - [ ] Verify the index is present pre-restart: `SHOW INDEX FROM tenant_model_instance WHERE Key_name = 'tenantmodelinstance_api_key_provider_id';` — non-empty result. - [ ] Restart with the fix applied: server starts cleanly, `migrate_db()` runs, no `Failed to drop index` in critical logs. - [ ] Verify the index is gone post-restart: same `SHOW INDEX` query — empty result. - [ ] Re-run the reproducer: two Ollama instances under the same provider, both `api_key=""`, both succeed. - [ ] Restart a second time — no new errors; the matching `1091` swallow keeps the migration idempotent. - [ ] Fresh install smoke test: drop the DB volume, start clean — no `1091` noise (the new index never existed), no functional regression. ## Files changed - [api/db/db_models.py](api/db/db_models.py) — extend the legacy-index drop list with `tenantmodelinstance_api_key_provider_id`; refactor the inline list to a named `legacy_indexes` local with a comment pointing at #15460 and #15699. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Co-authored-by: Wang Qi <wangq8@outlook.com>	2026-06-11 15:47:12 +08:00
少卿	8e17a12990	fix: remove think text buffering for real-time reasoning stream (#15891 ) Fix: remove think text buffering for real-time reasoning stream	2026-06-10 16:55:57 +08:00
buua436	dcf623d60d	feat: support multi-type factory models (#15893 ) ### What problem does this PR solve? Support factory models with multiple model types, so visual chat models can be exposed as both image2text and chat while preserving the database model-type-per-record design. This also updates the SILICONFLOW model list and adds a helper script to refresh SiliconFlow models from the provider API. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-06-10 15:35:21 +08:00
Lynn	478c9846a1	Fix: model list (#15860 ) ### What problem does this PR solve? Remove tenant_llm call in rag. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-10 14:59:57 +08:00
Wang Qi	9aa81e7cad	Fix paddle ocr / minerU cannot add (#15858 ) Fix paddle ocr / minerU cannot add	2026-06-10 13:04:13 +08:00
buua436	c1496ffd43	fix: propagate memory tenant id in task collect (#15837 ) ### What problem does this PR solve? Propagate `tenant_id` from memory task messages into task collection so refactored task execution can build a valid context. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-09 17:47:48 +08:00
DearsisHS	cbb3896aaa	fix(api): guard missing row in SearchService.get_detail (#15622 ) ## Summary `SearchService.get_detail` crashed with `AttributeError` (HTTP 500) when no matching row existed, because it called `.first().to_dict()` before the `if not search` guard — making that guard dead code. ## Root cause `.first()` returns `None` when the query matches nothing (deleted search app, or joined `User` not `VALID`). `None.to_dict()` raises before the guard runs. ## Fix ```diff .first() - .to_dict() ) if not search: return {} - return search + return search.to_dict() ``` Guard the `None` first, then serialize — restoring the intended `{}` "not found" return that every caller (`search_api`, `bot_api`, `chat_api`, `dataset_api_service`) already handles. ## Files changed - `api/db/services/search_service.py` ## Verification - `ruff check` — clean - Logic: `.first()` → `None` now hits `return {}` instead of `None.to_dict()`. Local full pytest not run (heavy RAG deps); CI validates. ## Note Implemented with LLM assistance (model: claude-opus-4-8). Closes #15621 Co-authored-by: dearsishs <MCarter112116@outlook.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 23:01:28 +08:00
buua436	c8c890b06c	fix: refine think stream parsing (#15745 ) ### What problem does this PR solve? Refine the stream parsing for `<think>` / `</think>` so MiniMax and DeepSeek-style chunking both flush in the right order without mixing think and answer buffers. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-08 16:53:22 +08:00
qinling0210	c960dc2a4c	Refine handling of POST /api/v1/datasets/search in GO (#15583 ) ### What problem does this PR solve? Refine handling of POST /api/v1/datasets/search in GO ### Type of change - [x] Refactoring	2026-06-08 11:49:37 +08:00
Wang Qi	aa9545e4c9	Revert "fix: duplicate document ingest guard" (#15707 ) Reverts infiniflow/ragflow#15638	2026-06-05 17:45:29 +08:00

1 2 3 4 5 ...

786 Commits