## Summary
- route hosted MinerU.Net and PaddleOCR.Net provider names to their
existing Go drivers
- add regression coverage for loading the hosted OCR provider configs
through ProviderManager
## What changed
- Added canonical provider-name aliases for the hosted OCR provider
display names.
- Covered both bundled configs with a focused provider-manager test.
## Why
The hosted provider configs use display names with `.Net`, while model
factory dispatch lowercases the provider name. Without aliases, those
configs fall through to `DummyModel` instead of using the existing
MinerU and PaddleOCR drivers.
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## Summary
- Add LongCat model-list support through the documented
OpenAI-compatible models endpoint.
## What changed
- Add the LongCat `models` URL suffix for `/openai/v1/models`.
- Implement `ListModels` for the LongCat Go driver.
- Delegate `CheckConnection` to the lightweight model-list request.
- Add focused regression coverage for successful, malformed, oversized,
and missing-key responses.
## Why
LongCat documents a models endpoint under the OpenAI-compatible API
surface, but the Go driver still returned `no such method` for model
listing and connection checks.
## Validation
- `go test ./internal/entity/models -run TestLongCat -count=1`
- `go test -race ./internal/entity/models -run TestLongCat -count=1`
- `go test ./internal/entity -count=1`
- `git diff --check`
## Notes
- Related to the broader Go model provider tracking in #14736, but this
PR only handles LongCat model listing.
- `go test ./internal/entity/models -count=1` is currently blocked by an
unrelated Astraflow test panic outside this LongCat change.
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## Summary
- add the xAI `models` URL suffix used by the existing Go `ListModels`
implementation
- return a clear error when the xAI models suffix is missing
- add focused xAI model-listing and connection-check regression tests
## What changed
- Added `url_suffix.models` to `conf/models/xai.json`.
- Normalized the configured models suffix before building the request
URL.
- Covered config loading, successful model listing, upstream errors,
API-key validation, missing suffix handling, and `CheckConnection`
delegation.
## Why
`XAIModel.ListModels` already builds requests from `URLSuffix.Models`,
and `CheckConnection` delegates to that method. The bundled xAI config
did not define that suffix, which left the model-listing path unable to
call the provider `/models` endpoint from the existing provider config.
## Validation
- `go test ./internal/entity/models -run TestXAI -count=1`
- `go test ./internal/entity -count=1`
- `git diff HEAD~1..HEAD --check`
## Notes
- `go test ./internal/entity/models -count=1` currently fails in
unchanged Astraflow coverage: `TestAstraflowEmbedReturnsNoSuchMethod`
panics before reaching any xAI assertions.
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.25.5 to v0.25.6
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
implement provider `OrcaRouter`
**The following functionalities are now supported:**
**Cohere:**
- [x] Chat / Think Chat / Stream Chat / Stream Think Chat
- [x] Model listing
- [x] TTS
- [ ] Balance
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Closes#15040.
ModelScope was listed unchecked in the Go-rewrite tracker #14736 and
already had an llm_factories.json entry (tags: LLM) but no Go driver, so
the new Go API server could not route ModelScope instances. The Python
side has supported it through the OpenAI-compatible base at
rag/llm/chat_model.py:618 (ModelScopeChat), which requires a
user-supplied base URL and appends /v1.
This adds:
- internal/entity/models/modelscope.go: self-hosted OpenAI-compatible
driver with chat (sync + SSE stream with idle-timeout cancellation),
list_models, and check_connection. Auth header is optional, matching the
xinference pattern, so deployments without auth and auth-enabled
deployments both work. Base URL is normalized so users can configure
either the root endpoint or the /v1 endpoint.
- internal/entity/models/modelscope_test.go: 12 tests covering name, URL
normalization, factory routing, chat happy path / auth header /
reasoning_content extraction, stream happy path / stream=false rejection
/ idle cancellation, list_models + check_connection, missing-base-URL
clear error, and the no-such-method sentinels.
- conf/models/modelscope.json: shipped config (class: "local",
url_suffix v1/chat/completions and v1/models).
- internal/entity/models/factory.go: case "modelscope" →
ModelScopeModel.
- internal/service/llm.go: ModelScope added to the selfDeployed map
alongside Ollama, Xinference, LocalAI, LM-Studio, GPUStack — the Python
side requires user-supplied URL with no default, so the Go side
classifies it the same way.
Follow-on issues will add Embed and Rerank, in line with how Novita,
NVIDIA, TogetherAI, and other providers landed method-by-method.
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
This PR adds HuaweiCloud provider integration in RAGFlow.
Supported capabilities:
- [x] Chat / Think Chat / Stream Chat / Stream Think Chat
- [x] Embedding
- [x] Rerank
- [x] Model listing
- [x] Provider connection checking
Verified examples from the CLI:
```
check instance 'test' from 'HuaweiCloud';
chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello';
think chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello';
stream chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello';
stream think chat with 'deepseek-v4-flash@test@HuaweiCloud' message
'hello';
embed text 'what is rag' 'who are you' with 'bge-m3@test@HuaweiCloud'
dimension 1024;
rerank query 'what is rag' document 'rag is retrieval augmented
generation' 'rag need llm' 'famous rag
project includes ragflow' with 'bge-reranker-v2-m3@test@HuaweiCloud' top
3;
list supported models from 'HuaweiCloud' 'test';
LIST MODELS FROM 'HuaweiCloud' 'test';
```
### Type of change
- [x] New Feature
- [x] Provider integration
## Summary
- Wire the Go TokenHub provider through the model factory.
- Harden TokenHub request handling for chat, streaming, embeddings, and
model listing.
- Add focused TokenHub unit coverage for factory wiring and provider
behavior.
## Notes
- Refs #14736.
- Follows up #15159.
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## Summary
Closes#15165.
Implements the AWS Bedrock model provider for the Go API server, tracked
under #14736. Adds Converse + Converse-Stream chat and foundation-model
listing, with SigV4 signing over a hand-rolled `net/http` path that
matches the established pattern in `internal/entity/models/` (no new
direct `go.mod` deps).
## Linked tracker
Tracked under #14736 (Implement model providers of RAGFlow API server in
Go). Closes#15165.
### What problem does this PR solve?
extend restful api suite
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
### What problem does this PR solve?
Fix: Fixed metadata issue
- The dataset's built-in metadata is now active, but it appears to be
disabled in the individual file configuration.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Creating or updating an agent via `POST /api/v1/agents` and `PUT
/api/v1/agents/{agent_id}` did not persist `canvas_type` because the
handler `req` dict never assigned the field before
`UserCanvasService.save` / `update_by_id`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Cursor <cursoragent@cursor.com>
### What problem does this PR solve?
The Go DeepInfra driver returned a stub error for `Rerank()` even though
DeepInfra serves reranker models at `POST /v1/inference/{model}` with
`query`, `documents`, and a `scores[]` response.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Cursor <cursoragent@cursor.com>
### What problem does this PR solve?
Add a Go driver for **FuturMix** (https://futurmix.ai/docs), one of the
unchecked providers on the umbrella tracking issue #14736. FuturMix is
documented as an "OpenAI-compatible API" aggregator over Claude / GPT /
Gemini / DeepSeek (~22 models per their `/models` page).
Until this PR, a tenant who configured `futurmix` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.
---------
Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Closes#15167.
The Baidu Go provider advertises OCR support through
`paddleocr-vl-0.9b`, but `BaiduModel.OCRFile` dereferenced required
inputs before validating them. Calling OCR with a missing API config,
API key, or model name could panic instead of returning a normal error.
This PR adds explicit input validation for those required values.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Closes#15142.
ZhipuAI lists `glm-ocr` as an OCR model, but the Go driver still
returned `no such method` from `OCRFile`. This wires the advertised
model to Z.AI's documented `layout_parsing` endpoint and returns the
`md_results` Markdown output through the existing `OCRFileResponse.Text`
field.
This PR also adds focused tests for URL input, raw file-content base64
input, and validation errors.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Test
- [x] `go test -vet=off ./internal/entity/models -run
'TestZhipuAIOCRFile'`
### What problem does this PR solve?
extend restful api suite
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
### What problem does this PR solve?
Fix [Bug]: Save parser configs in dataset configuration page is not
working #15175
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
Fixes the confirmed asyncio anti-patterns from #14755. Only the three
verified bugs are addressed; patterns already correctly using
`asyncio.new_event_loop()` in a fresh thread are left untouched.
### Changes
**`api/apps/restful_apis/tenant_api.py` — fire-and-forget
`send_invite_email`**
`asyncio.create_task()` was called without storing the `Task` reference.
CPython's GC can collect an unfinished task, silently cancelling it and
swallowing exceptions. Fixed by storing the task in a module-level
`_background_tasks: set[Task]` with a `done_callback` to discard it on
completion — the standard Python idiom for safe background tasks.
**`api/apps/restful_apis/agent_api.py` — fire-and-forget
`background_run`**
Same root cause in the webhook "Immediately" execution path. Same fix
applied.
**`rag/llm/chat_model.py` (`LocalLLM._stream_response`) —
`asyncio.get_event_loop()` on running loop**
`asyncio.get_event_loop()` returns Quart's running event loop when
called from an async context.
Calling `loop.run_until_complete()` on it raises `RuntimeError`.
Replaced with `asyncio.new_event_loop()` so the generator
uses a dedicated fresh loop, closed in a `finally` block.
## What was NOT changed
- `llm_service._sync_from_async_stream` and
`evaluation_service._sync_from_async_gen`: both already correctly use
`asyncio.new_event_loop()` inside a fresh thread.
- `llm_service._run_coroutine_sync`: only caller is `rag/app/resume.py`
(sync context), so `thread.join()` is correct there.
- `requests` in agent tools: sync methods dispatched through thread
pools; httpx migration is a separate, larger refactor.
## Test plan
- [ ] Invite a team member and confirm the email is sent with no task
warnings in logs.
- [ ] Trigger a webhook agent in "Immediately" mode; confirm canvas
state is persisted after background run.
- [ ] Verify `LocalLLM` (Jina backend) chat and streaming work
end-to-end.
Closes#14755
---------
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
### What problem does this PR solve?
Fix: The prompt variable for the agent operator disappears after input.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
extend restful api suite
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
### What problem does this PR solve?
Feat: Enable agent messages to display base64 images
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
IDK how to implement **`Ollama`** on #14580 but it's totally wrong.
This is the rewrite version for **`Ollama`**
**Verified from CLI**
```
# Embed
RAGFlow(user)> embed text 'what is rag' 'who are you' with 'nomic-embed-text:latest@test12@ollama' dimension 1024;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 768 | 0 |
| 768 | 1 |
+-----------+-------+
# Chat
RAGFlow(user)> think chat with 'qwen3:0.6b@test12@ollama' message 'who r u'
Thinking: Okay, the user asked, "Who r u?" I need to respond appropriately. First, I should acknowledge their question. Since I'm an AI, I don't have a physical form, but I can confirm that I'm a large language model. I should keep the response friendly and offer help. Let me make sure I'm not making up any information and that the response is natural. Also, I should check for any typos and ensure clarity. Alright, that should cover it.
Answer: I'm an AI language model, and I don't have a physical form. However, I can tell you that I'm designed to assist with questions and tasks. How can I help you today?
Time: 2.914285
RAGFlow(user)> stream think chat with 'qwen3:0.6b@test12@ollama' message 'who r u'
Thinking: , the user asked, "Who are you?" I need to respond appropriately. Since I'm an AI assistant, I should mention that I don't have a physical form or a mind. I should also clarify that I can help with various tasks like answering questions or providing information. It's important to keep the response friendly and informative while maintaining the correct tone.
Answer: don't have a physical form or a mind, but I'm here to help with your questions or tasks! What can I do for you today?
Time: 1.740047
# LisyModels
RAGFlow(user)> list supported models from 'ollama' 'test12'
+-------------------------+
| model_name |
+-------------------------+
| nomic-embed-text:latest |
| qwen3:0.6b |
+-------------------------+
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Fix: Replace the red highlight at the top of the PDF document with
yellow.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Follow on PR #15146 to reslove the backwad compatability issue.
1. /agents/<attachment_id>/download ->
/agents/attachments/<attachment_id>/download
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
This change fixes ingestion quality issues where MinerU parser output
may contain HTML fragments (for example, table-related tags like `<tr>`,
`<td>`, `<br>`), which were previously passed directly into
chunking/tokenization and degraded chunk quality.
The fix adds a sanitization step in the MinerU parser path so parsed
sections are normalized to clean text before chunking.
## Change Type (select all)
- [x] Bug fix
- [x] Ingestion pipeline improvement
- [x] Parser/chunking quality fix
## Related Issue
- https://github.com/infiniflow/ragflow/issues/14831
### What problem does this PR solve?
This PR improves the table upload flow for CSV/Excel files by allowing
table column role configuration at upload time.
Previously, users had to:
1. Upload and parse a table file.
2. Open parser settings and manually set table column roles.
3. Re-parse the file for the roles to take effect.
This was inefficient and required an unnecessary second parse.
With this change:
1. When the knowledge base uses table parsing, the upload dialog
extracts CSV/Excel headers client-side.
2. Users can choose Auto mode or Manual mode.
3. In Manual mode, users can assign per-column roles before upload.
4. The selected parser config is sent with the upload request and
applied server-side during document creation.
Result: configured table column roles are applied from the first parse.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
Declare doc_id, filename, mime_type, and size as separate outputs on the
Document Generation component so downstream nodes (e.g., the Code
component) can consume them via the variable picker. The existing
download JSON blob is preserved unchanged for the Message component's
download-chip rendering.
### What problem does this PR solve?
The Document Generation component previously exposed only a single
`download` output —
a JSON-encoded blob containing the file's `doc_id`, `filename`,
`mime_type`, `size`,
and base64 payload. On top of that, the variable picker actively hides
this `download`
entry from every consumer except the Message component (because the
embedded base64 is
too heavy to splat into arbitrary downstream nodes).
The combined effect: users wiring the Doc Generator's output into a Code
component had
no way to retrieve basic file info such as `file_name` or `doc_id` from
the picker,
blocking workflows that need to post-process the generated file (e.g.,
registering it
elsewhere, custom delivery, follow-up API calls).
This PR declares `doc_id`, `filename`, `mime_type`, and `size` as
**discrete outputs**
on the Document Generation component, alongside the existing `download`
blob. The new
fields:
- Appear in the variable picker for **all** downstream nodes, including
the Code
component, so users can bind them directly to script arguments.
- Are cheap scalars only — no base64 payload leaks into other
components.
- Leave the existing `download` JSON blob completely untouched, so the
Message
component's download-chip rendering (which parses that blob via
`_is_download_info`)
keeps working with no behavior change.
Changes:
- `agent/component/docs_generator.py` — declare the four new outputs in
`DocGeneratorParam` and emit them via `set_output(...)` in `_invoke`.
- `web/src/pages/agent/constant/index.tsx` — extend
`initialDocGeneratorValues.outputs`
with the new keys.
- `web/src/pages/agent/form/doc-generator-form/index.tsx` — mirror the
new outputs in
the zod schema so the form is valid.
No changes needed to the picker's existing `download`-hiding filter — it
matches only
on the literal output name `download`, so the new metadata entries fall
through
naturally.
Reported in: https://github.com/infiniflow/ragflow/issues/14461.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Implement embed for Tencent Hunyuan
**Verified from CLI**
```
RAGFlow(user)> embed text 'what is rag' 'who are you' with 'hunyuan-embedding@test1@hunyuan' dimension 16;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 1024 | 0 |
| 1024 | 1 |
+-----------+-------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
### What problem does this PR solve?
1. Fix /chat/completions to send only the latest message
2. Allo chat stream=False
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Go: implement provider: PaddleOCR_Local
**Verified from CLI**
```
RAGFlow(user)> ocr with 'PaddleOCR-VL@test@paddleocr_local' file './internal/test1.jpg'
+----------------------+
| text |
+----------------------+
| ## Parallel to these |
+----------------------+
```
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
- [X] New Feature (non-breaking change which adds functionality)
- [X] Refactoring
## Summary
- Adds a `Hunyuan` Go driver so the new API server can route Tencent
Hunyuan chat instances (registered in `conf/llm_factories.json:3830` as
`Tencent Hunyuan`). Follows the same SaaS-driver shape used for
Astraflow, Avian, Novita, TogetherAI, Replicate, DeepInfra, Upstage, and
LongCat.
Closes#15087
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix /chat/completions not aware of conversation_id
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Closes#15102.
OpenAI's Go provider config advertises `whisper-1` as ASR and `tts-1` as
TTS, but the Go driver returned `openai, no such method` for both audio
paths and did not define `url_suffix.asr` / `url_suffix.tts`.
This PR:
- adds OpenAI audio URL suffixes for `audio/transcriptions` and
`audio/speech`
- implements non-streaming `TranscribeAudio` using multipart form
uploads
- implements non-streaming `AudioSpeech` using the OpenAI speech JSON
request shape
- keeps streaming TTS explicitly unsupported instead of sending binary
audio through the text SSE sender
- adds focused tests for config coverage, ASR/TTS request shape,
required TTS voice validation, and unsupported streaming TTS
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: /openai/<chat_id>/chat/completions not aware of session_id
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Go: implement reasoning_chat, TTS, ASR for Groq
**Verify from CLI**
```
RAGFlow(user)> think chat with 'qwen/qwen3-32b@test@groq' message 'who r u'
Thinking: Okay, the user asked, who r u. I need to determine what the user is asking. They may be asking about my identity. I should introduce my name and basic functions. The user might want to know what I can do, so I should list some common use cases, such as answering questions, creating writing, coding, and expressing opinions. The user may be curious about how they can interact with me, so they can be advised to ask any questions or provide instructions. Keep your answers conversational, avoid overly technical terms, keep answers concise, and encourage further interaction. Check if there's any ambiguity in the answer and make sure it's accurate and meets the user's needs. Also consider if there are other aspects the user may be interested in, such as my training data or performance. But since the question is basic, I'll focus on the essentials first and invite the user to ask more. In summary, respond to the user's questions by introducing yourself, your functions, and encouraging further interaction.
Answer: Hello! I'm Qwen. I am a large-scale language model developed by Tongyi Lab, designed to assist you in various ways, such as answering questions, creating text, logical reasoning, programming, and more. I aim to provide clear, accurate, and helpful information and support. How can I assist you today? Feel free to ask any questions or give me tasks! 😊
Time: 2.199908
RAGFlow(user)> stream think chat with 'openai/gpt-oss-20b@test@groq' message 'who r u'
Thinking: to respond politely.
Answer: ’m ChatGPT—an AI language model created by OpenAI. I’m here to answer questions, offer explanations, and help with a wide range of topics. How can I assist you today?
RAGFlow(user)> tts with 'canopylabs/orpheus-arabic-saudi@test@groq' text 'hello? show yourself' play format 'wav' param '{"voice": "fahad"}'
SUCCESS
RAGFlow(user)> asr with 'whisper-large-v3-turbo@test@groq' audio './internal/test.wav' param '{"language": "en"}'
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired |
+----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Closes#15088.
Adds Groq support to the Go model-provider layer so Groq instances can
be routed through the Go API server with the same OpenAI-compatible
chat, streaming, model listing, and connection-check flow used by other
SaaS providers.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Summary
- Added a Groq Go model driver.
- Added the Groq provider catalog and default OpenAI-compatible API URL.
- Registered Groq in the model factory.
- Added focused provider tests.
## What changed
- Implemented chat completions, SSE streaming, ListModels, and
CheckConnection for Groq.
- Covered request shape, stream termination, reasoning fallback, model
listing, custom base URLs, safe transport setup, and unsupported
methods.
- Kept the provider catalog scoped to current Groq chat-capable model
IDs.
- Cleaned up pre-existing Go model package validation blockers so the
package can be tested normally with vet enabled.
## Why
The existing Python/provider catalog path includes Groq, but the Go
model-provider layer did not have a Groq driver, so the Go API server
could not instantiate or use Groq as requested in #15088.
## Notes
The model package now validates without disabling vet.
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
move agent attachment download api to the correct route and update
frontend callers
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Notes
- Move the attachment download endpoint from document routes to agent
routes.
- Update frontend download callers to use the agent attachment endpoint.
- Reuse the shared file response header helper instead of duplicating it
in `agent_api.py`.
## Summary
- Adds a `TokenPony` Go driver so the new API server can route TokenPony
chat instances, matching the existing Python `TokenPonyChat`
(`rag/llm/chat_model.py:1210`). Follows the same SaaS-driver shape used
for Astraflow, Avian, Novita, TogetherAI, Replicate, DeepInfra, Upstage,
and LongCat.
Closes#15086
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## Summary
Implements the TODO in `evaluation_service.py`: **Track token usage** in
evaluation results.
## Changes
- **Import** `num_tokens_from_string` from `common.token_utils`
- **Prompt tokens**: Use the full prompt returned by `async_chat` when
available (includes system prompt + knowledge base + query), otherwise
fall back to the question token count
- **Completion tokens**: Count tokens in the generated answer
- **Storage**: Store `token_usage` as `{prompt_tokens,
completion_tokens, total_tokens}` in each `EvaluationResult` instead of
`None`
## Why
The evaluation pipeline previously saved `token_usage: None` for every
result. This change allows downstream consumers (e.g. evaluation
dashboards, cost tracking) to see approximate token usage per test case
using the same tokenizer (tiktoken cl100k_base) used elsewhere in
RAGFlow.
## Testing
- No new tests added; existing evaluation flow unchanged
- Token counting uses existing `num_tokens_from_string` utility
---------
Co-authored-by: kiannidev <kiannidev@users.noreply.github.com>
### What problem does this PR solve?
Fixes#15066
OpenRouter now exposes an official speech-to-text endpoint at `POST
/api/v1/audio/transcriptions`, but the Go model driver still returned
`openrouter, no such method` from `TranscribeAudio`. This left
OpenRouter ASR models unavailable through the Go API server even though
the provider already has OpenRouter audio support for TTS.
Related provider-tracking context: #14736
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
The agent API currently does not pass chat_template_kwargs to the
underlying LLM call path, so clients cannot control template-level model
behavior (such as thinking-mode toggles) when invoking
/agents/chat/completion. This PR adds passthrough support for
chat_template_kwargs across agent execution flows (session and
non-session, streaming and non-streaming) by propagating it through
canvas runtime state and into LLM invocation kwargs. This addresses the
feature gap raised in [Issue
#14182](https://github.com/infiniflow/ragflow/issues/14182).
Closes#14182
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Closes#14789
### What problem does this PR solve?
User API endpoints (`login`, `user_profile`, `user_add`,
`forget_reset_password`) were returning full user objects via
`to_json()` / `to_dict()`, which included sensitive fields like
`password` and `access_token` in the response body. This leaks
credentials to the client.
This PR adds a `to_safe_dict()` method on the `User` model that strips
sensitive fields (`password`, `access_token`) and replaces all affected
call sites to use it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Enhance retry and timeout, and adjust the default timeout
2. NER: spacy do not batch chunks
3. extract _has_cancel_and_exit
4. enhance log messages
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
Closes#4310
### What problem does this PR solve?
Issue #4310 requests practical examples for the RAGFlow SDK and HTTP API
to help developers get started faster. The existing `example/sdk/`
folder only contains `dataset_example.py`. This PR fills the remaining
gaps by adding examples for three key API areas not yet covered in
`main` or by other open PRs (#13904, #13284):
- **Chunk management** — add, list, update, delete, and retrieve chunks
within a dataset
- **Chat assistant** — create a chat assistant, open a session, send
messages (streaming and non-streaming), and clean up
- **Retrieval** — perform semantic retrieval across one or multiple
datasets
### Type of change
- [x] Documentation Update
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Closes#14865
`download_img` in `common/misc_utils.py` is used for OAuth avatar URLs.
The previous implementation called `async_request` from
`common.http_client`, which followed redirects without re-validating
each hop and did not apply the same SSRF protections as this path needs.
That made it possible to reach non-public or disallowed targets (for
example via redirects or unsafe URLs) when fetching avatars.
This change replaces that flow with an explicit, bounded fetch: each URL
(including every redirect target) is checked with
`common.ssrf_guard.assert_url_is_safe`, DNS is pinned with
`pin_dns_global`, `httpx` streams the body with `follow_redirects=False`
and a manual redirect loop (capped by
`RAGFLOW_OAUTH_AVATAR_MAX_REDIRECTS`), and total response size is capped
(`RAGFLOW_OAUTH_AVATAR_MAX_BYTES`). Timeouts, proxy, and user agent
align with `HTTP_CLIENT_*` env vars without importing `http_client`, so
lightweight tests stay simple.
Unit tests cover empty/None URLs, loopback, cloud metadata-style
addresses, and disallowed schemes so SSRF regressions are caught early.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
This PR implements ASR and TTS support for the ZhipuAI Go driver.
The ZhipuAI model config already advertises `glm-asr-2512` as an ASR
model, but the Go driver returned `zhipu, no such method` from
`TranscribeAudio`. This adds the documented audio transcription endpoint
suffix and sends multipart transcription requests with `model`,
`stream=false`, and `file` fields.
Per maintainer review, this also adds the ZhipuAI TTS endpoint suffix
and implements `AudioSpeech` / `AudioSpeechWithSender` for `glm-tts`.
Closes#15133
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Closes#15089.
Adds PPIO support to the Go model-provider layer so PPIO instances can
be routed through the Go API server with the same OpenAI-compatible
chat, streaming, model listing, and connection-check flow used by other
SaaS providers.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Summary
- Added a PPIO Go model driver.
- Added the PPIO provider catalog and default OpenAI-compatible API URL.
- Registered PPIO in the model factory.
- Added focused provider and provider-manager tests.
## What changed
- Implemented chat completions, SSE streaming, ListModels, and
CheckConnection for PPIO.
- Covered request shape, stream termination, reasoning fallback, model
listing, custom base URLs, safe transport setup, unsupported methods,
and provider config loading.
- Kept the provider catalog aligned with the existing RAGFlow PPIO
factory model set.
- Cleaned up pre-existing Go model package validation blockers so the
scoped provider tests can run normally with vet enabled.
## Why
The existing Python/provider catalog path includes PPIO, but the Go
model-provider layer did not have a PPIO driver, so the Go API server
could not instantiate or use PPIO as requested in #15089.
### What problem does this PR solve?
implement rerank, asr, tts for TogetherAI
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. update python version to 3.13
2. upgrade ormsgpack to 1.6.0
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
implement ASR and TTS for Xinference
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
## Summary
Fixes 10 unguarded `response.choices[0]` accesses that cause
`IndexError` or `AttributeError` when the LLM returns an empty `choices`
list — the scenario described in #14711.
- `rag/llm/cv_model.py`
- `rag/llm/chat_model.py`
Each access site is now guarded with:
```python
if not response.choices:
raise ValueError("LLM returned empty response")
```
## Verification
Detected and verified by [pact](https://github.com/qizwiz/pact) — a
sheaf-cohomological LLM contract checker using Z3 as a local theory
solver.
**pact sheaf-cohomological proof status after fix:**
| File | Ȟ¹ (after) | Z3 |
|------|-----------|-----|
| `rag/llm/cv_model.py` | 0 | UNSAT ✓ |
| `rag/llm/chat_model.py` | 0 | UNSAT ✓ |
All access sites proven safe (Z3 UNSAT certificate).
The checker was also used to verify the autogen streaming-None fix in
[microsoft/autogen#7711](https://github.com/microsoft/autogen/pull/7711).
## Test plan
- [ ] Existing test suite passes
- [ ] Manually test with a provider that returns empty `choices` under
load (e.g. Vertex AI)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Signed-off-by: Jonathan Hill <jonathan.f.hill@gmail.com>
`GET /agents/<agent_id>/sessions/<session_id>` crashed with
`AttributeError: 'NoneType' object has no attribute 'to_dict'` when the
session lookup failed: `_, conv =
API4ConversationService.get_by_id(...)` returned `(False, None)`, then
`conv.to_dict()` was called unconditionally.
This is reachable in multi-instance deployments: the session row may not
yet be visible on the node servicing the immediate follow-up GET after a
session is created on a different node.
Add the same `if not exists` guard already used by every other call site
of `API4ConversationService.get_by_id` (see agent_api.py:1147,
sdk/session.py:179, conversation_service.py:248, canvas_service.py:323).
Closes#14989
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Replace the RuntimeError with a warning + first-address fallback so a
single email whose From header contains multiple addresses no longer
crashes the entire IMAP sync task. Also add regression tests covering:
- #14963: RFC 5322 quoted display names with commas (e.g. "Schlüter,
Sabine" <s@x>) parsed as one address, not two.
- #14964: multi-address headers warn instead of raising.
Closes#14964
Refs #14963
## Summary
- Bump pinned nginx in `Dockerfile` from `1.29.5-1~noble` (vulnerable)
to `1.31.0-1~noble` to remediate **CVE-2026-42945**.
## Root Cause
`Dockerfile:58` pinned `ARG NGINX_VERSION=1.29.5-1~noble`. Per the
official nginx security advisory, **CVE-2026-42945** is a buffer
overflow in `ngx_http_rewrite_module` triggered via the `rewrite` and
`set` directives, affecting nginx **0.6.27 through 1.30.0**. `1.29.5`
falls inside that range, so the shipped image is vulnerable.
References:
- nginx security advisories:
https://nginx.org/en/security_advisories.html
- Vendor advisory: https://my.f5.com/manage/s/article/K000161019
- Fixed versions: `1.31.0` (mainline) and `1.30.1` (stable)
## Fix
Single-line change in `Dockerfile:58`:
```diff
-ARG NGINX_VERSION=1.29.5-1~noble
+ARG NGINX_VERSION=1.31.0-1~noble
### What problem does this PR solve?
Fixes#14997.
RAPTOR builds on the Infinity backend have been broken since v0.25.2
introduced the `extra` field in code (`rag/svr/task_executor.py:1011`)
without declaring it in `conf/infinity_mapping.json`. Every RAPTOR job
fails with:
```
infinity.common.InfinityException: (3013, 'Fail to bind the expression: extra@src/planner/expression_binder_impl.cpp:99')
```
The auto-migration in
`common/doc_store/infinity_conn_base.py:_migrate_db()` adds any columns
it finds in the mapping JSON to existing tables — so the only thing
standing between users and a working RAPTOR build is that one missing
declaration. OceanBase, ES, and OpenSearch were unaffected because they
store `extra` as a native JSON type; only Infinity (which has a strict
`varchar`/`integer`/`float` schema) needed the addition.
### The fix
Two-part change:
1. **`conf/infinity_mapping.json`**: declare `"extra": {"type":
"varchar", "default": ""}`. On next startup, `_migrate_db()` adds the
column to all existing chunk tables — no manual DDL needed for upgrading
installations.
2. **`rag/utils/infinity_conn.py` `insert()`**: serialize the `extra`
dict to a JSON string at write time, since Infinity's `varchar` can't
store a Python dict directly. Modelled on the existing `chunk_data`
handling a few lines above.
The read path (`rag/utils/raptor_utils.py:_as_extra_dict`) already
normalises both dict and JSON-string inputs, so no read-side change is
needed. Other backends are untouched — `task_executor.py` still writes
the dict, and the OceanBase/ES/OpenSearch insert paths handle dicts
natively.
### Verification
Tested on a v0.25.4 deployment with the Infinity backend by applying the
same two changes via mounted-volume override:
- Confirmed `_migrate_db()` adds the `extra` column to all pre-existing
chunk tables on startup (column visible via Infinity's
`show_columns()`).
- Triggered RAPTOR builds on four datasets (~21k chunks total) via `POST
/api/v1/datasets/<id>/index?type=raptor`.
- All four progressed past the previously-failing
`get_raptor_chunk_methods()` call into actual entity-extraction and
clustering work without the (3013) error.
- GraphRAG builds (which can trigger the same path indirectly via
`task_executor.py:857`) also progressed cleanly.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
`UpstageModel.ChatStreamlyWithSender` (in the driver merged via #14819)
only extracted `delta.content` from each SSE event. For the `solar-pro3`
reasoning family (and any future Upstage model that follows the same
wire shape), the chain-of-thought is streamed in a **separate
`delta.reasoning` field**, and the driver was silently dropping all of
it.
The non-streaming path already extracts `message.reasoning` into
`ChatResponse.ReasonContent` (added earlier in this PR's history), so
the same model produced **inconsistent behavior** between streaming and
non-streaming: a tenant calling `solar-pro3` with `reasoning_effort:
high` would see the reasoning trace if they used `ChatWithMessages` but
not if they used `ChatStreamlyWithSender`.
### Live evidence
Probed against `api.upstage.ai/v1/chat/completions` with `solar-pro3` +
`reasoning_effort: high` + `stream: true` (8000-token budget so the
reasoning has room to finish):
```
$ curl -sN -H "Authorization: Bearer <key>" -H "Content-Type: application/json" \
-X POST https://api.upstage.ai/v1/chat/completions \
-d '{"model":"solar-pro3","messages":[{"role":"user","content":"Compute 15% of 80."}],
"max_tokens":8000,"stream":true,"reasoning_effort":"high"}'
# across 168 SSE events:
# delta keys seen: [content reasoning role]
# delta.content total len: 121 chars (the visible answer)
# delta.reasoning total len: 159 chars (the chain-of-thought) <- driver dropped this
```
A representative event showing both fields side by side:
```json
data: {"choices":[{"index":0,"delta":{"reasoning":"15% = 0.15."}}]}
data: {"choices":[{"index":0,"delta":{"content":"15% of 80 is "}}]}
```
The 159 chars of reasoning were arriving on the wire and being thrown
away. `solar-pro2` was also probed (625 events); it does **not** emit
`delta.reasoning` — its reasoning is inlined into `delta.content` — so
this change is a no-op for it and for `solar-mini`.
### What this PR includes
- `internal/entity/models/upstage.go`: in the SSE scanner loop, extract
`delta.reasoning` before `delta.content` and forward each non-empty
chunk via the sender's second arg (the existing `reasonContent` channel
the non-stream path already populates).
The ordering contract is documented inline: reasoning chunks within a
single SSE event are emitted before content chunks, so a UI that pipes
both sees the chain-of-thought start before the answer for that token,
matching the wire order Upstage emits.
- `internal/entity/models/upstage_test.go`: three new tests pinning the
new behavior:
- `TestUpstageStreamExtractsReasoningDelta` — reasoning + content
forwarded to the right sender args; one-of invariant per call
- `TestUpstageStreamReasoningChunksArriveBeforeContent` — ordering
pinned within a single SSE event that carries both fields
- `TestUpstageStreamWithoutReasoningStillWorks` — regression net:
non-reasoning models (`solar-mini`, `solar-pro2`) continue to work; the
reason callback never fires
No interface change. No factory change. No config change.
### How was this tested?
```
$ go test -vet=off -run TestUpstage -count=1 -v ./internal/entity/models/...
... (existing tests 1..9 still pass) ...
=== RUN TestUpstageStreamExtractsReasoningDelta
--- PASS: TestUpstageStreamExtractsReasoningDelta (0.01s)
=== RUN TestUpstageStreamReasoningChunksArriveBeforeContent
--- PASS: TestUpstageStreamReasoningChunksArriveBeforeContent (0.01s)
=== RUN TestUpstageStreamWithoutReasoningStillWorks
--- PASS: TestUpstageStreamWithoutReasoningStillWorks (0.00s)
PASS
ok ragflow/internal/entity/models 0.034s
```
12/12 Upstage tests pass on go 1.25. `go build
./internal/entity/models/...` exits 0.
**Live integration test** (smoke test not committed) — the patched
driver was run directly against `api.upstage.ai/v1` with the same prompt
that produced the curl evidence above:
```
=== RUN TestUpstageStreamReasoningLiveSmoke
[OK] visible content: 50 chunks, 84 chars
[OK] reasoning: 39 chunks, 90 chars
content head 200: "\\(15\\% = \\frac{15}{100}=0.15\\).\n\n\\[\n0.15 \\times 80 = 12.\n\\]\n\n**15 % of 80 is 12.**"
reasoning head 200: "We need to compute 15% of 80. That's 0.15 * 80 = 12. So answer is 12. Provide explanation."
UPSTAGE STREAM REASONING SMOKE PASSED
--- PASS: TestUpstageStreamReasoningLiveSmoke (1.97s)
```
Before this fix, the same call would have produced **0 reasoning
chunks**. The 90 chars of reasoning that the patched driver now surfaces
are the chain-of-thought solar-pro3 emits when reasoning_effort is high.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
`MistralModel.ChatWithMessages` (in the driver merged via #14807)
assumes that `choices[0].message.content` from `/v1/chat/completions` is
always a string and falls through to `return nil, fmt.Errorf("invalid
content format")` on anything else.
That assumption breaks for the **magistral reasoning family**
(`magistral-small-*`, `magistral-medium-*`). When the model needs a
chain-of-thought to answer, Mistral returns `content` as a **structured
array of typed parts**:
```json
"content": [
{"type": "thinking",
"thinking": [{"type": "text", "text": "Combined speed is 150 mph. 300 / 150 = 2 hours."}],
"closed": true},
{"type": "text", "text": "They will meet after **2 hours**."}
]
```
Concretely, this is what the live API returns today (probed against
`api.mistral.ai/v1`):
```
$ curl -H "Authorization: Bearer <key>" -H "Content-Type: application/json" \
-X POST https://api.mistral.ai/v1/chat/completions \
-d '{"model":"magistral-medium-latest",
"messages":[{"role":"user","content":"two trains 60mph and 90mph, 300mi apart, when do they meet? step by step."}],
"max_tokens":1024}'
HTTP 200
{ "choices":[{"message":{
"role":"assistant",
"content":[
{"type":"thinking","thinking":[{"type":"text","text":"Okay, let's see..."}],"closed":true},
{"type":"text","text":"To determine when the two trains meet..."}
]}}] }
```
With the current driver, every call like that returns the generic
`"invalid content format"` error. Trivial prompts that happen to fit in
a string answer still succeed, so the breakage is **non-deterministic
from the tenant's POV**: same model, same provider, sometimes works,
sometimes 500s with no useful error.
A secondary issue: `conf/models/mistral.json` does not include any
magistral model. The picker hid the broken path, which is why this
wasn't caught during #14807's review.
### What this PR includes
- New helper `extractMistralContent(raw interface{}) (answer,
reasonContent string, err error)` in
`internal/entity/models/mistral.go`, which normalizes both shapes
Mistral can return:
- `string` → historical path. `Answer = content`, `ReasonContent = ""`.
Preserves behavior for every non-reasoning model (`mistral-large-*`,
`mistral-small-*`, `ministral-*`, `codestral-*`, `pixtral-*`,
`open-mistral-nemo`).
- `[]interface{}` → walk the parts. Concatenate every `{"type":"text",
"text":...}` part into `Answer`; concatenate the inner text inside every
`{"type":"thinking", "thinking":[...]}` part into `ReasonContent`.
- `ChatWithMessages` now calls the helper instead of doing the raw
`.(string)` cast.
- Unknown part types are **skipped, not failed**. Mistral has been
adding new content variants quickly (audio chunks, citations, etc.);
this driver should not 500 every call when a new part type appears.
- `conf/models/mistral.json`: add `magistral-medium-latest` and
`magistral-small-latest`. Both are visible in `/v1/models` today.
No interface change. No factory change. No new dependencies.
### How was this tested?
**Unit tests** — 5 new tests in `internal/entity/models/mistral_test.go`
on top of the 27 already shipped via #14807:
- `TestMistralChatHandlesStringContent` — regression net for the
historical path
- `TestMistralChatExtractsReasoningFromStructuredContent` — the fixture
body is a trimmed copy of the actual `magistral-medium-latest` response
captured above; asserts both `Answer` and `ReasonContent` are populated
correctly
- `TestMistralChatHandlesStructuredContentWithoutThinking` —
`magistral-*` with a trivial answer returns a structured shape that has
only a `text` part; `ReasonContent` must stay empty
- `TestMistralChatIgnoresUnknownContentPartTypes` — `audio_url` and
`future_part_type` parts are skipped, `text` parts still flow through
- `TestExtractMistralContent` — table-driven unit coverage of the helper
for string, empty string, nil, empty array, text-only, thinking+text,
unsupported root type
```
$ go test -vet=off -run "TestMistral|TestExtractMistralContent" -count=1 -v ./internal/entity/models/...
=== RUN TestMistralChatHandlesStringContent
--- PASS: TestMistralChatHandlesStringContent (0.00s)
=== RUN TestMistralChatExtractsReasoningFromStructuredContent
--- PASS: TestMistralChatExtractsReasoningFromStructuredContent (0.00s)
=== RUN TestMistralChatHandlesStructuredContentWithoutThinking
--- PASS: TestMistralChatHandlesStructuredContentWithoutThinking (0.00s)
=== RUN TestMistralChatIgnoresUnknownContentPartTypes
--- PASS: TestMistralChatIgnoresUnknownContentPartTypes (0.00s)
=== RUN TestExtractMistralContent
=== RUN TestExtractMistralContent/plain_string
=== RUN TestExtractMistralContent/empty_string
=== RUN TestExtractMistralContent/nil
=== RUN TestExtractMistralContent/empty_array
=== RUN TestExtractMistralContent/text_only
=== RUN TestExtractMistralContent/thinking_then_text
=== RUN TestExtractMistralContent/unknown_root_type
--- PASS: TestExtractMistralContent (0.00s)
PASS
ok ragflow/internal/entity/models 0.046s
```
All 32 Mistral tests pass on go 1.25. `go build
./internal/entity/models/...` exits 0.
**Live integration test** — driver exercised against `api.mistral.ai/v1`
with the patched code:
```
=== RUN TestMistralMagistralSmoke
[OK] "magistral-small-latest" present upstream
[OK] "magistral-medium-latest" present upstream
[OK trivial] Answer="7" ReasonContent=""
[OK reasoning] Answer len=797 head="To determine when the two trains meet, we can follow these steps:\n\n1. **Identify..."
ReasonContent len=1069 head="Okay, let's see. There are two trains, one going 60 mph and the other going 90 mph. They're moving towards each other, s..."
MAGISTRAL SMOKE PASSED
--- PASS: TestMistralMagistralSmoke (18.09s)
PASS
ok ragflow/internal/entity/models 18.112s
```
What the live run proves on the wire:
- `magistral-small-latest` with a trivial prompt still uses the
string-content shape; the regression-net path is exercised against the
real server, not just the mock.
- `magistral-medium-latest` with a reasoning prompt uses the
structured-array shape; the new code path extracts a 1069-character
reasoning trace into `ChatResponse.ReasonContent` and a 797-character
visible answer into `ChatResponse.Answer`. Before this fix, the same
call returned `"invalid content format"` and the caller saw nothing.
The smoke-test file itself is not committed (live tests live outside the
PR diff, same convention used for prior provider PRs).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Problem
The Go server build pipeline (`build.sh` + CMake + CGO bindings) was
tested on Ubuntu only. On macOS arm64 with Homebrew it fails in five
orthogonal places. None of these require platform-specific code paths —
the same source builds on both Linux and Darwin after these fixes.
## Reproduction (before)
```
$ uname -a
Darwin … 25.4.0 arm64
$ brew install cmake pcre2 simde
$ bash build.sh
…
error: 'simde/x86/sse4.1.h' file not found
error: implicit instantiation of undefined template 'std::basic_istringstream<char>'
error: no matching function for call to 'Join'
…
clang: error: no such file or directory: '/usr/local/lib/libpcre2-8.a'
```
## Fix (5 small, orthogonal changes)
### 1. `internal/cpp/CMakeLists.txt` — find Homebrew + libpcre2-8
portably
- Detect Apple platforms via `if(APPLE)`, call `brew --prefix` once, add
`${HOMEBREW_PREFIX}/include` and `${HOMEBREW_PREFIX}/lib`. No effect on
Linux.
- Replace the literal `libpcre2-8.a` link token (which only the Linux
linker finds in `/usr/local/lib` by default) with
`find_library(PCRE2_LIB NAMES pcre2-8 REQUIRED)`. Works on
`/usr/lib/x86_64-linux-gnu` (Debian/Ubuntu), `/usr/local/lib` (Intel Mac
& legacy Linux), `/opt/homebrew/lib` (Apple Silicon).
### 2. `internal/cpp/wordnet_lemmatizer.cpp` +
`internal/cpp/rag_analyzer.cpp` — explicit `#include <sstream>`
libstdc++ (Linux) pulls `<sstream>` in transitively via `<fstream>`;
libc++ (Apple Clang) doesn't, so the existing `std::istringstream` /
`std::ostringstream` uses fail to compile on macOS. One-line include in
each file.
### 3. `internal/cpp/rag_analyzer.cpp` — `Join` template overload fix
`Join(tokens, start, tokens.size(), delim)` at line 146 passes `size_t`
to an `int` parameter. C++23 strict mode in Apple Clang refuses the
implicit narrowing and reports the 4-arg overload as a substitution
failure, leaving the call ambiguous between the 3-arg and 4-arg
templates. Fix: explicit `static_cast<int>(tokens.size())`. Behaviour
identical on libstdc++ — the narrowing was always intentional.
### 4. `internal/binding/rag_analyzer.go` — split darwin CGO LDFLAGS
The existing `#cgo darwin LDFLAGS: ... /usr/local/lib/libpcre2-8.a` only
matches Intel Macs. Apple Silicon Homebrew installs to `/opt/homebrew`.
Split into `darwin,arm64` and `darwin,amd64` build constraints with the
right absolute path on each.
### 5. `build.sh` — accept Homebrew path in the pcre2 sanity check
The sanity check looked at two Linux paths only and then fell through to
`sudo apt -y install libpcre2-dev` on failure. Added
`/opt/homebrew/lib/libpcre2-8.a`, and on Darwin failure now exits
cleanly with the right `brew install pcre2` hint instead of trying
`apt`.
## Verified
- `bash build.sh` now completes on macOS arm64 (Apple Silicon, brew 4.x,
cmake 4.x, Apple Clang 17, Go 1.25, pcre2 10.x, simde 0.8.x).
- Produced binaries: `bin/server_main`, `bin/admin_server`,
`bin/ragflow_cli`.
- `bin/server_main` boots, connects MySQL, runs migrations, loads the 64
model provider configs cleanly.
- Still builds on Linux — the CMake additions are inside an `if(APPLE)`
guard, the `find_library` call matches Linux paths too, the build.sh
check still tries `apt` when not on Darwin.
## Out of scope
The Go server itself currently fails at runtime when not pointing at
Elasticsearch (`Failed to initialize doc engine: failed to ping
Elasticsearch`), but that's the placeholder Infinity engine documented
in `internal/engine/README.md` — unrelated to this build patchset.
---
Happy to split this into smaller PRs if you'd prefer (one per file). The
five changes are independent.
## What
- Add Perplexity as a chat and embedding provider backed by its
OpenAI-compatible `/chat/completions` and `/v1/embeddings` APIs
- Register Perplexity in the Go model factory and provider config
- Support non-streaming chat, SSE streaming chat, embeddings, model
listing, and connection checks
Refs #14736
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
- Adds an `Astraflow` Go driver so the new API server can route
Astraflow (UCloud ModelVerse) chat instances, matching the existing
Python `AstraflowChat` (`rag/llm/chat_model.py:1237`). Follows the same
SaaS-driver shape used for Avian, Novita, TogetherAI, Replicate,
DeepInfra, Upstage, and LongCat.
Closes#15062
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Closes#15044.
Avian was listed unchecked in the Go-rewrite tracker #14736 and already
had an llm_factories.json entry with 4 preconfigured chat models
(deepseek-v3.2, kimi-k2.5, glm-5, minimax-m2.5), but the Go API server
had no driver to route them. The Python side has supported Avian at
rag/llm/chat_model.py:1220 (AvianChat) via the LiteLLM openai/ provider
with default base https://api.avian.io/v1.
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
`ReplicateModel.Embed` in `internal/entity/models/replicate.go` was a
`"replicate, no such method"` stub. Tracking issue #14736 lists
Replicate's embedding surface as not implemented. This PR wires it up
against Replicate's documented embedding schema.
Until this PR, a tenant who selected a Replicate embedding model got the
sentinel error on every embed call.
Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
This PR adds a new `Browser` operator to Agent workflows, enabling
prompt-driven browser automation in RAGFlow.Technically based
‘Browser-Use’
It includes:
- Backend browser component execution with tenant LLM integration
- Upload source support (file IDs, URLs, variables, CSV/JSON array)
- Downloaded file persistence to RAGFlow storage
- Frontend node/operator integration, form config, icon, and i18n
updates
- Unit tests for upload/download and ID parsing logic
- Dependency and Docker updates for browser-use runtime support
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Summary
- Adds a lightweight `@tool` decorator and `FunctionToolSession` adapter
in `rag/llm/tool_decorator.py` that let callers register plain Python
functions as LLM tools without hand-writing OpenAI function schemas or
building an MCP-style session.
- Refactors `Base.bind_tools` and `LiteLLMBase.bind_tools` in
`rag/llm/chat_model.py` to accept either the new decorator form
`bind_tools(tools=[fn1, fn2])` or the existing `(toolcall_session,
tools_schemas)` form, so existing agent/dialog call-sites in
`agent/component/agent_with_tools.py`, `api/db/services/llm_service.py`,
and `api/db/services/dialog_service.py` are unaffected.
- Adds 8 unit tests in `test/unit_test/rag/llm/test_tool_decorator.py`
covering schema shape, required/optional inference, sync + async
dispatch, and bad-input rejection.
## Usage
```python
from rag.llm.tool_decorator import tool
@tool
def get_weather(city: str) -> str:
"""Get current weather for a city.
:param city: City name to look up.
"""
return f"{city}: 21 C, partly cloudy"
chat_mdl.bind_tools(tools=[get_weather])
ans, tk = await chat_mdl.async_chat_with_tools(system, history)
```
The decorator introspects `inspect.signature` + type hints + the
docstring (`:param name:` style) and attaches an OpenAI-format
`openai_schema` to the callable. `FunctionToolSession` duck-types the
existing `ToolCallSession` protocol, dispatching async callables
directly and sync ones through `thread_pool_exec` so the event loop is
never blocked.
## Design notes
- `tool_decorator.py` deliberately does **not** live inside
`rag/llm/__init__.py` to avoid forcing every consumer through the heavy
provider auto-discovery loop and to sidestep a circular import
(`__init__.py` imports `chat_model`, which would otherwise need symbols
from `__init__.py`).
- `FunctionToolSession` is duck-typed against
`common.mcp_tool_call_conn.ToolCallSession` rather than explicitly
inheriting from it, so importing the decorator doesn't pull the MCP
client SDK into the import graph.
- Docstring parsing is intentionally minimal (`:param name:` only) to
keep this dependency-free; Google/NumPy styles can be added later via
`docstring_parser` if needed.
## Test plan
- [x] `python -m pytest test/unit_test/rag/llm/test_tool_decorator.py
-v` — 8 passed
- [x] `python -m pytest test/unit_test/rag/llm/
--ignore=test/unit_test/rag/llm/test_perplexity_embed.py` — 11 passed
(the ignored test has a pre-existing `numpy` import that's unrelated)
- [ ] Reviewer: smoke-test the new path end-to-end with a live model via
`chat_mdl.bind_tools(tools=[my_fn])` to confirm the OpenAI-format
schemas pass through unchanged
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
### What problem does this PR solve?
Closes#15048.
Several SDK session routes in `api/apps/sdk/session.py` called
`.split()` directly on `request.headers.get("Authorization")`. When
clients omitted the header, the handlers raised `AttributeError` before
returning the existing `Authorization is not valid!` response.
This PR centralizes SDK Authorization parsing in a small helper and
keeps the existing error response for missing, empty, or malformed
headers.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Tests
- `ZHIPU_AI_API_KEY=dummy uv run --python 3.13 --group test pytest
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py::test_sdk_session_routes_missing_authorization_unit
-q`
- `uv run --python 3.13 --group test ruff check api/apps/sdk/session.py
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
- `python3 -m py_compile api/apps/sdk/session.py
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
- `git diff --check`
### What problem does this PR solve?
Remove duplicate function definitions in
`api/db/services/dialog_service.py`.
**Problem:** Two helper functions were defined twice in the same file,
but with different parameter orders:
- First definition (line 57):
`_resolve_reference_metadata(request_payload=None, config=None)`
- Second definition (line 136): `_resolve_reference_metadata(config,
request_payload=None)`
**Solution:** Keep the second definition (which is actually used by
other modules) and remove the first one to avoid confusion.
Additionally, remove duplicate `_enrich_chunks_with_document_metadata`
definition (keep line 140 version).
<img width="1584" height="313" alt="image"
src="https://github.com/user-attachments/assets/7daee832-244f-4bb2-8488-e3b65012a3f9"
/>
<img width="1672" height="359" alt="image"
src="https://github.com/user-attachments/assets/4fd2f523-273c-4b20-a7c9-ab35740b7834"
/>
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
## Summary
- Align **GET `/api/v1/documents/<doc_id>/download`** with
**`/preview`**: resolve extension and MIME type from the stored document
name when the **`ext` query parameter is omitted**, instead of
defaulting to `markdown`.
- When **`?ext=`** is present, behavior stays the same as before
(explicit extension / `Content-Type` mapping).
- Enforce the same access + document lookup pattern as preview
(**`accessible`** + **`get_by_id`**).
- Extend unit tests for the no-`ext` PDF filename case.
## Test plan
- [x] `uv run pytest
test/testcases/test_web_api/test_document_app/test_document_metadata.py::TestDocumentMetadataUnit::test_download_attachment_success_and_exception_unit`
- [x] Optional: `curl -sSI` against
`/api/v1/documents/<pdf_doc_id>/download` without `ext` and confirm
`Content-Type: application/pdf`
Fixes#15052.
POST /api/v1/dify/retrieval resolved the caller via @apikey_required
(injecting tenant_id) but then fetched the requested knowledge_id with
no tenant filter and ran the full retrieval pipeline against
kb.tenant_id (the owner). Any valid Dify-compatible API key could
retrieve chunks from any tenant whose KB UUID was known. Adds the
missing ownership check.
## Root Cause
api/apps/sdk/dify_retrieval.py line 253:
KnowledgebaseService.get_by_id(kb_id) fetched the KB by id alone, then
the handler used kb.tenant_id (the OWNER) to build the embedding model
and call the retriever. The caller tenant_id was only used downstream at
line 278 for retrieval_by_children, well after cross-tenant data was
already retrieved.
grep confirmed there was no KnowledgebaseService.accessible call
anywhere in the handler.
## Fix
Two-line guard immediately after the existing get_by_id lookup,
mirroring the pattern PR #14749 lands for the sibling sdk/doc.py routes
(download, parse, stop_parsing, retrieval_test):
e, kb = KnowledgebaseService.get_by_id(kb_id)
if not e:
return build_error_result(message="Knowledgebase not found!",
code=RetCode.NOT_FOUND)
+ if not KnowledgebaseService.accessible(kb_id, tenant_id):
+ return build_error_result(message="No authorization.",
code=RetCode.AUTHENTICATION_ERROR)
if kb.tenant_embd_id:
...
KnowledgebaseService.accessible already handles solo-tenant ownership,
team membership via TenantService.get_joined_tenants_by_user_id, and the
permission=ME distinction. No behavior change for legitimate callers;
cross-tenant callers now receive RetCode.AUTHENTICATION_ERROR (109).
## Test Plan
- [x] Regression test added:
test/unit_test/api/apps/sdk/test_dify_retrieval.py
- test_cross_tenant_request_is_rejected -- attacker tenant calling owner
tenant KB gets 109; retriever is not invoked
- test_same_tenant_request_succeeds -- owner tenant gets the records
back
- test_missing_knowledge_base_returns_not_found -- missing KB returns
404 BEFORE the access check fires (legit callers see the clearer
message)
- [x] All 3 tests pass after the fix
- [x] Cross-tenant test FAILS on pre-fix main (KeyError on result[code]
because handler leaks records dict instead of returning auth error)
- [x] ruff check clean on both changed files
- [x] No drive-by reformatting in dify_retrieval.py -- only the 2 added
lines
### Post-fix output
test_cross_tenant_request_is_rejected PASSED [ 33%]
test_same_tenant_request_succeeds PASSED [ 66%]
test_missing_knowledge_base_returns_not_found PASSED [100%]
============================== 3 passed in 0.04s
===============================
Closes#15027
### What problem does this PR solve?
Closes#15076
Two endpoints in `api/apps/restful_apis/chat_api.py` accepted a
`user_id` field from the request body and used it directly when creating
a session:
```python
# before (vulnerable)
"user_id": req.get("user_id", current_user.id) # create_session
conv = await _create_session_for_completion(chat_id, dia, req.get("user_id", current_user.id)) # session_completion
```
Any authenticated caller could supply an arbitrary `user_id` and have
the new session attributed to a different user — effectively spoofing
session ownership. Both call sites are now fixed to always use
`current_user.id`, which is set by the authentication middleware and
cannot be tampered with via the request payload.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Changes
| File | Change |
|------|--------|
| `api/apps/restful_apis/chat_api.py` | Remove `req.get("user_id", ...)`
fallback in `create_session` and `session_completion`; always use
`current_user.id` |
|
`test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
| Add `test_create_session_user_id_not_spoofable` and
`test_session_completion_user_id_not_spoofable` (both `@pytest.mark.p2`)
|
### Testing
Two new unit tests assert that a `user_id` value supplied in the request
body is silently ignored and the session is always owned by the
authenticated user:
```
test_create_session_user_id_not_spoofable
test_session_completion_user_id_not_spoofable
```
Run with:
```bash
uv run pytest test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py -k "spoofable" -v
```
## What problem does this PR solve?
Closes#15021.
The Go model-provider layer had no support for **Azure OpenAI**. Azure
OpenAI is *not* a drop-in base-URL swap of the OpenAI driver — it
differs in authentication, endpoint structure, and how models are listed
— so it needs its own `ModelDriver` implementation.
## Type of change
- [x] New feature (non-breaking change which adds functionality)
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fixes#15023
GPUStack is listed as unchecked in the Go-rewrite tracker #14736, and
`internal/service/llm.go:171` already classifies it as a self-deployed
provider alongside Ollama, Xinference, LocalAI, and LM Studio — but
`internal/entity/models/` had no `gpustack.go` driver, so the new Go API
server could not route GPUStack instances. This PR adds the chat surface
for GPUStack so it lines up with the existing self-hosted Go drivers.
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## Summary
- Replaces the `"no such method"` stub on `XinferenceModel.Embed`
(`internal/entity/models/xinference.go`) with a real implementation
against Xinference's OpenAI-compatible `/v1/embeddings` endpoint.
- Adds the `"embedding": "v1/embeddings"` URL suffix to
`conf/models/xinference.json`.
- Mirrors the Python `XinferenceEmbed` class in
`rag/llm/embedding_model.py:407` for payload shape (OpenAI-compatible
`model + input` → `data[*].index + data[*].embedding`) and tolerates the
same no-auth default Xinference deployments use. Authorization is only
sent when a non-empty API key is configured, via the existing
`setXinferenceAuth` helper.
- Reuses the existing `normalizeXinferenceBaseURL` + `baseURLForRegion`
helpers so both `http://127.0.0.1:9997` and `http://127.0.0.1:9997/v1`
resolve to the same `/v1/embeddings` target without doubled `/v1`.
- Validates response indices — duplicate, missing, or out-of-range
`data[*].index` values fail with a clear error rather than silently
producing misaligned vectors.
- Returns `[]EmbeddingData` in original input order (placed by `Index`)
so downstream callers can index positionally without re-sorting.
- Forwards `EmbeddingConfig.Dimension` as `dimensions` when `> 0`,
matching the OpenAI cluster pattern.
Closes#14810
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fixes#15012
The Novita Go driver landed in #14850 and shipped a stub `Rerank` method
that returned `"novita, no such method"`, so Novita could not be used as
a rerank provider in RAGFlow. This PR fills that gap, in the same way
#14895 filled the Embed gap on the same driver.
Novita exposes a public rerank endpoint at `POST
https://api.novita.ai/openai/v1/rerank` that accepts the
Cohere-compatible request shape (`{model, query, documents, top_n}`)
with `Authorization: Bearer <api_key>`. `baai/bge-reranker-v2-m3` is
documented in Novita's model library with a 1024-token limit.
### What problem does this PR solve?
Fixes#14816
The Xinference Go driver landed chat in #14938 and Embed is in review in
#14932, but `Rerank` shipped as a stub that returns `"xinference, no
such method"`. Tenants who launch a rerank model with `--model-type
rerank` on their Xinference instance cannot route it through the Go API
server. This PR fills the gap.
Xinference exposes an OpenAI-compatible REST API. The rerank endpoint is
at `POST <base>/v1/rerank` and accepts the Cohere-shaped body `{model,
query, documents, top_n}`, returning `{results: [{index,
relevance_score}]}` — the same wire shape used by the merged NVIDIA
(#14778), Aliyun (#14676), Gitee (#14656), ZhipuAI (#14608), Novita
(#15014), and LocalAI (#14813) Rerank implementations. Documented in
[Xinference rerank
docs](https://inference.readthedocs.io/en/v1.6.1/models/model_abilities/rerank.html);
the [builtin rerank model
catalog](https://inference.readthedocs.io/en/stable/models/builtin/rerank/)
lists `bge-reranker-base`, `bge-reranker-large`, `bge-reranker-v2-m3`,
and others.
### What problem does this PR solve?
Add a Go driver for **n1n.ai** (https://docs.n1n.ai), one of the
unchecked providers on the umbrella tracking issue #14736. n1n.ai is an
OpenAI-compatible aggregator hosting a 450+ model catalog (GPT, Claude,
Gemini, DeepSeek, Kimi, Qwen, embedding + reranker families) under
`https://api.n1n.ai/v1`.
Until this PR, a tenant who configured `n1n` as a model provider in the
Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.
---------
Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
### What problem does this PR solve?
Fixes#15015
The TogetherAI Go driver in `internal/entity/models/togetherai.go`
shipped a stub `Embed` method that returned `"TogetherAI, no such
method"`, so TogetherAI could not be used as an embedding provider in
RAGFlow. This PR fills that gap.
TogetherAI exposes a public OpenAI-compatible embeddings endpoint at
`POST https://api.together.ai/v1/embeddings` that accepts the standard
`{model, input}` shape with `Authorization: Bearer <api_key>` (confirmed
in TogetherAI's official docs:
https://docs.together.ai/docs/embeddings-overview). Documented embedding
models include `intfloat/multilingual-e5-large-instruct`,
`BAAI/bge-large-en-v1.5`, and `BAAI/bge-base-en-v1.5`.
### Changes
- `internal/entity/models/togetherai.go`: implement
`TogetherAIModel.Embed`.
- Validate inputs (api key, model name) and short-circuit on empty
texts.
- Resolve region with the existing `baseURLForRegion` helper.
- Build URL from `URLSuffix.Embedding`.
- Send `{model, input}` POST body, add `dimensions` when
`embeddingConfig.Dimension > 0` (matches the pattern in #14735).
- Bearer auth + JSON content type, mirroring the chat path.
- Parse `{data: [{embedding, index}]}` and reorder by `index`, rejecting
out-of-range indices, duplicates, and missing entries so the output
always lines up with the input. Same shape as the merged Mistral,
Upstage, and Novita Embed implementations.
- `conf/models/togetherai.json`:
- Add `"embedding": "embeddings"` to `url_suffix`.
- Add default embedding model entries for
`intfloat/multilingual-e5-large-instruct`, `BAAI/bge-large-en-v1.5`, and
`BAAI/bge-base-en-v1.5`.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: The logs on the data source details page are not fully displayed.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Add model types when add model
---
```
RAGFlow(user)> add model 'pipeline' to provider 'mineru_local' instance 'test' with tokens 131072 doc_parse;
SUCCESS
```
2. implement provider: MinerU_Local
---
**Verified from CLI**
```
RAGFlow(user)> parse with 'pipeline@test@mineru_local' file './internal/test.pdf'
+--------------------------------------+
| task_id |
+--------------------------------------+
| c7260e31-b6e2-4b36-955d-e9c60510c669 |
+--------------------------------------+
RAGFlow(user)> show 'test@mineru_local' task 'c7260e31-b6e2-4b36-955d-e9c60510c669'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| content | index |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Bingxin Ke Anton Obukhov Shengyu Huang Nando Metzger Rodrigo Caye Daudt Konrad Schindler Photogrammetry and Remote Sensing, ETH Zurich ¨

### What problem does this PR solve?
RuntimeError: Cannot run the event loop while another loop is running
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: add local & ssh provider in admin panel
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Closes#15025
Langfuse-enabled `dialog_service.async_chat()` regressed to
`langfuse_tracer.start_generation(...)` after the earlier Langfuse v4
migration. Langfuse v4 uses `start_observation(as_type="generation")`,
so the remaining `start_generation` call can fail when chat tracing is
enabled.
This restores the migrated `start_observation(as_type="generation")`
call for chat observations while preserving the existing trace context,
model, input payload, and update/end flow. It also adds a regression
test with a fake Langfuse v4-style client that exposes
`start_observation()` but not `start_generation()`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Tests
- `.venv/bin/pytest
test/unit_test/api/db/services/test_dialog_service_final_answer.py -q`
- `.venv/bin/ruff check api/db/services/dialog_service.py
test/unit_test/api/db/services/test_dialog_service_final_answer.py`
### What problem does this PR solve?
Fix: The folder tree menu for moving folders cannot be scrolled.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Closes#15029.
Some custom `base_url` paths in `ModelProviderService` call
`NewInstance(newURL)` and then immediately invoke methods on the
returned driver. Several real Go model drivers still return `nil` from
`NewInstance`, so those paths can panic instead of returning a normal
error.
This PR:
- centralizes custom base URL driver creation in `model_service.go`
- skips request-local driver creation when `base_url` is blank or
whitespace
- preserves the existing region key behavior when building the
request-local base URL map
- returns a clear error when the provider driver is missing or
`NewInstance` returns `nil`
- routes list/check/task and active model paths through the guarded
helper
- adds focused unit coverage for empty-region preservation, regional
base URLs, blank base URLs, nil drivers, and nil `NewInstance` results
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Test plan
- [x] `git diff --check upstream/main...HEAD`
- [x] `/root/go/bin/gofmt -w internal/service/model_service.go
internal/service/model_service_test.go`
- [x] `GOPATH=/root/gopath GOTOOLCHAIN=local /root/go/bin/go test
./internal/service -run TestNewModelDriverForBaseURL -count=1 -vet=off`
- [x] `GOPATH=/root/gopath GOTOOLCHAIN=local /root/go/bin/go build
./internal/service/... ./internal/entity/models/...`
Note: the same targeted `go test` command without `-vet=off` is
currently blocked by an existing unrelated vet finding in
`internal/service/llm.go:355` (`non-constant format string in call to
fmt.Errorf`).
### What problem does this PR solve?
extend restful api suite
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
### What problem does this PR solve?
This PR implement implement provider 302.AI and JieKouAI
**The following functionalities are now supported:**
**302.ai**
- [x] chat / think chat / stream chat / stream think chat
- [x] Embedding
- [x] ASR
- [x] ListModels
- [x] Provider connection checking
- [x] Balance
- [x] Rerank
- [x] OCR
- [x] Doc Parse
- [x] Show task
- [ ] ~~List Tasks!~~
- [ ] TTS
**JieKouAI**
- [x] chat / think chat / stream chat / stream think chat
- [x] Embedding
- [x] Rerank
- [x] ListModels
**Verified examples from the CLI:**
```palintext
# jiekouAI
RAGFlow(user)> stream think chat with 'zai-org/glm-4.5@test@jiekouai' message 'Hi'
Thinking: Let me think about how to respond to this simple greeting. The user just said "Hi", which is a basic and friendly way to start a conversation. I should respond in a similarly warm and welcoming manner.First, I need to acknowledge their greeting and reciprocate with enthusiasm. Something like "Hello!" or "Hi there!" would work well to create a positive atmosphere right from the start.Next, I should make it clear that I'm ready to help. Since they haven't asked anything specific yet, I'll keep it open-ended and inviting. Perhaps offering assistance with a question or task would encourage them to engage further.I should also maintain a professional yet approachable tone. Being an AI assistant, I want to convey that I'm knowledgeable and capable, but also friendly and easy to talk to.Let me put this all together into a concise response. I'll start with a cheerful greeting, express my readiness to help, and finish with an open invitation for them to share what's on their mind. This should create a welcoming environment for whatever they want to discuss next.
Answer: ! I'm Claude, an AI assistant created by Anthropic. I'm here to help you with information, answer questions, or assist you with tasks. What can I help you with today?
RAGFlow(user)> think chat with 'zai-org/glm-4.5@test@jiekouai' message 'Hi'
Thinking: Let me consider how to respond to this greeting. The user initiated with a simple "Hi," so a friendly and open response would be most appropriate to encourage further conversation. I should maintain a welcoming tone while offering assistance.
The response should accomplish a few key things: return the greeting warmly, show openness to conversation, and offer specific ways I can help. This approach demonstrates both approachability and usefulness.
I'll start with a greeting in return, then express my availability to help, and finish by suggesting some areas where I can provide assistance. This creates a natural flow from acknowledgment to support.
It's important to keep the response concise but inviting. Since the user hasn't specified their needs yet, I'll present a few broad categories of assistance to spark their thinking about what they might want to discuss or ask about.
The response should end with an encouraging note that prompts them to share what's on their mind, keeping the conversational ball in their court while making it clear I'm ready to engage with whatever they need.
Answer: Hello! How can I help you today? Whether you have questions, need information, or just want to chat, I'm here to assist.
RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'text-embedding-3-large@test@jiekouai' dimension 16
+-----------+-------+
| dimension | index |
+-----------+-------+
| 3072 | 0 |
| 3072 | 1 |
+-----------+-------+
RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'baai/bge-reranker-v2-m3@test@jiekouai' top 3
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0 | 0.9830034 |
| 2 | 0.06399203 |
| 1 | 0.04665664 |
+-------+-----------------+
# 302.ai
RAGFlow(user)> think chat with 'kimi-k2.6@test@302.ai' message 'who r u'
Thinking: The user is asking "who r u" which is a casual way of asking "who are you." I need to identify myself as an AI assistant created by Moonshot AI. I should be friendly, concise, and helpful.
Key points to include:
- I am Kimi, an AI assistant made by Moonshot AI
- I can help with various tasks like answering questions, writing, analysis, coding, etc.
- Keep it casual but informative since the user used "r u" (text speak)
I should not:
- Pretend to be human
- Claim to have personal experiences or emotions
- Be overly formal or robotic
Simple, friendly response is best.
Answer: I'm Kimi, an AI assistant made by Moonshot AI. I can help you with answering questions, writing, coding, analysis, or just chatting. What can I do for you?
Time: 17.687750
RAGFlow(user)> stream think chat with 'kimi-k2.6@test@302.ai' message 'who r u'
Thinking: user asked "who r u" which is a casual way of asking "who are you." I should introduce myself as Kimi, an AI assistant developed by Moonshot AI. I need to be friendly, concise, and accurate. I should mention my capabilities briefly and keep the tone helpful. Since the user used casual text speak ("r u"), I can match that energy with a friendly but still informative tone.Key points:- I'm Kimi, an AI assistant made by Moonshot AI- I can help with various tasks like answering questions, writing, coding, analysis, etc.- Keep it brief but warm- Don't claim to be human- Don't over-explainDraft:"I'm Kimi, an AI assistant created by Moonshot AI. I can help with answering questions, writing, coding, analysis, brainstorming, and lots of other tasks. What can I do for you?"This is good - direct, accurate, and inviting.
Answer: Kimi, an AI assistant made by Moonshot AI. I can help with answering questions, writing, coding, analysis, brainstorming, and lots of other stuff. What can I do for you?
Time: 14.912576
RAGFlow(user)> asr with 'whisper-v3-turbo@test@302.ai' audio './internal/test.wav' param ''
+---------------------------------------------------------------------------------------------------------------------+
| text |
+---------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired |
+---------------------------------------------------------------------------------------------------------------------+
RAGFlow(user)> ocr with 'mistral-ocr-latest@test@302.ai' file './internal/test.pdf'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Bingxin Ke
Nando Metzger
Anton Obukhov
Rodrigo Caye Daudt
Shengyu Huang
Konrad Schindler
Photogrammetry and Remote Sensing, ETH Zürich

Figur... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
RAGFlow(user)> parse with 'vlm@test@302.ai' file 'https://arxiv.org/pdf/2505.09358'
+--------------------------------------+
| task_id |
+--------------------------------------+
| 6de6eae6-c122-4b67-91e8-b061a0b8c087 |
+--------------------------------------+
RAGFlow(user)> show 'test@302.ai' task '6de6eae6-c122-4b67-91e8-b061a0b8c087'
+----------------------------------------------------------------------------+-------+
| content | index |
+----------------------------------------------------------------------------+-------+
| https://file.302.ai/gpt/imgs/20260519/b340fdff4774699c287fe4ee4658b317.zip | 0 |
+----------------------------------------------------------------------------+-------+
RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'jina-embeddings-v3@test@302.ai' dimension 16
+-----------+-------+
| dimension | index |
+-----------+-------+
| 1024 | 0 |
| 1024 | 1 |
+-----------+-------+
RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'jina-reranker-v2-base-multilingual@test@302.ai' top 3;
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0 | 0.74167407 |
| 2 | 0.18832397 |
| 1 | 0.15713684 |
+-------+-----------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
### What problem does this PR solve?
extend restful api suite
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
### What problem does this PR solve?
Closes#14751.
The user reported that after adding a variable (e.g. `key1`) to an
agent's **Begin** component, the Python SDK gave them no way to pass it:
their call `session.ask(question=user_question, stream=False)` had no
parameter for `key1`, and the `ask()` signature was just `(question,
stream, **kwargs)` with a docstring that only described streaming
behavior.
The functionality already works — `_ask_agent` does
`json_data.update(kwargs)` and the server reads `inputs` from the
request body at `agent_api.py:902`. The canonical shape is also in the
public API docs (`docs/references/python_api_reference.md:1817-1840`):
```python
session.ask(
"",
stream=False,
inputs={"line_var": {"type": "line", "value": "I am line_var"}},
return_trace=True,
)
```
But because `inputs`, `release`, and `return_trace` were hidden behind
`**kwargs`, they did not appear in IDE signature help, and the docstring
did not mention them. Users had no path from "I added a key in the UI"
to "I need to pass `inputs=...` with this exact shape."
This PR promotes the three most relevant Begin-related arguments to
named parameters and rewrites the docstring with a worked example.
### What this PR changes
- `sdk/python/ragflow_sdk/modules/session.py`:
- `Session.ask()` signature becomes `ask(question="", stream=False,
inputs=None, release=None, return_trace=None, **kwargs)`.
- These three new named params are forwarded into the existing `kwargs`
dict before dispatch, so the wire format and downstream behavior are
unchanged.
- Docstring rewritten in numpy style, including the structured `{"type":
..., "value": ...}` shape that the Begin component requires (see
`agent/component/begin.py:45-60`).
No backend changes. `**kwargs` is preserved for forward compatibility
with other body fields (`session_id`, `files`, `user_id`,
`custom_header`, …).
### Test plan
- [ ] `session.ask(question="hi", stream=False)` — existing call still
works
- [ ] `session.ask("", stream=False, inputs={"key1": {"type": "line",
"value": "v"}})` — Begin component receives `key1 = "v"`
- [ ] `session.ask("", stream=True, return_trace=True)` — streaming
response includes trace events
- [ ] IDE / `help(Session.ask)` now shows `inputs`, `release`,
`return_trace` with descriptions
### Type of change
- [x] Refactoring
- [x] Documentation Update
## Summary
Closes#14869.
Adds VLM-based semantic descriptions to **image chunks produced by the
MinerU parser**, closing a long-standing parity gap with the deepdoc
parser's `VisionFigureParser`. A maintainer flagged this in #13342
("We may add the VLM enhancement to MinerU parser as well") and an
earlier proposal exists in #13824; this PR lands the change end-to-end
inside the existing parser plumbing.
## Why
Today the MinerU parser returns image chunks containing only the
native `image_caption` and `image_footnote` strings from MinerU's
JSON. When neither is present (or when both are sparse), the chunk
carries effectively no searchable content for the figure and
retrieval misses it entirely. Users who configured a local VLM
(reporter's case: Gemma-4-31B) had to post-process MinerU's
`tmp/*.json` themselves.
The deepdoc parser already solves this via
[`VisionFigureParser`](deepdoc/parser/figure_parser.py): when the
tenant has an `IMAGE2TEXT` model configured, each figure gets a
semantic description merged into its chunk. This PR brings the same
behavior to MinerU.
## What changed
### `deepdoc/parser/mineru_parser.py`
- **New method `_enhance_images_with_vlm(outputs, vision_model,
callback=None)`** —
collects every `IMAGE` block with a readable `img_path`, runs
`rag.app.picture.vision_llm_chunk` in a 10-worker
`ThreadPoolExecutor` using the existing
`vision_llm_figure_describe_prompt`, and writes the result back as
`vlm_description`. Per-image failures are logged and skipped — they
never abort the run.
- **`_transfer_to_sections` (IMAGE branch)** — folds
`vlm_description` into the section text alongside caption +
footnote, so the description becomes part of the chunk and is
searchable / retrievable.
- **`parse_pdf`** — after `_read_output`, calls
`_enhance_images_with_vlm(outputs, vision_model, callback=callback)`
when a `vision_model` kwarg is supplied. Wrapped in `try / except`
so a VLM outage cannot break parsing.
### `rag/app/naive.py` (`by_mineru`)
After successfully resolving the MinerU OCR parser, also resolves the
tenant's default `LLMType.IMAGE2TEXT` model via
`get_tenant_default_model_by_type`, wraps it in an `LLMBundle`, and
injects it as `kwargs["vision_model"]` before delegating to
`parse_pdf`.
## Behavior
| Tenant config | Behavior |
|---|---|
| `IMAGE2TEXT` model configured | MinerU image chunks contain `caption +
footnote + VLM description`. Retrieval against figures now actually
works. |
| No `IMAGE2TEXT` model configured | Exact same output as today (caption
+ footnote only). Lookup fails silently with an info log; no error, no
regression. |
| VLM call fails for a single image | That image silently falls back to
caption + footnote; other images proceed. |
| Caller already passes `vision_model` in kwargs | We don't override it
— `if "vision_model" not in kwargs` guards the lookup. |
## Files
- `deepdoc/parser/mineru_parser.py` (+56)
- `rag/app/naive.py` (+13)
### What problem does this PR solve?
extend restful api suite
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
## Summary
Closes#14921.
Reconfiguring an existing LLM provider to enable **tool call** or
**vision** fails with `Your API key is invalid. Fail to access model.`
even when the saved API key is correct. The most visible report is
VLLM ("Cannot add vllm model" once `--enable-auto-tool-choice` /
vision is toggled on), but the bug applies to every provider whose
api_key field stays blank in edit mode.
## Root cause
PR #14885 ("Fix: llm add api key overridden") removed the existing-key
lookup in `api/apps/llm_app.py::add_llm`. The intent was correct —
stop the saved key from clobbering a user-provided new one — but the
removal was unconditional, so the edit path now has no fallback at all:
1. `web/src/pages/user-setting/setting-model/hooks.tsx:230` sets the
initial `api_key` form value to `''` in edit mode (the real key is
never returned to the browser).
2. The user toggles `is_tools` / `vision` without retyping the key.
3. `hooks.tsx:183-185` strips the empty `api_key` from the payload.
4. `add_llm` defaults to the placeholder `"x"`
(`api/apps/llm_app.py:182`).
5. The upstream provider rejects `"x"` with `Your API key is invalid`.
## Fix
Restore the fallback **narrowly**, before any factory-specific handler
runs:
- If `req.get("api_key") is None`, look up the tenant's existing record
(using the correctly suffixed `llm_name` for VLLM /
OpenAI-API-Compatible / LocalAI / HuggingFace).
- Decode the saved blob with `_decode_api_key_config` and write **only
the decoded `api_key` string** back into `req["api_key"]`. Never use
the raw JSON payload — that was the exact thing PR #14885 was trying
to avoid.
- When the user **does** type a new key, `req.get("api_key")` is not
`None` and the fallback is skipped, so PR #14885's fix is preserved.
| Scenario | Before this PR | After this PR |
|---|---|---|
| Plain factory (VLLM, Ollama, …), retype key | OK | OK |
| Plain factory, blank key in edit (the bug) | Fails with "API key is
invalid" | Recovers saved key, validates against the real one |
| OpenRouter / Bedrock, change `provider_order` only | Fails |
`apikey_json([...])` rebuilds the JSON with saved `api_key` + new field
|
| User clears the form and types a brand-new key | OK (key replaced) |
OK (key replaced — fallback skipped) |
## Files changed
- `api/apps/llm_app.py` — restored fallback in `add_llm` (no other call
sites touched).
## Test plan
- [ ] Add a VLLM chat model with a valid api_key, no toggles → save
succeeds.
- [ ] Edit the same model, toggle **tool call** on, leave api_key blank
→ save succeeds, validation runs against the saved key.
- [ ] Edit again, toggle **vision** on (model_type → `image2text`),
leave api_key blank → save succeeds.
- [ ] Edit again and **type a new api_key** → the new key replaces the
saved one (`is None` check skips the fallback). Verify via the DB
row or by deliberately typing a wrong key and observing the
validation failure.
- [ ] Repeat the blank-key edit with **OpenRouter**, changing only
`provider_order` → resulting api_key JSON contains the saved
`api_key` and the new `provider_order`.
- [ ] First-time add of a new model name → no existing record, fallback
no-ops, behaves as before.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
## What
- Add TogetherAI as a chat provider backed by its OpenAI-compatible
`/v1/chat/completions` API
- Register TogetherAI in the Go model factory and provider config
- Support non-streaming chat, SSE streaming chat, model listing, and
connection checks
## Notes
- Uses the current TogetherAI OpenAI-compatible base URL
`https://api.together.ai/v1`
- Forwards documented chat parameters from `ChatConfig`: `max_tokens`,
`temperature`, `top_p`, `stop`, and GPT-OSS `reasoning_effort`
- Routes Together reasoning traces from `reasoning` /
`reasoning_content` into `ReasonContent`
## Tests
- `go test -vet=off -run TestTogetherAI -count=1
./internal/entity/models`
- `go test -vet=off -count=1 ./internal/entity/models`
Refs #14736
### What problem does this PR solve?
Closes#14808.
Adds a Go model driver for Xinference so self-hosted Xinference chat
models can be used through the Go provider layer instead of falling
through to the dummy driver. Xinference exposes an OpenAI-compatible API
under `/v1`; the driver accepts either a root endpoint such as
`http://127.0.0.1:9997` or an OpenAI-compatible endpoint such as
`http://127.0.0.1:9997/v1` and normalizes it before calling chat or
model-listing routes.
### What is changed?
- Add `internal/entity/models/xinference.go` implementing `ModelDriver`
for Xinference chat.
- Route provider name `xinference` in
`internal/entity/models/factory.go`.
- Add `conf/models/xinference.json` as a local provider config.
- Add focused unit tests in `internal/entity/models/xinference_test.go`.
Initial method coverage:
- `ChatWithMessages`: POST `/v1/chat/completions`.
- `ChatStreamlyWithSender`: SSE streaming from `/v1/chat/completions`.
- `ListModels`: GET `/v1/models`.
- `CheckConnection`: lightweight `ListModels` probe.
- Optional auth: send `Authorization: Bearer <api_key>` only when a
non-empty key is configured, matching Xinference no-auth and
auth-enabled deployments.
- `Balance`, `Embed`, `Rerank`, ASR, TTS, and OCR return `no such
method` for this initial chat-provider PR.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Bug Fix (non-breaking change which fixes an issue)
### Tests
- `go test -vet=off -run TestXinference -count=1
./internal/entity/models/...`
- `go test -vet=off -count=1 ./internal/entity/models/...`
### References
- Xinference docs:
https://inference.readthedocs.io/zh-cn/latest/index.html
- OpenAI-compatible chat usage:
https://inference.readthedocs.io/zh-cn/latest/getting_started/using_xinference.html
- API key auth:
https://inference.readthedocs.io/zh-cn/latest/user_guide/auth_system.html
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## What problem does this PR solve?
Closes#12582.
When a Retrieval component sits inside an Iteration with a **manual**
metadata filter that references the iteration variable (e.g.
`{IterationItem:abc@item}`), every iteration reuses the value resolved
on the **first** pass.
Root cause: [`_resolve_manual_filter` in
`agent/tools/retrieval.py`](https://github.com/infiniflow/ragflow/blob/main/agent/tools/retrieval.py#L144-L171)
mutated `flt["value"]` in place. The `filters` list passed in is the
live `self._param.meta_data_filter["manual"]` (see
[`apply_meta_data_filter` in
`common/metadata_utils.py:257-261`](https://github.com/infiniflow/ragflow/blob/main/common/metadata_utils.py#L257-L261)),
so after the first iteration the param dict permanently held the
resolved string instead of the original variable reference.
```text
iter #1: flt["value"] = "{IterationItem:abc@item}" → resolved to "AI"
after mutation: flt["value"] = "AI" ← written back into _param
iter #2: flt["value"] = "AI" ← no {…} matches
retrieval keeps filtering by "AI" forever
```
This PR returns a shallow copy with the resolved value instead, leaving
the original filter (and its variable reference) intact for the next
iteration.
## Type of change
- [x] Bug fix (non-breaking change which fixes an issue)
## Test plan
- [ ] Build an agent: `Agent (structured output → list of areas) →
Iteration → Retrieval (manual filter: Area = {IterationItem/Item}) →
Message`. Run with a multi-area query and confirm each iteration's
Retrieval result matches its own item, not the first item.
- [ ] Regression: Retrieval with a manual metadata filter outside an
Iteration still resolves the variable correctly on each request.
- [ ] Regression: Retrieval with no metadata filter and with `auto` /
`semi_auto` filters behave unchanged.
## What problem does this PR solve?
Closes#12017.
TTS output is deterministic for a given `(model, text)` pair, so
re-running the same text through the same TTS model produces the same
bytes — yet `Canvas.tts` and `dialog_service.tts` re-synthesized on
every request. That's slow and wastes provider quota whenever the same
assistant response is replayed, shared across users, or repeated within
a session.
### Change
New helper `rag/utils/tts_cache.py` with `synthesize_with_cache(tts_mdl,
cleaned_text)`:
- **Key:** `tts:cache:{model_id}:{sha256(text)}` — separate namespace
per model, identical cleaned text reuses a single entry across both call
sites.
- **Value:** the hex-encoded audio blob both call sites already
returned. No format change for downstream consumers.
- **TTL:** 7 days by default, configurable via
`RAGFLOW_TTS_CACHE_TTL_SECONDS`.
- **Failure modes:** a Redis hiccup falls back to direct synthesis; a
failed synthesis still returns `None` (existing contract preserved).
[`Canvas.tts`](https://github.com/infiniflow/ragflow/blob/main/agent/canvas.py#L683-L724)
and
[`dialog_service.tts`](https://github.com/infiniflow/ragflow/blob/main/api/db/services/dialog_service.py#L1367-L1380)
now route through the helper; the per-file bytes-accumulation/hex-encode
loop has been removed in favor of one shared implementation.
## Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Test plan
- [ ] **Cache hit, chat path:** Configure a dialog with TTS enabled, ask
the same question twice with `stream=false`. Verify the second response
returns the same `audio_binary` and that the second invocation doesn't
hit the TTS provider (e.g., observe provider-side logs / usage counters;
check no `LLMBundle.tts can't update token usage` log line on the second
run).
- [ ] **Cache hit, agent path:** Same exercise via a Conversational
Agent that includes a Message component playing back the answer.
- [ ] **Cache isolation per model:** Switch tenant's `tts_id` between
two models, run the same text against each — confirm the second model's
first synthesis still happens (no cross-model hits).
- [ ] **TTL override:** Set `RAGFLOW_TTS_CACHE_TTL_SECONDS=120`, confirm
the entry expires after 2 minutes.
- [ ] **Redis unavailable:** Stop Redis (or break the connection).
Verify the TTS endpoint still works — synthesis falls back to direct
calls, with a `TTS cache lookup failed` / `TTS cache store failed`
warning logged.
- [ ] **Failure path:** Configure a TTS model with an invalid API key,
ensure the response still returns successfully with `audio_binary=None`
(no regression vs. current behavior).
## Summary
Fix critical severity security issue in
`internal/cpp/opencc/dictionary/text.c`.
## Vulnerability
| Field | Value |
|-------|-------|
| **ID** | V-001 |
| **Severity** | CRITICAL |
| **Scanner** | multi_agent_ai |
| **Rule** | `V-001` |
| **File** | `internal/cpp/opencc/dictionary/text.c:107` |
**Description**: The OpenCC C library uses fgets() to read dictionary
and configuration files without proper bounds validation on subsequent
buffer operations. While fgets() itself is bounds-checked, the sprintf()
call at config_reader.c:174 constructs file paths by concatenating
home_path and filename without verifying the result fits in pkg_filename
buffer. An attacker providing malformed OpenCC configuration files with
excessively long path components can overflow the fixed-size buffer,
overwriting adjacent memory including return addresses and function
pointers.
## Changes
- `internal/cpp/opencc/config_reader.c`
- `internal/cpp/opencc/dictionary/text.c`
- `internal/cpp/opencc/utils.c`
## Verification
- [x] Build passes
- [x] Scanner re-scan confirms fix
- [x] LLM code review passed
---
*Automated security fix by [OrbisAI Security](https://orbisappsec.com)*
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved error detection and handling for malformed configuration and
dictionary entries during file parsing.
* Enhanced memory cleanup in error recovery paths to prevent potential
issues.
* Strengthened robustness of string operations and buffer handling
throughout the library.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-15.us-west-2.compute.internal>
## Problem
When using MinerU with `vlm-http-client` backend, the parser fails to
find the output files because they are located in a `vlm/` subdirectory,
but the `_read_output`
method doesn't check this location.
## Error Message
[ERROR]MinerU not found.
[MinerU] Missing output file, tried: ...
## Root Cause
The MinerU API with `vlm-http-client` backend returns output files in
the following structure:
output_dir/
vlm/
filename_content_list.json
filename.md
images/
However, the `_read_output` method in `mineru_parser.py` only checks:
1. `output_dir/filename_content_list.json`
2. `output_dir/sanitized_filename_content_list.json`
3. `output_dir/sanitized_filename/sanitized_filename_content_list.json`
It doesn't check the `vlm/` subdirectory.
## Solution
Added two additional fallback paths to check the `vlm/` subdirectory:
- `output_dir/vlm/filename_content_list.json`
- `output_dir/vlm/sanitized_filename_content_list.json`
## Testing
Tested with MinerU API using `vlm-http-client` backend. The parser now
successfully finds and processes the output files.
## Related
This issue occurs specifically when using:
- MinerU backend: `vlm-http-client`
- MinerU server URL configured for remote vLLM inference
## What
- Add Replicate as a chat provider backed by the documented predictions
API
- Register Replicate in the Go model factory and provider config
- Support non-streaming chat through sync predictions, polling fallback,
streaming through `urls.stream`, model listing, and connection checks
## Notes
- Uses `POST /v1/predictions` with Replicate model identifiers in
`version`, which supports official and community model identifiers
- Maps RAGFlow messages into Replicate prompt-shaped inputs (`prompt`,
optional `system_prompt`) and forwards common documented LLM inputs:
`max_new_tokens`, `temperature`, `top_p`
- Preserves whitespace in SSE output chunks and emits RAGFlow `[DONE]`
at stream completion
## Tests
- `go test -vet=off -run TestReplicate -count=1
./internal/entity/models`
- `go test -vet=off -count=1 ./internal/entity/models`
Refs #14736
### What problem does this PR solve?
Fix minor code quality issues:
1. Fix typo in assertion error message: "Can't fine" → "Can't find"
2. Remove duplicate line in common/connection_utils.py
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### Related issues
Closes#14922
### What problem does this PR solve?
`POST /memories` already resolves `tenant_llm_id` and `tenant_embd_id`
through `ensure_tenant_model_id_for_params`, but `PUT
/memories/<memory_id>` accepted client-supplied `tenant_llm_id` /
`tenant_embd_id` without checking that those `tenant_llm` rows belong to
the memory owner’s tenant. A caller could persist another tenant’s row
IDs and later trigger extraction or embedding that loaded foreign model
credentials via `get_model_config_by_id(tenant_model_id)` with no tenant
allow-list.
This change aligns the update path with create: updates that change
models must go through `llm_id` / `embd_id` and
`ensure_tenant_model_id_for_params` scoped to the **memory’s**
`tenant_id` (not only the current user, so team-access cases stay
correct). Direct `tenant_*` fields in the body without `llm_id` /
`embd_id` are rejected. As defense in depth, `memory_message_service`
passes `allowed_tenant_ids` / `requester_tenant_id` into
`get_model_config_by_id` for LLM and embedding resolution so mismatched
IDs cannot be used even if bad data existed. A regression test rejects
payloads that set only `tenant_llm_id` / `tenant_embd_id`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: jony376 <jony376@gmail.com>
### What problem does this PR solve?
This PR improves the connector dashboard task management experience and
adds better visibility into connector execution logs.
### Overview:
#### Before
<img width="700" alt="image"
src="https://github.com/user-attachments/assets/e4a8ed6f-2e18-4f0f-8528-41a514550052"
/>
#### Now:
<img width="700" alt="Screenshot from 2026-05-18 16-31-30"
src="https://github.com/user-attachments/assets/d4ca193b-847a-49ae-9e4f-5fbca60ea627"
/>
### 1. Add a new logging page to the connector dashboard
A new logging page has been added so users can view connector task
execution logs directly from the connector dashboard.
### 2. Merge the Resume button into Confirm
The separate **Resume** button has been removed. The **Confirm** button
now represents different actions depending on the current task state:
- **Save**: Save form changes and reschedule tasks.
- **Stop**: Cancel currently scheduled or running tasks.
- **Resume**: Create new scheduled tasks after the previous tasks have
been stopped.
- **Start**: Start tasks when no task has been started yet.
### 3. Separate syncing and pruning tasks
Connector tasks are now separated into **syncing** and **pruning**.
Pruning is controlled by the **Sync deleted files** option:
- When **Sync deleted files** is disabled, only syncing tasks are shown.
- When **Sync deleted files** is enabled, both syncing and pruning tasks
are shown.
**Now: Sync deleted files disabled**
<img width="700" alt="Sync deleted files disabled"
src="https://github.com/user-attachments/assets/dbd9232e-614a-407f-a0b1-c109e5fa567d"
/>
**Now: Sync deleted files enabled**
<img width="700" alt="Sync deleted files enabled"
src="https://github.com/user-attachments/assets/1f527f48-ccb3-4ee8-97ca-086891489296"
/>
### 4. Update logs in backend
<img width="700" alt="image"
src="https://github.com/user-attachments/assets/10a95a3f-98c1-4e67-8afa-ddf6cda5b0b2"
/>
### 5. Remove connector resume API
- Removed: `POST /v1/connectors/<connector_id>/resume`
- Replaced by: `PATCH /v1/connectors/<connector_id>`
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix forgetting policy validation and fix memory update diff checks
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When _parse_doc_id_filter_with_metadata returns [], the empty list is
falsy so the WHERE id IN (...) clause was silently skipped, causing the
full dataset to be returned instead of an empty result.
Change `if doc_ids:` to `if doc_ids is not None:` in both get_list() and
get_by_kb_id() to distinguish between no filter (None) and a filter that
matched zero documents ([]).
Fixes#14962
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. expose batch_chunk_token_size for configuration
2. retrieve chunks when build subgraph for the doc, not retreive all
docs chunks at the begining
3. get all chunks for a document, used to be hard coded 10000
4. delete not used method run_graphrag
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
Follow on: #14617
### What problem does this PR solve?
Fixes#14746.
Adds tenant access checks for connector-by-id REST routes before reading
connector details, mutating connector config/status, deleting
connectors, rebuilding, or listing sync logs. Unauthorized callers now
receive `RetCode.AUTHENTICATION_ERROR` with `No authorization.` without
reaching the connector/log mutation paths.
Validation:
- `python3 -m pytest
--confcutdir=test/testcases/test_web_api/test_connector_app
test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py`
- `uvx ruff check api/apps/restful_apis/connector_api.py
api/db/services/connector_service.py
test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: dev111-actor <dev111-actor@users.noreply.github.com>
### What problem does this PR solve?
Refactor: speed up ragflow server, save startup memory, saved 200MiB,
and 5-9 seconds start time.
##### Before
1241292 | | \_ python3 api/ragflow_server.py
RAGFlow server is ready after 25.61845850944519s initialization.
##### After
1019968 | | \_ python3 api/ragflow_server.py
RAGFlow server is ready after 16.205134391784668s initialization.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Closes#14858
The `test_db_connection` endpoint in the agent API accepts a
user-supplied `host` and connects to it directly via database drivers
(MySQL/PostgreSQL) without any validation. This allows an attacker to
probe internal network addresses (e.g. `127.0.0.1`, `10.x.x.x`,
link-local, etc.) through the server — a classic Server-Side Request
Forgery (SSRF) vulnerability.
This PR adds an SSRF guard that resolves the host and rejects any
address that is not globally routable before the database connection is
attempted.
**Changes:**
- **`common/ssrf_guard.py`** — Added `assert_host_is_safe()`, a
host-level counterpart of the existing `assert_url_is_safe()`, designed
for non-HTTP protocols (database drivers) where there is no URL to
parse.
- **`api/apps/restful_apis/agent_api.py`** — Call
`assert_host_is_safe(req["host"])` at the top of `test_db_connection` so
that non-public hosts are rejected early with a clear error message.
Fixes#14858
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Adds the Go model provider driver for CometAPI, which is listed as
unchecked in the Go provider tracking issue #14736 and requested in
#14804. Without this, the Go layer falls back to the dummy driver for
the `cometapi` provider.
Fixes#14804
### What this PR includes
- New `internal/entity/models/cometapi.go` implementing `ModelDriver`
for CometAPI.
- New `conf/models/cometapi.json` with CometAPI base URLs and
representative chat / embedding models from the public catalog.
- `factory.go`: route `"cometapi"` to `NewCometAPIModel`.
- Unit tests in `internal/entity/models/cometapi_test.go`.
### Method coverage
- `ChatWithMessages`: `POST /v1/chat/completions`.
- `ChatStreamlyWithSender`: SSE streaming on the same endpoint.
- `Embed`: `POST /v1/embeddings`, including optional `dimensions`.
- `ListModels`: `GET /api/models` public catalog.
- `Balance`: `GET https://query.cometapi.com/user/quota?key=...`.
- `CheckConnection`: delegates to the quota query to verify the key.
- `Rerank`, ASR, TTS, OCR: return `no such method` for now.
No ModelDriver interface change. No new dependencies.
### How was this tested?
```bash
go test -vet=off -run TestCometAPI -count=1 ./internal/entity/models/...
go test -vet=off -count=1 ./internal/entity/models/...
```
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: 加帆 <Jiafan@users.noreply.github.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: bulexu <baiheng527@gmail.com>
Co-authored-by: xubh <xubh@wikiflyer.cn>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Carve_ <75568342+Rynzie02@users.noreply.github.com>
Co-authored-by: Paul Y Hui <paulhui@seismic.com>
Co-authored-by: LIRUI YU <128563231+LiruiYu33@users.noreply.github.com>
Co-authored-by: yun.kou <koopking@gmail.com>
Co-authored-by: Yun.kou <yunkou@deepglint.com>
Co-authored-by: Ahmad Intisar <168020872+ahmadintisar@users.noreply.github.com>
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
Co-authored-by: chanx <1243304602@qq.com>
Co-authored-by: Syed Shahmeer Ali <syedshahmeerali196@gmail.com>
Co-authored-by: Octopus <liyuan851277048@icloud.com>
Co-authored-by: lif <1835304752@qq.com>
## Summary
- Stop pulling chunk vectors (`q_*_vec`) back from Elasticsearch in the
main retrieval path. ES already knows them; shipping them was pure
bandwidth/memory overhead.
- Recover the per-chunk cosine similarity via a second KNN-only ES call
filtered by the candidate chunk ids. The new `_score` is merged with
locally computed term similarity using the user-configured
`vector_similarity_weight`.
- Lazily fetch the chunk embedding only for the chunks
`insert_citations` actually needs.
## Details
**`rag/nlp/search.py`**
- `Dealer.search`: no longer appends `q_*_vec` to the ES select list.
OceanBase still gets it (its rerank path is unchanged).
- New `Dealer._knn_scores(sres, idx_names, kb_ids)`: a `MatchDenseExpr`
over the cached query vector filtered by `id IN sres.ids`, returning
`{chunk_id: cosine_score}` via ES `_score`.
- New `Dealer.rerank_with_knn(...)`: term similarity from
`qryr.token_similarity` plus the ES-supplied KNN score, combined with
`tkweight`/`vtweight` and the existing rank-feature bonus.
- New `Dealer.fetch_chunk_vectors(chunk_ids, tenant_ids, kb_ids, dim)`:
on-demand vector fetch for citation use.
- `Dealer.retrieval` routes Infinity → unchanged, OceanBase → existing
local `rerank`, ES → new KNN-score path.
**`common/doc_store/es_conn_base.py`**
- New `get_scores(res)` helper returning `{_id: _score}` directly from
hit headers (ES doesn't surface `_score` through `get_fields`).
**`api/db/services/dialog_service.py`**
- New top-level `_hydrate_chunk_vectors(...)` helper. On ES it
back-fills `ck["vector"]` from `fetch_chunk_vectors` right before
`insert_citations`. No-op on Infinity / OB (their chunks already carry
vectors).
- Both `decorate_answer` closures became `async` and are `await`-ed at
all call sites in `async_chat` and `async_ask`.
## Backend behavior
| Backend | Returns chunk vec in main search | Sim source | Vectors for
citations |
|---|---|---|---|
| ES | No | second KNN call (`_score`) merged with term sim | fetched on
demand |
| Infinity | No (unchanged) | normalized `_score` | already on chunks |
| OceanBase | Yes (kept) | local hybrid rerank | already on chunks |
## Test plan
### What problem does this PR solve?
Adds the missing Anthropic provider implementation for the Go model
provider layer.
Closes#14939
### What changed
- Add `conf/models/anthropic.json` with Anthropic Claude chat/vision
models and API endpoints.
- Add `internal/entity/models/anthropic.go` implementing non-streaming
Messages API chat, model listing, and connection checking.
- Register `anthropic` in the Go model factory.
- Add httptest coverage for headers, payload mapping, response parsing,
validation errors, provider errors, model listing, connection checking,
factory registration, and unsupported methods.
### Notes
Streaming chat is left as an explicit `no such method` follow-up because
this initial provider focuses on non-streaming chat and connection
checking.
### Tests
- `docker run --rm -v
/home/ubuntu/Documents/gitTensor_repos/carlos/ragflow:/work -v
/tmp/ragflow-go-cache:/go/pkg/mod -v
/tmp/ragflow-go-build:/root/.cache/go-build -w /work golang:1.25 go test
-vet=off ./internal/entity/models -run Anthropic -count=1 -v`
- `docker run --rm -v
/home/ubuntu/Documents/gitTensor_repos/carlos/ragflow:/work -v
/tmp/ragflow-go-cache:/go/pkg/mod -v
/tmp/ragflow-go-build:/root/.cache/go-build -w /work golang:1.25 go test
-vet=off ./internal/entity -count=1`
- `git diff --check`
- `jq . conf/models/anthropic.json >/dev/null`
Plain `go test ./internal/entity/models` currently hits pre-existing
unrelated vet findings in other provider files (`baidu.go`, `cohere.go`,
`fishaudio.go`, `openrouter.go`).
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
This PR adds non-streaming chat support for the Jina Go model provider.
The Jina provider was added with embedding, rerank, model listing, and
connection checking, but `ChatWithMessages` still returned a
not-implemented error even though Jina exposes an OpenAI-compatible
`/v1/chat/completions` endpoint.
Closes#14933
**The following functionalities are now supported:**
### **Jina:**
- [x] Chat
- [ ] Stream Chat
- [x] Embedding
- [x] Rerank
- [x] Model listing
- [x] Provider connection checking
- [ ] Balance
### **Implementation details:**
- Implements `JinaModel.ChatWithMessages`
- Sends `Authorization: Bearer <api-key>` and JSON chat completion
requests
- Validates API key, model name, messages, and configured region before
making requests
- Forwards supported chat config fields: `max_tokens`, `temperature`,
`top_p`, and `stop`
- Parses the first chat completion choice into `ChatResponse.Answer`
- Adds `jina-ai/jina-vlm` as a chat-capable model in
`conf/models/jina.json`
- Adds focused unit tests for request construction, auth, response
parsing, validation errors, provider errors, and region handling
**Verification:**
```plaintext
docker run --rm -v $PWD:/repo -w /repo golang:1.25 sh -c '/usr/local/go/bin/gofmt -w internal/entity/models/jina.go internal/entity/models/jina_test.go && /usr/local/go/bin/go test -vet=off ./internal/entity/models -run TestJina -count=1'
ok ragflow/internal/entity/models 0.037s
```
Note: `go test ./internal/entity/models -run TestJina -count=1`
currently hits unrelated existing vet findings in other provider files,
so the focused Jina tests were run with `-vet=off`.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fixes#14893
The Novita Go driver landed in #14850 and shipped a stub `Embed` method
that returned `"novita, no such method"`, so Novita could not be used as
an embedding provider in RAGFlow. This PR fills that gap.
Novita exposes a public embeddings endpoint at `POST
https://api.novita.ai/v3/embeddings` that accepts the standard
OpenAI-compatible request shape (`{model, input}`) with `Authorization:
Bearer <api_key>`. Two embedding models are documented in Novita's model
library: `baai/bge-m3` (multilingual, 8192 tokens) and
`baai/bge-large-en-v1.5`.
### Changes
- `internal/entity/models/novita.go`: implement `NovitaModel.Embed`.
- Validate inputs (api key, model name) and short-circuit on empty
texts.
- Resolve region with the existing `baseURLForRegion` helper.
- Build URL from `URLSuffix.Embedding` (the embeddings path lives under
`/v3/`, separate from the chat path under `/openai/v1/`).
- Send `{model, input}` POST body, add `dimensions` when
`embeddingConfig.Dimension > 0` (matches the pattern in #14735).
- Bearer auth + JSON content type, mirroring the chat path.
- Parse `{data: [{embedding, index}]}` and reorder by `index`, rejecting
out-of-range indices, duplicates, and missing entries so the output
always lines up with the input. Same shape as the merged Mistral and
Upstage Embed implementations.
- `conf/models/novita.json`:
- Add `"embedding": "v3/embeddings"` to `url_suffix`.
- Add default embedding model entries for `baai/bge-m3` and
`baai/bge-large-en-v1.5` so they appear in the model picker.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
This fixes a bug where files uploaded in chat were left in storage after
the session was deleted. It now removes those chat-uploaded blobs during
session deletion. fixes#14965
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## RAG Optimization Description
Optimize the core `BaseTitleChunker` in
`rag/flow/chunker/title_chunker/common.py` to improve RAG document
chunking quality and retrieval accuracy.
## Key Changes
1. **Format-branched text processing**: Preserve original whitespace &
indentation for Markdown/HTML payloads to maintain document semantics
and chunk fidelity; only perform full whitespace cleaning on plain text
content.
2. **Empty chunk filtering**: Thoroughly filter invalid pure-blank lines
to reduce noisy data in vector database.
3. **Code deduplication**: Unified markdown/text/html payload extraction
logic, removed redundant repeated code blocks.
4. **None serialization fix**: Avoid converting `None` value into
literal `"None"` string in chunk text fields.
5. **Production logging**: Added input/output line count logging for
filter logic, observable in online environment.
6. **100% backward compatible**: No changes to chunking hierarchy rules,
output format and all existing workflows.
## RAG Business Value
- Preserves document format fidelity for structured Markdown/HTML files
- Reduces invalid noisy chunks → improves RAG retrieval precision
- Cleans plain text data → optimizes vector embedding quality
- Improves code maintainability with no breaking changes
- Provides observable logging for chunk filtering behavior
## Compatibility
- ✅ No API changes
- ✅ No chunk logic modifications
- ✅ All document parsing/chunking workflows unaffected
- ✅ All pre-checks passed, no code conflicts
### Type of change
- [x] Refactoring
- [x] Performance Improvement
## Summary
This PR fixes 3 bugs in agent components:
### Bug 1: `DataOperations._invoke()` dispatches `"literal_eval"` to
wrong handler
**File:** `agent/component/data_operations.py`, line 76
The `_invoke()` method compares `self._param.operations` against
`"recursive_eval"` (line 76), but the valid value defined in
`DataOperationsParam.__init__()` (line 29) and validated in `check()`
(line 43) is `"literal_eval"`. This means selecting the `literal_eval`
operation from the frontend would never match, and the method
`_literal_eval()` would never be called.
**Fix:** Change `"recursive_eval"` to `"literal_eval"` in the dispatch
condition.
### Bug 2: `VariableAssigner._clear()` — `bool` branch unreachable
**File:** `agent/component/variable_assigner.py`, lines 95–100
In Python, `bool` is a subclass of `int` (`True` is `isinstance(True,
int) == True`). The `isinstance(variable, int)` check on line 95 catches
boolean values before the `isinstance(variable, bool)` check on line 99,
making the bool branch unreachable. A boolean variable would be cleared
to `0` instead of `False`.
**Fix:** Move the `isinstance(variable, bool)` check before
`isinstance(variable, int)`.
### Bug 3: `LoopItem.evaluate_condition()` — `bool` branch unreachable
**File:** `agent/component/loopitem.py`, lines 67–93
Same issue as Bug 2: `isinstance(var, (int, float))` on line 67 catches
boolean values before `isinstance(var, bool)` on line 85. Boolean
variables would be evaluated with numeric operators (`=`, `≠`, `>`,
etc.) instead of boolean operators (`is`, `is not`).
**Fix:** Move the `isinstance(var, bool)` check before `isinstance(var,
(int, float))`.
## Test plan
- [ ] Verify `DataOperations` with `literal_eval` operation correctly
invokes `_literal_eval()`
- [ ] Verify `VariableAssigner._clear()` returns `False` for boolean
variables (not `0`)
- [ ] Verify `LoopItem.evaluate_condition()` uses boolean operators for
`True`/`False` values
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Fixed operation routing logic to correctly dispatch the "literal_eval"
operation to its handler.
* **Refactor**
* Reorganized conditional branch ordering in agent components to improve
code structure and maintainability without affecting functional
behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### What problem does this PR solve?
LLM/chat and search UIs render Markdown in several places (document
preview, floating chat widget, next-search, etc.). Plugin lists and
behavior were duplicated or inconsistent, and single newlines in model
output were not always rendered as visible line breaks, which hurts
readability for chat-style content.
This PR centralizes shared **remark/rehype** configuration (including
**`remark-breaks`** for newline handling) and wires the main Markdown
surfaces to use it, so behavior is consistent and easier to maintain.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
This PR implement implement OCR for Baidu and Mistral, implement
PaddleOCR provider and implement ASR for CoHere
**Verified examples from the CLI:**
```
RAGFlow(user)> ocr with 'mistral-ocr-2512@test@mistral' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
RAGFlow(user)> ocr with 'paddleocr-vl-0.9b@test@baidu' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# PaddleOCR
RAGFlow(user)> ocr with 'PaddleOCR-VL-1.5@test@paddleocr' file './internal/test.pdf'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Bingxin Ke
Nando Metzger
Photogra
Anton Obukhov
Rodrigo Caye Daudt
netry and Remote Sensing,
Shengyu Huang
Konrad Schindler
ETH Zürich
<div style="text-align: c... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# Cohere
RAGFlow(user)> asr with 'cohere-transcribe-03-2026@test@cohere' audio './internal/test.wav' param '{"language": "en"}'
+-----------------------------------------------------------------------------------------------------------------------+
| text |
+-----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired. |
+-----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
Closes#14753
## What changed
| File | Change |
|---|---|
| `pyproject.toml` | `requires-python` → `>=3.13,<3.15`; remove
`strenum==0.4.15` |
| `Dockerfile` | `uv python install 3.13`, `uv sync --python 3.13` |
| `.github/workflows/tests.yml` | `uv sync --python 3.13` on both matrix
legs |
| `CLAUDE.md` | dev setup command + requirements note updated |
| `deepdoc/parser/mineru_parser.py` | `from strenum import StrEnum` →
`from enum import StrEnum` |
| `agent/tools/code_exec.py` | same |
`StrEnum` has been in the stdlib since Python 3.11 — the `strenum`
backport package is no longer needed once the floor is 3.13.
## Why uv.lock is not regenerated
`uv lock --python 3.13` fails because:
1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0`
2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels)
depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0`
3. These two constraints are irreconcilable on Python 3.13
The lockfile regeneration requires loosening the `numpy` upper bound in
the `infiniflow/graspologic` fork. Once that fork commit is updated and
the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will
succeed.
## RFC corrections
Two claims in the original RFC (#14753) did not hold up under code
review:
- **"graspologic hard-blocks 3.13"** — the infiniflow fork at the pinned
commit has no `<3.13` Python constraint. The blocker is the transitive
`numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a
direct Python version cap.
- **"free-threading throughput gains for I/O-bound workload"** — Python
3.13 free-threading requires a special `--disable-gil` build and
provides no benefit for async I/O code (the GIL is already released
during I/O). The real motivation is forward compatibility and improved
error messages.
## Summary
- Rename misspelled attribute `model_speciess` to `model_species` across
4 files
- The extra `s` is a typo — `species` is already plural
## Test plan
- [ ] Verify PDF parsing with laws/manual/paper parser types still works
correctly
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: yuj <yuj@ztjzsoft.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### What problem does this PR solve?
This PRimplement TTS, ASR for Siliconflow and TTs for StepFun
**The following functionalities are now supported:**
**SiliConFlow:**
- [x] Text To Speech
- [x] Audio To Text
- [x] Stream Audio To Text
**StrepFun:**
- [x] Audio To Text
- [x] Stream Audio To Text
**Verified examples from the CLI:**
```plaintext
# SiliconFlow
RAGFlow(user)> tts with 'FunAudioLLM/CosyVoice2-0.5B@test@Siliconflow' text 'hello? show yourself' play format 'wav' param '{"voice": "fnlp/MOSS-TTSD-v0.5:alex"}'
SUCCESS
RAGFlow(user)> asr with 'FunAudioLLM/SenseVoiceSmall@test@siliconflow' audio './internal/test.wav' param ''
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. |
+----------------------------------------------------------------------------------------------------------------------+
RAGFlow(user)> stream asr with 'FunAudioLLM/SenseVoiceSmall@test@siliconflow' audio './internal/test.wav' param ''
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. |
+----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
```
+---------------------+----------------------------------+-------------+-----------------+---------+--------+------+
| created_at | id | meta_fields | name | size | status | type |
+---------------------+----------------------------------+-------------+-----------------+---------+--------+------+
| 2026-05-08 19:35:08 | f6aa38bb4ad111f1ba6338a74640adcc | map[] | abc.pdf | 3387987 | 1 | pdf |
+---------------------+----------------------------------+-------------+-----------------+---------+--------+------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Users frequently ask how to use Ollama for local LLM inference with
RAGFlow. This FAQ entry provides step-by-step instructions for setting
up Ollama as a local model provider.
### Type of change
- [x] Documentation update
### Description
Adds a new FAQ entry: "How do I use Ollama with RAGFlow for local LLM
inference?"
Covers:
1. Starting Ollama and pulling a model
2. Configuring Ollama as a model provider in RAGFlow Settings
3. Using the Ollama model in an assistant
### What problem does this PR solve?
Closes#14878.
`VllmModel.Rerank()` in
[internal/entity/models/vllm.go:551](internal/entity/models/vllm.go#L551)
is currently a stub returning `nil, fmt.Errorf("%s, Rerank not
implemented", z.Name())`, and
[conf/models/vllm.json](conf/models/vllm.json) is missing a `rerank`
entry in `url_suffix`. Chat (long-standing) and embeddings (#14688)
already work, so rerank is the last missing leg of the retrieval
pipeline for operators running everything on a single self-hosted vLLM
server — today they have to point rerank at a different provider, which
defeats the point of a fully local deployment.
Upstream vLLM has supported a Jina/Cohere-compatible `POST /v1/rerank`
endpoint since v0.7
([vllm-project/vllm#12376](https://github.com/vllm-project/vllm/pull/12376)).
The request/response shape is essentially identical to the NVIDIA driver
landed in #14778, so this PR mirrors that structure with two
vLLM-specific adjustments.
This PR replaces the stub with a real implementation against vLLM's
`/v1/rerank`:
- `POST {baseURL}/rerank`
- Request body: `{"model": "<modelName>", "query": "<query>",
"documents": [...], "top_n": <int>}` — documents are a flat `[]string`,
**not** wrapped as `{text: "..."}` like NVIDIA's `/ranking`.
- Response body: `{"results": [{"index": int, "relevance_score": float},
...]}` (Jina-compatible; the optional `document` field is ignored since
callers reconstruct text via `Index`).
- `Authorization: Bearer <ApiKey>` is set **only when `APIConfig.ApiKey`
is non-empty**, matching the existing `Embed`/`ListModels` behaviour in
this file. vLLM is a local driver and can be deployed without an API
key.
The return shape matches the existing `*RerankResponse` contract used by
the NVIDIA ([nvidia.go:461](internal/entity/models/nvidia.go#L461)),
Aliyun ([aliyun.go:507](internal/entity/models/aliyun.go#L507)), and
ZhipuAI ([zhipu-ai.go:554](internal/entity/models/zhipu-ai.go#L554))
drivers, i.e. `Data []RerankResult` carrying `{Index, RelevanceScore}`
in the API's ranking order. Callers that need original-input order sort
by `Index`.
Behaviour requirements from the issue, all covered:
1. Empty `documents` → returns `&RerankResponse{}` without an HTTP call.
2. Missing `modelName` → `"model name is required"` validation error.
3. `rerankConfig.TopN` honored when `0 < TopN < len(documents)`;
otherwise `top_n` defaults to `len(documents)` so callers get a score
per input.
4. Non-200 responses return an error including upstream status and body
(`"vLLM rerank API error: <status>, body: <body>"`).
5. Response `index` values are bounds-checked against `len(documents)`.
**Scope:**
- [internal/entity/models/vllm.go](internal/entity/models/vllm.go) —
replaces the `Rerank` stub at line 551 with a real implementation; adds
`vllmRerankRequest`/`vllmRerankResponse` types for the slim subset of
the payload we need. Region/baseURL resolution, 30s context timeout,
conditional bearer header, and error wrapping all follow the existing
patterns in this file.
- [conf/models/vllm.json](conf/models/vllm.json) — adds `"rerank":
"rerank"` to `url_suffix`, joined to the operator-configured vLLM base
URL the same way the NVIDIA driver joins at
[nvidia.go:485](internal/entity/models/nvidia.go#L485).
-
[internal/entity/models/vllm_rerank_test.go](internal/entity/models/vllm_rerank_test.go)
— adds 7 `httptest`-backed tests mirroring `nvidia_rerank_test.go`:
happy path (out-of-order ranking → Index preservation), `top_n` clamp to
`RerankConfig.TopN`, empty-documents short-circuit, missing-model-name
validation, HTTP error propagation, out-of-range index rejection, and a
vLLM-specific `TestVllmRerankWithoutAPIKey` locking in the optional-auth
behaviour that distinguishes this driver from NVIDIA.
**Out of scope:** no interface change, no DDL, no frontend change. Chat,
embeddings, and balance paths are untouched. No new user-facing docs
required beyond the existing rerank model setup page — vLLM joins the
list of providers whose rerank model can be selected once `/v1/rerank`
is exposed by the server.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fixes#13975
## Problem
The GitHub data source connector had both `include_pull_requests` and
`include_issues` defaulting to `false` in both the frontend form and the
backend sync code. This meant that with the default configuration, **no
content was synced at all** from a GitHub repository — silently
producing zero results.
Additionally, the form field labels contained a typo: "Inlcude" instead
of "Include".
## Solution
- Changed `include_pull_requests` default from `false` to `true` in the
frontend form fields and default values
- Changed `include_issues` default from `false` to `true` in the
frontend form fields and default values
- Changed both backend defaults in `sync_data_source.py` from `False` to
`True`
- Fixed label typos: "Inlcude Pull Requests" → "Include Pull Requests"
and "Inlcude Issues" → "Include Issues"
This makes the GitHub connector consistent with the GitLab connector,
which already defaults `include_mrs`, `include_issues`, and
`include_code_files` all to `true`.
## Testing
- The connector now syncs both pull requests and issues by default when
a new GitHub data source is created
- Users who want to exclude PRs or issues can uncheck the corresponding
checkboxes in the form
Co-authored-by: octo-patch <octo-patch@github.com>
## What problem does this PR solve?
Closes#13384.
The `/api/v1/agentbots/<agent_id>/completions` non-streaming path
returned the first yielded SSE chunk and exited:
```python
async for answer in agent_completion(objs[0].tenant_id, agent_id, **req):
return get_result(data=answer)
```
That meant structured output, the full assistant message, and reference
data were all dropped when an agent was called with `stream=false`.
Streaming worked because each event was forwarded individually;
non-streaming was returning a raw SSE-formatted string from a single
early event.
The v1 endpoint at
[`agent_api.py:1006-1050`](https://github.com/infiniflow/ragflow/blob/main/api/apps/restful_apis/agent_api.py#L1006-L1050)
already handles this correctly. This PR mirrors that aggregation in the
SDK beta endpoint: parse each SSE line, accumulate `content` from
`message` events, merge `reference`, collect `outputs.structured` from
each `node_finished` event keyed by `component_id`, and attach all of
them to the final response.
## Type of change
- [x] Bug fix (non-breaking change which fixes an issue)
## Test plan
- [ ] Build an agent with a node that emits structured output, call
`POST /api/v1/agentbots/<agent_id>/completions` with `stream=false` and
a beta API token, verify `data.structured.<component_id>` is present in
the response.
- [ ] Same agent with `stream=true` — verify behavior is unchanged.
- [ ] Agent without structured output — verify `data.structured` is
omitted, `content` and `reference` still aggregated correctly.
### What problem does this PR solve?
```
RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png'
+----------------------------------------------------------+
| text |
+----------------------------------------------------------+
| 生活不是等待风暴过去,而是学会在雨中翩翩起舞。
——佚名 |
+----------------------------------------------------------+
RAGFlow(user)> list 'test@gitee' tasks;
+---------+----------------------------------+
| status | task_id |
+---------+----------------------------------+
| success | C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5 |
+---------+----------------------------------+
RAGFlow(user)> show 'test@gitee' task 'C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5';
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| content | index |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| # PDF 1: Purpose of RAGFlow
RAGFlow is an open source Retrieval-Augmented Generation (RAG) engine designed to turn raw documents into reliable context for large language models.Its purpose is to make it practical to build an Al assistant that can ans... | 1 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Summary
Harden `api/utils/configs.deserialize_b64` so that it always routes
pickle data through the existing `RestrictedUnpickler`
(`restricted_loads`) rather than falling back to bare `pickle.loads()`.
- **CWE-502** — Deserialization of Untrusted Data
- **File / function**: `api/utils/configs.py` → `deserialize_b64`
- **Caller**: `SerializedField.python_value` in `api/db/db_models.py`
(invoked by Peewee whenever a pickled DB column is read)
## The issue
Before this change, `deserialize_b64` consulted a
`use_deserialize_safe_module` config flag that **defaults to `False`**
and is not set anywhere in the repository:
```python
use_deserialize_safe_module = get_base_config('use_deserialize_safe_module', False)
if use_deserialize_safe_module:
return restricted_loads(src)
return pickle.loads(src) # <-- default path
```
So the default code path was unrestricted `pickle.loads()` on bytes read
from a MySQL `SerializedField(serialized_type=PICKLE)` column. Any
attacker who can influence those bytes (SQL injection elsewhere,
compromised DB credentials, a backup restored from an untrusted source,
or a compromised replication peer) can craft a pickle payload that
achieves arbitrary code execution on the ragflow application server when
the field is next read.
Today no model in-tree instantiates a `SerializedField` with the default
PICKLE type — only `JsonSerializedField` is used in practice — so the
attack surface is currently **latent** rather than actively reachable
through an HTTP endpoint. But the insecure-by-default behaviour is a
sharp edge: any future field that uses the default PICKLE serialization
would silently inherit RCE-on-read semantics.
## The fix
```diff
- use_deserialize_safe_module = get_base_config(
- 'use_deserialize_safe_module', False)
- if use_deserialize_safe_module:
- return restricted_loads(src)
- return pickle.loads(src)
+ return restricted_loads(src)
```
`restricted_loads` is the existing `RestrictedUnpickler` already defined
in the same file, which limits permitted modules to `numpy` and
`rag_flow`. The config flag (and the now-dead `get_base_config` import)
are removed.
Diff is 1 insertion / 6 deletions, scoped to a single function.
## Testing
- Built a malicious pickle whose `__reduce__` resolves to
`posix.system('id')`. Pre-fix: executes. Post-fix: `restricted_loads`
raises `UnpicklingError: global 'posix.system' is forbidden`.
- Round-tripped a benign `numpy.ndarray` through `serialize_b64` →
`deserialize_b64`. Values preserved bit-for-bit.
- Confirmed `use_deserialize_safe_module` is not set in any config file
in the tree, so removing the flag does not change any operator-facing
knob that was actually in use.
## A note on `restricted_loads` itself
The existing `SECURITY.md` notes that `restricted_loads`'s `numpy`
allow-list can still be reached via `numpy.f2py.diagnose.run_command`.
This PR does **not** attempt to fix that — it is a separate hardening
question about tightening the allow-list to specific symbols rather than
whole modules. The change here strictly improves on the status quo (bare
`pickle.loads`) and brings the default path in line with what the
`restricted_loads` helper was clearly designed for. Happy to follow up
with a separate PR narrowing the allow-list if that direction is
welcome.
## Adversarial review
Before submitting, we tried to argue this finding away. The two
strongest objections are (1) "no field uses PICKLE today, so this is
unreachable" — true, but the default behaviour of a security-sensitive
helper still matters because new fields silently inherit it; and (2)
"the attacker already needs DB write access, which is game over" —
partially true, but pickle-RCE meaningfully escalates *data tampering*
into *code execution on the application host* (filesystem, internal
network, in-process secrets), which is not equivalent. The fix is one
line of real code, has no behavioural cost for legitimate callers, and
removes an insecure default. We decided it was worth filing.
---
<sub>_Submitted by Sebastion — autonomous open-source security research
from [Foundation Machines](https://foundationmachines.ai). Free for
public repos via the [Sebastion AI GitHub
App](https://github.com/marketplace/sebastion-ai)._</sub>
### What problem does this PR solve?
This PR fixes two issues in Agent Retrieval behavior and configuration
UX:
1. `top_k` configured in Agent Retrieval was not passed down to the
backend retriever call, so retrieval could ignore the configured vector
recall limit.
2. Similarity weight slider semantics were confusing in Agent forms
because the Agent field stores `keywords_similarity_weight` while UI
interactions were interpreted as vector weight. This could cause
displayed values and actual behavior to diverge.
This PR ensures Agent retrieval uses configured `top_k`, and makes the
slider behavior consistent and explicit for both vector and keyword
weight modes.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- Fixes **Tongyi-Qianwen** (`QWenEmbed`) text embeddings when the
configured `base_url` points at DashScope **international**
(`dashscope-intl.aliyuncs.com`) or **China** (`dashscope.aliyuncs.com`)
hosts, including values copied from Model Studio that use the
**OpenAI-compatible** path (`.../compatible-mode/v1`).
- The `dashscope` Python SDK (`TextEmbedding.call`) expects the
**native** HTTP root (`https://<host>/api/v1`), not the
OpenAI-compatible base URL. Without mapping, international accounts
could hit the wrong host or path.
## Implementation
- Added `_dashscope_native_http_api_url()` to normalize known DashScope
hosts to `.../api/v1`, and wired `QWenEmbed` to set
`dashscope.base_http_api_url` before each embedding call (document and
query).
## Notes
- In-code comments document the Tongyi-Qianwen / DashScope intl vs CN
behavior for future maintainers.
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
## Problem
When a user uploads a file attachment in their first message (Q1) and
then sends a follow-up message (Q2) that triggers a backend response,
the uploaded file attachment disappears from Q1 in the chat UI.
Fixes#13959
## Root Cause
In `single-chat-box.tsx`, a `useEffect` hook syncs `derivedMessages`
from `conversation?.messages` whenever the conversation data changes
(e.g., after a new assistant reply arrives):
```typescript
useEffect(() => {
const messages = conversation?.messages;
if (Array.isArray(messages)) {
setDerivedMessages(messages); // ← overwrites local state
}
}, [conversation?.messages, setDerivedMessages]);
```
The problem is that `conversation.messages` comes from the server, which
stores messages as plain JSON. Browser `File` objects (uploaded by the
user) cannot be serialized to JSON, so they are never stored on the
server. Each time the server data is applied to local state, the `files`
array on the user's first message is lost.
## Fix
Instead of replacing the local messages wholesale, preserve any `files`
entries from the previous local state by ID before applying the server
data:
```typescript
useEffect(() => {
const messages = conversation?.messages;
if (Array.isArray(messages)) {
setDerivedMessages((prevMessages) => {
const filesMap = new Map(
prevMessages
.filter((m) => m.files?.length)
.map((m) => [m.id, m.files]),
);
if (filesMap.size === 0) {
return messages;
}
return messages.map((m) => ({
...m,
files: filesMap.get(m.id) ?? m.files,
}));
});
}
}, [conversation?.messages, setDerivedMessages]);
```
This is a minimal, targeted fix: when there are no local files to
preserve the behavior is identical to before (early return with plain
assignment). When local file objects exist they are re-attached to the
corresponding server messages by ID.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved search query processing to properly handle special characters
and apostrophes in search terms and synonyms.
* Fixed chat message file attachments to persist when syncing with
server.
* **Refactor**
* Simplified OCR detection return values by removing timing metadata.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: ximi <octo-patch@github.com>
### What problem does this PR solve?
add document download endpoint and refactor existing download function
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: This enables SelectWithSearch to search by label.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
This PR implement TTS for FishAudio and MiniMax provider and ASR for
FishAudio
**The following functionalities are now supported:**
**FishAudio:**
- [x] Text To Speech
- [x] Stream Text To Speech
- [x] Audio To Text
**OpenRouter:**
- [x] Text To Speech
**Verified examples from the CLI:**
```plaintext
**FishAudio**
RAGFlow(user)> tts with 's1@test@fishaudio' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"reference_id": "90e65eaaf50e4470b8e6d43ee6afd7d5", "temperature": 0.7, "top_p": 0.7, "prosody": {"speed": 1, "volume": 0, "normalize_loudness": true}, "chunk_length": 300, "normalize": true, "sample_rate": 44100, "mp3_bitrate": 128, "latency": "normal", "max_new_tokens": 1024, "repetition_penalty": 1.2, "min_chunk_length": 50, "condition_on_previous_chunks": true, "early_stop_threshold": 1}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/s1_output.wav
SUCCESS
RAGFlow(user)> stream tts with 's1@test@fishaudio' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"reference_id": "90e65eaaf50e4470b8e6d43ee6afd7d5", "temperature": 0.7, "top_p": 0.7, "prosody": {"speed": 1, "volume": 0, "normalize_loudness": true}, "chunk_length": 300, "normalize": true, "sample_rate": 44100, "mp3_bitrate": 128, "latency": "normal", "max_new_tokens": 1024, "repetition_penalty": 1.2, "min_chunk_length": 50, "condition_on_previous_chunks": true, "early_stop_threshold": 1}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/s1_output.wav
SUCCESS
RAGFlow(user)> asr with 'transcribe-1@test@fishaudio' audio './internal/test.wav' param '{"language": "en", "ignore_timestamps": true}'
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. |
+----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
## Summary
- Fixes resource-management bugs in the `POST
/agents/test_db_connection` endpoint where database connections could be
left open on error (part of #14750)
## Changes
- `api/apps/restful_apis/agent_api.py` — `test_db_connection`:
- mysql / mariadb / oceanbase / postgres: replaced bare `db.connect()` /
`db.close()` fallthrough with `with db.connection_context()` and a probe
`SELECT 1` — guaranteed close on both success and exception
- mssql: nested `try/finally` blocks so `cursor.close()` and
`db.close()` are always called even when `cursor.execute()` raises
- trino: wrapped cursor ops in `try/finally` for the same reason
- Removed the `if req["db_type"] != "mssql": db.connect(); db.close()`
shared fallthrough block — each branch now owns its teardown
- Consolidated to a single `return get_json_result(...)` after the
if/elif chain
### What problem does this PR solve?
When multiple MCP servers expose tools with the same name, the agent
currently registers those tools using their original MCP names. This can
lead to two issues:
- later MCP tools may overwrite earlier ones in the agent tool map
- duplicate function names may be exposed to the LLM
This PR fixes duplicate MCP tool-name handling by applying the same
indexed naming strategy already used for native agent tools. Native
tools are exposed with generated names such as `<tool_name>_<index>` to
avoid collisions, and MCP tools now follow the same convention for
consistency.
Specifically, this PR:
- assigns unique indexed function names to MCP tools exposed to the LLM
- preserves each MCP tool's original server-side name in an
`MCPToolBinding`
- dispatches MCP calls using the original MCP tool name while keeping
the indexed name in the agent tool map
- allows MCP metadata conversion to override only the OpenAI function
name without modifying the original MCP tool metadata
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Validation
The validation was performed using two MCP servers. Both servers exposed
a tool with the same name: `mcp0`. Both tools take no input parameters.
**MCP Server One:**
<img width="1780" height="625" alt="ONE"
src="https://github.com/user-attachments/assets/801a2654-fc10-4b71-b31c-81841fd40c55"
/>
**MCP Server Two:**
<img width="1777" height="624" alt="Second"
src="https://github.com/user-attachments/assets/c095151d-7bdf-47c8-9bfe-6aaf4a01b944"
/>
**Before the fix:**
When invoking `mcp0`, only the `mcp0` tool from the MCP server injected
later could be called successfully. As shown below, both `mcp0` tools
were present, but only the later-registered one was actually invokable.
<img width="694" height="935" alt="Three"
src="https://github.com/user-attachments/assets/3b9d7ab2-1765-492c-b8e0-bf05a69933ca"
/>
**After the fix:**
Both `mcp0` tools can now be invoked correctly.
<img width="737" height="1095" alt="F"
src="https://github.com/user-attachments/assets/6e896627-2b7f-41bb-becc-daa0c73ff58f"
/>
<img width="730" height="1090" alt="six"
src="https://github.com/user-attachments/assets/aba75593-26ae-4e3b-951d-b45ff177fd32"
/>
### What problem does this PR solve?
Fixes#14866.
Previously, `DocumentService.increment_chunk_num` and
`decrement_chunk_num` updated the `Document` row and its parent
`Knowledgebase` row in two separate, non-transactional statements. If
the second update failed (DB error, connection drop, etc.) after the
first one succeeded, the document and knowledge base chunk/token
counters would drift apart and stay inconsistent.
There was also a behavioral asymmetry between the two methods:
- `increment_chunk_num` only logged a warning when the document row was
missing and returned a value that callers usually treated as success.
- `decrement_chunk_num` raised `LookupError` in the same situation.
This PR makes the counter updates atomic and aligns the missing-document
behavior between the two methods:
- Wrap the `Document` and `Knowledgebase` updates in
`increment_chunk_num` / `decrement_chunk_num` inside a `DB.atomic()`
block so both succeed or both roll back together.
- Raise `LookupError` from `increment_chunk_num` when the target
document no longer exists, matching `decrement_chunk_num`.
- Update `reset_document_for_reparse` in `document_api_service.py` to
catch the new `LookupError` and return a proper "Document not found!"
API error instead of propagating the exception.
No schema changes, no API contract changes for the success path; only
the failure mode for a missing document during reparse is now a clean
error response instead of an uncaught exception.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- Replace `gen_conf={}` with `gen_conf=None` + guard in
`rag/llm/chat_model.py` (12 instances across Base, BaiChuanChat,
LocalLLM, MistralChat, ReplicateChat, BaiduYiyanChat, GoogleChat
classes)
- Replace `doc_ids=[]` with `doc_ids=None` + guard in
`api/db/services/document_service.py` (1 instance)
- Mutable default arguments are shared across all calls, causing
potential cross-request state contamination
- See Python docs:
https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects
## Test plan
- [x] Verify LLM calls work with and without explicit gen_conf
- [x] No behavior change for existing callers — `None` is replaced with
`{}` at function entry
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Closes#14853
The `/agents/download` and `/agents/<agent_id>/upload` endpoints in the
agent API are missing `@login_required` and `@add_tenant_id_to_kwargs`
decorators, allowing unauthenticated access. This is a security issue —
any user can upload files to or download files from an agent without
being logged in. Additionally, the upload endpoint bypasses canvas
access control (`@_require_canvas_access_async`).
This PR adds the missing authentication and authorization decorators to
both endpoints and replaces the manual `user_id` / `created_by` lookups
with the `tenant_id` provided by the auth middleware, making these
endpoints consistent with the rest of the agent API.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
In `api/utils/web_utils.py`, `__get_pdf_from_html()` creates a Chrome
WebDriver but only calls `driver.quit()` inside the `TimeoutException`
handler. If the page element becomes stale before the timeout (no
exception raised), the WebDriver is never quit, leaking the Chrome
browser process and returning `None`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Changes
- Move the PDF printing logic and `driver.quit()` outside the `except`
block so they execute on all code paths
- Use `try/finally` to ensure `driver.quit()` is always called, even if
the `Page.printToPDF` DevTools call fails
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary
`Graph.set_variable_param_value()` in `agent/canvas.py` has a bug in its
nested path traversal logic. The `for` loop iterates through **all**
keys in the path (including the last one), descending into every level.
After the loop, it then tries to set `cur[keys[-1]] = value`, but `cur`
has already descended one level too deep.
**Example:** For `path = "a.b"`, `value = "hello"`:
- **Before (bug):** `obj["a"]["b"]` becomes `{"b": "hello"}` instead of
`"hello"`
- **After (fix):** `obj["a"]["b"]` becomes `"hello"` as expected
The fix changes `for key in keys:` to `for key in keys[:-1]:`, so the
loop only navigates to the parent dict, and the final key is set
directly. This is consistent with how the read-side counterpart
`get_variable_param_value()` works.
This method is called by `set_variable_value()` when assigning to nested
variable paths (e.g., `component@root.nested.key`), which is used by the
`VariableAssigner` component.
## Test plan
- [ ] Create a canvas with a VariableAssigner that writes to a nested
path (e.g., `component@obj.nested.key`)
- [ ] Verify the value is set correctly at the expected path, not
wrapped in an extra dict layer
- [ ] Verify single-key paths (e.g., `component@key`) still work
correctly
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Fixed a bug in variable parameter assignment where nested structures
were being incorrectly modified, ensuring values are now properly set at
their intended locations without unintended overwrites.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### What problem does this PR solve?
This PR implement TTS for MiniMax provider and CLI testing for TTS
**The following functionalities are now supported:**
**MiniMax:**
- [x] Chat / Stream Chat
- [x] Embedding
- [x] Rerank
- [x] Model listing
- [x] Provider connection checking
- [x] Text To Speech
- [ ] OCRFile
- [ ] ~~Audio To Text~~
- [ ] ~~Balance~~
**Verified examples from the CLI:**
```plaintext
RAGFlow(user)> tts with 'speech-2.8-hd@test@minimax' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"voice_setting": {"voice_id": "English_radiant_girl", "speed": 1, "vol": 1, "pitch": 0}, "audio_setting": {"sample_rate": 32000, "bitrate": 128000, "format": "wav", "channel": 1}, "output_format": "hex"}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/speech-2.8-hd_output.wav
SUCCESS
RAGFlow(user)> stream tts with 'speech-2.8-hd@test@minimax' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"voice_setting": {"voice_id": "English_radiant_girl", "speed": 1, "vol": 1, "pitch": 0}, "audio_setting": {"sample_rate": 32000, "bitrate": 128000, "format": "wav", "channel": 1}, "output_format": "hex"}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/speech-2.8-hd_output.wav
SUCCESS
```
Set `Play` to play audio in CLI
Set `Save` `PATH_TO_SAVE` to save file
Set `format` to save file in wav or mp3
Set `Param` align with official request body
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Closes#9837
The Langfuse integration currently only sends the output text to
`langfuse_generation.update()` without including token usage
information. This means Langfuse cannot track input/output token
consumption for cost analysis and monitoring.
### Solution
Add the `usage` parameter to `langfuse_generation.update()` with:
- `input`: approximate input token count from `message_fit_in()`
- `output`: approximate output token count from
`num_tokens_from_string(answer)`
- `total`: sum of input and output
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
\`assert \"string\"\` always passes in Python because non-empty strings
are truthy. This silently skips input validation:
- **variable_assigner.py line 51**: \`assert \"Variable is not
complete.\"\` → \`raise ValueError(\"Variable is not complete.\")\`
- **loop.py line 59**: \`assert \"Loop Variable is not complete.\"\` →
\`raise ValueError(\"Loop Variable is not complete.\")\`
Without this fix, incomplete variables pass validation silently and
cause a confusing KeyError on the next line.
## Description
This PR fixes critical bugs and improves the robustness of the RAG
reranking module while maintaining **100% backward compatibility** with
all existing functionality and providers.
## Key Changes
1. **Network Stability**: Added 30s timeout to all API requests to
prevent service blocking
2. **Boundary Protection**: Added empty query/text validation for all
rerank models
3. **Response Fault Tolerance**: Replaced hardcoded key access with
`.get()` to avoid KeyError crashes
4. **Bug Fixes**:
- Fixed `Ai302Rerank` (completely non-functional before)
- Fixed `GPUStackRerank` incorrect exception catching
- Fixed `_normalize_rank` empty array crash
5. **Code Specification**: Added type annotations, standardized
unimplemented class prompts
## Compatibility
- ✅ No changes to any class/method names
- ✅ All rerank providers (Jina/Cohere/NVIDIA/HuggingFace etc.) work as
before
- ✅ No breaking changes, zero impact on existing workflows
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.25.3 to v0.25.4
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Includes gpt-5.4-mini and gpt-5.4-nano to the OpenAI model list
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add a Go driver for Voyage AI (https://voyageai.com), one of the
unchecked providers on the umbrella tracking issue #14736. Voyage AI is
**embed + rerank only** — no chat, no streaming, no `/v1/models`
endpoint. It's the first provider in the Go layer of this shape.
Until this PR, a tenant who configured `voyage` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.
### What this PR includes
- New `internal/entity/models/voyage.go` with a `VoyageModel`
implementing the `ModelDriver` interface.
- New `conf/models/voyage.json` with 6 embedding models (`voyage-3.5`,
`voyage-3.5-lite`, `voyage-3-large`, `voyage-code-3`, `voyage-law-2`,
`voyage-finance-2`) and 2 rerank models (`rerank-2`, `rerank-2-lite`).
- `factory.go`: route `"voyage"` to `NewVoyageModel`.
- `internal/entity/models/voyage_test.go`: 19 unit tests.
### How the driver works
- **Embed**: `POST /v1/embeddings`. Response is OpenAI-shaped (`{data:
[{embedding, index, object, text}], model, usage}`). Driver reorders by
`index`, rejects duplicate / out-of-range / missing slots, and
short-circuits empty input without an HTTP call.
- **Rerank**: `POST /v1/rerank`. Voyage uses **`top_k`** as the request
param name (not `top_n` like Aliyun/SiliconFlow); the driver translates
`RerankConfig.TopN` → `top_k`. Response is Cohere-shaped (`{data:
[{relevance_score, index}], model}`), so the existing
`RerankResponse{Data: []RerankResult{Index, RelevanceScore}}` shape fits
cleanly.
- **`ListModels`**: returns a hardcoded list of `voyageKnownModels`.
Voyage does **not** expose `/v1/models` (probed live, returns 404), so
the driver synthesizes the list from the same set the config ships. New
upstream models are added by extending one slice.
- **`CheckConnection`**: pings a 1-input embed call against
`voyage-3.5`. Without `/v1/models`, this is the cheapest way to verify
the API key + network path before a tenant tries a real workload.
- **`ChatWithMessages` / `ChatStreamlyWithSender` / `Balance` /
`TranscribeAudio` / `AudioSpeech` / `OCRFile`**: all return `"no such
method"`. Voyage does not host any of these surfaces.
No interface change. No new dependencies.
### How was this tested?
**19 unit tests** in `internal/entity/models/voyage_test.go` — all pass
on go 1.25:
```
$ go test -vet=off -run TestVoyage -count=1 ./internal/entity/models/...
ok ragflow/internal/entity/models 0.036s
```
Coverage: Name; Embed (happy path, reorder, empty-input, missing
key/model, duplicate index, out-of-range index, missing slot); Rerank
(happy path with `top_k` assertion, default-to-len-documents, empty
documents, out-of-range index); ListModels (static list, missing key);
CheckConnection (happy, 401); chat methods sentinels; Balance sentinel;
audio/OCR sentinels.
`go build ./internal/entity/models/...` exits 0.
**Live integration test** against `api.voyageai.com`:
```
=== RUN TestVoyageLiveSmoke
[OK] Name() = "voyage"
[OK] ListModels (static): 8 models -> [voyage-3.5 voyage-3.5-lite voyage-3-large voyage-code-3 voyage-law-2 voyage-finance-2 rerank-2 rerank-2-lite]
[OK] CheckConnection
[OK] Embed vectors=3 dim=1024 indices=[0 1 2]
[OK] Embed(empty) -> 0 vectors
[OK] Rerank results=3 scores=[0.8125 0.59765625 0.39453125]
[OK] ChatWithMessages returns voyage, no such method
[OK] Balance returns voyage, no such method
VOYAGE LIVE SMOKE PASSED
--- PASS: TestVoyageLiveSmoke (0.81s)
```
What the live run proves on the wire:
- Auth (`Bearer <key>`) accepted by `api.voyageai.com`.
- Embed `voyage-3.5` on 3 inputs returns 3 vectors at dim 1024 with
`index` field preserved as `[0, 1, 2]` — the reorder-by-index code is
exercised on real data.
- Empty input short-circuits without an HTTP call (mock server would
have been hit if it did).
- Rerank `rerank-2` on 3 docs returns 3 real `relevance_score` floats
`[0.8125, 0.598, 0.395]`. The `top_k` translation works on the live
wire.
- All sentinel methods return the documented `"no such method"` strings.
### Note on PR history
This branch was previously named for LocalAI Embed work which is now
consolidated into PR #14813. The branch was reset to `upstream/main` and
rebuilt for Voyage. Diff against `main` is a clean +838 lines across 4
files.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Tracking: #14736
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## Summary
Closes#14774.
Adds free-form tags on agents (UserCanvas) with full UI + API:
- Stored as comma-separated `tags` column on `UserCanvas` with online
migration.
- New endpoints: `GET /v1/agents/tags` (aggregate counts) and `PUT
/v1/agent/<id>/tags` (write). `GET /v1/agents` accepts a `tags=` query.
- "Edit tags" item in agent dropdown opens a chip-style editor dialog;
tags render as badges on each agent card.
- New "Tags" facet in the agents filter bar, with counts.
## Implementation notes
- **Tag matching is exact-token**: the SQL filter wraps stored tags as
`,…,` and matches `,ml,` so `ml` doesn't match `ml-ops`.
- **Server-side normalization** in `UserCanvasService.update_tags`:
dedup (case-insensitive), per-tag cap of 64 chars, total length capped
at 512 chars to fit the column, commas inside tag values are replaced
with spaces.
- **Tenant authorization**: `PUT /v1/agent/<id>/tags` gates on
`UserCanvasService.accessible(canvas_id, tenant_id)`.
- **Tag listing scope**: `UserCanvasService.list_tags` follows the same
own + team-shared rule as `get_by_tenant_ids`.
- **i18n**: keys added to `en.ts` and `zh.ts` only (per project
convention; other locales fall back).
- **`HomeCard`** gets a non-breaking `extra?: ReactNode` slot for the
chip row; no `src/components/ui/` files modified.
## Test plan
- [ ] Backend boot runs `migrate_db` → confirm `user_canvas.tags` column
exists (`DESCRIBE user_canvas`).
- [ ] Agents page renders cards normally (no console error from missing
field).
- [ ] `⋯ → Edit tags` opens a dialog that stays open (regression: dialog
was unmounting with the dropdown).
- [ ] Typing a tag without pressing Enter and clicking Save persists it
(regression: last typed tag was being dropped).
- [ ] Chip input supports Enter/comma to commit, Backspace on empty to
remove, `×` to remove individual chip.
- [ ] Tag containing a comma sent via API is stored with the comma
replaced by a space.
- [ ] 20 long tags sent via API does not error (length cap silently
truncates).
- [ ] "Tags" filter in the filter bar shows counts and narrows the list.
- [ ] Filtering by `ml` does **not** return agents tagged `ml-ops`.
- [ ] UI in Chinese shows 编辑标签 / 添加标签以整理和筛选你的智能体 etc.
- [ ] `PUT /v1/agent/<other-tenant-id>/tags` returns `Agent not found or
no permission.`
Introduce comprehensive floating widget customization: add new widget
settings (title, subtitle, footer, colors, mute, streaming) with types
and defaults, and expose them via EmbedDialog UI (split into Embed Setup
and Widget Customization tabs). Persist and load settings through Agent
page by reading/writing globals and wiring an onSaveWidgetSettings
handler to setAgent; show a loading ButtonLoading for saving. Update
embed iframe query params and FloatingChatWidget to honor URL params
(colors, text, mute/streaming) with validation/normalization, color
darkening for gradients, footer link normalization, and improved
styling. Also add copy-to-clipboard in message toolbar, adjust syntax
highlighter layout and Copy button, and add i18n key for muteWidget.
### What problem does this PR solve?
Adds a few fields to the embed widget modal to customize the appearance
of the floating widget when embedded into a page.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Noah <Noah.Thompson@ecn.forces.gc.ca>
## Summary
- Convert `pdfplumber.open()` to use `with` context manager in
`api/utils/file_utils.py` (`thumbnail_img` function)
- If any exception occurs between `open()` and `close()`, the PDF file
handle leaks
- The rest of the codebase (e.g. `read_potential_broken_pdf` in the same
file) already uses `with pdfplumber.open(...)` correctly
## Test plan
- [x] PDF thumbnail generation works correctly with context manager
- [x] Resources properly cleaned up on exceptions
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
# feat: Add Generic REST API Connector
## What problem does this PR solve?
RAGFlow supports many specific data source connectors (MySQL, Slack,
Google Drive, etc.), but there was no way to connect an arbitrary REST
API as a data source. Users with custom or third-party APIs had to write
a new connector class for each one.
This PR adds a **generic, configuration-driven REST API connector** that
lets users connect any REST API as a data source entirely through the UI
— no code changes needed per API.
---
## Features
### Core Connector (`common/data_source/rest_api_connector.py`)
- Implements `LoadConnector` and `PollConnector` interfaces for full and
incremental sync
- **Configurable authentication:** None, API Key (custom header), Bearer
Token, Basic Auth
- **Pluggable pagination:** Page-based, Offset-based, Cursor-based, or
None
- Smart page-size inference from user's query parameters to avoid
duplicate/conflicting params
- Configurable request delay between pages to prevent API rate limiting
- Auto-detection of the items array in JSON responses (`items`,
`results`, `data`, `records`, or first list found)
- **Advanced field mapping** with dot-notation (`country.name`), array
wildcards (`newsType[*].name`), type hints, and default values
- Optional content template rendering (`"Title: {title}\nBody: {body}"`)
- HTML stripping for content fields
- Stable document IDs via `hash128` from a configurable ID field or
auto-generated from item content
- Pydantic configuration schema with automatic coercion of UI string
inputs to dicts/lists
### Backend Registration (`rag/svr/sync_data_source.py`,
`common/constants.py`, `common/data_source/config.py`)
- `REST_API` sync class wired into RAGFlow's `func_factory`
- Full sync (`load_from_state`) and incremental polling (`poll_source`)
support
- Credentials and config passed from task to connector following
existing patterns (MySQL, SeaFile, etc.)
### Test Connection Endpoint (`api/apps/connector_app.py`)
- `POST /v1/connector/<id>/test` validates config schema,
authentication, and API connectivity without triggering a sync
- Clear error messages for auth failures vs. config issues
### Frontend UI (`web/src/pages/user-setting/data-source/constant/`)
- **Postman-style configuration:** Base URL, Query Parameters (key=value
per line), Auth, Content Fields, Metadata Fields, Pagination Type
- Auth-type-aware form: fields for API key header/value, Bearer token,
or Basic username/password appear only when relevant
- **Advanced Settings** toggle for: Custom Headers, Max Pages, Request
Delay, Poll Timestamp Field, Request Body (POST)
- Connector icon (SVG) and i18n strings (English)
- **"Test Connection"** button to validate before syncing
---
## Controls & Safety
- Configurable max pages safety cap (default: 1000, adjustable in UI)
- Configurable request delay between pages (default: 0.5s, adjustable in
UI)
- Auth errors (401/403) fail immediately without retries; transient
errors retry with exponential backoff
- Diagnostic logging: auth setup confirmation, request details on
failure, content field extraction status
---
## Type of change
- [x] New Feature (non-breaking change which adds functionality)
##Visual Screenshots of Features
<img width="482" height="510" alt="Screenshot 2026-03-11 at 5 19 52 PM"
src="https://github.com/user-attachments/assets/dcb7ab4a-1622-44f3-bb02-d6f0527314c4"
/>
(Connector can be configured within the external data sources tab)
Configuration Parameters:
<img width="661" height="682" alt="Screenshot 2026-03-11 at 5 20 46 PM"
src="https://github.com/user-attachments/assets/5e154e71-4ab5-4872-bfb2-04f02b73c18a"
/>
<img width="661" height="682" alt="Screenshot 2026-03-11 at 5 20 54 PM"
src="https://github.com/user-attachments/assets/00cb14b7-0bcf-4b94-9d71-34e93369ecb2"
/>
Connection can be tested before attaching to dataset:
<img width="981" height="681" alt="Screenshot 2026-03-11 at 5 21 40 PM"
src="https://github.com/user-attachments/assets/aaa6eeeb-89a7-4349-bc34-2423bf8be9ee"
/>
Ingestion tested with API connector (works perfectly fine):
<img width="1062" height="705" alt="Screenshot 2026-03-11 at 5 22 30 PM"
src="https://github.com/user-attachments/assets/afcd0d58-cadd-4152-badc-d2f14d96fbec"
/>
Search & Retrieval works as well with metadata flow:
<img width="1062" height="705" alt="Screenshot 2026-03-11 at 5 23 05 PM"
src="https://github.com/user-attachments/assets/d41ee935-dcf7-4456-b317-22a76ca032c0"
/>
---------
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
### What problem does this PR solve?
As title
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### Related issues
Closes#14781
### What problem does this PR solve?
Some retrieval endpoints accepted caller-supplied `tenant_rerank_id` and
resolved it through `get_model_config_by_id(...)`. That helper loaded
`TenantLLM` rows by global database id and returned decoded model
configuration without checking whether the model belonged to the
authenticated tenant or the dataset owner tenant.
This meant dataset access was validated, but rerank-model selection was
not. A caller who knew or could guess another tenant's
`tenant_rerank_id` could attempt retrieval with a foreign rerank model
config, creating a cross-tenant authorization gap for model usage.
This PR closes that gap by making `tenant_rerank_id` resolution
tenant-aware across the retrieval paths that accept it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Solution
- Extend `get_model_config_by_id(...)` to accept an optional
`allowed_tenant_ids` set and reject `TenantLLM` rows whose `tenant_id`
is outside that set.
- Pass the allowed tenant scope from retrieval endpoints that accept
`tenant_rerank_id`:
- `api/apps/sdk/doc.py`
- `api/apps/sdk/session.py`
- `api/apps/services/dataset_api_service.py`
- Use the authenticated tenant plus dataset-owner tenant ids already
derived by each retrieval flow as the authorization boundary for rerank
model selection.
- Add focused unit coverage to assert unauthorized `tenant_rerank_id`
values are rejected and that the allowed tenant set is propagated
correctly.
### Testing
- `python -m py_compile` on:
- `api/db/joint_services/tenant_model_service.py`
- `api/apps/services/dataset_api_service.py`
- `api/apps/sdk/doc.py`
- `api/apps/sdk/session.py`
- Added unit tests in:
-
`test/testcases/test_http_api/test_file_management_within_dataset/test_doc_sdk_routes_unit.py`
-
`test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
### Notes for reviewers
- This change is intentionally narrow: it affects only the
`tenant_rerank_id` path, not the normal `rerank_id` name-based
resolution path.
- Local lint/syntax checks passed.
- Full pytest execution could not be completed in this environment
because the local test runtime is missing `strenum`, so the route-test
files fail during collection before exercising the updated cases.
---------
Co-authored-by: jony376 <jony376@gmail.com>
### What problem does this PR solve?
A draft 0.25.3 release note.
### Type of change
- [x] Documentation Update
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
The LocalAI Go driver landed in #14809 and Embed landed in #14811.
`Rerank` was left as a stub that returns `"not implemented"`. This PR
fills the gap.
LocalAI exposes a public rerank endpoint at `<tenant-url>/v1/rerank`
with a Cohere-shaped request and response (`{model, query, documents,
top_n}` → `{results: [{index, relevance_score}]}`). The Python side has
had `LocalAIRerank` in `rag/llm/rerank_model.py` for a long time. Until
this PR, a tenant who wanted to use LocalAI for reranking in the Go
layer got `"not implemented"`.
### What this PR includes
- `conf/models/localai.json`: add `"rerank": "rerank"` under
`url_suffix` so the driver can build the URL from config. This matches
the `URLSuffix.Rerank` field already used by aliyun and siliconflow.
- `internal/entity/models/localai.go`: replace the `Rerank` stub with a
real implementation that POSTs to `/v1/rerank`. Adds local
request/response types `localAIRerankRequest` and
`localAIRerankResponse`.
No factory change. No interface change.
### How the implementation works
- Validate the model name and resolve the tenant-supplied base URL with
the existing `resolveBaseURL` helper.
- Wrap the request with `context.WithTimeout(nonStreamCallTimeout)` so
the call has a clear deadline. Same pattern `ChatWithMessages`,
`ListModels`, and `Embed` already use in this file.
- Only set the `Authorization` header when a non-empty API key was
supplied. LocalAI accepts an empty key by default, so this preserves the
optional-auth contract.
- Default `top_n` to `len(documents)` when `rerankConfig.TopN == 0`,
matching the existing Aliyun and SiliconFlow rerank implementations.
- Validate every `results[].index` against `len(documents)`. If the
upstream returns an out-of-range index, fail clearly instead of silently
writing past the slice.
- An empty `documents` slice returns `&RerankResponse{}` with no HTTP
call.
- Non-200 responses propagate the upstream status line and body.
### Note on stacking
This PR builds on #14809 (LocalAI driver) and #14811 (LocalAI Embed).
Until both merge, this PR's diff on GitHub will include all three
commits. After #14809 and #14811 land on `main`, GitHub will auto-reduce
this PR to only the `Rerank` changes (one commit, ~99 line diff in
`localai.go` plus 1 line in `localai.json`).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- `go build ./internal/entity/models/...` returns exit 0 on go 1.25 (the
`go.mod` minimum).
- The full method set on `LocalAIModel` still matches the `ModelDriver`
interface.
- Pattern parity with the existing Aliyun Rerank
(`internal/entity/models/aliyun.go`) and SiliconFlow Rerank
(`internal/entity/models/siliconflow.go`) implementations.
Closes#14812
Depends on #14809, #14811
Tracking: #14736
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.25.2 to v0.25.3
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png'
+----------------------------------------------------------+
| text |
+----------------------------------------------------------+
| 生活不是等待风暴过去,而是学会在雨中翩翩起舞。
——佚名 |
+----------------------------------------------------------+
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fixes#14884
The ZhipuAI Go driver in `internal/entity/models/zhipu-ai.go` had a stub
`ListModels` method that always returned `"zhipu-ai, no such method"`.
The DeepSeek, Gitee, NVIDIA, OpenAI, SiliconFlow, and OpenRouter drivers
in the same package already implement `ListModels` against the
OpenAI-compatible `/models` endpoint, and the model picker UI relies on
it. This PR brings ZhipuAI in line with that pattern.
### Changes
- `internal/entity/models/zhipu-ai.go`: implement
`ZhipuAIModel.ListModels`.
- Resolve region with default fallback.
- GET `${BaseURL[region]}/${URLSuffix.Models}` (resolves to
`https://open.bigmodel.cn/api/paas/v4/models` with the default region).
- Send `Authorization: Bearer <api_key>` when an API key is configured.
Omit the header when the key is empty, so an unauthenticated caller gets
a clear `401` from upstream.
- Surface non-200 responses with the upstream status line and body,
matching the other Go drivers.
- Parse the response via the package-level `DSModelList` / `DSModel`
types already used by DeepSeek, Gitee, and SiliconFlow.
- When the response includes `owned_by`, render the entry as
`id@owned_by`, matching the convention of Gitee and SiliconFlow.
- `conf/models/zhipu-ai.json`: add `"models": "models"` to `url_suffix`.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Set embedded models during form initialization.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add a Go driver for LongCat (Meituan, https://longcat.chat), one of the
unchecked providers on the umbrella tracking issue #14736. LongCat
exposes an OpenAI-compatible REST API at
`https://api.longcat.chat/openai/v1` with three public chat models
including `LongCat-Flash-Thinking`, a reasoning model that returns
chain-of-thought in `reasoning_content` (OpenAI o-series shape).
Until this PR, a tenant who configured `longcat` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.
### What this PR includes
- New `internal/entity/models/longcat.go` with a `LongCatModel`
implementing the `ModelDriver` interface.
- New `conf/models/longcat.json` with the 3 public chat models
(Flash-Chat, Flash-Lite, Flash-Thinking) and `url_suffix` for `chat` and
`models`.
- `factory.go`: route `"longcat"` to `NewLongCatModel`.
Method coverage:
- `ChatWithMessages`: `POST /openai/v1/chat/completions`, non-streaming
- `ChatStreamlyWithSender`: SSE stream against the same endpoint
- `ListModels` / `CheckConnection`: `GET /openai/v1/models`
- **Reasoning extraction**: `message.reasoning_content` (non-stream) and
`delta.reasoning_content` (stream) flow into
`ChatResponse.ReasonContent` / the sender's second arg. Matches the
OpenAI o-series convention also used by kimi-k2.6 and DeepSeek-R1.
- **`reasoning_effort` propagation**: `ChatConfig.Effort` → request body
`reasoning_effort` (LongCat-Flash-Thinking honors it; non-reasoning
models ignore it).
- `Embed` / `Rerank` / `Balance` / `TranscribeAudio` / `AudioSpeech` /
`OCRFile` return `"no such method"` (LongCat does not expose any of
these surfaces).
No interface change. No new dependencies.
### How was this tested?
**21 unit tests** in `internal/entity/models/longcat_test.go` — all
pass:
```
$ go test -vet=off -run TestLongCat -count=1 -v ./internal/entity/models/...
=== RUN TestLongCatName
--- PASS: TestLongCatName (0.00s)
=== RUN TestLongCatChatHappyPath
--- PASS: TestLongCatChatHappyPath (0.00s)
=== RUN TestLongCatChatExtractsReasoningContent
--- PASS: TestLongCatChatExtractsReasoningContent (0.00s)
=== RUN TestLongCatChatPropagatesReasoningEffort
--- PASS: TestLongCatChatPropagatesReasoningEffort (0.00s)
=== RUN TestLongCatChatOmitsReasoningEffortWhenUnset
--- PASS: TestLongCatChatOmitsReasoningEffortWhenUnset (0.00s)
=== RUN TestLongCatChatRequiresAPIKey
--- PASS: TestLongCatChatRequiresAPIKey (0.00s)
=== RUN TestLongCatChatRequiresMessages
--- PASS: TestLongCatChatRequiresMessages (0.00s)
=== RUN TestLongCatChatRejectsHTTPError
--- PASS: TestLongCatChatRejectsHTTPError (0.00s)
=== RUN TestLongCatStreamHappyPath
--- PASS: TestLongCatStreamHappyPath (0.00s)
=== RUN TestLongCatStreamExtractsReasoningContent
--- PASS: TestLongCatStreamExtractsReasoningContent (0.00s)
=== RUN TestLongCatStreamRejectsExplicitFalse
--- PASS: TestLongCatStreamRejectsExplicitFalse (0.00s)
=== RUN TestLongCatStreamRequiresSender
--- PASS: TestLongCatStreamRequiresSender (0.00s)
=== RUN TestLongCatStreamFailsWithoutTerminal
--- PASS: TestLongCatStreamFailsWithoutTerminal (0.00s)
=== RUN TestLongCatListModelsHappyPath
--- PASS: TestLongCatListModelsHappyPath (0.00s)
=== RUN TestLongCatListModelsRequiresAPIKey
--- PASS: TestLongCatListModelsRequiresAPIKey (0.00s)
=== RUN TestLongCatCheckConnectionDelegatesToListModels
--- PASS: TestLongCatCheckConnectionDelegatesToListModels (0.00s)
=== RUN TestLongCatEmbedReturnsNoSuchMethod
--- PASS: TestLongCatEmbedReturnsNoSuchMethod (0.00s)
=== RUN TestLongCatRerankReturnsNoSuchMethod
--- PASS: TestLongCatRerankReturnsNoSuchMethod (0.00s)
=== RUN TestLongCatBalanceReturnsNoSuchMethod
--- PASS: TestLongCatBalanceReturnsNoSuchMethod (0.00s)
=== RUN TestLongCatAudioOCRReturnNoSuchMethod
--- PASS: TestLongCatAudioOCRReturnNoSuchMethod (0.00s)
PASS
ok ragflow/internal/entity/models 0.020s
```
`go build ./internal/entity/models/...` exits 0 on go 1.25.
**Live integration test** against `api.longcat.chat`:
```
=== RUN TestLongCatLiveSmoke
[OK] Name() = "longcat"
[OK] CheckConnection
[OK] ListModels: 5 models -> [LongCat-Flash-Lite LongCat-Flash-Chat LongCat-Flash-Thinking-2601 LongCat-Flash-Omni-2603 LongCat-2.0-Preview]
[OK] Chat (Flash-Chat) answer="Got it! Let me know if you" reason=""
[OK] Chat (Flash-Thinking) answer len=443 head="To find 15 % of 80, follow these steps:\n\n1. **Convert the percentage to a frac..."
ReasonContent len=557 head="The user asks: \"15% of 80?\" They want step by step reasoning and final answer in \\boxed{}. So we need to compute 15% of ..."
[OK] Stream content: 78 chunks, 351 chars
[OK] Stream reasoning: 107 chunks, 537 chars
[OK] Balance returns longcat, no such method
[OK] Embed returns longcat, no such method
[OK] Rerank returns longcat, no such method
LONGCAT LIVE SMOKE PASSED
--- PASS: TestLongCatLiveSmoke (31.01s)
```
What the live run proves on the wire:
- Auth header (`Bearer <key>`) is accepted by `api.longcat.chat`.
- `/openai/v1/models` parser handles the real 5-model response (note:
live API returns versioned aliases `LongCat-Flash-Thinking-2601`,
`LongCat-Flash-Omni-2603`, `LongCat-2.0-Preview` plus the un-versioned
`LongCat-Flash-Chat` and `LongCat-Flash-Lite`).
- Non-stream chat against `LongCat-Flash-Chat`: visible answer parses
correctly, `ReasonContent` correctly empty.
- Non-stream chat against `LongCat-Flash-Thinking`: 443-char answer
flows into `Answer`, 557-char chain-of-thought flows into
`ReasonContent` via the new `message.reasoning_content` extraction.
- Streaming chat against `LongCat-Flash-Thinking`: 107 reasoning chunks
(537 chars) reach the sender's second arg via `delta.reasoning_content`;
78 content chunks (351 chars) reach the first arg. Before this code, the
reasoning chunks would have been silently dropped.
- All sentinel methods (Balance, Embed, Rerank, audio/OCR) return the
documented `"no such method"` strings.
### Note on PR history
This branch was previously named for LocalAI work which is now
consolidated into PR #14813. The branch was reset to `upstream/main` and
rebuilt for LongCat. The diff against `main` is a clean +969 lines
across 4 files.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Tracking: #14736
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
The delete /graph is duplicated of
`/datasets/<dataset_id>/<index_type>`, delete it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
add new testing suite for the new restful api endpoints meant to replace
http and web api tests
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
### What problem does this PR solve?
Add a Go driver for Novita.ai (https://novita.ai), one of the unchecked
providers on the umbrella tracking issue #14736. Novita exposes an
OpenAI-compatible REST API at `https://api.novita.ai/v3/openai` and
proxies a large catalog of third-party models (DeepSeek, Llama, Qwen3,
Kimi, Gemma, Mistral, MiniMax, GLM, etc.) behind a single OpenAI-shaped
surface — 102 models live at the time of writing.
Until this PR, a tenant who configured `novita` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.
### What this PR includes
- New `internal/entity/models/novita.go` with a `NovitaModel`
implementing the `ModelDriver` interface (~520 lines).
- New `conf/models/novita.json` with 7 representative chat models
(DeepSeek-V4, Llama-3.3-70B, Qwen3-30B/235B reasoning, Kimi-K2,
Gemma-3-27B, Mistral-Nemo).
- `factory.go`: route `"novita"` to `NewNovitaModel`.
- `internal/entity/models/novita_test.go`: 23 unit tests.
### Notable design point: `<think>...</think>` reasoning extraction
Novita-routed reasoning models like `qwen3-*` and `deepseek-r1-*` embed
their chain-of-thought **inline inside content as `<think>...</think>`
tags**, rather than in a separate `reasoning_content` field. Verified
live by probing `api.novita.ai`:
```
content head 200: <think>
Okay, let's see. I need to find 15% of 80. Hmm, percentages can sometimes be tricky, but I think
content tail 100: h, that works.
Alternatively, 0.15 × 80. If I move the decimal two places to the left for </think>
```
Without handling, a tenant picking qwen3 via Novita would see raw
`<think>` tags in their UI answer — different from every other reasoning
provider in the Go layer.
The driver detects those tags and routes the inner text to
`ChatResponse.ReasonContent` (non-stream) or the sender's second arg
(stream), keeping the visible answer clean of tag clutter:
- **`splitNovitaThink`** — scans a complete content string. Used by the
non-streaming path. Handles multiple `<think>` blocks, unclosed tags
(the model got cut off mid-reasoning), pure-text content with no tags.
- **`novitaThinkExtractor`** — stateful streaming version. Buffers
trailing bytes that might be the start of a tag (e.g. `<thi` held back
when the next chunk completes `nk>`), then emits segments in routing
order so callers can pipe them to a UI. Tested with byte-level chunk
boundaries and tag-spanning scenarios.
### Method coverage
| Method | Behavior |
|---|---|
| `ChatWithMessages` | `POST /v3/openai/chat/completions`, `<think>`
extraction on response |
| `ChatStreamlyWithSender` | SSE stream, stateful `<think>` extraction
across deltas |
| `ListModels` / `CheckConnection` | `GET /v3/openai/models` (102 live)
|
| `Embed` / `Rerank` / `Balance` / `TranscribeAudio` / `AudioSpeech` /
`OCRFile` | `"no such method"` — Novita's OpenAI-compatible surface does
not expose any |
No interface change. No new dependencies.
### How was this tested?
**23 unit tests** in `internal/entity/models/novita_test.go` — all pass:
```
$ go test -vet=off -run "TestNovita|TestSplitNovita" -count=1 ./internal/entity/models/...
ok ragflow/internal/entity/models 0.020s
```
Coverage:
- `splitNovitaThink` (5 cases: pure text, single block, leading text,
multiple blocks, unclosed tag)
- `novitaThinkExtractor` (6 cases: single-chunk, opening tag span,
closing tag span, byte-level chunking, no tags, lone `<` not as tag
start)
- `ChatWithMessages`: pure text, with `<think>` tags, missing API key,
empty messages, HTTP error
- `ChatStreamlyWithSender`: tag-stripping with spanning deltas, pure
content, sender-required, stream-true-required
- `ListModels` / `CheckConnection` (happy paths)
- All sentinel methods
`go build ./internal/entity/models/...` exits 0 on go 1.25.
**Live integration test** against `api.novita.ai/v3/openai`:
```
=== RUN TestNovitaLiveSmoke
[OK] Name() = "novita"
[OK] CheckConnection
[OK] ListModels: 102 models (showing first 6) [deepseek/deepseek-v4-pro deepseek/deepseek-v4-flash deepseek/deepseek-v3.2 xiaomimimo/mimo-v2.5-pro moonshotai/kimi-k2.6 zai-org/glm-5.1]
[OK] Chat (llama-3.3) answer="ok" reason=""
[OK] Chat (qwen3) answer len=0 head=""
ReasonContent len=1657 head="Okay, so I need to figure out what 15% of 80 is. Hmm, percentages can sometimes trip me up, but let ..."
[OK] Stream content: 0 chunks, 0 chars; reasoning: 600 chunks, 1667 chars
[OK] Embed/Rerank/Balance/TranscribeAudio/AudioSpeech/OCRFile all return "novita, no such method"
NOVITA LIVE SMOKE PASSED
--- PASS: TestNovitaLiveSmoke (26.18s)
```
What the live run proves on the wire:
- Auth (`Bearer <key>`) accepted by `api.novita.ai`.
- `/v3/openai/models` parser handles the real 102-model response.
- Non-stream chat against `meta-llama/llama-3.3-70b-instruct`: clean
string answer, empty ReasonContent (non-reasoning model, pure-text
path).
- Non-stream chat against `qwen/qwen3-30b-a3b-fp8`: 1657-char reasoning
extracted from `<think>...</think>` and routed to
`ChatResponse.ReasonContent`. Visible answer is 0 chars in this run
because qwen3 spent its 600-token budget entirely on reasoning before
reaching the answer phase — that's the model's behavior, not a driver
bug. The important thing: **no `<think>` tags leaked into the visible
Answer field**.
- Streaming against qwen3: 600 reasoning chunks (1667 chars) emitted via
the sender's 2nd arg across SSE deltas; **no `<think>` tag fragments
leaked into the content channel** despite tag boundaries crossing chunk
boundaries on the wire.
- All 6 sentinel methods return the documented `"no such method"`
strings.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Tracking: #14736
… modal
- Add vertical resizing functionality for the text field
### What problem does this PR solve?
_Fix the issue where the text content of the knowledge base editing
parsing block is too long to scroll._
<img width="701" height="775" alt="image"
src="https://github.com/user-attachments/assets/b258422e-fbc1-466d-abab-062e642c21d5"
/>
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: chenyun <chenyun@chenyundemacbook-pro.local>
### What problem does this PR solve?
Fix delete graphrag not take effect in UI
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The Baidu (Qianfan) `Encode` method silently swallowed malformed
responses. If a `data[]` item from the API was missing a field (`index`,
`embedding`, or unexpected shape), the loop did `continue` instead of
returning an error, leaving `nil` entries in the result slice. Callers
got back partial results with no indication anything went wrong, which
then crashes downstream consumers when they try to use a `nil` vector.
Concrete gaps fixed:
- No count-mismatch check between `data` length and input texts (only
checked for empty)
- No duplicate-index detection (a duplicate would silently overwrite)
- No missing-index final scan
- No empty-embedding rejection
- No per-call context timeout
- `EmbeddingConfig.Dimension` (added in #14735) was not propagated
This PR replaces `map[string]interface{}` parsing with a typed
`baiduEmbeddingResponse` struct, applies the standard four-layer
validation (count → out-of-range → duplicate → empty → final
missing-index scan), adds `context.WithTimeout(nonStreamCallTimeout)`,
and forwards `embeddingConfig.Dimension` as the `dimensions` parameter
(Baidu Qianfan v2 uses an OpenAI-compatible API).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Problem
When parsing DOCX files with many tables, DeepDOC generates chunks
containing only empty HTML table tags, such as:
```html
<table><tr><td></td></tr><tr><td></td></tr><tr><td></td></tr><tr><td></td></tr></table>
```
After the regex cleanup at `task_executor.py:584`, this becomes `" "`
(whitespace only).
The guard at line 585 (`if not c`) only catches empty strings `""`, but
whitespace strings are truthy in Python and pass through. When sent to
Zhipu `embedding-3` API, it rejects them with error 1213:
`未正常接收到prompt参数`.
## Root Cause
```python
c = re.sub(r"</?(table|td|caption|tr|th)( [^<>]{0,12})?>", " ", c)
if not c: # ← only catches "", not " " / "\n" / "\t"
c = "None"
```
Verified with Zhipu `embedding-3`:
| Input | Result |
|---|---|
| `""` | error 1213 |
| `" "` | error 1213 |
| `"\n"` | error 1213 |
| `"None"` | OK |
## Fix
```diff
- if not c:
+ if not c.strip():
c = "None"
```
## Testing
Reproduced with a 678KB DOCX file (166 tables, 270 chunks). Chunk #89 is
the empty table above. After fix, `"None"` is sent instead and embedding
succeeds.
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Closes#14768
### What problem does this PR solve?
The `list_chats` and `list_searches` REST API endpoints did not enforce
authorization on the `owner_ids` query parameter. Any authenticated user
could pass arbitrary tenant IDs to `owner_ids` and retrieve chats or
search apps belonging to other tenants they are not a member of.
This PR resolves the issue by:
1. Looking up the current user's authorized tenants via
`TenantService.get_joined_tenants_by_user_id` and rejecting any
`owner_ids` that fall outside that set.
2. When no `owner_ids` are provided, scoping the query to only the
user's authorized tenants instead of returning an unfiltered result.
3. Adding unit tests that verify unauthorized `owner_ids` are rejected
with `OPERATING_ERROR`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
What problem does this PR solve?
In rag/app/audio.py, the supported audio extensions list contains
duplicate entries: .wav appears twice (positions 3 and 5) and .aac
appears twice (positions 6 and 14). While this does not affect runtime
behavior, it is redundant and makes the code harder to maintain.
This PR removes the duplicate entries to keep the list clean and
consistent.
Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
```
RAGFlow(user)> chat with 'glm-ocr@test@zhipu-ai' message 'what is this'
CLI error: expect model glm-ocr@zhipu-ai is a chat or multimodal model
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat: When a Wait Node precedes a Message Node within a Loop Node, the
outgoing message is split into two separate messages.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
OpenAI model catalogs used in provider selection flows were missing the
latest GPT models (`gpt-5.5` and `gpt-5.4`).
Because model availability is driven by seeded catalog data
(`conf/llm_factories.json` → DB seed → API response), these models were
not selectable in the UI or `/llm/list` responses.
This PR updates and synchronizes the OpenAI catalog definitions across
configuration sources and ensures the new models are correctly exposed
through the API layer and validated in tests.
---
### Type of change
* [x] New Feature (non-breaking change which adds functionality)
---
### Changes Made
* Added `gpt-5.5` and `gpt-5.4` to OpenAI catalog definitions in:
* `conf/llm_factories.json`
* `conf/models/openai.json` (chat + vision support)
* Ensured consistency between DB-seeded factory config and provider
model configuration
* Updated test coverage in:
* `test_llm_list_unit.py`
* seeded OpenAI catalog entries
* added response-level assertion validating `/llm/list` includes both
new model IDs under OpenAI grouping
---
### Root Cause
OpenAI model listings in selection flows are generated from catalog data
seeded via `conf/llm_factories.json`.
The catalog had not been updated to include the latest GPT models,
resulting in missing availability in UI and API responses.
---
### Testing
* Created isolated test environment:
* `python -m venv .venv-review`
* installed `pytest`
* Ran targeted and full test suite:
* `test_list_app_grouping_availability_and_merge`: ✅ passed
* Full `test_llm_list_unit.py`: ✅ 10 passed
---
### Risks / Limitations
* Adding models to the catalog does not guarantee upstream provider
availability or account entitlement.
* Environments with pre-seeded DB catalogs may require reseed or refresh
to reflect updated configuration.
---
### Notes
* Changes are minimal and scoped strictly to catalog configuration and
related test coverage.
* Ensures `/llm/list` API remains aligned with expected latest OpenAI
model availability.
* Closes#14827
### What problem does this PR solve?
This PR completes the Jina provider
**The following functionalities are now supported:**
**Jina:**
- [ ] Chat / Stream Chat (Not available for now: [(Jina chat API
docs)](https://api.jina.ai/docs#/Search%20Foundation%20Models/chat_completions_v1_chat_completions_post))
- [x] Embedding
- [x] Rerank
- [x] Model listing
- [x] Provider connection checking
- [ ] ~~Balance~~
**Verified examples from the CLI:**
```plaintext
RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'jina-embeddings-v2-base-en@test@jina' dimension 16
+-----------+-------+
| dimension | index |
+-----------+-------+
| 768 | 0 |
| 768 | 1 |
+-----------+-------+
RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'jina-reranker-v2-base-multilingual@test@jina' top 3;
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0 | 0.74316794 |
| 2 | 0.18713269 |
| 1 | 0.15817434 |
+-------+-----------------+
RAGFlow(user)> list supported models from 'jina' 'test'
+---------------------------------------------+
| model_name |
+---------------------------------------------+
| Jina AI: Jina VLM |
| Jina AI: Jina Reranker v3 |
| Jina AI: Jina Code Embeddings 0.5b |
| Jina AI: Jina Code Embeddings 1.5b |
| Jina AI: Jina Embeddings v4 |
| Jina AI: Jina Reranker M0 |
| Jina AI: ReaderLM v2 |
| Jina AI: Jina Clip v2 |
| Jina AI: Jina Embeddings v3 |
| Jina AI: Jina Colbert v2 |
| Jina AI: Reader LM 0.5b |
| Jina AI: Reader LM 1.5b |
| Jina AI: Jina Reranker v2 Base Multilingual |
| Jina AI: Jina Clip v1 |
| Jina AI: Jina Reranker v1 Tiny EN |
| Jina AI: Jina Reranker v1 Turbo EN |
| Jina AI: Jina Reranker v1 Base EN |
| Jina AI: Jina Colbert v1 EN |
| Jina AI: Jina Embeddings v2 Base ES |
| Jina AI: Jina Embeddings v2 Base Code |
| Jina AI: Jina Embeddings v2 Base DE |
| Jina AI: Jina Embeddings v2 Base ZH |
| Jina AI: Jina Embeddings v2 Base EN |
| Jina AI: Jina Embedding B EN v1 |
| Jina AI: Jina Embeddings v5 Text Small |
| Jina AI: Jina Embeddings v5 Omni Small |
| Jina AI: Jina Embeddings v5 Omni Nano |
| Jina AI: Jina Embeddings v5 Text Nano |
+---------------------------------------------+
RAGFlow(user)> check instance 'test' from 'jina'
SUCCESS
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The Mistral Go driver landed in #14805 with chat, list models, and check
connection. `Embed` was left as a stub that returns `"not implemented"`.
This PR fills the gap.
`conf/models/mistral.json` did not list any embedding model out of the
box, so a tenant who wanted to use Mistral end to end (chat +
embeddings) could not run an embedding call. This PR adds
`mistral-embed` to the config and a real `/v1/embeddings`
implementation.
### What this PR includes
- `conf/models/mistral.json`: add `"embedding": "embeddings"` under
`url_suffix` so the driver can build the URL from config (matches the
`URLSuffix.Embedding` field already used by openai, siliconflow,
zhipu-ai), and add a `mistral-embed` entry under `models`
(1024-dimensional vectors, 8192 max input tokens).
- `internal/entity/models/mistral.go`: replace the `Embed` stub with a
real implementation that POSTs to `/v1/embeddings`. Adds local response
types `mistralEmbeddingData` and `mistralEmbeddingResponse`.
No factory change. No interface change.
### How the implementation works
- Validate `apiConfig`, the API key, and the model name. Use the
existing `baseURLForRegion` helper so an unknown region fails fast with
a clear error.
- Wrap the request with `context.WithTimeout(nonStreamCallTimeout)` so
the call has a clear deadline. Same pattern as `ChatWithMessages` and
`ListModels` already use in this file.
- Send all input texts in one request. The Mistral API accepts the
`input` field as an array.
- Parse `data[*].embedding` and copy each slice into a `[]EmbeddingData`
indexed by `data[*].index` so the output order matches the input order
even if the API returns items in a different order.
- An empty input slice returns `[]EmbeddingData{}` with no HTTP call.
- Non-200 responses propagate the upstream status line and body.
- A final pass checks that every input slot got a vector. If any slot is
still empty, return a clear error so the caller does not silently use a
zero vector.
### Note on stacking
This PR builds on #14805 (the Mistral driver). Until #14805 merges, this
PR's diff on GitHub will include both that PR's commits and this one.
After #14805 lands on `main`, GitHub will auto-reduce this PR to only
the `Embed` changes (one commit, ~111 line diff in `mistral.go` plus 8
lines in `mistral.json`).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- `go build ./internal/entity/models/...` returns exit 0 on go 1.25 (the
`go.mod` minimum).
- The full method set on `MistralModel` still matches the `ModelDriver`
interface.
- Pattern parity with the existing OpenAI Embed implementation
(`internal/entity/models/openai.go`).
Closes#14806
Depends on #14805
Tracking: #14736
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
```
RAGFlow(user)> asr with 'glm-asr-2512@test@zhipu-ai' audio './speech.wav';
CLI error: zhipu, no such method
RAGFlow(user)> stream asr with 'glm-asr-2512@test@zhipu-ai' audio './speech.wav';
CLI error: zhipu, no such method
RAGFlow(user)> tts with 'glm-tts@test@zhipu-ai' text 'how are you';
CLI error: zhipu, no such method
RAGFlow(user)> stream tts with 'glm-tts@test@zhipu-ai' text 'how are you';
CLI error: zhipu, no such method
RAGFlow(user)> ocr with 'glm-ocr@test@zhipu-ai' file './test.log';
CLI error: zhipu, no such method
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
The Upstage Go driver landed in #14817 with chat, list models, and check
connection. `Embed` was left as a stub that returns `"not implemented"`.
This PR fills the gap.
Upstage exposes an OpenAI-compatible embeddings endpoint at
`https://api.upstage.ai/v1/solar/embeddings` via the
`solar-embedding-1-large` family (`solar-embedding-1-large-query` for
queries, `solar-embedding-1-large-passage` for passages), and the Python
side has had `UpstageEmbed(OpenAIEmbed)` in `rag/llm/embedding_model.py`
for a long time targeting this same path. The existing
`conf/models/upstage.json` did not list any embedding model out of the
box, so a tenant who wanted to use Upstage end to end could not run an
embedding call. This PR fills the gap.
### What this PR includes
- `conf/models/upstage.json`: add `"embedding": "embeddings"` under
`url_suffix` so the driver can build the URL from config (matches the
`URLSuffix.Embedding` field already used by openai, mistral,
siliconflow, zhipu-ai), and add `solar-embedding-1-large-query` and
`solar-embedding-1-large-passage` entries under `models`.
- `internal/entity/models/upstage.go`: replace the `Embed` stub with a
real implementation that POSTs to `/v1/solar/embeddings`. Adds local
response types `upstageEmbeddingData` and `upstageEmbeddingResponse`.
No factory change. No interface change.
### How the implementation works
- Validate `apiConfig`, the API key, and the model name. Use the
existing `baseURLForRegion` helper so an unknown region fails fast with
a clear error.
- Wrap the request with `context.WithTimeout(nonStreamCallTimeout)` so
the call has a clear deadline. Same pattern as `ChatWithMessages` and
`ListModels` already use in this file.
- Send all input texts in one request. The Upstage API accepts the
`input` field as an array.
- Parse `data[*].embedding` and copy each slice into a `[]EmbeddingData`
indexed by `data[*].index` so the output order matches the input order
even if the API returns items in a different order.
- An empty input slice returns `[]EmbeddingData{}` with no HTTP call.
- Non-200 responses propagate the upstream status line and body.
- A final pass checks that every input slot got a vector. If any slot is
still empty, return a clear error so the caller does not silently use a
zero vector.
### Note on stacking
This PR builds on #14817 (the Upstage driver). Until #14817 merges, this
PR's diff on GitHub will include both that PR's commits and this one.
After #14817 lands on `main`, GitHub will auto-reduce this PR to only
the `Embed` changes (one commit, ~119 line diff in `upstage.go` plus ~15
lines in `upstage.json`).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- `go build ./internal/entity/models/...` returns exit 0 on go 1.25 (the
`go.mod` minimum).
- The full method set on `UpstageModel` still matches the `ModelDriver`
interface.
- Pattern parity with the existing Mistral Embed
(`internal/entity/models/mistral.go`) and OpenAI Embed
(`internal/entity/models/openai.go`) implementations.
Closes#14818
Depends on #14817
Tracking: #14736
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
This PR completes the Baichuan provider
**The following functionalities are now supported:**
**Baichuan:**
- [x] Chat / Stream Chat
- [x] Embedding
- [ ] ~~Rerank~~
- [ ] ~~Model listing~~
- [ ] ~~Provider connection checking~~
- [ ] ~~Balance~~
**Verified examples from the CLI:**
```plaintext
# Baichuan
RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'Baichuan-Text-Embedding@test@baichuan' dimension 16;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 1024 | 0 |
| 1024 | 1 |
+-----------+-------+
AGFlow(user)> chat with 'Baichuan-M2@test@baichuan' message 'who r u'
Answer: I'm BaiChuan, a helpful AI assistant created by Baichuan-AI. I'm designed to be a knowledgeable, friendly, and reliable assistant for various tasks like answering questions, explaining concepts, writing content, and more. Feel free to ask me anything! 😊
Time: 1.637975
RAGFlow(user)> stream chat with 'Baichuan-M2@test@baichuan' message 'who r u'
Answer: I'm BaiChuan-m2, an AI assistant developed by Baichuan-AI. My purpose is to help you with a wide range of tasks by providing information, answering questions, solving problems, and assisting with creative projects. Think of me as a helpful digital companion! If you have any questions or need assistance, just let me know.😊
Time: 1.692321
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
This PR adds focused unit tests for aggregate_by_field in OceanBase
memory utilities to improve behavior coverage for real-world input
shapes.
- Adds test coverage for list-valued aggregation fields, including
whitespace trimming and skipping invalid list entries.
- Adds test coverage for scalar field values to ensure blank/non-string
values are ignored.
- Confirms aggregation output remains correct and stable for
mixed-quality message payloads.
### Why this helps
It strengthens regression protection for aggregation logic used by
memory retrieval flows, with no production code changes and minimal
review risk.
### What problem does this PR solve?
fix:
update null checks to use 'is None' for better clarity
replace RAGFlowSelect with SelectWithSearch in DebugContent
add max height and overflow to DialogContent in ParameterDialog
remove unused types from DataOperationsForm
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fixes#13817
### What problem does this PR solve?
The "knowledge graph construction" link on line 21 of
`docs/guides/dataset/run_retrieval_test.md` points to
`./construct_knowledge_graph.md`, which doesn't exist. The actual file
is at `./advanced/construct_knowledge_graph.md`.
### Type of change
- [x] Documentation Update
Signed-off-by: majiayu000 <1835304752@qq.com>
### What problem does this PR solve?
Add a Go driver for StepFun (阶跃星辰), one of the unchecked providers on
the umbrella tracking issue #14736.
Until this PR, a tenant who configured `stepfun` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver. Chat, list
models, and check connection all returned `"not implemented"` instead of
reaching the StepFun API.
The Python side has had StepFun registered in `rag/llm/__init__.py` as a
`SupportedLiteLLMProvider` with base URL `https://api.stepfun.com/v1`,
plus `StepFunCV` for vision and `StepFunSeq2txt` for ASR, but no Go
path. StepFun's chat API is OpenAI-compatible, so the implementation
pattern is the same as the merged Moonshot driver (#14433) and OpenAI
driver (#14605).
### What this PR includes
- New file `internal/entity/models/stepfun.go` with a `StepFunModel`
that implements the `ModelDriver` interface.
- `factory.go`: route the `"stepfun"` provider name to
`NewStepFunModel`.
- New `conf/models/stepfun.json` with the public StepFun chat models
(step-2-16k, step-1 family in 8k/32k/128k/256k context lengths,
step-1-flash, and the step-1v / step-1o vision models) and `url_suffix`
entries for `chat` and `models`.
### How the driver works
- StepFun exposes the OpenAI-compatible API at
`https://api.stepfun.com/v1`.
- `ChatWithMessages` and `ChatStreamlyWithSender` post to
`/chat/completions` in the same shape as the merged moonshot,
openrouter, and openai drivers.
- `ListModels` and `CheckConnection` call `/models` to list available
ids and confirm the API key works.
- `Embed` is left as `"not implemented"`. StepFun has not advertised a
public embeddings endpoint in the API reference linked from the umbrella
issue
(`https://platform.stepfun.com/docs/en/api-reference/chat/chat-completion-create`
is the chat endpoint), so any real implementation belongs in a separate
follow-up only after the endpoint is verified.
- `Rerank` and `Balance` return `"no such method"` because StepFun does
not expose either.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- `go build ./internal/entity/models/...` returns exit 0 with no errors
on go 1.25 (the `go.mod` minimum).
- Method set of `StepFunModel` matches the `ModelDriver` interface:
`NewInstance`, `Name`, `ChatWithMessages`, `ChatStreamlyWithSender`,
`Embed`, `Rerank`, `ListModels`, `Balance`, `CheckConnection`.
- Pattern parity with the merged moonshot (#14433), openai (#14605),
openrouter (#14652), and xai (#14550) drivers.
Closes#14814
Tracking: #14736
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.6.3 to 2.7.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/urllib3/urllib3/releases">urllib3's
releases</a>.</em></p>
<blockquote>
<h2>2.7.0</h2>
<h2>🚀 urllib3 is fundraising for HTTP/2 support</h2>
<p><a
href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3
is raising ~$40,000 USD</a> to release HTTP/2 support and ensure
long-term sustainable maintenance of the project after a sharp decline
in financial support. If your company or organization uses Python and
would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and
thousands of other projects <a
href="https://opencollective.com/urllib3">please consider contributing
financially</a> to ensure HTTP/2 support is developed sustainably and
maintained for the long-haul.</p>
<p>Thank you for your support.</p>
<h2>Security</h2>
<p>Addressed high-severity security issues. Impact was limited to
specific use cases detailed in the accompanying advisories; overall user
exposure was estimated to be marginal.</p>
<ul>
<li>
<p>Decompression-bomb safeguards of the streaming API were bypassed:</p>
<ol>
<li>When <code>HTTPResponse.drain_conn()</code> was called after the
response had been read and decompressed partially. (Reported by <a
href="https://github.com/Cycloctane"><code>@Cycloctane</code></a>)</li>
<li>During the second <code>HTTPResponse.read(amt=N)</code> or
<code>HTTPResponse.stream(amt=N)</code> call when the response was
decompressed using the official <a
href="https://pypi.org/project/brotli/">Brotli</a> library. (Reported by
<a
href="https://github.com/kimkou2024"><code>@kimkou2024</code></a>)</li>
</ol>
<p>See GHSA-mf9v-mfxr-j63j for details.</p>
</li>
<li>
<p>HTTP pools created using
<code>ProxyManager.connection_from_url</code> did not strip sensitive
headers specified in <code>Retry.remove_headers_on_redirect</code> when
redirecting to a different host. (GHSA-qccp-gfcp-xxvc reported by <a
href="https://github.com/christos-spearbit"><code>@christos-spearbit</code></a>)</p>
</li>
</ul>
<h2>Deprecations and Removals</h2>
<ul>
<li>Used <code>FutureWarning</code> instead of
<code>DeprecationWarning</code> for better visibility of existing
deprecation notices. Rescheduled the removal of deprecated features to
version 3.0. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3763">urllib3/urllib3#3763</a>)</li>
<li>Removed support for end-of-life Python 3.9. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3720">urllib3/urllib3#3720</a>)</li>
<li>Removed support for end-of-life PyPy3.10. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/4979">urllib3/urllib3#4979</a>)</li>
<li>Bumped the minimum supported pyOpenSSL version to 19.0.0. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3777">urllib3/urllib3#3777</a>)</li>
</ul>
<h2>Bugfixes</h2>
<ul>
<li>Fixed a bug where <code>HTTPResponse.read(amt=None)</code> was
ignoring decompressed data buffered from previous partial reads. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3636">urllib3/urllib3#3636</a>)</li>
<li>Fixed a bug where <code>HTTPResponse.read()</code> could cache only
part of the response after a partial read when
<code>cache_content=True</code>. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/4967">urllib3/urllib3#4967</a>)</li>
<li>Fixed <code>HTTPResponse.stream()</code> and
<code>HTTPResponse.read_chunked()</code> to handle <code>amt=0</code>.
(<a
href="https://redirect.github.com/urllib3/urllib3/issues/3793">urllib3/urllib3#3793</a>)</li>
<li>Updated <code>_TYPE_BODY</code> type alias to include missing
<code>Iterable[str]</code>, matching the documented and runtime behavior
of chunked request bodies. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3798">urllib3/urllib3#3798</a>)</li>
<li>Fixed <code>LocationParseError</code> when paths resembling
schemeless URIs were passed to
<code>HTTPConnectionPool.urlopen()</code>. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3352">urllib3/urllib3#3352</a>)</li>
<li>Fixed <code>BaseHTTPResponse.readinto()</code> type annotation to
accept <code>memoryview</code> in addition to <code>bytearray</code>,
matching the <code>io.RawIOBase.readinto</code> contract and enabling
use with <code>io.BufferedReader</code> without type errors. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3764">urllib3/urllib3#3764</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/urllib3/urllib3/blob/main/CHANGES.rst">urllib3's
changelog</a>.</em></p>
<blockquote>
<h1>2.7.0 (2026-05-07)</h1>
<h2>Security</h2>
<p>Addressed high-severity security issues.
Impact was limited to specific use cases detailed in the accompanying
advisories; overall user exposure was estimated to be marginal.</p>
<ul>
<li>
<p>Decompression-bomb safeguards of the streaming API were bypassed:</p>
<ol>
<li>When <code>HTTPResponse.drain_conn()</code> was called after the
response had been
read and decompressed partially.</li>
<li>During the second <code>HTTPResponse.read(amt=N)</code> or
<code>HTTPResponse.stream(amt=N)</code> call when the response was
decompressed
using the official <code>Brotli
<https://pypi.org/project/brotli/></code>__ library.</li>
</ol>
<p>See <code>GHSA-mf9v-mfxr-j63j
<https://github.com/urllib3/urllib3/security/advisories/GHSA-mf9v-mfxr-j63j></code>__
for details.</p>
</li>
<li>
<p>HTTP pools created using
<code>ProxyManager.connection_from_url</code> did not strip
sensitive headers specified in
<code>Retry.remove_headers_on_redirect</code> when
redirecting to a different host.
(<code>GHSA-qccp-gfcp-xxvc
<https://github.com/urllib3/urllib3/security/advisories/GHSA-qccp-gfcp-xxvc></code>__)</p>
</li>
</ul>
<h2>Deprecations and Removals</h2>
<ul>
<li>Used <code>FutureWarning</code> instead of
<code>DeprecationWarning</code> for better
visibility of existing deprecation notices. Rescheduled the removal of
deprecated features to version 3.0.
(<code>[#3763](https://github.com/urllib3/urllib3/issues/3763)
<https://github.com/urllib3/urllib3/issues/3763></code>__)</li>
<li>Removed support for end-of-life Python 3.9.
(<code>[#3720](https://github.com/urllib3/urllib3/issues/3720)
<https://github.com/urllib3/urllib3/issues/3720></code>__)</li>
<li>Removed support for end-of-life PyPy3.10.
(<code>[#4979](https://github.com/urllib3/urllib3/issues/4979)
<https://github.com/urllib3/urllib3/issues/4979></code>__)</li>
<li>Bumped the minimum supported pyOpenSSL version to 19.0.0.
(<code>[#3777](https://github.com/urllib3/urllib3/issues/3777)
<https://github.com/urllib3/urllib3/issues/3777></code>__)</li>
</ul>
<h2>Bugfixes</h2>
<ul>
<li>Fixed a bug where <code>HTTPResponse.read(amt=None)</code> was
ignoring decompressed
data buffered from previous partial reads.
(<code>[#3636](https://github.com/urllib3/urllib3/issues/3636)
<https://github.com/urllib3/urllib3/issues/3636></code>__)</li>
<li>Fixed a bug where <code>HTTPResponse.read()</code> could cache only
part of the
response after a partial read when <code>cache_content=True</code>.</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="9a950b92d9"><code>9a950b9</code></a>
Release 2.7.0</li>
<li><a
href="5ec0de499b"><code>5ec0de4</code></a>
Merge commit from fork</li>
<li><a
href="2bdcc44d1e"><code>2bdcc44</code></a>
Merge commit from fork</li>
<li><a
href="f45b0df09d"><code>f45b0df</code></a>
Fix a misleading example for <code>ProxyManager</code> (<a
href="https://redirect.github.com/urllib3/urllib3/issues/4970">#4970</a>)</li>
<li><a
href="577193ca02"><code>577193c</code></a>
Switch to nightly PyPy3.11 in CI for now (<a
href="https://redirect.github.com/urllib3/urllib3/issues/4984">#4984</a>)</li>
<li><a
href="e90af45bb0"><code>e90af45</code></a>
Avoid infinite loop in <code>HTTPResponse.read_chunked</code> when
<code>amt=0</code> (<a
href="https://redirect.github.com/urllib3/urllib3/issues/4974">#4974</a>)</li>
<li><a
href="67ed74fdae"><code>67ed74f</code></a>
Bump dev dependencies (<a
href="https://redirect.github.com/urllib3/urllib3/issues/4972">#4972</a>)</li>
<li><a
href="3abd481097"><code>3abd481</code></a>
Upgrade mypy to version 1.20.2 (<a
href="https://redirect.github.com/urllib3/urllib3/issues/4978">#4978</a>)</li>
<li><a
href="2b8725dfca"><code>2b8725d</code></a>
Drop support for EOL PyPy3.10 (<a
href="https://redirect.github.com/urllib3/urllib3/issues/4979">#4979</a>)</li>
<li><a
href="2944b2a0a6"><code>2944b2a</code></a>
Upgrade <code>setup-chrome</code> and <code>setup-firefox</code> to fix
warnings (<a
href="https://redirect.github.com/urllib3/urllib3/issues/4973">#4973</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/urllib3/urllib3/compare/2.6.3...2.7.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/infiniflow/ragflow/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
### What problem does this PR solve?
Closes#14674.
This PR improves RAPTOR configuration and tree construction while
preserving the existing RAPTOR behavior as the default.
RAPTOR currently builds summary layers with the original UMAP + GMM
clustering path. This PR keeps that default path, and adds:
- A hidden backend tree-builder option:
- `tree_builder="raptor"`: default, existing RAPTOR behavior.
- `tree_builder="psi"`: rank-aware Psi-style tree builder using original
embedding-space cosine ranking.
- A user-facing clustering method option for the default RAPTOR builder:
- `clustering_method="gmm"`: existing default.
- `clustering_method="ahc"`: agglomerative hierarchical clustering path.
- A RAPTOR UI setting for `Clustering method` and `Max cluster`.
### What changed
#### Backend
- Added `tree_builder` support for RAPTOR/Psi.
- Added `clustering_method` support for GMM/AHC.
- Kept existing RAPTOR + GMM as the default.
- Added Psi tree building from original-space cosine similarity.
- Added bucketed Psi building controls for large inputs:
- `raptor.ext.psi_exact_max_leaves`
- `raptor.ext.psi_bucket_size`
- Added method-aware RAPTOR summary metadata using existing
`extra.raptor_method`.
- Avoided adding a dedicated DB schema field for experimental method
tracking.
- Added cleanup/migration logic to avoid mixing stale RAPTOR summary
trees.
- Added defensive checks for Psi tree construction and summary failures.
#### Frontend/UI
- Added `Clustering method` in RAPTOR settings with `GMM` and `AHC`.
- Added/kept `Max cluster` in RAPTOR settings.
- Enlarged max cluster UI limit to `1024`, matching backend validation.
- Kept AHC editable even when a RAPTOR task has already finished.
- Fixed the UI save payload so `clustering_method` and `tree_builder`
are serialized through `parser_config.raptor.ext`, avoiding backend
validation errors for extra top-level RAPTOR fields.
Example saved RAPTOR config:
```json
{
"raptor": {
"max_cluster": 317,
"ext": {
"clustering_method": "ahc",
"tree_builder": "raptor"
}
}
}
Co-authored-by: CaptainTimon <CaptainTimon@users.noreply.github.com>
## Summary
- Add GET method handler to `/api/v1/dify/retrieval` endpoint for Dify
external knowledge base connectivity verification
- GET requests return a simple success response; POST requests retain
existing retrieval logic unchanged
## Problem
When Dify integrates with RAGFlow as an external knowledge base, it
sends periodic GET requests to the retrieval endpoint for
health/connectivity checks. The endpoint only accepted POST, causing
werkzeug to return `405 Method Not Allowed`. After several successful
POST retrievals, the failing GET health checks trigger Dify's circuit
breaker, causing all subsequent requests to fail.
Traceback from the issue:
```
werkzeug.exceptions.MethodNotAllowed: 405 Method Not Allowed: The method is not allowed for the requested URL.
```
## Changes
- `api/apps/sdk/dify_retrieval.py`: Added a separate GET route handler
(`retrieval_health_check`) that returns `get_json_result(data=True)`
## Test plan
- [ ] Verify `GET /api/v1/dify/retrieval` returns `{"code": 0,
"message": "success", "data": true}`
- [ ] Verify `POST /api/v1/dify/retrieval` with valid API key and body
still works as before
- [ ] Verify Dify external knowledge base integration no longer returns
405 errors
Closes#13788🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Asksksn <Asksksn@noreply.gitcode.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
1. Add region check in zhipu-ai embed method
2. Fix retrieval test
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
This PR completes the Cohere provider integration (upgrading to the new
Cohere V2 API) and enhances the Fish Audio provider in RAGFlow.
**The following functionalities are now supported:**
**Cohere:**
- [x] Chat / Think Chat / Stream Chat / Stream Think Chat
- [x] Embedding
- [x] Rerank
- [x] Model listing
- [x] Provider connection checking
- [ ] Balance
**Fish Audio:**
- [x] Model listing (`ListModels`)
- [x] Balance (`Balance`)
-----
**Verified examples from the CLI:**
```plaintext
# Cohere
RAGFlow(user)> think chat with 'command-a-reasoning-08-2025@test3@cohere' message 'jumperwho'
Thinking: Okay, the user wrote "jumperwho". Let me try to figure out what they might be asking. First, I'll check if it's a misspelling. "Jumper" ...... Hmm. Since the query is unclear, the best approach is to ask the user to provide more context or correct any possible typos.
Answer: It seems there might be a typo or missing context in your query "jumperwho." Could you clarify what you're referring to? For example:
- Are you asking about a **jumper** (a type of sweater, a person who jumps, or a component in electronics)?
- Is this related to a specific context, like a movie (e.g., the 2008 film *Jumper*) or a game?
- Did you mean to ask about a person ("who") associated with jumping (e.g., a parachutist)?
Let me know so I can provide a helpful response! 😊
Time: 6.710331
RAGFlow(user)> stream think chat with 'command-a-reasoning-08-2025@test3@cohere' message 'jumperwho'
Thinking: , the user mentioned "jumperwho". Let me try to figure out what they're referring to. First, I'll check if it's a misspelling. "Jumper" could be a typo for "jumper" or maybe a username. Alternatively, it might be a combination of words like "jumper who",....... the best approach is to inform the user that I don't recognize the term and ask if they can provide more context or clarify what they mean by "jumperwho". That way, I can assist them better once I have more information.
Answer: seems "jumperwho" isn't a widely recognized term, proper noun, or acronym in common usage. Could you provide more context or clarify what you mean by "jumperwho"? This will help me understand your question or request better!
Time: 4.513596
RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'embed-v4.0@test3@cohere' dimension 16;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| embedding | index |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| [-0.016643638 -0.001957038 0.0055713872 0.009027058 0.05275187 -0.024542313 -0.044006906 0.024119169 0.0014192933 0.006558722 0.0019129605 -0.021016119 -0.026516981 -0.017489925 0.021298215 0.017772019 0.04569948 0.008886009 0.012059584 -0.0014721862 0.... | 0 |
| [0.018778935 -0.0063459855 -0.0006839742 0.0046623563 0.0067668925 -0.018001877 -0.03963003 0.035744734 -0.014246088 -0.0020721585 -0.006313608 0.025124922 -0.010749322 0.01217393 -0.010231283 -0.025254432 0.021498645 -0.028880708 0.019167464 -0.0058279... | 1 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'rerank-v4.0-pro@test@cohere' top 3;
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0 | 0.91744334 |
| 1 | 0.7458429 |
| 2 | 0.68729424 |
+-------+-----------------+
RAGFlow(user)> list supported models from 'cohere' 'test'
+-------------------------------------+
| model_name |
+-------------------------------------+
| c4ai-aya-expanse-32b |
| c4ai-aya-vision-32b |
| cohere-transcribe-03-2026 |
| command-a-03-2025 |
| command-a-reasoning-08-2025 |
| command-a-translate-08-2025 |
| command-a-vision-07-2025 |
| command-r-08-2024 |
| command-r-plus-08-2024 |
| command-r7b-12-2024 |
| command-r7b-arabic-02-2025 |
| embed-english-light-v3.0 |
| embed-english-light-v3.0-image |
| embed-english-v3.0 |
| embed-english-v3.0-image |
| embed-multilingual-light-v3.0 |
| embed-multilingual-light-v3.0-image |
| embed-multilingual-v3.0 |
| embed-multilingual-v3.0-image |
| embed-v4.0 |
+-------------------------------------+
RAGFlow(user)> check instance 'test' from 'cohere'
SUCCESS
# FishAudio
RAGFlow(user)> list supported models from 'fishaudio' 'test'
+----------------------------------------+
| model_name |
+----------------------------------------+
| Valentino Narración Biblica Fer |
| Super Smash Bros. 4/Ultimate Announcer |
| Farid Dieck |
| عصام الشوالي |
| ALEX_CHIKNA |
| Energetic Male |
| voz de locutor k |
| يي |
| ELITE |
| Mortal Kombat |
+----------------------------------------+
RAGFlow(user)> show balance from 'fishaudio' 'test'
+----------------------------------+-----------------------------+--------+-----------------+------------------+-----------------------------+----------------------------------+
| _id | created_at | credit | has_free_credit | has_phone_sha256 | updated_at | user_id |
+----------------------------------+-----------------------------+--------+-----------------+------------------+-----------------------------+----------------------------------+
| 82ffec12cf984d88a30ec504d7909812 | 2026-05-09T07:52:16.119000Z | 0 | | false | 2026-05-09T07:52:16.119000Z | 2578ab1126804d6eaa630552400d7ff3 |
+----------------------------------+-----------------------------+--------+-----------------+------------------+-----------------------------+----------------------------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
## Summary
- Replaces the `"no such method"` stub on `NvidiaModel.Rerank`
(`internal/entity/models/nvidia.go`) with a real implementation against
NVIDIA NIM's `/ranking` endpoint.
- Mirrors the existing Python `NvidiaRerank` class at
`rag/llm/rerank_model.py:149-190` for behavior parity: same
`passages`/`query.text`/`logit` payload shape; `top_n` set to
`len(documents)` so every input gets a score returned in original order
(the issue body's spec omitted `top_n`, which would cause silent data
loss).
- Adds the `"rerank": "ranking"` URL suffix and two NIM rerank model
entries (`nvidia/nv-rerankqa-mistral-4b-v3`,
`nvidia/llama-3.2-nv-rerankqa-1b-v2`) to `conf/models/nvidia.json` so
the picker exposes them.
- Follows the same shape as the recently merged Aliyun (#14676), Gitee
(#14656), and ZhipuAI (#14608) Rerank implementations: lowercase
per-driver request/response types, conversion to the project-wide
`RerankResponse{Data: []RerankResult}`, per-call `context.WithTimeout`
of 30s.
Closes#14720
## Test plan
- [x] `gofmt -l internal/entity/models/nvidia.go` — clean
- [x] `go vet ./internal/entity/models/...` — no new errors introduced
(the two pre-existing vet errors in `baidu.go:642` and
`openrouter.go:566` are unrelated to this PR)
- [x] `go build ./internal/entity/models/...` — succeeds
- [x] `python3 -c "import json;
json.load(open('conf/models/nvidia.json'))"` — JSON valid
- [ ] Live smoke test against NVIDIA NIM with a real API key (requires
reviewer with NIM credentials)
## Notes for reviewers
- The issue body suggested omitting `top_n`. The Python reference
includes it (`top_n: len(texts)`), and without it NVIDIA returns only
the default top-K rankings rather than scores for every input. This PR
follows the Python.
- The URL host is `integrate.api.nvidia.com` (kept consistent with the
existing chat/embeddings BaseURL in `nvidia.go`), not the legacy
`ai.api.nvidia.com` host the Python uses. NIM's unified endpoint accepts
the model names as-is, so no per-model URL transform is needed.
### What problem does this PR solve?
As the title suggests.
### Type of change
- [x] Documentation Update
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fixes#14570. On OpenSearch backends (`DOC_ENGINE=opensearch`) every
document-metadata write failed with `'OSConnection' object has no
attribute 'create_doc_meta_idx'`, so both `PATCH
/api/v1/datasets/{ds}/documents/{doc}` with `meta_fields` and `POST
/api/v1/datasets/{ds}/metadata/update` were unusable while every other
document operation (retrieval, parsing, name update, chunk management)
worked correctly on the same OpenSearch cluster.
The bug runs deeper than the missing method name in the error message
suggests. `DocMetadataService` also reached into
`settings.docStoreConn.es.*` directly for the index refresh, the
scripted partial update, and the count call, which means that even after
adding `create_doc_meta_idx` to `OSConnection` the very next call in the
same metadata flow would still raise `AttributeError` because
`OSConnection` exposes `self.os` rather than `self.es`. Fixing only the
reported symptom would have moved the failure one line down without
restoring the feature.
This PR adds a uniform document-metadata dispatch surface to both
connection classes so they present the same abstract API, and routes the
service layer through that surface via `getattr` guards instead of
poking at backend-specific attributes. The four new methods on
`OSConnection` and `ESConnectionBase` are `create_doc_meta_idx`,
`refresh_idx`, `count_idx`, and `replace_meta_fields`.
`OSConnection.create_doc_meta_idx` reuses the existing
`conf/doc_meta_es_mapping.json` schema in the OpenSearch `body=` form
because OpenSearch and Elasticsearch share the same index-creation
payload, and `replace_meta_fields` emits a full scripted assignment
(`ctx._source.meta_fields = params.meta_fields`) on both backends so
removed keys actually disappear instead of being preserved by deep-merge
semantics.
The `getattr`-guarded dispatch in `DocMetadataService` keeps the
existing fall-through paths intact for Infinity and OceanBase, which
continue to rely on their search-based count fallback and on the
delete-then-insert metadata replacement they used before, so this change
is strictly additive for those two backends.
Verification: `pytest
test/unit_test/rag/utils/test_opensearch_doc_meta.py` runs 16 new unit
tests that pass locally and pin the `OSConnection` dispatch surface, the
`create_doc_meta_idx` short-circuit when the index already exists, the
mapping-file payload routing, the `IndicesClient.create` failure path,
the `refresh_idx` and `count_idx` success and error sentinels, and the
full-assignment script emitted by `replace_meta_fields`. The test module
stubs `common.settings` and `rag.nlp` at import time so the suite runs
without the heavy backend SDKs that the rest of the repository pulls in
transitively.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: tmimmanuel <tmimmanuel@users.noreply.github.com>
### What problem does this PR solve?
fix some comments to improve readability
### Type of change
- [x] Documentation Update
---------
Signed-off-by: box4wangjing <box4wangjing@outlook.com>
Fixes#13851
## Problem
`OCR.detect()` in `deepdoc/vision/ocr.py` returns `None, None,
time_dict` (a truthy 3-tuple) when the text detector fails or receives a
`None` image. However, the caller in `pdf_parser.py:__ocr()` checks:
```python
bxs = self.ocr.detect(np.array(img), device_id)
if not bxs: # False! (None, None, time_dict) is a non-empty tuple → truthy
self.boxes.append([])
return
bxs = [(line[0], line[1][0]) for line in bxs] # iterates (None, None, time_dict)
# line = None → None[0] → TypeError: 'NoneType' object is not subscriptable
```
This causes the `NoneType object is not subscriptable` error that
appears after "OCR started" in the chunking pipeline when using PDF +
General parser.
## Solution
Simplified `OCR.detect()` to return `None` (falsy) instead of `None,
None, time_dict` on failure. The `time_dict` was unused by the only
caller of this method. The early-return guard `if not bxs:` in
`pdf_parser.py` then correctly catches it.
## Testing
- The method's only caller (`pdf_parser.py:__ocr`) already has a `if not
bxs:` guard that handles the `None` return correctly.
- No other callers of `OCR.detect()` exist in the codebase.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Modified OCR detection function return behavior to streamline output.
The function now returns detection results only, without timing
metadata. Error cases now return `None` instead of empty tuple values.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This PR fixes a UI issue where the .docx document preview was displayed
incompletely when clicking on a citation/reference link during a
knowledge base conversation.
### What problem does this PR solve?
The Issue:
In the chat interface, when a user clicks the source citation at the end
of an answer, the DocPreviewer opens. However, for .docx files, if the
content exceeded the window height, it was truncated and unscrollable,
preventing users from reading the full referenced text.
Changes:
web/src/components/document-preview/doc-preview.tsx: Added the
overflow-auto Tailwind class to the DocPreviewer root container to
ensure scrollbars appear automatically when content overflows.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: nie.weiyang <nie.weiyang@embedway.com>
### What problem does this PR solve?
The document parse status was set to DONE before the document chunks
were actually retrievable from Elasticsearch/Opensearch because it did
not wait for the index refresh. This meant that it was possible that the
document parse status returned by the API was DONE but when trying to
retrieve chunks there were none. Since the index refreshes every 1
second this was quite likely to happen when wait for document parsing by
polling with a short interval and then immediately trying to retrieve
chunks once the status was DONE.
I fixed this bug and added a test case that would have caught it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Added a private helper _visibility_and_status_filter(joined_tenant_ids,
user_id) that returns the Peewee condition: visible to user (team or
own) and status is VALID.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Serobabov Aleksandr <40SerobabovAS@region.cbr.ru>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Addresses event-loop blocking under high concurrency reported in #13825.
When multiple requests hit the API simultaneously, synchronous DB/Redis
calls block the async event loop, preventing Quart from handling other
requests and causing cascading 502/504 timeouts.
This PR wraps all remaining blocking DB/Redis calls in `canvas_app.py`,
`chat_api.py`, `session.py`, and `canvas_service.py` with `await
thread_pool_exec()`
- Offload all synchronous `Service.*`, `REDIS_CONN.*`, and
`APIToken.query` calls to the thread pool
- Convert sync endpoint handlers (`list_chats`, `get_chat`, `templates`,
`sessions`, etc.) to `async def`
- Convert sync helper functions (`_ensure_owned_chat`,
`_validate_llm_id`, `_validate_dataset_ids`, etc.) to async - no
duplicate sync/async pairs
- Wrap `CanvasReplicaService` Redis IO calls (`bootstrap`,
`replace_for_set`, `commit_after_run`)
- Use `asyncio.gather()` for concurrent file uploads and chat response
building
**Note:** This fixes the code-level event-loop blocking, which is a
prerequisite for handling concurrent requests. For the full "30
concurrent requests without 502/504" goal described in the issue, users
should also tune deployment config:
- `WS=4` or higher (HTTP worker processes, default 1)
- `MAX_CONCURRENT_CHATS=50` (default 10)
- `SANDBOX_EXECUTOR_MANAGER_POOL_SIZE` for workflow-heavy workloads
### Performance verification
Reviewer asked for a before-vs-after comparison
([comment](https://github.com/infiniflow/ragflow/pull/13941#issuecomment-4393667231)).
I built a self-contained microbenchmark that reproduces the exact
failure mode this PR targets: an async handler that performs blocking
DB/Redis-style calls (50 ms each, 3 per request, 30 concurrent requests)
is run twice — once with the pre-PR pattern (sync call directly inside
the async handler) and once with the post-PR pattern (`await
thread_pool_exec(...)`). The benchmark imports nothing from RAGFlow
except `thread_pool_exec` itself, so it is hermetic and reproducible
(`THREAD_POOL_MAX_WORKERS=128`, Python 3.13.12).
**Throughput — wall-clock for 30 concurrent requests (lower is better)**
| flavour | wall(s) | p50(s) | p95(s) | max(s) |
|---|---:|---:|---:|---:|
| before | 4.986 | 0.158 | 0.207 | 0.269 |
| after | 0.248 | 0.181 | 0.230 | 0.231 |
The pre-PR handler serializes the entire load on the event-loop thread,
so 30 × 3 × 50 ms ≈ 4.5 s shows up as the wall time. The post-PR handler
parallelizes the blocking work across the thread pool and finishes the
same load in 248 ms — a **~20× speedup** on this workload.
**Event-loop responsiveness — latency of an unrelated probe coroutine
while the 30 slow requests are running (lower is better)**
| flavour | samples | probe p50 (ms) | probe p95 (ms) | probe max (ms) |
|---|---:|---:|---:|---:|
| before | 1 | 5442.26 | 5442.26 | 5442.26 |
| after | 28 | 0.88 | 11.53 | 98.02 |
This is the metric that maps directly to "the API still answers other
requests while one is busy". A 5 ms-interval probe was scheduled while
the 30 slow handlers ran. With the pre-PR code the event loop was frozen
for the entire duration of the blocking work, so only one probe sample
was ever picked up and it waited **5,442 ms**. After the PR, 28 probe
samples landed with **p50 0.88 ms / p95 11.53 ms**, meaning unrelated
requests are no longer starved by the slow ones. That is the regression
mode behind the cascading 502/504s reported in #13825.
<details>
<summary>Raw benchmark output</summary>
```
config: 30 concurrent requests, 3 blocking calls of 50ms each per request, THREAD_POOL_MAX_WORKERS=128
=== Throughput (lower wall is better) ===
flavour wall(s) p50(s) p95(s) max(s)
before 4.986 0.158 0.207 0.269
after 0.248 0.181 0.230 0.231
=== Event-loop responsiveness (lower probe latency is better) ===
flavour samples probe p50(ms) probe p95(ms) probe max(ms)
before 1 5442.26 5442.26 5442.26
after 28 0.88 11.53 98.02
```
</details>
The benchmark script is included as a comment on the PR for
reproducibility.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement
Closes [#13825](https://github.com/infiniflow/ragflow/issues/13825)
---------
Co-authored-by: tmimmanuel <tmimmanuel@users.noreply.github.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
- Moved if not all([email, new_pwd, new_pwd2]) guard to the top, before
any decryption that could crash on None value
- Removed the redundant REDIS_CONN.get() call — one call is sufficient
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Provide embedding index according to the input text
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Summary
- Wrap 2 `ThreadPoolExecutor` instances in `file_service.py` with `with`
statement
- Ensures threads are properly shut down after all futures complete
## Problem
`parse_docs()` (line 532) and the file processing method (line 694)
create `ThreadPoolExecutor` instances that are never shut down. In a
long-running server process, this leaks thread resources on every
invocation — threads remain alive consuming memory even after all
submitted work is complete.
## Fix
Replace bare `ThreadPoolExecutor()` with `with ThreadPoolExecutor() as
exe:` context manager, which calls `executor.shutdown(wait=True)` on
exit.
## Test plan
- [x] Verified both call sites use `with` statement after fix
- [x] No remaining bare `ThreadPoolExecutor` in `file_service.py`
- [x] `document_service.py:1066` is a module-level executor (different
pattern, not changed in this PR)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
issue: https://github.com/infiniflow/ragflow/issues/14748
change: dataset search rerank id type
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Related issues
Closes#14744
### What problem does this PR solve?
The Memory REST endpoint `POST /api/v1/messages` previously persisted
whatever `user_id` the client sent in the JSON body. Memory rows were
therefore attributed to an arbitrary string, even when the caller
authenticated as a normal workspace user via JWT (browser/session-style
bearer token decoded into an access token). That broke attribution and
audit semantics for shared memories (team visibility): any authorized
writer could spoof another subject id.
The Python SDK already sends an optional `user_id` for integrations
using **API keys** (`APIToken`) to tag an external subject distinct from
the tenant owner user.
### Solution
- Record **`g.auth_via_api_token`** in `_load_user`
(`api/apps/__init__.py`): set `True` only when authentication resolves
via `APIToken`, otherwise `False` after JWT-based login succeeds.
- In **`POST /messages`** (`memory_api.add_message`): if the request was
authenticated with an API key, keep accepting optional `user_id` from
the body (default empty string). For JWT-authenticated users, **always**
set stored `user_id` to **`current_user.id`** and ignore the client
field.
- Guard reads of `g` with **`RuntimeError`** handling so isolated
imports or tests without a Quart application context do not fail when
resolving `user_id`.
- Document on **`RAGFlow.add_message`** that `user_id` is only
meaningful for API-key authentication.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Testing
- `python -m py_compile` on modified modules (`api/apps/__init__.py`,
`api/apps/restful_apis/memory_api.py`).
- Recommended: run web/SDK memory message tests (`test_add_message`,
`test_message_routes_unit`) against a full environment with `quart` and
configured services.
### Notes for reviewers
- Behavior change **only** for callers using JWT-style authorization on
`POST /messages`; API-key callers keep prior optional `user_id`
semantics.
Co-authored-by: jony376 <jony376@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
## What problem does this PR solve?
The Dify-compatible `/dify/retrieval` endpoint recently gained stricter
parsing and validation for its request payload, including:
- Normalized `retrieval_setting.top_k` and
`retrieval_setting.score_threshold` types.
- Clear separation between malformed arguments vs missing required
fields.
Previously, there was no unit test explicitly guarding the exact error
code and message contract for these cases.
## What does this PR change?
- **Add guard-style unit test** in `test_dify_retrieval_routes_unit.py`:
- `test_retrieval_argument_error_messages`:
- Sends a request with malformed numeric options:
- `retrieval_setting = {"top_k": "not-int", "score_threshold":
"not-float"}`
- Asserts `code == RetCode.ARGUMENT_ERROR` and message contains
`"invalid or malformed arguments:"`.
- Sends a request with required fields missing:
- Empty payload (`{}`)
- Asserts `code == RetCode.ARGUMENT_ERROR` and message contains
`"required arguments are missing:"`.
This test encodes the intended behavior of the Dify retrieval API so
future refactors cannot silently regress error handling.
## Type of change
- [x] Tests (add coverage and guardrails for existing behavior)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
pending_cell_images should be scoped by sheet
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
GraphRAG feature - Part 1 - add spacy to extract entity and relation
<img width="1621" height="1288" alt="image"
src="https://github.com/user-attachments/assets/aadeddad-94da-46c6-adad-9c3784181f61"
/>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Summary
- Wrap the `ThreadPoolExecutor` instances in `FileService.parse_docs`
and `FileService.get_files` with `with ... as exe:` blocks for
deterministic cleanup
- Replace the `concurrent.futures.ThreadPoolExecutor` in
`do_handle_task` with `asyncio.create_task(asyncio.to_thread(build_TOC,
...))`, preserving the existing parallelism with chunk insertion while
leveraging the surrounding async context
- Drop the now-unused `import concurrent` and the
`executor.shutdown(wait=False)` call in the `finally` block
Closes#14622.
No behavioral change, no public API change. Net diff: ~19 insertions /
25 deletions across two files.
## Test plan
- [ ] `uv run ruff check api/db/services/file_service.py
rag/svr/task_executor.py` passes
- [ ] Upload a multi-file batch through the chat/file endpoint and
confirm `FileService.parse_docs` still returns combined parsed text
- [ ] Trigger `FileService.get_files` via the chat reference flow with a
mix of image and non-image files; verify both `raw=True` and `raw=False`
paths return correctly
- [ ] Run a `naive`-parser document task with `toc_extraction: true` and
confirm the TOC chunk is generated and inserted exactly as before
- [ ] Run a `naive`-parser document task with `toc_extraction: false`
and confirm the path with `toc_thread = None` is unaffected
- [ ] Cancel a running task to exercise the `finally` block and confirm
cleanup still works without the executor shutdown call
---------
Co-authored-by: web-dev0521 <jasonpette1783@gmail.com>
Co-authored-by: Wang Qi <wangq8@outlook.com>
### What problem does this PR solve?
The OpenRouter `Encode` method silently swallowed malformed responses.
If a `data[]` item from the API was missing a field (`index`,
`embedding`, or unexpected shape), the loop did `continue` instead of
returning an error — leaving `nil` entries in the result slice. Callers
got back partial results with no indication anything went wrong, which
then crashes downstream consumers when they try to use a `nil` vector.
There were three concrete gaps:
- No count-mismatch check between `data` length and input texts (only
checked for empty)
- No duplicate-index detection (a duplicate would silently overwrite)
- Parse failures on individual items returned partial slices instead of
erroring
This PR replaces `map[string]interface{}` parsing with a typed
`openrouterEmbeddingResponse` struct and applies the same 3-layer
validation used in the other drivers (count mismatch → out-of-range
index → duplicate index), so any malformed response produces a clear
error instead of corrupted data.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The LM Studio Go driver shipped with a stub \`Encode\` method that
returned \`no such method\`, even though LM Studio is one of the most
common local LLM runners on macOS and Windows and exposes an
OpenAI-compatible embeddings endpoint at \`/v1/embeddings\`.
LM Studio users routinely load local embedding models such as
\`nomic-ai/nomic-embed-text-v1.5\`,
\`mixedbread-ai/mxbai-embed-large-v1\`, or \`BAAI/bge-m3\`. They run on
the same \`/v1\` namespace as chat. The existing \`ListModels\` already
discovers them, but because \`Encode\` was a stub, a tenant who picked
one of these models in the Go layer could not actually run an embedding
call.
This finishes the local-LLM trio: Ollama Encode (#14664) and vLLM Encode
(#14688) are already in flight, both using the
same OpenAI-compatible \`/embeddings\` shape.
### What this PR includes
- \`conf/models/lmstudio.json\`: add \`\"embedding\": \"embeddings\"\`
under \`url_suffix\` so the driver can build the URL from config.
- \`internal/entity/models/lmstudio.go\`: replace the \`Encode\` stub
with a real implementation. Adds a small local response type that
matches the OpenAI-compatible shape.
No factory change. No interface change.
### How the driver works
- Validate the model name. The API key is optional for local LM Studio,
so the Authorization header is only set when both \`apiConfig\` and
\`ApiKey\` are non-nil and non-empty, the same pattern the recently
merged CheckConnection PR (#14614) uses.
- Resolve the region with a default fallback. Return a clear "missing
base URL" error when the user has not configured
the local access address yet.
- Use a per-call \`context.WithTimeout(30s)\` and
\`http.NewRequestWithContext\`, the same pattern the merged
Aliyun Encode (#14647) and the in-flight Ollama Encode (#14664) and vLLM
Encode (#14688) use.
- Send \`{model, input: [texts]}\` in one request.
- Parse \`data[*].embedding\` and copy each slice into a \`[][]float64\`
indexed by \`data[*].index\`, so the output
order matches the input order.
- Handle both \`float64\` and \`float32\` element types.
- Empty input returns \`[][]float64{}\` with no HTTP call.
- Length mismatch between input and result, out-of-range index, and any
missing slot all return clear errors instead
of silent zero vectors.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
returns exit 0.
- The full method set on \`LmStudioModel\` still matches the
\`ModelDriver\` interface.
- Pattern parity with the merged Aliyun Encode (#14647), the in-flight
Ollama Encode (#14664) and vLLM Encode (#14688), and the existing
SiliconFlow Encode.
Closes#14693
### What problem does this PR solve?
The SiliconFlow `Encode` method sent one HTTP request per text, which is
wasteful and slow when indexing many documents (e.g., 100 docs = 100
round-trips).
SiliconFlow's `/v1/embeddings` is OpenAI-compatible and accepts an array
of strings in `input` (officially documented at
https://docs.siliconflow.cn/en/api-reference/embeddings/create-embeddings,
with a documented max array size of 32). This PR batches the requests up
to that limit, reducing 100 docs to ~4 round-trips, and replaces
`map[string]interface{}` parsing with a typed struct using the same
3-layer validation (count mismatch, out-of-range index, duplicate index)
used in the other drivers.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
The NVIDIA Go driver in `internal/entity/models/nvidia.go` shipped with
a stub `Encode`
method that returned `no such method`. `conf/models/nvidia.json` already
lists
`nvidia/llama-3.2-nemoretriever-1b-vlm-embed-v1` as an embedding model,
but the conf had
no `embedding` URL suffix, so the picker had nothing wired even if
`Encode` worked.
A tenant who wanted to use NVIDIA NIM for chat (already working) and
embeddings from a
single provider could not, even though the upstream endpoint is public
at
`https://integrate.api.nvidia.com/v1/embeddings` and uses an
OpenAI-compatible request
body extended with the NVIDIA-specific `input_type` and `truncate`
fields. Several other
Go drivers already implement `Encode` (siliconflow, zhipu-ai, aliyun),
so the interface
and the pattern are well-established.
This PR fills the gap.
### What this PR includes
* `conf/models/nvidia.json`: declare the `embedding` URL suffix
alongside the existing
`chat` and `models` entries. The embedding model entry was already
present, so no
model addition is needed.
* `internal/entity/models/nvidia.go`: replace the `Encode` stub with a
real
implementation. Adds a small local response type that matches the
OpenAI-compatible
shape NVIDIA NIM returns.
No factory change. No interface change.
### How the driver works
* Validates `apiConfig` and the API key, validates the model name,
resolves the region
with a default fallback (matching the pattern the merged `ListModels`
and
`CheckConnection` paths in this driver already use), and builds the URL
from
`BaseURL[region] + URLSuffix.Embedding`.
* Sends all input texts in one request as the `input` array, with the
NVIDIA-specific `input_type: "query"`, `encoding_format: "float"`, and
`truncate: "END"`
fields, mirroring the Python `NvidiaEmbed` reference.
* Parses `data[*].embedding` and copies each slice into `[][]float64`
indexed by
`data[*].index` so the output order matches the input order even if the
API returns
items in a different order.
* Handles both `float64` and `float32` element types.
* Empty input returns `[][]float64{}` with no HTTP call.
* Non-200 responses propagate the upstream status line and body.
* A final pass checks every input slot got a vector and returns a clear
error if any
slot is still nil.
* Per-call 30s context deadline so a slow call cannot block forever.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
* `go build ./internal/entity/models/...` returns exit 0.
* `go vet ./internal/entity/models/...` is clean.
* `gofmt -l internal/entity/models/nvidia.go` is clean.
* The full method set on `NvidiaModel` still matches the `ModelDriver`
interface.
* Pattern parity with the just-merged Aliyun `Encode` (#14647).
Closes#14699
### What problem does this PR solve?
The Ollama Go driver shipped with a stub \`Encode\` method that returned
\`no such method\`, even though Ollama is one of the most common local
LLM runners and exposes an OpenAI-compatible embeddings endpoint at
\`/v1/embeddings\`.
Ollama users routinely run local embedding models such as
\`nomic-embed-text\`, \`mxbai-embed-large\`, or \`bge-m3\`.
Pulled with \`ollama pull <model>\` and served on the same \`/v1\`
namespace as chat. The existing \`ListModels\` already
discovers them, but because \`Encode\` was a stub, a tenant who picked
one of these models in the Go layer could not
actually run an embedding call.
### What this PR includes
- \`conf/models/ollama.json\`: add \`\"embedding\": \"embeddings\"\`
under \`url_suffix\` so the
driver can build the URL from config.
- \`internal/entity/models/ollama.go\`: replace the \`Encode\` stub with
a real implementation. Adds a small local response
type that matches the OpenAI-compatible shape.
No factory change. No interface change.
### How the driver works
- Validate the model name. The API key is optional for local Ollama, so
the Authorization header is only set when both
\`apiConfig\` and \`ApiKey\` are non-nil and non-empty, the same pattern
the recently merged CheckConnection PR (#14614) uses.
- Resolve the region with a default fallback. Return a clear "missing
base URL" error when the user has not configured
the local access address yet.
- Use a per-call \`context.WithTimeout(30s)\` and
\`http.NewRequestWithContext\`, the same pattern the merged
Aliyun Encode (#14647) uses.
- Send \`{model, input: [texts]}\` in one request.
- Parse \`data[*].embedding\` and copy each slice into a \`[][]float64\`
indexed by \`data[*].index\`, so the output
order matches the input order.
- Handle both \`float64\` and \`float32\` element types.
- Empty input returns \`[][]float64{}\` with no HTTP call.
- Length mismatch between input and result, out-of-range index, and any
missing slot all return clear errors instead
of silent zero vectors.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
returns exit 0.
- The full method set on \`OllamaModel\` still matches the
\`ModelDriver\` interface.
- Pattern parity with the merged Aliyun Encode (#14647) and the existing
SiliconFlow Encode.
Closes#14662
## Summary
This PR fixes the `message_fit_in()` truncation bug reported in #13607.
Changes:
- fix the user-message truncation branch to reserve room for the system
prompt token budget
- guard the zero-token edge case to avoid dividing by zero in the
truncation ratio check
- add focused regression tests covering both the user-dominant
truncation path and the zero-token boundary case
## Validation
```bash
pytest -q --noconftest test/unit_test/rag/prompts/test_generator_message_fit_in.py
```
Result: `2 passed`
Closes#13607
## Summary
This PR fully addresses all CodeRabbit review feedback and enhances the
robustness of the reranking module with 100% backward compatibility.
## Key Fixes
1. Fixed JinaRerank hardcoded base_url to support subclass endpoint
overrides
2. Corrected GPUStackRerank exception handling to use proper requests
exceptions and preserve stack traces
3. Added 30s timeout to all API calls to prevent service hanging
4. Added empty input validation for all rerank providers
5. Replaced direct dict key access with .get() to eliminate KeyError
crashes
6. Fixed _normalize_rank edge case for empty arrays
7. Implemented missing functionality for Ai302Rerank
8. Standardized type hints and fixed typo issues
## Compatibility
- No breaking changes to any existing functionality
- All rerank providers work as originally intended
- Fully compatible with existing configurations and workflows
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
The vLLM Go driver shipped with a stub \`Encode\` method that returned
\`not implemented\`, even though vLLM is one of the most common
production-grade self-hosted inference servers and exposes an
OpenAI-compatible embeddings endpoint at \`/v1/embeddings\`.
Users who self-host \`BAAI/bge-m3\`, \`Qwen3-Embedding-*\`,
\`NV-Embed-v2\`, or similar models on vLLM could not run an embedding
call through the Go layer. The existing \`ListModels\` already discovers
the loaded models, but the embedding path failed because \`Encode\` was
a stub.
### What this PR includes
- \`conf/models/vllm.json\`: add \`\"embedding\": \"embeddings\"\` under
\`url_suffix\` so the driver can build the URL from config.
- \`internal/entity/models/vllm.go\`: replace the \`Encode\` stub with a
real implementation. Adds a small local response
type that matches the OpenAI-compatible shape.
No factory change. No interface change.
### How the driver works
- Validate the model name. The API key is optional for self-hosted vLLM,
so the Authorization header is only set when both \`apiConfig\` and
\`ApiKey\` are non-nil and non-empty, the same pattern the recently
merged CheckConnection PR (#14614) uses.
- Resolve the region with a default fallback. Return a clear "missing
base URL" error when the user has not configured
the local access address yet.
- Use a per-call \`context.WithTimeout(30s)\` and
\`http.NewRequestWithContext\`, the same pattern the merged
Aliyun Encode (#14647) and in-flight Ollama Encode (#14664) use.
- Send \`{model, input: [texts]}\` in one request.
- Parse \`data[*].embedding\` and copy each slice into a \`[][]float64\`
indexed by \`data[*].index\`, so the output
order matches the input order.
- Handle both \`float64\` and \`float32\` element types.
- Empty input returns \`[][]float64{}\` with no HTTP call.
- Length mismatch between input and result, out-of-range index, and any
missing slot all return clear errors instead
of silent zero vectors.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
returns exit 0.
- The full method set on \`VllmModel\` still matches the \`ModelDriver\`
interface.
- Pattern parity with the merged Aliyun Encode (#14647), the in-flight
Ollama Encode (#14664), and the existing
SiliconFlow Encode.
Closes#14687
## What
Widen the keyword delimiter in `rag/svr/task_executor.py`:
both `build_chunks` (LLM `keyword_extraction` cache parsing) and
`run_dataflow` (chunk-level `keywords` ingestion) now split on
`, , ; ; 、 \r \n` instead of only ASCII comma.
## Why
`rag/prompts/keyword_prompt.md` instructs the LLM:
> The keywords are delimited by ENGLISH COMMA.
In practice, Chinese-leaning models (Qwen / Tongyi-Qianwen, GLM,
etc.) frequently ignore this instruction when the source content is
Chinese and emit Chinese commas (`,`) instead. Result:
`cached.split(",")` sees the full LLM output as a *single* keyword.
Repro: `auto_keywords>=4` + Chinese docs + `qwen-plus@Tongyi-Qianwen`.
We observed entries in `important_kwd` like
`"功能介绍,配置说明,参数详解,问题排查"` — one bucket instead of four.
## Impact
- Silent data-quality bug; no exception thrown.
- BM25 `important_kwd^30` boost effectively stops firing — the
indexed term is the whole list, never matches user query tokens.
- Any downstream aggregating `important_kwd` (tagging, analytics,
candidate-keyword review UIs) sees garbage.
## Compatibility
- Pure widening of the splitter; ASCII-comma-only outputs continue
to work identically.
- No schema / API change.
## Test plan
Manually verified against `qwen-plus@Tongyi-Qianwen` with
`auto_keywords=10` on Chinese .txt files:
- Before: `important_kwd` contains one element per chunk that is the
full LLM string with `,`-separated phrases inside.
- After: `important_kwd` contains N elements, one per phrase, as the
LLM intended.
### What problem does this PR solve?
The Gitee AI Go driver in `internal/entity/models/gitee.go` shipped with
a stub `Encode` method that returned `gitee, no such method`, even
though `conf/models/gitee.json` already wires the `embedding` URL
suffix. The conf also listed no embedding models, so the picker had
nothing to select.
This blocked any tenant who wanted to use Gitee AI for chat, rerank
(already working, see #14656), and embeddings from a single provider.
This PR fills the gap, mirroring the just-merged Aliyun `Encode`
(#14647):
- `internal/entity/models/gitee.go`: replace the `Encode` stub with a
real implementation.
Validates inputs, resolves the region with a default fallback, POSTs the
standard OpenAI-compatible `{"model", "input": [...]}` body to
`BaseURL[region] + URLSuffix.Embedding`, parses `data[*].embedding`
indexed by `data[*].index` so output order matches input order, handles
both `float64` and `float32` element types, and uses a 30s per-call
context deadline matching the merged `Rerank`.
- `conf/models/gitee.json`: add `BAAI/bge-m3` so the embedding picker
has something to select.
No factory change. No interface change. No URL suffix change.
Verified with `go build`, `go vet`, and `gofmt -l` : all clean.
Closes#14697
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
`retrieval_by_children()` in `rag/nlp/search.py` crashes with a
`TypeError: 'NoneType' object is not subscriptable` when a parent
("mom") chunk referenced by child chunks is missing from the index.
This happens when the index is in an inconsistent state — for example
after a partial re-index, a document deletion that didn't clean up all
children, or a race condition during ingestion. `dataStore.get()`
returns `None` for the missing parent, and the subsequent access to
`chunk["content_with_weight"]` raises a `TypeError`.
**Stack trace:**
```
TypeError: 'NoneType' object is not subscriptable
File "rag/nlp/search.py", line 792, in retrieval_by_children
"content_with_weight": chunk["content_with_weight"],
```
### Type of change
- [x] Bug Fix
### Fix
When `dataStore.get()` returns `None` for a parent chunk, fall back to
using the child chunks directly and continue processing the remaining
parents. This preserves retrieval results for all other chunks rather
than aborting the entire query with an exception.
```python
chunk = self.dataStore.get(id, idx_nms[0], [ck["kb_id"] for ck in cks])
if chunk is None:
chunks.extend(cks)
continue
```
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
## Summary
Two bypass vectors in the sandbox code security analyzer allowed
malicious code to pass the safety check undetected and reach the Docker
executor.
### 1. JavaScript: template-literal bypass of `require()` block
The `SecureJavaScriptAnalyzer` regex patterns used `['"]` to match
module names, covering only single and double quotes. An attacker could
use ES6 template literals to bypass all three `require` checks:
`javascript
const cp = require(`child_process`);
async function main() {
return cp.execSync('cat /etc/passwd').toString();
}
`
The same bypass applied to `fs` and `worker_threads`.
**Fix:** Updated all three `require` patterns from `['"]` to `['"\]` to
also match backtick template literals.
### 2. Python: `builtins` not blocked + attribute-call blind spot in
`visit_Call`
`visit_Call` only checked `ast.Name` nodes, so attribute-style calls
like `module.func()` were invisible to the analyzer. Additionally,
`builtins` was absent from `DANGEROUS_IMPORTS`. Combined, this allowed:
`python
import builtins
def main():
builtins.exec('import os; os.system("id")')
`
Neither the import nor the exec call triggered any flag.
**Fix:** Added `builtins` to `DANGEROUS_IMPORTS` and added an
`ast.Attribute` branch to `visit_Call` so that `module.dangerous_func()`
style calls are caught alongside bare `dangerous_func()` calls.
## Tests
Added four regression tests covering each new bypass vector:
- `test_javascript_child_process_template_literal_is_rejected`
- `test_javascript_fs_template_literal_is_rejected`
- `test_python_builtins_import_is_rejected`
- `test_python_attribute_eval_call_is_rejected`
---------
Co-authored-by: bounty-hunter <bounty@hunter.local>
### What problem does this PR solve?
Two bugs in the Aliyun Go driver:
1. **`Name()` returns `"siliconflow"`** — a copy-paste bug from when the
driver was created. `Name()` is used in error messages and log output,
so every Aliyun error incorrectly attributed itself to SiliconFlow.
2. **Silent empty URL for unknown regions in `ChatWithMessages`,
`ChatStreamlyWithSender`, and `ListModels`** — all three methods
construct the request URL as `z.BaseURL[region]` without checking
whether the key exists. For an unrecognised region this returns `""`,
producing a malformed URL like `"/chat/completions"` that the HTTP
transport rejects with a confusing error. `Encode` and `Rerank` (already
merged) correctly fall back to `"default"` and return a clear error.
This PR applies the same pattern to the remaining three methods.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Closes#14703
`GoogleModel.CheckConnection` currently returns a hardcoded `no such
method` error even though the Google Go driver already supports
`ListModels`. This makes provider connection checks fail regardless of
whether the configured API key can list Google models.
This PR makes `CheckConnection` call `ListModels`, adds a small API-key
guard for nil, empty, and whitespace-only keys, and keeps `ListModels`
useful by following paginated Google model responses.
### What stays unchanged
* Google model listing still uses the Google GenAI SDK with
`genai.BackendGeminiAPI`.
* Model names still come from `models.Items[*].Name`.
* `Balance`, `Encode`, chat, streaming, provider config, and factory
wiring are unchanged.
### Tests and validation
Added focused unit coverage for:
* `CheckConnection` delegating to `ListModels` and returning its error
* nil, missing, empty, and whitespace-only API key validation
* model-name passthrough from the list-models adapter
* paginated model listing, empty-result preservation, and next-page
error propagation
Validated current PR head `17ceef43515ba8c46c254dd349b9085bf26dcbea`
locally with Go 1.25.0:
* `go test ./internal/entity/models -run
'TestGoogleModel|TestCollectGoogleModelNames' -count=1 -v` - PASS
* `go test ./internal/entity/models -count=1` - PASS
* `go test -race ./internal/entity/models -count=1` - PASS
* `gofmt -w internal/entity/models/google.go
internal/entity/models/google_test.go` - PASS, no diff
* `git diff --check` - PASS
### Type of change
* [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
- Implements the `Encode` method in the Google Gemini driver, which was
previously a stub returning `not implemented`
- Uses the `google.golang.org/genai` SDK's `EmbedContent` API, which
routes to the `batchEmbedContents` endpoint internally — all texts are
sent in a single request
- Adds `text-embedding-004` (max 2048 tokens) to
`conf/models/google.json`
- Response values are `[]float32` from the SDK and are cast to
`[]float64` to satisfy the `ModelDriver` interface
## Files changed
- `internal/entity/models/google.go` — full `Encode` implementation
- `conf/models/google.json` — adds `text-embedding-004` embedding model
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Multiple `requests.post()` calls across the LLM integration layer lack a
`timeout` parameter. Without a timeout, a single unresponsive upstream
service can block the calling thread **indefinitely**, eventually
exhausting the thread pool and degrading the entire system.
This is a well-known issue — Python's `requests` library defaults to
`timeout=None` (infinite wait), and [the library docs explicitly
recommend](https://requests.readthedocs.io/en/latest/user/advanced/#timeouts)
always setting a timeout.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Change
Added `timeout` to all `requests.post()` calls missing it:
| File | Calls fixed | Timeout |
|------|-------------|---------|
| `rag/llm/rerank_model.py` | 9 | 30s |
| `rag/llm/embedding_model.py` | 8 | 30s |
| `rag/llm/cv_model.py` | 3 | 60s |
| `rag/llm/tts_model.py` | 2 | 60s |
| `rag/llm/sequence2txt_model.py` | 2 | 60s |
Embedding/rerank calls use 30s (lightweight API calls). Vision, TTS, and
audio transcription use 60s (heavier workloads with file uploads).
Note: other files in the codebase (e.g. `check_minio_alive`,
`check_ragflow_server_alive`) already use `timeout=10`, so this PR
brings the LLM layer in line with existing practice.
Signed-off-by: Ricardo-M-L <Sibyl_Hartmanbnb@webname.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Resolves#14447. *(Note: This supersedes stalled PR #14448 and
implements the requested CodeRabbitAI fixes).*
Currently, the Dockerfiles inside `agent/sandbox/sandbox_base_image`
(both Python and Node.js) have hardcoded Chinese package mirrors. This
forces the mirrors on all users globally, which causes build network
timeouts for contributors outside of China.
This PR introduces an enhancement to fix the issue by:
1. Implementing the `NEED_MIRROR` build argument in the sandbox
Dockerfiles.
2. Replacing static `ENV` instructions with conditional shell logic
inside `RUN` blocks to dynamically set the package registries.
3. Allowing the build to cleanly fall back to default global registries
(`pypi.org` and `npmjs.org`) when `--build-arg NEED_MIRROR=0` is passed.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
The VolcEngine Go driver in `internal/entity/models/volcengine.go`
shipped with a
`ListModels` stub that returned `volcengine, no such method`.
`conf/models/volcengine.json`
also did not declare a `models` URL suffix, so the model picker had
nothing to call even
if the method body were filled in.
A tenant who configured Volcengine (Doubao / Ark) as a provider could
not see the list of
available endpoints from the RAGFlow UI. Several other Go drivers
already implement
`ListModels` against the OpenAI-compatible `/models` endpoint (deepseek,
gitee, nvidia,
openai, siliconflow), so the interface and pattern are well-established.
This PR fills the gap.
### What this PR includes
* `conf/models/volcengine.json`: declare the `models` URL suffix
alongside the existing
`chat`, `files`, and `embedding` entries. The Ark v3 API exposes
`https://ark.cn-beijing.volces.com/api/v3/models`, so the suffix is just
`models`.
* `internal/entity/models/volcengine.go`: replace the `ListModels` stub
with a real
implementation. Reuses the package-level `DSModelList` / `DSModel` types
that
DeepSeek, Gitee, and SiliconFlow already use to parse the
OpenAI-compatible models
response shape.
No factory change. No interface change.
### How the driver works
* Resolves the region with a default fallback, the same way the other
VolcEngine methods
in this driver already do.
* Builds the URL from `BaseURL[region] + URLSuffix.Models`, with
`strings.TrimSuffix` on
the base to keep the join robust.
* Issues a `GET` with optional `Authorization: Bearer <api_key>` (the
header is omitted
when no key is configured, mirroring the existing NVIDIA `ListModels`).
* Reads the response body once, surfaces a non-200 with the upstream
status line plus
body, and parses the JSON via the shared `DSModelList` type.
* Returns the model id list in input order. When the response includes
an `owned_by`
field, the entry is rendered as `id@owned_by`, matching the convention
used by the
other Go drivers.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
* `go build ./internal/entity/models/...` returns exit 0.
* `go vet ./internal/entity/models/...` is clean.
* `gofmt -l internal/entity/models/volcengine.go` is clean.
* The full method set on `VolcEngine` still matches the `ModelDriver`
interface.
* Endpoint reachability check: `GET
https://ark.cn-beijing.volces.com/api/v3/models`
returns `401 Unauthorized` without an API key, confirming the path
exists and accepts
Bearer authentication.
* Pattern parity with DeepSeek, Gitee, NVIDIA, and SiliconFlow
`ListModels`.
Fixes#14701
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## Summary
- `CvModel["Bedrock"]` was absent from `rag/llm/cv_model.py`, causing
`model_instance()` to return `None` when a Bedrock model was used as a
PDF parser — even after correct model resolution.
- This PR adds `BedrockCV`, enabling Bedrock vision models (e.g.
`amazon.nova-pro-v1:0`, `anthropic.claude-3-5-sonnet`) to be used as PDF
parsers.
## What problem does this PR solve?
When a Bedrock model is selected as the PDF parser in a knowledge base,
ingestion failed with:
```
'LiteLLMBase' object has no attribute 'describe_with_prompt'
```
The root cause: `LiteLLMBase` (the Bedrock chat implementation) was the
only registered handler for the Bedrock factory. It does not implement
`describe_with_prompt`. `CvModel` had no Bedrock entry, so
`model_instance()` returned `None` for `image2text` requests.
## Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Changes
**`rag/llm/cv_model.py`**
Adds `BedrockCV(Base)` with `_FACTORY_NAME = "Bedrock"`:
- Uses `litellm.completion` with the `bedrock/` prefix (consistent with
`LiteLLMBase`)
- Parses AWS credentials from the JSON key assembled by `add_llm`
(`auth_mode`, `bedrock_ak`, `bedrock_sk`, `bedrock_region`,
`aws_role_arn`)
- Supports three auth modes: `access_key_secret`, `iam_role` (via STS
`assume_role`), and default credential chain (IRSA, instance profile)
- Implements `describe_with_prompt` and `describe`
## Test plan
- [ ] Configure a Bedrock vision model (e.g. `amazon.nova-pro-v1:0`)
with valid AWS credentials
- [ ] Select it as PDF parser in a knowledge base
- [ ] Verify ingestion of a PDF document completes without errors
- [ ] Verify `CvModel["Bedrock"]` resolves to `BedrockCV`
🤖 Generated with [Claude Code](https://claude.ai/claude-code)
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
The table file parser (CSV/Excel) currently treats all columns
identically — every column is both vectorized (embedded in chunk text)
and stored as filterable metadata. There's no way for users to control
which columns should be searchable by semantic meaning versus which
should only be filterable attributes.
For example, when ingesting a news articles CSV with columns like title,
content, country, category, source, etc., the embedding includes
metadata fields like country: Brazil and source: Reuters in the chunk
text, which dilutes the semantic quality of the embedding without adding
retrieval value.
The RDBMS connector (MySQL/PostgreSQL) already supports content_columns
/ metadata_columns, but this capability was missing for file-based table
ingestion.
This PR adds column-level control (vectorize / metadata / both) for the
table file parser, following RAGFlow's existing patterns.
Backward compatible: Datasets without table_column_roles or with
table_column_mode: auto behave exactly as before (all columns = both).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
HuggingfaceRerank.post() unconditionally prepends `http://` to base_url,
which already contains a protocol. This creates invalid URLs like
http://http://127.0.0.1:8080/rerank, breaking all requests. The fix
normalizes URL handling to match the rest of the codebase, removing
redunant `http://`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Related Issues
- #7318
- #7796
---------
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
## Summary
- Tool-type components (Email, Invoke, etc.) fail to resolve template
strings that mix variable references with literal text in their
parameters.
- This adds template string resolution to `get_input()` in
`ComponentBase`, reusing existing `get_input_elements_from_text()` and
`string_format()` methods.
## Problem
`get_input()` in `ComponentBase` handles two cases:
1. **Pure reference** (`{Component:ID@field}`) — resolved via
`is_reff()` + `get_variable_value()`
2. **Literal value** — passed through as-is
But template strings like `{UserFillUp:X@name}@duke.edu` or `Question
from {Agent:Y@topic}` fall through to the literal branch because
`is_reff()` returns `False` (it expects the entire string to be a single
reference). The unresolved template is passed directly to the tool.
This affects **all** tool components (Email, Invoke, etc.) that need
mixed reference + text parameters — for example, constructing email
addresses or subjects dynamically.
## Fix
```python
# In get_input(), between is_reff check and literal fallback:
elif isinstance(v, str) and re.search(self.variable_ref_patt, v):
elements = self.get_input_elements_from_text(v)
kv = {k: e.get('value', '') for k, e in elements.items()}
self.set_input_value(var, self.string_format(v, kv))
```
This reuses `get_input_elements_from_text()` and `string_format()` which
are already used by `Message` components for the same purpose. The fix
only activates when the string contains at least one variable reference
pattern but is not a pure reference.
## Test plan
- [x] Pure references (`{Component:ID@field}`) still resolve correctly
via `is_reff()` path
- [x] Literal values without references pass through unchanged
- [x] Template strings like `{ref}@duke.edu` resolve the reference and
keep the literal suffix
- [x] Template strings like `Question from {ref}` resolve correctly
- [x] Multiple references in one string (`{ref1} and {ref2}`) both
resolve
- [x] Message components unaffected (they use their own template
resolution in `_run`)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: wanghualoong <wanghualoong@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
Closes#13663.
OAuth / OIDC callbacks call `login_user(user)` which writes `_user_id`
into the session cookie, but `_load_user()` in `api/apps/__init__.py`
only ever looked at the `Authorization` header. The SPA's response
interceptor wipes the Authorization value from `localStorage` on the
first 401 it sees — meaning that during the post-redirect window after
an OAuth login, a single transient 401 sends every subsequent request
back to the login page even though `login_user()` had already
established a perfectly good server-side session.
The reporter's analysis traces this all the way through the redirect →
`navigate('/')` → first request → empty header → 401 → `removeAll()` →
infinite-redirect-to-login chain.
## What changed
- New `_load_user_from_session()` helper that reads
`session["_user_id"]`, looks up the user in `UserService` (with the same
`StatusEnum.VALID` and `access_token` checks already used elsewhere),
and assigns `g.user`.
- Every `return None` path in `_load_user()` now routes through that
helper before giving up:
- missing `Authorization` header
- malformed `bearer ` prefix
- empty / too-short JWT payload
- JWT signature failure
- JWT-resolved user not found / has no `access_token`
- `APIToken.query()` fallback exhausted
The JWT and API-token paths still take precedence — the session is only
consulted when those can't authenticate the request. So existing
local-login and SDK callers see no behaviour change; only OAuth / OIDC
users that hit the original race now stay logged in.
The Bearer-prefix issue called out in #13663 (lines 103-110) is already
handled in the current code, so this PR only addresses the second half
of the report.
## Test plan
- [ ] Configure OIDC under `oauth` in `service_conf.yaml`
- [ ] Click the OIDC login button, complete auth at the IdP
- [ ] Confirm that navigating between pages no longer bounces back to
`/login`
- [ ] Confirm local email/password login still issues + accepts JWTs
- [ ] Confirm SDK/API key callers still authenticate via `Authorization:
Bearer <api-token>`
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
`Radio.Group` in `web/src/components/ui/radio.tsx` injects the parent's
`disabled` prop into each child via `React.cloneElement` with
`as React.ReactElement` and no validation.
This throws at runtime when a consumer passes strings, numbers, `null`,
`false`, or other non-element nodes, while the cast hides the unsafe
access from TypeScript.
Use `React.isValidElement<RadioProps>(child)` as a type guard before
calling `cloneElement`. Non-element children pass through unchanged,
and `child.props` access becomes type-checked without an `as` cast.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The OpenAI Go driver landed in #14605 with chat, list models, and check
connection. Encode was left as a stub that returns \`not implemented\`.
\`conf/models/openai.json\` already lists three embedding models out of
the box:
- text-embedding-ada-002
- text-embedding-3-small
- text-embedding-3-large
So a tenant who picked one of these in the Go layer could not actually
run an embedding call. This PR fills the gap.
### What this PR includes
- \`conf/models/openai.json\`: add \`\"embedding\": \"embeddings\"\`
under \`url_suffix\` so the driver can build the URL from config. This
matches the \`URLSuffix.Embedding\` field used by other drivers
(siliconflow, zhipu-ai).
- \`internal/entity/models/openai.go\`: replace the Encode stub with a
real implementation that POSTs to \`/v1/embeddings\`. Adds a small local
response type \`openaiEmbeddingResponse\`.
No factory change. No interface change.
### How the implementation works
- Validate \`apiConfig\` and the API key, validate the model name. Use
the existing \`baseURLForRegion\` helper so an unknown region fails fast
with a clear error.
- Wrap the request with \`context.WithTimeout(nonStreamCallTimeout)\` so
the call has a clear deadline. Same pattern as \`ChatWithMessages\` and
\`ListModels\` already use in this file.
- Send all input texts in one request. The OpenAI API accepts the
\`input\` field as an array.
- Parse \`data[*].embedding\` and copy each slice into a \`[][]float64\`
indexed by \`data[*].index\` so the output order matches the input order
even if the API returns items in a different order.
- Handle both \`float64\` and \`float32\` element types, the way the
SiliconFlow driver does.
- An empty input slice returns \`[][]float64{}\` with no HTTP call.
- Non-200 responses propagate the upstream status line and body.
- A final pass checks that every input slot got a vector. If any slot is
still nil, return a clear error so the caller does not silently use a
zero vector.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
(the go.mod minimum) returns exit 0.
- The full method set on \`OpenAIModel\` still matches the
\`ModelDriver\` interface.
- Pattern parity with the existing SiliconFlow Encode implementation
(\`internal/entity/models/siliconflow.go\`).
Closes#14629
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
As title.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
top_n is missing
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
S3-family connector syncs currently re-download every in-window object
just so we can compute `xxhash128(blob)` and compare against
`Document.content_hash`. Anything that bumps `LastModified` without
changing bytes (`aws s3 cp` touches, bucket re-encryption, etc.) pays
full bandwidth and re-parses files that didn't actually change. #14628
covers the broader incremental-ingestion redesign; this PR is the first
slice.
The fix is a pre-listing short-circuit. `BlobStorageConnector` (S3 / R2
/ GCS / OCI / S3-compat) now implements a new `FingerprintConnector`
interface: `list_keys()` paginates `list_objects_v2` and yields
`KeyRecord(key, fingerprint)` where `fingerprint = xxhash128(ETag)`. The
orchestrator joins those against the connector's existing `{doc_id:
content_hash}` map and only calls `get_value(key)` when the fingerprint
differs. Unchanged keys are skipped entirely — no `GetObject`, no
re-parse.
No DDL. xxhash128(ETag) is 32 hex chars and reuses the existing
`Document.content_hash` column per @yingfeng's suggestion; the connector
decides at listing time whether to populate it. Local uploads and
connectors that don't opt in fall through to the existing post-download
`xxhash128(blob)` path with no behavior change.
This is PR-1 of a 4-PR series — full design lives on #14628. Subsequent
PRs extend tier 1 to local FS / WebDAV / Dropbox / Seafile / RDBMS
(PR-2), wire up tier 2 cursor connectors with `SyncLogs.next_checkpoint`
(PR-3), and unify deletion via `KeyRecord(deleted=True)` reconciliation
(PR-4). Holding those back keeps this PR additive and reviewable on its
own.
#### Files touched
- `common/data_source/models.py` — new `KeyRecord`; optional
`fingerprint` on `Document`
- `common/data_source/interfaces.py` — `IncrementalCapability` enum,
`FingerprintConnector` ABC
- `common/data_source/blob_connector.py` — `BlobStorageConnector`
implements `FingerprintConnector`; per-object download factored into
`_build_document_from_obj()` so `_yield_blob_objects`, `list_keys`,
`get_value` all share it
- `rag/svr/sync_data_source.py` —
`_BlobLikeBase._fingerprint_filtered_generator` does the bypass loop;
`_run_task_logic` plumbs `doc.fingerprint` into the upload dict
- `api/db/services/document_service.py` —
`list_id_content_hash_map_by_kb_and_source_type()` helper
- `api/db/services/connector_service.py` + `file_service.py` —
fingerprint flows through `duplicate_and_parse → upload_document` and
lands in `content_hash`
- `test/unit_test/common/test_blob_connector_fingerprint.py` — 14 tests
covering ETag normalization (single-part, multipart, quoted, empty),
`list_keys()` not calling `GetObject`, `get_value()` materializing with
fingerprint, deterministic/stable fingerprints, and the bypass loop
asserting `GetObject` is *not* called on a match
#### Worth flagging for review
Old `_BlobLikeBase._generate` called `poll_source(start, now)` with a
`LastModified` window when `poll_range_start` was set. New code uses
`_fingerprint_filtered_generator` (full bucket listing + fingerprint
compare) outside of explicit `reindex=1`. Strictly better for
unchanged-bucket cases since it skips `GetObject`, but it does mean
every sync now does a full `list_objects_v2` paginate. Should still be
cheap for most buckets — flagging in case anyone has a very large bucket
where the time-window filter was meaningful.
On migration: existing rows have `content_hash = xxhash128(blob)` from
the old code. The first sync after this lands sees ETag-derived
fingerprints that don't match, re-fetches every object once, and writes
the new fingerprint. From the second sync onward the bypass works as
expected. "Slow day one, fast every day after." A `fingerprint_backfill:
trust` opt-out is sketched in the design doc but not in this PR.
#### Test plan
- [x] `uv run ruff check` — clean on all 8 touched files
- [x] `uv run pytest
test/unit_test/common/test_blob_connector_fingerprint.py -v` — 14 passed
- [x] Broader unit-test suite — no regressions in anything I touched
- [ ] Manual smoke against a real S3 bucket — configure a connector, run
sync twice, expect the second sync to log `bypassed=N, fetched=0` and no
`GetObject` calls in CloudTrail / bucket access logs
- [ ] Manual smoke with `reindex=1` — confirm the full re-download path
still works
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
This PR completes the Baidu Qianfan provider integration in RAGFlow.
**The following functionalities are now supported:**
- [x] Chat / Think Chat / Stream Chat / Stream Think Chat
- [x] Embedding
- [x] Rerank
- [x] Model listing
- [x] Provider connection checking
- [ ] Balance
-----
**Verified examples from the CLI:**
```plaintext
RAGFlow(user)> embed text 'what is rag' 'who are you' with 'embedding-3@test@zhipu-ai' dimension 16;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 16 | 0 |
| 16 | 1 |
+-----------+-------+
RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'qwen3-reranker-4b@test@baidu' top 2;
+-------+---------------------+
| index | relevance_score |
+-------+---------------------+
| 0 | 0.974821150302887 |
| 1 | 0.14223189651966095 |
| 2 | 0.08632347732782364 |
+-------+---------------------+
RAGFlow(user)> think chat with 'deepseek-v3.2@test@baidu' message 'who r u'
Thinking: Hmm, the user is asking for a simple introduction. This is straightforward – no need for overcomplication.
I should give a clear, friendly response that covers my basic identity as an AI assistant, my purpose, and my capabilities. Keeping it concise but informative is key here.
Mentioning my creator Anthropic adds credibility, and ending with an offer to help invites further interaction. No need for technical details unless the user asks later.
Answer: Hello! I'm an AI assistant created by Anthropic, designed to help with a wide variety of tasks. You can think of me as a helpful digital companion—I can answer questions, assist with writing, help solve problems, provide explanations, and engage in conversation on many topics. I'm here to help with whatever you need! How can I assist you today?
Time: 8.103902
RAGFlow(user)> stream think chat with 'deepseek-v3.2@test@baidu' message 'who r u'
Thinking: mm, the user is asking "who r u" with casual spelling. This is a straightforward identity question. should give a clear, friendly introduction without overcomplicating it. Can start with my core function as an AI assistant, mention my creator, and briefly state my key capabilities. response should be welcoming and invite further interaction since this seems like an introductory question. Keeping it concise but covering the essentials: who I am, what I do, and how I can help.
Answer: ! I am DeepSeek, an AI assistant created by DeepSeek Company. I'm designed to help answer questions, provide information, assist with various tasks, and engage in conversations on a wide range of topics. I'm here to assist you with whatever you need - whether it's answering questions, helping with analysis, writing, coding, or just having a friendly chat!Is there anything specific I can help you with today? 😊
Time: 7.219703
RAGFlow(user)> list supported models from 'baidu' 'test'
+--------------------------------------+
| model_name |
+--------------------------------------+
| ernie-3.5-8k-preview |
| ernie-4.0-8k |
| ernie-4.0-turbo-8k-latest |
| ernie-4.0-turbo-8k-preview |
| ernie-4.0-8k-preview |
| ernie-speed-pro-128k |
| ernie-char-fiction-8k |
| ernie-3.5-8k |
| ernie-3.5-128k |
| ernie-lite-pro-128k |
| ernie-novel-8k |
| ernie-4.0-turbo-8k |
| ernie-4.0-turbo-128k |
| ernie-4.0-8k-latest |
| irag-1.0 |
| ........... |
| glm-5.1 |
| ernie-image-turbo |
| deepseek-v4-pro |
| deepseek-v4-flash |
| ernie-5.1 |
+--------------------------------------+
RAGFlow(user)> check instance 'test' from 'baidu'
SUCCESS
```
Additionally, this PR fixes an incorrect error message typo:
Before:
```go
fmt.Errorf("API requestssss failed with status %d: %s : %s", ...)
```
After:
```go
fmt.Errorf("API request failed with status %d: %s", ...)
```
This PR mainly improves provider compatibility, API completeness, and
runtime stability.
### Type of change
* [x] Bug Fix (non-breaking change which fixes an issue)
* [x] New Feature (non-breaking change which adds functionality)
* [x] Refactoring
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.25.1 to v0.25.2
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
## Problem
During the REST API refactoring (#13690), the
`/api/v2/kb/check_embedding` endpoint was removed and never migrated to
the new RESTful structure. The frontend was pointed to the
`/api/v1/datasets/{id}/embedding` endpoint (which is `run_embedding` — a
completely different function). Additionally, a hard guard was
introduced that rejects any `embd_id` change when `chunk_num > 0`,
making it impossible to switch embedding models on datasets with
existing chunks.
## Root Cause
1. **Missing endpoint**: The old `check_embedding` logic (sample random
chunks, re-embed with the new model, compare cosine similarity) was not
carried over to the new REST API service layer.
2. **Wrong frontend URL**: `checkEmbedding` in `api.ts` pointed to
`/datasets/{id}/embedding` (`run_embedding`) instead of a dedicated
check endpoint.
3. **Overly restrictive guard**: `dataset_api_service.py` line 310
blocked all `embd_id` updates when `chunk_num > 0`. This check did not
exist in the pre-refactor code — it was incorrectly introduced during
the refactor.
## Changes
### Backend
- **`api/apps/services/dataset_api_service.py`**
- Remove the `chunk_num > 0` hard guard on `embd_id` updates
- Add `check_embedding()` service function: samples random chunks,
re-embeds them with the candidate model, computes cosine similarity,
returns compatibility result (avg ≥ 0.9 = compatible)
- Add `import re` for the `_clean()` helper
- **`api/apps/restful_apis/dataset_api.py`**
- Add `POST /datasets/<dataset_id>/embedding/check` endpoint following
the new REST API conventions
- Clean up unused top-level imports (`random`, `re`, `numpy`)
### Frontend
- **`web/src/utils/api.ts`**
- Fix `checkEmbedding` URL from `/datasets/${datasetId}/embedding` →
`/datasets/${datasetId}/embedding/check`
### Tests
-
**`test/testcases/test_http_api/test_dataset_management/test_update_dataset.py`**
- Update `test_embedding_model_with_existing_chunks` to assert success
(`code == 0`) instead of expecting the old `102` error
-
**`test/testcases/test_web_api/test_dataset_management/test_dataset_sdk_routes_unit.py`**
- Update `test_update_route_branch_matrix_unit` to assert
`RetCode.SUCCESS` when updating `embd_id` on a chunked dataset,
replacing the old `chunk_num` error assertion
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: noob <yixiao121314@outlook.com>
### What problem does this PR solve?
```
RAGFlow(user)> embed text 'what is rag' 'who are you' with 'embedding-3@test@zhipu-ai' dimension 16;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 16 | 0 |
| 16 | 1 |
+-----------+-------+
RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'rerank@test@zhipu-ai' top 2;
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0 | 1 |
| 2 | 0.99999976 |
+-------+-----------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Update mapping.json to treat id as a keyword.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Some bugs
- Error during batch modification of metadata in the Knowledge Base
- Manually configured metadata is not displayed in search settings
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Close#14292
## Issue
File ancestry endpoints return folder metadata without validating tenant
permissions, allowing any authenticated user to query arbitrary
`file_id` values across tenant boundaries.
## Affected Endpoints
- `GET /v1/file/parent_folder?file_id={file_id}`
- `GET /v1/file/all_parent_folder?file_id={file_id}`
- `GET /api/v1/files/{id}/ancestors`
## Root Cause
These endpoints **skip the permission check** that other file operations
(Delete, Download, Move) perform.
## Expected Permission Check
All file operations should follow this 3-step validation:
- Check file.tenant_id
- Check if user_id belongs to this tenant (via user_tenant join table)
- Check KB permission type (team permission)
**Code reference:** This is implemented in `checkFileTeamPermission()`
and used by Delete/Download/Move, but **missing** from
GetParentFolder/GetAllParentFolders.
## Reproduction
```bash
# User B (tenant: BBB) accessing User A's file (tenant: AAA)
curl -H "Authorization: Bearer USER_B_TOKEN" \
"http://localhost:9384/v1/file/parent_folder?file_id=AAA_FILE_123"
# Result: Returns User A's folder metadata ❌
# Expected: "No authorization." ✅
Fix
Pass userID from handler to service and call checkFileTeamPermission() — same as Download/Delete/Move handlers.
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
### What problem does this PR solve?
Implement `HuggingFace` provider
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Related issues
Closes#14644
### What problem does this PR solve?
This PR fixes an authorization bug where datasets marked with
`permission = me` could still be accessed by other members of the same
tenant through APIs that relied on `KnowledgebaseService.accessible()`
or `DocumentService.accessible()`.
Before this change, those shared access helpers only checked tenant
membership and did not enforce the dataset's permission mode. As a
result, a non-owner who knew a private `dataset_id` could still reach
downstream document and chunk operations even though the dataset was
intended to be owner-only.
This change updates the central access checks so that:
- dataset owners always retain access
- joined tenant members only get access when the dataset permission is
`TEAM`
- private datasets with `permission = me` remain inaccessible to
non-owners
- document-level access follows the same dataset permission rules
The PR also adds regression coverage for private-vs-team dataset access
behavior.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Testing
- Added
`test/unit_test/api/db/services/test_dataset_access_permissions.py`
- Attempted to run: `python -m pytest
test\\unit_test\\api\\db\\services\\test_dataset_access_permissions.py
-q`
- Local execution in this workspace is currently blocked during test
collection because the environment is missing the `strenum` dependency
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: jony376 <jony376@gmail.com>
Co-authored-by: Wang Qi <wangq8@outlook.com>
Co-authored-by: d 🔹 <liusway405@gmail.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Magicbook1108 <newyorkupperbay@gmail.com>
Co-authored-by: chanx <1243304602@qq.com>
Co-authored-by: sxxtony <166789813+sxxtony@users.noreply.github.com>
Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: Baki Burak Öğün <63836730+bakiburakogun@users.noreply.github.com>
Co-authored-by: bakiburakogun <bakiburakogun@users.noreply.github.com>
Co-authored-by: Panda Dev <56657208+pandadev66@users.noreply.github.com>
Co-authored-by: Haruko386 <tryeverypossible@163.com>
Co-authored-by: D2758695161 <13510221939@163.com>
Co-authored-by: Hunter <hunter@yitong.ai>
Co-authored-by: Lynn <lynn_inf@hotmail.com>
Co-authored-by: buua436 <sz_buua@foxmail.com>
Co-authored-by: web-dev0521 <jasonpette1783@gmail.com>
Co-authored-by: Tim Wang <38489718+wanghualoong@users.noreply.github.com>
Co-authored-by: wanghualoong <wanghualoong@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: qinling0210 <88864212+qinling0210@users.noreply.github.com>
Co-authored-by: dale053 <star05223@outlook.com>
### What
19 methods across `rag/llm/chat_model.py` and `rag/llm/cv_model.py`
declare `gen_conf={}` (or `gen_conf: dict = {}`) as a parameter default
and then mutate `gen_conf` in place — typically `del
gen_conf["max_tokens"]`, `gen_conf["penalty_score"] = ...`, or
`gen_conf.pop(...)` as part of provider-specific normalization.
### The two bugs in this pattern
**1. Mutable default argument (Python footgun).** Python evaluates
default values **once** at function-definition time, so the single `{}`
dict is *shared* across every caller that doesn't pass `gen_conf`. The
first such call's mutations leak into the default seen by every
subsequent call.
```python
# Before
def chat_streamly(self, system, history, gen_conf={}, **kwargs):
if "max_tokens" in gen_conf:
del gen_conf["max_tokens"] # mutates the SHARED default dict
...
```
After call N with `max_tokens` set, call N+1 that omits `gen_conf` no
longer sees `max_tokens` — even though the caller never touched it.
**2. Caller-dict pollution.** When the caller *does* pass a `gen_conf`
dict, the same in-place mutations modify the caller's dict. A reused
`gen_conf` (very common for chat-loop callers that build the config once
and pass it on every turn) silently loses `max_tokens`,
`presence_penalty`, etc. after the first round.
### The fix
In every affected method:
- Change `gen_conf={}` (or `gen_conf: dict = {}`) → `gen_conf=None`.
- Add `gen_conf = dict(gen_conf or {})` as the first statement of the
body so all subsequent mutations operate on a fresh local copy.
```python
# After
def chat_streamly(self, system, history, gen_conf=None, **kwargs):
gen_conf = dict(gen_conf or {})
if "max_tokens" in gen_conf:
del gen_conf["max_tokens"] # local copy — safe
...
```
This is byte-for-byte identical provider-side behavior for callers that
already pass a fresh `gen_conf` per call. The new `dict(...)` copy is
O(small constant) per call.
### Files changed
- `rag/llm/chat_model.py` — 17 methods
- `rag/llm/cv_model.py` — 2 methods
### Tests
Adds `test/unit_test/rag/llm/test_gen_conf_no_mutable_default.py` — an
`ast`-based regression guard that walks both modules and asserts no
parameter named `gen_conf` ever has a mutable literal (`{}` or `[]`) as
its default. The test caught **five additional `gen_conf: dict = {}`
sites** that an initial `gen_conf={}` text grep had missed (annotated
parameters with whitespace), and would fail again if the pattern is ever
reintroduced.
```
$ pytest test/unit_test/rag/llm/test_gen_conf_no_mutable_default.py -v
============================== 3 passed in 0.04s ===============================
```
`ruff check` passes on all touched files.
### Notes
- This PR is intentionally focused on **just** the `gen_conf` default +
copy fix. There's a related (but separate) `history.insert(0, ...)`
pattern in the same files that mutates the caller's history list in 12
places — left for a follow-up so this PR stays mechanical and easy to
review.
### Latest revision (`700bb54a7`) — addresses CodeRabbit review
- Type annotation: `gen_conf: dict = None` → `gen_conf: dict | None =
None` (5 occurrences in `chat_model.py`). The old annotation was a
static-checker mismatch since `None` isn't a `dict`.
- Regression test: the AST check accessed `default.keys` directly.
`ast.List` has no `.keys` attribute — a future `gen_conf=[]` would crash
with `AttributeError` instead of being caught. Use `getattr` for both
`.keys` (Dict) and `.elts` (List). Manually verified the updated check
correctly catches both `gen_conf={}` and `gen_conf=[]` while ignoring
`gen_conf=None` and non-empty literals.
---------
Co-authored-by: Ricardo <ricardo@example.com>
### What problem does this PR solve?
Bugfix: keep document api backward compatible
Fix 1: https://github.com/infiniflow/ragflow/issues/14634
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- When KB retrieval fails (e.g. ES `AssertionError` on empty
`index_names`), `kbinfos` falls back to a dict without a `total` key
- `_async_update_chunk_info` then iterates over `chunk_info.keys()`
(which includes `total`) and tries `kbinfos['total']`, raising a
`KeyError`
- This error surfaces when using Tavily web retrieval in a chat with no
knowledge base attached
## Changes
- Add `'total': 0` to all default `kbinfos` dicts in
`_retrieve_information`
- Add `setdefault('total', 0)` guard after successful KB retrieval to
handle cases where the retrieval result omits the key
- Accumulate `total` correctly in the merge branch of
`_async_update_chunk_info`
## Test plan
- [ ] Start a chat with Tavily configured and no knowledge base
- [ ] Verify no `KeyError: 'total'` is raised
- [ ] Verify Tavily results are returned correctly
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
## Summary
- Adding a Bedrock model from the frontend fails with `Fail to access
model(Bedrock/<model>).Expecting value: line 1 column 1 (char 0)`.
- The assembled Bedrock JSON credentials are silently replaced by `"x"`
before the connection test, causing `json.loads("x")` to raise a
`JSONDecodeError`.
## What problem does this PR solve?
Commit `050113482` introduced a fallback in `add_llm()` that reuses the
existing DB key when `req.get("api_key") is None`:
```python
if req.get("api_key") is None:
api_key = existing_api_key if existing_api_key is not None else "x"
```
For Bedrock, credentials are sent as separate fields (`auth_mode`,
`bedrock_ak`, `bedrock_sk`, `bedrock_region`, `aws_role_arn`) — the
frontend does not send an `api_key` field. The function correctly
assembles the JSON key:
```python
api_key = apikey_json(["auth_mode", "bedrock_ak", "bedrock_sk", "bedrock_region", "aws_role_arn"])
```
But since `req.get("api_key")` is `None`, the override immediately
replaces `api_key` with `"x"` (or a stale DB value). `LiteLLMBase` then
calls `json.loads("x")` for Bedrock auth → `JSONDecodeError`.
## Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Changes
**`api/apps/llm_app.py`**
Write the assembled key into `req["api_key"]` so the `None` check
evaluates to `False` and the override is skipped — consistent with how
`Tencent Cloud` is already handled.
```python
# Before
api_key = apikey_json(["auth_mode", "bedrock_ak", "bedrock_sk", "bedrock_region", "aws_role_arn"])
# After
req["api_key"] = apikey_json(["auth_mode", "bedrock_ak", "bedrock_sk", "bedrock_region", "aws_role_arn"])
api_key = req["api_key"]
```
## Test plan
- [ ] Configure a Bedrock provider in Model Providers with valid AWS
credentials
- [ ] Add a Bedrock chat model — verify no `Expecting value` error
- [ ] Update the same model — verify the existing key is reused
correctly when credentials fields are left empty
🤖 Generated with [Claude Code](https://claude.ai/claude-code)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
The use_sql() function in dialog_service.py constructed SQL WHERE
clauses and Infinity table names by directly interpolating kb_id values
using Python f-strings, with no validation of the input values. A
malformed or maliciously crafted kb_id (introduced via a compromised
admin account or a separate injection vector) could alter the structure
of the generated SQL query, potentially leading to unauthorized data
access or data manipulation.
This PR adds strict UUID format validation for all kb_id values before
they are interpolated into any SQL string, causing requests with invalid
IDs to fail fast with a ValueError rather than executing a tampered
query.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
## Summary
- When a model is registered as `chat` in `tenant_llm` but has the
`IMAGE2TEXT` tag in `llm_factories.json`, requesting it as `image2text`
(e.g. PDF parser) fails with `Tenant Model with name <model> and type
image2text not found`.
- After resolution via the new fallback, the returned
`config_dict["model_type"]` was still `"chat"`, causing
`tenant_llm_service.model_instance()` to instantiate `ChatModel` instead
of `CvModel` — breaking `describe_with_prompt` at ingestion time.
## What problem does this PR solve?
RAGFlow already has a `CHAT→IMAGE2TEXT` fallback: when a chat model is
not found, it retries with `image2text`. The symmetric fallback
(`IMAGE2TEXT→CHAT`) was missing.
This matters for multimodal models declared as `model_type: "chat"` with
an `IMAGE2TEXT` tag in `llm_factories.json` (e.g. models added after
tenant creation, or providers where a single model serves both
purposes). The frontend PDF parser selector correctly surfaces these
models via the `IMAGE2TEXT` tag, but the backend fails to resolve them
at runtime.
## Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Changes
**`api/db/joint_services/tenant_model_service.py`**
1. Add `IMAGE2TEXT→CHAT` fallback in
`get_model_config_by_type_and_name`: when an `image2text` model is not
found in `tenant_llm`, retry with `chat` — but only if the `llm` table
confirms `IMAGE2TEXT` capability via the `tags` field. This mirrors the
philosophy of the existing `CHAT→IMAGE2TEXT` fallback: substitution is
only allowed when the model has declared the required capability.
2. Normalize `config_dict["model_type"]` to `image2text` after the
fallback, so the caller (`model_instance`) correctly routes to `CvModel`
instead of `ChatModel`.
3. Extend the type validation guard to allow `(requested=image2text,
found=chat)` alongside the existing `(requested=chat, found=image2text)`
exception.
## Test plan
- [ ] Add a model with `model_type=chat` and `tags` containing
`IMAGE2TEXT` to a tenant
- [ ] Select it as PDF parser in a knowledge base
- [ ] Verify ingestion succeeds without `image2text not found` or
`describe_with_prompt` errors
- [ ] Verify the same model still works correctly in chat context
🤖 Generated with [Claude Code](https://claude.ai/claude-code)
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes#14360
## Problem
When the same blob storage bucket is connected to multiple knowledge
bases (each through a different data source connector), the sync
pipeline hashes only the blob path
(`bucket_type:bucket_name:object_key`) to derive the document ID. Every
connector pointing at the same bucket therefore produces **identical
IDs** for the same object. The collision guard in
`FileService.upload_document` then fires for the second knowledge base:
```
Existing document id collision with another knowledge base; skipping update.
```
This makes it impossible to index the same bucket into more than one KB
simultaneously.
## Solution
Include `connector_id` in the hash input so that each connector produces
a distinct document ID even when the underlying blob path is identical:
```python
# Before
"id": hash128(doc.id),
# After
"id": hash128(f"{task['connector_id']}:{doc.id}"),
```
Because each KB connection uses its own connector (with a unique
`connector_id`), documents are now namespaced per connector and no
collision occurs.
**Note:** This is a breaking change for existing synced data sources.
After upgrading, a re-sync will create new documents with the updated ID
format. Old documents (indexed under the previous format) will remain in
the database but can be manually deleted or cleaned up via a re-sync
with reindex enabled.
## Testing
- Verified that the one-line change produces unique IDs for two
connectors pointing at the same S3 path.
- Existing unit test
`test_upload_document_skips_cross_kb_document_id_collision` continues to
pass — the collision guard in `FileService` is still valid for genuinely
colliding IDs from other sources.
---------
Co-authored-by: octo-patch <octo-patch@github.com>
### What problem does this PR solve?
As title
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
The Aliyun Go driver has a stub `Rerank` method that always returns
`"Aliyun, Rerank not implemented"`. DashScope exposes an
OpenAI-compatible rerank endpoint (`compatible-mode/v1/rerank`) and
hosts dedicated bilingual rerankers (`gte-rerank-v2`, `gte-rerank`) that
are a natural pairing with the embedding models already in
`aliyun.json`. Without this, Aliyun users cannot use reranking within
RAGFlow.
Closes#14675
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Route error in dataset files page
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Refactor : Allow search multiple datasets
1. support /datasets/search
2. get rid of /graph/search, use /graph
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
Closes#14590
## Self Checks
- [x] I have searched for existing issues [search for existing
issues](https://github.com/infiniflow/ragflow/issues), including closed
ones.
- [x] I confirm that I am using English to submit this report ([Language
Policy](https://github.com/infiniflow/ragflow/issues/5910)).
- [x] Non-english title submitions will be closed directly (
非英文标题的提交将会被直接关闭 ) ([Language
Policy](https://github.com/infiniflow/ragflow/issues/5910)).
- [x] Please do not modify this template :) and fill in all the required
fields.
## RAGFlow workspace code commit ID
`a1b2c3d4e5f67890123456789abcdef12345678`
## RAGFlow image version
`0.13.1`
## Other environment information
- Hardware parameters: N/A
- OS type: Linux 6.17.0-22-generic
- Others: API key authentication via `Authorization: Bearer <token>`
## Actual behavior
The chatbot API endpoints:
- `POST /chatbots/<dialog_id>/completions`
- `GET /chatbots/<dialog_id>/info`
validate only that the bearer token exists in `APIToken`, but do not
verify that `dialog_id` belongs to the same tenant as that token.
Current flow (simplified):
1. Route extracts bearer token and checks `APIToken.query(beta=token)`.
2. If token exists, request is accepted.
3. Downstream service resolves dialog globally by ID
(`DialogService.get_by_id(dialog_id)` in `conversation_service.py`).
4. No tenant ownership check is enforced for `dialog_id`.
Impact: Any user with a valid API key can attempt arbitrary `dialog_id`
values and access/invoke chatbots outside their own tenant boundary if
IDs are known/guessed/leaked.
Security classification:
- Vulnerability class: Broken Access Control (IDOR, OWASP Top 10 A01)
- Severity recommendation: Critical
- Exploit prerequisite: any valid API key + discoverable target
`dialog_id`
## Expected behavior
Requests to `/chatbots/<dialog_id>/completions` and
`/chatbots/<dialog_id>/info` must be authorized only when:
1. bearer token is valid, and
2. `dialog_id` belongs to the same `tenant_id` as the token.
Otherwise, reject with authorization failure (e.g., 403 or
404-equivalent policy).
## Steps to reproduce
1. Prepare two tenants:
- Tenant A with API key `TOKEN_A`
- Tenant B with chatbot `dialog_id = DIALOG_B`
2. Send request from Tenant A to Tenant B chatbot completion endpoint:
```bash
curl -X POST "https://<host>/chatbots/DIALOG_B/completions" \
-H "Authorization: Bearer TOKEN_A" \
-H "Content-Type: application/json" \
-d '{"question":"hello","stream":false}'
```
3. Observe request is processed (or reaches dialog resolution) without
tenant ownership rejection.
4. Repeat against info endpoint:
```bash
curl -X GET "https://<host>/chatbots/DIALOG_B/info" \
-H "Authorization: Bearer TOKEN_A"
```
5. Observe the same missing ownership enforcement.
## Additional information
Affected code paths:
- `api/apps/sdk/session.py`
- `chatbot_completions(dialog_id)`
- `chatbots_inputs(dialog_id)`
- `api/db/services/conversation_service.py`
- `async_iframe_completion(...)` uses global dialog lookup
Suggested fix:
1. In both chatbot endpoints:
- Resolve `tenant_id = objs[0].tenant_id` from validated token.
- Fetch dialog with tenant-scoped query
(`DialogService.query(id=dialog_id, tenant_id=tenant_id)`).
- Reject if dialog is not found/owned by tenant.
2. Defense in depth:
- Require and enforce `tenant_id` in service-layer dialog resolution for
external flows.
- Avoid global `get_by_id(dialog_id)` where user-controlled dialog IDs
are reachable.
3. Add regression tests:
- Positive: same-tenant token + dialog succeeds.
- Negative: cross-tenant token + dialog fails for both endpoints.
### What problem does this PR solve?
Restrict file move operations: prevent moving a folder to itself or to
one of its own subfolders.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Do not bypass threshold for rerank when metadata filter is enabled
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes#14651.
`kb_prompt()` in `rag/prompts/generator.py` crashes with
`AttributeError: 'NoneType' object has no attribute 'items'` during
agent citation generation when a retrieved chunk carries
`document_metadata: null`.
**Root cause.** The crash happens at `rag/prompts/generator.py:132-133`:
```python
meta = ck.get("document_metadata", {})
for k, v in meta.items():
```
`dict.get(key, default)` only returns the default when the key is
*missing*. When the key is present with an explicit `None` value,
`.get()` returns `None`, and `.items()` crashes.
**How the chunk gets `None`.** It's a round-trip inside RAGFlow itself,
not bad input from retrieval:
1. The agent stores retrieved chunks via `agent/canvas.py:814`, which
routes them through `chunks_format()`.
2. `rag/prompts/generator.py:61` canonicalizes the field with
`chunk.get("document_metadata")` (no default), so chunks without
metadata become `{"document_metadata": None, ...}`.
3. `agent/component/agent_with_tools.py:314` feeds those canonicalized
chunks back into `kb_prompt()` for citation generation, and
`.get("document_metadata", {})` no longer protects us.
**Fix.** One-line change at `rag/prompts/generator.py:132`: use
`ck.get("document_metadata") or {}` so an explicit `None` is also
coerced to `{}`.
The line-61 `None` is intentionally part of the API/UI contract — the
frontend handles it via optional chaining
(`web/src/components/markdown-content/index.tsx:184`,
`web/src/pages/next-search/search-view.tsx:217`) — so the fix belongs at
the consumer, not the producer.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
```
RAGFlow(user)> logout;
SUCCESS
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. **Fix Global State Pollution in Local Providers (Critical Bug):** -
Resolved a severe concurrency and architecture issue in
`model_service.go`. Previously, `ListSupportedModels` would permanently
overwrite the global provider singleton with a localized URL instance
(`driver.NewInstance`). This caused cross-request contamination in
multi-tenant environments.
- Fixed `CheckProviderConnection` for local models (LM Studio, vLLM,
Ollama). It now properly creates a localized driver copy and injects the
`base_url` before testing the connection, entirely eliminating the
false-positive `missing base URL` error without polluting the global
state.
2. **Implement `VolcEngine` Embeddings:** - Fully implemented the
`Encode` method for the `volcengine` provider, enabling text embedding
capabilities for VolcEngine models.
3. **Enhance Region Validation in `SiliconFlow`:** - Added a strict
empty string check (`*apiConfig.Region != ""`) alongside the existing
`nil` check when parsing regions. This ensures that if an empty string
is passed, the system safely falls back to the `"default"` region,
preventing malformed URL requests and `unsupported protocol scheme`
errors.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Update API URL
2. Add password encryption
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Update the type of tenant_rerank_id in validation.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- When an agent workflow has multiple `UserFillUp` pause points,
`canvas.run()` calls `reset(True)` on **all** components at the start of
each run. This clears outputs from components that completed in prior
runs, so downstream references like `{Agent:XXX@content}` resolve to
`None`.
- This fix only resets components on the **current execution path**
(`self.path`), preserving outputs from previously completed components.
## Problem
In a multi-step agent (e.g. draft email → user confirms → send email):
1. First `run()`: Agent drafts content, UserFillUp pauses for user input
→ Agent output is saved
2. Second `run()`: User submits input, but `reset(True)` clears **all**
components including the Agent that already completed
3. Email component references `{Agent:XXX@content}` → gets `None`
instead of the draft
This affects **all** agents that reference upstream component outputs
after a UserFillUp pause point.
## Fix
```python
# Before: reset ALL components
for k, cpn in self.components.items():
self.components[k]["obj"].reset(True)
# After: only reset components on current execution path
path_set = set(self.path)
for k, cpn in self.components.items():
if k in path_set:
self.components[k]["obj"].reset(True)
```
`self.path` already tracks the current execution path. For agents
without UserFillUp (single run), `path` contains all components, so
behavior is unchanged.
## Test plan
- [x] Agent with single UserFillUp: outputs from prior components are
preserved after resume
- [x] Agent with multiple UserFillUp: each resume preserves all
previously completed outputs
- [x] Agent without UserFillUp: behavior unchanged (all components in
path, all reset)
- [x] Webhook-triggered agents: unaffected (path includes all components
on first run)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: wanghualoong <wanghualoong@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
add compatibility route for document download under /v1
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- **Collapsible thinking**: Replace `<section>` with `<details>` for
`<think>` content, so model thinking output is collapsed by default
(click to expand). Works for all models that output `<think>` tags
(Qwen3, DeepSeek, Gemini, Claude, etc.).
- **Fix double thinking tags**: When reasoning/deep research mode is
enabled in knowledge base chat, both the retrieval progress and model
thinking were wrapped in `<think>` tags, producing two "Thinking..."
blocks. Now retrieval progress uses a dedicated `<retrieving>` tag
rendered as a separate "Retrieving..." collapsible with a distinct green
accent.
### Before
- Thinking content displayed as flat gray-bordered `<section>`,
occupying significant screen space
- Deep research + model thinking both use `<think>` → two identical
"Thinking..." blocks
### After
- Thinking content collapsed by default in a `<details>` element, click
"Thinking..." to expand
- Deep research shows "Retrieving..." (green border), model thinking
shows "Thinking..." (gray border)
## Changes
**Backend (`api/db/services/dialog_service.py`)**
- Deep research callback: replace `start_to_think`/`end_to_think` marker
flags with direct `<retrieving>`/`</retrieving>` answer text
**Frontend**
- `web/src/utils/chat.ts`: `replaceThinkToSection()` now uses
`<details>` instead of `<section>`; add new
`replaceRetrievingToSection()`
- 4 tsx files: import and pipe `replaceRetrievingToSection`, whitelist
`details`, `summary`, `retrieving` in DOMPurify `ADD_TAGS`
- 4 less files: `section.think` → `details.think` with `<summary>`
styles; add `details.retrieving` with green accent; dark mode and RTL
variants
## Test plan
- [ ] Open a chat WITHOUT knowledge base, ask a question to a model with
thinking (e.g. Qwen3) → thinking content should be collapsed by default,
click "Thinking..." to expand
- [ ] Open a chat WITH knowledge base and reasoning enabled, ask a
question → "Retrieving..." (green) shows retrieval progress,
"Thinking..." (gray) shows model thinking, each independently
collapsible
- [ ] Verify dark mode renders correctly for both collapsible blocks
- [ ] Verify RTL layout renders correctly
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: wanghualoong <wanghualoong@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
Closes#14618.
The `GET /v1/document/get/<doc_id>` endpoint in
`api/apps/document_app.py` was protected only by `@login_required` and
called `DocumentService.get_by_id(doc_id)` without verifying that the
document's knowledge base belonged to the requesting user's tenant. Any
authenticated user who knew (or guessed) a document ID could download
files belonging to any other tenant — a cross-tenant IDOR.
This PR adds a `DocumentService.accessible(doc_id, current_user.id)`
check before serving the file. The helper already exists and joins
`Document` → `Knowledgebase` → `UserTenant` to verify the requesting
user belongs to the tenant that owns the document's KB. The same pattern
is already used by `api/apps/restful_apis/document_api.py` and mirrors
the tenant scoping in the SDK route at `api/apps/sdk/doc.py`.
The check returns the existing `"Document not found!"` error for both
non-existent and inaccessible documents, so attackers cannot use the
response to enumerate valid doc IDs across tenants.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Other (please describe): Security fix (cross-tenant IDOR /
authorization bypass)
### What problem does this PR solve?
The Aliyun Go driver shipped with a stub \`Encode\` method that returned
\`no such method\`, even though \`conf/models/aliyun.json\` already
wires the OpenAI-compatible embeddings URL suffix at
\`compatible-mode/v1/embeddings\`. The same config also did not list any
embedding models, so the picker had nothing to select.
So an Aliyun tenant who wanted to use Tongyi text-embedding-v3 or v4 in
the Go layer could not, even though the upstream endpoint is public and
uses the standard \`POST /v1/embeddings\` shape that the SiliconFlow and
ZhipuAI
drivers already support.
This PR fills the gap.
### What this PR includes
- \`conf/models/aliyun.json\`: add \`text-embedding-v4\` and
\`text-embedding-v3\` to the \`models\` array.
- \`internal/entity/models/aliyun.go\`: replace the \`Encode\` stub with
a real implementation. Adds a small local response type that matches the
OpenAI-compatible shape.
No factory change. No interface change.
### How the driver works
- Validate \`apiConfig\` and the API key, validate the model name,
resolve the region with a default fallback, build the
URL from \`BaseURL[region] + URLSuffix.Embedding\`.
- Send all input texts in one request as the \`input\` array, the same
OpenAI-compatible shape the SiliconFlow \`Encode\`
uses.
- Parse \`data[*].embedding\` and copy each slice into a \`[][]float64\`
indexed by \`data[*].index\` so the output order matches the input order
even if the API returns items in a different order.
- Handle both \`float64\` and \`float32\` element types.
- Empty input returns \`[][]float64{}\` with no HTTP call.
- Non-200 responses propagate the upstream status line and body.
- A final pass checks every input slot got a vector and returns a clear
error if any slot is still nil.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
returns exit 0.
- The full method set on \`AliyunModel\` still matches the
\`ModelDriver\` interface.
- Pattern parity with the existing SiliconFlow Encode implementation.
Closes#14646
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. implement `rerank`, `embedding`, `balance`, `checkConnet` method for
`OpenRouter`
2. delete `chat` method in `internal/entity/models/volcengine.go`
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
### What problem does this PR solve?
Since API is updated, CLI login failed.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### Related issues
Closes#14648
### What problem does this PR solve?
This PR fixes an authorization flaw in `POST /files/link-to-datasets`.
Before this change, the endpoint only checked whether the supplied
`file_ids` and `kb_ids` existed. It did not verify whether the
authenticated user was actually allowed to access those files or target
datasets. As a result, an authenticated user who knew valid IDs could
relink another user's files to arbitrary datasets.
This was especially risky because the relinking flow is state-changing:
the background worker removes existing file-document mappings and then
recreates documents under the attacker-supplied dataset IDs.
This change makes the route enforce the same permission model already
used by nearby file and document operations:
- each resolved file must pass `check_file_team_permission(...)`
- each target dataset must pass `check_kb_team_permission(...)`
- authorization is enforced before scheduling background relinking work
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Testing
- Added regression coverage in
`test/testcases/test_web_api/test_file_app/test_file2document_routes_unit.py`
- Covered:
- unauthorized file access is rejected
- unauthorized dataset access is rejected
- existing success path still returns immediately after scheduling
background work
- Attempted to run:
- `python -m pytest
test\\testcases\\test_web_api\\test_file_app\\test_file2document_routes_unit.py
-q`
- Local execution in this workspace is currently blocked by missing test
dependencies during bootstrap, including `ragflow_sdk`
---------
Co-authored-by: jony376 <jony376@gmail.com>
### What problem does this PR solve?
The Gitee AI Go driver shipped with a stub \`Rerank\` method that
returned \`Rerank not implemented\`, even though
\`conf/models/gitee.json\` already wires the rerank URL suffix at
\`\"rerank\": \"rerank\"\`. The same config did not list any
rerank model, so the picker had nothing to select.
So a Gitee tenant could not use BAAI/bge-reranker-v2-m3 as a reranker
through the Go layer today, even though the
infrastructure was one config entry and one method body away.
### What this PR includes
- \`conf/models/gitee.json\`: add \`BAAI/bge-reranker-v2-m3\` to the
\`models\` array.
- \`internal/entity/models/gitee.go\`: replace the \`Rerank\` stub with
a real implementation. Adds two small local types
that match the OpenAI-compatible \`/rerank\` shape already used by the
SiliconFlow and ZhipuAI drivers.
No factory change. No interface change.
### How the driver works
- Validate \`apiConfig\` and the API key, validate the model name,
resolve the region with a default fallback, build the
URL from \`BaseURL[region] + URLSuffix.Rerank\`.
- Use a per-call \`context.WithTimeout(30s)\` and
\`http.NewRequestWithContext\`, matching the pattern the
recently merged Aliyun Encode and the OpenAI driver already use.
- Send \`{model, query, documents, top_n, return_documents:false}\` in
the body.
- Parse \`results[*].relevance_score\` and copy each score into the
output slice indexed by \`results[*].index\`, so the
output order matches the input order even if the API returns items in a
different order.
- Empty input returns \`[]float64{}\` with no HTTP call.
- An out-of-range result index returns a clear error rather than
silently skipping the entry.
- Non-200 responses propagate the upstream status line and body.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
returns exit 0.
- The full method set on \`GiteeModel\` still matches the
\`ModelDriver\` interface.
- Pattern parity with the existing SiliconFlow Rerank and the recently
merged ZhipuAI Rerank (#14608).
Closes#14655
## Summary
Fixes file collision between different datasets when using Azure Blob
storage (SPN or SAS authentication).
## Bug
azure_spn_conn.py and zure_sas_conn.py ignored the ucket parameter
entirely, storing all files flat with just the filename. This caused
files with the same name from different datasets (knowledge bases) to
overwrite each other.
## Fix
Prepend bucket/ as a path prefix in all methods (put,
m, get, obj_exist, get_presigned_url, health) to match the behavior of
MinIO and S3 implementations.
## Changes
- **rag/utils/azure_spn_conn.py**: Added {bucket}/ prefix to file paths
in all operations
- **rag/utils/azure_sas_conn.py**: Same fix applied for consistency
(also noted in the original issue)
## Testing
Manual verification: files from different datasets now stored under
distinct bucket/ prefixes, preventing collisions.
Fixes#14159
Co-authored-by: Hunter <hunter@yitong.ai>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
The NVIDIA Go driver added in #14623 has a real chat path, but
\`ListModels\` and \`CheckConnection\` are stubs that always return \`no
such method\`. So:
- The model picker cannot auto-populate available NVIDIA NIM model ids.
Users have to type the full id by hand (e.g.
\`abacusai/dracarys-llama-3.1-70b-instruct\`).
- The "Check connection" button always fails for NVIDIA, even when the
base URL is reachable and the API key is accepted.
NVIDIA NIM is OpenAI-compatible. \`/v1/models\` works with the same
Bearer token used for chat. The
\`conf/models/nvidia.json\` file already wires the \`models\`
url_suffix, so no config change is needed.
### What this PR includes
- \`internal/entity/models/nvidia.go\`:
- \`ListModels\` now calls
\`GET ${BaseURL}/${URLSuffix.Models}\`, parses
\`response.data[*].id\`, and returns the list. Same shape
as the moonshot, xai, and openai drivers.
- \`CheckConnection\` now calls \`ListModels\` and returns its
error. Same pattern xai, moonshot, deepseek, aliyun, and
gitee already use.
\`Balance\`, \`Encode\`, and \`Rerank\` are still stubs in this PR and
can be added in follow-ups.
No JSON change. No factory change. No interface change.
### How the implementation works
- Region resolution falls back to \`default\` when the supplied region
is unknown, so a stray region value does not break a valid request.
- The Authorization header is only set when \`apiConfig\` and \`ApiKey\`
are non-nil and non-empty. This avoids a nil-pointer dereference and
lets self-hosted NIM deployments without a key still work.
- Non-200 responses propagate the upstream status line and body so the
user sees a real error message.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### How was this tested?
- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
(the go.mod minimum) returns exit 0.
- The full method set on \`NvidiaModel\` still matches the
\`ModelDriver\` interface.
- Pattern parity with the existing xai, moonshot, deepseek, aliyun,
gitee, and openai drivers.
Closes#14635
Closes#14631
### What problem does this PR solve?
The DeepSeek Go driver shipped with a stub \`Balance\` method that
returned \`no such method\`, even though DeepSeek exposes a public \`GET
/user/balance\` endpoint that works with the same Bearer token used for
chat.
So the "Balance" panel in the model provider UI always shows an error
for DeepSeek tenants, while it already works for Moonshot and Gitee.
This PR fills the gap.
### What this PR includes
- \`conf/models/deepseek.json\`: add \`\"balance\": \"user/balance\"\`
under \`url_suffix\` so the driver can build the URL from config the
same way the other endpoints do.
- \`internal/entity/models/deepseek.go\`: replace the \`Balance\` stub
with a real implementation. Adds a small local response type
\`deepseekBalanceResponse\` that matches the upstream shape.
No factory change. No interface change.
### How the driver works
- Validate \`apiConfig\` and the API key, resolve the region (with a
\`default\` fallback), and build the URL from \`BaseURL[region] +
URLSuffix.Balance\`.
- GET the URL with \`Authorization: Bearer <api_key>\`.
- Parse the upstream response:
\`\`\`json
{
\"is_available\": true,
\"balance_infos\": [
{\"currency\": \"USD\", \"total_balance\": \"10.00\", ...},
{\"currency\": \"CNY\", \"total_balance\": \"70.00\", ...}
]
}
\`\`\`
\`total_balance\` is a string in the upstream API, so the driver parses
it with \`strconv.ParseFloat\`.
- Return the first balance entry as \`{\"balance\": <float>,
\"currency\": <string>}\`, the same shape the Moonshot driver returns.
The UI can render it with no provider-specific code.
### Edge cases
- Missing or empty API key returns a clear local error before any HTTP
call.
- Empty \`balance_infos\` returns a clear \"no balance info in
response\" error rather than a zero-value silent success.
- Non-numeric \`total_balance\` returns a clear parse error.
- Non-200 responses propagate the upstream status line and body so the
user can see why the call failed.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
(the go.mod minimum) returns exit 0.
- The full method set on \`DeepSeekModel\` still matches the
\`ModelDriver\` interface.
- Pattern parity with the existing Moonshot and Gitee Balance
implementations.
### What problem does this PR solve?
1. **Implement `OpenRouter` Provider:** Fully support OpenRouter AI
models (e.g., `gemma`, `minimax`). Includes robust handling of
Server-Sent Events (SSE) streams, error event interception, and proper
parsing of both `reasoning_content` and standard `content`.
2. **Fix BaseURL Resolution Bug:** Fixed a critical edge case in region
configuration parsing. Added a strict empty string check
(`*apiConfig.Region != ""`) alongside the `nil` check. This ensures that
if the UI passes an empty string, the system correctly falls back to the
`"default"` region, preventing `unsupported protocol scheme ""` errors
during HTTP requests.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The SiliconFlow Go driver shipped with a stub \`Balance\` method that
returned \`no such method\`, even though SiliconFlow exposes a public
\`GET /v1/user/info\` endpoint that returns the account balance per
currency.
So the "Balance" panel in the model provider UI always shows an error
for SiliconFlow tenants, while it already works for
Moonshot and Gitee. This PR fills the gap.
### What this PR includes
- \`conf/models/siliconflow.json\`: add \`\"balance\": \"user/info\"\`
under \`url_suffix\` so the driver builds the URL from config.
- \`internal/entity/models/siliconflow.go\`: replace the \`Balance\`
stub with a real implementation. Adds a small local response type that
matches the upstream shape.
No factory change. No interface change.
### How the driver works
- Validate \`apiConfig\` and the API key, resolve the region with a
default fallback, and build the URL from \`BaseURL[region] +
URLSuffix.Balance\`.
- GET the URL with \`Authorization: Bearer <api_key>\`.
- Parse the upstream response. SiliconFlow returns balance fields as
strings, so the driver parses them with \`strconv.ParseFloat\`. It
prefers \`totalBalance\` over \`balance\` when both are present.
- Return \`{\"balance\": <float>, \"currency\": \"CNY\"}\`, the same
shape the Moonshot driver returns. The UI can render it
with no provider-specific code.
### Edge cases
- Missing or empty API key returns a clear local error before any HTTP
call.
- An unknown region falls back to the default base URL.
- Empty \`balance\` and \`totalBalance\` returns a clear "no balance
info in response" error rather than a zero-value silent success.
- Non-numeric balance string returns a clear parse error.
- Non-200 responses propagate the upstream status line and body.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
returns exit 0.
- The full method set on \`SiliconflowModel\` still matches the
\`ModelDriver\` interface.
- Pattern parity with the existing Moonshot and Gitee Balance
implementations.
Closes#14642
### What problem does this PR solve?
Three Go drivers had `CheckConnection` returning a hardcoded `no such
method` error, even though each one already has a working `ListModels`
that hits the configured base URL with the configured API key. So the
"Check connection" button in the model provider UI always failed for
these three providers, even when the underlying setup was fine.
Affected drivers:
- `internal/entity/models/ollama.go`
- `internal/entity/models/lmstudio.go`
- `internal/entity/models/vllm.go`
This is a real user-facing gap because Ollama and LM Studio are two of
the most popular local LLM runners, and vLLM is widely used for
self-hosted deployments.
### What this PR includes
For each of the three drivers, replace the stub with a small
implementation that calls `ListModels` and returns its error:
```go
func (o *OllamaModel) CheckConnection(apiConfig *APIConfig) error {
_, err := o.ListModels(apiConfig)
return err
}
```
This is the exact pattern that xai, moonshot, deepseek, aliyun, and
gitee already use for the same method.
No JSON change. No factory change. No interface change.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### How was this tested?
- `go build ./internal/entity/models/...` in a clean go 1.25 image (the
go.mod minimum) returns exit 0.
- The full ModelDriver interface still resolves on each driver
(NewInstance, Name, ChatWithMessages, ChatStreamlyWithSender, Encode,
Rerank, ListModels, Balance, CheckConnection).
- Pattern parity with the existing xai, moonshot, deepseek, aliyun, and
gitee CheckConnection methods.
Closes#14609
## Summary
Update the Turkish locale file to match the latest English locale keys.
## Changes
- Add missing Turkish translations for the new Skills and Skill Search
sections
- Add newly introduced common, header, dataset, settings, and agent
workflow strings
- Align renamed flow keys such as file format options and list
operations with the English source
- Add empty-state strings for skill spaces
## Validation
- Compared web/src/locales/en.ts and web/src/locales/tr.ts: 0 missing
keys, 0 extra keys
- Checked jsonjoy-builder locale: Turkish is already complete
- Checked translated README variants: no new Turkish-specific
documentation gap found
- VS Code diagnostics: no errors in web/src/locales/tr.ts
Co-authored-by: bakiburakogun <bakiburakogun@users.noreply.github.com>
### What problem does this PR solve?
Fixes#14412.
`common.metadata_utils.meta_filter` evaluates user-defined metadata
conditions in Python after `DocMetadataService.get_flatted_meta_by_kbs`
loads the entire `meta_fields` table into memory. Past a few thousand
documents per knowledge base this becomes a memory bottleneck and a
wasted ES round-trip — every filter request currently fetches up to
10000 metadata rows even when the resulting `doc_ids` list is tiny.
This PR adds an ES push-down path that translates the same filter
language into a `bool` query and returns just the matching document IDs.
**Changes**
- `common/metadata_es_filter.py` *(new)*: pure-Python translator from
the RAGflow filter list to ES DSL. Covers every operator the in-memory
path supports (`=`, `≠`, `>`, `<`, `≥`, `≤`, `in`, `not in`, `contains`,
`not contains`, `start with`, `end with`, `empty`, `not empty`) with
`case_insensitive: true` on `prefix` and `wildcard` for parity with the
existing lower-cased Python comparisons. User wildcard metacharacters
are escaped before being injected into `wildcard` patterns. Negative
operators (`≠`, `not in`, `not contains`, ranges) are wrapped with an
`exists` guard so they do not accidentally match documents missing the
key, matching the legacy `if k not in metas` behaviour.
- `api/db/services/doc_metadata_service.py`: new
`DocMetadataService.filter_doc_ids_by_meta_pushdown(kb_ids, filters,
logic)` that returns the doc IDs ES matched, or `None` to signal the
caller should fall back to the in-memory path. Returns `None` when the
active doc store is Infinity (`meta_fields` is a JSON column, not a
dotted-object mapping), when any filter cannot be expressed in DSL
(`UnsupportedMetaFilter`), or when the ES request or metadata index
lookup errors.
- `common/metadata_utils.py`: `apply_meta_data_filter` accepts an
optional `kb_ids` argument. When supplied, conditions go through
push-down first via a new `_try_meta_pushdown` helper; on `None` the
function falls back to the original `meta_filter` call. Default
behaviour is unchanged for callers that don't pass `kb_ids`.
- Updated all four callers (`agent/tools/retrieval.py`,
`api/db/services/dialog_service.py` ×2,
`api/apps/services/dataset_api_service.py`, `api/apps/sdk/session.py`)
to forward `kb_ids` so the push-down path is exercised in production.
- `test/unit_test/common/test_metadata_es_filter.py` *(new)*: 35 unit
tests covering every operator's DSL shape, value coercion
(`ast.literal_eval`, lowercasing, ISO-date pass-through), wildcard
escaping, OR-logic wrapping that protects negative clauses, and the
doc-ID extractor.
**Behaviour preserved**
- The in-memory `meta_filter` is untouched and still services every
fallback case (Infinity backend, unknown operators, ES outages).
- The eligibility / credibility / issue-multiplier semantics described
in the LLM-driven `auto` and `semi_auto` modes still hand the LLM the
full in-memory `metas` dict to choose conditions from. Only the
*evaluation* of those generated conditions is pushed down.
- Existing tests in
`test/unit_test/common/test_metadata_filter_operators.py` continue to
pass (14/14).
**Test plan**
- `pytest test/unit_test/common/test_metadata_es_filter.py` — 35 passed.
- `pytest test/unit_test/common/test_metadata_filter_operators.py` — 14
passed.
- `ruff check` clean on every modified file.
- Reviewer please validate the ES query shapes against a live cluster —
particularly `case_insensitive` on `wildcard` and `prefix` (requires ES
7.10+) and the `exists` + `must_not` pairing for `≠`.
**Notes**
- The first cut caps each push-down request at 10000 results, matching
the existing `get_flatted_meta_by_kbs` limit, and logs a warning when
the cap is hit. A `search_after` follow-up would let us drop the cap
entirely once the push-down path is validated.
- Operator parity with the in-memory path is exact for the canonical
unicode operators (`≥`, `≤`, `≠`) used internally; the ASCII aliases
(`>=`, `<=`, `!=`) are normalised by `convert_conditions` before they
reach the translator.
### Type of change
- [x] Performance Improvement
---------
Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
### What problem does this PR solve?
Feat: support local provider for code exec component & remove some
outdated models
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Summary
Fixes#14159 — files from different datasets can overwrite each other in
Azure Blob storage.
## Problem
Both `azure_spn_conn.py` and `azure_sas_conn.py` ignore the `bucket`
parameter in all storage operations (`put`, `get`, `rm`, `obj_exist`,
`get_presigned_url`). Files are stored flat using only the filename, so
two datasets containing a file with the same name will overwrite each
other.
The MinIO and S3 implementations correctly use the bucket (typically the
knowledge base ID) as a path prefix to create logical folder isolation:
- MinIO: uses `use_prefix_path` decorator → `{orig_bucket}/{fnm}`
- S3: uses `use_prefix_path` decorator → `{prefix_path}/{bucket}/{fnm}`
## Fix
Prepend `{bucket}/` to the file path in all 5 operations across both
Azure connector files:
| File | Methods fixed |
|------|---------------|
| `azure_spn_conn.py` | `put`, `get`, `rm`, `obj_exist`,
`get_presigned_url` |
| `azure_sas_conn.py` | `put`, `get`, `rm`, `obj_exist`,
`get_presigned_url` |
This matches the existing convention where `bucket` is the knowledge
base ID used as a directory prefix.
## ⚠️ Migration Note
Existing Azure SPN/SAS deployments have files stored without the bucket
prefix. After this fix, new files will be stored under
`{bucket}/{filename}` while existing files remain at `{filename}`. A
one-time migration script or manual file move may be needed for existing
deployments. New deployments are unaffected.
## Testing
- Verified the fix is consistent across all 5 methods in both files
- The `health()` method is intentionally left unchanged as it uses a
hardcoded test filename without bucket semantics
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
The ZhipuAI Go driver had a stub Rerank method that returned "not
implemented", even though conf/models/zhipu-ai.json already ships
glm-rerank as a rerank model and the rerank URL suffix is already wired
in url_suffix:
```json
"url_suffix": {
...
"rerank": "rerank"
},
"models": [
{"name": "glm-rerank", "model_types": ["rerank"]},
...
]
```
So the config was ready but the driver was not. A tenant who picked
glm-rerank in the Go layer could not actually run a rerank call. This PR
fills the gap so the listed model works end to end.
### What this PR includes
- `internal/entity/models/zhipu-ai.go`: real implementation of
`ZhipuAIModel.Rerank`, plus two small local types (`zhipuRerankRequest`,
`zhipuRerankResponse`) that mirror the standard OpenAI-compatible rerank
shape used by SiliconFlow.
No factory change. No JSON change. No interface change.
### How the driver works
- POST to `${BaseURL}/${URLSuffix.Rerank}` (resolves to
`https://open.bigmodel.cn/api/paas/v4/rerank` with the default config),
reusing the existing httpClient on the driver.
- Validate apiConfig and the API key, validate the model name, and
resolve the region. Return a clear local error before any HTTP call when
something is missing.
- Send `{model, query, documents, top_n, return_documents: false}` in
the body, the same shape the SiliconFlow driver already uses.
- Walk `results[*].relevance_score` and copy each score into the output
slice indexed by `results[*].index`, so the output order matches the
input order even if the API returns results in a different order.
- Empty `texts` input returns an empty `[]float64` with no HTTP call.
- Non-200 responses propagate the upstream status line and body.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- `go build ./internal/entity/models/...` in a clean go 1.25 image (the
go.mod minimum) returns exit 0.
- The full method set on `ZhipuAIModel` still matches the `ModelDriver`
interface (NewInstance, Name, ChatWithMessages, ChatStreamlyWithSender,
Encode, ListModels, Balance, CheckConnection, Rerank).
- Pattern parity with the existing SiliconFlow Rerank implementation
(`internal/entity/models/siliconflow.go`).
Closes#14607
Fixes#14551
### What problem does this PR solve?
The Moodle connector did not let the sync runner clean up indexed
documents that were deleted from the source. Other connectors such as
dropbox, seafile, webdav, and rss already do this through a slim
snapshot pass. This PR adds the same support for Moodle.
When `sync_deleted_files` is on, the runner now asks the Moodle
connector for a lightweight list of every module id that could be
indexed. The runner then compares this list with the index and removes
any indexed document whose id is not in the list.
The slim pass does not download files. It only goes through courses and
modules and yields ids. The id format matches the ids that the loader
produces, so the match is exact.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Notes
- `MoodleConnector` now also implements `SlimConnectorWithPermSync`.
- New `retrieve_all_slim_docs_perm_sync` yields slim docs with the same
ids the loader uses (`moodle_resource_<id>`, `moodle_forum_<id>`,
`moodle_page_<id>`, `moodle_book_<id>`, `moodle_assign_<id>`,
`moodle_quiz_<id>`).
- The `Moodle` sync class now returns `(document_generator, file_list)`
so the runner can do the cleanup. If the slim snapshot fails,
`file_list` is set back to `None` and the run continues without cleanup.
- The web data source map exposes `syncDeletedFiles` for Moodle so the
option shows up in the UI.
### How was this tested?
- `ruff check` passes on the changed Python files.
- Manual review of the produced slim ids against the ids the loader
builds in `_process_resource`, `_process_forum`, `_process_page`,
`_process_book`, and `_process_activity`.
- Behavior parity with the merged dropbox (#14476), seafile (#14499),
webdav (#14491), and rss (#14493) PRs.
### What problem does this PR solve?
Since secret key get and set logic is updated, the go server also need
to update.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Fixes#14159
## Problem
The `put()`, `get()`, `rm()`, and `obj_exist()` methods in both
`azure_spn_conn.py` and `azure_sas_conn.py` ignore the `bucket`
parameter entirely, storing all files flat using only the filename. This
causes files from different datasets to overwrite each other when they
share the same filename.
By contrast, the MinIO and S3 implementations correctly use the bucket
(typically the knowledge base ID) as a path prefix, creating logical
folder isolation like `{kb_id}/{filename}`.
## Solution
Prepend the `bucket` parameter as a path prefix to all file operations
in both Azure storage implementations:
- `azure_spn_conn.py`: `create_file`, `delete_file`, `get_file_client`
now use `f"{bucket}/{fnm}"`
- `azure_sas_conn.py`: `upload_blob`, `delete_blob`, `download_blob`,
`get_blob_client` now use `f"{bucket}/{fnm}"`
This matches the behavior of all other storage backends (MinIO, S3) and
prevents filename collisions across knowledge bases.
## Testing
- Verified the fix aligns with how MinIO/S3 connectors handle the bucket
parameter
- The `health()` method is left unchanged as it uses a fixed test path
for connectivity checks only
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
preserve doc generator download metadata
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Follow on PR: https://github.com/infiniflow/ragflow/pull/14602
to fix: team member cannot edit agent.
new behavior: beside delete, everything is allowed for team member.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. **Implement `Nvidia` Provider:** Fully support NVIDIA NIM APIs with
robust parameter handling (including the `thinking` parameter) and safe
URL merging in `NewInstance`.
2. **Fix Misleading CLI Errors:** Corrected a bug in `common_command.go`
where failed chat requests inaccurately reported `failed to list
instance models`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: enable sync deleted files for RDBMS & fix remove last file issue
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add a Go driver for OpenAI (GPT models).
The config file conf/models/openai.json has been in the repo for a while
with the full GPT-5 model list, but
internal/entity/models/factory.go had no case for "openai". So any
tenant that configured OpenAI as a model provider in the Go layer fell
through to the default branch and got the dummy driver. Chat, list
models, and check connection all returned dummy responses instead of
reaching the API.
OpenAI is the most commonly requested provider and the JSON config
already ships with the repo, so this gap is high impact even though the
JSON has been there for some time.
### What this PR includes
- New file internal/entity/models/openai.go with an OpenAIModel that
implements the ModelDriver interface.
- factory.go: route the "openai" provider name to NewOpenAIModel.
- conf/models/openai.json: add "models": "models" under url_suffix so
ListModels can hit /v1/models with no hardcoded fallback.
### How the driver works
- OpenAI exposes the canonical OpenAI-compatible API at
https://api.openai.com/v1.
- ChatWithMessages and ChatStreamlyWithSender post to /chat/completions
in the same shape the moonshot, vllm, and xai drivers use.
- ListModels and CheckConnection call /models to list available ids and
confirm the API key works.
- reasoning_content is passed through for the o-series and other
reasoning models, in both the non-stream and stream paths.
- Encode (embeddings) is left as "not implemented" for now, the same way
the other recent provider drivers do it. Rerank and Balance are not part
of OpenAI's public API surface in this layer and return a clear "not
implemented" or "no such method" error.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- go build ./internal/entity/models/... in a clean go 1.25 image (the
go.mod minimum) returns exit 0 with no errors.
- Method set of OpenAIModel matches the ModelDriver interface:
NewInstance, Name, ChatWithMessages, ChatStreamlyWithSender, Encode,
Rerank, ListModels, Balance, CheckConnection.
- Pattern parity with the merged moonshot (#14433), volcengine (#14460),
minimax (#14478), vllm (#14532), xai (#14550), and lm-studio (#14586)
PRs.
Closes#14604
## Summary
- Add MiniMax provider GroupId query parameter support in `LiteLLMBase`
- Extract `group_id` from key configuration in `__init__`
- Append `GroupId` as query parameter to `api_base` in
`_construct_complete_args`
## Why this change is needed
MiniMax provides an OpenAI-compatible API endpoint
(`/v1/chat/completions`), but `GroupId` is a MiniMax-specific account
identifier required for billing and rate limiting - it is not part of
the OpenAI standard.
Looking at LiteLLM's `MinimaxChatConfig`:
- `get_complete_url()` only constructs the base URL (e.g.,
`https://api.minimaxi.com/v1/chat/completions`)
- LiteLLM does **not** automatically inject `GroupId` into requests
- This must be handled by the caller (ragflow's chat_model.py)
The implementation appends `GroupId` as a query parameter to `api_base`:
```python
api_base = completion_args.get("api_base", self.base_url)
separator = "&" if "?" in api_base else "?"
completion_args["api_base"] = f"{api_base}{separator}GroupId={self.group_id}"
```
This matches MiniMax's official API format (as documented by
LlamaFactory):
```bash
curl --location 'https://api.minimaxi.chat/v1/text/chatcompletion?GroupId=你的GroupId' \
--header 'Authorization: Bearer 你的API_Key'
```
## Test plan
- [ ] Verify MiniMax API calls work with GroupId query parameter
- [ ] Verify backward compatibility for other providers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
### What problem does this PR solve?
Bump to infinity v0.7.0-dev6
(uv lock --upgrade-package infinity-sdk)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
A and B, two API servers and a REDIS server.
If A and REDIS restart, B will hold the obsolete secret key and will
lead to error.
TODO:
app.config['SECRET_KEY'] and app.secret_key still hold obsolete secret
key.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
support non-stream runtime agent completion
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
implement `lm-studio` provider
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
Fixes#14562
## Problem
LLMs like DeepSeek V4 Flash and Qwen3-MAX return \\( and \\[
(double backslash) in LaTeX output. The preprocessLaTeX() function
only handled single backslash delimiters, so equations showed as raw
text.
HTML entities like < and > were also not decoded.
## Solution
Added normalization step before existing delimiter conversion:
- \\( → \( and \\[ → \[
- < → < and > → > and & → &
---------
Co-authored-by: Vivek <viveksantoshkumardubey@email.com>
### What problem does this PR solve?
add file convert backward compatibility
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
This PR addresses three related GraphRAG reliability issues that
together allow long-running GraphRAG tasks (10+ hours of LLM extraction)
to be resumed after a crash or pause without re-doing completed work. It
builds on #14096 (per-doc subgraph cache) and extends the same idea to
the resolution and community-detection phases.
Fixes#14236.
## 1. Fix concurrent merge crash
Long GraphRAG runs would crash near the end of entity resolution with:
```
RuntimeError: dictionary keys changed during iteration
```
in `Extractor._merge_graph_nodes`. Two changes:
- `rag/graphrag/general/extractor.py`: snapshot `graph.neighbors(node1)`
via `list(...)` before iterating, so concurrent `add_edge` /
`remove_node` mutations on the shared `nx.Graph` cannot invalidate the
iterator. Also tracks each redirected neighbour in `node0_neighbors` so
a later merged node sharing the same external neighbour takes the
edge-merge branch instead of overwriting via `add_edge`.
- `rag/graphrag/entity_resolution.py`: serialize the merge step with a
dedicated `asyncio.Semaphore(1)`. `nx.Graph` is not thread-safe and
concurrent merges on overlapping neighbourhoods can produce incorrect
results even with the snapshot fix.
## 2. Don't wipe partial graph on pause
Previously the pause / cancel UI path called
`settings.docStoreConn.delete({"knowledge_graph_kwd": [...]}, ...)`,
destroying every subgraph, entity, relation, and graph row.
Re-triggering then started GraphRAG from scratch even though #14096 had
already added `load_subgraph_from_store`.
After main was merged in (which deleted `api/apps/kb_app.py` per
#14394), the pause path now lives on the new REST surface `DELETE
/v1/datasets/<id>/<index_type>`:
- `api/apps/services/dataset_api_service.py`: `delete_index` accepts a
`wipe: bool = True` parameter. When `False` the doc-store rows and
GraphRAG phase markers are left intact and only the running task is
cancelled. Default preserves historical behaviour.
- `api/apps/restful_apis/dataset_api.py`: parses `?wipe=false|0|no|off`
from the query string and forwards it.
- `web/src/utils/api.ts` + `web/src/services/knowledge-service.ts`:
`unbindPipelineTask` appends `?wipe=false` when explicitly false.
- The GraphRAG pause action in
`web/src/pages/dataset/dataset/generate-button/hook.ts` passes `wipe:
false` for `KnowledgeGraph`; raptor is unchanged.
**UX impact:** the pause icon next to a running GraphRAG task no longer
wipes graph data. The only path that still wipes is the explicit Delete
action in `GenerateLogButton` (trash icon behind a confirmation modal).
## 3. Phase-completion markers (`rag/graphrag/phase_markers.py`)
A small Redis-backed marker layer at
`graphrag:phase:{kb_id}:{resolution_done|community_done}` (7-day TTL).
`run_graphrag_for_kb` consults the markers on entry and skips phases
that already completed in a prior run. Markers are cleared automatically
when:
- new docs are merged into the graph (which invalidates prior resolution
and community results),
- `delete_index` wipes the graph, or
- `delete_knowledge_graph` is called.
Redis failures never block a run -- markers are an optimization, not a
gate.
## 4. Idempotent community detection
`extract_community` previously did `delete-then-insert` on
`community_report` rows; a crash mid-insert left the dataset with no
reports. Now report IDs are derived deterministically from `(kb_id,
community.title)`, the existing report IDs are snapshotted before
insert, new rows are written, then only stale rows are pruned. A failure
at any step leaves either the prior or the new report set intact --
never a partial mix.
## 5. Tunable doc-store insert pipeline
The GraphRAG insert loop in `rag/graphrag/utils.py` and the
`community_report` insert in `rag/graphrag/general/index.py` were both
hardcoded to `es_bulk_size = 4` and ran strictly sequentially. On a real
KB this meant 1077 chunks took ~21 minutes for a 100-chunk slice -- pure
round-trip overhead.
- New `insert_chunks_bounded()` helper in `rag/graphrag/utils.py`
batches inserts via a bounded `asyncio.Semaphore`. Same retry / timeout
semantics as the prior loop.
- Defaults: 64 docs per batch, 4 batches in flight (matches the regular
ingest pipeline in `document_service.py`). Tunable per-deployment via
`GRAPHRAG_INSERT_BULK_SIZE` and `GRAPHRAG_INSERT_CONCURRENCY`.
- Both `set_graph` and `extract_community` now use the helper.
This dropped the same 1077-chunk insert from minutes to seconds in local
testing without measurable extra pressure on Infinity (total in-flight
docs ≤ `BULK_SIZE × CONCURRENCY` = 256 by default).
## Tests
- `test/unit_test/rag/graphrag/test_merge_graph_nodes.py` (3 tests):
dense neighbourhood merge, neighbour-snapshot regression, concurrent
serialized merges.
- `test/unit_test/rag/graphrag/test_phase_markers.py` (4 tests): set/has
round-trip, kb-scoped clear, no-op on empty input, graceful Redis
failure.
-
`test/testcases/test_web_api/test_dataset_management/test_dataset_sdk_routes_unit.py`:
new `test_delete_index_wipe_flag_unit` covers `wipe=false` for both
GraphRAG and raptor on the new REST route, and confirms the default
still wipes and clears phase markers.
## Compatibility
- Backward compatible: tasks queued before this change behave
identically (default `wipe=true`, no markers expected).
- No schema/migration changes; all new state lives in Redis.
- New optional REST query param `wipe` on `DELETE
/v1/datasets/<id>/<index_type>`.
- New optional env vars `GRAPHRAG_INSERT_BULK_SIZE` and
`GRAPHRAG_INSERT_CONCURRENCY`; defaults preserve safe behaviour.
## Example of resume
Screenshot below shows a test resuming knowledge graph generation after
applying the concurrency fix and re-deploying.
<img width="521" height="677" alt="image"
src="https://github.com/user-attachments/assets/9ef0d405-cbb3-420d-a1a1-e51f3e7e9b7a"
/>
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This PR fixes a bug where `layout_recognize="<name>@OpenDataLoader"` was
misrouted and then failed during parsing in the naive parser path. It
now routes correctly to OpenDataLoader and avoids passing unsupported
arguments that caused runtime errors. fixes#14572
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
This fixes a missing authorization check in the beta API document
download endpoint:
- **CWE:** CWE-862 (Missing Authorization)
- **Severity:** Medium
- **Affected route/file:** `GET /api/v1/documents/<document_id>` in
`api/apps/sdk/doc.py`
- **Data flow:** the route reads a bearer beta API token, resolves the
token with `APIToken.query(beta=token)`, accepts `document_id` directly
from the URL, loads the document with
`DocumentService.query(id=document_id)`, and then fetches the backing
object through `File2DocumentService.get_storage_address()` /
`settings.STORAGE_IMPL.get()`.
Before this change, that flow verified that the API token was valid, but
it did not verify that the token's tenant owned the document's knowledge
base. A caller with any valid beta API token and a known document ID
could therefore reach storage for a document belonging to another
tenant.
## Fix
The endpoint now takes the tenant ID from the resolved API token and
checks the document's knowledge base with:
```python
KnowledgebaseService.query(id=doc[0].kb_id, tenant_id=tenant_id)
```
If the knowledge base is not owned by the token tenant, the request
returns an access error before any storage lookup occurs. This mirrors
the tenant-scoped ownership checks used by the dataset-scoped document
download path and keeps the patch small.
## Tests
Added unit coverage for `download_doc()` to assert that:
- the beta token tenant ID is used in the knowledge-base ownership
lookup;
- cross-tenant access returns `You do not have access to this
document.`;
- storage resolution is not called before tenant authorization succeeds;
- the existing same-tenant empty-file and successful-download paths
still run after the authorization gate passes.
I also verified the final patch is limited to `api/apps/sdk/doc.py` and
the related document SDK route unit test. A local `pytest` invocation
could not complete in this checkout because the shared test fixture
attempts to log in to a RAGFlow server at `127.0.0.1:9380`, which was
not running in the local environment.
## Security analysis
This is exploitable when an attacker has a valid beta API token for
their own tenant and obtains or guesses a document ID from another
tenant. The token alone should not grant access to other tenants' files,
but the direct document route previously authorized only the token
itself and not the requested resource. The new tenant-scoped
knowledge-base check binds the requested document back to the token
tenant before storage is accessed, preventing cross-tenant document
downloads through this endpoint.
Before submitting, we attempted to disprove this by checking whether
existing dataset-scoped routes, token validation, or framework
protections already enforced ownership. They do not apply to this direct
document-ID route: it bypassed the dataset path parameter and used only
`DocumentService.query(id=document_id)` before reading storage.
cc @lewiswigmore
### What problem does this PR solve?
### Type of change
- [v] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wiratama <dafa.wiratama@bankraya.co.id>
### What problem does this PR solve?
Fix#14340
## Problem Description
When using an **Agentic Agent** (not Workflow) with one or more
Retrieval tools (e.g., Dataset Retrieval + Memory Retrieval), the agent
silently returns an empty response (`agent_response: ""`) after hanging
for several minutes. The server logs show:
```
AttributeError: 'ChatCompletionMessageToolCall' object has no attribute 'index'
```
This error propagates as a `GENERIC_ERROR`, causing the canvas to return
an empty response. The subsequent Memory save task then receives the
empty `agent_response` and logs:
```
Document for referred_document_id XXXX not found
```
## Reproduction Steps
1. Set `DOC_ENGINE=infinity` (or `elasticsearch` — the engine itself is
not the root cause).
2. Create a blank **Agentic Agent** (not a Workflow).
3. Add **two Retrieval tools** to the Agent node:
- `Retrieval_DS` → Dataset (Knowledge Base)
- `Retrieval_Mem` → Memory component
4. Add a **Message** node with **Save to Memory** enabled.
5. Launch the agent and send any message (e.g., "hola").
6. The agent hangs and returns an empty response.
## Root Cause Analysis
The crash occurs in `_append_history` and `_append_history_batch` inside
`rag/llm/chat_model.py`. These methods directly access `.index` on tool
call objects:
```python
# _append_history_batch
{
"index": tc.index, # <-- crashes here
...
}
```
However, **non-streaming** LLM responses (`stream=False`) return
`ChatCompletionMessageToolCall` objects, which **do not have an `index`
field** according to the OpenAI API specification. The `index` field
only exists on `ChoiceDeltaToolCall` objects returned in **streaming**
responses (`stream=True`).
When the agentic agent triggers an internal `full_question` call (used
to compress multi-turn conversation history), the request is incorrectly
routed through `async_chat_with_tools` because `is_tools=True` is set at
the `LLMBundle` level. If the LLM decides to emit `tool_calls` during
this auxiliary request, the code enters the non-streaming tool loop and
crashes when trying to append history.
## Fix
Replaced all direct `.index` accesses with `getattr(..., "index", None)`
for safe, backward-compatible access:
| Method | File | Line | Change |
|--------|------|------|--------|
| `_append_history` | `rag/llm/chat_model.py` | ~L304 |
`tool_call.index` → `getattr(tool_call, "index", None)` |
| `_append_history_batch` | `rag/llm/chat_model.py` | ~L332 | `tc.index`
→ `getattr(tc, "index", None)` |
| `_append_history` | `rag/llm/chat_model.py` | ~L1467 |
`tool_call.index` → `getattr(tool_call, "index", None)` |
| `_append_history_batch` | `rag/llm/chat_model.py` | ~L1496 |
`tc.index` → `getattr(tc, "index", None)` |
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: noob <yixiao121314@outlook.com>
## Summary
- normalize string items for list-valued metadata filters in
`meta_filter`
- fix `in` / `not in` case asymmetry when document metadata is
lowercased but filter list values are not
- add regression tests that cover the original issue scenario using
uppercase list values
## Validation
- `PYTHONPATH=external/ragflow pytest
external/ragflow/test/unit_test/common/test_metadata_filter_operators.py
-q`
## Notes
- I commented on #14389 before opening this PR to claim the issue.
- The new tests use `value=["F2", "F11"]` so they fail on the old
implementation and pass with this fix.
- This also benefits other non-comparison operators that flow through
the same normalization path.
Co-authored-by: copizza <copizza@users.noreply.github.com>
Co-authored-by: Wang Qi <wangq8@outlook.com>
### What problem does this PR solve?
add legacy agent completion API compatibility
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR fixes missing authorization checks in the Memory API.
Previously, several authenticated endpoints accepted caller-supplied
`tenant_id`, `owner_ids`, or `memory_id` values and used them directly
to list, read, update, delete, or search Memory data.
That could allow an authenticated user to access or mutate another
tenant's Memory records if they knew a tenant ID or memory ID. The fix
centralizes Memory access checks and applies them consistently across
Memory and Memory-message operations.
The change:
- Adds helper logic to parse list filters and compute tenant IDs
accessible to `current_user`.
- Requires direct `memory_id` operations to pass Memory access checks
before reading, updating, deleting, or changing message state.
- Filters list/search/recent-message requests to accessible memories
only.
- Applies Memory visibility filtering before count and pagination in
`MemoryService.get_by_filter`.
- Accepts `owner_ids` in the Memory list route, matching the frontend
owner filter while still intersecting values with the caller's
accessible tenants.
-
### Related issues
Closes#14534
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: jony376 <jony376@gmail.com>
### What problem does this PR solve?
add IMAP deleted document sync
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Incremental DingTalk AI Table (Notable) sync did not reconcile rows
removed on the remote side with documents already in the knowledge base.
This follows the coordinated datasource work in #14362 (“sync deleted
files”).
This PR adds a **full slim snapshot**
(`retrieve_all_slim_docs_perm_sync`) that lists **current record IDs for
all sheets** without building document blobs, using the same logical
document IDs as full ingest
(`dingtalk_ai_table:{table_id}:{sheet_id}:{record_id}`). When
**`sync_deleted_files`** is enabled on incremental runs,
`DingTalkAITable._generate` returns **`(document_generator,
file_list)`** so **`SyncBase`** can run
**`cleanup_stale_documents_for_task`** and remove KB rows that no longer
exist remotely.
Design notes:
- **`_document_id`** centralizes the ID string so slim snapshots and
**`_convert_record_to_document`** stay aligned with
**`hash128(doc.id)`** semantics used during ingestion/cleanup.
- **`end_ts`** is captured before building **`file_list`**, then
**`poll_source`** uses the same upper bound (consistent with other
Dropbox-style connectors).
- **`batch_size`** from connector config is coerced to a positive
**`int`** before constructing the connector.
- Slim snapshot failures are caught in **`_generate`**; **`file_list`**
is set to **`None`** so cleanup is skipped rather than running on
partial/error state.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Files changed (summary)
| Area | Change |
|------|--------|
| `common/data_source/dingtalk_ai_table_connector.py` |
`SlimConnectorWithPermSync`, `retrieve_all_slim_docs_perm_sync`,
`_document_id` shared with document conversion |
| `rag/svr/sync_data_source.py` | `DingTalkAITable._generate`: slim
snapshot + tuple return; `batch_size` validation; shared `end_ts` with
`poll_source` |
| `web/src/pages/user-setting/data-source/constant/index.tsx` |
`syncDeletedFiles` for DingTalk AI Table in
`DataSourceFeatureVisibilityMap` |
Closes / relates to: #14362
### What problem does this PR solve?
This fixes a MinerU parsing failure where output JSON was not found in
nested v0.24.0 layouts, and also fixes a `content_names` NameError in
`_read_output()`. As a result, successful MinerU API runs no longer end
with false “MinerU not found” parsing failures.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Concurrent CREATE TABLE / CREATE INDEX / DROP TABLE on the same Infinity
instance can race on the catalog counter (e.g. db|1|next_table_id) and
fail with error 9003 "Resource busy" instead of waiting on a lock. Two
users creating a knowledge base at the same instant, or any deployment
with multiple backend workers behind one Infinity, can hit it.
Wrap the metadata paths in create_idx, create_doc_meta_idx, and
delete_idx with exponential backoff + jitter (5 attempts, 50ms base).
The wrapped operations already use ConflictType.Ignore, so retrying is
idempotent — worst case the second attempt is a no-op against an
already-created table. Tunable via INFINITY_META_RETRY_MAX /
INFINITY_META_RETRY_BASE_DELAY_MS.
Repro: stress 30 concurrent POST /api/v1/datasets against a 4-worker
backend → ~50% of requests fail without the patch (Resource busy from
the second worker that hits the counter), 100% succeed with it. At 100
concurrent requests, all 100 succeed in ~1.2s; the retry budget never
exhausted in our tests.
Scope is limited to metadata paths only — data-path operations (INSERT
chunks, SELECT for retrieval) go through per-table code paths and don't
share the contended counter.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: yoan sapienza <Yoan Sapienza yoan.sapienza@orange.fr Yoan Sapienza zappy@macbookpro.home>
Closes#14552
### What problem does this PR solve?
Add a Go driver for xAI (Grok models).
The config file conf/models/xai.json has been in the repo since the
early Go provider work, but internal/entity/models/factory.go had no
case for "xai". So any xAI request fell through to the dummy driver
and never reached the API. This PR adds the missing driver and wires it
up.
### What this PR includes
- New file internal/entity/models/xai.go with an XAIModel that
implements the ModelDriver interface.
- factory.go: route the "xai" provider name to NewXAIModel.
### How the driver works
- xAI exposes an OpenAI-compatible API at https://api.x.ai/v1.
- ChatWithMessages and ChatStreamlyWithSender post to /chat/completions
in the same shape the moonshot and deepseek drivers use.
- ListModels and CheckConnection call /models to confirm the API key
works and to list available model ids.
- reasoning_content is passed through for grok-3-mini and other xAI
reasoning models, both in the non-stream and stream paths.
- Encode, Rerank, and Balance are not part of the public xAI API at the
moment, so they return a clear "not implemented" or "no such method"
error.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### How was this tested?
- go build ./internal/entity/models/... in a clean go 1.25 image (the
go.mod minimum) returns exit 0 with no errors.
- Method set of XAIModel matches the ModelDriver interface: NewInstance,
Name, ChatWithMessages, ChatStreamlyWithSender, Encode, Rerank,
ListModels, Balance, CheckConnection.
- Pattern parity with the merged moonshot (#14433), volcengine (#14460),
minimax (#14478), and vllm (#14532) PRs.
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
implement `Ollama` provider
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Use GetChatModel, remove duplicate functions in model_service.go
### Type of change
- [x] Refactoring
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
```
RAGFlow(user)> chat with 'glm-4.6v-flash@test@zhipu-ai' message 'What are the pics talk about?' image 'https://cdn.bigmodel.cn/static/logo/register.png' 'https://cdn.bigmodel.cn/static/logo/api-key.png'
Answer: The first picture shows a login/register modal with options for phone number login, account login, and WeChat QR code login, along with a prompt for new users to get a 20 million tokens experience package. The second picture displays the API keys management page of a platform, including a warning about API key security and a table listing existing API keys with details like creation time and usage history.
Time: 31.600545
RAGFlow(user)> chat with 'glm-4.6v-flash@test@zhipu-ai' message 'What are the video talk about?' video 'https://cdn.bigmodel.cn/agent-demos/lark/113123.mov'
Answer: Based on the sequence of frames provided, the video is a demonstration of a web search and navigation process.
1. The video starts with a blank Google search page.
2. The user types "智谱" (which is the Chinese name for the company Zhipu AI) into the search box.
3. The search is initiated and the page shows "About 0 results".
4. The search results load, showing information about Zhipu AI, including its website.
5. The user clicks on the main website link (www.zhipuai.cn).
6. The video ends by showing the homepage of Zhipu AI's website, titled "Z.ai GLM Large Model Open Platform".
In summary, the video is about searching for the company "智谱" (Zhipu AI) on Google and then navigating to its official website.
Time: 76.582520
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Currently, RAGFlow's Search and Chat interfaces display only raw
vectorized text chunks during retrieval, without contextual information
about their source documents. Users cannot see document titles, page
numbers, upload dates, or custom metadata fields that would help them
understand and trust the retrieved results.
This PR introduces an **optional metadata display feature** that
enriches retrieved chunks with document-level metadata in both the
Search tab and Chatbot interface.
**Key improvements:**
- **Search results**: Display document metadata as styled badges beneath
chunk snippets
- **Chat citations**: Show metadata in citation popovers and reference
lists for better source context
- **LLM context**: Metadata is injected into the LLM prompt to enable
more accurate, citation-aware responses
- **External API support**: Applications using RAGFlow's SDK retrieval
endpoints (`/v1/retrieval`, `/v1/searchbots/retrieval_test`) can opt-in
via request parameters
- **User control**: Multi-select dropdown UI allows users to choose
which metadata fields to display
**Implementation approach:**
- ✅ Reuses existing `DocMetadataService` infrastructure (no new database
tables or indices)
- ✅ Settings stored in existing JSON configuration fields
(`search_config.reference_metadata`, `prompt_config.reference_metadata`)
- ✅ No database migrations required
- ✅ Disabled by default (fully opt-in and backward-compatible)
- ✅ Dynamic metadata field selection populated from actual document
metadata keys
- ✅ Fixed critical bug where Python's builtin `set()` was shadowed by a
route handler function
**Modified endpoints (all backward-compatible):**
- `POST /v1/retrieval` (Public SDK)
- `POST /v1/searchbots/retrieval_test` (Searchbots)
- `POST /v1/chunk/retrieval_test` (UI/Internal)
- Chat completions endpoints (via `extra_body.reference_metadata` or
`prompt_config`)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
###Images
-
<img width="879" height="1275" alt="image"
src="https://github.com/user-attachments/assets/95b2d731-31ae-45a1-b081-bf5893f52aeb"
/>
<br><br>
<br><br>
<img width="1532" height="362" alt="image"
src="https://github.com/user-attachments/assets/9cebc65b-b7a7-459f-b25e-3b13fa9b638e"
/>
<br><br>
<br><br>
<img width="2586" height="1320" alt="image"
src="https://github.com/user-attachments/assets/2153d493-d899-461f-a7a9-041391e07776"
/>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Attili-sys <Attili-sys@users.noreply.github.com>
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
### What problem does this PR solve?
remove delete_documents uuid validation
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Partially addresses #14362.
This PR enables syncing deleted files for RSS data sources.
Previously, RSS incremental sync only returned feed entries whose
timestamps were inside the poll window. If an entry was removed from the
RSS feed, RAGFlow had no full current RSS snapshot to pass into the
shared stale-document cleanup path, so the deleted remote entry could
remain in the knowledge base.
This PR:
- adds `retrieve_all_slim_docs_perm_sync()` to `RSSConnector`
- reuses the same `rss:<md5(stable_key)>` document ID derivation used by
normal RSS ingest
- returns `(document_generator, file_list)` for incremental RSS sync
when `sync_deleted_files` is enabled
- captures the poll end timestamp before snapshot/poll so cleanup does
not race against the same sync window
- adds start/end logs around RSS slim snapshot collection
- exposes the deleted-file sync toggle for RSS in the data source UI
Per maintainer request on related datasource PRs, this PR contains no
test-case changes. Local verification was run with an external script.
Validation:
- `uv run ruff check common/data_source/rss_connector.py
rag/svr/sync_data_source.py`
- `uv run pytest test/unit_test/rag/test_sync_data_source.py -q`
- `./node_modules/.bin/eslint
src/pages/user-setting/data-source/constant/index.tsx`
- `git diff --check`
- `uv run python /tmp/verify_rss_deleted_sync.py --repo
/root/74/ragflow`
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## What problem does this PR solve?
Incremental WebDAV sync only ingested files whose modification time fell
inside the poll window; documents removed on the WebDAV server were
never removed from the knowledge base. This aligns with
[#14362](https://github.com/infiniflow/ragflow/issues/14362)
(coordinated datasource “sync deleted files” work).
This PR adds a **full-tree slim snapshot**
(`retrieve_all_slim_docs_perm_sync`) that enumerates current remote
paths **without downloading file contents**, using the same logical
document IDs as full ingest (`webdav:{base_url}:{file_path}`). When
**`sync_deleted_files`** is enabled on incremental runs, sync returns
**`(document_generator, file_list)`** so **`SyncBase`** runs
**`cleanup_stale_documents_for_task`** and removes KB rows no longer
present remotely.
Design notes:
- **`_list_files_recursive`** gains **`filter_by_mtime`**: snapshot
passes **`filter_by_mtime=False`** (full tree under **`remote_path`**);
**`poll_source`** keeps mtime-window filtering as before.
- Slim snapshot applies the same **extension** and **`size_threshold`**
rules as **`_yield_webdav_documents`** so retain IDs match what would be
indexed.
- **`end_ts`** is captured before building **`file_list`**, then
**`poll_source`** uses the same upper bound (consistent with
Dropbox-style connectors).
## Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Files changed
| Area | Change |
|------|--------|
| `common/data_source/webdav_connector.py` |
`SlimConnectorWithPermSync`, `retrieve_all_slim_docs_perm_sync`,
`filter_by_mtime` on `_list_files_recursive` |
| `rag/svr/sync_data_source.py` | WebDAV `_generate`: `file_list` +
tuple return; pass **`batch_size`** from connector config |
| `web/src/pages/user-setting/data-source/constant/index.tsx` |
`syncDeletedFiles` for WebDAV in `DataSourceFeatureVisibilityMap` |
### What problem does this PR solve?
Implement the vLLM model provider for RAGFlow to fully support local and
self-hosted open-source models (e.g., Qwen, GLM, Llama) via the vLLM
framework, and fix several critical bugs related to model instance
management and API requests.
**Key changes and fixes:**
1. **Added Standard vLLM Provider (`vllm.go`, `vllm.json`):**
- Implemented `VllmModel` driver strictly adhering to the OpenAI API
specification.
- Removed hardcoded and dangerous routing logic (e.g., forcing
`AsyncChat` for Qwen/GLM prefixes), ensuring standard
`/v1/chat/completions` compatibility.
- Refactored `ListModels` to use safe JSON parsing (resolving nil
pointer panics) and standard `GET` requests without bodies.
- Added `APIConfig.Region` fallback logic to prevent empty `base_url`
fetching when checking models.
2. **Fixed `ChatToModelStreamWithSender` Bug (`model_service.go`):**
- Resolved the `model is disabled` error when streaming chat with local
database-saved models.
- Added the missing `if modelInfo.Status == "active"` block to correctly
invoke `NewInstance` and inject the dynamic `base_url` into the provider
driver before starting the SSE stream.
3. **Fixed `ListSupportedModels` Bug (`model_service.go`):**
- Added dynamic `NewInstance` injection for `base_url`. Previously, the
list models function used the static JSON config without injecting
user-configured dynamic URLs from the database, resulting in an
`unsupported protocol scheme ""` error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: LaTeX formulas cannot be displayed on the chat page.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
id as "text", not a "keyword", order by it will cause error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix docker image version info in comment
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Refs #14362.
This PR enables syncing deleted files for Zendesk data sources.
Previously, Zendesk incremental sync never returned a slim remote
snapshot to the shared stale-document cleanup path, so deleted remote
Zendesk records could remain in RAGFlow. The existing Zendesk slim
snapshot also included records that ingestion intentionally skips, such
as draft articles, articles without bodies, skipped-label articles,
empty-body articles, and tickets with `status == "deleted"`.
This PR:
- exposes the deleted-file sync option for Zendesk in the data source UI
- returns Zendesk slim snapshots during incremental sync when
`sync_deleted_files` is enabled
- reuses Zendesk indexability rules so cleanup compares against the same
records ingestion can materialize
- adds start/end logs around Zendesk slim snapshot collection for
operational visibility
Per maintainer request, this PR contains no test-case changes. Manual
verification recording will be provided separately.
Validation:
- `uv run ruff check common/data_source/zendesk_connector.py
rag/svr/sync_data_source.py`
- `uv run pytest test/unit_test/rag/test_sync_data_source.py -q`
- `./node_modules/.bin/eslint
src/pages/user-setting/data-source/constant/index.tsx`
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Partially addresses #14362.
Adds deleted-file sync support for the Asana data source. Asana already
indexes task attachments as documents, but it did not provide the slim
document snapshot required by stale-document reconciliation, and the
sync wrapper never returned a `file_list` for cleanup.
This PR:
- adds `retrieve_all_slim_docs_perm_sync()` to `AsanaConnector`
- builds slim IDs with the same `asana:{task_id}:{attachment_gid}`
format used by indexed documents
- avoids downloading attachment blobs during the snapshot
- aborts the snapshot if Asana API errors occur, preventing partial
snapshots from deleting valid local docs
- captures the incremental poll end time before snapshotting and makes
`poll_source()` respect that boundary
- exposes the deleted-file sync toggle for Asana in the data source UI
Per maintainer request, this PR contains no test-case changes. Manual
verification recording will be provided separately.
Validation:
- `uv run ruff check common/data_source/asana_connector.py
rag/svr/sync_data_source.py`
- `uv run pytest test/unit_test/rag/test_sync_data_source.py -q`
- `./node_modules/.bin/eslint
src/pages/user-setting/data-source/constant/index.tsx`
- `git diff --check`
### Type of change
- [x] New Feature
## Summary
Fix critical severity security issue in `rag/utils/ob_conn.py`.
## Vulnerability
| Field | Value |
|-------|-------|
| **ID** | V-003 |
| **Severity** | CRITICAL |
| **Scanner** | multi_agent_ai |
| **Rule** | `V-003` |
| **File** | `rag/utils/ob_conn.py:691` |
**Description**: The OceanBase database connector constructs SQL WHERE
clauses by directly embedding user-controlled filter expressions using
Python f-strings at lines 726, 777, 781, 787, 793, 821, and 827. No
parameterization or allowlist validation is applied before the
expressions are incorporated into live SQL queries. This is the most
critical vulnerability in the codebase because it directly exposes the
RAG knowledge base — the platform's core business asset — to complete
compromise.
## Changes
- `rag/utils/ob_conn.py`
## Verification
- [x] Build passes
- [x] Scanner re-scan confirms fix
- [x] LLM code review passed
---
*Automated security fix by [OrbisAI Security](https://orbisappsec.com)*
### What problem does this PR solve?
Feat: add button for remove header & footer in pipeline
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Incremental Seafile sync only ingests files whose modification time
falls in the poll window; documents removed in Seafile were never
removed from the knowledge base. This contributes to
[#14362](https://github.com/infiniflow/ragflow/issues/14362) (datasource
“sync deleted files” coordination).
This PR adds a **slim snapshot** (`retrieve_all_slim_docs_perm_sync`)
that enumerates current remote file IDs **without downloading content**,
using the same logical IDs as full ingest
(`seafile:{repo_id}:{file_id}`). When **`sync_deleted_files`** is
enabled on incremental runs, **`SeaFile._generate`** returns
**`(document_generator, file_list)`** so **`SyncBase`** can run
**`cleanup_stale_documents_for_task`** and remove stale KB documents.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What changed
- **`common/data_source/seafile_connector.py`**: `SeaFileConnector`
implements **`SlimConnectorWithPermSync`**;
**`_list_files_recursive(..., filter_by_mtime=...)`** supports full-tree
listing for snapshots; **`retrieve_all_slim_docs_perm_sync()`** reuses
the same library/root scan as ingest and applies the same **size**
ceiling; logging for snapshot start/end and counts.
- **`rag/svr/sync_data_source.py`**: **`SeaFile._generate`** validates
**`batch_size`**, captures **`end_ts`** before snapshot +
**`poll_source`**, wraps slim retrieval in **`try`/`except`** (
**`file_list = None`** on failure so ingest continues), returns
**`(generator, file_list)`**.
- **`web/src/pages/user-setting/data-source/constant/index.tsx`**:
**`syncDeletedFiles`** for Seafile in
**`DataSourceFeatureVisibilityMap`**.
### What problem does this PR solve?
This fixes a crash in Manual and Naive parsing when PDF outlines include
page numbers as a third tuple value. It makes outline unpacking accept
extra values so parsing no longer fails. fixes#14411
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Both tokenizer (`rag/flow/tokenizer/tokenizer.py`) and
`BuiltinEmbed.encode`
(`rag/llm/embedding_model.py`) currently accumulate embedding batches
via
`np.concatenate` inside the per-batch loop. `np.concatenate` allocates a
new
array and copies all existing data on every call, so accumulating N
batches
is O(N²) in both time and peak memory.
Replacing the incremental concatenate with a list-of-batches + a single
`np.vstack` at the end gives O(N) total work.
For tokenizer the title-vector broadcast `np.concatenate([vts[0]] * N)`
is
also replaced by `np.tile`, which does the same job with a single
contiguous
allocation instead of building a Python list of references.
This is purely a CPU/memory optimisation — output shape and dtype are
unchanged. Measured impact grows with document size:
- 1k chunks (batch 512, 2 iters): ~negligible
- 10k chunks (20 iters): ~10× speedup on this stage
- 100k chunks (195 iters): ~100× speedup, and peak RAM
drops from O(N) extra to near-zero
### Type of change
- [x] Performance Improvement
Co-authored-by: yoan sapienza <Yoan Sapienza yoan.sapienza@orange.fr Yoan Sapienza zappy@macbookpro.home>
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.25.0 to v0.25.1
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: The GraphRAG icon is not displaying.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
## Summary
Fixed a bug where the **File Logs** tab in the dataset ingestion page
always showed "No logs" even after files were parsed successfully.
## Root Cause
Both the **File Logs** and **Dataset Logs** tabs on the frontend called
the same backend endpoint `/datasets/{dataset_id}/ingestions`. However,
the backend only queried `get_dataset_logs_by_kb_id`, which
hard-filtered records by `document_id == GRAPH_RAPTOR_FAKE_DOC_ID`
(dataset-level logs). As a result, real file-level logs were never
returned, causing the table to appear empty.
## Changes
### Backend
- **`api/apps/restful_apis/dataset_api.py`**
- Added two new query parameters to `list_ingestion_logs`:
- `log_type` — `"file"` or `"dataset"` (default: `"dataset"`)
- `keywords` — search keyword for filtering by document / task name
- **`api/apps/services/dataset_api_service.py`**
- Updated `list_ingestion_logs` signature to accept `log_type` and
`keywords`.
- Added conditional routing:
- When `log_type == "file"`, call
`PipelineOperationLogService.get_file_logs_by_kb_id`
- Otherwise, call
`PipelineOperationLogService.get_dataset_logs_by_kb_id`
- **`api/db/services/pipeline_operation_log_service.py`**
- Extended `get_dataset_logs_by_kb_id` with an optional `keywords`
parameter so dataset logs can also be searched.
### Frontend
- **`web/src/pages/dataset/dataset-overview/hook.ts`**
- Removed the separate API function switching (`listPipelineDatasetLogs`
vs `listDataPipelineLogDocument`).
- Unified both tabs to call `listDataPipelineLogDocument` with the new
`log_type` query parameter (`"file"` or `"dataset"`).
- Ensured `keywords` and filter values are passed through correctly.
## Behavior After Fix
| Tab | `log_type` | Returned Records | Searchable Field |
|---|---|---|---|
| File Logs | `file` | Real document-level logs | `document_name` (file
name) |
| Dataset Logs | `dataset` | GraphRAG / RAPTOR / MindMap logs |
`document_name` (task type) |
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: noob <yixiao121314@outlook.com>
Co-authored-by: Wang Qi <wangq8@outlook.com>
Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Fix: The pipeline column header in the FileLogsTable is displaying
incorrectly.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. drop instance model
2. Fix issue of drop instance but not drop models.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
implement MiniMax provider
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: enable sync deleted file for Discord
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Partially addresses #14362 by adding deleted-file sync support for the
Dropbox data source.
Dropbox previously did not provide the slim current-file snapshot
required by stale document reconciliation, and its sync runner returned
only document batches. As a result, enabling deleted-file sync could not
remove local documents that had been deleted from Dropbox.
This PR:
- Adds `retrieve_all_slim_docs_perm_sync()` to `DropboxConnector`.
- Reuses Dropbox metadata traversal to collect current remote file IDs
without downloading file contents.
- Wires incremental Dropbox sync to return `(document_generator,
file_list)` when `sync_deleted_files` is enabled.
- Enables the deleted-file sync toggle for Dropbox in the data source
settings UI.
- Adds regression coverage for slim snapshots, nested folders, paginated
listings, duplicate filenames, and full reindex behavior.
Tests:
- `uv run pytest test/unit_test/common/test_dropbox_connector.py -q`
- `uv run pytest test/unit_test/rag/test_sync_data_source.py -q`
- `uv run pytest test/unit_test/common/test_dropbox_connector.py
test/unit_test/rag/test_sync_data_source.py -q`
- `uv run ruff check common/data_source/dropbox_connector.py
rag/svr/sync_data_source.py
test/unit_test/common/test_dropbox_connector.py
test/unit_test/rag/test_sync_data_source.py`
- `./node_modules/.bin/eslint
src/pages/user-setting/data-source/constant/index.tsx`
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: enable sync deleted files in gitlab
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Problem
In the Dataset Configuration page, changing the RAPTOR **Generation
scope** from "Single file" to "Dataset" and clicking **Save** did not
persist the change. After refreshing or re-entering the page, the scope
always reverted to "Single file".
## Root Cause
1. **Backend**: The `RaptorConfig` Pydantic model in
`api/utils/validation_utils.py` was configured with `extra="forbid"` but
did not declare a `scope` field. When the frontend sent `"scope":
"dataset"`, Pydantic rejected the request.
2. **Frontend**: The `extractRaptorConfigExt` utility in
`web/src/hooks/parser-config-utils.ts` treated `scope` as an unknown
field and moved it into the nested `ext` object. Consequently, the
backend could not read `raptor_config.get("scope", "file")` correctly,
so the default `"file"` was always used.
## Changes
- Added `scope: Literal["file", "dataset"]` to the backend
`RaptorConfig` model with a default of `"file"`.
- Added `scope` to the known-field whitelist in the frontend
`extractRaptorConfigExt` helper so it is transmitted as a top-level
raptor field instead of being buried in `ext`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: noob <yixiao121314@outlook.com>
### What problem does this PR solve?
1. support command:
```
RAGFlow(user)> create provider 'vllm' instance 'test' key 'test-key' url 'base-url' region 'abc';
SUCCESS
RAGFlow(user)> list instances from 'vllm';
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
| apiKey | extra | id | instanceName | providerID | status |
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
| test-key | {"base_url":"base-url","region":"abc"} | 40213c89430311f1a7cf38a74640adcc | test | b4d40e6142d311f1a4f938a74640adcc | enable |
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
```
2. support add vllm model
```
RAGFlow(user)> add model 'Qwen/Qwen2-0.5B' to provider 'vllm' instance 'test' with tokens 131072 chat;
SUCCESS
```
3. add vllm chat
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Port PR14454 to GO (PruneDeletedChunks)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: enable sync deleted files for Gmail && fix google drive issues
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: bill <yibie_jingnian@163.com>
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
Fix: Clicking the button in the bottom-right corner of the
`/chats/widget` page fails to display the dialog box.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
implement `volcengine` provider
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Dataset: When configuring the "general chunk method," options such
as chunk size and parent-child slicing are unavailable.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
prune deleted doc chunks from retrieval
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Update the URL to: /api/v1/chat/completions
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat: sync deleted files in Bitbucket
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
**Addresses the Google Drive integration for #14362**
This PR completely overhauls the Google Drive sync logic to accurately
detect remote deletions, while drastically reducing the memory footprint
during the snapshot phase.
### What changed under the hood:
* **Killed the memory bloat:** Swapped out the massive document
dictionary objects for a lightweight `collections.namedtuple` (`SlimDoc
= namedtuple('SlimDoc', ['id'])`). This prevents RAM spikes during
`retrieve_all_slim_docs_perm_sync` on massive enterprise drives.
* **Flawless downstream integration:** The `SlimDoc` object relies on
simple duck typing. It perfectly delivers the `.id` attribute required
by `ConnectorService.cleanup_stale_documents_for_task`, meaning your
core `hash128` vector cleanup logic runs natively without modification.
* **Fixed the Shared Drive blindspot:** The standard API query was
missing team folders. Injected the `corpora="allDrives"` and
`includeItemsFromAllDrives=True` override flags so the connector now
accurately maps state across both personal workspaces and organizational
Shared Drives.
### Testing:
Isolated the Google API retrieval logic locally to prove the `SlimDoc`
mapping works and correctly registers state drops when a file is trashed
remotely.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Performance Improvement
### What problem does this PR solve?
Fix: enable sync deleted file in airtable
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
## Summary
Migrate two web API endpoints to REST-style HTTP API endpoints,
following the pattern established in #14222:
| Old Endpoint | New Endpoint |
|---|---|
| `POST /v1/chunk/retrieval_test` | `POST
/api/v1/datasets/<dataset_id>/search` |
| `GET /v1/chunk/knowledge_graph` | `GET
/api/v1/datasets/<dataset_id>/graph` |
### What problem does this PR solve?
Fix: google authentication - gmail && google-drive
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Steps to re-produce (existing bug before API migration):
create a new dataset
upload a file
click on "General" in "Parse" column and then click on "switch or
configure ingestion pipeline"
click on "Settings" (at right of "Auto metadata")
click "Add" to add new metadata
click on "Save"
re-open "Settings" and the newly added metadata is not there
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
agent toolcall null response & schema validation & DeepSeek think
history
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: enable sync delted files for connectors
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Summary
Fixes case-asymmetric matching for manual `meta_data_filter` when using
**`in`** / **`not in`** with a **list** `value`. Document metadata
strings were lowercased, but list elements were not, so values like
`"F2"` failed to match `["F2", "F11"]` even though **`=`** behaved
correctly.
Closes#14389
## Changes
- **`common/metadata_utils.py`**: For **`in`** / **`not in`**, normalize
string elements when `value` and/or `input` is a list, consistent with
scalar string lowercasing.
- **`test/unit_test/common/test_metadata_filter_operators.py`**:
Regression tests for list `value` case-insensitivity and **`not in`**.
## Type of change
- [x] Bug fix (non-breaking)
### What problem does this PR solve?
This PR fixes a regression where Manual pipeline + Naive (Plain Text)
PDF parsing crashed with `AttributeError: 'PlainParser' object has no
attribute 'extract_positions'` in `rag/app/manual.py`.
fixes#14411
### Type of change:
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add methods to volcengine
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Always return success if no such task id to follow existing code logic.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
align chat recommendation and thumbup APIs
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
preserve infinity available_int zero filter
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Refactor server_main
2. Add volcengine
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
[Uploading part_4-13.pdf…]()
### What problem does this PR solve?
In chat, the thumbnails didn't display correctly
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
Steps to reproduce:
1. create dataset and upload a file (see attached)
2. parse the document
3. once parsing completed, create a chat and associate it with the
dataset
4. ask a question (DAP VS DAPE comparison)
5. check result
### What problem does this PR solve?
Before migration
Web API: POST /v1/document/change_parser
HTTP API: PATCH /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API
PATCH /api/v1/datasets/<dataset_id>/documents
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add executor shutdown in finally clause to free resources.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Before migration: GET /v1/document/thumbnails
After migration: GET /api/v1/thumbnails
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Before migration: POST /v1/document/run
After migration: POST /api/v1/documents/ingest/
### Type of change
- [x] Refactoring
### What problem does this PR solve?
As title.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
### Summary
PR #14222 consolidated KB (web) API endpoints into RESTful Dataset
(HTTP) API endpoints and deleted the web API test suite under
`test_web_api/test_kb_app/` and `test_web_api/test_document_app/`. While
most test coverage was migrated to the HTTP API test suite, some tests
were not ported over. This PR adds back the missing coverage.
### Route migration reference
| Old Web API | New HTTP API | Missing tests |
|---|---|---|
| `POST /v1/kb/update_metadata_setting` | `PUT
/api/v1/datasets/<id>/metadata/config` | auth & error paths |
| `GET /api/v1/datasets/<id>/auto_metadata` | `GET
/api/v1/datasets/<id>/metadata/config` | auth & CRUD |
| `PUT /api/v1/datasets/<id>/auto_metadata` | `PUT
/api/v1/datasets/<id>/metadata/config` | auth & CRUD |
| `GET /v1/kb/<kb_id>/basic_info` | `GET
/api/v1/datasets/<id>/ingestions/summary` | covered |
| `POST /v1/kb/list_pipeline_logs` | `GET
/api/v1/datasets/<id>/ingestions` | edge cases missing |
### Changes
#### `test_file_management_within_dataset/test_metadata_config.py` (new,
10 tests)
Covers `GET/PUT /datasets/<id>/metadata/config` (migrated from
`test_kb_tags_meta.py`'s `test_update_metadata_setting` and
`test_document_metadata.py`'s negative tests):
- Authorization for dataset metadata config GET/PUT
- Authorization for document metadata config PUT
- Success, invalid dataset, missing payload, not found scenarios
#### `test_dataset_management/test_ingestion_logs.py` (extended, +2
tests)
Covers `GET /datasets/<id>/ingestions` edge cases (migrated from
`test_kb_pipeline_tasks.py`):
- Missing dataset ID
- Abnormal date filter
### Type of change
- [x] Other: Test coverage improvement
---------
Signed-off-by: noob <yixiao121314@outlook.com>
### What problem does this PR solve?
Before migration
Web API: POST /v1/document/change_status
After consolidation, Restful API
POST /api/v1/datasets/<dataset_id>/documents/batch-update-status
### Type of change
- [x] Refactoring
### What problem does this PR solve?
prioritize explore session ID and reset default conversation variables
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Dockerfile v0.25.0 expects nginx conf at path
/etc/nginx/ragflow.conf.python, see
[Dockerfile#L200](ca01c7a745/Dockerfile (L200))
However current helm template mount the conf at path
/etc/nginx/ragflow.conf causing runtime error at startup time.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Mauro Gattari <mauro.gattari@infn.it>
### What problem does this PR solve?
Before migration: POST /v1/document/upload_info/
After migration: POST /api/v1/documentss/upload/
### Type of change
- [x] Refactoring
## Summary
- **Lazy img_np loading**: `np.array(img)` is now deferred until the
first OCR text extraction is actually needed, avoiding unnecessary
memory allocation for pages that already have text.
- **Chunked parse_into_bboxes**: Large PDFs (>50 pages, configurable via
`PDF_PARSER_PAGE_BATCH_SIZE`) are processed in batches. Each chunk's
boxes are normalized with `_to_global_boxes` to produce globally
consistent page numbers and position tags.
- **DLA early init**: Move remote-client initialization before model
loading in `LayoutRecognizer.__init__` so `DEEPDOC_URL` (or legacy
`TENSORRT_DLA_SVR`) short-circuits unnecessary model download for parser
containers relying on remote inference.
- **Fix outline regression**: Restore `self.outlines =
extract_pdf_outlines(fnm)` in `parse_into_bboxes`; this was dropped
during refactoring and is required by downstream `remove_toc` and
metadata handling in `rag/flow/parser/parser.py`.
## Test plan
- [ ] Small PDF (<=50 pages): verify parse succeeds and `self.outlines`
is populated
- [ ] Large PDF (>50 pages): verify chunked processing produces globally
consistent page numbers
- [ ] With `DEEPDOC_URL` set: verify remote DLA client is used and local
model is not downloaded
- [ ] With legacy `TENSORRT_DLA_SVR` set: verify backward compatibility
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
### What problem does this PR solve?
This PR fixes issue #14371 where file parsing failed after upgrading
from v0.24.0 to v0.25.0, because metadata config could be a JSON Schema
object but was handled like a list and later caused `KeyError:
'properties'`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Before migration: GET /v1/document/artifact/<filename>
After migration: GET /api/v1/documents/artifact/<filename>
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Fixes#14196
## Problem
When using DeepDOC to parse large PDFs (over 1000 pages), the parser
silently truncated processing at 300 pages due to a hardcoded default
`page_to=299` in `RAGFlowPdfParser.__images__()`. This caused:
- **Errors** on pages beyond the limit
- **Poor image quality** as the parser attempted to compensate with
missing page data
- **Inconsistent chunk splitting** between full PDF imports and partial
imports
Additionally, the codebase scattered magic numbers (`299`, `600`,
`10000`, `100000`, `100000000`, `10000000000`, `10**9`) across 22 files
as sentinel values for "parse all pages", making future maintenance
error-prone.
## Root Cause
```python
# deepdoc/parser/pdf_parser.py (before)
def __images__(self, fnm, zoomin=3, page_from=0, page_to=299, callback=None):
# Only the first 300 pages were rendered; everything beyond was silently dropped
```
While most callers in `rag/app/*.py` correctly passed `to_page=100000`,
the base class `RAGFlowPdfParser.__call__()` and `parse_into_bboxes()`
invoked `__images__` **without** forwarding `page_from`/`page_to`,
falling back to the restrictive default of 299.
## Solution
### 1. Define constants in `common/constants.py`
```python
MAXIMUM_PAGE_NUMBER = 100000 # Used by the parsing layer
MAXIMUM_TASK_PAGE_NUMBER = MAXIMUM_PAGE_NUMBER * 1000 # Used by the task/DB layer
```
### 2. Replace all hardcoded sentinel values
| Layer | Files Changed | Old Values | New Value |
|---|---|---|---|
| **Deepdoc parsers** | `pdf_parser.py`, `mineru_parser.py`,
`docling_parser.py`, `opendataloader_parser.py`, `paddleocr_parser.py`,
`docx_parser.py` | `299`, `600`, `10**9`, `100000000` |
`MAXIMUM_PAGE_NUMBER` |
| **Chunk parsers** | `naive.py`, `book.py`, `qa.py`, `one.py`,
`manual.py`, `paper.py`, `presentation.py`, `laws.py`, `resume.py`,
`email.py`, `table.py` | `100000`, `10000`, `10000000000` |
`MAXIMUM_PAGE_NUMBER` |
| **Task/DB layer** | `db_models.py`, `task_service.py`,
`document_service.py`, `file_service.py` | `100000000` |
`MAXIMUM_TASK_PAGE_NUMBER` |
### 3. Fix `parse_into_bboxes()` missing parameters
Added `from_page`/`to_page` parameters to `parse_into_bboxes()` so that
the `rag/flow/parser/parser.py` DeepDOC path no longer falls back to the
restrictive default.
## Files Changed (22)
- `common/constants.py`
- `deepdoc/parser/pdf_parser.py`
- `deepdoc/parser/mineru_parser.py`
- `deepdoc/parser/docling_parser.py`
- `deepdoc/parser/opendataloader_parser.py`
- `deepdoc/parser/paddleocr_parser.py`
- `deepdoc/parser/docx_parser.py`
- `rag/app/naive.py`
- `rag/app/book.py`
- `rag/app/qa.py`
- `rag/app/one.py`
- `rag/app/manual.py`
- `rag/app/paper.py`
- `rag/app/presentation.py`
- `rag/app/laws.py`
- `rag/app/resume.py`
- `rag/app/email.py`
- `rag/app/table.py`
- `api/db/db_models.py`
- `api/db/services/task_service.py`
- `api/db/services/document_service.py`
- `api/db/services/file_service.py`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
---------
Signed-off-by: noob <yixiao121314@outlook.com>
### What problem does this PR solve?
As title.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
AI coding agents (Claude, Copilot, etc.) tend to directly edit files in
`src/components/ui/` when asked to tweak styles or add props, treating
them like ordinary feature code. This silently breaks the shared
component library that both shadcn primitives and project-authored
common components live in.
This PR adds a `Shared UI Component Lock` convention to `web/CLAUDE.md`
to instruct AI agents to treat the entire `src/components/ui/` directory
as read-only. Any customization must be done via wrappers or composition
outside the directory; exceptions require explicit user approval.
### Type of change
- [x] Other (please describe): Update `CLAUDE.md`
## Summary
PDF files often contain a bookmark/outline tree (table of contents built
into the file by the authoring tool). RAGFlow's `pdf_parser.outlines`
already extracts these `(title, depth)` tuples via pypdf, but they are
used ephemerally during chunking (`manual` parser uses them for
hierarchy detection) and then discarded.
This PR persists the outline as `doc.meta_fields["outline"]` — a JSON
array of `{"title": str, "depth": int}` objects — so downstream features
can use the structural information.
### Why this matters
- **Complementary to `toc_extraction`** — the existing `toc_extraction`
feature uses LLM calls to generate a TOC and only works for the `naive`
parser. The raw PDF outline is free (already extracted by pypdf), works
for all parsers, and captures the author's original document structure.
- **Document navigation** — frontends can render a clickable TOC from
the outline
- **Entity extraction** — the outline provides a structural map for
identifying document sections and key topics
- **Search result context** — knowing which section a chunk belongs to
helps users evaluate relevance
### Changes
| File | Change | LOC |
|------|--------|-----|
| `rag/app/naive.py` | Attach `pdf_parser.outlines` as `__outline__` on
first chunk dict | ~7 |
| `rag/app/manual.py` | Same for the manual parser | ~5 |
| `rag/svr/task_executor.py` | Extract `__outline__`, persist via
`DocMetadataService.update_document_metadata()` | ~12 |
### Design decisions
- **Transient key pattern**: The outline is passed from parser →
task_executor via `__outline__` on the first chunk dict, then removed
before indexing. This follows the same pattern as `metadata_obj` for
LLM-generated metadata.
- **No schema changes**: Uses the existing `meta_fields` JSON column on
the document table.
- **Graceful degradation**: If a PDF has no outline (common for scanned
docs), nothing is stored. If persistence fails, it logs a warning and
continues — parsing is not interrupted.
### Backward compatibility
- **Fully backward compatible** — no existing fields, behavior, or
schemas changed
- PDFs without outlines are unaffected
- Existing `meta_fields` data is preserved (merged, not overwritten)
## Test plan
- [ ] Parse a PDF with bookmarks (e.g. any multi-chapter document),
verify `meta_fields["outline"]` is populated
- [ ] Parse a PDF without bookmarks, verify no errors and no outline key
in meta_fields
- [ ] Verify existing `meta_fields` data is preserved (not overwritten)
when outline is added
- [ ] Verify `manual` parser also persists outlines
- [ ] Verify outline JSON structure: `[{"title": "Chapter 1", "depth":
0}, ...]`
Related: #9921 (Deterministic Document Access Layer)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: yuch85 <yuch85.1@gmail.com>
Co-authored-by: Wang Qi <wangq8@outlook.com>
### What problem does this PR solve?
## Summary
Closes#6102
When using Infinity as the document store engine (GPU version), calling
`update()` on a non-existent table throws an unhandled
`InfinityException` with error code 3022 (`TABLE_NOT_EXIST`). This
causes users to see a raw "3022" error when clicking on a parsed
document.
## Root Cause
The `update()` methods in both `rag/utils/infinity_conn.py` and
`memory/utils/infinity_conn.py` call `db_instance.get_table(table_name)`
without catching `InfinityException`. In contrast, other CRUD methods
(`insert`, `delete`, `search`) all handle this exception gracefully:
| Method | Handles table-not-exist? | Behavior |
|----------|--------------------------|----------|
| `insert` | ✅ Yes | Auto-creates the table |
| `search` | ✅ Yes | Skips the table |
| `delete` | ✅ Yes | Returns 0 |
| `update` | ❌ **No** | Crashes with 3022 |
Additionally, `api/apps/document_app.py` worked around this with a
fragile string match (`"3022" in msg`) to detect the error.
## Changes
- **`rag/utils/infinity_conn.py`**: Catch `InfinityException` in
`update()`. When `TABLE_NOT_EXIST` is detected, log a warning and return
`False` — consistent with `delete()`.
- **`memory/utils/infinity_conn.py`**: Apply the same fix to its
`update()` method.
- **`api/apps/document_app.py`**: Remove the fragile `"3022"`
string-matching workaround. Table-not-exist is now handled by the `if
not ok` path with an improved error message.
### Type of change
- [x] Refactoring
---------
Signed-off-by: noob <yixiao121314@outlook.com>
## What does this PR do?
Fixes the `hint : 103 Only owner of canvas authorized for this
operation` error that appears when opening a **Chat** shared link
(`/chats/share?shared_id=...&from=chat`).
## Root Cause
The Chat shared page (`web/src/pages/next-chats/share/index.tsx`)
unconditionally calls `useFetchFlowSSE()`, which requests
`/api/canvas/getsse/{sharedId}`. This is an Agent Canvas endpoint that
validates canvas ownership. When sharing a **Chat** dialog (not an
Agent):
1. `sharedId` is a `dialog_id`, not a `canvas_id`
2. The API token's `tenant_id` doesn't match any canvas owner
3. The backend returns `code: 103, message: "Only owner of canvas
authorized for this operation."`
4. The global error interceptor in `request.ts` displays it as a
notification: `hint : 103 Only owner of canvas authorized for this
operation.`
## Changes
- **`web/src/hooks/use-agent-request.ts`**: Added an `enabled` parameter
to `useFetchFlowSSE` so callers can conditionally skip the query.
- **`web/src/pages/next-chats/share/index.tsx`**: Only enable
`useFetchFlowSSE` when `from === SharedFrom.Agent`. For Chat shares, the
hook is disabled, avoiding the unnecessary canvas API call entirely.
## Related Issue
Closes#14115
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: noob <yixiao121314@outlook.com>
## Summary
RAPTOR's recursive clustering builds a `layers` list tracking
`(start_idx, end_idx)` boundaries per level, but currently discards this
information — only the flat `chunks` list is returned. This makes it
impossible to distinguish leaf-level summaries from top-level ones.
This PR:
- Returns `(chunks, layers)` tuple from `raptor.py`'s `__call__`
- Annotates each RAPTOR summary chunk with `raptor_layer_int` (1 = first
summary level, 2 = summary-of-summaries, etc.)
- Adds `raptor_layer_int` to `infinity_mapping.json` (Elasticsearch
handles it via existing `*_int` dynamic template)
### Why this matters
Downstream features need to know which RAPTOR layer a summary belongs
to:
- **Retrieving the top-level document summary** for entity extraction,
search snippets, or document comparison
- **Filtering by abstraction level** — users may want only high-level
summaries or only leaf-level cluster summaries
- **RAPTOR recall quality** — #10951 reports summaries not being
recalled for definition queries; layer metadata enables targeted
retrieval
### Changes
| File | Change | LOC |
|------|--------|-----|
| `rag/raptor.py` | Return `(chunks, layers)` tuple | ~3 |
| `rag/svr/task_executor.py` | Build `chunk_layer` mapping, set
`raptor_layer_int` | ~12 |
| `conf/infinity_mapping.json` | Add `raptor_layer_int` integer field |
~1 |
### Backward compatibility
- **Additive only** — no existing fields or behavior changed
- Existing RAPTOR chunks continue to work (they'll have
`raptor_layer_int = 0` by default)
- New RAPTOR chunks get layer metadata automatically
## Test plan
- [ ] Parse a document with RAPTOR enabled, verify `raptor_layer_int` is
set on indexed chunks
- [ ] Verify `raptor_layer_int` values increase with abstraction level
(layer 1 < layer 2 < ...)
- [ ] Verify existing RAPTOR deletion (`delete by raptor_kwd`) still
works
- [ ] Verify Infinity backend accepts the new field
Fixes#7488
Related: #4104, #11191, #10951🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: yuch85 <yuch85.1@gmail.com>
Co-authored-by: Wang Qi <wangq8@outlook.com>
### What problem does this PR solve?
The POST /upload_info?url=<url> endpoint accepted a user-supplied URL
and passed it directly to AsyncWebCrawler without any validation. There
were no restrictions on URL scheme, destination hostname, or resolved IP
address. This allowed any authenticated user to instruct the server to
make outbound HTTP requests to internal infrastructure — including RFC
1918 private networks, loopback addresses, and cloud metadata services
such as http://169.254.169.254 — effectively using the server as a proxy
for internal network reconnaissance or credential theft.
This PR adds an SSRF guard (_validate_url_for_crawl) that runs before
any crawl is initiated. It enforces an allowlist of safe schemes
(http/https), resolves the hostname at validation time, and rejects any
URL whose resolved IP falls within a private or reserved network range.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Allow search id or _id when using es as doc_engine.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: introduce minimum type check for pipeline
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
As title
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: The button styles in the PaddleOCR dialog are not applying
correctly.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Copilot <copilot@github.com>
### What problem does this PR solve?
Blob storage sync was downloading unsupported files first and rejecting
them later, which wasted bandwidth and made sync slower. This PR skips
unsupported extensions before download and applies `allow_images` in
blob sync. fixes#14338
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Resolves#14211
**Background:** Currently, RAGFlow routes all Docling parsing through
the standard `/convert/source` endpoint. For large documents, this
returns massive, unchunked text that exceeds RAGFlow's internal
embedding model context limits, causing pipeline failures.
**Solution:**
This PR updates the `_parse_pdf_remote` ingestion logic in
`docling_parser.py` to prioritize `docling-serve`'s native chunking
endpoints (`/v1/chunk/source` and `/v1alpha/chunk/source`).
- By receiving pre-sliced chunk objects directly from Docling, RAGFlow
natively bypasses token limit overflows.
- Included a graceful fallback mechanism to the standard
`/convert/source` endpoints to maintain backwards compatibility for
users running older versions of the Docling server that return 404s on
the new routes.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Allow image2text models (multimodal) to be used as chat models.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
The Langfuse Python SDK v3+ removed `start_generation()` method.
RagFlow's code called this non-existent method, causing AttributeError
when Langfuse tracing is enabled.
Replace all `start_generation()` calls with
`start_observation(as_type="generation")` which is the correct v4 SDK
API.
Affected files:
- api/db/services/llm_service.py (12 occurrences)
- api/db/services/dialog_service.py (1 occurrence)
Fixes#14204
Related to #9243
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### What problem does this PR solve?
when use azure blob as the file container, when click parse file, it
calls:
```python
partial(settings.STORAGE_IMPL.put, tenant_id=task["tenant_id"])
```
So any storage backend used there must accept tenant_id as a kwarg.
RAGFlowAzureSasBlob.put() did not, causing:
```
TypeError: ... got an unexpected keyword argument 'tenant_id'
```
Now it does, so parsing should proceed past this point.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
`check_ragflow_server_alive()` in `api/utils/health_utils.py` calls
`requests.get(url)` without a `timeout` parameter. Unlike
`check_minio_alive()` which correctly specifies `timeout=10`, this
health check can hang indefinitely if the server is unresponsive.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Changes
Added `timeout=10` to the `requests.get()` call, consistent with
`check_minio_alive()`.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Bumps [lxml](https://github.com/lxml/lxml) from 6.0.2 to 6.1.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/lxml/lxml/blob/master/CHANGES.txt">lxml's
changelog</a>.</em></p>
<blockquote>
<h1>6.1.0 (2026-04-17)</h1>
<p>This release fixes a possible external entity injection (XXE)
vulnerability in
<code>iterparse()</code> and the <code>ETCompatXMLParser</code>.</p>
<h2>Features added</h2>
<ul>
<li>
<p>GH#486: The HTML ARIA accessibility attributes were added to the set
of safe attributes
in <code>lxml.html.defs</code>. This allows <code>lxml_html_clean</code>
to pass them through.
Patch by oomsveta.</p>
</li>
<li>
<p>The default chunk size for reading from file-likes in
<code>iterparse()</code> is now configurable
with a new <code>chunk_size</code> argument.</p>
</li>
</ul>
<h2>Bugs fixed</h2>
<ul>
<li>LP#2146291: The <code>resolve_entities</code> option was still set
to <code>True</code> for
<code>iterparse</code> and <code>ETCompatXMLParser</code>, allowing for
external entity injection (XXE)
when using these parsers without setting this option explicitly.
The default was now changed to <code>'internal'</code> only (as for the
normal XML and HTML parsers
since lxml 5.0).
Issue found by Sihao Qiu as CVE-2026-41066.</li>
</ul>
<h1>6.0.4 (2026-04-12)</h1>
<h2>Bugs fixed</h2>
<ul>
<li>LP#2148019: Spurious MemoryError during namespace cleanup.</li>
</ul>
<h1>6.0.3 (2026-04-09)</h1>
<h2>Bugs fixed</h2>
<ul>
<li>
<p>Several out of memory error cases now raise <code>MemoryError</code>
that were not handled before.</p>
</li>
<li>
<p>Slicing with large step values (outside of <code>+/-
sys.maxsize</code>) could trigger undefined C behaviour.</p>
</li>
<li>
<p>LP#2125399: Some failing tests were fixed or disabled in PyPy.</p>
</li>
<li>
<p>LP#2138421: Memory leak in error cases when setting the
<code>public_id</code> or <code>system_url</code> of a document.</p>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="43722f4402"><code>43722f4</code></a>
Update changelog.</li>
<li><a
href="87470409b1"><code>8747040</code></a>
Name version of option change in docstring.</li>
<li><a
href="6c36e6cef7"><code>6c36e6c</code></a>
Fix pypistats URL in download statistics script.</li>
<li><a
href="c7d76d6cb8"><code>c7d76d6</code></a>
Change security policy to point to Github security advisories.</li>
<li><a
href="378ccf82db"><code>378ccf8</code></a>
Update project income report.</li>
<li><a
href="315270b810"><code>315270b</code></a>
Docs: Reduce TOC depth of package pages and move module contents
first.</li>
<li><a
href="6dbba7f3c7"><code>6dbba7f</code></a>
Docs: Show current year in copyright line.</li>
<li><a
href="e4385bfa5d"><code>e4385bf</code></a>
Update project income report.</li>
<li><a
href="5bed1e1a22"><code>5bed1e1</code></a>
Validate file hashes in release download script.</li>
<li><a
href="c13ee10a42"><code>c13ee10</code></a>
Prepare release of 6.1.0.</li>
<li>Additional commits viewable in <a
href="https://github.com/lxml/lxml/compare/lxml-6.0.2...lxml-6.1.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/infiniflow/ragflow/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
### What problem does this PR solve?
Before migration
Web API: POST /v1/document/metadata/update
After migration, Restful API
PATCH /api/v2/datasets/<dataset_id>/documents/metadatas
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Fix: Recall Test Page Metadata Not Displaying.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Add new provider minimax
2. Add new command: CHECK INSTANCE 'instance_name' FROM 'provider_name';
```
RAGFlow(user)> check instance 'test' from 'minimax';
SUCCESS
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Some bugs
- Pipeline runtime log files could not be viewed
- Corrected TOC terminology errors in the English translation
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Fix: Remove duplicate text output from the thought model on the chat
page.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Before migration
Web API: POST /v1/document/update_metadata_setting
After consolidation, Restful API
PUT
/api/v1/datasets/<dataset_id>/documents/<document_id>/metadata/config
### Type of change
- [x] Refactoring
### What problem does this PR solve?
This PR fixes the merge-phase crash reported in #14236 during GraphRAG
entity resolution.
The issue happens after candidate pair resolution completes, when
multiple merge coroutines mutate the same shared `networkx` graph
concurrently. In `_merge_graph_nodes`, the code iterates over
`graph.neighbors(node1)` and also awaits during edge/description
merging. That allows another coroutine to modify the graph adjacency
structure in between, which can trigger `RuntimeError: dictionary keys
changed during iteration` and can also lead to unsafe shared-graph
mutation.
This change keeps the PR scoped to that single issue by:
- serializing merge-time graph mutations with a dedicated merge lock
- snapshotting `graph.neighbors(node1)` with `list(...)` before
iteration
Together, these changes prevent concurrent mutation of the shared graph
during the merge phase and make the merge loop safe against live-view
invalidation.
Fixes#14236
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- Replace single `Read()` call in Go upload service with `io.ReadAll()`.
- Prevent potential truncated/corrupted file content during multipart
upload.
- Keep existing API behavior unchanged while fixing data integrity risk.
## Root Cause
`io.Reader.Read()` may return fewer bytes than requested without an
error. The previous implementation read once into a full buffer and
assumed all bytes were populated.
## Test plan
- Upload files of multiple sizes and verify uploaded content integrity.
- Confirm upload endpoint still returns successful responses.
- Verify downstream document parsing works on uploaded files.
## Issues
Closes#14266
### What problem does this PR solve?
As title
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Add Astraflow Provider Support
This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud
/ 优刻得) as a new AI model provider in RAGFlow, with support for both
global and China endpoints.
### About Astraflow
Astraflow is an OpenAI-compatible AI model aggregation platform
supporting 200+ models from major providers including DeepSeek, Qwen,
GPT, Claude, Gemini, Llama, Mistral, and more.
| Variant | Factory Name | Endpoint | Env Var |
|---------|-------------|----------|---------|
| Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` |
`ASTRAFLOW_API_KEY` |
| China | `Astraflow-CN` | `https://api.modelverse.cn/v1` |
`ASTRAFLOW_CN_API_KEY` |
- **API key signup**: https://astraflow.ucloud.cn/
---
### Files Changed
| File | Change |
|------|--------|
| `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in
`SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and
`LITELLM_PROVIDER_PREFIX` |
| `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat`
(OpenAI-compatible `Base` subclass) |
| `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and
`AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) |
| `rag/llm/rerank_model.py` | Add `AstraflowRerank` and
`AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) |
| `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV`
(subclasses of `GptV4`) |
| `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS`
(subclasses of `OpenAITTS`) |
| `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and
`AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) |
| `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN`
factories with a curated list of popular models |
---
### Supported Model Types
- ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7,
Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more
- ✅ **Text Embedding** — text-embedding-3-small/large
- ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini,
Llama-4, etc.
- ✅ **Text Re-Rank**
- ✅ **TTS** — tts-1
- ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1
### Implementation Notes
- Uses the `openai/` LiteLLM prefix — consistent with other
OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI,
OpenRouter, n1n, Avian, etc.)
- `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249)
are separate factory entries, allowing users to choose the optimal
endpoint based on their region.
- All model classes cleanly subclass existing base classes (`Base`,
`OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`)
with no custom logic needed — the provider is fully OpenAI-compatible.
---------
Co-authored-by: user <user@xzaaaMacBook-Air.local>
### What problem does this PR solve?
update MinerU parser to most recent minerU v3 logic
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add document of search message with user_id, add sdk support.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
As title.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
update MinerU endpoint to /pdf_parse which has been exposed since v3.x.
fixes#14263
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
normalize think tags in final chat answer
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Before consolidation
Web API: POST /v1/document/rm
Http API - DELETE /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API -- DELETE
/api/v1/datasets/<dataset_id>/documents
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Refactor /api/v1/chats to be more RESTful.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Before consolidation
Web API: POST /v1/document/infos
Http API - GET /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents?ids=id1&ids=id2
### Type of change
- [ ] Refactoring
Closes#14165
Add a short documentation page under Developer Guides introducing
DeepWiki as a resource for developers doing secondary development or
exploring RAGFlow's codebase internals.
---------
Co-authored-by: hyl64 <hyl64@users.noreply.github.com>
### What problem does this PR solve?
Before consolidation
Web API: POST /v1/document/filter
Http API - GET /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents?type=filter
### Type of change
- [x] Refactoring
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.24.0 to v0.25.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Get metadata configuration from union of custom metadata and
built_in_metadata.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Component definition is missing display name.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Supports stream and non-stream chat
2. Supports think and non-think chat
3. List supported models from DeepSeek service. (This command can be
used to verify the API validity)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Editing an empty response in the retrieval operator will cause the
focus to shift to the metadata input box.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The minimum value for the "Suggested text block size" input box is
set to 1.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
OpenSource Resume is supported only with Elasticsearch.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The number of chunks in the file list is not displayed.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The mind map on the search page does not display completely upon
initial loading.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
In order to attach the debugger to a running docker container it has to
be inside the docker image
### What problem does this PR solve?
[#14224](https://github.com/infiniflow/ragflow/issues/14224)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes#14206.
This issue is a regression. PR #9520 previously changed Gemini models
from `image2text` to `chat` to fix chat-side resolution, but PR #13073
later restored those Gemini entries to `image2text` during model-list
updates, which reintroduced the bug.
The underlying problem is that Gemini models are multimodal and
advertise both `CHAT` and `IMAGE2TEXT`, while tenant model resolution
still depends on a single stored `model_type`. That makes chat-only
flows such as memory extraction fragile when a compatible model is
stored as `image2text`.
This PR fixes the issue at the model resolution layer instead of
changing `llm_factories.json` again:
- keep the stored tenant model type unchanged
- try exact `model_type` lookup first
- if no exact match is found, fall back only when the model metadata
shows the requested capability is supported
- coerce the runtime config to the requested type for chat callers
- fail fast in memory creation instead of silently persisting
`tenant_llm_id=0`
This preserves existing multimodal and `image2text` behavior while
restoring chat compatibility for memory-related flows.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Testing
- Re-checked the current memory creation and memory message extraction
paths against the updated resolution logic
- Verified locally that a Gemini-style tenant model stored as
`image2text` but tagged with `CHAT` can still be resolved for `chat`
- Verified `get_model_config_by_type_and_name(..., CHAT, ...)` returns a
chat-compatible runtime config
- Verified `get_model_config_by_id(..., CHAT)` also returns a
chat-compatible runtime config
- Verified strict resolution still fails when the model metadata does
not advertise chat capability
### What problem does this PR solve?
Now each model support region with different URL
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Before consolidation
Web API: POST /v1/document/list
Http API - GET /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API -- GET
/api/v1/datasets/<dataset_id>/documents
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add tips for installing Chinse fonts under code sandbox. Otherwise,
`matplotlib `won't render Chinese correctly.
<img width="2082" height="1186" alt="sales_analysis"
src="https://github.com/user-attachments/assets/57e675ab-1e92-4662-9aeb-ad72a6121eb5"
/>
### Type of change
- [x] Documentation Update
https://bailian.console.aliyun.com/cn-beijing?tab=api#/api/?type=model&url=2780056
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Other (please describe): add gte-rerank-v2、qwen3-rerank
### What problem does this PR solve?
## Summary
Fixes#5939
Entity names containing single quotes (e.g., `投影直线L'`) caused SQL syntax
errors when building filter conditions for Infinity queries, due to
unescaped string interpolation in `equivalent_condition_to_str`.
## Changes
In `common/doc_store/infinity_conn_base.py`, added `.replace("'", "''")`
escaping for string values in two branches of
`equivalent_condition_to_str` where it was missing:
1. **`field_keyword` branch with non-list value** (line 190): The list
branch already escaped single quotes on line 183, but the single-string
branch did not.
2. **Plain string value branch** (line 209): Direct f-string
interpolation `{k}='{v}'` was vulnerable to unescaped quotes.
Both fixes use the same SQL-standard escape pattern (`'` → `''`) already
applied elsewhere in this method.
## How to Test
1. Upload a document containing entity names with single quotes.
2. Enable Knowledge Graph (GraphRAG) in the parsing configuration.
3. Initiate document parsing — it should complete without SQL syntax
errors.
## Note
The original issue also reported a typo (`dge_graph_kwd` instead of
`knowledge_graph_kwd`), which has already been fixed in the current
codebase.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: noob <yixiao121314@outlook.com>
### What problem does this PR solve?
Fix: Clicking on the empty dialog box on the agent exploration page will
result in an error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Correctly set and display parent-child config in parser_config, and
allow to pass `tenant_id` in PATCH `/api/v1/chats`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Spaces cannot be entered in the code editor of the code operator.
[Monaco Editor with XYFlow fails to accept most space bar keypresses,
who is at fault?
#5204](https://github.com/microsoft/monaco-editor/discussions/5204)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The embedded page for search is inaccessible.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix: Add internationalization configurations related to text
segmentation identifiers.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The placeholder in PromptEditor is obscured.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Closes#9078
### What problem does this PR solve?
The `retrieval_test` endpoint in `chunk_app.py` never forwarded the
`highlight` request parameter to `retriever.retrieval()`, so the search
engine never produced highlight snippets. Additionally, the frontend
always rendered `content_with_weight` instead of preferring the
`highlight` field, and the CSS rule color `var(--accent-primary)` didn't
work because the variable stores an RGB triplet `(45,212,191)` requiring
the `rgb()` wrapper.
### Before
- Search page: displayed raw content_with_weight as a wall of plain
white text with no term highlighting, including markdown headings
rendered as literal text
- Retrieval testing page: showed `content_with_weight` in a plain `<p>`
tag, no `<em>` tags rendered, no highlight coloring
- Children chunks: when child chunks were consolidated into a parent via
`retrieval_by_children`, any highlight data from children was discarded
- TOC chunks: chunks fetched via `retrieval_by_toc` had no `highlight`
field, appearing as plain text while other chunks had highlights
**Retrieval testing**:
<img width="1449" height="1178"
alt="before-retrieval-no-highlight-cropped"
src="https://github.com/user-attachments/assets/5c6f5a5e-6c11-461a-bdb4-049d7dfb7a33"
/>
**Search**:
<img width="1378" height="711" alt="before-search-no-highlight-cropped"
src="https://github.com/user-attachments/assets/be7b5152-72ef-40da-a8fd-921e997ae7d3"
/>
### After
- Search page: displays the highlight field with search terms rendered
in teal/cyan color (`rgb(var(--accent-primary))`)
- Retrieval testing page: sends highlight: true in the request, uses
`HighLightMarkdown` component to render `<em>` tags with proper coloring
- Children chunks: highlights from child chunks are joined and preserved
on the parent
- TOC chunks: when other chunks have highlights, TOC-fetched chunks use
`content_with_weight` as a highlight fallback
**Retrieval testing**:
<img width="1410" height="1015" alt="05-retrieval-testing-results"
src="https://github.com/user-attachments/assets/f0cff8cf-0962-4320-b559-cd5037f622d2"
/>
**Search**:
<img width="1294" height="455" alt="03-search-highlight-results"
src="https://github.com/user-attachments/assets/a90e0e3e-3837-46be-8ddd-2412ff7cbc19"
/>
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Trivial fix log creation, follow on PR:
https://github.com/infiniflow/ragflow/pull/14136
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add a new agent template that demonstrates how to leverage the
`CodeExec` component to do the data analysis.
### Type of change
- [x] Other (please describe): Agent template
### What problem does this PR solve?
Updated ingestion pipeline template descriptions for better technical
accuracy and readability.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Correctly set parent child config in parser_config.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The PromptEditor's placeholder is only half displayed.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes#6034
Changes the `size` field in both `Document` and `File` models from
`IntegerField` (32-bit, max ~2GB) to `BigIntegerField` (64-bit, max
~9.2EB), and adds corresponding database migrations.
## Problem
When uploading a file larger than 2GB, the `size` value overflows a
32-bit signed integer (max 2,147,483,647). This causes:
- The stored `size` wraps around to an incorrect value (e.g., a 3GB file
shows as 2,097,152 KB in File Management).
- Subsequent file operations (e.g., download) fail because the corrupted
size leads to invalid storage lookups.
## Changes
- `Document.size`: `IntegerField` → `BigIntegerField`
- `File.size`: `IntegerField` → `BigIntegerField`
- Added `alter_db_column_type` migrations in `migrate_db()` for both
`document.size` and `file.size` columns to ensure existing deployments
are upgraded automatically.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: noob <yixiao121314@outlook.com>
### What problem does this PR solve?
Resolve#14137 .
### Problem
Graph resolution succeeds (nodes/edges merged, pagerank updated), but
the subsequent burst of Infinity write operations in `set_graph`
exhausts the connection pool with `TOO_MANY_CONNECTIONS` errors. Root
causes:
1. **Hardcoded pool size** — `infinity_conn_pool.py` hardcoded
`ConnectionPool(max_size=4)` on initial creation and `max_size=32` on
refresh. Operators cannot tune this without patching code.
2. **No retry on transient failures** — a single `TOO_MANY_CONNECTIONS`
on edge deletes or chunk inserts kills the entire resolution+community
pipeline with no retry.
### Changes
#### `common/doc_store/infinity_conn_pool.py`
- Read `ConnectionPool` `max_size` from the `INFINITY_POOL_MAX_SIZE`
environment variable (default: `4`), applied consistently to both
initial creation and refresh paths.
- Log the actual pool size on startup for easier debugging.
#### `rag/graphrag/utils.py` — `set_graph()`
- **Edge deletes**: add exponential-backoff retry (3 attempts, 1s/2s/4s
delays) so transient `TOO_MANY_CONNECTIONS` errors are retried instead
of failing the entire job. Concurrency continues to be gated by the
existing `chat_limiter`.
- **Batch inserts**: add exponential-backoff retry (3 attempts, 1s/2s/4s
delays) for the same reason.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: noob <yixiao121314@outlook.com>
### What problem does this PR solve?
Sandbox don't attach attachment metadata
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add a title prefix to the testid on the login page.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: add button to turn off vlm parsing
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: chanx <1243304602@qq.com>
### What problem does this PR solve?
Fix: Pipeline page style optimizations
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Closes#6541
### What problem does this PR solve?
Add content validation to `update_chunk` (SDK and non-SDK) to reject
empty or whitespace-only content before it reaches the embedding model.
**Before:** Calling `update_chunk` with space-only content (like `" "`,
`""`, `"\n"`) bypassed validation and was sent directly to the embedding
model, which returned an error. This was the same bug previously fixed
for `add_chunk` in #6390, but `update_chunk` was missed.
**After:** Empty/whitespace-only content is caught by validation and
returns an error: `` `content` is required ``
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: update templates && add resume template
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: The pop-up menu of the PromptEditor will be blocked. #14126
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
Addresses review feedback on #14074 (Checkpoint mechanism for
long-running workflow jobs, issue #12494).
**Changes based on @yuzhichang's review:**
1. **Renamed `checkpoint_service.py` → `task_checkpoint.py`** as
suggested.
2. **Replaced Redis with direct docEngine queries** as suggested — the
subgraph already gets persisted to the doc store by
`generate_subgraph()`, so we just query for it instead of maintaining a
separate checkpoint in Redis. This is simpler, has no extra dependency,
and uses a single source of truth.
**Changes based on CodeRabbit review:**
3. **Fixed `source_id` query format mismatch** — subgraphs are stored
with `source_id: [doc_id]` (list), but the original query used
`source_id: doc_id` (string). Now follows the same pattern as
`does_graph_contains()` in `rag/graphrag/utils.py`: filter by
`knowledge_graph_kwd` only, then match `source_id` in Python. This
avoids ambiguity across Elasticsearch / Infinity / OceanBase backends.
### Changes
| File | Change |
|---|---|
| `api/db/services/task_checkpoint.py` (new) |
`load_subgraph_from_store()` and `has_raptor_chunks()` — docEngine-based
checkpoint queries |
| `rag/graphrag/general/index.py` | `build_one()` calls
`load_subgraph_from_store()` before running LLM extraction |
| `rag/svr/task_executor.py` | RAPTOR per-doc loop calls
`has_raptor_chunks()` before processing |
| `test/unit_test/rag/graphrag/test_checkpoint_resume.py` (new) | 10
unit tests covering subgraph loading, source_id filtering, edge cases |
### How it works
- **GraphRAG:** Before running expensive LLM entity/relation extraction
for a doc, checks the doc store for an existing subgraph (saved by a
previous interrupted run). If found, loads it directly and skips LLM
calls.
- **RAPTOR:** Before processing a doc, checks if RAPTOR chunks
(`raptor_kwd="raptor"`) already exist for it. If yes, skips.
### Testing
- 10 new unit tests — all passing
- Full existing suite: 617 passed
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
### What problem does this PR solve?
Resolve#14115 .
## Problem
On the shared chat link page (`/chats/share?shared_id=...`), querying
the knowledge base returns "no relevant information was found", while
the same query works correctly on the editor chat page.
## Root Cause
Knowledge base retrieval in `async_chat()` is gated by the check `if
"knowledge" in param_keys` (line 598), where `param_keys` is derived
from `prompt_config["parameters"]`. If `parameters` is empty or missing
the `{"key": "knowledge", "optional": false}` entry, retrieval is
entirely skipped.
This can happen because `_apply_prompt_defaults()` — which ensures
`parameters` contains the `knowledge` entry — is only called in the
`create` (POST) and `update_chat` (PUT) handlers, but **not** in
`patch_chat` (PATCH). If a chat's `prompt_config` was updated via PATCH
without including `parameters`, the `knowledge` entry would be absent.
Additionally, `prompt_config["parameters"]` would raise a `KeyError` if
the key was missing entirely.
## Fix
Added a defensive safety net in `async_chat()`
(`api/db/services/dialog_service.py`) that auto-injects the `knowledge`
parameter when:
- `dialog.kb_ids` is set (knowledge bases are configured)
- `"knowledge"` is not already in `param_keys`
- `{knowledge}` placeholder exists in the system prompt
Also changed `prompt_config["parameters"]` to
`prompt_config.get("parameters", [])` to prevent `KeyError` when the key
is absent.
## Files Changed
- `api/db/services/dialog_service.py` — added auto-injection of
`knowledge` parameter and safe `.get()` access for `parameters`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: noob <yixiao121314@outlook.com>
## Summary
- remove eval-based parsing from retrieval rank feature scoring
- validate `tag_feas` at write time in chunk APIs and SDK routes
- add regression tests for safe parsing and malicious payload rejection
## Details
`tag_feas` is intended to be structured rank-feature data, but the
retrieval ranking path was evaluating stored values as Python
expressions. This change treats `tag_feas` strictly as data.
### What changed
- replace `eval()` in `rag/nlp/search.py` with safe parsing via
`json.loads()` and optional `ast.literal_eval()` compatibility for
legacy Python-dict strings
- strictly filter parsed values down to `dict[str, finite number]`
- reject invalid `tag_feas` payloads at write time in web chunk routes
and SDK document chunk routes
- add focused regression tests to prove executable strings are ignored
and invalid payloads are rejected
## Validation
- `python -m pytest test/unit_test/common/test_tag_feature_utils.py
test/unit_test/rag/test_rank_feature_scores.py -q`
---------
Co-authored-by: unknown <zhenglinkai@CCN.Local>
Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>
## What's the problem
Both `async_chat()` and `async_ask()` call `decorate_answer()` to build
the final SSE payload — it inserts citation markers (`##N$$`) into the
answer text and prunes `doc_aggs` to only the cited documents.
Immediately after, both functions overwrite `final["answer"]` with `""`:
```python
# async_chat(), line ~774 (issue #13828)
final = decorate_answer(thought + full_answer)
final["final"] = True
final["audio_binary"] = None
final["answer"] = "" # discards decorated text
yield final
# async_ask(), line ~1444 (same bug, different path)
final = decorate_answer(full_answer)
final["final"] = True
final["answer"] = "" # discards decorated text
yield final
```
The client receives filtered references (built for a citation-decorated
answer it never sees) while displaying the raw, undecorated streaming
text. Citations can never match.
## Root cause
`final["answer"] = ""` was left over from an earlier design where
clients were meant to reconstruct the full answer purely from delta
events. Once `decorate_answer()` started placing citation markers, this
blank-out broke the contract: the final event is where the decorated
answer should land.
## Fix
Remove the two blank-override lines — one in `async_chat()`, one in
`async_ask()`:
```diff
- final["answer"] = ""
```
`decorate_answer()` already sets `final["answer"]` to the correct
decorated string; there is nothing to override.
## Relation to #13828
Issue #13828 and PR #13835 identify the bug in `async_chat()`. This PR
absorbs that fix and also corrects the identical pattern in
`async_ask()` (used by the `/retrieval` route in `chat_api.py`), which
PR #13835 does not touch.
## Regression test
Added
`test/unit_test/api/db/services/test_dialog_service_final_answer.py`
with three tests:
| Test | Purpose |
|------|---------|
| `test_buggy_pattern_drops_answer` | Documents the old behaviour:
blank-override empties the final answer |
| `test_fixed_pattern_preserves_decorated_answer` | Core invariant:
final event carries the decorated text from `decorate_answer()` |
| `test_final_event_reference_matches_decorated_result` | Citation
markers in the answer must match the pruned `doc_aggs` in the same event
|
Local run result:
```
test_dialog_service_final_answer.py::test_buggy_pattern_drops_answer PASSED
test_dialog_service_final_answer.py::test_fixed_pattern_preserves_decorated_answer PASSED
test_dialog_service_final_answer.py::test_final_event_reference_matches_decorated_result PASSED
3 passed in 0.04s
```
`ruff check` passes with no issues on all changed files.
---------
Co-authored-by: edenfunf <edenfunf@gmail.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Feat: Edit the code of the code operator from a broad perspective.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix(flow): Fix text descriptions for multi-column layout options.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
As title
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Consolidation WEB API & HTTP API for document upload
Before consolidation
Web API: POST /v1/document/upload
Http API - POST /api/v1/datasets/<dataset_id>/documents
After consolidation, Restful API -- POST
/api/v1/datasets/<dataset_id>/documents
### Type of change
- [x] Refactoring
## What problem does this PR solve?
Add a warning log when `get_flatted_meta_by_kbs` returns 10,000 results,
which indicates the query limit has been reached and metadata may be
silently truncated.
## Type of change
- [x] Improvement (non-breaking change which improves observability)
### What problem does this PR solve?
Fixes#14051.
The chat UI already sends an `internet` flag with each request, but the
backend previously triggered Tavily web retrieval whenever
`prompt_config.tavily_api_key` was configured. As a result, web search
could still run even when the internet toggle was off.
This PR makes web search an explicit opt-in at request time:
- `tavily_api_key` only indicates that web search is available
- Tavily retrieval runs only when `internet` is explicitly enabled
- the same behavior now applies to both the normal retrieval path and
the deep-research / reasoning path
This also fixes the no-KB fallback case so chats without KBs fall back
to normal solo chat when `internet` is off.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Remove unused token related API
2. Fix typo
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: The file count in the file header did not change after uploading or
deleting files.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Before change, update_document in api/apps/restful_apis/document_api.py
is using "PUT".
After change, it will use "PATCH" which is more suitable.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
feat(file): Add file ancestor directory lookup feature by go
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
refactor: Remove knowledge base-related API handlers that are already
included in the dataset.
### Type of change
- [x] Refactoring
## Summary
- Replace `json.load(open(...))` with `with open(...) as f:
json.load(f)` in 2 resume parser files
- Fixes 4 leaked file descriptors in `corporations.py` (3) and
`schools.py` (1)
## Why
In a long-running server process like RAGFlow, leaked file handles can
accumulate and hit the OS file descriptor limit (`OSError: [Errno 24]
Too many open files`). The other instances mentioned in the issue
(`infinity_conn_base.py` and `init_data.py`) have already been fixed.
## Test plan
- [x] Verified affected files use `with` statement after fix
- [x] Grep confirms no remaining `json.load(open(` patterns in codebase
Fixes#13996🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### What problem does this PR solve?
This fixes rerank overflow where retrieval could send more documents
than allowed (for example 66 when `page_size=6`), causing provider 400
errors and bypassing the user’s `top_k` intent in rerank-enabled paths.
this pr fixes#14081
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix issue with stale tests on p3 level
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The indented tree text generated on the search page overlaps.
#14077
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Feat: Hide the download button embedded in the agent page.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Summary
When setting a default model for an OpenAI-API-Compatible provider,
ensure_tenant_model_id_for_params called get_api_key
without a model_type filter. If the same model name was registered under
multiple types (e.g., both chat and embedding),
it could return the wrong tenant_llm_id, leading to Model(@None) not
authorized errors during chat.
This applies the same type-scoped fix that PR #13569 introduced in
get_model_config_by_type_and_name — now consistently
in tenant_utils.py as well.
Changes
- Added _KEY_TO_MODEL_TYPE mapping in tenant_utils.py
- Each model key (llm_id, embd_id, etc.) now passes its correct LLMType
to get_api_key
Fixes#13775
### What problem does this PR solve?
- Implemented a helper function to convert markdown cell text to native
numeric types for Excel output.
- Ensured that leading zeros are preserved and handled various numeric
formats, including those with thousand separators and scientific
notation.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Closes#13907
The template catalog had duplicate files (e.g. `*_r.json`) only to place
the same template into multiple sidebar groups.
This increases maintenance cost and makes template updates error-prone.
This PR adds first-class support for multiple template categories in a
single file via `canvas_types`, then removes duplicate template files.
What changed:
- Added `canvas_types` to `CanvasTemplate` model and DB migration.
- Added normalization logic when loading templates:
- accepts legacy `canvas_type`
- accepts new `canvas_types`
- merges/deduplicates values
- preserves backward compatibility by keeping `canvas_type` as first
normalized value.
- Updated template import flow to load only `.json` files and in stable
sorted order.
- Updated frontend template filtering to match on `canvas_types` first,
with fallback to legacy `canvas_type`.
- Consolidated duplicated template pairs into single files and removed:
- `deep_search_r.json`
- `reflective_academic_paper_generator_r.json`
- `seo_article_writer_r.json`
- Added regression/edge-case tests for category normalization and route
serialization expectations.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix: The chat page is not displaying the meta tags.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Upgrades Apache Tika from 3.2.3 to 3.3.0 to address the security
vulnerability GHSA-72hv-8253-57qq (TIKA-4687).
Closes#13601
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Changes
- `Dockerfile`: Updated tika JAR filename and `TIKA_SERVER_JAR` env var
from 3.2.3 to 3.3.0
- `Dockerfile.deps`: Updated tika JAR filename in COPY instruction from
3.2.3 to 3.3.0
- `download_deps.py`: Updated both Maven Central and Huawei Cloud mirror
download URLs from 3.2.3 to 3.3.0
### References
- Apache Tika 3.3.0 release:
https://www.apache.org/dyn/closer.lua/tika/3.3.0/tika-app-3.3.0.jar
- TIKA-4687: https://issues.apache.org/jira/browse/TIKA-4687
- GHSA-72hv-8253-57qq
### What problem does this PR solve?
Update search
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Sandbox cannot accept large args list.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Consolidate "set_meta" API into "update_document" .
Before consolidation
Web API: POST /api/v1/document/set_meta
Http API - PUT /v1/datasets/<dataset_id>/document/<document_id>
After consolidation, Restful API -- PUT
/v1/datasets/<dataset_id>/document/<document_id>
### Type of change
- [x] Refactoring
Close#14018
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Problem
In Agent applications, even with the cite option enabled, only inline
[ID: x] citation markers are visible (showing chunk content on hover).
The Agent does not display the referenced file cards below the response,
unlike Chat applications.
### Root Cause
The Agent's Retrieval tool (agent/tools/retrieval.py) calls
retriever.retrieval() with aggs=False, which means the retrieval results
do not include doc_aggs (document aggregation) data. Without doc_aggs,
the frontend ReferenceDocumentList component has no data to render the
file cards.
In contrast, the Chat application (api/db/services/dialog_service.py)
calls the same retriever.retrieval() method with aggs=True.
### Fix
Changed aggs=False to aggs=True in agent/tools/retrieval.py so that
document aggregation data is returned along with the retrieved chunks.
### What problem does this PR solve?
Fix: When creating a dataset, if no `chunk_method` is selected, there is
no indication that this is a required field.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Consolidation WEB API & HTTP API for document metadata summary
Before consolidation
Web API: POST /api/v1/document/metadata/summary
Http API - GET /v1/datasets/<dataset_id>/metadata/summary
After consolidation, Restful API -- GET
/v1/datasets/<dataset_id>/metadata/summary
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Fix: The dataset on the search page is not displaying the required field
error message.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Visit
`http://127.0.0.1:9381/?__debugger__=yes&cmd=resource&f=debugger.js`
will expose the flask code:
```
docReady(() => {
if (!EVALEX_TRUSTED) {
initPinBox();
}
// if we are in console mode, show the console.
if (CONSOLE_MODE && EVALEX) {
createInteractiveConsole();
}
const frames = document.querySelectorAll("div.traceback div.frame");
if (EVALEX) {
addConsoleIconToFrames(frames);
}
addEventListenersToElements(document.querySelectorAll("div.detail"), "click", () =>
document.querySelector("div.traceback").scrollIntoView(false)
);
addToggleFrameTraceback(frames);
addToggleTraceTypesOnClick(document.querySelectorAll("h2.traceback"));
addInfoPrompt(document.querySelectorAll("span.nojavascript"));
wrapPlainTraceback();
});
function addToggleFrameTraceback(frames) {
frames.forEach((frame) => {
frame.addEventListener("click", () => {
frame.getElementsByTagName("pre")[0].parentElement.classList.toggle("expanded");
});
})
}
```
### Type of change
- [x] Other (please describe): Fix security risk
### What problem does this PR solve?
As title.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat: pipeline support ONE chunking method
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
As title
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Summary
Fixes#13996
Replace `json.load(open(...))` with `with open(...) as f: json.load(f)`
in two files to ensure file descriptors are properly closed.
**Affected files:**
- `common/doc_store/infinity_conn_base.py` — schema loading for Infinity
doc store
- `api/db/init_data.py` — agent template loading at startup
## Why this matters
In a long-running server process like RAGFlow, leaked file descriptors
from `json.load(open(...))` can accumulate over time. While CPython's
refcounting usually cleans these up, it's not guaranteed (especially
under memory pressure or with alternative Python runtimes), and can lead
to `OSError: [Errno 24] Too many open files`.
## Test plan
- [ ] Verify Infinity doc store schema loading still works correctly
- [ ] Verify agent templates load correctly on startup
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Improved file handling in internal data processing to ensure proper
resource cleanup.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Co-authored-by: easonysliu <easonysliu@tencent.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### What problem does this PR solve?
feat: Implement file-related functionality
- Implement file deletion API and business logic
- Add context support for file deletion operations and prevent root
folder deletion
- Implement file move functionality
- Add File Download API Endpoints and Utility Functions
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Closes https://github.com/infiniflow/ragflow/issues/13939
## What problem does this PR solve?
The Google Drive connector fails to detect new files after the initial
sync (#13939). The root cause is that `generate_time_range_filter()`
applies a strict `modifiedTime > poll_range_start` cutoff when querying
the Google Drive API. Files uploaded to Google Drive that retain their
original `modifiedTime` (common behavior) get silently excluded if their
timestamp predates the last sync's cutoff.
Unlike the Confluence and Jira connectors which use a configurable time
buffer (`CONFLUENCE_SYNC_TIME_BUFFER_SECONDS`) to offset
`poll_range_start` backward, the Google Drive connector had no such
mechanism — resulting in a razor-sharp timestamp boundary with zero
tolerance for overlap.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
* **New Features**
* Added a configurable time buffer for Google Drive synchronization to
address timing delays and improve sync reliability.
* Improved file detection logic to include recently created files
alongside modified ones, reducing missed synchronizations.
### What problem does this PR solve?
This PR fixes a mismatch between the MCP retrieval contract and the
backend retrieval API.
`ragflow_retrieval` already describes `dataset_ids` as optional, but
`/api/v1/retrieval` still rejected omitted or empty `dataset_ids` with
`` `dataset_ids` is required. ``. That made MCP retrieval fail even
though the tool schema promised that the request could search across all
available datasets.
This change updates `/api/v1/retrieval` to accept missing or empty
`dataset_ids`, resolve all accessible datasets for the authenticated
user, and keep the route schema aligned with the new runtime behavior.
It also adds focused unit coverage for the fallback resolution path and
the no-accessible-datasets case.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fixes: #13981
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved dataset resolution to reliably discover all accessible
datasets through proper pagination, replacing the previous parsing
method.
* Enhanced error handling with clearer messaging when no datasets are
available for retrieval.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### What problem does this PR solve?
As title.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: The knowledge base selected by the retrieval node is not displayed.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: support vlm fall back in pipeline for img/table parsing
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
As title
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
GraphRAG _async_chat.
### Type of change
- [x] Refactoring
- [x] Performance Improvement
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Unified chat calls to an async invocation across extractors, improving
timeout handling and ensuring task IDs propagate reliably.
* **Tests**
* Added and expanded unit tests and mocks to cover extractor behavior,
timeout scenarios, and safe test-package imports, reducing regression
risk.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Fixes#13823
## Problem
When querying with words like `cat`, RAGFlow's query expansion system
looks up synonyms via WordNet, which can return terms containing single
quotes (e.g., `cat-o'-nine-tails`). When using Infinity as the document
store, these unescaped single quotes in the query string cause a
`TokenError` because Infinity's lexer treats `'` as a string delimiter.
```
TokenError: Error tokenizing ' OR "big cat" OR "computerized tomography")^0.7)': Missing ' from 1:531
```
## Solution
Strip single quotes from synonym terms before they are inserted into
query expressions, consistent with how single quotes are already
stripped from the input query text (line 51 of `query.py`):
- **`common/query_base.py`**: In `sub_special_char()`, strip `'` before
escaping other special characters. This fixes the Chinese text
processing path and the `paragraph()` method.
- **`rag/nlp/query.py`**: In the English text path, strip `'` from
tokenized synonym terms.
- **`memory/services/query.py`**: Same fix for the memory query English
text path.
## Testing
The fix can be verified by:
1. Using Infinity as the document store (`DOC_ENGINE=infinity`)
2. Creating a dataset and running a retrieval test with the keyword
`cat`
3. Confirming no `TokenError` is raised and results are returned
normally
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Enhanced special character handling in query processing and synonym
expansion by properly sanitizing single quotes before text processing.
* Simplified OCR detection output by removing timing metadata while
preserving core detection accuracy.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: ximi <octo-patch@github.com>
### What problem does this PR solve?
As title
### Type of change
- [x] Refactoring
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Improved authentication error logging to better distinguish between
JWT and API token failures.
* Enhanced code documentation with clarifying comments for better
maintainability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat: Integrate the name, avatar, and description of chat and search
into a single component.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Inline-editable avatar, name, and description fields
* Expandable content blocks in search results
* New RAGFlow heading/logo component
* **Refactor**
* Replaced scattered form fields with a composed Avatar/Name/Description
component
* Mindmap drawer converted to a sheet-based drawer and layout cleanup
* Simplified search page controls and layout; improved scroll viewport
handling
* **Chores**
* Added/updated English and Chinese localization keys (placeholders,
view more/less)
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Resolves#12105
This PR fixes two MCP tool call issues in
`common/mcp_tool_call_conn.py`.
First, the timeout passed to `tool_call(..., timeout=...)` was only
applied to the outer `future.result(...)` wait, but was not forwarded to
the internal MCP request. As a result, callers could pass a longer
timeout while the actual MCP request still failed after the default
internal timeout.
Second, the MCP tool call result handling assumed `result.content[0]`
always existed. If an MCP server returned an empty content list, this
could raise an exception unexpectedly.
This PR fixes both issues by:
- forwarding the external `timeout` value to the internal MCP request
timeout
- returning a clear message when the MCP server returns empty content
instead of indexing into an empty list
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe)
fix: support dense_vector from ES fields response (ES 9.x compatibility)
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Configuration Chore (non-breaking change which updates
configuration)
## Summary by CodeRabbit
* **Bug Fixes**
* More accurate handling and unwrapping of dense-vector fields so
returned values have correct shapes.
* Field selection reliably limits returned data and falls back to
alternate result locations when needed.
* Use of consistent result IDs and tolerant handling when score values
are missing.
* **Chores / Configuration**
* Increased build memory and adjusted build-time flags for the frontend
build.
* Simplified runtime model/GPU checks and removed an automated runtime
GPU-install attempt.
* **Build Fixes**
* `web/vite.config.ts`: make `build.minify` and `build.sourcemap`
respect `VITE_MINIFY` and `VITE_BUILD_SOURCEMAP` env vars from
Dockerfile instead of hardcoding `terser` and `true`.
* **Environment**
* Allow stack version override and default the runtime image tag to
"latest".
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Correct unwrapping of dense-vector fields and reliable field selection
with fallback locations.
* Consistent use of hit-level IDs and tolerant handling when score
values are missing.
* **Chores / Configuration**
* Increased frontend build memory and added build-time minify/sourcemap
flags; build minification and sourcemap now configurable.
* Removed runtime GPU detection for model initialization; force CPU
initialization.
* **Environment**
* Allow stack version override and default runtime image tag to
"latest".
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
Feat: support doc for pipeline parser in word
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added support for processing legacy Word `.doc` file formats,
extending document compatibility.
* **Bug Fixes**
* Enhanced error handling during document parsing to improve reliability
and prevent processing failures.
### What problem does this PR solve?
Feat: enable sync deleted files for connector
1. first comes with github
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added "sync deleted files" feature for data sources, enabling
automatic removal of files deleted from the source system.
* Added multilingual support for the new sync deleted files setting
across multiple languages.
* **UI Improvements**
* Improved checkbox form field rendering and layout.
* Enhanced full-width display for authentication token input fields.
### What problem does this PR solve?
Refactor: merge document.rename into document.update_document
### Type of change
- [x] Refactoring
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added a unified document update API (PUT) supporting name, metadata,
parser/chunk settings, and status changes.
* **Breaking Changes**
* Legacy single-parameter rename endpoint removed; renames now require
dataset + document identifiers.
* `/list` now reads dataset id from a different query parameter.
* **Validation / Bug Fixes**
* Stricter meta_fields and parser-config validation; unauthenticated
requests return 401.
* **Frontend**
* UI now sends dataset id when saving document names.
* **Tests**
* Numerous unit and HTTP tests adjusted or removed to match new API and
validations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: MkDev11 <94194147+MkDev11@users.noreply.github.com>
Co-authored-by: mkdev11 <YOUR_GITHUB_ID+MkDev11@users.noreply.github.com>
Co-authored-by: mkdev11 <MkDev11@users.noreply.github.com>
Co-authored-by: Qi Wang <wangq8@outlook.com>
Co-authored-by: dataCenter430 <161712630+dataCenter430@users.noreply.github.com>
Co-authored-by: balibabu <cike8899@users.noreply.github.com>
### What problem does this PR solve?
Add stage for migrate tenant_llm data into table tenant_model_instance
and tenant_model.
### Type of change
- [x] Other (please describe): tool script
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Added two new migration stages to move tenant model and instance
records into new target tables, with dry-run, full-execute, and "create
table only" modes; migration skips already-migrated rows to avoid
duplicates.
* **Bug Fixes**
* Cleaned up migration header logging for clearer output.
* **Documentation**
* Added usage guide describing stages, options, modes, config format,
examples, and expected logs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### What problem does this PR solve?
Fix: dsl import/export
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Enhanced JSON import functionality for agents to automatically
populate components from imported graph structures.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
Implement Delete in GO and refactor functions
### Type of change
- [x] Refactoring
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added a remove_chunks command to delete specific or all chunks from a
document.
* Added new endpoints for chunk removal and chunk update.
* **Refactor**
* Renamed index commands to dataset/metadata table terminology and
updated REST routes accordingly.
* Updated chunk update flow to a JSON POST style and improved metadata
error messages.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
### What problem does this PR solve?
Revert xgboost version to 1.6.0
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Updated xgboost dependency from version 3.2.0 to 1.6.0
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### What problem does this PR solve?
1. list configs
2. set log level debug/info/warn/error/fatal/panic
```
RAGFlow(user)> list configs;
+--------------------+-----------------------+
| key | value |
+--------------------+-----------------------+
| redis_host | localhost:6379 |
| doc_engine | elasticsearch |
| elasticsearch_host | http://localhost:1200 |
| log_level | info |
| database | mysql |
| database_host | localhost:3306 |
| admin | 0.0.0.0:9383 |
| storage_engine | minio |
| minio_host | localhost:9000 |
+--------------------+-----------------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Added `LIST CONFIGS` command to view system configuration details
(Redis, database, log level, storage engine, and host settings).
* Added `SET LOG LEVEL` command to adjust logging verbosity at runtime.
* **Improvements**
* Enhanced log level configuration defaults and runtime state
management.
* Reorganized token management and system endpoints under `/system/`
routes for better API organization.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Refactor: Remove unused API code
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Style**
* Updated table header styling in dataset settings by removing a
hard-coded background color class, allowing the header to use default or
inherited component styling instead.
* **Refactor**
* Removed token management endpoints from the API service. Token
creation, listing, and removal functions are no longer available.
* Removed the statistics data endpoint from available API routes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### What problem does this PR solve?
Fix: Linter error message: Use 'const' instead.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Updated variable declarations across form components, agent utilities,
memory management hooks, and data handling functions to enhance code
consistency and maintainability throughout the application codebase.
* **Style**
* Added ESLint suppressions to document intentional constant-condition
patterns in asynchronous event streaming operations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### What problem does this PR solve?
Fix import error in sandbox provider.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Updated internal configuration import mechanism for sandbox provider
initialization. No end-user impact.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### What problem does this PR solve?
- ping
- token
- log level
### Type of change
- [x] Refactoring
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* System endpoints consolidated under /api/v1/system: ping, health
check, and token management moved to the centralized API surface.
* Token management unified at /api/v1/system/tokens with
list/create/delete behavior.
* **Documentation**
* API reference updated to reflect the new /api/v1/system paths.
* **Tests**
* Client fixtures and test utilities updated to use
/api/v1/system/tokens; one unit test for health/oceanbase status
removed.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
As title.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Standardized the query parameter used when listing documents so
listings behave consistently across the web and client interfaces.
* Clarified the error message shown when a required dataset ID is
missing to give clearer guidance to users.
* **Tests**
* Updated test coverage to reflect the standardized dataset identifier
usage.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: The document management table cannot be displayed.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved table layout and overflow behavior in the files view to
ensure proper scrolling and display.
* **Chores**
* Removed unused system status functionality and cleaned up service
methods.
* Updated TypeScript configuration for compatibility.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
…
### What problem does this PR solve?
Closes#13857
Parent-child chunking was introduced in v0.23.0 but is only configurable
through the web UI. Users managing datasets programmatically cannot
enable it via the HTTP API or Python SDK because `ParserConfig` uses
`extra="forbid"`, rejecting the `children_delimiter` field at
validation.
### What does this PR change?
Adds a `parent_child` nested config to `ParserConfig`, following the
same pattern as `raptor` and `graphrag`:
```json
"parser_config": {
"parent_child": {
"use_parent_child": true,
"children_delimiter": "\n"
}
}
```
- api/utils/validation_utils.py — new ParentChildConfig model, added to
ParserConfig
- api/utils/api_utils.py — naive defaults + flatten to
children_delimiter for the execution layer
- api/apps/services/dataset_api_service.py — flatten on the update path
- test/testcases/configs.py — updated DEFAULT_PARSER_CONFIG
-
test/testcases/test_http_api/test_dataset_management/test_create_dataset.py
— 4 valid + 2 invalid test cases
No changes to the execution layer (rag/app/naive.py, rag/nlp/search.py).
Existing UI flow via ext is unaffected.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added parent-child chunking configuration for dataset creation and
updates with new `use_parent_child` toggle and customizable
`children_delimiter` setting to specify how parent chunks are split into
child chunks.
* **Documentation**
* Updated HTTP and Python API references with parent-child chunking
configuration details and examples.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### Use uv run python3 x.py instead of uv run x.py
When directly call `uv run x.py` it will use the python in shebang, it
does not work if the default python lack of some packages, so change it
to best practices `uv run python3 x.py`
### Type of change
- [x] Documentation Update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **Documentation**
* Updated development setup instructions across all README files
(English and multiple language translations) to use explicit Python
interpreter invocation for the dependency download command.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### What problem does this PR solve?
Implements automatic adjustment of knowledge base chunk recall weights
based on user feedback (upvotes/downvotes). When users upvote or
downvote a response, the system locates the corresponding knowledge
snippets and adjusts their recall weight to improve future retrieval
quality.
**Closes #12670**
**How it works:**
1. User upvotes/downvotes a response via `POST /thumbup`
2. System extracts chunk IDs from the conversation reference
3. For each referenced chunk:
- Reads current `pagerank_fea` value from document store
- Increments (+1) for upvote or decrements (-1) for downvote
- Clamps weight to [0, 100] range
- Updates chunk in ES/Infinity/OceanBase
4. Future retrievals score these chunks higher/lower based on
accumulated feedback
**Files changed:**
- `api/db/services/chunk_feedback_service.py` - New service for updating
chunk pagerank weights
- `api/apps/conversation_app.py` - Integrated feedback service into
thumbup endpoint
- `test/testcases/test_web_api/test_chunk_feedback/` - Unit tests
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Chat message feedback now updates per-chunk relevance weights
(feature-flag gated), with configurable weighting and atomic updates
across storage backends.
* **Bug Fixes**
* Stricter validation for message feedback inputs and more robust
handling of feedback transitions.
* **Tests**
* Expanded test coverage for chunk-feedback behavior, weighting
strategies, storage backends, and thumb-flip scenarios.
* **Chores**
* CI workflow extended to run the new chunk-feedback web API tests.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: mkdev11 <YOUR_GITHUB_ID+MkDev11@users.noreply.github.com>
Co-authored-by: mkdev11 <MkDev11@users.noreply.github.com>
### What problem does this PR solve?
as title.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Internal code quality improvements with no user-facing changes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Refactor version API to RESTful style. Python and go server API also
updated.
### Type of change
- [x] Refactoring
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **Refactor**
* Migrated core API endpoints to the `/api/v1/` namespace for improved
consistency and organization.
* Standardized system version, search, and chat list endpoints under the
new API versioning structure.
* **New Features**
* Added MinIO region configuration support, allowing specification of
storage engine regional settings via environment variables or
configuration files.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Summary
- Add optional `region` parameter to `Minio()` client constructor in
`rag/utils/minio_conn.py`
- Reads from `MINIO.region` in settings, defaults to `None` when not
configured
- Required by some S3-compatible storage services (e.g., AWS S3, Tencent
COS) for proper bucket access
## Motivation
When using RAGFlow with S3-compatible storage that requires a region
(such as AWS S3 or Tencent Cloud COS), the MinIO client fails to access
buckets because the `region` parameter is not passed through.
The `Minio()` Python client already supports the `region` parameter
natively — this PR simply wires it up from the RAGFlow configuration.
## Changes
- `rag/utils/minio_conn.py`: Pass `region=settings.MINIO.get("region",
None) or None` to `Minio()` constructor
## Backward Compatibility
- No breaking changes. When `region` is not configured, it defaults to
`None`, preserving the existing behavior exactly.
## Test Plan
- [ ] Verified with MinIO (no region set) — works as before
- [x] Verified with S3-compatible storage requiring region — bucket
access succeeds
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Enhanced MinIO client initialization with regional configuration
support for improved compatibility with region-specific deployments.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Co-authored-by: Jarry Wang <code-better-life@users.noreply.github.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Add float parsing
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
api_host -> webAPI
ExternalApi -> restAPIv1
### Type of change
- [x] Refactoring
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Updated internal API endpoint configuration to use consolidated base
URL constants for improved maintainability and consistency across the
application.
* **Chores**
* Updated server-side protocol validation for admin connectivity checks.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: The agent selected a knowledge base, but the API returned the
error: "No dataset is selected".
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
This fixes two broken internal documentation links in the guides:
- `docs/develop/mcp/launch_mcp_server.md` linked
`./acquire_ragflow_api_key.md`, but the target page lives one level up
as `../acquire_ragflow_api_key.md`.
- `docs/guides/dataset/run_retrieval_test.md` linked
`./construct_knowledge_graph.md`, but the actual page lives under
`./advanced/construct_knowledge_graph.md`.
These broken links make it harder to follow the MCP and retrieval-test
docs from the local docs tree.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Refactor context search command
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Summary
- Fix `a image` → `an image` in README and log message
- Fix `colomn` → `column` in table structure recognizer comment
- Fix `formated` → `formatted` in confluence connector docstring
- Fix `tabel of content` → `table of contents` in TOC prompt
## Test plan
- [ ] Documentation and comment changes, no functional impact
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: yuj <yuj@ztjzsoft.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Add validation logic for parser_config.
Refactor the processing flow. Before change, validation logics and
update logics are mixed up - some validation logis executes followed by
some update logic executes and then another such
"validation-and-then-update" which is not good. After change, all
validation logic executes firstly. Update logic will be executed after
ALL validation logic executed.
Validation logic for parameters (that come from front end) will be
checked using Pydantic. For validation logic that depends on data from
DB, they will be in separate methods.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
fix#13944 where OpenAI-compatible custom endpoints failed verification
when model names contained `gpt-5` becauser of incorrect name-based
handling in the Base/backend=`base` path.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The MySQL and PostgreSQL sync classes in `sync_data_source.py` were not
passing `id_column`, `timestamp_column`, and `metadata_columns` to
`RDBMSConnector`,
making incremental sync and document update impossible even when
configured.
- Without `id_column`: updated records generate new documents instead of
overwriting existing ones (doc ID is derived from content hash, so any
change produces a new ID).
- Without `timestamp_column`: `poll_source` always falls back to full
sync,
ignoring the configured time range.
- The three fields existed in the frontend default values but had no
form
inputs, so users had no way to fill them in.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### Changes
- **Backend** (`rag/svr/sync_data_source.py`): pass `id_column`,
`timestamp_column`, and `metadata_columns` from `self.conf` to
`RDBMSConnector` for both `MySQL` and `PostgreSQL` sync classes.
- **Frontend**
(`web/src/pages/user-setting/data-source/constant/index.tsx`):
add `ID Column`, `Timestamp Column`, and `Metadata Columns` form fields
to MySQL and PostgreSQL data source configuration UI with tooltips.
Signed-off-by: lixintao <lixintao@uniontech.com>
Co-authored-by: lixintao <lixintao@uniontech.com>
### What problem does this PR solve?
Implement UpdateDataset and UpdateMetadata in GO
Add cli:
UPDATE CHUNK <chunk_id> OF DATASET <dataset_name> SET <update_fields>
REMOVE TAGS 'tag1', 'tag2' from DATASET 'dataset_name';
SET METADATA OF DOCUMENT <doc_id> TO <meta>
### Type of change
- [ ] Refactoring
### What problem does this PR solve?
Add a script to migrate data in tenant_llm into tenant_model_provider.
### Type of change
- [x] Other (please describe): tool script.
### What problem does this PR solve?
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
---------
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
Now user can use 'think mode' to chat with LLM
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Problem Description
When a user creates Dataset A using the **Tag parser** (for CSV/Excel
files with tag definitions), and then creates Dataset B, the Tag Sets
dropdown in Dataset B's Configuration page cannot display Dataset A.
### Steps to Reproduce
1. Create Dataset A with **Tag** as the chunking method
2. Upload a CSV file to Dataset A to generate tags
3. Create Dataset B
4. Navigate to Dataset B → Configuration → Tag Sets
5. **Expected**: Dataset A should appear in the dropdown
6. **Actual**: The dropdown is empty, Dataset A is not visible
---
## Root Cause Analysis
After thorough code review, **the original code logic is correct**. The
`chunk_method` field flows properly through the system:
### Data Flow
```mermaid
sequenceDiagram
participant Frontend
participant Pydantic
participant API
participant Database
Note over Frontend,Database: Creating a Tag Dataset
Frontend->>Pydantic: POST {chunk_method: "tag"}
Pydantic->>API: serialization_alias converts<br/>chunk_method → parser_id
API->>Database: INSERT {parser_id: "tag"}
Note over Frontend,Database: Querying Datasets
Frontend->>API: GET /api/v1/datasets
API->>Database: SELECT parser_id, ...
Database-->>API: Returns {parser_id: "tag"}
API->>API: remap_dictionary_keys()<br/>parser_id → chunk_method
API-->>Frontend: {chunk_method: "tag"}
Note over Frontend: Filter: x.chunk_method === 'tag'
Note over Frontend: ✅ Match found!
```
### Field Mapping
**Location**: `api/utils/api_utils.py:657-662`
```python
DEFAULT_KEY_MAP = {
"chunk_num": "chunk_count",
"doc_num": "document_count",
"parser_id": "chunk_method", # Maps DB field to API response
"embd_id": "embedding_model",
}
```
### Frontend Filtering (Already Correct)
**Location**:
`web/src/pages/dataset/dataset-setting/components/tag-item.tsx:24`
```typescript
const knowledgeOptions = knowledgeList
.filter((x) => x.chunk_method === 'tag') // ✅ Correct field
.map((x) => ({...}));
```
---
## Actual Issue
The most likely causes for the "bug" are:
1. **Browser Cache**: Old data cached before proper deployment
2. **Stale Data**: Datasets created before the code was fully deployed
3. **Container Not Restarted**: Changes not applied to running container
---
## Resolution
**No code changes are needed.** The existing code correctly:
1. Accepts `chunk_method` from frontend
2. Converts to `parser_id` via Pydantic serialization_alias
3. Stores in database as `parser_id`
4. Maps back to `chunk_method` in API response
5. Frontend filters by `chunk_method === 'tag'`
### What problem does this PR solve?
Update the customer feedback dispatcher template and introduce a new
operator `Variable Aggregator`.
### Type of change
- [x] Other (please describe): Template change
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat: Place the language configuration in web/.env for easy user
configuration.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
\`switch.py\` line 137 concatenates the operator directly after the text
without separator:
\`'Not supported operator' + operator\` → produces \`"Not supported
operatorXXX"\`
Changed to: \`f'Not supported operator: {operator}'\`
### What problem does this PR solve?
feat(File Management): Refactor File List API and Add Knowledge Base
Document Initialization
- Migrate the file list API endpoint from `/v1/file/list` to
`/api/v1/files` to align with the Python implementation.
- Add logic for initializing knowledge base documents; automatically
create the `.knowledgebase` folder and associated documents when
retrieving the root directory.
- Enhance parameter validation and error handling, including the
introduction of a new `CodeParamError` error code.
- Optimize the file list response structure to match the implementation
on the Python side.
- Update the Vite configuration to support proxying the new
`/api/v1/files` endpoint.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Summary
- The Azure SPN storage handler hardcoded
`AzureAuthorityHosts.AZURE_CHINA`, preventing users in Azure Public
Cloud regions (UK-South, EU, US, etc.) from authenticating
- Add a `cloud` config option (env: `AZURE_CLOUD`) supporting all four
Azure sovereignties: `public`, `china`, `government`, `germany`
- Defaults to `public` (global Azure) — the most common international
use case
Closes#13259
## Test plan
- [ ] Verify default (`cloud: public`) connects to Azure Public Cloud
endpoints
- [ ] Verify `cloud: china` retains existing behavior for Azure China
users
- [ ] Verify `AZURE_CLOUD` env var overrides the config file value
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Replace `quay.io/minio/minio` with `pgsty/minio` community fork in
`docker/docker-compose-base.yml`
MinIO stopped distributing pre-built Docker images and changed its
license. The pgsty/minio fork provides drop-in compatible images under
AGPLv3.
Closes#13840
## Test plan
- [x] Verify `docker compose -f docker/docker-compose-base.yml up -d`
pulls the pgsty/minio image successfully
- [ ] Verify MinIO console accessible on port 9001
- [ ] Verify RAGFlow backend can connect to MinIO and perform file
operations normally
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
feat: Implement file upload and folder creation features
- Add file upload route in router.go
- Add file operation methods in dao/file.go
- Add util/file.go for file type detection and filename handling
- Implement file upload and folder creation endpoints in handler/file.go
- Implement file upload and folder creation logic in service/file.go
- Modify response message format in memory.go
- Add document count method in dao/document.go
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Introduce 5 new tables, including model groups and provider instance.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. Search() in Infinity can return row_id now
2. To Get ROW_ID from search(), refer to handling of retrieval_test.
example
```
$ curl -s -X POST "http://localhost:$PORT/v1/chunk/retrieval_test" -H "Authorization: $TOKEN" -H "Content-Type: application/json" -d '{"kb_id": "4fcd01582ca911f1954184ba59049aa3", "question": "曹操"}'
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR fixes a race in batch document parsing where overlapping parse
requests for the same document could clear/rewrite chunk state and make
previously parsed content appear lost. It adds an atomic per-document
parse guard so only one parse can run at a time for that document (Fixes
#13864 ).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR fixes WebDAV sync behavior for unsupported file types
([#13795](https://github.com/infiniflow/ragflow/issues/13795)).
Previously, the WebDAV connector selected files primarily by modified
time (and size threshold) and could still pass unsupported extensions
into the download/document-generation path. This caused unnecessary
processing and inconsistent behavior compared with connectors that
validate file type earlier.
This change adds extension validation in two places:
1. **Early filter during recursive listing** to skip unsupported files
before they enter the download flow.
2. **Defensive filter before download/document creation** to prevent
unsupported files from being processed if any listing edge case slips
through.
It also wires `allow_images` into the WebDAV sync path so image
extension handling follows connector policy.
Scope is intentionally limited to WebDAV for a focused bug-fix PR.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### How was this tested?
- Manual verification with mixed file types under the configured WebDAV
path:
- supported: `.pdf`, `.txt`, `.md`
- unsupported: `.exe`, `.bin`, `.dat`
- Triggered full sync and polling sync.
- Confirmed unsupported files are skipped before download.
- Confirmed supported files are still indexed normally.
- Confirmed image handling follows `allow_images` setting.
Fixes: #13795
### What problem does this PR solve?
Fixes markdown tables being parsed twice (once as markdown and again as
generated HTML), which caused duplicate table chunks in the chunk list
UI.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Two small fixes:
1. **iterationitem.py line 72**: Typo "interationitem" → "iterationitem"
(missing 't'). The component name check never matched IterationItem
components.
2. **raptor.py line 94**: Error message "Embedding error: " had a
trailing colon with no details. Changed to "Embedding error: empty
embeddings returned".
### What problem does this PR solve?
Fix: The dataset on the list page cannot be renamed.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Implement InsertDataset and InsertMetadata in GO
new internal cli for go:
INSERT DATASET FROM FILE "file_name"
INSERT METADATA FROM FILE "file_name"
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: If a model configured in the agent is deleted from the user
center, a notification will be displayed on the canvas with a red
border.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
As title.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: The agent form sheet will be obscured by the message log sheet.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Previously, `apikey_required` called
`request.headers.get('Authorization').split()[1]` without checking for
None or insufficient parts, causing an unhandled AttributeError or
IndexError (500) instead of a proper 403 JSON response.
This applies the same guarding pattern already used by `token_required`
in the same file.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Fix: Unable to reconnect after deleting the connection between begin and
parser #13868
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The chat settings are not displayed correctly on the first page
load.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix special characters in matching text of search(). We should escape
some special characters(such as ?, *,:) before passing to matching_text
of search()
Fix https://github.com/infiniflow/ragflow/issues/13729
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Add REST APIs to dynamically query and modify log levels at runtime for
both Python (Flask) and Go servers.
Changes:
- common/log_utils.py: add set_log_level() and get_log_levels()
functions
- admin/server/routes.py: add GET/PUT /api/v1/admin/log_levels endpoints
- api/apps/system_app.py: add GET/PUT /api/{version}/system/log_levels
endpoints
- internal/logger/logger.go: add GetLevel() and SetLevel() with atomic
level support
- internal/handler/system.go: add GetLogLevel, SetLogLevel, Health
handlers
- internal/router/router.go: route /health to systemHandler
- internal/admin/handler.go: add GetLogLevel, SetLogLevel handlers
- internal/admin/router.go: add /api/v1/admin/log_level routes
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Fix incorrect Markdown heading mapping for `h4` in `TITLE_TAGS`
dictionary
- `h4` was mapped to `"#####"` (h5 level) instead of `"####"` (correct
h4 level)
Closes#13819
## Details
In `deepdoc/parser/html_parser.py`, the `TITLE_TAGS` dictionary had a
typo where `h4` was assigned 5 `#` characters instead of 4, causing h4
headings to be converted to h5-level Markdown headings during HTML
parsing.
## Test plan
- [ ] Parse an HTML document containing `<h4>` tags and verify the
output uses `####` (4 hashes)
- [ ] Verify other heading levels remain correct
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Asksksn <Asksksn@noreply.gitcode.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
Enable reading Tag Set tags via API (expose tag_kwd field). The result
of the queried list chunks is as shown below:
<img width="1422" height="818" alt="image"
src="https://github.com/user-attachments/assets/abd1960a-fe34-489e-9d72-525f8e574938"
/>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: heyang.why <heyang.why@alibaba-inc.com>
### What problem does this PR solve?
Supporting public RSS/Atom feed URLs as data sources for RagFlow.
link https://github.com/infiniflow/ragflow/issues/12313
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Remove antd-related code and upgrade lucide-react to the latest
version.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Add go test
2. Update CI process
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
When using Infinity as DOC_ENGINE with parent-child chunker enabled,
vector insertion fails because the "mom" field is missing from the index
mapping. This fix adds the required field to resolve the issue.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- Adds `pyasn1>=0.6.3` as a `[tool.uv.constraint-dependencies]` entry to
mitigate **CVE-2026-30922** (CVSS 7.5 HIGH)
- Regenerates `uv.lock` so the resolved pyasn1 version moves from
**0.6.2 to 0.6.3**
## Details
**CVE-2026-30922** is a Denial of Service vulnerability in pyasn1 caused
by unbounded recursion when decoding ASN.1 data with deeply nested
structures. An attacker can send crafted payloads with thousands of
nested SEQUENCE or SET tags to trigger a `RecursionError` crash or
memory exhaustion.
- **Severity:** HIGH (CVSS 7.5)
- **Affected versions:** pyasn1 < 0.6.3
- **Fixed in:** pyasn1 >= 0.6.3
- **NVD:** https://nvd.nist.gov/vuln/detail/CVE-2026-25769
`pyasn1` is not a direct dependency of RAGFlow but is pulled in
transitively via `google-auth` -> `rsa` -> `pyasn1-modules` -> `pyasn1`.
The `constraint-dependencies` mechanism in uv is the correct way to
enforce a minimum version for transitive dependencies without polluting
the direct dependency list.
## Test plan
- [x] `pyproject.toml` passes TOML validation
- [x] `uv lock` resolves successfully with the new constraint
- [x] pyasn1 version in `uv.lock` is now 0.6.3
- [ ] Existing CI/CD tests continue to pass
Closes#13686
### What problem does this PR solve?
Feat: Add Memory function by go
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
## Summary
Closes#13803
The `__images__` method in `paddleocr_parser.py` defaulted to
`page_to=100`, only loading the first 100 pages for image cropping.
However, the PaddleOCR API processes **all** pages of the PDF. For PDFs
with more than 100 pages, page indices beyond 99 were rejected as out of
range during crop validation, causing content loss.
## Root Cause
```
__images__(page_to=100) → loads pages 0-99 → page_images has 100 entries
PaddleOCR API → processes all 226 pages → tags reference pages 1-226
extract_positions() → converts tag "101" to index 100
crop() validation → 0 <= 100 < 100 → False → "All page indices [100] out of range"
```
## Fix
Changed `page_to` default from `100` to `10**9`, so all PDF pages are
loaded for cropping. Python's list slicing safely handles oversized
indices.
## Test plan
- [ ] Parse a PDF with >100 pages using PaddleOCR — no more "out of
range" warnings
- [ ] Parse a PDF with <100 pages — behavior unchanged
- [ ] Verify cropped images are generated correctly for all pages
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Asksksn <Asksksn@noreply.gitcode.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Added Tailwind truncation classes (`inline-block max-w-[120px]
truncate align-middle`) to the username `<span>` in `SharedBadge` to
prevent long usernames from wrapping onto multiple lines
- Added `title` attribute to show the full username on hover when
truncated

## Test plan
- [x] Verify long usernames display truncated with ellipsis (`...`)
- [x] Verify hovering over a truncated username shows the full name as a
tooltip
- [x] Verify short usernames display normally without truncation
Closes#13748
### What problem does this PR solve?
- Add multiple output format to ragflow_cli
- Initialize contextengine to Go module
- ls datasets/ls files
- cat file
- search -d dir -q query
issue: #13714
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fixes issue #13799 where team members get model not authorized when
running RAG on an admin-shared knowledge base after the admin changes
the KB embedding model (for example to bge-m3).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Allow create datasets with parse_type == 1/None and chunk_method, or
parse_type == 2 and pipeline_id.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Allow create dataset with resume chunk_method.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The chunk method of the knowledge base cannot be saved.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Problem
The /file2document/convert endpoint ran all file lookups, document
deletions, and insertions synchronously inside the
request cycle. Linking a large folder (~1.7GB with many files) caused
504 Gateway Timeout because the blocking DB loop
held the HTTP connection open for too long.
Fix
- Extracted the heavy DB work into a plain sync function _convert_files
- Inputs are validated and folder file IDs expanded upfront (fast path)
- The blocking work is dispatched to a thread pool via
get_running_loop().run_in_executor() and the endpoint returns 200
immediately
- Frontend only checks data.code === 0 so the response change
(file2documents list → True) has no impact
Fixes#13781
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
Add command: logout
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Implement Create/Drop Index/Metadata index in GO
New API handling in GO:
POST/kb/index
DELETE /kb/index
POST /tenant/doc_meta_index
DELETE /tenant/doc_meta_index
CREATE INDEX FOR DATASET 'dataset_name' VECTOR_SIZE 1024;
DROP INDEX FOR DATASET 'dataset_name';
CREATE INDEX DOC_META;
DROP INDEX DOC_META;
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Searches /search API to RESTFul
### Type of change
- [x] Documentation Update
- [x] Refactoring
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
- GENERATE TOKENS OF USER 'xxx@xxx.com'
- DROP KEY 'ragflow-yyyyy' OF 'xxx@xxx.com'
- LIST KEYS OF 'xxx@xxx.com'
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Fix the issue of errors when creating datasets.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Fix: Using AvatarUpload in a dialog and pressing Enter will cause a file
selection pop-up to appear. #13779
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
CI isn't stable, try to fix it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
The removal of cargo in commit f59d96f87 also removed build-essential
which was needed to compile C extension packages like datrie.
Use aliyun mirror for coverage pip install
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Metadata,chunk,dataset Related bugs
- metadata not show add button #13731
- chunk edit question style
- dataset modified chunk method bug
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix
migrate_add_unique_email-silently-skips-unique-constraint-when-non-unique-user_email-index-exists.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes a bug in the Asana connector where providing `Project IDs` caused
sync to fail with:
`project_membership: Not a recognized ID: <PROJECT_GID>`
Root cause: the connector called `get_project_membership(project_gid)`,
but that API expects a **project membership gid**, not a **project
gid**.
This PR switches to the correct project-scoped API and adds regression
tests.
Fixes: [#13669](https://github.com/infiniflow/ragflow/issues/13669)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Changes made
- Updated `common/data_source/asana_connector.py`:
- Replaced `get_project_membership(pid, ...)` with
`get_project_memberships_for_project(pid, ...)`
- Trimmed and filtered `asana_project_ids` parsing to avoid
empty/whitespace IDs
- Normalized `asana_team_id` by trimming whitespace
- Used safer access for membership email extraction (`m.get("user")`)
- Added `test/unit_test/common/test_asana_connector.py`:
- Verifies the correct project-membership API method is called
- Verifies empty `project_ids` path returns workspace emails
- Verifies project/team input normalization behavior
### Compatibility / risk
- Non-breaking bug fix
- No API contract changes
- Existing behavior for empty `Project IDs` remains unchanged
### What problem does this PR solve?
Implement GetChunk() in Infinity in GO
Add cli:
GET CHUNK 'XXX';
LIST CHUNKS OF DOCUMENT 'XXX';
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Go cli
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: This resolves the issue where selecting a knowledge base in chat
could not differentiate between different users.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Files /file API to RESTFul style.
### Type of change
- [x] Documentation Update
- [x] Refactoring
---------
Co-authored-by: writinwaters <cai.keith@gmail.com>
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Minor fix.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Hu Di <812791840@qq.com>
## Summary
Add a complete Turkish translation of the README and include a Turkish
language badge across all existing README files.
## Changes
- **New file**: `README_tr.md` - Full Turkish translation of README.md,
covering all sections (What is RAGFlow, Demo, Latest Updates, Key
Features, System Architecture, Get Started, Configurations, Docker
Image, Development from Source, Documentation, Roadmap, Community,
Contributing)
- **Updated 9 existing README files** (README.md, README_zh.md,
README_tzh.md, README_ja.md, README_ko.md, README_id.md,
README_pt_br.md, README_fr.md, README_ar.md) to include the Turkish
language badge in the language selector
## Impact
- 10 files changed, 417 insertions
- Follows the same structure and conventions as other language-specific
README files (README_ja.md, README_ko.md, etc.)
- Turkish badge uses the same styling pattern (highlighted with DBEDFA
in README_tr.md, standard DFE0E5 in others)
---------
Co-authored-by: bakiburakogun <bakiburakogun@users.noreply.github.com>
## Summary
Complete and improve the existing Turkish (tr.ts) localization to fully
match the English (en.ts) reference file.
## Changes
- **Translate 6 English model tips** in the setting section
(chatModelTip, embeddingModelTip, img2txtModelTip, sequence2txtModelTip,
rerankModelTip, ttsModelTip) to Turkish
- **Expand all 13 truncated parser HTML descriptions** (book, laws,
manual, naive, paper, presentation, qa, resume, table, picture, one,
knowledgeGraph, tag) to match the full en.ts structure
- **Expand shortened tooltips** across knowledgeDetails,
knowledgeConfiguration, chat, and setting sections (~40+ tooltips
expanded)
- **Add missing translation details** for data source connectors
(SeaFile, Jira, Gmail, Moodle, Dropbox, Google Drive, etc.)
## Impact
- 182 insertions, 71 deletions in web/src/locales/tr.ts
- No structural changes, only translation content improvements
- All application terminology maintained consistently
Co-authored-by: bakiburakogun <bakiburakogun@users.noreply.github.com>
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Fix: Fixed the issue where agent log time could not be selected.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
As title to be compatible with go server
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
let excel use lazy image loader
### Type of change
- [x] Refactoring
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Fix: type check in resume parsing method
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Saving dataset settings failed with validation error 101 (Extra inputs
are not permitted)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Tokenzier in Infinity is modified in
https://github.com/infiniflow/infinity/pull/3330, sync the code change
to cpp files in ragflow
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add cli
LIST DOCUMENTS OF DATASET quoted_string ";"
LIST METADATA OF DATASETS quoted_string ("," quoted_string)* ";"
LIST METADATA SUMMARY OF DATASET quoted_string (DOCUMENTS quoted_string
("," quoted_string)*)? ";"
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Get user_id from canvas variable when input a {} pattern value.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The retrieval_test interface is continuously requested when the
user enters a question. #13719
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Init Minio / S3 / OSS
2. Fix minio / s3 / oss config
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. Allow admin@ragflow.io login go ragflow server
2. Fix go server start error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Adds Perplexity contextualized embeddings API as a new model provider,
as requested in #13610.
- `PerplexityEmbed` provider in `rag/llm/embedding_model.py` supporting
both standard (`/v1/embeddings`) and contextualized
(`/v1/contextualizedembeddings`) endpoints
- All 4 Perplexity embedding models registered in
`conf/llm_factories.json`: `pplx-embed-v1-0.6b`, `pplx-embed-v1-4b`,
`pplx-embed-context-v1-0.6b`, `pplx-embed-context-v1-4b`
- Frontend entries (enum, icon mapping, API key URL) in
`web/src/constants/llm.ts`
- Updated `docs/guides/models/supported_models.mdx`
- 22 unit tests in `test/unit_test/rag/llm/test_perplexity_embed.py`
Perplexity's API returns `base64_int8` encoded embeddings (not
OpenAI-compatible), so this uses a custom `requests`-based
implementation. Contextualized vs standard model is auto-detected from
the model name.
Closes#13610
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
When using pagination in the Dataset file list or File Manager,
selecting row N on page 1 would incorrectly cause row N on page 2 (and
subsequent pages) to also appear selected. This is a state pollution
bug.
### Root Cause
TanStack React Table defaults to using array indices (0, 1, 2...) as
`rowSelection` keys. With server-side (manual) pagination, each page's
rows start from index 0, so a selection like `{2: true}` on page 1 also
matches index 2 on every other page.
### Fix
- Added `getRowId: (row) => row.id` to `useReactTable` in both
`DatasetTable` and `FilesTable`, so selection state is keyed by unique
document/file IDs instead of positional indices.
- Updated the `useSelectedIds` helper to support ID-based selection keys
while maintaining backward compatibility with index-based keys.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Files Changed
| File | Change |
|------|--------|
| `web/src/pages/dataset/dataset/dataset-table.tsx` | Added `getRowId`
to table config |
| `web/src/pages/files/files-table.tsx` | Added `getRowId` to table
config |
| `web/src/hooks/logic-hooks/use-row-selection.ts` | Updated
`useSelectedIds` to handle ID-based selection |
### What problem does this PR solve?
Fix: Enhanced the user deletion function to return detailed deletion
information.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: CREATE / DELETE / LIST dataset api in Go
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Lynn <lynn_inf@hotmail.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
environment variable > config file
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
The `odr` variable was configured with `desc("weight_flt")` but a new
empty `OrderByExpr()` was passed to `dataStore.search()` instead,
causing the descending sort to have no effect.
### What problem does this PR solve?
In `_community_retrieval_`, the configured `OrderByExpr` with
`desc("weight_flt")` was discarded — a new empty `OrderByExpr()` was
passed to `dataStore.search()` instead, so community reports were never
sorted by weight.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Define a crypt function in admin directory, remove import from
api.utils. And move requests-toolbelt to dependency.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Split dataset api to gateway and service, and modify web UI to use
restful http api.
2. Old KB releated APIs are commented.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Fix graphrag extractor chat response parsing and skip truncated cache
values
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Refactor go server log
2. Update docker building, since nginx config should be set according to
the deployment.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fixes [#13505](https://github.com/infiniflow/ragflow/issues/13505): Jira
incremental sync could miss updated issues after initial sync,
especially near time boundaries.
Root cause:
- Jira JQL uses minute-level precision for `updated` filters.
- Incremental windows had no overlap buffer, so boundary updates could
be skipped.
- Sync log cursor tracking used a backward-facing update for
`poll_range_start`.
- Existing-doc updates in `upload_document` lacked a KB ownership guard
for doc-id collisions.
What changed:
- Added Jira incremental overlap buffer (`time_buffer_seconds`,
defaulting to `JIRA_SYNC_TIME_BUFFER_SECONDS`) when building JQL
lower-bound time.
- Preserved second-level post-filtering to avoid duplicate reprocessing
while still catching boundary updates.
- Improved Jira sync logging to include start/end window and overlap
configuration.
- Updated sync cursor tracking in `increase_docs` to keep
`poll_range_start` moving forward with max update time.
- Added KB ID safety check before updating existing document records in
`upload_document`.
Verification performed:
- Python syntax compile checks passed for modified files.
- Manual verification flow:
1. Run full Jira sync.
2. Edit an already-indexed Jira issue.
3. Run next incremental sync.
4. Confirm updated content is re-ingested into KB.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Closes#13277
### What problem does this PR solve?
Adds `{variable_name}` (and `{component@variable}`) interpolation
support to HTTP header values in the `Invoke` component, matching the
existing URL interpolation behavior.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
<img width="1280" height="867" alt="image"
src="https://github.com/user-attachments/assets/8ab7b4e9-7cc0-4a7f-8a5f-f838a15a5fda"
/>
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
### What problem does this PR solve?
RAGFlow had no Turkish language support. This PR adds Turkish (tr)
locale translations to the UI.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Co-authored-by: Mustafa YILDIZ <mustafa.yildiz@cilek.com>
## Summary
Upgrade MiniMax model configuration to include the latest M2.7 model.
## Changes
- Add `MiniMax-M2.7` and `MiniMax-M2.7-highspeed` to the model selection
list in `conf/llm_factories.json`
- Place M2.7 models at the top of the list as the recommended default
- Retain all previous models (M2.5, M2.5-highspeed, M2.1, M2) as
available alternatives
## Why
MiniMax-M2.7 is the latest flagship model with enhanced reasoning and
coding capabilities. This update ensures RAGFlow users can access the
newest model while maintaining backward compatibility with existing
configurations.
## Testing
- JSON config validated (well-formed)
- No existing MiniMax-specific unit tests affected
- Model entries follow the same structure as existing entries
Co-authored-by: PR Bot <pr-bot@minimaxi.com>
### What problem does this PR solve?
add a handler for gpt 5 models that do not accept parameters by dropping
them, and centralize all models with specific paramter handling function
into a single helper.
solves issue #13639
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
1. Change go admin server port from 9385 to 9383 to avoid conflicts
2. Start go server after python servers are started completely, in
entrypoint.sh
3. Fix some database migration issue
4. Add more API routes in web to compliant with EE.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Closes#1398
### What problem does this PR solve?
Adds native support for EPUB files. EPUB content is extracted in spine
(reading) order and parsed using the existing HTML parser. No new
dependencies required.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
To check this parser manually:
```python
uv run --python 3.12 python -c "
from deepdoc.parser import EpubParser
with open('$HOME/some_epub_book.epub', 'rb') as f:
data = f.read()
sections = EpubParser()(None, binary=data, chunk_token_num=512)
print(f'Got {len(sections)} sections')
for i, s in enumerate(sections[:5]):
print(f'\n--- Section {i} ---')
print(s[:200])
"
```
### What problem does this PR solve?
using builtin model when parsing gave an error because it expects
fid==builtin. split_model_name_and_factory returns id=None. pr allows
the model to be accepted wheter with or without @Builtin
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Export Agent Logs.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
Fix: The dataset description should not be a required field.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix left preview containment regression for file previews
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Follow-up expose agent structured outputs in non-stream completions
#13389.
### Type of change
- [x] Documentation Update
- [x] Refactoring
---------
Co-authored-by: writinwaters <cai.keith@gmail.com>
### What problem does this PR solve?
Implement Search() in Infinity in GO.
The function can handle the following request.
"search '曹操' on datasets 'infinity'"
"search '常胜将军' on datasets 'infinity'"
"search '卓越儒雅' on datasets 'infinity'"
"search '辅佐刘禅北伐中原' on datasets 'infinity'"
The output is exactly the same as request to python Search()
### Type of change
- [ ] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed an issue where agent template titles were not displayed in
Chinese mode.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed an issue where the agent could not publish.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Split dataset api to gateway and service, and modify web UI to use
restful http api.
2. Old KB releated APIs are commented.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Add chunk also supports uploading image.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add_chunk supports add image.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Fix: paddle ocr coordinate lower > upper #13618
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
This pull request updates the GitHub Actions workflow for testing,
primarily to simplify Docker Compose usage and environment file
management. The main changes focus on removing unnecessary subdirectory
references, updating environment file handling, and streamlining the
workflow steps.
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Shared chat link triggers infinite POST loop with empty question,
input disabled #13606
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Feat: Translate embedded dialog text.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
when the conversation starts to get long on multimodel chat, the
conversation pushes the input bar offscreem
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
add timeout to fix fail at build during uvsync step
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Replace pypi.tuna.tsinghua.edu.cn with mirrors.aliyun.com to resolve
issues with missing packages on the Tsinghua mirror.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Split dataset api to gateway and service, and modify web UI to use
restful http api.
2. Old KB releated APIs are commented.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Forces NLTK to load the corpus synchronously once, preventing concurrent
tasks from triggering the lazy-loading race condition that cause Fixing
WordNetCorpusReader object has no attribute _LazyCorpusLoader_… #13590
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: shakeel <shakeel@lollylaw.com>
### What problem does this PR solve?
Fix: model selecton rule in get_model_config_by_type_and_name
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add the `user_id` field to the agent log table and the embedded
page.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Fix go server date precision
2. Use API_SCHEME_PROXY to control the web API route
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Enhanced user management functionality and cascading data deletion.
Added tenant and related data initialization functionality during user
creation, including tenants, user-tenant relationships, LLM
configuration, and root folder.
Added cascading deletion logic for user deletion, ensuring that all
associated data is cleaned up simultaneously when a user is deleted.
Implemented a Werkzeug-compatible password hash algorithm (scrypt) and
verification functionality.
Added multiple DAO methods to support batch data operations and
cascading deletion.
Improved user login processing and added token signing functionality.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
What problem does this PR solve?
fix CVE-2026-28804 CVE-2026-31826
Bug Fix (non-breaking change which fixes an issue)
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify the style of the release confirmation box.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Co-authored-by: balibabu <assassin_cike@163.com>
Co-authored-by: 6ba3i <isbaaoui09@gmail.com>
### What problem does this PR solve?
Get user_id from canvas and record it.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Add more CLI command
2. Add some license hooks
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
min value and message force users to input a descript in datasets. Also
had a wrong error message.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
Fixes#13544: PostgreSQL startup crash because
`update_tenant_llm_to_id_primary_key()` unconditionally uses
MySQL-specific SQL.
- Split `update_tenant_llm_to_id_primary_key()` into
`_update_tenant_llm_to_id_primary_key_mysql()` and
`_update_tenant_llm_to_id_primary_key_postgres()`, dispatching on
`settings.DATABASE_TYPE`
- MySQL path: unchanged (existing `DATABASE()`, `SET @row = 0`,
`AUTO_INCREMENT`, `DROP PRIMARY KEY` logic)
- PostgreSQL path: uses `current_database()`, `ROW_NUMBER() OVER (ORDER
BY ...)` for sequential IDs, `CREATE SEQUENCE` + `nextval()` for
auto-increment, and `information_schema.table_constraints` to find the
PK constraint name
- Also fix `migrate_add_unique_email()`: MySQL-only
`information_schema.statistics` is replaced with `pg_indexes` on
PostgreSQL
## Test plan
- [ ] Start RAGFlow with `DB_TYPE=postgres` — startup should complete
without `function database() does not exist` error
- [ ] Start RAGFlow with `DB_TYPE=mysql` (default) — existing behaviour
unchanged, migration runs as before
- [ ] Fresh PostgreSQL install: verify `tenant_llm.id` column is created
as a serial primary key after migration
- [ ] Idempotency: running migration twice on PostgreSQL should be a
no-op (column already exists check passes)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: gambletan <gambletan@github>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
Removed duplicate key that caused build warning during Vite build.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue #13465
POST /api/v1/retrieval failed with
{"code":100,...,"message":"Exception('Model Name is required')"} when
cross_languages was provided and no explicit llm_id was passed.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Unify top level pages structure
- Standardize locale language codes (BCP 47) and time zones (IANA tz)
> **Note:**
> Newly created user info brings non-standard default values `timezone:
"UTC+8\tAsia/Shanghai"` and `language: "English"`.
### Type of change
- [x] Refactoring
## Summary
Add MiniMax's latest M2.5 model family to the model registry and update
the default API base URL to the international endpoint for broader
accessibility.
## Changes
- **Add MiniMax-M2.5 models** to `conf/llm_factories.json`:
- `MiniMax-M2.5` — Peak Performance. Ultimate Value. Master the Complex.
- `MiniMax-M2.5-highspeed` — Same performance, faster and more agile.
- Both support 204,800 token context window and tool calling (`is_tools:
true`).
- **Update default MiniMax API base URL** in `rag/llm/__init__.py`:
- From `https://api.minimaxi.com/v1` (domestic) to
`https://api.minimax.io/v1` (international).
- Chinese users can still override via the Base URL field in the UI
settings (as documented in existing i18n strings).
## Supported Models
| Model | Context Window | Tool Calling | Description |
|-------|---------------|-------------|-------------|
| `MiniMax-M2.5` | 204,800 tokens | Yes | Peak Performance. Ultimate
Value. |
| `MiniMax-M2.5-highspeed` | 204,800 tokens | Yes | Same performance,
faster and more agile. |
## API Documentation
- OpenAI Compatible API:
https://platform.minimax.io/docs/api-reference/text-openai-api
## Testing
- [x] JSON validation passes
- [x] Python syntax validation passes
- [x] Ruff lint passes
- [x] MiniMax-M2.5 API call verified (returns valid response)
- [x] MiniMax-M2.5-highspeed API call verified (returns valid response)
Co-authored-by: PR Bot <pr-bot@minimaxi.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
- Print Go version log when start server
- Expose the server port in CI docker container
### Type of change
- [x] Other (please describe): For CI
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
RAGFlow server isn't available when admin server isn't connected.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feature (System Settings): Implemented system settings management
functionality
- Added a new SystemSettings model, including creation and update time
fields.
- Implemented SystemSettingsDAO, providing CRUD operations and
transaction support.
- Implemented management interfaces for variables, configurations, and
environment variables in the admin service.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
This PR fixes two security vulnerabilities in web dependencies
identified by Trivy:
1. CVE-2025-13465 (lodash): Prototype pollution vulnerability in _.unset
and _.omit functions
2. CVE-2026-0540 (dompurify): Cross-site scripting (XSS) vulnerability
**Changes:**
- Upgraded lodash from 4.17.21 to 4.17.23
- Upgraded dompurify from 3.3.1 to 3.3.2
- Added npm override to force monaco-editor's transitive dependency on
dompurify to use 3.3.2 (monaco-editor still depends on vulnerable 3.2.7)
Both upgrades are backward-compatible patch versions. Build verified
successfully with no breaking changes.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- Add documentation for the `-p project_name` flag in the migration
script, covering all steps (stop, backup, restore, start)
- Add a note explaining how Docker volume name prefixes relate to the
Compose project name
- Update `docker-compose` to `docker compose` (Compose V2 syntax) for
consistency
- Fix `sh` to `bash` to match the script's shebang line
This is the documentation follow-up to #12187 which added `-p` project
name support to `docker/migration.sh`.
## Test plan
- [ ] Verify the documentation renders correctly on the docs site
- [ ] Confirm all example commands are accurate against the current
`migration.sh`
### What problem does this PR solve?
Implement: minio, s3, oss, azure_sas, azure_spn, gcs, opendal
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fixes#13285
When an LLM returns a transient error (e.g. overloaded) during parsing,
the task progress is set to -1. Previously, the progress could never be
updated again, leaving the document permanently stuck in FAIL status
even after the task successfully recovered and completed.
Three coordinated changes address this:
1. task_service.update_progress: relax the progress update guard to
accept prog >= 1 even when current progress is -1, so a task that
recovers from a transient failure can report completion.
2. document_service.get_unfinished_docs: include documents that are
marked FAIL (progress == -1) but still have at least one non-failed task
(task.progress >= 0) in the polling set, so their status can be
re-synced once a task recovers. Documents where all tasks have
permanently failed are excluded to avoid unnecessary polling.
3. document_service.update_progress: explicitly set document status to
RUNNING when not all tasks have finished, instead of preserving whatever
stale status (potentially FAIL) the document previously had.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: image pdf in ingestion pipeline #13550
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR adds support for parsing PDFs through an external Docling
server, so RAGFlow can connect to remote `docling serve` deployments
instead of relying only on local in-process Docling.
It addresses the feature request in
[#13426](https://github.com/infiniflow/ragflow/issues/13426) and aligns
with the external-server usage pattern already used by MinerU.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What is changed?
- Add external Docling server support in `DoclingParser`:
- Use `DOCLING_SERVER_URL` to enable remote parsing mode.
- Try `POST /v1/convert/source` first, and fallback to
`/v1alpha/convert/source`.
- Keep existing local Docling behavior when `DOCLING_SERVER_URL` is not
set.
- Wire Docling env settings into parser invocation paths:
- `rag/app/naive.py`
- `rag/flow/parser/parser.py`
- Add Docling env hints in constants and update docs:
- `docs/guides/dataset/select_pdf_parser.md`
- `docs/guides/agent/agent_component_reference/parser.md`
- `docs/faq.mdx`
### Why this approach?
This keeps the change focused on one issue and one capability (external
Docling connectivity), without introducing unrelated provider-model
plumbing.
### Validation
- Static checks:
- `python -m py_compile` on changed Python files
- `python -m ruff check` on changed Python files
- Functional checks:
- Remote v1 endpoint path works
- v1alpha fallback works
- Local Docling path remains available when server URL is unset
### Related links
- Feature request: [Support external Docling server (issue
#13426)](https://github.com/infiniflow/ragflow/issues/13426)
- Compare view for this branch:
[main...feat/docling-server](https://github.com/infiniflow/ragflow/compare/main...spider-yamet:ragflow:feat/docling-server?expand=1)
##### Fixes [#13426](https://github.com/infiniflow/ragflow/issues/13426)
## Summary
Fix knowledge-base chat retrieval when no individual document IDs are
selected.
## Root Cause
`async_chat()` initialized `doc_ids` as an empty list when the request
did not explicitly select documents. That empty list was then forwarded
into retrieval as an active `doc_id` filter, effectively becoming
`doc_id IN []` and suppressing all chunk matches.
## Changes
- treat missing selected document IDs as `None` instead of `[]`
- keep explicit document filtering when IDs are actually provided
- add regression coverage for the shared chat retrieval path
## Validation
- `python3 -m py_compile api/db/services/dialog_service.py
test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py`
- `.venv/bin/python -m pytest
test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py`
- manually verified that chat completions again inject retrieved
knowledge into the prompt
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
The Chunk class had a typo in the attribute name 'documnet_keyword',
which caused the document_name field to remain empty when retrieving
chunks via the SDK. This fix corrects the spelling to
'document_keyword'.
Changes:
- Line 36: Changed self.documnet_keyword to self.document_keyword
- Line 52: Updated backward compatibility code to use
self.document_keyword
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
feat(cli): Enhance CLI functionality and add administrator mode support
- Modify `parseActivateUser` in `parser.go` to support 'on'/'off' states
- Add administrator mode switching and host port settings functionality
to `cli.go`
- Implement user management API calls in `client.go`
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
As title
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
For EE
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
`./server_main -p 9380`
`./server_main -h`
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Add delete all support for delete operations.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
---------
Co-authored-by: writinwaters <cai.keith@gmail.com>
### What problem does this PR solve?
In ragflow cli, use Up/Down arrows to navigate command history,
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Mark test cases as lower priority (p3) for:
- Creating chat assistants
- Deleting chat assistants
- Listing chat assistants
- Listing chunks within datasets
### Type of change
- [x] Update testcases
### What problem does this PR solve?
Standardize term capitalization in `deploy_local_llm.mdx` and improve
code block formatting.
### Type of change
- [x] Documentation Update
## Summary
- Convert bare `open()` calls to `with` context managers or
`Path.read_text()`
- File handles leak if not properly closed, especially on exceptions
- Fixes in crypt.py, sequence2txt_model.py, term_weight.py,
deepdoc/vision/__init__.py
## Test plan
- [x] File operations work correctly with context managers
- [x] Resources properly cleaned up on exceptions
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
This PR implements comprehensive Arabic language support for the RAGFlow
application. The changes include:
- Complete Arabic translation of all UI text elements in the web
interface
- RTL (right-to-left) layout support for Arabic content
- Localization updates for all supported languages (ar, bg, de, en, es,
fr, id, it, ja, pt-br, ru, vi, zh-traditional, zh)
- UI component adjustments to properly display Arabic text and support
RTL layout
The implementation ensures that Arabic-speaking users can fully interact
with the application in their native language with proper text rendering
and layout direction.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
<img width="2866" height="1617" alt="image"
src="https://github.com/user-attachments/assets/f2751b34-1b65-4867-b81d-a1068c17b9b7"
/>
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Feat: Implement user creation, deletion, and permission management
functionality.
- Added the `ListByEmail` method to `user.go` to query users by email
address.
- Updated the user activation status handling logic in `handler.go`,
adding input validation.
- Added RSA password decryption functionality to `password.go`.
- Implemented complete user management functionality in `service.go`,
including user creation, deletion, password modification, activation
status, and permission management.
- Added input validation and error handling logic.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Change go server default port to 9382
2. Compatible with EE data model.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix https://github.com/infiniflow/ragflow/issues/13388
Call get_flatted_meta_by_kbs in dify retrieval. Remove get_meta_by_kbs.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- scope normal document-list metadata lookups to the current page's
document IDs
- keep the `return_empty_metadata=True` path dataset-wide because it
needs full knowledge of docs that already have metadata
- add unit tests for both paged listing paths and the unchanged
empty-metadata behavior
## Why
`DocumentService.get_list()` and the normal `get_by_kb_id()` path were
calling `DocMetadataService.get_metadata_for_documents(None, kb_id)`,
which loads metadata for the entire dataset on every page request.
That becomes especially problematic on large datasets. The metadata scan
path paginates through the full metadata index without an explicit sort,
while the ES helper only switches to `search_after` beyond `10000`
results when a sort is present. In practice this can lead to unnecessary
full-dataset metadata work, slower document-list loading, and unreliable
`meta_fields` in list responses for large KBs.
This change keeps the existing empty-metadata filter behavior intact,
but scopes normal list responses to metadata for the current page only.
### What problem does this PR solve?
Use auth middle-ware to check authorization.
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Summary
This PR is the direct successor to the previous `docx` lazy-loading
implementation. It addresses the technical debt intentionally left out
in the last PR by fully migrating the `qa` and `manual` parsing
strategies to the new lazy-loading model.
Additionally, this PR comprehensively refactors the underlying `docx`
parsing pipeline to eliminate significant code redundancy and introduces
robust fallback mechanisms to handle completely corrupted image streams
safely.
## What's Changed
* **Centralized Abstraction (`docx_parser.py`)**: Moved the
`get_picture` extraction logic up to the `RAGFlowDocxParser` base class.
Previously, `naive`, `qa`, and `manual` parsers maintained separate,
redundant copies of this method. All downstream strategies now natively
gather raw blobs and return `LazyDocxImage` objects automatically.
* **Robust Corrupted Image Fallback (`docx_parser.py`)**: Handled edge
cases where `python-docx` encounters critically malformed magic headers.
Implemented an explicit `try-except` structure that safely intercepts
`UnrecognizedImageError` (and similar exceptions) and seamlessly falls
back to retrieving the raw binary via `getattr(related_part, "blob",
None)`, preventing parser crashes on damaged documents.
* **Legacy Code & Redundancy Purge**:
* Removed the duplicate `get_picture` methods from `naive.py`, `qa.py`,
and `manual.py`.
* Removed the standalone, immediate-decoding `concat_img` method in
`manual.py`. It has been completely replaced by the globally unified,
lazy-loading-compatible `rag.nlp.concat_img`.
* Cleaned up unused legacy imports (e.g., `PIL.Image`, docx exception
packages) across all updated strategy files.
## Scope
To keep this PR focused, I have restricted these changes strictly to the
unification of `docx` extraction logic and the lazy-load migration of
`qa` and `manual`.
## Validation & Testing
I've tested this to ensure no regressions and validated the fallback
logic:
* **Output Consistency**: Compared identical `.docx` inputs using `qa`
and `manual` strategies before and after this branch: chunk counts,
extracted text, table HTML, and attached images match perfectly.
* **Memory Footprint Drop**: Confirmed a noticeable drop in peak memory
usage when processing image-dense documents through the `qa` and
`manual` pipelines, bringing them up to parity with the `naive`
strategy's performance gains.
## Breaking Changes
* None.
### What problem does this PR solve?
Feat: Add a user_id field to the message and retrieval operators.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Previously, when an Agent component was configured with structured
output, the non-streaming /agents/{agent_id}/completions API never
returned the structured field in its response.
The root cause: the non-streaming code path only collected message
events to build full_content, then returned the workflow_finished
payload — which only contains the output of the last component in the
execution path (typically a Message component).
Any structured output set by upstream components (e.g., Agent or LLM)
was silently discarded.
This PR fixes the non-streaming handler to iterate node_finished events
and collect structured output from intermediate components.
If any component produced a non-empty structured value, it is included
in the final response under data.structured. The streaming path is
unaffected, as it already exposes node_finished events to the caller.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Support getting aggregated parsing status to dataset via the API
Issue: #12810
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: heyang.why <heyang.why@alibaba-inc.com>
### What problem does this PR solve?
bin directory cannot be copied to docker image introduced by
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
feat(admin): Implemented default administrator initialization and login
functionality.
Added support for default administrator configuration, including super
user nickname, email, and password.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The number of deleted session prompts is displayed incorrectly.
#13499
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes#6004#7142#11959
Unlike #9207 we actually normalize the coordinates here
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display release status in agent version history.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
Avoid getting doc in function delete_document_metadata as the doc might
have been removed.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: chats_openai in none stream condition #13453
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix https://github.com/infiniflow/ragflow/issues/13388
The following command returns empty when there is doc with the meta data
```
curl --request POST \
--url http://localhost:9222/api/v1/retrieval \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ragflow-fO3mPFePfLgUYg8-9gjBVVXbvHqrvMPLGaW0P86PvAk' \
--data '{
"question": "any question",
"dataset_ids": ["9bb4f0591b8811f18a4a84ba59049aa3"],
"metadata_condition": {
"logic": "and",
"conditions": [
{
"name": "character",
"comparison_operator": "is",
"value": "刘备"
}
]
}
}'
```
When metadata_condtion is specified in the retrieval API, it is
converted to doc_ids and doc_ids is passed to retrieval function.
In retrieval funciton, when doc_ids is explicitly provided , we should
bypass threshold.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Problem
When PDF fonts lack ToUnicode/CMap mappings, pdfplumber (pdfminer)
cannot map CIDs to correct Unicode characters, outputting PUA characters
(U+E000~U+F8FF) or `(cid:xxx)` placeholders. The original code fully
trusted pdfplumber text without any garbled detection, causing garbled
output in the final parsed result.
Relates to #13366
## Solution
### 1. Garbled text detection functions
- `_is_garbled_char(ch)`: Detects PUA characters (BMP/Plane 15/16),
replacement character U+FFFD, control characters, and
unassigned/surrogate codepoints
- `_is_garbled_text(text, threshold)`: Calculates garbled ratio and
detects `(cid:xxx)` patterns
### 2. Box-level fallback (in `__ocr()`)
When a text box has ≥50% garbled characters, discard pdfplumber text and
fallback to OCR recognition.
### 3. Page-level detection (in `__images__()`)
Sample characters from each page; if garbled rate ≥30%, clear all
pdfplumber characters for that page, forcing full OCR.
### 4. Layout recognizer CID filtering
Filter out `(cid:xxx)` patterns in `layout_recognizer.py` text
processing to prevent them from polluting layout analysis.
## Testing
- 29 unit tests covering: normal CJK/English text, PUA characters, CID
patterns, mixed text, boundary thresholds, edge cases
- All 85 existing project unit tests pass without regression
### What problem does this PR solve?
As title
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
refactor: Moves the LLM factory initialization logic to the `dao`
package.
Removes the `init_data` package and integrates the LLM factory
initialization functionality into the `dao` package.
Adds a `utility` package to provide general utility functions.
Updates `server_main.go` to use the new initialization path.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## Problem
The `ragflow-cli` PyPI package (v0.24.0) is missing `http_client.py`,
`ragflow_client.py`, and `user.py`, causing import errors when installed
from PyPI.
## Root Cause
`pyproject.toml` only lists `ragflow_cli` and `parser` in
`[tool.setuptools] py-modules`.
## Fix
Add the three missing modules to `py-modules`.
Fixes#13456
Co-authored-by: atian8179 <atian8179@users.noreply.github.com>
### What problem does this PR solve?
1. Resolve standard user can access admin service
2. Get RAGFlow service status
3. Fix minio status fetching
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. RAGFlow server will send heartbeat periodically.
2. This PR will including:
- Scheduled task
- API server message sending
- Admin server API to receive the message.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
feat: Added LLM factory initialization functionality and knowledge base
related API interfaces
refactor(dao): Refactored the TenantLLMDAO query method
feat(handler): Implemented knowledge base related API endpoints
feat(service): Added LLM API key setting functionality
feat(model): Extended the knowledge base model definition
feat(config): Added default user LLM configuration
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this commit solve?
This commit introduces a new API endpoint
`/datasets/<dataset_id>/documents/<document_id>/chunks/switch` that
allows users to switch the availability status of specified chunks in a
document as same as chunk_app.py
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR addresses security vulnerabilities in PDF processing
dependencies identified by Trivy security scan:
1. CVE-2026-28804 (MEDIUM): pypdf 6.7.4 vulnerable to inefficient
decoding of ASCIIHexDecode streams
2. CVE-2023-36464 (MEDIUM): pypdf2 3.0.1 susceptible to infinite loop
when parsing malformed comments
Since pypdf2 is deprecated with no available fixes, this PR migrates all
pypdf2 usage to the actively maintained pypdf library (version 6.7.5),
which resolves
both vulnerabilities.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
This PR fixes two runtime bugs in agent components:
**Bug 1: `agent/component/invoke.py` — `NameError` in POST +
`clean_html` path**
The POST method's `clean_html` branch uses the variable `sections`
without ever defining it. Both the GET and PUT branches correctly call
`sections = HtmlParser()(None, response.content)` before referencing
`sections`, but this line was missing from the POST branch (copy-paste
omission). This causes a `NameError` whenever a user configures an
Invoke component with `method="post"` and `clean_html=True`.
**Bug 2: `agent/component/data_operations.py` — `AttributeError` in
`_recursive_eval`**
The `_recursive_eval` method recursively calls `self.recursive_eval()`
(without the leading underscore) instead of `self._recursive_eval()`.
Since the method is defined as `_recursive_eval`, this causes an
`AttributeError` at runtime when the `literal_eval` operation processes
nested dicts or lists.
## Test plan
- [ ] Configure an Invoke node with `method=post` and `clean_html=True`,
verify HTML is parsed correctly without `NameError`
- [ ] Configure a DataOperations node with `operations=literal_eval` on
nested data, verify no `AttributeError`
---------
Signed-off-by: JiangNan <1394485448@qq.com>
### What problem does this PR solve?
Add APIs to admin server.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Summary
Fix a database connection and cursor resource leak in the ExeSQL agent
tool.
When SQL execution raises an exception (for example syntax error or
missing table),
the existing code path skips `cursor.close()` and `db.close()`, causing
database
connections to accumulate over time.
This can eventually lead to connection exhaustion in long-running agent
workflows.
## Root Cause
The cleanup logic for database cursors and connections is placed after
the SQL
execution loop without `try/finally` protection. If an exception occurs
during
`cursor.execute()`, `fetchmany()`, or result processing, the cleanup
code is not
reached and the connection remains open.
The same issue also exists in the IBM DB2 execution path where
`ibm_db.close(conn)`
may be skipped when exceptions occur.
## Fix
- Wrap SQL execution logic in `try/finally` blocks to guarantee resource
cleanup.
- Ensure `cursor.close()` and `db.close()` are always executed.
- Add explicit `db.close()` when `db.cursor()` creation fails.
- Remove redundant close calls in early-return branches since `finally`
now handles cleanup.
## Impact
- No change to normal execution behavior.
- Ensures database resources are always released when errors occur.
- Prevents connection leaks in long-running workflows.
- Only affects `agent/tools/exesql.py`.
## Testing
Manual test scenarios:
1. Valid SQL execution
2. SQL syntax error
3. Query against a non-existing table
4. Execution cancellation during query
In all scenarios the database cursor and connection are properly closed.
Code quality checks:
- `ruff check` passed
- No new warnings introduced
### What problem does this PR solve?
Add DingTalk AI Table connector and integration for data synchronization
Issue #13400
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: wangheyang <wangheyang@corp.netease.com>
### What problem does this PR solve?
- Adjust UI styles in **Dataset** pages.
- Adjust several shared components styles
- Modify files and directory structure in `src/layouts`
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Changed test priority markers from p1/p2 to p3 in three test files:
- test_table_parser_dataset_chat.py: Adjusted priority for table parser
dataset chat test
- test_delete_chunks.py: Updated priority for chunk deletion test with
invalid IDs
- test_retrieval_chunks.py: Modified priority for chunks retrieval
pagination test
These changes demote the priority of specific test cases to p3,
indicating they are lower priority tests that can run later in the test
suite execution.
### Type of change
- [x] Test update
### What problem does this PR solve?
Feat: Add PublishConfirmDialog
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Since database model is updated in python version, go server also need
to update
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Follow-up to #12488#13386
### What problem does this PR solve?
Previously, token authentication failures returned HTTP 200 with an
error code in the response body.
This PR updates `token_required` to raise `Unauthorized` and relies on
the global error handler to return a structured JSON response with HTTP
401 status.
The response body structure (`code`, `message`, `data`) remains
unchanged to preserve compatibility with the official SDK.
Frontend logic has been updated to handle HTTP 401 responses in addition
to checking `data.code`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Empty ids means no-op operation.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
---------
Co-authored-by: writinwaters <cai.keith@gmail.com>
## Summary
- Revert aliyun registry from
`infiniflow-registry.cn-shanghai.cr.aliyuncs.com` back to
`registry.cn-hangzhou.aliyuncs.com`
## Test plan
- [ ] Verify the docker/.env file contains the correct registry URL
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
Fix: paddle ocr missing outlines #13422
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
To copy infinity/resource into docker images
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat:Using Go to implement user registration logic
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Add aggregation_utils.aggregate_by_field for pure aggregation logic
- Wire OBConnection.get_aggregation to use it (unwrap tuple, pass
messages)
- Add unit tests for aggregate_by_field (no DB/heavy deps)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Closes: #12889
### What problem does this PR solve?
When syncing external data sources (e.g., Jira, Confluence, Google
Drive), updated documents were not being re-chunked. The raw content was
correctly updated in blob storage, but the vector database retained
stale chunks, causing search results to return outdated information.
**Root cause:** The task digest used for chunk reuse optimization was
calculated only from parser configuration fields (`parser_id`,
`parser_config`, `kb_id`, etc.), without any content-dependent fields.
When a document's content changed but the parser configuration remained
the same, the system incorrectly reused old chunks instead of
regenerating new ones.
**Example scenario:**
1. User syncs a Jira issue: "Meeting scheduled for Monday"
2. User updates the Jira issue to: "Meeting rescheduled to Friday"
3. User triggers sync again
4. Raw content panel shows updated text ✓
5. Chunk panel still shows old text "Monday" ✗
**Solution:**
1. Include `update_time` and `size` in the chunking config, so the task
digest changes when document content is updated
2. Track updated documents separately in `upload_document()` and return
them for processing
3. Process updated documents through the re-parsing pipeline to
regenerate chunks
[1.webm](https://github.com/user-attachments/assets/d21d4dcd-e189-4d39-8700-053bae0ca5a0)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix update_cnt add error in init_data.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Optimize the style of the chat page.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR aims to extend the list of possible providers. Adds new Provider
"RAGcon" within the Ollama Modal. It provides all model types except OCR
via Openai-compatible endpoints.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Jakob <16180662+hauberj@users.noreply.github.com>
### What problem does this PR solve?
This PR remediates CVE-2024-47081 (MEDIUM severity) in the agent/sandbox
component by upgrading the requests library from version 2.32.3 to
2.32.5. The vulnerability allows .netrc credentials to leak via
malicious URLs.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR remediates three HIGH severity vulnerabilities in urllib3
affecting the admin client and Python SDK:
- **CVE-2025-66418**: Unbounded decompression chain leads to resource
exhaustion
- **CVE-2025-66471**: Streaming API improperly handles highly compressed
data
- **CVE-2026-21441**: Decompression-bomb safeguard bypass when following
HTTP redirects
Trivy security scan identified urllib3 v2.5.0 as vulnerable in both
`admin/client/uv.lock` and `sdk/python/uv.lock`. This PR updates urllib3
to v2.6.3 to eliminate these security risks.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Renovate global navigation bar, align styles to the design.
(May causes minor layout issues in sub-pages, will check and fix soon)
### Type of change
- [x] Refactoring
Add checksum annotation for values in ragflow.yaml
### What problem does this PR solve?
This PR is about this ticket: #13408
Ragflow helm charts do not include the Values.yaml in the list of
watched changes.
If you update the Values.yaml for an existing deployment, helm will not
detect it and not update the deployment.
This PR fixes that.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
new test for chat multiple model and other chat parameters under
playwright
### Type of change
- [x] Other (please describe): new test/ data-testid
### What problem does this PR solve?
Alibaba Could OSS config issue #13390.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: UI Placeholder and Hint Optimization
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
feat: Adds the tenant model ID field to the interface definition
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add id for table tenant_llm and apply in LLMBundle.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Feat: published agent version control
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR remediates CVE-2026-25639, a HIGH severity Denial of Service
vulnerability in axios caused by __proto__ pollution in the mergeConfig
function. The vulnerability affects both the web frontend and the
sandbox nodejs environment.
Trivy security scan identified axios versions below 1.13.5 as
vulnerable. This PR updates axios to secure versions (1.13.6 in web,
1.13.5 in sandbox) to eliminate the security risk.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Improve model verification UX. #13395
### Type of change
- [x] Refactoring
---------
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
1. init go admin server
2. refactor api server router
3. add benchmark CI to 450s time limit
4. remove docker builder container after building
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Enhance chunk management by adding support for 'available', 'tag_kwd'
and 'tag_feas' fields in list, add, and update chunk functions just like
chunk_app.py.This improves data handling and flexibility in chunk
processing.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR aims to:
1. Enable file uploads for the public API, similarly to what
/document/upload_info accomplishes for the frontend;
2. Enable files sent to the /chat/:chat_id/completions endpoint to be
used within the conversation.
We classify the first item as a new future, while classifying the second
one as a bug fix.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
*The work related to this PR was co-authored by*
[Bruno Ferreira](https://github.com/brunopferreira): Custom Solutions
Manager @ [Orbcom](https://orbcom.pt/)
[Pedro Ferreira](https://github.com/sirj0k3r): Lead Software Developer @
[Orbcom](https://orbcom.pt/)
[Pedro Cardoso](https://github.com/pedromiguel4560): Associate Software
Developer @ [Orbcom](https://orbcom.pt/)
*This PR replaces #13248*
---------
Co-authored-by: Pedro Cardoso <pedrocardoso@orbcom.pt>
Co-authored-by: Pedro Ferreira <pedroferreira@orbcom.pt>
### What problem does this PR solve?
When multiple columns are used as content columns in RDBMS connector,
the generated document text gets chunked by TxtParser which strips
newline delimiters during merge. This causes field names and values from
different columns to be concatenated without any separator, making the
content unreadable.
Changes:
- txt_parser.py: restore newline separator when merging adjacent text
segments within a chunk, so that split sections are not directly
concatenated
- rdbms_connector.py: use double newline between fields and place field
value on a new line after the field name bracket, giving TxtParser
clearer boundaries to work with
Closes#13001
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: tunsuytang <tunsuytang@tencent.com>
### What problem does this PR solve?
Feat: Write the row and column numbers into the element's data attribute
for easy code location.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Problem: When searching for a specific company name like(Daofeng
Technology), the search would incorrectly return unrelated resumes
containing generic terms like (Technology) in their company names
Root Cause: The `corporation_name_tks` field was included in the
identity fields that are redundantly written to every chunk. This caused
common words like "科技" to match across all chunks, leading to
over-retrieval of irrelevant resumes.
Solution: Remove `corporation_name_tks` from the `_IDENTITY_FIELDS`
list. Company information is still preserved in the "Work Overview"
chunk where it belongs, allowing proper company-based searches while
preventing false positives from generic terms.
---------
Co-authored-by: Aron.Yao <yaowei@192.168.1.68>
Co-authored-by: Aron.Yao <yaowei@yaoweideMacBook-Pro.local>
Co-authored-by: Liu An <asiro@qq.com>
# RAGFlow Go Implementation Plan 🚀
This repository tracks the progress of porting RAGFlow to Go. We'll
implement core features and provide performance comparisons between
Python and Go versions.
## Implementation Checklist
- [x] User Management APIs
- [x] Dataset Management Operations
- [x] Retrieval Test
- [x] Chat Management Operations
- [x] Infinity Go SDK
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
this pr adds new tests, for the full configuration tab in datasests
### Type of change
- [x] Other (please describe): new tests
### What problem does this PR solve?
ci fails in elastic search because of benchmark
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Accelerate python module downloading
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Guard embedding_model change when dataset has existing chunks. API must
return code 102 with message 'When chunk_num (N) > 0, embedding_model
must remain <current_model>' to prevent silent embedding drift.
### Type of change
- [x] Add Testcases
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
benchmark always failed in new CI machine. please enable it after the
issue is fixed.
### Type of change
- [x] Other (please describe): disable benchmark
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
## Summary
Dify’s external retrieval expects `records[].metadata.document_id` to
be a non-empty string.
RAGFlow currently only sets `metadata.doc_id`, which causes Dify
validation to fail.
This PR adds `metadata.document_id` (mapped from `doc_id`) in the
Dify-compatible retrieval response.
## Changes
- Add `meta["document_id"] = c["doc_id"]` in
`api/apps/sdk/dify_retrieval.py`
## Testing
- Not run (logic-only change).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Use redis to store the secret key.
2. During startup API server will read the secret from redis. If no such
secret key, generate one and store it into redis, atomically.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: The dropdown menu for large models does not automatically focus on
the search box. #13313
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Correct PDF chunking parameter name in naive #13325
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Change the background color of the message notification button.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Issue: #12756
### What problem does this PR solve?
When users upload files through Agent's Begin or Await Response
components, the parsing is hardcoded to "Plain Text", ignoring all other
available parsers (DeepDOC, TCADP, Docling, MinerU, PaddleOCR). This PR
adds a PDF parser dropdown to these components so users can select the
appropriate parser for their file inputs.
### Changes
**Backend**
- `agent/component/fillup.py` - Added `layout_recognize` param to
`UserFillUpParam`, forwarded to `FileService.get_files()`
- `agent/component/begin.py` - Same forwarding in `Begin._invoke()`
- `agent/canvas.py` - Extract Begin's `layout_recognize` for `sys.files`
parsing, added param to `get_files_async()` / `get_files()`
- `api/db/services/file_service.py` - Added `layout_recognize` param to
`parse()` and `get_files()`, replacing hardcoded `"Plain Text"`
- `rag/app/naive.py` - Added `"plain text"` and `"tcadp parser"` aliases
to PARSERS dict to match dropdown values after `.lower()`
**Frontend**
- `web/src/pages/agent/form/begin-form/index.tsx` - Show
`LayoutRecognizeFormField` dropdown when file inputs exist
- `web/src/pages/agent/form/begin-form/schema.ts` - Added
`layout_recognize` to Zod schema
- `web/src/pages/agent/form/user-fill-up-form/index.tsx` - Same dropdown
for Await Response component
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Move test files from utils/ to their corresponding functional
directories:
- api/db/ for database related tests
- api/utils/ for API utility tests
- rag/utils/ for RAG utility tests
### Type of change
- [x] Refactoring
Chinese text remained in generated code comments, log messages, field
descriptions, and documentation files under `agent/sandbox/`.
### Changes
- **`tests/MIGRATION_GUIDE.md`** — Full EN translation (migration guide
from OpenSandbox → Code Interpreter)
- **`tests/QUICKSTART.md`** — Full EN translation (quick test guide for
Aliyun sandbox provider)
- **`providers/aliyun_codeinterpreter.py`** — Removed `(主账号ID)` from
docstring, error log, and config field description
- **`sandbox_spec.md`** — Removed `(主账号ID)` from `account_id` field
description
- **`tests/test_aliyun_codeinterpreter_integration.py`** — Removed
`(主账号ID)` from inline comment
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
<!-- START COPILOT CODING AGENT TIPS -->
---
💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: yuzhichang <153784+yuzhichang@users.noreply.github.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
For helm deployment, there is also requirement to enable the Admin
Service for administrative operations.
So expose the ability of enable/disable this function by helm
configuration.
When it's enabled (by default),
<img width="486" height="190" alt="image"
src="https://github.com/user-attachments/assets/4db0dc3d-bd94-4ad9-bb5d-a240aac5e1c5"
/>
Admin access and operations would be feasible like below,
<img width="2530" height="876" alt="image"
src="https://github.com/user-attachments/assets/3e948e1b-7522-4f8d-8dc0-c80a22242022"
/>
Something like 'user management' is very much important for Ragflow
User/Owner to control their clients.
### What problem does this PR solve?
Playwright tests previously depended on cross-file execution order
(`auth -> provider -> dataset -> chat`).
This change makes setup explicit and idempotent via fixtures so tests
can run independently.
- Added/standardized prerequisite fixtures in
`test/playwright/conftest.py`:
- `ensure_auth_context`, `ensure_model_provider_configured`,
`ensure_dataset_ready`, `ensure_chat_ready`
- Made provisioning reusable/idempotent with `RUN_ID`-based resource
naming.
- Synced auth envs (`E2E_ADMIN_EMAIL`, `E2E_ADMIN_PASSWORD`) into seeded
creds.
- Fixed provider cache freshness (`auth_header`/`page` refresh on cache
hit).
Also included minimal stability fixes:
- dataset create stale-element click handling,
- search wait logic for results/empty-state,
- agent create-menu handling,
- agent run-step retry when run UI doesn’t open first click.
### Type of change
- [x] Test fix
- [x] Refactoring
---------
Co-authored-by: Liu An <asiro@qq.com>
Cross-verify project experience and work experience, and remove
duplicate text
---------
Co-authored-by: Aron.Yao <yaowei@192.168.1.68>
Co-authored-by: Aron.Yao <yaowei@yaoweideMacBook-Pro.local>
### What problem does this PR solve?
Fix: The document generation node cannot generate the output content of
a large model to a file. #13321
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix AttributeError when calling llm.chat() in resume parser. LLMBundle
only has async_chat method, not chat method. Use `_run_coroutine_sync`
wrapper to call async_chat synchronously.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Potential fix for
[https://github.com/infiniflow/ragflow/security/code-scanning/71](https://github.com/infiniflow/ragflow/security/code-scanning/71)
In general, instead of using `String.prototype.includes` on the entire
URL string, parse the URL and make decisions based on its `host` (or
`hostname`) field. This avoids cases where the trusted domain appears in
the path, query, or as part of a different hostname.
Here, `payload.source_fid` is set to `'siliconflow_intl'` if
`postBody.base_url` “contains” `api.siliconflow.com`. To keep behavior
for correct inputs but close the hole, we should:
1. Safely parse `postBody.base_url` using the standard `URL` class.
2. Extract the hostname (`url.hostname`).
3. Compare it appropriately:
- If we only want the exact host `api.siliconflow.com`, use strict
equality.
- If international endpoints may include subdomains like
`foo.api.siliconflow.com`, allow those via suffix check on the hostname.
4. Fall back to `LLMFactory.SILICONFLOW` if parsing fails or the host
does not match.
Concretely, in `web/src/pages/user-setting/setting-model/hooks.tsx`, in
the `onApiKeySavingOk` callback where `payload.source_fid` is set,
replace the `toLowerCase().includes('api.siliconflow.com')` logic with a
small block that:
- Initializes a local `let sourceFid = LLMFactory.SILICONFLOW;`
- If `postBody.base_url` is present, attempts `new
URL(postBody.base_url)` inside a `try/catch`, lowercases `url.hostname`,
and checks whether it equals `api.siliconflow.com` or ends with
`.api.siliconflow.com`.
- Assigns `payload.source_fid = sourceFid`.
No new external dependencies are required; `URL` is available in modern
browsers and Node, and TypeScript understands it.
_Suggested fixes powered by Copilot Autofix. Review carefully before
merging._
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
### What problem does this PR solve?
This PR adds end-to-end Arabic support in production. It also adds a
full Arabic README
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
Core optimizations (refer to arXiv:2510.09722):
1. PDF text fusion: Metadata + OCR dual-path extraction and fusion
2. Page-aware reconstruction: YOLOv10 page segmentation + hierarchical
sorting + line number indexing
3. Parallel task decomposition: Basic information/work
experience/educational background three-way parallel LLM extraction
4. Index pointer mechanism: LLM returns a range of line numbers instead
of generating the full text, reducing the illusion of full text.
---------
Co-authored-by: Aron.Yao <yaowei@yaoweideMacBook-Pro.local>
Co-authored-by: Aron.Yao <yaowei@192.168.1.68>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Feat: Modify the style of the classification operator and fix some
console errors.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: add more models for siliconflow and tongyi-qwen
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Summary
When using MinerU, docling, TCADP, or paddleocr as the PDF parser with
the General (naive) chunk method, the user-configured `chunk_token_num`
is **unconditionally overwritten to 0** at
[rag/app/naive.py#L858-L859](https://github.com/infiniflow/ragflow/blob/main/rag/app/naive.py#L858-L859),
effectively disabling chunk merging regardless of what the user sets in
the UI.
### Problem
A user sets `chunk_token_num = 2048` in the dataset configuration UI,
expecting small parser blocks to be merged into larger chunks. However,
this line:
```python
if name in ["tcadp", "docling", "mineru", "paddleocr"]:
parser_config["chunk_token_num"] = 0
```
silently overrides the user's setting. As a result, every MinerU output
block becomes its own chunk. For short documents (e.g. a 3-page PDF fund
factsheet parsed by MinerU), this produces **47 tiny chunks** — some as
small as 11 characters (`"July 2025"`) or 15 characters (`"CIES
Eligible"`).
This severely degrades retrieval quality: vector embeddings of such
short fragments have minimal semantic value, and keyword search produces
excessive noise.
### Fix
Only apply the `chunk_token_num = 0` override when the user has **not**
explicitly configured a positive value:
```python
if name in ["tcadp", "docling", "mineru", "paddleocr"]:
if int(parser_config.get("chunk_token_num", 0)) <= 0:
parser_config["chunk_token_num"] = 0
```
This preserves the original default behavior (no merging) while
respecting the user's explicit configuration.
### Before / After (MinerU, 3-page PDF, chunk_token_num=2048)
| | Before | After |
|---|---|---|
| Chunks produced | 47 | ~8 (merged by token limit) |
| Smallest chunk | 11 chars | ~500 chars |
| User setting respected | No | Yes |
## Test plan
- [ ] Parse a PDF with MinerU and `chunk_token_num = 2048` → verify
chunks are merged up to token limit
- [ ] Parse a PDF with MinerU and `chunk_token_num = 0` (or default) →
verify original behavior (no merging)
- [ ] Parse a PDF with DeepDOC parser → verify no change in behavior
(not affected by this code path)
- [ ] Repeat with docling/paddleocr if available
### What problem does this PR solve?
Summary:
This PR addresses critical indexing issues in
deepdoc/parser/pdf_parser.py that occur when parsing long PDFs with
chunk-based pagination:
Normalize rotated table page numbering: Rotated-table re-OCR now writes
page_number in chunk-local 1-based form, eliminating double-addition of
page_from offset that caused misalignment between table positions and
document boxes.
Convert absolute positions to chunk-local coordinates: When inserting
tables/figures extracted via _extract_table_figure, positions are now
converted from absolute (0-based) to chunk-local indices before distance
matching and box insertion. This prevents IndexError and out-of-range
accesses during paged parsing of long documents.
Root Cause:
The parser mixed absolute (0-based, document-global) and relative
(1-based, chunk-local) page numbering systems. Table/figure positions
from layout extraction carried absolute page numbers, but insertion
logic expected chunk-local coordinates aligned with self.boxes and
page_cum_height.
Testing(I do):
Manual verification: Parse a 200+ page PDF with from_page > 0 and table
rotation enabled. Confirm that:
Tables and figures appear on correct pages
No IndexError or position mismatches occur
Page numbers in output match expected chunk-local offsets
Automated testing: 我没做
## Separate Discussion: Memory Optimization Strategy(from codex-5.2-max
and claude 4.5 opus and me)
### Context
The current implementation loads entire page ranges into memory
(`__images__`, `page_chars`, intermediates), which can cause RAM
exhaustion on large documents. While the page numbering fix resolves
correctness issues, scalability remains a concern.
### Proposed Architecture
**Pipeline-Driven Chunking with Explicit Resource Management:**
1. **Authoritative chunk planning**: Accept page-range specifications
from upstream pipeline as the single source of truth. The parser should
be a stateless worker that processes assigned chunks without making
independent pagination decisions.
2. **Granular memory lifecycle**:
```python
for chunk_spec in chunk_plan:
# Load only chunk_spec.pages into __images__
page_images = load_page_range(chunk_spec.start, chunk_spec.end)
# Process with offset tracking
results = process_chunk(page_images, offset=chunk_spec.start)
# Explicit cleanup before next iteration
del page_images, page_chars, layout_intermediates
gc.collect() # Force collection of large objects
```
3. **Persistent lightweight state**: Keep model instances (layout
detector, OCR engine), document metadata (outlines, PDF structure), and
configuration across chunks to avoid reinitialization overhead (~2-5s
per chunk for model loading).
4. **Adaptive fallback**: Provide `max_pages_per_chunk` (default: 50)
only when pipeline doesn't supply a plan. Never exceed
pipeline-specified ranges to maintain predictable memory bounds.
5. **Optional: Dynamic budgeting**: Expose a memory budget parameter
that adjusts chunk size based on observed image dimensions and format
(e.g., reduce chunk size for high-DPI scanned documents).
### Benefits
- **Predictable memory footprint**: RAM usage bounded by `chunk_size ×
avg_page_size` rather than total document size
- **Horizontal scalability**: Enables parallel chunk processing across
workers
- **Failure isolation**: Page extraction errors affect only current
chunk, not entire document
- **Cloud-friendly**: Works within container memory limits (e.g., 2-4GB
per worker)
### Trade-offs
- **Increased I/O**: Re-opening PDF for each chunk vs. keeping file
handle (mitigated by page-range seeks)
- **Complexity**: Requires careful offset tracking and stateful
coordination between pipeline and parser
- **Warmup cost**: Model initialization overhead amortized across chunks
(acceptable for documents >100 pages)
### Implementation Priority
This optimization should be **deferred to a separate PR** after the
current correctness fix is merged, as:
1. It requires broader architectural changes across the pipeline
2. Current fix is critical for correctness and can be backported
3. Memory optimization needs comprehensive benchmarking on
representative document corpus
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Enterprise deployments that use an external Identity Provider (e.g.,
Microsoft Entra ID, Okta, Keycloak) need the ability to enforce SSO-only
authentication by hiding the email/password login form. Currently, the
login page always shows the password form alongside OAuth buttons, with
no way to disable it.
This PR adds a `disable_password_login` configuration option under the
existing `authentication` section in `service_conf.yaml`. When set to
`true`, the login page only displays configured OAuth/SSO buttons and
hides the email/password form, "Remember me" checkbox, and "Sign up"
link.
The flag can be set via:
- `service_conf.yaml` (`authentication.disable_password_login: true`)
- Environment variable (`DISABLE_PASSWORD_LOGIN=true`)
Default behavior is unchanged (`false`).
### Behavior
| `disable_password_login` | OAuth configured | Result |
|---|---|---|
| `false` (default) | No | Standard email/password form |
| `false` | Yes | Email/password form + SSO buttons below |
| `true` | Yes | **SSO buttons only** (no form, no sign up link) |
| `true` | No | Empty card (admin should configure OAuth first) |
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Files changed (5)
1. `docker/service_conf.yaml.template` — added `disable_password_login:
false` under authentication
2. `common/settings.py` — added `DISABLE_PASSWORD_LOGIN` global variable
and loader in `init_settings()`
3. `common/config_utils.py` — fixed `TypeError` in `show_configs()` when
authentication section contains non-dict values (e.g., booleans)
4. `api/apps/system_app.py` — exposed `disablePasswordLogin` flag in
`/config` endpoint
5. `web/src/pages/login/index.tsx` — conditionally render password form
based on config flag; OAuth buttons always render when channels exist
---------
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
### What problem does this PR solve?
Fix: add soft limit for graph rag size #13258 Q2
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
When using OceanBase as the document storage engine, parsing and
inserting chunks with chunk_data (e.g., table parser row data) fails
with the following error:
```
[ERROR][Exception]: Insert chunk error: ['Unconsumed column names: chunk_data']
This happens because the chunk_data column was recently introduced but was omitted from the EXTRA_COLUMNS list in
rag/utils/ob_conn.py
```
As a result, the automatic schema migration for existing OceanBase
tables does not append the missing chunk_data column, causing the
underlying pyobvector or SQLAlchemy to raise an unconsumed column names
error during data insertion.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What is the solution?
Added column_chunk_data to the EXTRA_COLUMNS list in
```
rag/utils/ob_conn.py
```
This ensures that the OceanBase connection wrapper can correctly detect
the missing column and automatically alter existing chunk tables to
include the chunk_data field during initialization.
### What problem does this PR solve?
Feat: add preprocess parameters for ingestion pipeline
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR helps automate the testing of the ui interface using pytest
Playwright
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test automation infrastructure
---------
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
This PR adds comprehensive **Right-to-Left (RTL) language support**,
primarily targeting Arabic and other RTL scripts (Hebrew, Persian, Urdu,
etc.).
Previously, RTL content had multiple rendering issues:
- Incorrect sentence splitting for Arabic punctuation in citation logic
- Misaligned text in chat messages and markdown components
- Improper positioning of blockquotes and “think” sections
- Incorrect table alignment
- Citation placement ambiguity in RTL prompts
- UI layout inconsistencies when mixing LTR and RTL text
This PR introduces backend and frontend improvements to properly detect,
render, and style RTL content while preserving existing LTR behavior.
#### Backend
- Updated sentence boundary regex in `rag/nlp/search.py` to include
Arabic punctuation:
- `،` (comma)
- `؛` (semicolon)
- `؟` (question mark)
- `۔` (Arabic full stop)
- Ensures citation insertion works correctly in RTL sentences.
- Updated citation prompt instructions to clarify citation placement
rules for RTL languages.
#### Frontend
- Introduced a new utility: `text-direction.ts`
- Detects text direction based on Unicode ranges.
- Supports Arabic, Hebrew, Syriac, Thaana, and related scripts.
- Provides `getDirAttribute()` for automatic `dir` assignment.
- Applied dynamic `dir` attributes across:
- Markdown rendering
- Chat messages
- Search results
- Tables
- Hover cards and reference popovers
- Added proper RTL styling in LESS:
- Text alignment adjustments
- Blockquote border flipping
- Section indentation correction
- Table direction switching
- Use of `<bdi>` for figure labels to prevent bidirectional conflicts
#### DevOps / Environment
- Added Windows backend launch script with retry handling.
- Updated dependency metadata.
- Adjusted development-only React debugging behavior.
---
### Type of change
- [x] Bug Fix (non-breaking change which fixes RTL rendering and
citation issues)
- [x] New Feature (non-breaking change which adds RTL detection and
dynamic direction handling)
---------
Co-authored-by: 6ba3i <isbaaoui09@gmail.com>
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
Co-authored-by: Ahmad Intisar <168020872+ahmadintisar@users.noreply.github.com>
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Feat: Modify the form styles for retrieval and conditional operators.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
feat: pipeline add preprocess
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Feat: When exporting the agent DSL, the tailkey, password, and history
fields need to be cleared. #13281
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
test_doc_sdk_routes_unit had two flaky/incorrect branch assumptions:
1. parse/stop_parsing production logic gates on doc.run, but tests used
progress, causing branch mismatch and unintended fallthrough into
mutation/DB paths.
2. stop_parsing invalid-state test asserted an outdated message
fragment, making the contract brittle.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Update for Admin UI:
- Update file picker input in **Registration whitelist** > **Import from
Excel** modal
- Modify DOM structure of **Sandbox Settings** and move several
hardcoded texts into translation files
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Necessary ids for implementing the new testing suite with playwright for
UI
### Type of change
- [x] Other (please describe): Testing IDs
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Properly close detached PIL image on JPEG save failure in encode_image.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When the original code terminates the parsing task halfway, the progress
may not be 0 or 1, which will result in the inability to call the
interface to parse again
-Change the document parsing progress check to task status check, and
use TaskStatus.RUNNING.value to judge
-Update the condition judgment for stopping parsing documents, and check
whether the task is running instead
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This pull request refactors the chat session creation and deletion logic
in both the parser and client code to use unique session IDs instead of
session names. It also updates the corresponding command syntax and
payloads, ensuring more robust and unambiguous session management.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. Create / Drop / List chat sessions
2. Chat with LLM and datasets
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
This pull request makes a small but important fix to how streaming
requests are handled in the `completion` endpoint of
`conversation_app.py`. The main change ensures that the `stream`
argument is not passed twice, which could cause errors.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
**Summary**
This PR tackles a significant memory bottleneck when processing
image-heavy Word documents. Previously, our pipeline eagerly decoded
DOCX images into `PIL.Image` objects, which caused high peak memory
usage. To solve this, I've introduced a **lazy-loading approach**:
images are now stored as raw blobs and only decoded exactly when and
where they are consumed.
This successfully reduces the memory footprint while keeping the parsing
output completely identical to before.
**What's Changed**
Instead of a dry file-by-file list, here is the logical breakdown of the
updates:
* **The Core Abstraction (`lazy_image.py`)**: Introduced `LazyDocxImage`
along with helper APIs to handle lazy decoding, image-type checks, and
NumPy compatibility. It also supports `.close()` and detached PIL access
to ensure safe lifecycle management and prevent memory leaks.
* **Pipeline Integration (`naive.py`, `figure_parser.py`, etc.)**:
Updated the general DOCX picture extraction to return these new lazy
images. Downstream consumers (like the figure/VLM flow and base64
encoding paths) now decode images right at the use site using detached
PIL instances, avoiding shared-instance side effects.
* **Compatibility Hooks (`operators.py`, `book.py`, etc.)**: Added
necessary compatibility conversions so these lazy images flow smoothly
through existing merging, filtering, and presentation steps without
breaking.
**Scope & What is Intentionally Left Out**
To keep this PR focused, I have restricted these changes strictly to the
**general Word pipeline** and its downstream consumers.
The `QA` and `manual` Word parsing pipelines are explicitly **not
modified** in this PR. They can be safely migrated to this new lazy-load
model in a subsequent, standalone PR.
**Design Considerations**
I briefly considered adding image compression during processing, but
decided against it to avoid any potential quality degradation in the
derived outputs. I also held off on a massive pipeline re-architecture
to avoid overly invasive changes right now.
**Validation & Testing**
I've tested this to ensure no regressions:
* Compared identical DOCX inputs before and after this branch: chunk
counts, extracted text, table HTML, and image descriptions match
perfectly.
* **Confirmed a noticeable drop in peak memory usage when processing
image-dense documents.** For a 30MB Word document containing 243 1080p
screenshots, memory consumption is reduced by approximately 1.5GB.
**Breaking Changes**
None.
### What problem does this PR solve?
Added the option to delete models individually from providers.
For additional context, see
[issue-13184](https://github.com/infiniflow/ragflow/issues/13184)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Note: when deleting a selected model, it leaves the full model name as
text as seen here:
<img width="676" height="90" alt="image"
src="https://github.com/user-attachments/assets/c11c7c1b-3f2a-4119-b20c-bb8148a8ad16"
/>
If attempting to use ragflow with that deleted model, ragflow will throw
an unauthorized model error as expected.
I left it like that on purpose, so it's easier for the user to
understand what he deleted and that he needs to replace it with another
model.
Co-authored-by: Shahar Flumin <shahar@Shahars-MacBook-Air.local>
### What problem does this PR solve?
The SeaFile connector currently synchronises the entire account — every
library
visible to the authenticated user. This is impractical for users who
only need
a subset of their data indexed, especially on large SeaFile instances
with many
shared libraries.
This PR introduces granular sync scope support, allowing users to choose
between
syncing their entire account, a single library, or a specific directory
within a
library. It also adds support for SeaFile library-scoped API tokens
(`/api/v2.1/via-repo-token/` endpoints), enabling tighter access control
without
exposing account-level credentials.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Test
```
from seafile_connector import SeaFileConnector
import logging
import os
logging.basicConfig(level=logging.DEBUG)
URL = os.environ.get("SEAFILE_URL", "https://seafile.example.com")
TOKEN = os.environ.get("SEAFILE_TOKEN", "")
REPO_ID = os.environ.get("SEAFILE_REPO_ID", "")
SYNC_PATH = os.environ.get("SEAFILE_SYNC_PATH", "/Documents")
REPO_TOKEN = os.environ.get("SEAFILE_REPO_TOKEN", "")
def _test_scope(scope, repo_id=None, sync_path=None):
print(f"\n{'='*50}")
print(f"Testing scope: {scope}")
print(f"{'='*50}")
creds = {"seafile_token": TOKEN} if TOKEN else {}
if REPO_TOKEN and scope in ("library", "directory"):
creds["repo_token"] = REPO_TOKEN
connector = SeaFileConnector(
seafile_url=URL,
batch_size=5,
sync_scope=scope,
include_shared = False,
repo_id=repo_id,
sync_path=sync_path,
)
connector.load_credentials(creds)
connector.validate_connector_settings()
count = 0
for batch in connector.load_from_state():
for doc in batch:
count += 1
print(f" [{count}] {doc.semantic_identifier} "
f"({doc.size_bytes} bytes, {doc.extension})")
print(f"\n-> {scope} scope: {count} document(s) found.\n")
# 1. Account scope
if TOKEN:
_test_scope("account")
else:
print("\nSkipping account scope (set SEAFILE_TOKEN)")
# 2. Library scope
if REPO_ID and (TOKEN or REPO_TOKEN):
_test_scope("library", repo_id=REPO_ID)
else:
print("\nSkipping library scope (set SEAFILE_REPO_ID + token)")
# 3. Directory scope
if REPO_ID and SYNC_PATH and (TOKEN or REPO_TOKEN):
_test_scope("directory", repo_id=REPO_ID, sync_path=SYNC_PATH)
else:
print("\nSkipping directory scope (set SEAFILE_REPO_ID + SEAFILE_SYNC_PATH + token)")
```
### What problem does this PR solve?
Update **Chat** UI:
- Align to the design.
- Update `<AudioButton>` visualizer logic.
- Fix keyboard navigation issue.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
The _transfer_to_sections method was throwing a type hint violation
because it occasionally returns 3-item tuples instead of 2. Adjusted to
list[tuple[str, ...]] to prevent runtime crashes.
Error:
20:53:21 Page(1~10): [ERROR]Internal server error while chunking:
Method[1m[35m
deepdoc.parser.docling_parser.DoclingParser._transfer_to_sections()[0m
return [1m[31m[(1. JIRA Nasıl Kullanılır?, text,
@@1\t70.8\t194.9\t70.9\t85.5##), (1.1. Proje O...##)][0m violates type
hint [1m[32mlist[tuple[str, str]][0m, as [1m[33mlist [0mindex
[1m[33m15[0m item tuple [1m[33mtuple [0m[1m[31m(Gelen ekran
üzerinden alanları isterlerine göre doldurduğunuz taktirde Create
düğmesi i...##)[0m length 3 != 2.
20:53:21 [ERROR][Exception]: Method[1m[35m
deepdoc.parser.docling_parser.DoclingParser._transfer_to_sections()[0m
return [1m[31m[('1. JIRA Nasıl Kullanılır?', 'text',
'@@1\t70.8\t194.9\t70.9\t85.5##'), ('1.1. Proje O...##')][0m violates
type hint [1m[32mlist[tuple[str, str]][0m, as [1m[33mlist [0mindex
[1m[33m15[0m item tuple [1m[33mtuple [0m[1m[31m('Gelen ekran
üzerinden alanları isterlerine göre doldurduğunuz taktirde Create
düğmesi i...##')[0m length 3 != 2.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Enes Delibalta <enes.delibalta@pentanom.com>
### What problem does this PR solve?
Refer to issue: #13236
The base url for GPUStack chat model requires `/v1` suffix. For the
other model type like `Embedding` or `Rerank`, the `/v1` suffix is not
required and will be appended in code.
So keep the same logic for chat model as other model type.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR fixes 2 bugs related to RAGFlow's init superuser functionality.
#### Bug 1
When the RAGFlow server was started with the `--init-superuser` option
it would always create a new admin user even if it already exists
resulting in duplicate users.
To fix this, I added an additional check before create the superuser and
added the *unique* constraint to the email column of the database, to
mitigate potential TOCTOU race conditions. Since existing databases
could contain duplicate emails I added email de-duplication to the
database migration.
#### Bug 2
When the RAGFlow server was started with the `--init-superuser` option
but without configured default LLM and embedding models it would fail to
start because the `init_superuser` function would always make test
request to the models even if they were not set.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The output content of the multi-model comparison will disappear.
#13227
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Make the embedded page of chat compatible with mobile devices.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR adds [Avian](https://avian.io) as a new LLM provider to RAGFlow.
Avian provides an OpenAI-compatible API with competitive pricing,
offering access to models like DeepSeek V3.2, Kimi K2.5, GLM-5, and
MiniMax M2.5.
**Provider details:**
- API Base URL: `https://api.avian.io/v1`
- Auth: Bearer token via API key
- OpenAI-compatible (chat completions, streaming, function calling)
- Models:
- `deepseek/deepseek-v3.2` — 164K context, $0.26/$0.38 per 1M tokens
- `moonshotai/kimi-k2.5` — 131K context, $0.45/$2.20 per 1M tokens
- `z-ai/glm-5` — 131K context, $0.30/$2.55 per 1M tokens
- `minimax/minimax-m2.5` — 1M context, $0.30/$1.10 per 1M tokens
**Changes:**
- `rag/llm/chat_model.py` — Add `AvianChat` class extending `Base`
- `rag/llm/__init__.py` — Register in `SupportedLiteLLMProvider`,
`FACTORY_DEFAULT_BASE_URL`, `LITELLM_PROVIDER_PREFIX`
- `conf/llm_factories.json` — Add Avian factory with model definitions
- `web/src/constants/llm.ts` — Add to `LLMFactory` enum, `IconMap`,
`APIMapUrl`
- `web/src/components/svg-icon.tsx` — Register SVG icon
- `web/src/assets/svg/llm/avian.svg` — Provider icon
- `docs/references/supported_models.mdx` — Add to supported models table
This follows the same pattern as other OpenAI-compatible providers
(e.g., n1n #12680, TokenPony).
cc @KevinHuSh @JinHai-CN
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Fix [#13210](https://github.com/infiniflow/ragflow/issues/13210)
Remove limit in _search_metadata, use pagination in _search_metadata.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This pull request makes a minor update to the English locale strings for
the Table of Contents toggle buttons, changing the labels from "Show
TOC"/"Hide TOC" to "Show content"/"Hide content" for improved clarity.
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Codecov’s coverage report shows that several RAGFlow code paths are
currently untested or under-tested. This makes it easier for regressions
to slip in during refactors and feature work.
This PR adds targeted automated tests to cover the files and branches
highlighted by Codecov, improving confidence in core behavior while
keeping runtime functionality unchanged.
### Type of change
- [x] Other (please describe): Test coverage improvement (adds/extends
unit and integration tests to address Codecov-reported gaps)
…ff publishing this guide.
### What problem does this PR solve?
Removed failsure mode checklist per your request. @JinHai-CN
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: The agent is embedded in the webpage; interrupting its operation
will redirect to the login page. #12697
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
Fixes the initial enabled/disabled state of chat variable checkboxes by
correcting a helper function that previously always returned .
## Problem
in had two statements:
Because of the early , the function always returned , so all chat
variable checkboxes were initially disabled regardless of the field.
This also made the helper inconsistent with , which enables all fields
by default except .
## Fix
Update to use the same condition as :
This ensures:
- All chat variable checkboxes are enabled by default
- remains the only field disabled by default
- Behavior is consistent between the helper and the checkbox map
initialization in .
No API or backend changes are involved; this is a small, isolated
frontend bugfix.
### What problem does this PR solve?
Feat: optimize ingestion pipeline with preprocess
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR adds a new guide: **"RAG failure modes checklist"**.
RAG systems often fail in ways that are not immediately visible from a
single metric like accuracy or latency. In practice, debugging
production RAG applications requires identifying recurring failure
patterns across retrieval, routing, evaluation, and deployment stages.
This guide introduces a structured, pattern-based checklist (P01–P12) to
help users interpret traces, evaluation results, and dataset behavior
within RAGFlow. The goal is to provide a practical way to classify
incidents (e.g., retrieval hallucination, chunking issues, index
staleness, routing misalignment) and reason about minimal structural
fixes rather than ad-hoc prompt changes.
The change is documentation-only and does not modify any code or
configuration.
Refs #13138
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Codecov’s coverage report shows that several RAGFlow code paths are
currently untested or under-tested. This makes it easier for regressions
to slip in during refactors and feature work.
This PR adds targeted automated tests to cover the files and branches
highlighted by Codecov, improving confidence in core behavior while
keeping runtime functionality unchanged.
### Type of change
- [x] Other (please describe): Test coverage improvement (adds/extends
unit and integration tests to address Codecov-reported gaps)
### What problem does this PR solve?
Fix: Note component text area does not resize with component #13065
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
User experience enhancement for variable picker in prompt editor:
- Add case-insensitive string search for variables.
- Add basic keyboard navigation in variable picker:
- Hit <kbd>UpArrow</kbd> and <kbd>DownArrow</kbd> for navigating.
- Hit <kbd>Tab</kbd> or <kbd>Enter</kbd> for selecting focused item into
editor.
- Fix unexpectedly inserting invalid variable into editor by hitting
<kbd>Tab</kbd>.
_Note: you still need to pick variables inside secondary menu (agent
structured output, etc.) by using your pointing device. May finish these
later._
### Type of change
- [x] Refactoring
Actual behavior
When using OceanBase as storage, the list_chunk sorting is abnormal. The
following is the SQL statement.
SELECT id, content_with_weight, important_kwd, question_kwd, img_id,
available_int, position_int, doc_type_kwd, create_timestamp_flt,
create_time, array_to_string(page_num_int, ',') AS page_num_int_sort,
array_to_string(top_int, ',') AS top_int_sort FROM
rag_store_284250730805059584 WHERE doc_id = '' AND kb_id IN ('') ORDER
BY page_num_int_sort ASC, top_int_sort ASC, create_timestamp_flt DESC
LIMIT 0, 20
<img width="1610" height="740" alt="image"
src="https://github.com/user-attachments/assets/84e14c30-a97f-4e8f-8c8c-6ccac915d97d"
/>
Co-authored-by: Aron.Yao <yaowei@yaoweideMacBook-Pro.local>
### What problem does this PR solve?
When users start RAGFlow with `docker compose -p <alias>`, Docker
creates volumes prefixed with the alias (e.g., `myproject_mysql_data`).
The migration script (`docker/migration.sh`) previously hardcoded the
`docker_` prefix in volume names, causing backup/restore to silently
skip all volumes for any non-default project name.
This PR adds a `-p <project_name>` option so the script correctly
targets volumes regardless of the Docker Compose project name used.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Changes
- Add `-p <project_name>` flag (default: `docker`) for specifying Docker
Compose project name
- Build volume names dynamically: `${project_name}_${base_name}`
- Update help text with new option documentation and examples
- Show project-aware `docker compose` commands in error messages
- Fix deprecated `docker-compose` to `docker compose` in hints
- Use dynamic step count instead of hardcoded `4`
- Fully backward compatible — existing usage without `-p` works
unchanged
### Usage
```bash
# Existing usage (unchanged)
./migration.sh backup
./migration.sh restore my_backup
# New: custom project name
./migration.sh -p myproject backup
./migration.sh -p myproject restore my_backup
```
### What problem does this PR solve?
Fix authorization bypass (IDOR) in `/v1/document/web_crawl` allows
Cross-Tenant Dataset Modification.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
The RDBMS (MySQL/PostgreSQL) connector generates document filenames
using the first 100 characters of the content column
(semantic_identifier). When the content contains newline characters
(\n), the resulting filename includes those newlines — for example:
Category: غير صحيح كليًا\nTitle: تفنيد حقائق....txt
RAGFlow's filename_type() function uses re.match(r".*\.txt$", filename)
to detect file types, but .* does not match newline characters by
default in Python regex. This causes the regex to fail, returning
FileType.OTHER, which triggers:
pythonraise RuntimeError("This type of file has not been supported
yet!")
As a result, all documents synced via the MySQL/PostgreSQL connector are
silently discarded. The sync logs report success (e.g., "399 docs
synchronized"), but zero documents actually appear in the dataset. This
is the root cause of issue #13001.
Root cause trace:
rdbms_connector.py → _row_to_document() sets semantic_identifier from
raw content (may contain \n)
connector_service.py → duplicate_and_parse() uses semantic_identifier as
the filename
file_service.py → upload_document() calls filename_type(filename)
file_utils.py → filename_type() regex .*\.txt$ fails on newlines →
returns FileType.OTHER
upload_document() raises "This type of file has not been supported yet!"
Fix: Sanitize the semantic_identifier in _row_to_document() by replacing
newlines and carriage returns with spaces before truncating to 100
characters.
Relates to: #13001, #12817
Type of change
Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
### What problem does this PR solve?
Fix LFI vulnerability in document parsing API.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
Fixes MinIO SSL/TLS support in two places: the MinIO **client**
connection and the **health check** used by the Admin/Service Health
dashboard. Both now respect the `secure` and `verify` settings from the
MinIO configuration.
Closes#13158Closes#13159
---
## Problem
**#13158 – MinIO client:** The client in `rag/utils/minio_conn.py` was
hardcoded with `secure=False`, so RAGFlow could not connect to MinIO
over HTTPS even when `secure: true` was set in config. There was also no
way to disable certificate verification for self-signed certs.
**#13159 – MinIO health check:** In `api/utils/health_utils.py`, the
MinIO liveness check always used `http://` for the health URL. When
MinIO was configured with SSL, the health check failed and the dashboard
showed "timeout" even though MinIO was reachable over HTTPS.
---
## Solution
### MinIO client (`rag/utils/minio_conn.py`)
- Read `MINIO.secure` (default `false`) and pass it into the `Minio()`
constructor so HTTPS is used when configured.
- Add `_build_minio_http_client()` that reads `MINIO.verify` (default
`true`). When `verify` is false, return an `urllib3.PoolManager` with
`cert_reqs=ssl.CERT_NONE` and pass it as `http_client` to `Minio()` so
self-signed certificates are accepted.
- Support string values for `secure` and `verify` (e.g. `"true"`,
`"false"`).
### MinIO health check (`api/utils/health_utils.py`)
- Add `_minio_scheme_and_verify()` to derive URL scheme (http/https) and
the `verify` flag from `MINIO.secure` and `MINIO.verify`.
- Update `check_minio_alive()` to use the correct scheme, pass `verify`
into `requests.get(..., verify=verify)`, and use `timeout=10`.
### Config template (`docker/service_conf.yaml.template`)
- Add commented optional MinIO keys `secure` and `verify` (and env vars
`MINIO_SECURE`, `MINIO_VERIFY`) so deployers know they can enable HTTPS
and optional cert verification.
### Tests
- **`test/unit_test/utils/test_health_utils_minio.py`** – Tests for
`_minio_scheme_and_verify()` and `check_minio_alive()` (scheme, verify,
status codes, timeout, errors).
- **`test/unit_test/utils/test_minio_conn_ssl.py`** – Tests for
`_build_minio_http_client()` (verify true/false/missing, string values,
`CERT_NONE` when verify is false).
---
## Testing
- Unit tests added/updated as above; run with the project's test runner.
- Manually: configure MinIO with HTTPS and `secure: true` (and
optionally `verify: false` for self-signed); confirm client operations
work and the Service Health dashboard shows MinIO as alive instead of
timeout.
### What problem does this PR solve?
Fix stored XSS via HTML file upload and inline rendering in
/v1/file/get/<id>
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Type of Change
- [x] Bug fix
## Description
Closes#13119
The current IMAP connector uses `split(',')` to parse email headers,
which crashes when a sender's display name contains a comma inside
quotes (e.g., `"Doe, John" <john@example.com>`).
This PR replaces the manual string splitting with Python's standard
`email.utils.getaddresses`. This correctly handles RFC 5322 quoted
strings and prevents the `RuntimeError: Expected a singular address`.
## Checklist
- [x] I have checked the code and it works as expected.
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
When using a chat assistant that has a hardcoded `empty_response`, that
response was not returned correctly in streaming mode when no
information is found in the knowledge base. In this case only one
response with `"content": null` was yielded. If `"references": true`,
then the `empty_response` is still put into the `final_content` so there
is technically some content returned, but when `"references": false` no
content at all is returned.
I update the OpenAI chat completion endpoint to yield an additional
response with the `empty_response` in the content.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fixes AttributeError in _remove_reasoning_content() when LLM returns
None, and improves JSON parsing regex for markdown code fences in
agent_with_tools.py
### What problem does this PR solve?
Fix: Metadata mult-selected display error
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Use negative lookbehind (?<![a-zA-Z]) so \] and \) inside commands
(e.g. \right], \big)) are not treated as block/inline delimiters
- Use greedy matching to capture up to the last valid delimiter, fixing
truncated formulas (e.g. C_{seq}(y|x) = \frac{1}{|y|} ...)
- Add unit tests for preprocessLaTeX
Closes#13134
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- Fix duplicate YAML mapping keys in `helm/templates/env.yaml` that
cause deployment failures with strict YAML parsers
## Problem
The `range` loop in `env.yaml` iterates over all `.Values.env` keys and
emits them into a Secret. The exclusion filter skips host/port/user
keys, but does **not** skip password keys (`MYSQL_PASSWORD`,
`REDIS_PASSWORD`, `MINIO_PASSWORD`, `ELASTIC_PASSWORD`,
`OPENSEARCH_PASSWORD`). These same keys are then explicitly defined
again later in the template, producing duplicate YAML mapping keys.
Go's `yaml.v3` (used by Flux's helm-controller for post-rendering)
rejects duplicate keys per the YAML spec:
```
Helm install failed: yaml: unmarshal errors:
mapping key "MINIO_PASSWORD" already defined
mapping key "MYSQL_PASSWORD" already defined
mapping key "REDIS_PASSWORD" already defined
```
Plain `helm install` does not surface this because Helm's internal
parser (`yaml.v2`) silently accepts duplicate keys (last value wins).
## Fix
Add password keys to the exclusion filter on line 12 so they are only
emitted by their explicit definitions later in the template.
Note: `MINIO_ROOT_USER` is intentionally **not** excluded — it is only
emitted by the range loop and has no explicit definition elsewhere.
Excluding it causes MinIO to crash with `Missing credential environment
variable, "MINIO_ROOT_USER"`.
## Test plan
- [ ] Deploy with Flux helm-controller (uses yaml.v3) — no duplicate key
errors
- [ ] Verify all passwords are present in the rendered Secret
- [ ] Verify `MINIO_ROOT_USER` is present in the rendered Secret
- [ ] Test with `DOC_ENGINE=elasticsearch` (ELASTIC_PASSWORD)
- [ ] Test with `DOC_ENGINE=opensearch` (OPENSEARCH_PASSWORD)
Fixes#13135
### What problem does this PR solve?
This fixes the bug described in #13130. When starting RAGFlow with
Postgres the admin tenant create failed because the rerank model was not
set.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Refact: switch from oogle-generativeai to google-genai #13132
Refact: commnet out unused pywencai.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
The Docker Compose configuration was using hub.icert.top as the registry
for the OpenSearch image. That registry is not reachable in our
environment, which causes podman pull and docker compose pull to fail
with a connection refused error. As a result, the application cannot
start because the OpenSearch image cannot be downloaded.
This PR updates the image reference to use the official Docker Hub image
(opensearchproject/opensearch:2.19.1) instead of the hub.icert.top
mirror. After this change, the image pulls successfully and the services
start as expected.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Shynggys Samarkhanov <shynggys.samarkhanov@nixs.com>
### What problem does this PR solve?
RAGFlow supports 12 UI languages but does not include Bulgarian. This PR
adds Bulgarian (`bg` / `Български`) as the 13th supported language,
covering the full UI translation (2001 keys across all 26 sections) and
OCR/PDF parser language mapping.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Changes
- **`web/src/constants/common.ts`** — Registered Bulgarian in all 5
language data structures (`LanguageList`, `LanguageMap`,
`LanguageAbbreviation` enum, `LanguageAbbreviationMap`,
`LanguageTranslationMap`)
- **`web/src/locales/config.ts`** — Added lazy-loading dynamic import
for the `bg` locale
- **`web/src/locales/bg.ts`** *(new)* — Full Bulgarian translation file
with all 26 sections, matching the English source (`en.ts`). All
interpolation placeholders, HTML tags, and technical terms are preserved
as-is
- **`deepdoc/parser/mineru_parser.py`** — Mapped `'Bulgarian'` to
`'cyrillic'` in `LANGUAGE_TO_MINERU_MAP` for OCR/PDF parser support
### How it works
The language selector automatically picks up the new entry. When a user
selects "Български", the translation bundle is lazy-loaded on demand.
The preference is persisted to the database and localStorage across
sessions.
### What problem does this PR solve?
Refactor: i18n language pack for on-demand import
### Type of change
- [x] Refactoring
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Refactoring
### What problem does this PR solve?
This PR fixes missing metadata on documents synced from the Moodle
connector, especially for **Book** modules.
Background:
- Moodle Book metadata includes fields like `chapters`, which is a
`list[dict]`.
- During metadata normalization in
`DocMetadataService._split_combined_values`, list deduplication used
`dict.fromkeys(...)`.
- `dict.fromkeys(...)` fails for unhashable values (like `dict`),
causing metadata update to fail.
- Result: documents were imported, but metadata was not saved for
affected module types (notably Books).
What this PR changes:
- Replaces hash-based list deduplication with `dedupe_list(...)`, which
safely handles unhashable list items while preserving order.
- This allows Book metadata (and other complex list metadata) to be
persisted correctly.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Contribution during my time at RAGcon GmbH.
### What problem does this PR solve?
Fix: replace session page icons and fix nested list search functionality
in filters
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
What problem does this PR solve?
The sync_data_source.py module imports WebDAVConnector from
common.data_source, but WebDAVConnector was never registered in the
package's __init__.py. This causes an ImportError at startup, crashing
the data sync service:
ImportError: cannot import name 'WebDAVConnector' from
'common.data_source'
The webdav_connector.py file already exists in the common/data_source/
directory — it just wasn't exported. This PR adds the import and
registers it in __all__.
Type of change
Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
Fix the issue where the server-side parameter validation fails when the
id parameter is None in the asynchronous list_datasets method.
### What problem does this PR solve?
Fix the issue where the server-side parameter validation fails when the
id parameter is None in the asynchronous list_datasets method.
### Type of change
- [√ ] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Bugs fixed (#13109)
- chat pdf preview error
- data source add box error
- change route next-chat -> chat , next-search->search ...
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Renamed test/unit/test_delete_query_construction.py to
test/unit_test/common/test_delete_query_construction.py to align with
the project's directory structure and improve test categorization.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Decouple the memory API into a gateway layer (for routing/param parse)
and a service layer (for business logic).
### Type of change
- [x] Refactoring
### What problem does this PR solve?
This PR fixes SSO/OIDC login persistence after the Vite migration
#12568. Because wrappers are ignored by React Router, the OAuth callback
never stored the auth token in localStorage, causing auth to only work
while ?auth= stayed in the URL. We move that logic into a route loader
and remove the Bearer prefix for the signed token so the backend accepts
it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Contribution during my time at RAGcon GmbH.
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
### What problem does this PR solve?
Fix error when extracting the graph.
A string is expected, but a tuple was provided.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
- Replace hardcoded CST (UTC+8) expected values in `test_time_utils.py`
with dynamically computed local-time expectations using
`time.localtime()` and `time.mktime()`
- Tests previously failed in any timezone other than UTC+8; they now
pass regardless of the system's local timezone
## Test plan
- [x] `uv run pytest test/unit_test/ -v` — 317 passed, 25 skipped
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Jim Smith <jhsmith0@me.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Problem
RAGFlow was using incorrect model names for Google Gemini embeddings:
- `embedding-001` (missing `gemini-` prefix)
- `text-embedding-004` (OpenAI model name, not Gemini)
This caused API errors when users tried to use Gemini embeddings.
## Solution
- Updated `conf/llm_factories.json` to use the correct model name:
`gemini-embedding-001`
- Removed the incorrect `text-embedding-004` entry
- Added volume mount in `docker-compose.yml` to ensure config changes
persist
## Testing
Tested with a valid Gemini API key and confirmed embeddings now work
correctly.
## Changes
- Modified `conf/llm_factories.json`
- Modified `docker/docker-compose.yml`
---------
Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
description: Go naming conventions and best practices. Use this skill when working with Go code and need to name packages, files, directories, structs, interfaces, functions, variables, or constants. Provides comprehensive naming guidelines following Go community standards.
---
Strictly follow the naming conventions in [rules/named.md](rules/named.md)
echo "No docker/ragflow-logs directory found; skipping log collection"
fi
sudo rm -rf docker/ragflow-logs || true
- name:Stop ragflow:nightly for Infinity
if:always() # always run this step even if previous steps failed
run:|
# Sometimes `docker compose down` fail due to hang container, heavy load etc. Need to remove such containers to release resources(for example, listen ports).
echo "No docker/ragflow-logs directory found; skipping log collection"
fi
sudo rm -rf docker/ragflow-logs || true
- name:Stop ragflow:nightly
- name:Stop ragflow:nightly for Elasticsearch
if:always() # always run this step even if previous steps failed
run:|
# Sometimes `docker compose down` fail due to hang container, heavy load etc. Need to remove such containers to release resources(for example, listen ports).
RUN mkdir -p /ragflow/rag/res/deepdoc /root/.ragflow
RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/huggingface.co,target=/huggingface.co \
tar --exclude='.*' -cf - \
@@ -19,49 +19,49 @@ RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/huggingface.co
# This is the only way to run python-tika without internet access. Without this set, the default is to check the tika version and pull latest every time from Apache.
RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps \
- 🔧 [Build a Docker image](#-build-a-docker-image)
- 🔨 [Launch service from source for development](#-launch-service-from-source-for-development)
@@ -74,9 +76,9 @@
[RAGFlow](https://ragflow.io/) is a leading open-source Retrieval-Augmented Generation ([RAG](https://ragflow.io/basics/what-is-rag)) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs. It offers a streamlined RAG workflow adaptable to enterprises of any scale. Powered by a converged [context engine](https://ragflow.io/basics/what-is-agent-context-engine) and pre-built agent templates, RAGFlow enables developers to transform complex data into high-fidelity, production-ready AI systems with exceptional efficiency and precision.
## 🎮 Demo
## 🎮 Get Started
Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
Try our cloud service at [https://cloud.ragflow.io](https://cloud.ragflow.io).
@@ -85,6 +87,8 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
## 🔥 Latest Updates
- 2026-04-24 Supports DeepSeek v4.
- 2026-03-24 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — Provides an official skill for accessing RAGFlow datasets via OpenClaw.
- 2025-12-26 Supports 'Memory' for AI agent.
- 2025-11-19 Supports Gemini 3 Pro.
- 2025-11-12 Supports data synchronization from Confluence, S3, Notion, Discord, Google Drive.
- [gVisor](https://gvisor.dev/docs/user_guide/install/): Required only if you intend to use the code executor (sandbox) feature of RAGFlow.
> [!TIP]
@@ -188,12 +193,12 @@ releases! 🌟
> All Docker images are built for x86 platforms. We don't currently offer Docker images for ARM64.
> If you are on an ARM64 platform, follow [this guide](https://ragflow.io/docs/dev/build_docker_image) to build a Docker image compatible with your system.
> The command below downloads the `v0.24.0` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.24.0`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server.
> The command below downloads the `v0.25.6` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.25.6`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server.
```bash
$ cd ragflow/docker
# git checkout v0.24.0
# git checkout v0.25.6
# Optional: use a stable tag (see releases: https://github.com/infiniflow/ragflow/releases)
# This step ensures the **entrypoint.sh** file in the code matches the Docker image version.
- 🔨 [إطلاق الخدمة من المصدر للتطوير](#-launch-service-from-source-for-development)
- 📚 [التوثيق](#-documentation)
- 📜 [Roadmap](#-roadmap)
- 🏄 [المجتمع](#-community)
- 🙌 [مساهمة](#-contributing)
</details>
## 💡 ما هو RAGFlow؟
يُعد مشروع [RAGFlow](https://ragflow.io/) محركًا رائدًا ومفتوح المصدر للاسترجاع المعزز بالتوليد (<bdi dir="ltr">RAG</bdi>)، ويجمع أحدث تقنيات <bdi dir="ltr">RAG</bdi> مع قدرات الوكلاء لبناء طبقة سياق متقدمة لنماذج <bdi dir="ltr">LLMs</bdi>. يوفّر سير عمل <bdi dir="ltr">RAG</bdi> مبسّطًا وقابلًا للتكيّف مع المؤسسات بمختلف أحجامها. وبالاعتماد على [محرك سياق موحّد](https://ragflow.io/basics/what-is-agent-context-engine) وقوالب وكلاء جاهزة، يتيح <bdi dir="ltr">RAGFlow</bdi> للمطورين تحويل البيانات المعقّدة إلى أنظمة <bdi dir="ltr">AI</bdi> عالية الدقة وجاهزة للإنتاج بكفاءة وموثوقية.
## 🎮 ابدأ
جرّب النسخة التجريبية على [https://cloud.ragflow.io](https://cloud.ragflow.io).
3. ابدأ تشغيل الخادم باستخدام صور Docker المعدة مسبقًا:
> [!CAUTION]
> جميع الصور Docker مصممة لمنصات x86. لا نعرض حاليًا صور Docker لـ ARM64.
> إذا كنت تستخدم نظامًا أساسيًا ARM64، فاتبع [هذا الدليل](https://ragflow.io/docs/dev/build_docker_image) لإنشاء صورة Docker متوافقة مع نظامك.
> يقوم الأمر أدناه بتنزيل إصدار `v0.25.6` من الصورة RAGFlow Docker. راجع الجدول التالي للحصول على أوصاف لإصدارات RAGFlow المختلفة. لتنزيل إصدار RAGFlow مختلف عن `v0.25.6`، قم بتحديث المتغير `RAGFLOW_IMAGE` وفقًا لذلك في **docker/.env** قبل استخدام `docker compose` لبدء تشغيل الخادم.
```bash
$ cd ragflow/docker
# git checkout v0.25.6
# Optional: use a stable tag (see releases: https://github.com/infiniflow/ragflow/releases)
# This step ensures the **entrypoint.sh** file in the code matches the Docker image version.
# Use CPU for DeepDoc tasks:
$ docker compose -f docker-compose.yml up -d
# To use GPU to accelerate DeepDoc tasks:
# sed -i '1i DEVICE=gpu' .env
# docker compose -f docker-compose.yml up -d
```
> ملاحظة: قبل `v0.22.0`، قدمنا كلتا الصورتين بنماذج embedding وصورًا رفيعة بدون نماذج embedding. التفاصيل على النحو التالي:
| RAGFlow علامة الصورة | حجم الصورة (جيجابايت) | هل لديه نماذج embedding؟ | مستقر؟ |
> بدءًا من `v0.22.0`، نقوم بشحن الإصدار النحيف فقط ولم نعد نلحق اللاحقة **-slim** بعلامة الصورة.
4. التحقق من حالة الخادم بعد تشغيل الخادم:
```bash
$ docker logs -f docker-ragflow-cpu-1
```
_النتيجة التالية تؤكد الإطلاق الناجح للنظام:_
```bash
____ ___ ______ ______ __
/ __ \ / | / ____// ____// /____ _ __
/ /_/ // /| | / / __ / /_ / // __ \| | /| / /
/ _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ /
/_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/
* Running on all addresses (0.0.0.0)
```
> إذا تخطيت خطوة التأكيد هذه وقمت بتسجيل الدخول مباشرة إلى RAGFlow، فقد يعرض متصفحك تنبيه `network abnormal`
> خطأ لأنه في تلك اللحظة، قد لا تتم تهيئة RAGFlow بشكل كامل.
>
5. في متصفح الويب الخاص بك، أدخل عنوان IP الخاص بالخادم الخاص بك وقم بتسجيل الدخول إلى RAGFlow.
> باستخدام الإعدادات الافتراضية، ما عليك سوى إدخال `http://IP_OF_YOUR_MACHINE` (**من دون** رقم المنفذ) كإعداد افتراضي
> HTTP يمكن حذف منفذ العرض `80` عند استخدام التكوينات الافتراضية.
>
6. في [service_conf.yaml.template](./docker/service_conf.yaml.template)، حدد المصنع LLM المطلوب في `user_default_llm` وقم بالتحديث
الحقل `API_KEY` مع مفتاح API المقابل.
> راجع [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) لمزيد من المعلومات.
>
_العرض بدأ!_
## 🔧 التكوينات
عندما يتعلق الأمر بتكوينات النظام، ستحتاج إلى إدارة الملفات التالية:
- [.env](./docker/.env): يحتفظ بالإعدادات الأساسية للنظام، مثل `SVR_HTTP_PORT`، `MYSQL_PASSWORD`، و
`MINIO_PASSWORD`.
- [service_conf.yaml.template](./docker/service_conf.yaml.template): تكوين الخدمات الخلفية. سيتم ملء متغيرات البيئة في هذا الملف تلقائيًا عند بدء تشغيل الحاوية Docker. ستكون أي متغيرات بيئة تم تعيينها داخل حاوية Docker متاحة للاستخدام، مما يسمح لك بتخصيص سلوك الخدمة استنادًا إلى بيئة النشر.
- [docker-compose.yml](./docker/docker-compose.yml): يعتمد النظام على [docker-compose.yml](./docker/docker-compose.yml) لبدء التشغيل.
> يوفر الملف [./docker/README](./docker/README.md) وصفًا تفصيليًا لإعدادات البيئة والخدمة
> التكوينات التي يمكن استخدامها كـ `${ENV_VARS}` في ملف [service_conf.yaml.template](./docker/service_conf.yaml.template).
لتحديث منفذ العرض الافتراضي HTTP (80)، انتقل إلى [docker-compose.yml](./docker/docker-compose.yml) وقم بتغيير `80:80`
إلى `<YOUR_SERVING_PORT>:80`.
تتطلب تحديثات التكوينات المذكورة أعلاه إعادة تشغيل جميع الحاويات لتصبح سارية المفعول:
> ```bash
> $ docker compose -f docker-compose.yml up -d
> ```
### تبديل محرك المستندات من Elasticsearch إلى Infinity
RAGFlow يستخدم Elasticsearch بشكل افتراضي لتخزين النص الكامل والمتجهات. للتبديل إلى [Infinity](https://github.com/infiniflow/infinity/)، اتبع الخطوات التالية:
1. إيقاف كافة الحاويات قيد التشغيل:
```bash
$ docker compose -f docker/docker-compose.yml down -v
```
> [!WARNING]
> `-v` سوف يحذف docker وحدات تخزين الحاوية، وسيتم مسح البيانات الموجودة.
2. اضبط `DOC_ENGINE` في **docker/.env** على `infinity`.
3. ابدأ الحاويات:
```bash
$ docker compose -f docker-compose.yml up -d
```
> [!WARNING]
> التبديل إلى Infinity على جهاز Linux/arm64 غير مدعوم رسميًا بعد.
## 🔧 أنشئ صورة Docker
يبلغ حجم هذه الصورة حوالي 2 غيغابايت وتعتمد على خدمات LLM وembedding الخارجية.
- 🔎 [Architecture du système](#-architecture-du-système)
- 🎬 [Auto-hébergement](#-auto-hébergement)
- 🔧 [Configurations](#-configurations)
- 🔧 [Construire une image Docker](#-construire-une-image-docker)
- 🔨 [Lancer le service depuis les sources pour le développement](#-lancer-le-service-depuis-les-sources-pour-le-développement)
- 📚 [Documentation](#-documentation)
- 📜 [Roadmap](#-feuille-de-route)
- 🏄 [Communauté](#-communauté)
- 🙌 [Contribuer](#-contribuer)
</details>
## 💡 Qu'est-ce que RAGFlow?
[RAGFlow](https://ragflow.io/) est un moteur de [RAG](https://ragflow.io/basics/what-is-rag) (Retrieval-Augmented Generation) open-source de premier plan qui fusionne les technologies RAG de pointe avec des capacités Agent pour créer une couche de contexte supérieure pour les LLM. Il offre un flux de travail RAG rationalisé, adaptable aux entreprises de toute taille. Alimenté par un [moteur de contexte](https://ragflow.io/basics/what-is-agent-context-engine) convergent et des modèles d'agents préconstruits, RAGFlow permet aux développeurs de transformer des données complexes en systèmes d'IA haute-fidélité, prêts pour la production, avec une efficacité et une précision exceptionnelles.
## 🎮 Démarrage
Essayez notre service cloud sur [https://cloud.ragflow.io](https://cloud.ragflow.io).
- 24-03-2026 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — Fournit un skill officiel pour accéder aux datasets RAGFlow via OpenClaw.
- 26-12-2025 Prise en charge de la « Mémoire » pour l'agent IA.
- 19-11-2025 Prise en charge de Gemini 3 Pro.
- 12-11-2025 Prise en charge de la synchronisation de données depuis Confluence, S3, Notion, Discord et Google Drive.
- 23-10-2025 Prise en charge de MinerU & Docling comme méthodes d'analyse de documents.
- 15-10-2025 Prise en charge du pipeline d'ingestion orchestrable.
- 08-08-2025 Prise en charge des derniers modèles de la série GPT-5 d'OpenAI.
- 01-08-2025 Prise en charge du flux de travail agentique et de MCP.
- 23-05-2025 Ajout d'un composant exécuteur de code Python/JavaScript à l'Agent.
- 05-05-2025 Prise en charge des requêtes inter-langues.
- 19-03-2025 Prise en charge de l'utilisation d'un modèle multi-modal pour analyser les images dans les fichiers PDF ou DOCX.
## 🎉 Restez informé
⭐️ Mettez une étoile à notre dépôt pour rester informé des nouvelles fonctionnalités et améliorations passionnantes ! Recevez des notifications instantanées pour les nouvelles versions ! 🌟
- Extraction de connaissances basée sur la [compréhension approfondie des documents](./deepdoc/README.md) à partir de données non structurées aux formats complexes.
- Trouve "l'aiguille dans la meule de données" de tokens littéralement illimités.
### 🍱 **Découpage(Chunking) basé sur des templates**
- Intelligent et explicable.
- De nombreuses options de templates disponibles.
### 🌱 **Citations fondées avec réduction des hallucinations**
- Visualisation du découpage de texte pour permettre une intervention humaine.
- Aperçu rapide des références clés et citations traçables pour soutenir des réponses fondées.
### 🍔 **Compatibilité avec des sources de données hétérogènes**
- Prend en charge Word, présentations, Excel, txt, images, copies numérisées, données structurées, pages web, et plus encore.
### 🛀 **Flux de travail RAG automatisé et sans effort**
- Orchestration RAG rationalisée adaptée aux particuliers comme aux grandes entreprises.
- LLM et modèles d'embedding configurables.
- Rappel multiple associé à un ré-classement fusionné.
- APIs intuitives pour une intégration transparente avec les entreprises.
- [gVisor](https://gvisor.dev/docs/user_guide/install/) : Requis uniquement si vous souhaitez utiliser la fonctionnalité d'exécuteur de code (sandbox) de RAGFlow.
> [!TIP]
> Si vous n'avez pas installé Docker sur votre machine locale (Windows, Mac ou Linux), consultez [Installer Docker Engine](https://docs.docker.com/engine/install/).
### 🚀 Démarrer le serveur
1. Assurez-vous que `vm.max_map_count` >= 262144 :
> Pour vérifier la valeur de `vm.max_map_count` :
>
> ```bash
> $ sysctl vm.max_map_count
> ```
>
> Réinitialisez `vm.max_map_count` à une valeur d'au moins 262144 si ce n'est pas le cas.
>
> ```bash
> # Dans ce cas, nous le définissons à 262144 :
> $ sudo sysctl -w vm.max_map_count=262144
> ```
>
> Ce changement sera réinitialisé après un redémarrage du système. Pour que votre modification reste permanente, ajoutez ou mettez à jour la valeur `vm.max_map_count` dans **/etc/sysctl.conf** :
3. Démarrez le serveur en utilisant les images Docker préconstruites :
> [!CAUTION]
> Toutes les images Docker sont construites pour les plateformes x86. Nous ne proposons pas actuellement d'images Docker pour ARM64.
> Si vous êtes sur une plateforme ARM64, suivez [ce guide](https://ragflow.io/docs/dev/build_docker_image) pour construire une image Docker compatible avec votre système.
> La commande ci-dessous télécharge l'édition `v0.25.6` de l'image Docker RAGFlow. Consultez le tableau suivant pour les descriptions des différentes éditions de RAGFlow. Pour télécharger une édition de RAGFlow différente de `v0.25.6`, mettez à jour la variable `RAGFLOW_IMAGE` dans **docker/.env** avant d'utiliser `docker compose` pour démarrer le serveur.
```bash
$ cd ragflow/docker
# git checkout v0.25.6
# Optionnel : utiliser un tag stable (voir les versions : https://github.com/infiniflow/ragflow/releases)
# Cette étape garantit que le fichier **entrypoint.sh** dans le code correspond à la version de l'image Docker.
# Use CPU for DeepDoc tasks:
$ docker compose -f docker-compose.yml up -d
# To use GPU to accelerate DeepDoc tasks:
# sed -i '1i DEVICE=gpu' .env
# docker compose -f docker-compose.yml up -d
```
> Remarque : Avant `v0.22.0`, nous fournissions à la fois des images avec des modèles d'embedding et des images slim sans modèles d'embedding. Détails ci-dessous :
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
> À partir de `v0.22.0`, nous ne distribuons que l'édition slim et ne rajoutons plus le suffixe **-slim** au tag d'image.
4. Vérifiez l'état du serveur après son démarrage :
```bash
$ docker logs -f docker-ragflow-cpu-1
```
_La sortie suivante confirme un lancement réussi du système :_
```bash
____ ___ ______ ______ __
/ __ \ / | / ____// ____// /____ _ __
/ /_/ // /| | / / __ / /_ / // __ \| | /| / /
/ _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ /
/_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/
* Running on all addresses (0.0.0.0)
```
> Si vous sautez cette étape de confirmation et vous connectez directement à RAGFlow, votre navigateur peut afficher une erreur `network abnormal`, car à ce moment-là, votre RAGFlow peut ne pas être entièrement initialisé.
>
5. Dans votre navigateur web, entrez l'adresse IP de votre serveur et connectez-vous à RAGFlow.
> Avec les paramètres par défaut, il vous suffit d'entrer `http://IP_OF_YOUR_MACHINE` (**sans** numéro de port), car le port HTTP par défaut `80` peut être omis lors de l'utilisation des configurations par défaut.
>
6. Dans [service_conf.yaml.template](./docker/service_conf.yaml.template), sélectionnez la fabrique LLM souhaitée dans `user_default_llm` et mettez à jour le champ `API_KEY` avec la clé API correspondante.
> Voir [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) pour plus d'informations.
>
_Le spectacle commence !_
## 🔧 Configurations
En ce qui concerne les configurations système, vous devrez gérer les fichiers suivants :
- [.env](./docker/.env) : Conserve les paramètres de base du système, tels que `SVR_HTTP_PORT`, `MYSQL_PASSWORD` et `MINIO_PASSWORD`.
- [service_conf.yaml.template](./docker/service_conf.yaml.template) : Configure les services back-end. Les variables d'environnement dans ce fichier seront automatiquement renseignées au démarrage du conteneur Docker. Toutes les variables d'environnement définies dans le conteneur Docker seront disponibles, vous permettant de personnaliser le comportement du service en fonction de l'environnement de déploiement.
- [docker-compose.yml](./docker/docker-compose.yml) : Le système s'appuie sur [docker-compose.yml](./docker/docker-compose.yml) pour démarrer.
> Le fichier [./docker/README](./docker/README.md) fournit une description détaillée des paramètres d'environnement et des configurations de services qui peuvent être utilisés comme `${ENV_VARS}` dans le fichier [service_conf.yaml.template](./docker/service_conf.yaml.template).
Pour mettre à jour le port HTTP de service par défaut (80), accédez à [docker-compose.yml](./docker/docker-compose.yml) et changez `80:80` en `<YOUR_SERVING_PORT>:80`.
Les mises à jour des configurations ci-dessus nécessitent un redémarrage de tous les conteneurs pour prendre effet :
> ```bash
> $ docker compose -f docker-compose.yml up -d
> ```
### Passer du moteur de documents Elasticsearch à Infinity
RAGFlow utilise Elasticsearch par défaut pour stocker le texte intégral et les vecteurs. Pour passer à [Infinity](https://github.com/infiniflow/infinity/), suivez ces étapes :
1. Arrêtez tous les conteneurs en cours d'exécution :
```bash
$ docker compose -f docker/docker-compose.yml down -v
```
> [!WARNING]
> `-v` supprimera les volumes des conteneurs Docker, et les données existantes seront effacées.
2. Définissez `DOC_ENGINE` dans **docker/.env** sur `infinity`.
3. Démarrez les conteneurs :
```bash
$ docker compose -f docker-compose.yml up -d
```
> [!WARNING]
> Le passage à Infinity sur une machine Linux/arm64 n'est pas encore officiellement pris en charge.
## 🔧 Construire une image Docker
Cette image fait environ 2 Go et dépend de services LLM et d'embedding externes.
- 🔨 [Meluncurkan aplikasi dari Sumber untuk Pengembangan](#-meluncurkan-aplikasi-dari-sumber-untuk-pengembangan)
@@ -74,9 +76,9 @@
[RAGFlow](https://ragflow.io/) adalah mesin [RAG](https://ragflow.io/basics/what-is-rag) (Retrieval-Augmented Generation) open-source terkemuka yang mengintegrasikan teknologi RAG mutakhir dengan kemampuan Agent untuk menciptakan lapisan kontekstual superior bagi LLM. Menyediakan alur kerja RAG yang efisien dan dapat diadaptasi untuk perusahaan segala skala. Didukung oleh mesin konteks terkonvergensi dan template Agent yang telah dipra-bangun, RAGFlow memungkinkan pengembang mengubah data kompleks menjadi sistem AI kesetiaan-tinggi dan siap-produksi dengan efisiensi dan presisi yang luar biasa.
## 🎮 Demo
## 🎮 Mulai
Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
Coba layanan cloud kami di [https://cloud.ragflow.io](https://cloud.ragflow.io).
@@ -85,6 +87,8 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
## 🔥 Pembaruan Terbaru
- 2026-04-24 Mendukung DeepSeek v4.
- 2026-03-24 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — Menyediakan skill resmi untuk mengakses dataset RAGFlow melalui OpenClaw.
- 2025-12-26 Mendukung 'Memori' untuk agen AI.
- 2025-11-19 Mendukung Gemini 3 Pro.
- 2025-11-12 Mendukung sinkronisasi data dari Confluence, S3, Notion, Discord, Google Drive.
@@ -140,7 +144,7 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
@@ -148,6 +152,7 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
- RAM >= 16 GB
- Disk >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1
- Python >= 3.13
- [gVisor](https://gvisor.dev/docs/user_guide/install/): Hanya diperlukan jika Anda ingin menggunakan fitur eksekutor kode (sandbox) dari RAGFlow.
> [!TIP]
@@ -188,12 +193,12 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
> Semua gambar Docker dibangun untuk platform x86. Saat ini, kami tidak menawarkan gambar Docker untuk ARM64.
> Jika Anda menggunakan platform ARM64, [silakan gunakan panduan ini untuk membangun gambar Docker yang kompatibel dengan sistem Anda](https://ragflow.io/docs/dev/build_docker_image).
> Perintah di bawah ini mengunduh edisi v0.24.0 dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.24.0, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server.
> Perintah di bawah ini mengunduh edisi v0.25.6 dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.25.6, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server.
```bash
$ cd ragflow/docker
# git checkout v0.24.0
# git checkout v0.25.6
# Opsional: gunakan tag stabil (lihat releases: https://github.com/infiniflow/ragflow/releases)
# This steps ensures the **entrypoint.sh** file in the code matches the Docker image version.
[RAGFlow](https://ragflow.io/) 는 최첨단 [RAG](https://ragflow.io/basics/what-is-rag)(Retrieval-Augmented Generation)와 Agent 기능을 융합하여 대규모 언어 모델(LLM)을 위한 우수한 컨텍스트 계층을 생성하는 선도적인 오픈소스 RAG 엔진입니다. 모든 규모의 기업에 적용 가능한 효율적인 RAG 워크플로를 제공하며, 통합 [컨텍스트 엔진](https://ragflow.io/basics/what-is-agent-context-engine)과 사전 구축된 Agent 템플릿을 통해 개발자들이 복잡한 데이터를 예외적인 효율성과 정밀도로 고급 구현도의 프로덕션 준비 완료 AI 시스템으로 변환할 수 있도록 지원합니다.
- [gVisor](https://gvisor.dev/docs/user_guide/install/): RAGFlow의 코드 실행기(샌드박스) 기능을 사용하려는 경우에만 필요합니다.
> [!TIP]
@@ -170,12 +175,12 @@
> 모든 Docker 이미지는 x86 플랫폼을 위해 빌드되었습니다. 우리는 현재 ARM64 플랫폼을 위한 Docker 이미지를 제공하지 않습니다.
> ARM64 플랫폼을 사용 중이라면, [시스템과 호환되는 Docker 이미지를 빌드하려면 이 가이드를 사용해 주세요](https://ragflow.io/docs/dev/build_docker_image).
> 아래 명령어는 RAGFlow Docker 이미지의 v0.24.0 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.24.0과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오.
> 아래 명령어는 RAGFlow Docker 이미지의 v0.25.6 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.25.6와 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오.
```bash
$ cd ragflow/docker
# git checkout v0.24.0
# git checkout v0.25.6
# Optional: use a stable tag (see releases: https://github.com/infiniflow/ragflow/releases)
# 이 단계는 코드의 entrypoint.sh 파일이 Docker 이미지 버전과 일치하도록 보장합니다.
- 🔎 [Arquitetura do Sistema](#-arquitetura-do-sistema)
- 🎬 [Primeiros Passos](#-primeiros-passos)
- 🎬 [Auto-hospedagem](#-auto-hospedagem)
- 🔧 [Configurações](#-configurações)
- 🔧 [Construir uma imagem docker sem incorporar modelos](#-construir-uma-imagem-docker-sem-incorporar-modelos)
- 🔧 [Construir uma imagem docker incluindo modelos](#-construir-uma-imagem-docker-incluindo-modelos)
@@ -75,9 +77,9 @@
[RAGFlow](https://ragflow.io/) é um mecanismo de [RAG](https://ragflow.io/basics/what-is-rag) (Retrieval-Augmented Generation) open-source líder que fusiona tecnologias RAG de ponta com funcionalidades Agent para criar uma camada contextual superior para LLMs. Oferece um fluxo de trabalho RAG otimizado adaptável a empresas de qualquer escala. Alimentado por [um motor de contexto](https://ragflow.io/basics/what-is-agent-context-engine) convergente e modelos Agent pré-construídos, o RAGFlow permite que desenvolvedores transformem dados complexos em sistemas de IA de alta fidelidade e pronto para produção com excepcional eficiência e precisão.
## 🎮 Demo
## 🎮 Primeiros Passos
Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
Experimente o nosso serviço na nuvem em [https://cloud.ragflow.io](https://cloud.ragflow.io).
@@ -86,6 +88,8 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
## 🔥 Últimas Atualizações
- 24-04-2026 Suporta DeepSeek v4.
- 24-03-2026 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — Fornece um skill oficial para acessar datasets do RAGFlow via OpenClaw.
- 26-12-2025 Suporte à função 'Memória' para agentes de IA.
- 19-11-2025 Suporta Gemini 3 Pro.
- 12-11-2025 Suporta a sincronização de dados do Confluence, S3, Notion, Discord e Google Drive.
@@ -141,7 +145,7 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
@@ -149,6 +153,7 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
- RAM >= 16 GB
- Disco >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1
- Python >= 3.13
- [gVisor](https://gvisor.dev/docs/user_guide/install/): Necessário apenas se você pretende usar o recurso de executor de código (sandbox) do RAGFlow.
> [!TIP]
@@ -188,12 +193,12 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
> Todas as imagens Docker são construídas para plataformas x86. Atualmente, não oferecemos imagens Docker para ARM64.
> Se você estiver usando uma plataforma ARM64, por favor, utilize [este guia](https://ragflow.io/docs/dev/build_docker_image) para construir uma imagem Docker compatível com o seu sistema.
> O comando abaixo baixa a edição`v0.24.0` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.24.0`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor.
> O comando abaixo baixa a edição`v0.25.6` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.25.6`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor.
```bash
$ cd ragflow/docker
# git checkout v0.24.0
# git checkout v0.25.6
# Opcional: use uma tag estável (veja releases: https://github.com/infiniflow/ragflow/releases)
# Esta etapa garante que o arquivo entrypoint.sh no código corresponda à versão da imagem do Docker.
- 🔨 [Geliştirme İçin Kaynaktan Hizmet Başlatma](#-geliştirme-i̇çin-kaynaktan-hizmet-başlatma)
- 📚 [Dokümantasyon](#-dokümantasyon)
- 📜 [Yol Haritası](#-yol-haritası)
- 🏄 [Topluluk](#-topluluk)
- 🙌 [Katkıda Bulunma](#-katkıda-bulunma)
</details>
## 💡 RAGFlow Nedir?
[RAGFlow](https://ragflow.io/), derin doküman anlayışına dayalı, açık kaynaklı ve öncü bir Artırılmış Üretim ile Bilgi Erişimi ([RAG](https://ragflow.io/basics/what-is-rag)) motorudur. En son RAG teknolojisini Ajan yetenekleriyle birleştirerek LLM'ler için üstün bir bağlam katmanı oluşturur. Her ölçekteki kuruluşa uyarlanabilir, kolaylaştırılmış bir RAG iş akışı sunar. Yakınsanmış bir [bağlam motoru](https://ragflow.io/basics/what-is-agent-context-engine) ve hazır ajan şablonlarıyla donatılmış RAGFlow, geliştiricilerin karmaşık verileri yüksek doğrulukta, üretime hazır yapay zeka sistemlerine olağanüstü verimlilik ve hassasiyetle dönüştürmesini sağlar.
- 2026-03-24 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — OpenClaw üzerinden RAGFlow veri setlerine erişmek için resmi bir skill sağlar.
- 2025-12-26 Yapay zeka ajanı için 'Bellek' desteği eklendi.
- 2025-11-19 Gemini 3 Pro desteği eklendi.
- 2025-11-12 Confluence, S3, Notion, Discord, Google Drive'dan veri senkronizasyonu desteği eklendi.
- 2025-10-23 Doküman ayrıştırma yöntemi olarak MinerU ve Docling desteği eklendi.
- 2025-10-15 Düzenlenebilir veri alım hattı desteği eklendi.
- 2025-08-08 OpenAI'ın en yeni GPT-5 serisi modelleri için destek eklendi.
- 2025-08-01 Ajanlı iş akışı ve MCP desteği eklendi.
- 2025-05-23 Ajana Python/JavaScript kod çalıştırıcı bileşeni eklendi.
- 2025-05-05 Diller arası sorgu desteği eklendi.
- 2025-03-19 PDF veya DOCX dosyalarındaki görselleri yorumlamak için çok modlu model desteği eklendi.
## 🎉 Bizi Takip Edin
⭐️ Heyecan verici yeni özellikler ve iyileştirmelerden haberdar olmak için depomuzı yıldızlayın! Yeni sürümler için anında bildirim alın! 🌟
- [gVisor](https://gvisor.dev/docs/user_guide/install/): Yalnızca RAGFlow'un kod çalıştırıcı (sandbox) özelliğini kullanmayı planlıyorsanız gereklidir.
> [!TIP]
> Yerel makinenize (Windows, Mac veya Linux) Docker yüklemediyseniz, [Docker Engine Kurulumu](https://docs.docker.com/engine/install/) sayfasına bakın.
### 🚀 Sunucuyu Başlatma
1.`vm.max_map_count` değerinin >= 262144 olduğundan emin olun:
> `vm.max_map_count` değerini kontrol etmek için:
>
> ```bash
> $ sysctl vm.max_map_count
> ```
>
> Değer 262144'ten düşükse, en az 262144 olarak ayarlayın.
>
> ```bash
> # Bu örnekte 262144 olarak ayarlıyoruz:
> $ sudo sysctl -w vm.max_map_count=262144
> ```
>
> Bu değişiklik sistem yeniden başlatıldığında sıfırlanacaktır. Değişikliğin kalıcı olmasını sağlamak için
> **/etc/sysctl.conf** dosyasındaki `vm.max_map_count` değerini buna göre ekleyin veya güncelleyin:
3. Önceden oluşturulmuş Docker imajlarını kullanarak sunucuyu başlatın:
> [!CAUTION]
> Tüm Docker imajları x86 platformları için oluşturulmuştur. Şu anda ARM64 için Docker imajı sunmuyoruz.
> ARM64 platformundaysanız, sisteminizle uyumlu bir Docker imajı oluşturmak için [bu kılavuzu](https://ragflow.io/docs/dev/build_docker_image) takip edin.
> Aşağıdaki komut RAGFlow Docker imajının `v0.25.6` sürümünü indirir. Farklı RAGFlow sürümleri için aşağıdaki tabloya bakın. `v0.25.6` dışında bir sürüm indirmek için, `docker compose` ile sunucuyu başlatmadan önce **docker/.env** dosyasındaki `RAGFLOW_IMAGE` değişkenini güncelleyin.
```bash
$ cd ragflow/docker
# git checkout v0.25.6
# İsteğe bağlı: Kararlı bir etiket kullanın (sürümler: https://github.com/infiniflow/ragflow/releases)
# Bu adım, koddaki **entrypoint.sh** dosyasının Docker imaj sürümüyle eşleşmesini sağlar.
# DeepDoc görevleri için CPU kullanımı:
$ docker compose -f docker-compose.yml up -d
# DeepDoc görevlerini hızlandırmak için GPU kullanımı:
# sed -i '1i DEVICE=gpu' .env
# docker compose -f docker-compose.yml up -d
```
> Not: `v0.22.0` öncesinde hem gömme modelleri içeren imajlar hem de gömme modelleri içermeyen ince (slim) imajlar sunuyorduk. Detaylar aşağıdadır:
> `v0.22.0`'dan itibaren yalnızca ince (slim) sürümü sunuyoruz ve imaj etiketine artık **-slim** son eki eklemiyoruz.
4. Sunucu çalışır duruma geldikten sonra sunucu durumunu kontrol edin:
```bash
$ docker logs -f docker-ragflow-cpu-1
```
_Aşağıdaki çıktı, sistemin başarıyla başlatıldığını onaylar:_
```bash
____ ___ ______ ______ __
/ __ \ / | / ____// ____// /____ _ __
/ /_/ // /| | / / __ / /_ / // __ \| | /| / /
/ _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ /
/_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/
* Running on all addresses (0.0.0.0)
```
> Bu onay adımını atlayıp doğrudan RAGFlow'a giriş yaparsanız, o anda RAGFlow tam olarak başlatılmamış olabileceğinden
> tarayıcınız `ağ hatası` uyarısı verebilir.
>
5. Web tarayıcınıza sunucunuzun IP adresini girin ve RAGFlow'a giriş yapın.
> Varsayılan ayarlarla, yalnızca `http://MAKİNENİZİN_IP_ADRESİ` girmeniz yeterlidir (port numarası **gerekmez**),
> çünkü varsayılan HTTP sunucu portu `80` varsayılan yapılandırmalar kullanıldığında ihmal edilebilir.
>
6. [service_conf.yaml.template](./docker/service_conf.yaml.template) dosyasında, `user_default_llm` içinde istediğiniz LLM sağlayıcısını seçin ve
`API_KEY` alanını ilgili API anahtarıyla güncelleyin.
> Daha fazla bilgi için [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) sayfasına bakın.
>
_Gösteri başlasın!_
## 🔧 Yapılandırmalar
Sistem yapılandırmaları söz konusu olduğunda, aşağıdaki dosyaları yönetmeniz gerekecektir:
- [.env](./docker/.env): `SVR_HTTP_PORT`, `MYSQL_PASSWORD` ve `MINIO_PASSWORD` gibi temel sistem ayarlarını içerir.
- [service_conf.yaml.template](./docker/service_conf.yaml.template): Arka uç hizmetlerini yapılandırır. Bu dosyadaki ortam değişkenleri, Docker konteyneri başladığında otomatik olarak doldurulacaktır. Docker konteyneri içinde ayarlanan tüm ortam değişkenleri kullanıma hazır olacak ve hizmet davranışını dağıtım ortamına göre özelleştirmenize olanak tanıyacaktır.
- [docker-compose.yml](./docker/docker-compose.yml): Sistem, başlatılmak için [docker-compose.yml](./docker/docker-compose.yml) dosyasına dayanır.
> [./docker/README](./docker/README.md) dosyası, [service_conf.yaml.template](./docker/service_conf.yaml.template) dosyasında `${ENV_VARS}` olarak kullanılabilen ortam ayarları ve hizmet yapılandırmalarının ayrıntılı bir açıklamasını sağlar.
Varsayılan HTTP sunucu portunu (80) değiştirmek için [docker-compose.yml](./docker/docker-compose.yml) dosyasında `80:80` ifadesini `<SUNUCU_PORTUNUZ>:80` olarak değiştirin.
Yukarıdaki yapılandırma değişikliklerinin etkili olması için tüm konteynerlerin yeniden başlatılması gerekir:
RAGFlow varsayılan olarak tam metin ve vektörlerin depolanması için Elasticsearch kullanır. [Infinity](https://github.com/infiniflow/infinity/)'ye geçmek için şu adımları izleyin:
1. Çalışan tüm konteynerleri durdurun:
```bash
$ docker compose -f docker/docker-compose.yml down -v
```
> [!WARNING]
> `-v` seçeneği Docker konteyner birimlerini silecek ve mevcut veriler temizlenecektir.
2. **docker/.env** dosyasında `DOC_ENGINE` değerini `infinity` olarak ayarlayın.
3. Konteynerleri başlatın:
```bash
$ docker compose -f docker-compose.yml up -d
```
> [!WARNING]
> Linux/arm64 makinesinde Infinity'ye geçiş henüz resmi olarak desteklenmemektedir.
## 🔧 Docker İmajı Oluşturma
Bu imaj yaklaşık 2 GB boyutundadır ve harici LLM ile gömme hizmetlerine bağlıdır.
description="Admin Service's client of [RAGFlow](https://github.com/infiniflow/ragflow). The Admin Service provides user management and system monitoring. "
decrypt(crypt(input_string)) == base64(input_string), which frontend and ragflow_cli use.
"""
pub="-----BEGIN PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArq9XTUSeYr2+N1h3Afl/z8Dse/2yD0ZGrKwx+EEEcdsBLca9Ynmx3nIB5obmLlSfmskLpBo0UACBmB5rEjBp2Q2f3AG3Hjd4B+gNCG6BDaawuDlgANIhGnaTLrIqWrrcm4EMzJOnAOI1fgzJRsOOUEfaS318Eq9OVO3apEyCCt0lOQK6PuksduOjVxtltDav+guVAA068NrPYmRNabVKRNLJpL8w4D44sfth5RvZ3q9t+6RTArpEtc5sh5ChzvqPOzKGMXW83C95TxmXqpbK6olN4RevSfVjEAgCydH6HN6OhtOQEcnrU97r9H0iZOWwbw3pVrZiUkuRD1R56Wzs2wIDAQAB\n-----END PUBLIC KEY-----"
@@ -41,35 +42,25 @@ class AgentParam(LLMParam, ToolParamBase):
"""
def__init__(self):
self.meta:ToolMeta={
"name":"agent",
"description":"This is an agent for a specific task.",
"parameters":{
"user_prompt":{
"type":"string",
"description":"This is the order you need to send to the agent.",
"default":"",
"required":True
},
"reasoning":{
"type":"string",
"description":(
"Supervisor's reasoning for choosing the this agent. "
"Explain why this agent is being invoked and what is expected of it."
),
"required":True
},
"context":{
"type":"string",
"description":(
"All relevant background information, prior facts, decisions, "
"and state needed by the agent to solve the current query. "
"Should be as detailed and self-contained as possible."
),
"required":True
},
}
}
self.meta:ToolMeta={
"name":"agent",
"description":"This is an agent for a specific task.",
"parameters":{
"user_prompt":{"type":"string","description":"This is the order you need to send to the agent.","default":"","required":True},
"reasoning":{
"type":"string",
"description":("Supervisor's reasoning for choosing the this agent. Explain why this agent is being invoked and what is expected of it."),
"required":True,
},
"context":{
"type":"string",
"description":(
"All relevant background information, prior facts, decisions, and state needed by the agent to solve the current query. Should be as detailed and self-contained as possible."
),
"required":True,
},
},
}
super().__init__()
self.function_name="agent"
self.tools=[]
@@ -79,7 +70,6 @@ class AgentParam(LLMParam, ToolParamBase):
raiseTypeError(f"List should be returned, but `{functions}`")
forfinfunctions:
ifnotisinstance(f,dict):
raiseTypeError(f"An object type should be returned, but `{f}`")
tool_tasks=[]
forfuncinfunctions:
name=func["name"]
args=func["arguments"]
ifname==COMPLETE_TASK:
append_user_content(hist,f"Respond with a formal answer. FORGET(DO NOT mention) about `{COMPLETE_TASK}`. The language for the response MUST be as the same as the first user request.\n")
logging.exception(msg=f"Wrong JSON argument format in LLM ReAct response: {e}")
e=f"\nTool call error, please correct the input parameter of response format and call it again.\n *** Exception ***\n{e}"
append_user_content(hist,str(e))
logging.warning(f"Exceed max rounds: {self._param.max_rounds}")
final_instruction=f"""
{user_request}
IMPORTANT: You have reached the conversation limit. Based on ALL the information and research you have gathered so far, please provide a DIRECT and COMPREHENSIVE final answer to the original request.
Instructions:
1. SYNTHESIZE all information collected during this conversation
2. Provide a COMPLETE response using existing data - do not suggest additional research
3. Structure your response as a FINAL DELIVERABLE, not a plan
4. If information is incomplete, state what you found and provide the best analysis possible with available data
5. DO NOT mention conversation limits or suggest further steps
6. Focus on delivering VALUE with the information already gathered
Respond immediately with your final comprehensive answer.
"""
ifself.check_if_canceled("Agent final instruction"):
# raise TypeError(f"List should be returned, but `{functions}`")
# for f in functions:
# if not isinstance(f, dict):
# raise TypeError(f"An object type should be returned, but `{f}`")
# tool_tasks = []
# for func in functions:
# name = func["name"]
# args = func["arguments"]
# if name == COMPLETE_TASK:
# append_user_content(hist, f"Respond with a formal answer. FORGET(DO NOT mention) about `{COMPLETE_TASK}`. The language for the response MUST be as the same as the first user request.\n")
# logging.exception(msg=f"Wrong JSON argument format in LLM ReAct response: {e}")
# e = f"\nTool call error, please correct the input parameter of response format and call it again.\n *** Exception ***\n{e}"
# append_user_content(hist, str(e))
# logging.warning( f"Exceed max rounds: {self._param.max_rounds}")
# final_instruction = f"""
# {user_request}
# IMPORTANT: You have reached the conversation limit. Based on ALL the information and research you have gathered so far, please provide a DIRECT and COMPREHENSIVE final answer to the original request.
# Instructions:
# 1. SYNTHESIZE all information collected during this conversation
# 2. Provide a COMPLETE response using existing data - do not suggest additional research
# 3. Structure your response as a FINAL DELIVERABLE, not a plan
# 4. If information is incomplete, state what you found and provide the best analysis possible with available data
# 5. DO NOT mention conversation limits or suggest further steps
# 6. Focus on delivering VALUE with the information already gathered
# Respond immediately with your final comprehensive answer.
# """
# if self.check_if_canceled("Agent final instruction"):
[English](./README.md) | [简体中文](./README_zh.md) | Türkçe
# Eklentiler
Bu klasör, RAGFlow'un eklenti mekanizmasını içerir.
RAGFlow, `embedded_plugins` alt klasöründen eklentileri özyinelemeli olarak yükleyecektir.
## Desteklenen eklenti türleri
Şu anda desteklenen tek eklenti türü `llm_tools`'dur.
-`llm_tools`: LLM'nin çağırması için bir araç.
## Eklenti nasıl eklenir
Bir LLM araç eklentisi eklemek basittir: bir eklenti dosyası oluşturun, içine `LLMToolPlugin` sınıfından türetilmiş bir sınıf koyun, ardından `get_metadata` ve `invoke` metodlarını uygulayın.
-`get_metadata` metodu: Bu metod, aracın açıklamasını içeren bir `LLMToolMetadata` nesnesi döndürür.
Açıklama, LLM'ye çağrı için ve RAGFlow web ön yüzüne görüntüleme amacıyla sağlanacaktır.
-`invoke` metodu: Bu metod, LLM tarafından üretilen parametreleri kabul eder ve aracın yürütme sonucunu içeren bir `str` döndürür.
Bu aracın tüm yürütme mantığı bu metoda konulmalıdır.
2025-05-15 19:29:08,959 INFO 34670 Recursively importing plugins from path `/some-path/ragflow/agent/plugin/embedded_plugins`
2025-05-15 19:29:08,960 INFO 34670 Loaded llm_tools plugin BadCalculatorPlugin version 1.0.0
```
Veya eklentinizi düzeltmeniz gereken hatalar da içerebilir.
### Örnek
Yanlış cevaplar veren bir hesap makinesi aracı ekleyerek eklenti ekleme sürecini göstereceğiz.
Önce, `embedded_plugins/llm_tools` klasörü altında `bad_calculator.py` adında bir eklenti dosyası oluşturun.
Ardından, `LLMToolPlugin` temel sınıfından türetilmiş bir `BadCalculatorPlugin` sınıfı oluşturuyoruz:
```python
classBadCalculatorPlugin(LLMToolPlugin):
_version_="1.0.0"
```
`_version_` alanı zorunludur ve eklentinin sürüm numarasını belirtir.
Hesap makinemizin girdileri olarak `a` ve `b` olmak üzere iki sayısı vardır, bu yüzden `BadCalculatorPlugin` sınıfımıza aşağıdaki `invoke` metodunu ekliyoruz:
```python
definvoke(self,a:int,b:int)->str:
returnstr(a+b+100)
```
`invoke` metodu LLM tarafından çağrılacaktır. Birçok parametreye sahip olabilir, ancak dönüş tipi `str` olmalıdır.
Son olarak, LLM'ye `bad_calculator` aracımızı nasıl kullanacağını anlatmak için bir `get_metadata` metodu eklememiz gerekiyor:
```python
@classmethod
defget_metadata(cls)->LLMToolMetadata:
return{
# Bu aracın adı, LLM'ye sağlanır
"name":"bad_calculator",
# Bu aracın görüntüleme adı, RAGFlow ön yüzüne sağlanır
"displayName":"$t:bad_calculator.name",
# Bu aracın kullanım açıklaması, LLM'ye sağlanır
"description":"A tool to calculate the sum of two numbers (will give wrong answer)",
# Bu aracın açıklaması, RAGFlow ön yüzüne sağlanır
`get_metadata` metodu bir `classmethod`'dur. Bu aracın açıklamasını LLM'ye sağlayacaktır.
`display` ile başlayan alanlar özel bir gösterim kullanabilir: `$t:xxx`, bu gösterim RAGFlow ön yüzündeki uluslararasılaştırma (i18n) mekanizmasını kullanarak `llmTools` kategorisinden metin alır. Bu gösterimi kullanmazsanız, ön yüz buraya yazdığınız metni doğrudan gösterecektir.
Artık aracımız hazırdır. `Yanıt Üret` bileşeninde seçip deneyebilirsiniz.
> `matplotlib` uses the `Agg` (non-interactive) backend by default in the sandbox (`MPLBACKEND=Agg`). No display server is available, so always save figures to files (e.g. `fig.savefig("artifacts/chart.png")`) rather than calling `plt.show()`.
>
> Tip: if Chinese text renders as missing boxes/squares in `matplotlib`, install Debian package `fonts-noto-cjk` in your custom image. We do not preinstall it by default to keep the base image smaller. The sandbox base image ships a `matplotlibrc` that already lists common CJK fonts in the `font.sans-serif` fallback chain, so no code-level font configuration is needed — just install the font package and rebuild the image.
## Only overrides are listed; all other settings use matplotlib built-in defaults.
# Prefer CJK-capable fonts so Chinese / Japanese / Korean text renders correctly.
# matplotlib silently skips fonts that are not installed, falling back to the
# next entry in the list, so this is safe even without any CJK font package.
font.family: sans-serif
font.sans-serif: Noto Sans CJK SC, Noto Sans CJK TC, Noto Sans CJK JP, Noto Sans CJK KR, Source Han Sans SC, Source Han Sans CN, WenQuanYi Zen Hei, Microsoft YaHei, SimHei, PingFang SC, Heiti SC, STHeiti, Arial Unicode MS, DejaVu Sans, Bitstream Vera Sans, Computer Modern Sans Serif, Lucida Grande, Verdana, Geneva, Lucid, Arial, Helvetica, Avant Garde, sans-serif
# Use ASCII hyphen-minus for the minus sign so it renders correctly with any font.
| `/sandboxes/{id}/processes/cmd` | POST | Execute shell command |
## 常见问题
## FAQ
### Q: 为什么要添加 account_id?
### Q: Why is account_id required?
**A**: Code Interpreter API 需要在请求头中提供`X-Acs-Parent-Id`(阿里云主账号ID)进行身份验证。这是 Aliyun Code Interpreter API 的必需参数。
**A**: The Code Interpreter API requires the`X-Acs-Parent-Id` (Aliyun primary account ID) header for authentication. This is a required parameter for the Aliyun Code Interpreter API.
### Q: 30 秒超时限制可以绕过吗?
### Q: Can the 30-second timeout limit be bypassed?
**A**: No. This is a **hard limit** of Aliyun Code Interpreter and cannot be bypassed through configuration or request parameters. If your code execution exceeds 30 seconds, consider:
1.Optimizing the code logic
2.Processing data in batches
3.Using contexts to maintain state
### Q: 旧的 OpenSandbox 配置还能用吗?
### Q: Can the old OpenSandbox configuration still be used?
"results":[{"type":"stderr","text":"Traceback..."},{"type":"error","text":"NameError: name 'x' is not defined"},{"type":"endOfExecution","status":"error"}]
File diff suppressed because one or more lines are too long
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.