ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-07-05 10:58:34 +08:00

Author	SHA1	Message	Date
Panda Dev	6bfe0f9a10	Go: implement Encode (embeddings) in OpenAI driver (#14630 ) ### What problem does this PR solve? The OpenAI Go driver landed in #14605 with chat, list models, and check connection. Encode was left as a stub that returns \`not implemented\`. \`conf/models/openai.json\` already lists three embedding models out of the box: - text-embedding-ada-002 - text-embedding-3-small - text-embedding-3-large So a tenant who picked one of these in the Go layer could not actually run an embedding call. This PR fills the gap. ### What this PR includes - \`conf/models/openai.json\`: add \`\"embedding\": \"embeddings\"\` under \`url_suffix\` so the driver can build the URL from config. This matches the \`URLSuffix.Embedding\` field used by other drivers (siliconflow, zhipu-ai). - \`internal/entity/models/openai.go\`: replace the Encode stub with a real implementation that POSTs to \`/v1/embeddings\`. Adds a small local response type \`openaiEmbeddingResponse\`. No factory change. No interface change. ### How the implementation works - Validate \`apiConfig\` and the API key, validate the model name. Use the existing \`baseURLForRegion\` helper so an unknown region fails fast with a clear error. - Wrap the request with \`context.WithTimeout(nonStreamCallTimeout)\` so the call has a clear deadline. Same pattern as \`ChatWithMessages\` and \`ListModels\` already use in this file. - Send all input texts in one request. The OpenAI API accepts the \`input\` field as an array. - Parse \`data[].embedding\` and copy each slice into a \`[][]float64\` indexed by \`data[].index\` so the output order matches the input order even if the API returns items in a different order. - Handle both \`float64\` and \`float32\` element types, the way the SiliconFlow driver does. - An empty input slice returns \`[][]float64{}\` with no HTTP call. - Non-200 responses propagate the upstream status line and body. - A final pass checks that every input slot got a vector. If any slot is still nil, return a clear error so the caller does not silently use a zero vector. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - \`go build ./internal/entity/models/...\` in a clean go 1.25 image (the go.mod minimum) returns exit 0. - The full method set on \`OpenAIModel\` still matches the \`ModelDriver\` interface. - Pattern parity with the existing SiliconFlow Encode implementation (\`internal/entity/models/siliconflow.go\`). Closes #14629 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-10 10:31:37 +08:00
Haruko386	7931b693dc	Go: implement provider: Baidu (#14741 ) ### What problem does this PR solve? This PR completes the Baidu Qianfan provider integration in RAGFlow. The following functionalities are now supported: - [x] Chat / Think Chat / Stream Chat / Stream Think Chat - [x] Embedding - [x] Rerank - [x] Model listing - [x] Provider connection checking - [ ] Balance ----- Verified examples from the CLI: ```plaintext RAGFlow(user)> embed text 'what is rag' 'who are you' with 'embedding-3@test@zhipu-ai' dimension 16; +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 16 \| 0 \| \| 16 \| 1 \| +-----------+-------+ RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'qwen3-reranker-4b@test@baidu' top 2; +-------+---------------------+ \| index \| relevance_score \| +-------+---------------------+ \| 0 \| 0.974821150302887 \| \| 1 \| 0.14223189651966095 \| \| 2 \| 0.08632347732782364 \| +-------+---------------------+ RAGFlow(user)> think chat with 'deepseek-v3.2@test@baidu' message 'who r u' Thinking: Hmm, the user is asking for a simple introduction. This is straightforward – no need for overcomplication. I should give a clear, friendly response that covers my basic identity as an AI assistant, my purpose, and my capabilities. Keeping it concise but informative is key here. Mentioning my creator Anthropic adds credibility, and ending with an offer to help invites further interaction. No need for technical details unless the user asks later. Answer: Hello! I'm an AI assistant created by Anthropic, designed to help with a wide variety of tasks. You can think of me as a helpful digital companion—I can answer questions, assist with writing, help solve problems, provide explanations, and engage in conversation on many topics. I'm here to help with whatever you need! How can I assist you today? Time: 8.103902 RAGFlow(user)> stream think chat with 'deepseek-v3.2@test@baidu' message 'who r u' Thinking: mm, the user is asking "who r u" with casual spelling. This is a straightforward identity question. should give a clear, friendly introduction without overcomplicating it. Can start with my core function as an AI assistant, mention my creator, and briefly state my key capabilities. response should be welcoming and invite further interaction since this seems like an introductory question. Keeping it concise but covering the essentials: who I am, what I do, and how I can help. Answer: ! I am DeepSeek, an AI assistant created by DeepSeek Company. I'm designed to help answer questions, provide information, assist with various tasks, and engage in conversations on a wide range of topics. I'm here to assist you with whatever you need - whether it's answering questions, helping with analysis, writing, coding, or just having a friendly chat!Is there anything specific I can help you with today? 😊 Time: 7.219703 RAGFlow(user)> list supported models from 'baidu' 'test' +--------------------------------------+ \| model_name \| +--------------------------------------+ \| ernie-3.5-8k-preview \| \| ernie-4.0-8k \| \| ernie-4.0-turbo-8k-latest \| \| ernie-4.0-turbo-8k-preview \| \| ernie-4.0-8k-preview \| \| ernie-speed-pro-128k \| \| ernie-char-fiction-8k \| \| ernie-3.5-8k \| \| ernie-3.5-128k \| \| ernie-lite-pro-128k \| \| ernie-novel-8k \| \| ernie-4.0-turbo-8k \| \| ernie-4.0-turbo-128k \| \| ernie-4.0-8k-latest \| \| irag-1.0 \| \| ........... \| \| glm-5.1 \| \| ernie-image-turbo \| \| deepseek-v4-pro \| \| deepseek-v4-flash \| \| ernie-5.1 \| +--------------------------------------+ RAGFlow(user)> check instance 'test' from 'baidu' SUCCESS ``` Additionally, this PR fixes an incorrect error message typo: Before: ```go fmt.Errorf("API requestssss failed with status %d: %s : %s", ...) ``` After: ```go fmt.Errorf("API request failed with status %d: %s", ...) ``` This PR mainly improves provider compatibility, API completeness, and runtime stability. ### Type of change * [x] Bug Fix (non-breaking change which fixes an issue) * [x] New Feature (non-breaking change which adds functionality) * [x] Refactoring	2026-05-09 19:21:13 +08:00
Jin Hai	17d71e5d79	Go CLI: embed and rerank (#14735 ) ### What problem does this PR solve? ``` RAGFlow(user)> embed text 'what is rag' 'who are you' with 'embedding-3@test@zhipu-ai' dimension 16; +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 16 \| 0 \| \| 16 \| 1 \| +-----------+-------+ RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'rerank@test@zhipu-ai' top 2; +-------+-----------------+ \| index \| relevance_score \| +-------+-----------------+ \| 0 \| 1 \| \| 2 \| 0.99999976 \| +-------+-----------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-05-09 17:41:54 +08:00
Lynn	efe6d23d61	Fix: handle id as keyword (#14729 ) ### What problem does this PR solve? Update mapping.json to treat id as a keyword. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-09 17:41:08 +08:00
Haruko386	ee0de58204	Go: implement provider: HuggingFace (#14722 ) ### What problem does this PR solve? Implement `HuggingFace` provider ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-09 13:36:03 +08:00
Joseff	2ad854c586	Go: implement Rerank in Aliyun driver (#14676 ) ### What problem does this PR solve? The Aliyun Go driver has a stub `Rerank` method that always returns `"Aliyun, Rerank not implemented"`. DashScope exposes an OpenAI-compatible rerank endpoint (`compatible-mode/v1/rerank`) and hosts dedicated bilingual rerankers (`gte-rerank-v2`, `gte-rerank`) that are a natural pairing with the embedding models already in `aliyun.json`. Without this, Aliyun users cannot use reranking within RAGFlow. Closes #14675 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-08 20:21:04 +08:00
Haruko386	94f82acd03	Fix(Go): prevent global state pollution in local model connection check (#14669 ) ### What problem does this PR solve? 1. Fix Global State Pollution in Local Providers (Critical Bug): - Resolved a severe concurrency and architecture issue in `model_service.go`. Previously, `ListSupportedModels` would permanently overwrite the global provider singleton with a localized URL instance (`driver.NewInstance`). This caused cross-request contamination in multi-tenant environments. - Fixed `CheckProviderConnection` for local models (LM Studio, vLLM, Ollama). It now properly creates a localized driver copy and injects the `base_url` before testing the connection, entirely eliminating the false-positive `missing base URL` error without polluting the global state. 2. Implement `VolcEngine` Embeddings: - Fully implemented the `Encode` method for the `volcengine` provider, enabling text embedding capabilities for VolcEngine models. 3. Enhance Region Validation in `SiliconFlow`: - Added a strict empty string check (`*apiConfig.Region != ""`) alongside the existing `nil` check when parsing regions. This ensures that if an empty string is passed, the system safely falls back to the `"default"` region, preventing malformed URL requests and `unsupported protocol scheme` errors. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-05-08 15:54:27 +08:00
Panda Dev	a82ae4a991	Go: implement Encode (embeddings) in Aliyun driver (#14647 ) ### What problem does this PR solve? The Aliyun Go driver shipped with a stub \`Encode\` method that returned \`no such method\`, even though \`conf/models/aliyun.json\` already wires the OpenAI-compatible embeddings URL suffix at \`compatible-mode/v1/embeddings\`. The same config also did not list any embedding models, so the picker had nothing to select. So an Aliyun tenant who wanted to use Tongyi text-embedding-v3 or v4 in the Go layer could not, even though the upstream endpoint is public and uses the standard \`POST /v1/embeddings\` shape that the SiliconFlow and ZhipuAI drivers already support. This PR fills the gap. ### What this PR includes - \`conf/models/aliyun.json\`: add \`text-embedding-v4\` and \`text-embedding-v3\` to the \`models\` array. - \`internal/entity/models/aliyun.go\`: replace the \`Encode\` stub with a real implementation. Adds a small local response type that matches the OpenAI-compatible shape. No factory change. No interface change. ### How the driver works - Validate \`apiConfig\` and the API key, validate the model name, resolve the region with a default fallback, build the URL from \`BaseURL[region] + URLSuffix.Embedding\`. - Send all input texts in one request as the \`input\` array, the same OpenAI-compatible shape the SiliconFlow \`Encode\` uses. - Parse \`data[].embedding\` and copy each slice into a \`[][]float64\` indexed by \`data[].index\` so the output order matches the input order even if the API returns items in a different order. - Handle both \`float64\` and \`float32\` element types. - Empty input returns \`[][]float64{}\` with no HTTP call. - Non-200 responses propagate the upstream status line and body. - A final pass checks every input slot got a vector and returns a clear error if any slot is still nil. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - \`go build ./internal/entity/models/...\` in a clean go 1.25 image returns exit 0. - The full method set on \`AliyunModel\` still matches the \`ModelDriver\` interface. - Pattern parity with the existing SiliconFlow Encode implementation. Closes #14646 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-08 13:58:25 +08:00
Haruko386	d13a240dc0	Go: implement remaining interface for OpenRouter (#14657 ) ### What problem does this PR solve? 1. implement `rerank`, `embedding`, `balance`, `checkConnet` method for `OpenRouter` 2. delete `chat` method in `internal/entity/models/volcengine.go` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-08 13:56:45 +08:00
Panda Dev	d8d49df35e	Go: implement Rerank in Gitee AI driver (#14656 ) ### What problem does this PR solve? The Gitee AI Go driver shipped with a stub \`Rerank\` method that returned \`Rerank not implemented\`, even though \`conf/models/gitee.json\` already wires the rerank URL suffix at \`\"rerank\": \"rerank\"\`. The same config did not list any rerank model, so the picker had nothing to select. So a Gitee tenant could not use BAAI/bge-reranker-v2-m3 as a reranker through the Go layer today, even though the infrastructure was one config entry and one method body away. ### What this PR includes - \`conf/models/gitee.json\`: add \`BAAI/bge-reranker-v2-m3\` to the \`models\` array. - \`internal/entity/models/gitee.go\`: replace the \`Rerank\` stub with a real implementation. Adds two small local types that match the OpenAI-compatible \`/rerank\` shape already used by the SiliconFlow and ZhipuAI drivers. No factory change. No interface change. ### How the driver works - Validate \`apiConfig\` and the API key, validate the model name, resolve the region with a default fallback, build the URL from \`BaseURL[region] + URLSuffix.Rerank\`. - Use a per-call \`context.WithTimeout(30s)\` and \`http.NewRequestWithContext\`, matching the pattern the recently merged Aliyun Encode and the OpenAI driver already use. - Send \`{model, query, documents, top_n, return_documents:false}\` in the body. - Parse \`results[].relevance_score\` and copy each score into the output slice indexed by \`results[].index\`, so the output order matches the input order even if the API returns items in a different order. - Empty input returns \`[]float64{}\` with no HTTP call. - An out-of-range result index returns a clear error rather than silently skipping the entry. - Non-200 responses propagate the upstream status line and body. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - \`go build ./internal/entity/models/...\` in a clean go 1.25 image returns exit 0. - The full method set on \`GiteeModel\` still matches the \`ModelDriver\` interface. - Pattern parity with the existing SiliconFlow Rerank and the recently merged ZhipuAI Rerank (#14608). Closes #14655	2026-05-08 13:08:22 +08:00
Panda Dev	e729eced45	Go: implement Balance in DeepSeek driver (#14632 ) Closes #14631 ### What problem does this PR solve? The DeepSeek Go driver shipped with a stub \`Balance\` method that returned \`no such method\`, even though DeepSeek exposes a public \`GET /user/balance\` endpoint that works with the same Bearer token used for chat. So the "Balance" panel in the model provider UI always shows an error for DeepSeek tenants, while it already works for Moonshot and Gitee. This PR fills the gap. ### What this PR includes - \`conf/models/deepseek.json\`: add \`\"balance\": \"user/balance\"\` under \`url_suffix\` so the driver can build the URL from config the same way the other endpoints do. - \`internal/entity/models/deepseek.go\`: replace the \`Balance\` stub with a real implementation. Adds a small local response type \`deepseekBalanceResponse\` that matches the upstream shape. No factory change. No interface change. ### How the driver works - Validate \`apiConfig\` and the API key, resolve the region (with a \`default\` fallback), and build the URL from \`BaseURL[region] + URLSuffix.Balance\`. - GET the URL with \`Authorization: Bearer <api_key>\`. - Parse the upstream response: \`\`\`json { \"is_available\": true, \"balance_infos\": [ {\"currency\": \"USD\", \"total_balance\": \"10.00\", ...}, {\"currency\": \"CNY\", \"total_balance\": \"70.00\", ...} ] } \`\`\` \`total_balance\` is a string in the upstream API, so the driver parses it with \`strconv.ParseFloat\`. - Return the first balance entry as \`{\"balance\": <float>, \"currency\": <string>}\`, the same shape the Moonshot driver returns. The UI can render it with no provider-specific code. ### Edge cases - Missing or empty API key returns a clear local error before any HTTP call. - Empty \`balance_infos\` returns a clear \"no balance info in response\" error rather than a zero-value silent success. - Non-numeric \`total_balance\` returns a clear parse error. - Non-200 responses propagate the upstream status line and body so the user can see why the call failed. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - \`go build ./internal/entity/models/...\` in a clean go 1.25 image (the go.mod minimum) returns exit 0. - The full method set on \`DeepSeekModel\` still matches the \`ModelDriver\` interface. - Pattern parity with the existing Moonshot and Gitee Balance implementations.	2026-05-08 12:03:39 +08:00
Haruko386	a377512110	Go: implement provider: OpenRouter (#14652 ) ### What problem does this PR solve? 1. Implement `OpenRouter` Provider: Fully support OpenRouter AI models (e.g., `gemma`, `minimax`). Includes robust handling of Server-Sent Events (SSE) streams, error event interception, and proper parsing of both `reasoning_content` and standard `content`. 2. Fix BaseURL Resolution Bug: Fixed a critical edge case in region configuration parsing. Added a strict empty string check (`*apiConfig.Region != ""`) alongside the `nil` check. This ensures that if the UI passes an empty string, the system correctly falls back to the `"default"` region, preventing `unsupported protocol scheme ""` errors during HTTP requests. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-05-08 12:02:37 +08:00
Panda Dev	a86e0ca0ca	Go: implement Balance in SiliconFlow driver (#14643 ) ### What problem does this PR solve? The SiliconFlow Go driver shipped with a stub \`Balance\` method that returned \`no such method\`, even though SiliconFlow exposes a public \`GET /v1/user/info\` endpoint that returns the account balance per currency. So the "Balance" panel in the model provider UI always shows an error for SiliconFlow tenants, while it already works for Moonshot and Gitee. This PR fills the gap. ### What this PR includes - \`conf/models/siliconflow.json\`: add \`\"balance\": \"user/info\"\` under \`url_suffix\` so the driver builds the URL from config. - \`internal/entity/models/siliconflow.go\`: replace the \`Balance\` stub with a real implementation. Adds a small local response type that matches the upstream shape. No factory change. No interface change. ### How the driver works - Validate \`apiConfig\` and the API key, resolve the region with a default fallback, and build the URL from \`BaseURL[region] + URLSuffix.Balance\`. - GET the URL with \`Authorization: Bearer <api_key>\`. - Parse the upstream response. SiliconFlow returns balance fields as strings, so the driver parses them with \`strconv.ParseFloat\`. It prefers \`totalBalance\` over \`balance\` when both are present. - Return \`{\"balance\": <float>, \"currency\": \"CNY\"}\`, the same shape the Moonshot driver returns. The UI can render it with no provider-specific code. ### Edge cases - Missing or empty API key returns a clear local error before any HTTP call. - An unknown region falls back to the default base URL. - Empty \`balance\` and \`totalBalance\` returns a clear "no balance info in response" error rather than a zero-value silent success. - Non-numeric balance string returns a clear parse error. - Non-200 responses propagate the upstream status line and body. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - \`go build ./internal/entity/models/...\` in a clean go 1.25 image returns exit 0. - The full method set on \`SiliconflowModel\` still matches the \`ModelDriver\` interface. - Pattern parity with the existing Moonshot and Gitee Balance implementations. Closes #14642	2026-05-08 12:01:10 +08:00
Magicbook1108	c29335cbff	Feat: support local provider for code exec component & remove some outdated models (#14637 ) ### What problem does this PR solve? Feat: support local provider for code exec component & remove some outdated models ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-07 21:23:13 +08:00
Haruko386	078ea3bf4a	Go: implement provider: Nvidia (#14623 ) ### What problem does this PR solve? 1. Implement `Nvidia` Provider: Fully support NVIDIA NIM APIs with robust parameter handling (including the `thinking` parameter) and safe URL merging in `NewInstance`. 2. Fix Misleading CLI Errors: Corrected a bug in `common_command.go` where failed chat requests inaccurately reported `failed to list instance models`. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-05-07 14:17:57 +08:00
Panda Dev	b8b741555f	Go: implement provider: OpenAI (#14605 ) ### What problem does this PR solve? Add a Go driver for OpenAI (GPT models). The config file conf/models/openai.json has been in the repo for a while with the full GPT-5 model list, but internal/entity/models/factory.go had no case for "openai". So any tenant that configured OpenAI as a model provider in the Go layer fell through to the default branch and got the dummy driver. Chat, list models, and check connection all returned dummy responses instead of reaching the API. OpenAI is the most commonly requested provider and the JSON config already ships with the repo, so this gap is high impact even though the JSON has been there for some time. ### What this PR includes - New file internal/entity/models/openai.go with an OpenAIModel that implements the ModelDriver interface. - factory.go: route the "openai" provider name to NewOpenAIModel. - conf/models/openai.json: add "models": "models" under url_suffix so ListModels can hit /v1/models with no hardcoded fallback. ### How the driver works - OpenAI exposes the canonical OpenAI-compatible API at https://api.openai.com/v1. - ChatWithMessages and ChatStreamlyWithSender post to /chat/completions in the same shape the moonshot, vllm, and xai drivers use. - ListModels and CheckConnection call /models to list available ids and confirm the API key works. - reasoning_content is passed through for the o-series and other reasoning models, in both the non-stream and stream paths. - Encode (embeddings) is left as "not implemented" for now, the same way the other recent provider drivers do it. Rerank and Balance are not part of OpenAI's public API surface in this layer and return a clear "not implemented" or "no such method" error. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - go build ./internal/entity/models/... in a clean go 1.25 image (the go.mod minimum) returns exit 0 with no errors. - Method set of OpenAIModel matches the ModelDriver interface: NewInstance, Name, ChatWithMessages, ChatStreamlyWithSender, Encode, Rerank, ListModels, Balance, CheckConnection. - Pattern parity with the merged moonshot (#14433), volcengine (#14460), minimax (#14478), vllm (#14532), xai (#14550), and lm-studio (#14586) PRs. Closes #14604	2026-05-07 13:09:51 +08:00
Haruko386	dd7a0ce1d3	Go: implement provider: lm-studio (#14586 ) ### What problem does this PR solve? implement `lm-studio` provider ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-06 19:23:11 +08:00
Haruko386	cd54c08e84	Go: implement provider: Ollama (#14580 ) ### What problem does this PR solve? implement `Ollama` provider ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-06 12:03:58 +08:00
Yingfeng	4ee0702aed	Feat: add skills space to context engine (#13908 ) ### What problem does this PR solve? issue #13714 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-04-30 12:36:03 +08:00
FuturMix	2548c28d65	feat: add FuturMix as model provider (#14419 ) ## Summary Add [FuturMix](https://futurmix.ai) as a new model provider. FuturMix is an OpenAI-compatible unified AI gateway that provides access to 22+ models (GPT, Claude, Gemini, DeepSeek, and more) through a single API endpoint and key. - API Base: `https://futurmix.ai/v1` (OpenAI-compatible) - Supported capabilities: Chat, Embedding, Image2Text, TTS, Speech2Text, Rerank ### Changes \| File \| Change \| \|------\|--------\| \| `rag/llm/__init__.py` \| Add `FuturMix` to `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` \| \| `rag/llm/chat_model.py` \| Add `FuturMixChat(Base)` — follows Astraflow/Avian pattern \| \| `rag/llm/embedding_model.py` \| Add `FuturMixEmbed(OpenAIEmbed)` — follows Astraflow pattern \| \| `rag/llm/cv_model.py` \| Add `FuturMixCV(GptV4)` — follows SILICONFLOW/OpenRouter pattern \| \| `rag/llm/tts_model.py` \| Add `FuturMixTTS(OpenAITTS)` — follows CometAPI/DeerAPI pattern \| \| `rag/llm/sequence2txt_model.py` \| Add `FuturMixSeq2txt(GPTSeq2txt)` — follows StepFun pattern \| \| `rag/llm/rerank_model.py` \| Add `FuturMixRerank(OpenAI_APIRerank)` \| \| `conf/llm_factories.json` \| Add factory config with 8 chat, 2 embedding, 1 image2text, 2 TTS, 1 speech2text models \| \| `docs/guides/models/supported_models.mdx` \| Add FuturMix to supported models table \| ### Models included - Chat: claude-sonnet-4-20250514, claude-3.5-haiku, gpt-4o, gpt-4o-mini, gemini-2.5-flash, gemini-2.0-flash, deepseek-chat, deepseek-reasoner - Embedding: text-embedding-3-small, text-embedding-3-large - Image2Text: gpt-4o - TTS: tts-1, tts-1-hd - Speech2Text: whisper-1 ## Test plan - [ ] Verify FuturMix appears in the model provider list in RAGFlow UI - [ ] Configure FuturMix with API key and test chat completion - [ ] Test embedding model with document indexing - [ ] Test image2text with a sample image 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-30 10:59:37 +08:00
Haruko386	0e1477eb23	Go: implement provider: MiniMax (#14478 ) ### What problem does this PR solve? implement MiniMax provider ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-04-29 19:06:40 +08:00
Jin Hai	bb05a8bd7e	Update create model instance command (#14441 ) ### What problem does this PR solve? 1. support command: ``` RAGFlow(user)> create provider 'vllm' instance 'test' key 'test-key' url 'base-url' region 'abc'; SUCCESS RAGFlow(user)> list instances from 'vllm'; +----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+ \| apiKey \| extra \| id \| instanceName \| providerID \| status \| +----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+ \| test-key \| {"base_url":"base-url","region":"abc"} \| 40213c89430311f1a7cf38a74640adcc \| test \| b4d40e6142d311f1a4f938a74640adcc \| enable \| +----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+ ``` 2. support add vllm model ``` RAGFlow(user)> add model 'Qwen/Qwen2-0.5B' to provider 'vllm' instance 'test' with tokens 131072 chat; SUCCESS ``` 3. add vllm chat ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-29 17:05:08 +08:00
Haruko386	decf673049	Go: implement provider: volcengine (#14460 ) ### What problem does this PR solve? implement `volcengine` provider ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-04-29 15:45:08 +08:00
Jin Hai	f670913bb4	Refactor model type to model class (#14426 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-28 16:05:15 +08:00
qinling0210	effc84a042	Refactor model in GO (#14398 ) ### What problem does this PR solve? Refactor model in GO ### Type of change - [x] Refactoring	2026-04-28 12:59:01 +08:00
Jin Hai	819257f257	Go: add volcengine (#14409 ) ### What problem does this PR solve? 1. Refactor server_main 2. Add volcengine ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-28 12:12:58 +08:00
Jin Hai	965717c4fb	Go: add new provider: google (#14395 ) ### What problem does this PR solve? As title. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-27 20:35:47 +08:00
Jin Hai	c3eac4103a	Go: aliyun model provider (#14379 ) ### What problem does this PR solve? As title. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-27 14:53:33 +08:00
yuch85	3ad3241ae0	feat: persist RAPTOR layer metadata on summary chunks (#13286 ) ## Summary RAPTOR's recursive clustering builds a `layers` list tracking `(start_idx, end_idx)` boundaries per level, but currently discards this information — only the flat `chunks` list is returned. This makes it impossible to distinguish leaf-level summaries from top-level ones. This PR: - Returns `(chunks, layers)` tuple from `raptor.py`'s `__call__` - Annotates each RAPTOR summary chunk with `raptor_layer_int` (1 = first summary level, 2 = summary-of-summaries, etc.) - Adds `raptor_layer_int` to `infinity_mapping.json` (Elasticsearch handles it via existing `_int` dynamic template) ### Why this matters Downstream features need to know which RAPTOR layer a summary belongs to: - Retrieving the top-level document summary* for entity extraction, search snippets, or document comparison - Filtering by abstraction level — users may want only high-level summaries or only leaf-level cluster summaries - RAPTOR recall quality — #10951 reports summaries not being recalled for definition queries; layer metadata enables targeted retrieval ### Changes \| File \| Change \| LOC \| \|------\|--------\|-----\| \| `rag/raptor.py` \| Return `(chunks, layers)` tuple \| ~3 \| \| `rag/svr/task_executor.py` \| Build `chunk_layer` mapping, set `raptor_layer_int` \| ~12 \| \| `conf/infinity_mapping.json` \| Add `raptor_layer_int` integer field \| ~1 \| ### Backward compatibility - Additive only — no existing fields or behavior changed - Existing RAPTOR chunks continue to work (they'll have `raptor_layer_int = 0` by default) - New RAPTOR chunks get layer metadata automatically ## Test plan - [ ] Parse a document with RAPTOR enabled, verify `raptor_layer_int` is set on indexed chunks - [ ] Verify `raptor_layer_int` values increase with abstraction level (layer 1 < layer 2 < ...) - [ ] Verify existing RAPTOR deletion (`delete by raptor_kwd`) still works - [ ] Verify Infinity backend accepts the new field Fixes #7488 Related: #4104, #11191, #10951 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: yuch85 <yuch85.1@gmail.com> Co-authored-by: Wang Qi <wangq8@outlook.com>	2026-04-27 10:20:46 +08:00
wdeveloper16	78188ce9e9	Feat: add OpenDataLoader PDF parser backend (#14058 ) (#14097 ) ### What problem does this PR solve? Closes #14058. RAGFlow supports multiple PDF parsing backends (DeepDOC, MinerU, Docling, TCADP, PaddleOCR). This PR adds OpenDataLoader ([opendataloader-project/opendataloader-pdf](https://github.com/opendataloader-project/opendataloader-pdf)) as a new optional backend, giving users a deterministic, local-first alternative with competitive table extraction accuracy. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update --- ### Changes #### Backend - `deepdoc/parser/opendataloader_parser.py` — new `OpenDataLoaderParser` class inheriting `RAGFlowPdfParser`. Implements `check_installation()` (guards Python package + Java 11+ runtime), `parse_pdf()` with JSON-first extraction (heading/paragraph/table/list/image/formula) and Markdown fallback, position-tag generation compatible with the shared `@@page\tx0\tx1\ty0\ty1##` format, and temp-dir lifecycle with cleanup. - `rag/app/naive.py` — new `by_opendataloader()` wrapper, registered in `PARSERS` dict, added to `chunk_token_num=0` override list. - `rag/flow/parser/parser.py` — `"opendataloader"` branch in the pipeline PDF handler + check validation list. #### Infrastructure - `docker/entrypoint.sh` — `ensure_opendataloader()` function: opt-in via `USE_OPENDATALOADER=true`, skips gracefully if Java is not on PATH. #### Frontend - `web/src/components/layout-recognize-form-field.tsx` — `OpenDataLoader` added to `ParseDocumentType` enum and parser dropdown. Cascades automatically to the pipeline editor's Parser component. #### Docs - `docs/guides/dataset/select_pdf_parser.md` — added OpenDataLoader entry and full env-var reference. --- ### Environment variables \| Variable \| Default \| Description \| \|---\|---\|---\| \| `USE_OPENDATALOADER` \| `false` \| Set `true` to install `opendataloader-pdf` on container startup \| \| `OPENDATALOADER_VERSION` \| latest \| Pin the PyPI release (e.g. `==2.2.1`) \| \| `OPENDATALOADER_HYBRID` \| _(unset)_ \| Enable hybrid AI mode (e.g. `docling-fast`) \| \| `OPENDATALOADER_IMAGE_OUTPUT` \| _(unset)_ \| `off` / `embedded` / `external` \| \| `OPENDATALOADER_OUTPUT_DIR` \| _(tmp)_ \| Persistent output dir; temp dir used + cleaned if unset \| \| `OPENDATALOADER_DELETE_OUTPUT` \| `1` \| `0` to retain intermediate files for debugging \| \| `OPENDATALOADER_SANITIZE` \| _(unset)_ \| `1` to filter prompt-injection patterns from output \| --- ### Dependencies - Runtime: `opendataloader-pdf` (PyPI, Apache 2.0) — opt-in, not added to `pyproject.toml` core deps. Installed by `ensure_opendataloader()` at container startup when `USE_OPENDATALOADER=true`. - System: Java 11+ on PATH (JVM is the underlying engine). The installer skips with a warning if `java` is not found. --- ### How to test Standalone parser: ```bash source .venv/bin/activate uv pip install opendataloader-pdf python3 -c " import sys; sys.path.insert(0, '.') from deepdoc.parser.opendataloader_parser import OpenDataLoaderParser p = OpenDataLoaderParser() print('available:', p.check_installation()) s, t = p.parse_pdf('path/to/test.pdf', parse_method='pipeline') print(f'sections={len(s)} tables={len(t)}') " ``` ### Benchmark vs Docling ``` file parser secs sections tables ---------------------------------------------------------------------- text-heavy.pdf docling 45.29 148 10 text-heavy.pdf opendataloader 3.14 559 0 table-heavy.pdf docling 7.05 76 3 table-heavy.pdf opendataloader 3.71 90 0 complex.pdf docling 42.67 114 8 complex.pdf opendataloader 3.51 180 0 ```	2026-04-25 00:33:02 +08:00
Jin Hai	1c244df90d	Go: add gitee and siliconflow as model provider (#14336 ) ### What problem does this PR solve? As title ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-24 20:59:30 +08:00
qinling0210	1473000135	Implement retrieval_test in GO (#14231 ) ### What problem does this PR solve? Implement retrieval_test in GO ### Type of change - [x] Refactoring	2026-04-24 15:30:14 +08:00
Magicbook1108	aadd9a333f	Feat: deepseek v4 (#14346 ) ### What problem does this PR solve? Feat: deepseek v4 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-04-24 13:07:59 +08:00
Jin Hai	2b029882d7	Go: add new provider minimax (#14296 ) ### What problem does this PR solve? 1. Add new provider minimax 2. Add new command: CHECK INSTANCE 'instance_name' FROM 'provider_name'; ``` RAGFlow(user)> check instance 'test' from 'minimax'; SUCCESS ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-23 10:16:20 +08:00
Jin Hai	b8660b9919	Add deepseek and moonshot model json (#14290 ) ### What problem does this PR solve? As title ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-22 15:59:41 +08:00
ucloudnb666	f853a39b40	feat: Add Astraflow provider support (global + China endpoints) (#14270 ) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. \| Variant \| Factory Name \| Endpoint \| Env Var \| \|---------\|-------------\|----------\|---------\| \| Global \| `Astraflow` \| `https://api-us-ca.umodelverse.ai/v1` \| `ASTRAFLOW_API_KEY` \| \| China \| `Astraflow-CN` \| `https://api.modelverse.cn/v1` \| `ASTRAFLOW_CN_API_KEY` \| - API key signup: https://astraflow.ucloud.cn/ --- ### Files Changed \| File \| Change \| \|------\|--------\| \| `rag/llm/__init__.py` \| Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` \| \| `rag/llm/chat_model.py` \| Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) \| \| `rag/llm/embedding_model.py` \| Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) \| \| `rag/llm/rerank_model.py` \| Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) \| \| `rag/llm/cv_model.py` \| Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) \| \| `rag/llm/tts_model.py` \| Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) \| \| `rag/llm/sequence2txt_model.py` \| Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) \| \| `conf/llm_factories.json` \| Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models \| --- ### Supported Model Types - ✅ Chat / LLM — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ Text Embedding — text-embedding-3-small/large - ✅ Image / Vision (IMAGE2TEXT) — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ Text Re-Rank - ✅ TTS — tts-1 - ✅ Speech-to-Text (SPEECH2TEXT) — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>	2026-04-22 15:38:34 +08:00
Jin Hai	e48d75987c	Go: add stream / think chat (#14242 ) ### What problem does this PR solve? 1. Supports stream and non-stream chat 2. Supports think and non-think chat 3. List supported models from DeepSeek service. (This command can be used to verify the API validity) ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-21 16:52:32 +08:00
Jin Hai	f269ee9739	Go: add thinking features to zhipu-ai (#14234 ) ### What problem does this PR solve? ``` RAGFlow(user)> list models from 'zhipu-ai'; +------------+------------+---------------+----------------+ \| features \| max_tokens \| model_types \| name \| +------------+------------+---------------+----------------+ \| [thinking] \| 128000 \| [chat] \| glm-4.7 \| \| [thinking] \| 128000 \| [chat] \| glm-4.5 \| \| [thinking] \| 128000 \| [chat vision] \| glm-4.6v-Flash \| \| [thinking] \| 128000 \| [chat] \| glm-4.5-x \| \| [thinking] \| 128000 \| [chat] \| glm-4.5-air \| \| [thinking] \| 128000 \| [chat] \| glm-4.5-airx \| \| [thinking] \| 128000 \| [chat] \| glm-4.5-flash \| \| [thinking] \| 64000 \| [vision] \| glm-4.5v \| \| \| 128000 \| [chat] \| glm-4-plus \| \| \| 128000 \| [chat] \| glm-4-0520 \| \| \| 128000 \| [chat] \| glm-4 \| \| \| 8000 \| [chat] \| glm-4-airx \| \| \| 128000 \| [chat] \| glm-4-air \| \| \| 128000 \| [chat] \| glm-4-flash \| \| \| 128000 \| [chat] \| glm-4-flashx \| \| \| 1000000 \| [chat] \| glm-4-long \| \| \| 128000 \| [chat] \| glm-3-turbo \| \| \| 2000 \| [vision] \| glm-4v \| \| \| 8192 \| [chat] \| glm-4-9b \| \| \| 512 \| [embedding] \| embedding-2 \| \| \| 512 \| [embedding] \| embedding-3 \| \| \| 4096 \| [asr] \| glm-asr \| \| \| 0 \| [tts] \| glm-tts \| \| \| 0 \| [ocr] \| glm-ocr \| \| \| 0 \| [rerank] \| glm-rerank \| +------------+------------+---------------+----------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-20 21:53:27 +08:00
Jin Hai	af2ed416a7	Add extra field to model instance (#14203 ) ### What problem does this PR solve? Now each model support region with different URL ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-20 15:31:12 +08:00
rhinoceros.xn	4e992de91f	Add tongyi gte-rerank-v2 (#14215 ) https://bailian.console.aliyun.com/cn-beijing?tab=api#/api/?type=model&url=2780056 ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Other (please describe): add gte-rerank-v2、qwen3-rerank	2026-04-20 11:39:17 +08:00
Jin Hai	94106646e7	Go: set and list default models (#14191 ) ### What problem does this PR solve? ``` RAGFlow(user)> set default vlm "zhipu-ai" "ccc" "glm-4.6v-flash"; SUCCESS RAGFlow(user)> list default models; +--------+----------------+----------------+----------------+------------+ \| enable \| model_instance \| model_name \| model_provider \| model_type \| +--------+----------------+----------------+----------------+------------+ \| true \| ccc \| glm-4.6v-flash \| zhipu-ai \| llm \| \| true \| ccc \| glm-4.6v-flash \| zhipu-ai \| image2text \| +--------+----------------+----------------+----------------+------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-17 18:05:33 +08:00
Magicbook1108	69264b3a70	Feat: Refact pipeline (#13826 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Co-authored-by: Zhichang Yu <yuzhichang@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-03 19:26:45 +08:00
Jin Hai	6c29128de1	Refactor model provider and command (#13887 ) ### What problem does this PR solve? Introduce 5 new tables, including model groups and provider instance. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-02 20:20:35 +08:00
Jin Hai	e20cf39735	Refactor Go server model provider reading and access (#13831 ) ### What problem does this PR solve? 1. Refactor model provider json file format 2. Use memory data structure to replace database 3. Add CLI command to access ``` RAGFlow(user)> list pool models from 'xai'; +-------------------------------------------------------------------------------------+------------+-------------+-----------------------+ \| features \| max_tokens \| model_types \| name \| +-------------------------------------------------------------------------------------+------------+-------------+-----------------------+ \| map[] \| 256000 \| [llm] \| grok-4 \| \| map[] \| 131072 \| [llm] \| grok-3 \| \| map[] \| 131072 \| [llm] \| grok-3-fast \| \| map[] \| 131072 \| [llm] \| grok-3-mini \| \| map[] \| 131072 \| [llm] \| grok-3-mini-mini-fast \| \| map[multimodal:map[enabled:true input_modalities:[image] output_modalities:[text]]] \| 32768 \| [vlm] \| grok-2-vision \| +-------------------------------------------------------------------------------------+------------+-------------+-----------------------+ RAGFlow(user)> show pool model 'grok-2-vision' from 'xai'; +-------------------------------------------------------------------------------------+------------+-------------+---------------+ \| features \| max_tokens \| model_types \| name \| +-------------------------------------------------------------------------------------+------------+-------------+---------------+ \| map[multimodal:map[enabled:true input_modalities:[image] output_modalities:[text]]] \| 32768 \| [vlm] \| grok-2-vision \| +-------------------------------------------------------------------------------------+------------+-------------+---------------+ RAGFlow(user)> list pool providers; +--------+------------------------------------------------------------+---------------------------+ \| name \| tags \| url \| +--------+------------------------------------------------------------+---------------------------+ \| OpenAI \| LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,MODERATION \| https://api.openai.com/v1 \| \| xAI \| LLM \| https://api.x.ai/v1 \| +--------+------------------------------------------------------------+---------------------------+ RAGFlow(user)> show pool provider 'openai'; +---------------------------+--------+------------------------------------------------------------+--------------+ \| base_url \| name \| tags \| total_models \| +---------------------------+--------+------------------------------------------------------------+--------------+ \| https://api.openai.com/v1 \| OpenAI \| LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,MODERATION \| 27 \| +---------------------------+--------+------------------------------------------------------------+--------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-03-30 12:00:49 +08:00
Liu An	2240fc778c	Fix: add missing "mom" field to infinity_mapping.json for parent-child chunker (#13821 ) ### What problem does this PR solve? When using Infinity as DOC_ENGINE with parent-child chunker enabled, vector insertion fails because the "mom" field is missing from the index mapping. This fix adds the required field to resolve the issue. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-03-27 13:06:18 +08:00
Jin Hai	b308cd3a02	Update go cli (#13717 ) ### What problem does this PR solve? Go cli ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-03-24 20:08:36 +08:00
Yongteng Lei	dd839f30e8	Fix: code supports matplotlib (#13724 ) ### What problem does this PR solve? Code as "final" node: ![img_v3_02vs_aece4caf-8403-4939-9e68-9845a22c2cfg](https://github.com/user-attachments/assets/9d87b8df-da6b-401c-bf6d-8b807fe92c22) Code as "mid" node: ![img_v3_02vv_f74f331f-d755-44ab-a18c-96fff8cbd34g](https://github.com/user-attachments/assets/c94ef3f9-2a6c-47cb-9d2b-19703d2752e4) ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-03-20 20:32:00 +08:00
tmimmanuel	13d0df1562	feat: add Perplexity contextualized embeddings API as a new model provider (#13709 ) ### What problem does this PR solve? Adds Perplexity contextualized embeddings API as a new model provider, as requested in #13610. - `PerplexityEmbed` provider in `rag/llm/embedding_model.py` supporting both standard (`/v1/embeddings`) and contextualized (`/v1/contextualizedembeddings`) endpoints - All 4 Perplexity embedding models registered in `conf/llm_factories.json`: `pplx-embed-v1-0.6b`, `pplx-embed-v1-4b`, `pplx-embed-context-v1-0.6b`, `pplx-embed-context-v1-4b` - Frontend entries (enum, icon mapping, API key URL) in `web/src/constants/llm.ts` - Updated `docs/guides/models/supported_models.mdx` - 22 unit tests in `test/unit_test/rag/llm/test_perplexity_embed.py` Perplexity's API returns `base64_int8` encoded embeddings (not OpenAI-compatible), so this uses a custom `requests`-based implementation. Contextualized vs standard model is auto-detected from the model name. Closes #13610 ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2026-03-20 10:47:48 +08:00
Octopus	f171554c0a	feat: upgrade MiniMax default model to M2.7 (#13676 ) ## Summary Upgrade MiniMax model configuration to include the latest M2.7 model. ## Changes - Add `MiniMax-M2.7` and `MiniMax-M2.7-highspeed` to the model selection list in `conf/llm_factories.json` - Place M2.7 models at the top of the list as the recommended default - Retain all previous models (M2.5, M2.5-highspeed, M2.1, M2) as available alternatives ## Why MiniMax-M2.7 is the latest flagship model with enhanced reasoning and coding capabilities. This update ensures RAGFlow users can access the newest model while maintaining backward compatibility with existing configurations. ## Testing - JSON config validated (well-formed) - No existing MiniMax-specific unit tests affected - Model entries follow the same structure as existing entries Co-authored-by: PR Bot <pr-bot@minimaxi.com>	2026-03-18 19:20:10 +08:00
Jin Hai	74866371ef	Fix compatiblity issue (#13667 ) ### What problem does this PR solve? 1. Change go admin server port from 9385 to 9383 to avoid conflicts 2. Start go server after python servers are started completely, in entrypoint.sh 3. Fix some database migration issue 4. Add more API routes in web to compliant with EE. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-03-18 11:51:03 +08:00

1 2 3 4 5 ...

291 Commits