ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-29 23:41:12 +08:00

Author	SHA1	Message	Date
Wang Qi	9b726a519e	Fix: failed to get embedding model by embd_id: model config not found BAAI/bge-m3@...@SILICONFLOW (#16445 )	2026-06-29 15:40:29 +08:00
Jin Hai	65afaa1292	Model config: add tools (#16371 ) ### What problem does this PR solve? ``` { "name": "glm-4-flash", "max_tokens": 128000, "model_types": [ "chat" ], "tools": { "support": true } } ``` ``` RAGFlow(admin)> list provider 'zhipu-ai' models; +------------+---------------+------------+---------------+----------------+-----------+-----------+ \| dimensions \| max_dimension \| max_tokens \| model_type \| name \| thinking \| tools \| +------------+---------------+------------+---------------+----------------+-----------+-----------+ \| \| \| 204800 \| [chat] \| glm-5 \| supported \| supported \| \| \| \| 204800 \| [chat] \| glm-5-turbo \| supported \| supported \| ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-26 11:37:51 +08:00
Rander	1235da7093	refactor(paddleocr): migrate from sync API to async Job API (#15967 ) ## Summary Migrate PaddleOCR integration from the deprecated synchronous HTTP API to the new asynchronous Job API (`submit → poll → fetch`), aligning with PaddleOCR 3.6.0+ architecture. ## Changes ### Python (`deepdoc/parser/paddleocr_parser.py`) - Replace synchronous `requests.post()` with async Job API flow (submit → poll → fetch) - Authentication: `token {token}` → `Bearer {token}` - File transfer: base64 JSON body → multipart file upload - Polling: exponential backoff (initial 3s, ×1.5, max 15s, timeout controlled by `request_timeout`) - Result: fetch full JSONL from result URL, preserving `prunedResult` with bbox info for crop functionality - Rename `api_url` → `base_url` (backward compatible: `api_url` still accepted as fallback) ### Python (`rag/llm/ocr_model.py`) - Prefer `paddleocr_base_url` / `PADDLEOCR_BASE_URL`, fallback to `paddleocr_api_url` / `PADDLEOCR_API_URL` ### Go (`internal/entity/models/paddleocr.go`) - Add `Client-Platform: ragflow` header to submit and poll requests - Change polling from fixed 3s to exponential backoff (initial 3s, ×1.5, max 15s) ### Python (`common/constants.py`) - Add `PADDLEOCR_BASE_URL` to env keys and default config ## Backward Compatibility - Old env var `PADDLEOCR_API_URL` still works (used as fallback) - Frontend field `paddleocr_api_url` still works (backend reads it as fallback) - No user-facing configuration changes required for existing setups ## Why not use the `paddleocr` SDK package directly? RAGFlow's `_transfer_to_sections()` relies on `prunedResult` (containing `block_bbox`, `block_label`, `parsing_res_list`) from the raw API response for PDF crop functionality. The SDK's public `parse_document()` API only returns `DocParsingResult` with `markdown_text`, discarding the bbox data. Therefore we implement the async Job API flow directly via HTTP, following the same logic as the SDK internally.	2026-06-16 19:34:21 +08:00
zaviermeekz-cpu	83e2180e80	fix: use /api/tags endpoint for Ollama model listing (#16000 ) (#16003 ) After upgrading to v0.26.0, the Ollama provider returns an empty model list because the Go rewrite uses `/api/ps` (only running models) instead of `/api/tags` (all installed models). This PR changes the endpoint to `/api/tags`, restoring the ability to list and add Ollama models. Closes #16000	2026-06-15 10:20:15 +08:00
Haruko386	547139da29	fix(Go-models): preserve model name lookup when aliases exist (#15969 ) ### What problem does this PR solve? As title ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Documentation Update	2026-06-12 19:15:28 +08:00
Idriss Sbaaoui	9871a7e0b6	fix: replicate model provider (#15933 ) ### What problem does this PR solve? FIx replicate model provider failing with valid api key ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Wang Qi <wangq8@outlook.com>	2026-06-11 15:08:33 +08:00
Jin Hai	55abf4f565	Go: new CLI command, list all models and show model (#15786 ) ### What problem does this PR solve? ``` RAGFlow(user)> list models; +---------------------------+------------+-------------+--------------------+---------------------------------------------+ \| alias \| max_tokens \| model_types \| name \| thinking \| +---------------------------+------------+-------------+--------------------+---------------------------------------------+ \| \| 1048576 \| [chat] \| deepseek-v4-flash \| map[clear_thinking:true default_value:true] \| \| \| 1048576 \| [chat] \| deepseek-v4-pro \| map[clear_thinking:true default_value:true] \| \| \| 1024000 \| [chat] \| minimax-m3 \| map[clear_thinking:true default_value:true] \| \| \| 64000 \| [vision] \| glm-4.5v \| map[clear_thinking:true default_value:true] \| \| [baai/bge-m3] \| 8192 \| [embedding] \| bge-m3 \| \| \| [baai/bge-reranker-v2-m3] \| 1024 \| [rerank] \| bge-reranker-v2-m3 \| \| \| \| \| [tts] \| step-audio-tts-3b \| \| \| [qwen/qwen3-asr-1.7b] \| \| [asr] \| qwen3-asr-1.7b \| \| \| [paddleocr-vl-1.5] \| \| [ocr] \| paddleocr-vl-0.9b \| \| +---------------------------+------------+-------------+--------------------+---------------------------------------------+ RAGFlow(user)> show model 'minimax-m3'; +--------------+---------------------------------------------+ \| field \| value \| +--------------+---------------------------------------------+ \| name \| minimax-m3 \| \| max_tokens \| 1024000 \| \| model_types \| [chat] \| \| thinking \| map[clear_thinking:true default_value:true] \| \| class \| \| \| alias \| \| \| ModelTypeMap \| \| +--------------+---------------------------------------------+ RAGFlow(user)> show model 'baai/bge-m3'; +--------------+---------------+ \| field \| value \| +--------------+---------------+ \| model_types \| [embedding] \| \| thinking \| \| \| class \| \| \| alias \| [baai/bge-m3] \| \| ModelTypeMap \| \| \| name \| bge-m3 \| \| max_tokens \| 8192 \| +--------------+---------------+ ``` --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-08 21:38:15 +08:00
oktofeesh	25df0a6725	fix(go-models): validate URL suffix config keys (#15734 ) ## Summary Fixes typoed model-provider URL suffix keys and adds strict nested decoding so future URL suffix config mistakes fail during provider loading instead of being silently ignored.	2026-06-08 19:29:36 +08:00
Haruko386	8dc7f1d95e	Go: implement ASR and TTS for xiaomi (#15765 ) ### What problem does this PR solve? Verified from CLI ``` RAGFlow(user)> chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Answer: Hello! I'm MiMo-v2.5, a large language model developed by Xiaomi's LLM Core Team. You can think of me as a friendly AI assistant ready to help you answer questions, have conversations, or work on creative tasks. My context window can handle up to 1 million tokens, so we can dive into pretty long discussions or documents if you'd like. What can I help you with today? Time: 3.831830 RAGFlow(user)> stream chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Answer: there! I'm MiMo-v2.5, an AI assistant created by the Xiaomi LLM Core Team. I'm here to chat, help out, answer questions, or just have a friendly conversation. Think of me as a helpful buddy with a pretty big memory (1 million tokens worth!). What can I do for you today?😊 Time: 2.421630 RAGFlow(user)> think chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Thinking: The user is asking a simple question about who I am. According to my system prompt, I should: - Identify myself as MiMo-v2.5 - State that I was developed by the Xiaomi LLM Core Team - Answer in first person and be warm and conversational Answer: Hey there! 👋 I'm MiMo, an AI assistant created by the Xiaomi LLM Core Team. Think of me as a friendly chat buddy who's here to help you with all sorts of questions and tasks! I love having conversations, answering questions, brainstorming ideas, and helping people figure things out. Whether you want to chat, need help with something specific, or just want to explore ideas together — I'm here for it! 😊 What can I help you with today? Time: 6.651589 RAGFlow(user)> tts with 'mimo-v2.5-tts@test@xiaomi' text 'hello? show yourself' play format 'wav' param '{"voice": "Chloe"}' SUCCESS RAGFlow(user)> asr with 'mimo-v2.5-asr@test@xiaomi' audio './internal/test.wav' param '{"language": "zh"}' +------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------+ \| 1 The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. \| +------------------------------------------------------------------------------------------------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-06-08 19:27:45 +08:00
tmimmanuel	5e25e2600b	Go: implement Xiaomi chat provider (#15626 ) ### What problem does this PR solve? Implements the Xiaomi MiMo chat provider for the Go model provider layer. Reference issue: #14736 Official docs used: - Xiaomi MiMo OpenAI-compatible chat API: https://platform.xiaomimimo.com/docs/en-US/api/chat/openai-api - Xiaomi MiMo model and rate limits: https://platform.xiaomimimo.com/docs/en-US/quick-start/model - Xiaomi MiMo model hyperparameters: https://platform.xiaomimimo.com/docs/en-US/quick-start/model-hyperparameters	2026-06-08 13:09:36 +08:00
tmimmanuel	f78ef328bb	Go: implement Bedrock embeddings (#15543 ) ### What problem does this PR solve? Fixes #15542. AWS Bedrock support for the Go model provider layer was added in #15166, but embedding support was intentionally left out of scope and `BedrockModel.Embed(...)` still returned the `no such method` sentinel. This PR implements Bedrock text embeddings under the umbrella provider tracker #14736. ### What this PR includes - `internal/entity/models/bedrock.go`: implement `BedrockModel.Embed(...)` through Bedrock Runtime `InvokeModel` with existing SigV4 auth, region resolution, and runtime URL helpers. - Titan embeddings: supports `amazon.titan-embed-text-v1` and `amazon.titan-embed-text-v2:0`; v2 forwards `EmbeddingConfig.Dimension` as `dimensions` when provided, while v1 keeps the payload minimal. - Cohere embeddings: supports `cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, and `cohere.embed-v4:0`; batches input texts and maps returned vectors to RAGFlow `EmbeddingData` in input order. - `conf/models/bedrock.json`: adds the `embedding` URL suffix (`invoke`) and Bedrock embedding model entries. - `internal/entity/models/bedrock_test.go`: adds unit tests for Titan, Cohere, typed Cohere responses, validation, empty input, unsupported models, and HTTP error propagation. Reference docs: - Bedrock InvokeModel API: https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html - Titan Text Embeddings: https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html - Cohere Embed models on Bedrock: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-embed.html ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - [x] `jq empty conf/models/bedrock.json` - [x] `git diff --check` - [x] `go test ./internal/entity/models/... -run Bedrock -count=1` - [x] `go test ./internal/entity/models/... -run '^$' -count=1` - [x] `go test ./internal/entity/models/... -run Bedrock -race -count=1` Note: `go test ./internal/entity/models/... -count=1` currently fails in unrelated existing Astraflow coverage (`TestAstraflowEmbedReturnsNoSuchMethod` panics in `internal/entity/models/astraflow.go`). The Bedrock-specific tests and compile-only package check pass.	2026-06-05 13:26:32 +08:00
Idriss Sbaaoui	1134769940	Chore: update cohere models (#15576 ) ### What problem does this PR solve? remove old and add latest cohere models ### Type of change - [x] Refactoring - [x] Other (please describe): update models	2026-06-03 15:55:45 +08:00
Wang Qi	d41373cfa9	Feature: Add the new anthropic and voyage models (#15516 ) add the newanthropic and voyage models. Strip opus 4.7 and 4.8 of certain usnspported keys Co-authored-by: Idriss Sbaaoui <112825897+6ba3i@users.noreply.github.com>	2026-06-02 17:29:18 +08:00
Wang Qi	c990badda1	Feature: Add MiniMax M3 (#15513 ) Feature: Add MiniMax M3	2026-06-02 17:28:48 +08:00
glorydavid03023	3774916060	Go: implement Embed in GPUStack driver (#15182 ) ### What problem does this PR solve? The Go GPUStack driver returned a stub error for `Embed()` even though GPUStack exposes OpenAI-compatible embeddings on the v1-openai route (not `v1/embeddings`). ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-06-01 11:22:43 +08:00
呆萌闷油瓶	658ff06ca4	feat: add 4 new models for siliconflow (#15383 ) ### What problem does this PR solve? Added 4 new models: deepseek-ai/DeepSeek-V4-Pro deepseek-ai/DeepSeek-V4-Flash Pro/moonshotai/Kimi-K2.6 Pro/zai-org/GLM-5.1 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-29 19:28:29 +08:00
Haruko386	ae88578451	Go: implement TTS and ASR for X.AI (#15247 ) ### What problem does this PR solve? As title ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-27 14:08:35 +08:00
sxxtony	17b5b33574	Go: implement Rerank in Replicate driver (#15278 ) ### What problem does this PR solve? `ReplicateModel.Rerank` in `internal/entity/models/replicate.go` was a `"replicate, no such method"` stub. The chat path landed in #14958 and the embed path in #15073; rerank is the last major retrieval surface still missing on this provider. Until this PR, a tenant who selected a Replicate reranker model got the sentinel error on every rerank call. Co-authored-by: sxxtony <sxxtony@users.noreply.github.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-27 14:07:00 +08:00
Alexander Laurent	ae5f48f233	feat: add GiteeAI provider support to Go API server (#15131 ) ### What problem does this PR solve? Closes #15090. Adds GiteeAI support to the Go model-provider layer so GiteeAI chat models can be routed through the Go API server using the same OpenAI-compatible chat, streaming, model listing, and connection-check flow used by other SaaS providers. GiteeAI is implemented as a separate provider from the existing `gitee` provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ## Summary - Added a GiteeAI Go model driver. - Added the GiteeAI provider catalog with default base URL `https://ai.gitee.com/v1`. - Registered `giteeai` in the model factory separately from `gitee`. - Added focused provider tests for sync chat, streaming chat, model listing, connection checks, base URL override, SSE parsing, `[DONE]` handling, and unsupported methods. ## What changed - Implemented `ChatWithMessages` for `POST /chat/completions`. - Implemented `ChatStreamlyWithSender` with SSE parsing, `delta` extraction, `finish_reason`, `[DONE]`, and `<think>` tag handling. - Implemented `ListModels` for `GET /models`. - Implemented `CheckConnection` by delegating to `ListModels`. - Returned standard `no such method` errors for unsupported embedding, rerank, image-to-text, ASR, and TTS paths. ## Tests ```bash go test -vet=off ./internal/entity/models -run 'TestGiteeAI' -count=1 go test -vet=off ./internal/entity -run 'Test.Provider\|Test.Model' -count=1 ``` --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-27 14:06:34 +08:00
Hz_	47626bbe63	go: add Qiniu model provider (#15280 ) ### What problem does this PR solve? This PR adds Qiniu provider integration for the Go model driver layer in RAGFlow. Supported capabilities: - [X] Chat - [X] Think Chat - [X] Stream Chat - [X] Stream Think Chat - [X] Model listing - [X] Provider configuration and factory registration Verified examples from the CLI: ``` login user '*' password ''; ADD PROVIDER 'qiniu'; CREATE PROVIDER 'qiniu' INSTANCE 'test' KEY '**'; chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu' message 'hello'; think chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu' message 'hello'; stream chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu' message 'hello, what are you'; stream think chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu' message 'hello, what are you'; stream think chat with 'qwen3-max-2026-01-23@test@qiniu' message 'hello, what are you'; LIST MODELS FROM 'qiniu' 'test'; ``` ### Type of change - [X] New Feature - [X] Provider integration	2026-05-27 13:19:39 +08:00
oktofeesh	a3c6e075f6	fix(go-models): add VolcEngine model listing suffix (#15234 ) ## Summary - add the VolcEngine `models` URL suffix used by the existing Go `ListModels` implementation - return a clear error when the VolcEngine models suffix is missing - add focused VolcEngine model-listing regression tests ## What changed - Added `url_suffix.models` to `conf/models/volcengine.json`. - Normalized the configured models suffix before building the request URL. - Covered config loading, successful model listing, upstream errors, and missing suffix handling. ## Why `VolcEngine.ListModels` already builds requests from `URLSuffix.Models`, but the bundled VolcEngine config did not define that suffix. That left the model-listing path unable to call the documented `/models` endpoint from the existing provider config. Fixes #14701 Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-27 13:14:56 +08:00
oktofeesh	5ae41dc1eb	fix(go-models): route hosted OCR providers through drivers (#15233 ) ## Summary - route hosted MinerU.Net and PaddleOCR.Net provider names to their existing Go drivers - add regression coverage for loading the hosted OCR provider configs through ProviderManager ## What changed - Added canonical provider-name aliases for the hosted OCR provider display names. - Covered both bundled configs with a focused provider-manager test. ## Why The hosted provider configs use display names with `.Net`, while model factory dispatch lowercases the provider name. Without aliases, those configs fall through to `DummyModel` instead of using the existing MinerU and PaddleOCR drivers. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 20:40:40 +08:00
oktofeesh	22a3b8cdf9	feat(go-models): list LongCat models (#15241 ) ## Summary - Add LongCat model-list support through the documented OpenAI-compatible models endpoint. ## What changed - Add the LongCat `models` URL suffix for `/openai/v1/models`. - Implement `ListModels` for the LongCat Go driver. - Delegate `CheckConnection` to the lightweight model-list request. - Add focused regression coverage for successful, malformed, oversized, and missing-key responses. ## Why LongCat documents a models endpoint under the OpenAI-compatible API surface, but the Go driver still returned `no such method` for model listing and connection checks. ## Validation - `go test ./internal/entity/models -run TestLongCat -count=1` - `go test -race ./internal/entity/models -run TestLongCat -count=1` - `go test ./internal/entity -count=1` - `git diff --check` ## Notes - Related to the broader Go model provider tracking in #14736, but this PR only handles LongCat model listing. - `go test ./internal/entity/models -count=1` is currently blocked by an unrelated Astraflow test panic outside this LongCat change. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 19:58:53 +08:00
oktofeesh	557024e7d4	fix(go-models): add xAI model listing suffix (#15236 ) ## Summary - add the xAI `models` URL suffix used by the existing Go `ListModels` implementation - return a clear error when the xAI models suffix is missing - add focused xAI model-listing and connection-check regression tests ## What changed - Added `url_suffix.models` to `conf/models/xai.json`. - Normalized the configured models suffix before building the request URL. - Covered config loading, successful model listing, upstream errors, API-key validation, missing suffix handling, and `CheckConnection` delegation. ## Why `XAIModel.ListModels` already builds requests from `URLSuffix.Models`, and `CheckConnection` delegates to that method. The bundled xAI config did not define that suffix, which left the model-listing path unable to call the provider `/models` endpoint from the existing provider config. ## Validation - `go test ./internal/entity/models -run TestXAI -count=1` - `go test ./internal/entity -count=1` - `git diff HEAD~1..HEAD --check` ## Notes - `go test ./internal/entity/models -count=1` currently fails in unchanged Astraflow coverage: `TestAstraflowEmbedReturnsNoSuchMethod` panics before reaching any xAI assertions. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 19:58:20 +08:00
Haruko386	3619ceca01	Go: implement provider: OrcaRouter (#15235 ) ### What problem does this PR solve? implement provider `OrcaRouter` The following functionalities are now supported: Cohere: - [x] Chat / Think Chat / Stream Chat / Stream Think Chat - [x] Model listing - [x] TTS - [ ] Balance ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 18:20:33 +08:00
dripsmvcp	a48bcf814d	Go: implement provider: ModelScope (#15041 ) Closes #15040. ModelScope was listed unchecked in the Go-rewrite tracker #14736 and already had an llm_factories.json entry (tags: LLM) but no Go driver, so the new Go API server could not route ModelScope instances. The Python side has supported it through the OpenAI-compatible base at rag/llm/chat_model.py:618 (ModelScopeChat), which requires a user-supplied base URL and appends /v1. This adds: - internal/entity/models/modelscope.go: self-hosted OpenAI-compatible driver with chat (sync + SSE stream with idle-timeout cancellation), list_models, and check_connection. Auth header is optional, matching the xinference pattern, so deployments without auth and auth-enabled deployments both work. Base URL is normalized so users can configure either the root endpoint or the /v1 endpoint. - internal/entity/models/modelscope_test.go: 12 tests covering name, URL normalization, factory routing, chat happy path / auth header / reasoning_content extraction, stream happy path / stream=false rejection / idle cancellation, list_models + check_connection, missing-base-URL clear error, and the no-such-method sentinels. - conf/models/modelscope.json: shipped config (class: "local", url_suffix v1/chat/completions and v1/models). - internal/entity/models/factory.go: case "modelscope" → ModelScopeModel. - internal/service/llm.go: ModelScope added to the selfDeployed map alongside Ollama, Xinference, LocalAI, LM-Studio, GPUStack — the Python side requires user-supplied URL with no default, so the Go side classifies it the same way. Follow-on issues will add Embed and Rerank, in line with how Novita, NVIDIA, TogetherAI, and other providers landed method-by-method. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 18:18:46 +08:00
Hz_	84add43208	Add HuaweiCloud model provider (#15237 ) ### What problem does this PR solve? This PR adds HuaweiCloud provider integration in RAGFlow. Supported capabilities: - [x] Chat / Think Chat / Stream Chat / Stream Think Chat - [x] Embedding - [x] Rerank - [x] Model listing - [x] Provider connection checking Verified examples from the CLI: ``` check instance 'test' from 'HuaweiCloud'; chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello'; think chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello'; stream chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello'; stream think chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello'; embed text 'what is rag' 'who are you' with 'bge-m3@test@HuaweiCloud' dimension 1024; rerank query 'what is rag' document 'rag is retrieval augmented generation' 'rag need llm' 'famous rag project includes ragflow' with 'bge-reranker-v2-m3@test@HuaweiCloud' top 3; list supported models from 'HuaweiCloud' 'test'; LIST MODELS FROM 'HuaweiCloud' 'test'; ``` ### Type of change - [x] New Feature - [x] Provider integration	2026-05-26 17:13:15 +08:00
Jake Armstrong	0fb85a66bc	feat(go-models): add AWS Bedrock provider driver (#15166 ) ## Summary Closes #15165. Implements the AWS Bedrock model provider for the Go API server, tracked under #14736. Adds Converse + Converse-Stream chat and foundation-model listing, with SigV4 signing over a hand-rolled `net/http` path that matches the established pattern in `internal/entity/models/` (no new direct `go.mod` deps). ## Linked tracker Tracked under #14736 (Implement model providers of RAGFlow API server in Go). Closes #15165.	2026-05-26 17:10:06 +08:00
glorydavid03023	3dbd874a79	Go: implement Rerank in DeepInfra driver (#15185 ) ### What problem does this PR solve? The Go DeepInfra driver returned a stub error for `Rerank()` even though DeepInfra serves reranker models at `POST /v1/inference/{model}` with `query`, `documents`, and a `scores[]` response. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-26 10:52:09 +08:00
sxxtony	67f7d87dff	Go: implement provider: FuturMix (#15013 ) ### What problem does this PR solve? Add a Go driver for FuturMix (https://futurmix.ai/docs), one of the unchecked providers on the umbrella tracking issue #14736. FuturMix is documented as an "OpenAI-compatible API" aggregator over Claude / GPT / Gemini / DeepSeek (~22 models per their `/models` page). Until this PR, a tenant who configured `futurmix` as a model provider in the Go layer fell through to the default branch of `internal/entity/models/factory.go` and got the dummy driver. --------- Co-authored-by: sxxtony <sxxtony@users.noreply.github.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 10:51:29 +08:00
Jake Armstrong	b961810e79	Go: implement OCR in ZhipuAI driver (#15143 ) ### What problem does this PR solve? Closes #15142. ZhipuAI lists `glm-ocr` as an OCR model, but the Go driver still returned `no such method` from `OCRFile`. This wires the advertised model to Z.AI's documented `layout_parsing` endpoint and returns the `md_results` Markdown output through the existing `OCRFileResponse.Text` field. This PR also adds focused tests for URL input, raw file-content base64 input, and validation errors. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Test - [x] `go test -vet=off ./internal/entity/models -run 'TestZhipuAIOCRFile'`	2026-05-26 10:50:06 +08:00
Haruko386	4783ce9951	fix(Go): rewrite chat, listmodels, embed for Ollama (#15213 ) ### What problem does this PR solve? IDK how to implement `Ollama` on #14580 but it's totally wrong. This is the rewrite version for `Ollama` Verified from CLI ``` # Embed RAGFlow(user)> embed text 'what is rag' 'who are you' with 'nomic-embed-text:latest@test12@ollama' dimension 1024; +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 768 \| 0 \| \| 768 \| 1 \| +-----------+-------+ # Chat RAGFlow(user)> think chat with 'qwen3:0.6b@test12@ollama' message 'who r u' Thinking: Okay, the user asked, "Who r u?" I need to respond appropriately. First, I should acknowledge their question. Since I'm an AI, I don't have a physical form, but I can confirm that I'm a large language model. I should keep the response friendly and offer help. Let me make sure I'm not making up any information and that the response is natural. Also, I should check for any typos and ensure clarity. Alright, that should cover it. Answer: I'm an AI language model, and I don't have a physical form. However, I can tell you that I'm designed to assist with questions and tasks. How can I help you today? Time: 2.914285 RAGFlow(user)> stream think chat with 'qwen3:0.6b@test12@ollama' message 'who r u' Thinking: , the user asked, "Who are you?" I need to respond appropriately. Since I'm an AI assistant, I should mention that I don't have a physical form or a mind. I should also clarify that I can help with various tasks like answering questions or providing information. It's important to keep the response friendly and informative while maintaining the correct tone. Answer: don't have a physical form or a mind, but I'm here to help with your questions or tasks! What can I do for you today? Time: 1.740047 # LisyModels RAGFlow(user)> list supported models from 'ollama' 'test12' +-------------------------+ \| model_name \| +-------------------------+ \| nomic-embed-text:latest \| \| qwen3:0.6b \| +-------------------------+ ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2026-05-25 18:55:03 +08:00
Haruko386	69f301b84a	Go: implement embed for Tencent Hunyuan (#15207 ) ### What problem does this PR solve? Implement embed for Tencent Hunyuan Verified from CLI ``` RAGFlow(user)> embed text 'what is rag' 'who are you' with 'hunyuan-embedding@test1@hunyuan' dimension 16; +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 1024 \| 0 \| \| 1024 \| 1 \| +-----------+-------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-25 16:04:17 +08:00
ちー	bb6cfc14e6	feat[go]: implement provider: TokenHub (#15159 ) ### What problem does this PR solve? implement provider TokenHub ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-25 16:02:50 +08:00
Haruko386	5d022d83e8	Go: implement provider: PaddleOCR_Local (#15158 ) ### What problem does this PR solve? Go: implement provider: PaddleOCR_Local Verified from CLI ``` RAGFlow(user)> ocr with 'PaddleOCR-VL@test@paddleocr_local' file './internal/test1.jpg' +----------------------+ \| text \| +----------------------+ \| ## Parallel to these \| +----------------------+ ``` ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) - [X] New Feature (non-breaking change which adds functionality) - [X] Refactoring	2026-05-25 12:12:57 +08:00
dripsmvcp	8d8ea71877	Go: implement provider: Tencent Hunyuan (#15092 ) ## Summary - Adds a `Hunyuan` Go driver so the new API server can route Tencent Hunyuan chat instances (registered in `conf/llm_factories.json:3830` as `Tencent Hunyuan`). Follows the same SaaS-driver shape used for Astraflow, Avian, Novita, TogetherAI, Replicate, DeepInfra, Upstage, and LongCat. Closes #15087 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-25 11:04:39 +08:00
bitloi	432e966414	fix(go): support OpenAI audio endpoints (#15104 ) ### What problem does this PR solve? Closes #15102. OpenAI's Go provider config advertises `whisper-1` as ASR and `tts-1` as TTS, but the Go driver returned `openai, no such method` for both audio paths and did not define `url_suffix.asr` / `url_suffix.tts`. This PR: - adds OpenAI audio URL suffixes for `audio/transcriptions` and `audio/speech` - implements non-streaming `TranscribeAudio` using multipart form uploads - implements non-streaming `AudioSpeech` using the OpenAI speech JSON request shape - keeps streaming TTS explicitly unsupported instead of sending binary audio through the text SSE sender - adds focused tests for config coverage, ASR/TTS request shape, required TTS voice validation, and unsupported streaming TTS ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-25 10:25:53 +08:00
Tohka	302f97de50	Go: implement reasoning_chat, TTS, ASR for Groq (#15153 ) ### What problem does this PR solve? Go: implement reasoning_chat, TTS, ASR for Groq Verify from CLI ``` RAGFlow(user)> think chat with 'qwen/qwen3-32b@test@groq' message 'who r u' Thinking: Okay, the user asked, who r u. I need to determine what the user is asking. They may be asking about my identity. I should introduce my name and basic functions. The user might want to know what I can do, so I should list some common use cases, such as answering questions, creating writing, coding, and expressing opinions. The user may be curious about how they can interact with me, so they can be advised to ask any questions or provide instructions. Keep your answers conversational, avoid overly technical terms, keep answers concise, and encourage further interaction. Check if there's any ambiguity in the answer and make sure it's accurate and meets the user's needs. Also consider if there are other aspects the user may be interested in, such as my training data or performance. But since the question is basic, I'll focus on the essentials first and invite the user to ask more. In summary, respond to the user's questions by introducing yourself, your functions, and encouraging further interaction. Answer: Hello! I'm Qwen. I am a large-scale language model developed by Tongyi Lab, designed to assist you in various ways, such as answering questions, creating text, logical reasoning, programming, and more. I aim to provide clear, accurate, and helpful information and support. How can I assist you today? Feel free to ask any questions or give me tasks! 😊 Time: 2.199908 RAGFlow(user)> stream think chat with 'openai/gpt-oss-20b@test@groq' message 'who r u' Thinking: to respond politely. Answer: ’m ChatGPT—an AI language model created by OpenAI. I’m here to answer questions, offer explanations, and help with a wide range of topics. How can I assist you today? RAGFlow(user)> tts with 'canopylabs/orpheus-arabic-saudi@test@groq' text 'hello? show yourself' play format 'wav' param '{"voice": "fahad"}' SUCCESS RAGFlow(user)> asr with 'whisper-large-v3-turbo@test@groq' audio './internal/test.wav' param '{"language": "en"}' +----------------------------------------------------------------------------------------------------------------------+ \| text \| +----------------------------------------------------------------------------------------------------------------------+ \| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired \| +----------------------------------------------------------------------------------------------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-22 18:02:30 +08:00
Haruko386	3f02ca7ba1	Go: implement embed, rerank, tts for AstraFlow (#15135 ) ### What problem does this PR solve? implement embed, rerank, tts for AstraFlow Verify from CLI ``` # Astraflow RAGFlow(user)> tts with 'IndexTeam/IndexTTS-2@test3@astraflow' text 'hello? show yourself' play format 'wav' param '{"voice": "jack_cheng"}' SUCCESS RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'bge-reranker-v2-m3@test3@astraflow' top 3; +-------+---------------------+ \| index \| relevance_score \| +-------+---------------------+ \| 0 \| 0.9837390184402466 \| \| 2 \| 0.06322699040174484 \| \| 1 \| 0.04663187265396118 \| +-------+---------------------+ RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'text-embedding-3-large@test3@astraflow' dimension 16 +-----------+-------+ \| dimension \| index \| +-----------+-------+ \| 3072 \| 0 \| \| 3072 \| 1 \| +-----------+-------+ # Xinference ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-22 18:02:01 +08:00
ghost	f9ce07ced1	feat(go-models): add Groq provider driver (#15097 ) ### What problem does this PR solve? Closes #15088. Adds Groq support to the Go model-provider layer so Groq instances can be routed through the Go API server with the same OpenAI-compatible chat, streaming, model listing, and connection-check flow used by other SaaS providers. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ## Summary - Added a Groq Go model driver. - Added the Groq provider catalog and default OpenAI-compatible API URL. - Registered Groq in the model factory. - Added focused provider tests. ## What changed - Implemented chat completions, SSE streaming, ListModels, and CheckConnection for Groq. - Covered request shape, stream termination, reasoning fallback, model listing, custom base URLs, safe transport setup, and unsupported methods. - Kept the provider catalog scoped to current Groq chat-capable model IDs. - Cleaned up pre-existing Go model package validation blockers so the package can be tested normally with vet enabled. ## Why The existing Python/provider catalog path includes Groq, but the Go model-provider layer did not have a Groq driver, so the Go API server could not instantiate or use Groq as requested in #15088. ## Notes The model package now validates without disabling vet. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-22 15:24:52 +08:00
dripsmvcp	ed04893415	Go: implement provider: TokenPony (#15091 ) ## Summary - Adds a `TokenPony` Go driver so the new API server can route TokenPony chat instances, matching the existing Python `TokenPonyChat` (`rag/llm/chat_model.py:1210`). Follows the same SaaS-driver shape used for Astraflow, Avian, Novita, TogetherAI, Replicate, DeepInfra, Upstage, and LongCat. Closes #15086 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-22 15:21:45 +08:00
Jake Armstrong	b1ef5d365f	Go: implement ASR in OpenRouter driver (#15067 ) ### What problem does this PR solve? Fixes #15066 OpenRouter now exposes an official speech-to-text endpoint at `POST /api/v1/audio/transcriptions`, but the Go model driver still returned `openrouter, no such method` from `TranscribeAudio`. This left OpenRouter ASR models unavailable through the Go API server even though the provider already has OpenRouter audio support for TTS. Related provider-tracking context: #14736 ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-22 15:19:38 +08:00
Jake Armstrong	b2bf9155ed	Go: implement ASR in ZhipuAI driver (#15134 ) ### What problem does this PR solve? This PR implements ASR and TTS support for the ZhipuAI Go driver. The ZhipuAI model config already advertises `glm-asr-2512` as an ASR model, but the Go driver returned `zhipu, no such method` from `TranscribeAudio`. This adds the documented audio transcription endpoint suffix and sends multipart transcription requests with `model`, `stream=false`, and `file` fields. Per maintainer review, this also adds the ZhipuAI TTS endpoint suffix and implements `AudioSpeech` / `AudioSpeechWithSender` for `glm-tts`. Closes #15133 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-05-22 11:53:18 +08:00
ghost	b2053cc3c7	feat(go-models): add PPIO provider driver (#15099 ) ### What problem does this PR solve? Closes #15089. Adds PPIO support to the Go model-provider layer so PPIO instances can be routed through the Go API server with the same OpenAI-compatible chat, streaming, model listing, and connection-check flow used by other SaaS providers. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ## Summary - Added a PPIO Go model driver. - Added the PPIO provider catalog and default OpenAI-compatible API URL. - Registered PPIO in the model factory. - Added focused provider and provider-manager tests. ## What changed - Implemented chat completions, SSE streaming, ListModels, and CheckConnection for PPIO. - Covered request shape, stream termination, reasoning fallback, model listing, custom base URLs, safe transport setup, unsupported methods, and provider config loading. - Kept the provider catalog aligned with the existing RAGFlow PPIO factory model set. - Cleaned up pre-existing Go model package validation blockers so the scoped provider tests can run normally with vet enabled. ## Why The existing Python/provider catalog path includes PPIO, but the Go model-provider layer did not have a PPIO driver, so the Go API server could not instantiate or use PPIO as requested in #15089.	2026-05-22 11:52:18 +08:00
Haruko386	1ece1c81da	Go: implement rerank, asr, tts for TogetherAI (#15107 ) ### What problem does this PR solve? implement rerank, asr, tts for TogetherAI ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-21 20:57:04 +08:00
Haruko386	a725e114f9	Go: implement ASR and TTS for Xinference (#15096 ) ### What problem does this PR solve? implement ASR and TTS for Xinference ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-21 18:28:06 +08:00
tmimmanuel	85d0b46d8e	fix(mistral): handle structured content from magistral reasoning models (#14805 ) ### What problem does this PR solve? `MistralModel.ChatWithMessages` (in the driver merged via #14807) assumes that `choices[0].message.content` from `/v1/chat/completions` is always a string and falls through to `return nil, fmt.Errorf("invalid content format")` on anything else. That assumption breaks for the magistral reasoning family (`magistral-small-`, `magistral-medium-`). When the model needs a chain-of-thought to answer, Mistral returns `content` as a structured array of typed parts: ```json "content": [ {"type": "thinking", "thinking": [{"type": "text", "text": "Combined speed is 150 mph. 300 / 150 = 2 hours."}], "closed": true}, {"type": "text", "text": "They will meet after 2 hours."} ] ``` Concretely, this is what the live API returns today (probed against `api.mistral.ai/v1`): ``` $ curl -H "Authorization: Bearer <key>" -H "Content-Type: application/json" \ -X POST https://api.mistral.ai/v1/chat/completions \ -d '{"model":"magistral-medium-latest", "messages":[{"role":"user","content":"two trains 60mph and 90mph, 300mi apart, when do they meet? step by step."}], "max_tokens":1024}' HTTP 200 { "choices":[{"message":{ "role":"assistant", "content":[ {"type":"thinking","thinking":[{"type":"text","text":"Okay, let's see..."}],"closed":true}, {"type":"text","text":"To determine when the two trains meet..."} ]}}] } ``` With the current driver, every call like that returns the generic `"invalid content format"` error. Trivial prompts that happen to fit in a string answer still succeed, so the breakage is non-deterministic from the tenant's POV: same model, same provider, sometimes works, sometimes 500s with no useful error. A secondary issue: `conf/models/mistral.json` does not include any magistral model. The picker hid the broken path, which is why this wasn't caught during #14807's review. ### What this PR includes - New helper `extractMistralContent(raw interface{}) (answer, reasonContent string, err error)` in `internal/entity/models/mistral.go`, which normalizes both shapes Mistral can return: - `string` → historical path. `Answer = content`, `ReasonContent = ""`. Preserves behavior for every non-reasoning model (`mistral-large-`, `mistral-small-`, `ministral-`, `codestral-`, `pixtral-`, `open-mistral-nemo`). - `[]interface{}` → walk the parts. Concatenate every `{"type":"text", "text":...}` part into `Answer`; concatenate the inner text inside every `{"type":"thinking", "thinking":[...]}` part into `ReasonContent`. - `ChatWithMessages` now calls the helper instead of doing the raw `.(string)` cast. - Unknown part types are skipped, not failed. Mistral has been adding new content variants quickly (audio chunks, citations, etc.); this driver should not 500 every call when a new part type appears. - `conf/models/mistral.json`: add `magistral-medium-latest` and `magistral-small-latest`. Both are visible in `/v1/models` today. No interface change. No factory change. No new dependencies. ### How was this tested? Unit tests* — 5 new tests in `internal/entity/models/mistral_test.go` on top of the 27 already shipped via #14807: - `TestMistralChatHandlesStringContent` — regression net for the historical path - `TestMistralChatExtractsReasoningFromStructuredContent` — the fixture body is a trimmed copy of the actual `magistral-medium-latest` response captured above; asserts both `Answer` and `ReasonContent` are populated correctly - `TestMistralChatHandlesStructuredContentWithoutThinking` — `magistral-` with a trivial answer returns a structured shape that has only a `text` part; `ReasonContent` must stay empty - `TestMistralChatIgnoresUnknownContentPartTypes` — `audio_url` and `future_part_type` parts are skipped, `text` parts still flow through - `TestExtractMistralContent` — table-driven unit coverage of the helper for string, empty string, nil, empty array, text-only, thinking+text, unsupported root type ``` $ go test -vet=off -run "TestMistral\|TestExtractMistralContent" -count=1 -v ./internal/entity/models/... === RUN TestMistralChatHandlesStringContent --- PASS: TestMistralChatHandlesStringContent (0.00s) === RUN TestMistralChatExtractsReasoningFromStructuredContent --- PASS: TestMistralChatExtractsReasoningFromStructuredContent (0.00s) === RUN TestMistralChatHandlesStructuredContentWithoutThinking --- PASS: TestMistralChatHandlesStructuredContentWithoutThinking (0.00s) === RUN TestMistralChatIgnoresUnknownContentPartTypes --- PASS: TestMistralChatIgnoresUnknownContentPartTypes (0.00s) === RUN TestExtractMistralContent === RUN TestExtractMistralContent/plain_string === RUN TestExtractMistralContent/empty_string === RUN TestExtractMistralContent/nil === RUN TestExtractMistralContent/empty_array === RUN TestExtractMistralContent/text_only === RUN TestExtractMistralContent/thinking_then_text === RUN TestExtractMistralContent/unknown_root_type --- PASS: TestExtractMistralContent (0.00s) PASS ok ragflow/internal/entity/models 0.046s ``` All 32 Mistral tests pass on go 1.25. `go build ./internal/entity/models/...` exits 0. Live integration test* — driver exercised against `api.mistral.ai/v1` with the patched code: ``` === RUN TestMistralMagistralSmoke [OK] "magistral-small-latest" present upstream [OK] "magistral-medium-latest" present upstream [OK trivial] Answer="7" ReasonContent="" [OK reasoning] Answer len=797 head="To determine when the two trains meet, we can follow these steps:\n\n1. **Identify..." ReasonContent len=1069 head="Okay, let's see. There are two trains, one going 60 mph and the other going 90 mph. They're moving towards each other, s..." MAGISTRAL SMOKE PASSED --- PASS: TestMistralMagistralSmoke (18.09s) PASS ok ragflow/internal/entity/models 18.112s ``` What the live run proves on the wire: - `magistral-small-latest` with a trivial prompt still uses the string-content shape; the regression-net path is exercised against the real server, not just the mock. - `magistral-medium-latest` with a reasoning prompt uses the structured-array shape; the new code path extracts a 1069-character reasoning trace into `ChatResponse.ReasonContent` and a 797-character visible answer into `ChatResponse.Answer`. Before this fix, the same call returned `"invalid content format"` and the caller saw nothing. The smoke-test file itself is not committed (live tests live outside the PR diff, same convention used for prior provider PRs). ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-21 15:33:14 +08:00
BitToby	bd4ce39038	Go: implement provider: Perplexity (#15008 ) ## What - Add Perplexity as a chat and embedding provider backed by its OpenAI-compatible `/chat/completions` and `/v1/embeddings` APIs - Register Perplexity in the Go model factory and provider config - Support non-streaming chat, SSE streaming chat, embeddings, model listing, and connection checks Refs #14736 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-21 15:33:02 +08:00
dripsmvcp	d5ba14a128	feat(go): implement provider Astraflow (#15062 ) (#15064 ) - Adds an `Astraflow` Go driver so the new API server can route Astraflow (UCloud ModelVerse) chat instances, matching the existing Python `AstraflowChat` (`rag/llm/chat_model.py:1237`). Follows the same SaaS-driver shape used for Avian, Novita, TogetherAI, Replicate, DeepInfra, Upstage, and LongCat. Closes #15062 --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-21 15:32:56 +08:00
dripsmvcp	5a18df0fd0	Go: implement provider: Avian (#15045 ) Closes #15044. Avian was listed unchecked in the Go-rewrite tracker #14736 and already had an llm_factories.json entry with 4 preconfigured chat models (deepseek-v3.2, kimi-k2.5, glm-5, minimax-m2.5), but the Go API server had no driver to route them. The Python side has supported Avian at rag/llm/chat_model.py:1220 (AvianChat) via the LiteLLM openai/ provider with default base https://api.avian.io/v1. Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-21 15:32:49 +08:00

1 2 3

132 Commits