ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-29 15:31:05 +08:00

Author	SHA1	Message	Date
Tim Wang	ca96d61e73	Feat: Add New API model provider for OpenAI-compatible gateways (#15991 ) ## Summary Add support for "New API" as a model provider, enabling connection to [New API](https://github.com/QuantumNous/new-api) / [one-api](https://github.com/songquanpeng/one-api) compatible gateways that aggregate multiple LLM backends behind a unified OpenAI-compatible `/v1` endpoint. ### Features - All model types: Chat, Embedding, Rerank, Image2Text, TTS, Speech2Text - List Models discovery: `NewAPI(OpenAIAPICompatible)` class in `model_meta.py` queries the gateway's `/v1/models` to auto-discover available models via the native `GET /api/v1/providers/<name>/models` endpoint - Model parameter editing: Pencil icon on each discovered model row to edit `model_type`, `max_tokens`, and `features` (e.g. tool call support) before submitting - Custom model addition: "Add Custom Model" button at the bottom of the List Models dropdown for models not returned by the API - Gear icon settings: Enabled the Settings gear button on provider instances to manage models on existing instances (viewMode) - viewMode credential passthrough: Fixed List Models in viewMode — merges `initialValues` credentials when `api_key`/`base_url` fields are hidden by `hideWhenInstanceExists` ### Changes Backend (8 files): - `rag/llm/chat_model.py` — `NewAPIChat(Base)` class - `rag/llm/embedding_model.py` — `NewAPIEmbed(OpenAIEmbed)` class (no auto `/v1` append) - `rag/llm/rerank_model.py` — `NewAPIRerank(Base)` class (uses `/rerank` endpoint) - `rag/llm/cv_model.py` — `NewAPICv(GptV4)` class - `rag/llm/tts_model.py` — `NewAPITTS(OpenAITTS)` class - `rag/llm/sequence2txt_model.py` — `NewAPISeq2txt(GPTSeq2txt)` class - `rag/llm/model_meta.py` — `NewAPI(OpenAIAPICompatible)` class for List Models discovery - `conf/llm_factories.json` — New API factory entry with all model type tags Frontend (8 files + 1 new SVG): - `web/src/assets/svg/llm/new-api.svg` — New API logo icon - `web/src/constants/llm.ts` — `LLMFactory.NewAPI` enum + `IconMap` entry - `web/src/components/svg-icon.tsx` — `NewAPI` added to `svgIcons` - `web/src/pages/user-setting/setting-model/modal/provider-modal/field-config/local-llm-configs.ts` — New API `buildLocalConfig` - `web/src/pages/user-setting/setting-model/modal/provider-modal/constants.ts` — `LIST_MODEL_PROVIDERS` includes NewAPI - `web/src/pages/user-setting/setting-model/components/used-model.tsx` — Enable Settings gear button - `web/src/pages/user-setting/setting-model/modal/provider-modal/hooks/use-list-models-picker.ts` — viewMode credential merge + model editing state/handlers - `web/src/pages/user-setting/setting-model/modal/provider-modal/hooks/use-list-models-options.tsx` — Pencil edit icon per model row - `web/src/pages/user-setting/setting-model/modal/provider-modal/index.tsx` — `AddCustomModelDialog` import + edit dialog rendering Note on Go implementation: A Go model driver (`NewAPIModel` delegating to `OpenAIModel`) has been prepared but is deferred until the Go runtime is enabled in a future release (current v0.26.0 images use `API_PROXY_SCHEME=python` and do not compile Go binaries). Will submit as a follow-up PR. ## Related - Depends on: #15996 (provider instance API improvements — server-side credential lookup, idempotent `add_model`, security fixes — required for viewMode gear icon and batch model submission) ## Test plan - [ ] Add New API provider with api_key and base_url pointing to an OpenAI-compatible gateway - [ ] Click "List Models" — should discover and display available models from `/v1/models` - [ ] Click pencil icon on a model — should open edit dialog to change model_type, max_tokens, features - [ ] Select multiple models and click OK — should add all selected models - [ ] Click gear icon on the added instance — should open viewMode with List Models working - [ ] In viewMode, select new models including pre-existing ones, click OK — should succeed (requires #15996) - [ ] Verify all model types work: create a Chat assistant, Embedding KB, Rerank setting 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Tim Wang <wanghualoong@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-26 18:47:20 +08:00
Lynn	bf1eabea72	Feat: support new qwen model (#16385 )	2026-06-26 17:30:16 +08:00
Hz_	0de8f3e127	feat: add missing qwen models to all_models.json (#16379 ) Add 19 missing qwen models and 3 aliases to all_models.json. Models added: qwen-image-2.0-pro (2026-06-22, 2026-04-22), qwen3.5-ocr, qwen3.7-max-2026-05-17, qwen3.5-livetranslate-flash-realtime, qwen3.5-omni-plus/flash-realtime, qwen-deep-research-2025-12-15, qwen-flash-character-2026-02-26, qwen-plus-2025-11-05, qwen-deep-search-planning, qwen3-s2s-flash-realtime-2025-09-22, qwen-max-1201/longcontext/0107, qwen-1.8b-longcontext-chat Aliases: qwen3.5-plus-2026-04-20, qwen-turbo-0919, qwen-1.8b-chat	2026-06-26 15:35:30 +08:00
Jin Hai	65afaa1292	Model config: add tools (#16371 ) ### What problem does this PR solve? ``` { "name": "glm-4-flash", "max_tokens": 128000, "model_types": [ "chat" ], "tools": { "support": true } } ``` ``` RAGFlow(admin)> list provider 'zhipu-ai' models; +------------+---------------+------------+---------------+----------------+-----------+-----------+ \| dimensions \| max_dimension \| max_tokens \| model_type \| name \| thinking \| tools \| +------------+---------------+------------+---------------+----------------+-----------+-----------+ \| \| \| 204800 \| [chat] \| glm-5 \| supported \| supported \| \| \| \| 204800 \| [chat] \| glm-5-turbo \| supported \| supported \| ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-26 11:37:51 +08:00
Jin Hai	7214a23614	Go: fix duplicate models (#16197 ) ### What problem does this PR solve? 1. Remove unused file 2. Remove duplicate models 3. Resort the function order ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-19 09:57:58 +08:00
Haruko386	b53b5bf12c	Json add paddleOCR models (#16156 ) close #15853 ### What problem does this PR solve? As title ### Type of change - [x] Other (add models):	2026-06-18 17:57:41 +08:00
Wang Qi	b3ac03b96c	Set default Paddle OCR URL (#16128 ) Set default Paddle OCR URL	2026-06-17 14:29:20 +08:00
Rander	1235da7093	refactor(paddleocr): migrate from sync API to async Job API (#15967 ) ## Summary Migrate PaddleOCR integration from the deprecated synchronous HTTP API to the new asynchronous Job API (`submit → poll → fetch`), aligning with PaddleOCR 3.6.0+ architecture. ## Changes ### Python (`deepdoc/parser/paddleocr_parser.py`) - Replace synchronous `requests.post()` with async Job API flow (submit → poll → fetch) - Authentication: `token {token}` → `Bearer {token}` - File transfer: base64 JSON body → multipart file upload - Polling: exponential backoff (initial 3s, ×1.5, max 15s, timeout controlled by `request_timeout`) - Result: fetch full JSONL from result URL, preserving `prunedResult` with bbox info for crop functionality - Rename `api_url` → `base_url` (backward compatible: `api_url` still accepted as fallback) ### Python (`rag/llm/ocr_model.py`) - Prefer `paddleocr_base_url` / `PADDLEOCR_BASE_URL`, fallback to `paddleocr_api_url` / `PADDLEOCR_API_URL` ### Go (`internal/entity/models/paddleocr.go`) - Add `Client-Platform: ragflow` header to submit and poll requests - Change polling from fixed 3s to exponential backoff (initial 3s, ×1.5, max 15s) ### Python (`common/constants.py`) - Add `PADDLEOCR_BASE_URL` to env keys and default config ## Backward Compatibility - Old env var `PADDLEOCR_API_URL` still works (used as fallback) - Frontend field `paddleocr_api_url` still works (backend reads it as fallback) - No user-facing configuration changes required for existing setups ## Why not use the `paddleocr` SDK package directly? RAGFlow's `_transfer_to_sections()` relies on `prunedResult` (containing `block_bbox`, `block_label`, `parsing_res_list`) from the raw API response for PDF crop functionality. The SDK's public `parse_document()` API only returns `DocParsingResult` with `markdown_text`, discarding the bbox data. Therefore we implement the async Job API flow directly via HTTP, following the same logic as the SDK internally.	2026-06-16 19:34:21 +08:00
Hz_	8047857de0	fix(go): all_models.json (#16075 ) ### What problem does this PR solve? This PR fixes Go admin server startup failure caused by duplicate model aliases in conf/all_models.json. The model provider loader builds a global lookup table from both model name and alias values. Some aliases duplicated another model's name or another alias, for example amazon.titan-embed-text-v1, which caused startup to fail with a duplicate alias error. This PR removes conflicting duplicate aliases while keeping all model definitions intact.	2026-06-16 15:31:17 +08:00
Hz_	4a33455a20	feat(go-models): add more providers (#16017 ) ### What problem does this PR solve? add more providers.	2026-06-16 12:54:19 +08:00
Haruko386	cafd8a1125	Json: add many models to all_models.json (#16013 ) ### What problem does this PR solve? As title ### Type of change - [x] Other (please describe): add some models	2026-06-15 15:25:49 +08:00
Hz_	eb6ea284a8	feat(go-models): Add google models to all_models.json (#16007 ) ### What problem does this PR solve? Add google models to all_models.json	2026-06-15 11:37:56 +08:00
zaviermeekz-cpu	83e2180e80	fix: use /api/tags endpoint for Ollama model listing (#16000 ) (#16003 ) After upgrading to v0.26.0, the Ollama provider returns an empty model list because the Go rewrite uses `/api/ps` (only running models) instead of `/api/tags` (all installed models). This PR changes the endpoint to `/api/tags`, restoring the ability to list and add Ollama models. Closes #16000	2026-06-15 10:20:15 +08:00
Haruko386	4115282c5f	Json[model-provider] add nvidia, moonshot, minimax, claude, GPT models (#15970 ) ### What problem does this PR solve? As title ### Type of change - [x] Other (please describe): add models	2026-06-12 19:16:10 +08:00
Haruko386	547139da29	fix(Go-models): preserve model name lookup when aliases exist (#15969 ) ### What problem does this PR solve? As title ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Documentation Update	2026-06-12 19:15:28 +08:00
Jin Hai	e96bc37d06	Go: use NATS as the message queue (#15327 ) ### What problem does this PR solve? ``` RAGFlow(admin)> mq publish 'msg2'; SUCCESS RAGFlow(admin)> mq publish 'msg3'; SUCCESS RAGFlow(admin)> mq list; +---------+---------------+ \| message \| subject \| +---------+---------------+ \| msg1 \| tasks.RAGFLOW \| \| msg2 \| tasks.RAGFLOW \| \| msg3 \| tasks.RAGFLOW \| +---------+---------------+ RAGFlow(admin)> mq pull 2; +---------+---------------+ \| message \| subject \| +---------+---------------+ \| msg1 \| tasks.RAGFLOW \| \| msg2 \| tasks.RAGFLOW \| +---------+---------------+ RAGFlow(admin)> mq pull noack; +---------+---------------+ \| message \| subject \| +---------+---------------+ \| abc \| tasks.RAGFLOW \| +---------+---------------+ RAGFlow(admin)> mq show +-------------------+----------------+--------+---------------+---------------+-------------------+---------------+ \| ack_pending_count \| consumer_count \| memory \| message_count \| pending_count \| redelivered_count \| waiting_count \| +-------------------+----------------+--------+---------------+---------------+-------------------+---------------+ \| 2 \| 1 \| 0 \| 2 \| 0 \| 1 \| 0 \| +-------------------+----------------+--------+---------------+---------------+-------------------+---------------+ RAGFlow(admin)> list ingestors; +--------------+-------------------------------------------+--------+ \| host \| name \| status \| +--------------+-------------------------------------------+--------+ \| 192.168.1.38 \| ingestor-8f0e4bd5650a4ac58b0151969fbf6935 \| alive \| +--------------+-------------------------------------------+--------+ RAGFlow(admin)> list ingestion tasks; +----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+ \| document_id \| id \| status \| step \| user \| user_id \| +----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+ \| ffe64fae423411f1a2d938a74640adcc \| 90d3d0f6528941c1ac8eb0360effccc4 \| COMPLETED \| 5 \| aaa@aaa.com \| 2ba4881420fa11f19e9c38a74640adcc \| +----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+ RAGFlow(admin)> remove ingestion tasks '90d3d0f6528941c1ac8eb0360effccc4'; +---------+----------------------------------+ \| delete \| task_id \| +---------+----------------------------------+ \| success \| 90d3d0f6528941c1ac8eb0360effccc4 \| +---------+----------------------------------+ RAGFlow(admin)> stop ingestion tasks 'e89e20d9a25848a1b79bd9345ddbfe1d'; +----------+----------------------------------+ \| status \| task_id \| +----------+----------------------------------+ \| STOPPING \| e89e20d9a25848a1b79bd9345ddbfe1d \| +----------+----------------------------------+ # Publish a message RAGFlow(admin)> mq publish 'cdd'; SUCCESS # List current tasks in the message queue RAGFlow(admin)> mq list +----------------------------------+---------------+ \| message \| subject \| +----------------------------------+---------------+ \| 7ce392a3c1624cd2be4b5276e8825059 \| tasks.RAGFLOW \| +----------------------------------+---------------+ # Consume a task from the message queue RAGFlow(admin)> mq pull +------+-----+----------------+ \| ack \| id \| type \| +------+-----+----------------+ \| true \| cdd \| ingestion_test \| +------+-----+----------------+ # User mode # List ingestion tasks, followed by dataset id RAGFlow(user)> list ingestion tasks from '0abe79f9423311f1ad8d38a74640adcc'; +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| create_date \| create_time \| dataset_id \| document_id \| id \| schema \| status \| update_date \| update_time \| user_id \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| 2026-05-30T20:21:06+08:00 \| 1780143666289 \| 0abe79f9423311f1ad8d38a74640adcc \| ffe64fae423411f1a2d938a74640adcc \| 8d758cd14a8b4ba8ab505003fb52017d \| \| COMPLETED \| 2026-05-30T20:21:26+08:00 \| 1780143686431 \| 2ba4881420fa11f19e9c38a74640adcc \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ RAGFlow(user)> list ingestion tasks; +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| create_date \| create_time \| dataset_id \| document_id \| id \| schema \| status \| update_date \| update_time \| user_id \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| 2026-06-02T19:02:31+08:00 \| 1780398151417 \| 0abe79f9423311f1ad8d38a74640adcc \| ffe64fae423411f1a2d938a74640adcc \| e89e20d9a25848a1b79bd9345ddbfe1d \| \| COMPLETED \| 2026-06-02T19:02:52+08:00 \| 1780398172208 \| 2ba4881420fa11f19e9c38a74640adcc \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ # Create an ingestion task # First argument is document id, second argument is dataset id RAGFlow(user)> start ingestion 'ffe64fae423411f1a2d938a74640adcc' from '0abe79f9423311f1ad8d38a74640adcc'; +----------------------------------+-------------------------------------------+ \| document_id \| result \| +----------------------------------+-------------------------------------------+ \| ffe64fae423411f1a2d938a74640adcc \| task_id: 8d758cd14a8b4ba8ab505003fb52017d \| +----------------------------------+-------------------------------------------+ # Pause an ingestion task, first argument is ingestion id RAGFlow(user)> stop ingestion '8d758cd14a8b4ba8ab505003fb52017d'; +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| create_date \| create_time \| dataset_id \| document_id \| id \| schema \| status \| update_date \| update_time \| user_id \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| 2026-05-30T20:21:06+08:00 \| 1780143666289 \| 0abe79f9423311f1ad8d38a74640adcc \| ffe64fae423411f1a2d938a74640adcc \| 8d758cd14a8b4ba8ab505003fb52017d \| \| COMPLETED \| 2026-05-30T20:21:26+08:00 \| 1780143686431 \| 2ba4881420fa11f19e9c38a74640adcc \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ # Delete an ingestion task RAGFlow(api/default)> remove ingestion tasks 'f366450a27d54677aec1c7090add30f0'; +---------+----------------------------------+ \| remove \| task_id \| +---------+----------------------------------+ \| success \| f366450a27d54677aec1c7090add30f0 \| +---------+----------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-12 14:56:44 +08:00
Hz_	30724140d2	feat(go): Add Z.ai model entries to all_models.json Add missing Qwen commercial models and provider aliases (#15929 ) ### What problem does this PR solve? - Add Z.ai model definitions to `conf/all_models.json`. - Add missing Qwen / DashScope commercial API-only models, including: - Qwen3.7 / Qwen3.6 / Qwen3.5 Max, Plus, Flash families - Qwen Coder and Math commercial models - Qwen VL, OCR, Omni, ASR, TTS, translation, image generation, and image editing models - Add verified provider-specific aliases for supported Qwen models: - DashScope / Alibaba Cloud Model Studio model IDs - OpenRouter `qwen/...` aliases - Amazon Bedrock `qwen.qwen3-*` model IDs - Add `thinking` metadata for Qwen models that officially support thinking mode. - Remove aliases that exactly duplicate their own canonical `name`.	2026-06-12 14:33:01 +08:00
Haruko386	e3be39d0de	Json: add some models (#15947 ) ### What problem does this PR solve? As title ### Type of change - [x] Other (please describe): add models	2026-06-12 14:32:21 +08:00
Idriss Sbaaoui	9871a7e0b6	fix: replicate model provider (#15933 ) ### What problem does this PR solve? FIx replicate model provider failing with valid api key ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Wang Qi <wangq8@outlook.com>	2026-06-11 15:08:33 +08:00
Hz_	515acf4f60	fix(go): Fix case-insensitive model alias lookup (#15911 ) ## Summary - Normalize model alias index keys to lowercase - Detect lowercase alias collisions during provider manager initialization - Fix ListModels metadata mapping for mixed-case provider aliases	2026-06-10 20:36:43 +08:00
buua436	dcf623d60d	feat: support multi-type factory models (#15893 ) ### What problem does this PR solve? Support factory models with multiple model types, so visual chat models can be exposed as both image2text and chat while preserving the database model-type-per-record design. This also updates the SILICONFLOW model list and adds a helper script to refresh SiliconFlow models from the provider API. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-06-10 15:35:21 +08:00
Wang Qi	899f76af6b	Fix add OpenRouter base_url, UI need to select at least one model to verify (#15894 ) Fix add OpenRouter base_url, UI need to select at least one model to verify	2026-06-10 14:59:27 +08:00
Hz_	38755c705a	feat(go): Add DeepSeek models and Gitee alias metadata tests (#15885 ) This PR expands conf/all_models.json with DeepSeek model entries and provider aliases. Changes: - Added DeepSeek model entries across `V4`, `V3.2`, `V3.1`, `V3`, `R1`, `Coder`, `Math`, `VL`, `OCR`, `Prover`, `MoE`, and `LLM` series. - Normalized model name values to lowercase canonical IDs. - Added alias values for official DeepSeek/Hugging Face names and provider-specific names from OpenRouter, VolcEngine, SiliconFlow, HuaweiCloud, and QiniuCloud. - Preserved model metadata such as max_tokens, model_types, and thinking where applicable. - Added Gitee ListModels tests to verify DeepSeek aliases map back to model metadata from all_models.json. - Added an optional Gitee integration test gated by GITEE_LIST_MODELS_INTEGRATION=1. Test: /usr/local/go/bin/go clean -cache /usr/local/go/bin/go test ./internal/entity/models -run 'TestGiteeListModels(MapsAllDeepSeekAliasesToModelMetadata\|KeepsOwnedBySuffixAfterAliasMetadataLookup\| Integration)'	2026-06-10 13:59:23 +08:00
Jin Hai	55abf4f565	Go: new CLI command, list all models and show model (#15786 ) ### What problem does this PR solve? ``` RAGFlow(user)> list models; +---------------------------+------------+-------------+--------------------+---------------------------------------------+ \| alias \| max_tokens \| model_types \| name \| thinking \| +---------------------------+------------+-------------+--------------------+---------------------------------------------+ \| \| 1048576 \| [chat] \| deepseek-v4-flash \| map[clear_thinking:true default_value:true] \| \| \| 1048576 \| [chat] \| deepseek-v4-pro \| map[clear_thinking:true default_value:true] \| \| \| 1024000 \| [chat] \| minimax-m3 \| map[clear_thinking:true default_value:true] \| \| \| 64000 \| [vision] \| glm-4.5v \| map[clear_thinking:true default_value:true] \| \| [baai/bge-m3] \| 8192 \| [embedding] \| bge-m3 \| \| \| [baai/bge-reranker-v2-m3] \| 1024 \| [rerank] \| bge-reranker-v2-m3 \| \| \| \| \| [tts] \| step-audio-tts-3b \| \| \| [qwen/qwen3-asr-1.7b] \| \| [asr] \| qwen3-asr-1.7b \| \| \| [paddleocr-vl-1.5] \| \| [ocr] \| paddleocr-vl-0.9b \| \| +---------------------------+------------+-------------+--------------------+---------------------------------------------+ RAGFlow(user)> show model 'minimax-m3'; +--------------+---------------------------------------------+ \| field \| value \| +--------------+---------------------------------------------+ \| name \| minimax-m3 \| \| max_tokens \| 1024000 \| \| model_types \| [chat] \| \| thinking \| map[clear_thinking:true default_value:true] \| \| class \| \| \| alias \| \| \| ModelTypeMap \| \| +--------------+---------------------------------------------+ RAGFlow(user)> show model 'baai/bge-m3'; +--------------+---------------+ \| field \| value \| +--------------+---------------+ \| model_types \| [embedding] \| \| thinking \| \| \| class \| \| \| alias \| [baai/bge-m3] \| \| ModelTypeMap \| \| \| name \| bge-m3 \| \| max_tokens \| 8192 \| +--------------+---------------+ ``` --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-08 21:38:15 +08:00
Lynn	b9f06e6095	Feat: model list (#15774 ) ### What problem does this PR solve? Support model list for VolcEngine. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-06-08 20:18:00 +08:00
oktofeesh	25df0a6725	fix(go-models): validate URL suffix config keys (#15734 ) ## Summary Fixes typoed model-provider URL suffix keys and adds strict nested decoding so future URL suffix config mistakes fail during provider loading instead of being silently ignored.	2026-06-08 19:29:36 +08:00
Haruko386	8dc7f1d95e	Go: implement ASR and TTS for xiaomi (#15765 ) ### What problem does this PR solve? Verified from CLI ``` RAGFlow(user)> chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Answer: Hello! I'm MiMo-v2.5, a large language model developed by Xiaomi's LLM Core Team. You can think of me as a friendly AI assistant ready to help you answer questions, have conversations, or work on creative tasks. My context window can handle up to 1 million tokens, so we can dive into pretty long discussions or documents if you'd like. What can I help you with today? Time: 3.831830 RAGFlow(user)> stream chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Answer: there! I'm MiMo-v2.5, an AI assistant created by the Xiaomi LLM Core Team. I'm here to chat, help out, answer questions, or just have a friendly conversation. Think of me as a helpful buddy with a pretty big memory (1 million tokens worth!). What can I do for you today?😊 Time: 2.421630 RAGFlow(user)> think chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Thinking: The user is asking a simple question about who I am. According to my system prompt, I should: - Identify myself as MiMo-v2.5 - State that I was developed by the Xiaomi LLM Core Team - Answer in first person and be warm and conversational Answer: Hey there! 👋 I'm MiMo, an AI assistant created by the Xiaomi LLM Core Team. Think of me as a friendly chat buddy who's here to help you with all sorts of questions and tasks! I love having conversations, answering questions, brainstorming ideas, and helping people figure things out. Whether you want to chat, need help with something specific, or just want to explore ideas together — I'm here for it! 😊 What can I help you with today? Time: 6.651589 RAGFlow(user)> tts with 'mimo-v2.5-tts@test@xiaomi' text 'hello? show yourself' play format 'wav' param '{"voice": "Chloe"}' SUCCESS RAGFlow(user)> asr with 'mimo-v2.5-asr@test@xiaomi' audio './internal/test.wav' param '{"language": "zh"}' +------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------+ \| 1 The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. \| +------------------------------------------------------------------------------------------------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-06-08 19:27:45 +08:00
Danut Matei	e2b0da9eea	fix(opensearch): keep the BM25 leg in hybrid search (#15760 ) ### What problem does this PR solve? Fixes the OpenSearch side of #10747: hybrid search drops the keyword (BM25) leg and ends up doing plain vector search. When a search has both a text and a vector leg, `OSConnection.search()` throws the text query away: del q["query"] q["query"] = {"knn": knn_query} The text clause only stays on as a filter inside the knn query, so it narrows the candidate set but doesn't count towards scoring. So hybrid search on OpenSearch behaves like plain vector search, unlike the Elasticsearch backend. What I changed: - when both legs are present, send a real hybrid query `{"hybrid": {"queries": [bm25, {"knn": ...}]}}` and let a normalization-processor search pipeline score and combine the two legs - only the actual filters (kb_id, available_int, ...) go in the knn filter, not the text must clause - create the pipeline on startup if it's missing, so there's no separate provisioning step. name and weights can be set under `os:` in service_conf.yaml, or via `OS_HYBRID_PIPELINE`; defaults are `ragflow_hybrid_pipeline` and `[0.5, 0.5]` - normalization-processor needs OpenSearch 2.10+. on older clusters, or when the pipeline can't be created, log a warning and fall back to vector-only instead of pointing at a pipeline that doesn't exist This is only the hybrid-search fix; `create_doc_meta_idx` is already on main. Testing (there's no OpenSearch path in CI): added a unit test (`test/unit_test/rag/utils/test_opensearch_hybrid_search.py`, no services needed) that checks the query built in each case — hybrid + pipeline param for text+vector, plain knn for vector-only, plain bool for text-only, the knn filter never carrying the text query_string, and the vector-only fallback when the pipeline isn't available. Also ran it against a real OpenSearch 2.19.1 container with a doc that matches the keyword but sits outside the knn top-k: pure knn returns `['D1','D2','D5']` (keyword doc missing), the hybrid query returns `['A','D1','D2','D5']` (keyword doc present). ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Signed-off-by: Danut Matei <matei.danut.dm@gmail.com>	2026-06-08 16:17:47 +08:00
buua436	6bf7056422	feat: add placeholder model metas (#15753 ) ### What problem does this PR solve? add placeholder model metas ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-08 14:54:59 +08:00
tmimmanuel	5e25e2600b	Go: implement Xiaomi chat provider (#15626 ) ### What problem does this PR solve? Implements the Xiaomi MiMo chat provider for the Go model provider layer. Reference issue: #14736 Official docs used: - Xiaomi MiMo OpenAI-compatible chat API: https://platform.xiaomimimo.com/docs/en-US/api/chat/openai-api - Xiaomi MiMo model and rate limits: https://platform.xiaomimimo.com/docs/en-US/quick-start/model - Xiaomi MiMo model hyperparameters: https://platform.xiaomimimo.com/docs/en-US/quick-start/model-hyperparameters	2026-06-08 13:09:36 +08:00
tmimmanuel	f78ef328bb	Go: implement Bedrock embeddings (#15543 ) ### What problem does this PR solve? Fixes #15542. AWS Bedrock support for the Go model provider layer was added in #15166, but embedding support was intentionally left out of scope and `BedrockModel.Embed(...)` still returned the `no such method` sentinel. This PR implements Bedrock text embeddings under the umbrella provider tracker #14736. ### What this PR includes - `internal/entity/models/bedrock.go`: implement `BedrockModel.Embed(...)` through Bedrock Runtime `InvokeModel` with existing SigV4 auth, region resolution, and runtime URL helpers. - Titan embeddings: supports `amazon.titan-embed-text-v1` and `amazon.titan-embed-text-v2:0`; v2 forwards `EmbeddingConfig.Dimension` as `dimensions` when provided, while v1 keeps the payload minimal. - Cohere embeddings: supports `cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, and `cohere.embed-v4:0`; batches input texts and maps returned vectors to RAGFlow `EmbeddingData` in input order. - `conf/models/bedrock.json`: adds the `embedding` URL suffix (`invoke`) and Bedrock embedding model entries. - `internal/entity/models/bedrock_test.go`: adds unit tests for Titan, Cohere, typed Cohere responses, validation, empty input, unsupported models, and HTTP error propagation. Reference docs: - Bedrock InvokeModel API: https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html - Titan Text Embeddings: https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html - Cohere Embed models on Bedrock: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-embed.html ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - [x] `jq empty conf/models/bedrock.json` - [x] `git diff --check` - [x] `go test ./internal/entity/models/... -run Bedrock -count=1` - [x] `go test ./internal/entity/models/... -run '^$' -count=1` - [x] `go test ./internal/entity/models/... -run Bedrock -race -count=1` Note: `go test ./internal/entity/models/... -count=1` currently fails in unrelated existing Astraflow coverage (`TestAstraflowEmbedReturnsNoSuchMethod` panics in `internal/entity/models/astraflow.go`). The Bedrock-specific tests and compile-only package check pass.	2026-06-05 13:26:32 +08:00
Idriss Sbaaoui	1134769940	Chore: update cohere models (#15576 ) ### What problem does this PR solve? remove old and add latest cohere models ### Type of change - [x] Refactoring - [x] Other (please describe): update models	2026-06-03 15:55:45 +08:00
Wang Qi	583daf47d5	Fix: model provider orders (#15524 ) Fix: model provider orders	2026-06-03 10:17:12 +08:00
Wang Qi	d41373cfa9	Feature: Add the new anthropic and voyage models (#15516 ) add the newanthropic and voyage models. Strip opus 4.7 and 4.8 of certain usnspported keys Co-authored-by: Idriss Sbaaoui <112825897+6ba3i@users.noreply.github.com>	2026-06-02 17:29:18 +08:00
Wang Qi	c990badda1	Feature: Add MiniMax M3 (#15513 ) Feature: Add MiniMax M3	2026-06-02 17:28:48 +08:00
glorydavid03023	3774916060	Go: implement Embed in GPUStack driver (#15182 ) ### What problem does this PR solve? The Go GPUStack driver returned a stub error for `Embed()` even though GPUStack exposes OpenAI-compatible embeddings on the v1-openai route (not `v1/embeddings`). ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-06-01 11:22:43 +08:00
呆萌闷油瓶	658ff06ca4	feat: add 4 new models for siliconflow (#15383 ) ### What problem does this PR solve? Added 4 new models: deepseek-ai/DeepSeek-V4-Pro deepseek-ai/DeepSeek-V4-Flash Pro/moonshotai/Kimi-K2.6 Pro/zai-org/GLM-5.1 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-29 19:28:29 +08:00
Lynn	dc4b82523b	Feat: tenant llm provider (#14595 ) ### What problem does this PR solve? Python implementation of the Go-based model_provider API suite. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: bill <yibie_jingnian@163.com>	2026-05-29 17:39:41 +08:00
Haruko386	ae88578451	Go: implement TTS and ASR for X.AI (#15247 ) ### What problem does this PR solve? As title ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-05-27 14:08:35 +08:00
sxxtony	17b5b33574	Go: implement Rerank in Replicate driver (#15278 ) ### What problem does this PR solve? `ReplicateModel.Rerank` in `internal/entity/models/replicate.go` was a `"replicate, no such method"` stub. The chat path landed in #14958 and the embed path in #15073; rerank is the last major retrieval surface still missing on this provider. Until this PR, a tenant who selected a Replicate reranker model got the sentinel error on every rerank call. Co-authored-by: sxxtony <sxxtony@users.noreply.github.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-27 14:07:00 +08:00
Alexander Laurent	ae5f48f233	feat: add GiteeAI provider support to Go API server (#15131 ) ### What problem does this PR solve? Closes #15090. Adds GiteeAI support to the Go model-provider layer so GiteeAI chat models can be routed through the Go API server using the same OpenAI-compatible chat, streaming, model listing, and connection-check flow used by other SaaS providers. GiteeAI is implemented as a separate provider from the existing `gitee` provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ## Summary - Added a GiteeAI Go model driver. - Added the GiteeAI provider catalog with default base URL `https://ai.gitee.com/v1`. - Registered `giteeai` in the model factory separately from `gitee`. - Added focused provider tests for sync chat, streaming chat, model listing, connection checks, base URL override, SSE parsing, `[DONE]` handling, and unsupported methods. ## What changed - Implemented `ChatWithMessages` for `POST /chat/completions`. - Implemented `ChatStreamlyWithSender` with SSE parsing, `delta` extraction, `finish_reason`, `[DONE]`, and `<think>` tag handling. - Implemented `ListModels` for `GET /models`. - Implemented `CheckConnection` by delegating to `ListModels`. - Returned standard `no such method` errors for unsupported embedding, rerank, image-to-text, ASR, and TTS paths. ## Tests ```bash go test -vet=off ./internal/entity/models -run 'TestGiteeAI' -count=1 go test -vet=off ./internal/entity -run 'Test.Provider\|Test.Model' -count=1 ``` --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-27 14:06:34 +08:00
Hz_	47626bbe63	go: add Qiniu model provider (#15280 ) ### What problem does this PR solve? This PR adds Qiniu provider integration for the Go model driver layer in RAGFlow. Supported capabilities: - [X] Chat - [X] Think Chat - [X] Stream Chat - [X] Stream Think Chat - [X] Model listing - [X] Provider configuration and factory registration Verified examples from the CLI: ``` login user '*' password ''; ADD PROVIDER 'qiniu'; CREATE PROVIDER 'qiniu' INSTANCE 'test' KEY '**'; chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu' message 'hello'; think chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu' message 'hello'; stream chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu' message 'hello, what are you'; stream think chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu' message 'hello, what are you'; stream think chat with 'qwen3-max-2026-01-23@test@qiniu' message 'hello, what are you'; LIST MODELS FROM 'qiniu' 'test'; ``` ### Type of change - [X] New Feature - [X] Provider integration	2026-05-27 13:19:39 +08:00
oktofeesh	a3c6e075f6	fix(go-models): add VolcEngine model listing suffix (#15234 ) ## Summary - add the VolcEngine `models` URL suffix used by the existing Go `ListModels` implementation - return a clear error when the VolcEngine models suffix is missing - add focused VolcEngine model-listing regression tests ## What changed - Added `url_suffix.models` to `conf/models/volcengine.json`. - Normalized the configured models suffix before building the request URL. - Covered config loading, successful model listing, upstream errors, and missing suffix handling. ## Why `VolcEngine.ListModels` already builds requests from `URLSuffix.Models`, but the bundled VolcEngine config did not define that suffix. That left the model-listing path unable to call the documented `/models` endpoint from the existing provider config. Fixes #14701 Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-27 13:14:56 +08:00
oktofeesh	5ae41dc1eb	fix(go-models): route hosted OCR providers through drivers (#15233 ) ## Summary - route hosted MinerU.Net and PaddleOCR.Net provider names to their existing Go drivers - add regression coverage for loading the hosted OCR provider configs through ProviderManager ## What changed - Added canonical provider-name aliases for the hosted OCR provider display names. - Covered both bundled configs with a focused provider-manager test. ## Why The hosted provider configs use display names with `.Net`, while model factory dispatch lowercases the provider name. Without aliases, those configs fall through to `DummyModel` instead of using the existing MinerU and PaddleOCR drivers. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 20:40:40 +08:00
oktofeesh	22a3b8cdf9	feat(go-models): list LongCat models (#15241 ) ## Summary - Add LongCat model-list support through the documented OpenAI-compatible models endpoint. ## What changed - Add the LongCat `models` URL suffix for `/openai/v1/models`. - Implement `ListModels` for the LongCat Go driver. - Delegate `CheckConnection` to the lightweight model-list request. - Add focused regression coverage for successful, malformed, oversized, and missing-key responses. ## Why LongCat documents a models endpoint under the OpenAI-compatible API surface, but the Go driver still returned `no such method` for model listing and connection checks. ## Validation - `go test ./internal/entity/models -run TestLongCat -count=1` - `go test -race ./internal/entity/models -run TestLongCat -count=1` - `go test ./internal/entity -count=1` - `git diff --check` ## Notes - Related to the broader Go model provider tracking in #14736, but this PR only handles LongCat model listing. - `go test ./internal/entity/models -count=1` is currently blocked by an unrelated Astraflow test panic outside this LongCat change. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 19:58:53 +08:00
oktofeesh	557024e7d4	fix(go-models): add xAI model listing suffix (#15236 ) ## Summary - add the xAI `models` URL suffix used by the existing Go `ListModels` implementation - return a clear error when the xAI models suffix is missing - add focused xAI model-listing and connection-check regression tests ## What changed - Added `url_suffix.models` to `conf/models/xai.json`. - Normalized the configured models suffix before building the request URL. - Covered config loading, successful model listing, upstream errors, API-key validation, missing suffix handling, and `CheckConnection` delegation. ## Why `XAIModel.ListModels` already builds requests from `URLSuffix.Models`, and `CheckConnection` delegates to that method. The bundled xAI config did not define that suffix, which left the model-listing path unable to call the provider `/models` endpoint from the existing provider config. ## Validation - `go test ./internal/entity/models -run TestXAI -count=1` - `go test ./internal/entity -count=1` - `git diff HEAD~1..HEAD --check` ## Notes - `go test ./internal/entity/models -count=1` currently fails in unchanged Astraflow coverage: `TestAstraflowEmbedReturnsNoSuchMethod` panics before reaching any xAI assertions. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 19:58:20 +08:00
Haruko386	3619ceca01	Go: implement provider: OrcaRouter (#15235 ) ### What problem does this PR solve? implement provider `OrcaRouter` The following functionalities are now supported: Cohere: - [x] Chat / Think Chat / Stream Chat / Stream Think Chat - [x] Model listing - [x] TTS - [ ] Balance ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 18:20:33 +08:00
dripsmvcp	a48bcf814d	Go: implement provider: ModelScope (#15041 ) Closes #15040. ModelScope was listed unchecked in the Go-rewrite tracker #14736 and already had an llm_factories.json entry (tags: LLM) but no Go driver, so the new Go API server could not route ModelScope instances. The Python side has supported it through the OpenAI-compatible base at rag/llm/chat_model.py:618 (ModelScopeChat), which requires a user-supplied base URL and appends /v1. This adds: - internal/entity/models/modelscope.go: self-hosted OpenAI-compatible driver with chat (sync + SSE stream with idle-timeout cancellation), list_models, and check_connection. Auth header is optional, matching the xinference pattern, so deployments without auth and auth-enabled deployments both work. Base URL is normalized so users can configure either the root endpoint or the /v1 endpoint. - internal/entity/models/modelscope_test.go: 12 tests covering name, URL normalization, factory routing, chat happy path / auth header / reasoning_content extraction, stream happy path / stream=false rejection / idle cancellation, list_models + check_connection, missing-base-URL clear error, and the no-such-method sentinels. - conf/models/modelscope.json: shipped config (class: "local", url_suffix v1/chat/completions and v1/models). - internal/entity/models/factory.go: case "modelscope" → ModelScopeModel. - internal/service/llm.go: ModelScope added to the selfDeployed map alongside Ollama, Xinference, LocalAI, LM-Studio, GPUStack — the Python side requires user-supplied URL with no default, so the Go side classifies it the same way. Follow-on issues will add Embed and Rerank, in line with how Novita, NVIDIA, TogetherAI, and other providers landed method-by-method. --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-05-26 18:18:46 +08:00
Hz_	84add43208	Add HuaweiCloud model provider (#15237 ) ### What problem does this PR solve? This PR adds HuaweiCloud provider integration in RAGFlow. Supported capabilities: - [x] Chat / Think Chat / Stream Chat / Stream Think Chat - [x] Embedding - [x] Rerank - [x] Model listing - [x] Provider connection checking Verified examples from the CLI: ``` check instance 'test' from 'HuaweiCloud'; chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello'; think chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello'; stream chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello'; stream think chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello'; embed text 'what is rag' 'who are you' with 'bge-m3@test@HuaweiCloud' dimension 1024; rerank query 'what is rag' document 'rag is retrieval augmented generation' 'rag need llm' 'famous rag project includes ragflow' with 'bge-reranker-v2-m3@test@HuaweiCloud' top 3; list supported models from 'HuaweiCloud' 'test'; LIST MODELS FROM 'HuaweiCloud' 'test'; ``` ### Type of change - [x] New Feature - [x] Provider integration	2026-05-26 17:13:15 +08:00
Jake Armstrong	0fb85a66bc	feat(go-models): add AWS Bedrock provider driver (#15166 ) ## Summary Closes #15165. Implements the AWS Bedrock model provider for the Go API server, tracked under #14736. Adds Converse + Converse-Stream chat and foundation-model listing, with SigV4 signing over a hand-rolled `net/http` path that matches the established pattern in `internal/entity/models/` (no new direct `go.mod` deps). ## Linked tracker Tracked under #14736 (Implement model providers of RAGFlow API server in Go). Closes #15165.	2026-05-26 17:10:06 +08:00

1 2 3 4 5 ...

413 Commits