Files
ragflow/conf/models/xinference.json

13 lines
260 B
JSON
Raw Permalink Normal View History

Go: implement provider: Xinference (#14938) ### What problem does this PR solve? Closes #14808. Adds a Go model driver for Xinference so self-hosted Xinference chat models can be used through the Go provider layer instead of falling through to the dummy driver. Xinference exposes an OpenAI-compatible API under `/v1`; the driver accepts either a root endpoint such as `http://127.0.0.1:9997` or an OpenAI-compatible endpoint such as `http://127.0.0.1:9997/v1` and normalizes it before calling chat or model-listing routes. ### What is changed? - Add `internal/entity/models/xinference.go` implementing `ModelDriver` for Xinference chat. - Route provider name `xinference` in `internal/entity/models/factory.go`. - Add `conf/models/xinference.json` as a local provider config. - Add focused unit tests in `internal/entity/models/xinference_test.go`. Initial method coverage: - `ChatWithMessages`: POST `/v1/chat/completions`. - `ChatStreamlyWithSender`: SSE streaming from `/v1/chat/completions`. - `ListModels`: GET `/v1/models`. - `CheckConnection`: lightweight `ListModels` probe. - Optional auth: send `Authorization: Bearer <api_key>` only when a non-empty key is configured, matching Xinference no-auth and auth-enabled deployments. - `Balance`, `Embed`, `Rerank`, ASR, TTS, and OCR return `no such method` for this initial chat-provider PR. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Bug Fix (non-breaking change which fixes an issue) ### Tests - `go test -vet=off -run TestXinference -count=1 ./internal/entity/models/...` - `go test -vet=off -count=1 ./internal/entity/models/...` ### References - Xinference docs: https://inference.readthedocs.io/zh-cn/latest/index.html - OpenAI-compatible chat usage: https://inference.readthedocs.io/zh-cn/latest/getting_started/using_xinference.html - API key auth: https://inference.readthedocs.io/zh-cn/latest/user_guide/auth_system.html --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-18 21:10:13 -10:00
{
"name": "xinference",
"url_suffix": {
"chat": "v1/chat/completions",
"embedding": "v1/embeddings",
"models": "v1/models",
fix(Go): rewrite chat, listmodels, embed for Ollama (#15213) ### What problem does this PR solve? IDK how to implement **`Ollama`** on #14580 but it's totally wrong. This is the rewrite version for **`Ollama`** **Verified from CLI** ``` # Embed RAGFlow(user)> embed text 'what is rag' 'who are you' with 'nomic-embed-text:latest@test12@ollama' dimension 1024; +-----------+-------+ | dimension | index | +-----------+-------+ | 768 | 0 | | 768 | 1 | +-----------+-------+ # Chat RAGFlow(user)> think chat with 'qwen3:0.6b@test12@ollama' message 'who r u' Thinking: Okay, the user asked, "Who r u?" I need to respond appropriately. First, I should acknowledge their question. Since I'm an AI, I don't have a physical form, but I can confirm that I'm a large language model. I should keep the response friendly and offer help. Let me make sure I'm not making up any information and that the response is natural. Also, I should check for any typos and ensure clarity. Alright, that should cover it. Answer: I'm an AI language model, and I don't have a physical form. However, I can tell you that I'm designed to assist with questions and tasks. How can I help you today? Time: 2.914285 RAGFlow(user)> stream think chat with 'qwen3:0.6b@test12@ollama' message 'who r u' Thinking: , the user asked, "Who are you?" I need to respond appropriately. Since I'm an AI assistant, I should mention that I don't have a physical form or a mind. I should also clarify that I can help with various tasks like answering questions or providing information. It's important to keep the response friendly and informative while maintaining the correct tone. Answer: don't have a physical form or a mind, but I'm here to help with your questions or tasks! What can I do for you today? Time: 1.740047 # LisyModels RAGFlow(user)> list supported models from 'ollama' 'test12' +-------------------------+ | model_name | +-------------------------+ | nomic-embed-text:latest | | qwen3:0.6b | +-------------------------+ ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring
2026-05-25 18:55:03 +08:00
"rerank": "v1/rerank",
"asr": "v1/audio/transcriptions",
"tts": "v1/audio/speech"
Go: implement provider: Xinference (#14938) ### What problem does this PR solve? Closes #14808. Adds a Go model driver for Xinference so self-hosted Xinference chat models can be used through the Go provider layer instead of falling through to the dummy driver. Xinference exposes an OpenAI-compatible API under `/v1`; the driver accepts either a root endpoint such as `http://127.0.0.1:9997` or an OpenAI-compatible endpoint such as `http://127.0.0.1:9997/v1` and normalizes it before calling chat or model-listing routes. ### What is changed? - Add `internal/entity/models/xinference.go` implementing `ModelDriver` for Xinference chat. - Route provider name `xinference` in `internal/entity/models/factory.go`. - Add `conf/models/xinference.json` as a local provider config. - Add focused unit tests in `internal/entity/models/xinference_test.go`. Initial method coverage: - `ChatWithMessages`: POST `/v1/chat/completions`. - `ChatStreamlyWithSender`: SSE streaming from `/v1/chat/completions`. - `ListModels`: GET `/v1/models`. - `CheckConnection`: lightweight `ListModels` probe. - Optional auth: send `Authorization: Bearer <api_key>` only when a non-empty key is configured, matching Xinference no-auth and auth-enabled deployments. - `Balance`, `Embed`, `Rerank`, ASR, TTS, and OCR return `no such method` for this initial chat-provider PR. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Bug Fix (non-breaking change which fixes an issue) ### Tests - `go test -vet=off -run TestXinference -count=1 ./internal/entity/models/...` - `go test -vet=off -count=1 ./internal/entity/models/...` ### References - Xinference docs: https://inference.readthedocs.io/zh-cn/latest/index.html - OpenAI-compatible chat usage: https://inference.readthedocs.io/zh-cn/latest/getting_started/using_xinference.html - API key auth: https://inference.readthedocs.io/zh-cn/latest/user_guide/auth_system.html --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-18 21:10:13 -10:00
},
"class": "local"
}