29 Commits

Author SHA1 Message Date
JPette1783
daa3811165 feat(models): add shared HTTP client, SSE parser, and stub helpers for Go model drivers (#15821)
### What problem does this PR solve?

The Go model-driver layer () has ~38,700 lines across 109 files. Roughly
74% of that is boilerplate duplicated into every driver: identical HTTP
client setup, the same 65-line SSE scanner loop, and 10-11 one-line "not
supported" stub methods per driver. Any fix must be manually propagated
to every file. Closes #15820.

This PR establishes the three shared utility files that form the
foundation for incremental driver migration:

---

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Co-authored-by: Haruko386 <tryeverypossible@163.com>
2026-06-11 19:20:12 +08:00
Haruko386
9c30557ef7 Go: add dimensions for list models and fix some embed-bug in providers (#15940)
### What problem does this PR solve?

As title

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-06-11 19:18:49 +08:00
Haruko386
84edf539e7 Go: Refactor list-models func (#15900)
### What problem does this PR solve?

As title
Issue: #15853 

### Type of change

- [x] Refactoring
2026-06-11 13:32:50 +08:00
JPette1783
4b10c0b885 fix(go-models): guard nil pointers in DeepSeek and VolcEngine streaming (#15817)
### What problem does this PR solve?

`ChatStreamlyWithSender` in two Go model drivers could panic on nil
pointer dereferences when a caller passes a nil model config or omits
the reasoning `Effort`:

- **deepseek.go** - `switch *chatModelConfig.Effort` dereferenced
`Effort` without a nil check. It now defaults to `"high"` when nil.
- **volcengine.go** - the `modelConfig` pointer itself was dereferenced
(`Stream`, `MaxTokens`, `Temperature`, .) with no guard, and `Effort`
was dereferenced unchecked. `modelConfig` now defaults to an empty
`&ChatConfig{}` when nil so the optional-field accesses are safe, and
`Effort` defaults to `"medium"` when nil.

Addresses the CodeRabbit review on `volcengine.go`
`ChatStreamlyWithSender`. Per maintainer feedback ("one PR do one
thing"), the unrelated `handler/auth.go` and
`service/heartbeat_sender.go` changes were removed so this PR is scoped
to the model-provider fixes.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-11 13:32:24 +08:00
Jin Hai
719ce15c95 Go CLI: update list supported models (#15845)
### What problem does this PR solve?

Now list supported models will show more info.

```
RAGFlow(api/default)> list supported models from 'gitee' 'test';
+-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+
| dimension | max_tokens | model_types | name                                                     | thinking                                    |
+-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+
|           |            |             | Wan2.7                                                   |                                             |
|           |            |             | HappyHorse-1.0                                           |                                             |
|           |            |             | Qwen3.6-27B@Qwen                                         |                                             |
|           |            |             | Qwen3.6-35B-A3B@Qwen                                     |                                             |
|           | 1048576    | [chat]      | DeepSeek-V4-Flash@deepseek-ai                            | map[clear_thinking:true default_value:true] |
|           | 1048576    | [chat]      | DeepSeek-V4-Pro@deepseek-ai                              | map[clear_thinking:true default_value:true] |
+-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-09 19:01:00 +08:00
Haruko386
baeb0c0431 Refactor[Go Model Provider]: refactor baseURL and modelConfig (#15627)
### What problem does this PR solve?

As Title

### Type of change

- [x] Refactoring
2026-06-04 17:50:22 +08:00
Jin Hai
dbebc66ba8 Go: refactor provider code (#15564)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-03 14:09:07 +08:00
Dexterity
2819d0ea24 fix(go-models): use per call context timeouts so long streaming responses are not truncated (#15380)
### What problem does this PR solve?

Closes #15379 

Around 29 Go model providers in `internal/entity/models/` share an
`http.Client` configured with `Timeout: 120 * time.Second`, and reuse
that same client for `ChatStreamlyWithSender`. Go's
`http.Client.Timeout` is a hard ceiling on the whole request that also
covers reading the response body, so it behaves as a wall clock on
streaming. Any streamed chat response that lasts longer than 120 seconds
gets cut off in the middle with a timeout error. Long generations,
reasoning model outputs, and slow or overloaded upstreams are the common
victims.

The providers that already behave correctly (`groq`, `mistral`,
`voyage`, `anthropic`) set no client `Timeout` and instead wrap each
request in a `context.WithTimeout`. This change converges the affected
providers onto that same pattern.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-06-02 15:27:26 +08:00
Dexterity
04aa8d04e8 fix(go-models): raise SSE scanner buffer so large stream chunks are not dropped (#15382)
### Summary

Closes #15381 

Every provider in `internal/entity/models/` reads its streaming response
with `bufio.NewScanner(resp.Body)` and iterates over `scanner.Scan()`.
The default `bufio.Scanner` maximum token size is 64KB, so when an
upstream sends a single SSE `data:` line larger than 64KB (long content
deltas, large tool or function call argument blobs, bundled
`reasoning_content`, or providers that emit a whole message in one
event) `scanner.Scan()` returns `false` and `scanner.Err()` returns
`bufio.ErrTooLong`. Streaming chat then ends with an error partway
through the response.

This change adds `scanner.Buffer(make([]byte, 64*1024), 1024*1024)`
immediately after every SSE scanner that was still bare, raising the cap
to 1MB. 1MB is the value already used for streaming chat in `openai.go`,
`modelscope.go`, `groq.go`, `mistral.go`, `xai.go` and the other already
patched providers (the 8MB cap in the repo is reserved for TTS and
embedding paths), so this simply converges the remaining providers onto
the established pattern. Nothing else changes: line parsing, `data:`
prefix handling, `[DONE]` detection, JSON unmarshalling, error handling,
and the existing `scanner.Err()` checks all stay the same.

Providers covered (23 scanners across 22 files): 302ai, aliyun,
baichuan, baidu, cohere, deepinfra, deepseek, gitee, huggingface,
lmstudio, minimax (the chat scanner, whose TTS scanner was already
bumped), moonshot, nvidia, ollama, openrouter, orcarouter, paddleocr,
siliconflow, tokenhub, vllm, volcengine, xunfei, zhipu-ai. `jiekouai.go`
is excluded because it is covered by the in flight #15337.

A table driven regression test (`sse_scanner_buffer_test.go`) streams a
single 128KB `data:` content delta followed by `data: [DONE]` through an
`httptest` server and asserts that `ChatStreamlyWithSender` delivers the
full content with no error across a representative subset of providers.
Without the buffer fix the test fails with `bufio.Scanner: token too
long`.

This PR also removes three duplicate declarations of the package level
`roundTripperFunc` test helper that several recently merged provider PRs
each added independently, which had left the `internal/entity/models`
test package unable to compile. The helper now lives in a single place
and is shared.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-29 19:34:00 +08:00
oktofeesh
a3c6e075f6 fix(go-models): add VolcEngine model listing suffix (#15234)
## Summary
- add the VolcEngine `models` URL suffix used by the existing Go
`ListModels` implementation
- return a clear error when the VolcEngine models suffix is missing
- add focused VolcEngine model-listing regression tests

## What changed
- Added `url_suffix.models` to `conf/models/volcengine.json`.
- Normalized the configured models suffix before building the request
URL.
- Covered config loading, successful model listing, upstream errors, and
missing suffix handling.

## Why
`VolcEngine.ListModels` already builds requests from `URLSuffix.Models`,
but the bundled VolcEngine config did not define that suffix. That left
the model-listing path unable to call the documented `/models` endpoint
from the existing provider config.

Fixes #14701

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-27 13:14:56 +08:00
Haruko386
bf41d35729 Go: implement PaddleOCR provider and implement ASR for CoHere (#14954)
### What problem does this PR solve?

This PR implement implement OCR for Baidu and Mistral, implement
PaddleOCR provider and implement ASR for CoHere

**Verified examples from the CLI:**

```
RAGFlow(user)> ocr with 'mistral-ocr-2512@test@mistral' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                                                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


RAGFlow(user)> ocr with 'paddleocr-vl-0.9b@test@baidu' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                                                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

# PaddleOCR
RAGFlow(user)> ocr with 'PaddleOCR-VL-1.5@test@paddleocr' file './internal/test.pdf'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                                                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Bingxin Ke

Nando Metzger

Photogra

Anton Obukhov

Rodrigo Caye Daudt

netry and Remote Sensing,

Shengyu Huang

Konrad Schindler

ETH Zürich





<div style="text-align: c...  |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

# Cohere

RAGFlow(user)> asr with 'cohere-transcribe-03-2026@test@cohere' audio './internal/test.wav' param '{"language": "en"}'
+-----------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------+
|  The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired. |
+-----------------------------------------------------------------------------------------------------------------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-15 18:41:43 +08:00
Jin Hai
3a5df08c76 Go: add file parse command (#14892)
### What problem does this PR solve?

```
RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png'
+----------------------------------------------------------+
| text                                                     |
+----------------------------------------------------------+
| 生活不是等待风暴过去,而是学会在雨中翩翩起舞。
——佚名                                                       |
+----------------------------------------------------------+

RAGFlow(user)> list 'test@gitee' tasks;
+---------+----------------------------------+
| status  | task_id                          |
+---------+----------------------------------+
| success | C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5 |
+---------+----------------------------------+
RAGFlow(user)> show 'test@gitee' task 'C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5';
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| content                                                                                                                                                                                                                                                          | index |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| # PDF 1: Purpose of RAGFlow  

RAGFlow is an open source Retrieval-Augmented Generation (RAG) engine designed to turn raw documents into reliable context for large language models.Its purpose is to make it practical to build an Al assistant that can ans... | 1     |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+

```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-15 12:29:52 +08:00
Jin Hai
b18640d228 Go: fix OCR command (#14891)
### What problem does this PR solve?

RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png'
+----------------------------------------------------------+
| text                                                     |
+----------------------------------------------------------+
| 生活不是等待风暴过去,而是学会在雨中翩翩起舞。

——佚名                                                       |
+----------------------------------------------------------+

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-13 17:29:53 +08:00
Jin Hai
d08bf02d9b Go: add ASR, TTS, OCR command (#14836)
### What problem does this PR solve?

```
RAGFlow(user)> asr with 'glm-asr-2512@test@zhipu-ai' audio './speech.wav';
CLI error: zhipu, no such method
RAGFlow(user)> stream asr with 'glm-asr-2512@test@zhipu-ai' audio './speech.wav';
CLI error: zhipu, no such method

RAGFlow(user)> tts with 'glm-tts@test@zhipu-ai' text 'how are you';
CLI error: zhipu, no such method

RAGFlow(user)> stream tts with 'glm-tts@test@zhipu-ai' text 'how are you';
CLI error: zhipu, no such method

RAGFlow(user)> ocr with 'glm-ocr@test@zhipu-ai' file './test.log';
CLI error: zhipu, no such method
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-12 17:17:44 +08:00
Jin Hai
c55e23e7e2 Go: refactor embedding interface (#14757)
### What problem does this PR solve?

Provide embedding index according to the input text

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-11 14:45:30 +08:00
BitToby
39a1773f7f Go: implement ListModels in Volcengine driver (#14702)
### What problem does this PR solve?

The VolcEngine Go driver in `internal/entity/models/volcengine.go`
shipped with a
`ListModels` stub that returned `volcengine, no such method`.
`conf/models/volcengine.json`
also did not declare a `models` URL suffix, so the model picker had
nothing to call even
if the method body were filled in.

A tenant who configured Volcengine (Doubao / Ark) as a provider could
not see the list of
available endpoints from the RAGFlow UI. Several other Go drivers
already implement
`ListModels` against the OpenAI-compatible `/models` endpoint (deepseek,
gitee, nvidia,
openai, siliconflow), so the interface and pattern are well-established.

This PR fills the gap.

### What this PR includes

* `conf/models/volcengine.json`: declare the `models` URL suffix
alongside the existing
  `chat`, `files`, and `embedding` entries. The Ark v3 API exposes
`https://ark.cn-beijing.volces.com/api/v3/models`, so the suffix is just
`models`.
* `internal/entity/models/volcengine.go`: replace the `ListModels` stub
with a real
implementation. Reuses the package-level `DSModelList` / `DSModel` types
that
DeepSeek, Gitee, and SiliconFlow already use to parse the
OpenAI-compatible models
  response shape.

No factory change. No interface change.

### How the driver works

* Resolves the region with a default fallback, the same way the other
VolcEngine methods
  in this driver already do.
* Builds the URL from `BaseURL[region] + URLSuffix.Models`, with
`strings.TrimSuffix` on
  the base to keep the join robust.
* Issues a `GET` with optional `Authorization: Bearer <api_key>` (the
header is omitted
when no key is configured, mirroring the existing NVIDIA `ListModels`).
* Reads the response body once, surfaces a non-200 with the upstream
status line plus
  body, and parses the JSON via the shared `DSModelList` type.
* Returns the model id list in input order. When the response includes
an `owned_by`
field, the entry is rendered as `id@owned_by`, matching the convention
used by the
  other Go drivers.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)


### How was this tested?

* `go build ./internal/entity/models/...` returns exit 0.
* `go vet ./internal/entity/models/...` is clean.
* `gofmt -l internal/entity/models/volcengine.go` is clean.
* The full method set on `VolcEngine` still matches the `ModelDriver`
interface.
* Endpoint reachability check: `GET
https://ark.cn-beijing.volces.com/api/v3/models`
returns `401 Unauthorized` without an API key, confirming the path
exists and accepts
  Bearer authentication.
* Pattern parity with DeepSeek, Gitee, NVIDIA, and SiliconFlow
`ListModels`.

Fixes #14701

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-11 10:59:18 +08:00
Jin Hai
17d71e5d79 Go CLI: embed and rerank (#14735)
### What problem does this PR solve?

```
RAGFlow(user)> embed text 'what is rag' 'who are you' with 'embedding-3@test@zhipu-ai' dimension 16;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 16        | 0     |
| 16        | 1     |
+-----------+-------+

RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'rerank@test@zhipu-ai' top 2;
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0     | 1               |
| 2     | 0.99999976      |
+-------+-----------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-09 17:41:54 +08:00
Haruko386
94f82acd03 Fix(Go): prevent global state pollution in local model connection check (#14669)
### What problem does this PR solve?

1. **Fix Global State Pollution in Local Providers (Critical Bug):** -
Resolved a severe concurrency and architecture issue in
`model_service.go`. Previously, `ListSupportedModels` would permanently
overwrite the global provider singleton with a localized URL instance
(`driver.NewInstance`). This caused cross-request contamination in
multi-tenant environments.
- Fixed `CheckProviderConnection` for local models (LM Studio, vLLM,
Ollama). It now properly creates a localized driver copy and injects the
`base_url` before testing the connection, entirely eliminating the
false-positive `missing base URL` error without polluting the global
state.
2. **Implement `VolcEngine` Embeddings:** - Fully implemented the
`Encode` method for the `volcengine` provider, enabling text embedding
capabilities for VolcEngine models.
3. **Enhance Region Validation in `SiliconFlow`:** - Added a strict
empty string check (`*apiConfig.Region != ""`) alongside the existing
`nil` check when parsing regions. This ensures that if an empty string
is passed, the system safely falls back to the `"default"` region,
preventing malformed URL requests and `unsupported protocol scheme`
errors.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-05-08 15:54:27 +08:00
Haruko386
d13a240dc0 Go: implement remaining interface for OpenRouter (#14657)
### What problem does this PR solve?

1. implement `rerank`, `embedding`, `balance`, `checkConnet` method for
`OpenRouter`
2. delete `chat` method in `internal/entity/models/volcengine.go`

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-08 13:56:45 +08:00
Haruko386
a377512110 Go: implement provider: OpenRouter (#14652)
### What problem does this PR solve?

1. **Implement `OpenRouter` Provider:** Fully support OpenRouter AI
models (e.g., `gemma`, `minimax`). Includes robust handling of
Server-Sent Events (SSE) streams, error event interception, and proper
parsing of both `reasoning_content` and standard `content`.
2. **Fix BaseURL Resolution Bug:** Fixed a critical edge case in region
configuration parsing. Added a strict empty string check
(`*apiConfig.Region != ""`) alongside the `nil` check. This ensures that
if the UI passes an empty string, the system correctly falls back to the
`"default"` region, preventing `unsupported protocol scheme ""` errors
during HTTP requests.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-05-08 12:02:37 +08:00
Jin Hai
aa57b5bd8b Go: move logger to common module (#14545)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-06 10:41:58 +08:00
qinling0210
12af73f2ca Support stream for multimodal chat (#14537)
### What problem does this PR solve?

Support stream for multimodal chat

### Type of change

- [x] Refactoring
2026-04-30 19:33:57 +08:00
qinling0210
265f92c83e Simplify chat and support multimodal chat (#14523)
### What problem does this PR solve?

Simplify chat and support multimodal chat

### Type of change

- [x] Refactoring
2026-04-30 15:25:01 +08:00
Haruko386
0e1477eb23 Go: implement provider: MiniMax (#14478)
### What problem does this PR solve?

implement MiniMax provider

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-04-29 19:06:40 +08:00
Jin Hai
bb05a8bd7e Update create model instance command (#14441)
### What problem does this PR solve?

1. support command:

```
RAGFlow(user)> create provider 'vllm' instance 'test' key 'test-key' url 'base-url' region 'abc';
SUCCESS
RAGFlow(user)> list instances from 'vllm';
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
| apiKey   | extra                                  | id                               | instanceName | providerID                       | status |
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
| test-key | {"base_url":"base-url","region":"abc"} | 40213c89430311f1a7cf38a74640adcc | test         | b4d40e6142d311f1a4f938a74640adcc | enable |
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
```
2. support add vllm model
```
RAGFlow(user)> add model 'Qwen/Qwen2-0.5B' to provider 'vllm' instance 'test' with tokens 131072 chat;
SUCCESS
```
3. add vllm chat

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-29 17:05:08 +08:00
Haruko386
decf673049 Go: implement provider: volcengine (#14460)
### What problem does this PR solve?

implement `volcengine` provider 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-04-29 15:45:08 +08:00
qinling0210
dcce864d4c Simplify Encode (#14437)
### What problem does this PR solve?

Simplify Encode

### Type of change

- [x] Refactoring
2026-04-28 18:07:42 +08:00
Jin Hai
ae420f6358 Go: fix compilation (#14418)
### What problem does this PR solve?

Add methods to volcengine

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-28 13:21:05 +08:00
Jin Hai
819257f257 Go: add volcengine (#14409)
### What problem does this PR solve?

1. Refactor server_main
2. Add volcengine

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-28 12:12:58 +08:00