Commit Graph

15 Commits

Author SHA1 Message Date
JPette1783
daa3811165 feat(models): add shared HTTP client, SSE parser, and stub helpers for Go model drivers (#15821)
### What problem does this PR solve?

The Go model-driver layer () has ~38,700 lines across 109 files. Roughly
74% of that is boilerplate duplicated into every driver: identical HTTP
client setup, the same 65-line SSE scanner loop, and 10-11 one-line "not
supported" stub methods per driver. Any fix must be manually propagated
to every file. Closes #15820.

This PR establishes the three shared utility files that form the
foundation for incremental driver migration:

---

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Co-authored-by: Haruko386 <tryeverypossible@163.com>
2026-06-11 19:20:12 +08:00
Haruko386
84edf539e7 Go: Refactor list-models func (#15900)
### What problem does this PR solve?

As title
Issue: #15853 

### Type of change

- [x] Refactoring
2026-06-11 13:32:50 +08:00
Jin Hai
719ce15c95 Go CLI: update list supported models (#15845)
### What problem does this PR solve?

Now list supported models will show more info.

```
RAGFlow(api/default)> list supported models from 'gitee' 'test';
+-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+
| dimension | max_tokens | model_types | name                                                     | thinking                                    |
+-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+
|           |            |             | Wan2.7                                                   |                                             |
|           |            |             | HappyHorse-1.0                                           |                                             |
|           |            |             | Qwen3.6-27B@Qwen                                         |                                             |
|           |            |             | Qwen3.6-35B-A3B@Qwen                                     |                                             |
|           | 1048576    | [chat]      | DeepSeek-V4-Flash@deepseek-ai                            | map[clear_thinking:true default_value:true] |
|           | 1048576    | [chat]      | DeepSeek-V4-Pro@deepseek-ai                              | map[clear_thinking:true default_value:true] |
+-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-09 19:01:00 +08:00
Haruko386
baeb0c0431 Refactor[Go Model Provider]: refactor baseURL and modelConfig (#15627)
### What problem does this PR solve?

As Title

### Type of change

- [x] Refactoring
2026-06-04 17:50:22 +08:00
Jin Hai
dbebc66ba8 Go: refactor provider code (#15564)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-03 14:09:07 +08:00
bitloi
432e966414 fix(go): support OpenAI audio endpoints (#15104)
### What problem does this PR solve?

Closes #15102.

OpenAI's Go provider config advertises `whisper-1` as ASR and `tts-1` as
TTS, but the Go driver returned `openai, no such method` for both audio
paths and did not define `url_suffix.asr` / `url_suffix.tts`.

This PR:

- adds OpenAI audio URL suffixes for `audio/transcriptions` and
`audio/speech`
- implements non-streaming `TranscribeAudio` using multipart form
uploads
- implements non-streaming `AudioSpeech` using the OpenAI speech JSON
request shape
- keeps streaming TTS explicitly unsupported instead of sending binary
audio through the text SSE sender
- adds focused tests for config coverage, ASR/TTS request shape,
required TTS voice validation, and unsupported streaming TTS


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-25 10:25:53 +08:00
Haruko386
bf41d35729 Go: implement PaddleOCR provider and implement ASR for CoHere (#14954)
### What problem does this PR solve?

This PR implement implement OCR for Baidu and Mistral, implement
PaddleOCR provider and implement ASR for CoHere

**Verified examples from the CLI:**

```
RAGFlow(user)> ocr with 'mistral-ocr-2512@test@mistral' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                                                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


RAGFlow(user)> ocr with 'paddleocr-vl-0.9b@test@baidu' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                                                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

# PaddleOCR
RAGFlow(user)> ocr with 'PaddleOCR-VL-1.5@test@paddleocr' file './internal/test.pdf'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                                                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Bingxin Ke

Nando Metzger

Photogra

Anton Obukhov

Rodrigo Caye Daudt

netry and Remote Sensing,

Shengyu Huang

Konrad Schindler

ETH Zürich





<div style="text-align: c...  |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

# Cohere

RAGFlow(user)> asr with 'cohere-transcribe-03-2026@test@cohere' audio './internal/test.wav' param '{"language": "en"}'
+-----------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------+
|  The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired. |
+-----------------------------------------------------------------------------------------------------------------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-15 18:41:43 +08:00
Jin Hai
3a5df08c76 Go: add file parse command (#14892)
### What problem does this PR solve?

```
RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png'
+----------------------------------------------------------+
| text                                                     |
+----------------------------------------------------------+
| 生活不是等待风暴过去,而是学会在雨中翩翩起舞。
——佚名                                                       |
+----------------------------------------------------------+

RAGFlow(user)> list 'test@gitee' tasks;
+---------+----------------------------------+
| status  | task_id                          |
+---------+----------------------------------+
| success | C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5 |
+---------+----------------------------------+
RAGFlow(user)> show 'test@gitee' task 'C3FX4MQNKY5MGC6ZFMIXIAMJKHCEBQB5';
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| content                                                                                                                                                                                                                                                          | index |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
| # PDF 1: Purpose of RAGFlow  

RAGFlow is an open source Retrieval-Augmented Generation (RAG) engine designed to turn raw documents into reliable context for large language models.Its purpose is to make it practical to build an Al assistant that can ans... | 1     |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+

```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-15 12:29:52 +08:00
Jin Hai
b18640d228 Go: fix OCR command (#14891)
### What problem does this PR solve?

RAGFlow(user)> ocr with 'hunyuanocr@test@gitee' file './picture.png'
+----------------------------------------------------------+
| text                                                     |
+----------------------------------------------------------+
| 生活不是等待风暴过去,而是学会在雨中翩翩起舞。

——佚名                                                       |
+----------------------------------------------------------+

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-13 17:29:53 +08:00
Jin Hai
d08bf02d9b Go: add ASR, TTS, OCR command (#14836)
### What problem does this PR solve?

```
RAGFlow(user)> asr with 'glm-asr-2512@test@zhipu-ai' audio './speech.wav';
CLI error: zhipu, no such method
RAGFlow(user)> stream asr with 'glm-asr-2512@test@zhipu-ai' audio './speech.wav';
CLI error: zhipu, no such method

RAGFlow(user)> tts with 'glm-tts@test@zhipu-ai' text 'how are you';
CLI error: zhipu, no such method

RAGFlow(user)> stream tts with 'glm-tts@test@zhipu-ai' text 'how are you';
CLI error: zhipu, no such method

RAGFlow(user)> ocr with 'glm-ocr@test@zhipu-ai' file './test.log';
CLI error: zhipu, no such method
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-12 17:17:44 +08:00
Jin Hai
c55e23e7e2 Go: refactor embedding interface (#14757)
### What problem does this PR solve?

Provide embedding index according to the input text

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-11 14:45:30 +08:00
Panda Dev
6bfe0f9a10 Go: implement Encode (embeddings) in OpenAI driver (#14630)
### What problem does this PR solve?

The OpenAI Go driver landed in #14605 with chat, list models, and check
connection. Encode was left as a stub that returns \`not implemented\`.

\`conf/models/openai.json\` already lists three embedding models out of
the box:

- text-embedding-ada-002
- text-embedding-3-small
- text-embedding-3-large

So a tenant who picked one of these in the Go layer could not actually
run an embedding call. This PR fills the gap.

### What this PR includes

- \`conf/models/openai.json\`: add \`\"embedding\": \"embeddings\"\`
under \`url_suffix\` so the driver can build the URL from config. This
matches the \`URLSuffix.Embedding\` field used by other drivers
(siliconflow, zhipu-ai).
- \`internal/entity/models/openai.go\`: replace the Encode stub with a
real implementation that POSTs to \`/v1/embeddings\`. Adds a small local
response type \`openaiEmbeddingResponse\`.

No factory change. No interface change.

### How the implementation works

- Validate \`apiConfig\` and the API key, validate the model name. Use
the existing \`baseURLForRegion\` helper so an unknown region fails fast
with a clear error.
- Wrap the request with \`context.WithTimeout(nonStreamCallTimeout)\` so
the call has a clear deadline. Same pattern as \`ChatWithMessages\` and
\`ListModels\` already use in this file.
- Send all input texts in one request. The OpenAI API accepts the
\`input\` field as an array.
- Parse \`data[*].embedding\` and copy each slice into a \`[][]float64\`
indexed by \`data[*].index\` so the output order matches the input order
even if the API returns items in a different order.
- Handle both \`float64\` and \`float32\` element types, the way the
SiliconFlow driver does.
- An empty input slice returns \`[][]float64{}\` with no HTTP call.
- Non-200 responses propagate the upstream status line and body.
- A final pass checks that every input slot got a vector. If any slot is
still nil, return a clear error so the caller does not silently use a
zero vector.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

### How was this tested?

- \`go build ./internal/entity/models/...\` in a clean go 1.25 image
(the go.mod minimum) returns exit 0.
- The full method set on \`OpenAIModel\` still matches the
\`ModelDriver\` interface.
- Pattern parity with the existing SiliconFlow Encode implementation
(\`internal/entity/models/siliconflow.go\`).

Closes #14629

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-10 10:31:37 +08:00
Jin Hai
17d71e5d79 Go CLI: embed and rerank (#14735)
### What problem does this PR solve?

```
RAGFlow(user)> embed text 'what is rag' 'who are you' with 'embedding-3@test@zhipu-ai' dimension 16;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 16        | 0     |
| 16        | 1     |
+-----------+-------+

RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'rerank@test@zhipu-ai' top 2;
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0     | 1               |
| 2     | 0.99999976      |
+-------+-----------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-09 17:41:54 +08:00
Haruko386
a377512110 Go: implement provider: OpenRouter (#14652)
### What problem does this PR solve?

1. **Implement `OpenRouter` Provider:** Fully support OpenRouter AI
models (e.g., `gemma`, `minimax`). Includes robust handling of
Server-Sent Events (SSE) streams, error event interception, and proper
parsing of both `reasoning_content` and standard `content`.
2. **Fix BaseURL Resolution Bug:** Fixed a critical edge case in region
configuration parsing. Added a strict empty string check
(`*apiConfig.Region != ""`) alongside the `nil` check. This ensures that
if the UI passes an empty string, the system correctly falls back to the
`"default"` region, preventing `unsupported protocol scheme ""` errors
during HTTP requests.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-05-08 12:02:37 +08:00
Panda Dev
b8b741555f Go: implement provider: OpenAI (#14605)
### What problem does this PR solve?

Add a Go driver for OpenAI (GPT models).

The config file conf/models/openai.json has been in the repo for a while
with the full GPT-5 model list, but
internal/entity/models/factory.go had no case for "openai". So any
tenant that configured OpenAI as a model provider in the Go layer fell
through to the default branch and got the dummy driver. Chat, list
models, and check connection all returned dummy responses instead of
reaching the API.

OpenAI is the most commonly requested provider and the JSON config
already ships with the repo, so this gap is high impact even though the
JSON has been there for some time.

### What this PR includes

- New file internal/entity/models/openai.go with an OpenAIModel that
implements the ModelDriver interface.
- factory.go: route the "openai" provider name to NewOpenAIModel.
- conf/models/openai.json: add "models": "models" under url_suffix so
ListModels can hit /v1/models with no hardcoded fallback.

### How the driver works

- OpenAI exposes the canonical OpenAI-compatible API at
https://api.openai.com/v1.
- ChatWithMessages and ChatStreamlyWithSender post to /chat/completions
in the same shape the moonshot, vllm, and xai drivers use.
- ListModels and CheckConnection call /models to list available ids and
confirm the API key works.
- reasoning_content is passed through for the o-series and other
reasoning models, in both the non-stream and stream paths.
- Encode (embeddings) is left as "not implemented" for now, the same way
the other recent provider drivers do it. Rerank and Balance are not part
of OpenAI's public API surface in this layer and return a clear "not
implemented" or "no such method" error.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

### How was this tested?

- go build ./internal/entity/models/... in a clean go 1.25 image (the
go.mod minimum) returns exit 0 with no errors.
- Method set of OpenAIModel matches the ModelDriver interface:
NewInstance, Name, ChatWithMessages, ChatStreamlyWithSender, Encode,
Rerank, ListModels, Balance, CheckConnection.
- Pattern parity with the merged moonshot (#14433), volcengine (#14460),
minimax (#14478), vllm (#14532), xai (#14550), and lm-studio (#14586)
PRs.

Closes #14604
2026-05-07 13:09:51 +08:00