Commit Graph

22 Commits

Author SHA1 Message Date
Haruko386
078ea3bf4a Go: implement provider: Nvidia (#14623)
### What problem does this PR solve?

1. **Implement `Nvidia` Provider:** Fully support NVIDIA NIM APIs with
robust parameter handling (including the `thinking` parameter) and safe
URL merging in `NewInstance`.
2. **Fix Misleading CLI Errors:** Corrected a bug in `common_command.go`
where failed chat requests inaccurately reported `failed to list
instance models`.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-05-07 14:17:57 +08:00
Panda Dev
b8b741555f Go: implement provider: OpenAI (#14605)
### What problem does this PR solve?

Add a Go driver for OpenAI (GPT models).

The config file conf/models/openai.json has been in the repo for a while
with the full GPT-5 model list, but
internal/entity/models/factory.go had no case for "openai". So any
tenant that configured OpenAI as a model provider in the Go layer fell
through to the default branch and got the dummy driver. Chat, list
models, and check connection all returned dummy responses instead of
reaching the API.

OpenAI is the most commonly requested provider and the JSON config
already ships with the repo, so this gap is high impact even though the
JSON has been there for some time.

### What this PR includes

- New file internal/entity/models/openai.go with an OpenAIModel that
implements the ModelDriver interface.
- factory.go: route the "openai" provider name to NewOpenAIModel.
- conf/models/openai.json: add "models": "models" under url_suffix so
ListModels can hit /v1/models with no hardcoded fallback.

### How the driver works

- OpenAI exposes the canonical OpenAI-compatible API at
https://api.openai.com/v1.
- ChatWithMessages and ChatStreamlyWithSender post to /chat/completions
in the same shape the moonshot, vllm, and xai drivers use.
- ListModels and CheckConnection call /models to list available ids and
confirm the API key works.
- reasoning_content is passed through for the o-series and other
reasoning models, in both the non-stream and stream paths.
- Encode (embeddings) is left as "not implemented" for now, the same way
the other recent provider drivers do it. Rerank and Balance are not part
of OpenAI's public API surface in this layer and return a clear "not
implemented" or "no such method" error.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

### How was this tested?

- go build ./internal/entity/models/... in a clean go 1.25 image (the
go.mod minimum) returns exit 0 with no errors.
- Method set of OpenAIModel matches the ModelDriver interface:
NewInstance, Name, ChatWithMessages, ChatStreamlyWithSender, Encode,
Rerank, ListModels, Balance, CheckConnection.
- Pattern parity with the merged moonshot (#14433), volcengine (#14460),
minimax (#14478), vllm (#14532), xai (#14550), and lm-studio (#14586)
PRs.

Closes #14604
2026-05-07 13:09:51 +08:00
Haruko386
dd7a0ce1d3 Go: implement provider: lm-studio (#14586)
### What problem does this PR solve?

implement `lm-studio` provider

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-06 19:23:11 +08:00
Haruko386
cd54c08e84 Go: implement provider: Ollama (#14580)
### What problem does this PR solve?

implement `Ollama` provider

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-06 12:03:58 +08:00
Haruko386
0e1477eb23 Go: implement provider: MiniMax (#14478)
### What problem does this PR solve?

implement MiniMax provider

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-04-29 19:06:40 +08:00
Jin Hai
bb05a8bd7e Update create model instance command (#14441)
### What problem does this PR solve?

1. support command:

```
RAGFlow(user)> create provider 'vllm' instance 'test' key 'test-key' url 'base-url' region 'abc';
SUCCESS
RAGFlow(user)> list instances from 'vllm';
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
| apiKey   | extra                                  | id                               | instanceName | providerID                       | status |
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
| test-key | {"base_url":"base-url","region":"abc"} | 40213c89430311f1a7cf38a74640adcc | test         | b4d40e6142d311f1a4f938a74640adcc | enable |
+----------+----------------------------------------+----------------------------------+--------------+----------------------------------+--------+
```
2. support add vllm model
```
RAGFlow(user)> add model 'Qwen/Qwen2-0.5B' to provider 'vllm' instance 'test' with tokens 131072 chat;
SUCCESS
```
3. add vllm chat

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-29 17:05:08 +08:00
Haruko386
decf673049 Go: implement provider: volcengine (#14460)
### What problem does this PR solve?

implement `volcengine` provider 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-04-29 15:45:08 +08:00
Jin Hai
f670913bb4 Refactor model type to model class (#14426)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-28 16:05:15 +08:00
qinling0210
effc84a042 Refactor model in GO (#14398)
### What problem does this PR solve?

Refactor model in GO

### Type of change

- [x] Refactoring
2026-04-28 12:59:01 +08:00
Jin Hai
819257f257 Go: add volcengine (#14409)
### What problem does this PR solve?

1. Refactor server_main
2. Add volcengine

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-28 12:12:58 +08:00
Jin Hai
965717c4fb Go: add new provider: google (#14395)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-27 20:35:47 +08:00
Jin Hai
c3eac4103a Go: aliyun model provider (#14379)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-27 14:53:33 +08:00
Jin Hai
1c244df90d Go: add gitee and siliconflow as model provider (#14336)
### What problem does this PR solve?

As title

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-24 20:59:30 +08:00
qinling0210
1473000135 Implement retrieval_test in GO (#14231)
### What problem does this PR solve?

Implement retrieval_test in GO

### Type of change

- [x] Refactoring
2026-04-24 15:30:14 +08:00
Jin Hai
2b029882d7 Go: add new provider minimax (#14296)
### What problem does this PR solve?

1. Add new provider minimax
2. Add new command: CHECK INSTANCE 'instance_name' FROM 'provider_name';
```
RAGFlow(user)> check instance 'test' from 'minimax';
SUCCESS
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-23 10:16:20 +08:00
Jin Hai
b8660b9919 Add deepseek and moonshot model json (#14290)
### What problem does this PR solve?

As title

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-22 15:59:41 +08:00
Jin Hai
e48d75987c Go: add stream / think chat (#14242)
### What problem does this PR solve?

1. Supports stream and non-stream chat
2. Supports think and non-think chat
3. List supported models from DeepSeek service. (This command can be
used to verify the API validity)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-21 16:52:32 +08:00
Jin Hai
f269ee9739 Go: add thinking features to zhipu-ai (#14234)
### What problem does this PR solve?

```
RAGFlow(user)> list models from 'zhipu-ai';
+------------+------------+---------------+----------------+
| features   | max_tokens | model_types   | name           |
+------------+------------+---------------+----------------+
| [thinking] | 128000     | [chat]        | glm-4.7        |
| [thinking] | 128000     | [chat]        | glm-4.5        |
| [thinking] | 128000     | [chat vision] | glm-4.6v-Flash |
| [thinking] | 128000     | [chat]        | glm-4.5-x      |
| [thinking] | 128000     | [chat]        | glm-4.5-air    |
| [thinking] | 128000     | [chat]        | glm-4.5-airx   |
| [thinking] | 128000     | [chat]        | glm-4.5-flash  |
| [thinking] | 64000      | [vision]      | glm-4.5v       |
|            | 128000     | [chat]        | glm-4-plus     |
|            | 128000     | [chat]        | glm-4-0520     |
|            | 128000     | [chat]        | glm-4          |
|            | 8000       | [chat]        | glm-4-airx     |
|            | 128000     | [chat]        | glm-4-air      |
|            | 128000     | [chat]        | glm-4-flash    |
|            | 128000     | [chat]        | glm-4-flashx   |
|            | 1000000    | [chat]        | glm-4-long     |
|            | 128000     | [chat]        | glm-3-turbo    |
|            | 2000       | [vision]      | glm-4v         |
|            | 8192       | [chat]        | glm-4-9b       |
|            | 512        | [embedding]   | embedding-2    |
|            | 512        | [embedding]   | embedding-3    |
|            | 4096       | [asr]         | glm-asr        |
|            | 0          | [tts]         | glm-tts        |
|            | 0          | [ocr]         | glm-ocr        |
|            | 0          | [rerank]      | glm-rerank     |
+------------+------------+---------------+----------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-20 21:53:27 +08:00
Jin Hai
af2ed416a7 Add extra field to model instance (#14203)
### What problem does this PR solve?

Now each model support region with different URL

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-20 15:31:12 +08:00
Jin Hai
94106646e7 Go: set and list default models (#14191)
### What problem does this PR solve?

```
RAGFlow(user)> set default vlm "zhipu-ai" "ccc" "glm-4.6v-flash";
SUCCESS
RAGFlow(user)> list default models;
+--------+----------------+----------------+----------------+------------+
| enable | model_instance | model_name     | model_provider | model_type |
+--------+----------------+----------------+----------------+------------+
| true   | ccc            | glm-4.6v-flash | zhipu-ai       | llm        |
| true   | ccc            | glm-4.6v-flash | zhipu-ai       | image2text |
+--------+----------------+----------------+----------------+------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-17 18:05:33 +08:00
Jin Hai
6c29128de1 Refactor model provider and command (#13887)
### What problem does this PR solve?

Introduce 5 new tables, including model groups and provider instance.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-02 20:20:35 +08:00
Jin Hai
e20cf39735 Refactor Go server model provider reading and access (#13831)
### What problem does this PR solve?

1. Refactor model provider json file format
2. Use memory data structure to replace database
3. Add CLI command to access

```
RAGFlow(user)> list pool models from 'xai';
+-------------------------------------------------------------------------------------+------------+-------------+-----------------------+
| features                                                                            | max_tokens | model_types | name                  |
+-------------------------------------------------------------------------------------+------------+-------------+-----------------------+
| map[]                                                                               | 256000     | [llm]       | grok-4                |
| map[]                                                                               | 131072     | [llm]       | grok-3                |
| map[]                                                                               | 131072     | [llm]       | grok-3-fast           |
| map[]                                                                               | 131072     | [llm]       | grok-3-mini           |
| map[]                                                                               | 131072     | [llm]       | grok-3-mini-mini-fast |
| map[multimodal:map[enabled:true input_modalities:[image] output_modalities:[text]]] | 32768      | [vlm]       | grok-2-vision         |
+-------------------------------------------------------------------------------------+------------+-------------+-----------------------+
RAGFlow(user)> show pool model 'grok-2-vision' from 'xai';
+-------------------------------------------------------------------------------------+------------+-------------+---------------+
| features                                                                            | max_tokens | model_types | name          |
+-------------------------------------------------------------------------------------+------------+-------------+---------------+
| map[multimodal:map[enabled:true input_modalities:[image] output_modalities:[text]]] | 32768      | [vlm]       | grok-2-vision |
+-------------------------------------------------------------------------------------+------------+-------------+---------------+
RAGFlow(user)> list pool providers;
+--------+------------------------------------------------------------+---------------------------+
| name   | tags                                                       | url                       |
+--------+------------------------------------------------------------+---------------------------+
| OpenAI | LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,MODERATION | https://api.openai.com/v1 |
| xAI    | LLM                                                        | https://api.x.ai/v1       |
+--------+------------------------------------------------------------+---------------------------+
RAGFlow(user)> show pool provider 'openai';
+---------------------------+--------+------------------------------------------------------------+--------------+
| base_url                  | name   | tags                                                       | total_models |
+---------------------------+--------+------------------------------------------------------------+--------------+
| https://api.openai.com/v1 | OpenAI | LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,MODERATION | 27           |
+---------------------------+--------+------------------------------------------------------------+--------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-30 12:00:49 +08:00