Commit Graph

255 Commits

Author SHA1 Message Date
oktofeesh
8468227a1a fix(go-models): harden 302.AI driver requests (#15289)
## Summary
- Harden the 302.AI model driver request validation and response parsing
paths.
- Add focused tests for chat request mode, model listing, malformed
provider responses, and input validation.

## What changed
- Validate API keys, model names, rerank queries, ASR file paths, OCR
inputs, parse URLs, task IDs, and model-list IDs before use.
- Keep chat and streaming methods from accepting conflicting `stream`
values in request payloads.
- Send `ListModels` as a bodyless GET and parse the response with typed
JSON structs instead of unchecked assertions.
- Remove raw SSE event logging from stream handling.

## Why
The driver could panic or send inconsistent requests when optional
config fields were nil, empty, malformed, or contradicted the method
path. This keeps provider-driver behavior explicit while preserving the
existing supported 302.AI flows.

Closes #14736
2026-05-28 13:33:01 +08:00
Hz_
0694b4af57 fix: include user model settings in /user/me response (#15320)
### What problem does this PR solve?

Fixes the `/user/me` response so it returns the current user's model
settings correctly.

### Type of change

- Added model settings data to the `/user/me` response.
- Kept the response structure compatible with existing user profile
fields.
- Avoided changing unrelated user/session behavior.
2026-05-28 13:31:16 +08:00
tmimmanuel
085241b039 Go: implement system healthz API (#15307)
## Summary
- Add Go REST support for `GET /api/v1/system/healthz`.
- Return Python-compatible `ok`/`nok` dependency fields for DB, Redis,
document engine, and storage.
- Return HTTP 200 only when all checks pass; otherwise return HTTP 500
with `_meta` failure details.
- Add focused service coverage for the unhealthy dependency response
when Go dependencies are not initialized.

## Scope
This is a small, isolated slice of #15240. It avoids current open
connector PRs (#15274, #15300, #15265, #15264), tenant/member PRs
(#15295, #15301, #15276), MCP PRs (#15281, #15253, #15254, #15260,
#15261, #15262), and the memory-message PR (#15256).

Refs #15240
2026-05-28 13:30:22 +08:00
Hz_
b472ceeb68 go: add PATCH /api/v1/users/me user settings update (#15297)
### What problem does this PR solve?

- Add Go implementation parity for `PATCH /api/v1/users/me`.

- This updates the Go user settings endpoint to match the Python
behavior for updating the current user's profile settings.

### Changes

- Route `PATCH /api/v1/users/me` through the authenticated current user
from middleware.
- Add `password` and `new_password` support to `UpdateSettingsRequest`.
- Prevent `email` from being updated through this endpoint, matching the
Python blacklist behavior.
  - Support updating:
    - `nickname`
    - `avatar`
    - `language`
    - `color_schema`
    - `timezone`
    - `password`
  - Align password handling with Python:
    - invalid plaintext password payload returns `CodeExceptionError`
    - wrong old password returns `Password error!`
- successful update returns `{ code: 0, data: true, message: "success"
}`

### Test

Tested manually with Python and Go backends using the same request
bodies:

  - `PATCH /api/v1/users/me` with nickname/timezone update
- plaintext password payload returns Python-compatible `Incorrect
padding`
  - wrong old password returns `Password error!`
2026-05-28 07:08:50 +08:00
Haruko386
82318dee5d feat[Go]: implement create_connector API (#15285)
### What problem does this PR solve?

implement create_connector API

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-27 15:54:11 +08:00
oktofeesh
7fb9a26623 fix(go-models): validate TokenHub chat requests (#15283)
## Summary
- centralize TokenHub chat request validation for chat and streaming
calls
- reject blank TokenHub model names before sending provider requests
- send TokenHub model listing requests as bodyless GET requests

## What changed
- Added shared TokenHub chat request validation for API key, model name,
and messages.
- Updated `ListModels` to call `GET /models` without a request body.
- Added focused tests for blank model names and accidental GET request
bodies.
- Replaced an httptest handler callback `t.Fatalf` with `t.Errorf` plus
an HTTP error and return.

## Why
TokenHub chat requests should fail locally for invalid model names
instead of sending avoidable malformed requests upstream. Model listing
should also match normal GET semantics and avoid sending an empty JSON
body.

Closes #14736

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-27 14:39:41 +08:00
Haruko386
ae88578451 Go: implement TTS and ASR for X.AI (#15247)
### What problem does this PR solve?

As title

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-27 14:08:35 +08:00
tmimmanuel
0b000b833e Go: implement connector get API (#15259)
## Summary
- Add Go REST support for `GET /api/v1/connectors/:connector_id`.
- Reuse the Python API behavior by returning the connector only when the
current user can access its tenant.
- Add focused handler coverage for success and unauthorized responses.

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-27 14:07:55 +08:00
sxxtony
17b5b33574 Go: implement Rerank in Replicate driver (#15278)
### What problem does this PR solve?

`ReplicateModel.Rerank` in `internal/entity/models/replicate.go` was a
`"replicate, no such method"` stub. The chat path landed in #14958 and
the embed path in #15073; rerank is the last major retrieval surface
still missing on this provider.

Until this PR, a tenant who selected a Replicate reranker model got the
sentinel error on every rerank call.

Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-27 14:07:00 +08:00
Hz_
47626bbe63 go: add Qiniu model provider (#15280)
### What problem does this PR solve?

This PR adds Qiniu provider integration for the Go model driver layer in
RAGFlow.

  Supported capabilities:

  - [X] Chat
  - [X] Think Chat
  - [X] Stream Chat
  - [X] Stream Think Chat
  - [X] Model listing
  - [X] Provider configuration and factory registration

  Verified examples from the CLI:

  ```
  login user '***' password '***';

  ADD PROVIDER 'qiniu';

  CREATE PROVIDER 'qiniu' INSTANCE 'test' KEY '***';

chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu' message
'hello';

think chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu'
message 'hello';

stream chat with 'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu'
message 'hello, what are you';

stream think chat with
'deepseek/deepseek-v3.1-terminus-thinking@test@qiniu' message 'hello,
what are you';

stream think chat with 'qwen3-max-2026-01-23@test@qiniu' message 'hello,
what are you';

  LIST MODELS FROM 'qiniu' 'test';

```

  ### Type of change

  - [X] New Feature
  - [X] Provider integration
2026-05-27 13:19:39 +08:00
oktofeesh
a3c6e075f6 fix(go-models): add VolcEngine model listing suffix (#15234)
## Summary
- add the VolcEngine `models` URL suffix used by the existing Go
`ListModels` implementation
- return a clear error when the VolcEngine models suffix is missing
- add focused VolcEngine model-listing regression tests

## What changed
- Added `url_suffix.models` to `conf/models/volcengine.json`.
- Normalized the configured models suffix before building the request
URL.
- Covered config loading, successful model listing, upstream errors, and
missing suffix handling.

## Why
`VolcEngine.ListModels` already builds requests from `URLSuffix.Models`,
but the bundled VolcEngine config did not define that suffix. That left
the model-listing path unable to call the documented `/models` endpoint
from the existing provider config.

Fixes #14701

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-27 13:14:56 +08:00
oktofeesh
5ae41dc1eb fix(go-models): route hosted OCR providers through drivers (#15233)
## Summary
- route hosted MinerU.Net and PaddleOCR.Net provider names to their
existing Go drivers
- add regression coverage for loading the hosted OCR provider configs
through ProviderManager

## What changed
- Added canonical provider-name aliases for the hosted OCR provider
display names.
- Covered both bundled configs with a focused provider-manager test.

## Why
The hosted provider configs use display names with `.Net`, while model
factory dispatch lowercases the provider name. Without aliases, those
configs fall through to `DummyModel` instead of using the existing
MinerU and PaddleOCR drivers.

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-26 20:40:40 +08:00
oktofeesh
22a3b8cdf9 feat(go-models): list LongCat models (#15241)
## Summary
- Add LongCat model-list support through the documented
OpenAI-compatible models endpoint.

## What changed
- Add the LongCat `models` URL suffix for `/openai/v1/models`.
- Implement `ListModels` for the LongCat Go driver.
- Delegate `CheckConnection` to the lightweight model-list request.
- Add focused regression coverage for successful, malformed, oversized,
and missing-key responses.

## Why
LongCat documents a models endpoint under the OpenAI-compatible API
surface, but the Go driver still returned `no such method` for model
listing and connection checks.

## Validation
- `go test ./internal/entity/models -run TestLongCat -count=1`
- `go test -race ./internal/entity/models -run TestLongCat -count=1`
- `go test ./internal/entity -count=1`
- `git diff --check`

## Notes
- Related to the broader Go model provider tracking in #14736, but this
PR only handles LongCat model listing.
- `go test ./internal/entity/models -count=1` is currently blocked by an
unrelated Astraflow test panic outside this LongCat change.

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-26 19:58:53 +08:00
oktofeesh
557024e7d4 fix(go-models): add xAI model listing suffix (#15236)
## Summary
- add the xAI `models` URL suffix used by the existing Go `ListModels`
implementation
- return a clear error when the xAI models suffix is missing
- add focused xAI model-listing and connection-check regression tests

## What changed
- Added `url_suffix.models` to `conf/models/xai.json`.
- Normalized the configured models suffix before building the request
URL.
- Covered config loading, successful model listing, upstream errors,
API-key validation, missing suffix handling, and `CheckConnection`
delegation.

## Why
`XAIModel.ListModels` already builds requests from `URLSuffix.Models`,
and `CheckConnection` delegates to that method. The bundled xAI config
did not define that suffix, which left the model-listing path unable to
call the provider `/models` endpoint from the existing provider config.

## Validation
- `go test ./internal/entity/models -run TestXAI -count=1`
- `go test ./internal/entity -count=1`
- `git diff HEAD~1..HEAD --check`

## Notes
- `go test ./internal/entity/models -count=1` currently fails in
unchanged Astraflow coverage: `TestAstraflowEmbedReturnsNoSuchMethod`
panics before reaching any xAI assertions.

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-26 19:58:20 +08:00
Haruko386
3619ceca01 Go: implement provider: OrcaRouter (#15235)
### What problem does this PR solve?

implement provider `OrcaRouter`
**The following functionalities are now supported:**

**Cohere:**
- [x] Chat / Think Chat / Stream Chat / Stream Think Chat
- [x] Model listing
- [x] TTS
- [ ] Balance


### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-26 18:20:33 +08:00
dripsmvcp
a48bcf814d Go: implement provider: ModelScope (#15041)
Closes #15040.

ModelScope was listed unchecked in the Go-rewrite tracker #14736 and
already had an llm_factories.json entry (tags: LLM) but no Go driver, so
the new Go API server could not route ModelScope instances. The Python
side has supported it through the OpenAI-compatible base at
rag/llm/chat_model.py:618 (ModelScopeChat), which requires a
user-supplied base URL and appends /v1.

This adds:
- internal/entity/models/modelscope.go: self-hosted OpenAI-compatible
driver with chat (sync + SSE stream with idle-timeout cancellation),
list_models, and check_connection. Auth header is optional, matching the
xinference pattern, so deployments without auth and auth-enabled
deployments both work. Base URL is normalized so users can configure
either the root endpoint or the /v1 endpoint.
- internal/entity/models/modelscope_test.go: 12 tests covering name, URL
normalization, factory routing, chat happy path / auth header /
reasoning_content extraction, stream happy path / stream=false rejection
/ idle cancellation, list_models + check_connection, missing-base-URL
clear error, and the no-such-method sentinels.
- conf/models/modelscope.json: shipped config (class: "local",
url_suffix v1/chat/completions and v1/models).
- internal/entity/models/factory.go: case "modelscope" →
ModelScopeModel.
- internal/service/llm.go: ModelScope added to the selfDeployed map
alongside Ollama, Xinference, LocalAI, LM-Studio, GPUStack — the Python
side requires user-supplied URL with no default, so the Go side
classifies it the same way.

Follow-on issues will add Embed and Rerank, in line with how Novita,
NVIDIA, TogetherAI, and other providers landed method-by-method.

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-26 18:18:46 +08:00
Hz_
84add43208 Add HuaweiCloud model provider (#15237)
### What problem does this PR solve?

  This PR adds HuaweiCloud provider integration in RAGFlow.

  Supported capabilities:

  - [x] Chat / Think Chat / Stream Chat / Stream Think Chat
  - [x] Embedding
  - [x] Rerank
  - [x] Model listing
  - [x] Provider connection checking

  Verified examples from the CLI:

  ```
  check instance 'test' from 'HuaweiCloud';

  chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello';

  think chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello';

  stream chat with 'deepseek-v4-flash@test@HuaweiCloud' message 'hello';

stream think chat with 'deepseek-v4-flash@test@HuaweiCloud' message
'hello';

embed text 'what is rag' 'who are you' with 'bge-m3@test@HuaweiCloud'
dimension 1024;

rerank query 'what is rag' document 'rag is retrieval augmented
generation' 'rag need llm' 'famous rag
project includes ragflow' with 'bge-reranker-v2-m3@test@HuaweiCloud' top
3;

  list supported models from 'HuaweiCloud' 'test';

  LIST MODELS FROM 'HuaweiCloud' 'test';
```
  ### Type of change

  - [x] New Feature
  - [x] Provider integration
2026-05-26 17:13:15 +08:00
ghost
a7d25391dc fix(tokenhub): wire Go driver and harden requests (#15224)
## Summary
- Wire the Go TokenHub provider through the model factory.
- Harden TokenHub request handling for chat, streaming, embeddings, and
model listing.
- Add focused TokenHub unit coverage for factory wiring and provider
behavior.

## Notes
- Refs #14736.
- Follows up #15159.

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-26 17:12:37 +08:00
Jake Armstrong
0fb85a66bc feat(go-models): add AWS Bedrock provider driver (#15166)
## Summary

Closes #15165.

Implements the AWS Bedrock model provider for the Go API server, tracked
under #14736. Adds Converse + Converse-Stream chat and foundation-model
listing, with SigV4 signing over a hand-rolled `net/http` path that
matches the established pattern in `internal/entity/models/` (no new
direct `go.mod` deps).

## Linked tracker

Tracked under #14736 (Implement model providers of RAGFlow API server in
Go). Closes #15165.
2026-05-26 17:10:06 +08:00
glorydavid03023
3dbd874a79 Go: implement Rerank in DeepInfra driver (#15185)
### What problem does this PR solve?

The Go DeepInfra driver returned a stub error for `Rerank()` even though
DeepInfra serves reranker models at `POST /v1/inference/{model}` with
`query`, `documents`, and a `scores[]` response.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 10:52:09 +08:00
sxxtony
67f7d87dff Go: implement provider: FuturMix (#15013)
### What problem does this PR solve?

Add a Go driver for **FuturMix** (https://futurmix.ai/docs), one of the
unchecked providers on the umbrella tracking issue #14736. FuturMix is
documented as an "OpenAI-compatible API" aggregator over Claude / GPT /
Gemini / DeepSeek (~22 models per their `/models` page).

Until this PR, a tenant who configured `futurmix` as a model provider in
the Go layer fell through to the default branch of
`internal/entity/models/factory.go` and got the dummy driver.

---------

Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-26 10:51:29 +08:00
Renzo
806414df43 Go: validate Baidu OCR inputs (#15168)
### What problem does this PR solve?

Closes #15167.

The Baidu Go provider advertises OCR support through
`paddleocr-vl-0.9b`, but `BaiduModel.OCRFile` dereferenced required
inputs before validating them. Calling OCR with a missing API config,
API key, or model name could panic instead of returning a normal error.

This PR adds explicit input validation for those required values.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-26 10:51:05 +08:00
Jake Armstrong
b961810e79 Go: implement OCR in ZhipuAI driver (#15143)
### What problem does this PR solve?

Closes #15142.

ZhipuAI lists `glm-ocr` as an OCR model, but the Go driver still
returned `no such method` from `OCRFile`. This wires the advertised
model to Z.AI's documented `layout_parsing` endpoint and returns the
`md_results` Markdown output through the existing `OCRFileResponse.Text`
field.

This PR also adds focused tests for URL input, raw file-content base64
input, and validation errors.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

### Test

- [x] `go test -vet=off ./internal/entity/models -run
'TestZhipuAIOCRFile'`
2026-05-26 10:50:06 +08:00
qinling0210
af85aa9c7b Implement Elasticsearch functions in GO (#15160)
### What problem does this PR solve?

Implement Elasticsearch functions in GO (except for Search)

### Type of change

- [x] Refactoring
2026-05-25 19:15:07 +08:00
Haruko386
4783ce9951 fix(Go): rewrite chat, listmodels, embed for Ollama (#15213)
### What problem does this PR solve?

IDK how to implement **`Ollama`** on #14580 but it's totally wrong.
This is the rewrite version for **`Ollama`**

**Verified from CLI**
```
# Embed
RAGFlow(user)> embed text 'what is rag' 'who are you' with 'nomic-embed-text:latest@test12@ollama' dimension 1024;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 768       | 0     |
| 768       | 1     |
+-----------+-------+

# Chat
RAGFlow(user)> think chat with 'qwen3:0.6b@test12@ollama' message 'who r u'
Thinking: Okay, the user asked, "Who r u?" I need to respond appropriately. First, I should acknowledge their question. Since I'm an AI, I don't have a physical form, but I can confirm that I'm a large language model. I should keep the response friendly and offer help. Let me make sure I'm not making up any information and that the response is natural. Also, I should check for any typos and ensure clarity. Alright, that should cover it.

Answer: I'm an AI language model, and I don't have a physical form. However, I can tell you that I'm designed to assist with questions and tasks. How can I help you today?
Time: 2.914285


RAGFlow(user)> stream think chat with 'qwen3:0.6b@test12@ollama' message 'who r u'
Thinking: , the user asked, "Who are you?" I need to respond appropriately. Since I'm an AI assistant, I should mention that I don't have a physical form or a mind. I should also clarify that I can help with various tasks like answering questions or providing information. It's important to keep the response friendly and informative while maintaining the correct tone.
Answer:  don't have a physical form or a mind, but I'm here to help with your questions or tasks! What can I do for you today?
Time: 1.740047

# LisyModels
RAGFlow(user)> list supported models from 'ollama' 'test12'
+-------------------------+
| model_name              |
+-------------------------+
| nomic-embed-text:latest |
| qwen3:0.6b              |
+-------------------------+
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2026-05-25 18:55:03 +08:00
Haruko386
69f301b84a Go: implement embed for Tencent Hunyuan (#15207)
### What problem does this PR solve?

Implement embed for Tencent Hunyuan

**Verified from CLI**
```
RAGFlow(user)> embed text 'what is rag' 'who are you' with 'hunyuan-embedding@test1@hunyuan' dimension 16;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 1024      | 0     |
| 1024      | 1     |
+-----------+-------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-25 16:04:17 +08:00
ちー
bb6cfc14e6 feat[go]: implement provider: TokenHub (#15159)
### What problem does this PR solve?

implement provider TokenHub

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-25 16:02:50 +08:00
Jin Hai
f8c626bbc8 Go: add ingestion server (#15094)
### What problem does this PR solve?

1. Go ingestion server will connected with admin server with gRPC stream
2. Go ingestion server will be responsible for ingestion tasks
```

RAGFlow(admin)> list ingestors;
+-----------------+-----------+----------------------------------+---------------------------+----------+------------+--------------+--------+------------+---------------+
| address         | cpu_usage | id                               | last_heartbeat            | name     | process_id | rss_usage    | status | task_count | vms_usage     |
+-----------------+-----------+----------------------------------+---------------------------+----------+------------+--------------+--------+------------+---------------+
| 127.0.0.1:58564 | 0         | bdd1870eea2646e0aacb8a2cd3307aa2 | 2026-05-24T18:16:17+08:00 | ingestor | 680152     | 212.72265625 | active | 0          | 2589.12109375 |
+-----------------+-----------+----------------------------------+---------------------------+----------+------------+--------------+--------+------------+---------------+

RAGFlow(admin)> start ingestion 'abc';
+----------------------------------+
| task_id                          |
+----------------------------------+
| e714777639ca4760ab427b5f211e81ad |
+----------------------------------+

RAGFlow(admin)> stop ingestion 'f7bd39d0a724457eb5fdce6d81699776';
+----------------------------------+
| task_id                          |
+----------------------------------+
| f7bd39d0a724457eb5fdce6d81699776 |
+----------------------------------+

RAGFlow(admin)> list tasks;
+-----+----------------------------------+-------+------+----------------------------------+---------------------------+------------+------------+
| ETA | assign_to                        | error | from | id                               | last_update               | start_time | status     |
+-----+----------------------------------+-------+------+----------------------------------+---------------------------+------------+------------+
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | eae6431da72a40e796cff3a03008091b | 2026-05-24T19:46:03+08:00 |            | COMPLETED  |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | 6cccdd174bd049ecb05a774bbb47593f | 2026-05-24T19:46:03+08:00 |            | COMPLETED  |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | ef360d777e57485799adb96b30f2b4b8 | 2026-05-24T19:46:03+08:00 |            | CANCELED   |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | bcc5c5448cb64de48b6b6171c36fb790 | 2026-05-24T19:46:03+08:00 |            | CANCELED   |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | bfc25384c43a443294fe2da979a38ac2 | 2026-05-24T19:46:03+08:00 |            | DISPATCHED |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | 84960537b85d413b8990a9efd5952d67 | 2026-05-24T19:46:04+08:00 |            | DISPATCHED |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | 3d223c1b51e24b36861a3bfb2f1d58d4 | 2026-05-24T19:46:03+08:00 |            | CANCELED   |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | e433b0e356b846c89c301621a3c54494 | 2026-05-24T19:46:03+08:00 |            | COMPLETED  |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | 7c93a3880f074ebd8eca14e6b51bb7ef | 2026-05-24T19:46:03+08:00 |            | COMPLETED  |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | df2e4ef51aaf4390bff9a23f2692486e | 2026-05-24T19:46:04+08:00 |            | DISPATCHED |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | 7377c53010194ef7a83aa206698d66ff | 2026-05-24T19:46:05+08:00 |            | DISPATCHED |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | df64d1a1f9d348e3a2f174c4d7d69e73 | 2026-05-24T19:46:05+08:00 |            | DISPATCHED |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | b59834512e2847e1bdf13ace04b8a456 | 2026-05-24T19:46:06+08:00 |            | DISPATCHED |
| 0   | 17937da188b84f23a5c10bb87588944b |       | CLI  | 0064bb0ab69344028d1ecfda053826f4 | 2026-05-24T19:46:03+08:00 |            | QUEUED     |
+-----+----------------------------------+-------+------+----------------------------------+---------------------------+------------+------------+


```


### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-25 14:00:08 +08:00
Haruko386
5d022d83e8 Go: implement provider: PaddleOCR_Local (#15158)
### What problem does this PR solve?

Go: implement provider: PaddleOCR_Local

**Verified from CLI**

```
RAGFlow(user)> ocr with 'PaddleOCR-VL@test@paddleocr_local' file './internal/test1.jpg'
+----------------------+
| text                 |
+----------------------+
| ## Parallel to these |
+----------------------+
```

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
- [X] New Feature (non-breaking change which adds functionality)
- [X] Refactoring
2026-05-25 12:12:57 +08:00
dripsmvcp
8d8ea71877 Go: implement provider: Tencent Hunyuan (#15092)
## Summary
- Adds a `Hunyuan` Go driver so the new API server can route Tencent
Hunyuan chat instances (registered in `conf/llm_factories.json:3830` as
`Tencent Hunyuan`). Follows the same SaaS-driver shape used for
Astraflow, Avian, Novita, TogetherAI, Replicate, DeepInfra, Upstage, and
LongCat.

Closes #15087

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-25 11:04:39 +08:00
bitloi
432e966414 fix(go): support OpenAI audio endpoints (#15104)
### What problem does this PR solve?

Closes #15102.

OpenAI's Go provider config advertises `whisper-1` as ASR and `tts-1` as
TTS, but the Go driver returned `openai, no such method` for both audio
paths and did not define `url_suffix.asr` / `url_suffix.tts`.

This PR:

- adds OpenAI audio URL suffixes for `audio/transcriptions` and
`audio/speech`
- implements non-streaming `TranscribeAudio` using multipart form
uploads
- implements non-streaming `AudioSpeech` using the OpenAI speech JSON
request shape
- keeps streaming TTS explicitly unsupported instead of sending binary
audio through the text SSE sender
- adds focused tests for config coverage, ASR/TTS request shape,
required TTS voice validation, and unsupported streaming TTS


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-25 10:25:53 +08:00
Tohka
302f97de50 Go: implement reasoning_chat, TTS, ASR for Groq (#15153)
### What problem does this PR solve?

Go: implement reasoning_chat, TTS, ASR for Groq

**Verify from CLI**
```
RAGFlow(user)> think chat with 'qwen/qwen3-32b@test@groq' message 'who r u'
Thinking: Okay, the user asked, who r u. I need to determine what the user is asking. They may be asking about my identity. I should introduce my name and basic functions. The user might want to know what I can do, so I should list some common use cases, such as answering questions, creating writing, coding, and expressing opinions. The user may be curious about how they can interact with me, so they can be advised to ask any questions or provide instructions. Keep your answers conversational, avoid overly technical terms, keep answers concise, and encourage further interaction. Check if there's any ambiguity in the answer and make sure it's accurate and meets the user's needs. Also consider if there are other aspects the user may be interested in, such as my training data or performance. But since the question is basic, I'll focus on the essentials first and invite the user to ask more. In summary, respond to the user's questions by introducing yourself, your functions, and encouraging further interaction.

Answer: Hello! I'm Qwen. I am a large-scale language model developed by Tongyi Lab, designed to assist you in various ways, such as answering questions, creating text, logical reasoning, programming, and more. I aim to provide clear, accurate, and helpful information and support. How can I assist you today? Feel free to ask any questions or give me tasks! 😊
Time: 2.199908


RAGFlow(user)> stream think chat with 'openai/gpt-oss-20b@test@groq' message 'who r u'
Thinking:  to respond politely.
Answer: ’m ChatGPT—an AI language model created by OpenAI. I’m here to answer questions, offer explanations, and help with a wide range of topics. How can I assist you today?


RAGFlow(user)> tts with 'canopylabs/orpheus-arabic-saudi@test@groq' text 'hello? show yourself' play format 'wav' param '{"voice": "fahad"}'
SUCCESS


RAGFlow(user)> asr with 'whisper-large-v3-turbo@test@groq' audio './internal/test.wav' param '{"language": "en"}'
+----------------------------------------------------------------------------------------------------------------------+
| text                                                                                                                 |
+----------------------------------------------------------------------------------------------------------------------+
|  The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired |
+----------------------------------------------------------------------------------------------------------------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-22 18:02:30 +08:00
Haruko386
3f02ca7ba1 Go: implement embed, rerank, tts for AstraFlow (#15135)
### What problem does this PR solve?

implement embed, rerank, tts for AstraFlow

**Verify from CLI**

```
# Astraflow
RAGFlow(user)> tts with 'IndexTeam/IndexTTS-2@test3@astraflow' text 'hello? show yourself' play format 'wav' param '{"voice": "jack_cheng"}'
SUCCESS

RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'bge-reranker-v2-m3@test3@astraflow' top 3;
+-------+---------------------+
| index | relevance_score     |
+-------+---------------------+
| 0     | 0.9837390184402466  |
| 2     | 0.06322699040174484 |
| 1     | 0.04663187265396118 |
+-------+---------------------+

RAGFlow(user)> embed text 'walkerwhat' 'jumperwho' with 'text-embedding-3-large@test3@astraflow' dimension 16
+-----------+-------+
| dimension | index |
+-----------+-------+
| 3072      | 0     |
| 3072      | 1     |
+-----------+-------+

# Xinference


```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-22 18:02:01 +08:00
ghost
f9ce07ced1 feat(go-models): add Groq provider driver (#15097)
### What problem does this PR solve?

Closes #15088.

Adds Groq support to the Go model-provider layer so Groq instances can
be routed through the Go API server with the same OpenAI-compatible
chat, streaming, model listing, and connection-check flow used by other
SaaS providers.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

## Summary

- Added a Groq Go model driver.
- Added the Groq provider catalog and default OpenAI-compatible API URL.
- Registered Groq in the model factory.
- Added focused provider tests.

## What changed

- Implemented chat completions, SSE streaming, ListModels, and
CheckConnection for Groq.
- Covered request shape, stream termination, reasoning fallback, model
listing, custom base URLs, safe transport setup, and unsupported
methods.
- Kept the provider catalog scoped to current Groq chat-capable model
IDs.
- Cleaned up pre-existing Go model package validation blockers so the
package can be tested normally with vet enabled.

## Why

The existing Python/provider catalog path includes Groq, but the Go
model-provider layer did not have a Groq driver, so the Go API server
could not instantiate or use Groq as requested in #15088.

## Notes

The model package now validates without disabling vet.

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-22 15:24:52 +08:00
dripsmvcp
ed04893415 Go: implement provider: TokenPony (#15091)
## Summary
- Adds a `TokenPony` Go driver so the new API server can route TokenPony
chat instances, matching the existing Python `TokenPonyChat`
(`rag/llm/chat_model.py:1210`). Follows the same SaaS-driver shape used
for Astraflow, Avian, Novita, TogetherAI, Replicate, DeepInfra, Upstage,
and LongCat.

Closes #15086

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-22 15:21:45 +08:00
Jake Armstrong
b1ef5d365f Go: implement ASR in OpenRouter driver (#15067)
### What problem does this PR solve?

Fixes #15066

OpenRouter now exposes an official speech-to-text endpoint at `POST
/api/v1/audio/transcriptions`, but the Go model driver still returned
`openrouter, no such method` from `TranscribeAudio`. This left
OpenRouter ASR models unavailable through the Go API server even though
the provider already has OpenRouter audio support for TTS.

Related provider-tracking context: #14736

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-22 15:19:38 +08:00
Jake Armstrong
b2bf9155ed Go: implement ASR in ZhipuAI driver (#15134)
### What problem does this PR solve?

This PR implements ASR and TTS support for the ZhipuAI Go driver.

The ZhipuAI model config already advertises `glm-asr-2512` as an ASR
model, but the Go driver returned `zhipu, no such method` from
`TranscribeAudio`. This adds the documented audio transcription endpoint
suffix and sends multipart transcription requests with `model`,
`stream=false`, and `file` fields.

Per maintainer review, this also adds the ZhipuAI TTS endpoint suffix
and implements `AudioSpeech` / `AudioSpeechWithSender` for `glm-tts`.

Closes #15133

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-05-22 11:53:18 +08:00
ghost
b2053cc3c7 feat(go-models): add PPIO provider driver (#15099)
### What problem does this PR solve?

Closes #15089.

Adds PPIO support to the Go model-provider layer so PPIO instances can
be routed through the Go API server with the same OpenAI-compatible
chat, streaming, model listing, and connection-check flow used by other
SaaS providers.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

## Summary

- Added a PPIO Go model driver.
- Added the PPIO provider catalog and default OpenAI-compatible API URL.
- Registered PPIO in the model factory.
- Added focused provider and provider-manager tests.

## What changed

- Implemented chat completions, SSE streaming, ListModels, and
CheckConnection for PPIO.
- Covered request shape, stream termination, reasoning fallback, model
listing, custom base URLs, safe transport setup, unsupported methods,
and provider config loading.
- Kept the provider catalog aligned with the existing RAGFlow PPIO
factory model set.
- Cleaned up pre-existing Go model package validation blockers so the
scoped provider tests can run normally with vet enabled.

## Why

The existing Python/provider catalog path includes PPIO, but the Go
model-provider layer did not have a PPIO driver, so the Go API server
could not instantiate or use PPIO as requested in #15089.
2026-05-22 11:52:18 +08:00
Haruko386
1ece1c81da Go: implement rerank, asr, tts for TogetherAI (#15107)
### What problem does this PR solve?

implement rerank, asr, tts for TogetherAI

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-21 20:57:04 +08:00
Haruko386
a725e114f9 Go: implement ASR and TTS for Xinference (#15096)
### What problem does this PR solve?

implement ASR and TTS for Xinference

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-21 18:28:06 +08:00
tmimmanuel
38a8bc3dab fix(upstage): extract reasoning delta from streaming responses (#14817)
### What problem does this PR solve?

`UpstageModel.ChatStreamlyWithSender` (in the driver merged via #14819)
only extracted `delta.content` from each SSE event. For the `solar-pro3`
reasoning family (and any future Upstage model that follows the same
wire shape), the chain-of-thought is streamed in a **separate
`delta.reasoning` field**, and the driver was silently dropping all of
it.

The non-streaming path already extracts `message.reasoning` into
`ChatResponse.ReasonContent` (added earlier in this PR's history), so
the same model produced **inconsistent behavior** between streaming and
non-streaming: a tenant calling `solar-pro3` with `reasoning_effort:
high` would see the reasoning trace if they used `ChatWithMessages` but
not if they used `ChatStreamlyWithSender`.

### Live evidence

Probed against `api.upstage.ai/v1/chat/completions` with `solar-pro3` +
`reasoning_effort: high` + `stream: true` (8000-token budget so the
reasoning has room to finish):

```
$ curl -sN -H "Authorization: Bearer <key>" -H "Content-Type: application/json" \
       -X POST https://api.upstage.ai/v1/chat/completions \
       -d '{"model":"solar-pro3","messages":[{"role":"user","content":"Compute 15% of 80."}],
            "max_tokens":8000,"stream":true,"reasoning_effort":"high"}'

# across 168 SSE events:
#   delta keys seen: [content reasoning role]
#   delta.content total len:   121 chars   (the visible answer)
#   delta.reasoning total len: 159 chars   (the chain-of-thought) <- driver dropped this
```

A representative event showing both fields side by side:

```json
data: {"choices":[{"index":0,"delta":{"reasoning":"15% = 0.15."}}]}
data: {"choices":[{"index":0,"delta":{"content":"15% of 80 is "}}]}
```

The 159 chars of reasoning were arriving on the wire and being thrown
away. `solar-pro2` was also probed (625 events); it does **not** emit
`delta.reasoning` — its reasoning is inlined into `delta.content` — so
this change is a no-op for it and for `solar-mini`.

### What this PR includes

- `internal/entity/models/upstage.go`: in the SSE scanner loop, extract
`delta.reasoning` before `delta.content` and forward each non-empty
chunk via the sender's second arg (the existing `reasonContent` channel
the non-stream path already populates).

The ordering contract is documented inline: reasoning chunks within a
single SSE event are emitted before content chunks, so a UI that pipes
both sees the chain-of-thought start before the answer for that token,
matching the wire order Upstage emits.

- `internal/entity/models/upstage_test.go`: three new tests pinning the
new behavior:
- `TestUpstageStreamExtractsReasoningDelta` — reasoning + content
forwarded to the right sender args; one-of invariant per call
- `TestUpstageStreamReasoningChunksArriveBeforeContent` — ordering
pinned within a single SSE event that carries both fields
- `TestUpstageStreamWithoutReasoningStillWorks` — regression net:
non-reasoning models (`solar-mini`, `solar-pro2`) continue to work; the
reason callback never fires

No interface change. No factory change. No config change.

### How was this tested?

```
$ go test -vet=off -run TestUpstage -count=1 -v ./internal/entity/models/...
... (existing tests 1..9 still pass) ...
=== RUN   TestUpstageStreamExtractsReasoningDelta
--- PASS: TestUpstageStreamExtractsReasoningDelta (0.01s)
=== RUN   TestUpstageStreamReasoningChunksArriveBeforeContent
--- PASS: TestUpstageStreamReasoningChunksArriveBeforeContent (0.01s)
=== RUN   TestUpstageStreamWithoutReasoningStillWorks
--- PASS: TestUpstageStreamWithoutReasoningStillWorks (0.00s)
PASS
ok      ragflow/internal/entity/models  0.034s
```

12/12 Upstage tests pass on go 1.25. `go build
./internal/entity/models/...` exits 0.

**Live integration test** (smoke test not committed) — the patched
driver was run directly against `api.upstage.ai/v1` with the same prompt
that produced the curl evidence above:

```
=== RUN   TestUpstageStreamReasoningLiveSmoke
    [OK] visible content: 50 chunks, 84 chars
    [OK] reasoning:       39 chunks, 90 chars
    content head 200:   "\\(15\\% = \\frac{15}{100}=0.15\\).\n\n\\[\n0.15 \\times 80 = 12.\n\\]\n\n**15 % of 80 is 12.**"
    reasoning head 200: "We need to compute 15% of 80. That's 0.15 * 80 = 12. So answer is 12. Provide explanation."

UPSTAGE STREAM REASONING SMOKE PASSED
--- PASS: TestUpstageStreamReasoningLiveSmoke (1.97s)
```

Before this fix, the same call would have produced **0 reasoning
chunks**. The 90 chars of reasoning that the patched driver now surfaces
are the chain-of-thought solar-pro3 emits when reasoning_effort is high.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-21 15:33:21 +08:00
tmimmanuel
85d0b46d8e fix(mistral): handle structured content from magistral reasoning models (#14805)
### What problem does this PR solve?

`MistralModel.ChatWithMessages` (in the driver merged via #14807)
assumes that `choices[0].message.content` from `/v1/chat/completions` is
always a string and falls through to `return nil, fmt.Errorf("invalid
content format")` on anything else.

That assumption breaks for the **magistral reasoning family**
(`magistral-small-*`, `magistral-medium-*`). When the model needs a
chain-of-thought to answer, Mistral returns `content` as a **structured
array of typed parts**:

```json
"content": [
  {"type": "thinking",
   "thinking": [{"type": "text", "text": "Combined speed is 150 mph. 300 / 150 = 2 hours."}],
   "closed": true},
  {"type": "text", "text": "They will meet after **2 hours**."}
]
```

Concretely, this is what the live API returns today (probed against
`api.mistral.ai/v1`):

```
$ curl -H "Authorization: Bearer <key>" -H "Content-Type: application/json" \
       -X POST https://api.mistral.ai/v1/chat/completions \
       -d '{"model":"magistral-medium-latest",
            "messages":[{"role":"user","content":"two trains 60mph and 90mph, 300mi apart, when do they meet? step by step."}],
            "max_tokens":1024}'
HTTP 200
{ "choices":[{"message":{
    "role":"assistant",
    "content":[
      {"type":"thinking","thinking":[{"type":"text","text":"Okay, let's see..."}],"closed":true},
      {"type":"text","text":"To determine when the two trains meet..."}
    ]}}] }
```

With the current driver, every call like that returns the generic
`"invalid content format"` error. Trivial prompts that happen to fit in
a string answer still succeed, so the breakage is **non-deterministic
from the tenant's POV**: same model, same provider, sometimes works,
sometimes 500s with no useful error.

A secondary issue: `conf/models/mistral.json` does not include any
magistral model. The picker hid the broken path, which is why this
wasn't caught during #14807's review.

### What this PR includes

- New helper `extractMistralContent(raw interface{}) (answer,
reasonContent string, err error)` in
`internal/entity/models/mistral.go`, which normalizes both shapes
Mistral can return:
- `string` → historical path. `Answer = content`, `ReasonContent = ""`.
Preserves behavior for every non-reasoning model (`mistral-large-*`,
`mistral-small-*`, `ministral-*`, `codestral-*`, `pixtral-*`,
`open-mistral-nemo`).
- `[]interface{}` → walk the parts. Concatenate every `{"type":"text",
"text":...}` part into `Answer`; concatenate the inner text inside every
`{"type":"thinking", "thinking":[...]}` part into `ReasonContent`.
- `ChatWithMessages` now calls the helper instead of doing the raw
`.(string)` cast.
- Unknown part types are **skipped, not failed**. Mistral has been
adding new content variants quickly (audio chunks, citations, etc.);
this driver should not 500 every call when a new part type appears.
- `conf/models/mistral.json`: add `magistral-medium-latest` and
`magistral-small-latest`. Both are visible in `/v1/models` today.

No interface change. No factory change. No new dependencies.

### How was this tested?

**Unit tests** — 5 new tests in `internal/entity/models/mistral_test.go`
on top of the 27 already shipped via #14807:

- `TestMistralChatHandlesStringContent` — regression net for the
historical path
- `TestMistralChatExtractsReasoningFromStructuredContent` — the fixture
body is a trimmed copy of the actual `magistral-medium-latest` response
captured above; asserts both `Answer` and `ReasonContent` are populated
correctly
- `TestMistralChatHandlesStructuredContentWithoutThinking` —
`magistral-*` with a trivial answer returns a structured shape that has
only a `text` part; `ReasonContent` must stay empty
- `TestMistralChatIgnoresUnknownContentPartTypes` — `audio_url` and
`future_part_type` parts are skipped, `text` parts still flow through
- `TestExtractMistralContent` — table-driven unit coverage of the helper
for string, empty string, nil, empty array, text-only, thinking+text,
unsupported root type

```
$ go test -vet=off -run "TestMistral|TestExtractMistralContent" -count=1 -v ./internal/entity/models/...
=== RUN   TestMistralChatHandlesStringContent
--- PASS: TestMistralChatHandlesStringContent (0.00s)
=== RUN   TestMistralChatExtractsReasoningFromStructuredContent
--- PASS: TestMistralChatExtractsReasoningFromStructuredContent (0.00s)
=== RUN   TestMistralChatHandlesStructuredContentWithoutThinking
--- PASS: TestMistralChatHandlesStructuredContentWithoutThinking (0.00s)
=== RUN   TestMistralChatIgnoresUnknownContentPartTypes
--- PASS: TestMistralChatIgnoresUnknownContentPartTypes (0.00s)
=== RUN   TestExtractMistralContent
=== RUN   TestExtractMistralContent/plain_string
=== RUN   TestExtractMistralContent/empty_string
=== RUN   TestExtractMistralContent/nil
=== RUN   TestExtractMistralContent/empty_array
=== RUN   TestExtractMistralContent/text_only
=== RUN   TestExtractMistralContent/thinking_then_text
=== RUN   TestExtractMistralContent/unknown_root_type
--- PASS: TestExtractMistralContent (0.00s)
PASS
ok      ragflow/internal/entity/models  0.046s
```

All 32 Mistral tests pass on go 1.25. `go build
./internal/entity/models/...` exits 0.

**Live integration test** — driver exercised against `api.mistral.ai/v1`
with the patched code:

```
=== RUN   TestMistralMagistralSmoke
    [OK] "magistral-small-latest" present upstream
    [OK] "magistral-medium-latest" present upstream
    [OK trivial]    Answer="7"  ReasonContent=""
    [OK reasoning]  Answer len=797   head="To determine when the two trains meet, we can follow these steps:\n\n1. **Identify..."
                    ReasonContent len=1069 head="Okay, let's see. There are two trains, one going 60 mph and the other going 90 mph. They're moving towards each other, s..."

MAGISTRAL SMOKE PASSED
--- PASS: TestMistralMagistralSmoke (18.09s)
PASS
ok      ragflow/internal/entity/models  18.112s
```

What the live run proves on the wire:

- `magistral-small-latest` with a trivial prompt still uses the
string-content shape; the regression-net path is exercised against the
real server, not just the mock.
- `magistral-medium-latest` with a reasoning prompt uses the
structured-array shape; the new code path extracts a 1069-character
reasoning trace into `ChatResponse.ReasonContent` and a 797-character
visible answer into `ChatResponse.Answer`. Before this fix, the same
call returned `"invalid content format"` and the caller saw nothing.

The smoke-test file itself is not committed (live tests live outside the
PR diff, same convention used for prior provider PRs).

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-21 15:33:14 +08:00
sapienza yoan
9d37234953 build(go): make bash build.sh work on macOS arm64 (Homebrew) (#15009)
## Problem

The Go server build pipeline (`build.sh` + CMake + CGO bindings) was
tested on Ubuntu only. On macOS arm64 with Homebrew it fails in five
orthogonal places. None of these require platform-specific code paths —
the same source builds on both Linux and Darwin after these fixes.

## Reproduction (before)

```
$ uname -a
Darwin … 25.4.0 arm64
$ brew install cmake pcre2 simde
$ bash build.sh
…
error: 'simde/x86/sse4.1.h' file not found
error: implicit instantiation of undefined template 'std::basic_istringstream<char>'
error: no matching function for call to 'Join'
…
clang: error: no such file or directory: '/usr/local/lib/libpcre2-8.a'
```

## Fix (5 small, orthogonal changes)

### 1. `internal/cpp/CMakeLists.txt` — find Homebrew + libpcre2-8
portably

- Detect Apple platforms via `if(APPLE)`, call `brew --prefix` once, add
`${HOMEBREW_PREFIX}/include` and `${HOMEBREW_PREFIX}/lib`. No effect on
Linux.
- Replace the literal `libpcre2-8.a` link token (which only the Linux
linker finds in `/usr/local/lib` by default) with
`find_library(PCRE2_LIB NAMES pcre2-8 REQUIRED)`. Works on
`/usr/lib/x86_64-linux-gnu` (Debian/Ubuntu), `/usr/local/lib` (Intel Mac
& legacy Linux), `/opt/homebrew/lib` (Apple Silicon).

### 2. `internal/cpp/wordnet_lemmatizer.cpp` +
`internal/cpp/rag_analyzer.cpp` — explicit `#include <sstream>`

libstdc++ (Linux) pulls `<sstream>` in transitively via `<fstream>`;
libc++ (Apple Clang) doesn't, so the existing `std::istringstream` /
`std::ostringstream` uses fail to compile on macOS. One-line include in
each file.

### 3. `internal/cpp/rag_analyzer.cpp` — `Join` template overload fix

`Join(tokens, start, tokens.size(), delim)` at line 146 passes `size_t`
to an `int` parameter. C++23 strict mode in Apple Clang refuses the
implicit narrowing and reports the 4-arg overload as a substitution
failure, leaving the call ambiguous between the 3-arg and 4-arg
templates. Fix: explicit `static_cast<int>(tokens.size())`. Behaviour
identical on libstdc++ — the narrowing was always intentional.

### 4. `internal/binding/rag_analyzer.go` — split darwin CGO LDFLAGS

The existing `#cgo darwin LDFLAGS: ... /usr/local/lib/libpcre2-8.a` only
matches Intel Macs. Apple Silicon Homebrew installs to `/opt/homebrew`.
Split into `darwin,arm64` and `darwin,amd64` build constraints with the
right absolute path on each.

### 5. `build.sh` — accept Homebrew path in the pcre2 sanity check

The sanity check looked at two Linux paths only and then fell through to
`sudo apt -y install libpcre2-dev` on failure. Added
`/opt/homebrew/lib/libpcre2-8.a`, and on Darwin failure now exits
cleanly with the right `brew install pcre2` hint instead of trying
`apt`.

## Verified

- `bash build.sh` now completes on macOS arm64 (Apple Silicon, brew 4.x,
cmake 4.x, Apple Clang 17, Go 1.25, pcre2 10.x, simde 0.8.x).
- Produced binaries: `bin/server_main`, `bin/admin_server`,
`bin/ragflow_cli`.
- `bin/server_main` boots, connects MySQL, runs migrations, loads the 64
model provider configs cleanly.
- Still builds on Linux — the CMake additions are inside an `if(APPLE)`
guard, the `find_library` call matches Linux paths too, the build.sh
check still tries `apt` when not on Darwin.

## Out of scope

The Go server itself currently fails at runtime when not pointing at
Elasticsearch (`Failed to initialize doc engine: failed to ping
Elasticsearch`), but that's the placeholder Infinity engine documented
in `internal/engine/README.md` — unrelated to this build patchset.

---

Happy to split this into smaller PRs if you'd prefer (one per file). The
five changes are independent.
2026-05-21 15:33:09 +08:00
BitToby
bd4ce39038 Go: implement provider: Perplexity (#15008)
## What
- Add Perplexity as a chat and embedding provider backed by its
OpenAI-compatible `/chat/completions` and `/v1/embeddings` APIs
- Register Perplexity in the Go model factory and provider config
- Support non-streaming chat, SSE streaming chat, embeddings, model
listing, and connection checks

Refs #14736

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 15:33:02 +08:00
dripsmvcp
d5ba14a128 feat(go): implement provider Astraflow (#15062) (#15064)
- Adds an `Astraflow` Go driver so the new API server can route
Astraflow (UCloud ModelVerse) chat instances, matching the existing
Python `AstraflowChat` (`rag/llm/chat_model.py:1237`). Follows the same
SaaS-driver shape used for Avian, Novita, TogetherAI, Replicate,
DeepInfra, Upstage, and LongCat.

Closes #15062

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 15:32:56 +08:00
dripsmvcp
5a18df0fd0 Go: implement provider: Avian (#15045)
Closes #15044.

Avian was listed unchecked in the Go-rewrite tracker #14736 and already
had an llm_factories.json entry with 4 preconfigured chat models
(deepseek-v3.2, kimi-k2.5, glm-5, minimax-m2.5), but the Go API server
had no driver to route them. The Python side has supported Avian at
rag/llm/chat_model.py:1220 (AvianChat) via the LiteLLM openai/ provider
with default base https://api.avian.io/v1.

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 15:32:49 +08:00
sxxtony
7740ec6c95 Go: implement Embed (embeddings) in Replicate driver (#15073)
### What problem does this PR solve?

`ReplicateModel.Embed` in `internal/entity/models/replicate.go` was a
`"replicate, no such method"` stub. Tracking issue #14736 lists
Replicate's embedding surface as not implemented. This PR wires it up
against Replicate's documented embedding schema.

Until this PR, a tenant who selected a Replicate embedding model got the
sentinel error on every embed call.

Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 15:32:41 +08:00
web-dev0521
2d3a1a4483 feat(go-models): add Azure OpenAI model driver (#15022)
## What problem does this PR solve?

Closes #15021.

The Go model-provider layer had no support for **Azure OpenAI**. Azure
OpenAI is *not* a drop-in base-URL swap of the OpenAI driver — it
differs in authentication, endpoint structure, and how models are listed
— so it needs its own `ModelDriver` implementation.

## Type of change

- [x] New feature (non-breaking change which adds functionality)

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 11:52:56 +08:00
Renzo
c7ac9b7171 Go: implement provider: GPUStack (chat) (#15024)
### What problem does this PR solve?

Fixes #15023

GPUStack is listed as unchecked in the Go-rewrite tracker #14736, and
`internal/service/llm.go:171` already classifies it as a self-deployed
provider alongside Ollama, Xinference, LocalAI, and LM Studio — but
`internal/entity/models/` had no `gpustack.go` driver, so the new Go API
server could not route GPUStack instances. This PR adds the chat surface
for GPUStack so it lines up with the existing self-hosted Go drivers.

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 11:49:18 +08:00
Renzo
394cd5d116 Go: implement Embed in Xinference driver (#14932)
## Summary

- Replaces the `"no such method"` stub on `XinferenceModel.Embed`
(`internal/entity/models/xinference.go`) with a real implementation
against Xinference's OpenAI-compatible `/v1/embeddings` endpoint.
- Adds the `"embedding": "v1/embeddings"` URL suffix to
`conf/models/xinference.json`.
- Mirrors the Python `XinferenceEmbed` class in
`rag/llm/embedding_model.py:407` for payload shape (OpenAI-compatible
`model + input` → `data[*].index + data[*].embedding`) and tolerates the
same no-auth default Xinference deployments use. Authorization is only
sent when a non-empty API key is configured, via the existing
`setXinferenceAuth` helper.
- Reuses the existing `normalizeXinferenceBaseURL` + `baseURLForRegion`
helpers so both `http://127.0.0.1:9997` and `http://127.0.0.1:9997/v1`
resolve to the same `/v1/embeddings` target without doubled `/v1`.
- Validates response indices — duplicate, missing, or out-of-range
`data[*].index` values fail with a clear error rather than silently
producing misaligned vectors.
- Returns `[]EmbeddingData` in original input order (placed by `Index`)
so downstream callers can index positionally without re-sorting.
- Forwards `EmbeddingConfig.Dimension` as `dimensions` when `> 0`,
matching the OpenAI cluster pattern.

Closes #14810

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-21 11:47:30 +08:00