### What problem does this PR solve?
The Go model-driver layer () has ~38,700 lines across 109 files. Roughly
74% of that is boilerplate duplicated into every driver: identical HTTP
client setup, the same 65-line SSE scanner loop, and 10-11 one-line "not
supported" stub methods per driver. Any fix must be manually propagated
to every file. Closes#15820.
This PR establishes the three shared utility files that form the
foundation for incremental driver migration:
---
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
---------
Co-authored-by: Haruko386 <tryeverypossible@163.com>
### What problem does this PR solve?
As title
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
### What problem does this PR solve?
- Replace embedding model `dimension` metadata with `max_dimension`.
- Add optional `dimensions` metadata for models with fixed selectable
output dimensions.
- Include `max_dimension` and `dimensions` in model list responses.
- Validate requested embedding dimensions before calling provider
embedding APIs.
- Forward SiliconFlow embedding dimensions with the correct `dimensions`
request field.
- Add unit coverage for embedding dimension validation rules.
### What problem does this PR solve?
`ReplicateModel.Rerank` in `internal/entity/models/replicate.go` was a
`"replicate, no such method"` stub. The chat path landed in #14958 and
the embed path in #15073; rerank is the last major retrieval surface
still missing on this provider.
Until this PR, a tenant who selected a Replicate reranker model got the
sentinel error on every rerank call.
Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
`ReplicateModel.Embed` in `internal/entity/models/replicate.go` was a
`"replicate, no such method"` stub. Tracking issue #14736 lists
Replicate's embedding surface as not implemented. This PR wires it up
against Replicate's documented embedding schema.
Until this PR, a tenant who selected a Replicate embedding model got the
sentinel error on every embed call.
Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
## What
- Add Replicate as a chat provider backed by the documented predictions
API
- Register Replicate in the Go model factory and provider config
- Support non-streaming chat through sync predictions, polling fallback,
streaming through `urls.stream`, model listing, and connection checks
## Notes
- Uses `POST /v1/predictions` with Replicate model identifiers in
`version`, which supports official and community model identifiers
- Maps RAGFlow messages into Replicate prompt-shaped inputs (`prompt`,
optional `system_prompt`) and forwards common documented LLM inputs:
`max_new_tokens`, `temperature`, `top_p`
- Preserves whitespace in SSE output chunks and emits RAGFlow `[DONE]`
at stream completion
## Tests
- `go test -vet=off -run TestReplicate -count=1
./internal/entity/models`
- `go test -vet=off -count=1 ./internal/entity/models`
Refs #14736