mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-06-29 23:41:12 +08:00
### What problem does this PR solve? Fixes #14816 The Xinference Go driver landed chat in #14938 and Embed is in review in #14932, but `Rerank` shipped as a stub that returns `"xinference, no such method"`. Tenants who launch a rerank model with `--model-type rerank` on their Xinference instance cannot route it through the Go API server. This PR fills the gap. Xinference exposes an OpenAI-compatible REST API. The rerank endpoint is at `POST <base>/v1/rerank` and accepts the Cohere-shaped body `{model, query, documents, top_n}`, returning `{results: [{index, relevance_score}]}` — the same wire shape used by the merged NVIDIA (#14778), Aliyun (#14676), Gitee (#14656), ZhipuAI (#14608), Novita (#15014), and LocalAI (#14813) Rerank implementations. Documented in [Xinference rerank docs](https://inference.readthedocs.io/en/v1.6.1/models/model_abilities/rerank.html); the [builtin rerank model catalog](https://inference.readthedocs.io/en/stable/models/builtin/rerank/) lists `bge-reranker-base`, `bge-reranker-large`, `bge-reranker-v2-m3`, and others.