Files
ragflow/conf/llm_factories.json

8072 lines
281 KiB
JSON
Raw Permalink Normal View History

{
"factory_llm_infos": [
{
"name": "OpenAI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,MODERATION",
"status": "1",
"rank": "999",
"url": "https://api.openai.com/v1",
"llm": [
fix: expose gpt-5.5 and gpt-5.4 in OpenAI model list (#14828) ### What problem does this PR solve? OpenAI model catalogs used in provider selection flows were missing the latest GPT models (`gpt-5.5` and `gpt-5.4`). Because model availability is driven by seeded catalog data (`conf/llm_factories.json` → DB seed → API response), these models were not selectable in the UI or `/llm/list` responses. This PR updates and synchronizes the OpenAI catalog definitions across configuration sources and ensures the new models are correctly exposed through the API layer and validated in tests. --- ### Type of change * [x] New Feature (non-breaking change which adds functionality) --- ### Changes Made * Added `gpt-5.5` and `gpt-5.4` to OpenAI catalog definitions in: * `conf/llm_factories.json` * `conf/models/openai.json` (chat + vision support) * Ensured consistency between DB-seeded factory config and provider model configuration * Updated test coverage in: * `test_llm_list_unit.py` * seeded OpenAI catalog entries * added response-level assertion validating `/llm/list` includes both new model IDs under OpenAI grouping --- ### Root Cause OpenAI model listings in selection flows are generated from catalog data seeded via `conf/llm_factories.json`. The catalog had not been updated to include the latest GPT models, resulting in missing availability in UI and API responses. --- ### Testing * Created isolated test environment: * `python -m venv .venv-review` * installed `pytest` * Ran targeted and full test suite: * `test_list_app_grouping_availability_and_merge`: ✅ passed * Full `test_llm_list_unit.py`: ✅ 10 passed --- ### Risks / Limitations * Adding models to the catalog does not guarantee upstream provider availability or account entitlement. * Environments with pre-seeded DB catalogs may require reseed or refresh to reflect updated configuration. --- ### Notes * Changes are minimal and scoped strictly to catalog configuration and related test coverage. * Ensures `/llm/list` API remains aligned with expected latest OpenAI model availability. * Closes #14827
2026-05-12 03:03:47 -07:00
{
"llm_name": "gpt-5.5",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.4",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.4-mini",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.4-nano",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
fix: expose gpt-5.5 and gpt-5.4 in OpenAI model list (#14828) ### What problem does this PR solve? OpenAI model catalogs used in provider selection flows were missing the latest GPT models (`gpt-5.5` and `gpt-5.4`). Because model availability is driven by seeded catalog data (`conf/llm_factories.json` → DB seed → API response), these models were not selectable in the UI or `/llm/list` responses. This PR updates and synchronizes the OpenAI catalog definitions across configuration sources and ensures the new models are correctly exposed through the API layer and validated in tests. --- ### Type of change * [x] New Feature (non-breaking change which adds functionality) --- ### Changes Made * Added `gpt-5.5` and `gpt-5.4` to OpenAI catalog definitions in: * `conf/llm_factories.json` * `conf/models/openai.json` (chat + vision support) * Ensured consistency between DB-seeded factory config and provider model configuration * Updated test coverage in: * `test_llm_list_unit.py` * seeded OpenAI catalog entries * added response-level assertion validating `/llm/list` includes both new model IDs under OpenAI grouping --- ### Root Cause OpenAI model listings in selection flows are generated from catalog data seeded via `conf/llm_factories.json`. The catalog had not been updated to include the latest GPT models, resulting in missing availability in UI and API responses. --- ### Testing * Created isolated test environment: * `python -m venv .venv-review` * installed `pytest` * Ran targeted and full test suite: * `test_list_app_grouping_availability_and_merge`: ✅ passed * Full `test_llm_list_unit.py`: ✅ 10 passed --- ### Risks / Limitations * Adding models to the catalog does not guarantee upstream provider availability or account entitlement. * Environments with pre-seeded DB catalogs may require reseed or refresh to reflect updated configuration. --- ### Notes * Changes are minimal and scoped strictly to catalog configuration and related test coverage. * Ensures `/llm/list` API remains aligned with expected latest OpenAI model availability. * Closes #14827
2026-05-12 03:03:47 -07:00
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.2-pro",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.2",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.1",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.1-chat-latest",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-mini",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-nano",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-chat-latest",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "gpt-4.1",
"tags": "LLM,CHAT,1M,IMAGE2TEXT",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1-mini",
"tags": "LLM,CHAT,1M,IMAGE2TEXT",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1-nano",
"tags": "LLM,CHAT,1M,IMAGE2TEXT",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.5-preview",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o3",
"tags": "LLM,CHAT,200K,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o4-mini",
"tags": "LLM,CHAT,200K,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o4-mini-high",
"tags": "LLM,CHAT,200K,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o-mini",
"tags": "LLM,CHAT,128K,IMAGE2TEXT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o",
"tags": "LLM,CHAT,128K,IMAGE2TEXT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-3.5-turbo",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "gpt-3.5-turbo-16k-0613",
"tags": "LLM,CHAT,16k",
"max_tokens": 16385,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "text-embedding-ada-002",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-3-small",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-3-large",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "whisper-1",
"tags": "SPEECH2TEXT",
"max_tokens": 26214400,
"model_type": "speech2text",
"is_tools": false
},
{
"llm_name": "gpt-4",
"tags": "LLM,CHAT,8K",
"max_tokens": 8191,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "gpt-4-turbo",
"tags": "LLM,CHAT,8K",
"max_tokens": 8191,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4-32k",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "tts-1",
"tags": "TTS",
"max_tokens": 2048,
"model_type": "tts",
"is_tools": false
}
]
},
{
"name": "xAI",
"logo": "",
"tags": "LLM",
"status": "1",
"rank": "992",
"llm": [
{
"llm_name": "grok-4",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3",
"tags": "LLM,CHAT,130k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3-fast",
"tags": "LLM,CHAT,130k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3-mini",
"tags": "LLM,CHAT,130k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3-mini-mini-fast",
"tags": "LLM,CHAT,130k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-2-vision",
"tags": "LLM,IMAGE2TEXT,32k",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
}
]
},
{
"name": "TokenPony",
"logo": "",
"tags": "LLM",
"status": "1",
"url": "https://ragflow.vip-api.tokenpony.cn/v1",
"llm": [
{
"llm_name": "qwen3-8b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-v3-0324",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-32b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "kimi-k2-instruct-0905",
"tags": "LLM,CHAT,256K",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-r1-0528",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-coder-480b",
"tags": "LLM,CHAT,1024k",
"max_tokens": 1024000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "hunyuan-a13b-instruct",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-next-80b-a3b-instruct",
"tags": "LLM,CHAT,1024k",
"max_tokens": 1024000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-v3.2-exp",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-v3.1-terminus",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-vl-235b-a22b-instruct",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-vl-30b-a3b-instruct",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ocr",
"tags": "LLM,8k",
"max_tokens": 8000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-235b-a22b-instruct-2507",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.6",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "minimax-m2",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
}
]
},
{
"name": "Tongyi-Qianwen",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TEXT RE-RANK,TTS,SPEECH2TEXT,MODERATION",
"status": "1",
"rank": "994",
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"llm": [
{
2026-06-26 17:30:16 +08:00
"llm_name": "qwen3-max-2026-01-23",
"max_tokens": 262144,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-tts-instruct-flash",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "kimi-k2.7-code",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3-asr-flash-realtime-2025-10-27",
"model_type": [
"speech2text"
],
"is_tools": false
},
{
"llm_name": "qwen-mt-flash",
"max_tokens": 8192,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen3-vl-flash-2025-10-15",
"max_tokens": 32768,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3.5-35b-a3b",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen-mt-plus",
"max_tokens": 16384,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen-mt-turbo",
"max_tokens": 8192,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "glm-5.2",
"max_tokens": 1048576,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3.7-max-preview",
"max_tokens": 1000000,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-vl-flash-2026-01-22",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-coder-next",
"max_tokens": 262144,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "deepseek-v3.1",
"max_tokens": 163840,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-tts-instruct-flash-realtime",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen3-omni-flash-2025-09-15",
"max_tokens": 49152,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qvq-max",
"max_tokens": 128000,
2026-06-26 17:30:16 +08:00
"model_type": [
"chat",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-tts-vc-2026-01-22",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "deepseek-r1-distill-qwen-7b",
"max_tokens": 131072,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen3-livetranslate-flash-2025-12-01",
"max_tokens": 49152,
"model_type": [
"speech2text"
],
"is_tools": false
},
{
"llm_name": "kimi-k2-thinking",
"max_tokens": 262144,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen-math-plus",
"max_tokens": 3072,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen-plus-2025-12-01",
"max_tokens": 1000000,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-tts-vd-realtime-2026-01-15",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen3.6-flash-2026-04-16",
"max_tokens": 1000000,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3-max-preview",
"max_tokens": 262144,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "deepseek-v4-flash",
"max_tokens": 1048576,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen-math-plus-latest",
"max_tokens": 3072,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qvq-plus",
"max_tokens": 128000,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen3-tts-flash-2025-11-27",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen3-vl-flash",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3-tts-instruct-flash-realtime-2026-01-22",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen3.7-max",
"max_tokens": 1000000,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen-math-turbo",
"max_tokens": 3072,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "ZHIPU/GLM-5.1",
"max_tokens": 202752,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-tts-flash-2025-09-18",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen-vl-ocr-2025-11-20",
"max_tokens": 30720,
"model_type": [
"ocr",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3.5-397b-a17b",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3.6-plus-2026-04-02",
"max_tokens": 1000000,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3.6-35b-a3b",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "kimi/kimi-k2.5",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3-tts-flash-realtime-2025-11-27",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen3-coder-plus-2025-07-22",
"max_tokens": 1000000,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-vl-plus-2025-12-19",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3-omni-flash-realtime-2025-09-15",
"max_tokens": 65536,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-omni-flash",
"max_tokens": 65536,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-livetranslate-flash-realtime",
"max_tokens": 49152,
"model_type": [
"speech2text"
],
"is_tools": false
},
{
"llm_name": "qwen-vl-ocr-latest",
"max_tokens": 30720,
"model_type": [
"ocr",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3.5-omni-flash-2026-03-15",
"max_tokens": 262144,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-asr-flash-realtime",
"model_type": [
"speech2text"
],
"is_tools": false
},
{
"llm_name": "qwen3.7-plus-2026-05-26",
"max_tokens": 1000000,
"model_type": [
"chat",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-tts-vc-realtime-2025-11-27",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen-plus-2025-09-11",
"max_tokens": 1048576,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen3-coder-plus",
"max_tokens": 1000000,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3.7-max-2026-05-20",
"max_tokens": 1000000,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-max-2025-09-23",
"max_tokens": 262144,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-tts-instruct-flash-2026-01-26",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen3-tts-vd-realtime-2025-12-16",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen-mt-lite",
"max_tokens": 8192,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen3-omni-flash-realtime",
"max_tokens": 65536,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-tts-vd-2026-01-26",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "kimi/kimi-k2.7-code",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
2026-06-26 17:30:16 +08:00
"llm_name": "qwen-plus-2025-01-25",
"max_tokens": 131072,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3.5-27b",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen-coder-turbo",
"max_tokens": 131072,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen3-omni-flash-2025-12-01",
"max_tokens": 65536,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3.6-max-preview",
"max_tokens": 262144,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-omni-flash-realtime-2025-12-01",
"max_tokens": 65536,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": false
},
{
"llm_name": "kimi-k2.5",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen-coder-plus",
"max_tokens": 131072,
2026-06-26 17:30:16 +08:00
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen3-vl-plus-2025-09-23",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3.7-plus",
"max_tokens": 1000000,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen-flash-character",
"max_tokens": 8192,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen3.5-omni-plus",
"max_tokens": 262144,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-coder-flash",
"max_tokens": 1000000,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "kimi/kimi-k2.6",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen-math-plus-0919",
"max_tokens": 3072,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen3.6-flash",
"max_tokens": 1000000,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3.6-plus",
"max_tokens": 1000000,
"model_type": [
"chat",
"ocr",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3-tts-flash-realtime-2025-09-18",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen3.5-omni-plus-2026-03-15",
"max_tokens": 262144,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-tts-flash-realtime",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen-omni-turbo",
"max_tokens": 32768,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-livetranslate-flash-realtime-2025-09-22",
"max_tokens": 49152,
"model_type": [
"speech2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-tts-flash",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "deepseek-v4-pro",
"max_tokens": 1048576,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3.6-27b",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3-asr-flash-2026-02-10",
"model_type": [
"speech2text"
],
"is_tools": false
},
{
"llm_name": "qwen3-asr-flash-realtime-2026-02-10",
"model_type": [
"speech2text"
],
"is_tools": false
},
{
"llm_name": "ZHIPU/GLM-5",
"max_tokens": 202752,
"model_type": [
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen-tts-2025-05-22",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen3-coder-plus-2025-09-23",
"max_tokens": 1000000,
"model_type": [
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen3-livetranslate-flash",
"max_tokens": 49152,
"model_type": [
"speech2text"
],
"is_tools": false
},
{
"llm_name": "qwen3.7-max-2026-06-08",
"max_tokens": 1000000,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "kimi-k2.6",
"max_tokens": 262144,
"model_type": [
"chat",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen3-tts-vc-realtime-2026-01-15",
"model_type": [
"tts"
],
"is_tools": false
},
{
"llm_name": "qwen3.5-omni-flash",
"max_tokens": 262144,
"model_type": [
"chat",
"speech2text",
"image2text"
],
"is_tools": true
},
{
"llm_name": "qwen-vl-ocr",
"max_tokens": 30720,
"model_type": [
"ocr",
"image2text"
],
"is_tools": false
},
{
"llm_name": "qwen3.5-122b-a10b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-v3.2",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-r1",
"tags": "LLM,CHAT,64K",
"max_tokens": 65792,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "deepseek-v3",
"tags": "LLM,CHAT,64K",
"max_tokens": 65792,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "deepseek-r1-distill-qwen-1.5b",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "deepseek-r1-distill-qwen-14b",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "deepseek-r1-distill-qwen-32b",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "deepseek-r1-distill-llama-8b",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "deepseek-r1-distill-llama-70b",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "qwq-plus",
"tags": "LLM,CHAT,132k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
Refa: revise the implementation of LightRAG and enable response caching (#9828) ### What problem does this PR solve? This revision performed a comprehensive check on LightRAG to ensure the correctness of its implementation. It **did not involve** Entity Resolution and Community Reports Generation. There is an example using default entity types and the General chunking method, which shows good results in both time and effectiveness. Moreover, response caching is enabled for resuming failed tasks. [The-Necklace.pdf](https://github.com/user-attachments/files/22042432/The-Necklace.pdf) After: ![img_v3_02pk_177dbc6a-e7cc-4732-b202-ad4682d171fg](https://github.com/user-attachments/assets/5ef1d93a-9109-4fe9-8a7b-a65add16f82b) ```bash Begin at: Fri, 29 Aug 2025 16:48:03 GMT Duration: 222.31 s Progress: 16:48:04 Task has been received. 16:48:06 Page(1~7): Start to parse. 16:48:06 Page(1~7): OCR started 16:48:08 Page(1~7): OCR finished (1.89s) 16:48:11 Page(1~7): Layout analysis (3.72s) 16:48:11 Page(1~7): Table analysis (0.00s) 16:48:11 Page(1~7): Text merged (0.00s) 16:48:11 Page(1~7): Finish parsing. 16:48:12 Page(1~7): Generate 7 chunks 16:48:12 Page(1~7): Embedding chunks (0.29s) 16:48:12 Page(1~7): Indexing done (0.04s). Task done (7.84s) 16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: She had no dresses, no je... 16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: Her husband, already half... 16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: And this life lasted ten ... 16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: Then she asked, hesitatin... 16:49:30 Completed processing for f421fb06849e11f0bdd32724b93a52b2: She had no dresses, no je... after 1 gleanings, 21985 tokens. 16:49:30 Entities extraction of chunk 3 1/7 done, 12 nodes, 13 edges, 21985 tokens. 16:49:40 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Finally, she replied, hes... after 1 gleanings, 22584 tokens. 16:49:40 Entities extraction of chunk 5 2/7 done, 19 nodes, 19 edges, 22584 tokens. 16:50:02 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Then she asked, hesitatin... after 1 gleanings, 24610 tokens. 16:50:02 Entities extraction of chunk 0 3/7 done, 16 nodes, 28 edges, 24610 tokens. 16:50:03 Completed processing for f421fb06849e11f0bdd32724b93a52b2: And this life lasted ten ... after 1 gleanings, 24031 tokens. 16:50:04 Entities extraction of chunk 1 4/7 done, 24 nodes, 22 edges, 24031 tokens. 16:50:14 Completed processing for f421fb06849e11f0bdd32724b93a52b2: So they begged the jewell... after 1 gleanings, 24635 tokens. 16:50:14 Entities extraction of chunk 6 5/7 done, 27 nodes, 26 edges, 24635 tokens. 16:50:29 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Her husband, already half... after 1 gleanings, 25758 tokens. 16:50:29 Entities extraction of chunk 2 6/7 done, 25 nodes, 35 edges, 25758 tokens. 16:51:35 Completed processing for f421fb06849e11f0bdd32724b93a52b2: The Necklace By Guy de Ma... after 1 gleanings, 27491 tokens. 16:51:35 Entities extraction of chunk 4 7/7 done, 39 nodes, 37 edges, 27491 tokens. 16:51:35 Entities and relationships extraction done, 147 nodes, 177 edges, 171094 tokens, 198.58s. 16:51:35 Entities merging done, 0.01s. 16:51:35 Relationships merging done, 0.01s. 16:51:35 ignored 7 relations due to missing entities. 16:51:35 generated subgraph for doc f421fb06849e11f0bdd32724b93a52b2 in 198.68 seconds. 16:51:35 run_graphrag f421fb06849e11f0bdd32724b93a52b2 graphrag_task_lock acquired 16:51:35 set_graph removed 0 nodes and 0 edges from index in 0.00s. 16:51:35 Get embedding of nodes: 9/147 16:51:35 Get embedding of nodes: 109/147 16:51:37 Get embedding of edges: 9/170 16:51:37 Get embedding of edges: 109/170 16:51:40 set_graph converted graph change to 319 chunks in 4.21s. 16:51:40 Insert chunks: 4/319 16:51:40 Insert chunks: 104/319 16:51:40 Insert chunks: 204/319 16:51:40 Insert chunks: 304/319 16:51:40 set_graph added/updated 147 nodes and 170 edges from index in 0.53s. 16:51:40 merging subgraph for doc f421fb06849e11f0bdd32724b93a52b2 into the global graph done in 4.79 seconds. 16:51:40 Knowledge Graph done (204.29s) ``` Before: ![img_v3_02pk_63370edf-ecee-4ee8-8ac8-69c8d2c712fg](https://github.com/user-attachments/assets/1162eb0f-68c2-4de5-abe0-cdfa168f71de) ```bash Begin at: Fri, 29 Aug 2025 17:00:47 GMT processDuration: 173.38 s Progress: 17:00:49 Task has been received. 17:00:51 Page(1~7): Start to parse. 17:00:51 Page(1~7): OCR started 17:00:53 Page(1~7): OCR finished (1.82s) 17:00:57 Page(1~7): Layout analysis (3.64s) 17:00:57 Page(1~7): Table analysis (0.00s) 17:00:57 Page(1~7): Text merged (0.00s) 17:00:57 Page(1~7): Finish parsing. 17:00:57 Page(1~7): Generate 7 chunks 17:00:57 Page(1~7): Embedding chunks (0.31s) 17:00:57 Page(1~7): Indexing done (0.03s). Task done (7.88s) 17:00:57 created task graphrag 17:01:00 Task has been received. 17:02:17 Entities extraction of chunk 1 1/7 done, 9 nodes, 9 edges, 10654 tokens. 17:02:31 Entities extraction of chunk 2 2/7 done, 12 nodes, 13 edges, 11066 tokens. 17:02:33 Entities extraction of chunk 4 3/7 done, 9 nodes, 10 edges, 10433 tokens. 17:02:42 Entities extraction of chunk 5 4/7 done, 11 nodes, 14 edges, 11290 tokens. 17:02:52 Entities extraction of chunk 6 5/7 done, 13 nodes, 15 edges, 11039 tokens. 17:02:55 Entities extraction of chunk 3 6/7 done, 14 nodes, 13 edges, 11466 tokens. 17:03:32 Entities extraction of chunk 0 7/7 done, 19 nodes, 18 edges, 13107 tokens. 17:03:32 Entities and relationships extraction done, 71 nodes, 89 edges, 79055 tokens, 149.66s. 17:03:32 Entities merging done, 0.01s. 17:03:32 Relationships merging done, 0.01s. 17:03:32 ignored 1 relations due to missing entities. 17:03:32 generated subgraph for doc b1d9d3b6848711f0aacd7ddc0714c4d3 in 149.69 seconds. 17:03:32 run_graphrag b1d9d3b6848711f0aacd7ddc0714c4d3 graphrag_task_lock acquired 17:03:32 set_graph removed 0 nodes and 0 edges from index in 0.00s. 17:03:32 Get embedding of nodes: 9/71 17:03:33 Get embedding of edges: 9/88 17:03:34 set_graph converted graph change to 161 chunks in 2.27s. 17:03:34 Insert chunks: 4/161 17:03:34 Insert chunks: 104/161 17:03:34 set_graph added/updated 71 nodes and 88 edges from index in 0.28s. 17:03:34 merging subgraph for doc b1d9d3b6848711f0aacd7ddc0714c4d3 into the global graph done in 2.60 seconds. 17:03:34 Knowledge Graph done (153.18s) ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring - [x] Performance Improvement
2025-08-29 17:58:36 +08:00
{
"llm_name": "qwen-plus-2025-07-28",
"tags": "LLM,CHAT,132k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen-plus-2025-07-14",
"tags": "LLM,CHAT,132k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen-flash",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen-flash-2025-07-28",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3.5-plus",
"tags": "LLM,CHAT,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3.5-plus-2026-02-15",
"tags": "LLM,CHAT,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3.5-flash",
"tags": "LLM,CHAT,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3.5-flash-2026-02-23",
"tags": "LLM,CHAT,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-max",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-coder-480b-a35b-instruct",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-30b-a3b-instruct-2507",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-30b-a3b-thinking-2507",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-30b-a3b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-vl-plus",
"tags": "LLM,IMAGE2TEXT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-vl-235b-a22b-instruct",
"tags": "LLM,IMAGE2TEXT,128k",
"max_tokens": 128000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-vl-235b-a22b-thinking",
"tags": "LLM,IMAGE2TEXT,128k",
"max_tokens": 128000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "qwen3-235b-a22b-instruct-2507",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-235b-a22b-thinking-2507",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-235b-a22b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-next-80b-a3b-instruct",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-next-80b-a3b-thinking",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-8b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-14b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-32b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen-long",
"tags": "LLM,CHAT,10000K",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen-turbo",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen-max",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen-plus",
"tags": "LLM,CHAT,132k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen-plus-2025-04-28",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen-plus-latest",
"tags": "LLM,CHAT,132k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "text-embedding-v2",
"tags": "TEXT EMBEDDING,2K",
"max_tokens": 2048,
"model_type": "embedding"
},
{
"llm_name": "sambert-zhide-v1",
"tags": "TTS",
"max_tokens": 2048,
"model_type": "tts"
},
{
"llm_name": "sambert-zhiru-v1",
"tags": "TTS",
"max_tokens": 2048,
"model_type": "tts"
},
{
"llm_name": "text-embedding-v3",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8192,
"model_type": "embedding"
},
{
"llm_name": "text-embedding-v4",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8192,
"model_type": "embedding"
},
{
"llm_name": "qwen-vl-max",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 765,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "qwen-vl-plus",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 765,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "gte-rerank",
"tags": "RE-RANK,4k",
"max_tokens": 4000,
"model_type": "rerank"
},
{
"llm_name": "qwen3-asr-flash",
"tags": "SPEECH2TEXT,8k",
"max_tokens": 8000,
"model_type": "speech2text"
},
{
"llm_name": "qwen3-asr-flash-2025-09-08",
"tags": "SPEECH2TEXT,8k",
"max_tokens": 8000,
"model_type": "speech2text"
},
{
"llm_name": "gte-rerank-v2",
"tags": "RE-RANK,4k",
"max_tokens": 4000,
"model_type": "rerank"
},
{
"llm_name": "qwen3-rerank",
"tags": "RE-RANK,4k",
"max_tokens": 4000,
"model_type": "rerank"
}
]
},
{
"name": "ZHIPU-AI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION",
"status": "1",
"rank": "993",
"url": "https://open.bigmodel.cn/api/paas/v4",
"llm": [
{
"llm_name": "glm-4.7",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5-x",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5-air",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5-airx",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5-flash",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5v",
"tags": "LLM,IMAGE2TEXT,64",
"max_tokens": 64000,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "glm-4-plus",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4-0520",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4-airx",
"tags": "LLM,CHAT,",
"max_tokens": 8000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4-air",
"tags": "LLM,CHAT,",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4-flash",
"tags": "LLM,CHAT,",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4-flashx",
"tags": "LLM,CHAT,",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4-long",
"tags": "LLM,CHAT,",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-3-turbo",
"tags": "LLM,CHAT,",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4v",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 2000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "glm-4-9b",
"tags": "LLM,CHAT,",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "embedding-2",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "embedding-3",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "glm-asr",
"tags": "SPEECH2TEXT",
"max_tokens": 4096,
"model_type": "speech2text"
}
]
},
{
"name": "Ollama",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION",
"status": "1",
"rank": "988",
"llm": []
},
{
"name": "ModelScope",
"logo": "",
"tags": "LLM",
"status": "1",
"llm": []
},
{
"name": "LocalAI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION",
"status": "1",
"llm": []
},
{
"name": "OpenAI-API-Compatible",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION",
"status": "1",
"llm": [],
"rank": "985"
},
{
"name": "VLLM",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION",
"status": "1",
"llm": []
},
{
"name": "Moonshot",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,IMAGE2TEXT",
"status": "1",
"rank": "995",
"url": "https://api.moonshot.cn/v1",
"llm": [
{
"llm_name": "kimi-thinking-preview",
"tags": "LLM,CHAT,1M",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "kimi-k2-0711-preview",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "kimi-k2-0905-preview",
"tags": "LLM,CHAT,256k",
"max_tokens": 262144,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "kimi-k2-thinking",
"tags": "LLM,CHAT,256k",
"max_tokens": 262144,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "kimi-k2-thinking-turbo",
"tags": "LLM,CHAT,256k",
"max_tokens": 262144,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "kimi-k2-turbo-preview",
"tags": "LLM,CHAT,256k",
"max_tokens": 262144,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "kimi-k2.5",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "kimi-latest",
"tags": "LLM,CHAT,8k,32k,128k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "moonshot-v1-8k",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "moonshot-v1-32k",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "moonshot-v1-128k",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "moonshot-v1-8k-vision-preview",
"tags": "LLM,IMAGE2TEXT,8k",
"max_tokens": 8192,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "moonshot-v1-32k-vision-preview",
"tags": "LLM,IMAGE2TEXT,32k",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "moonshot-v1-128k-vision-preview",
"tags": "LLM,IMAGE2TEXT,128k",
"max_tokens": 131072,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "moonshot-v1-auto",
"tags": "LLM,CHAT,",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
}
]
},
{
"name": "FastEmbed",
"logo": "",
"tags": "TEXT EMBEDDING",
"status": "1",
"llm": []
},
{
"name": "Xinference",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TTS,SPEECH2TEXT,MODERATION,TEXT RE-RANK",
"status": "1",
"llm": []
},
{
"name": "DeepSeek",
"logo": "",
"tags": "LLM",
"status": "1",
"rank": "996",
"url": "https://api.deepseek.com/v1",
"llm": [
{
"llm_name": "deepseek-v4-flash",
"tags": "LLM,CHAT,",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-v4-pro",
"tags": "LLM,CHAT,",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
}
]
},
{
"name": "VolcEngine",
"logo": "",
"tags": "LLM, TEXT EMBEDDING, IMAGE2TEXT",
"status": "1",
"url": "https://ark.cn-beijing.volces.com/api/v3",
"llm": []
},
{
"name": "BaiChuan",
"logo": "",
"tags": "LLM,TEXT EMBEDDING",
"status": "1",
"llm": [
{
"llm_name": "Baichuan2-Turbo",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Baichuan2-Turbo-192k",
"tags": "LLM,CHAT,192K",
"max_tokens": 196608,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Baichuan3-Turbo",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Baichuan3-Turbo-128k",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Baichuan4",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Baichuan-Text-Embedding",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
}
]
},
{
"name": "Jina",
"logo": "",
"tags": "TEXT EMBEDDING, TEXT RE-RANK",
"status": "1",
"llm": [
{
"llm_name": "jina-reranker-v1-base-en",
"tags": "RE-RANK,8k",
"max_tokens": 8196,
"model_type": "rerank"
},
{
"llm_name": "jina-reranker-v1-turbo-en",
"tags": "RE-RANK,8k",
"max_tokens": 8196,
"model_type": "rerank"
},
{
"llm_name": "jina-reranker-v1-tiny-en",
"tags": "RE-RANK,8k",
"max_tokens": 8196,
"model_type": "rerank"
},
{
"llm_name": "jina-colbert-v1-en",
"tags": "RE-RANK,8k",
"max_tokens": 8196,
"model_type": "rerank"
},
{
"llm_name": "jina-embeddings-v2-base-en",
"tags": "TEXT EMBEDDING",
"max_tokens": 8196,
"model_type": "embedding"
},
{
"llm_name": "jina-embeddings-v2-base-de",
"tags": "TEXT EMBEDDING",
"max_tokens": 8196,
"model_type": "embedding"
},
{
"llm_name": "jina-embeddings-v2-base-es",
"tags": "TEXT EMBEDDING",
"max_tokens": 8196,
"model_type": "embedding"
},
{
"llm_name": "jina-embeddings-v2-base-code",
"tags": "TEXT EMBEDDING",
"max_tokens": 8196,
"model_type": "embedding"
},
{
"llm_name": "jina-embeddings-v2-base-zh",
"tags": "TEXT EMBEDDING",
"max_tokens": 8196,
"model_type": "embedding"
},
{
"llm_name": "jina-reranker-v2-base-multilingual",
"tags": "RE-RANK,8k",
"max_tokens": 8196,
"model_type": "rerank"
},
{
"llm_name": "jina-embeddings-v3",
"tags": "TEXT EMBEDDING",
"max_tokens": 8196,
"model_type": "embedding"
},
{
"llm_name": "jina-embeddings-v4",
"tags": "TEXT EMBEDDING",
"max_tokens": 32768,
"model_type": "embedding"
}
]
},
{
"name": "Builtin",
"logo": "",
"tags": "TEXT EMBEDDING",
"status": "1",
"llm": [
{
"llm_name": "BAAI/bge-small-en-v1.5",
"tags": "TEXT EMBEDDING,512",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "BAAI/bge-m3",
"tags": "TEXT EMBEDDING,8k",
"max_tokens": 8192,
"model_type": "embedding"
},
{
"llm_name": "Qwen/Qwen3-Embedding-0.6B",
"tags": "TEXT EMBEDDING,32k",
"max_tokens": 32768,
"model_type": "embedding"
}
]
},
{
"name": "MiniMax",
"logo": "",
"tags": "LLM",
"status": "1",
"rank": "987",
"url": "https://api.minimaxi.com/v1",
"llm": [
{
"llm_name": "MiniMax-M3",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "MiniMax-M2.7",
"tags": "LLM,CHAT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "MiniMax-M2.7-highspeed",
"tags": "LLM,CHAT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
feat: add MiniMax-M2.5 and M2.5-highspeed models (#13557) ## Summary Add MiniMax's latest M2.5 model family to the model registry and update the default API base URL to the international endpoint for broader accessibility. ## Changes - **Add MiniMax-M2.5 models** to `conf/llm_factories.json`: - `MiniMax-M2.5` — Peak Performance. Ultimate Value. Master the Complex. - `MiniMax-M2.5-highspeed` — Same performance, faster and more agile. - Both support 204,800 token context window and tool calling (`is_tools: true`). - **Update default MiniMax API base URL** in `rag/llm/__init__.py`: - From `https://api.minimaxi.com/v1` (domestic) to `https://api.minimax.io/v1` (international). - Chinese users can still override via the Base URL field in the UI settings (as documented in existing i18n strings). ## Supported Models | Model | Context Window | Tool Calling | Description | |-------|---------------|-------------|-------------| | `MiniMax-M2.5` | 204,800 tokens | Yes | Peak Performance. Ultimate Value. | | `MiniMax-M2.5-highspeed` | 204,800 tokens | Yes | Same performance, faster and more agile. | ## API Documentation - OpenAI Compatible API: https://platform.minimax.io/docs/api-reference/text-openai-api ## Testing - [x] JSON validation passes - [x] Python syntax validation passes - [x] Ruff lint passes - [x] MiniMax-M2.5 API call verified (returns valid response) - [x] MiniMax-M2.5-highspeed API call verified (returns valid response) Co-authored-by: PR Bot <pr-bot@minimaxi.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-03-12 20:41:46 +08:00
{
"llm_name": "MiniMax-M2.5",
"tags": "LLM,CHAT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "MiniMax-M2.5-highspeed",
"tags": "LLM,CHAT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "MiniMax-M2.1",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat"
},
{
"llm_name": "MiniMax-M2",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat"
}
]
},
{
"name": "Mistral",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,MODERATION",
"status": "1",
"url": "https://api.mistral.ai/v1",
"llm": [
{
"llm_name": "codestral-latest",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "mistral-large-latest",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "mistral-saba-latest",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat"
},
{
"llm_name": "pixtral-large-latest",
"tags": "LLM,IMAGE2TEXT,131k",
"max_tokens": 131000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "ministral-3b-latest",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "ministral-8b-latest",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "mistral-embed",
"tags": "TEXT EMBEDDING,8k",
"max_tokens": 8192,
"model_type": "embedding"
},
{
"llm_name": "mistral-moderation-latest",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "mistral-small-latest",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "pixtral-12b-2409",
"tags": "LLM,IMAGE2TEXT,131k",
"max_tokens": 131000,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "mistral-ocr-latest",
"tags": "LLM,IMAGE2TEXT,131k",
"max_tokens": 131000,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "open-mistral-nemo",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "open-codestral-mamba",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat"
}
]
},
{
"name": "Azure-OpenAI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION",
"status": "1",
"llm": [
{
"llm_name": "gpt-4o-mini",
"tags": "LLM,128K",
"max_tokens": 128000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gpt-4o",
"tags": "LLM,128K",
"max_tokens": 128000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gpt-3.5-turbo",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "gpt-3.5-turbo-16k",
"tags": "LLM,CHAT,16k",
"max_tokens": 16385,
"model_type": "chat"
},
{
"llm_name": "text-embedding-ada-002",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding"
},
{
"llm_name": "text-embedding-3-small",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding"
},
{
"llm_name": "text-embedding-3-large",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding"
},
{
"llm_name": "whisper-1",
"tags": "SPEECH2TEXT",
"max_tokens": 26214400,
"model_type": "speech2text"
},
{
"llm_name": "gpt-4",
"tags": "LLM,CHAT,8K",
"max_tokens": 8191,
"model_type": "chat"
},
{
"llm_name": "gpt-4-turbo",
"tags": "LLM,CHAT,8K",
"max_tokens": 8191,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4-32k",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "gpt-4-vision-preview",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 765,
"model_type": [
"image2text",
"chat"
]
}
]
},
{
"name": "Bedrock",
"logo": "",
"tags": "LLM,TEXT EMBEDDING",
"status": "1",
"llm": []
},
{
"name": "Gemini",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,IMAGE2TEXT",
"status": "1",
"rank": "997",
"llm": [
{
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"llm_name": "gemini-3-pro-preview",
"tags": "LLM,1M,IMAGE2TEXT",
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"max_tokens": 1048576,
"model_type": [
"image2text",
"chat"
],
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"is_tools": true
},
{
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"llm_name": "gemini-2.5-flash",
"tags": "LLM,1024K,IMAGE2TEXT",
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"max_tokens": 1048576,
"model_type": [
"image2text",
"chat"
],
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"is_tools": true
},
{
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"llm_name": "gemini-2.5-pro",
"tags": "LLM,IMAGE2TEXT,1024K",
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"max_tokens": 1048576,
"model_type": [
"image2text",
"chat"
],
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"is_tools": true
},
{
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"llm_name": "gemini-2.5-flash-lite",
"tags": "LLM,1024K,IMAGE2TEXT",
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"max_tokens": 1048576,
"model_type": [
"image2text",
"chat"
],
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"is_tools": true
},
{
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"llm_name": "gemini-2.0-flash",
"tags": "LLM,1024K",
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"max_tokens": 1048576,
"model_type": [
"image2text",
"chat"
],
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"is_tools": true
},
{
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"llm_name": "gemini-2.0-flash-lite",
"tags": "LLM,1024K",
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"max_tokens": 1048576,
"model_type": [
"image2text",
"chat"
],
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"is_tools": true
},
{
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
"llm_name": "gemini-embedding-001",
"tags": "TEXT EMBEDDING",
"max_tokens": 2048,
"model_type": "embedding"
}
]
},
{
"name": "Groq",
"logo": "",
"tags": "LLM",
"status": "1",
"llm": [
{
"llm_name": "gemma2-9b-it",
"tags": "LLM,CHAT,15k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "llama3-70b-8192",
"tags": "LLM,CHAT,6k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "llama3-8b-8192",
"tags": "LLM,CHAT,30k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "llama-3.1-70b-versatile",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "llama-3.1-8b-instant",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "llama-3.3-70b-versatile",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "llama-3.3-70b-specdec",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "mixtral-8x7b-32768",
"tags": "LLM,CHAT,5k",
"max_tokens": 32768,
"model_type": "chat"
}
]
},
{
"name": "OpenRouter",
"logo": "",
"tags": "LLM,IMAGE2TEXT",
"status": "1",
"llm": [],
"rank": "989",
"url": "https://openrouter.ai/api/v1"
},
{
"name": "StepFun",
"logo": "",
"tags": "LLM,IMAGE2TEXT,SPEECH2TEXT,TTS",
"status": "1",
"llm": [
{
"llm_name": "step-3",
"tags": "LLM,IMAGE2TEXT,64k",
"max_tokens": 65536,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "step-2-mini",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "step-2-16k",
"tags": "LLM,CHAT,16k",
"max_tokens": 16384,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "step-1-8k",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "step-1-32k",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "step-1-256k",
"tags": "LLM,CHAT,256k",
"max_tokens": 262144,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "step-r1-v-mini",
"tags": "LLM,IMAGE2TEXT,100k",
"max_tokens": 102400,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "step-1v-8k",
"tags": "LLM,IMAGE2TEXT,8k",
"max_tokens": 8192,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "step-1v-32k",
"tags": "LLM,IMAGE2TEXT,32k",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "step-1o-vision-32k",
"tags": "LLM,IMAGE2TEXT,32k",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "step-1o-turbo-vision",
"tags": "LLM,IMAGE2TEXT,32k",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "step-tts-mini",
"tags": "TTS,1000c",
"max_tokens": 1000,
"model_type": "tts",
"is_tools": false
},
{
"llm_name": "step-tts-vivid",
"tags": "TTS,1000c",
"max_tokens": 1000,
"model_type": "tts",
"is_tools": false
},
{
"llm_name": "step-asr",
"tags": "SPEECH2TEXT,100MB",
"max_tokens": 32768,
"model_type": "speech2text",
"is_tools": false
}
]
},
{
"name": "NVIDIA",
"logo": "",
"tags": "LLM,TEXT EMBEDDING, TEXT RE-RANK",
"status": "1",
"llm": [
{
"llm_name": "01-ai/yi-large",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "abacusai/dracarys-llama-3.1-70b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "ai21labs/jamba-1.5-large-instruct",
"tags": "LLM,CHAT,256K",
"max_tokens": 256000,
"model_type": "chat"
},
{
"llm_name": "ai21labs/jamba-1.5-mini-instruct",
"tags": "LLM,CHAT,256K",
"max_tokens": 256000,
"model_type": "chat"
},
{
"llm_name": "aisingapore/sea-lion-7b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "baichuan-inc/baichuan2-13b-chat",
"tags": "LLM,CHAT,192K",
"max_tokens": 196608,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "bigcode/starcoder2-7b",
"tags": "LLM,CHAT,16K",
"max_tokens": 16384,
"model_type": "chat"
},
{
"llm_name": "bigcode/starcoder2-15b",
"tags": "LLM,CHAT,16K",
"max_tokens": 16384,
"model_type": "chat"
},
{
"llm_name": "databricks/dbrx-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/deepseek-r1",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "google/gemma-2b",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "google/gemma-7b",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "google/gemma-2-2b-it",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "google/gemma-2-9b-it",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "google/gemma-2-27b-it",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "google/codegemma-1.1-7b",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "google/codegemma-7b",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "google/recurrentgemma-2b",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "google/shieldgemma-9b",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "ibm/granite-3.0-3b-a800m-instruct",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "ibm/granite-3.0-8b-instruct",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "ibm/granite-34b-code-instruct",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "ibm/granite-8b-code-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "ibm/granite-guardian-3.0-8b",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "igenius / colosseum-355b_instruct_16k",
"tags": "LLM,CHAT,16K",
"max_tokens": 16384,
"model_type": "chat"
},
{
"llm_name": "igenius / italia_10b_instruct_16k",
"tags": "LLM,CHAT,16K",
"max_tokens": 16384,
"model_type": "chat"
},
{
"llm_name": "institute-of-science-tokyo/llama-3.1-swallow-70b-instruct-v01",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "institute-of-science-tokyo/llama-3.1-swallow-8b-instruct-v0.1",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "mediatek/breeze-7b-instruct",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "meta/codellama-70b",
"tags": "LLM,CHAT,100K",
"max_tokens": 100000,
"model_type": "chat"
},
{
"llm_name": "meta/llama2-70b",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "meta/llama3-8b",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "meta/llama3-70b",
"tags": "LLM,CHAT,",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "meta/llama-3.1-8b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "meta/llama-3.1-70b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "meta/llama-3.1-405b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "meta/llama-3.2-1b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "meta/llama-3.2-3b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "meta/llama-3.3-70b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "microsoft/phi-3-medium-128k-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "microsoft/phi-3-medium-4k-instruct",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "microsoft/phi-3-mini-128k-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "microsoft/phi-3-mini-4k-instruct",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "microsoft/phi-3-small-128k-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "microsoft/phi-3-small-8k-instruct",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "microsoft/phi-3.5-mini",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "microsoft/phi-3.5-moe-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "mistralai/codestral-22b-instruct-v0.1",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "mistralai/mamba-codestral-7b-v0.1",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "mistralai/mistral-2-large-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "mistralai/mathstral-7b-v01",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "mistralai/mistral-7b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "mistralai/mistral-7b-instruct-v0.3",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "mistralai/mixtral-8x7b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "mistralai/mixtral-8x22b-instruct",
"tags": "LLM,CHAT,64K",
"max_tokens": 65536,
"model_type": "chat"
},
{
"llm_name": "mistralai/mistral-large",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "mistralai/mistral-small-24b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "nvidia/llama3-chatqa-1.5-8b",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "nvidia/llama-3.1-nemoguard-8b-content-safety",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "nvidia/llama-3.1-nemoguard-8b-topic-control",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "nvidia/llama-3.1-nemotron-51b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "nvidia/llama-3.1-nemotron-70b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "nvidia/llama-3.1-nemotron-70b-reward",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat"
},
{
"llm_name": "nvidia/llama3-chatqa-1.5-70b",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "nvidia/mistral-nemo-minitron-8b-base",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "nvidia/mistral-nemo-minitron-8b-8k-instruct",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "nvidia/nemotron-4-340b-instruct",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "nvidia/nemotron-4-340b-reward",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "nvidia/nemotron-4-mini-hindi-4b-instruct",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "nvidia/nemotron-mini-4b-instruct",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "nv-mistralai/mistral-nemo-12b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "qwen/qwen2-7b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen/qwen2.5-7b-instruct",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen/qwen2.5-coder-7b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "rakuten/rakutenai-7b-chat",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "rakuten/rakutenai-7b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "seallms/seallm-7b-v2.5",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "snowflake/arctic",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "tokyotech-llm/llama-3-swallow-70b-instruct-v01",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "thudm/chatglm3-6b",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "tiiuae/falcon3-7b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "upstage/solar-10.7b-instruct",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "writer/palmyra-creative-122b",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "writer/palmyra-fin-70b-32k",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "writer/palmyra-med-70b-32k",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "writer/palmyra-med-70b",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "yentinglin/llama-3-taiwan-70b-instruct",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "zyphra/zamba2-7b-instruct",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat"
},
{
"llm_name": "BAAI/bge-m3",
"tags": "TEXT EMBEDDING",
"max_tokens": 8192,
"model_type": "embedding"
},
{
"llm_name": "BAAI/bge-m3-unsupervised",
"tags": "TEXT EMBEDDING",
"max_tokens": 8192,
"model_type": "embedding"
},
{
"llm_name": "BAAI/bge-m3-retromae",
"tags": "TEXT EMBEDDING",
"max_tokens": 8129,
"model_type": "embedding"
},
{
"llm_name": "BAAI/bge-large-en-v1.5",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "BAAI/bge-base-en-v1.5",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "BAAI/bge-small-en-v1.5",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "nvidia/embed-qa-4",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "nvidia/llama-3.2-nv-embedqa-1b-v1",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "nvidia/llama-3.2-nv-embedqa-1b-v2",
"tags": "TEXT EMBEDDING",
"max_tokens": 8192,
"model_type": "embedding"
},
{
"llm_name": "nvidia/llama-3.2-nv-rerankqa-1b-v1",
"tags": "RE-RANK,512",
"max_tokens": 512,
"model_type": "rerank"
},
{
"llm_name": "nvidia/llama-3.2-nv-rerankqa-1b-v2",
"tags": "RE-RANK,8K",
"max_tokens": 8192,
"model_type": "rerank"
},
{
"llm_name": "nvidia/nvclip",
"tags": "TEXT EMBEDDING",
"max_tokens": 1024,
"model_type": "embedding"
},
{
"llm_name": "nvidia/nv-embed-v1",
"tags": "TEXT EMBEDDING",
"max_tokens": 4096,
"model_type": "embedding"
},
{
"llm_name": "nvidia/nv-embedqa-e5-v5",
"tags": "TEXT EMBEDDING",
"max_tokens": 1024,
"model_type": "embedding"
},
{
"llm_name": "nvidia/nv-embedqa-mistral-7b-v2",
"tags": "TEXT EMBEDDING",
"max_tokens": 4096,
"model_type": "embedding"
},
{
"llm_name": "nvidia/nv-rerankqa-mistral-4b-v3",
"tags": "RE-RANK,512",
"max_tokens": 512,
"model_type": "rerank"
},
{
"llm_name": "nvidia/rerank-qa-mistral-4b",
"tags": "RE-RANK,512",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "snowflake-arctic-embed-xs",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "snowflake-arctic-embed-s",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "snowflake-arctic-embed-m",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "snowflake-arctic-embed-m-long",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "snowflake-arctic-embed-l",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "adept/fuyu-8b",
"tags": "IMAGE2TEXT,1K",
"max_tokens": 1024,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "google/deplot",
"tags": "IMAGE2TEXT,8K",
"max_tokens": 8192,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "google/paligemma",
"tags": "IMAGE2TEXT,256K",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "meta/llama-3.2-11b-vision-instruct",
"tags": "IMAGE2TEXT,128K",
"max_tokens": 131072,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "meta/llama-3.2-90b-vision-instruct",
"tags": "IMAGE2TEXT,128K",
"max_tokens": 131072,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "microsoft/florence-2",
"tags": "IMAGE2TEXT,1K",
"max_tokens": 1024,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "microsoft/kosmos-2",
"tags": "IMAGE2TEXT,4K",
"max_tokens": 4096,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "microsoft/phi-3-vision-128k-instruct",
"tags": "IMAGE2TEXT,128K",
"max_tokens": 131072,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "microsoft/phi-3.5-vision-instruct",
"tags": "IMAGE2TEXT,128K",
"max_tokens": 131072,
"model_type": [
"image2text",
"chat"
]
},
{
"llm_name": "nvidia/neva-22b",
"tags": "IMAGE2TEXT,1K",
"max_tokens": 1024,
"model_type": [
"image2text",
"chat"
]
}
]
},
{
"name": "LM-Studio",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,IMAGE2TEXT",
"status": "1",
"llm": []
},
{
"name": "Cohere",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TEXT RE-RANK,SPEECH2TEXT",
"status": "1",
"rank": "990",
"llm": [
{
"llm_name": "command-a-plus-05-2026",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "command-a-03-2025",
"tags": "LLM,CHAT,256k",
"max_tokens": 262144,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "command-r7b-12-2024",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "command-a-translate-08-2025",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "command-a-reasoning-08-2025",
"tags": "LLM,CHAT,256k",
"max_tokens": 262144,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "command-a-vision-07-2025",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "command-r-plus-08-2024",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "command-r-08-2024",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "embed-v4.0",
"tags": "TEXT EMBEDDING,128k",
"max_tokens": 131072,
"model_type": "embedding"
},
{
"llm_name": "embed-english-v3.0",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "embed-english-light-v3.0",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "embed-multilingual-v3.0",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "embed-multilingual-light-v3.0",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
},
{
"llm_name": "rerank-v4.0-pro",
"tags": "RE-RANK,32k",
"max_tokens": 32768,
"model_type": "rerank"
},
{
"llm_name": "rerank-v4.0-fast",
"tags": "RE-RANK,32k",
"max_tokens": 32768,
"model_type": "rerank"
},
{
"llm_name": "rerank-v3.5",
"tags": "RE-RANK,4k",
"max_tokens": 4096,
"model_type": "rerank"
},
{
"llm_name": "rerank-english-v3.0",
"tags": "RE-RANK,4k",
"max_tokens": 4096,
"model_type": "rerank"
},
{
"llm_name": "rerank-multilingual-v3.0",
"tags": "RE-RANK,4k",
"max_tokens": 4096,
"model_type": "rerank"
},
{
"llm_name": "cohere-transcribe-03-2026",
"tags": "SPEECH2TEXT",
"max_tokens": 8192,
"model_type": "speech2text"
}
]
},
{
"name": "TogetherAI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,IMAGE2TEXT",
"status": "1",
"llm": []
},
{
"name": "Upstage",
"logo": "",
"tags": "LLM,TEXT EMBEDDING",
"status": "1",
"llm": [
{
"llm_name": "solar-1-mini-chat",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "solar-1-mini-chat-ja",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "solar-embedding-1-large-query",
"tags": "TEXT EMBEDDING",
"max_tokens": 4000,
"model_type": "embedding"
},
{
"llm_name": "solar-embedding-1-large-passage",
"tags": "TEXT EMBEDDING",
"max_tokens": 4000,
"model_type": "embedding"
}
]
},
{
"name": "NovitaAI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING",
"status": "1",
"llm": [
{
"llm_name": "qwen/qwen2.5-7b-instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat"
},
{
"llm_name": "meta-llama/llama-3.2-1b-instruct",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat"
},
{
"llm_name": "meta-llama/llama-3.2-3b-instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "thudm/glm-4-9b-0414",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat"
},
{
"llm_name": "thudm/glm-z1-9b-0414",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat"
},
{
"llm_name": "meta-llama/llama-3.1-8b-instruct-bf16",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "meta-llama/llama-3.1-8b-instruct",
"tags": "LLM,CHAT,16k",
"max_tokens": 16384,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek/deepseek-v3-0324",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat"
},
{
"llm_name": "deepseek/deepseek-r1-turbo",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat"
},
{
"llm_name": "Sao10K/L3-8B-Stheno-v3.2",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "meta-llama/llama-3.3-70b-instruct",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek/deepseek-r1-distill-llama-8b",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat"
},
{
"llm_name": "mistralai/mistral-nemo",
"tags": "LLM,CHAT,128k",
"max_tokens": 131072,
"model_type": "chat"
},
{
"llm_name": "meta-llama/llama-3-8b-instruct",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek/deepseek-v3-turbo",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "mistralai/mistral-7b-instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "deepseek/deepseek-r1",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat"
},
{
"llm_name": "deepseek/deepseek-r1-distill-qwen-14b",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat"
},
{
"llm_name": "baai/bge-m3",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8192,
"model_type": "embedding"
}
]
},
{
"name": "SILICONFLOW",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TEXT RE-RANK,IMAGE2TEXT,TTS,SPEECH2TEXT",
"status": "1",
"rank": "986",
"url": "https://api.siliconflow.cn/v1",
"llm": [
{
"llm_name": "deepseek-ai/DeepSeek-V4-Pro",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V4-Flash",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/moonshotai/Kimi-K2.6",
"tags": "LLM,IMAGE2TEXT,CHAT,262k",
"max_tokens": 262000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Pro/zai-org/GLM-5.1",
"tags": "LLM,CHAT,205k",
"max_tokens": 205000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "nex-agi/Nex-N2-Pro",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "MiniMaxAI/MiniMax-M2.5",
"tags": "LLM,CHAT,197k",
"max_tokens": 197000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/MiniMaxAI/MiniMax-M2.5",
"tags": "LLM,CHAT,197k",
"max_tokens": 197000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V3.2",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/deepseek-ai/DeepSeek-V3.2",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V3.1-Terminus",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/deepseek-ai/DeepSeek-V3.1-Terminus",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3.6-35B-A3B",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3.6-27B",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3.5-397B-A17B",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3.5-122B-A10B",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3.5-35B-A3B",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3.5-27B",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3.5-9B",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3.5-4B",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-R1",
"tags": "LLM,CHAT,160k",
"max_tokens": 160000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/deepseek-ai/DeepSeek-R1",
"tags": "LLM,CHAT,160k",
"max_tokens": 160000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V3",
"tags": "LLM,CHAT,160k",
"max_tokens": 160000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/deepseek-ai/DeepSeek-V3",
"tags": "LLM,CHAT,160k",
"max_tokens": 160000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "stepfun-ai/Step-3.5-Flash",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-32B-Instruct",
"tags": "LLM,IMAGE2TEXT,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-32B-Thinking",
"tags": "LLM,IMAGE2TEXT,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-8B-Instruct",
"tags": "LLM,IMAGE2TEXT,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-8B-Thinking",
"tags": "LLM,IMAGE2TEXT,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-30B-A3B-Instruct",
"tags": "LLM,IMAGE2TEXT,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-30B-A3B-Thinking",
"tags": "LLM,IMAGE2TEXT,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Omni-30B-A3B-Instruct",
"tags": "LLM,IMAGE2TEXT,CHAT,256k",
"max_tokens": 256000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Omni-30B-A3B-Thinking",
"tags": "LLM,IMAGE2TEXT,CHAT,64k",
"max_tokens": 64000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Omni-30B-A3B-Captioner",
"tags": "LLM,IMAGE2TEXT,CHAT,64k",
"max_tokens": 64000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "inclusionAI/Ling-flash-2.0",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "inclusionAI/Ling-mini-2.0",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "tencent/Hunyuan-MT-7B",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "ByteDance-Seed/Seed-OSS-36B-Instruct",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "zai-org/GLM-4.5V",
"tags": "LLM,IMAGE2TEXT,CHAT,64k",
"max_tokens": 64000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "zai-org/GLM-4.5-Air",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Coder-30B-A3B-Instruct",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-30B-A3B-Instruct-2507",
"tags": "LLM,CHAT,256k",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "tencent/Hunyuan-A13B-Instruct",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "deepseek-ai/DeepSeek-R1-0528-Qwen3-8B",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-32B",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-14B",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-8B",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "THUDM/GLM-4-32B-0414",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "THUDM/GLM-Z1-9B-0414",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "THUDM/GLM-4-9B-0414",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-72B-Instruct-128K",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-72B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-32B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-14B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-7B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/Qwen/Qwen2.5-7B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-Embedding-8B",
"tags": "TEXT EMBEDDING,32k",
"max_tokens": 32000,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Embedding-8B",
"tags": "TEXT EMBEDDING,32k",
"max_tokens": 32000,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Embedding-4B",
"tags": "TEXT EMBEDDING,32k",
"max_tokens": 32000,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Embedding-0.6B",
"tags": "TEXT EMBEDDING,32k",
"max_tokens": 32000,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "BAAI/bge-m3",
"tags": "TEXT EMBEDDING,8k",
"max_tokens": 8192,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "BAAI/bge-large-en-v1.5",
"tags": "TEXT EMBEDDING,512",
"max_tokens": 512,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "BAAI/bge-large-zh-v1.5",
"tags": "TEXT EMBEDDING,512",
"max_tokens": 512,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Pro/BAAI/bge-m3",
"tags": "TEXT EMBEDDING,8k",
"max_tokens": 8192,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-VL-Reranker-8B",
"tags": "TEXT RE-RANK,32k",
"max_tokens": 32000,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Reranker-8B",
"tags": "TEXT RE-RANK,32k",
"max_tokens": 32000,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Reranker-4B",
"tags": "TEXT RE-RANK,32k",
"max_tokens": 32000,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Reranker-0.6B",
"tags": "TEXT RE-RANK,32k",
"max_tokens": 32000,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "BAAI/bge-reranker-v2-m3",
"tags": "TEXT RE-RANK,8k",
"max_tokens": 8192,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "Pro/BAAI/bge-reranker-v2-m3",
"tags": "TEXT RE-RANK,8k",
"max_tokens": 8192,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "fnlp/MOSS-TTSD-v0.5",
"tags": "TTS",
"max_tokens": 26214400,
"model_type": "tts",
"is_tools": false
},
{
"llm_name": "FunAudioLLM/CosyVoice2-0.5B",
"tags": "TTS",
"max_tokens": 26214400,
"model_type": "tts",
"is_tools": false
}
]
},
{
"name": "siliconflow_intl",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TEXT RE-RANK,IMAGE2TEXT,TTS",
"status": "1",
"llm": [
{
"llm_name": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"tags": "LLM,CHAT,33k",
"max_tokens": 33000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "MiniMaxAI/MiniMax-M2.5",
"tags": "LLM,CHAT,197k",
"max_tokens": 197000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "zai-org/GLM-5",
"tags": "LLM,CHAT,205k",
"max_tokens": 205000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "stepfun-ai/Step-3.5-Flash",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "moonshotai/Kimi-K2.5",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "MiniMaxAI/MiniMax-M2.1",
"tags": "LLM,CHAT,197k",
"max_tokens": 197000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "zai-org/GLM-4.7",
"tags": "LLM,CHAT,205k",
"max_tokens": 205000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V3.2",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V3.2-Exp",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "zai-org/GLM-4.6V",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V3.1-Terminus",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V3.1",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V3",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-R1",
"tags": "LLM,CHAT,154k",
"max_tokens": 154000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "nex-agi/DeepSeek-V3.1-Nex-N1",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-32B-Instruct",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-32B-Thinking",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "zai-org/GLM-4.5V",
"tags": "LLM,CHAT,66k",
"max_tokens": 66000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "inclusionAI/Ling-mini-2.0",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "inclusionAI/Ring-flash-2.0",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "inclusionAI/Ling-flash-2.0",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "tencent/Hunyuan-MT-7B",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Omni-30B-A3B-Captioner",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Omni-30B-A3B-Thinking",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Omni-30B-A3B-Instruct",
"tags": "LLM,CHAT,65k",
"max_tokens": 65000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Next-80B-A3B-Thinking",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Next-80B-A3B-Instruct",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Coder-480B-A35B-Instruct",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Coder-30B-A3B-Instruct",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-30B-A3B-Thinking-2507",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-30B-A3B-Instruct-2507",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-235B-A22B-Instruct-2507",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-235B-A22B-Thinking-2507",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "ByteDance-Seed/Seed-OSS-36B-Instruct",
"tags": "LLM,CHAT,262k",
"max_tokens": 262000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "baidu/ERNIE-4.5-300B-A47B",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "tencent/Hunyuan-A13B-Instruct",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "moonshotai/Kimi-K2-Instruct",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-32B",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-14B",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-8B",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Reranker-8B",
"tags": "LLM,RE-RANK,33k",
"max_tokens": 33000,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Embedding-8B",
"tags": "LLM,EMBEDDING,33k",
"max_tokens": 33000,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Reranker-4B",
"tags": "LLM,RE-RANK,33k",
"max_tokens": 33000,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Embedding-4B",
"tags": "LLM,EMBEDDING,33k",
"max_tokens": 33000,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Reranker-0.6B",
"tags": "LLM,RE-RANK,33k",
"max_tokens": 33000,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-Embedding-0.6B",
"tags": "LLM,EMBEDDING,33k",
"max_tokens": 33000,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "THUDM/GLM-Z1-32B-0414",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "THUDM/GLM-4-32B-0414",
"tags": "LLM,CHAT,33k",
"max_tokens": 33000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "THUDM/GLM-Z1-9B-0414",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "THUDM/GLM-4-9B-0414",
"tags": "LLM,CHAT,33k",
"max_tokens": 33000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/QwQ-32B",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-R1-Distill-Qwen-14B",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-Coder-32B-Instruct",
"tags": "LLM,CHAT,33k",
"max_tokens": 33000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen2.5-72B-Instruct-128K",
"tags": "LLM,CHAT,131k",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/deepseek-vl2",
"tags": "LLM,CHAT,4k",
"max_tokens": 4000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen2.5-72B-Instruct",
"tags": "LLM,CHAT,33k",
"max_tokens": 33000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-32B-Instruct",
"tags": "LLM,CHAT,33k",
"max_tokens": 33000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-14B-Instruct",
"tags": "LLM,CHAT,33k",
"max_tokens": 33000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-7B-Instruct",
"tags": "LLM,CHAT,33k",
"max_tokens": 33000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "IndexTeam/IndexTTS-2",
"tags": "TTS",
"max_tokens": 1000,
"model_type": "tts",
"is_tools": false
}
]
},
{
"name": "PPIO",
"logo": "",
"tags": "LLM",
"status": "1",
"llm": [
{
"llm_name": "deepseek/deepseek-r1/community",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat"
},
{
"llm_name": "deepseek/deepseek-v3/community",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat"
},
{
"llm_name": "deepseek/deepseek-r1",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat"
},
{
"llm_name": "deepseek/deepseek-v3",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat"
},
{
"llm_name": "deepseek/deepseek-r1-distill-llama-70b",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat"
},
{
"llm_name": "deepseek/deepseek-r1-distill-qwen-32b",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat"
},
{
"llm_name": "deepseek/deepseek-r1-distill-qwen-14b",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat"
},
{
"llm_name": "deepseek/deepseek-r1-distill-llama-8b",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat"
},
{
"llm_name": "qwen/qwen-2.5-72b-instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen/qwen-2-vl-72b-instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "meta-llama/llama-3.2-3b-instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "qwen/qwen2.5-32b-instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat"
},
{
"llm_name": "baichuan/baichuan2-13b-chat",
"tags": "LLM,CHAT,14k",
"max_tokens": 14336,
"model_type": "chat"
},
{
"llm_name": "meta-llama/llama-3.1-70b-instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "meta-llama/llama-3.1-8b-instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "01-ai/yi-1.5-34b-chat",
"tags": "LLM,CHAT,16k",
"max_tokens": 16384,
"model_type": "chat"
},
{
"llm_name": "01-ai/yi-1.5-9b-chat",
"tags": "LLM,CHAT,16k",
"max_tokens": 16384,
"model_type": "chat"
},
{
"llm_name": "thudm/glm-4-9b-chat",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "qwen/qwen-2-7b-instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat"
}
]
},
{
"name": "Replicate",
"logo": "",
"tags": "LLM,TEXT EMBEDDING",
"status": "1",
"llm": [
{
"llm_name": "meta/llama-4-maverick-instruct",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "meta/llama-4-scout-instruct",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "meta/meta-llama-3-70b-instruct",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "meta/meta-llama-3-8b-instruct",
"tags": "LLM,CHAT,8k",
"max_tokens": 8192,
"model_type": "chat"
},
{
"llm_name": "replicate/all-mpnet-base-v2:b6b7585c9640cd7a9572c6e129c9549d79c9c31f0d3fdce7baac7c67ca38f305",
"tags": "TEXT EMBEDDING",
"max_tokens": 384,
"model_type": "embedding"
},
{
"llm_name": "ibm-granite/granite-embedding-278m-multilingual:1f76d42a05f120e12272746d5a2d86b525c13420773f795a4cbef9117d8685f1",
"tags": "TEXT EMBEDDING",
"max_tokens": 512,
"model_type": "embedding"
}
],
"rank": "987",
"url": "https://api.replicate.com"
},
{
"name": "Tencent Hunyuan",
"logo": "",
"tags": "LLM,IMAGE2TEXT",
"status": "1",
"url": "https://api.hunyuan.cloud.tencent.com/v1",
"llm": [
{
"llm_name": "hunyuan-pro",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "hunyuan-standard",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"model_type": "chat"
},
{
"llm_name": "hunyuan-standard-256K",
"tags": "LLM,CHAT,256k",
"max_tokens": 262144,
"model_type": "chat"
},
{
"llm_name": "hunyuan-lite",
"tags": "LLM,CHAT,256k",
"max_tokens": 262144,
"model_type": "chat"
},
{
"llm_name": "hunyuan-vision",
"tags": "LLM,IMAGE2TEXT,8k",
"max_tokens": 8192,
"model_type": [
"image2text",
"chat"
]
}
]
},
{
"name": "XunFei Spark",
"logo": "",
"tags": "LLM,TTS",
"status": "1",
"llm": [
{
"llm_name": "Spark-Max",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true,
"features": [
"is_tools",
"thinking"
]
},
{
"llm_name": "Spark-Max-32K",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true,
"features": [
"is_tools",
"thinking"
]
},
{
"llm_name": "Spark-Lite",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true,
"features": [
"is_tools",
"thinking"
]
},
{
"llm_name": "Spark-Pro",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true,
"features": [
"is_tools",
"thinking"
]
},
{
"llm_name": "Spark-Pro-128K",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true,
"features": [
"is_tools",
"thinking"
]
},
{
"llm_name": "Spark-4.0-Ultra",
"tags": "LLM,CHAT,128K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true,
"features": [
"is_tools",
"thinking"
]
}
]
},
{
"name": "BaiduYiyan",
"logo": "",
"tags": "LLM",
"status": "1",
"llm": []
},
{
"name": "Fish Audio",
"logo": "",
"tags": "TTS",
"status": "1",
"llm": []
},
{
"name": "Tencent Cloud",
"logo": "",
"tags": "SPEECH2TEXT",
"status": "1",
"llm": []
},
{
"name": "Anthropic",
"logo": "",
"tags": "LLM",
"status": "1",
"rank": "998",
"url": "https://api.anthropic.com/",
"llm": [
{
"llm_name": "claude-opus-4-8",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-7",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-6",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-5-20251101",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-1-20250805",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-20250514",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-6",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-5-20250929",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-20250514",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-7-sonnet-20250219",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-5-sonnet-20241022",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-5-haiku-20241022",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-haiku-20240307",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
}
]
},
{
"name": "Voyage AI",
"logo": "",
"tags": "TEXT EMBEDDING, TEXT RE-RANK",
"status": "1",
"llm": [
{
"llm_name": "voyage-4-large",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-4",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-4-lite",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-3-large",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-3.5",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-3.5-lite",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-code-3",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-multimodal-3",
"tags": "TEXT EMBEDDING,Chat,IMAGE2TEXT,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-large-2-instruct",
"tags": "TEXT EMBEDDING,16000",
"max_tokens": 16000,
"model_type": "embedding"
},
{
"llm_name": "voyage-finance-2",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-multilingual-2",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-law-2",
"tags": "TEXT EMBEDDING,16000",
"max_tokens": 16000,
"model_type": "embedding"
},
{
"llm_name": "voyage-code-2",
"tags": "TEXT EMBEDDING,16000",
"max_tokens": 16000,
"model_type": "embedding"
},
{
"llm_name": "voyage-large-2",
"tags": "TEXT EMBEDDING,16000",
"max_tokens": 16000,
"model_type": "embedding"
},
{
"llm_name": "voyage-2",
"tags": "TEXT EMBEDDING,4000",
"max_tokens": 4000,
"model_type": "embedding"
},
{
"llm_name": "voyage-3",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "voyage-3-lite",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "rerank-1",
"tags": "RE-RANK, 8000",
"max_tokens": 8000,
"model_type": "rerank"
},
{
"llm_name": "rerank-lite-1",
"tags": "RE-RANK, 4000",
"max_tokens": 4000,
"model_type": "rerank"
},
{
"llm_name": "rerank-2.5",
"tags": "RE-RANK, 32000",
"max_tokens": 32000,
"model_type": "rerank"
},
{
"llm_name": "rerank-2.5-lite",
"tags": "RE-RANK, 32000",
"max_tokens": 32000,
"model_type": "rerank"
},
{
"llm_name": "rerank-2",
"tags": "RE-RANK, 16000",
"max_tokens": 16000,
"model_type": "rerank"
},
{
"llm_name": "rerank-2-lite",
"tags": "RE-RANK, 8000",
"max_tokens": 8000,
"model_type": "rerank"
}
]
},
{
"name": "GiteeAI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,IMAGE2TEXT,SPEECH2TEXT,TEXT RE-RANK",
"status": "1",
"url": "https://ai.gitee.com/v1/",
"llm": [
{
"llm_name": "ERNIE-4.5-Turbo",
"tags": "LLM,CHAT",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "ERNIE-X1-Turbo",
"tags": "LLM,CHAT",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "DeepSeek-R1",
"tags": "LLM,CHAT",
"max_tokens": 65792,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "DeepSeek-V3",
"tags": "LLM,CHAT",
"max_tokens": 65792,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen3-235B-A22B",
"tags": "LLM,CHAT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen3-30B-A3B",
"tags": "LLM,CHAT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen3-32B",
"tags": "LLM,CHAT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen3-8B",
"tags": "LLM,CHAT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen3-4B",
"tags": "LLM,CHAT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen3-0.6B",
"tags": "LLM,CHAT",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "QwQ-32B",
"tags": "LLM,CHAT",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "DeepSeek-R1-Distill-Qwen-32B",
"tags": "LLM,CHAT",
"max_tokens": 65792,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "DeepSeek-R1-Distill-Qwen-14B",
"tags": "LLM,CHAT",
"max_tokens": 65792,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "DeepSeek-R1-Distill-Qwen-1.5B",
"tags": "LLM,CHAT",
"max_tokens": 65792,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen2.5-72B-Instruct",
"tags": "LLM,CHAT",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen2.5-32B-Instruct",
"tags": "LLM,CHAT",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen2.5-14B-Instruct",
"tags": "LLM,CHAT",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen2.5-7B-Instruct",
"tags": "LLM,CHAT",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen2-72B-Instruct",
"tags": "LLM,CHAT",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen2-7B-Instruct",
"tags": "LLM,CHAT",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "GLM-4-32B",
"tags": "LLM,CHAT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "GLM-4-9B-0414",
"tags": "LLM,CHAT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "glm-4-9b-chat",
"tags": "LLM,CHAT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "internlm3-8b-instruct",
"tags": "LLM,CHAT",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Yi-34B-Chat",
"tags": "LLM,CHAT",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "ERNIE-4.5-Turbo-VL",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 4096,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "Qwen2.5-VL-32B-Instruct",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "Qwen2-VL-72B",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 4096,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "Align-DS-V",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 4096,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "InternVL3-78B",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "InternVL3-38B",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "InternVL2.5-78B",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "InternVL2.5-26B",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 16384,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "InternVL2-8B",
"tags": "LLM,IMAGE2TEXT",
"max_tokens": 8192,
"model_type": [
"image2text",
"chat"
],
"is_tools": false
},
{
"llm_name": "Qwen2-Audio-7B-Instruct",
"tags": "LLM,SPEECH2TEXT,IMAGE2TEXT",
"max_tokens": 8192,
"model_type": "speech2text",
"is_tools": false
},
{
"llm_name": "whisper-base",
"tags": "SPEECH2TEXT",
"max_tokens": 512,
"model_type": "speech2text",
"is_tools": false
},
{
"llm_name": "whisper-large",
"tags": "SPEECH2TEXT",
"max_tokens": 512,
"model_type": "speech2text",
"is_tools": false
},
{
"llm_name": "whisper-large-v3-turbo",
"tags": "SPEECH2TEXT",
"max_tokens": 512,
"model_type": "speech2text",
"is_tools": false
},
{
"llm_name": "whisper-large-v3",
"tags": "SPEECH2TEXT",
"max_tokens": 512,
"model_type": "speech2text",
"is_tools": false
},
{
"llm_name": "SenseVoiceSmall",
"tags": "SPEECH2TEXT",
"max_tokens": 512,
"model_type": "speech2text",
"is_tools": false
},
{
"llm_name": "Qwen3-Reranker-8B",
"tags": "TEXT RE-RANK,32K",
"max_tokens": 32768,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "Qwen3-Reranker-4B",
"tags": "TEXT RE-RANK,32K",
"max_tokens": 32768,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "Qwen3-Reranker-0.6B",
"tags": "TEXT RE-RANK,32K",
"max_tokens": 32768,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "Qwen3-Embedding-8B",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 8192,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen3-Embedding-4B",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 4096,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen3-Embedding-0.6B",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 4096,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "jina-clip-v1",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 512,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "jina-clip-v2",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 8192,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "jina-reranker-m0",
"tags": "TEXT RE-RANK,10K",
"max_tokens": 10240,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "bce-embedding-base_v1",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 512,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "bce-reranker-base_v1",
"tags": "TEXT RE-RANK",
"max_tokens": 512,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "bge-m3",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 8192,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "bge-reranker-v2-m3",
"tags": "TEXT RE-RANK",
"max_tokens": 8192,
"model_type": "rerank",
"is_tools": false
},
{
"llm_name": "bge-large-zh-v1.5",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 1024,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "bge-small-zh-v1.5",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 512,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "nomic-embed-code",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 512,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "all-mpnet-base-v2",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 512,
"model_type": "embedding",
"is_tools": false
}
]
},
{
"name": "Google Cloud",
"logo": "",
"tags": "LLM",
"status": "1",
"llm": []
},
{
"name": "HuggingFace",
"logo": "",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"status": "1",
"rank": "991",
"llm": []
},
{
"name": "GPUStack",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TTS,SPEECH2TEXT,TEXT RE-RANK",
"status": "1",
"llm": []
},
{
"name": "DeepInfra",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TTS,SPEECH2TEXT,MODERATION",
"status": "1",
"llm": [
{
"llm_name": "moonshotai/Kimi-K2-Instruct",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "mistralai/Voxtral-Small-24B-2507",
"tags": "SPEECH2TEXT",
"model_type": "speech2text"
},
{
"llm_name": "mistralai/Voxtral-Mini-3B-2507",
"tags": "SPEECH2TEXT",
"model_type": "speech2text"
},
{
"llm_name": "deepseek-ai/DeepSeek-R1-0528-Turbo",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "Qwen/Qwen3-235B-A22B",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "Qwen/Qwen3-30B-A3B",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "Qwen/Qwen3-32B",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "Qwen/Qwen3-14B",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "deepseek-ai/DeepSeek-V3-0324-Turbo",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-Turbo",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "meta-llama/Llama-4-Scout-17B-16E-Instruct",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "deepseek-ai/DeepSeek-R1-0528",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "deepseek-ai/DeepSeek-V3-0324",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "mistralai/Devstral-Small-2507",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "mistralai/Mistral-Small-3.2-24B-Instruct-2506",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "meta-llama/Llama-Guard-4-12B",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "Qwen/QwQ-32B",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "anthropic/claude-4-opus",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "anthropic/claude-4-sonnet",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "google/gemini-2.5-flash",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "google/gemini-2.5-pro",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "google/gemma-3-27b-it",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "google/gemma-3-12b-it",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "google/gemma-3-4b-it",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "hexgrad/Kokoro-82M",
"tags": "TTS",
"model_type": "tts"
},
{
"llm_name": "canopylabs/orpheus-3b-0.1-ft",
"tags": "TTS",
"model_type": "tts"
},
{
"llm_name": "sesame/csm-1b",
"tags": "TTS",
"model_type": "tts"
},
{
"llm_name": "microsoft/Phi-4-multimodal-instruct",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "deepseek-ai/DeepSeek-R1-Distill-Llama-70B",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "deepseek-ai/DeepSeek-V3",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "meta-llama/Llama-3.3-70B-Instruct-Turbo",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "meta-llama/Llama-3.3-70B-Instruct",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "microsoft/phi-4",
"tags": "LLM,CHAT",
"model_type": "chat"
},
{
"llm_name": "openai/whisper-large-v3-turbo",
"tags": "SPEECH2TEXT",
"model_type": "speech2text"
},
{
"llm_name": "BAAI/bge-base-en-v1.5",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "BAAI/bge-en-icl",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "BAAI/bge-large-en-v1.5",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "BAAI/bge-m3",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "BAAI/bge-m3-multi",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "Qwen/Qwen3-Embedding-0.6B",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "Qwen/Qwen3-Embedding-4B",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "Qwen/Qwen3-Embedding-8B",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "intfloat/e5-base-v2",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "intfloat/e5-large-v2",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "intfloat/multilingual-e5-large",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "intfloat/multilingual-e5-large-instruct",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "sentence-transformers/all-MiniLM-L12-v2",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "sentence-transformers/all-MiniLM-L6-v2",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "sentence-transformers/all-mpnet-base-v2",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "sentence-transformers/clip-ViT-B-32",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "sentence-transformers/clip-ViT-B-32-multilingual-v1",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "sentence-transformers/multi-qa-mpnet-base-dot-v1",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "sentence-transformers/paraphrase-MiniLM-L6-v2",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "shibing624/text2vec-base-chinese",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "thenlper/gte-base",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
},
{
"llm_name": "thenlper/gte-large",
"tags": "TEXT EMBEDDING",
"model_type": "embedding"
}
]
},
{
"name": "302.AI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TEXT RE-RANK,IMAGE2TEXT",
"status": "1",
"llm": [
{
"llm_name": "deepseek-chat",
"tags": "LLM,CHAT",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "chatgpt-4o-latest",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "llama3.3-70b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-reasoner",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.0-flash",
"tags": "LLM",
"max_tokens": 1000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "claude-3-7-sonnet-20250219",
"tags": "LLM,CHAT",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-7-sonnet-latest",
"tags": "LLM,CHAT",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3-beta",
"tags": "LLM,CHAT",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3-mini-beta",
"tags": "LLM,CHAT",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1",
"tags": "LLM,CHAT",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o3",
"tags": "LLM,CHAT",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o4-mini",
"tags": "LLM,CHAT",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-235b-a22b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "qwen3-32b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "gemini-2.5-pro-preview-05-06",
"tags": "LLM,CHAT",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "llama-4-maverick",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash",
"tags": "LLM,CHAT",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-20250514",
"tags": "LLM,CHAT",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-20250514",
"tags": "LLM",
"max_tokens": 200000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gemini-2.5-pro",
"tags": "LLM",
"max_tokens": 1000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "jina-clip-v2",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 8192,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "jina-reranker-m0",
"tags": "TEXT EMBEDDING,TEXT RE-RANK",
"max_tokens": 10240,
"model_type": "rerank",
"is_tools": false
}
]
},
{
"name": "CometAPI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,IMAGE2TEXT",
"status": "1",
"llm": [
{
"llm_name": "gpt-5-chat-latest",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "chatgpt-4o-latest",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-mini",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-nano",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1-mini",
"tags": "LLM,CHAT,1M",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1-nano",
"tags": "LLM,CHAT,1M",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1",
"tags": "LLM,CHAT,1M",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o-mini",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o4-mini-2025-04-16",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o3-pro-2025-06-10",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-1-20250805",
"tags": "LLM,200k,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "claude-opus-4-1-20250805-thinking",
"tags": "LLM,200k,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-20250514",
"tags": "LLM,200k,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-20250514-thinking",
"tags": "LLM,200k,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "claude-3-7-sonnet-latest",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-5-haiku-latest",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-pro",
"tags": "LLM,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash",
"tags": "LLM,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash-lite",
"tags": "LLM,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gemini-2.0-flash",
"tags": "LLM,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "grok-4-0709",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3-mini",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-2-image-1212",
"tags": "LLM,32k,IMAGE2TEXT",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "deepseek-v3.1",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-v3",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-r1-0528",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-chat",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-reasoner",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-30b-a3b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-coder-plus-2025-07-22",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "text-embedding-ada-002",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-3-small",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-3-large",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "whisper-1",
"tags": "SPEECH2TEXT",
"max_tokens": 26214400,
"model_type": "speech2text",
"is_tools": false
},
{
"llm_name": "tts-1",
"tags": "TTS",
"max_tokens": 2048,
"model_type": "tts",
"is_tools": false
}
]
},
{
"name": "LongCat",
"logo": "",
"tags": "LLM",
"status": "1",
"url": "https://api.longcat.chat/openai",
"llm": [
{
"llm_name": "LongCat-Flash-Chat",
"tags": "LLM,CHAT,8000",
"max_tokens": 8000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "LongCat-Flash-Thinking",
"tags": "LLM,CHAT,8000",
"max_tokens": 8000,
"model_type": "chat",
"is_tools": true
}
]
},
{
"name": "DeerAPI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,IMAGE2TEXT",
"status": "1",
"llm": [
{
"llm_name": "gpt-5-chat-latest",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "chatgpt-4o-latest",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-mini",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-nano",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1-mini",
"tags": "LLM,CHAT,1M",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1-nano",
"tags": "LLM,CHAT,1M",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1",
"tags": "LLM,CHAT,1M",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o-mini",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o4-mini-2025-04-16",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o3-pro-2025-06-10",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-1-20250805",
"tags": "LLM,200k,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "claude-opus-4-1-20250805-thinking",
"tags": "LLM,200k,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-20250514",
"tags": "LLM,200k,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-20250514-thinking",
"tags": "LLM,200k,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "claude-3-7-sonnet-latest",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-5-haiku-latest",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-pro",
"tags": "LLM,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash",
"tags": "LLM,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash-lite",
"tags": "LLM,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gemini-2.0-flash",
"tags": "LLM,1M,IMAGE2TEXT",
"max_tokens": 1000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "grok-4-0709",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3-mini",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-2-image-1212",
"tags": "LLM,32k,IMAGE2TEXT",
"max_tokens": 32768,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "deepseek-v3.1",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-v3",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-r1-0528",
"tags": "LLM,CHAT,164k",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-chat",
"tags": "LLM,CHAT,32k",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-reasoner",
"tags": "LLM,CHAT,64k",
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-30b-a3b",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-coder-plus-2025-07-22",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "text-embedding-ada-002",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-3-small",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-3-large",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "whisper-1",
"tags": "SPEECH2TEXT",
"max_tokens": 26214400,
"model_type": "speech2text",
"is_tools": false
},
{
"llm_name": "tts-1",
"tags": "TTS",
"max_tokens": 2048,
"model_type": "tts",
"is_tools": false
}
]
},
{
"name": "Jiekou.AI",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TEXT RE-RANK",
"status": "1",
"llm": [
{
"llm_name": "Sao10K/L3-8B-Stheno-v3.2",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "baichuan/baichuan-m2-32b",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "baidu/ernie-4.5-300b-a47b-paddle",
"tags": "LLM,CHAT,123K",
"max_tokens": 123000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "baidu/ernie-4.5-vl-424b-a47b",
"tags": "LLM,CHAT,123K",
"max_tokens": 123000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-5-haiku-20241022",
"tags": "LLM,CHAT,200K",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-5-sonnet-20241022",
"tags": "LLM,CHAT,200K",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-7-sonnet-20250219",
"tags": "LLM,CHAT,200K",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-haiku-20240307",
"tags": "LLM,CHAT,200K",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-haiku-4-5-20251001",
"tags": "LLM,20K,IMAGE2TEXT",
"max_tokens": 20000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "claude-opus-4-1-20250805",
"tags": "LLM,CHAT,200K",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-20250514",
"tags": "LLM,CHAT,200K",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-20250514",
"tags": "LLM,CHAT,200K",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-5-20250929",
"tags": "LLM,200K,IMAGE2TEXT",
"max_tokens": 200000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "deepseek/deepseek-r1-0528",
"tags": "LLM,CHAT,163K",
"max_tokens": 163840,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek/deepseek-v3-0324",
"tags": "LLM,CHAT,163K",
"max_tokens": 163840,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek/deepseek-v3.1",
"tags": "LLM,CHAT,163K",
"max_tokens": 163840,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "doubao-1-5-pro-32k-250115",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "doubao-1.5-pro-32k-character-250715",
"tags": "LLM,CHAT,200K",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.0-flash-20250609",
"tags": "LLM,CHAT,1M",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.0-flash-lite",
"tags": "LLM,CHAT,1M",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash",
"tags": "LLM,CHAT,1M",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash-lite",
"tags": "LLM,CHAT,1M",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash-lite-preview-06-17",
"tags": "LLM,CHAT,1M",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash-lite-preview-09-2025",
"tags": "LLM,1M,IMAGE2TEXT",
"max_tokens": 1048576,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash-preview-05-20",
"tags": "LLM,CHAT,1M",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-pro",
"tags": "LLM,CHAT,1M",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-pro-preview-06-05",
"tags": "LLM,CHAT,1M",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "google/gemma-3-12b-it",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "google/gemma-3-27b-it",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1",
"tags": "LLM,CHAT,1M",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1-mini",
"tags": "LLM,CHAT,1M",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4.1-nano",
"tags": "LLM,CHAT,1M",
"max_tokens": 1047576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o-mini",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5",
"tags": "LLM,CHAT,400K",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-chat-latest",
"tags": "LLM,CHAT,400K",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-codex",
"tags": "LLM,400K,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gpt-5-mini",
"tags": "LLM,CHAT,400K",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-nano",
"tags": "LLM,CHAT,400K",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-pro",
"tags": "LLM,400K,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "gpt-5.1",
"tags": "LLM,CHAT,400K",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.1-chat-latest",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.1-codex",
"tags": "LLM,CHAT,400K",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-3-mini",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-4-0709",
"tags": "LLM,CHAT,256K",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "grok-4-fast-non-reasoning",
"tags": "LLM,2M,IMAGE2TEXT",
"max_tokens": 2000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "grok-4-fast-reasoning",
"tags": "LLM,2M,IMAGE2TEXT",
"max_tokens": 2000000,
"model_type": [
"image2text",
"chat"
],
"is_tools": true
},
{
"llm_name": "grok-code-fast-1",
"tags": "LLM,CHAT,256K",
"max_tokens": 256000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gryphe/mythomax-l2-13b",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "meta-llama/llama-3.1-8b-instruct",
"tags": "LLM,CHAT,16K",
"max_tokens": 16384,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "meta-llama/llama-3.2-3b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "meta-llama/llama-3.3-70b-instruct",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "meta-llama/llama-4-maverick-17b-128e-instruct-fp8",
"tags": "LLM,CHAT,1M",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "meta-llama/llama-4-scout-17b-16e-instruct",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "minimaxai/minimax-m1-80k",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "mistralai/mistral-7b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "mistralai/mistral-nemo",
"tags": "LLM,CHAT,60K",
"max_tokens": 60288,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "moonshotai/kimi-k2-0905",
"tags": "LLM,CHAT,262K",
"max_tokens": 262144,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "moonshotai/kimi-k2-instruct",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o1",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o1-mini",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o3",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "o3-mini",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "openai/gpt-oss-120b",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "openai/gpt-oss-20b",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen/qwen-2.5-72b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen/qwen-mt-plus",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "qwen/qwen2.5-7b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen/qwen2.5-vl-72b-instruct",
"tags": "LLM,CHAT,32K",
"max_tokens": 32768,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "qwen/qwen3-235b-a22b-fp8",
"tags": "LLM,CHAT,40K",
"max_tokens": 40960,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "qwen/qwen3-235b-a22b-instruct-2507",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen/qwen3-235b-a22b-thinking-2507",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen/qwen3-30b-a3b-fp8",
"tags": "LLM,CHAT,40K",
"max_tokens": 40960,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "qwen/qwen3-32b-fp8",
"tags": "LLM,CHAT,40K",
"max_tokens": 40960,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "qwen/qwen3-8b-fp8",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "qwen/qwen3-coder-480b-a35b-instruct",
"tags": "LLM,CHAT,262K",
"max_tokens": 262144,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen/qwen3-next-80b-a3b-instruct",
"tags": "LLM,CHAT,65K",
"max_tokens": 65536,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen/qwen3-next-80b-a3b-thinking",
"tags": "LLM,CHAT,65K",
"max_tokens": 65536,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "sao10k/l3-70b-euryale-v2.1",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "sao10k/l3-8b-lunaris",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "sao10k/l31-70b-euryale-v2.2",
"tags": "LLM,CHAT,8K",
"max_tokens": 8192,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "thudm/glm-4.1v-9b-thinking",
"tags": "LLM,CHAT,65K",
"max_tokens": 65536,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "zai-org/glm-4.5",
"tags": "LLM,CHAT,131K",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "zai-org/glm-4.5v",
"tags": "LLM,CHAT,65K",
"max_tokens": 65536,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "baai/bge-m3",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8192,
"model_type": "embedding"
},
{
"llm_name": "qwen/qwen3-embedding-0.6b",
"tags": "TEXT EMBEDDING,32K",
"max_tokens": 32768,
"model_type": "embedding"
},
{
"llm_name": "qwen/qwen3-embedding-8b",
"tags": "TEXT EMBEDDING,32K",
"max_tokens": 32768,
"model_type": "embedding"
},
{
"llm_name": "baai/bge-reranker-v2-m3",
"tags": "RE-RANK,8K",
"max_tokens": 8000,
"model_type": "rerank"
},
{
"llm_name": "qwen/qwen3-reranker-8b",
"tags": "RE-RANK,32K",
"max_tokens": 32768,
"model_type": "rerank"
}
]
},
{
"name": "MinerU",
"logo": "",
"tags": "OCR",
"status": "1",
"llm": []
},
{
"name": "PaddleOCR",
"logo": "",
"tags": "OCR",
"status": "1",
"url": "https://paddleocr.aistudio-app.com",
"llm": []
},
Feat: add OpenDataLoader PDF parser backend (#14058) (#14097) ### What problem does this PR solve? Closes #14058. RAGFlow supports multiple PDF parsing backends (DeepDOC, MinerU, Docling, TCADP, PaddleOCR). This PR adds **OpenDataLoader** ([opendataloader-project/opendataloader-pdf](https://github.com/opendataloader-project/opendataloader-pdf)) as a new optional backend, giving users a deterministic, local-first alternative with competitive table extraction accuracy. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update --- ### Changes #### Backend - `deepdoc/parser/opendataloader_parser.py` — new `OpenDataLoaderParser` class inheriting `RAGFlowPdfParser`. Implements `check_installation()` (guards Python package + Java 11+ runtime), `parse_pdf()` with JSON-first extraction (heading/paragraph/table/list/image/formula) and Markdown fallback, position-tag generation compatible with the shared `@@page\tx0\tx1\ty0\ty1##` format, and temp-dir lifecycle with cleanup. - `rag/app/naive.py` — new `by_opendataloader()` wrapper, registered in `PARSERS` dict, added to `chunk_token_num=0` override list. - `rag/flow/parser/parser.py` — `"opendataloader"` branch in the pipeline PDF handler + check validation list. #### Infrastructure - `docker/entrypoint.sh` — `ensure_opendataloader()` function: opt-in via `USE_OPENDATALOADER=true`, skips gracefully if Java is not on PATH. #### Frontend - `web/src/components/layout-recognize-form-field.tsx` — `OpenDataLoader` added to `ParseDocumentType` enum and parser dropdown. Cascades automatically to the pipeline editor's Parser component. #### Docs - `docs/guides/dataset/select_pdf_parser.md` — added OpenDataLoader entry and full env-var reference. --- ### Environment variables | Variable | Default | Description | |---|---|---| | `USE_OPENDATALOADER` | `false` | Set `true` to install `opendataloader-pdf` on container startup | | `OPENDATALOADER_VERSION` | latest | Pin the PyPI release (e.g. `==2.2.1`) | | `OPENDATALOADER_HYBRID` | _(unset)_ | Enable hybrid AI mode (e.g. `docling-fast`) | | `OPENDATALOADER_IMAGE_OUTPUT` | _(unset)_ | `off` / `embedded` / `external` | | `OPENDATALOADER_OUTPUT_DIR` | _(tmp)_ | Persistent output dir; temp dir used + cleaned if unset | | `OPENDATALOADER_DELETE_OUTPUT` | `1` | `0` to retain intermediate files for debugging | | `OPENDATALOADER_SANITIZE` | _(unset)_ | `1` to filter prompt-injection patterns from output | --- ### Dependencies - **Runtime**: `opendataloader-pdf` (PyPI, Apache 2.0) — opt-in, not added to `pyproject.toml` core deps. Installed by `ensure_opendataloader()` at container startup when `USE_OPENDATALOADER=true`. - **System**: Java 11+ on PATH (JVM is the underlying engine). The installer skips with a warning if `java` is not found. --- ### How to test **Standalone parser:** ```bash source .venv/bin/activate uv pip install opendataloader-pdf python3 -c " import sys; sys.path.insert(0, '.') from deepdoc.parser.opendataloader_parser import OpenDataLoaderParser p = OpenDataLoaderParser() print('available:', p.check_installation()) s, t = p.parse_pdf('path/to/test.pdf', parse_method='pipeline') print(f'sections={len(s)} tables={len(t)}') " ``` ### Benchmark vs Docling ``` file parser secs sections tables ---------------------------------------------------------------------- text-heavy.pdf docling 45.29 148 10 text-heavy.pdf opendataloader 3.14 559 0 table-heavy.pdf docling 7.05 76 3 table-heavy.pdf opendataloader 3.71 90 0 complex.pdf docling 42.67 114 8 complex.pdf opendataloader 3.51 180 0 ```
2026-04-24 18:33:02 +02:00
{
"name": "OpenDataLoader",
"logo": "",
"tags": "OCR",
"status": "1",
"llm": []
},
{
"name": "n1n",
"logo": "",
"tags": "LLM",
"status": "1",
"url": "https://api.n1n.ai/v1",
"llm": [
{
"llm_name": "gpt-4o-mini",
"tags": "LLM,CHAT,128K,IMAGE2TEXT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o",
"tags": "LLM,CHAT,128K,IMAGE2TEXT",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-3.5-turbo",
"tags": "LLM,CHAT,4K",
"max_tokens": 4096,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "deepseek-chat",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
}
]
feat: Add Avian as an LLM provider (#13256) ### What problem does this PR solve? This PR adds [Avian](https://avian.io) as a new LLM provider to RAGFlow. Avian provides an OpenAI-compatible API with competitive pricing, offering access to models like DeepSeek V3.2, Kimi K2.5, GLM-5, and MiniMax M2.5. **Provider details:** - API Base URL: `https://api.avian.io/v1` - Auth: Bearer token via API key - OpenAI-compatible (chat completions, streaming, function calling) - Models: - `deepseek/deepseek-v3.2` — 164K context, $0.26/$0.38 per 1M tokens - `moonshotai/kimi-k2.5` — 131K context, $0.45/$2.20 per 1M tokens - `z-ai/glm-5` — 131K context, $0.30/$2.55 per 1M tokens - `minimax/minimax-m2.5` — 1M context, $0.30/$1.10 per 1M tokens **Changes:** - `rag/llm/chat_model.py` — Add `AvianChat` class extending `Base` - `rag/llm/__init__.py` — Register in `SupportedLiteLLMProvider`, `FACTORY_DEFAULT_BASE_URL`, `LITELLM_PROVIDER_PREFIX` - `conf/llm_factories.json` — Add Avian factory with model definitions - `web/src/constants/llm.ts` — Add to `LLMFactory` enum, `IconMap`, `APIMapUrl` - `web/src/components/svg-icon.tsx` — Register SVG icon - `web/src/assets/svg/llm/avian.svg` — Provider icon - `docs/references/supported_models.mdx` — Add to supported models table This follows the same pattern as other OpenAI-compatible providers (e.g., n1n #12680, TokenPony). cc @KevinHuSh @JinHai-CN ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update
2026-02-27 09:36:55 +00:00
},
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
{
"name": "Astraflow",
"logo": "",
"tags": "LLM,TEXT EMBEDDING",
"status": "1",
"url": "https://api-us-ca.umodelverse.ai/v1",
"llm": [
{
"llm_name": "claude-opus-4-7",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-6",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-5-20250929",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-haiku-4-5-20251001",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.4",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.4-mini",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.4-nano",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o-mini",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Max",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Coder",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-32B",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-235B-A22B-Instruct",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "kimi-k2.6",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-5.1",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "MiniMax-M2.7",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "MiniMax-M2",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-pro",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-embedding-8b",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8192,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-3-large",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-ada-002",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
}
]
},
feat: add FuturMix as model provider (#14419) ## Summary Add [FuturMix](https://futurmix.ai) as a new model provider. FuturMix is an OpenAI-compatible unified AI gateway that provides access to 22+ models (GPT, Claude, Gemini, DeepSeek, and more) through a single API endpoint and key. - **API Base**: `https://futurmix.ai/v1` (OpenAI-compatible) - **Supported capabilities**: Chat, Embedding, Image2Text, TTS, Speech2Text, Rerank ### Changes | File | Change | |------|--------| | `rag/llm/__init__.py` | Add `FuturMix` to `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `FuturMixChat(Base)` — follows Astraflow/Avian pattern | | `rag/llm/embedding_model.py` | Add `FuturMixEmbed(OpenAIEmbed)` — follows Astraflow pattern | | `rag/llm/cv_model.py` | Add `FuturMixCV(GptV4)` — follows SILICONFLOW/OpenRouter pattern | | `rag/llm/tts_model.py` | Add `FuturMixTTS(OpenAITTS)` — follows CometAPI/DeerAPI pattern | | `rag/llm/sequence2txt_model.py` | Add `FuturMixSeq2txt(GPTSeq2txt)` — follows StepFun pattern | | `rag/llm/rerank_model.py` | Add `FuturMixRerank(OpenAI_APIRerank)` | | `conf/llm_factories.json` | Add factory config with 8 chat, 2 embedding, 1 image2text, 2 TTS, 1 speech2text models | | `docs/guides/models/supported_models.mdx` | Add FuturMix to supported models table | ### Models included - **Chat**: claude-sonnet-4-20250514, claude-3.5-haiku, gpt-4o, gpt-4o-mini, gemini-2.5-flash, gemini-2.0-flash, deepseek-chat, deepseek-reasoner - **Embedding**: text-embedding-3-small, text-embedding-3-large - **Image2Text**: gpt-4o - **TTS**: tts-1, tts-1-hd - **Speech2Text**: whisper-1 ## Test plan - [ ] Verify FuturMix appears in the model provider list in RAGFlow UI - [ ] Configure FuturMix with API key and test chat completion - [ ] Test embedding model with document indexing - [ ] Test image2text with a sample image 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-30 10:59:37 +08:00
{
"name": "FuturMix",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,IMAGE2TEXT,SPEECH2TEXT,TTS,TEXT RE-RANK",
"status": "1",
"url": "https://futurmix.ai/v1",
"llm": [
{
"llm_name": "claude-sonnet-4-20250514",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3.5-haiku",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o-mini",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.0-flash",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-chat",
"tags": "LLM,CHAT,64k",
"max_tokens": 65536,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-reasoner",
"tags": "LLM,CHAT,64k",
"max_tokens": 65536,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "gpt-4o",
"tags": "IMAGE2TEXT,128k",
feat: add FuturMix as model provider (#14419) ## Summary Add [FuturMix](https://futurmix.ai) as a new model provider. FuturMix is an OpenAI-compatible unified AI gateway that provides access to 22+ models (GPT, Claude, Gemini, DeepSeek, and more) through a single API endpoint and key. - **API Base**: `https://futurmix.ai/v1` (OpenAI-compatible) - **Supported capabilities**: Chat, Embedding, Image2Text, TTS, Speech2Text, Rerank ### Changes | File | Change | |------|--------| | `rag/llm/__init__.py` | Add `FuturMix` to `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `FuturMixChat(Base)` — follows Astraflow/Avian pattern | | `rag/llm/embedding_model.py` | Add `FuturMixEmbed(OpenAIEmbed)` — follows Astraflow pattern | | `rag/llm/cv_model.py` | Add `FuturMixCV(GptV4)` — follows SILICONFLOW/OpenRouter pattern | | `rag/llm/tts_model.py` | Add `FuturMixTTS(OpenAITTS)` — follows CometAPI/DeerAPI pattern | | `rag/llm/sequence2txt_model.py` | Add `FuturMixSeq2txt(GPTSeq2txt)` — follows StepFun pattern | | `rag/llm/rerank_model.py` | Add `FuturMixRerank(OpenAI_APIRerank)` | | `conf/llm_factories.json` | Add factory config with 8 chat, 2 embedding, 1 image2text, 2 TTS, 1 speech2text models | | `docs/guides/models/supported_models.mdx` | Add FuturMix to supported models table | ### Models included - **Chat**: claude-sonnet-4-20250514, claude-3.5-haiku, gpt-4o, gpt-4o-mini, gemini-2.5-flash, gemini-2.0-flash, deepseek-chat, deepseek-reasoner - **Embedding**: text-embedding-3-small, text-embedding-3-large - **Image2Text**: gpt-4o - **TTS**: tts-1, tts-1-hd - **Speech2Text**: whisper-1 ## Test plan - [ ] Verify FuturMix appears in the model provider list in RAGFlow UI - [ ] Configure FuturMix with API key and test chat completion - [ ] Test embedding model with document indexing - [ ] Test image2text with a sample image 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-30 10:59:37 +08:00
"max_tokens": 128000,
"model_type": [
"image2text",
"chat"
],
feat: add FuturMix as model provider (#14419) ## Summary Add [FuturMix](https://futurmix.ai) as a new model provider. FuturMix is an OpenAI-compatible unified AI gateway that provides access to 22+ models (GPT, Claude, Gemini, DeepSeek, and more) through a single API endpoint and key. - **API Base**: `https://futurmix.ai/v1` (OpenAI-compatible) - **Supported capabilities**: Chat, Embedding, Image2Text, TTS, Speech2Text, Rerank ### Changes | File | Change | |------|--------| | `rag/llm/__init__.py` | Add `FuturMix` to `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `FuturMixChat(Base)` — follows Astraflow/Avian pattern | | `rag/llm/embedding_model.py` | Add `FuturMixEmbed(OpenAIEmbed)` — follows Astraflow pattern | | `rag/llm/cv_model.py` | Add `FuturMixCV(GptV4)` — follows SILICONFLOW/OpenRouter pattern | | `rag/llm/tts_model.py` | Add `FuturMixTTS(OpenAITTS)` — follows CometAPI/DeerAPI pattern | | `rag/llm/sequence2txt_model.py` | Add `FuturMixSeq2txt(GPTSeq2txt)` — follows StepFun pattern | | `rag/llm/rerank_model.py` | Add `FuturMixRerank(OpenAI_APIRerank)` | | `conf/llm_factories.json` | Add factory config with 8 chat, 2 embedding, 1 image2text, 2 TTS, 1 speech2text models | | `docs/guides/models/supported_models.mdx` | Add FuturMix to supported models table | ### Models included - **Chat**: claude-sonnet-4-20250514, claude-3.5-haiku, gpt-4o, gpt-4o-mini, gemini-2.5-flash, gemini-2.0-flash, deepseek-chat, deepseek-reasoner - **Embedding**: text-embedding-3-small, text-embedding-3-large - **Image2Text**: gpt-4o - **TTS**: tts-1, tts-1-hd - **Speech2Text**: whisper-1 ## Test plan - [ ] Verify FuturMix appears in the model provider list in RAGFlow UI - [ ] Configure FuturMix with API key and test chat completion - [ ] Test embedding model with document indexing - [ ] Test image2text with a sample image 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-30 10:59:37 +08:00
"is_tools": true
},
{
"llm_name": "text-embedding-3-small",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-3-large",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "tts-1",
"tags": "TTS",
"max_tokens": 4096,
"model_type": "tts",
"is_tools": false
},
{
"llm_name": "tts-1-hd",
"tags": "TTS",
"max_tokens": 4096,
"model_type": "tts",
"is_tools": false
},
{
"llm_name": "whisper-1",
"tags": "SPEECH2TEXT",
"max_tokens": 25000000,
"model_type": "speech2text",
"is_tools": false
},
{
"llm_name": "jina-reranker-v2-base-multilingual",
"tags": "RE-RANK,8k",
"max_tokens": 8192,
"model_type": "rerank",
"is_tools": false
}
]
},
feat: Add Astraflow provider support (global + China endpoints) (#14270) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. | Variant | Factory Name | Endpoint | Env Var | |---------|-------------|----------|---------| | Global | `Astraflow` | `https://api-us-ca.umodelverse.ai/v1` | `ASTRAFLOW_API_KEY` | | China | `Astraflow-CN` | `https://api.modelverse.cn/v1` | `ASTRAFLOW_CN_API_KEY` | - **API key signup**: https://astraflow.ucloud.cn/ --- ### Files Changed | File | Change | |------|--------| | `rag/llm/__init__.py` | Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) | | `rag/llm/embedding_model.py` | Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) | | `rag/llm/rerank_model.py` | Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) | | `rag/llm/cv_model.py` | Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) | | `rag/llm/tts_model.py` | Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) | | `rag/llm/sequence2txt_model.py` | Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) | | `conf/llm_factories.json` | Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models | --- ### Supported Model Types - ✅ **Chat / LLM** — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ **Text Embedding** — text-embedding-3-small/large - ✅ **Image / Vision (IMAGE2TEXT)** — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ **Text Re-Rank** - ✅ **TTS** — tts-1 - ✅ **Speech-to-Text (SPEECH2TEXT)** — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>
2026-04-22 15:38:34 +08:00
{
"name": "Astraflow-CN",
"logo": "",
"tags": "LLM,TEXT EMBEDDING",
"status": "1",
"url": "https://api.modelverse.cn/v1",
"llm": [
{
"llm_name": "claude-opus-4-7",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-opus-4-6",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-5-20250929",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-haiku-4-5-20251001",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.4",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.4-mini",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5.4-nano",
"tags": "LLM,CHAT,400k",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-4o-mini",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Max",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-Coder",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-32B",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-VL-235B-A22B-Instruct",
"tags": "LLM,CHAT,131k",
"max_tokens": 131072,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "kimi-k2.6",
"tags": "LLM,CHAT,200k",
"max_tokens": 200000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-5.1",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "MiniMax-M2.7",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "MiniMax-M2",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-pro",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "qwen3-embedding-8b",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8192,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-3-large",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "text-embedding-ada-002",
"tags": "TEXT EMBEDDING,8K",
"max_tokens": 8191,
"model_type": "embedding",
"is_tools": false
}
]
},
feat: Add Avian as an LLM provider (#13256) ### What problem does this PR solve? This PR adds [Avian](https://avian.io) as a new LLM provider to RAGFlow. Avian provides an OpenAI-compatible API with competitive pricing, offering access to models like DeepSeek V3.2, Kimi K2.5, GLM-5, and MiniMax M2.5. **Provider details:** - API Base URL: `https://api.avian.io/v1` - Auth: Bearer token via API key - OpenAI-compatible (chat completions, streaming, function calling) - Models: - `deepseek/deepseek-v3.2` — 164K context, $0.26/$0.38 per 1M tokens - `moonshotai/kimi-k2.5` — 131K context, $0.45/$2.20 per 1M tokens - `z-ai/glm-5` — 131K context, $0.30/$2.55 per 1M tokens - `minimax/minimax-m2.5` — 1M context, $0.30/$1.10 per 1M tokens **Changes:** - `rag/llm/chat_model.py` — Add `AvianChat` class extending `Base` - `rag/llm/__init__.py` — Register in `SupportedLiteLLMProvider`, `FACTORY_DEFAULT_BASE_URL`, `LITELLM_PROVIDER_PREFIX` - `conf/llm_factories.json` — Add Avian factory with model definitions - `web/src/constants/llm.ts` — Add to `LLMFactory` enum, `IconMap`, `APIMapUrl` - `web/src/components/svg-icon.tsx` — Register SVG icon - `web/src/assets/svg/llm/avian.svg` — Provider icon - `docs/references/supported_models.mdx` — Add to supported models table This follows the same pattern as other OpenAI-compatible providers (e.g., n1n #12680, TokenPony). cc @KevinHuSh @JinHai-CN ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update
2026-02-27 09:36:55 +00:00
{
"name": "Avian",
"logo": "",
"tags": "LLM",
"status": "1",
"llm": [
{
"llm_name": "deepseek/deepseek-v3.2",
"tags": "LLM,CHAT,164K",
"max_tokens": 164000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "moonshotai/kimi-k2.5",
"tags": "LLM,CHAT,131K",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "z-ai/glm-5",
"tags": "LLM,CHAT,131K",
"max_tokens": 131000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "minimax/minimax-m2.5",
"tags": "LLM,CHAT,1M",
"max_tokens": 1000000,
"model_type": "chat",
"is_tools": true
}
]
},
{
"name": "RAGcon",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,IMAGE2TEXT",
"status": "1",
"llm": []
},
{
"name": "Xiaomi",
"logo": "",
"tags": "LLM,IMAGE2TEXT",
"status": "1",
"url": "https://api.xiaomimimo.com/v1",
"llm": [
{
"llm_name": "mimo-v2.5-pro",
"tags": "LLM,CHAT,1M",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "mimo-v2.5",
"tags": "LLM,CHAT,1M,IMAGE2TEXT",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "mimo-v2-flash",
"tags": "LLM,CHAT,256K",
"max_tokens": 262144,
"model_type": "chat",
"is_tools": true
}
]
},
{
"name": "Perplexity",
"logo": "",
"tags": "TEXT EMBEDDING",
"status": "1",
"llm": [
{
"llm_name": "pplx-embed-v1-0.6b",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "pplx-embed-v1-4b",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "pplx-embed-context-v1-0.6b",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
},
{
"llm_name": "pplx-embed-context-v1-4b",
"tags": "TEXT EMBEDDING,32000",
"max_tokens": 32000,
"model_type": "embedding"
}
]
Feat: Add New API model provider for OpenAI-compatible gateways (#15991) ## Summary Add support for **"New API"** as a model provider, enabling connection to [New API](https://github.com/QuantumNous/new-api) / [one-api](https://github.com/songquanpeng/one-api) compatible gateways that aggregate multiple LLM backends behind a unified OpenAI-compatible `/v1` endpoint. ### Features - **All model types**: Chat, Embedding, Rerank, Image2Text, TTS, Speech2Text - **List Models discovery**: `NewAPI(OpenAIAPICompatible)` class in `model_meta.py` queries the gateway's `/v1/models` to auto-discover available models via the native `GET /api/v1/providers/<name>/models` endpoint - **Model parameter editing**: Pencil icon on each discovered model row to edit `model_type`, `max_tokens`, and `features` (e.g. tool call support) before submitting - **Custom model addition**: "Add Custom Model" button at the bottom of the List Models dropdown for models not returned by the API - **Gear icon settings**: Enabled the Settings gear button on provider instances to manage models on existing instances (viewMode) - **viewMode credential passthrough**: Fixed List Models in viewMode — merges `initialValues` credentials when `api_key`/`base_url` fields are hidden by `hideWhenInstanceExists` ### Changes **Backend** (8 files): - `rag/llm/chat_model.py` — `NewAPIChat(Base)` class - `rag/llm/embedding_model.py` — `NewAPIEmbed(OpenAIEmbed)` class (no auto `/v1` append) - `rag/llm/rerank_model.py` — `NewAPIRerank(Base)` class (uses `/rerank` endpoint) - `rag/llm/cv_model.py` — `NewAPICv(GptV4)` class - `rag/llm/tts_model.py` — `NewAPITTS(OpenAITTS)` class - `rag/llm/sequence2txt_model.py` — `NewAPISeq2txt(GPTSeq2txt)` class - `rag/llm/model_meta.py` — `NewAPI(OpenAIAPICompatible)` class for List Models discovery - `conf/llm_factories.json` — New API factory entry with all model type tags **Frontend** (8 files + 1 new SVG): - `web/src/assets/svg/llm/new-api.svg` — New API logo icon - `web/src/constants/llm.ts` — `LLMFactory.NewAPI` enum + `IconMap` entry - `web/src/components/svg-icon.tsx` — `NewAPI` added to `svgIcons` - `web/src/pages/user-setting/setting-model/modal/provider-modal/field-config/local-llm-configs.ts` — New API `buildLocalConfig` - `web/src/pages/user-setting/setting-model/modal/provider-modal/constants.ts` — `LIST_MODEL_PROVIDERS` includes NewAPI - `web/src/pages/user-setting/setting-model/components/used-model.tsx` — Enable Settings gear button - `web/src/pages/user-setting/setting-model/modal/provider-modal/hooks/use-list-models-picker.ts` — viewMode credential merge + model editing state/handlers - `web/src/pages/user-setting/setting-model/modal/provider-modal/hooks/use-list-models-options.tsx` — Pencil edit icon per model row - `web/src/pages/user-setting/setting-model/modal/provider-modal/index.tsx` — `AddCustomModelDialog` import + edit dialog rendering **Note on Go implementation**: A Go model driver (`NewAPIModel` delegating to `OpenAIModel`) has been prepared but is deferred until the Go runtime is enabled in a future release (current v0.26.0 images use `API_PROXY_SCHEME=python` and do not compile Go binaries). Will submit as a follow-up PR. ## Related - Depends on: #15996 (provider instance API improvements — server-side credential lookup, idempotent `add_model`, security fixes — required for viewMode gear icon and batch model submission) ## Test plan - [ ] Add New API provider with api_key and base_url pointing to an OpenAI-compatible gateway - [ ] Click "List Models" — should discover and display available models from `/v1/models` - [ ] Click pencil icon on a model — should open edit dialog to change model_type, max_tokens, features - [ ] Select multiple models and click OK — should add all selected models - [ ] Click gear icon on the added instance — should open viewMode with List Models working - [ ] In viewMode, select new models including pre-existing ones, click OK — should succeed (requires #15996) - [ ] Verify all model types work: create a Chat assistant, Embedding KB, Rerank setting 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Tim Wang <wanghualoong@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-26 18:47:20 +08:00
},
{
"name": "New API",
"logo": "",
"tags": "LLM,TEXT EMBEDDING,TEXT RE-RANK,IMAGE2TEXT,TTS,SPEECH2TEXT",
"status": "1",
"llm": [],
"rank": "885"
}
]
feat: add FuturMix as model provider (#14419) ## Summary Add [FuturMix](https://futurmix.ai) as a new model provider. FuturMix is an OpenAI-compatible unified AI gateway that provides access to 22+ models (GPT, Claude, Gemini, DeepSeek, and more) through a single API endpoint and key. - **API Base**: `https://futurmix.ai/v1` (OpenAI-compatible) - **Supported capabilities**: Chat, Embedding, Image2Text, TTS, Speech2Text, Rerank ### Changes | File | Change | |------|--------| | `rag/llm/__init__.py` | Add `FuturMix` to `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` | | `rag/llm/chat_model.py` | Add `FuturMixChat(Base)` — follows Astraflow/Avian pattern | | `rag/llm/embedding_model.py` | Add `FuturMixEmbed(OpenAIEmbed)` — follows Astraflow pattern | | `rag/llm/cv_model.py` | Add `FuturMixCV(GptV4)` — follows SILICONFLOW/OpenRouter pattern | | `rag/llm/tts_model.py` | Add `FuturMixTTS(OpenAITTS)` — follows CometAPI/DeerAPI pattern | | `rag/llm/sequence2txt_model.py` | Add `FuturMixSeq2txt(GPTSeq2txt)` — follows StepFun pattern | | `rag/llm/rerank_model.py` | Add `FuturMixRerank(OpenAI_APIRerank)` | | `conf/llm_factories.json` | Add factory config with 8 chat, 2 embedding, 1 image2text, 2 TTS, 1 speech2text models | | `docs/guides/models/supported_models.mdx` | Add FuturMix to supported models table | ### Models included - **Chat**: claude-sonnet-4-20250514, claude-3.5-haiku, gpt-4o, gpt-4o-mini, gemini-2.5-flash, gemini-2.0-flash, deepseek-chat, deepseek-reasoner - **Embedding**: text-embedding-3-small, text-embedding-3-large - **Image2Text**: gpt-4o - **TTS**: tts-1, tts-1-hd - **Speech2Text**: whisper-1 ## Test plan - [ ] Verify FuturMix appears in the model provider list in RAGFlow UI - [ ] Configure FuturMix with API key and test chat completion - [ ] Test embedding model with document indexing - [ ] Test image2text with a sample image 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-30 10:59:37 +08:00
}