2026-05-08 12:02:37 +08:00
|
|
|
{
|
|
|
|
|
"name": "OpenRouter",
|
|
|
|
|
"url": {
|
|
|
|
|
"default": "https://openrouter.ai/api/v1"
|
|
|
|
|
},
|
|
|
|
|
"url_suffix": {
|
|
|
|
|
"chat": "chat/completions",
|
2026-05-08 13:56:45 +08:00
|
|
|
"models": "models",
|
|
|
|
|
"embedding": "embeddings",
|
|
|
|
|
"rerank": "rerank",
|
Go: implement TTS for fishaudio, openrouter and asr for fishaudio (#14926)
### What problem does this PR solve?
This PR implement TTS for FishAudio and MiniMax provider and ASR for
FishAudio
**The following functionalities are now supported:**
**FishAudio:**
- [x] Text To Speech
- [x] Stream Text To Speech
- [x] Audio To Text
**OpenRouter:**
- [x] Text To Speech
**Verified examples from the CLI:**
```plaintext
**FishAudio**
RAGFlow(user)> tts with 's1@test@fishaudio' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"reference_id": "90e65eaaf50e4470b8e6d43ee6afd7d5", "temperature": 0.7, "top_p": 0.7, "prosody": {"speed": 1, "volume": 0, "normalize_loudness": true}, "chunk_length": 300, "normalize": true, "sample_rate": 44100, "mp3_bitrate": 128, "latency": "normal", "max_new_tokens": 1024, "repetition_penalty": 1.2, "min_chunk_length": 50, "condition_on_previous_chunks": true, "early_stop_threshold": 1}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/s1_output.wav
SUCCESS
RAGFlow(user)> stream tts with 's1@test@fishaudio' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"reference_id": "90e65eaaf50e4470b8e6d43ee6afd7d5", "temperature": 0.7, "top_p": 0.7, "prosody": {"speed": 1, "volume": 0, "normalize_loudness": true}, "chunk_length": 300, "normalize": true, "sample_rate": 44100, "mp3_bitrate": 128, "latency": "normal", "max_new_tokens": 1024, "repetition_penalty": 1.2, "min_chunk_length": 50, "condition_on_previous_chunks": true, "early_stop_threshold": 1}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/s1_output.wav
SUCCESS
RAGFlow(user)> asr with 'transcribe-1@test@fishaudio' audio './internal/test.wav' param '{"language": "en", "ignore_timestamps": true}'
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. |
+----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-14 18:58:00 +08:00
|
|
|
"balance": "credits",
|
2026-05-21 21:19:38 -10:00
|
|
|
"tts": "audio/speech",
|
|
|
|
|
"asr": "audio/transcriptions"
|
2026-05-08 12:02:37 +08:00
|
|
|
},
|
|
|
|
|
"class": "openrouter",
|
|
|
|
|
"models": [
|
|
|
|
|
{
|
|
|
|
|
"name": "google/gemma-4-31b-it",
|
|
|
|
|
"max_tokens": 262144,
|
|
|
|
|
"model_types": [
|
|
|
|
|
"chat"
|
|
|
|
|
],
|
|
|
|
|
"thinking": {
|
|
|
|
|
"default_value": true,
|
|
|
|
|
"clear_thinking": true
|
|
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"name": "minimax/minimax-m2.5",
|
|
|
|
|
"max_tokens": 196608,
|
|
|
|
|
"model_types": [
|
|
|
|
|
"chat"
|
|
|
|
|
],
|
|
|
|
|
"thinking": {
|
|
|
|
|
"default_value": true,
|
|
|
|
|
"clear_thinking": true
|
|
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"name": "tencent/hy3-preview",
|
|
|
|
|
"max_tokens": 262144,
|
|
|
|
|
"model_types": [
|
|
|
|
|
"chat"
|
|
|
|
|
],
|
|
|
|
|
"thinking": {
|
|
|
|
|
"default_value": true,
|
|
|
|
|
"clear_thinking": true
|
|
|
|
|
}
|
Go: implement TTS for fishaudio, openrouter and asr for fishaudio (#14926)
### What problem does this PR solve?
This PR implement TTS for FishAudio and MiniMax provider and ASR for
FishAudio
**The following functionalities are now supported:**
**FishAudio:**
- [x] Text To Speech
- [x] Stream Text To Speech
- [x] Audio To Text
**OpenRouter:**
- [x] Text To Speech
**Verified examples from the CLI:**
```plaintext
**FishAudio**
RAGFlow(user)> tts with 's1@test@fishaudio' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"reference_id": "90e65eaaf50e4470b8e6d43ee6afd7d5", "temperature": 0.7, "top_p": 0.7, "prosody": {"speed": 1, "volume": 0, "normalize_loudness": true}, "chunk_length": 300, "normalize": true, "sample_rate": 44100, "mp3_bitrate": 128, "latency": "normal", "max_new_tokens": 1024, "repetition_penalty": 1.2, "min_chunk_length": 50, "condition_on_previous_chunks": true, "early_stop_threshold": 1}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/s1_output.wav
SUCCESS
RAGFlow(user)> stream tts with 's1@test@fishaudio' text 'He who desires but acts not, breeds pestilence.' play format 'wav' save './internal' param '{"reference_id": "90e65eaaf50e4470b8e6d43ee6afd7d5", "temperature": 0.7, "top_p": 0.7, "prosody": {"speed": 1, "volume": 0, "normalize_loudness": true}, "chunk_length": 300, "normalize": true, "sample_rate": 44100, "mp3_bitrate": 128, "latency": "normal", "max_new_tokens": 1024, "repetition_penalty": 1.2, "min_chunk_length": 50, "condition_on_previous_chunks": true, "early_stop_threshold": 1}'
Saved to directory: /home/infiniflow/Documents/development/ragflow/internal/s1_output.wav
SUCCESS
RAGFlow(user)> asr with 'transcribe-1@test@fishaudio' audio './internal/test.wav' param '{"language": "en", "ignore_timestamps": true}'
+----------------------------------------------------------------------------------------------------------------------+
| text |
+----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. |
+----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-14 18:58:00 +08:00
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"name": "openai/gpt-audio-mini",
|
|
|
|
|
"max_tokens": 131072,
|
|
|
|
|
"model_types": [
|
|
|
|
|
"tts"
|
|
|
|
|
]
|
2026-05-21 21:19:38 -10:00
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"name": "openai/whisper-large-v3",
|
|
|
|
|
"max_tokens": 131072,
|
|
|
|
|
"model_types": [
|
|
|
|
|
"asr"
|
|
|
|
|
]
|
2026-05-08 12:02:37 +08:00
|
|
|
}
|
|
|
|
|
]
|
2026-05-21 21:19:38 -10:00
|
|
|
}
|