mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-07-04 01:29:35 +08:00
### What problem does this PR solve?
This PR implement implement OCR for Baidu and Mistral, implement
PaddleOCR provider and implement ASR for CoHere
**Verified examples from the CLI:**
```
RAGFlow(user)> ocr with 'mistral-ocr-2512@test@mistral' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
RAGFlow(user)> ocr with 'paddleocr-vl-0.9b@test@baidu' file './internal/text.jpg'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parallel to these organizational innovations there were significant complementary technical innovations (e.g., improved methods of manufacturing cast-iron pipe and of coating interiors for pressure maintenance, and newer paving and construction material... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# PaddleOCR
RAGFlow(user)> ocr with 'PaddleOCR-VL-1.5@test@paddleocr' file './internal/test.pdf'
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| # Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Bingxin Ke
Nando Metzger
Photogra
Anton Obukhov
Rodrigo Caye Daudt
netry and Remote Sensing,
Shengyu Huang
Konrad Schindler
ETH Zürich
<div style="text-align: c... |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
# Cohere
RAGFlow(user)> asr with 'cohere-transcribe-03-2026@test@cohere' audio './internal/test.wav' param '{"language": "en"}'
+-----------------------------------------------------------------------------------------------------------------------+
| text |
+-----------------------------------------------------------------------------------------------------------------------+
| The examination and testimony of the experts enabled the Commission to conclude that five shots may have been fired. |
+-----------------------------------------------------------------------------------------------------------------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
108 lines
1.8 KiB
JSON
108 lines
1.8 KiB
JSON
{
|
|
"name": "Mistral",
|
|
"url": {
|
|
"default": "https://api.mistral.ai"
|
|
},
|
|
"url_suffix": {
|
|
"chat": "v1/chat/completions",
|
|
"models": "v1/models",
|
|
"embedding": "v1/embeddings",
|
|
"ocr": "v1/ocr"
|
|
},
|
|
"class": "mistral",
|
|
"models": [
|
|
{
|
|
"name": "mistral-large-latest",
|
|
"max_tokens": 128000,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "mistral-medium-latest",
|
|
"max_tokens": 128000,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "mistral-small-latest",
|
|
"max_tokens": 128000,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "ministral-8b-latest",
|
|
"max_tokens": 128000,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "ministral-3b-latest",
|
|
"max_tokens": 128000,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "pixtral-large-latest",
|
|
"max_tokens": 128000,
|
|
"model_types": [
|
|
"chat",
|
|
"vision"
|
|
]
|
|
},
|
|
{
|
|
"name": "codestral-latest",
|
|
"max_tokens": 256000,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "open-mistral-nemo",
|
|
"max_tokens": 128000,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "open-mistral-7b",
|
|
"max_tokens": 32000,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "open-mixtral-8x7b",
|
|
"max_tokens": 32000,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "open-mixtral-8x22b",
|
|
"max_tokens": 64000,
|
|
"model_types": [
|
|
"chat"
|
|
]
|
|
},
|
|
{
|
|
"name": "mistral-embed",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"embedding"
|
|
]
|
|
},
|
|
{
|
|
"name": "mistral-ocr-2512",
|
|
"max_tokens": 8192,
|
|
"model_types": [
|
|
"ocr"
|
|
]
|
|
}
|
|
]
|
|
}
|