buua436
0501134820
Fix: support tool call config ( #14616 )
...
### What problem does this PR solve?
support tool call config
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-07 15:54:57 +08:00
wdeveloper16
78188ce9e9
Feat: add OpenDataLoader PDF parser backend ( #14058 ) ( #14097 )
...
### What problem does this PR solve?
Closes #14058 .
RAGFlow supports multiple PDF parsing backends (DeepDOC, MinerU,
Docling, TCADP, PaddleOCR). This PR adds **OpenDataLoader**
([opendataloader-project/opendataloader-pdf](https://github.com/opendataloader-project/opendataloader-pdf ))
as a new optional backend, giving users a deterministic, local-first
alternative with competitive table extraction accuracy.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
---
### Changes
#### Backend
- `deepdoc/parser/opendataloader_parser.py` — new `OpenDataLoaderParser`
class inheriting `RAGFlowPdfParser`. Implements `check_installation()`
(guards Python package + Java 11+ runtime), `parse_pdf()` with
JSON-first extraction (heading/paragraph/table/list/image/formula) and
Markdown fallback, position-tag generation compatible with the shared
`@@page\tx0\tx1\ty0\ty1##` format, and temp-dir lifecycle with cleanup.
- `rag/app/naive.py` — new `by_opendataloader()` wrapper, registered in
`PARSERS` dict, added to `chunk_token_num=0` override list.
- `rag/flow/parser/parser.py` — `"opendataloader"` branch in the
pipeline PDF handler + check validation list.
#### Infrastructure
- `docker/entrypoint.sh` — `ensure_opendataloader()` function: opt-in
via `USE_OPENDATALOADER=true`, skips gracefully if Java is not on PATH.
#### Frontend
- `web/src/components/layout-recognize-form-field.tsx` —
`OpenDataLoader` added to `ParseDocumentType` enum and parser dropdown.
Cascades automatically to the pipeline editor's Parser component.
#### Docs
- `docs/guides/dataset/select_pdf_parser.md` — added OpenDataLoader
entry and full env-var reference.
---
### Environment variables
| Variable | Default | Description |
|---|---|---|
| `USE_OPENDATALOADER` | `false` | Set `true` to install
`opendataloader-pdf` on container startup |
| `OPENDATALOADER_VERSION` | latest | Pin the PyPI release (e.g.
`==2.2.1`) |
| `OPENDATALOADER_HYBRID` | _(unset)_ | Enable hybrid AI mode (e.g.
`docling-fast`) |
| `OPENDATALOADER_IMAGE_OUTPUT` | _(unset)_ | `off` / `embedded` /
`external` |
| `OPENDATALOADER_OUTPUT_DIR` | _(tmp)_ | Persistent output dir; temp
dir used + cleaned if unset |
| `OPENDATALOADER_DELETE_OUTPUT` | `1` | `0` to retain intermediate
files for debugging |
| `OPENDATALOADER_SANITIZE` | _(unset)_ | `1` to filter prompt-injection
patterns from output |
---
### Dependencies
- **Runtime**: `opendataloader-pdf` (PyPI, Apache 2.0) — opt-in, not
added to `pyproject.toml` core deps. Installed by
`ensure_opendataloader()` at container startup when
`USE_OPENDATALOADER=true`.
- **System**: Java 11+ on PATH (JVM is the underlying engine). The
installer skips with a warning if `java` is not found.
---
### How to test
**Standalone parser:**
```bash
source .venv/bin/activate
uv pip install opendataloader-pdf
python3 -c "
import sys; sys.path.insert(0, '.')
from deepdoc.parser.opendataloader_parser import OpenDataLoaderParser
p = OpenDataLoaderParser()
print('available:', p.check_installation())
s, t = p.parse_pdf('path/to/test.pdf', parse_method='pipeline')
print(f'sections={len(s)} tables={len(t)}')
"
```
### Benchmark vs Docling
```
file parser secs sections tables
----------------------------------------------------------------------
text-heavy.pdf docling 45.29 148 10
text-heavy.pdf opendataloader 3.14 559 0
table-heavy.pdf docling 7.05 76 3
table-heavy.pdf opendataloader 3.71 90 0
complex.pdf docling 42.67 114 8
complex.pdf opendataloader 3.51 180 0
```
2026-04-25 00:33:02 +08:00
Lynn
62cb292635
Feat/tenant model ( #13072 )
...
### What problem does this PR solve?
Add id for table tenant_llm and apply in LLMBundle.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com >
Co-authored-by: Liu An <asiro@qq.com >
2026-03-05 17:27:17 +08:00
Yongteng Lei
f13a1fb007
Refa: improve model verification ux ( #13392 )
...
### What problem does this PR solve?
Improve model verification UX. #13395
### Type of change
- [x] Refactoring
---------
Co-authored-by: Liu An <asiro@qq.com >
2026-03-05 17:23:47 +08:00
Magicbook1108
5fc3bd38b0
Feat: Support siliconflow.com ( #13308 )
...
### What problem does this PR solve?
Feat: Support siliconflow.com
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2026-03-02 15:37:42 +08:00
Kevin Hu
1262533b74
Feat: support verify to set llm key and boost bigrams. ( #12980 )
...
#12863
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2026-02-05 19:19:09 +08:00
Yongteng Lei
2a758402ad
Fix: Hunyuan cannot work properly ( #12843 )
...
### What problem does this PR solve?
Hunyuan cannot work properly
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-27 17:04:53 +08:00
Kevin Hu
cec06bfb5d
Fix: empty chunk issue. ( #12638 )
...
#12570
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-15 17:46:21 +08:00
Lin Manhui
2e09db02f3
feat: add paddleocr parser ( #12513 )
...
### What problem does this PR solve?
Add PaddleOCR as a new PDF parser.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2026-01-09 17:48:45 +08:00
Jin Hai
e5f3d5ae26
Refactor add_llm and add speech to text ( #12089 )
...
### What problem does this PR solve?
1. Refactor implementation of add_llm
2. Add speech to text model.
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com >
2025-12-22 19:27:26 +08:00
Magicbook1108
f8fd1ea7e1
Feat: Further update Bedrock model configs ( #12029 )
...
### What problem does this PR solve?
Feat: Further update Bedrock model configs #12020 #12008
<img width="700" alt="2b4f0f7fab803a2a2d5f345c756a2c69"
src="https://github.com/user-attachments/assets/e1b9eaad-5c60-47bd-a6f4-88a104ce0c63 "
/>
<img width="700" alt="afe88ec3c58f745f85c5c507b040c250"
src="https://github.com/user-attachments/assets/9de39745-395d-4145-930b-96eb452ad6ef "
/>
<img width="700" alt="1a21bb2b7cd8003dce1e5207f27efc69"
src="https://github.com/user-attachments/assets/ddba1682-6654-4954-aa71-41b8ebc04ac0 "
/>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 11:32:20 +08:00
Yongteng Lei
e9710b7aa9
Refa: treat MinerU as an OCR model 2 ( #11905 )
...
### What problem does this PR solve?
Treat MinerU as an OCR model 2. #11903
### Type of change
- [x] Refactoring
2025-12-11 17:33:12 +08:00
Yongteng Lei
a94b3b9df2
Refa: treat MinerU as an OCR model ( #11849 )
...
### What problem does this PR solve?
Treat MinerU as an OCR model.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2025-12-09 18:54:14 +08:00
Yongteng Lei
51ec708c58
Refa: cleanup synchronous functions in chat_model and implement synchronization for conversation and dialog chats ( #11779 )
...
### What problem does this PR solve?
Cleanup synchronous functions in chat_model and implement
synchronization for conversation and dialog chats.
### Type of change
- [x] Refactoring
- [x] Performance Improvement
2025-12-08 09:43:03 +08:00
Yongteng Lei
b6c4722687
Refa: make RAGFlow more asynchronous ( #11601 )
...
### What problem does this PR solve?
Try to make this more asynchronous. Verified in chat and agent
scenarios, reducing blocking behavior. #11551 , #11579 .
However, the impact of these changes still requires further
investigation to ensure everything works as expected.
### Type of change
- [x] Refactoring
2025-12-01 14:24:06 +08:00
Kevin Hu
d1716d865a
Feat: Alter flask to Quart for async API serving. ( #11275 )
...
### What problem does this PR solve?
#11277
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-11-18 17:05:16 +08:00
Kevin Hu
c30ffb5716
Fix: ollama model list issue. ( #11175 )
...
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-11 19:46:41 +08:00
Kevin Hu
26cf5131c9
Fix: filter builtin llm factories. ( #11163 )
...
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-11 14:52:59 +08:00
Zhichang Yu
68b952abb1
Don't select vector on infinity ( #11151 )
...
### What problem does this PR solve?
Don't select vector on infinity
### Type of change
- [x] Performance Improvement
2025-11-10 18:01:40 +08:00
Billy Bao
5a8fbc5a81
Fix: Can't add more models ( #11076 )
...
### What problem does this PR solve?
Currently we cannot add any models, since factory is a string, and the
return type of get_allowed_llm_factories() is List[object]
https://github.com/infiniflow/ragflow/pull/11003
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-06 18:54:13 +08:00
Jin Hai
1a9215bc6f
Move some vars to globals ( #11017 )
...
### What problem does this PR solve?
As title.
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com >
2025-11-05 14:14:38 +08:00
Wanderson Pinto dos Santos
3654ae61c1
feat: add allowed factories variable to allow admins to restrict llms users can add ( #11003 )
...
### What problem does this PR solve?
Currently, if we want to restrict the allowed factories users can use we
need to delete from the database table manually. The proposal of this PR
is to include a variable to that, if set, will restrict the LLM
factories the users can see and add. This allow us to not touch the
llm_factories.json or the database if the LLM factory is already
inserted.
Obs.: All the lint changes were from the pre-commit hook which I did not
change.
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
2025-11-05 10:47:50 +08:00
Jin Hai
bab3fce136
Move some constants to common ( #11004 )
...
### What problem does this PR solve?
As title.
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com >
2025-11-05 08:01:39 +08:00
Kevin Hu
3e5a39482e
Feat: Support multiple data sources synchronizations ( #10954 )
...
### What problem does this PR solve?
#10953
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-11-03 19:59:18 +08:00
Jin Hai
d008a4df9f
Move base64_image related functions to common directory ( #10957 )
...
### What problem does this PR solve?
As title
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com >
2025-11-03 15:20:46 +08:00
Billy Bao
fa38aed01b
Fix: the input length exceeds the context length ( #10895 )
...
### What problem does this PR solve?
Fix: the input length exceeds the context length #10750
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-30 19:00:53 +08:00
Liu An
40b2c48957
Chore(config): remove Youdao and BAAI embedding model providers ( #10873 )
...
### What problem does this PR solve?
This commit removes the Youdao and BAAI entries from the LLM factories
configuration as they are no longer needed or supported.
### Type of change
- [x] Config update
2025-10-29 19:38:57 +08:00
Yongteng Lei
c0c2a10680
Feat: allow initialize Redis without password ( #10856 )
...
### What problem does this PR solve?
Allow initialize Redis without password.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-10-29 09:45:28 +08:00
Zhichang Yu
73144e278b
Don't release full image ( #10654 )
...
### What problem does this PR solve?
Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag
### Type of change
- [x] Documentation Update
- [x] Refactoring
2025-10-23 23:02:27 +08:00
buua436
4e86ee4ff9
Feat: Support Specifying OpenRouter Model Provider ( #10550 )
...
### What problem does this PR solve?
issue:
[#5787 ](https://github.com/infiniflow/ragflow/issues/5787 )
change:
Support Specifying OpenRouter Model Provider
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-10-16 09:39:59 +08:00
Jin Hai
5abd0bbac1
Fix typo ( #9766 )
...
### What problem does this PR solve?
As title
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com >
2025-08-27 18:56:40 +08:00
Kevin Hu
5e8cd693a5
Refa: split services about llm. ( #9450 )
...
### What problem does this PR solve?
### Type of change
- [x] Refactoring
2025-08-13 16:41:01 +08:00
Yongteng Lei
83771e500c
Refa: migrate chat models to LiteLLM ( #9394 )
...
### What problem does this PR solve?
All models pass the mock response tests, which means that if a model can
return the correct response, everything should work as expected.
However, not all models have been fully tested in a real environment,
the real API_KEY. I suggest actively monitoring the refactored models
over the coming period to ensure they work correctly and fixing them
step by step, or waiting to merge until most have been tested in
practical environment.
### Type of change
- [x] Refactoring
2025-08-12 10:59:20 +08:00
Kevin Hu
9ca86d801e
Refa: add provider info while adding model. ( #9273 )
...
### What problem does this PR solve?
#9248
### Type of change
- [x] Refactoring
2025-08-07 09:40:42 +08:00
Stephen Hu
1409bb30df
Refactor:Improve the logic so that it does not decode base 64 for the test image each time ( #9264 )
...
### What problem does this PR solve?
Improve the logic so that it does not decode base 64 for the test image
each time
### Type of change
- [x] Refactoring
- [x] Performance Improvement
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com >
2025-08-06 11:42:25 +08:00
kuschzzp
b638d3f773
Image validation of the image2text model without using local paths ( #9052 )
...
### What problem does this PR solve?
#9050
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-30 12:57:24 +08:00
Adrian Altermatt
6691532079
Feat: Add model editing functionality with improved UI labels ( #8855 )
...
### What problem does this PR solve?
Add edit button for local LLM models
<img width="1531" height="1428" alt="image"
src="https://github.com/user-attachments/assets/19d62255-59a6-4a7e-9772-8b8743101f78 "
/>
<img width="1531" height="1428" alt="image"
src="https://github.com/user-attachments/assets/c3a0f77e-cc6b-4190-95a6-13835463428b "
/>
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Liu An <asiro@qq.com >
2025-07-21 19:16:53 +08:00
Kevin Hu
163e71d06f
Fix: Hunyuan model adding error. ( #6531 )
...
### What problem does this PR solve?
#6523
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 10:33:33 +08:00
Kevin Hu
5748d58c74
Refa: refine the error message. ( #6151 )
...
### What problem does this PR solve?
#6138
### Type of change
- [x] Refactoring
2025-03-17 13:07:22 +08:00
Kevin Hu
471bd92b4c
Fix: empty api-key causes problems. ( #6022 )
...
### What problem does this PR solve?
#5926
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-13 14:57:47 +08:00
Kevin Hu
45123dcc0a
Fix: ollama model add error. ( #5947 )
...
### What problem does this PR solve?
#5944
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-12 10:56:05 +08:00
Kevin Hu
82f5d901c8
Refa: add model. ( #5820 )
...
### What problem does this PR solve?
#5783
### Type of change
- [x] Refactoring
2025-03-10 11:22:06 +08:00
Kevin Hu
4c9a3e918f
Fix: add image2text issue. ( #5431 )
...
### What problem does this PR solve?
#5356
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-27 14:06:49 +08:00
Yongteng Lei
0e920a91dd
FIX: correct typo ( #5387 )
...
### What problem does this PR solve?
Correct typo in supported_models file
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-26 17:21:09 +08:00
Yongteng Lei
cdcaae17c6
Feat: add VLLM ( #5380 )
...
### What problem does this PR solve?
Read to add VLMM.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-02-26 16:04:53 +08:00
Kevin Hu
4f40f685d9
Code refactor ( #5371 )
...
### What problem does this PR solve?
#5173
### Type of change
- [x] Refactoring
2025-02-26 15:40:52 +08:00
Kevin Hu
605cfdb8dc
Refine error message for re-rank model. ( #5278 )
...
### What problem does this PR solve?
#5261
### Type of change
- [x] Refactoring
2025-02-24 13:01:34 +08:00
yrk111222
7ce675030b
Support downloading models from ModelScope Community. ( #5073 )
...
This PR supports downloading models from ModelScope. The main
modifications are as follows:
-New Feature (non-breaking change which adds functionality)
-Documentation Update
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com >
2025-02-24 10:12:20 +08:00
Kevin Hu
ef8847eda7
Double check error of adding llm. ( #5237 )
...
### What problem does this PR solve?
#5227
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-21 19:09:49 +08:00
Kevin Hu
78982d88e0
Reformat error message. ( #4829 )
...
### What problem does this PR solve?
#4828
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-10 16:47:53 +08:00