Commit Graph

1139 Commits

Author SHA1 Message Date
Yongteng Lei
df16a80f25 Feat: add initial Google Drive connector support (#11147)
### What problem does this PR solve?

This feature is primarily ported from the
[Onyx](https://github.com/onyx-dot-app/onyx) project with necessary
modifications. Thanks for such a brilliant project.

Minor: consistently use `google_drive` rather than `google_driver`.

<img width="566" height="731" alt="image"
src="https://github.com/user-attachments/assets/6f64e70e-881e-42c7-b45f-809d3e0024a4"
/>

<img width="904" height="830" alt="image"
src="https://github.com/user-attachments/assets/dfa7d1ef-819a-4a82-8c52-0999f48ed4a6"
/>

<img width="911" height="869" alt="image"
src="https://github.com/user-attachments/assets/39e792fb-9fbe-4f3d-9b3c-b2265186bc22"
/>

<img width="947" height="323" alt="image"
src="https://github.com/user-attachments/assets/27d70e96-d9c0-42d9-8c89-276919b6d61d"
/>


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-10 19:15:02 +08:00
Billy Bao
6e1ebb2855 Fix: Optimize Prompts and Regex for use_sql() (#11148)
### What problem does this PR solve?

Fix: Optimize Prompts and Regex for use_sql() #11127 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-10 19:02:07 +08:00
Zhichang Yu
68b952abb1 Don't select vector on infinity (#11151)
### What problem does this PR solve?

Don't select vector on infinity

### Type of change

- [x] Performance Improvement
2025-11-10 18:01:40 +08:00
Kevin Hu
d207291217 Fix: add download stats to kb logs. (#11112)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-10 13:28:07 +08:00
Lynn
d016a06fd5 Feat/monitor task (#11116)
### What problem does this PR solve?

Show task executor.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-10 12:51:39 +08:00
Kevin Hu
dd1c8c5779 Feat: add auto parse to connector. (#11099)
### What problem does this PR solve?

#10953

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-07 16:49:29 +08:00
Jin Hai
307f53dae8 Minor tweaks (#11106)
### What problem does this PR solve?

Refactor

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-07 15:44:57 +08:00
Kevin Hu
34283d4db4 Feat: add data source to pipleline logs . (#11075)
### What problem does this PR solve?

#10953

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-07 11:43:59 +08:00
Lynn
b7aa6d6c4f Fix: add avatar for UI (#11080)
### What problem does this PR solve?

Add avatar for admin UI.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-07 09:27:31 +08:00
Jin Hai
af98763e27 Admin: add 'show version' (#11079)
### What problem does this PR solve?

```
admin> show version;
show_version
+-----------------------+
| version               |
+-----------------------+
| v0.21.0-241-gc6cf58d5 |
+-----------------------+
admin> \q
Goodbye!

```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-06 19:24:46 +08:00
Billy Bao
5a8fbc5a81 Fix: Can't add more models (#11076)
### What problem does this PR solve?

Currently we cannot add any models, since factory is a string, and the
return type of get_allowed_llm_factories() is List[object]
https://github.com/infiniflow/ragflow/pull/11003

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-06 18:54:13 +08:00
Kevin Hu
3bd1fefe1f Feat: debug sync data. (#11073)
### What problem does this PR solve?

#10953 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-06 16:48:04 +08:00
Yongteng Lei
23b81eae77 Feat: GraphRAG handle cancel gracefully (#11061)
### What problem does this PR solve?

 GraghRAG handle cancel gracefully. #10997.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-06 16:12:20 +08:00
Kevin Hu
adbb8319e0 Fix: add fields for logs. (#11039)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-06 09:49:57 +08:00
Jin Hai
f98b24c9bf Move api.settings to common.settings (#11036)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-06 09:36:38 +08:00
Kevin Hu
cd6ed4b380 Feat: add webhook component. (#11033)
### What problem does this PR solve?

#10427

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-05 19:59:23 +08:00
Billy Bao
24335485bf Fix: get_allowed_llm_factories() return type (#11031)
### What problem does this PR solve?

Fix: get_allowed_llm_factories() return type #11003

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

<img width="2880" height="215" alt="截图 2025-11-05 17-02-01"
src="https://github.com/user-attachments/assets/ee892077-21f9-4b1e-a1d2-b921fa7f6121"
/>
2025-11-05 17:32:12 +08:00
Jin Hai
02d10f8eda Move var from rag.settings to common.globals (#11022)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-05 15:48:50 +08:00
Kevin Hu
8584d4b642 Fix: numeric string miss transformation. (#11025)
### What problem does this PR solve?

#11024

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-05 15:14:30 +08:00
Kevin Hu
b86e07088b Fix: escape multi-steps issues. (#11016)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-05 14:51:00 +08:00
Jin Hai
1a9215bc6f Move some vars to globals (#11017)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-05 14:14:38 +08:00
buua436
89410d2381 fix:api /factories wrong return (#11015)
### What problem does this PR solve?

change:
api /factories wrong return

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-05 12:50:11 +08:00
Jin Hai
96c015fb85 Fix and refactor imports (#11010)
### What problem does this PR solve?

1. Move EMBEDDING_CFG to common.globals
2. Fix error imports
3. Move signal handles to common/signal_utils.py

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-05 11:07:54 +08:00
Wanderson Pinto dos Santos
3654ae61c1 feat: add allowed factories variable to allow admins to restrict llms users can add (#11003)
### What problem does this PR solve?

Currently, if we want to restrict the allowed factories users can use we
need to delete from the database table manually. The proposal of this PR
is to include a variable to that, if set, will restrict the LLM
factories the users can see and add. This allow us to not touch the
llm_factories.json or the database if the LLM factory is already
inserted.

Obs.: All the lint changes were from the pre-commit hook which I did not
change.

### Type of change

- [X] New Feature (non-breaking change which adds functionality)
2025-11-05 10:47:50 +08:00
Jin Hai
bab3fce136 Move some constants to common (#11004)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-05 08:01:39 +08:00
Kevin Hu
4bbbf92331 Refa: link connector to KB. (#10991)
### What problem does this PR solve?

#10953

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-04 20:13:52 +08:00
Jin Hai
880a6a0428 Move some enumerate type to constants.py (#10998)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-04 19:25:25 +08:00
Jin Hai
03038c7d3d Update RetCode to common.constants (#10984)
### What problem does this PR solve?

1. Update RetCode to common.constants
2. Decouple the admin and API modules

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-04 15:12:53 +08:00
Jin Hai
16d2be623c Minor tweaks (#10987)
### What problem does this PR solve?

1. Rename identifier name
2. Fix some return statement
3. Fix some typos

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-04 14:15:31 +08:00
Billy Bao
19f71a961a Fix: Create dataset performance unmatched between HTTP api and web ui (#10960)
### What problem does this PR solve?

Fix: Create dataset performance unmatched between HTTP api and web ui
#10925

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-04 13:45:14 +08:00
Jin Hai
1e45137284 Move 'timeout' to common folder (#10983)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-04 11:51:12 +08:00
Jin Hai
d55344bc11 Remove unused code (#10981)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-04 11:10:29 +08:00
Jin Hai
378bdfccfc Refactor log utils (#10973)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 20:25:02 +08:00
Kevin Hu
3e5a39482e Feat: Support multiple data sources synchronizations (#10954)
### What problem does this PR solve?
#10953

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-03 19:59:18 +08:00
Jin Hai
9a486e0f51 Move some funcs from api to rag module (#10972)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 19:26:09 +08:00
Jin Hai
1284647694 Refactor file utils (#10970)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 18:54:55 +08:00
Jin Hai
076d811086 Introduce common/config_utils.py (#10968)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 17:25:06 +08:00
Jin Hai
121d3fd815 Introduce common/constants.py (#10965)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 16:32:37 +08:00
Jin Hai
d008a4df9f Move base64_image related functions to common directory (#10957)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 15:20:46 +08:00
Jin Hai
78631a3fd3 Move some functions out of 'api/utils/common.py' (#10948)
### What problem does this PR solve?

as title.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 12:34:47 +08:00
Stephen Hu
b47361432a Fix: API: chunk.update does not update positions (#10945)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/10944

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-03 11:01:44 +08:00
Billy Bao
fa210e7c58 Feat: parsing hyperlinks in docx and pdf & Fix: default parser config of toc extraction (#10877)
### What problem does this PR solve?

Feat: parsing hyperlinks in docx and pdf #10848
Fix: default parser config of toc extraction

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-03 09:34:12 +08:00
Jin Hai
360f5c1179 Move token related functions to common (#10942)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 08:50:05 +08:00
Jin Hai
44f2d6f5da Move 'get_project_base_directory' to common directory (#10940)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-02 21:05:28 +08:00
Jin Hai
57a83eca8a Remove unused code (#10938)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-02 16:25:16 +08:00
Jin Hai
6447b737ab Move singleton to common directory (#10935)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-02 12:24:08 +08:00
Jin Hai
f52e56c2d6 Remove 'get_lan_ip' and add common misc_utils.py (#10880)
### What problem does this PR solve?

Add get_uuid, download_img and hash_str2int into misc_utils.py

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-10-31 16:42:01 +08:00
buua436
5674d762f7 Feat:check embedding model api (#10854)
### What problem does this PR solve?
change:
Randomly sample `check_num` chunks from knowledge base `kb_id`, re-embed
them using `embd_id`, and compare with stored vectors via cosine
similarity. If `avg_cos_sim > 0.99`, return success (`code=0`);
otherwise return business failure (`code=10`).

url:
`/v1/kb/check_embedding`

Request Body:
```
{
  "kb_id": "<dataset_id>",
  "embd_id": "BAAI/bge-m3@SILICONFLOW",
  "check_num": 5
}

```
Success Response:
```
{
  "code": 0,
  "message": "success",
  "data": {
    "summary": { "avg_cos_sim": 0.999999, "sampled": 5, "valid": 5, "max_cos_sim":0.999999,"min_cos_sim":0.999999,"model":"BAAI/bge-m3@SILICONFLOW" },
    "results": [ ... ]
  }
}
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-30 19:06:16 +08:00
Billy Bao
fa38aed01b Fix: the input length exceeds the context length (#10895)
### What problem does this PR solve?

Fix: the input length exceeds the context length #10750

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-30 19:00:53 +08:00
Yongteng Lei
a3bb4aadcc Fix: predictable token generation (#10868)
### What problem does this PR solve?

Fix predictable token generation.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-30 09:31:36 +08:00