Commit Graph

15 Commits

Author SHA1 Message Date
Idriss Sbaaoui
c3b38d397f Feat: add new tests and tescases for restful api suite (#15223)
### What problem does this PR solve?

extend restful api suite

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
2026-05-26 10:08:45 +08:00
Idriss Sbaaoui
7d200d5bd7 Feat: add new tests and tescases for restful api suite (#15208)
### What problem does this PR solve?

extend restful api suite

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
2026-05-25 19:03:56 +08:00
Jonathan Chang
9d1006e4ec fix: The output of the parser in the ingestion pipeline contains HTML tags (#14920)
## Summary
This change fixes ingestion quality issues where MinerU parser output
may contain HTML fragments (for example, table-related tags like `<tr>`,
`<td>`, `<br>`), which were previously passed directly into
chunking/tokenization and degraded chunk quality.

The fix adds a sanitization step in the MinerU parser path so parsed
sections are normalized to clean text before chunking.

## Change Type (select all)
- [x] Bug fix
- [x] Ingestion pipeline improvement
- [x] Parser/chunking quality fix

## Related Issue
- https://github.com/infiniflow/ragflow/issues/14831
2026-05-25 16:06:36 +08:00
Wang Qi
bb148edf4c Revert "Fix: /openai/<chat_id>/chat/completions not aware of session_id" (#15205)
Reverts infiniflow/ragflow#15155 because this is never supported, keep
it as it is.
2026-05-25 14:23:10 +08:00
Wang Qi
e6dd397531 Fix: /openai/<chat_id>/chat/completions not aware of session_id (#15155)
### What problem does this PR solve?

Fix: /openai/<chat_id>/chat/completions not aware of session_id

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-22 20:38:56 +08:00
Wang Qi
87918650ff Refactor: Move API files (#15151)
Refactor: Move API files
2026-05-22 17:44:05 +08:00
dale053
c33d0b8081 fix: prevent sensitive fields from leaking in user API responses (#14792)
Closes #14789

### What problem does this PR solve?

User API endpoints (`login`, `user_profile`, `user_add`,
`forget_reset_password`) were returning full user objects via
`to_json()` / `to_dict()`, which included sensitive fields like
`password` and `access_token` in the response body. This leaks
credentials to the client.

This PR adds a `to_safe_dict()` method on the `User` model that strips
sensitive fields (`password`, `access_token`) and replaces all affected
call sites to use it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-22 15:14:26 +08:00
Haruko386
a725e114f9 Go: implement ASR and TTS for Xinference (#15096)
### What problem does this PR solve?

implement ASR and TTS for Xinference

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-05-21 18:28:06 +08:00
dripsmvcp
440153c378 fix(api): check kb ownership in /dify/retrieval (#15028)
POST /api/v1/dify/retrieval resolved the caller via @apikey_required
(injecting tenant_id) but then fetched the requested knowledge_id with
no tenant filter and ran the full retrieval pipeline against
kb.tenant_id (the owner). Any valid Dify-compatible API key could
retrieve chunks from any tenant whose KB UUID was known. Adds the
missing ownership check.

## Root Cause
api/apps/sdk/dify_retrieval.py line 253:
KnowledgebaseService.get_by_id(kb_id) fetched the KB by id alone, then
the handler used kb.tenant_id (the OWNER) to build the embedding model
and call the retriever. The caller tenant_id was only used downstream at
line 278 for retrieval_by_children, well after cross-tenant data was
already retrieved.

grep confirmed there was no KnowledgebaseService.accessible call
anywhere in the handler.

## Fix
Two-line guard immediately after the existing get_by_id lookup,
mirroring the pattern PR #14749 lands for the sibling sdk/doc.py routes
(download, parse, stop_parsing, retrieval_test):

    e, kb = KnowledgebaseService.get_by_id(kb_id)
    if not e:
return build_error_result(message="Knowledgebase not found!",
code=RetCode.NOT_FOUND)
+   if not KnowledgebaseService.accessible(kb_id, tenant_id):
+ return build_error_result(message="No authorization.",
code=RetCode.AUTHENTICATION_ERROR)
    if kb.tenant_embd_id:
        ...

KnowledgebaseService.accessible already handles solo-tenant ownership,
team membership via TenantService.get_joined_tenants_by_user_id, and the
permission=ME distinction. No behavior change for legitimate callers;
cross-tenant callers now receive RetCode.AUTHENTICATION_ERROR (109).

## Test Plan
- [x] Regression test added:
test/unit_test/api/apps/sdk/test_dify_retrieval.py
- test_cross_tenant_request_is_rejected -- attacker tenant calling owner
tenant KB gets 109; retriever is not invoked
- test_same_tenant_request_succeeds -- owner tenant gets the records
back
- test_missing_knowledge_base_returns_not_found -- missing KB returns
404 BEFORE the access check fires (legit callers see the clearer
message)
- [x] All 3 tests pass after the fix
- [x] Cross-tenant test FAILS on pre-fix main (KeyError on result[code]
because handler leaks records dict instead of returning auth error)
- [x] ruff check clean on both changed files
- [x] No drive-by reformatting in dify_retrieval.py -- only the 2 added
lines

### Post-fix output

    test_cross_tenant_request_is_rejected           PASSED [ 33%]
    test_same_tenant_request_succeeds               PASSED [ 66%]
    test_missing_knowledge_base_returns_not_found   PASSED [100%]

============================== 3 passed in 0.04s
===============================

Closes #15027
2026-05-21 13:29:00 +08:00
Idriss Sbaaoui
aea90f4e39 Feat: add new tests and tescases for restful api suite (#15038)
### What problem does this PR solve?

extend restful api suite

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
2026-05-20 14:56:55 +08:00
Idriss Sbaaoui
6b2fcb4116 Feat: add new tests and tescases for restful api suite (#14996)
### What problem does this PR solve?

extend restful api suite

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
2026-05-19 17:17:31 +08:00
Idriss Sbaaoui
95b56e73f2 Feat: add new tests and tescases for restful api suite (#14993)
### What problem does this PR solve?

extend restful api suite

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
2026-05-19 15:43:15 +08:00
Magicbook1108
b69a6a5d80 Feat: full optimization on connector dashboard (#14979)
### What problem does this PR solve?

This PR improves the connector dashboard task management experience and
adds better visibility into connector execution logs.

### Overview:

#### Before
<img width="700" alt="image"
src="https://github.com/user-attachments/assets/e4a8ed6f-2e18-4f0f-8528-41a514550052"
/>

#### Now:
<img width="700" alt="Screenshot from 2026-05-18 16-31-30"
src="https://github.com/user-attachments/assets/d4ca193b-847a-49ae-9e4f-5fbca60ea627"
/>

### 1. Add a new logging page to the connector dashboard

A new logging page has been added so users can view connector task
execution logs directly from the connector dashboard.

### 2. Merge the Resume button into Confirm

The separate **Resume** button has been removed. The **Confirm** button
now represents different actions depending on the current task state:

- **Save**: Save form changes and reschedule tasks.
- **Stop**: Cancel currently scheduled or running tasks.
- **Resume**: Create new scheduled tasks after the previous tasks have
been stopped.
- **Start**: Start tasks when no task has been started yet.

### 3. Separate syncing and pruning tasks

Connector tasks are now separated into **syncing** and **pruning**.

Pruning is controlled by the **Sync deleted files** option:

- When **Sync deleted files** is disabled, only syncing tasks are shown.
- When **Sync deleted files** is enabled, both syncing and pruning tasks
are shown.

**Now: Sync deleted files disabled**

<img width="700" alt="Sync deleted files disabled"
src="https://github.com/user-attachments/assets/dbd9232e-614a-407f-a0b1-c109e5fa567d"
/>

**Now: Sync deleted files enabled**

<img width="700" alt="Sync deleted files enabled"
src="https://github.com/user-attachments/assets/1f527f48-ccb3-4ee8-97ca-086891489296"
/>

### 4. Update logs in backend

<img width="700" alt="image"
src="https://github.com/user-attachments/assets/10a95a3f-98c1-4e67-8afa-ddf6cda5b0b2"
/>

### 5. Remove connector resume API

- Removed: `POST /v1/connectors/<connector_id>/resume`
- Replaced by: `PATCH /v1/connectors/<connector_id>`


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-19 10:07:11 +08:00
dev
b12eaee38b fix(api): enforce tenant access for connector routes (#14747)
### What problem does this PR solve?

Fixes #14746.

Adds tenant access checks for connector-by-id REST routes before reading
connector details, mutating connector config/status, deleting
connectors, rebuilding, or listing sync logs. Unauthorized callers now
receive `RetCode.AUTHENTICATION_ERROR` with `No authorization.` without
reaching the connector/log mutation paths.

Validation:
- `python3 -m pytest
--confcutdir=test/testcases/test_web_api/test_connector_app
test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py`
- `uvx ruff check api/apps/restful_apis/connector_api.py
api/db/services/connector_service.py
test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py`

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: dev111-actor <dev111-actor@users.noreply.github.com>
2026-05-18 16:09:26 +08:00
Idriss Sbaaoui
09e1fd290a Chore: migrate tests to restful api (#14871)
### What problem does this PR solve?

add new testing suite for the new restful api endpoints meant to replace
http and web api tests

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): test
2026-05-13 15:07:23 +08:00