ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-07-06 03:18:36 +08:00

Author	SHA1	Message	Date
chanx	c41855da81	Fix: Model provider add verify and fixed form in modal not resetting issue (#15520 ) ### What problem does this PR solve? Fix: Model provider add verify and fixed form in modal not resetting issue ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-03 11:59:57 +08:00
Lynn	dc4b82523b	Feat: tenant llm provider (#14595 ) ### What problem does this PR solve? Python implementation of the Go-based model_provider API suite. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: bill <yibie_jingnian@163.com>	2026-05-29 17:39:41 +08:00
balibabu	2c099bbb95	Fix: Uploading TSV format documents to the knowledge base did not generate any error messages. (#15284 ) ### What problem does this PR solve? Fix: Uploading TSV format documents to the knowledge base did not generate any error messages. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-27 14:42:53 +08:00
buua436	71a52d579c	fix: move agent attachment download api (#15146 ) ### What problem does this PR solve? move agent attachment download api to the correct route and update frontend callers ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Notes - Move the attachment download endpoint from document routes to agent routes. - Update frontend download callers to use the agent attachment endpoint. - Reuse the shared file response header helper instead of duplicating it in `agent_api.py`.	2026-05-22 15:22:05 +08:00
Magicbook1108	b28e134944	Feat: add local & ssh provider in admin panel (#15039 ) ### What problem does this PR solve? Feat: add local & ssh provider in admin panel ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-20 16:56:20 +08:00
Magicbook1108	b69a6a5d80	Feat: full optimization on connector dashboard (#14979 ) ### What problem does this PR solve? This PR improves the connector dashboard task management experience and adds better visibility into connector execution logs. ### Overview: #### Before <img width="700" alt="image" src="https://github.com/user-attachments/assets/e4a8ed6f-2e18-4f0f-8528-41a514550052" /> #### Now: <img width="700" alt="Screenshot from 2026-05-18 16-31-30" src="https://github.com/user-attachments/assets/d4ca193b-847a-49ae-9e4f-5fbca60ea627" /> ### 1. Add a new logging page to the connector dashboard A new logging page has been added so users can view connector task execution logs directly from the connector dashboard. ### 2. Merge the Resume button into Confirm The separate Resume button has been removed. The Confirm button now represents different actions depending on the current task state: - Save: Save form changes and reschedule tasks. - Stop: Cancel currently scheduled or running tasks. - Resume: Create new scheduled tasks after the previous tasks have been stopped. - Start: Start tasks when no task has been started yet. ### 3. Separate syncing and pruning tasks Connector tasks are now separated into syncing and pruning. Pruning is controlled by the Sync deleted files option: - When Sync deleted files is disabled, only syncing tasks are shown. - When Sync deleted files is enabled, both syncing and pruning tasks are shown. Now: Sync deleted files disabled <img width="700" alt="Sync deleted files disabled" src="https://github.com/user-attachments/assets/dbd9232e-614a-407f-a0b1-c109e5fa567d" /> Now: Sync deleted files enabled <img width="700" alt="Sync deleted files enabled" src="https://github.com/user-attachments/assets/1f527f48-ccb3-4ee8-97ca-086891489296" /> ### 4. Update logs in backend <img width="700" alt="image" src="https://github.com/user-attachments/assets/10a95a3f-98c1-4e67-8afa-ddf6cda5b0b2" /> ### 5. Remove connector resume API - Removed: `POST /v1/connectors/<connector_id>/resume` - Replaced by: `PATCH /v1/connectors/<connector_id>` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-19 10:07:11 +08:00
buua436	b89878c593	Fix: dataset document download route (#14910 ) ### What problem does this PR solve? dataset document download route ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-14 10:59:06 +08:00
plind	dd76653dc1	feat: add tag management for Agents with filtering and sorting (#14774 ) (#14799 ) ## Summary Closes #14774. Adds free-form tags on agents (UserCanvas) with full UI + API: - Stored as comma-separated `tags` column on `UserCanvas` with online migration. - New endpoints: `GET /v1/agents/tags` (aggregate counts) and `PUT /v1/agent/<id>/tags` (write). `GET /v1/agents` accepts a `tags=` query. - "Edit tags" item in agent dropdown opens a chip-style editor dialog; tags render as badges on each agent card. - New "Tags" facet in the agents filter bar, with counts. ## Implementation notes - Tag matching is exact-token: the SQL filter wraps stored tags as `,…,` and matches `,ml,` so `ml` doesn't match `ml-ops`. - Server-side normalization in `UserCanvasService.update_tags`: dedup (case-insensitive), per-tag cap of 64 chars, total length capped at 512 chars to fit the column, commas inside tag values are replaced with spaces. - Tenant authorization: `PUT /v1/agent/<id>/tags` gates on `UserCanvasService.accessible(canvas_id, tenant_id)`. - Tag listing scope: `UserCanvasService.list_tags` follows the same own + team-shared rule as `get_by_tenant_ids`. - i18n: keys added to `en.ts` and `zh.ts` only (per project convention; other locales fall back). - `HomeCard` gets a non-breaking `extra?: ReactNode` slot for the chip row; no `src/components/ui/` files modified. ## Test plan - [ ] Backend boot runs `migrate_db` → confirm `user_canvas.tags` column exists (`DESCRIBE user_canvas`). - [ ] Agents page renders cards normally (no console error from missing field). - [ ] `⋯ → Edit tags` opens a dialog that stays open (regression: dialog was unmounting with the dropdown). - [ ] Typing a tag without pressing Enter and clicking Save persists it (regression: last typed tag was being dropped). - [ ] Chip input supports Enter/comma to commit, Backspace on empty to remove, `×` to remove individual chip. - [ ] Tag containing a comma sent via API is stored with the comma replaced by a space. - [ ] 20 long tags sent via API does not error (length cap silently truncates). - [ ] "Tags" filter in the filter bar shows counts and narrows the list. - [ ] Filtering by `ml` does not return agents tagged `ml-ops`. - [ ] UI in Chinese shows 编辑标签 / 添加标签以整理和筛选你的智能体 etc. - [ ] `PUT /v1/agent/<other-tenant-id>/tags` returns `Agent not found or no permission.`	2026-05-13 21:41:32 +08:00
Ahmad Intisar	e994051eb9	Feature/generic api connector (#13545 ) # feat: Add Generic REST API Connector ## What problem does this PR solve? RAGFlow supports many specific data source connectors (MySQL, Slack, Google Drive, etc.), but there was no way to connect an arbitrary REST API as a data source. Users with custom or third-party APIs had to write a new connector class for each one. This PR adds a generic, configuration-driven REST API connector that lets users connect any REST API as a data source entirely through the UI — no code changes needed per API. --- ## Features ### Core Connector (`common/data_source/rest_api_connector.py`) - Implements `LoadConnector` and `PollConnector` interfaces for full and incremental sync - Configurable authentication: None, API Key (custom header), Bearer Token, Basic Auth - Pluggable pagination: Page-based, Offset-based, Cursor-based, or None - Smart page-size inference from user's query parameters to avoid duplicate/conflicting params - Configurable request delay between pages to prevent API rate limiting - Auto-detection of the items array in JSON responses (`items`, `results`, `data`, `records`, or first list found) - Advanced field mapping with dot-notation (`country.name`), array wildcards (`newsType[].name`), type hints, and default values - Optional content template rendering (`"Title: {title}\nBody: {body}"`) - HTML stripping for content fields - Stable document IDs via `hash128` from a configurable ID field or auto-generated from item content - Pydantic configuration schema with automatic coercion of UI string inputs to dicts/lists ### Backend Registration (`rag/svr/sync_data_source.py`, `common/constants.py`, `common/data_source/config.py`) - `REST_API` sync class wired into RAGFlow's `func_factory` - Full sync (`load_from_state`) and incremental polling (`poll_source`) support - Credentials and config passed from task to connector following existing patterns (MySQL, SeaFile, etc.) ### Test Connection Endpoint (`api/apps/connector_app.py`) - `POST /v1/connector/<id>/test` validates config schema, authentication, and API connectivity without triggering a sync - Clear error messages for auth failures vs. config issues ### Frontend UI (`web/src/pages/user-setting/data-source/constant/`) - Postman-style configuration:* Base URL, Query Parameters (key=value per line), Auth, Content Fields, Metadata Fields, Pagination Type - Auth-type-aware form: fields for API key header/value, Bearer token, or Basic username/password appear only when relevant - Advanced Settings toggle for: Custom Headers, Max Pages, Request Delay, Poll Timestamp Field, Request Body (POST) - Connector icon (SVG) and i18n strings (English) - "Test Connection" button to validate before syncing --- ## Controls & Safety - Configurable max pages safety cap (default: 1000, adjustable in UI) - Configurable request delay between pages (default: 0.5s, adjustable in UI) - Auth errors (401/403) fail immediately without retries; transient errors retry with exponential backoff - Diagnostic logging: auth setup confirmation, request details on failure, content field extraction status --- ## Type of change - [x] New Feature (non-breaking change which adds functionality) ##Visual Screenshots of Features <img width="482" height="510" alt="Screenshot 2026-03-11 at 5 19 52 PM" src="https://github.com/user-attachments/assets/dcb7ab4a-1622-44f3-bb02-d6f0527314c4" /> (Connector can be configured within the external data sources tab) Configuration Parameters: <img width="661" height="682" alt="Screenshot 2026-03-11 at 5 20 46 PM" src="https://github.com/user-attachments/assets/5e154e71-4ab5-4872-bfb2-04f02b73c18a" /> <img width="661" height="682" alt="Screenshot 2026-03-11 at 5 20 54 PM" src="https://github.com/user-attachments/assets/00cb14b7-0bcf-4b94-9d71-34e93369ecb2" /> Connection can be tested before attaching to dataset: <img width="981" height="681" alt="Screenshot 2026-03-11 at 5 21 40 PM" src="https://github.com/user-attachments/assets/aaa6eeeb-89a7-4349-bc34-2423bf8be9ee" /> Ingestion tested with API connector (works perfectly fine): <img width="1062" height="705" alt="Screenshot 2026-03-11 at 5 22 30 PM" src="https://github.com/user-attachments/assets/afcd0d58-cadd-4152-badc-d2f14d96fbec" /> Search & Retrieval works as well with metadata flow: <img width="1062" height="705" alt="Screenshot 2026-03-11 at 5 23 05 PM" src="https://github.com/user-attachments/assets/d41ee935-dcf7-4456-b317-22a76ca032c0" /> --------- Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-13 20:35:01 +08:00
buua436	de2abe9ed8	Fix: tag parser id (#14724 ) ### What problem does this PR solve? tag parser id ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-09 14:29:09 +08:00
Wang Qi	0552b1695a	Fix UI search multiple datasets (#14689 ) ### What problem does this PR solve? Fix UI search multiple datasets ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-05-08 20:20:09 +08:00
Preston Percival	e8f19aa338	feat(graphrag): fix merge concurrency and add resume-from-checkpoint (#14238 ) This PR addresses three related GraphRAG reliability issues that together allow long-running GraphRAG tasks (10+ hours of LLM extraction) to be resumed after a crash or pause without re-doing completed work. It builds on #14096 (per-doc subgraph cache) and extends the same idea to the resolution and community-detection phases. Fixes #14236. ## 1. Fix concurrent merge crash Long GraphRAG runs would crash near the end of entity resolution with: ``` RuntimeError: dictionary keys changed during iteration ``` in `Extractor._merge_graph_nodes`. Two changes: - `rag/graphrag/general/extractor.py`: snapshot `graph.neighbors(node1)` via `list(...)` before iterating, so concurrent `add_edge` / `remove_node` mutations on the shared `nx.Graph` cannot invalidate the iterator. Also tracks each redirected neighbour in `node0_neighbors` so a later merged node sharing the same external neighbour takes the edge-merge branch instead of overwriting via `add_edge`. - `rag/graphrag/entity_resolution.py`: serialize the merge step with a dedicated `asyncio.Semaphore(1)`. `nx.Graph` is not thread-safe and concurrent merges on overlapping neighbourhoods can produce incorrect results even with the snapshot fix. ## 2. Don't wipe partial graph on pause Previously the pause / cancel UI path called `settings.docStoreConn.delete({"knowledge_graph_kwd": [...]}, ...)`, destroying every subgraph, entity, relation, and graph row. Re-triggering then started GraphRAG from scratch even though #14096 had already added `load_subgraph_from_store`. After main was merged in (which deleted `api/apps/kb_app.py` per #14394), the pause path now lives on the new REST surface `DELETE /v1/datasets/<id>/<index_type>`: - `api/apps/services/dataset_api_service.py`: `delete_index` accepts a `wipe: bool = True` parameter. When `False` the doc-store rows and GraphRAG phase markers are left intact and only the running task is cancelled. Default preserves historical behaviour. - `api/apps/restful_apis/dataset_api.py`: parses `?wipe=false\|0\|no\|off` from the query string and forwards it. - `web/src/utils/api.ts` + `web/src/services/knowledge-service.ts`: `unbindPipelineTask` appends `?wipe=false` when explicitly false. - The GraphRAG pause action in `web/src/pages/dataset/dataset/generate-button/hook.ts` passes `wipe: false` for `KnowledgeGraph`; raptor is unchanged. UX impact: the pause icon next to a running GraphRAG task no longer wipes graph data. The only path that still wipes is the explicit Delete action in `GenerateLogButton` (trash icon behind a confirmation modal). ## 3. Phase-completion markers (`rag/graphrag/phase_markers.py`) A small Redis-backed marker layer at `graphrag:phase:{kb_id}:{resolution_done\|community_done}` (7-day TTL). `run_graphrag_for_kb` consults the markers on entry and skips phases that already completed in a prior run. Markers are cleared automatically when: - new docs are merged into the graph (which invalidates prior resolution and community results), - `delete_index` wipes the graph, or - `delete_knowledge_graph` is called. Redis failures never block a run -- markers are an optimization, not a gate. ## 4. Idempotent community detection `extract_community` previously did `delete-then-insert` on `community_report` rows; a crash mid-insert left the dataset with no reports. Now report IDs are derived deterministically from `(kb_id, community.title)`, the existing report IDs are snapshotted before insert, new rows are written, then only stale rows are pruned. A failure at any step leaves either the prior or the new report set intact -- never a partial mix. ## 5. Tunable doc-store insert pipeline The GraphRAG insert loop in `rag/graphrag/utils.py` and the `community_report` insert in `rag/graphrag/general/index.py` were both hardcoded to `es_bulk_size = 4` and ran strictly sequentially. On a real KB this meant 1077 chunks took ~21 minutes for a 100-chunk slice -- pure round-trip overhead. - New `insert_chunks_bounded()` helper in `rag/graphrag/utils.py` batches inserts via a bounded `asyncio.Semaphore`. Same retry / timeout semantics as the prior loop. - Defaults: 64 docs per batch, 4 batches in flight (matches the regular ingest pipeline in `document_service.py`). Tunable per-deployment via `GRAPHRAG_INSERT_BULK_SIZE` and `GRAPHRAG_INSERT_CONCURRENCY`. - Both `set_graph` and `extract_community` now use the helper. This dropped the same 1077-chunk insert from minutes to seconds in local testing without measurable extra pressure on Infinity (total in-flight docs ≤ `BULK_SIZE × CONCURRENCY` = 256 by default). ## Tests - `test/unit_test/rag/graphrag/test_merge_graph_nodes.py` (3 tests): dense neighbourhood merge, neighbour-snapshot regression, concurrent serialized merges. - `test/unit_test/rag/graphrag/test_phase_markers.py` (4 tests): set/has round-trip, kb-scoped clear, no-op on empty input, graceful Redis failure. - `test/testcases/test_web_api/test_dataset_management/test_dataset_sdk_routes_unit.py`: new `test_delete_index_wipe_flag_unit` covers `wipe=false` for both GraphRAG and raptor on the new REST route, and confirms the default still wipes and clears phase markers. ## Compatibility - Backward compatible: tasks queued before this change behave identically (default `wipe=true`, no markers expected). - No schema/migration changes; all new state lives in Redis. - New optional REST query param `wipe` on `DELETE /v1/datasets/<id>/<index_type>`. - New optional env vars `GRAPHRAG_INSERT_BULK_SIZE` and `GRAPHRAG_INSERT_CONCURRENCY`; defaults preserve safe behaviour. ## Example of resume Screenshot below shows a test resuming knowledge graph generation after applying the concurrency fix and re-deploying. <img width="521" height="677" alt="image" src="https://github.com/user-attachments/assets/9ef0d405-cbb3-420d-a1a1-e51f3e7e9b7a" /> ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2026-05-06 15:01:01 +08:00
Attili-sys	24af0875e5	Feat/configurable metadata display (#13464 ) ### What problem does this PR solve? Currently, RAGFlow's Search and Chat interfaces display only raw vectorized text chunks during retrieval, without contextual information about their source documents. Users cannot see document titles, page numbers, upload dates, or custom metadata fields that would help them understand and trust the retrieved results. This PR introduces an optional metadata display feature that enriches retrieved chunks with document-level metadata in both the Search tab and Chatbot interface. Key improvements: - Search results: Display document metadata as styled badges beneath chunk snippets - Chat citations: Show metadata in citation popovers and reference lists for better source context - LLM context: Metadata is injected into the LLM prompt to enable more accurate, citation-aware responses - External API support: Applications using RAGFlow's SDK retrieval endpoints (`/v1/retrieval`, `/v1/searchbots/retrieval_test`) can opt-in via request parameters - User control: Multi-select dropdown UI allows users to choose which metadata fields to display Implementation approach: - ✅ Reuses existing `DocMetadataService` infrastructure (no new database tables or indices) - ✅ Settings stored in existing JSON configuration fields (`search_config.reference_metadata`, `prompt_config.reference_metadata`) - ✅ No database migrations required - ✅ Disabled by default (fully opt-in and backward-compatible) - ✅ Dynamic metadata field selection populated from actual document metadata keys - ✅ Fixed critical bug where Python's builtin `set()` was shadowed by a route handler function Modified endpoints (all backward-compatible): - `POST /v1/retrieval` (Public SDK) - `POST /v1/searchbots/retrieval_test` (Searchbots) - `POST /v1/chunk/retrieval_test` (UI/Internal) - Chat completions endpoints (via `extra_body.reference_metadata` or `prompt_config`) ### Type of change - [x] New Feature (non-breaking change which adds functionality) ###Images - <img width="879" height="1275" alt="image" src="https://github.com/user-attachments/assets/95b2d731-31ae-45a1-b081-bf5893f52aeb" /> <br><br> <br><br> <img width="1532" height="362" alt="image" src="https://github.com/user-attachments/assets/9cebc65b-b7a7-459f-b25e-3b13fa9b638e" /> <br><br> <br><br> <img width="2586" height="1320" alt="image" src="https://github.com/user-attachments/assets/2153d493-d899-461f-a7a9-041391e07776" /> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Attili-sys <Attili-sys@users.noreply.github.com> Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>	2026-04-30 23:13:27 +08:00
Yingfeng	4ee0702aed	Feat: add skills space to context engine (#13908 ) ### What problem does this PR solve? issue #13714 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-04-30 12:36:03 +08:00
Wang Qi	1b84892e3a	Fix delete graph (#14484 ) ### What problem does this PR solve? Fix delete graph ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue)	2026-04-29 18:09:10 +08:00
balibabu	ce933357c6	Fix: Dataset: When configuring the "general chunk method," options such as chunk size and parent-child slicing are unavailable. (#14459 ) ### What problem does this PR solve? Fix: Dataset: When configuring the "general chunk method," options such as chunk size and parent-child slicing are unavailable. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: balibabu <assassin_cike@163.com>	2026-04-29 14:37:48 +08:00
euvre	35f6d81b73	Refactor: migrate chunk retrieval_test and knowledge_graph to REST API endpoints (#14402 ) ### What problem does this PR solve? ## Summary Migrate two web API endpoints to REST-style HTTP API endpoints, following the pattern established in #14222: \| Old Endpoint \| New Endpoint \| \|---\|---\| \| `POST /v1/chunk/retrieval_test` \| `POST /api/v1/datasets/<dataset_id>/search` \| \| `GET /v1/chunk/knowledge_graph` \| `GET /api/v1/datasets/<dataset_id>/graph` \|	2026-04-28 20:00:26 +08:00
buua436	444e564329	Fix: align chat recommendation and thumbup APIs (#14413 ) ### What problem does this PR solve? align chat recommendation and thumbup APIs ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-04-28 12:55:16 +08:00
Jack	c81081f8ef	Refactor: Doc change parser (#14327 ) ### What problem does this PR solve? Before migration Web API: POST /v1/document/change_parser HTTP API: PATCH /api/v1/datasets/<dataset_id>/documents After consolidation, Restful API PATCH /api/v1/datasets/<dataset_id>/documents ### Type of change - [x] Refactoring	2026-04-27 23:42:57 +08:00
Jack	49912a156e	Refactor: migrate document run api (#14351 ) ### What problem does this PR solve? Before migration: POST /v1/document/run After migration: POST /api/v1/documents/ingest/ ### Type of change - [x] Refactoring	2026-04-27 21:25:58 +08:00
Jack	a536980e22	Refactor: Doc batch change status (#14337 ) ### What problem does this PR solve? Before migration Web API: POST /v1/document/change_status After consolidation, Restful API POST /api/v1/datasets/<dataset_id>/documents/batch-update-status ### Type of change - [x] Refactoring	2026-04-27 20:00:23 +08:00
Wang Qi	488c3ef6a3	Add task API (#14393 ) ### What problem does this PR solve? Add task API ### Type of change - [x] Refactor	2026-04-27 19:16:37 +08:00
Jack	61a24a2c14	Refactor: migrate doc upload info used in chat (#14359 ) ### What problem does this PR solve? Before migration: POST /v1/document/upload_info/ After migration: POST /api/v1/documentss/upload/ ### Type of change - [x] Refactoring	2026-04-27 16:58:42 +08:00
buua436	a9e5724b46	Refa: unify document create flows under REST documents API (#14345 ) ### What problem does this PR solve? unify document create flows under REST documents API ### Type of change - [x] Refactoring	2026-04-27 10:18:16 +08:00
euvre	4dcc42e0e1	feat(api): add unified index API and dataset management endpoints (#14222 ) ### What problem does this PR solve? ## Summary Refactor the dataset API layer into a clean service/REST separation pattern, add a unified `/index` API for graph/raptor/mindmap operations, and introduce several new dataset management endpoints with full test coverage. ## Changes ### Service Layer (`dataset_api_service.py`) - Added `trace_index(dataset_id, tenant_id, index_type)` — unified trace function for all index types - Added `run_index`, `delete_index` service functions - Added `get_dataset`, `get_ingestion_summary`, `list_ingestion_logs`, `get_ingestion_log` - Added `run_embedding`, `list_tags`, `aggregate_tags`, `delete_tags`, `rename_tag` - Added `get_flattened_metadata`, `get_auto_metadata`, `update_auto_metadata` ### REST API Layer (`dataset_api.py`) New unified routes: \| Method \| Route \| Description \| \|--------\|-------\|-------------\| \| POST \| `/datasets/<id>/index?type=graph\\|raptor\\|mindmap` \| Run index task \| \| GET \| `/datasets/<id>/index?type=graph\\|raptor\\|mindmap` \| Trace index task \| \| DELETE \| `/datasets/<id>/<index_type>` \| Delete index \| \| GET \| `/datasets/<id>` \| Get dataset details \| \| GET \| `/datasets/<id>/ingestions/summary` \| Ingestion summary \| \| GET \| `/datasets/<id>/ingestions` \| List ingestion logs \| \| GET \| `/datasets/<id>/ingestions/<log_id>` \| Get single ingestion log \| \| POST \| `/datasets/<id>/embedding` \| Run embedding \| \| GET \| `/datasets/<id>/tags` \| List tags \| \| GET \| `/datasets/tags/aggregation` \| Aggregate tags across datasets \| \| DELETE \| `/datasets/<id>/tags` \| Delete tags \| \| PUT \| `/datasets/<id>/tags` \| Rename tag \| \| GET \| `/datasets/metadata/flattened` \| Get flattened metadata \| \| GET/PUT \| `/datasets/<id>/metadata/config` \| New metadata config path \| Removed routes (replaced by unified `/index`): - `POST /datasets/<id>/mindmap` - `GET /datasets/<id>/mindmap` Preserved legacy routes (backward compatibility): - `/run_graphrag`, `/trace_graphrag`, `/run_raptor`, `/trace_raptor` - `/auto_metadata` GET/PUT ### Test Suite - Updated `common.py` helpers: added `trace_index`, removed `run_mindmap`/`trace_mindmap` - Added 7 new test files with 39 test cases total: \| Test File \| Cases \| \|-----------\|-------\| \| `test_get_dataset.py` \| 4 \| \| `test_ingestion_summary.py` \| 2 \| \| `test_ingestion_logs.py` \| 5 \| \| `test_index_api.py` \| 14 \| \| `test_embedding.py` \| 2 \| \| `test_tags.py` \| 8 \| \| `test_flattened_metadata.py` \| 4 \| - Deleted `test_mindmap_tasks.py` (covered by unified index tests) ## Design Decisions 1. Unified `/index?type=...` — single endpoint replaces 3 separate route pairs for graph/raptor/mindmap 2. Backward compatibility — old routes (`/run_graphrag`, `/run_raptor`, `/auto_metadata`) preserved alongside new paths 3. `_VALID_INDEX_TYPES = {"graph", "raptor", "mindmap"}` — input validation via constant set 4. `_INDEX_TYPE_TO_TASK_ID_FIELD` — maps index type to KB model task ID field for clean dispatch ## Files Changed - `api/apps/restful_apis/dataset_api.py` - `api/apps/services/dataset_api_service.py` - `sdk/python/ragflow_sdk/modules/dataset.py` - `test/testcases/test_http_api/common.py` - `test/testcases/test_http_api/test_dataset_management/` (7 new files) ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Signed-off-by: noob <yixiao121314@outlook.com>	2026-04-27 09:38:01 +08:00
Wang Qi	199fbceb72	Refactor user REST API (#14334 ) ### What problem does this PR solve? Refactor user REST API ### Type of change - [x] Refactoring	2026-04-24 10:25:15 +08:00
Magicbook1108	c74aece63c	Feat: Agent api (#14157 ) ### What problem does this PR solve? 1. List agents Prev API: - `/v1/canvas/list GET` - `/api/v1/agents GET` Current API: `/api/v2/agents GET` 2. Get canvas template Prev API: `/v1/canvas/templates GET` Current API: `/api/v2/agents/templates GET` 3. Delete an agent Prev API: - `/v1/canvas/rm POST` - `/api/v1/agents/<agent_id> DELETE` Current API: `/api/v2/agents/<agent_id> DELETE` 4. Update an agent Prev API: - `/api/v1/agents/<agent_id> PUT` - `/v1/canvas/setting POST ` Current API: `/api/v2/agents/<agent_id> PATCH` 5. Create an agent Prev API: - `/v1/canvas/set POST` - `/api/v1/agents POST` Current API: `/api/v2/agents POST` 6. Get an agent Prev API: - `/v1/canvas/get/<canvas_id> GET ` Current API: `/api/v2/agents/<agent_id> GET` 7. Reset an agent Prev API: - `/v1/canvas/reset POST` Current API: `/api/v2/agents/<agent_id>/reset POST` 8. Upload a file to an agent Prev API: - `/v1/canvas/upload/<canvas_id> POST` Current API: `/api/v2/agents/<agent_id>/upload POST` 9. Input form Prev API: - `/v1/canvas/input_form GET` Current API: `/api/v2/agents/<agent_id>/components/<component_id>/input-form GET` 10. Debug an agent Prev API: - `/v1/canvas/debug POST` Current API: `/api/v2/agents/<agent_id>/components/<component_id>/debug POST` 11. Trace an agent Prev API: - `/v1/canvas/trace GET` Current API: `/api/v2/agents/<agent_id>/logs/<message_id> GET` 12. Get an agent version list Prev API: - `/v1/canvas/getlistversion/<canvas_id>` Current API: `/api/v2/agents/<agent_id>/versions GET` 13. Get a version of agent Prev API: - `/v1/canvas/getversion/<version_id>` Current API: `/api/v2/agents/<agent_id>/versions/<version_id> GET` 14. Test db connection Prev API: - `/v1/canvas/test_db_connect POST` Current API: `/api/v2/agents/test_db_connection` 15. Rerun the agent Prev API: - `/v1/canvas/rerun POST` Current API: `/api/v2/agents/rerun POST` 16. Get prompts Prev API: - `/v1/canvas/prompts GET` Current API: `/api/v2/agents/prompts GET` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: chanx <1243304602@qq.com>	2026-04-24 10:02:22 +08:00
buua436	7817b0d779	Refa: migrate chunk APIs to RESTful routes (#14291 ) ### What problem does this PR solve? migrate chunk APIs to RESTful routes ### Type of change - [x] Refactoring	2026-04-23 14:17:23 +08:00
buua436	aa4526266f	Refa: migrate MCP APIs to RESTful api (#14317 ) ### What problem does this PR solve? migrate MCP APIs to RESTful api ### Type of change - [x] Refactoring	2026-04-23 12:51:27 +08:00
Jack	dbf8c6ed90	Refactor: Doc metadata update (#14289 ) ### What problem does this PR solve? Before migration Web API: POST /v1/document/metadata/update After migration, Restful API PATCH /api/v2/datasets/<dataset_id>/documents/metadatas ### Type of change - [x] Refactoring	2026-04-23 12:04:34 +08:00
Wang Qi	01753b8f31	Refactor: API connectors (#14228 ) ### What problem does this PR solve? Refactor /api/v1/connectors to be more RESTful. ### Type of change - [x] Refactoring	2026-04-22 20:42:41 +08:00
Jack	c08cd8e090	Refactor: Migrate document metadata config update API (#14286 ) ### What problem does this PR solve? Before migration Web API: POST /v1/document/update_metadata_setting After consolidation, Restful API PUT /api/v1/datasets/<dataset_id>/documents/<document_id>/metadata/config ### Type of change - [x] Refactoring	2026-04-22 20:01:31 +08:00
Magicbook1108	d1c62fc19d	Refact: Tenant api (#14288 ) ### What problem does this PR solve? Refact: Tenant api ### Type of change - [x] Refactoring	2026-04-22 20:00:32 +08:00
Jack	3d8a82c0aa	Refactor: Consolidation WEB API & HTTP API for document delete api (#14254 ) ### What problem does this PR solve? Before consolidation Web API: POST /v1/document/rm Http API - DELETE /api/v1/datasets/<dataset_id>/documents After consolidation, Restful API -- DELETE /api/v1/datasets/<dataset_id>/documents ### Type of change - [x] Refactoring	2026-04-22 10:49:52 +08:00
buua436	6baf74afc1	Refa: align chat and search restful APIs (#14229 ) ### What problem does this PR solve? Refactor /api/v1/chats to be more RESTful. ### Type of change - [x] Refactoring --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2026-04-22 10:49:11 +08:00
Jack	2d05475693	Refactor: Consolidation WEB API & HTTP API for document infos (#14239 ) ### What problem does this PR solve? Before consolidation Web API: POST /v1/document/infos Http API - GET /api/v1/datasets/<dataset_id>/documents After consolidation, Restful API -- GET /api/v1/datasets/<dataset_id>/documents?ids=id1&ids=id2 ### Type of change - [ ] Refactoring	2026-04-21 19:35:11 +08:00
Jack	009e538a4e	Refactor: Consolidation WEB API & HTTP API for document get_filter (#14248 ) ### What problem does this PR solve? Before consolidation Web API: POST /v1/document/filter Http API - GET /api/v1/datasets/<dataset_id>/documents After consolidation, Restful API -- GET /api/v1/datasets/<dataset_id>/documents?type=filter ### Type of change - [x] Refactoring	2026-04-21 18:55:30 +08:00
Jack	939933649a	Refactor: Consolidation WEB API & HTTP API for document list_docs (#14176 ) ### What problem does this PR solve? Before consolidation Web API: POST /v1/document/list Http API - GET /api/v1/datasets/<dataset_id>/documents After consolidation, Restful API -- GET /api/v1/datasets/<dataset_id>/documents ### Type of change - [x] Refactoring	2026-04-20 14:54:40 +08:00
Magicbook1108	f906a203bb	Fix doc generator (#14160 ) ### What problem does this PR solve? Fix doc generator ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-04-16 20:37:38 +08:00
balibabu	4cf4d444d2	Fix: Login page type error. (#14156 ) ### What problem does this PR solve? Fix: Login page type error. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-04-16 18:46:52 +08:00
Jack	bc5f78996b	Consolidateion of document upload API (#14106 ) ### What problem does this PR solve? Consolidation WEB API & HTTP API for document upload Before consolidation Web API: POST /v1/document/upload Http API - POST /api/v1/datasets/<dataset_id>/documents After consolidation, Restful API -- POST /api/v1/datasets/<dataset_id>/documents ### Type of change - [x] Refactoring	2026-04-15 11:27:43 +08:00
Jack	576431de99	Refactor: Change update doc from PUT to patch (#14067 ) ### What problem does this PR solve? Before change, update_document in api/apps/restful_apis/document_api.py is using "PUT". After change, it will use "PATCH" which is more suitable. ### Type of change - [x] Refactoring	2026-04-14 17:12:23 +08:00
chanx	6ffa566ec3	Refactor: Standardize naming convention to camelCase (#14079 ) ### What problem does this PR solve? Refactor: Standardize naming convention to camelCase ### Type of change - [x] Refactoring	2026-04-13 21:07:07 +08:00
Jack	4046a4cfb6	Consolidateion metadata summary API (#14031 ) ### What problem does this PR solve? Consolidation WEB API & HTTP API for document metadata summary Before consolidation Web API: POST /api/v1/document/metadata/summary Http API - GET /v1/datasets/<dataset_id>/metadata/summary After consolidation, Restful API -- GET /v1/datasets/<dataset_id>/metadata/summary ### Type of change - [x] Refactoring	2026-04-10 18:41:30 +08:00
Jack	577c96bf2a	Refactor: Merge document update API (#13962 ) ### What problem does this PR solve? Refactor: merge document.rename into document.update_document ### Type of change - [x] Refactoring <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added a unified document update API (PUT) supporting name, metadata, parser/chunk settings, and status changes. * Breaking Changes * Legacy single-parameter rename endpoint removed; renames now require dataset + document identifiers. * `/list` now reads dataset id from a different query parameter. * Validation / Bug Fixes * Stricter meta_fields and parser-config validation; unauthenticated requests return 401. * Frontend * UI now sends dataset id when saving document names. * Tests * Numerous unit and HTTP tests adjusted or removed to match new API and validations. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: MkDev11 <94194147+MkDev11@users.noreply.github.com> Co-authored-by: mkdev11 <YOUR_GITHUB_ID+MkDev11@users.noreply.github.com> Co-authored-by: mkdev11 <MkDev11@users.noreply.github.com> Co-authored-by: Qi Wang <wangq8@outlook.com> Co-authored-by: dataCenter430 <161712630+dataCenter430@users.noreply.github.com> Co-authored-by: balibabu <cike8899@users.noreply.github.com>	2026-04-09 11:17:38 +08:00
balibabu	86900dca99	Refactor: Remove unused API code (#13978 ) ### What problem does this PR solve? Refactor: Remove unused API code ### Type of change - [x] New Feature (non-breaking change which adds functionality) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Style * Updated table header styling in dataset settings by removing a hard-coded background color class, allowing the header to use default or inherited component styling instead. * Refactor * Removed token management endpoints from the API service. Token creation, listing, and removal functions are no longer available. * Removed the statistics data endpoint from available API routes. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 18:46:08 +08:00
balibabu	b8764cfa11	Fix: The document management table cannot be displayed. (#13967 ) ### What problem does this PR solve? Fix: The document management table cannot be displayed. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Improved table layout and overflow behavior in the files view to ensure proper scrolling and display. * Chores * Removed unused system status functionality and cleaned up service methods. * Updated TypeScript configuration for compatibility. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 11:37:27 +08:00
Yongteng Lei	b7daf6285b	Refa: Chat conversations /convsersation API to RESTFul (#13893 ) ### What problem does this PR solve? Chat conversations /convsersation API to RESTFul. ### Type of change - [x] Refactoring	2026-04-02 20:49:23 +08:00
Yongteng Lei	b622c47ed6	Refa: Chats /chat API to RESTFul (#13881 ) ### What problem does this PR solve? Refactor Chats /chat API to RESTFul. ### Type of change - [x] Refactoring	2026-04-01 20:10:37 +08:00
Liu An	b1d28b5898	Revert "Refa: Chats /chat API to RESTFul (#13871 )" (#13877 ) ### What problem does this PR solve? This reverts commit `1a608ac411`. ### Type of change - [x] Other (please describe):	2026-04-01 11:05:29 +08:00

1 2 3 4

176 Commits