### What problem does this PR solve?
`_get_optimal_clusters` in `rag/raptor.py` had two edge-case issues in
GMM cluster-count selection:
1. It used `np.arange(1, max_clusters)`, which never evaluates the
upper-bound candidate (`max_clusters`).
2. When effective `max_clusters` becomes `1`, the candidate list was
empty and `argmin` crashed.
This PR makes candidate evaluation inclusive (`1..max_clusters`) and
guards the single-cluster case by returning `1` directly.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Validation
- `pytest test/unit_test/rag/test_raptor_psi_tree_builder.py
--config-file pyproject.toml -q`
- `ruff check rag/raptor.py
test/unit_test/rag/test_raptor_psi_tree_builder.py`
### Tests added
- Regression test for `max_cluster == 1` path (no crash, returns 1)
- Regression test verifying upper-bound candidate is evaluated and can
be selected
_AI-assistance disclosure: parts of this change (bug triage and test
scaffolding) were drafted with AI assistance and fully reviewed and
verified by me._
---------
Co-authored-by: Harsh Kashyap <harshkashyap@Harshs-MacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
### What problem does this PR solve?
- Tools management
- Pregel engine wrapper for better usage
- UT race
- Coding style
### Type of change
- [x] Refactoring
### What problem does this PR solve?
```
RAGFlow(admin)> show model 'abc';
+------------+----------------------------------------------------------------+
| field | value |
+------------+----------------------------------------------------------------+
| command | get_model_by_model_name |
| error | 'get model by model name' is implemented in enterprise edition |
| model_name | abc |
+------------+----------------------------------------------------------------+
RAGFlow(admin)> list models;
+-----------------+--------------------------------------------------------+
| command | error |
+-----------------+--------------------------------------------------------+
| list_all_models | 'list all models' is implemented in enterprise edition |
+-----------------+--------------------------------------------------------+
```
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Implement:
1. `/api/v1/datasets/<dataset_id>/documents/<document_id>/chunks GET`
2.
`/api/v1/datasets/<dataset_id>/documents/<document_id>/chunks/<chunk_id>
PATCH`
3. `/api/v1/datasets/<dataset_id>/documents/<document_id>/chunks PATCH`
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Provider default public key for CLI
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Refactor the Go agent port's logging so every log line — gin access,
agent canvas events, harness warnings, fatal boot errors — flows through
a single common.Logger (zap) backed by a rotated file, with structured
fields, level filtering, and configurable rotation.
---------
Co-authored-by: Claude <noreply@anthropic.com>
## Problem
`thread_pool_exec()` dispatches work via `loop.run_in_executor()`, which
submits the callable with a plain `executor.submit(func, *args)` and
does **not** copy the caller's `contextvars.Context`. So a `ContextVar`
set in the async caller is not visible inside the function running in
the worker thread.
This differs from `asyncio.to_thread()`, which runs the callable inside
a copied context. `run_in_executor()` has never propagated context
(verified on Python 3.12 and 3.13) — so this is a pre-existing gap in
the helper, **not** a regression or a Python-version compatibility
issue.
Concretely, any code that sets a `ContextVar` in async code and reads it
inside a function dispatched via `thread_pool_exec` (request tracing,
per-task state, Langfuse trace propagation, etc.) silently loses that
context.
## Fix
Copy the current context before submitting and run the callable inside
it with `ctx.run()`, matching what `asyncio.to_thread()` does:
```python
async def thread_pool_exec(func, *args, **kwargs):
loop = asyncio.get_running_loop()
ctx = contextvars.copy_context()
if kwargs:
inner = functools.partial(func, *args, **kwargs)
return await loop.run_in_executor(_thread_pool_executor(), ctx.run, inner)
return await loop.run_in_executor(_thread_pool_executor(), ctx.run, func, *args)
```
This explicitly **adds** ContextVar propagation to the helper (it does
not restore any prior behavior). Backward-compatible.
## Tests
`TestThreadPoolExec` covers propagation, the kwargs path, per-call
isolation and the unset-default case.
> Note: the branch name still contains `python313` for historical
reasons; the change is unrelated to any Python version.
### What problem does this PR solve?
Updated the release workflow to install SIMDe headers into the MSYS2
toolchain include directory. Adjusted CMake flags to remove references
to the previous SIMDE_INCLUDE_DIR.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
Stabilizes the Go unit-test surface so the test suite can run reliably
in CI and locally via \`bash build.sh --test\`.
## Verification
\`\`\`bash
bash build.sh --test -- -count=10 -run TestWithCancel_SequentialAgent
./internal/harness/core/
bash build.sh --test -- -count=5 -run TestSiliconflowChatExtracts
./internal/entity/models/
bash build.sh --test # full suite
\`\`\`
All previously failing packages (\`admin\`, \`cli\`, \`handler\`,
\`parser\`,
\`router\`, \`service\`, \`service/chunk\`) now build and test
successfully.
\`TestWithCancel_SequentialAgent\` passes 10/10 (was flaky). SiliconFlow
reasoning test passes after switching the assertion to the SiliconFlow
wire
format.
---------
Co-authored-by: Claude <noreply@anthropic.com>
## Summary
Several keys added in recent releases were missing from the French
(`fr.ts`) locale file.
- **`top`** — missing in both the common section and the dataset section
- **Chat channels** — all UI strings for the new chat channels feature
(`chatChannels`, `chatChannelDesc.*`, `connectDialog`, `notConnected`,
etc.)
- **Username validation** — `usernameMaxLength`,
`usernameInvalidCharacters`
- **Model editing** — `editCustomModelTitle`
## Changes
- `web/src/locales/fr.ts` — 47 lines added, no other files touched
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
### What problem does this PR solve?
```
RAGFlow(admin)> show role 'user' default models;
+--------------------------+-----------------------------------------------------------------+-----------+
| command | error | role_name |
+--------------------------+-----------------------------------------------------------------+-----------+
| show_role_default_models | 'show role default models' is implemented in enterprise edition | user |
+--------------------------+-----------------------------------------------------------------+-----------+
RAGFlow(admin)> set role 'user' default chat 'glm4.5@test@zhipu-ai';
+------------+---------------------------------------------------------------+
| field | value |
+------------+---------------------------------------------------------------+
| model_id | |
| model_type | chat |
| role_name | user |
| command | set_role_default_model |
| error | 'set role default model' is implemented in enterprise edition |
+------------+---------------------------------------------------------------+
RAGFlow(admin)> reset role 'user' default chat;
+------------+-----------------------------------------------------------------+
| field | value |
+------------+-----------------------------------------------------------------+
| command | reset_role_default_model |
| error | 'reset role default model' is implemented in enterprise edition |
| model_type | chat |
| role_name | user |
+------------+-----------------------------------------------------------------+
```
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
## Summary
- add public Go route for `/api/v1/searchbots/detail`
- implement beta-token auth flow for shared search access
- add tenant-based access check for shared search apps
- add joined search detail query for the share response
- align Go response shape with the current Python runtime behavior
- add DAO / service / handler tests for the new endpoint
close#16132
## Summary
This PR completes the Go-side merge and cleanup for chat channel APIs,
including handler/service wiring, route registration, and test coverage.
Implemented and aligned 5 chat channel APIs:
```
- POST `/api/v1/chat-channels`
- GET `/api/v1/chat-channels`
- GET `/api/v1/chat-channels/:channel_id`
- PATCH `/api/v1/chat-channels/:channel_id`
- DELETE `/api/v1/chat-channels/:channel_id`
```
Co-authored-by: Haruko386 <tryeverypossible@163.com>
### What problem does this PR solve?
```
fixed:
RAGFlow(api/default)> use model 'minimax-m2.5@test@minimax'
SUCCESS
RAGFlow(api/default)> chat message 'who r u'
Answer: Hey! I'm MiniMax-M2.5, an AI assistant here to help you with questions, tasks, or whatever you need. What can I do for you?
Time: 1.727263
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Removed references to 'simde' from the package lists and updated paths
for compiler detection and CMake configuration to ensure proper handling
of Windows executables.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
As title.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Closes#15139.
The "+ Add" flow in the Set/Edit Metadata modal posted updates with an
empty value, so backend saves were silent no-ops and the document's "X
fields" count stayed at 0 despite a "Success" toast.
The value `<Input>` updates `tempValues` synchronously per keystroke but
only writes through to `metaData.values` on blur (via
`handleValueBlur`). When the user clicks the nested modal's Confirm
button without first blurring, the click handler races the blur and
`handleSave` closes over the pre-blur `metaData.values` — still the
initial `['']`. `addUpdateValue` then queues an empty-string update; the
auto-fire save sends it, and after `resetOperations()` the outer Save
button posts `updates: []`.
Read from `tempValues` instead so the queued update carries the typed
value.
Regression test in `tests/use-manage-values-modal.test.ts` asserts that
`handleSave` passes the typed value (not the pre-blur empty string) to
`addUpdateValue` in the add-new code path.
### What problem does this PR solve?
Updated rust_target and added simde support for Windows builds. Modified
CMake commands to include simde and adjusted paths for compilers.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Updated MSYS2 package list for Windows builds and added Rust target
specifications. Modified build steps for office_oxide and rag tokenizer
libraries to improve compatibility and streamline the build process.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## What
Fixes#16008 — tables contained in a DOCX are silently dropped when the
document is parsed with the **laws** chunking method.
## Root cause
`Docx.__call__` in `rag/app/laws.py` iterated `self.doc.paragraphs`,
which only yields paragraph elements. Tables are separate `tbl` blocks
in the document body, so they were never visited and were lost from the
output. (The `naive` parser already handles tables by iterating the
document body.)
## Changes
- Iterate `self.doc._element.body` so tables are visited in document
order alongside paragraphs.
- Add a `__table_to_html` helper that renders each table to HTML,
including merged-cell `colspan` detection (mirrors the `naive` parser's
logic).
- Inject each table into the section tree with a sentinel level deeper
than any heading, so `Node.build_tree` merges it into its **enclosing
section** — keeping the chapter/article title path as retrieval context
rather than producing an orphaned chunk.
- Guard the `h2_level` computation against an empty heading set, so a
tables-only or empty DOCX no longer raises `IndexError`.
This keeps the laws parser's hierarchical chunking **and** adds table
extraction, so users no longer have to choose between losing structure
(naive) or losing tables (laws).
## Tests
Adds `test/unit_test/rag/test_laws_docx_tables.py` covering:
- table content is preserved and carries its section title path,
- merged adjacent cells collapse to `colspan`,
- tables-only document does not crash,
- empty document returns `[]`.
All four pass; `ruff check` / `ruff format` are clean.
### What problem does this PR solve?
- list configs;
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
- list resources
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. Remove unused file
2. Remove duplicate models
3. Resort the function order
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What does this PR do?
This PR adds a new DingTalk chat channel integration and hardens the
inbound callback path.
### Summary
- Adds DingTalk as a selectable chat channel in the UI and backend
channel registry.
- Adds the DingTalk chat channel icon asset.
- Acknowledges DingTalk Stream callbacks and deduplicates repeated
inbound messages to avoid duplicate replies.
### What problem does this PR solve?
Implement OpenAI chat completions in GO
POST /api/v1/openai/<chat_id>/chat/completions
OpenAI chat cli: internal/development.md
### Type of change
- [x] Refactoring
### What problem does this PR solve?
1. add modelID for delete_model and update_status
2. fix the bug when update-status delete model
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
RAGFlow(admin)> show tasks summary;
+---------+-----------------------------------------------------------------+
| field | value |
+---------+-----------------------------------------------------------------+
| command | show_users_quota_summary |
| error | 'Show users quota summary' is implemented in enterprise
edition |
+---------+-----------------------------------------------------------------+
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>