Commit Graph

639 Commits

Author SHA1 Message Date
qinling0210
563d855780 Implement OpenAI chat completions in GO (#16177)
### What problem does this PR solve?

Implement OpenAI chat completions in GO

POST /api/v1/openai/<chat_id>/chat/completions

OpenAI chat cli: internal/development.md

### Type of change

- [x] Refactoring
2026-06-18 18:07:27 +08:00
Lynn
47bd9dd049 Fix: replace tenant_llm apis (#16131)
Replace tenant_llm apis with provider-instance apis.
2026-06-18 16:38:32 +08:00
buua436
78b4906f7a fix: tighten embedding truncation threshold (#16123)
### What problem does this PR solve?
Use a 95% max_length threshold before truncating embedding inputs, which
reduces the chance of provider-side invalid-parameter errors on
near-limit chunks.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-17 14:18:02 +08:00
euvre
fe46244d30 fix: paginate non-DeepDOC PDF parsing tasks to prevent OOM (#16106)
The parser pods suffer from OOM kills when processing large PDF
documents. The root cause is in api/db/services/task_service.py: when
layout_recognize is not DeepDOC (e.g. Plain Text), page_size was set to
MAXIMUM_TASK_PAGE_NUMBER (100 million), causing the entire PDF to be
processed as a single task with all pages loaded into memory
simultaneously.

This PR fixes the issue by paginating non-DeepDOC PDF parsing tasks the
same way DeepDOC already does.
2026-06-17 09:33:53 +08:00
Wang Qi
17e3aad7ae Revert "fix: paginate non-DeepDOC PDF parsing tasks to prevent OOM" (#16104)
Reverts infiniflow/ragflow#15951
2026-06-16 20:11:45 +08:00
euvre
d2a18d5c46 fix: paginate non-DeepDOC PDF parsing tasks to prevent OOM (#15951)
### What problem does this PR solve?

The parser pods suffer from OOM kills when processing large PDF
documents. The root cause is in api/db/services/task_service.py: when
layout_recognize is not DeepDOC (e.g. Plain Text), page_size was set to
MAXIMUM_TASK_PAGE_NUMBER (100 million), causing the entire PDF to be
processed as a single task with all pages loaded into memory
simultaneously.

This PR fixes the issue by paginating non-DeepDOC PDF parsing tasks the
same way DeepDOC already does.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
2026-06-16 20:07:19 +08:00
Kevin Hu
15f50e5cb2 fix: rename dialog_id to chat_id in chat_channel (backend + frontend) (#16096)
## Summary

- The `ChatChannel` DB column was renamed from `dialog_id` to `chat_id`
via a migration (added in a prior commit).
- Aligns the REST API layer (`chat_channel_api.py`,
`chat_channel_service.py`) to use `chat_id` consistently.
- Updates the frontend (`interface.ts`, `hooks.ts`,
`connect-dialog-modal.tsx`, `added-channel-card.tsx`) to read/write
`chat_id` instead of `dialog_id`.
- The joined `dialog_name` alias in the list query is unchanged (backend
still returns it under that name).

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-16 19:02:20 +08:00
Lynn
70792de899 Fix: v0.26.1 model provider (#16073)
### What problem does this PR solve?

Fix:
- Pass session_id to langfuse.
- Get correct status for add model_type.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-16 16:21:43 +08:00
Kevin Hu
5a817762fa Refactor: Change table chat_channel status data type. (#16061)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring
2026-06-16 12:02:12 +08:00
buua436
8e235b7b95 fix: add legacy chat/completions mode (#16014)
### What problem does this PR solve?
Adds a legacy mode for /chat/completions that restores v0.23.0-style
output by converting start_to_think/end_to_think back into raw
<think></think> markers and streaming cumulative answer text.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-16 10:34:06 +08:00
Lynn
47495c1f6a Feat: model provider (#16028)
### What problem does this PR solve?

Feat:
- Allow upsert model_type for instance model

Fix:
- Allow create instance with duplicate api_key

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-06-15 19:10:33 +08:00
Wang Qi
f6a2075ad0 Fix one data source can be synced to multiple dataset (#16023)
Fix one data source can be synced to multiple dataset
Test add/delete - worked.
2026-06-15 16:54:25 +08:00
Yingfeng
b5bea72e4b Add git-like file commit API (#15978)
### What problem does this PR solve?

| # | Method | Endpoint | Description | Git Equivalent |
|---|--------|----------|-------------|----------------|
| 1 | `POST` | `/api/v1/{prefix}/{folder_id}/commits` | Create a
snapshot commit with file changes (add/modify/delete/rename) | `git add`
+ `git commit` |
| 2 | `GET` | `/api/v1/{prefix}/{folder_id}/commits` | List commit
history (paginated) | `git log` |
| 3 | `GET` | `/api/v1/{prefix}/{folder_id}/commits/{commit_id}` | Get
commit detail with file changes | `git show` |
| 4 | `GET` | `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files` |
List file changes in a commit | `git show --name-status` |
| 5 | `GET` |
`/api/v1/{prefix}/{folder_id}/commits/diff?from=...&to=...` | Compare
two commits and return differences | `git diff` |
| 6 | `GET` | `/api/v1/{prefix}/{folder_id}/changes` | Get uncommitted
changes (add/modify/delete) | `git status` |
| 7 | `GET` | `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/tree` |
Get the folder tree snapshot at commit time | `git ls-tree` |
| 8 | `GET` |
`/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files/{file_id}/content`
| Get a file's content as it existed in a specific commit | `git show
HEAD:file` |
| 9 | `GET` | `/api/v1/{prefix}/{file_id}/versions` | Get version
history for a specific file across all commits | `git log -- file` |

Where `{prefix}/{id}` can be:
- `folders/{folder_id}` — direct folder access
- `workspaces/{workspace_id}` — alias of `folders/{folder_id}`
- `datasets/{dataset_id}` — resolves to the dataset's folder
- `memories/{memory_id}` — resolves to the memory's folder
- `skills/{skill_id}` — resolves to the skill's folder

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2026-06-15 11:19:56 +08:00
Kevin Hu
b5a426e6e0 Feat: chat channels — connect assistants to external messaging bots (#15850)
### What problem does this PR solve?

#15844

Adds a **Chat channels** capability so a RAGFlow assistant (Dialog) can
be exposed as a bot on external messaging platforms (Feishu/Lark,
Discord, Telegram, Slack, WeCom, LINE, etc.). An admin configures a bot
in the UI, connects it to an assistant, and inbound messages are
answered from that assistant's knowledge base — replies are delivered
back on the channel.

**Feishu/Lark is implemented and tested end-to-end.** Discord, Telegram,
LINE, and WeCom are scaffolded against the same interface; the remaining
listed channels are tracked as follow-ups.

### Design

**Backend**
- New `chat_channel` table (`tenant_id`, `name`, `channel`, `config`
JSON holding `{credential: {...}}`, `dialog_id`, `status`) +
`ChatChannelService` and RESTful CRUD under `/api/v1/chat_channels`.
- Channel framework under `api/channels/`: a `core` registry +
per-channel packages that self-register a builder and implement a common
`Channel` interface (`start`/`stop`/`send` + inbound normalization) over
`IncomingMessage`/`OutgoingMessage`.
- Embedded **reconcile loop** in `ragflow_server`
(`api/channels/bootstrap.py`): loads enabled bots, and
starts/stops/restarts them as rows change (no server restart needed).
Inbound messages run the connected dialog via the non-streaming
completion path, keeping per-end-user conversation history.
- Missing optional channel SDKs degrade gracefully (channel skipped with
a warning; others unaffected). Channel-level errors are logged, not
crashed.
- Feishu's WebSocket client runs in a dedicated thread with its own
event loop to avoid cross-loop/contextvars conflicts with the channel
runtime.

**Frontend**
- **Settings → Chat channels** panel: available-channels grid +
configured-bots list with add/edit/delete and a **Connect assistant**
popup that binds a bot to a dialog.
- Brand icons via simple-icons / reused shared data-source assets, with
colored fallbacks for brands not available.
- Route, sidebar entry, i18n (en/zh), and a top-nav segment-boundary fix
so the settings page no longer highlights the Chat tab.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

### Notes
- DB: new `chat_channel` table is auto-created; `chat_channel.dialog_id`
is also covered by a `migrate_db` `alter_db_add_column` for existing
installs.
- Channel SDKs (`lark-oapi`, `discord.py`, `python-telegram-bot`,
`line-bot-sdk`, `wechatpy`, `aiohttp`) added to dependencies.
- Screenshots / per-channel credential docs to follow.

<img width="1338" height="1290" alt="Image"
src="https://github.com/user-attachments/assets/042cb2f9-0dad-4e6a-bcf7-43ced4bbd704"
/>

<img width="1344" height="738" alt="Image"
src="https://github.com/user-attachments/assets/373cd08e-ec40-4c67-9c51-4d948b1ba617"
/>

<img width="672" height="887" alt="Image"
src="https://github.com/user-attachments/assets/5a34953f-a9a3-4c1e-869e-5eff0dc64c84"
/>

---------
2026-06-12 18:21:30 +08:00
Jonathan Chang
de06c9a60b feat: Langfuse session grouping for multi-turn chat traces (#15679)
## Summary

This PR passes `session_id` into Langfuse trace observations so
multi-turn chat messages can be grouped under the same session in
Langfuse.

Changes include:
- Propagate `session_id` from chat/session APIs into
`dialog_service.async_chat`.
- Pass `session_id` into Langfuse `start_observation(...)`.
- Share Langfuse `trace_context` with chat, embedding, rerank, and TTS
model bundles where applicable.
- Add unit coverage to verify Langfuse observations receive
`session_id`.
- Update affected test stubs for the new optional Langfuse context
arguments.

## Related Issue
Closes: #15636 

## Change Type
- [x] Feature
- [x] Bug fix
- [x] Test
- [ ] Refactor
- [ ] Documentation
- [ ] Breaking change

## Real Behavior Proof

Before this change:

- Langfuse observations were created without `session_id`.
- Multi-turn chat traces could not be grouped by session in Langfuse.

After this change:

- Chat/session flows pass `session_id` into `async_chat`.
- Langfuse observations include `session_id`.
- Related model bundles receive shared trace context and session
metadata.

Validation result:

```bash
uv run python -m py_compile \
  api/db/services/tenant_llm_service.py \
  api/db/services/llm_service.py \
  api/db/services/dialog_service.py \
  api/db/services/conversation_service.py \
  api/apps/restful_apis/chat_api.py \
  test/unit_test/api/db/services/test_dialog_service_final_answer.py \
  test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py
```
Passed.

```bash
uv run pytest \
  test/unit_test/api/db/services/test_dialog_service_final_answer.py \
  test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py -q
```
Result:

```text
11 passed in 16.89s
```

```bash
git diff --check
```
Passed.
## Checklist

- [x] Analyzed the issue requirement.
- [x] Checked existing Langfuse trace integration.
- [x] Implemented only the requested session grouping behavior.
- [x] Added/updated unit tests.
- [x] Ran focused tests successfully.
- [x] Ran Python compile validation.
- [x] Ran whitespace diff validation.
2026-06-12 10:18:06 +08:00
bohdansolovie
381091df71 fix(dialog): guard async_ask() against empty or invalid kb_ids (#15530)
Fixes #15529 .

### Problem

`async_ask()` accessed `kbs[0]` without verifying that
`KnowledgebaseService.get_by_ids()` returned any knowledge bases. Empty
or stale `kb_ids` raised `IndexError`, which surfaced as HTTP 500 on
search/bot SSE endpoints.

### Fix

- Add an early guard when `kbs` is empty, yielding a final SSE error
event (consistent with `gen_mindmap()` in the same module).
- Add regression tests for empty `kb_ids` and deleted/invalid KB IDs.

### Test plan

- [ ] `pytest
test/unit_test/api/db/services/test_dialog_service_final_answer.py -k
"async_ask_empty or async_ask_stale"`
- [ ] Manual: `POST /api/v1/searchbots/ask` with invalid `kb_ids`
returns SSE error, not HTTP 500

---------

Co-authored-by: Wang Qi <wangq8@outlook.com>
2026-06-11 15:52:59 +08:00
少卿
8e17a12990 fix: remove think text buffering for real-time reasoning stream (#15891)
Fix: remove think text buffering for real-time reasoning stream
2026-06-10 16:55:57 +08:00
Lynn
478c9846a1 Fix: model list (#15860)
### What problem does this PR solve?

Remove tenant_llm call in rag.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-10 14:59:57 +08:00
buua436
c1496ffd43 fix: propagate memory tenant id in task collect (#15837)
### What problem does this PR solve?
Propagate `tenant_id` from memory task messages into task collection so
refactored task execution can build a valid context.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-09 17:47:48 +08:00
DearsisHS
cbb3896aaa fix(api): guard missing row in SearchService.get_detail (#15622)
## Summary
`SearchService.get_detail` crashed with `AttributeError` (HTTP 500) when
no matching row existed, because it called `.first().to_dict()` before
the `if not search` guard — making that guard dead code.

## Root cause
`.first()` returns `None` when the query matches nothing (deleted search
app, or joined `User` not `VALID`). `None.to_dict()` raises before the
guard runs.

## Fix
```diff
             .first()
-            .to_dict()
         )
         if not search:
             return {}
-        return search
+        return search.to_dict()
```
Guard the `None` first, then serialize — restoring the intended `{}`
"not found" return that every caller (`search_api`, `bot_api`,
`chat_api`, `dataset_api_service`) already handles.

## Files changed
- `api/db/services/search_service.py`

## Verification
- `ruff check` — clean
- Logic: `.first()` → `None` now hits `return {}` instead of
`None.to_dict()`. Local full pytest not run (heavy RAG deps); CI
validates.

## Note
Implemented with LLM assistance (model: claude-opus-4-8).

Closes #15621

Co-authored-by: dearsishs <MCarter112116@outlook.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 23:01:28 +08:00
buua436
c8c890b06c fix: refine think stream parsing (#15745)
### What problem does this PR solve?
Refine the stream parsing for `<think>` / `</think>` so MiniMax and
DeepSeek-style chunking both flush in the right order without mixing
think and answer buffers.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-08 16:53:22 +08:00
qinling0210
c960dc2a4c Refine handling of POST /api/v1/datasets/search in GO (#15583)
### What problem does this PR solve?

Refine handling of POST /api/v1/datasets/search in GO

### Type of change

- [x] Refactoring
2026-06-08 11:49:37 +08:00
Wang Qi
aa9545e4c9 Revert "fix: duplicate document ingest guard" (#15707)
Reverts infiniflow/ragflow#15638
2026-06-05 17:45:29 +08:00
buua436
71649db3b0 fix: prevent duplicated post-think text (#15651)
### What problem does this PR solve?
This fixes duplicated post-think text in streamed chat responses. When
the model emits text immediately after `</think>`, the stream state now
advances its cursor correctly so the same visible prefix is not emitted
twice.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-05 13:21:26 +08:00
buua436
423fb6faae fix: duplicate document ingest guard (#15638)
### What problem does this PR solve?
When a document is rerun or updated concurrently, the previous
unconditional update could overwrite a newer task state.
This change adds an `update_time`-based optimistic lock so the update
only succeeds if the record has not been modified by another flow in the
meantime.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-04 17:57:51 +08:00
buua436
bbacb31226 Fix: think stream tail handling (#15582)
### What problem does this PR solve?

think stream tail handling
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-04 10:04:35 +08:00
Lynn
dc4b82523b Feat: tenant llm provider (#14595)
### What problem does this PR solve?

Python implementation of the Go-based model_provider API suite.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: bill <yibie_jingnian@163.com>
2026-05-29 17:39:41 +08:00
Jack
f0cb7a544b Refactor: Task Executor (#15154)
### What problem does this PR solve?

1. Break huge function into smaller pieces
2. Add unit test for the smaller pieces function
3. Layer-ed design
a. infra layer - task_context.py, recording_context.py,
write_operation_interceptor.py, ...
    b. service layer - *_service.py
    c. business layer - task_handler.py
4. Default behavior: use "refactor-ed version" - can switch to original
version by change env variable

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
- [x] Performance Improvement

---------

Co-authored-by: Liu An <asiro@qq.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-05-27 21:54:17 +08:00
Ahmad Intisar
e6068a7f7e Fix: table parser metadata (#15127)
### What problem does this PR solve?

This PR improves the table upload flow for CSV/Excel files by allowing
table column role configuration at upload time.

Previously, users had to:
1. Upload and parse a table file.
2. Open parser settings and manually set table column roles.
3. Re-parse the file for the roles to take effect.

This was inefficient and required an unnecessary second parse.

With this change:
1. When the knowledge base uses table parsing, the upload dialog
extracts CSV/Excel headers client-side.
2. Users can choose Auto mode or Manual mode.
3. In Manual mode, users can assign per-column roles before upload.
4. The selected parser config is sent with the upload request and
applied server-side during document creation.

Result: configured table column roles are applied from the first parse.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
2026-05-25 16:05:38 +08:00
Wang Qi
7e6844118b Fix search vector_similarity_weight (#15108)
### What problem does this PR solve?

Fix search vector_similarity_weight

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-22 16:05:13 +08:00
kpdev
faf77a5a8a feat(evaluation): track token usage in evaluation results (#13487)
## Summary

Implements the TODO in `evaluation_service.py`: **Track token usage** in
evaluation results.

## Changes

- **Import** `num_tokens_from_string` from `common.token_utils`
- **Prompt tokens**: Use the full prompt returned by `async_chat` when
available (includes system prompt + knowledge base + query), otherwise
fall back to the question token count
- **Completion tokens**: Count tokens in the generated answer
- **Storage**: Store `token_usage` as `{prompt_tokens,
completion_tokens, total_tokens}` in each `EvaluationResult` instead of
`None`

## Why

The evaluation pipeline previously saved `token_usage: None` for every
result. This change allows downstream consumers (e.g. evaluation
dashboards, cost tracking) to see approximate token usage per test case
using the same tokenizer (tiktoken cl100k_base) used elsewhere in
RAGFlow.

## Testing

- No new tests added; existing evaluation flow unchanged
- Token counting uses existing `num_tokens_from_string` utility

---------

Co-authored-by: kiannidev <kiannidev@users.noreply.github.com>
2026-05-22 15:19:53 +08:00
Full Stack Developer
8f90740d2e feat: pass chat_template_kwargs through agent chat completion (#14542)
### What problem does this PR solve?

The agent API currently does not pass chat_template_kwargs to the
underlying LLM call path, so clients cannot control template-level model
behavior (such as thinking-mode toggles) when invoking
/agents/chat/completion. This PR adds passthrough support for
chat_template_kwargs across agent execution flows (session and
non-session, streaming and non-streaming) by propagating it through
canvas runtime state and into LLM invocation kwargs. This addresses the
feature gap raised in [Issue
#14182](https://github.com/infiniflow/ragflow/issues/14182).

Closes #14182 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-22 15:15:49 +08:00
kingloon
da4eaf9fb0 Fix: remove duplicate function definitions (#15063)
### What problem does this PR solve?

Remove duplicate function definitions in
`api/db/services/dialog_service.py`.

**Problem:** Two helper functions were defined twice in the same file,
but with different parameter orders:

- First definition (line 57):
`_resolve_reference_metadata(request_payload=None, config=None)`
- Second definition (line 136): `_resolve_reference_metadata(config,
request_payload=None)`

**Solution:** Keep the second definition (which is actually used by
other modules) and remove the first one to avoid confusion.

Additionally, remove duplicate `_enrich_chunks_with_document_metadata`
definition (keep line 140 version).
<img width="1584" height="313" alt="image"
src="https://github.com/user-attachments/assets/7daee832-244f-4bb2-8488-e3b65012a3f9"
/>
<img width="1672" height="359" alt="image"
src="https://github.com/user-attachments/assets/4fd2f523-273c-4b20-a7c9-ab35740b7834"
/>


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2026-05-21 15:31:51 +08:00
bitloi
6499bce2a6 fix: Langfuse chat observation (#15026)
### What problem does this PR solve?

Closes #15025

Langfuse-enabled `dialog_service.async_chat()` regressed to
`langfuse_tracer.start_generation(...)` after the earlier Langfuse v4
migration. Langfuse v4 uses `start_observation(as_type="generation")`,
so the remaining `start_generation` call can fail when chat tracing is
enabled.

This restores the migrated `start_observation(as_type="generation")`
call for chat observations while preserving the existing trace context,
model, input payload, and update/end flow. It also adds a regression
test with a fake Langfuse v4-style client that exposes
`start_observation()` but not `start_generation()`.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Tests

- `.venv/bin/pytest
test/unit_test/api/db/services/test_dialog_service_final_answer.py -q`
- `.venv/bin/ruff check api/db/services/dialog_service.py
test/unit_test/api/db/services/test_dialog_service_final_answer.py`
2026-05-20 15:01:19 +08:00
plind
f169ab4b39 feat(tts): cache synthesized speech in Redis to avoid redundant calls (#14851)
## What problem does this PR solve?

Closes #12017.

TTS output is deterministic for a given `(model, text)` pair, so
re-running the same text through the same TTS model produces the same
bytes — yet `Canvas.tts` and `dialog_service.tts` re-synthesized on
every request. That's slow and wastes provider quota whenever the same
assistant response is replayed, shared across users, or repeated within
a session.

### Change

New helper `rag/utils/tts_cache.py` with `synthesize_with_cache(tts_mdl,
cleaned_text)`:

- **Key:** `tts:cache:{model_id}:{sha256(text)}` — separate namespace
per model, identical cleaned text reuses a single entry across both call
sites.
- **Value:** the hex-encoded audio blob both call sites already
returned. No format change for downstream consumers.
- **TTL:** 7 days by default, configurable via
`RAGFLOW_TTS_CACHE_TTL_SECONDS`.
- **Failure modes:** a Redis hiccup falls back to direct synthesis; a
failed synthesis still returns `None` (existing contract preserved).


[`Canvas.tts`](https://github.com/infiniflow/ragflow/blob/main/agent/canvas.py#L683-L724)
and
[`dialog_service.tts`](https://github.com/infiniflow/ragflow/blob/main/api/db/services/dialog_service.py#L1367-L1380)
now route through the helper; the per-file bytes-accumulation/hex-encode
loop has been removed in favor of one shared implementation.

## Type of change

- [x] New Feature (non-breaking change which adds functionality)

## Test plan

- [ ] **Cache hit, chat path:** Configure a dialog with TTS enabled, ask
the same question twice with `stream=false`. Verify the second response
returns the same `audio_binary` and that the second invocation doesn't
hit the TTS provider (e.g., observe provider-side logs / usage counters;
check no `LLMBundle.tts can't update token usage` log line on the second
run).
- [ ] **Cache hit, agent path:** Same exercise via a Conversational
Agent that includes a Message component playing back the answer.
- [ ] **Cache isolation per model:** Switch tenant's `tts_id` between
two models, run the same text against each — confirm the second model's
first synthesis still happens (no cross-model hits).
- [ ] **TTL override:** Set `RAGFLOW_TTS_CACHE_TTL_SECONDS=120`, confirm
the entry expires after 2 minutes.
- [ ] **Redis unavailable:** Stop Redis (or break the connection).
Verify the TTS endpoint still works — synthesis falls back to direct
calls, with a `TTS cache lookup failed` / `TTS cache store failed`
warning logged.
- [ ] **Failure path:** Configure a TTS model with an invalid API key,
ensure the response still returns successfully with `audio_binary=None`
(no regression vs. current behavior).
2026-05-19 14:20:40 +08:00
kingloon
525a87be0f Misc: fix some typos (#14987)
### What problem does this PR solve?

Fix minor code quality issues:

1. Fix typo in assertion error message: "Can't fine" → "Can't find"
2. Remove duplicate line in common/connection_utils.py

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2026-05-19 10:47:06 +08:00
Magicbook1108
b69a6a5d80 Feat: full optimization on connector dashboard (#14979)
### What problem does this PR solve?

This PR improves the connector dashboard task management experience and
adds better visibility into connector execution logs.

### Overview:

#### Before
<img width="700" alt="image"
src="https://github.com/user-attachments/assets/e4a8ed6f-2e18-4f0f-8528-41a514550052"
/>

#### Now:
<img width="700" alt="Screenshot from 2026-05-18 16-31-30"
src="https://github.com/user-attachments/assets/d4ca193b-847a-49ae-9e4f-5fbca60ea627"
/>

### 1. Add a new logging page to the connector dashboard

A new logging page has been added so users can view connector task
execution logs directly from the connector dashboard.

### 2. Merge the Resume button into Confirm

The separate **Resume** button has been removed. The **Confirm** button
now represents different actions depending on the current task state:

- **Save**: Save form changes and reschedule tasks.
- **Stop**: Cancel currently scheduled or running tasks.
- **Resume**: Create new scheduled tasks after the previous tasks have
been stopped.
- **Start**: Start tasks when no task has been started yet.

### 3. Separate syncing and pruning tasks

Connector tasks are now separated into **syncing** and **pruning**.

Pruning is controlled by the **Sync deleted files** option:

- When **Sync deleted files** is disabled, only syncing tasks are shown.
- When **Sync deleted files** is enabled, both syncing and pruning tasks
are shown.

**Now: Sync deleted files disabled**

<img width="700" alt="Sync deleted files disabled"
src="https://github.com/user-attachments/assets/dbd9232e-614a-407f-a0b1-c109e5fa567d"
/>

**Now: Sync deleted files enabled**

<img width="700" alt="Sync deleted files enabled"
src="https://github.com/user-attachments/assets/1f527f48-ccb3-4ee8-97ca-086891489296"
/>

### 4. Update logs in backend

<img width="700" alt="image"
src="https://github.com/user-attachments/assets/10a95a3f-98c1-4e67-8afa-ddf6cda5b0b2"
/>

### 5. Remove connector resume API

- Removed: `POST /v1/connectors/<connector_id>/resume`
- Replaced by: `PATCH /v1/connectors/<connector_id>`


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-19 10:07:11 +08:00
Jake Armstrong
93d3deb5e4 Fix admin CLI system variable commands (#14956)
## What

Fixes #12409.

Implements admin CLI support for:

- `list vars;`
- `show var <name-or-prefix>;`
- `set var <name> <value>;`

## Changes

- Wire Go CLI variable commands to the admin API.
- Support integer and quoted string values in `SET VAR`.
- Return variable rows as `data_type`, `name`, `setting_type`, and
`value`.
- Add exact-name lookup with prefix fallback for `SHOW VAR`.
- Validate values by stored data type: `string`, `integer`, `bool`, and
`json`.
- Keep the legacy Python admin CLI/server behavior aligned.
- Update admin CLI docs and add focused tests.

## Verification

- `go test -count=1 ./internal/cli`
- `python3.12 -m py_compile admin/server/services.py
admin/server/routes.py api/db/services/system_settings_service.py
admin/client/parser.py admin/client/ragflow_client.py`
- Python admin CLI parser smoke test for `SET VAR`, quoted values, `SHOW
VAR`, and `LIST VARS`.
- Attempted `./run_go_tests.sh`; local environment is missing native
tokenizer/linker artifacts:
  - `internal/cpp/cmake-build-release/librag_tokenizer_c_api.a`
  - `-lstdc++`

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-05-18 19:08:45 +08:00
Hamza Amin Khokhar
2dbe3b8a62 fix: metadata_condition returning all docs when filter matches nothing (#14967)
### What problem does this PR solve?

When _parse_doc_id_filter_with_metadata returns [], the empty list is
falsy so the WHERE id IN (...) clause was silently skipped, causing the
full dataset to be returned instead of an empty result.

Change `if doc_ids:` to `if doc_ids is not None:` in both get_list() and
get_by_kb_id() to distinguish between no filter (None) and a filter that
matched zero documents ([]).

Fixes #14962

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-18 18:54:30 +08:00
dev
b12eaee38b fix(api): enforce tenant access for connector routes (#14747)
### What problem does this PR solve?

Fixes #14746.

Adds tenant access checks for connector-by-id REST routes before reading
connector details, mutating connector config/status, deleting
connectors, rebuilding, or listing sync logs. Unauthorized callers now
receive `RetCode.AUTHENTICATION_ERROR` with `No authorization.` without
reaching the connector/log mutation paths.

Validation:
- `python3 -m pytest
--confcutdir=test/testcases/test_web_api/test_connector_app
test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py`
- `uvx ruff check api/apps/restful_apis/connector_api.py
api/db/services/connector_service.py
test/testcases/test_web_api/test_connector_app/test_connector_routes_unit.py`

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: dev111-actor <dev111-actor@users.noreply.github.com>
2026-05-18 16:09:26 +08:00
Wang Qi
56d73d0c2c Refactor: speed up ragflow server, save startup memory (#14973)
### What problem does this PR solve?

Refactor: speed up ragflow server, save startup memory, saved 200MiB,
and 5-9 seconds start time.

##### Before
1241292  |   |           \_ python3 api/ragflow_server.py
RAGFlow server is ready after 25.61845850944519s initialization.

##### After
1019968  |   |           \_ python3 api/ragflow_server.py
RAGFlow server is ready after 16.205134391784668s initialization.

### Type of change

- [x] Refactoring
2026-05-18 15:55:59 +08:00
qinling0210
f1d2383572 Push metadata filters down to Infinity (#14974)
### What problem does this PR solve?

Push metadata filters down to Infinity

### Type of change

- [x] Refactoring
2026-05-18 14:22:04 +08:00
Kevin Hu
7cdc74bbe5 Refactor: Drop the vector fetch for ES (#14970)
## Summary
- Stop pulling chunk vectors (`q_*_vec`) back from Elasticsearch in the
main retrieval path. ES already knows them; shipping them was pure
bandwidth/memory overhead.
- Recover the per-chunk cosine similarity via a second KNN-only ES call
filtered by the candidate chunk ids. The new `_score` is merged with
locally computed term similarity using the user-configured
`vector_similarity_weight`.
- Lazily fetch the chunk embedding only for the chunks
`insert_citations` actually needs.

## Details
**`rag/nlp/search.py`**
- `Dealer.search`: no longer appends `q_*_vec` to the ES select list.
OceanBase still gets it (its rerank path is unchanged).
- New `Dealer._knn_scores(sres, idx_names, kb_ids)`: a `MatchDenseExpr`
over the cached query vector filtered by `id IN sres.ids`, returning
`{chunk_id: cosine_score}` via ES `_score`.
- New `Dealer.rerank_with_knn(...)`: term similarity from
`qryr.token_similarity` plus the ES-supplied KNN score, combined with
`tkweight`/`vtweight` and the existing rank-feature bonus.
- New `Dealer.fetch_chunk_vectors(chunk_ids, tenant_ids, kb_ids, dim)`:
on-demand vector fetch for citation use.
- `Dealer.retrieval` routes Infinity → unchanged, OceanBase → existing
local `rerank`, ES → new KNN-score path.

**`common/doc_store/es_conn_base.py`**
- New `get_scores(res)` helper returning `{_id: _score}` directly from
hit headers (ES doesn't surface `_score` through `get_fields`).

**`api/db/services/dialog_service.py`**
- New top-level `_hydrate_chunk_vectors(...)` helper. On ES it
back-fills `ck["vector"]` from `fetch_chunk_vectors` right before
`insert_citations`. No-op on Infinity / OB (their chunks already carry
vectors).
- Both `decorate_answer` closures became `async` and are `await`-ed at
all call sites in `async_chat` and `async_ask`.

## Backend behavior
| Backend | Returns chunk vec in main search | Sim source | Vectors for
citations |
|---|---|---|---|
| ES | No | second KNN call (`_score`) merged with term sim | fetched on
demand |
| Infinity | No (unchanged) | normalized `_score` | already on chunks |
| OceanBase | Yes (kept) | local hybrid rerank | already on chunks |

## Test plan
2026-05-18 14:21:56 +08:00
Rene Arredondo
9f2fb4611f Fix: guard empty/whitespace embedding inputs in LLMBundle (#14428) (#14924)
Closes #14428 


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-18 14:11:54 +08:00
dale053
bd99a22661 fix: atomic chunk/token counter updates for documents and knowledge b… (#14867)
### What problem does this PR solve?

Fixes #14866.

Previously, `DocumentService.increment_chunk_num` and
`decrement_chunk_num` updated the `Document` row and its parent
`Knowledgebase` row in two separate, non-transactional statements. If
the second update failed (DB error, connection drop, etc.) after the
first one succeeded, the document and knowledge base chunk/token
counters would drift apart and stay inconsistent.

There was also a behavioral asymmetry between the two methods:

- `increment_chunk_num` only logged a warning when the document row was
missing and returned a value that callers usually treated as success.
- `decrement_chunk_num` raised `LookupError` in the same situation.

This PR makes the counter updates atomic and aligns the missing-document
behavior between the two methods:

- Wrap the `Document` and `Knowledgebase` updates in
`increment_chunk_num` / `decrement_chunk_num` inside a `DB.atomic()`
block so both succeed or both roll back together.
- Raise `LookupError` from `increment_chunk_num` when the target
document no longer exists, matching `decrement_chunk_num`.
- Update `reset_document_for_reparse` in `document_api_service.py` to
catch the new `LookupError` and return a proper "Document not found!"
API error instead of propagating the exception.

No schema changes, no API contract changes for the success path; only
the failure mode for a missing document during reparse is now a clean
error response instead of an uncaught exception.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-14 14:48:52 +08:00
Ethan T.
ba8cb9dd4a fix: replace mutable default arguments with None in LLM chat models (#13513)
## Summary
- Replace `gen_conf={}` with `gen_conf=None` + guard in
`rag/llm/chat_model.py` (12 instances across Base, BaiChuanChat,
LocalLLM, MistralChat, ReplicateChat, BaiduYiyanChat, GoogleChat
classes)
- Replace `doc_ids=[]` with `doc_ids=None` + guard in
`api/db/services/document_service.py` (1 instance)
- Mutable default arguments are shared across all calls, causing
potential cross-request state contamination
- See Python docs:
https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects

## Test plan
- [x] Verify LLM calls work with and without explicit gen_conf
- [x] No behavior change for existing callers — `None` is replaced with
`{}` at function entry

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2026-05-14 14:46:47 +08:00
Br1an
d46bbd30f7 Fix: send input and output token usage to Langfuse (#13294)
### What problem does this PR solve?

Closes #9837

The Langfuse integration currently only sends the output text to
`langfuse_generation.update()` without including token usage
information. This means Langfuse cannot track input/output token
consumption for cost analysis and monitoring.

### Solution

Add the `usage` parameter to `langfuse_generation.update()` with:
- `input`: approximate input token count from `message_fit_in()`
- `output`: approximate output token count from
`num_tokens_from_string(answer)`
- `total`: sum of input and output

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2026-05-14 13:11:37 +08:00
plind
dd76653dc1 feat: add tag management for Agents with filtering and sorting (#14774) (#14799)
## Summary
Closes #14774.

Adds free-form tags on agents (UserCanvas) with full UI + API:
- Stored as comma-separated `tags` column on `UserCanvas` with online
migration.
- New endpoints: `GET /v1/agents/tags` (aggregate counts) and `PUT
/v1/agent/<id>/tags` (write). `GET /v1/agents` accepts a `tags=` query.
- "Edit tags" item in agent dropdown opens a chip-style editor dialog;
tags render as badges on each agent card.
- New "Tags" facet in the agents filter bar, with counts.

## Implementation notes
- **Tag matching is exact-token**: the SQL filter wraps stored tags as
`,…,` and matches `,ml,` so `ml` doesn't match `ml-ops`.
- **Server-side normalization** in `UserCanvasService.update_tags`:
dedup (case-insensitive), per-tag cap of 64 chars, total length capped
at 512 chars to fit the column, commas inside tag values are replaced
with spaces.
- **Tenant authorization**: `PUT /v1/agent/<id>/tags` gates on
`UserCanvasService.accessible(canvas_id, tenant_id)`.
- **Tag listing scope**: `UserCanvasService.list_tags` follows the same
own + team-shared rule as `get_by_tenant_ids`.
- **i18n**: keys added to `en.ts` and `zh.ts` only (per project
convention; other locales fall back).
- **`HomeCard`** gets a non-breaking `extra?: ReactNode` slot for the
chip row; no `src/components/ui/` files modified.

## Test plan
- [ ] Backend boot runs `migrate_db` → confirm `user_canvas.tags` column
exists (`DESCRIBE user_canvas`).
- [ ] Agents page renders cards normally (no console error from missing
field).
- [ ] `⋯ → Edit tags` opens a dialog that stays open (regression: dialog
was unmounting with the dropdown).
- [ ] Typing a tag without pressing Enter and clicking Save persists it
(regression: last typed tag was being dropped).
- [ ] Chip input supports Enter/comma to commit, Backspace on empty to
remove, `×` to remove individual chip.
- [ ] Tag containing a comma sent via API is stored with the comma
replaced by a space.
- [ ] 20 long tags sent via API does not error (length cap silently
truncates).
- [ ] "Tags" filter in the filter bar shows counts and narrows the list.
- [ ] Filtering by `ml` does **not** return agents tagged `ml-ops`.
- [ ] UI in Chinese shows 编辑标签 / 添加标签以整理和筛选你的智能体 etc.
- [ ] `PUT /v1/agent/<other-tenant-id>/tags` returns `Agent not found or
no permission.`
2026-05-13 21:41:32 +08:00
tmimmanuel
663fc1d42c fix(opensearch): implement doc-meta dispatch surface on OSConnection (#14577)
### What problem does this PR solve?

Fixes #14570. On OpenSearch backends (`DOC_ENGINE=opensearch`) every
document-metadata write failed with `'OSConnection' object has no
attribute 'create_doc_meta_idx'`, so both `PATCH
/api/v1/datasets/{ds}/documents/{doc}` with `meta_fields` and `POST
/api/v1/datasets/{ds}/metadata/update` were unusable while every other
document operation (retrieval, parsing, name update, chunk management)
worked correctly on the same OpenSearch cluster.

The bug runs deeper than the missing method name in the error message
suggests. `DocMetadataService` also reached into
`settings.docStoreConn.es.*` directly for the index refresh, the
scripted partial update, and the count call, which means that even after
adding `create_doc_meta_idx` to `OSConnection` the very next call in the
same metadata flow would still raise `AttributeError` because
`OSConnection` exposes `self.os` rather than `self.es`. Fixing only the
reported symptom would have moved the failure one line down without
restoring the feature.

This PR adds a uniform document-metadata dispatch surface to both
connection classes so they present the same abstract API, and routes the
service layer through that surface via `getattr` guards instead of
poking at backend-specific attributes. The four new methods on
`OSConnection` and `ESConnectionBase` are `create_doc_meta_idx`,
`refresh_idx`, `count_idx`, and `replace_meta_fields`.
`OSConnection.create_doc_meta_idx` reuses the existing
`conf/doc_meta_es_mapping.json` schema in the OpenSearch `body=` form
because OpenSearch and Elasticsearch share the same index-creation
payload, and `replace_meta_fields` emits a full scripted assignment
(`ctx._source.meta_fields = params.meta_fields`) on both backends so
removed keys actually disappear instead of being preserved by deep-merge
semantics.

The `getattr`-guarded dispatch in `DocMetadataService` keeps the
existing fall-through paths intact for Infinity and OceanBase, which
continue to rely on their search-based count fallback and on the
delete-then-insert metadata replacement they used before, so this change
is strictly additive for those two backends.

Verification: `pytest
test/unit_test/rag/utils/test_opensearch_doc_meta.py` runs 16 new unit
tests that pass locally and pin the `OSConnection` dispatch surface, the
`create_doc_meta_idx` short-circuit when the index already exists, the
mapping-file payload routing, the `IndicesClient.create` failure path,
the `refresh_idx` and `count_idx` success and error sentinels, and the
full-assignment script emitted by `replace_meta_fields`. The test module
stubs `common.settings` and `rag.nlp` at import time so the suite runs
without the heavy backend SDKs that the rest of the repository pulls in
transitively.


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: tmimmanuel <tmimmanuel@users.noreply.github.com>
2026-05-11 17:04:28 +08:00
box4wangjing
292b0b8bce chore: fix some comments to improve readability (#14756)
### What problem does this PR solve?

fix some comments to improve readability

### Type of change

- [x] Documentation Update

---------

Signed-off-by: box4wangjing <box4wangjing@outlook.com>
2026-05-11 16:48:48 +08:00