186 Commits

Author SHA1 Message Date
Wang Qi
6e82e2726d Guard /datasets/{dataset_id}/chunks cannot parse ingestion pipeline, use /documents/ingest instead (#16395) 2026-06-29 13:45:29 +08:00
euvre
a339e8a579 feat: handle partial upload success in document batch upload (#16438) 2026-06-29 13:06:14 +08:00
jony376
8fb692f10a fix(agent): enforce document access on POST /api/v1/agents/rerun (#15145)
## Related issues

Closes #15144

### What problem does this PR solve?

`POST /api/v1/agents/rerun` loaded a pipeline operation log by UUID via
`PipelineOperationLogService.get_documents_info` with no authorization,
then wiped chunks, reset document counters, deleted tasks, and re-queued
dataflow for the victim document.

Any authenticated user who knew a victim's pipeline log id could disrupt
parsing on documents they did not own.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

### Changes

| File | Change |
|------|--------|
| `api/apps/restful_apis/agent_api.py` | Call
`DocumentService.accessible(doc["id"], tenant_id)` before destructive
rerun operations; deny with generic `"Document not found."` |
|
`test/unit_test/api/apps/restful_apis/test_rerun_agent_authorization.py`
| Unit tests: cross-tenant log rejected, missing/unauthorized same
message, authorized rerun proceeds |

### Security notes

- **CWE-639:** Closes cross-tenant pipeline rerun / chunk wipe via
leaked log UUID.
- `tenant_id` from `@add_tenant_id_to_kwargs` is `current_user.id`;
`DocumentService.accessible` covers team-shared KBs.

### Test plan

- [ ] `pytest
test/unit_test/api/apps/restful_apis/test_rerun_agent_authorization.py`
- [ ] Manual: attacker cannot rerun victim pipeline log id

```bash
cd ragflow
uv run pytest test/unit_test/api/apps/restful_apis/test_rerun_agent_authorization.py -q
```

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-06-29 09:45:17 +08:00
Tim Wang
f0f10b6092 Fix: UserFillUp interactive forms not working in agent explore mode (#14589)
## Summary

- **Backend**: `_iter_session_completion_events` in `agent_api.py` was
filtering out `user_inputs` and `workflow_finished` SSE events, causing
agents with UserFillUp components to silently fail in explore mode — the
interactive form never appeared, while the same agent worked correctly
in run (editor) mode.
- **Frontend**: `SessionChat` component in explore mode was missing
`DebugContent` children rendering inside `MessageItem`, so even if the
backend forwarded the events, the form UI would not render. Added
`DebugContent`, `MarkdownContent`, `useAwaitCompentData` hook, and
input-disabling logic to match the run mode's `chat/box.tsx` behavior.

## What was changed

### Backend (`api/apps/restful_apis/agent_api.py`)
- Line 266: Added `"user_inputs"` and `"workflow_finished"` to the
allowed event filter in `_iter_session_completion_events`

### Frontend (`web/src/pages/agent/explore/components/session-chat.tsx`)
- Added imports: `DebugContent`, `MarkdownContent`,
`useAwaitCompentData`, `useParams`
- Added `sendFormMessage` from `useSendSessionMessage()` hook
- Added `useAwaitCompentData` hook for form state management
- Added `DebugContent` as `MessageItem` children for the latest
assistant message (renders UserFillUp form)
- Added `MarkdownContent` + submitted values display for previous
assistant messages
- Updated `NextMessageInput` disabled states to respect `isWaitting`
(form submission in progress)

## Test plan

- [x] Agent with UserFillUp component (e.g., email draft with
send/edit/cancel options) shows interactive form in **explore mode**
- [x] Same agent continues to work correctly in **run (editor) mode**
- [x] Form submission sends data back to the agent and workflow
continues
- [x] Input field is disabled while waiting for form submission
- [ ] Agents without UserFillUp components are unaffected in explore
mode

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-06-29 09:45:17 +08:00
kpdev
212429bf9d fix(api): gate sandbox artifact download on agent session ownership (#16169)
Fixes #16168

## Summary
- Add session-scoped authorization for `GET
/api/v1/documents/artifact/<filename>`
- Allow download only when the artifact filename appears in the caller's
`api_4_conversation` message and
`UserCanvasService.accessible(dialog_id, user_id)` passes
- Deny with generic `"Artifact not found."` before storage access (no
cross-user enumeration)
- Return 4xx when the blob is missing (existing behavior preserved)

## Approach
Sandbox artifacts are runtime CodeExec outputs, not KB documents — this
uses the same session gate pattern as `agent_chat_completion`, not
`DocumentService.accessible`.

## Test plan
- [x] Unit: denied when filename not referenced in user sessions
- [x] Unit: denied when agent canvas is not accessible
- [x] Unit: authorized user receives bytes; missing blob returns
`"Artifact not found."`
- [ ] `pytest
test/testcases/test_web_api/test_document_app/test_document_metadata.py
-k get_artifact`

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-06-29 09:45:16 +08:00
Renzo
6079ded70b fix: require explicit anonymous webhook access (#14890)
### What problem does this PR solve?

Fixes #14882

Agent webhook execution currently fails open when the saved webhook
`security` block is missing/empty, or when `auth_type` is set to `none`.
This allows unauthenticated webhook invocation without an explicit
operator opt-in.

This PR makes anonymous webhook access explicit:
- Rejects missing or empty webhook security config.
- Requires `allow_anonymous: true` when `auth_type` is `none`.
- Preserves explicit anonymous webhooks by having the frontend serialize
`allow_anonymous: true` when the user selects `None` auth.
- Updates webhook unit tests to cover both denied implicit-anonymous
configs and allowed explicit-anonymous configs.

### Type of change

- [x] Bug Fix
- [x] Security hardening
- [x] Test

### Tests

- [x] `ZHIPU_AI_API_KEY=dummy uv run python -m pytest
--confcutdir=test/testcases/test_web_api/test_agent_app
test/testcases/test_web_api/test_agent_app/test_agents_webhook_unit.py`
- [x] `uv run ruff check api/apps/restful_apis/agent_api.py
test/testcases/test_web_api/test_agent_app/test_agents_webhook_unit.py`
- [x] `npm exec eslint src/pages/agent/utils.ts
src/pages/agent/form/begin-form/schema.ts`

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-06-29 09:45:16 +08:00
philluiz2323
43a9d53c72 fix(agent): enforce tenant ownership on agentbots completions/inputs (#15457)
### What problem does this PR solve?

Fixes #15456.

The SDK agent-bot routes `POST /api/v1/agentbots/<agent_id>/completions`
and `GET /api/v1/agentbots/<agent_id>/inputs`
(`api/apps/restful_apis/bot_api.py`) authenticate the caller with a beta
API token — which only yields the caller's `tenant_id` — but then load
and run the agent named in the URL **without verifying the agent belongs
to the caller's tenant**. `UserCanvasService.get_agent_dsl_with_release`
even accepts a `tenant_id` it never uses, and `begin_inputs` calls
`get_by_id` directly. Any holder of a single valid beta token could
therefore run another tenant's agent (leaking its DSL/prompts/tool
config) or read another tenant's agent metadata and begin input form,
just by substituting a victim `agent_id`.

This PR adds the project's existing ownership gate,
`UserCanvasService.accessible(agent_id, tenant_id)`, to both endpoints
right after token authentication — mirroring the checks already enforced
on the equivalent first-party routes in
`api/apps/restful_apis/agent_api.py` (lines 75/578/775) and on the
sibling `chatbot_completions` / `create_agent_session` /
`delete_agent_session` handlers in the same file. On failure it returns
the same `Can't find agent by ID: <id>` message already used by
`begin_inputs`, so it does not reveal whether an `agent_id` exists in
another tenant.

Added a regression test
(`test/unit_test/api/apps/restful_apis/test_agentbots_access_control.py`,
following the existing stubbed-loader pattern from
`test_get_agent_session.py`) asserting that an inaccessible `agent_id`
is rejected before the agent is loaded (`begin_inputs`) or executed
(`completions`), and that an accessible agent still proceeds.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-06-29 09:45:16 +08:00
Rene Arredondo
7ecc0908ef fix(agent): authenticate "Thinking" button in shared/embedded chat via beta token (#14985) (#15238)
## Summary

Fixes #14985 — clicking the **Thinking** button in a shared/embedded
chat returns 401 and bounces the user to the login page, even though
the same share page can chat with the agent just fine.

## Root cause

In shared chat, `useGetSharedChatSearchParams` binds `conversationId`
to the URL's `shared_id` query param — which is the **beta APIToken**,
not the real agent id. That `conversationId` propagates through the
component tree:

```tsx
<WorkFlowTimeline canvasId={conversationId}>
  → useFetchMessageTrace(canvasId)
  → GET /api/v1/agents/<sharedId>/logs/<messageId>
```

But `/agents/<agent_id>/logs/<message_id>` is decorated with
`@login_required` (`api/apps/restful_apis/agent_api.py:842-846`).
The share page only holds the beta token — there is no session JWT
— so the request 401s and quart-auth redirects to the login page.
The reporter's server log matches exactly:

```
load_user from jwt got exception No b'.' found in value
load_user: No APIToken found for token=ULG10SWG3E...
Unauthorized request (quart_auth)
GET /api/v1/agents/394013f8d42211f0bad6123fa55e8ed9/logs/96fd72e2-... 1.1 401
```

The `394013f8...` segment in the URL is the `shared_id` (beta
token), not an actual agent id. `_load_user` already accepts the
regular `APIToken.token` field, but not `APIToken.beta`, by design
— beta is a much weaker share-link credential than a personal API
key.

The sibling endpoints `/agentbots/<id>/completions` and
`/agentbots/<id>/inputs` already use the right auth pattern for
this scope (beta-token via `_get_sdk_authorization_token` →
`APIToken.query(beta=token)`). Trace just didn't have a parallel.

## Fix

### Backend (`api/apps/restful_apis/bot_api.py`)

Added a beta-token sibling endpoint:

```
GET /api/v1/agentbots/<shared_id>/logs/<message_id>
```

- Same auth shape as the existing `agentbots` endpoints.
- The `<shared_id>` path segment is a client-supplied label only.
  The real `agent_id` used to build the Redis key
  (`<agent_id>-<message_id>-logs`) is taken from
  `APIToken.dialog_id` on the looked-up token, so the endpoint
  never trusts client-supplied identifiers for the data lookup.
- Returns the same `{data: ...}` shape as the existing
  `/agents/<id>/logs/<message_id>` endpoint, so the frontend
  doesn't need to reshape the response.

### Frontend

- `web/src/utils/api.ts`: added `sharedTrace(sharedId, messageId)`
  URL builder.
- `web/src/services/agent-service.ts`: added
  `fetchSharedTrace({ shared_id, message_id })`.
- `web/src/hooks/use-agent-request.ts`: `useFetchMessageTrace`
  takes an optional `isShare` argument. When set, it calls
  `fetchSharedTrace`; `isShare` is also folded into the
  `queryKey` so the two modes never share cached results.
- `web/src/pages/agent/log-sheet/workflow-timeline.tsx`:
  forwards the already-existing `isShare` prop into the hook.

All other existing call sites of `useFetchMessageTrace` (webhook
timeline, pipeline log, dataflow result) pass no `isShare`
argument → undefined → falsy → unchanged behavior.

## Test plan

- [ ] In the regular Agent UI (logged-in user): open the trace /
      log sheet for any message and click into "Thinking" — the
      timeline should still load via `/agents/<id>/logs/<msg>`,
      same as before.
- [ ] From the Agent page, click **Chat in new tab** to open
      `/chat/share?shared_id=<token>&from=agent`. Send a message,
      wait for a response, then click **Thinking** on the
      assistant turn. The trace panel should load instead of
      redirecting to the login page.
- [ ] Same flow but with the agent embedded in an iframe ("Embed
      into webpage") — confirm there is no login redirect.
- [ ] In DevTools → Network, confirm the share-chat trace request
      goes to `/api/v1/agentbots/<sharedId>/logs/<msgId>` and
      returns 200 with the same JSON shape as the logged-in path.
- [ ] Confirm the chat completions, inputs, and upload flows in
      the share page still work — they were not touched.
- [ ] Send a bogus / expired beta token to the new endpoint and
      confirm it returns the standard "Authentication error: API
      key is invalid!" response (no traceback, no 500).
- [ ] Run `uv run pytest` to make sure no existing tests regress.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-06-29 09:45:16 +08:00
jony376
7b81f63653 fix(agent): bind session_id to path agent_id on GET/DELETE agent sessions (#15374)
## Related issues

Closes #15128

### What problem does this PR solve?

`GET` and `DELETE` `/api/v1/agents/<agent_id>/sessions/<session_id>`
verified canvas access for `agent_id` in the URL but loaded/deleted
sessions only by `session_id`, without checking `conv.dialog_id ==
agent_id`.

Any user with access to **any** agent could read or delete another
agent's `API4Conversation` session (messages, references, DSL, etc.)
when they knew the session UUID.

Agent completions in the same file already enforce this binding; chat
sessions do too — these two routes were inconsistent.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

### Changes

| File | Change |
|------|--------|
| `api/apps/restful_apis/agent_api.py` | Require `conv.dialog_id ==
agent_id` in `get_agent_session` and `delete_agent_session_item`; return
generic `"Session not found!"` on mismatch |
| `test/unit_test/api/apps/restful_apis/test_get_agent_session.py` | Add
IDOR regression tests for GET/DELETE; fix success fixture to include
`dialog_id`; track `delete_by_id` calls |

### Test plan

- [x] Unit tests added for GET/DELETE IDOR and success paths
- [ ] `pytest
test/unit_test/api/apps/restful_apis/test_get_agent_session.py`

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-06-29 09:45:16 +08:00
Zhichang Yu
faef22c18a Harden closed-advisory fixes (#16409)
## Summary
- harden reopened advisory fixes across REST connector, invoke, document
downloads, and markdown rendering
- add targeted regression coverage for redirect-safe SSRF handling,
invoke SSRF checks, document access control, and markdown sanitization
- verify each referenced GHSA against the original GitHub advisory text
and align the closed-advisory plan with the implemented remediation

## What changed
- add tenant access checks to document download endpoints to avoid
cross-tenant document disclosure
- add per-hop SSRF validation, DNS pinning, redirect handling, and
redirect limits to the REST API connector
- ensure invoke requests validate and pin the resolved host and never
follow redirects implicitly
- keep the generic rate-limited request path wrapped, not just GET and
POST helpers
- sanitize markdown HTML before rendering in the highlight markdown
component

## Validation
- `cd web && npm test -- --runInBand
src/components/highlight-markdown/__tests__/index.test.tsx`
- `.venv/bin/python -m pytest -q
test/unit_test/data_source/test_rest_api_connector.py`
- targeted `test/testcases/test_web_api/...` unit additions were
reviewed, but the suite cannot be executed end-to-end in this
environment because parent `test/testcases/conftest.py` requires a local
service on `127.0.0.1:9380`

## Notes
- all GHSA entries referenced by the plan were checked against the
original GitHub advisory text, not sampled
- the closed-advisory plan document was updated locally during review,
but is intentionally not included in this PR
2026-06-29 09:45:16 +08:00
Zhichang Yu
477f2fcebd feat[Go]: port agent webhook trigger, agent file upload/download, component input-form + debug endpoints from Python (#16403)
port agent webhook trigger, agent file upload/download, component
input-form + debug endpoints from Python
- [x] New Feature (non-breaking change which adds functionality)
2026-06-29 09:45:16 +08:00
Wang Qi
ac9469e5f5 Fix add VLLM without apikey will fail (#16352) 2026-06-25 17:17:29 +08:00
buua436
479a9a715e feat: unify provider id or name routing (#16336) 2026-06-25 13:04:21 +08:00
kpdev
68d2ca0ff1 fix(api): use dataset-owner tenant for legacy /chunks docstore cleanup (#15961) 2026-06-24 14:24:40 +08:00
Ambercssa
e9cdd09b67 fix(agent): handle different reference data formats (#16276) 2026-06-24 13:33:59 +08:00
buua436
aba5d172bd feat: add whatsapp web qr chat channel (#16238)
Adds a WhatsApp chat channel backed by a QR-based web login flow so users can connect without manual token setup.
2026-06-23 17:45:31 +08:00
Wang Qi
5ca1686ac7 Fix that agent cannot be the same name (#16192)
Fix that agent cannot be the same name
2026-06-18 19:10:21 +08:00
Wang Qi
8067e97f0d Refactor: rename /chat_channels to /chat-channels (#16099) 2026-06-16 19:15:43 +08:00
Kevin Hu
15f50e5cb2 fix: rename dialog_id to chat_id in chat_channel (backend + frontend) (#16096)
## Summary

- The `ChatChannel` DB column was renamed from `dialog_id` to `chat_id`
via a migration (added in a prior commit).
- Aligns the REST API layer (`chat_channel_api.py`,
`chat_channel_service.py`) to use `chat_id` consistently.
- Updates the frontend (`interface.ts`, `hooks.ts`,
`connect-dialog-modal.tsx`, `added-channel-card.tsx`) to read/write
`chat_id` instead of `dialog_id`.
- The joined `dialog_name` alias in the list query is unchanged (backend
still returns it under that name).

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-16 19:02:20 +08:00
Kevin Hu
5a817762fa Refactor: Change table chat_channel status data type. (#16061)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring
2026-06-16 12:02:12 +08:00
buua436
8e235b7b95 fix: add legacy chat/completions mode (#16014)
### What problem does this PR solve?
Adds a legacy mode for /chat/completions that restores v0.23.0-style
output by converting start_to_think/end_to_think back into raw
<think></think> markers and streaming cumulative answer text.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-16 10:34:06 +08:00
Lynn
47495c1f6a Feat: model provider (#16028)
### What problem does this PR solve?

Feat:
- Allow upsert model_type for instance model

Fix:
- Allow create instance with duplicate api_key

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-06-15 19:10:33 +08:00
dripsmvcp
53d4d9b3bd fix(api): return 4xx not 500 when attachment blob is missing (#15509)
Guard the agent-attachment download against a missing or empty storage blob so the caller gets a structured 4xx (`Document not found!`) instead of an HTTP 500. Same bug class as #15365 on document preview.
Resolve #15502
2026-06-15 15:41:49 +08:00
Yingfeng
b5bea72e4b Add git-like file commit API (#15978)
### What problem does this PR solve?

| # | Method | Endpoint | Description | Git Equivalent |
|---|--------|----------|-------------|----------------|
| 1 | `POST` | `/api/v1/{prefix}/{folder_id}/commits` | Create a
snapshot commit with file changes (add/modify/delete/rename) | `git add`
+ `git commit` |
| 2 | `GET` | `/api/v1/{prefix}/{folder_id}/commits` | List commit
history (paginated) | `git log` |
| 3 | `GET` | `/api/v1/{prefix}/{folder_id}/commits/{commit_id}` | Get
commit detail with file changes | `git show` |
| 4 | `GET` | `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files` |
List file changes in a commit | `git show --name-status` |
| 5 | `GET` |
`/api/v1/{prefix}/{folder_id}/commits/diff?from=...&to=...` | Compare
two commits and return differences | `git diff` |
| 6 | `GET` | `/api/v1/{prefix}/{folder_id}/changes` | Get uncommitted
changes (add/modify/delete) | `git status` |
| 7 | `GET` | `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/tree` |
Get the folder tree snapshot at commit time | `git ls-tree` |
| 8 | `GET` |
`/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files/{file_id}/content`
| Get a file's content as it existed in a specific commit | `git show
HEAD:file` |
| 9 | `GET` | `/api/v1/{prefix}/{file_id}/versions` | Get version
history for a specific file across all commits | `git log -- file` |

Where `{prefix}/{id}` can be:
- `folders/{folder_id}` — direct folder access
- `workspaces/{workspace_id}` — alias of `folders/{folder_id}`
- `datasets/{dataset_id}` — resolves to the dataset's folder
- `memories/{memory_id}` — resolves to the memory's folder
- `skills/{skill_id}` — resolves to the skill's folder

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2026-06-15 11:19:56 +08:00
Wang Qi
59d4203947 Fix last login time (#16004)
Fix last login time
2026-06-15 10:06:24 +08:00
Kevin Hu
b5a426e6e0 Feat: chat channels — connect assistants to external messaging bots (#15850)
### What problem does this PR solve?

#15844

Adds a **Chat channels** capability so a RAGFlow assistant (Dialog) can
be exposed as a bot on external messaging platforms (Feishu/Lark,
Discord, Telegram, Slack, WeCom, LINE, etc.). An admin configures a bot
in the UI, connects it to an assistant, and inbound messages are
answered from that assistant's knowledge base — replies are delivered
back on the channel.

**Feishu/Lark is implemented and tested end-to-end.** Discord, Telegram,
LINE, and WeCom are scaffolded against the same interface; the remaining
listed channels are tracked as follow-ups.

### Design

**Backend**
- New `chat_channel` table (`tenant_id`, `name`, `channel`, `config`
JSON holding `{credential: {...}}`, `dialog_id`, `status`) +
`ChatChannelService` and RESTful CRUD under `/api/v1/chat_channels`.
- Channel framework under `api/channels/`: a `core` registry +
per-channel packages that self-register a builder and implement a common
`Channel` interface (`start`/`stop`/`send` + inbound normalization) over
`IncomingMessage`/`OutgoingMessage`.
- Embedded **reconcile loop** in `ragflow_server`
(`api/channels/bootstrap.py`): loads enabled bots, and
starts/stops/restarts them as rows change (no server restart needed).
Inbound messages run the connected dialog via the non-streaming
completion path, keeping per-end-user conversation history.
- Missing optional channel SDKs degrade gracefully (channel skipped with
a warning; others unaffected). Channel-level errors are logged, not
crashed.
- Feishu's WebSocket client runs in a dedicated thread with its own
event loop to avoid cross-loop/contextvars conflicts with the channel
runtime.

**Frontend**
- **Settings → Chat channels** panel: available-channels grid +
configured-bots list with add/edit/delete and a **Connect assistant**
popup that binds a bot to a dialog.
- Brand icons via simple-icons / reused shared data-source assets, with
colored fallbacks for brands not available.
- Route, sidebar entry, i18n (en/zh), and a top-nav segment-boundary fix
so the settings page no longer highlights the Chat tab.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

### Notes
- DB: new `chat_channel` table is auto-created; `chat_channel.dialog_id`
is also covered by a `migrate_db` `alter_db_add_column` for existing
installs.
- Channel SDKs (`lark-oapi`, `discord.py`, `python-telegram-bot`,
`line-bot-sdk`, `wechatpy`, `aiohttp`) added to dependencies.
- Screenshots / per-channel credential docs to follow.

<img width="1338" height="1290" alt="Image"
src="https://github.com/user-attachments/assets/042cb2f9-0dad-4e6a-bcf7-43ced4bbd704"
/>

<img width="1344" height="738" alt="Image"
src="https://github.com/user-attachments/assets/373cd08e-ec40-4c67-9c51-4d948b1ba617"
/>

<img width="672" height="887" alt="Image"
src="https://github.com/user-attachments/assets/5a34953f-a9a3-4c1e-869e-5eff0dc64c84"
/>

---------
2026-06-12 18:21:30 +08:00
Carl Harris
a2de880b6d fix(profile): enforce profile name validation and input constraints (#15694)
### What problem does this PR solve?

The Profile **Name** field currently lacks application-level validation
and allows users to save excessively long names and unsupported special
characters.

While the database enforces a maximum length of 100 characters, neither
the frontend nor backend validates nickname format before persistence.
This can result in inconsistent user data, poor user experience, and UI
layout issues when long names wrap across multiple lines.

This PR introduces consistent frontend and backend validation for
profile names, enforces length and character constraints, provides clear
validation feedback, and prevents invalid values from being saved.

Fixes #15693

### Type of change

* [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-12 11:13:18 +08:00
Jonathan Chang
de06c9a60b feat: Langfuse session grouping for multi-turn chat traces (#15679)
## Summary

This PR passes `session_id` into Langfuse trace observations so
multi-turn chat messages can be grouped under the same session in
Langfuse.

Changes include:
- Propagate `session_id` from chat/session APIs into
`dialog_service.async_chat`.
- Pass `session_id` into Langfuse `start_observation(...)`.
- Share Langfuse `trace_context` with chat, embedding, rerank, and TTS
model bundles where applicable.
- Add unit coverage to verify Langfuse observations receive
`session_id`.
- Update affected test stubs for the new optional Langfuse context
arguments.

## Related Issue
Closes: #15636 

## Change Type
- [x] Feature
- [x] Bug fix
- [x] Test
- [ ] Refactor
- [ ] Documentation
- [ ] Breaking change

## Real Behavior Proof

Before this change:

- Langfuse observations were created without `session_id`.
- Multi-turn chat traces could not be grouped by session in Langfuse.

After this change:

- Chat/session flows pass `session_id` into `async_chat`.
- Langfuse observations include `session_id`.
- Related model bundles receive shared trace context and session
metadata.

Validation result:

```bash
uv run python -m py_compile \
  api/db/services/tenant_llm_service.py \
  api/db/services/llm_service.py \
  api/db/services/dialog_service.py \
  api/db/services/conversation_service.py \
  api/apps/restful_apis/chat_api.py \
  test/unit_test/api/db/services/test_dialog_service_final_answer.py \
  test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py
```
Passed.

```bash
uv run pytest \
  test/unit_test/api/db/services/test_dialog_service_final_answer.py \
  test/unit_test/api/db/services/test_dialog_service_use_sql_source_columns.py -q
```
Result:

```text
11 passed in 16.89s
```

```bash
git diff --check
```
Passed.
## Checklist

- [x] Analyzed the issue requirement.
- [x] Checked existing Langfuse trace integration.
- [x] Implemented only the requested session grouping behavior.
- [x] Added/updated unit tests.
- [x] Ran focused tests successfully.
- [x] Ran Python compile validation.
- [x] Ran whitespace diff validation.
2026-06-12 10:18:06 +08:00
jaso0n0818
2971849783 fix: guard docStoreConn.delete with index_exist in parse and stop_parsing (#15876)
## What problem does this PR solve?

Closes #15874

Both the `POST /api/v1/datasets/<dataset_id>/chunks` (re-parse) and
`DELETE /api/v1/datasets/<dataset_id>/chunks` (stop-parsing) handlers
called `settings.docStoreConn.delete` unconditionally. When the
tenant/dataset index has not been created yet — fresh dataset, first
parse interrupted before any chunks were indexed, or index manually
removed — the delete call throws and the handler returns HTTP 500
**after** the document state was already mutated (RUNNING with zeroed
counters for the parse path; CANCEL with zeroed counters for the stop
path), leaving the document in an inconsistent state.

The newer `parse_documents` path in `document_api.py` already uses
`index_exist` before deleting:



## How to fix?

Apply the same `index_exist` guard to both call sites in `chunk_api.py`:

- **`parse`** (POST path, line ~192): guard the delete before
`TaskService.filter_delete`.
- **`stop_parsing`** (DELETE path, line ~242): guard the delete after
`DocumentService.update_by_id`.

Both sites already have the correct `search.index_name(tenant_id)` and
`dataset_id` parameters; the guard is a one-line addition at each site.

## Type of change

- [x] Bug fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Wang Qi <wangq8@outlook.com>
2026-06-11 16:30:03 +08:00
kpdev
de18313f97 fix(api): POST /documents/stop removes partial chunks and resets counters (#15789)
### What problem does this PR solve?

`POST /api/v1/datasets/{dataset_id}/documents/stop`
(`stop_parse_documents`) cancels parsing tasks and sets `run` to
`CANCEL`, but it does **not** remove chunks already indexed in the doc
store or reset `progress` / `chunk_num`. REST callers can end up with a
“cancelled” document that still returns partial chunks in `GET
.../chunks` and in retrieval.

Legacy `DELETE /api/v1/datasets/{dataset_id}/chunks` (`stop_parsing`)
already performs full cleanup: it resets counters and calls
`docStoreConn.delete`. This PR aligns the newer stop endpoint with that
behavior so both paths leave the dataset consistent.

Fixes [#15788](https://github.com/infiniflow/ragflow/issues/15788).

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

### Changes

- Update `stop_parse_documents` in `document_api.py` to reset `progress`
and `chunk_num` to `0` and delete partial chunks via
`docStoreConn.delete` after `cancel_all_task_of`.
- Add unit test `test_stop_parse_documents_cleans_partial_chunks` to
assert counters reset and doc store delete is invoked.

### Test plan

- [x] Unit test: `pytest
test/testcases/test_http_api/test_file_management_within_dataset/test_doc_sdk_routes_unit.py::TestDocRoutesUnit::test_stop_parse_documents_cleans_partial_chunks
-v`
- [ ] Manual: upload a slow document, start parse, call `POST
.../documents/stop` while `RUNNING`, verify `GET .../chunks` returns
zero chunks and UI `chunk_count` is 0
- [ ] Control: legacy `DELETE .../chunks` behavior unchanged

---------

Co-authored-by: Wang Qi <wangq8@outlook.com>
2026-06-11 15:51:32 +08:00
zaviermeekz-cpu
a1dc2da7b4 fix: add model_name to embed completion request (#15883) (#15888)
### What problem does this PR solve?

When embedding a chatbot, the API returned `"Model Name is required"`.
The embed widget now includes the assistant's `llm_id` as `model_name`
in the completion request.

### Type of change

- [x] Bug Fix

### How has this been tested?

- Created a chatbot with a default model.
- Embedded it and sent a message – the error is gone and the assistant
replies correctly.

### Related Issue

Closes #15883

Co-authored-by: RAGFlow Dev <dev@ragflow.local>
Co-authored-by: Wang Qi <wangq8@outlook.com>
2026-06-11 14:38:37 +08:00
zaviermeekz-cpu
c50f9c59aa fix: allow zero message history window and clear history for new sessions (#15897) (#15902)
### What problem does this PR solve?

Two bugs in the Agent Categorize component:
1. The backend rejected `message_history_window_size = 0` while frontend
allowed it, causing API errors.
2. When calling the agent API without a `session_id`, a new session was
created but retained history from previous conversations.

### Type of change

- [x] Bug Fix

### How has this been tested?

- Issue 1: `CategorizeParam().check()` now accepts `0` and rejects
negative values.
- Issue 2: `canvas.clear_history()` is called for new sessions (no
`session_id`), ensuring fresh conversation state. Verified via UI and
API that a second call without `session_id` does not remember the first
conversation.

### Related Issue

Closes #15897

Co-authored-by: RAGFlow Dev <dev@ragflow.local>
Co-authored-by: Wang Qi <wangq8@outlook.com>
2026-06-11 13:24:48 +08:00
Lynn
478c9846a1 Fix: model list (#15860)
### What problem does this PR solve?

Remove tenant_llm call in rag.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-10 14:59:57 +08:00
balibabu
287a4cfd2b Fix: An error message appears when accessing the agent's launch page: "pagesize exceeds maximum value". (#15835)
### What problem does this PR solve?
Fix: An error message appears when accessing the agent's launch page:
"pagesize exceeds maximum value".

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: balibabu <assassin_cike@163.com>
2026-06-09 16:56:47 +08:00
buua436
e81bca73d5 fix: normalize agent session chunks (#15756)
### What problem does this PR solve?

Normalize agent session chunk references so they are mapped through a
dedicated helper instead of duplicating the field extraction inline.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-08 15:29:55 +08:00
Lynn
b05d5a5228 Feat: get model list from remote (#15711)
### What problem does this PR solve?

Feat:
- Get model list from remote provider. 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-08 11:02:40 +08:00
Wang Qi
aa9545e4c9 Revert "fix: duplicate document ingest guard" (#15707)
Reverts infiniflow/ragflow#15638
2026-06-05 17:45:29 +08:00
Wang Qi
214ee319f8 Revert "fix(api): authorize owner_ids for list chats and search apps (#14775) (#15698)
This reverts PR #14775  commit 5a5e766386.
2026-06-05 17:26:02 +08:00
Wang Qi
4cbe597d7e Refactor: consolidate to use @login_required (#15652)
Refactor: consolidate to use @login_required
2026-06-05 11:35:00 +08:00
kpdev
bd49fd70aa fix(api): set SDK document download Content-Type from filename (#15112) (#15113)
## Summary

- Infer `Content-Type` from the stored document filename on SDK download
routes.
- Covers `GET /api/v1/datasets/<dataset_id>/documents/<document_id>` and
`GET /api/v1/documents/<document_id>`.
- Aligns with REST preview/download via `CONTENT_TYPE_MAP`.

## Test plan

- [x] `pytest
test/testcases/test_http_api/test_file_management_within_dataset/test_doc_sdk_routes_unit.py::TestDocRoutesUnit::test_download_mimetype_from_filename`
- [x] Manual: `curl -sSI` on SDK dataset document download for a PDF;
expect `Content-Type: application/pdf`

Fixes #15112.
2026-06-05 10:08:53 +08:00
buua436
423fb6faae fix: duplicate document ingest guard (#15638)
### What problem does this PR solve?
When a document is rerun or updated concurrently, the previous
unconditional update could overwrite a newer task state.
This change adds an `update_time`-based optimistic lock so the update
only succeeds if the record has not been modified by another flow in the
meantime.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-04 17:57:51 +08:00
buua436
c70f19e138 Fix: remove duplicate document preview access check (#15625)
### What problem does this PR solve?

remove duplicate document preview access check

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-04 13:05:15 +08:00
Lynn
597ac1e900 Fix: search bot and verify model instance (#15588)
### What problem does this PR solve?

Fix:
- Verify provider with empty llm list in llm_factories.json
- Set search bot's chat_llm_name, use tenant default chat model as
default

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-04 11:59:55 +08:00
kpdev
d26d799467 fix(api): restore accessible check on document preview (#15505)
Restore `DocumentService.accessible` on `GET
/api/v1/documents/{doc_id}/preview` so cross-tenant users cannot stream
documents by UUID.

Fixes #15501

### What problem does this PR solve?

PR #15146 (`71a52d579`) moved the agent attachment download route and
accidentally removed the `DocumentService.accessible(doc_id,
current_user.id)` guard from the REST preview handler. The endpoint
still requires login, but any authenticated user who knows another
tenant's `doc_id` can download the raw file bytes.

This restores the same authorization check that existed before #15146,
returning a generic `"Document not found!"` when access is denied (no
cross-tenant ID enumeration). SDK download routes tracked in #15125 are
unchanged.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-04 09:59:07 +08:00
dripsmvcp
2196f2260a fix(api): restore DocumentService.accessible check on /preview (#15508)
## Summary
Restore the `DocumentService.accessible(doc_id, current_user.id)` check
that PR #15146 dropped from the REST document preview handler. Any
authenticated caller could download any tenant's document bytes by
guessing/knowing the `doc_id`.

## Root cause
`api/apps/restful_apis/document_api.py` — the `GET
/documents/<doc_id>/preview` handler called `DocumentService.get_by_id`
and went straight to `File2DocumentService.get_storage_address` +
`STORAGE_IMPL.get`, with no tenant check between the lookup and the
read. The handler's docstring even promises "user must belong to the
tenant that owns the document's knowledge base" — the code didn't
enforce it.

## Fix
- Add `current_user` to the existing `api.apps` import.
- Immediately after `get_by_id`, call
`DocumentService.accessible(doc_id, current_user.id)`; on denial, return
the **same** `get_data_error_result(message="Document not found!")`
shape used for the missing-doc branch. That makes a cross-tenant probe
indistinguishable from a missing-doc probe, preventing ID enumeration
(the issue body calls this out explicitly).
- Emit `logging.warning` with caller user + doc_id for audit.
- Restores symmetry with peer routes that already call
`accessible(doc_id, user_id)` (e.g. `_run_sync` at
`document_api.py:1380`).

## Test plan
Adds
`test/unit_test/api/apps/restful_apis/test_document_preview_accessible.py`:

- **`test_cross_tenant_preview_is_denied`** — owner tenant ≠ caller
tenant; asserts the response shape is `Document not found!` and the
storage backend (`thread_pool_exec(STORAGE_IMPL.get, ...)`) is **never**
invoked.
- **`test_missing_doc_returns_not_found`** — missing-doc behaviour
unchanged.

Stub-loader pattern mirrors
`test/unit_test/api/apps/sdk/test_dify_retrieval.py` (added in #15028,
passing in CI).

## Provenance — how this fix was produced

This PR was authored against a small cited knowledge base committed in
the working tree as a `.vouch/` (see
[vouchdev/vouch](https://github.com/vouchdev/vouch)). The loop used
here:

1. **Grounding first.** Before reading the handler, queried the KB for
prior context: `vouch context "tenant scoped accessible authorization"`
→ retrieved a cited claim distilled from PR #15028 (which restored the
same `accessible()` check on `/dify/retrieval`). The retrieved rule:

> *ragflow REST endpoints that load by tenant-scoped id must call
`<Service>.accessible(id, tenant_id)` after `get_by_id` and before
storage/DB read; deny with code 109 'No authorization.' and log a
warning. Established by PR #15028.*

2. **Applied the pattern with a domain refinement.** For an API/JSON
endpoint, `No authorization.` is the right denial shape. For a
**byte-streaming, browser-facing** endpoint like `/preview`, leaking
*existence* itself enables enumeration — so per the issue's expected
behaviour, this PR denies with `Document not found!` (indistinguishable
from missing) instead. Same auth check, narrower response.

3. **Recorded the refinement back into the KB** as a new cited claim, so
the next IDOR-class issue starts already grounded in both the general
pattern and the byte-route nuance.

Net effect of the workflow: the fix replicates a known-good pattern
instead of reinventing it, *and* the place where the pattern was nuanced
is now retrievable for the next pass. Mechanism is fully independent of
this PR — it's not a runtime dependency, just process discipline.

Closes #15501
2026-06-04 09:58:26 +08:00
Wang Qi
b946df8ba2 Fix: consolidate beta auth (#15581)
Fix: consolidate beta auth
2026-06-03 19:58:06 +08:00
Wang Qi
d6fc50a469 Fix: no more @token_required (#15562)
Fix: no more @token_required
2026-06-03 16:24:08 +08:00
bitloi
a75ea7ba7c Fix: Chat completion generation parameter overrides (#15389)
### What problem does this PR solve?

Closes #15388.

Chat completion routes did not reliably honor per-request generation
settings:

- `/api/v1/chat/completions` copied generation settings with a
truthiness check, so valid zero values such as `temperature: 0`, `top_p:
0`, `frequency_penalty: 0`, `presence_penalty: 0`, and `max_tokens: 0`
were dropped.
- `/api/v1/openai/{chat_id}/chat/completions` did not forward standard
generation settings into the request-specific dialog LLM settings before
calling `async_chat`.

This PR preserves explicitly supplied generation parameters, including
zero values, and merges request-level overrides into existing dialog
settings where appropriate.

The supported generation parameter keys and merge behavior live in a
shared REST API helper to keep both completion routes aligned.

Validation:

- `git diff --check`
- `python3 -m py_compile api/apps/restful_apis/_generation_params.py
api/apps/restful_apis/chat_api.py api/apps/restful_apis/openai_api.py
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
- `uv run ruff check api/apps/restful_apis/_generation_params.py
api/apps/restful_apis/chat_api.py api/apps/restful_apis/openai_api.py
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
- `ZHIPU_AI_API_KEY=dummy uv run pytest
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py
-q -k generation_params`

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-03 11:46:10 +08:00
kpdev
76968af0ba Guard missing storage blobs on preview and image endpoints (#15366)
Fixes [#15365](https://github.com/infiniflow/ragflow/issues/15365) —
`get_document_image()` and document preview call `make_response(None)`
when storage returns no bytes, causing HTTP 500.
2026-06-03 11:33:03 +08:00
Lynn
36357a6afd Fix: model provider (#15517)
### What problem does this PR solve?

Fix:
- Handle siliconflow and siliconflow_intl api_key

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-02 19:04:20 +08:00