768 Commits

Author SHA1 Message Date
euvre
81cfcdf2d3 feat(frontend): add AuthenticatedImg component for authorized image requests (#16525) 2026-07-01 17:02:44 +08:00
Lynn
400476f0b3 Feat: SoMark (#16482)
Follow #15486
Co-authored-by: limuting <limuting233@gmail.com>
Co-authored-by: lutianyi <lutianyi233@163.com>
Co-authored-by: justinychuang <huangyicheng@soulcode.cn>
Co-authored-by: maybehokori <138367708+maybehokori@users.noreply.github.com>
2026-07-01 13:29:28 +08:00
Lynn
b53b693f22 Fix: CI (#16504)
### Summary

Fix race condition in parallel lefthook hooks causing ETXTBSY error
2026-06-30 22:14:11 +08:00
Wang Qi
d2ecd57c59 Fix: UI cannot start up (#16497) 2026-06-30 17:09:09 +08:00
Rene Arredondo
dc8b6d767c fix(agent): inject uploaded attachments into LLM context (#15215) (#15220)
## Summary

Fixes #15215 — attachments uploaded to an agent were not reaching the
LLM.

When a user uploads a file in an agent chat, `canvas.run` parses it into
the `sys.files` global (text content for documents, `data:image/...`
URIs
for images — see `agent/canvas.py:752-768`). But the LLM/Agent
component's
`_prepare_prompt_variables` only substitutes variables the user's prompt
template explicitly references via `{var}` placeholders. The default
prompt is `[{"role": "user", "content": "{sys.query}"}]` with no
`{sys.files}`, so the parsed attachment content never reaches the model.

In the reporter's logs, this is why the agent saw only the bare query
`附件 摘要 attachment summary` and went searching the dataset instead of
reading the uploaded PDF.

## Fix

`agent/component/llm.py` — added `_collect_sys_files()` and an
auto-injection step in `_prepare_prompt_variables`:

- If `sys.files` is non-empty **and** neither `sys_prompt` nor any entry
  in `prompts` already contains `{sys.files}` (no double-injection),
  split the entries into text vs. `data:image/...` URIs.
- Image URIs are merged into `self.imgs`, which the existing logic uses
  to switch the chat model to `IMAGE2TEXT` and pass `images=...` to
  `async_chat`.
- Text content is appended to the last `user` role message in `msg`,
  mirroring how `dialog_service.async_chat_solo` handles attachments for
  the non-agent chat path (`api/db/services/dialog_service.py:318-321`).

Both `LLM._invoke_async` and `Agent._invoke_async` (tool-using) go
through `_prepare_prompt_variables`, so plain LLM nodes and Agent nodes
are fixed in both streaming and non-streaming paths.

## Test plan

- [ ] Upload a PDF attachment to an agent with the default `{sys.query}`
prompt and ask "summarize the attachment" — the model should answer
      from the file content rather than searching the knowledge base.
- [ ] Upload an image attachment to an agent and ask about its contents
—
      the model should switch to the vision-capable LLM and answer from
      the image.
- [ ] Verify that an agent whose prompt **does** include `{sys.files}`
      still works and does **not** include the file content twice.
- [ ] Verify that an agent run with no attachments behaves unchanged.
- [ ] Run `uv run pytest` to make sure no existing tests regress.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: yzc <yuzhichang@gmail.com>
2026-06-30 15:48:59 +08:00
chanx
9542e6d530 fix: adjust width of messageItemSectionLeft to fit-content (#16488) 2026-06-30 15:37:22 +08:00
Attili-sys
5fc254eb2e Feature big query connector (#15871)
### What problem does this PR solve?

This PR adds Google BigQuery as a first-class data source connector in
RAGFlow.

It enables users to ingest and sync BigQuery data using the same
row-to-document model used by relational database connectors: selected
content columns become document text, metadata columns become document
metadata, an optional ID column provides stable document IDs, and an
optional timestamp column enables cursor-based incremental sync.

The connector supports service-account JSON credentials, table mode,
custom query mode, GoogleSQL queries, cursor-based incremental sync,
deleted-row pruning support, configurable query limits such as
`maximum_bytes_billed`, dry-run validation, batch loading, stable
document IDs, and BigQuery-aware value serialization.
2026-06-29 22:08:40 +08:00
chanx
ca17808f12 fix: user-setting modal fixes and DOMPurify cleanup (#16449)
### Summary
  fix: user-setting modal fixes and DOMPurify cleanup
- HighlightMarkdown: drop post-process DOMPurify pass (ineffective after
preprocessLaTeX; Coderabbit CRITICAL
#3486038798)
- SettingTeam: add invite-only-registered-users hint to add-user modal
- SettingModel: reset provider loading state when add-provider modal
closes
- MCP edit dialog: set maskClosable=false to prevent accidental
dismissal
- Form: switch FormDescription color from text-muted-foreground to
text-text-disabled
2026-06-29 16:38:23 +08:00
Carl Harris
61ac1c1dff refactor: enhance UI components and improve layout (#15984) 2026-06-29 10:40:28 +08:00
Tim Wang
f0f10b6092 Fix: UserFillUp interactive forms not working in agent explore mode (#14589)
## Summary

- **Backend**: `_iter_session_completion_events` in `agent_api.py` was
filtering out `user_inputs` and `workflow_finished` SSE events, causing
agents with UserFillUp components to silently fail in explore mode — the
interactive form never appeared, while the same agent worked correctly
in run (editor) mode.
- **Frontend**: `SessionChat` component in explore mode was missing
`DebugContent` children rendering inside `MessageItem`, so even if the
backend forwarded the events, the form UI would not render. Added
`DebugContent`, `MarkdownContent`, `useAwaitCompentData` hook, and
input-disabling logic to match the run mode's `chat/box.tsx` behavior.

## What was changed

### Backend (`api/apps/restful_apis/agent_api.py`)
- Line 266: Added `"user_inputs"` and `"workflow_finished"` to the
allowed event filter in `_iter_session_completion_events`

### Frontend (`web/src/pages/agent/explore/components/session-chat.tsx`)
- Added imports: `DebugContent`, `MarkdownContent`,
`useAwaitCompentData`, `useParams`
- Added `sendFormMessage` from `useSendSessionMessage()` hook
- Added `useAwaitCompentData` hook for form state management
- Added `DebugContent` as `MessageItem` children for the latest
assistant message (renders UserFillUp form)
- Added `MarkdownContent` + submitted values display for previous
assistant messages
- Updated `NextMessageInput` disabled states to respect `isWaitting`
(form submission in progress)

## Test plan

- [x] Agent with UserFillUp component (e.g., email draft with
send/edit/cancel options) shows interactive form in **explore mode**
- [x] Same agent continues to work correctly in **run (editor) mode**
- [x] Form submission sends data back to the agent and workflow
continues
- [x] Input field is disabled while waiting for form submission
- [ ] Agents without UserFillUp components are unaffected in explore
mode

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-06-29 09:45:17 +08:00
jiashi19
0d7ad0ed0c Feat/agent thinking switch (#15446)
### What problem does this PR solve?

This PR adds an Agent LLM setting to control thinking mode for official
providers that expose a thinking switch.

Related to #12842.  
Closes #15445.

Some providers expose thinking controls through provider-specific
request fields, but Agent LLM settings did not have a unified option for
users to enable or disable thinking mode.

This PR adds a `Thinking` selector with:

- System default
- Enabled
- Disabled
<img width="452" height="278" alt="8566b0b4-0546-4c8a-913d-f9bbd38319f6"
src="https://github.com/user-attachments/assets/25b497f7-1ba0-4bfe-940d-6fe79287d6ab"
/>
<img width="471" height="971" alt="8a0a6bee-f45f-48d5-bd83-17af260de3db"
src="https://github.com/user-attachments/assets/41ad43c1-5087-48f1-bf37-f2ca14c2be2f"
/>
Initial support is limited to the verified official providers:

- Qwen / DashScope: `enable_thinking`
- Kimi / Moonshot: `thinking.type`
- GLM / ZHIPU-AI: `thinking.type`

For LiteLLM-based providers, provider-specific fields are forwarded
through `extra_body` before `drop_params` filtering so the request
parameters are preserved.



### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: jiashi <jiashi19@outlook.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-06-29 09:45:16 +08:00
Zhichang Yu
faef22c18a Harden closed-advisory fixes (#16409)
## Summary
- harden reopened advisory fixes across REST connector, invoke, document
downloads, and markdown rendering
- add targeted regression coverage for redirect-safe SSRF handling,
invoke SSRF checks, document access control, and markdown sanitization
- verify each referenced GHSA against the original GitHub advisory text
and align the closed-advisory plan with the implemented remediation

## What changed
- add tenant access checks to document download endpoints to avoid
cross-tenant document disclosure
- add per-hop SSRF validation, DNS pinning, redirect handling, and
redirect limits to the REST API connector
- ensure invoke requests validate and pin the resolved host and never
follow redirects implicitly
- keep the generic rate-limited request path wrapped, not just GET and
POST helpers
- sanitize markdown HTML before rendering in the highlight markdown
component

## Validation
- `cd web && npm test -- --runInBand
src/components/highlight-markdown/__tests__/index.test.tsx`
- `.venv/bin/python -m pytest -q
test/unit_test/data_source/test_rest_api_connector.py`
- targeted `test/testcases/test_web_api/...` unit additions were
reviewed, but the suite cannot be executed end-to-end in this
environment because parent `test/testcases/conftest.py` requires a local
service on `127.0.0.1:9380`

## Notes
- all GHSA entries referenced by the plan were checked against the
original GitHub advisory text, not sampled
- the closed-advisory plan document was updated locally during review,
but is intentionally not included in this PR
2026-06-29 09:45:16 +08:00
Wang Qi
3a829fb6dd Fix VLM PDF parser only parse first 12 pages, and default page range for PDF files align with backend (#16394)
1. Fix VLM parser only parse first 12 pages
2. Fix frontend default pages 1 - 100000, keep aligned with backend.
2026-06-26 20:15:25 +08:00
Tim Wang
ca96d61e73 Feat: Add New API model provider for OpenAI-compatible gateways (#15991)
## Summary

Add support for **"New API"** as a model provider, enabling connection
to [New API](https://github.com/QuantumNous/new-api) /
[one-api](https://github.com/songquanpeng/one-api) compatible gateways
that aggregate multiple LLM backends behind a unified OpenAI-compatible
`/v1` endpoint.

### Features

- **All model types**: Chat, Embedding, Rerank, Image2Text, TTS,
Speech2Text
- **List Models discovery**: `NewAPI(OpenAIAPICompatible)` class in
`model_meta.py` queries the gateway's `/v1/models` to auto-discover
available models via the native `GET /api/v1/providers/<name>/models`
endpoint
- **Model parameter editing**: Pencil icon on each discovered model row
to edit `model_type`, `max_tokens`, and `features` (e.g. tool call
support) before submitting
- **Custom model addition**: "Add Custom Model" button at the bottom of
the List Models dropdown for models not returned by the API
- **Gear icon settings**: Enabled the Settings gear button on provider
instances to manage models on existing instances (viewMode)
- **viewMode credential passthrough**: Fixed List Models in viewMode —
merges `initialValues` credentials when `api_key`/`base_url` fields are
hidden by `hideWhenInstanceExists`

### Changes

**Backend** (8 files):
- `rag/llm/chat_model.py` — `NewAPIChat(Base)` class
- `rag/llm/embedding_model.py` — `NewAPIEmbed(OpenAIEmbed)` class (no
auto `/v1` append)
- `rag/llm/rerank_model.py` — `NewAPIRerank(Base)` class (uses `/rerank`
endpoint)
- `rag/llm/cv_model.py` — `NewAPICv(GptV4)` class
- `rag/llm/tts_model.py` — `NewAPITTS(OpenAITTS)` class
- `rag/llm/sequence2txt_model.py` — `NewAPISeq2txt(GPTSeq2txt)` class
- `rag/llm/model_meta.py` — `NewAPI(OpenAIAPICompatible)` class for List
Models discovery
- `conf/llm_factories.json` — New API factory entry with all model type
tags

**Frontend** (8 files + 1 new SVG):
- `web/src/assets/svg/llm/new-api.svg` — New API logo icon
- `web/src/constants/llm.ts` — `LLMFactory.NewAPI` enum + `IconMap`
entry
- `web/src/components/svg-icon.tsx` — `NewAPI` added to `svgIcons`
-
`web/src/pages/user-setting/setting-model/modal/provider-modal/field-config/local-llm-configs.ts`
— New API `buildLocalConfig`
-
`web/src/pages/user-setting/setting-model/modal/provider-modal/constants.ts`
— `LIST_MODEL_PROVIDERS` includes NewAPI
- `web/src/pages/user-setting/setting-model/components/used-model.tsx` —
Enable Settings gear button
-
`web/src/pages/user-setting/setting-model/modal/provider-modal/hooks/use-list-models-picker.ts`
— viewMode credential merge + model editing state/handlers
-
`web/src/pages/user-setting/setting-model/modal/provider-modal/hooks/use-list-models-options.tsx`
— Pencil edit icon per model row
-
`web/src/pages/user-setting/setting-model/modal/provider-modal/index.tsx`
— `AddCustomModelDialog` import + edit dialog rendering

**Note on Go implementation**: A Go model driver (`NewAPIModel`
delegating to `OpenAIModel`) has been prepared but is deferred until the
Go runtime is enabled in a future release (current v0.26.0 images use
`API_PROXY_SCHEME=python` and do not compile Go binaries). Will submit
as a follow-up PR.

## Related

- Depends on: #15996 (provider instance API improvements — server-side
credential lookup, idempotent `add_model`, security fixes — required for
viewMode gear icon and batch model submission)

## Test plan

- [ ] Add New API provider with api_key and base_url pointing to an
OpenAI-compatible gateway
- [ ] Click "List Models" — should discover and display available models
from `/v1/models`
- [ ] Click pencil icon on a model — should open edit dialog to change
model_type, max_tokens, features
- [ ] Select multiple models and click OK — should add all selected
models
- [ ] Click gear icon on the added instance — should open viewMode with
List Models working
- [ ] In viewMode, select new models including pre-existing ones, click
OK — should succeed (requires #15996)
- [ ] Verify all model types work: create a Chat assistant, Embedding
KB, Rerank setting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Tim Wang <wanghualoong@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-26 18:47:20 +08:00
Yoorim Choi
46b97bd1a1 fix(web): fix layout issues with text, overflow, and spacing consistency (#16324) 2026-06-25 19:25:32 +08:00
Wang Qi
31e50b164f Fix [ID:0] not converted to Fig. 1 (#16357) 2026-06-25 17:17:46 +08:00
chanx
d44359826d fix(web): agent log refetch and slider percentage rounding (#16344) 2026-06-25 13:49:25 +08:00
Yoorim Choi
6a8281721f fix(i18n): fix missing i18n coverage and refine Korean translations (#16203)
### What problem does this PR solve?

This PR follows up on
[#15863](https://github.com/infiniflow/ragflow/pull/15863) (Korean i18n)
with translation refinements and i18n coverage for hardcoded strings
found in the UI.

- Refine awkward Korean phrasing (e.g. 'Chunk 만들기' → 'Chunk 생성', '유형' →
'타입', etc.)
- Apply i18n to hardcoded strings in `message-item`,
`next-message-item`, `multi-select`, `chat-prompt-engine`, and various
filter hooks
- Rename `use-selelct-filters.ts` → `use-select-filters.ts` (typo fix)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-24 10:14:19 +08:00
balibabu
c849c76f8a Feat: Add a prefix to the name of the FormField associated with the chat. (#16178)
Fix: Add a prefix to the `name` of the `FormField` associated with the chat.
2026-06-22 19:18:11 +08:00
Zhichang Yu
3f805a64f1 feat(agent): align Go agent behavior with Python (except retrieval component) (#16225)
## Summary

Aligns the **Go agent runtime/canvas/components/tools** behavior with
the **Python `agent/` implementation** so the same stored canvas DSL
produces the same execution result on either side. Every component,
tool, and runtime primitive in `internal/agent/` is now driven by the
same semantics as its Python counterpart — variable resolution, template
substitution, control flow, error reporting, retry/cancel, and stream
event shapes.

The **retrieval component is the one explicit exception** in this PR. It
is being reworked in a separate change and is excluded from this
alignment pass; the wrapper slot (`universe_a_wrappers.go →
newRetrievalComponent`) is preserved.

## Scope of alignment

### Components (all aligned with `agent/component/`)
`Begin` · `Message` · `LLM` (incl. ChatTemplateKwargs,
MessageHistoryWindowSize, VisualFiles, Cite, OutputStructure,
JSONOutput, TopP, MaxRetries, DelayAfterError, credentials) · `Agent`
(react + tool artifact capture + `Reset()` interface-assert) · `Switch`
(12/12 operators, Python-equivalent semantics) · `Categorize` · `Invoke`
· `Iteration` · `Loop` (macro-expansion through `workflowx.AddLoopNode`)
· `UserFillUp` (Python-equivalent interrupt/resume via eino
`compose.Interrupt`/`ResumeWithData`) · `FillUp` · `DataOperations` ·
`ListOperations` · `StringTransform` · `VariableAggregator` ·
`VariableAssigner` · `Browser` (full stagehand runtime parity) ·
`DocsGenerator` · `ExcelProcessor`.

### Tools (all aligned with `agent/tools/`)
`Retrieval` (wrapper slot only — logic out of scope) · `MCPToolAdapter`
(streamable-HTTP) · `CodeExec` (sandbox bridge with
`code_exec_contract.go` matching Python contract) · `AkShare` · `ArXiv`
· `Crawler` · `DeepL` · `DuckDuckGo` · `Email` · `ExeSQL` · `GitHub` ·
`Google` · `GoogleScholar` · `Jin10` · `PubMed` · `QWeather` · `SearXNG`
· `Tavily` · `Tushare` · `Wencai` · `Wikipedia` · `YahooFinance` —
uniform `eino tool.InvokableTool` interface, SSRF protection, shared
HTTP client.

### Canvas execution engine (`internal/agent/canvas/`)
Aligned with Python's `agent/canvas.py`:
- **Scheduler** (`scheduler.go`): state pre/post handlers, node lambdas,
per-component timeout resolver (4-level: per-class env → per-class table
→ uniform env → 600s fallback), `legacyNoOpNames`.
- **Loop subgraph** (`loop_subgraph.go`): Python-equivalent
`AddLoopNode` macro expansion + condition translation.
- **Multibranch** (`multibranch.go`): `Switch` / `Categorize` routing
via `compose.NewGraphMultiBranch` — same branch selection semantics as
Python.
- **Parallel subgraph** (`parallel_subgraph.go`): matches Python's
parallel fan-out contract.
- **Interrupt/Resume** (`interrupt_resume.go`): `UserFillUpNodeBody` /
`IsInterruptError` / `ExtractInterruptContexts` — replaces the
deprecated Python sentinel chain with eino's native interrupt API,
preserving the same external behavior.
- **Checkpoint** (`checkpoint_store.go`): `RedisCheckPointStore`
Get/Set/Delete, with business metadata (status / canvas_id /
parent_run_id) on a parallel Redis Hash.
- **RunTracker** (`run_tracker.go`): Start / MarkSucceeded / MarkFailed
/ MarkCancelled / AttachCheckpoint — same lifecycle as the Python run
record.
- **Cancel** (`cancel.go`): Redis pub/sub watch.
- **Stream** (`stream.go`): SSE channel with `messages` / `waiting` /
`errors` / `done` events, same shape as Python's `agent.canvas.RunEvent`
payload.

### DSL bridge (`internal/agent/dsl/`)
- `normalize.go`: v1↔v2 collapsed into a single wire format — Python and
Go consume the same stored JSON.
- `reset.go`: per-run state reset matches Python's `Canvas.reset()`
semantics.
- Testdata mirrors Python's `agent_msg.json` / `all.json` / etc.

### Runtime (`internal/agent/runtime/`)
- `CanvasState` / `NewCanvasState` / `GetVar` / `SetVar` / `ReadVars`:
same `{{cpn_id@param}}` resolution model.
- `ResolveTemplate` (regex fast path + gonja fallback) — Python
Jinja-style semantics.
- `selector.go`, `metrics.go`, `component.go`: shared runtime contracts.

## Out of scope (intentionally)

- **`Retrieval` component logic** — wrapped only; full parity lands in a
follow-up PR.
- **Frontend** — only minor dsl-bridge / canvas UX fixes ride along.
- **CLI / admin / model registry** — orthogonal to agent behavior.

## How alignment is verified

`internal/service/agent_run_e2e_test.go` exercises the **full production
chain** against real Python-shaped DSL fixtures:
```
loadCanvasForUser → versionDAO.GetLatest → decodeCanvasFromDSL →
canvas.Compile → cc.Workflow.Invoke → answer extraction
```
using in-memory SQLite + miniredis (no Docker). Covers:
- `TestRunAgent_RealCanvas_BeginMessage` — happy path, `{{sys.query}}`
resolution
- `TestRunAgent_RealCanvas_WaitForUserResume` — two-run resume cycle
(Python-equivalent)
- `TestRunAgent_RealCanvas_CompileFails` — unknown component name →
sanitized error (Python-equivalent)
- `TestRunAgent_RealCanvas_InvokeFails` — unresolvable template ref
(Python-equivalent)
- `TestRunAgent_RunTracker_AttachCheckpoint_CallSequence` —
Start→AttachCheckpoint→MarkSucceeded lifecycle

`internal/handler/agent_test.go` — SSE streaming parity (`Content-Type:
text/event-stream`, `data: {…}\n\n`, trailing `data: [DONE]\n\n`,
OpenAI-compatible non-stream `choices`).

`internal/agent/canvas/fixture_compile_test.go` + per-component tests
pin the Python-equivalent outputs.

```
go test -count=1 -v -run 'TestRunAgent_RealCanvas|TestRunAgent_RunTracker' ./internal/service/
```

## Design reference

`docs/develop/agent-go-port-design.md` (1329 lines, last cross-checked
2026-06-17) — module layout, per-component / per-tool inventory,
corner-case catalogue, and the actionable backlog (Section 14, including
the retrieval alignment follow-up).

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-22 11:58:29 +08:00
balibabu
a9021528c3 Fix: Lint error. (#16172)
### What problem does this PR solve?

Fix: Lint error.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-18 13:14:18 +08:00
Wang Qi
99a25dca34 Fix Chat/Search/Agent bot show image (#16152)
Fix Chat/Search/Agent bot show image
2026-06-18 09:38:31 +08:00
balibabu
3247e353c7 Fix: The .docx file is not displaying fully; the hierarchy of the pipeline created from the template is missing. (#16134)
### What problem does this PR solve?

Fix: The .docx file is not displaying fully; the hierarchy of the
pipeline created from the template is missing.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-17 16:18:47 +08:00
Rander
1235da7093 refactor(paddleocr): migrate from sync API to async Job API (#15967)
## Summary

Migrate PaddleOCR integration from the deprecated synchronous HTTP API
to the new asynchronous Job API (`submit → poll → fetch`), aligning with
PaddleOCR 3.6.0+ architecture.

## Changes

### Python (`deepdoc/parser/paddleocr_parser.py`)
- Replace synchronous `requests.post()` with async Job API flow (submit
→ poll → fetch)
- Authentication: `token {token}` → `Bearer {token}`
- File transfer: base64 JSON body → multipart file upload
- Polling: exponential backoff (initial 3s, ×1.5, max 15s, timeout
controlled by `request_timeout`)
- Result: fetch full JSONL from result URL, preserving `prunedResult`
with bbox info for crop functionality
- Rename `api_url` → `base_url` (backward compatible: `api_url` still
accepted as fallback)

### Python (`rag/llm/ocr_model.py`)
- Prefer `paddleocr_base_url` / `PADDLEOCR_BASE_URL`, fallback to
`paddleocr_api_url` / `PADDLEOCR_API_URL`

### Go (`internal/entity/models/paddleocr.go`)
- Add `Client-Platform: ragflow` header to submit and poll requests
- Change polling from fixed 3s to exponential backoff (initial 3s, ×1.5,
max 15s)

### Python (`common/constants.py`)
- Add `PADDLEOCR_BASE_URL` to env keys and default config

## Backward Compatibility

- Old env var `PADDLEOCR_API_URL` still works (used as fallback)
- Frontend field `paddleocr_api_url` still works (backend reads it as
fallback)
- No user-facing configuration changes required for existing setups

## Why not use the `paddleocr` SDK package directly?

RAGFlow's `_transfer_to_sections()` relies on `prunedResult` (containing
`block_bbox`, `block_label`, `parsing_res_list`) from the raw API
response for PDF crop functionality. The SDK's public `parse_document()`
API only returns `DocParsingResult` with `markdown_text`, discarding the
bbox data. Therefore we implement the async Job API flow directly via
HTTP, following the same logic as the SDK internally.
2026-06-16 19:34:21 +08:00
chanx
cac87d7f77 fix: remove unnecessary 'asChild' prop from FilterButton component (#16094)
### What problem does this PR solve?

fix: remove unnecessary 'asChild' prop from FilterButton component

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-16 17:55:04 +08:00
balibabu
ba93ac3bd7 Feat: Move less important chat settings into a collapsible panel. (#16024)
### What problem does this PR solve?

Feat: Move less important chat settings into a collapsible panel.

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2026-06-15 19:09:19 +08:00
balibabu
fa6d29603a Fix: Adjust chat line height. (#16021)
### What problem does this PR solve?

Fix: Adjust chat line height.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-15 16:53:45 +08:00
buua436
400dfd50d8 feat: add custom value support for s3 region (#15968)
### What problem does this PR solve?
Allow S3-compatible data source region fields to accept custom values
while preserving search-and-select behavior.

### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2026-06-15 11:40:28 +08:00
Zhichang Yu
3fa15c0e2f feat(agent): Go port — canvas engine, 22 components, DSL v2, 13 endpoints (#15952)
Ports the agent canvas subsystem from Python to Go.

## What's included

### Canvas Engine (Phase 0/1)
- State engine, scheduler, variable resolver, Redis checkpoint store,
cancel protocol
- **209 tests** across canvas / component / io packages

### 22 Components (P0–P4)
| Tier | Components |
|---|---|
| P0 T1+T2+T3 | LLM, Agent, ExitLoop, Switch, Categorize, Begin,
Message, Invoke |
| P1 T3 | VariableAggregator, VariableAssigner, StringTransform,
ListOperations, DataOperations |
| P2 T3 | Iteration, IterationItem, Loop, LoopItem |
| P3 T3 | UserFillUp, Fillup |
| P4 T5 | Browser, ExcelProcessor, DocsGenerator |

### DSL v2 Schema (Phase 2.5)
- Typed v2 in-memory model with v1-to-v2 auto-detect converter
- v1 legacy field stripping per plan §2.11.7

### HTTP Endpoints & Bug Fixes (Plans PR1–PR3)
- **DELETE SQL bug fix**: gorm v2 `Where("id = ?", id).Delete(...)`
pattern
- **CreateAgent validation**: title/DSL required, duplicate check, 103
envelope
- **13 new endpoints**: templates, prompts, tags, sessions CRUD,
chat/completions (SSE + non-stream stubs), rerun, test_db_connection,
logs, webhook/logs
- **756 Go unit tests** (745 → 756, +18)
- **17 → 0 Python integration test failures** (test_agents.py +
test_session_management/)

### Tools
21 eino tools: HTTPHelper, search tools, financial/data tools, mandatory
stubs

### Infrastructure
OTel observability, NATS message queue, DeepDoc gRPC client, SSRF
guards, IDOR mitigation
2026-06-12 22:58:28 +08:00
balibabu
89aac82663 Fix: chat/agent -- Default avatar is not displaying correctly. (#15948)
### What problem does this PR solve?

Fix: chat/agent -- Default avatar is not displaying correctly.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-12 17:54:36 +08:00
Yingfeng
bae8c6f109 Improve docx preview (#15907) 2026-06-11 20:43:58 +08:00
balibabu
70ae25fc7b Fix: Remove the pagination from the search and retrieval pages. (#15942)
### What problem does this PR solve?

Fix: Remove the pagination from the search and retrieval pages.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-11 16:36:05 +08:00
monsterDavid
a851228ded fix(preview): authenticate markdown document preview requests (#15589)
## Summary

Fixes [#15585](https://github.com/infiniflow/ragflow/issues/15585).

- Route markdown preview through the shared `request` client (same as
txt/image previewers) so `Authorization` headers and interceptors are
applied consistently.
- Add a unit test covering `AUTH_BETA` token loading for embedded search
auth.

## Root cause

Search result preview for `.md`/`.mdx` used raw `fetch`, which did not
apply the same auth path as other preview types. That led to `401` on
`GET /api/v1/documents/{id}/preview` even when the user was logged in or
using an embedded search `auth` query param.

## Test plan

- [ ] Log in, run a search, open a markdown citation link — preview
loads (no 401).
- [ ] Open an embedded shared search URL with `auth` query param,
preview a markdown file — preview loads.
- [ ] Confirm PDF/txt preview still works in the same search UI.

---------

Co-authored-by: MkDev11 <89318445+bitloi@users.noreply.github.com>
Co-authored-by: Wang Qi <wangq8@outlook.com>
2026-06-11 15:46:20 +08:00
chanx
84482762d5 feat: support custom editing for model list (#15855)
### What problem does this PR solve?

feat: support custom editing for model list

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2026-06-09 19:24:43 +08:00
balibabu
d025e18176 Fix: Add a waiting status to the messages on the chat page. (#15773)
### What problem does this PR solve?

Fix: Add a waiting status to the messages on the chat page.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-08 19:17:00 +08:00
chanx
7dd4030986 fix: Resolve error when checking pipeline parsing result (#15778)
### What problem does this PR solve?

fix: Resolve error when checking pipeline parsing result

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-08 19:16:21 +08:00
chanx
2bd8900638 Fix: Model provider bugs (#15770)
### What problem does this PR solve?

Fix: Model provider bugs

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-08 17:04:05 +08:00
chanx
144abbe2eb feat: Unify the 'Add Model Provider' modal (#15768)
### What problem does this PR solve?

feat:Unify the 'Add Model Provider' modal

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-06-08 16:46:52 +08:00
balibabu
9c32b73cf7 Fix: The embedded website floating component on the agent page does not display citations. (#15767)
### What problem does this PR solve?

Fix: The embedded website floating component on the agent page does not
display citations.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-08 16:09:36 +08:00
balibabu
9c14e3f377 Fix: When adding a chat in the main interface, a warning will automatically pop up (#15685)
### What problem does this PR solve?

Fix: When adding a chat in the main interface, a warning will
automatically pop up (even if embedding and LLM model have already been
configured).
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-05 19:09:22 +08:00
chanx
a678ed7b1f Fix: Switching pagesize on a chunk page did not reset the current page. (#15401)
### What problem does this PR solve?

Fix: Switching pagesize on a chunk page did not reset the current page.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-03 15:57:57 +08:00
Julian
33ef724b5f Add Bulk action for linking Multiple Files to Datasets (#14960)
### What problem does this PR solve?

Feature: #14961 


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-06-02 12:23:33 +08:00
balibabu
f194e8b4c4 Fix: The newly added model did not appear in the drop-down menu. (#15476)
### What problem does this PR solve?

Fix: The newly added model did not appear in the drop-down menu.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-01 17:56:41 +08:00
Lynn
dc4b82523b Feat: tenant llm provider (#14595)
### What problem does this PR solve?

Python implementation of the Go-based model_provider API suite.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: bill <yibie_jingnian@163.com>
2026-05-29 17:39:41 +08:00
balibabu
187dc8a1e6 Fix: The Creativity parameter of chat was not saved. (#15243)
### What problem does this PR solve?

Fix: The Creativity parameter of chat was not saved.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-27 11:02:30 +08:00
chanx
bce11527c3 Fix: Fixed metadata issue (#15226)
### What problem does this PR solve?

Fix: Fixed metadata issue

- The dataset's built-in metadata is now active, but it appears to be
disabled in the individual file configuration.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-26 13:16:15 +08:00
balibabu
c7c75c0a87 Feat: Enable agent messages to display base64 images (#15212)
### What problem does this PR solve?

Feat: Enable agent messages to display base64 images

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2026-05-25 19:02:03 +08:00
balibabu
0f92353bd9 Fix: Replace the red highlight at the top of the PDF document with yellow. (#15203)
### What problem does this PR solve?

Fix: Replace the red highlight at the top of the PDF document with
yellow.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-05-25 17:21:36 +08:00
Ahmad Intisar
e6068a7f7e Fix: table parser metadata (#15127)
### What problem does this PR solve?

This PR improves the table upload flow for CSV/Excel files by allowing
table column role configuration at upload time.

Previously, users had to:
1. Upload and parse a table file.
2. Open parser settings and manually set table column roles.
3. Re-parse the file for the roles to take effect.

This was inefficient and required an unnecessary second parse.

With this change:
1. When the knowledge base uses table parsing, the upload dialog
extracts CSV/Excel headers client-side.
2. Users can choose Auto mode or Manual mode.
3. In Manual mode, users can assign per-column roles before upload.
4. The selected parser config is sent with the upload request and
applied server-side during document creation.

Result: configured table column roles are applied from the first parse.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Ahmad Intisar <ahmadintisar@Ahmads-MacBook-M4-Pro.local>
2026-05-25 16:05:38 +08:00
buua436
71a52d579c fix: move agent attachment download api (#15146)
### What problem does this PR solve?

move agent attachment download api to the correct route and update
frontend callers

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Notes

- Move the attachment download endpoint from document routes to agent
routes.
- Update frontend download callers to use the agent attachment endpoint.
- Reuse the shared file response header helper instead of duplicating it
in `agent_api.py`.
2026-05-22 15:22:05 +08:00