## Problem
When building or updating a knowledge graph with a large number of
entities and edges, `set_graph()` in `rag/graphrag/utils.py` creates one
`asyncio` task per entity and one per edge, each calling
`embd_mdl.encode([single_name])` — a single-item HTTP request to the
embedding server.
For a graph with 17,000+ nodes and edges (real case reported in #16205),
this generates **34,000+ individual embedding API round-trips** instead
of ~266 batched calls at the default `_INSERT_BULK_SIZE=128`. The
asyncio gather over thousands of tasks makes the embedding server the
bottleneck; under load, a single slow/failed call aborts all remaining
tasks, causing the pipeline to stall and never complete.
Closes#16205. Related: #15921.
## Root Cause
```python
# Before (in set_graph, node loop):
tasks = [asyncio.create_task(graph_node_to_chunk(n, ...)) for n in nodes]
# Each task calls embd_mdl.encode([single_name]) — 1 HTTP call per node
```
`graph_node_to_chunk` checks the embed cache first, but the cache is
cold on first build, so every task makes a live API call.
## Fix
Pre-warm the embedding cache with batched calls before spawning tasks.
Each batch pre-warm calls `embd_mdl.encode(batch_of_128)` once,
populating the cache. Then every individual task hits the cache and
makes zero embedding API calls.
- Only encodes names not already in cache (no-op on warm cache / small
incremental updates)
- Uses existing project idioms: `thread_pool_exec`, `chat_limiter`,
`_INSERT_BULK_SIZE`, `get_embed_cache`, `set_embed_cache`
- Mirrors the `ENABLE_TIMEOUT_ASSERTION` timeout pattern from
`graph_node_to_chunk`
- Zero behavior change: per-task encode logic remains as a correct
fallback
## Result
| Graph size | Before | After |
|---|---|---|
| 17,576 edges | ~17,576 embedding calls → stall | ~138 batched calls |
| 17,509 nodes | ~17,509 embedding calls → stall | ~137 batched calls |
| **Total** | **~35,000 calls** | **~275 calls** |
---------
Co-authored-by: Oti_B <oti@mac.speedport.ip>
### Summary
The addition of the Context method to Go's testing.T provides
significant improvements for writing concurrent tests. It allows better
management of goroutines, ensuring they properly exit and preventing
issues like deadlocks and unfinished processes.
By using Context, errors and cancellations can be handled more
effectively, making tests more robust and easier to reason about. This
change also enables tighter integration between tests and the
application code, especially for systems that span multiple concurrent
components. Overall, it simplifies test code and enhances test stability
and maintainability.
More info: [golang/go#18368](https://github.com/golang/go/issues/18368)
Signed-off-by: blackflytech <blackflytech@outlook.com>
### Summary
`truncateText` in the `reduction` and `summarization` middlewares
truncates with `s[:maxLen]`, which slices by byte. When `maxLen` lands
inside a multi-byte character (common with CJK or other non-ASCII
content flowing through the agent), the string is cut mid-rune and the
tail byte(s) become invalid UTF-8. That broken text then goes into the
reduced context / summary prompt.
`TruncateToolResult` in the same `reduction` package already avoids this
by slicing on a rune boundary and even notes it in a comment. This PR
makes the two `truncateText` helpers do the same, so they stay
consistent with the existing helper.
Both functions keep their existing output shape (summarization still
appends `...`). Added a small unit test in each package covering ASCII
truncation and a CJK string, asserting the result stays valid UTF-8.
## Summary
TuShare required non-empty upstream input but filtered fetched news with
the static `keyword` param (default empty string), so agent-provided
keywords were ignored.
Use `self._param.keyword or ans` when filtering, matching how AkShare
uses upstream input for its query.
Fixes#16360
## Test plan
- [x] `test_tushare_filters_with_upstream_keyword_when_param_empty`
mocks the API and asserts only rows matching the upstream keyword are
returned
---------
Co-authored-by: yzc <yuzhichang@gmail.com>
Co-authored-by: Harsh Kashyap <harshkashyap@Harshs-MacBook-Pro.local>
### What problem does this PR solve?
Closes#16418.
`scholarly.search_pubs(...)` returns a **lazy generator**, but
`agent/tools/googlescholar.py` treated it as a re-iterable, bounded
list:
```python
scholar_client = scholarly.search_pubs(kwargs["query"], ...) # lazy generator
self._retrieve_chunks(scholar_client, ...) # (1) iterates -> exhausts it
self.set_output("json", list(scholar_client)) # (2) already empty -> []
```
1. **`json` output was always empty.** `_retrieve_chunks` iterates
`scholar_client`, exhausting the generator; `list(scholar_client)` then
returns `[]`.
2. **`top_n` was never applied.** Unlike `ArXiv`
(`max_results=self._param.top_n`), the unbounded generator was passed
straight to `_retrieve_chunks`, which has no internal limit — so the
tool kept paginating well past Top N (until an error, rate-limit/block,
or `COMPONENT_EXEC_TIMEOUT`).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Changes
- Materialize at most `top_n` results once with `itertools.islice`, and
reuse that list for both `_retrieve_chunks` and the `json` output.
- Add regression tests
(`test/unit_test/agent/component/test_googlescholar.py`, stubbing
`scholarly.search_pubs`) covering the `top_n` bound, the non-empty
`json` output, and the empty-query short-circuit.
Verified: against `main` the new tests fail with `assert 30 == 5` (top_n
ignored) and `assert 0 == 5` (empty json); with this fix all pass.
Backend-only.
---------
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
## Summary
Add two new MCP tools to the RAGFlow MCP server:
1. **ragflow_list_datasets** - List all accessible datasets with IDs,
names, descriptions
2. **ragflow_list_chats** - List all accessible chat assistants with
IDs, names, descriptions
### Implementation
- Added `list_chats()` method to `RAGFlowConnector`
- Registered both tools in `list_tools()` and `call_tool()`
- Follows existing `ragflow_retrieval` pattern for error handling
### Usage via langchain-mcp-adapters
---------
Co-authored-by: saltsalt123 <saltsalt123@users.noreply.github.com>
Co-authored-by: yzc <yuzhichang@gmail.com>
### What problem does this PR solve?
Closes https://github.com/infiniflow/ragflow/issues/14571.
Adds CAJAL as a first-class local scientific-writing option in RAGFlow:
- registers `agnuxo/cajal-4b-p2pclaw` as a known Ollama chat model with
a 32K context setting
- adds a built-in “CAJAL scientific paper agent” template under the
existing agent template catalog
- preconfigures the agent for grounded scientific writing: retrieval
first, citation traceability, LaTeX-ready output, and explicit
limitations when evidence is missing
- adds unit coverage to ensure the template normalizes through RAGFlow’s
production template loader, keeps graph form data in sync, and exposes
the Ollama model option
Behavior/evidence gathered for the requested model:
- Hugging Face model metadata for `Agnuxo/CAJAL-4B-P2PCLAW` reports
`pipeline_tag=text-generation` and tags including `gguf`, `llama.cpp`,
`vllm`, `scientific-research`, `papers`, `academic-writing`, `latex`,
and `license:apache-2.0`.
- The model card documents CAJAL as a 4B scientific paper generation
model with 32K context, local inference, LaTeX/citation specialization,
and CPU-only support around 5 tok/s on Ryzen 7 5800X.
- Local CPU generation could not be completed on this machine because
the advertised Ollama model name is not currently resolvable from
Ollama’s registry: both
`https://registry.ollama.ai/v2/agnuxo/cajal-4b-p2pclaw/manifests/latest`
and
`https://registry.ollama.ai/v2/library/agnuxo/cajal-4b-p2pclaw/manifests/latest`
returned `404 Not Found`; the Hugging Face repo tree currently exposes
an 8.4 GB `model.safetensors` but no GGUF artifact in `main`. The
template therefore targets the documented Ollama model name for users
who have the local CAJAL deployment/model file available.
Verification run locally:
```bash
python3 -m pytest test/test_cajal_template_unit.py -q
# 3 passed in 0.34s
python3 - <<'PY'
import json, glob
for f in sorted(glob.glob('agent/templates/*.json') + ['conf/llm_factories.json']):
with open(f, encoding='utf-8') as fp: json.load(fp)
print('json_ok')
PY
# json_ok
python3 -m ruff check test/test_cajal_template_unit.py
# All checks passed!
git diff --check
```
`uv run pytest
test/testcases/test_web_api/test_agent_app/test_cajal_template_unit.py
-q` was also attempted first, but dependency setup failed before test
collection while building `ormsgpack==1.5.0` from uv with a package
metadata parse error. Clearing uv’s `ormsgpack` cache and retrying
reproduced the same build failure, so the focused unit test was run with
the system Python environment instead.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: yzc <yzc@users.noreply.github.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
…rialization
Agent components (llm.py, agent_with_tools.py, message.py) store
functools.partial objects as deferred streaming handles in their output
slots. When the canvas state gets serialized for SSE events, Redis
commits, or logging, these partials — plus non-copyable objects like
Langfuse clients — crash json.dumps and deepcopy.
Changes:
- canvas_app.py: add default=str to json.dumps for SSE event
serialization (lines 238, 296)
- canvas.py: wrap deepcopy calls in try/except to handle non-copyable
objects (Langfuse clients, etc.), add default=str to final json.dumps
- base.py: add default=str to ComponentParamBase.__str__ to handle
non-serializable objects in component parameters
Closes#14229
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: yzc <yuzhichang@gmail.com>
## Summary
- Treat `max_tokens=0` as unset (`or 8192`) when building model context
budgets, fixing agents that silently zeroed prompts when a vLLM model
had `max_tokens: 0` in tenant config
- Replace trailing same-role canvas history in `LLM._sys_prompt_and_msg`
instead of skipping the current user prompt
- Add `LLM.fit_messages()` validation after `message_fit_in` on agent
paths so empty user content fails fast with a clear error instead of
reaching vLLM
Fixes#16411
## Root cause
Agent canvas workflow called `message_fit_in` with `int(max_length *
0.97)`. When `max_length` was `0`, both system and user content were
trimmed to empty strings. The `[HISTORY STREAMLY]` log showing only
`{"role":"user","content":""}` matches this. A secondary bug skipped
appending the formatted user prompt when history ended with a `user`
role message.
## Test plan
- [x] Added `test/unit_test/agent/component/test_llm_prompt.py` for
role-replace, validation, and zero-budget fitting
- [x] Added
`test_message_fit_in_zero_budget_preserves_non_empty_messages` in
`test_generator_message_fit_in.py`
- [ ] CI unit tests
- [ ] Manual: agent canvas `begin → Retrieval → Agent → Message` with
vLLM Qwen3; confirm user message reaches LLM
Made with [Cursor](https://cursor.com)
---------
Co-authored-by: Taranum Wasu <taranumwasu@Taranums-MacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
Recovery PR for #16173 after the fork branch was accidentally reset
during rewrite-cleanup.
Cherry-picked onto current `main`:
- fix(common/time_utils): correct fallback timestamp and ISO-8601
normalization
- fix(common/time_utils): preserve zero timestamps and mark regression
tests
- test(common/time_utils): make fallback assertions deterministic
Supersedes closed#16173 — same branch
`Harsh23Kashyap/fix/time-utils-edgecases`, rebuilt per @yuzhichang
recovery steps in
https://github.com/infiniflow/ragflow/pull/16173#issuecomment-4829663835
---------
Co-authored-by: Harsh Kashyap <harshkashyap@Harshs-MacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
## Summary
Two changes to make Go build \& run independent of native libraries
(office_oxide, pdfium, pdf_oxide).
## 1. Make native libraries optional (build.sh + Go source)
## 2. Roll back tests.yml CI changes from PR #16391
This PR migrates the Box web OAuth flow from Python to Go for:
- POST /api/v1/connectors/box/oauth/web/start
- GET /api/v1/connectors/box/oauth/web/callback
- POST /api/v1/connectors/box/oauth/web/result
### What problem does this PR solve?
GET /api/v1/agents (list_agents) already supports filtering by
canvas_category, keywords, tags, and owner_ids, but it does not support
canvas_type — even though canvas_type is a persisted field on UserCanvas
and is already accepted on agent create/update APIs.
This gap causes two issues:
Filtering — clients cannot list agents by business category (e.g.
Marketing, Agent, Ingestion Pipeline) without fetching all agents and
filtering client-side.
Response payload — list_agents did not return canvas_type in each canvas
item, so consumers had to call GET /api/v1/agents/{id} per agent to read
it.
This PR adds optional canvas_type query parameter support and includes
canvas_type in the list response.
### Type of change
- [√] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
## Summary
Fixes#15169 — `POST /api/v1/agents/chat/completions` returned
`data: {}` with no `session_id` when the agent produced no events
(e.g. the reporter's payload sent `"query": ""`).
## Root cause
For `{"agent_id": "...", "query": "", "stream": false}`:
1. No `session_id` in the request → new-session branch at
`agent_api.py:1278`.
2. `session_id = get_uuid()` at `agent_api.py:1294`.
3. Falls into `_run_workflow_session`.
4. `canvas.run(query="")` produces no events, so `final_ans`
stays `{}`.
5. Non-streaming path then hit:
```python
if not final_ans:
await commit_runtime_replica()
return get_result(data={})
```
`session_id` was allocated but silently dropped on the way out.
The streaming path had the same shape (only a bare `[DONE]` was
yielded — no SSE event carrying `session_id`). The
session-continuation path at `agent_api.py:1463` had the same bug
for callers that passed `session_id` and got `{}` back.
The successful (non-empty) paths were fine because every canvas
event has `ans["session_id"] = session_id` attached before being
yielded / captured into `final_ans` (see
`agent_api.py:255` and `:303`).
## Fix
Three minimal changes, all in
`api/apps/restful_apis/agent_api.py`:
1. **`_run_workflow_session` (non-streaming)**:
`return get_result(data={"session_id": session_id})` instead of
`data={}`.
2. **`_run_workflow_session` (SSE)**: if the canvas loop emits no
events, yield one
`data:{"session_id": "...", "data": {}}` event before
`[DONE]`, so the client receives the id over the wire.
3. **`agent_chat_completion` session-continuation**: echo the
caller-supplied `session_id` back in the empty-events case
instead of `{}`.
No change needed on the happy paths — they already attach
`session_id` to every event.
## Test plan
- [ ] Repro from the issue: `POST /api/v1/agents/chat/completions`
with `{"agent_id": "<id>", "query": "", "stream": false}`.
Response `data` should now contain `session_id`.
- [ ] Same payload with `"stream": true`. SSE stream should
contain one event with `session_id` before `data:[DONE]`.
- [ ] Same shape but with a real, non-empty `"query"` (new
session). Response should be unchanged from before — every
event still carries `session_id`, final response still
includes it on `final_ans`.
- [ ] Pass an existing `session_id` plus `"query": ""`. Response
should echo that `session_id` back instead of `{}`.
- [ ] Pass an existing `session_id` plus a normal query. Response
should be unchanged from before.
- [ ] `openai-compatible: true` path is untouched — sanity-check
it still works.
- [ ] Run `uv run pytest` to make sure no existing tests regress.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
## Summary
Fixes#15215 — attachments uploaded to an agent were not reaching the
LLM.
When a user uploads a file in an agent chat, `canvas.run` parses it into
the `sys.files` global (text content for documents, `data:image/...`
URIs
for images — see `agent/canvas.py:752-768`). But the LLM/Agent
component's
`_prepare_prompt_variables` only substitutes variables the user's prompt
template explicitly references via `{var}` placeholders. The default
prompt is `[{"role": "user", "content": "{sys.query}"}]` with no
`{sys.files}`, so the parsed attachment content never reaches the model.
In the reporter's logs, this is why the agent saw only the bare query
`附件 摘要 attachment summary` and went searching the dataset instead of
reading the uploaded PDF.
## Fix
`agent/component/llm.py` — added `_collect_sys_files()` and an
auto-injection step in `_prepare_prompt_variables`:
- If `sys.files` is non-empty **and** neither `sys_prompt` nor any entry
in `prompts` already contains `{sys.files}` (no double-injection),
split the entries into text vs. `data:image/...` URIs.
- Image URIs are merged into `self.imgs`, which the existing logic uses
to switch the chat model to `IMAGE2TEXT` and pass `images=...` to
`async_chat`.
- Text content is appended to the last `user` role message in `msg`,
mirroring how `dialog_service.async_chat_solo` handles attachments for
the non-agent chat path (`api/db/services/dialog_service.py:318-321`).
Both `LLM._invoke_async` and `Agent._invoke_async` (tool-using) go
through `_prepare_prompt_variables`, so plain LLM nodes and Agent nodes
are fixed in both streaming and non-streaming paths.
## Test plan
- [ ] Upload a PDF attachment to an agent with the default `{sys.query}`
prompt and ask "summarize the attachment" — the model should answer
from the file content rather than searching the knowledge base.
- [ ] Upload an image attachment to an agent and ask about its contents
—
the model should switch to the vision-capable LLM and answer from
the image.
- [ ] Verify that an agent whose prompt **does** include `{sys.files}`
still works and does **not** include the file content twice.
- [ ] Verify that an agent run with no attachments behaves unchanged.
- [ ] Run `uv run pytest` to make sure no existing tests regress.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: yzc <yuzhichang@gmail.com>
### Summary
```
RAGFlow(admin)> show users plan summary;
+---------+----------------------------------------------------------------+
| field | value |
+---------+----------------------------------------------------------------+
| command | show_users_plan_summary |
| error | 'Show users plan summary' is implemented in enterprise edition |
+---------+----------------------------------------------------------------+
```
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
This PR adds Google BigQuery as a first-class data source connector in
RAGFlow.
It enables users to ingest and sync BigQuery data using the same
row-to-document model used by relational database connectors: selected
content columns become document text, metadata columns become document
metadata, an optional ID column provides stable document IDs, and an
optional timestamp column enables cursor-based incremental sync.
The connector supports service-account JSON credentials, table mode,
custom query mode, GoogleSQL queries, cursor-based incremental sync,
deleted-row pruning support, configurable query limits such as
`maximum_bytes_billed`, dry-run validation, batch loading, stable
document IDs, and BigQuery-aware value serialization.
### Summary
As title
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Bug Fix (non-breaking change which fixes an issue)