Commit Graph

7100 Commits

Author SHA1 Message Date
Hz_
fa1b52ca74 fix(go): prevent moving folders into themselves (#16522) 2026-07-02 15:45:30 +08:00
maoyifeng
404ef4ce87 workflow steps separated to go or python (#16561)
add new workflow yml,  steps separated to go or python
2026-07-02 15:02:11 +08:00
Jin Hai
0b9ab12c58 Go: fix lint (#16533)
### Summary

as title.

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-07-02 13:44:05 +08:00
grandpig
17e3e34e78 refactor: use WaitGroup.Go to simplify code (#16539)
### Summary

Adopt sync.WaitGroup.Go (Go 1.25) to simplify tracked goroutine
spawning. This replaces the error-prone trio of wg.Add(1), go func(),
and defer wg.Done() with a single, self-contained call.

More info: https://github.com/golang/go/issues/63796

Signed-off-by: grandpig <grandpig@outlook.com>
2026-07-02 13:41:53 +08:00
Hz_
d0d0339428 fix(go): agent settings update clearing DSL (#16495)
### Summary

This PR fixes a Go backend bug where updating agent settings, such as
description, could clear the agent DSL.

Root cause:
PUT /api/v1/agents/:canvas_id only bound the dsl field in Go. When the
frontend submitted settings without dsl, the service still updated the
canvas with an empty DSL value.

Changes:

- Treat agent updates as partial patches.
- Preserve existing DSL when dsl is not present in the request.
- Update only specified user_canvas fields instead of saving the full
row.
- Add a regression test for settings updates preserving DSL.

Test:

`go test ./internal/service ./internal/handler`

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-07-02 13:41:24 +08:00
Hz_
a67026f714 fix(go): agent explore thumbnail loading for multiple doc_ids (#16514)
## Summary
- align the Go `/api/v1/thumbnails` endpoint with the frontend request
format for repeated `doc_ids`
- return thumbnail mappings for multiple documents instead of failing on
a single missing document
- preserve Python-compatible thumbnail formatting, including base64
thumbnail passthrough
2026-07-02 12:35:10 +08:00
Hz_
cb8012e30b fix(go): accept disabled chunk filter in list chunks handler (#16532)
### Summary

Fixes a bug in the Go chunk list handler where the available` query
parser rejected `false` and `0` even though they were documented as
supported values.`

This caused requests from the "Disabled" chunk filter to return HTTP 400
and broke the chunk list page when filtering disabled chunks.
2026-07-02 12:07:19 +08:00
Haruko386
b4825166a7 fix: JSONMap scan in dataset index chunking config (#16489)
### Summary

As title

This PR fixes dataset index task creation failing with unsupported data
type: entity.JSONMap when loading document chunking config.

#### issues:
```
2026/06/30 15:19:40 /home/infiniflow/Documents/development/ragflow/internal/dao/document.go:162 
[error] unsupported data type: ragflow/internal/entity.JSONMap
```

#### Changes:
+ Adds the missing GORM type:longtext tag to ParserConfig in
DocumentDAO.GetChunkingConfig.
+ Adds a DAO regression test covering GetChunkingConfig joins across
document, knowledgebase, and tenant while scanning parser_config.
2026-07-02 12:06:53 +08:00
Haruko386
d6b1c5937b fix: get duplicate datasetID when get-Chat (#16498)
### Summary

As title

```go
// Resolve kb_ids to kb_names
	kbNames, datasetIDs := s.getDatasetNamesAndIDs(chat.KBIDs)

        // duplicated add datasetID(removed)
	for _, kbID := range chat.KBIDs {
		datasetID, ok := kbID.(string)
		if !ok {
			continue
		}
		datasetIDs = append(datasetIDs, datasetID)
	}
```
2026-07-02 12:06:29 +08:00
Haruko386
ee45c97b0b fix: unadble to add metadata for file in kb (#16523)
### Summary

As title

Before, it return `update success` but never insert or update any
metadata

fixed:

```go
	_, err = s.docEngine.InsertMetadata(nil, []map[string]interface{}{
		{
			"id":          docID,
			"kb_id":       doc.KbID,
			"meta_fields": meta,
		},
	}, tenantID)
```
2026-07-02 12:06:05 +08:00
Br1an
27c9a093bd Fix: close MCP sessions after canvas execution to prevent connection leaks (#13295)
### What problem does this PR solve?

Closes #12962

MCPToolCallSessions created during agent execution (in `Agent.__init__`)
are never explicitly closed. Each session starts its own event loop
thread and opens an SSE/HTTP connection to the MCP server. When the
canvas goes out of scope, these threads and connections remain alive
indefinitely, accumulating over time and causing resource exhaustion
after prolonged use.

### Solution

1. Add a `Graph.close()` method that iterates all components, finds
MCPToolCallSessions held by Agent tools, and calls `close_sync()` on
each to properly shut down the event loop, thread, and connection.
2. Call `canvas.close()` in `finally` blocks after `canvas.run()`
completes in `canvas_service.py` and `canvas_app.py`.
3. Move MCP session cleanup to `finally` blocks in `test_tool` endpoint
(`mcp_server_app.py`) and `get_mcp_tools` (`api_utils.py`) to ensure
sessions are closed even on exceptions.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: conflict-resolver <conflict-resolver@local>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-07-02 10:57:24 +08:00
Zhichang Yu
ba552f64b9 Stabilize timeout tests with semantic assertions (#16537)
Replace fragile wall-clock timeout assertions with semantic checks for
deadline errors, retry suppression, and event ordering. Keep only
lower-bound timing checks where they prove backoff behavior. This
reduces CPU-load flakes without weakening regression coverage.
2026-07-02 10:56:38 +08:00
euvre
3195d6fa89 fix: improve Normal role badge visibility with proper styling (#16528) 2026-07-02 10:47:01 +08:00
Wang Qi
7abc69434f [Go] Fix to allow duplicate key for provider (#16543) 2026-07-02 10:34:36 +08:00
Hz_
9b83d0f154 fix(go): document count in kb (#16490)
### Summary
This PR fixes incorrect dataset document counters in the Go service.

Several document creation paths inserted document records directly
through documentDAO.Create, bypassing the shared InsertDocument logic
that increments knowledgebase.doc_num. As a result, datasets could
contain documents while doc_num remained 0.
2026-07-02 10:34:14 +08:00
Hz_
0de69e5bba feat(go-api) sessions message update (#16517)
### Summary
```
/api/v1/chats/<chat_id>/sessions/<session_id>/messages/<msg_id> DELETE
/api/v1/chats/<chat_id>/sessions/<session_id>/messages/<msg_id>/feedback PUT
```
Migrates the chat session message delete and feedback APIs to the Go
server, matching the Python behavior for authorization, session
ownership checks, message/reference updates, and feedback validation.
2026-07-02 10:33:27 +08:00
Jack
5bc4753d1e Feat/oss parser no post (#16464)
### Summary

Remove dead code
2026-07-02 09:46:33 +08:00
qinling0210
133b1e15fd Port agent PRs to GO (#16529)
### Summary

Port the following PRs to GO in this PR

https://github.com/infiniflow/ragflow/pull/14210
https://github.com/infiniflow/ragflow/pull/14641
https://github.com/infiniflow/ragflow/pull/15220
https://github.com/infiniflow/ragflow/pull/15228
https://github.com/infiniflow/ragflow/pull/15384
https://github.com/infiniflow/ragflow/pull/15754
https://github.com/infiniflow/ragflow/pull/16413
https://github.com/infiniflow/ragflow/pull/16483
https://github.com/infiniflow/ragflow/pull/16419
https://github.com/infiniflow/ragflow/pull/16361   
https://github.com/infiniflow/ragflow/pull/16050
2026-07-02 09:45:01 +08:00
Öndery
742188c3bb feat(agent): report accurate aggregated token usage and propagate session/user + input/output to Langfuse for agent runs (#16420)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe):

## Summary

Agent (Canvas) runs previously did not surface token usage in the SSE
stream, and RAGFlow's own Langfuse generations for agent runs were
missing the prompt/completion split and the session/user correlation.
This made it impossible for an external caller (or Langfuse) to
reconcile an agent turn's cost with the upstream provider (e.g.
OpenRouter), because a single turn can issue several distinct LLM calls
(query rewriting / cross-language translation, multi-round tool
reasoning, nested sub-agents, and the final answer).

This PR introduces a per-run token usage sink so that **every** LLM call
in a run is aggregated and reported once, and enriches Langfuse
generations with the prompt/completion split plus session/user
attributes.

## What changes

### 1. Per-run token usage sink (`common/token_utils.py`)

- Adds two `contextvars`: `token_usage_sink` (a mutable per-run
accumulator) and `langfuse_run_attrs` (session_id/user_id for the run).
- Adds `record_run_token_usage(...)` (thread-safe via a lock, because
`thread_pool_exec` copies the context into worker threads that share the
sink dict) and `usage_from_response(...)` which extracts a
`{prompt_tokens, completion_tokens, total_tokens}` split from
OpenAI/OpenRouter-style responses.

### 2. Provider layer captures the prompt/completion split
(`rag/llm/chat_model.py`)

- `LiteLLMBase` and `Base` now store `self.last_usage`
(prompt/completion/total) for the most recent chat call, in both the
plain and tool-calling paths.
- Streaming requests set `stream_options.include_usage = True` (LiteLLM
path) so the authoritative usage arrives on the final chunk; this is
read even on the usage-only chunk that carries no `choices`.
- Fixes a multi-round accounting bug in `*_with_tools`: token totals
were **overwritten** by each round (`total_tokens = tol`) instead of
accumulated, undercounting multi-round tool conversations. Each round is
now committed to a running aggregate.

### 3. LLMBundle reports usage once, per call
(`api/db/services/llm_service.py`)

- New `_report_usage(total_tokens)` records the call's usage into the
active run sink and returns the prompt/completion/total split for
Langfuse. The split is only used when it is consistent with the
authoritative total; otherwise only the total is reported.
- All three chat entry points (`async_chat`, `async_chat_streamly`,
`async_chat_streamly_delta`) now emit `usage_details` with
`input`/`output`/`total` instead of total-only.
- `_start_langfuse_observation` now applies `session_id`/`user_id` from
the per-run context (`langfuse_run_attrs`) so agent-run generations are
correctly grouped, even though agent LLMBundles are constructed without
those attributes.

### 4. Canvas installs the sink and emits the aggregate
(`agent/canvas.py`)

- `Canvas.run()` installs a fresh `token_usage_sink` and
`langfuse_run_attrs` (from `user_id`/`session_id`) at the start of every
turn.
- `message_end` now includes an aggregated `usage` object:
`{prompt_tokens, completion_tokens, total_tokens, calls}` covering all
LLM calls in the run.

### 5. Pass session id into the run
(`api/db/services/canvas_service.py`)

- `completion()` forwards `session_id` to `Canvas.run()` for Langfuse
session correlation.

## Why a context variable

LLM calls in an agent run originate from many places that each build
their own `LLMBundle` (e.g. `cross_languages`/`keyword_extraction`
helpers, the Agent component, and nested sub-agents invoked as tools). A
run-scoped context variable is the only non-invasive chokepoint that
captures all of them exactly once, including nested agents (which run in
the same async context) and thread-pool tools (the executor copies the
context).

## Behavior / compatibility

- No public API or wire-format removal: `message_end` gains an
additional optional `usage` field; existing consumers are unaffected.
- When a provider does not return authoritative usage, behavior falls
back to the previous token estimate (total only, no split).
- Non-agent flows (Dataflow `Pipeline`, sync `Graph.run`) are untouched.

## Testing
- [x] Simple agent answer: `message_end.usage.total_tokens` matches
provider usage.
- [x] Agent with cross-language retrieval: aggregate equals the sum of
both provider calls.
- [x] Tool-calling agent (multi-round): total accumulates across rounds.
- [x] Nested agent (agent-as-tool): sub-agent tokens included in the
parent run total.
- [x] Langfuse: agent generations show input/output split and are
grouped by session/user.

---------

Co-authored-by: yzc <yuzhichang@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-07-02 09:35:28 +08:00
Jack
42a0faad18 Fix: use .a to replace .so for pdfium/pdf_oxide/office_oxide (#16496)
### Summary

Use .a to replace .so for pdfium/pdf_oxide/office_oxide

---------

Co-authored-by: Claude <noreply@anthropic.com>
dev-20260702
2026-07-01 21:21:31 +08:00
OSHA-B
b0e6007131 perf: batch-embed entity/edge names in set_graph() to fix stall on large graphs (#16205) (#16472)
## Problem

When building or updating a knowledge graph with a large number of
entities and edges, `set_graph()` in `rag/graphrag/utils.py` creates one
`asyncio` task per entity and one per edge, each calling
`embd_mdl.encode([single_name])` — a single-item HTTP request to the
embedding server.

For a graph with 17,000+ nodes and edges (real case reported in #16205),
this generates **34,000+ individual embedding API round-trips** instead
of ~266 batched calls at the default `_INSERT_BULK_SIZE=128`. The
asyncio gather over thousands of tasks makes the embedding server the
bottleneck; under load, a single slow/failed call aborts all remaining
tasks, causing the pipeline to stall and never complete.

Closes #16205. Related: #15921.

## Root Cause

```python
# Before (in set_graph, node loop):
tasks = [asyncio.create_task(graph_node_to_chunk(n, ...)) for n in nodes]
# Each task calls embd_mdl.encode([single_name]) — 1 HTTP call per node
```

`graph_node_to_chunk` checks the embed cache first, but the cache is
cold on first build, so every task makes a live API call.

## Fix

Pre-warm the embedding cache with batched calls before spawning tasks.
Each batch pre-warm calls `embd_mdl.encode(batch_of_128)` once,
populating the cache. Then every individual task hits the cache and
makes zero embedding API calls.

- Only encodes names not already in cache (no-op on warm cache / small
incremental updates)
- Uses existing project idioms: `thread_pool_exec`, `chat_limiter`,
`_INSERT_BULK_SIZE`, `get_embed_cache`, `set_embed_cache`
- Mirrors the `ENABLE_TIMEOUT_ASSERTION` timeout pattern from
`graph_node_to_chunk`
- Zero behavior change: per-task encode logic remains as a correct
fallback

## Result

| Graph size | Before | After |
|---|---|---|
| 17,576 edges | ~17,576 embedding calls → stall | ~138 batched calls |
| 17,509 nodes | ~17,509 embedding calls → stall | ~137 batched calls |
| **Total** | **~35,000 calls** | **~275 calls** |

---------

Co-authored-by: Oti_B <oti@mac.speedport.ip>
2026-07-01 20:45:20 +08:00
Haruko386
4a72c973e8 fix: return call failed when LLM not available (#16518)
### Summary

As title
2026-07-01 20:10:42 +08:00
euvre
fb0376561f fix: normalize Q&A parser ID key to lowercase 'qa' (#16530) 2026-07-01 19:33:18 +08:00
Jin Hai
f4f9e4466b Go CLI: fix list provider models (#16493)
### Summary

As title.

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-07-01 19:08:17 +08:00
Haruko386
9eacbb418f fix: unable to open filter in agent page(no agent tags...) (#16531)
### Summary

As title
2026-07-01 19:07:15 +08:00
blackflytech
fcf2ca869b refactor: replace context.WithCancel with t.Context (#16509)
### Summary

The addition of the Context method to Go's testing.T provides
significant improvements for writing concurrent tests. It allows better
management of goroutines, ensuring they properly exit and preventing
issues like deadlocks and unfinished processes.

By using Context, errors and cancellations can be handled more
effectively, making tests more robust and easier to reason about. This
change also enables tighter integration between tests and the
application code, especially for systems that span multiple concurrent
components. Overall, it simplifies test code and enhances test stability
and maintainability.

More info: [golang/go#18368](https://github.com/golang/go/issues/18368)

Signed-off-by: blackflytech <blackflytech@outlook.com>
2026-07-01 18:37:11 +08:00
qinling0210
5ba25a5267 Fix GetProjectRoot in GO (#16520)
### Summary

Fix GetProjectRoot in GO
2026-07-01 18:17:53 +08:00
euvre
81cfcdf2d3 feat(frontend): add AuthenticatedImg component for authorized image requests (#16525) 2026-07-01 17:02:44 +08:00
ZF
97a4c64cc8 fix(harness): truncate text on rune boundary to keep UTF-8 valid (#16511)
### Summary

`truncateText` in the `reduction` and `summarization` middlewares
truncates with `s[:maxLen]`, which slices by byte. When `maxLen` lands
inside a multi-byte character (common with CJK or other non-ASCII
content flowing through the agent), the string is cut mid-rune and the
tail byte(s) become invalid UTF-8. That broken text then goes into the
reduced context / summary prompt.

`TruncateToolResult` in the same `reduction` package already avoids this
by slicing on a rune boundary and even notes it in a comment. This PR
makes the two `truncateText` helpers do the same, so they stay
consistent with the existing helper.

Both functions keep their existing output shape (summarization still
appends `...`). Added a small unit test in each package covering ASCII
truncation and a CJK string, asserting the result stays valid UTF-8.
2026-07-01 16:45:26 +08:00
Harsh Kashyap
d770217b25 fix(api): fall back to factory max_tokens for tenant models (#16364) 2026-07-01 16:00:13 +08:00
qinling0210
7862f69f39 Implement chat completions in go (#16491)
### Summary

POST   /api/v1/chat/completions
2026-07-01 15:52:52 +08:00
Harsh Kashyap
b8e960e6c8 fix(qa): preserve final CSV pair row number (#16433) 2026-07-01 14:52:08 +08:00
Harsh Kashyap
b42414b64a fix(deepdoc): parse bodyless HTML fragments (#16423) 2026-07-01 14:45:22 +08:00
connerlambden
9bf57600cf feat(agent): add BGPT structured literature evidence search tool (#16050)
## Summary

Adds a first-class **BGPT** Agent tool (backend + UI) in response to
[#15997](https://github.com/infiniflow/ragflow/issues/15997#issuecomment-4703864227).

BGPT calls `POST https://bgpt.pro/api/mcp-search` and returns structured
study evidence from full-text papers — not just titles/abstracts. Each
result is formatted for RAGFlow citations with:

- methods
- sample size / population
- results
- limitations
- conflicts of interest
- data availability
- study blind spots
- `how_to_falsify`

## Why this shape

- Mirrors existing literature tools (`PubMed`, `ArXiv`) and HTTP tools
(`SearXNG`).
- Works on the free tier (no API key required for first 50 results).
- Optional `api_key` and `days_back` in the node/tool config.
- Surfaces both `formalized_content` and raw `json` outputs (like
SearXNG).

## Files

- `agent/tools/bgpt.py` — REST client + evidence formatter
- Frontend: Operator enum, forms, tool picker, canvas accordion, en/zh
locales, icon

## Demo / docs

Runnable claim-interrogation demo:
https://github.com/connerlambden/bgpt-mcp/blob/main/EVIDENCE_DEMO.md

## Test plan

- [ ] Add BGPT node on Agent canvas, run query `GLP-1 alcohol craving`,
verify `formalized_content` includes limitations/COI fields
- [ ] Add BGPT as Agent sub-tool under Search, verify tool-calling works
- [ ] Confirm empty query / try-run returns gracefully
- [ ] Optional: paid-tier `api_key` path

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-07-01 13:52:24 +08:00
Harsh Kashyap
508f6226f8 fix(agent): filter TuShare news with upstream keyword input (#16361)
## Summary

TuShare required non-empty upstream input but filtered fetched news with
the static `keyword` param (default empty string), so agent-provided
keywords were ignored.

Use `self._param.keyword or ans` when filtering, matching how AkShare
uses upstream input for its query.

Fixes #16360

## Test plan

- [x] `test_tushare_filters_with_upstream_keyword_when_param_empty`
mocks the API and asserts only rows matching the upstream keyword are
returned

---------

Co-authored-by: yzc <yuzhichang@gmail.com>
Co-authored-by: Harsh Kashyap <harshkashyap@Harshs-MacBook-Pro.local>
2026-07-01 13:51:39 +08:00
Harsh Kashyap
572f1ea9f4 fix(web): sanitize agent rerun modal HTML against stored XSS (#16516) 2026-07-01 13:38:31 +08:00
Lynn
400476f0b3 Feat: SoMark (#16482)
Follow #15486
Co-authored-by: limuting <limuting233@gmail.com>
Co-authored-by: lutianyi <lutianyi233@163.com>
Co-authored-by: justinychuang <huangyicheng@soulcode.cn>
Co-authored-by: maybehokori <138367708+maybehokori@users.noreply.github.com>
2026-07-01 13:29:28 +08:00
Lynn
b6fa5ce4ea Fix: ollama provider (#16519) 2026-07-01 13:24:31 +08:00
Wang Qi
8f24b30652 [Go] Add API /api/v1/chat/recommendation and consolidate with /api/v1/searchbots/related_questions (#16500) 2026-07-01 13:17:16 +08:00
Muhammad Furqan
828c5789f6 fix(agent/tools): GoogleScholar empty json output and ignored top_n (#16419)
### What problem does this PR solve?

Closes #16418.

`scholarly.search_pubs(...)` returns a **lazy generator**, but
`agent/tools/googlescholar.py` treated it as a re-iterable, bounded
list:

```python
scholar_client = scholarly.search_pubs(kwargs["query"], ...)   # lazy generator
self._retrieve_chunks(scholar_client, ...)                     # (1) iterates -> exhausts it
self.set_output("json", list(scholar_client))                  # (2) already empty -> []
```

1. **`json` output was always empty.** `_retrieve_chunks` iterates
`scholar_client`, exhausting the generator; `list(scholar_client)` then
returns `[]`.
2. **`top_n` was never applied.** Unlike `ArXiv`
(`max_results=self._param.top_n`), the unbounded generator was passed
straight to `_retrieve_chunks`, which has no internal limit — so the
tool kept paginating well past Top N (until an error, rate-limit/block,
or `COMPONENT_EXEC_TIMEOUT`).

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Changes

- Materialize at most `top_n` results once with `itertools.islice`, and
reuse that list for both `_retrieve_chunks` and the `json` output.
- Add regression tests
(`test/unit_test/agent/component/test_googlescholar.py`, stubbing
`scholarly.search_pubs`) covering the `top_n` bound, the non-empty
`json` output, and the empty-query short-circuit.

Verified: against `main` the new tests fail with `assert 30 == 5` (top_n
ignored) and `assert 0 == 5` (empty json); with this fix all pass.
Backend-only.

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-07-01 10:47:39 +08:00
Yingfeng
6648fe4151 Fix g++ 11 incompatibility issue (#16512) 2026-07-01 10:16:47 +08:00
saltsalt123
f60245f199 feat(mcp): add ragflow_list_datasets and ragflow_list_chats tools (#15384)
## Summary

Add two new MCP tools to the RAGFlow MCP server:

1. **ragflow_list_datasets** - List all accessible datasets with IDs,
names, descriptions
2. **ragflow_list_chats** - List all accessible chat assistants with
IDs, names, descriptions

### Implementation
- Added `list_chats()` method to `RAGFlowConnector`
- Registered both tools in `list_tools()` and `call_tool()`
- Follows existing `ragflow_retrieval` pattern for error handling

### Usage via langchain-mcp-adapters

---------

Co-authored-by: saltsalt123 <saltsalt123@users.noreply.github.com>
Co-authored-by: yzc <yuzhichang@gmail.com>
dev-20260701
2026-07-01 09:36:52 +08:00
sxxtony
06b07bbfd6 Add CAJAL scientific paper agent template (#14641)
### What problem does this PR solve?

Closes https://github.com/infiniflow/ragflow/issues/14571.

Adds CAJAL as a first-class local scientific-writing option in RAGFlow:

- registers `agnuxo/cajal-4b-p2pclaw` as a known Ollama chat model with
a 32K context setting
- adds a built-in “CAJAL scientific paper agent” template under the
existing agent template catalog
- preconfigures the agent for grounded scientific writing: retrieval
first, citation traceability, LaTeX-ready output, and explicit
limitations when evidence is missing
- adds unit coverage to ensure the template normalizes through RAGFlow’s
production template loader, keeps graph form data in sync, and exposes
the Ollama model option

Behavior/evidence gathered for the requested model:

- Hugging Face model metadata for `Agnuxo/CAJAL-4B-P2PCLAW` reports
`pipeline_tag=text-generation` and tags including `gguf`, `llama.cpp`,
`vllm`, `scientific-research`, `papers`, `academic-writing`, `latex`,
and `license:apache-2.0`.
- The model card documents CAJAL as a 4B scientific paper generation
model with 32K context, local inference, LaTeX/citation specialization,
and CPU-only support around 5 tok/s on Ryzen 7 5800X.
- Local CPU generation could not be completed on this machine because
the advertised Ollama model name is not currently resolvable from
Ollama’s registry: both
`https://registry.ollama.ai/v2/agnuxo/cajal-4b-p2pclaw/manifests/latest`
and
`https://registry.ollama.ai/v2/library/agnuxo/cajal-4b-p2pclaw/manifests/latest`
returned `404 Not Found`; the Hugging Face repo tree currently exposes
an 8.4 GB `model.safetensors` but no GGUF artifact in `main`. The
template therefore targets the documented Ollama model name for users
who have the local CAJAL deployment/model file available.

Verification run locally:

```bash
python3 -m pytest test/test_cajal_template_unit.py -q
# 3 passed in 0.34s

python3 - <<'PY'
import json, glob
for f in sorted(glob.glob('agent/templates/*.json') + ['conf/llm_factories.json']):
    with open(f, encoding='utf-8') as fp: json.load(fp)
print('json_ok')
PY
# json_ok

python3 -m ruff check test/test_cajal_template_unit.py
# All checks passed!

git diff --check
```

`uv run pytest
test/testcases/test_web_api/test_agent_app/test_cajal_template_unit.py
-q` was also attempted first, but dependency setup failed before test
collection while building `ormsgpack==1.5.0` from uv with a package
metadata parse error. Clearing uv’s `ormsgpack` cache and retrying
reproduced the same build failure, so the focused unit test was run with
the system Python environment instead.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: yzc <yzc@users.noreply.github.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-07-01 09:35:37 +08:00
RazmikGevorgyan
38f8f8a656 fix: handle non-serializable objects in agent canvas SSE and state se… (#14210)
…rialization

Agent components (llm.py, agent_with_tools.py, message.py) store
functools.partial objects as deferred streaming handles in their output
slots. When the canvas state gets serialized for SSE events, Redis
commits, or logging, these partials — plus non-copyable objects like
Langfuse clients — crash json.dumps and deepcopy.

Changes:
- canvas_app.py: add default=str to json.dumps for SSE event
serialization (lines 238, 296)
- canvas.py: wrap deepcopy calls in try/except to handle non-copyable
objects (Langfuse clients, etc.), add default=str to final json.dumps
- base.py: add default=str to ComponentParamBase.__str__ to handle
non-serializable objects in component parameters

Closes #14229

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: yzc <yuzhichang@gmail.com>
2026-07-01 09:33:41 +08:00
Taranum Wasu
e23f63bd93 fix(agent): prevent empty LLM user message after prompt fitting (#16413)
## Summary
- Treat `max_tokens=0` as unset (`or 8192`) when building model context
budgets, fixing agents that silently zeroed prompts when a vLLM model
had `max_tokens: 0` in tenant config
- Replace trailing same-role canvas history in `LLM._sys_prompt_and_msg`
instead of skipping the current user prompt
- Add `LLM.fit_messages()` validation after `message_fit_in` on agent
paths so empty user content fails fast with a clear error instead of
reaching vLLM

Fixes #16411

## Root cause
Agent canvas workflow called `message_fit_in` with `int(max_length *
0.97)`. When `max_length` was `0`, both system and user content were
trimmed to empty strings. The `[HISTORY STREAMLY]` log showing only
`{"role":"user","content":""}` matches this. A secondary bug skipped
appending the formatted user prompt when history ended with a `user`
role message.

## Test plan
- [x] Added `test/unit_test/agent/component/test_llm_prompt.py` for
role-replace, validation, and zero-budget fitting
- [x] Added
`test_message_fit_in_zero_budget_preserves_non_empty_messages` in
`test_generator_message_fit_in.py`
- [ ] CI unit tests
- [ ] Manual: agent canvas `begin → Retrieval → Agent → Message` with
vLLM Qwen3; confirm user message reaches LLM

Made with [Cursor](https://cursor.com)

---------

Co-authored-by: Taranum Wasu <taranumwasu@Taranums-MacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-07-01 09:30:54 +08:00
Harsh Kashyap
45fc7feab4 fix(common/time_utils): correct None/empty timestamp fallback and ISO 8601 parsing (#16483)
Recovery PR for #16173 after the fork branch was accidentally reset
during rewrite-cleanup.

Cherry-picked onto current `main`:
- fix(common/time_utils): correct fallback timestamp and ISO-8601
normalization
- fix(common/time_utils): preserve zero timestamps and mark regression
tests
- test(common/time_utils): make fallback assertions deterministic

Supersedes closed #16173 — same branch
`Harsh23Kashyap/fix/time-utils-edgecases`, rebuilt per @yuzhichang
recovery steps in
https://github.com/infiniflow/ragflow/pull/16173#issuecomment-4829663835

---------

Co-authored-by: Harsh Kashyap <harshkashyap@Harshs-MacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-30 22:30:44 +08:00
Lynn
b53b693f22 Fix: CI (#16504)
### Summary

Fix race condition in parallel lefthook hooks causing ETXTBSY error
2026-06-30 22:14:11 +08:00
Jack
8e1dc4f308 revert: roll back tests.yml CI changes from PR #16391 (#16505)
## Summary

Two changes to make Go build \& run independent of native libraries
(office_oxide, pdfium, pdf_oxide).

## 1. Make native libraries optional (build.sh + Go source)

## 2. Roll back tests.yml CI changes from PR #16391
2026-06-30 21:50:37 +08:00
Yingfeng
5af361ed68 Add spacy based ner and relationship extractor for both python and Go version with equivalent outputs (#16456)
As title
2026-06-30 21:40:24 +08:00
Hz_
3633d08495 feat(go-api): Migrate Box web OAuth connector APIs to Go (#16480)
This PR migrates the Box web OAuth flow from Python to Go for:

  - POST /api/v1/connectors/box/oauth/web/start
  - GET /api/v1/connectors/box/oauth/web/callback
  - POST /api/v1/connectors/box/oauth/web/result
2026-06-30 18:10:36 +08:00