## Summary
- Treat `max_tokens=0` as unset (`or 8192`) when building model context
budgets, fixing agents that silently zeroed prompts when a vLLM model
had `max_tokens: 0` in tenant config
- Replace trailing same-role canvas history in `LLM._sys_prompt_and_msg`
instead of skipping the current user prompt
- Add `LLM.fit_messages()` validation after `message_fit_in` on agent
paths so empty user content fails fast with a clear error instead of
reaching vLLM
Fixes#16411
## Root cause
Agent canvas workflow called `message_fit_in` with `int(max_length *
0.97)`. When `max_length` was `0`, both system and user content were
trimmed to empty strings. The `[HISTORY STREAMLY]` log showing only
`{"role":"user","content":""}` matches this. A secondary bug skipped
appending the formatted user prompt when history ended with a `user`
role message.
## Test plan
- [x] Added `test/unit_test/agent/component/test_llm_prompt.py` for
role-replace, validation, and zero-budget fitting
- [x] Added
`test_message_fit_in_zero_budget_preserves_non_empty_messages` in
`test_generator_message_fit_in.py`
- [ ] CI unit tests
- [ ] Manual: agent canvas `begin → Retrieval → Agent → Message` with
vLLM Qwen3; confirm user message reaches LLM
Made with [Cursor](https://cursor.com)
---------
Co-authored-by: Taranum Wasu <taranumwasu@Taranums-MacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
## Summary
Fixes#15215 — attachments uploaded to an agent were not reaching the
LLM.
When a user uploads a file in an agent chat, `canvas.run` parses it into
the `sys.files` global (text content for documents, `data:image/...`
URIs
for images — see `agent/canvas.py:752-768`). But the LLM/Agent
component's
`_prepare_prompt_variables` only substitutes variables the user's prompt
template explicitly references via `{var}` placeholders. The default
prompt is `[{"role": "user", "content": "{sys.query}"}]` with no
`{sys.files}`, so the parsed attachment content never reaches the model.
In the reporter's logs, this is why the agent saw only the bare query
`附件 摘要 attachment summary` and went searching the dataset instead of
reading the uploaded PDF.
## Fix
`agent/component/llm.py` — added `_collect_sys_files()` and an
auto-injection step in `_prepare_prompt_variables`:
- If `sys.files` is non-empty **and** neither `sys_prompt` nor any entry
in `prompts` already contains `{sys.files}` (no double-injection),
split the entries into text vs. `data:image/...` URIs.
- Image URIs are merged into `self.imgs`, which the existing logic uses
to switch the chat model to `IMAGE2TEXT` and pass `images=...` to
`async_chat`.
- Text content is appended to the last `user` role message in `msg`,
mirroring how `dialog_service.async_chat_solo` handles attachments for
the non-agent chat path (`api/db/services/dialog_service.py:318-321`).
Both `LLM._invoke_async` and `Agent._invoke_async` (tool-using) go
through `_prepare_prompt_variables`, so plain LLM nodes and Agent nodes
are fixed in both streaming and non-streaming paths.
## Test plan
- [ ] Upload a PDF attachment to an agent with the default `{sys.query}`
prompt and ask "summarize the attachment" — the model should answer
from the file content rather than searching the knowledge base.
- [ ] Upload an image attachment to an agent and ask about its contents
—
the model should switch to the vision-capable LLM and answer from
the image.
- [ ] Verify that an agent whose prompt **does** include `{sys.files}`
still works and does **not** include the file content twice.
- [ ] Verify that an agent run with no attachments behaves unchanged.
- [ ] Run `uv run pytest` to make sure no existing tests regress.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: yzc <yuzhichang@gmail.com>
### What problem does this PR solve?
This PR adds an Agent LLM setting to control thinking mode for official
providers that expose a thinking switch.
Related to #12842.
Closes#15445.
Some providers expose thinking controls through provider-specific
request fields, but Agent LLM settings did not have a unified option for
users to enable or disable thinking mode.
This PR adds a `Thinking` selector with:
- System default
- Enabled
- Disabled
<img width="452" height="278" alt="8566b0b4-0546-4c8a-913d-f9bbd38319f6"
src="https://github.com/user-attachments/assets/25b497f7-1ba0-4bfe-940d-6fe79287d6ab"
/>
<img width="471" height="971" alt="8a0a6bee-f45f-48d5-bd83-17af260de3db"
src="https://github.com/user-attachments/assets/41ad43c1-5087-48f1-bf37-f2ca14c2be2f"
/>
Initial support is limited to the verified official providers:
- Qwen / DashScope: `enable_thinking`
- Kimi / Moonshot: `thinking.type`
- GLM / ZHIPU-AI: `thinking.type`
For LiteLLM-based providers, provider-specific fields are forwarded
through `extra_body` before `drop_params` filtering so the request
parameters are preserved.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: jiashi <jiashi19@outlook.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
### What problem does this PR solve?
Ensure agent components with image inputs route to `image2text` models
instead of staying on the chat path, so visual requests use the CV
wrapper when supported.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Python implementation of the Go-based model_provider API suite.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: bill <yibie_jingnian@163.com>
### What problem does this PR solve?
The agent API currently does not pass chat_template_kwargs to the
underlying LLM call path, so clients cannot control template-level model
behavior (such as thinking-mode toggles) when invoking
/agents/chat/completion. This PR adds passthrough support for
chat_template_kwargs across agent execution flows (session and
non-session, streaming and non-streaming) by propagating it through
canvas runtime state and into LLM invocation kwargs. This addresses the
feature gap raised in [Issue
#14182](https://github.com/infiniflow/ragflow/issues/14182).
Closes#14182
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix update_cnt add error in init_data.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add id for table tenant_llm and apply in LLMBundle.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Make RAGFlow more asynchronous 2. #11551, #11579, #11619.
### Type of change
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Make RAGFlow more asynchronous 2. #11551, #11579, #11619.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Try to make this more asynchronous. Verified in chat and agent
scenarios, reducing blocking behavior. #11551, #11579.
However, the impact of these changes still requires further
investigation to ensure everything works as expected.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
issue:
[#11541](https://github.com/infiniflow/ragflow/issues/11541)
change:
enable structured output for agent with tool
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
change:
“After each dialogue turn, the agent component’s output is not reset.”
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add mechanism to check cancellation in Agent.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:#5617
change:add IBM DB2 support in ExeSQL
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Fix typos
2. Rename function
3. Use English to write comment
### Type of change
- [x] Refactoring
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix broken imports
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
Change return type of _generate_streamly from str to Generator[str,
None, None] to properly type hint streaming responses.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
#9082#6365
<u> **WARNING: it's not compatible with the older version of `Agent`
module, which means that `Agent` from older versions can not work
anymore.**</u>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)