ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-07-01 00:05:43 +08:00

Files

Rene Arredondo dc8b6d767c fix(agent): inject uploaded attachments into LLM context (#15215 ) (#15220 )

## Summary

Fixes #15215 — attachments uploaded to an agent were not reaching the
LLM.

When a user uploads a file in an agent chat, `canvas.run` parses it into
the `sys.files` global (text content for documents, `data:image/...`
URIs
for images — see `agent/canvas.py:752-768`). But the LLM/Agent
component's
`_prepare_prompt_variables` only substitutes variables the user's prompt
template explicitly references via `{var}` placeholders. The default
prompt is `[{"role": "user", "content": "{sys.query}"}]` with no
`{sys.files}`, so the parsed attachment content never reaches the model.

In the reporter's logs, this is why the agent saw only the bare query
`附件 摘要 attachment summary` and went searching the dataset instead of
reading the uploaded PDF.

## Fix

`agent/component/llm.py` — added `_collect_sys_files()` and an
auto-injection step in `_prepare_prompt_variables`:

- If `sys.files` is non-empty **and** neither `sys_prompt` nor any entry
  in `prompts` already contains `{sys.files}` (no double-injection),
  split the entries into text vs. `data:image/...` URIs.
- Image URIs are merged into `self.imgs`, which the existing logic uses
  to switch the chat model to `IMAGE2TEXT` and pass `images=...` to
  `async_chat`.
- Text content is appended to the last `user` role message in `msg`,
  mirroring how `dialog_service.async_chat_solo` handles attachments for
  the non-agent chat path (`api/db/services/dialog_service.py:318-321`).

Both `LLM._invoke_async` and `Agent._invoke_async` (tool-using) go
through `_prepare_prompt_variables`, so plain LLM nodes and Agent nodes
are fixed in both streaming and non-streaming paths.

## Test plan

- [ ] Upload a PDF attachment to an agent with the default `{sys.query}`
prompt and ask "summarize the attachment" — the model should answer
      from the file content rather than searching the knowledge base.
- [ ] Upload an image attachment to an agent and ask about its contents
—
      the model should switch to the vision-capable LLM and answer from
      the image.
- [ ] Verify that an agent whose prompt **does** include `{sys.files}`
      still works and does **not** include the file content twice.
- [ ] Verify that an agent run with no attachments behaves unchanged.
- [ ] Run `uv run pytest` to make sure no existing tests regress.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: yzc <yuzhichang@gmail.com>

2026-06-30 15:48:59 +08:00

models

Fix: failed to get embedding model by embd_id: model config not found BAAI/bge-m3@...@SILICONFLOW (#16445 )

2026-06-29 15:40:29 +08:00

all_models.json

fix(agent): inject uploaded attachments into LLM context (#15215 ) (#15220 )

2026-06-30 15:48:59 +08:00

doc_meta_es_mapping.json

Put document metadata in ES/Infinity (#12826 )

2026-01-28 13:29:34 +08:00

doc_meta_infinity_mapping.json

Put document metadata in ES/Infinity (#12826 )