Commit Graph

7071 Commits

Author SHA1 Message Date
Harsh Kashyap
d770217b25 fix(api): fall back to factory max_tokens for tenant models (#16364) 2026-07-01 16:00:13 +08:00
qinling0210
7862f69f39 Implement chat completions in go (#16491)
### Summary

POST   /api/v1/chat/completions
2026-07-01 15:52:52 +08:00
Harsh Kashyap
b8e960e6c8 fix(qa): preserve final CSV pair row number (#16433) 2026-07-01 14:52:08 +08:00
Harsh Kashyap
b42414b64a fix(deepdoc): parse bodyless HTML fragments (#16423) 2026-07-01 14:45:22 +08:00
connerlambden
9bf57600cf feat(agent): add BGPT structured literature evidence search tool (#16050)
## Summary

Adds a first-class **BGPT** Agent tool (backend + UI) in response to
[#15997](https://github.com/infiniflow/ragflow/issues/15997#issuecomment-4703864227).

BGPT calls `POST https://bgpt.pro/api/mcp-search` and returns structured
study evidence from full-text papers — not just titles/abstracts. Each
result is formatted for RAGFlow citations with:

- methods
- sample size / population
- results
- limitations
- conflicts of interest
- data availability
- study blind spots
- `how_to_falsify`

## Why this shape

- Mirrors existing literature tools (`PubMed`, `ArXiv`) and HTTP tools
(`SearXNG`).
- Works on the free tier (no API key required for first 50 results).
- Optional `api_key` and `days_back` in the node/tool config.
- Surfaces both `formalized_content` and raw `json` outputs (like
SearXNG).

## Files

- `agent/tools/bgpt.py` — REST client + evidence formatter
- Frontend: Operator enum, forms, tool picker, canvas accordion, en/zh
locales, icon

## Demo / docs

Runnable claim-interrogation demo:
https://github.com/connerlambden/bgpt-mcp/blob/main/EVIDENCE_DEMO.md

## Test plan

- [ ] Add BGPT node on Agent canvas, run query `GLP-1 alcohol craving`,
verify `formalized_content` includes limitations/COI fields
- [ ] Add BGPT as Agent sub-tool under Search, verify tool-calling works
- [ ] Confirm empty query / try-run returns gracefully
- [ ] Optional: paid-tier `api_key` path

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-07-01 13:52:24 +08:00
Harsh Kashyap
508f6226f8 fix(agent): filter TuShare news with upstream keyword input (#16361)
## Summary

TuShare required non-empty upstream input but filtered fetched news with
the static `keyword` param (default empty string), so agent-provided
keywords were ignored.

Use `self._param.keyword or ans` when filtering, matching how AkShare
uses upstream input for its query.

Fixes #16360

## Test plan

- [x] `test_tushare_filters_with_upstream_keyword_when_param_empty`
mocks the API and asserts only rows matching the upstream keyword are
returned

---------

Co-authored-by: yzc <yuzhichang@gmail.com>
Co-authored-by: Harsh Kashyap <harshkashyap@Harshs-MacBook-Pro.local>
2026-07-01 13:51:39 +08:00
Harsh Kashyap
572f1ea9f4 fix(web): sanitize agent rerun modal HTML against stored XSS (#16516) 2026-07-01 13:38:31 +08:00
Lynn
400476f0b3 Feat: SoMark (#16482)
Follow #15486
Co-authored-by: limuting <limuting233@gmail.com>
Co-authored-by: lutianyi <lutianyi233@163.com>
Co-authored-by: justinychuang <huangyicheng@soulcode.cn>
Co-authored-by: maybehokori <138367708+maybehokori@users.noreply.github.com>
2026-07-01 13:29:28 +08:00
Lynn
b6fa5ce4ea Fix: ollama provider (#16519) 2026-07-01 13:24:31 +08:00
Wang Qi
8f24b30652 [Go] Add API /api/v1/chat/recommendation and consolidate with /api/v1/searchbots/related_questions (#16500) 2026-07-01 13:17:16 +08:00
Muhammad Furqan
828c5789f6 fix(agent/tools): GoogleScholar empty json output and ignored top_n (#16419)
### What problem does this PR solve?

Closes #16418.

`scholarly.search_pubs(...)` returns a **lazy generator**, but
`agent/tools/googlescholar.py` treated it as a re-iterable, bounded
list:

```python
scholar_client = scholarly.search_pubs(kwargs["query"], ...)   # lazy generator
self._retrieve_chunks(scholar_client, ...)                     # (1) iterates -> exhausts it
self.set_output("json", list(scholar_client))                  # (2) already empty -> []
```

1. **`json` output was always empty.** `_retrieve_chunks` iterates
`scholar_client`, exhausting the generator; `list(scholar_client)` then
returns `[]`.
2. **`top_n` was never applied.** Unlike `ArXiv`
(`max_results=self._param.top_n`), the unbounded generator was passed
straight to `_retrieve_chunks`, which has no internal limit — so the
tool kept paginating well past Top N (until an error, rate-limit/block,
or `COMPONENT_EXEC_TIMEOUT`).

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Changes

- Materialize at most `top_n` results once with `itertools.islice`, and
reuse that list for both `_retrieve_chunks` and the `json` output.
- Add regression tests
(`test/unit_test/agent/component/test_googlescholar.py`, stubbing
`scholarly.search_pubs`) covering the `top_n` bound, the non-empty
`json` output, and the empty-query short-circuit.

Verified: against `main` the new tests fail with `assert 30 == 5` (top_n
ignored) and `assert 0 == 5` (empty json); with this fix all pass.
Backend-only.

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-07-01 10:47:39 +08:00
Yingfeng
6648fe4151 Fix g++ 11 incompatibility issue (#16512) 2026-07-01 10:16:47 +08:00
saltsalt123
f60245f199 feat(mcp): add ragflow_list_datasets and ragflow_list_chats tools (#15384)
## Summary

Add two new MCP tools to the RAGFlow MCP server:

1. **ragflow_list_datasets** - List all accessible datasets with IDs,
names, descriptions
2. **ragflow_list_chats** - List all accessible chat assistants with
IDs, names, descriptions

### Implementation
- Added `list_chats()` method to `RAGFlowConnector`
- Registered both tools in `list_tools()` and `call_tool()`
- Follows existing `ragflow_retrieval` pattern for error handling

### Usage via langchain-mcp-adapters

---------

Co-authored-by: saltsalt123 <saltsalt123@users.noreply.github.com>
Co-authored-by: yzc <yuzhichang@gmail.com>
dev-20260701
2026-07-01 09:36:52 +08:00
sxxtony
06b07bbfd6 Add CAJAL scientific paper agent template (#14641)
### What problem does this PR solve?

Closes https://github.com/infiniflow/ragflow/issues/14571.

Adds CAJAL as a first-class local scientific-writing option in RAGFlow:

- registers `agnuxo/cajal-4b-p2pclaw` as a known Ollama chat model with
a 32K context setting
- adds a built-in “CAJAL scientific paper agent” template under the
existing agent template catalog
- preconfigures the agent for grounded scientific writing: retrieval
first, citation traceability, LaTeX-ready output, and explicit
limitations when evidence is missing
- adds unit coverage to ensure the template normalizes through RAGFlow’s
production template loader, keeps graph form data in sync, and exposes
the Ollama model option

Behavior/evidence gathered for the requested model:

- Hugging Face model metadata for `Agnuxo/CAJAL-4B-P2PCLAW` reports
`pipeline_tag=text-generation` and tags including `gguf`, `llama.cpp`,
`vllm`, `scientific-research`, `papers`, `academic-writing`, `latex`,
and `license:apache-2.0`.
- The model card documents CAJAL as a 4B scientific paper generation
model with 32K context, local inference, LaTeX/citation specialization,
and CPU-only support around 5 tok/s on Ryzen 7 5800X.
- Local CPU generation could not be completed on this machine because
the advertised Ollama model name is not currently resolvable from
Ollama’s registry: both
`https://registry.ollama.ai/v2/agnuxo/cajal-4b-p2pclaw/manifests/latest`
and
`https://registry.ollama.ai/v2/library/agnuxo/cajal-4b-p2pclaw/manifests/latest`
returned `404 Not Found`; the Hugging Face repo tree currently exposes
an 8.4 GB `model.safetensors` but no GGUF artifact in `main`. The
template therefore targets the documented Ollama model name for users
who have the local CAJAL deployment/model file available.

Verification run locally:

```bash
python3 -m pytest test/test_cajal_template_unit.py -q
# 3 passed in 0.34s

python3 - <<'PY'
import json, glob
for f in sorted(glob.glob('agent/templates/*.json') + ['conf/llm_factories.json']):
    with open(f, encoding='utf-8') as fp: json.load(fp)
print('json_ok')
PY
# json_ok

python3 -m ruff check test/test_cajal_template_unit.py
# All checks passed!

git diff --check
```

`uv run pytest
test/testcases/test_web_api/test_agent_app/test_cajal_template_unit.py
-q` was also attempted first, but dependency setup failed before test
collection while building `ormsgpack==1.5.0` from uv with a package
metadata parse error. Clearing uv’s `ormsgpack` cache and retrying
reproduced the same build failure, so the focused unit test was run with
the system Python environment instead.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: sxxtony <sxxtony@users.noreply.github.com>
Co-authored-by: yzc <yzc@users.noreply.github.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2026-07-01 09:35:37 +08:00
RazmikGevorgyan
38f8f8a656 fix: handle non-serializable objects in agent canvas SSE and state se… (#14210)
…rialization

Agent components (llm.py, agent_with_tools.py, message.py) store
functools.partial objects as deferred streaming handles in their output
slots. When the canvas state gets serialized for SSE events, Redis
commits, or logging, these partials — plus non-copyable objects like
Langfuse clients — crash json.dumps and deepcopy.

Changes:
- canvas_app.py: add default=str to json.dumps for SSE event
serialization (lines 238, 296)
- canvas.py: wrap deepcopy calls in try/except to handle non-copyable
objects (Langfuse clients, etc.), add default=str to final json.dumps
- base.py: add default=str to ComponentParamBase.__str__ to handle
non-serializable objects in component parameters

Closes #14229

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: yzc <yuzhichang@gmail.com>
2026-07-01 09:33:41 +08:00
Taranum Wasu
e23f63bd93 fix(agent): prevent empty LLM user message after prompt fitting (#16413)
## Summary
- Treat `max_tokens=0` as unset (`or 8192`) when building model context
budgets, fixing agents that silently zeroed prompts when a vLLM model
had `max_tokens: 0` in tenant config
- Replace trailing same-role canvas history in `LLM._sys_prompt_and_msg`
instead of skipping the current user prompt
- Add `LLM.fit_messages()` validation after `message_fit_in` on agent
paths so empty user content fails fast with a clear error instead of
reaching vLLM

Fixes #16411

## Root cause
Agent canvas workflow called `message_fit_in` with `int(max_length *
0.97)`. When `max_length` was `0`, both system and user content were
trimmed to empty strings. The `[HISTORY STREAMLY]` log showing only
`{"role":"user","content":""}` matches this. A secondary bug skipped
appending the formatted user prompt when history ended with a `user`
role message.

## Test plan
- [x] Added `test/unit_test/agent/component/test_llm_prompt.py` for
role-replace, validation, and zero-budget fitting
- [x] Added
`test_message_fit_in_zero_budget_preserves_non_empty_messages` in
`test_generator_message_fit_in.py`
- [ ] CI unit tests
- [ ] Manual: agent canvas `begin → Retrieval → Agent → Message` with
vLLM Qwen3; confirm user message reaches LLM

Made with [Cursor](https://cursor.com)

---------

Co-authored-by: Taranum Wasu <taranumwasu@Taranums-MacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-07-01 09:30:54 +08:00
Harsh Kashyap
45fc7feab4 fix(common/time_utils): correct None/empty timestamp fallback and ISO 8601 parsing (#16483)
Recovery PR for #16173 after the fork branch was accidentally reset
during rewrite-cleanup.

Cherry-picked onto current `main`:
- fix(common/time_utils): correct fallback timestamp and ISO-8601
normalization
- fix(common/time_utils): preserve zero timestamps and mark regression
tests
- test(common/time_utils): make fallback assertions deterministic

Supersedes closed #16173 — same branch
`Harsh23Kashyap/fix/time-utils-edgecases`, rebuilt per @yuzhichang
recovery steps in
https://github.com/infiniflow/ragflow/pull/16173#issuecomment-4829663835

---------

Co-authored-by: Harsh Kashyap <harshkashyap@Harshs-MacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-30 22:30:44 +08:00
Lynn
b53b693f22 Fix: CI (#16504)
### Summary

Fix race condition in parallel lefthook hooks causing ETXTBSY error
2026-06-30 22:14:11 +08:00
Jack
8e1dc4f308 revert: roll back tests.yml CI changes from PR #16391 (#16505)
## Summary

Two changes to make Go build \& run independent of native libraries
(office_oxide, pdfium, pdf_oxide).

## 1. Make native libraries optional (build.sh + Go source)

## 2. Roll back tests.yml CI changes from PR #16391
2026-06-30 21:50:37 +08:00
Yingfeng
5af361ed68 Add spacy based ner and relationship extractor for both python and Go version with equivalent outputs (#16456)
As title
2026-06-30 21:40:24 +08:00
Hz_
3633d08495 feat(go-api): Migrate Box web OAuth connector APIs to Go (#16480)
This PR migrates the Box web OAuth flow from Python to Go for:

  - POST /api/v1/connectors/box/oauth/web/start
  - GET /api/v1/connectors/box/oauth/web/callback
  - POST /api/v1/connectors/box/oauth/web/result
nightly
2026-06-30 18:10:36 +08:00
Yingfeng
63bdf5c5b1 Fix harness streaming emit (#16486) 2026-06-30 18:06:03 +08:00
天海蒼灆
3c946a7e58 fix(agent): add canvas_type filter and field to list_agents API (#15754)
### What problem does this PR solve?

GET /api/v1/agents (list_agents) already supports filtering by
canvas_category, keywords, tags, and owner_ids, but it does not support
canvas_type — even though canvas_type is a persisted field on UserCanvas
and is already accepted on agent create/update APIs.

This gap causes two issues:

Filtering — clients cannot list agents by business category (e.g.
Marketing, Agent, Ingestion Pipeline) without fetching all agents and
filtering client-side.
Response payload — list_agents did not return canvas_type in each canvas
item, so consumers had to call GET /api/v1/agents/{id} per agent to read
it.
This PR adds optional canvas_type query parameter support and includes
canvas_type in the list response.
### Type of change

- [√] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2026-06-30 17:43:26 +08:00
Wang Qi
d2ecd57c59 Fix: UI cannot start up (#16497) 2026-06-30 17:09:09 +08:00
Haruko386
b3af9fc068 fix: remove dup-prefix in bot_routes (#16492) 2026-06-30 17:02:58 +08:00
Rene Arredondo
09dc4c8841 fix(agent): return session_id when chat completion produces no events (#15169) (#15228)
## Summary

Fixes #15169 — `POST /api/v1/agents/chat/completions` returned
`data: {}` with no `session_id` when the agent produced no events
(e.g. the reporter's payload sent `"query": ""`).

## Root cause

For `{"agent_id": "...", "query": "", "stream": false}`:

1. No `session_id` in the request → new-session branch at
   `agent_api.py:1278`.
2. `session_id = get_uuid()` at `agent_api.py:1294`.
3. Falls into `_run_workflow_session`.
4. `canvas.run(query="")` produces no events, so `final_ans`
   stays `{}`.
5. Non-streaming path then hit:

   ```python
   if not final_ans:
       await commit_runtime_replica()
       return get_result(data={})
   ```

   `session_id` was allocated but silently dropped on the way out.

The streaming path had the same shape (only a bare `[DONE]` was
yielded — no SSE event carrying `session_id`). The
session-continuation path at `agent_api.py:1463` had the same bug
for callers that passed `session_id` and got `{}` back.

The successful (non-empty) paths were fine because every canvas
event has `ans["session_id"] = session_id` attached before being
yielded / captured into `final_ans` (see
`agent_api.py:255` and `:303`).

## Fix

Three minimal changes, all in
`api/apps/restful_apis/agent_api.py`:

1. **`_run_workflow_session` (non-streaming)**:
   `return get_result(data={"session_id": session_id})` instead of
   `data={}`.
2. **`_run_workflow_session` (SSE)**: if the canvas loop emits no
   events, yield one
   `data:{"session_id": "...", "data": {}}` event before
   `[DONE]`, so the client receives the id over the wire.
3. **`agent_chat_completion` session-continuation**: echo the
   caller-supplied `session_id` back in the empty-events case
   instead of `{}`.

No change needed on the happy paths — they already attach
`session_id` to every event.

## Test plan

- [ ] Repro from the issue: `POST /api/v1/agents/chat/completions`
      with `{"agent_id": "<id>", "query": "", "stream": false}`.
      Response `data` should now contain `session_id`.
- [ ] Same payload with `"stream": true`. SSE stream should
      contain one event with `session_id` before `data:[DONE]`.
- [ ] Same shape but with a real, non-empty `"query"` (new
      session). Response should be unchanged from before — every
      event still carries `session_id`, final response still
      includes it on `final_ans`.
- [ ] Pass an existing `session_id` plus `"query": ""`. Response
      should echo that `session_id` back instead of `{}`.
- [ ] Pass an existing `session_id` plus a normal query. Response
      should be unchanged from before.
- [ ] `openai-compatible: true` path is untouched — sanity-check
      it still works.
- [ ] Run `uv run pytest` to make sure no existing tests regress.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2026-06-30 16:41:44 +08:00
Wang Qi
3bb976b383 [Go] Add /api/v1/searchbots/mindmap and /api/v1/chat/mindmap (#16443) 2026-06-30 16:35:33 +08:00
Zhichang Yu
4c54cefd29 Port 14 upstream agent security / correctness fixes to Go canvas (#16455)
Mirrors 14 merged upstream PRs into the Go agent port.

PRs ported:
  - #15609 ExeSQL SSRF guard + DNS pin
  - #15436 HTTP timeout on external API tools
  - #16363 be_output restore + DeepL error path
  - #15644 switch no longer matches empty condition
  - #15374 session_id bind to path agent_id (DAO idor guard)
  - #16169 sandbox artifact ownership gate
  - #15457 tenant ownership on agentbots
  - #15145 rerun agent document access check
- #15446 thinking switch (component portion; provider policy lives in
internal/llm)
  - #15426 Invoke URL/proxy SSRF + DNS pin + no-redirects
  - #15238 agentbot thinking-logs beta endpoint
  - #14589 UserFillUp SSE event propagation
  - #14890 anonymous webhook opt-in
- #15068 PipelineChunker new component (text/file_ref/parser_id
dispatch; file-format extraction is a follow-up)

40 files, +2355 / -58 lines. 33 new tests, all targeted package suites
pass (1721 + 4 skipped); 1 pre-existing flaky test unrelated.
2026-06-30 16:28:48 +08:00
Rene Arredondo
dc8b6d767c fix(agent): inject uploaded attachments into LLM context (#15215) (#15220)
## Summary

Fixes #15215 — attachments uploaded to an agent were not reaching the
LLM.

When a user uploads a file in an agent chat, `canvas.run` parses it into
the `sys.files` global (text content for documents, `data:image/...`
URIs
for images — see `agent/canvas.py:752-768`). But the LLM/Agent
component's
`_prepare_prompt_variables` only substitutes variables the user's prompt
template explicitly references via `{var}` placeholders. The default
prompt is `[{"role": "user", "content": "{sys.query}"}]` with no
`{sys.files}`, so the parsed attachment content never reaches the model.

In the reporter's logs, this is why the agent saw only the bare query
`附件 摘要 attachment summary` and went searching the dataset instead of
reading the uploaded PDF.

## Fix

`agent/component/llm.py` — added `_collect_sys_files()` and an
auto-injection step in `_prepare_prompt_variables`:

- If `sys.files` is non-empty **and** neither `sys_prompt` nor any entry
  in `prompts` already contains `{sys.files}` (no double-injection),
  split the entries into text vs. `data:image/...` URIs.
- Image URIs are merged into `self.imgs`, which the existing logic uses
  to switch the chat model to `IMAGE2TEXT` and pass `images=...` to
  `async_chat`.
- Text content is appended to the last `user` role message in `msg`,
  mirroring how `dialog_service.async_chat_solo` handles attachments for
  the non-agent chat path (`api/db/services/dialog_service.py:318-321`).

Both `LLM._invoke_async` and `Agent._invoke_async` (tool-using) go
through `_prepare_prompt_variables`, so plain LLM nodes and Agent nodes
are fixed in both streaming and non-streaming paths.

## Test plan

- [ ] Upload a PDF attachment to an agent with the default `{sys.query}`
prompt and ask "summarize the attachment" — the model should answer
      from the file content rather than searching the knowledge base.
- [ ] Upload an image attachment to an agent and ask about its contents
—
      the model should switch to the vision-capable LLM and answer from
      the image.
- [ ] Verify that an agent whose prompt **does** include `{sys.files}`
      still works and does **not** include the file content twice.
- [ ] Verify that an agent run with no attachments behaves unchanged.
- [ ] Run `uv run pytest` to make sure no existing tests regress.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: yzc <yuzhichang@gmail.com>
2026-06-30 15:48:59 +08:00
Jin Hai
bd56a1473f Go CLI: merge function (#16458)
### Summary

1. remove unused code.

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-30 15:47:26 +08:00
chanx
9542e6d530 fix: adjust width of messageItemSectionLeft to fit-content (#16488) 2026-06-30 15:37:22 +08:00
Wang Qi
2018eec0dc Fix: allow any host for url for development (#16459) 2026-06-30 10:19:04 +08:00
dependabot[bot]
540acb4892 build(deps): bump crawl4ai from 0.8.9 to 0.9.0 (#16470) dev-20260630 2026-06-30 09:34:48 +08:00
maoyifeng
5276baf1f9 Go CLI: add admin_command response table funtion (#16454)
### Summary

Go CLI: add admin_command  response table funtion

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-06-30 00:25:39 +08:00
Jin Hai
6370fce3f0 Go CLI: add show users plan summary (#16463)
### Summary

```
RAGFlow(admin)> show users plan summary;
+---------+----------------------------------------------------------------+
| field   | value                                                          |
+---------+----------------------------------------------------------------+
| command | show_users_plan_summary                                        |
| error   | 'Show users plan summary' is implemented in enterprise edition |
+---------+----------------------------------------------------------------+
```

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-29 22:28:45 +08:00
Attili-sys
5fc254eb2e Feature big query connector (#15871)
### What problem does this PR solve?

This PR adds Google BigQuery as a first-class data source connector in
RAGFlow.

It enables users to ingest and sync BigQuery data using the same
row-to-document model used by relational database connectors: selected
content columns become document text, metadata columns become document
metadata, an optional ID column provides stable document IDs, and an
optional timestamp column enables cursor-based incremental sync.

The connector supports service-account JSON credentials, table mode,
custom query mode, GoogleSQL queries, cursor-based incremental sync,
deleted-row pruning support, configurable query limits such as
`maximum_bytes_billed`, dry-run validation, batch loading, stable
document IDs, and BigQuery-aware value serialization.
2026-06-29 22:08:40 +08:00
Jin Hai
1087a25f22 Revert "feat(go-api): Add Go chat session message delete and feedback APIs" (#16465)
Reverts infiniflow/ragflow#16442
2026-06-29 21:37:11 +08:00
writinwaters
c1175137e4 Docs: Added an FAQ (#16466)
### Summary

Added an FAQ.
2026-06-29 21:20:48 +08:00
Haruko386
1c0cdd84ce feat[Go]: implement searches/<search_id>/completions POST (#16440)
### Summary

As title

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-29 20:07:12 +08:00
Jin Hai
7c1edca15e Go CLI: fix api commands (#16457)
### Summary

As title.

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-29 19:09:32 +08:00
Wang Qi
48b77022f4 [Go] Fix beta auth for /documents/images/:image_id and /documents/:id/preview and /thumbnails (#16453) 2026-06-29 19:08:49 +08:00
Hz_
a553886989 feat(go-api): Add Go chat session message delete and feedback APIs (#16442)
### Summary

```
/api/v1/chats/<chat_id>/sessions/<session_id>/messages/<msg_id> DELETE
/api/v1/chats/<chat_id>/sessions/<session_id>/messages/<msg_id>/feedback PUT
```

Migrates the chat session message delete and feedback APIs to the Go
server, matching the Python behavior for authorization, session
ownership checks, message/reference updates, and feedback validation.

### Testing

  - `/usr/local/go/bin/go test ./internal/service ./internal/handler`
- Verified through the frontend page for deleting chat messages and
updating message feedback
2026-06-29 19:05:50 +08:00
Hz_
a10a2d8769 fix(py): chat message reference deletion index (#16436)
Fix the reference index used when deleting a chat message pair.

Each user/assistant message pair shares one reference entry, while the
first assistant prologue has no reference. Using `i // 2` correctly
removes the reference for the deleted pair and avoids deleting the
previous turn's reference.
2026-06-29 19:05:25 +08:00
Haruko386
445a13ee9a fix: new chat cannot be edit (#16434)
### What problem does this PR solve?

As title
main fix:

```go
if _, ok := req["meta_data_filter"]; !ok || req["meta_data_filter"] == nil {
	req["meta_data_filter"] = map[string]interface{}{}
}
```


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2026-06-29 19:04:59 +08:00
Haruko386
43f75fdfc7 fix: unable to upload avatar for search (#16437)
### What problem does this PR solve?

As title

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-29 19:04:30 +08:00
Haruko386
c5e10a1578 fix: auth middleware double responses on early rejection (#16444)
### Summary

As title:
2026-06-29 19:02:37 +08:00
Jack
98323e7910 Refactor: oss parser go refactor (#16391)
### What problem does this PR solve?

Package refactor and PDF post process.

### Type of change

- [x] Refactoring

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-29 18:46:41 +08:00
Wang Qi
c0f64295c2 [Go] Fix searchbot retrieval_test accept kb_id as array, fix model recognize (#16452) 2026-06-29 17:17:20 +08:00
Jin Hai
3202ec6abf Go CLI: refactor commands (#16447)
### Summary

1. Move debug commands to dev file.
2. Refactor some commands syntax

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-29 17:03:26 +08:00
Wang Qi
ec5cd6b1c0 [Go] Fix searchbot BETA auth (#16450) 2026-06-29 16:44:21 +08:00