Commit Graph

6526 Commits

Author SHA1 Message Date
Jin Hai
2061edd308 Remove unused codes (#15579)
### What problem does this PR solve?

Remove unused code.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-03 17:35:36 +08:00
Jack
b363146997 refactor: overhaul task executor with layered architecture and comprehensive test suite (#15471)
## Summary

Decomposes the monolithic `task_executor.py` (1945 lines) into a 6-layer
architecture with clear separation of concerns. The refactored code is
functionally equivalent to the original, verified through 400 passing
tests and a production-vs-dry-run comparison framework.

## Architecture

```
entry (task_manager)
  └─ orchestration (task_handler)
       ├─ services (chunk_service, embedding_service, dataflow_service, raptor_service, post_processor)
       │    └─ utilities (chunk_builder, chunk_post_processor, embedding_utils)
       └─ infrastructure (task_context, recording_context, interceptor)
```

Key design decisions:
- **TaskContext** — typed facade over raw task dict, injects rate
limiters + callbacks via composition
- **RecordingContext + Comparator** — enables side-by-side production vs
dry-run execution for safe migration
- **NullRecordingContext** — zero-allocation no-op for production, uses
`__slots__`
- **WriteOperationInterceptor** — FIFO replay of previous runs function
returns for comparison mode

## Migration Strategy

The original `handle_task()` in `task_executor.py` uses a 3-way switch
via `TE_RUN_MODE`:
- `TE_RUN_MODE=0` (default) → runs refactored code
- `TE_RUN_MODE=1` → runs both original + refactored, compares all
intermediate results
- `TE_RUN_MODE=2` → runs original code (fallback)

The comparison mode (`TE_RUN_MODE=1`) records ~40 intermediate values
(chunks, vectors, token counts, func return values) from the production
run and replays them during dry-run, then uses `ContextComparator` to
report mismatches.

## Functional Equivalence Fixes

All divergences between original and refactored code were identified and
fixed:
- Timeout decorators (handle/build_chunks/raptor/embedding)
- NullRecordingContext leak in finally block causing RuntimeError
- MinIO None-binary check with proper FileNotFoundError
- Dataflow dispatch after embedding binding + init_kb
- Memory task missing return after processing
- RAPTOR checkpoint progress reporting
- Tag cache (get_tags_from_cache/set_tags_to_cache) restoration
- dataflow_id correction in _load_dsl
- Language default Chinese, dead code guard removal
- embed_chunks made async with proper thread_pool_exec
- Full GraphRAG default configuration (10 parameters)
- Hardcoded q_768_vec fallback removal in RAPTOR

## Test Changes

- 20 new tests covering table parser manual mode, tag cache, embedding
edge cases, RAPTOR checkpoint, dataflow_id correction, storage binary
None, cancel cleanup, metadata=None boundary
- Unified `make_task_context`/`make_task_dict` factories eliminated 10+
duplicated helpers
- DataflowService tests migrated from internal method mocks to IO
boundary mocks (real orchestration code executes)
- Parametrized duplicate build_chunks post-processor tests
- 7 raptor tests modernized to @pytest.mark.asyncio
- Mock count per test reduced through boundary-level mocking strategy

**Test count: 400 passing, 0 warnings, 0 skips**

## Files Changed

| File | Change |
|------|--------|
| `rag/svr/task_executor.py` | +1 line (NullRecordingContext fix) |
| `rag/svr/task_executor_refactor/task_handler.py` | Orchestration
layer, 8 logic fixes |
| `rag/svr/task_executor_refactor/chunk_service.py` | +timeout +
None-check |
| `rag/svr/task_executor_refactor/embedding_service.py` | sync→async
rewrite |
| `rag/svr/task_executor_refactor/dataflow_service.py` | dataflow_id fix
+ timeout |
| `rag/svr/task_executor_refactor/raptor_service.py` | checkpoint fix +
assert |
| `rag/svr/task_executor_refactor/chunk_post_processor.py` | tag cache
restore |
| `rag/svr/task_executor_refactor/task_context.py` | language default
fix |
| `test/.../conftest.py` | +294 lines shared helpers |
| `test/.../*.py` | 15 test files refactored, 20 new tests |

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:18:31 +08:00
Jin Hai
d736f358ba Go: refactor model provider (#15568)
### What problem does this PR solve?

1. Add license announcement
2. Add sanity check on API config
3. Add base class: BaseModel
4. Add GetBaseURL

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-03 16:33:58 +08:00
Wang Qi
d6fc50a469 Fix: no more @token_required (#15562)
Fix: no more @token_required
2026-06-03 16:24:08 +08:00
chanx
a678ed7b1f Fix: Switching pagesize on a chunk page did not reset the current page. (#15401)
### What problem does this PR solve?

Fix: Switching pagesize on a chunk page did not reset the current page.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-03 15:57:57 +08:00
Idriss Sbaaoui
1134769940 Chore: update cohere models (#15576)
### What problem does this PR solve?

remove old and add latest cohere models

### Type of change

- [x] Refactoring
- [x] Other (please describe): update models
2026-06-03 15:55:45 +08:00
Haruko386
473d06d1ad feat[Go]: implement add multi_models (#15563) 2026-06-03 15:26:46 +08:00
buua436
c0e00a7f6e Fix: agent template smart_customer_service_specialist.json (#15565)
### What problem does this PR solve?

agent template smart_customer_service_specialist.json

### Type of change

- [x] Refactoring
2026-06-03 15:05:39 +08:00
Lynn
ac3964b6bc Feat: display intl url for siliconflow and verify model provider without llms in json (#15550)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-03 14:43:08 +08:00
Jin Hai
dbebc66ba8 Go: refactor provider code (#15564)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-03 14:09:07 +08:00
Jin Hai
e1f19f6679 Go: fix gitee balance api (#15554)
```
RAGFlow(user)> create provider 'gitee' instance 'intl' key 'api-token' url 'https://ai.gitee.com/v1' region 'intl';
SUCCESS
```

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-03 13:23:20 +08:00
chanx
c41855da81 Fix: Model provider add verify and fixed form in modal not resetting issue (#15520)
### What problem does this PR solve?

Fix: Model provider add verify and fixed form in modal not resetting
issue

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-03 11:59:57 +08:00
buua436
76fc1d547f Refa: refine mysql migration version workflow (#15549)
### What problem does this PR solve?

refine mysql migration version workflow

### Type of change

- [x] Refactoring
2026-06-03 11:51:42 +08:00
bitloi
a75ea7ba7c Fix: Chat completion generation parameter overrides (#15389)
### What problem does this PR solve?

Closes #15388.

Chat completion routes did not reliably honor per-request generation
settings:

- `/api/v1/chat/completions` copied generation settings with a
truthiness check, so valid zero values such as `temperature: 0`, `top_p:
0`, `frequency_penalty: 0`, `presence_penalty: 0`, and `max_tokens: 0`
were dropped.
- `/api/v1/openai/{chat_id}/chat/completions` did not forward standard
generation settings into the request-specific dialog LLM settings before
calling `async_chat`.

This PR preserves explicitly supplied generation parameters, including
zero values, and merges request-level overrides into existing dialog
settings where appropriate.

The supported generation parameter keys and merge behavior live in a
shared REST API helper to keep both completion routes aligned.

Validation:

- `git diff --check`
- `python3 -m py_compile api/apps/restful_apis/_generation_params.py
api/apps/restful_apis/chat_api.py api/apps/restful_apis/openai_api.py
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
- `uv run ruff check api/apps/restful_apis/_generation_params.py
api/apps/restful_apis/chat_api.py api/apps/restful_apis/openai_api.py
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py`
- `ZHIPU_AI_API_KEY=dummy uv run pytest
test/testcases/test_http_api/test_session_management/test_session_sdk_routes_unit.py
-q -k generation_params`

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-03 11:46:10 +08:00
kpdev
76968af0ba Guard missing storage blobs on preview and image endpoints (#15366)
Fixes [#15365](https://github.com/infiniflow/ragflow/issues/15365) —
`get_document_image()` and document preview call `make_response(None)`
when storage returns no bytes, causing HTTP 500.
2026-06-03 11:33:03 +08:00
VictorECDSA
ff5971448b [Fix] naive: force-merge short markdown headers to prevent separate chunks (#15488)
## Problem

When uploading `.md` files with `parser=naive` and `delimiter="\n"`,
markdown headers (e.g., `## Quick Travel`) become separate chunks with
very short content (16-18 characters). This causes retrieval issues:
when the header is matched, the corresponding body text is not included
in the chunk.

## Related Issues

Closes #15487

## Checklist

- [x] Code changes are minimal and focused
- [x] Unit tests added (12/12 passed)
- [x] No breaking changes
2026-06-03 10:49:28 +08:00
Wang Qi
583daf47d5 Fix: model provider orders (#15524)
Fix: model provider orders
2026-06-03 10:17:12 +08:00
Hz_
9799f33549 GOCli check provider region (#15474)
## Summary
- add CLI command `CHECK PROVIDER 'provider_name' REGION 'region_name'
KEY 'api_key';`
  - route the command through CLI parser and command dispatcher
- call `GET /api/v1/providers/:provider_name/connection` with `region`
and `api_key`

  ## Testing
  - `go test ./internal/cli/...`
  - manually verified CLI command parsing and request flow
2026-06-02 19:34:25 +08:00
ちー
5f8926410d feat[Go]: implement /api/v1/connectors/<connector_id> PATCH (#15512)
### What problem does this PR solve?

As title, all test are passed

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-02 19:34:07 +08:00
Haruko386
9f969feb89 feat[Go] implement check connection by using apikey and region (#15475)
### What problem does this PR solve?

**Verified from PostMan**


GET http://127.0.0.1:9384/api/v1/providers/gitee/connection
```json
body: 

{
    "api_key": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    "region": "default"

}

resp: 
{
    "code": 0,
    "message": "success"
}
```

GET http://127.0.0.1:9384/api/v1/providers/gitee/connection
```json
body: 

{
    "api_key": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    "region": "deprecated"

}

resp: 
{
    "code": 0,
    "message": "success"
}
```

GET http://127.0.0.1:9384/api/v1/providers/gitee/connection
```json
body: 

{
    "api_key": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    "region": "china"

}

resp: 
{
    "code": 0,
    "message": "success"
}

```

GET http://127.0.0.1:9384/api/v1/providers/lmstudio/connection
```json
body: 

{
    "api_key": "",
    "region": "test"

}

resp: 
{
    "code": 0,
    "message": "success"
}
```


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-02 19:32:41 +08:00
Lynn
36357a6afd Fix: model provider (#15517)
### What problem does this PR solve?

Fix:
- Handle siliconflow and siliconflow_intl api_key

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-02 19:04:20 +08:00
Wang Qi
d41373cfa9 Feature: Add the new anthropic and voyage models (#15516)
add the newanthropic and voyage models. Strip opus 4.7 and 4.8 of
certain usnspported keys

Co-authored-by: Idriss Sbaaoui <112825897+6ba3i@users.noreply.github.com>
2026-06-02 17:29:18 +08:00
Wang Qi
c990badda1 Feature: Add MiniMax M3 (#15513)
Feature: Add MiniMax M3
2026-06-02 17:28:48 +08:00
Alexander Laurent
a98889cd76 feat: add Go MCP server update API (#15261)
## What

#15240
implementation for PUT /api/v1/mcp/servers/:mcp_id

## Changes

- Adds the Go implementation for `PUT /api/v1/mcp/servers/:mcp_id`.
- Wires MCP service and handler into the Go server/router for the update
route.
- Preserves Python-style behavior for ownership checks, partial update
fields, MCP type/name/URL validation, `headers`/`variables`
normalization, and tool metadata scrubbing.
2026-06-02 15:58:44 +08:00
Dexterity
2819d0ea24 fix(go-models): use per call context timeouts so long streaming responses are not truncated (#15380)
### What problem does this PR solve?

Closes #15379 

Around 29 Go model providers in `internal/entity/models/` share an
`http.Client` configured with `Timeout: 120 * time.Second`, and reuse
that same client for `ChatStreamlyWithSender`. Go's
`http.Client.Timeout` is a hard ceiling on the whole request that also
covers reading the response body, so it behaves as a wall clock on
streaming. Any streamed chat response that lasts longer than 120 seconds
gets cut off in the middle with a timeout error. Long generations,
reasoning model outputs, and slow or overloaded upstreams are the common
victims.

The providers that already behave correctly (`groq`, `mistral`,
`voyage`, `anthropic`) set no client `Timeout` and instead wrap each
request in a `context.WithTimeout`. This change converges the affected
providers onto that same pattern.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-06-02 15:27:26 +08:00
buua436
4018f02d96 Feat: mark mysql migrations as applied (#15504)
### What problem does this PR solve?

mark mysql migrations as applied

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-02 15:04:33 +08:00
glorydavid03023
5733e0624c fix(go-models): harden N1N default transport handling (#15351)
## Summary
- Harden `NewN1NModel` to avoid panics when `http.DefaultTransport` is a
custom non-`*http.Transport` RoundTripper.
- Fallback to a safe transport (`ProxyFromEnvironment`) while preserving
existing pooling/timeout settings.
- Add `n1n_test.go` with coverage for name/factory plus
`TestN1NNewModelWithCustomDefaultTransport`.


Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-02 13:40:10 +08:00
Hz_
1092f624fb fix: post /api/v1/system/tokens (#15410)
### What problem does this PR solve?

This PR aligns `POST /api/v1/system/tokens` in Go with the Python
implementation.

### Type of change

- Keep the token creation flow under the system API route.
- Preserve the owner-tenant authorization check.
- Generate and persist API tokens consistently with the current Go
service flow.
- Return the created token payload in the standard API response format.

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-06-02 13:39:07 +08:00
Lynn
3bc5ed282e Fix: model-provider bugs (#15460)
### What problem does this PR solve?

Fix:
- Use @ to avoid split  by `_` in model_name.
- Verify api_key when add instance.
- Pop api_key in list intances response.
- Remove useless index.
- Sort providers, instances and models by name.
- Get `is_tools` from llm_factories.json

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-02 13:24:53 +08:00
Haruko386
0e9eeb7b88 feat[Go] implement /api/v1/datasets/<dataset_id>/metadata/config (#15493)
### What problem does this PR solve?

implement /api/v1/datasets/<dataset_id>/metadata/config

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-02 13:24:28 +08:00
dripsmvcp
d4f1c2c95c fix(go-models): remove duplicate roundTripperFunc from novita_test.go (#15492)
Remove duplicated function
2026-06-02 13:23:39 +08:00
ちー
e4ef9834da fix: rewrite enable thinking mode for minimax (#15496)
### What problem does this PR solve?

fix the bad thinking mode for minimax

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-02 13:22:11 +08:00
Aeovy
600590cd18 Fix: disable thinking to avoid potential infinite loops in Qwen3.5/Qwen3.6 models (#15101)
### What problem does this PR solve?

This PR fixes the issue where Qwen3.5/Qwen3.6 series models may spend
excessive time on simple document-parsing tasks, such as Auto Metadata
extraction, keyword extraction, question generation, and image
description when using the MinerU parser.

For these tasks, Qwen3.5/Qwen3.6 models may perform unnecessary
reasoning by default, which can lead to very long response times, high
token consumption, and, in some cases, potential infinite output loops.

Since Qwen3.5/Qwen3.6 multimodal models are instantiated as `CvModel`
when configured as `image2text`, the existing `enable_thinking=False`
logic in `chat_model.py` does not apply to them. This PR adds the
corresponding handling for the CV/image-to-text model path as well.

This helps reduce unnecessary thinking time, avoid potential infinite
loops, and improve parsing efficiency without noticeably affecting
output quality for these simple extraction and image-description tasks.

Fixes #15083.
2026-06-02 13:21:35 +08:00
nickmopen
5b02fe4841 fix(api): stop duplicating answer in openai-compatible chat completions stream (#15286) (#15443)
### What problem does this PR solve?

Fixes #15286.

When calling `/api/v1/openai/<chat_id>/chat/completions` with `"stream":
true`, the response contains the answer **twice** — the final message
repeats everything that was already streamed.

#### Root cause

RAGFlow's `async_chat` streams the body as incremental `delta.content`
chunks, then emits a terminating `final` event whose `answer` is the
**complete** (decorated) message. The handler re-emitted that full
answer as one more `delta.content` chunk:

```python
if ans.get("final"):
    if ans.get("answer"):
        full_content = ans["answer"]
        response["choices"][0]["delta"]["content"] = full_content   # <-- whole answer again
        yield ...
```

So a client accumulating `delta.content` ends up with the message
duplicated.

#### Fix

Drop the re-emission. The complete answer from the `final` event is now
surfaced **only** through the trailing chunk's `final_content` and
`reference` fields, which matches OpenAI streaming semantics: deltas are
incremental, and the final chunk carries only `finish_reason` / `usage`
(plus RAGFlow's `reference` / `final_content` extensions).

This matches the expected behavior described in the issue: "The stream
should only yield content chunks once, and the final message should only
contain reference, usage, and finish_reason."

#### Testability refactor

The streaming SSE assembly was a closure inside the request handler, so
it could only be exercised against a live server + real LLM. I extracted
it into a module-level `_stream_chat_completion_sse` async generator
(behavior-preserving) so it can be unit-tested with a fake event stream.

#### Tests

Adds
`test/unit_test/api/apps/restful_apis/test_openai_stream_no_duplicate.py`
(same import-stub pattern as the existing `test_get_agent_session.py`):

- body is streamed exactly once (the regression);
- the complete answer is never re-emitted as a content chunk;
- the terminating chunk has `finish_reason="stop"`, `content=None`, and
correct `usage`;
- `final_content` / `reference` are present on the trailing chunk;
- reasoning (`think`) deltas stream separately and are not duplicated.

> Note: this is unrelated to #15442, which only changes the `stream`
default — it does not touch the duplication logic.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Added test cases

---------

Co-authored-by: Wang Qi <wangq8@outlook.com>
2026-06-02 13:20:40 +08:00
buua436
2e02bf7ba4 Fix: migrate legacy model id configs (#15495)
### What problem does this PR solve?

migrate legacy model id configs

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-02 13:08:58 +08:00
Julian
33ef724b5f Add Bulk action for linking Multiple Files to Datasets (#14960)
### What problem does this PR solve?

Feature: #14961 


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-06-02 12:23:33 +08:00
kpdev
0f6f7b3c3c fix(api): document image_id parsing for hyphenated thumbnail keys (#15115) (#15116)
### What problem does this PR solve?

Fixes #15115.

`GET /api/v1/documents/images/<image_id>` returned **Image not found**
when the thumbnail storage object key contained hyphens (e.g.
`page-1.png`). Document APIs build URLs as `{dataset_id}-{thumbnail}`,
but `get_document_image()` used `image_id.split("-")` and required
exactly two segments, so keys like `<kb_id>-page-1.png` were rejected
even though the blob existed.

This PR splits only on the first hyphen (`split("-", 1)`) and sets
`Content-Type` from the object key extension via `CONTENT_TYPE_MAP`
instead of hardcoding `image/JPEG`.
2026-06-02 10:54:14 +08:00
kpdev
a4bc066f74 fix(rag): id2image parsing for hyphenated storage object keys (#15117) (#15118)
### What problem does this PR solve?

Fixes #15117.

Chunk images are stored with `img_id = f"{bucket}-{objname}"` in
`image2id()` (`rag/utils/base64_image.py`). When loading via
`id2image()`, the code used `image_id.split("-")` and required exactly
two segments. Object keys that contain hyphens (e.g. `page-1.jpg`)
produce more than two segments, so `id2image` returns `None` and chunk
image previews fail even though the blob exists.

This is the same parsing issue as #15115 (HTTP thumbnail route); this PR
fixes the indexing/retrieval path.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

### Test plan

- [x] `pytest test/unit_test/rag/utils/test_base64_image.py`
- [ ] Manual: index a chunk with an `objname` containing hyphens and
confirm `img_id` resolves to an image in retrieval

Fixes #15117.
2026-06-02 10:52:51 +08:00
jony376
088d8448ae fix(migration): parameterize tenant_model_provider inserts in mysql_migration (#15313)
### Related issues
Closes #15312

### What problem does this PR solve?

`tools/scripts/mysql_migration.py` built batch INSERT SQL for the
`tenant_model_provider` stage using f-strings with raw `llm_factory` and
`tenant_id` values. If either value contained a single quote, migration
SQL could fail; this also created unnecessary SQL-injection risk in the
migration path.

This PR replaces string interpolation with parameterized SQL
placeholders in `TenantModelProviderStage.execute()`. The migration now
safely handles quoted values and executes deterministically across
existing tenant data.
2026-06-02 10:29:41 +08:00
Hernandez Avelino
09d0a17453 fix(api): handle array message content on OpenAI chat completions (#15359)
### Related issues

Closes #15358

<!-- After filing upstream, replace XXXX with your issue number. -->

---

### What problem does this PR solve?

`POST /api/v1/openai/<chat_id>/chat/completions` forwards `messages` to
`async_chat` without normalizing `content`. Downstream, `dialog_service`
assumes string content:

```python
re.sub(r"##\d+\$\$", "", m["content"])
```

OpenAI-compatible clients may send `content` as an **array** of parts
(text, `image_url`, etc.), including text-only arrays. That causes
`TypeError` and HTTP **500** instead of a valid response or a clear
**400**.

`openai_api.py` also reads `messages[-1]["content"]` directly for
`prompt` without handling list-shaped content.

This PR normalizes array `content` to a string (concatenating `type:
text` parts) before calling `async_chat`, matching a minimal
OpenAI-compat path. Image parts can be documented as unsupported or
handled in a follow-up if vision integration is required.
2026-06-02 10:27:03 +08:00
Jack
67a3ed7558 Fix auto metadata type issue (#15338)
### What problem does this PR solve?

Fix auto metadata type issue
https://github.com/infiniflow/ragflow/issues/15323

Type information is missing at frontend - backend correctly store the
type information for the auto metadata type.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-02 10:23:04 +08:00
Rene Arredondo
e1403171f1 fix(chat): sanitize NaN/Inf scores before serializing chat completions (#15245) (#15266)
## Summary

Fixes #15245 — `POST /api/v1/chat/completions` with `stream=true`
intermittently returns 500:

```
data:{"code": 500, "message": "failed to encode response: json:
unsupported value: NaN (status code: 500)", "data": {...}}
```

…even though "the same question" works on retry.

## Root cause

The streaming path serialized the answer with bare `json.dumps(...)`
(`api/apps/restful_apis/chat_api.py:1221`). `json.dumps` defaults to
`allow_nan=True` and emits the literal token `NaN` for NaN /
Infinity float values. That is valid Python-flavored JSON but
**invalid per RFC 8259**, so downstream consumers reject it. The
reporter's gateway is Go-based and the error wording
(`failed to encode response: json: unsupported value: NaN`) is
straight from Go's `encoding/json`.

How NaN gets into the payload: retrieval scoring in
`rag/nlp/search.py` runs `np.mean(...)` over aggregations that can
be empty, and similarity denominators can be zero. Reference chunk
fields like `similarity`, `vector_similarity`, `term_similarity`
can therefore be NaN depending on which chunks a given query
retrieves — which is exactly why the failure is intermittent for
the same question.

The non-streaming branch (`get_json_result(data=answer)`,
`chat_api.py:1243`) has the same vulnerability — Quart's `jsonify`
also defaults to `allow_nan=True` and the same retrieval pipeline
feeds both branches.

`agent/tools/exesql.py:88-102` already has the same NaN/Inf guard
for SQL results. This PR brings the chat completions path up to
parity.

## Fix

Add a small `_sanitize_json_floats(obj)` helper near the top of
`api/apps/restful_apis/chat_api.py`. It walks `dict` / `list` /
`tuple` and replaces any `float` that is `NaN` or `±Infinity` with
`None`. Apply it at the two serialization boundaries:

- **Streaming branch** (`stream()`): sanitize the SSE payload before
  `json.dumps`.
- **Non-streaming branch**: sanitize the `answer` dict before
  `get_json_result(data=...)`.

The terminal `data:True` frame and the `code:500` error frame carry
no scores and are left untouched.

Added `import math` to the existing alphabetical import block.

No change to retrieval logic — replacing NaN with `null` at the
serialization boundary is conservative: clients still parse the
JSON, a missing-score chunk is a strictly better failure mode than
a 500 that kills the whole reply.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-02 10:08:34 +08:00
nickmopen
bebf6ed244 fix(llm): strip non-generation keys from gen_conf for LiteLLM providers (#15427) (#15432)
### What problem does this PR solve?

Fixes #15427.

All LiteLLM-routed chats fail with:

- Anthropic: `litellm.BadRequestError: AnthropicException -
{"type":"invalid_request_error","message":"model_type: Extra inputs are
not permitted"}`
- OpenAI: `litellm.BadRequestError: OpenAIException - Unknown parameter:
'model_type'`

This is a regression from v0.25.4.

#### Root cause

A chat assistant's `llm_setting` is forwarded to the model as
`gen_conf`. `llm_setting` can legitimately carry RAGFlow-internal
metadata such as `model_type` (the chat REST APIs in
`api/apps/restful_apis/` read it back out of `llm_setting`), so that key
ends up inside `gen_conf`.

`Base._clean_conf` (OpenAI-compatible providers) already **whitelists**
the keys it forwards, so direct-OpenAI providers were unaffected.
`LiteLLMBase._clean_conf` only dropped `max_tokens` and passed
everything else straight through to `litellm.acompletion`, which
forwarded `model_type` to the upstream provider — and Anthropic / OpenAI
reject it. Because both Claude and GPT route through LiteLLM, every chat
broke.

#### Fix

- Extract the allowed-key set into a shared `ALLOWED_GEN_CONF_KEYS`
constant and reuse it in `Base._clean_conf`.
- Apply the same whitelist in `LiteLLMBase._clean_conf`, plus the
LiteLLM-specific reasoning params (`thinking`, `reasoning_effort`,
`extra_body`) that the model-family policies inject for reasoning
models.

This covers all four LiteLLM completion paths (`async_chat`,
`async_chat_streamly`, `async_chat_with_tools`,
`async_chat_streamly_with_tools`), since they all route through
`_clean_conf`.

#### Tests

Adds `test/unit_test/rag/llm/test_clean_conf_whitelist.py` covering both
backends: `model_type` (and other stray keys) are dropped, genuine
generation params and `thinking` survive, `max_tokens` is removed, and
the whitelist invariants hold.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Added test cases
2026-06-02 10:04:11 +08:00
buua436
eaa19bdb02 Fix:empty chat model fallback (#15477)
### What problem does this PR solve?

empty chat model fallback

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-02 10:00:57 +08:00
web-dev0521
1696d4ead6 feat(go-api): implement password-reset flow (issue #15282) (#15293)
## Summary

Ports the Python password-reset flow to Go, adding 4 unauthenticated
endpoints under `/api/v1/auth/password/`:

- `POST /auth/password/forgot/captcha` — generates and returns a PNG
captcha image; stores the plaintext code in Redis (60 s TTL)
- `POST /auth/password/forgot/otp` — verifies captcha, enforces resend
cooldown (60 s), generates HMAC-SHA256-hashed OTP (300 s TTL), sends
plain-text email via SMTP
- `POST /auth/password/forgot/otp/verify` — verifies OTP with attempt
counting (lock after 5 failures for 30 min), sets a
`otp:verified:{email}` flag (300 s TTL) on success
- `POST /auth/password/reset` — checks verified flag, decrypts +
validates passwords, updates user record, auto-logs in (issues JWT,
returns user profile)

Closes #15282
2026-06-02 09:38:02 +08:00
Alexander Laurent
1748723971 feat: add Go MCP server list API (#15253)
## What
#15240 
Implements `GET /api/v1/mcp/servers` in the Go API server.

## Changes

- Added MCP server DAO list query with tenant scoping.
- Added MCP service response wrapper.
- Added MCP handler for list request parsing and response formatting.
- Wired `GET /api/v1/mcp/servers` under authenticated `/api/v1` routes.
- Initialized MCP service and handler in the Go server startup.
- update_time and update_date now both map to update_date
- create_time and create_date now both map to create_date
- default ordering now returns create_date
## API Behavior

Matches the Python endpoint behavior:

- Requires authenticated user.
- Lists MCP servers for the current user tenant.
- Supports `keywords`.
- Supports `mcp_id` and repeated/comma-separated `mcp_ids`.
- Supports `page`, `page_size`, `orderby`, and `desc`.
- Returns:

```json
{
  "code": 0,
  "message": "success",
  "data": {
    "mcp_servers": [],
    "total": 0
  }
}
```
2026-06-02 09:37:05 +08:00
David Myriel
3aea80f5f5 docs: add Tigris as S3-compatible storage backend, fix s3 region field name (#15361)
## Summary

Add Tigris configuration to the Configuration and Backup & migration
pages, using the existing AWS_S3 backend — no code changes required.
Fix `region` → `region_name` in the existing S3 config example in
`backup_and_migration.md`. The code in `s3_conn.py` reads `region_name`,
so the previous field name was silently ignored.

##Context

With MinIO's open-source repository archived (#13840 on
infiniflow/ragflow), users need documented alternatives for object
storage. Tigris is S3-compatible and works with RAGFlow's existing
AWS_S3 backend out of the box.

## Changes

`configurations.md`: Added `### s3 (Tigris)` section after `### minio`,
matching the existing reference style. Includes config block, field
descriptions, and a pointer to `service_conf.yaml.template` for other
S3-compatible backends.
`backup_and_migration.md`: Added Tigris config block under single-bucket
mode. Fixed region → region_name in the existing S3 example. Added
Tigris to the supported backends list.

##Notes

No new files — edits to existing docs only.
Config field names (`access_key`, `secret_key`, `region_name`,
`endpoint_url`, `bucket`, `prefix_path`, `signature_version`,
`addressing_style`) verified against `rag/utils/s3_conn.py`.
2026-06-01 20:47:33 +08:00
writinwaters
c2597f132e Docs: Added a guide on how to ingest an RSS feed. (#15467)
### What problem does this PR solve?

Added a guide on how to ingest an RSS feed.

### Type of change

- [x] Documentation Update
2026-06-01 20:23:36 +08:00
monsterDavid
d398d617ca fix(mineru): skip page chrome blocks to prevent duplicate chunks (#15387)
## Summary
- Skip MinerU `header`, `footer`, and `page_number` blocks when
converting `content_list.json` into sections.
- Ignore unsupported block types explicitly so future MinerU output
types cannot re-emit the previous text block.

Fixes duplicate text in General/naive chunks when parsing PDFs via
MinerU (reported with repeated page headers and body text in slices).

Closes #15335

## Test plan
- [x] `pytest test/unit_test/deepdoc/parser/test_mineru_parser.py -v`
(4/4 passed)
2026-06-01 20:15:04 +08:00
oktofeesh
f0e4f2d5d8 fix(go-models): apply custom Google base URLs (#15385)
## Summary
- Add custom `base_url` support to the Google Go model driver.
- Preserve Google URL suffix configuration when creating custom base URL
driver instances.
- Validate Google chat/stream request inputs before constructing the SDK
client.
- Cover Google model listing, connection checks, base URL resolution,
and request validation with focused tests.

## What changed
- `GoogleModel.NewInstance` now returns a Google driver configured with
the supplied base URL map.
- Google SDK client creation now resolves configured base URLs through
`genai.HTTPOptions.BaseURL`.
- Base URL lookup supports configured regions, empty-region keys, and
`default` fallback.
- Google chat, streaming chat, embeddings, and model listing now reject
blank API keys before creating SDK clients.
- Google chat and streaming chat now reject blank model names locally,
and streaming chat rejects a nil sender.
- Existing message handling, embeddings, pagination, and provider errors
are preserved.

## Why
Google custom model instances could not use configured base URLs because
`NewInstance` returned `nil` and the SDK client path ignored the driver
base URL map. The request validation keeps invalid Google calls from
reaching SDK client construction with blank credentials or incomplete
chat inputs.
2026-06-01 19:24:29 +08:00