mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-06-29 23:41:12 +08:00
## Summary Fixes #15245 — `POST /api/v1/chat/completions` with `stream=true` intermittently returns 500: ``` data:{"code": 500, "message": "failed to encode response: json: unsupported value: NaN (status code: 500)", "data": {...}} ``` …even though "the same question" works on retry. ## Root cause The streaming path serialized the answer with bare `json.dumps(...)` (`api/apps/restful_apis/chat_api.py:1221`). `json.dumps` defaults to `allow_nan=True` and emits the literal token `NaN` for NaN / Infinity float values. That is valid Python-flavored JSON but **invalid per RFC 8259**, so downstream consumers reject it. The reporter's gateway is Go-based and the error wording (`failed to encode response: json: unsupported value: NaN`) is straight from Go's `encoding/json`. How NaN gets into the payload: retrieval scoring in `rag/nlp/search.py` runs `np.mean(...)` over aggregations that can be empty, and similarity denominators can be zero. Reference chunk fields like `similarity`, `vector_similarity`, `term_similarity` can therefore be NaN depending on which chunks a given query retrieves — which is exactly why the failure is intermittent for the same question. The non-streaming branch (`get_json_result(data=answer)`, `chat_api.py:1243`) has the same vulnerability — Quart's `jsonify` also defaults to `allow_nan=True` and the same retrieval pipeline feeds both branches. `agent/tools/exesql.py:88-102` already has the same NaN/Inf guard for SQL results. This PR brings the chat completions path up to parity. ## Fix Add a small `_sanitize_json_floats(obj)` helper near the top of `api/apps/restful_apis/chat_api.py`. It walks `dict` / `list` / `tuple` and replaces any `float` that is `NaN` or `±Infinity` with `None`. Apply it at the two serialization boundaries: - **Streaming branch** (`stream()`): sanitize the SSE payload before `json.dumps`. - **Non-streaming branch**: sanitize the `answer` dict before `get_json_result(data=...)`. The terminal `data:True` frame and the `code:500` error frame carry no scores and are left untouched. Added `import math` to the existing alphabetical import block. No change to retrieval logic — replacing NaN with `null` at the serialization boundary is conservative: clients still parse the JSON, a missing-score chunk is a strictly better failure mode than a 500 that kills the whole reply. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)
(1). Deploy RAGFlow services and images
https://ragflow.io/docs/build_docker_image
(2). Configure the required environment for testing
Install Python dependencies (including test dependencies):
uv sync --python 3.13 --only-group test --no-default-groups --frozen
Activate the environment:
source .venv/bin/activate
Install SDK:
uv pip install sdk/python
Modify the .env file: Add the following code:
COMPOSE_PROFILES=${COMPOSE_PROFILES},tei-cpu
TEI_MODEL=BAAI/bge-small-en-v1.5
RAGFLOW_IMAGE=infiniflow/ragflow:v0.25.6 #Replace with the image you are using
Start the container(wait two minutes):
docker compose -f docker/docker-compose.yml up -d
(3). Test Elasticsearch
a) Run sdk tests against Elasticsearch:
export HTTP_API_TEST_LEVEL=p2
export HOST_ADDRESS=http://127.0.0.1:9380 # Ensure that this port is the API port mapped to your localhost
pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_sdk_api
b) Run http api tests against Elasticsearch:
pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_http_api
(4). Test Infinity
Modify the .env file:
DOC_ENGINE=${DOC_ENGINE:-infinity}
Start the container:
docker compose -f docker/docker-compose.yml down -v
docker compose -f docker/docker-compose.yml up -d
a) Run sdk tests against Infinity:
DOC_ENGINE=infinity pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_sdk_api
b) Run http api tests against Infinity:
DOC_ENGINE=infinity pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_http_api