Files
ragflow/test/unit_test
Rene Arredondo e1403171f1 fix(chat): sanitize NaN/Inf scores before serializing chat completions (#15245) (#15266)
## Summary

Fixes #15245 — `POST /api/v1/chat/completions` with `stream=true`
intermittently returns 500:

```
data:{"code": 500, "message": "failed to encode response: json:
unsupported value: NaN (status code: 500)", "data": {...}}
```

…even though "the same question" works on retry.

## Root cause

The streaming path serialized the answer with bare `json.dumps(...)`
(`api/apps/restful_apis/chat_api.py:1221`). `json.dumps` defaults to
`allow_nan=True` and emits the literal token `NaN` for NaN /
Infinity float values. That is valid Python-flavored JSON but
**invalid per RFC 8259**, so downstream consumers reject it. The
reporter's gateway is Go-based and the error wording
(`failed to encode response: json: unsupported value: NaN`) is
straight from Go's `encoding/json`.

How NaN gets into the payload: retrieval scoring in
`rag/nlp/search.py` runs `np.mean(...)` over aggregations that can
be empty, and similarity denominators can be zero. Reference chunk
fields like `similarity`, `vector_similarity`, `term_similarity`
can therefore be NaN depending on which chunks a given query
retrieves — which is exactly why the failure is intermittent for
the same question.

The non-streaming branch (`get_json_result(data=answer)`,
`chat_api.py:1243`) has the same vulnerability — Quart's `jsonify`
also defaults to `allow_nan=True` and the same retrieval pipeline
feeds both branches.

`agent/tools/exesql.py:88-102` already has the same NaN/Inf guard
for SQL results. This PR brings the chat completions path up to
parity.

## Fix

Add a small `_sanitize_json_floats(obj)` helper near the top of
`api/apps/restful_apis/chat_api.py`. It walks `dict` / `list` /
`tuple` and replaces any `float` that is `NaN` or `±Infinity` with
`None`. Apply it at the two serialization boundaries:

- **Streaming branch** (`stream()`): sanitize the SSE payload before
  `json.dumps`.
- **Non-streaming branch**: sanitize the `answer` dict before
  `get_json_result(data=...)`.

The terminal `data:True` frame and the `code:500` error frame carry
no scores and are left untouched.

Added `import math` to the existing alphabetical import block.

No change to retrieval logic — replacing NaN with `null` at the
serialization boundary is conservative: clients still parse the
JSON, a missing-score chunk is a strictly better failure mode than
a 500 that kills the whole reply.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-02 10:08:34 +08:00
..