Files
ragflow/internal
ZF 97a4c64cc8 fix(harness): truncate text on rune boundary to keep UTF-8 valid (#16511)
### Summary

`truncateText` in the `reduction` and `summarization` middlewares
truncates with `s[:maxLen]`, which slices by byte. When `maxLen` lands
inside a multi-byte character (common with CJK or other non-ASCII
content flowing through the agent), the string is cut mid-rune and the
tail byte(s) become invalid UTF-8. That broken text then goes into the
reduced context / summary prompt.

`TruncateToolResult` in the same `reduction` package already avoids this
by slicing on a rune boundary and even notes it in a comment. This PR
makes the two `truncateText` helpers do the same, so they stay
consistent with the existing helper.

Both functions keep their existing output shape (summarization still
appends `...`). Added a small unit test in each package covering ASCII
truncation and a CJK string, asserting the result stays valid UTF-8.
2026-07-01 16:45:26 +08:00
..