Files
ragflow/test/unit_test/rag
Harsh Kashyap ebd4f4e633 fix(rag/nlp): handle non-numbered DOCX heading styles (#16219)
## What problem does this PR solve?

DOCX parsing could crash when a paragraph used a `Heading`-prefixed
style without a trailing numeric level, such as `Heading`, `Heading1`,
or `Heading Title`.

`docx_question_level()` assumed every heading style looked like `Heading
N` and called `int(p.style.name.split(" ")[-1])`. For non-numbered
heading styles, that raises `ValueError` and breaks Manual, Q&A, and
Laws chunking.

This PR parses heading levels safely and falls back to level 1 for
Heading-prefixed styles without an explicit numeric suffix.

Closes #16163.

## Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Test Case (non-breaking change which adds test coverage)
2026-06-29 15:21:17 +08:00
..
2026-06-29 09:45:16 +08:00