mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-06-29 23:41:12 +08:00
### What problem does this PR solve? Markdown extraction currently applies custom delimiters before respecting fenced code blocks. When a delimiter such as a newline is configured, fenced code can be split into separate chunks, and longer outer fences can be closed incorrectly by shorter nested fences. This PR keeps the fix intentionally narrow for the Markdown chunking discussion in #15482: - preserve fenced code blocks when delimiter-based extraction is used - support both backtick and tilde fences - respect fence length so longer outer fences can contain shorter inner fences - keep delimiter splitting unchanged outside fenced blocks Refs #15482 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Testing - `ruff check deepdoc/parser/markdown_parser.py test/unit_test/deepdoc/parser/test_markdown_parser.py` - `python3 run_tests.py -t test/unit_test/deepdoc/parser/test_markdown_parser.py`