ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-29 23:41:12 +08:00

Files

euvre 1e80419c21 fix: restore TitleChunker output for json/chunks upstream formats (#15396 )

fix: restore TitleChunker output for json/chunks upstream formats

## Summary

The refactor commit e194027b (#14247) introduced two regressions that
caused `TitleChunker` to produce zero chunks when the upstream Parser
node outputs `json` or `chunks` format (e.g. PDF parsing).

## Root Cause

### 1. Dead code in `extract_line_records` (critical)

After refactor, when `payload` is `None` (which is the case for `json`
and `chunks` output formats), the method returns an empty list
immediately via `return []`, so no records are ever extracted from
structured upstream output. The original `json`/`chunks` handling code
became unreachable dead code.

### 2. Unconditional overwrite in `build_chunks_from_record_groups`

The `chunks` variable assigned in the `if` branch for markdown/text/html
formats was unconditionally overwritten by the statement below it, due
to a missing `else` keyword.

## Fix

- Remove the premature `return []` so the `json`/`chunks` branch is
reachable again.
- Add `else` branch in `build_chunks_from_record_groups` so the two
format families are handled independently.

## Test Plan

- [x] Verified no lint errors on the changed file
- [ ] Tested with a PDF document parsed via DeepDOC → TitleChunker
pipeline
- [ ] Tested with markdown input through TitleChunker
- [ ] Tested hierarchy and group chunking modes

## Impact

- Fixes the regression where documents parsed with `json`/`chunks`
output format produced no chunks from `TitleChunker`.
- No API or configuration changes. Fully backward compatible.

Signed-off-by: noob <yixiao121314@outlook.com>

2026-06-01 17:14:22 +08:00

chunker

fix: restore TitleChunker output for json/chunks upstream formats (#15396 )

2026-06-01 17:14:22 +08:00

extractor

Fix: tokenizer issue. (#11902 )

2025-12-11 17:38:17 +08:00

parser

Feat: tenant llm provider (#14595 )

2026-05-29 17:39:41 +08:00

tests

Feat: Refact pipeline (#13826 )