mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-06-29 15:31:05 +08:00
Doc: two PDF parser optimizers are supported as of v0.25.0. (#14261)
### What problem does this PR solve? Multi-column layout detection is supported in v0.25.0 ### Type of change - [x] Documentation Update
This commit is contained in:
@@ -31,6 +31,14 @@ The **Parser** component supports parsing the following file types:
|
||||
| Audio | MP3, WAV |
|
||||
| Video | MP4, AVI, MKV |
|
||||
|
||||
### Detect multi-column layout
|
||||
|
||||
Optimizes the parser to detect and reorder multi-column pages into a logical sequence. Ideal for PDF documents with two-column or newspaper-style layouts.
|
||||
|
||||
### Remove original table of contents
|
||||
|
||||
Strips the original table of contents from PDF files. Once enabled, the table of contents is not chunked or parsed for retrieval.
|
||||
|
||||
### PDF parser
|
||||
|
||||
The output of a PDF parser is `json`. In the PDF parser, you select the parsing method that works best with your PDFs.
|
||||
|
||||
Reference in New Issue
Block a user