43 lines
828 B
Markdown
43 lines
828 B
Markdown
---
|
|
name: pdf-extract
|
|
description: "从PDF文件中提取文本以供LLM处理。"
|
|
metadata:
|
|
{
|
|
"openclaw":
|
|
{
|
|
"emoji": "📄",
|
|
"requires": { "bins": ["pdftotext"] },
|
|
"install":
|
|
[
|
|
{
|
|
"id": "dnf",
|
|
"kind": "dnf",
|
|
"package": "poppler-utils",
|
|
"bins": ["pdftotext"],
|
|
"label": "Install via dnf",
|
|
},
|
|
],
|
|
},
|
|
}
|
|
---
|
|
|
|
# PDF Extract
|
|
|
|
Extract text from PDF files for LLM processing. Uses `pdftotext` from the poppler-utils package to convert PDF documents into plain text.
|
|
|
|
## Commands
|
|
|
|
```bash
|
|
# Extract all text from a PDF
|
|
pdf-extract "document.pdf"
|
|
|
|
# Extract text from specific pages
|
|
pdf-extract "document.pdf" --pages 1-5
|
|
```
|
|
|
|
## Install
|
|
|
|
```bash
|
|
sudo dnf install poppler-utils
|
|
```
|