Refactor: migrate pdf_parser.py to golang (#16323)

### What problem does this PR solve?

Http API based on onnx model.
pdf_parser.py to golang

### Type of change

- [x] Refactoring
This commit is contained in:
Jack
2026-06-25 20:16:16 +08:00
committed by GitHub
parent c7052f4dd1
commit 304d9e02bb
98 changed files with 24591 additions and 8 deletions

View File

@@ -25,7 +25,7 @@ DOC_ENGINE=${DOC_ENGINE:-elasticsearch}
# - `gpu`
DEVICE=${DEVICE:-cpu}
COMPOSE_PROFILES=${DOC_ENGINE},${DEVICE}
COMPOSE_PROFILES=${DOC_ENGINE},${DEVICE},deepdoc
# The version of Elasticsearch.
STACK_VERSION=${STACK_VERSION:-8.11.3}
@@ -308,3 +308,13 @@ THREAD_POOL_MAX_WORKERS=128
#Option to disable login form for SSO
DISABLE_PASSWORD_LOGIN=false
# -----------------------------------------------------------------------------
# DeepDoc OSS Vision Service
# -----------------------------------------------------------------------------
# URL for the deepdoc vision API (DLA, OCR, TSR) served by OSS ONNX models.
# The `deepdoc` service defined in docker-compose.yml provides this endpoint.
# When unset, the parser falls back to inline ONNX Runtime inference.
DEEPDOC_URL=http://deepdoc:9390
# Docker image for the OSS deepdoc service. CPU-only; uses ONNX Runtime.
DEEPDOC_IMAGE=deepdoc_oss:latest