mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-06-29 23:41:12 +08:00
### What problem does this PR solve? Document metadata is completely broken on the OpenSearch backend (`DOC_ENGINE=opensearch`). Both failures were introduced by #14577, which added a doc-metadata dispatch surface but only validated it against Elasticsearch. **1. Index creation rejected (`mapper_parsing_exception`).** `OSConnection.create_doc_meta_idx` feeds `conf/doc_meta_es_mapping.json` verbatim to OpenSearch. That file declares a top-level `"dynamic": "runtime"`. Runtime fields are Elasticsearch-only; OpenSearch cannot parse the value: mapper_parsing_exception: Could not convert [dynamic.dynamic] to boolean (400) **2. `search()` signature mismatch (`TypeError`).** `DocMetadataService` (added by #14577) calls `docStoreConn.search(...)` with snake_case kwargs (`select_fields=`, `index_names=`, `knowledgebase_ids=`, …), matching `ESConnection.search`. But `OSConnection.search` still uses camelCase parameters (`selectFields`, `indexNames`, `knowledgebaseIds`, …): TypeError: OSConnection.search() got an unexpected keyword argument 'select_fields' The UI then shows "0 fields" for every document on OpenSearch. ### Fix 1. In `OSConnection.create_doc_meta_idx`, normalize a top-level `"dynamic": "runtime"` to `True` **for the OpenSearch request only**. The shared mapping file is left untouched, so the Elasticsearch backend keeps its runtime-field behavior. Dynamic field discovery is preserved on OpenSearch. 2. Rename the `OSConnection.search()` parameters (and their in-method local uses) from camelCase to snake_case so they match `ESConnection.search()` and the `DocMetadataService` call sites. The change is confined to `search()`; `get/insert/update/delete` keep their existing positional signatures (they are called positionally from `rag/nlp/search.py`). ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Affected backends OpenSearch only. Elasticsearch, Infinity and OceanBase are untouched. ### How to reproduce 1. `DOC_ENGINE=opensearch`, restart the stack. 2. Upload/parse a document, then open the dataset's document list / set metadata. - Before: index creation 400s (`Could not convert [dynamic.dynamic]`), and/or `TypeError ... 'select_fields'`; document metadata shows 0 fields. ### Risk & backward compatibility - ES default deployment: no change. `doc_meta_es_mapping.json` is not modified, so ES still receives `"dynamic": "runtime"`. - `search()` rename is internal; the only kwarg caller (`DocMetadataService`) already uses the snake_case names this PR aligns to. ### Test plan - [ ] `DOC_ENGINE=opensearch`: per-tenant `ragflow_doc_meta_*` index is created (no `mapper_parsing_exception`); document metadata reads/writes work. - [ ] `DOC_ENGINE=elasticsearch` regression: doc-meta index still created with runtime mapping; metadata unchanged.