ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-29 23:41:12 +08:00

Author	SHA1	Message	Date
Jin Hai	139f4515e8	Go: refactor CLI (#15898 ) ### What problem does this PR solve? 1. remove unused code 2. fix login issue ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-10 16:06:30 +08:00
Hz_	38755c705a	feat(go): Add DeepSeek models and Gitee alias metadata tests (#15885 ) This PR expands conf/all_models.json with DeepSeek model entries and provider aliases. Changes: - Added DeepSeek model entries across `V4`, `V3.2`, `V3.1`, `V3`, `R1`, `Coder`, `Math`, `VL`, `OCR`, `Prover`, `MoE`, and `LLM` series. - Normalized model name values to lowercase canonical IDs. - Added alias values for official DeepSeek/Hugging Face names and provider-specific names from OpenRouter, VolcEngine, SiliconFlow, HuaweiCloud, and QiniuCloud. - Preserved model metadata such as max_tokens, model_types, and thinking where applicable. - Added Gitee ListModels tests to verify DeepSeek aliases map back to model metadata from all_models.json. - Added an optional Gitee integration test gated by GITEE_LIST_MODELS_INTEGRATION=1. Test: /usr/local/go/bin/go clean -cache /usr/local/go/bin/go test ./internal/entity/models -run 'TestGiteeListModels(MapsAllDeepSeekAliasesToModelMetadata\|KeepsOwnedBySuffixAfterAliasMetadataLookup\| Integration)'	2026-06-10 13:59:23 +08:00
oktofeesh	bbc1f2ecec	feat(go-api): add RAG retrieval to chat completions (#15739 ) ## Summary - Add knowledge-base retrieval support to Go chat completions. ## What changed - Routes KB-backed chat sessions through the Go retrieval service instead of falling back to solo chat. - Resolves embedding and rerank models, validates accessible knowledge bases, and preserves tenant-aware retrieval. - Rejects mixed embedding models across selected knowledge bases before retrieval to avoid incompatible vector dimensions. - Threads the HTTP request context into streaming retrieval so cancelled requests can stop downstream retrieval work. - Applies metadata filters and message-level `doc_ids` before retrieval. - Expands parent/child chunks before building references and prompt context. - Injects retrieved knowledge through a copied dialog prompt config so the caller's original dialog is not mutated. - Honors configured empty responses when no chunks are found. - Names the metadata no-match sentinel and reuses it across retrieval/handler paths. - Adds a defensive content cast while appending streamed answers. - Adds focused unit coverage for retrieval, metadata filtering, authorization, multimodal messages, references, empty-response behavior, prompt immutability, and mixed embedding models. --------- Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-10 11:07:45 +08:00
Jin Hai	7c1bd9a5a5	Go CLI: switch to admin/api server (#15861 ) ### What problem does this PR solve? ``` RAGFlow(api/default)> use admin SUCCESS RAGFlow(api/default)> use api 'abc'; SUCCESS ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-10 10:57:00 +08:00
Haruko386	d56aeb2f5d	feat[Go]: api datasets/<dataset_id>/documents/<document_id>/metadata/… (#15846 ) ### What problem does this PR solve? As title ``` /api/v1/datasets/<dataset_id>/documents/<document_id>/metadata/config PUT ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-06-10 09:57:11 +08:00
Haruko386	a396b1ace2	feat[Go]: implement /api/v1/agents/<agent_id> and test_db_connection (#15771 ) ### What problem does this PR solve? Add two API in go ``` /api/v1/agents/test_db_connection POST /api/v1/agents/<agent_id>/sessions DELETE ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-06-10 09:54:07 +08:00
Jack	87b8062df4	feat: implement POST /api/v1/searchbots/ask — streaming RAG with citations and think-tag processing (#15825 ) Implements POST /api/v1/searchbots/ask in Go with streaming SSE, citations, and think-tag processing. 23 files, 90+ unit tests. --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 22:48:50 +08:00
Jack	2f99d52fb5	fix(ci): re-enable Go tests and fix compilation errors after ListModels signature change (#15862 ) ## Summary This PR re-enables the Go test steps in CI that were previously commented out, and fixes all compilation errors that have accumulated in `internal/entity/models/` since the `ListModels` return type was changed from `[]string` to `[]ListModelResponse`. ## Changes ### CI (`.github/workflows/tests.yml`) - Re-enable Prepare test resources step (clones resource repo with WordNet data) - Re-enable Test Go packages step (runs `go test ./internal/...`) - Fix resource path race condition by using `/tmp/resource-${GITHUB_RUN_ID}` instead of `/tmp/resource` - Exclude `/cli` package from Go tests (contains `main` redeclarations) ### Test fixes (16 model provider test files) All errors were caused by the upstream change from `[]string` to `[]ListModelResponse` in the `ListModels` interface: - Add `joinModelNames` test helper to extract `.Name` from `[]ListModelResponse` slices - `strings.Join(models, ",")` → `joinModelNames(models, ",")` (11 files) - `ids[i] != "..."` → `ids[i].Name != "..."` (cometapi, mistral) - `got[i] != want[i]` → `got[i].Name != want[i]` (bedrock) - `[]string` return types → `[]ListModelResponse` (google) ### Pre-existing bugs in model_test.go Bugs introduced by the upstream `entity/` → `entity/models/` directory rename: - Add missing `pm := GetProviderManager()` calls in 3 test functions - Fix `InitProviderManager` signature (`_, err :=` → `err :=`) - Fix `MaxTokens` `*int` dereference (6 comparisons) - Fix `readProviderConfig` relative path (3 levels up instead of 2) ### model.go - Add `findRepoRoot()` to make `conf/all_models.json` resolution work from any CWD, fixing `TestSiliconFlowProviderConfigLoadsLatestProModels` ### Test validation ```bash go build ./internal/... # ✅ go test ./internal/entity/models/... -count=1 # ✅ all pass ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 21:12:15 +08:00
ghost	64b860f771	fix(elasticsearch): complete Go result functions (#15148 ) ## Summary - Complete the Go Elasticsearch result functions that remained stubbed after #15160. - Add focused unit coverage for field mapping, aggregation, IDs, and highlighting behavior. - Update a stale query-builder test type import discovered during validation. ## What changed - Keep the Elasticsearch Go implementation merged in #15160 and fill in `GetFields`, `GetAggregation`, `GetHighlight`, and `GetDocIDs` in `internal/engine/elasticsearch/chunk.go`. - Add regression and invariant coverage in `internal/engine/elasticsearch/chunk_helpers_test.go`. - Update `internal/service/nlp/query_builder_test.go` to use the current `types.MatchTextExpr` type. ## Why - #15160 implemented the main Go Elasticsearch surface, but retrieval/tag flows still call result functions that returned stubs. - Completing these functions keeps Elasticsearch result processing aligned with the expected document-engine behavior for field extraction, tag aggregation, doc ID extraction, and snippet highlighting. ## Validation - `go test ./internal/engine/elasticsearch` - `GOARCH=arm64 CGO_ENABLED=1 go test ./internal/service/nlp -run TestQueryBuilder` - `git diff --check` - CodeRabbit review reported 0 issues after follow-up fixes. - Codex Security diff scan found no reportable issues. ## Notes - This PR is now a follow-up to #15160 rather than a competing implementation. - A full local `go test ./internal/service/nlp` run is blocked by local WordNet resource prerequisites; the query-builder tests touched by this PR pass with the arm64 CGO path.	2026-06-09 20:10:11 +08:00
JPette1783	acae932938	fix(go): guard four nil-pointer dereferences causing runtime panics (#15815 ) ### What problem does this PR solve? Fixes four Go paths that dereference a pointer with no prior nil check, each causing a runtime panic. Closes #15814. \| # \| File \| Bug \| Fix \| \|---\|------\|-----\|-----\| \| 1 \| `internal/entity/models/deepseek.go` \| streaming path runs `switch chatModelConfig.Effort` inside `if Thinking`; panics when `Thinking=true` and `Effort==nil` \| nil-check with default `"high"`, matching the non-streaming path in the same file \| \| 2 \| `internal/entity/models/volcengine.go` \| identical oversight: `switch modelConfig.Effort` with no guard \| nil-check with default `"medium"`, matching its non-streaming path \| \| 3 \| `internal/handler/auth.go` \| `AuthMiddleware` does `if user.IsSuperuser`; panics on every authenticated request when the DB column is `NULL` \| guard with `user.IsSuperuser != nil &&`, matching every other call site (`admin/handler.go`, `admin/service.go`, `user.go`) \| \| 4 \| `internal/service/heartbeat_sender.go` \| `responseBody["code"].(float64)` panics on any non-200 response lacking a numeric `code`; the upstream `recover()` calls `Fatal()` → `os.Exit(1)`, taking down the whole server \| comma-ok assertion (`code, ok := ...`); return an error instead of panicking \| - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-09 19:29:25 +08:00
Hz_	d4fe3bb148	feat(go-api): Add GET dataset metadata summary API (#15843 ) ## What Adds the RESTful dataset metadata summary endpoint: `GET /api/v1/datasets/{dataset_id}/metadata/summary` The endpoint supports optional document filtering through: `?doc_ids=doc_id_1,doc_id_2`	2026-06-09 19:27:47 +08:00
JPette1783	e050f1816e	fix(models): guard unsafe index access in Google and Ollama drivers (#15819 ) ### What problem does this PR solve? Fixes four panic / spurious-error paths in the Go model layer. Closes #15818. \| # \| File \| Bug \| Fix \| \|---\|------\|-----\|-----\| \| 1 \| \| Thinking-mode streaming path: accessed unconditionally; Gemini emits usage-only chunks with an empty slice, causing a runtime panic \| Guard each step: , , before indexing \| \| 2 \| \| is a plain for ordinary requests; the cast to silently returns , then panics immediately \| Switch on concrete type; handle both and \| \| 3 \| \| Identical panic on the streaming path \| Same switch-on-type fix \| \| 4 \| \| The field is optional (absent for non-thinking models) but the code returned an error when it was missing, breaking every ordinary Ollama completion \| Change to a silent comma-ok assertion; is empty string when the field is absent \| ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 19:26:52 +08:00
Jin Hai	719ce15c95	Go CLI: update list supported models (#15845 ) ### What problem does this PR solve? Now list supported models will show more info. ``` RAGFlow(api/default)> list supported models from 'gitee' 'test'; +-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+ \| dimension \| max_tokens \| model_types \| name \| thinking \| +-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+ \| \| \| \| Wan2.7 \| \| \| \| \| \| HappyHorse-1.0 \| \| \| \| \| \| Qwen3.6-27B@Qwen \| \| \| \| \| \| Qwen3.6-35B-A3B@Qwen \| \| \| \| 1048576 \| [chat] \| DeepSeek-V4-Flash@deepseek-ai \| map[clear_thinking:true default_value:true] \| \| \| 1048576 \| [chat] \| DeepSeek-V4-Pro@deepseek-ai \| map[clear_thinking:true default_value:true] \| +-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-09 19:01:00 +08:00
Hz_	d1c436b804	feat(api): implement `GET /api/v1/agents/prompts` endpoint in Go (#15748 ) ### Description This PR ports the `GET /api/v1/agents/prompts` endpoint from the Python backend to the Go backend. ### Changes Made - Handler: Added `GetPrompts` method to `internal/handler/agent.go`. - Router: Registered the `agents.GET("/prompts")` route in `internal/router/router.go`. - Logic: Leveraged the existing `service.LoadPrompt` utility to read `analyze_task_system`, `analyze_task_user`, `next_step`, `reflect`, and `citation_prompt` templates directly from the `rag/prompts` directory. - Unit Test: Added `TestGetPrompts_Success` to `internal/handler/agent_test.go` to mock the HTTP context and validate the JSON response structure. ### Motivation This is part of the ongoing effort to port the Agent API surface to Go. Since this specific endpoint only serves static prompt templates and does not require the complex DAG/Canvas execution engine, it can be seamlessly and safely handled by the Go backend. ### Testing - [x] Added automated unit test `TestGetPrompts_Success` (verified passing). - [x] Tested locally via `curl` against the Go server (port 9380) and Python server (port 9384). - [x] Verified that the Go JSON response structure and loaded prompt text are logically 100% identical to the Python implementation.	2026-06-09 17:03:42 +08:00
Yingfeng	01a2a44766	Clean CLI for filesystem (#15838 ) ### Type of change - [x] Refactoring	2026-06-09 17:00:10 +08:00
Jin Hai	d02eb6b596	Go: refactor CLI (#15728 ) ### What problem does this PR solve? ``` RAGFlow(user)> add api server 'ccc' host '127.0.0.1:9980'; SUCCESS RAGFlow(user)> list api server; +------------+---------------+-----------------+---------+-------------+---------------+ \| api_server \| api_server_ip \| api_server_port \| auth \| user_name \| user_password \| +------------+---------------+-----------------+---------+-------------+---------------+ \| ccc \| 127.0.0.1 \| 9980 \| no auth \| \| \| \| default \| 127.0.0.1 \| 9384 \| login \| aaa@aaa.com \| * \| +------------+---------------+-----------------+---------+-------------+---------------+ RAGFlow(user)> delete api server 'ccc'; SUCCESS RAGFlow(user)> list api server; +------------+---------------+-----------------+---------+ \| api_server \| api_server_ip \| api_server_port \| auth \| +------------+---------------+-----------------+---------+ \| default \| 127.0.0.1 \| 9384 \| no auth \| +------------+---------------+-----------------+---------+ RAGFlow(user)> show admin server; +--------------+-------+ \| field \| value \| +--------------+-------+ \| admin_server \| N/A \| +--------------+-------+ RAGFlow(user)> add admin server host '127.0.0.1:9880'; SUCCESS RAGFlow(user)> show admin server; +-------------------+-----------+ \| field \| value \| +-------------------+-----------+ \| admin_server_ip \| 127.0.0.1 \| \| admin_server_port \| 9880 \| \| auth \| no auth \| +-------------------+-----------+ RAGFlow(user)> delete admin server; SUCCESS RAGFlow(user)> show admin server; +--------------+-------+ \| field \| value \| +--------------+-------+ \| admin_server \| N/A \| +--------------+-------+ RAGFlow(user)> show current +-----------------+-------------+ \| field \| value \| +-----------------+-------------+ \| api_server_port \| 9384 \| \| user_name \| aaa@aaa.com \| \| user_password \| * \| \| mode \| api \| \| verbose \| false \| \| api_server \| default \| \| api_server_ip \| 127.0.0.1 \| \| auth \| login \| \| output \| table \| \| interactive \| true \| +-----------------+-------------+ ``` ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-09 15:22:50 +08:00
Jin Hai	55abf4f565	Go: new CLI command, list all models and show model (#15786 ) ### What problem does this PR solve? ``` RAGFlow(user)> list models; +---------------------------+------------+-------------+--------------------+---------------------------------------------+ \| alias \| max_tokens \| model_types \| name \| thinking \| +---------------------------+------------+-------------+--------------------+---------------------------------------------+ \| \| 1048576 \| [chat] \| deepseek-v4-flash \| map[clear_thinking:true default_value:true] \| \| \| 1048576 \| [chat] \| deepseek-v4-pro \| map[clear_thinking:true default_value:true] \| \| \| 1024000 \| [chat] \| minimax-m3 \| map[clear_thinking:true default_value:true] \| \| \| 64000 \| [vision] \| glm-4.5v \| map[clear_thinking:true default_value:true] \| \| [baai/bge-m3] \| 8192 \| [embedding] \| bge-m3 \| \| \| [baai/bge-reranker-v2-m3] \| 1024 \| [rerank] \| bge-reranker-v2-m3 \| \| \| \| \| [tts] \| step-audio-tts-3b \| \| \| [qwen/qwen3-asr-1.7b] \| \| [asr] \| qwen3-asr-1.7b \| \| \| [paddleocr-vl-1.5] \| \| [ocr] \| paddleocr-vl-0.9b \| \| +---------------------------+------------+-------------+--------------------+---------------------------------------------+ RAGFlow(user)> show model 'minimax-m3'; +--------------+---------------------------------------------+ \| field \| value \| +--------------+---------------------------------------------+ \| name \| minimax-m3 \| \| max_tokens \| 1024000 \| \| model_types \| [chat] \| \| thinking \| map[clear_thinking:true default_value:true] \| \| class \| \| \| alias \| \| \| ModelTypeMap \| \| +--------------+---------------------------------------------+ RAGFlow(user)> show model 'baai/bge-m3'; +--------------+---------------+ \| field \| value \| +--------------+---------------+ \| model_types \| [embedding] \| \| thinking \| \| \| class \| \| \| alias \| [baai/bge-m3] \| \| ModelTypeMap \| \| \| name \| bge-m3 \| \| max_tokens \| 8192 \| +--------------+---------------+ ``` --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-08 21:38:15 +08:00
Jack	35527f6755	fix: guard http.DefaultTransport type assertion in xiaomi for Go 1.25 (#15787 ) ## Problem `TestXiaomiNewModelWithCustomDefaultTransport` panics on Go 1.25: ``` panic: interface conversion: http.RoundTripper is models.roundTripperFunc, not http.Transport ``` In Go 1.25, `http.DefaultTransport` is no longer `http.Transport`, so the unchecked type assertion in `NewXiaomiModel` panics when the test replaces it with a `roundTripperFunc`. ## Fix Use a safe type assertion with fallback to a new `http.Transport`, matching the pattern already used in `modelscope.go`. ## Verification ```bash go test -run TestXiaomiNewModelWithCustomDefaultTransport ./internal/entity/models/... # PASS ``` Internal contributors only. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 21:11:21 +08:00
Jack	338fdb65fb	feat(ci): enable go test in CI pipeline (#15750 ) ## What problem does this PR solve? Go test files are never compiled in CI — only production binaries via `go build`. This allowed a missing `"sort"` import in `metadata_filter_test.go` to be merged without detection. ## Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) ## Changes - Add `go test -count=1 ./internal/...` step after Go build in CI workflow - Fix missing `"sort"` import in `metadata_filter_test.go` (pre-existing compile error) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 20:06:57 +08:00
oktofeesh	6fc3955cab	fix(go-models): normalize Qwen reasoning families (#15735 ) ## Summary Normalizes Qwen model-family names before reasoning extraction so provider-prefixed Qwen models use the existing `<think>...</think>` fallback.	2026-06-08 19:32:19 +08:00
oktofeesh	e0dc7af5dd	fix(go-models): fix MiniMax driver requests (#15527 ) ## Summary - keep MiniMax chat calls in non-streaming mode and streaming calls in SSE mode - make MiniMax model listing and connection checks use a bodyless GET /v1/models - add focused MiniMax request/response regression tests	2026-06-08 19:32:01 +08:00
oktofeesh	25df0a6725	fix(go-models): validate URL suffix config keys (#15734 ) ## Summary Fixes typoed model-provider URL suffix keys and adds strict nested decoding so future URL suffix config mistakes fail during provider loading instead of being silently ignored.	2026-06-08 19:29:36 +08:00
Haruko386	8dc7f1d95e	Go: implement ASR and TTS for xiaomi (#15765 ) ### What problem does this PR solve? Verified from CLI ``` RAGFlow(user)> chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Answer: Hello! I'm MiMo-v2.5, a large language model developed by Xiaomi's LLM Core Team. You can think of me as a friendly AI assistant ready to help you answer questions, have conversations, or work on creative tasks. My context window can handle up to 1 million tokens, so we can dive into pretty long discussions or documents if you'd like. What can I help you with today? Time: 3.831830 RAGFlow(user)> stream chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Answer: there! I'm MiMo-v2.5, an AI assistant created by the Xiaomi LLM Core Team. I'm here to chat, help out, answer questions, or just have a friendly conversation. Think of me as a helpful buddy with a pretty big memory (1 million tokens worth!). What can I do for you today?😊 Time: 2.421630 RAGFlow(user)> think chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Thinking: The user is asking a simple question about who I am. According to my system prompt, I should: - Identify myself as MiMo-v2.5 - State that I was developed by the Xiaomi LLM Core Team - Answer in first person and be warm and conversational Answer: Hey there! 👋 I'm MiMo, an AI assistant created by the Xiaomi LLM Core Team. Think of me as a friendly chat buddy who's here to help you with all sorts of questions and tasks! I love having conversations, answering questions, brainstorming ideas, and helping people figure things out. Whether you want to chat, need help with something specific, or just want to explore ideas together — I'm here for it! 😊 What can I help you with today? Time: 6.651589 RAGFlow(user)> tts with 'mimo-v2.5-tts@test@xiaomi' text 'hello? show yourself' play format 'wav' param '{"voice": "Chloe"}' SUCCESS RAGFlow(user)> asr with 'mimo-v2.5-asr@test@xiaomi' audio './internal/test.wav' param '{"language": "zh"}' +------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------+ \| 1 The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. \| +------------------------------------------------------------------------------------------------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-06-08 19:27:45 +08:00
oktofeesh	d63bd81d0d	fix(go-models): fix Moonshot model and balance requests (#15528 ) ## Summary - keep Moonshot chat calls in non-streaming mode and streaming calls in SSE mode - make Moonshot model listing and balance checks use bodyless GET requests - add focused Moonshot request/response regression tests	2026-06-08 19:27:19 +08:00
Jack	04209ffccf	feat: implement FetchChunkVectors for citation vector hydration (#15749 ) ## What problem does this PR solve? Implements `FetchChunkVectors` — the infrastructure needed to hydrate chunk embedding vectors on demand. This is a prerequisite for `insert_citations` (citation insertion in the `searchbots/ask` endpoint), matching the Python `Dealer.fetch_chunk_vectors` pattern. Without this, citation insertion cannot compute answer-vs-chunk vector similarity. ## Type of change - [x] New Feature (non-breaking change which adds functionality) ## Changes ### New Function - `FetchChunkVectors(engine, chunkIDs, tenantIDs, kbIDs, dim)` — fetches embedding vectors for a set of chunk IDs - Consumer-side `vectorFetcher` interface with only `Search` + `GetType` methods - Both `elasticsearchEngine` and `infinityEngine` implicitly satisfy the interface ### Engine Behavior - ES: queries by chunk ID list via `Search` with filter `{"id": chunkIDs}`, parses tab-separated `q_N_vec` string format - Infinity / OceanBase: skips the round-trip (vectors already shipped with chunks) - Degrades gracefully: engine errors return zero vectors — citation insertion will use placeholders instead of failing ### Vector Parsing - Handles ES tab-separated string format (`"0.1\t0.2\t0.3"`) - Handles `[]float64` and `[]interface{}` formats - Returns zero vector for wrong-dimension or unparseable input ### Bug Fix - `metadata_filter_test.go`: add missing `"sort"` import (pre-existing build break) ### Tests - 12 unit tests: empty input, Infinity/OceanBase skip, ES string vector, ES float slice, ES interface slice, search error degradation, missing chunk → zero, wrong dimension → zero, parse edge cases ## Files Changed \| File \| Change \| \|------\|--------\| \| `internal/service/chunk_vector.go` \| New — FetchChunkVectors + parse helpers \| \| `internal/service/chunk_vector_test.go` \| New — 12 tests \| \| `internal/service/metadata_filter_test.go` \| Fix missing `"sort"` import \| 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 16:54:00 +08:00
Haruko386	67ce0c896d	feat[Go]: implement /api/v1/agents/<agent_id>/sessions (#15705 ) ### What problem does this PR solve? As Title Codes were tested by Postman ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-06-08 16:26:27 +08:00
Jack	8f4809d1b5	feat: implement POST /api/v1/searchbots/retrieval_test (#15710 ) ## What problem does this PR solve? Implements `POST /api/v1/searchbots/retrieval_test` in the Go API server, aligning with the Python `bot_api.py` counterpart. Also applies security hardening and consistency fixes discovered during CTO-level code review: - Missing endpoint: `retrieval_test` was not available in Go, requiring Python fallback - Security: Both `chunkHandler` and `searchBotHandler` leaked `err.Error()` to API consumers - Python alignment: Default values, empty question handling, and `top_k <= 0` validation differed from Python behavior - Test gaps: `chunkHandler.RetrievalTest` had zero unit tests; several edge cases uncovered ## Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring ## Summary ### New Endpoint - `POST /api/v1/searchbots/retrieval_test` — retrieval test with full field support (page, size, top_k, use_kg, cross_languages, keyword, similarity_threshold, vector_similarity_weight) ### New Type - `common.StringSlice` — JSON type that accepts both `"kb1"` and `["kb1", "kb2"]`, matching Python API flexibility ### Security - Both `searchBotHandler` and `chunkHandler` now use `common.Warn()` + generic error messages instead of leaking `err.Error()` to API consumers - All error responses include consistent `"data": nil` shape - `chunkHandler.RetrievalTest` uses interface-based DI (`chunkService`) to enable testability ### Python Alignment - Handler-level defaults align with Python `bot_api.py` (page=1, size=30, top_k=1024, similarity_threshold=0.0, vector_similarity_weight=0.3) - `top_k <= 0` validation matching Python behavior - Empty/whitespace question returns 200 + empty result (matches `chunk_api.py`) - `chunkHandler` `Datasets` field uses `common.StringSlice` for string-or-array flexibility ### Refactoring - `ChunkServiceIface` → `ChunkRetriever`, `chunkSvcIface` → `chunkService` (Go-conventional naming) - Extracted `applyRetrievalDefaults`, `toRetrievalServiceRequest` from handler body - Regex moved to package-level var in `parseRelatedQuestions` - `service.RetrievalTestRequest.Datasets` type changed to `common.StringSlice` - `chunkHandler` now uses consumer-side interface for DI ### Tests - 37 unit tests across both handlers: auth, validation, defaults, StringSlice edge cases, empty/whitespace KbID, service errors, JSON format, `top_k <= 0`, field mapping verification ## Files Changed \| File \| Change \| \|------\|--------\| \| `cmd/server_main.go` \| Wire new handler + chunkService + difyRetrievalHandler \| \| `internal/common/json_types.go` \| New StringSlice type \| \| `internal/common/json_types_test.go` \| StringSlice tests \| \| `internal/handler/chunk.go` \| Interface-based DI, security, Python alignment, defaults \| \| `internal/handler/chunk_test.go` \| New — 9 comprehensive tests \| \| `internal/handler/searchbot.go` \| New endpoint + refactoring + `top_k <= 0` validation \| \| `internal/handler/searchbot_test.go` \| 18 tests covering all edge cases \| \| `internal/router/router.go` \| Register new route + difyRetrievalHandler \| \| `internal/service/chunk.go` \| Datasets type → StringSlice, Question binding relaxed \| 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 16:16:56 +08:00
qinling0210	5e0a7ce408	Update Rerank logic in GO (#15755 ) ### What problem does this PR solve? Sync the rerank logic in the following PR to GO. https://github.com/infiniflow/ragflow/pull/15429 https://github.com/infiniflow/ragflow/pull/15434 ### Type of change - [x] Refactoring	2026-06-08 15:28:10 +08:00
bitloi	220ee9dbfb	fix: normalize reasoning model families (#15612 ) ### What problem does this PR solve? Closes #15611. RAGFlow's fallback reasoning parser only recognized the exact model family `qwen3`. For provider-prefixed Qwen model names such as SiliconFlow's `qwen/qwen3-8b`, the derived model class can be `qwen/qwen3`, so inline `<think>...</think>` content was not split from the visible answer when `reasoning_content` was absent. This PR normalizes model-family detection before fallback reasoning extraction, keeps the parser nil-safe, and adds focused tests for Qwen3 variants plus Gitee and SiliconFlow chat responses. It also makes SiliconFlow propagate `ChatConfig.Thinking` into the chat request body, matching the existing Gitee behavior, so Qwen thinking mode is actually enabled when requested. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring ### Validation - `/root/go/bin/gofmt -l internal/entity/models/common.go internal/entity/models/common_test.go internal/entity/models/reasoning_family_provider_test.go internal/entity/models/siliconflow.go` - `git diff --check` - `/root/go/bin/go test ./internal/entity/models -run 'Test(NormalizeModelFamily\|GetThinkingAndAnswer\|GiteeChatExtractsQwenThinkingFromInlineContent\|SiliconflowChatExtractsProviderPrefixedQwenThinkingFromInlineContent)' -vet=off -count=1` Note: the full package command `/root/go/bin/go test ./internal/entity/models -vet=off -count=1` now runs locally, but it currently fails on an unrelated existing `TestAstraflowEmbedReturnsNoSuchMethod` panic in `internal/entity/models/astraflow.go:482`.	2026-06-08 13:32:52 +08:00
oktofeesh	b1a2210d06	fix(go-models): increase JieKouAI SSE scanner buffer (#15737 ) ## Summary - Raise the JieKouAI streaming SSE scanner buffer to handle larger data chunks without truncation.	2026-06-08 13:10:10 +08:00
tmimmanuel	5e25e2600b	Go: implement Xiaomi chat provider (#15626 ) ### What problem does this PR solve? Implements the Xiaomi MiMo chat provider for the Go model provider layer. Reference issue: #14736 Official docs used: - Xiaomi MiMo OpenAI-compatible chat API: https://platform.xiaomimimo.com/docs/en-US/api/chat/openai-api - Xiaomi MiMo model and rate limits: https://platform.xiaomimimo.com/docs/en-US/quick-start/model - Xiaomi MiMo model hyperparameters: https://platform.xiaomimimo.com/docs/en-US/quick-start/model-hyperparameters	2026-06-08 13:09:36 +08:00
dripsmvcp	3d7adf2193	feat[Go]: implement GET /plugin/tools (issue #15240 ) (#15570 ) ## Summary Port the Python `GET /v1/plugin/tools` endpoint to the Go API server. Listed in the Go-API port checklist of #15240. Returns the metadata of every embedded LLM tool plugin in the same JSON shape the Python endpoint emits (camelCase keys preserved), so existing frontends bind to the Go server without changes.	2026-06-08 11:53:19 +08:00
qinling0210	c960dc2a4c	Refine handling of POST /api/v1/datasets/search in GO (#15583 ) ### What problem does this PR solve? Refine handling of POST /api/v1/datasets/search in GO ### Type of change - [x] Refactoring	2026-06-08 11:49:37 +08:00
Hz_	074c331cdf	fix(go-api): sync document handler interface and enforce preview acce… (#15688 ) ### Description This PR syncs the `documentServiceIface` interface and introduces handler methods for document preview, artifact fetching, and downloading in the Go API. It also ensures that strict dataset alignment and access checks are enforced when retrieving or downloading documents. Furthermore, this PR introduces comprehensive unit tests for both the newly added Handler and Service methods to ensure robustness and prevent future regressions. ### Key Changes * Router & Handler Integration: * Added and wired new API endpoints in `internal/router/router.go`. * Synchronized the `documentServiceIface` with `GetDocumentArtifact`, `GetDocumentPreview`, and `DownloadDocument`. * Implemented handlers for these endpoints in `internal/handler/document.go`. * Access & Validation Enforcement: * Refactored `internal/service/document.go` to strictly check if a document belongs to the requested dataset before allowing downloads or previews. * Added robust artifact file sanitization (`sanitizeArtifactFilename`) and attachment handling (`shouldForceArtifactAttachment`). * Comprehensive Unit Testing: * Handler Layer (`internal/handler/document_test.go`): Added mock service implementations and Gin router tests covering success, not-found, and internal error states for all 3 new endpoints. * Service Layer (`internal/service/document_test.go`): Added extensive business logic tests including dataset mismatch checks, non-existent document checks, and artifact file validation.	2026-06-08 11:37:06 +08:00
Jack	5a04ac0864	feat: Dify-compatible retrieval API endpoint (#15704 ) ## Summary Dify-compatible retrieval API for external knowledge base integration. ## Changes - New handler: DifyRetrievalHandler with POST/GET /api/v1/dify/retrieval - Health check: GET /api/v1/dify/retrieval/health - Full pipeline: KB validation -> permission check -> embedding -> metadata filter -> chunk retrieval -> child chunk aggregation -> optional KG search -> response assembly - 12 tests covering all paths (success, errors, metadata filter, KG mode) - Testability: Handler dependencies defined as interfaces (KBServiceIface, ModelServiceIface, etc.) ## Files \| File \| Type \| \|------\|------\| \| internal/handler/dify_retrieval_handler.go \| New — handler + interfaces \| \| internal/handler/dify_retrieval_handler_test.go \| New — 12 tests \| \| internal/router/router.go \| Modified — route registration \| \| cmd/server_main.go \| Modified — handler wiring \| \| internal/service/kg/pipeline.go \| Modified — SetChatModel/SetEmbModel \| \| internal/service/kg/retrieval.go \| New — helper functions \| \| internal/service/kg/scoring.go \| Moved from service package \| \| internal/service/kg/search.go \| New — KG search functions \| \| internal/service/kg/types.go \| New — type definitions \| --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 21:16:25 +08:00
Hz_	1deb1313d2	feat(go-cli): support batch model add/remove and optional embedding dimension (#15631 ) ## Summary This PR improves the Go CLI in two areas: 1. It adds batch model management support, allowing multiple models to be added or removed in a single command. 2. It makes the `dimension` argument optional for the `embed text` command. These changes keep the existing single-model and explicit-dimension behaviors compatible while making the CLI more convenient for common workflows. ## What Changed ### 1. Batch model add/remove support The CLI now supports operating on multiple model names provided in a single quoted string. Supported commands include: ``` add model 'x1 x2 x3' to provider 'vllm' instance 'test' with tokens 1024 chat think vision, token 2048 chat, token 1024 think vision; drop model 'x1 x2 x3' from 'vllm' 'test'; remove model 'x1 x2 x3' from 'vllm' 'test'; ``` For add model, each config segment after with is matched to the corresponding model name by position. Example mapping: - x1 -> tokens 1024, chat + vision, thinking=true - x2 -> tokens 2048, chat - x3 -> tokens 1024, vision, thinking=true The existing single-model syntax remains supported. ### 2. Optional embedding dimension Previously, the Go CLI required dimension to be explicitly provided for embed text. Before: embed text 'what is rag' 'who are you' with 'model@test@provider' dimension 8192; Now both forms are supported: embed text 'what is rag' 'who are you' with 'model@test@provider' dimension 8192; embed text 'what is rag' 'who are you' with 'model@test@provider'; When omitted, the CLI leaves dimension unset and relies on provider/backend behavior. ## Tests Added parser tests covering: - Multiple models with multiple config segments - Model type deduplication - Model/config count mismatch - Drop/remove multiple models - Optional embedding dimension parsing	2026-06-05 19:31:06 +08:00
Jack	ea79d65d08	feat: add KGSearchRetrieval for full KG pipeline (N-hop, scoring, query_rewrite, community) (#15690 ) ## Summary `KGSearchRetrieval` composes entity search, type search, relation search, N-hop analysis, score fusion, LLM-based query\_rewrite, and community reports into a single synthetic chunk for KG-enhanced retrieval. ### Components \| Component \| Source \| Status \| \|-----------\|--------\|--------\| \| Entity/relation/community search \| Direct `DocEngine.Search` calls \| ✅ \| \| N-hop analysis + score fusion \| `common.AnalyzeNHopPaths` / `DoubleHitBoost` / `FuseRelationScores` \| ✅ #15666 \| \| Query rewrite prompt + parser \| `common.BuildQueryRewritePrompt` / `ParseQueryRewriteResponse` \| ✅ #15669 \| \| Token budget \| `common.BuildKGContent` + `NumTokensFromString` \| ✅ #15666 \| \| LLM query rewrite integration \| `queryRewrite` function with fallback \| ✅ \| ### Testing 11 tests (pure function + mock engine): ``` === RUN TestKgEntityFromChunk_Basic --- PASS === RUN TestKgEntityFromChunk_ScoreFallback --- PASS === RUN TestKgEntityFromChunk_MissingFields --- PASS === RUN TestKgRelationFromChunk_Basic --- PASS === RUN TestKgRelationFromChunk_MissingFrom --- PASS === RUN TestSearchKGTypeSamples_Success --- PASS === RUN TestSearchKGTypeSamples_Empty --- PASS === RUN TestKGSearchRetrieval_Basic --- PASS === RUN TestKGSearchRetrieval_NoEntities --- PASS === RUN TestQueryRewrite_Fallback --- PASS === RUN TestQueryRewrite_EmptyQuestion --- PASS ``` --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 18:00:27 +08:00
tmimmanuel	f78ef328bb	Go: implement Bedrock embeddings (#15543 ) ### What problem does this PR solve? Fixes #15542. AWS Bedrock support for the Go model provider layer was added in #15166, but embedding support was intentionally left out of scope and `BedrockModel.Embed(...)` still returned the `no such method` sentinel. This PR implements Bedrock text embeddings under the umbrella provider tracker #14736. ### What this PR includes - `internal/entity/models/bedrock.go`: implement `BedrockModel.Embed(...)` through Bedrock Runtime `InvokeModel` with existing SigV4 auth, region resolution, and runtime URL helpers. - Titan embeddings: supports `amazon.titan-embed-text-v1` and `amazon.titan-embed-text-v2:0`; v2 forwards `EmbeddingConfig.Dimension` as `dimensions` when provided, while v1 keeps the payload minimal. - Cohere embeddings: supports `cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, and `cohere.embed-v4:0`; batches input texts and maps returned vectors to RAGFlow `EmbeddingData` in input order. - `conf/models/bedrock.json`: adds the `embedding` URL suffix (`invoke`) and Bedrock embedding model entries. - `internal/entity/models/bedrock_test.go`: adds unit tests for Titan, Cohere, typed Cohere responses, validation, empty input, unsupported models, and HTTP error propagation. Reference docs: - Bedrock InvokeModel API: https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html - Titan Text Embeddings: https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html - Cohere Embed models on Bedrock: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-embed.html ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### How was this tested? - [x] `jq empty conf/models/bedrock.json` - [x] `git diff --check` - [x] `go test ./internal/entity/models/... -run Bedrock -count=1` - [x] `go test ./internal/entity/models/... -run '^$' -count=1` - [x] `go test ./internal/entity/models/... -run Bedrock -race -count=1` Note: `go test ./internal/entity/models/... -count=1` currently fails in unrelated existing Astraflow coverage (`TestAstraflowEmbedReturnsNoSuchMethod` panics in `internal/entity/models/astraflow.go`). The Bedrock-specific tests and compile-only package check pass.	2026-06-05 13:26:32 +08:00
web-dev0521	b8db200757	feat(go-api): implement MCP server management endpoints (#15281 ) ## Summary Ports the MCP (Model Context Protocol) server management endpoints that power `web/src/pages/user-setting/mcp/` from Python (`api/apps/restful_apis/mcp_api.py`) to Go. There were no MCP routes in the Go server before this change. Closes #15275 (subtask of #15240). ## Endpoints implemented (base path `/api/v1`) \| Method \| Path \| Description \| \|--------\|------\|-------------\| \| GET \| `/mcp/servers` \| List tenant servers (keyword / order / pagination) \| \| POST \| `/mcp/servers` \| Create a server \| \| GET \| `/mcp/servers/{mcp_id}` \| Get one (`?mode=download` exports config) \| \| PUT \| `/mcp/servers/{mcp_id}` \| Update a server \| \| DELETE \| `/mcp/servers/{mcp_id}` \| Delete a server \| \| POST \| `/mcp/import` \| Bulk import from JSON config \| \| POST \| `/mcp/servers/{mcp_id}/test` \| Connect + list tools (see notes) \| ## Implementation Follows the existing `handler → service → dao` layering (per PR #14790): - entity (`internal/entity/mcp.go`): added `MCPServerType` constants and `IsValidMCPServerType` over the existing `MCPServer` model. - dao (`internal/dao/mcp.go`): new `MCPServerDAO` with tenant-scoped CRUD, a keyword filter, and a whitelisted order-column map (guards against SQL injection via the caller-supplied `orderby`). - service (`internal/service/mcp.go`): new `MCPService` — list/get/export/create/update/delete/import/test — mirroring `MCPServerService` and the `mcp_api` request validation, with sentinel errors for clean code mapping. - handler (`internal/handler/mcp.go`): new `MCPHandler` with the seven handlers and Python-compatible response codes. - router / server_main: registered the `/mcp` group and wired the handler. ## Deviations from Python (documented in code) 1. Bulk import is at `POST /mcp/import`, not `/mcp/servers/import`. gin (v1.9.1) cannot register a static segment and a path param at the same tree node, so `/mcp/servers/import` would collide with `/mcp/servers/:mcp_id` and panic at startup. The frontend should call `/mcp/import`. 2. No live tool discovery on create/update/import. The Python path runs `get_mcp_tools` over SSE / streamable-HTTP and stores `variables.tools`. The Go server has no MCP client yet, so these persist `variables`/`headers` but leave `variables.tools` unpopulated. 3. `/test` returns a data error (`ErrMCPTestUnsupported`) until a Go MCP client lands. Per the issue, the live-connection path is scoped as a follow-up; the handler still validates `url` + `server_type`. ## Testing - Added `internal/service/mcp_test.go` covering `IsValidMCPServerType` and the `TestServer` validation/short-circuit paths (no DB required). - No Go toolchain was available in the dev environment, so `go build ./...` / `go vet ./...` verification is left to CI. ## Follow-ups - Go MCP client (SSE / streamable-HTTP) to enable live tool discovery and the real `/test` behavior. - Reconcile the `/mcp/import` vs `/mcp/servers/import` path with the frontend. ---------	2026-06-05 13:25:09 +08:00
Jack	e629c0203b	feat: add KG entity/relation/community search functions (#15689 ) ## Summary Knowledge Graph search functions for entity, relation, community report, and type-samples retrieval. Uses DocEngine.SelectFields (PR #15684) for KG-specific fields. ### Functions \| Function \| Description \| \|----------\|-------------\| \| `SearchKGEntities` \| Hybrid search over KG entities (dense + text + fusion) \| \| `SearchKGEntitiesByTypes` \| Entity search filtered by `entity_type_kwd` \| \| `SearchKGRelations` \| Hybrid search over KG relations \| \| `SearchKGCommunityReports` \| Community report search by entity names \| \| `SearchKGTypeSamples` \| Type→entities mapping for query_rewrite \| ### Internal helpers \| Helper \| Description \| \|--------\|-------------\| \| `buildHybridExpr` \| Shared dense+text+fusion expression construction \| \| `buildKGDenseExpr` \| Wraps `Embed()` call for vector search \| \| `Parse*` \| Convert raw chunks to typed structs \| ### Testing 35 tests (pure function + mock integration) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 13:23:04 +08:00
Haruko386	4b2af1347c	feat[Go]: implement Agent/Workflow PUT /api/v1/agents/<canvas_id>/tags (#15641 ) feat[Go]: implement Agent/Workflow PUT /api/v1/agents/<canvas_id>/tags (#15641)	2026-06-05 13:22:23 +08:00
Jack	f6ff862a24	fix: restore case-insensitive contains/not contains/not in and consolidate metadata filter pipeline (#15686 ) ## Summary This PR fixes case-sensitivity regressions introduced in #15656 and consolidates the metadata filtering pipeline by removing the duplicate `applySingleCondition` adapter layer. ### Bug fixes 1. contains / not contains: restored case-insensitive matching (was lost when `applySingleCondition` was replaced by `common.MetaFilter.matchValue` which lacked `strings.ToLower`) 2. not in: restored case-insensitive matching (was lost for same reason; uses `strings.EqualFold`) 3. != with date filter values: non-date metadata values now correctly match the `≠` operator (a non-date value IS not equal to any date, but was returning false) ### Architecture 4. Removed `applySingleCondition` (65 lines) — the inline switch was a duplicate of `common.MetaFilter` logic. `ApplyMetaFilter` now converts conditions and delegates to `common.MetaFilter` once per filter set, eliminating ~25 lines of duplicate AND/OR merge logic. 5. Added `filterSet` — O(n+m) hash-map fast path for `in`/`not in` operators, replacing the O(nm) linear scan in `matchValue`. 6. Exported `NormalizeOperator`* from `common` for consistent operator alias handling. ### Cleanup 7. Removed 18 lines of dead code (`matchValue`'s `in`/`not in` branches already bypassed by `filterOut` delegation) 8. Fixed orphaned godoc comment for `convertOperator` 9. Fixed incorrect `filterSet` doc comment (claimed "matching EqualFold" but used `strings.ToLower`) 10. Completed `convertToMetaCondition` operator normalization documentation ### Testing - 60 tests (24 service + 36 common), all passing - New tests: `==`, `≠`, `>`, `<`, `≥`, `≤`, `empty`, `not empty` through `ApplyMetaFilter` - New tests: `<`, `≤`, `≠` through `MetaFilter`; `not-in-empty-list` through `filterSet` - All 18 `MetaFilter` tests pass; all 10 `filterSet` unit tests pass --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 12:47:55 +08:00
Jack	ee32d91aab	feat: add EnrichChunksWithDocMetadata function to attach document metadata to chunks (#15659 ) ## Summary Add `EnrichChunksWithDocMetadata` as a method on `MetadataService` that attaches document metadata to retrieval chunks in-place. Equivalent to Python's `enrich_chunks_with_document_metadata()` from `api/utils/reference_metadata_utils.py`. ### Usage ```go metadataSvc.EnrichChunksWithDocMetadata(chunks, tenantID, metadataFields) ``` ### Changes - `service/metadata.go`: Added `EnrichChunksWithDocMetadata` method - `service/enrich_metadata_test.go` (new): 7 test cases ### Algorithm 1. Collect unique `(kb_id, doc_id)` pairs from chunks 2. Fetch metadata from ES via `SearchMetadata(kbID, tenantID, docIDs)` 3. Attach `document_metadata` field to each matching chunk 4. Optionally filter to specified `metadataFields` ### Testing All 7 tests pass: ``` === RUN TestEnrichChunksWithDocMetadata_NoChunks --- PASS === RUN TestEnrichChunksWithDocMetadata_EmptyChunks --- PASS === RUN TestEnrichChunksWithDocMetadata_EmptyDocID --- PASS === RUN TestEnrichChunksWithDocMetadata_DuplicateDocIDs --- PASS === RUN TestEnrichChunksWithDocMetadata_MultipleKBs --- PASS === RUN TestEnrichChunksWithDocMetadata_WithMetadataFields --- PASS === RUN TestEnrichChunksWithDocMetadata_MixedFields --- PASS ``` Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 11:42:23 +08:00
Jack	3b1ae3f829	feat: support SelectFields override in DocEngine for KG-specific queries (#15684 ) ## Summary Both ES and Infinity engines now respect `SearchRequest.SelectFields`, allowing callers to specify output columns for KG entity/relation/community queries instead of the default chunk columns. ### Changes - `internal/engine/elasticsearch/chunk.go`: Added `SelectFields` override after default `outputColumns` - `internal/engine/infinity/chunk.go`: Added `SelectFields` override after default `outputColumns` - `internal/engine/elasticsearch/kg_test.go` (new): Integration test (skipped unless `ES_TEST=1`) ### Usage ```go result, err := docEngine.Search(ctx, \&types.SearchRequest{ KbIDs: kbIDs, SelectFields: []string{entity_kwd, entity_type_kwd, rank_flt, n_hop_with_weight}, Filter: map[string]interface{}{knowledge_graph_kwd: entity}, }) ``` Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 11:41:39 +08:00
dripsmvcp	431f52a5d4	feat[Go]: implement GET /agents/templates (issue #15240 ) (#15573 ) ## Summary Port the canvas-template catalogue endpoint to the Go API server. Listed in the Go-API port checklist of #15240. Mirrors `list_agent_template` in `api/apps/restful_apis/agent_api.py`: returns every row from the `canvas_template` table so that the UI can render the template gallery on the New-Agent screen. ## What - `internal/dao/canvas_template.go` — new `CanvasTemplateDAO.GetAll()` ordered by `create_time desc` (newest templates first). - `internal/service/agent.go` — wire the new DAO into `AgentService` and expose `ListTemplates() ([]entity.CanvasTemplate, error)`. - `internal/handler/agent.go` — new `AgentHandler.ListTemplates` HTTP handler (auth-gated, mirrors Python `@login_required`). - `internal/router/router.go` — `agents.GET("/templates", r.agentHandler.ListTemplates)` registered alongside the existing `GET /agents`. - `internal/handler/agent_test.go` — three new tests covering: success path, empty-list → JSON array (not `null`), and the auth gate. ## Notes - `CanvasTemplate` entity, GORM tags, and DB migration already exist in `internal/entity/canvas.go` and `internal/dao/database.go` — no schema change required. - The handler coerces a `nil` slice to `[]entity.CanvasTemplate{}` so the JSON payload is always an array (the frontend does `data.map(...)` on it). ## Test plan - [x] `go vet ./internal/handler ./internal/service ./internal/dao ./internal/router` clean - [x] Three unit tests added; existing `TestListAgents_Success` untouched - [ ] CI runs `go test ./internal/handler` with cgo binding linked ## Related - Tracker: #15240	2026-06-05 10:13:30 +08:00
Jack	a237a89b90	feat: add QueryRewrite prompt builder and response parser (#15669 ) QueryRewrite prompt builder and response parser. Zero external dependencies. ### Functions - `BuildQueryRewritePrompt`: Renders `minirag_query2kwd` prompt with query and type pool - `ParseQueryRewriteResponse`: Parses LLM JSON response with fallback for markdown and extra text ### Testing ``` === RUN TestBuildQueryRewritePrompt --- PASS === RUN TestParseQueryRewriteResponse_ValidJSON --- PASS === RUN TestParseQueryRewriteResponse_MarkdownBlock --- PASS === RUN TestParseQueryRewriteResponse_ExtraText --- PASS === RUN TestParseQueryRewriteResponse_Invalid --- PASS === RUN TestParseQueryRewriteResponse_EmptyEntities --- PASS ``` Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 10:11:14 +08:00
Jack	bf6c091c9f	feat: add KG scoring utilities (#15666 ) KG scoring utilities as pure functions. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 10:10:59 +08:00
Jack	eee6ad546f	feat: add ResolveReferenceMetadata utility function (#15663 ) Add `ResolveReferenceMetadata` to parse `include_metadata` / `metadata_fields` from request and config payloads. ### Changes - New: `internal/common/reference_metadata.go` — pure function, zero dependencies - New: `internal/common/reference_metadata_test.go` — 8 test cases Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 22:34:18 +08:00
Jack	96a416629d	refactor: change GetFlattedMetaByKBs return type to common.MetaData (#15656 ) ## Summary Change `GetFlattedMetaByKBs` return type from `map[string]interface{}` to strongly-typed `common.MetaData`. Depends on: #15648 (provides `MetaData`, `MetaValueDocs` types) ### Changes - `service/metadata.go`: Changed return type, removed type assertions - `service/metadata_filter.go`: Updated all metadata function signatures - `service/metadata_filter_test.go` (new): 12 test cases ### Bug fix `applySingleCondition` used `.([]interface{})` assertions on `[]string` data, silently breaking operators like `!=`, `contains`, `start with`, etc. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 22:16:04 +08:00
Jack	a78a3fdd47	fix: add nil guard to DocumentDAO.GetByIDs and add tests (#15649 ) ## Summary `DocumentDAO.GetByIDs()` generated `WHERE id IN ()` for empty/nil ID slices, which is invalid SQL and would fail on most databases. This PR adds a nil guard and comprehensive tests. ### Changes - Modified: `internal/dao/document.go` — Added `len(ids) == 0` guard to `GetByIDs` - New: `internal/dao/document_test.go` — 4 test cases covering success, empty IDs, nil IDs, and no-match ### Testing ``` === RUN TestDocumentGetByIDs_Success --- PASS === RUN TestDocumentGetByIDs_EmptyIDs --- PASS === RUN TestDocumentGetByIDs_NilIDs --- PASS === RUN TestDocumentGetByIDs_NoMatch --- PASS ``` Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 21:00:02 +08:00

1 2 3 4 5 ...

359 Commits