## What problem does this PR solve?
Implements `POST /api/v1/searchbots/retrieval_test` in the Go API
server, aligning with the Python `bot_api.py` counterpart. Also applies
security hardening and consistency fixes discovered during CTO-level
code review:
- **Missing endpoint**: `retrieval_test` was not available in Go,
requiring Python fallback
- **Security**: Both `chunkHandler` and `searchBotHandler` leaked
`err.Error()` to API consumers
- **Python alignment**: Default values, empty question handling, and
`top_k <= 0` validation differed from Python behavior
- **Test gaps**: `chunkHandler.RetrievalTest` had zero unit tests;
several edge cases uncovered
## Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
## Summary
### New Endpoint
- `POST /api/v1/searchbots/retrieval_test` — retrieval test with full
field support (page, size, top_k, use_kg, cross_languages, keyword,
similarity_threshold, vector_similarity_weight)
### New Type
- `common.StringSlice` — JSON type that accepts both `"kb1"` and
`["kb1", "kb2"]`, matching Python API flexibility
### Security
- Both `searchBotHandler` and `chunkHandler` now use `common.Warn()` +
generic error messages instead of leaking `err.Error()` to API consumers
- All error responses include consistent `"data": nil` shape
- `chunkHandler.RetrievalTest` uses interface-based DI (`chunkService`)
to enable testability
### Python Alignment
- Handler-level defaults align with Python `bot_api.py` (page=1,
size=30, top_k=1024, similarity_threshold=0.0,
vector_similarity_weight=0.3)
- `top_k <= 0` validation matching Python behavior
- Empty/whitespace question returns 200 + empty result (matches
`chunk_api.py`)
- `chunkHandler` `Datasets` field uses `common.StringSlice` for
string-or-array flexibility
### Refactoring
- `ChunkServiceIface` → `ChunkRetriever`, `chunkSvcIface` →
`chunkService` (Go-conventional naming)
- Extracted `applyRetrievalDefaults`, `toRetrievalServiceRequest` from
handler body
- Regex moved to package-level var in `parseRelatedQuestions`
- `service.RetrievalTestRequest.Datasets` type changed to
`common.StringSlice`
- `chunkHandler` now uses consumer-side interface for DI
### Tests
- 37 unit tests across both handlers: auth, validation, defaults,
StringSlice edge cases, empty/whitespace KbID, service errors, JSON
format, `top_k <= 0`, field mapping verification
## Files Changed
| File | Change |
|------|--------|
| `cmd/server_main.go` | Wire new handler + chunkService +
difyRetrievalHandler |
| `internal/common/json_types.go` | New StringSlice type |
| `internal/common/json_types_test.go` | StringSlice tests |
| `internal/handler/chunk.go` | Interface-based DI, security, Python
alignment, defaults |
| `internal/handler/chunk_test.go` | New — 9 comprehensive tests |
| `internal/handler/searchbot.go` | New endpoint + refactoring + `top_k
<= 0` validation |
| `internal/handler/searchbot_test.go` | 18 tests covering all edge
cases |
| `internal/router/router.go` | Register new route +
difyRetrievalHandler |
| `internal/service/chunk.go` | Datasets type → StringSlice, Question
binding relaxed |
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
## Summary
This PR fixes case-sensitivity regressions introduced in #15656 and
consolidates the metadata filtering pipeline by removing the duplicate
`applySingleCondition` adapter layer.
### Bug fixes
1. **contains / not contains**: restored case-insensitive matching (was
lost when `applySingleCondition` was replaced by
`common.MetaFilter.matchValue` which lacked `strings.ToLower`)
2. **not in**: restored case-insensitive matching (was lost for same
reason; uses `strings.EqualFold`)
3. **!= with date filter values**: non-date metadata values now
correctly match the `≠` operator (a non-date value IS not equal to any
date, but was returning false)
### Architecture
4. **Removed `applySingleCondition`** (65 lines) — the inline switch was
a duplicate of `common.MetaFilter` logic. `ApplyMetaFilter` now converts
conditions and delegates to `common.MetaFilter` once per filter set,
eliminating ~25 lines of duplicate AND/OR merge logic.
5. **Added `filterSet`** — O(n+m) hash-map fast path for `in`/`not in`
operators, replacing the O(n*m) linear scan in `matchValue`.
6. **Exported `NormalizeOperator`** from `common` for consistent
operator alias handling.
### Cleanup
7. Removed 18 lines of dead code (`matchValue`'s `in`/`not in` branches
already bypassed by `filterOut` delegation)
8. Fixed orphaned godoc comment for `convertOperator`
9. Fixed incorrect `filterSet` doc comment (claimed "matching EqualFold"
but used `strings.ToLower`)
10. Completed `convertToMetaCondition` operator normalization
documentation
### Testing
- 60 tests (24 service + 36 common), all passing
- New tests: `==`, `≠`, `>`, `<`, `≥`, `≤`, `empty`, `not empty` through
`ApplyMetaFilter`
- New tests: `<`, `≤`, `≠` through `MetaFilter`; `not-in-empty-list`
through `filterSet`
- All 18 `MetaFilter` tests pass; all 10 `filterSet` unit tests pass
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
QueryRewrite prompt builder and response parser. Zero external
dependencies.
### Functions
- `BuildQueryRewritePrompt`: Renders `minirag_query2kwd` prompt with
query and type pool
- `ParseQueryRewriteResponse`: Parses LLM JSON response with fallback
for markdown and extra text
### Testing
```
=== RUN TestBuildQueryRewritePrompt --- PASS
=== RUN TestParseQueryRewriteResponse_ValidJSON --- PASS
=== RUN TestParseQueryRewriteResponse_MarkdownBlock --- PASS
=== RUN TestParseQueryRewriteResponse_ExtraText --- PASS
=== RUN TestParseQueryRewriteResponse_Invalid --- PASS
=== RUN TestParseQueryRewriteResponse_EmptyEntities --- PASS
```
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Add `ResolveReferenceMetadata` to parse `include_metadata` /
`metadata_fields` from request and config payloads.
### Changes
- **New**: `internal/common/reference_metadata.go` — pure function, zero
dependencies
- **New**: `internal/common/reference_metadata_test.go` — 8 test cases
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
## Summary
Migrate the metadata filtering utilities `meta_filter` and
`convert_conditions` from `common/metadata_utils.py` to Go as pure
functions with zero external dependencies.
These functions are used by `dify/retrieval`, `openai/chat/completions`,
`document_api`, and `chunk_api` for filtering documents by metadata
conditions.
### Changes
- **New**: `internal/common/metadata_utils.go` — `ConvertConditions()`
and `MetaFilter()` with full operator support
- **New**: `internal/common/metadata_utils_test.go` — 18 test cases
covering all operators and edge cases
### Supported Operators
`=`, `≠`, `>`, `<`, `≥`, `≤`, `contains`, `not contains`, `in`, `not
in`, `start with`, `end with`, `empty`, `not empty`
### Design
- Numeric comparison via `strconv.ParseFloat`
- Date comparison via YYYY-MM-DD format detection
- Case-insensitive string comparison fallback
- `and` / `or` logic support for multiple conditions
- Zero external dependencies — pure functions only
### What problem does this PR solve?
```
RAGFlow(user)> embed text 'what is rag' 'who are you' with 'embedding-3@test@zhipu-ai' dimension 16;
+-----------+-------+
| dimension | index |
+-----------+-------+
| 16 | 0 |
| 16 | 1 |
+-----------+-------+
RAGFlow(user)> rerank query 'what is rag' document 'rag is retrieval augment generation' 'rag need llm' 'famous rag project includes ragflow' with 'rerank@test@zhipu-ai' top 2;
+-------+-----------------+
| index | relevance_score |
+-------+-----------------+
| 0 | 1 |
| 2 | 0.99999976 |
+-------+-----------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
As title
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
feat(File Management): Refactor File List API and Add Knowledge Base
Document Initialization
- Migrate the file list API endpoint from `/v1/file/list` to
`/api/v1/files` to align with the Python implementation.
- Add logic for initializing knowledge base documents; automatically
create the `.knowledgebase` folder and associated documents when
retrieving the root directory.
- Enhance parameter validation and error handling, including the
introduction of a new `CodeParamError` error code.
- Optimize the file list response structure to match the implementation
on the Python side.
- Update the Vite configuration to support proxying the new
`/api/v1/files` endpoint.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: CREATE / DELETE / LIST dataset api in Go
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Lynn <lynn_inf@hotmail.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
1. Add more CLI command
2. Add some license hooks
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
As title
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. RAGFlow server will send heartbeat periodically.
2. This PR will including:
- Scheduled task
- API server message sending
- Admin server API to receive the message.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat:Using Go to implement user registration logic
### Type of change
- [x] New Feature (non-breaking change which adds functionality)