Commit Graph

16 Commits

Author SHA1 Message Date
Hz_
54fb5b0fa7 feat(go-api): add Go support for POST /api/v1/datasets/{dataset_id}/documents/{document_id}/chunks (#16256)
## Summary
Add the Go implementation of `POST
/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks`.

This wires the full create-chunk path in Go:
- router and handler registration
- request/response structs
- chunk creation service logic
- embedding generation
- chunk insert into doc engine
- chunk/token counter increment
- `tag_feas` validation
- `image_base64` decoding and chunk image storage/merge
- unit tests for handler and service

## Testing
Unit tests:
- `/usr/local/go/bin/go test ./internal/handler`
- `/usr/local/go/bin/go test ./internal/service/chunk`
- `/usr/local/go/bin/go test ./internal/service`
- `/usr/local/go/bin/go test ./...`

All passed locally.

Manual curl checks:
- basic text chunk: Go passed
- chunk with `important_keywords` / `questions` / `tag_kwd` /
`tag_feas`: Go passed
- blank content validation: Go matched expected `code=102`
- invalid `image_base64` validation: Go matched expected `code=102`
- image upload and repeated image upload / merge path: Go passed twice
2026-06-25 14:15:29 +08:00
Haruko386
dd46ece3bc feat[go]: datasets/<dataset_id>/chunks DELETE (#16185)
### What problem does this PR solve?

As title:

`documents.POST("/ingest", r.documentHandler.Ingest)`:

---

<img width="3750" height="2039" alt="image"
src="https://github.com/user-attachments/assets/533c1c3d-af3e-47e6-9f51-a278539b7066"
/>

`datasets.DELETE("/:dataset_id/chunks", r.chunkHandler.StopParsing)`

---

<img width="3621" height="2040" alt="image"
src="https://github.com/user-attachments/assets/022adcdb-1e47-4883-9611-1a695c34007d"
/>


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-24 19:43:18 +08:00
Haruko386
6cbd069ea3 feat[Go]: implement <document_id>/chunks/<chunk_id> PATCH (#16232)
### What problem does this PR solve?

Implement: 
1. `/api/v1/datasets/<dataset_id>/documents/<document_id>/chunks GET`
2.
`/api/v1/datasets/<dataset_id>/documents/<document_id>/chunks/<chunk_id>
PATCH`
3. `/api/v1/datasets/<dataset_id>/documents/<document_id>/chunks PATCH`

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-23 18:50:36 +08:00
Haruko386
b777e50291 feat[Go]: implement api /api/v1/datasets/<dataset_id>/chunks POST (#16067)
### What problem does this PR solve?

As title

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-22 18:14:01 +08:00
Jack
87b8062df4 feat: implement POST /api/v1/searchbots/ask — streaming RAG with citations and think-tag processing (#15825)
Implements POST /api/v1/searchbots/ask in Go with streaming SSE,
citations, and think-tag processing. 23 files, 90+ unit tests.

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 22:48:50 +08:00
Jack
8f4809d1b5 feat: implement POST /api/v1/searchbots/retrieval_test (#15710)
## What problem does this PR solve?

Implements `POST /api/v1/searchbots/retrieval_test` in the Go API
server, aligning with the Python `bot_api.py` counterpart. Also applies
security hardening and consistency fixes discovered during CTO-level
code review:

- **Missing endpoint**: `retrieval_test` was not available in Go,
requiring Python fallback
- **Security**: Both `chunkHandler` and `searchBotHandler` leaked
`err.Error()` to API consumers
- **Python alignment**: Default values, empty question handling, and
`top_k <= 0` validation differed from Python behavior
- **Test gaps**: `chunkHandler.RetrievalTest` had zero unit tests;
several edge cases uncovered

## Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring

## Summary

### New Endpoint
- `POST /api/v1/searchbots/retrieval_test` — retrieval test with full
field support (page, size, top_k, use_kg, cross_languages, keyword,
similarity_threshold, vector_similarity_weight)

### New Type
- `common.StringSlice` — JSON type that accepts both `"kb1"` and
`["kb1", "kb2"]`, matching Python API flexibility

### Security
- Both `searchBotHandler` and `chunkHandler` now use `common.Warn()` +
generic error messages instead of leaking `err.Error()` to API consumers
- All error responses include consistent `"data": nil` shape
- `chunkHandler.RetrievalTest` uses interface-based DI (`chunkService`)
to enable testability

### Python Alignment
- Handler-level defaults align with Python `bot_api.py` (page=1,
size=30, top_k=1024, similarity_threshold=0.0,
vector_similarity_weight=0.3)
- `top_k <= 0` validation matching Python behavior
- Empty/whitespace question returns 200 + empty result (matches
`chunk_api.py`)
- `chunkHandler` `Datasets` field uses `common.StringSlice` for
string-or-array flexibility

### Refactoring
- `ChunkServiceIface` → `ChunkRetriever`, `chunkSvcIface` →
`chunkService` (Go-conventional naming)
- Extracted `applyRetrievalDefaults`, `toRetrievalServiceRequest` from
handler body
- Regex moved to package-level var in `parseRelatedQuestions`
- `service.RetrievalTestRequest.Datasets` type changed to
`common.StringSlice`
- `chunkHandler` now uses consumer-side interface for DI

### Tests
- 37 unit tests across both handlers: auth, validation, defaults,
StringSlice edge cases, empty/whitespace KbID, service errors, JSON
format, `top_k <= 0`, field mapping verification

## Files Changed

| File | Change |
|------|--------|
| `cmd/server_main.go` | Wire new handler + chunkService +
difyRetrievalHandler |
| `internal/common/json_types.go` | New StringSlice type |
| `internal/common/json_types_test.go` | StringSlice tests |
| `internal/handler/chunk.go` | Interface-based DI, security, Python
alignment, defaults |
| `internal/handler/chunk_test.go` | New — 9 comprehensive tests |
| `internal/handler/searchbot.go` | New endpoint + refactoring + `top_k
<= 0` validation |
| `internal/handler/searchbot_test.go` | 18 tests covering all edge
cases |
| `internal/router/router.go` | Register new route +
difyRetrievalHandler |
| `internal/service/chunk.go` | Datasets type → StringSlice, Question
binding relaxed |

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 16:16:56 +08:00
qinling0210
c960dc2a4c Refine handling of POST /api/v1/datasets/search in GO (#15583)
### What problem does this PR solve?

Refine handling of POST /api/v1/datasets/search in GO

### Type of change

- [x] Refactoring
2026-06-08 11:49:37 +08:00
qinling0210
af85aa9c7b Implement Elasticsearch functions in GO (#15160)
### What problem does this PR solve?

Implement Elasticsearch functions in GO (except for Search)

### Type of change

- [x] Refactoring
2026-05-25 19:15:07 +08:00
Jin Hai
2f2d1569e6 Go: fix retrieval test error (#14794)
### What problem does this PR solve?

1. Add region check in zhipu-ai embed method
2. Fix retrieval test

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-11 20:19:08 +08:00
Jin Hai
5e96c5cae6 Fix go cli: search on datasets (#14692)
### What problem does this PR solve?

As title

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-05-08 20:25:14 +08:00
qinling0210
82fa85c837 Implement Delete in GO and refactor functions (#13974)
### What problem does this PR solve?

Implement Delete in GO and refactor functions

### Type of change

- [x] Refactoring

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a remove_chunks command to delete specific or all chunks from a
document.
  * Added new endpoints for chunk removal and chunk update.

* **Refactor**
* Renamed index commands to dataset/metadata table terminology and
updated REST routes accordingly.
* Updated chunk update flow to a JSON POST style and improved metadata
error messages.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2026-04-09 09:52:31 +08:00
qinling0210
49386bc1b5 Implement UpdateDataset and UpdateMetadata in GO (#13928)
### What problem does this PR solve?

Implement UpdateDataset and UpdateMetadata in GO

Add cli:
UPDATE CHUNK <chunk_id> OF DATASET <dataset_name> SET <update_fields>
REMOVE TAGS 'tag1', 'tag2' from DATASET 'dataset_name';
SET METADATA OF DOCUMENT <doc_id> TO <meta>


### Type of change

- [ ] Refactoring
2026-04-07 09:44:51 +08:00
qinling0210
7c8927c4fb Implement GetChunk() in Infinity in GO (#13758)
### What problem does this PR solve?

Implement GetChunk() in Infinity in GO

Add cli:
GET CHUNK 'XXX';
LIST CHUNKS OF DOCUMENT 'XXX';

### Type of change

- [x] Refactoring
2026-03-24 20:10:21 +08:00
Jin Hai
2133fd76a8 Add auth middleware (#13506)
### What problem does this PR solve?

Use auth middle-ware to check authorization.

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-11 11:23:13 +08:00
chanx
81fd1811b8 Feat:Using Go to implement user registration logic (#13431)
### What problem does this PR solve?

Feat:Using Go to implement user registration logic

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-03-06 16:42:49 +08:00
Jin Hai
70e9743ef1 RAGFlow go API server (#13240)
# RAGFlow Go Implementation Plan 🚀

This repository tracks the progress of porting RAGFlow to Go. We'll
implement core features and provide performance comparisons between
Python and Go versions.

## Implementation Checklist

- [x] User Management APIs
- [x] Dataset Management Operations
- [x] Retrieval Test
- [x] Chat Management Operations
- [x] Infinity Go SDK

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>
2026-03-04 19:17:16 +08:00