Commit Graph

5 Commits

Author SHA1 Message Date
Hz_
d4fe3bb148 feat(go-api): Add GET dataset metadata summary API (#15843)
## What

Adds the RESTful dataset metadata summary endpoint:

`GET /api/v1/datasets/{dataset_id}/metadata/summary`

The endpoint supports optional document filtering through:

`?doc_ids=doc_id_1,doc_id_2`
2026-06-09 19:27:47 +08:00
Jack
338fdb65fb feat(ci): enable go test in CI pipeline (#15750)
## What problem does this PR solve?

Go test files are never compiled in CI — only production binaries via
`go build`. This allowed a missing `"sort"` import in
`metadata_filter_test.go` to be merged without detection.

## Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)

## Changes

- Add `go test -count=1 ./internal/...` step after Go build in CI
workflow
- Fix missing `"sort"` import in `metadata_filter_test.go` (pre-existing
compile error)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 20:06:57 +08:00
Hz_
074c331cdf fix(go-api): sync document handler interface and enforce preview acce… (#15688)
### Description

This PR syncs the `documentServiceIface` interface and introduces
handler methods for document preview, artifact fetching, and downloading
in the Go API. It also ensures that strict dataset alignment and access
checks are enforced when retrieving or downloading documents.

Furthermore, this PR introduces comprehensive unit tests for both the
newly added Handler and Service methods to ensure robustness and prevent
future regressions.

### Key Changes
* **Router & Handler Integration**: 
  * Added and wired new API endpoints in `internal/router/router.go`.
* Synchronized the `documentServiceIface` with `GetDocumentArtifact`,
`GetDocumentPreview`, and `DownloadDocument`.
* Implemented handlers for these endpoints in
`internal/handler/document.go`.
* **Access & Validation Enforcement**: 
* Refactored `internal/service/document.go` to strictly check if a
document belongs to the requested dataset before allowing downloads or
previews.
* Added robust artifact file sanitization (`sanitizeArtifactFilename`)
and attachment handling (`shouldForceArtifactAttachment`).
* **Comprehensive Unit Testing**:
* **Handler Layer (`internal/handler/document_test.go`)**: Added mock
service implementations and Gin router tests covering success,
not-found, and internal error states for all 3 new endpoints.
* **Service Layer (`internal/service/document_test.go`)**: Added
extensive business logic tests including dataset mismatch checks,
non-existent document checks, and artifact file validation.
2026-06-08 11:37:06 +08:00
Jack
c6eee09ed3 feat: migrate POST /api/v1/datasets/<dataset_id>/documents/stop to Go (#15597)
## Summary

Migrate the stop parse documents endpoint from Python to Go.

### Python endpoint
`POST /api/v1/datasets/<dataset_id>/documents/stop` —
`api/apps/restful_apis/document_api.py:1542-1641`

### Changes
| File | Change |
|------|--------|
| `internal/dao/task.go` | Add `GetByDocID` method |
| `internal/dao/task_test.go` | 3 DAO tests (new file) |
| `internal/service/document.go` | Add `StopParseDocuments` + refactor
shared helpers |
| `internal/service/document_test.go` | 8 service tests |
| `internal/handler/document.go` | Add handler + request struct +
interface |
| `internal/handler/document_test.go` | 5 handler tests |
| `internal/router/router.go` | Add `POST /:dataset_id/documents/stop`
route |

### How it works
1. Validates all document IDs belong to the dataset
2. For each document in RUNNING/CANCEL state (or with unfinished tasks):
- Sets Redis cancel signal `{task_id}-cancel` for each associated task
   - Updates `document.run` to CANCEL ("2")
3. Returns `{"success_count": N, "errors": [...]}`

### Test strategy
- **DAO/Service**: SQLite in-memory DB, zero mocks. Redis is nil-safe by
design.
- **Handler**: `fakeDocumentService` implementing `documentServiceIface`
interface.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-06-04 14:16:13 +08:00
Jack
67c3e73d70 feat: migrate DELETE /api/v1/datasets/:dataset_id/documents to Go (#15577)
## Summary

Migrate the batch document deletion endpoint from Python to Go. Two
modes supported: explicit `ids` list and `delete_all`.

## Changes

| File | Change |
|------|--------|
| `internal/dao/file2document.go` | Add `GetByDocumentID`,
`DeleteByDocumentID` |
| `internal/dao/file2document_test.go` | 5 new tests |
| `internal/dao/kb_test.go` | 2 new tests (`DecreaseDocumentNum`) |
| `internal/service/document.go` | Add `deleteDocumentFull` +
`DeleteDocuments`, refactor `DeleteDocument` |
| `internal/service/document_test.go` | 10 new tests |
| `internal/handler/document.go` | Add `documentServiceIface` +
`DeleteDocuments` handler |
| `internal/handler/document_test.go` | 7 new tests |
| `internal/router/router.go` | Register `DELETE /:dataset_id/documents`
|
| `cmd/server_main.go` | Support `RAGFLOW_DICT_PATH` env var |
| `internal/binding/rag_analyzer.go` | Use `-lpcre2-8` dynamic linking |
| `internal/dao/database.go` | Skip Error 1091/1138 during migration |
| `internal/service/llm.go` | Fix vet warning |

## Per-document cleanup

- Delete tasks from DB
- Hard-delete document + decrement KB counters
- Delete chunks from document engine (nil-guarded)
- Delete metadata from document engine (nil-guarded)
- Remove file2document mapping + file record + storage blob

## Test Results

**24 unit tests all passing** (7 DAO + 10 service + 7 handler) using
SQLite :memory: + gin.TestMode.

See [test report](docs/test_report_delete_documents.md) for manual
integration test results.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 20:55:53 +08:00