Commit Graph

163 Commits

Author SHA1 Message Date
Yingfeng
b5bea72e4b Add git-like file commit API (#15978)
### What problem does this PR solve?

| # | Method | Endpoint | Description | Git Equivalent |
|---|--------|----------|-------------|----------------|
| 1 | `POST` | `/api/v1/{prefix}/{folder_id}/commits` | Create a
snapshot commit with file changes (add/modify/delete/rename) | `git add`
+ `git commit` |
| 2 | `GET` | `/api/v1/{prefix}/{folder_id}/commits` | List commit
history (paginated) | `git log` |
| 3 | `GET` | `/api/v1/{prefix}/{folder_id}/commits/{commit_id}` | Get
commit detail with file changes | `git show` |
| 4 | `GET` | `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files` |
List file changes in a commit | `git show --name-status` |
| 5 | `GET` |
`/api/v1/{prefix}/{folder_id}/commits/diff?from=...&to=...` | Compare
two commits and return differences | `git diff` |
| 6 | `GET` | `/api/v1/{prefix}/{folder_id}/changes` | Get uncommitted
changes (add/modify/delete) | `git status` |
| 7 | `GET` | `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/tree` |
Get the folder tree snapshot at commit time | `git ls-tree` |
| 8 | `GET` |
`/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files/{file_id}/content`
| Get a file's content as it existed in a specific commit | `git show
HEAD:file` |
| 9 | `GET` | `/api/v1/{prefix}/{file_id}/versions` | Get version
history for a specific file across all commits | `git log -- file` |

Where `{prefix}/{id}` can be:
- `folders/{folder_id}` — direct folder access
- `workspaces/{workspace_id}` — alias of `folders/{folder_id}`
- `datasets/{dataset_id}` — resolves to the dataset's folder
- `memories/{memory_id}` — resolves to the memory's folder
- `skills/{skill_id}` — resolves to the skill's folder

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2026-06-15 11:19:56 +08:00
Jin Hai
32d5c0039b Go: refactor model API to accept model id (#15999)
### What problem does this PR solve?

Not not only model_name@instance_name@provider_name is acceptable, but
also model_id is acceptable.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-15 10:10:14 +08:00
Zhichang Yu
3fa15c0e2f feat(agent): Go port — canvas engine, 22 components, DSL v2, 13 endpoints (#15952)
Ports the agent canvas subsystem from Python to Go.

## What's included

### Canvas Engine (Phase 0/1)
- State engine, scheduler, variable resolver, Redis checkpoint store,
cancel protocol
- **209 tests** across canvas / component / io packages

### 22 Components (P0–P4)
| Tier | Components |
|---|---|
| P0 T1+T2+T3 | LLM, Agent, ExitLoop, Switch, Categorize, Begin,
Message, Invoke |
| P1 T3 | VariableAggregator, VariableAssigner, StringTransform,
ListOperations, DataOperations |
| P2 T3 | Iteration, IterationItem, Loop, LoopItem |
| P3 T3 | UserFillUp, Fillup |
| P4 T5 | Browser, ExcelProcessor, DocsGenerator |

### DSL v2 Schema (Phase 2.5)
- Typed v2 in-memory model with v1-to-v2 auto-detect converter
- v1 legacy field stripping per plan §2.11.7

### HTTP Endpoints & Bug Fixes (Plans PR1–PR3)
- **DELETE SQL bug fix**: gorm v2 `Where("id = ?", id).Delete(...)`
pattern
- **CreateAgent validation**: title/DSL required, duplicate check, 103
envelope
- **13 new endpoints**: templates, prompts, tags, sessions CRUD,
chat/completions (SSE + non-stream stubs), rerun, test_db_connection,
logs, webhook/logs
- **756 Go unit tests** (745 → 756, +18)
- **17 → 0 Python integration test failures** (test_agents.py +
test_session_management/)

### Tools
21 eino tools: HTTPHelper, search tools, financial/data tools, mandatory
stubs

### Infrastructure
OTel observability, NATS message queue, DeepDoc gRPC client, SSRF
guards, IDOR mitigation
2026-06-12 22:58:28 +08:00
Jin Hai
234f1b7cff Go: add office_oxide and parse docx file. (#15976)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-12 20:28:15 +08:00
Yingfeng
5a7d7771a3 Decouple skill space from Python API (#15971)
### What problem does this PR solve?

Make skill space independent of Python filesystem API

### Type of change

- [x] Refactoring
2026-06-12 18:18:55 +08:00
Jin Hai
e96bc37d06 Go: use NATS as the message queue (#15327)
### What problem does this PR solve?

```
RAGFlow(admin)> mq publish 'msg2';
SUCCESS
RAGFlow(admin)> mq publish 'msg3';
SUCCESS
RAGFlow(admin)> mq list;
+---------+---------------+
| message | subject       |
+---------+---------------+
| msg1    | tasks.RAGFLOW |
| msg2    | tasks.RAGFLOW |
| msg3    | tasks.RAGFLOW |
+---------+---------------+
RAGFlow(admin)> mq pull 2;
+---------+---------------+
| message | subject       |
+---------+---------------+
| msg1    | tasks.RAGFLOW |
| msg2    | tasks.RAGFLOW |
+---------+---------------+
RAGFlow(admin)> mq pull noack;
+---------+---------------+
| message | subject       |
+---------+---------------+
| abc     | tasks.RAGFLOW |
+---------+---------------+
RAGFlow(admin)> mq show
+-------------------+----------------+--------+---------------+---------------+-------------------+---------------+
| ack_pending_count | consumer_count | memory | message_count | pending_count | redelivered_count | waiting_count |
+-------------------+----------------+--------+---------------+---------------+-------------------+---------------+
| 2                 | 1              | 0      | 2             | 0             | 1                 | 0             |
+-------------------+----------------+--------+---------------+---------------+-------------------+---------------+

RAGFlow(admin)> list ingestors;
+--------------+-------------------------------------------+--------+
| host         | name                                      | status |
+--------------+-------------------------------------------+--------+
| 192.168.1.38 | ingestor-8f0e4bd5650a4ac58b0151969fbf6935 | alive  |
+--------------+-------------------------------------------+--------+

RAGFlow(admin)> list ingestion tasks;
+----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+
| document_id                      | id                               | status    | step | user        | user_id                          |
+----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+
| ffe64fae423411f1a2d938a74640adcc | 90d3d0f6528941c1ac8eb0360effccc4 | COMPLETED | 5    | aaa@aaa.com | 2ba4881420fa11f19e9c38a74640adcc |
+----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+

RAGFlow(admin)> remove ingestion tasks '90d3d0f6528941c1ac8eb0360effccc4';
+---------+----------------------------------+
| delete  | task_id                          |
+---------+----------------------------------+
| success | 90d3d0f6528941c1ac8eb0360effccc4 |
+---------+----------------------------------+

RAGFlow(admin)> stop ingestion tasks 'e89e20d9a25848a1b79bd9345ddbfe1d';
+----------+----------------------------------+
| status   | task_id                          |
+----------+----------------------------------+
| STOPPING | e89e20d9a25848a1b79bd9345ddbfe1d |
+----------+----------------------------------+

# Publish a message
RAGFlow(admin)> mq publish 'cdd';
SUCCESS

# List current tasks in the message queue
RAGFlow(admin)> mq list
+----------------------------------+---------------+
| message                          | subject       |
+----------------------------------+---------------+
| 7ce392a3c1624cd2be4b5276e8825059 | tasks.RAGFLOW |
+----------------------------------+---------------+

# Consume a task from the message queue
RAGFlow(admin)> mq pull
+------+-----+----------------+
| ack  | id  | type           |
+------+-----+----------------+
| true | cdd | ingestion_test |
+------+-----+----------------+

# User mode
# List ingestion tasks, followed by dataset id
RAGFlow(user)> list ingestion tasks from '0abe79f9423311f1ad8d38a74640adcc';
+---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+
| create_date               | create_time   | dataset_id                       | document_id                      | id                               | schema | status    | update_date               | update_time   | user_id                          |
+---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+
| 2026-05-30T20:21:06+08:00 | 1780143666289 | 0abe79f9423311f1ad8d38a74640adcc | ffe64fae423411f1a2d938a74640adcc | 8d758cd14a8b4ba8ab505003fb52017d |        | COMPLETED | 2026-05-30T20:21:26+08:00 | 1780143686431 | 2ba4881420fa11f19e9c38a74640adcc |
+---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+

RAGFlow(user)> list ingestion tasks;
+---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+
| create_date               | create_time   | dataset_id                       | document_id                      | id                               | schema | status    | update_date               | update_time   | user_id                          |
+---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+
| 2026-06-02T19:02:31+08:00 | 1780398151417 | 0abe79f9423311f1ad8d38a74640adcc | ffe64fae423411f1a2d938a74640adcc | e89e20d9a25848a1b79bd9345ddbfe1d |        | COMPLETED | 2026-06-02T19:02:52+08:00 | 1780398172208 | 2ba4881420fa11f19e9c38a74640adcc |
+---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+

# Create an ingestion task
# First argument is document id, second argument is dataset id
RAGFlow(user)> start ingestion 'ffe64fae423411f1a2d938a74640adcc' from '0abe79f9423311f1ad8d38a74640adcc';
+----------------------------------+-------------------------------------------+
| document_id                      | result                                    |
+----------------------------------+-------------------------------------------+
| ffe64fae423411f1a2d938a74640adcc | task_id: 8d758cd14a8b4ba8ab505003fb52017d |
+----------------------------------+-------------------------------------------+

# Pause an ingestion task, first argument is ingestion id
RAGFlow(user)> stop ingestion '8d758cd14a8b4ba8ab505003fb52017d';
+---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+
| create_date               | create_time   | dataset_id                       | document_id                      | id                               | schema | status    | update_date               | update_time   | user_id                          |
+---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+
| 2026-05-30T20:21:06+08:00 | 1780143666289 | 0abe79f9423311f1ad8d38a74640adcc | ffe64fae423411f1a2d938a74640adcc | 8d758cd14a8b4ba8ab505003fb52017d |        | COMPLETED | 2026-05-30T20:21:26+08:00 | 1780143686431 | 2ba4881420fa11f19e9c38a74640adcc |
+---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+

# Delete an ingestion task
RAGFlow(api/default)> remove ingestion tasks 'f366450a27d54677aec1c7090add30f0';
+---------+----------------------------------+
| remove  | task_id                          |
+---------+----------------------------------+
| success | f366450a27d54677aec1c7090add30f0 |
+---------+----------------------------------+

```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-12 14:56:44 +08:00
Haruko386
9c30557ef7 Go: add dimensions for list models and fix some embed-bug in providers (#15940)
### What problem does this PR solve?

As title

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2026-06-11 19:18:49 +08:00
Hz_
312514c032 feat(go): Add embedding dimension metadata and validation (#15939)
### What problem does this PR solve?

- Replace embedding model `dimension` metadata with `max_dimension`.
- Add optional `dimensions` metadata for models with fixed selectable
output dimensions.
- Include `max_dimension` and `dimensions` in model list responses.
- Validate requested embedding dimensions before calling provider
embedding APIs.
- Forward SiliconFlow embedding dimensions with the correct `dimensions`
request field.
- Add unit coverage for embedding dimension validation rules.
2026-06-11 17:55:13 +08:00
Rene Arredondo
bf59eb77cc feat(go-api): port forgot-password flow to Go (#15282) (#15290)
## Summary

Implements **chunk 1** of #15282 — the four `/api/v1/auth/password/...`
endpoints from the login-page Go port. **Chunk 2 (OAuth/OIDC) is
deferred** to its own subtask, matching the issue author's own
confidence-low recommendation ("multi-provider, stateful redirect flow
with external dependencies; recommend its own subtask").

New endpoints, all registered under `apiNoAuth` (forgot-password users
are unauthenticated by definition):

| Method | Path | Status |
|--------|------|--------|
| `POST` | `/api/v1/auth/password/forgot/captcha` | new |
| `POST` | `/api/v1/auth/password/forgot/otp` | new |
| `POST` | `/api/v1/auth/password/forgot/otp/verify` | new |
| `POST` | `/api/v1/auth/password/reset` | new |

## Wire compatibility with the Python backend

The two backends share state through Redis, so the Go port had to use
identical keys, encodings, and constants. Either backend can now
validate a code the other minted.

- **Redis keys**: `captcha:<email>`, `otp:<email>`,
`otp_attempts:<email>`, `otp_last_sent:<email>`, `otp_lock:<email>`,
`otp:verified:<email>` — same as `api/utils/web_utils.py`.
- **Stored OTP value**: `"<hex_hash>:<hex_salt>"` — same as Python.
- **Hash**: HMAC-SHA256 with a `crypto/rand` 16-byte salt — same as
`hash_code()`.
- **Constants**: `OTP_LENGTH=4`, `OTP_TTL=5min`, `ATTEMPT_LIMIT=5`,
`ATTEMPT_LOCK_SECONDS=30min`, `RESEND_COOLDOWN_SECONDS=60s` — all match
`api/utils/web_utils.py`.
- **Email body**: matches `RESET_CODE_EMAIL_TMPL` byte-for-byte.

## Files

### New

| File | Purpose |
|---|---|
| `internal/utility/otp.go` | OTP/captcha constants, Redis key builders
(`CaptchaRedisKey`, `OTPRedisKeys`, `OTPVerifiedRedisKey`),
`HashOTPCode`, `GenerateOTPCode` / `GenerateCaptchaCode` /
`GenerateOTPSalt` via `crypto/rand`, and `EncodeOTPStorageValue` /
`DecodeOTPStorageValue` matching Python's storage shape. |
| `internal/utility/smtp.go` | Minimal stdlib `net/smtp` sender.
`SendResetCodeEmail(to, otp, ttlMin)` builds an RFC 5322 plain-text
message and dispatches via implicit TLS / STARTTLS / plain — same
selectors as Python `aiosmtplib`. Returns `SMTPNotConfiguredError` if
the config block is empty. |

### Modified

| File | Change |
|---|---|
| `internal/server/config.go` | New `SMTPConfig` struct + `Config.SMTP`
field. Field names mirror the `smtp:` keys in `common/settings.py`
(`mail_server`, `mail_port`, `mail_use_ssl`, `mail_use_tls`,
`mail_username`, `mail_password`, `mail_from_name`, `mail_from_address`,
`mail_frontend_url`) so a single `conf/service_conf.yaml` powers both
backends. |
| `internal/service/user.go` | Four methods — `ForgotIssueCaptcha`,
`ForgotSendOTP`, `ForgotVerifyOTP`, `ForgotResetPassword`. Reuses the
existing `decryptPassword`, `HashPassword`, `userDAO.Update`, and
`utility.GenerateToken` so the reset+auto-login path is identical to
`LoginByEmail`. |
| `internal/handler/user.go` | Four handlers in the same `c.JSON` shape
as `LoginByEmail`. The reset handler rotates the access token and emits
an `Authorization` header for auto-login (matches Python
`construct_response(auth=user.get_id())`). |
| `internal/router/router.go` | Routes registered under `apiNoAuth`,
with an explanatory comment on why they sit outside the auth middleware.
|

## Known divergence — captcha rendering

The Python endpoint returns a rendered `image/JPEG` from the
`python-captcha` library. The Go side has **no image-captcha dependency
vendored** in `go.mod`, and hand-rolling a raster generator was out of
scope for this PR.

This commit returns JSON `{captcha: "<text>"}` instead. Implications:

- **Backend gate is identical** — the OTP step still verifies the
user-submitted captcha string against the Redis value, so the security
model is unchanged.
- **Frontend impact**: the password-reset page rendering needs a small
tweak (text display instead of `<img>`) until a Go captcha library is
wired in.
- The handler comments call this out explicitly so the next PR knows
what to swap.

Possible follow-ups (any one closes the gap):
1. Add `github.com/mojocn/base64Captcha` or `github.com/dchest/captcha`
to `go.mod` and replace the JSON response with an `image/JPEG`.
2. Hand-roll a 5x7 bitmap font + `image/png` writer using only the
stdlib.
3. Render a server-side SVG (cheap, but trivially OCR-able — only useful
as a UI shim).

## Test plan

- [ ] **Captcha**: `POST
/api/v1/auth/password/forgot/captcha?email=<existing>` returns `{code:
0, data: {captcha: "ABCD"}}`. Redis shows `captcha:<email>` with that
value and ~60s TTL. Unknown email returns `code: CodeDataError`.
- [ ] **OTP send**: `POST /api/v1/auth/password/forgot/otp` with the
right captcha mints an OTP, stores `<hash>:<salt>` under `otp:<email>`
for 5 min, sends an email, returns success. With a wrong captcha returns
`CodeAuthenticationError`. Hitting it again within 60s returns "you
still have to wait …" with `CodeNotEffective`.
- [ ] **OTP verify**: correct OTP → `code: 0`, OTP keys cleared,
`otp:verified:<email>` = `"1"`. Wrong OTP → `code:
CodeAuthenticationError`, attempt counter bumped; after 5 wrong tries
`otp_lock:<email>` is set and further attempts hit `CodeNotEffective`.
- [ ] **Reset**: with the verified flag set, supply a new password
(RSA-encrypted+base64, same as `LoginByEmail`). Returns `code: 0`,
`Authorization` header set, verified flag deleted. Without the verified
flag returns `CodeAuthenticationError`.
- [ ] **Wire-compat smoke**: mint an OTP from the Python backend, verify
it via the Go endpoint, and vice versa. Should both succeed.
- [ ] **SMTP misconfigured**: drop `smtp.mail_server` from
`conf/service_conf.yaml`. The OTP-send endpoint should now return
"failed to send email" without panicking; check the log for the
`SMTPNotConfiguredError` warning.
- [ ] **End-to-end FE**: hit the password-reset flow from
`web/src/pages/login-next/`. Confirm the text-captcha shim works after
the FE tweak.
- [ ] `go build ./...` and `go vet ./...` — I could not run these in the
sandbox; please confirm a clean build before merging.
- [ ] `uv run pytest` to confirm no Python regressions (shared Redis
schema).

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-10 21:27:56 +08:00
Jonathan Chang
dfcf226ba3 feat: Implement API of ragflow server in Go (#15256)
## Summary
- Implemented the Go API endpoint for Memory message forgetting:
  - `DELETE /api/v1/messages/{memory_id}:{message_id}`
- Added route registration for the Memory message DELETE endpoint only.
- Added request path validation for `memory_id:message_id`.
- Added service logic to mark a message as forgotten by setting
`forget_at`.
- Preserved Python-compatible response behavior:
  - Success returns `code: 0`, `message: true`, `data: null`.
- Added focused unit tests for message path parsing and invalid message
ID handling.
- Fixed Linux cgo linker config to use the installed shared PCRE2
library so Go tests/builds can run in this environment.
## Related Issue
Closes: #15240 
## Change Type
- [x] Feature
- [x] Test
- [x] Build / CI compatibility

## Implemented API
- `DELETE /api/v1/messages/{memory_id}:{message_id}`
## Real Behavior Proof
Validated with targeted Go tests:
```bash
/tmp/go1.25.0/bin/go test ./internal/handler ./internal/router
```
Result:
```text
ok  	ragflow/internal/handler
?   	ragflow/internal/router	[no test files]
```
Validated server entrypoint build:
```bash
/tmp/go1.25.0/bin/go build -o /tmp/ragflow-server-main ./cmd/server_main.go
```

Result:
```text
build succeeded
```
Validated patch formatting:
```bash
git diff --check
```

Result:

```text
no whitespace errors
```
## Checklist
- [x] Implemented only `DELETE
/api/v1/messages/{memory_id}:{message_id}`.
- [x] Did not implement unrelated Memory message APIs.
- [x] Added route registration.
- [x] Added handler validation.
- [x] Added service-level memory access check.
- [x] Added tests.
- [x] Ran targeted Go tests.
- [x] Ran server build validation.
- [x] Ran `git diff --check`.
2026-06-10 21:27:35 +08:00
Jin Hai
3e4fb8cf1c Go: fix test and remove unused code (#15909)
### What problem does this PR solve?

1. Fix go test, some cases still failed.
2. Remove unused code.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-10 20:38:43 +08:00
Hunnyboy1217
16d5b4fa02 feat[Go]: implement POST /api/v1/files/link-to-datasets (#15674)
### What problem does this PR solve?

Closes #15673 — ports the Python `file2document_api.py` `convert()`
endpoint to Go.

| Method | Path | Handler |
|--------|------|---------|
| POST | `/api/v1/files/link-to-datasets` | `FileHandler.LinkToDatasets`
|

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---

#### Implementation notes

**Files changed:**

```
internal/service/file2document.go  – new service (File2DocumentService)
internal/dao/file2document.go      – added Create method
internal/handler/file.go           – FileHandler gains file2DocumentService;
                                     LinkToDatasets HTTP handler
internal/router/router.go          – route registered
```

**Functional parity table:**

| Concern | Go behaviour |
|---------|-------------|
| Required fields | `file_ids` and `kb_ids` both required; missing
either → `CodeDataError` mirroring Python `@validate_request` |
| File existence | `fileDAO.GetByIDs(fileIDs)` builds a set; any missing
ID → `"File not found!"` |
| KB existence | `kbDAO.GetByID(kbID)` per KB; missing → `"Can't find
this dataset!"` |
| Folder expansion | `getAllInnermostFileIDs` recursively calls
`fileDAO.ListByParentID` — mirrors
`FileService.get_all_innermost_file_ids` |
| File permissions | `checkFileTeamPermission`: `file.TenantID ==
userID` OR user in tenant's team — mirrors `check_file_team_permission`
|
| KB permissions | `checkKBTeamPermission`: `kb.TenantID == userID` OR
user in tenant's team — mirrors `check_kb_team_permission` |
| Fire-and-forget | `go convertFiles(...)` goroutine after all
validation passes — mirrors `loop.run_in_executor(None, _convert_files,
…)` |
| Conversion | `convertFiles`: for each file → delete existing mappings
+ hard-delete old documents → create new `Document` in each target KB →
create `File2Document` mapping — mirrors Python `_convert_files` |
| `getParser` | Extension-based lookup with fallback to `kb.ParserID` —
mirrors `FileService.get_parser` |
| Immediate return | `true` returned to caller as soon as goroutine is
scheduled |

---------

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-06-10 16:46:55 +08:00
Hz_
3796835c4d feat(go-api): migrate agent file download handler to Go with strict P… (#15769)
## What does this PR do?

This PR migrates the Agent Temporary File Download endpoint (`GET
/api/v1/agents/download`) from the Python backend to the Go backend,
optimizing the data retrieval flow and maintaining strict functional
parity. It also fixes a persistent parsing error in the Sandbox code
execution node.

## Checklist
- [x] Code logic matches Python implementation
- [x] All local unit tests passed
- [x] No breaking changes to existing router interfaces

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-06-10 16:09:36 +08:00
oktofeesh
bbc1f2ecec feat(go-api): add RAG retrieval to chat completions (#15739)
## Summary
- Add knowledge-base retrieval support to Go chat completions.

## What changed
- Routes KB-backed chat sessions through the Go retrieval service
instead of falling back to solo chat.
- Resolves embedding and rerank models, validates accessible knowledge
bases, and preserves tenant-aware retrieval.
- Rejects mixed embedding models across selected knowledge bases before
retrieval to avoid incompatible vector dimensions.
- Threads the HTTP request context into streaming retrieval so cancelled
requests can stop downstream retrieval work.
- Applies metadata filters and message-level `doc_ids` before retrieval.
- Expands parent/child chunks before building references and prompt
context.
- Injects retrieved knowledge through a copied dialog prompt config so
the caller's original dialog is not mutated.
- Honors configured empty responses when no chunks are found.
- Names the metadata no-match sentinel and reuses it across
retrieval/handler paths.
- Adds a defensive content cast while appending streamed answers.
- Adds focused unit coverage for retrieval, metadata filtering,
authorization, multimodal messages, references, empty-response behavior,
prompt immutability, and mixed embedding models.

---------

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 11:07:45 +08:00
Haruko386
d56aeb2f5d feat[Go]: api datasets/<dataset_id>/documents/<document_id>/metadata/… (#15846)
### What problem does this PR solve?

As title

```
/api/v1/datasets/<dataset_id>/documents/<document_id>/metadata/config PUT
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-06-10 09:57:11 +08:00
Haruko386
a396b1ace2 feat[Go]: implement /api/v1/agents/<agent_id> and test_db_connection (#15771)
### What problem does this PR solve?

Add two API in go
```
/api/v1/agents/test_db_connection POST

/api/v1/agents/<agent_id>/sessions DELETE
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-06-10 09:54:07 +08:00
Jack
87b8062df4 feat: implement POST /api/v1/searchbots/ask — streaming RAG with citations and think-tag processing (#15825)
Implements POST /api/v1/searchbots/ask in Go with streaming SSE,
citations, and think-tag processing. 23 files, 90+ unit tests.

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 22:48:50 +08:00
JPette1783
acae932938 fix(go): guard four nil-pointer dereferences causing runtime panics (#15815)
### What problem does this PR solve?

Fixes four Go paths that dereference a pointer with no prior nil check,
each
causing a **runtime panic**. Closes #15814.

| # | File | Bug | Fix |
|---|------|-----|-----|
| 1 | `internal/entity/models/deepseek.go` | streaming path runs `switch
*chatModelConfig.Effort` inside `if *Thinking`; panics when
`Thinking=true` and `Effort==nil` | nil-check with default `"high"`,
matching the non-streaming path in the same file |
| 2 | `internal/entity/models/volcengine.go` | identical oversight:
`switch *modelConfig.Effort` with no guard | nil-check with default
`"medium"`, matching its non-streaming path |
| 3 | `internal/handler/auth.go` | `AuthMiddleware` does `if
*user.IsSuperuser`; panics on every authenticated request when the DB
column is `NULL` | guard with `user.IsSuperuser != nil &&`, matching
every other call site (`admin/handler.go`, `admin/service.go`,
`user.go`) |
| 4 | `internal/service/heartbeat_sender.go` |
`responseBody["code"].(float64)` panics on any non-200 response lacking
a numeric `code`; the upstream `recover()` calls `Fatal()` →
`os.Exit(1)`, taking down the whole server | comma-ok assertion (`code,
ok := ...`); return an error instead of panicking |

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-06-09 19:29:25 +08:00
Jin Hai
719ce15c95 Go CLI: update list supported models (#15845)
### What problem does this PR solve?

Now list supported models will show more info.

```
RAGFlow(api/default)> list supported models from 'gitee' 'test';
+-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+
| dimension | max_tokens | model_types | name                                                     | thinking                                    |
+-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+
|           |            |             | Wan2.7                                                   |                                             |
|           |            |             | HappyHorse-1.0                                           |                                             |
|           |            |             | Qwen3.6-27B@Qwen                                         |                                             |
|           |            |             | Qwen3.6-35B-A3B@Qwen                                     |                                             |
|           | 1048576    | [chat]      | DeepSeek-V4-Flash@deepseek-ai                            | map[clear_thinking:true default_value:true] |
|           | 1048576    | [chat]      | DeepSeek-V4-Pro@deepseek-ai                              | map[clear_thinking:true default_value:true] |
+-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-09 19:01:00 +08:00
Jin Hai
55abf4f565 Go: new CLI command, list all models and show model (#15786)
### What problem does this PR solve?

```
RAGFlow(user)> list models;
+---------------------------+------------+-------------+--------------------+---------------------------------------------+
| alias                     | max_tokens | model_types | name               | thinking                                    |
+---------------------------+------------+-------------+--------------------+---------------------------------------------+
|                           | 1048576    | [chat]      | deepseek-v4-flash  | map[clear_thinking:true default_value:true] |
|                           | 1048576    | [chat]      | deepseek-v4-pro    | map[clear_thinking:true default_value:true] |
|                           | 1024000    | [chat]      | minimax-m3         | map[clear_thinking:true default_value:true] |
|                           | 64000      | [vision]    | glm-4.5v           | map[clear_thinking:true default_value:true] |
| [baai/bge-m3]             | 8192       | [embedding] | bge-m3             |                                             |
| [baai/bge-reranker-v2-m3] | 1024       | [rerank]    | bge-reranker-v2-m3 |                                             |
|                           |            | [tts]       | step-audio-tts-3b  |                                             |
| [qwen/qwen3-asr-1.7b]     |            | [asr]       | qwen3-asr-1.7b     |                                             |
| [paddleocr-vl-1.5]        |            | [ocr]       | paddleocr-vl-0.9b  |                                             |
+---------------------------+------------+-------------+--------------------+---------------------------------------------+
RAGFlow(user)> show model 'minimax-m3';
+--------------+---------------------------------------------+
| field        | value                                       |
+--------------+---------------------------------------------+
| name         | minimax-m3                                  |
| max_tokens   | 1024000                                     |
| model_types  | [chat]                                      |
| thinking     | map[clear_thinking:true default_value:true] |
| class        |                                             |
| alias        |                                             |
| ModelTypeMap |                                             |
+--------------+---------------------------------------------+
RAGFlow(user)> show model 'baai/bge-m3';
+--------------+---------------+
| field        | value         |
+--------------+---------------+
| model_types  | [embedding]   |
| thinking     |               |
| class        |               |
| alias        | [baai/bge-m3] |
| ModelTypeMap |               |
| name         | bge-m3        |
| max_tokens   | 8192          |
+--------------+---------------+
```

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-08 21:38:15 +08:00
Jack
338fdb65fb feat(ci): enable go test in CI pipeline (#15750)
## What problem does this PR solve?

Go test files are never compiled in CI — only production binaries via
`go build`. This allowed a missing `"sort"` import in
`metadata_filter_test.go` to be merged without detection.

## Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)

## Changes

- Add `go test -count=1 ./internal/...` step after Go build in CI
workflow
- Fix missing `"sort"` import in `metadata_filter_test.go` (pre-existing
compile error)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 20:06:57 +08:00
Jack
04209ffccf feat: implement FetchChunkVectors for citation vector hydration (#15749)
## What problem does this PR solve?

Implements `FetchChunkVectors` — the infrastructure needed to hydrate
chunk embedding vectors on demand. This is a prerequisite for
`insert_citations` (citation insertion in the `searchbots/ask`
endpoint), matching the Python `Dealer.fetch_chunk_vectors` pattern.

Without this, citation insertion cannot compute answer-vs-chunk vector
similarity.

## Type of change

- [x] New Feature (non-breaking change which adds functionality)

## Changes

### New Function
- `FetchChunkVectors(engine, chunkIDs, tenantIDs, kbIDs, dim)` — fetches
embedding vectors for a set of chunk IDs
- Consumer-side `vectorFetcher` interface with only `Search` + `GetType`
methods
- Both `*elasticsearchEngine` and `*infinityEngine` implicitly satisfy
the interface

### Engine Behavior
- **ES**: queries by chunk ID list via `Search` with filter `{"id":
chunkIDs}`, parses tab-separated `q_N_vec` string format
- **Infinity / OceanBase**: skips the round-trip (vectors already
shipped with chunks)
- **Degrades gracefully**: engine errors return zero vectors — citation
insertion will use placeholders instead of failing

### Vector Parsing
- Handles ES tab-separated string format (`"0.1\t0.2\t0.3"`)
- Handles `[]float64` and `[]interface{}` formats
- Returns zero vector for wrong-dimension or unparseable input

### Bug Fix
- `metadata_filter_test.go`: add missing `"sort"` import (pre-existing
build break)

### Tests
- 12 unit tests: empty input, Infinity/OceanBase skip, ES string vector,
ES float slice, ES interface slice, search error degradation, missing
chunk → zero, wrong dimension → zero, parse edge cases

## Files Changed

| File | Change |
|------|--------|
| `internal/service/chunk_vector.go` | New — FetchChunkVectors + parse
helpers |
| `internal/service/chunk_vector_test.go` | New — 12 tests |
| `internal/service/metadata_filter_test.go` | Fix missing `"sort"`
import |

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 16:54:00 +08:00
Haruko386
67ce0c896d feat[Go]: implement /api/v1/agents/<agent_id>/sessions (#15705)
### What problem does this PR solve?

As Title
Codes were tested by Postman

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-08 16:26:27 +08:00
Jack
8f4809d1b5 feat: implement POST /api/v1/searchbots/retrieval_test (#15710)
## What problem does this PR solve?

Implements `POST /api/v1/searchbots/retrieval_test` in the Go API
server, aligning with the Python `bot_api.py` counterpart. Also applies
security hardening and consistency fixes discovered during CTO-level
code review:

- **Missing endpoint**: `retrieval_test` was not available in Go,
requiring Python fallback
- **Security**: Both `chunkHandler` and `searchBotHandler` leaked
`err.Error()` to API consumers
- **Python alignment**: Default values, empty question handling, and
`top_k <= 0` validation differed from Python behavior
- **Test gaps**: `chunkHandler.RetrievalTest` had zero unit tests;
several edge cases uncovered

## Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring

## Summary

### New Endpoint
- `POST /api/v1/searchbots/retrieval_test` — retrieval test with full
field support (page, size, top_k, use_kg, cross_languages, keyword,
similarity_threshold, vector_similarity_weight)

### New Type
- `common.StringSlice` — JSON type that accepts both `"kb1"` and
`["kb1", "kb2"]`, matching Python API flexibility

### Security
- Both `searchBotHandler` and `chunkHandler` now use `common.Warn()` +
generic error messages instead of leaking `err.Error()` to API consumers
- All error responses include consistent `"data": nil` shape
- `chunkHandler.RetrievalTest` uses interface-based DI (`chunkService`)
to enable testability

### Python Alignment
- Handler-level defaults align with Python `bot_api.py` (page=1,
size=30, top_k=1024, similarity_threshold=0.0,
vector_similarity_weight=0.3)
- `top_k <= 0` validation matching Python behavior
- Empty/whitespace question returns 200 + empty result (matches
`chunk_api.py`)
- `chunkHandler` `Datasets` field uses `common.StringSlice` for
string-or-array flexibility

### Refactoring
- `ChunkServiceIface` → `ChunkRetriever`, `chunkSvcIface` →
`chunkService` (Go-conventional naming)
- Extracted `applyRetrievalDefaults`, `toRetrievalServiceRequest` from
handler body
- Regex moved to package-level var in `parseRelatedQuestions`
- `service.RetrievalTestRequest.Datasets` type changed to
`common.StringSlice`
- `chunkHandler` now uses consumer-side interface for DI

### Tests
- 37 unit tests across both handlers: auth, validation, defaults,
StringSlice edge cases, empty/whitespace KbID, service errors, JSON
format, `top_k <= 0`, field mapping verification

## Files Changed

| File | Change |
|------|--------|
| `cmd/server_main.go` | Wire new handler + chunkService +
difyRetrievalHandler |
| `internal/common/json_types.go` | New StringSlice type |
| `internal/common/json_types_test.go` | StringSlice tests |
| `internal/handler/chunk.go` | Interface-based DI, security, Python
alignment, defaults |
| `internal/handler/chunk_test.go` | New — 9 comprehensive tests |
| `internal/handler/searchbot.go` | New endpoint + refactoring + `top_k
<= 0` validation |
| `internal/handler/searchbot_test.go` | 18 tests covering all edge
cases |
| `internal/router/router.go` | Register new route +
difyRetrievalHandler |
| `internal/service/chunk.go` | Datasets type → StringSlice, Question
binding relaxed |

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 16:16:56 +08:00
qinling0210
5e0a7ce408 Update Rerank logic in GO (#15755)
### What problem does this PR solve?

Sync the rerank logic in the following PR  to  GO.
https://github.com/infiniflow/ragflow/pull/15429
https://github.com/infiniflow/ragflow/pull/15434

### Type of change

- [x] Refactoring
2026-06-08 15:28:10 +08:00
dripsmvcp
3d7adf2193 feat[Go]: implement GET /plugin/tools (issue #15240) (#15570)
## Summary

Port the Python `GET /v1/plugin/tools` endpoint to the Go API server.
Listed in the Go-API port checklist of #15240.

Returns the metadata of every embedded LLM tool plugin in the same JSON
shape the Python endpoint emits (camelCase keys preserved), so existing
frontends bind to the Go server without changes.
2026-06-08 11:53:19 +08:00
qinling0210
c960dc2a4c Refine handling of POST /api/v1/datasets/search in GO (#15583)
### What problem does this PR solve?

Refine handling of POST /api/v1/datasets/search in GO

### Type of change

- [x] Refactoring
2026-06-08 11:49:37 +08:00
Hz_
074c331cdf fix(go-api): sync document handler interface and enforce preview acce… (#15688)
### Description

This PR syncs the `documentServiceIface` interface and introduces
handler methods for document preview, artifact fetching, and downloading
in the Go API. It also ensures that strict dataset alignment and access
checks are enforced when retrieving or downloading documents.

Furthermore, this PR introduces comprehensive unit tests for both the
newly added Handler and Service methods to ensure robustness and prevent
future regressions.

### Key Changes
* **Router & Handler Integration**: 
  * Added and wired new API endpoints in `internal/router/router.go`.
* Synchronized the `documentServiceIface` with `GetDocumentArtifact`,
`GetDocumentPreview`, and `DownloadDocument`.
* Implemented handlers for these endpoints in
`internal/handler/document.go`.
* **Access & Validation Enforcement**: 
* Refactored `internal/service/document.go` to strictly check if a
document belongs to the requested dataset before allowing downloads or
previews.
* Added robust artifact file sanitization (`sanitizeArtifactFilename`)
and attachment handling (`shouldForceArtifactAttachment`).
* **Comprehensive Unit Testing**:
* **Handler Layer (`internal/handler/document_test.go`)**: Added mock
service implementations and Gin router tests covering success,
not-found, and internal error states for all 3 new endpoints.
* **Service Layer (`internal/service/document_test.go`)**: Added
extensive business logic tests including dataset mismatch checks,
non-existent document checks, and artifact file validation.
2026-06-08 11:37:06 +08:00
Jack
5a04ac0864 feat: Dify-compatible retrieval API endpoint (#15704)
## Summary

Dify-compatible retrieval API for external knowledge base integration.

## Changes

- **New handler**: DifyRetrievalHandler with POST/GET
/api/v1/dify/retrieval
- **Health check**: GET /api/v1/dify/retrieval/health
- **Full pipeline**: KB validation -> permission check -> embedding ->
metadata filter -> chunk retrieval -> child chunk aggregation ->
optional KG search -> response assembly
- **12 tests** covering all paths (success, errors, metadata filter, KG
mode)
- **Testability**: Handler dependencies defined as interfaces
(KBServiceIface, ModelServiceIface, etc.)

## Files

| File | Type |
|------|------|
| internal/handler/dify_retrieval_handler.go | New — handler +
interfaces |
| internal/handler/dify_retrieval_handler_test.go | New — 12 tests |
| internal/router/router.go | Modified — route registration |
| cmd/server_main.go | Modified — handler wiring |
| internal/service/kg/pipeline.go | Modified — SetChatModel/SetEmbModel
|
| internal/service/kg/retrieval.go | New — helper functions |
| internal/service/kg/scoring.go | Moved from service package |
| internal/service/kg/search.go | New — KG search functions |
| internal/service/kg/types.go | New — type definitions |

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 21:16:25 +08:00
Jack
ea79d65d08 feat: add KGSearchRetrieval for full KG pipeline (N-hop, scoring, query_rewrite, community) (#15690)
## Summary

`KGSearchRetrieval` composes entity search, type search, relation
search, N-hop analysis, score fusion, LLM-based query\_rewrite, and
community reports into a single synthetic chunk for KG-enhanced
retrieval.

### Components

| Component | Source | Status |
|-----------|--------|--------|
| Entity/relation/community search | Direct `DocEngine.Search` calls | 
|
| N-hop analysis + score fusion | `common.AnalyzeNHopPaths` /
`DoubleHitBoost` / `FuseRelationScores` |  #15666 |
| Query rewrite prompt + parser | `common.BuildQueryRewritePrompt` /
`ParseQueryRewriteResponse` |  #15669 |
| Token budget | `common.BuildKGContent` + `NumTokensFromString` | 
#15666 |
| LLM query rewrite integration | `queryRewrite` function with fallback
|  |

### Testing

11 tests (pure function + mock engine):

```
=== RUN   TestKgEntityFromChunk_Basic          --- PASS
=== RUN   TestKgEntityFromChunk_ScoreFallback  --- PASS
=== RUN   TestKgEntityFromChunk_MissingFields  --- PASS
=== RUN   TestKgRelationFromChunk_Basic        --- PASS
=== RUN   TestKgRelationFromChunk_MissingFrom  --- PASS
=== RUN   TestSearchKGTypeSamples_Success      --- PASS
=== RUN   TestSearchKGTypeSamples_Empty        --- PASS
=== RUN   TestKGSearchRetrieval_Basic          --- PASS
=== RUN   TestKGSearchRetrieval_NoEntities     --- PASS
=== RUN   TestQueryRewrite_Fallback            --- PASS
=== RUN   TestQueryRewrite_EmptyQuestion       --- PASS
```

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 18:00:27 +08:00
web-dev0521
b8db200757 feat(go-api): implement MCP server management endpoints (#15281)
## Summary

Ports the MCP (Model Context Protocol) server management endpoints that
power `web/src/pages/user-setting/mcp/` from Python
(`api/apps/restful_apis/mcp_api.py`) to Go. There were no MCP routes in
the Go server before this change.

Closes #15275 (subtask of #15240).

## Endpoints implemented (base path `/api/v1`)

| Method | Path | Description |
|--------|------|-------------|
| GET | `/mcp/servers` | List tenant servers (keyword / order /
pagination) |
| POST | `/mcp/servers` | Create a server |
| GET | `/mcp/servers/{mcp_id}` | Get one (`?mode=download` exports
config) |
| PUT | `/mcp/servers/{mcp_id}` | Update a server |
| DELETE | `/mcp/servers/{mcp_id}` | Delete a server |
| POST | `/mcp/import` | Bulk import from JSON config |
| POST | `/mcp/servers/{mcp_id}/test` | Connect + list tools (see notes)
|

## Implementation

Follows the existing `handler → service → dao` layering (per PR #14790):

- **entity** (`internal/entity/mcp.go`): added `MCPServerType` constants
and `IsValidMCPServerType` over the existing `MCPServer` model.
- **dao** (`internal/dao/mcp.go`): new `MCPServerDAO` with tenant-scoped
CRUD, a keyword filter, and a **whitelisted order-column map** (guards
against SQL injection via the caller-supplied `orderby`).
- **service** (`internal/service/mcp.go`): new `MCPService` —
list/get/export/create/update/delete/import/test — mirroring
`MCPServerService` and the `mcp_api` request validation, with sentinel
errors for clean code mapping.
- **handler** (`internal/handler/mcp.go`): new `MCPHandler` with the
seven handlers and Python-compatible response codes.
- **router / server_main**: registered the `/mcp` group and wired the
handler.

## Deviations from Python (documented in code)

1. **Bulk import is at `POST /mcp/import`, not `/mcp/servers/import`.**
gin (v1.9.1) cannot register a static segment and a path param at the
same tree node, so `/mcp/servers/import` would collide with
`/mcp/servers/:mcp_id` and panic at startup. The frontend should call
`/mcp/import`.
2. **No live tool discovery on create/update/import.** The Python path
runs `get_mcp_tools` over SSE / streamable-HTTP and stores
`variables.tools`. The Go server has no MCP client yet, so these persist
`variables`/`headers` but leave `variables.tools` unpopulated.
3. **`/test` returns a data error (`ErrMCPTestUnsupported`)** until a Go
MCP client lands. Per the issue, the live-connection path is scoped as a
follow-up; the handler still validates `url` + `server_type`.

## Testing

- Added `internal/service/mcp_test.go` covering `IsValidMCPServerType`
and the `TestServer` validation/short-circuit paths (no DB required).
- No Go toolchain was available in the dev environment, so `go build
./...` / `go vet ./...` verification is left to CI.

## Follow-ups

- Go MCP client (SSE / streamable-HTTP) to enable live tool discovery
and the real `/test` behavior.
- Reconcile the `/mcp/import` vs `/mcp/servers/import` path with the
frontend.

---------
2026-06-05 13:25:09 +08:00
Jack
e629c0203b feat: add KG entity/relation/community search functions (#15689)
## Summary

Knowledge Graph search functions for entity, relation, community report,
and type-samples retrieval. Uses DocEngine.SelectFields (PR #15684) for
KG-specific fields.

### Functions

| Function | Description |
|----------|-------------|
| `SearchKGEntities` | Hybrid search over KG entities (dense + text +
fusion) |
| `SearchKGEntitiesByTypes` | Entity search filtered by
`entity_type_kwd` |
| `SearchKGRelations` | Hybrid search over KG relations |
| `SearchKGCommunityReports` | Community report search by entity names |
| `SearchKGTypeSamples` | Type→entities mapping for query_rewrite |

### Internal helpers

| Helper | Description |
|--------|-------------|
| `buildHybridExpr` | Shared dense+text+fusion expression construction |
| `buildKGDenseExpr` | Wraps `Embed()` call for vector search |
| `Parse*` | Convert raw chunks to typed structs |

### Testing

35 tests (pure function + mock integration)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 13:23:04 +08:00
Haruko386
4b2af1347c feat[Go]: implement Agent/Workflow PUT /api/v1/agents/<canvas_id>/tags (#15641)
feat[Go]: implement Agent/Workflow PUT /api/v1/agents/<canvas_id>/tags (#15641)
2026-06-05 13:22:23 +08:00
Jack
f6ff862a24 fix: restore case-insensitive contains/not contains/not in and consolidate metadata filter pipeline (#15686)
## Summary

This PR fixes case-sensitivity regressions introduced in #15656 and
consolidates the metadata filtering pipeline by removing the duplicate
`applySingleCondition` adapter layer.

### Bug fixes
1. **contains / not contains**: restored case-insensitive matching (was
lost when `applySingleCondition` was replaced by
`common.MetaFilter.matchValue` which lacked `strings.ToLower`)
2. **not in**: restored case-insensitive matching (was lost for same
reason; uses `strings.EqualFold`)
3. **!= with date filter values**: non-date metadata values now
correctly match the `≠` operator (a non-date value IS not equal to any
date, but was returning false)

### Architecture
4. **Removed `applySingleCondition`** (65 lines) — the inline switch was
a duplicate of `common.MetaFilter` logic. `ApplyMetaFilter` now converts
conditions and delegates to `common.MetaFilter` once per filter set,
eliminating ~25 lines of duplicate AND/OR merge logic.
5. **Added `filterSet`** — O(n+m) hash-map fast path for `in`/`not in`
operators, replacing the O(n*m) linear scan in `matchValue`.
6. **Exported `NormalizeOperator`** from `common` for consistent
operator alias handling.

### Cleanup
7. Removed 18 lines of dead code (`matchValue`'s `in`/`not in` branches
already bypassed by `filterOut` delegation)
8. Fixed orphaned godoc comment for `convertOperator`
9. Fixed incorrect `filterSet` doc comment (claimed "matching EqualFold"
but used `strings.ToLower`)
10. Completed `convertToMetaCondition` operator normalization
documentation

### Testing
- 60 tests (24 service + 36 common), all passing
- New tests: `==`, `≠`, `>`, `<`, `≥`, `≤`, `empty`, `not empty` through
`ApplyMetaFilter`
- New tests: `<`, `≤`, `≠` through `MetaFilter`; `not-in-empty-list`
through `filterSet`
- All 18 `MetaFilter` tests pass; all 10 `filterSet` unit tests pass

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 12:47:55 +08:00
Jack
ee32d91aab feat: add EnrichChunksWithDocMetadata function to attach document metadata to chunks (#15659)
## Summary

Add `EnrichChunksWithDocMetadata` as a method on `MetadataService` that
attaches document metadata to retrieval chunks in-place. Equivalent to
Python's `enrich_chunks_with_document_metadata()` from
`api/utils/reference_metadata_utils.py`.

### Usage

```go
metadataSvc.EnrichChunksWithDocMetadata(chunks, tenantID, metadataFields)
```

### Changes

- **`service/metadata.go`**: Added `EnrichChunksWithDocMetadata` method
- **`service/enrich_metadata_test.go`** (new): 7 test cases

### Algorithm

1. Collect unique `(kb_id, doc_id)` pairs from chunks
2. Fetch metadata from ES via `SearchMetadata(kbID, tenantID, docIDs)`
3. Attach `document_metadata` field to each matching chunk
4. Optionally filter to specified `metadataFields`

### Testing

All 7 tests pass:

```
=== RUN   TestEnrichChunksWithDocMetadata_NoChunks       --- PASS
=== RUN   TestEnrichChunksWithDocMetadata_EmptyChunks     --- PASS
=== RUN   TestEnrichChunksWithDocMetadata_EmptyDocID      --- PASS
=== RUN   TestEnrichChunksWithDocMetadata_DuplicateDocIDs --- PASS
=== RUN   TestEnrichChunksWithDocMetadata_MultipleKBs     --- PASS
=== RUN   TestEnrichChunksWithDocMetadata_WithMetadataFields --- PASS
=== RUN   TestEnrichChunksWithDocMetadata_MixedFields     --- PASS
```

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 11:42:23 +08:00
dripsmvcp
431f52a5d4 feat[Go]: implement GET /agents/templates (issue #15240) (#15573)
## Summary

Port the canvas-template catalogue endpoint to the Go API server. Listed
in the Go-API port checklist of #15240.

Mirrors `list_agent_template` in `api/apps/restful_apis/agent_api.py`:
returns every row from the `canvas_template` table so that the UI can
render the template gallery on the New-Agent screen.

## What

- `internal/dao/canvas_template.go` — new `CanvasTemplateDAO.GetAll()`
ordered by `create_time desc` (newest templates first).
- `internal/service/agent.go` — wire the new DAO into `AgentService` and
expose `ListTemplates() ([]*entity.CanvasTemplate, error)`.
- `internal/handler/agent.go` — new `AgentHandler.ListTemplates` HTTP
handler (auth-gated, mirrors Python `@login_required`).
- `internal/router/router.go` — `agents.GET("/templates",
r.agentHandler.ListTemplates)` registered alongside the existing `GET
/agents`.
- `internal/handler/agent_test.go` — three new tests covering: success
path, empty-list → JSON array (not `null`), and the auth gate.

## Notes

- `CanvasTemplate` entity, GORM tags, and DB migration already exist in
`internal/entity/canvas.go` and `internal/dao/database.go` — no schema
change required.
- The handler coerces a `nil` slice to `[]*entity.CanvasTemplate{}` so
the JSON payload is always an array (the frontend does `data.map(...)`
on it).

## Test plan

- [x] `go vet ./internal/handler ./internal/service ./internal/dao
./internal/router` clean
- [x] Three unit tests added; existing `TestListAgents_Success`
untouched
- [ ] CI runs `go test ./internal/handler` with cgo binding linked

## Related

- Tracker: #15240
2026-06-05 10:13:30 +08:00
Jack
96a416629d refactor: change GetFlattedMetaByKBs return type to common.MetaData (#15656)
## Summary

Change `GetFlattedMetaByKBs` return type from `map[string]interface{}`
to strongly-typed `common.MetaData`.

**Depends on**: #15648 (provides `MetaData`, `MetaValueDocs` types)

### Changes
- `service/metadata.go`: Changed return type, removed type assertions
- `service/metadata_filter.go`: Updated all metadata function signatures
- `service/metadata_filter_test.go` (new): 12 test cases

### Bug fix
`applySingleCondition` used `.([]interface{})` assertions on `[]string`
data, silently breaking operators like `!=`, `contains`, `start with`,
etc.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 22:16:04 +08:00
Jack
6143205b37 feat: implement GET /api/v1/agents/<agent_id>/versions/<version_id> API (#15640)
## Summary

Implement the `GET /api/v1/agents/<agent_id>/versions/<version_id>`
endpoint in Go, returning full version details including DSL.

Depends on #15629 which introduced the version list endpoint and
`UserCanvasVersionDAO` infrastructure.

### Changes

- **Modified**: `internal/handler/agent.go` — Added `GetAgentVersion`
handler with auth check and ownership verification
- **Modified**: `internal/router/router.go` — Registered `GET
/:agent_id/versions/:version_id` route
- **New/Modified tests**: Service and handler tests for the version
detail endpoint

### Testing

```
=== RUN   TestGetVersion_Success       --- PASS
=== RUN   TestGetVersion_WrongCanvas   --- PASS
=== RUN   TestGetVersion_NotFound      --- PASS
=== RUN   TestGetAgentVersionHandler_Success      --- PASS
=== RUN   TestGetAgentVersionHandler_VersionNotFound --- PASS
```

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 19:13:26 +08:00
Jack
23aae19898 feat: implement POST /api/v1/agents/<agent_id>/upload API (#15633)
## Summary

Implement the `POST /api/v1/agents/<agent_id>/upload` endpoint in Go,
allowing file uploads associated with agent canvases.

### Changes

- **Modified**: `internal/service/agent.go` — Added `CheckCanvasAccess`
method (owner + team-level permission semantics)
- **Modified**: `internal/handler/agent.go` — Added `UploadAgentFile`
handler with auth check, multipart file parsing, and delegation to
`FileService`. Added `fileUploader` interface for testability.
- **Modified**: `internal/router/router.go` — Registered `POST
/:agent_id/upload` route
- **Modified**: `cmd/server_main.go` — Wired `fileService` into
`AgentHandler`
- **New**: `internal/service/agent_test.go` — 4 service-level tests for
`CheckCanvasAccess` (owner, team member, private denial, not found)
- **New**: `internal/handler/agent_upload_test.go` — 3 handler-level
tests (success with fake file service, cross-user denial, empty file
rejection)

### Testing

All 7 tests pass with zero mocking of the DB layer (in-memory SQLite):

```
=== RUN   TestCheckCanvasAccess_Owner               --- PASS
=== RUN   TestCheckCanvasAccess_NotOwner            --- PASS
=== RUN   TestCheckCanvasAccess_PrivateCanvas_Denied --- PASS
=== RUN   TestCheckCanvasAccess_NotFound            --- PASS
=== RUN   TestUploadAgentFileHandler_Success        --- PASS
=== RUN   TestUploadAgentFileHandler_NoPermission   --- PASS
=== RUN   TestUploadAgentFileHandler_NoFiles        --- PASS
```

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 17:21:47 +08:00
Jack
02d163a177 feat: implement GET /api/v1/agents/<agent_id>/versions API (#15629)
## Summary

Implement the `GET /api/v1/agents/<agent_id>/versions` endpoint in Go,
listing all version snapshots for an agent canvas in descending update
time order.

### Changes

- **New**: `internal/dao/user_canvas_version.go` —
`UserCanvasVersionDAO` with `ListByCanvasID` (ordered by update_time
DESC) and `GetByID`
- **Modified**: `internal/service/agent.go` — Added `CheckCanvasAccess`,
`ListVersions`, `GetVersion` methods
- **Modified**: `internal/handler/agent.go` — Added `ListAgentVersions`
handler with auth check
- **Modified**: `internal/router/router.go` — Registered `GET
/:agent_id/versions` route
- **New**: `internal/service/agent_test.go` — 5 service-level tests
(SQLite in-memory DB, zero mock)
- **Modified**: `internal/handler/agent_test.go` — 3 handler-level tests
(real DB, pre-authenticated context)

### Testing

All 8 tests pass with zero mocking (in-memory SQLite replaces MySQL):

```
=== RUN   TestListVersions_Success         --- PASS
=== RUN   TestListVersions_Empty           --- PASS
=== RUN   TestCheckCanvasAccess_Owner      --- PASS
=== RUN   TestCheckCanvasAccess_NotOwner   --- PASS
=== RUN   TestCheckCanvasAccess_NotFound   --- PASS
=== RUN   TestListAgentVersionsHandler_Success      --- PASS
=== RUN   TestListAgentVersionsHandler_NoPermission --- PASS
=== RUN   TestListAgentVersionsHandler_CanvasNotFound --- PASS
```

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 15:36:26 +08:00
Jack
c6eee09ed3 feat: migrate POST /api/v1/datasets/<dataset_id>/documents/stop to Go (#15597)
## Summary

Migrate the stop parse documents endpoint from Python to Go.

### Python endpoint
`POST /api/v1/datasets/<dataset_id>/documents/stop` —
`api/apps/restful_apis/document_api.py:1542-1641`

### Changes
| File | Change |
|------|--------|
| `internal/dao/task.go` | Add `GetByDocID` method |
| `internal/dao/task_test.go` | 3 DAO tests (new file) |
| `internal/service/document.go` | Add `StopParseDocuments` + refactor
shared helpers |
| `internal/service/document_test.go` | 8 service tests |
| `internal/handler/document.go` | Add handler + request struct +
interface |
| `internal/handler/document_test.go` | 5 handler tests |
| `internal/router/router.go` | Add `POST /:dataset_id/documents/stop`
route |

### How it works
1. Validates all document IDs belong to the dataset
2. For each document in RUNNING/CANCEL state (or with unfinished tasks):
- Sets Redis cancel signal `{task_id}-cancel` for each associated task
   - Updates `document.run` to CANCEL ("2")
3. Returns `{"success_count": N, "errors": [...]}`

### Test strategy
- **DAO/Service**: SQLite in-memory DB, zero mocks. Redis is nil-safe by
design.
- **Handler**: `fakeDocumentService` implementing `documentServiceIface`
interface.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-06-04 14:16:13 +08:00
Jack
67c3e73d70 feat: migrate DELETE /api/v1/datasets/:dataset_id/documents to Go (#15577)
## Summary

Migrate the batch document deletion endpoint from Python to Go. Two
modes supported: explicit `ids` list and `delete_all`.

## Changes

| File | Change |
|------|--------|
| `internal/dao/file2document.go` | Add `GetByDocumentID`,
`DeleteByDocumentID` |
| `internal/dao/file2document_test.go` | 5 new tests |
| `internal/dao/kb_test.go` | 2 new tests (`DecreaseDocumentNum`) |
| `internal/service/document.go` | Add `deleteDocumentFull` +
`DeleteDocuments`, refactor `DeleteDocument` |
| `internal/service/document_test.go` | 10 new tests |
| `internal/handler/document.go` | Add `documentServiceIface` +
`DeleteDocuments` handler |
| `internal/handler/document_test.go` | 7 new tests |
| `internal/router/router.go` | Register `DELETE /:dataset_id/documents`
|
| `cmd/server_main.go` | Support `RAGFLOW_DICT_PATH` env var |
| `internal/binding/rag_analyzer.go` | Use `-lpcre2-8` dynamic linking |
| `internal/dao/database.go` | Skip Error 1091/1138 during migration |
| `internal/service/llm.go` | Fix vet warning |

## Per-document cleanup

- Delete tasks from DB
- Hard-delete document + decrement KB counters
- Delete chunks from document engine (nil-guarded)
- Delete metadata from document engine (nil-guarded)
- Remove file2document mapping + file record + storage blob

## Test Results

**24 unit tests all passing** (7 DAO + 10 service + 7 handler) using
SQLite :memory: + gin.TestMode.

See [test report](docs/test_report_delete_documents.md) for manual
integration test results.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 20:55:53 +08:00
Haruko386
df55880b44 feat[Go] implement /connectors/google/oauth (#15584)
### What problem does this PR solve?

The following API is available in go

> /api/v1/connectors/google/oauth/web/start POST
> /api/v1/connectors/gmail/oauth/web/callback GET
> /api/v1/connectors/google-drive/oauth/web/callback GET
> /api/v1/connectors/google/oauth/web/result POST


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-03 20:08:55 +08:00
Jin Hai
2061edd308 Remove unused codes (#15579)
### What problem does this PR solve?

Remove unused code.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-03 17:35:36 +08:00
Haruko386
473d06d1ad feat[Go]: implement add multi_models (#15563) 2026-06-03 15:26:46 +08:00
Jin Hai
e1f19f6679 Go: fix gitee balance api (#15554)
```
RAGFlow(user)> create provider 'gitee' instance 'intl' key 'api-token' url 'https://ai.gitee.com/v1' region 'intl';
SUCCESS
```

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-03 13:23:20 +08:00
ちー
5f8926410d feat[Go]: implement /api/v1/connectors/<connector_id> PATCH (#15512)
### What problem does this PR solve?

As title, all test are passed

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-02 19:34:07 +08:00
Haruko386
9f969feb89 feat[Go] implement check connection by using apikey and region (#15475)
### What problem does this PR solve?

**Verified from PostMan**


GET http://127.0.0.1:9384/api/v1/providers/gitee/connection
```json
body: 

{
    "api_key": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    "region": "default"

}

resp: 
{
    "code": 0,
    "message": "success"
}
```

GET http://127.0.0.1:9384/api/v1/providers/gitee/connection
```json
body: 

{
    "api_key": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    "region": "deprecated"

}

resp: 
{
    "code": 0,
    "message": "success"
}
```

GET http://127.0.0.1:9384/api/v1/providers/gitee/connection
```json
body: 

{
    "api_key": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    "region": "china"

}

resp: 
{
    "code": 0,
    "message": "success"
}

```

GET http://127.0.0.1:9384/api/v1/providers/lmstudio/connection
```json
body: 

{
    "api_key": "",
    "region": "test"

}

resp: 
{
    "code": 0,
    "message": "success"
}
```


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-06-02 19:32:41 +08:00
Alexander Laurent
a98889cd76 feat: add Go MCP server update API (#15261)
## What

#15240
implementation for PUT /api/v1/mcp/servers/:mcp_id

## Changes

- Adds the Go implementation for `PUT /api/v1/mcp/servers/:mcp_id`.
- Wires MCP service and handler into the Go server/router for the update
route.
- Preserves Python-style behavior for ownership checks, partial update
fields, MCP type/name/URL validation, `headers`/`variables`
normalization, and tool metadata scrubbing.
2026-06-02 15:58:44 +08:00
Hz_
1092f624fb fix: post /api/v1/system/tokens (#15410)
### What problem does this PR solve?

This PR aligns `POST /api/v1/system/tokens` in Go with the Python
implementation.

### Type of change

- Keep the token creation flow under the system API route.
- Preserve the owner-tenant authorization check.
- Generate and persist API tokens consistently with the current Go
service flow.
- Return the created token payload in the standard API response format.

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2026-06-02 13:39:07 +08:00