ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-29 15:31:05 +08:00

Author	SHA1	Message	Date
Hz_	bc963f8cf2	refactor(go): replace GenerateUUID1 with GenerateToken for entity IDs (#16010 ) ### Description - Refactor: Replaced `utility.GenerateUUID1` (UUID v1) with `utility.GenerateToken` (UUID v4) for generating entity IDs (`userID`, `kbID`, `modelID`, etc.). - Cleanup: Removed the unused `GenerateUUID1` function from `utility` package. - Improvement: Simplified ID generation logic and eliminated unnecessary error handling boilerplate since `GenerateToken` cannot fail.	2026-06-15 14:06:07 +08:00
Yingfeng	b5bea72e4b	Add git-like file commit API (#15978 ) ### What problem does this PR solve? \| # \| Method \| Endpoint \| Description \| Git Equivalent \| \|---\|--------\|----------\|-------------\|----------------\| \| 1 \| `POST` \| `/api/v1/{prefix}/{folder_id}/commits` \| Create a snapshot commit with file changes (add/modify/delete/rename) \| `git add` + `git commit` \| \| 2 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits` \| List commit history (paginated) \| `git log` \| \| 3 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}` \| Get commit detail with file changes \| `git show` \| \| 4 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files` \| List file changes in a commit \| `git show --name-status` \| \| 5 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/diff?from=...&to=...` \| Compare two commits and return differences \| `git diff` \| \| 6 \| `GET` \| `/api/v1/{prefix}/{folder_id}/changes` \| Get uncommitted changes (add/modify/delete) \| `git status` \| \| 7 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/tree` \| Get the folder tree snapshot at commit time \| `git ls-tree` \| \| 8 \| `GET` \| `/api/v1/{prefix}/{folder_id}/commits/{commit_id}/files/{file_id}/content` \| Get a file's content as it existed in a specific commit \| `git show HEAD:file` \| \| 9 \| `GET` \| `/api/v1/{prefix}/{file_id}/versions` \| Get version history for a specific file across all commits \| `git log -- file` \| Where `{prefix}/{id}` can be: - `folders/{folder_id}` — direct folder access - `workspaces/{workspace_id}` — alias of `folders/{folder_id}` - `datasets/{dataset_id}` — resolves to the dataset's folder - `memories/{memory_id}` — resolves to the memory's folder - `skills/{skill_id}` — resolves to the skill's folder ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2026-06-15 11:19:56 +08:00
Jin Hai	32d5c0039b	Go: refactor model API to accept model id (#15999 ) ### What problem does this PR solve? Not not only model_name@instance_name@provider_name is acceptable, but also model_id is acceptable. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-15 10:10:14 +08:00
Jin Hai	e89afbae21	Go: file parser config (#15989 ) ### What problem does this PR solve? Add parser config ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-13 19:40:43 +08:00
Jin Hai	d32e05d560	Go: add more file parser (#15979 ) ### What problem does this PR solve? Now we can parse 'pptx', 'ppt', 'doc', 'xls', 'xlsx' ``` RAGFlow(api/default)> parse file 'test.pptx'; Parsing PPTX file: test.pptx Document format: pptx ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-12 23:28:14 +08:00
Zhichang Yu	3fa15c0e2f	feat(agent): Go port — canvas engine, 22 components, DSL v2, 13 endpoints (#15952 ) Ports the agent canvas subsystem from Python to Go. ## What's included ### Canvas Engine (Phase 0/1) - State engine, scheduler, variable resolver, Redis checkpoint store, cancel protocol - 209 tests across canvas / component / io packages ### 22 Components (P0–P4) \| Tier \| Components \| \|---\|---\| \| P0 T1+T2+T3 \| LLM, Agent, ExitLoop, Switch, Categorize, Begin, Message, Invoke \| \| P1 T3 \| VariableAggregator, VariableAssigner, StringTransform, ListOperations, DataOperations \| \| P2 T3 \| Iteration, IterationItem, Loop, LoopItem \| \| P3 T3 \| UserFillUp, Fillup \| \| P4 T5 \| Browser, ExcelProcessor, DocsGenerator \| ### DSL v2 Schema (Phase 2.5) - Typed v2 in-memory model with v1-to-v2 auto-detect converter - v1 legacy field stripping per plan §2.11.7 ### HTTP Endpoints & Bug Fixes (Plans PR1–PR3) - DELETE SQL bug fix: gorm v2 `Where("id = ?", id).Delete(...)` pattern - CreateAgent validation: title/DSL required, duplicate check, 103 envelope - 13 new endpoints: templates, prompts, tags, sessions CRUD, chat/completions (SSE + non-stream stubs), rerun, test_db_connection, logs, webhook/logs - 756 Go unit tests (745 → 756, +18) - 17 → 0 Python integration test failures (test_agents.py + test_session_management/) ### Tools 21 eino tools: HTTPHelper, search tools, financial/data tools, mandatory stubs ### Infrastructure OTel observability, NATS message queue, DeepDoc gRPC client, SSRF guards, IDOR mitigation	2026-06-12 22:58:28 +08:00
bitloi	cafa0f2e4f	fix: SSE write timeout (#15852 ) ### What problem does this PR solve? Fixes #15840. The Go HTTP server sets `WriteTimeout: 120s`, which also applies to long-lived SSE responses. Existing Go streaming handlers did not clear the per-response write deadline, so streams that run longer than the server timeout can be terminated mid-response. This PR adds a small handler helper that clears the response write deadline for SSE requests and calls it only in existing Go streaming branches: - conversation completion streaming - provider chat streaming - provider transcription streaming - provider speech streaming The global server `WriteTimeout` remains unchanged for non-streaming requests. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Test plan - `/root/go/bin/go test ./internal/handler -run TestDisableWriteDeadlineForSSEAllowsLongLivedStream -count=1` - `/root/go/bin/go test ./internal/handler -count=1`	2026-06-12 20:49:34 +08:00
Jin Hai	234f1b7cff	Go: add office_oxide and parse docx file. (#15976 ) ### What problem does this PR solve? As title. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-12 20:28:15 +08:00
Haruko386	547139da29	fix(Go-models): preserve model name lookup when aliases exist (#15969 ) ### What problem does this PR solve? As title ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Documentation Update	2026-06-12 19:15:28 +08:00
Yingfeng	5a7d7771a3	Decouple skill space from Python API (#15971 ) ### What problem does this PR solve? Make skill space independent of Python filesystem API ### Type of change - [x] Refactoring	2026-06-12 18:18:55 +08:00
Jin Hai	115b730d07	Go: parse ingestion DSL (#15938 ) PR #15938 ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-12 17:58:36 +08:00
bitloi	22a058f56c	fix(go): redact internal handler errors (#15746 ) ### What problem does this PR solve? Refs #15743 Some Go API handlers return raw `err.Error()` strings in `CodeServerError` responses. Those errors can include internal backend details such as database, storage, search engine, or host information. This PR adds a small shared `jsonInternalError` helper for handler-level internal failures. The helper logs the raw error server-side with request method/path context, then returns the existing generic `common.CodeServerError.Message()` to API clients. This first slice migrates the existing `jsonError(c, common.CodeServerError, err.Error())` production call sites in agent, dataset graph, file, and system handlers. It intentionally does not close the full issue because direct `c.JSON` error responses in other handlers remain for follow-up PRs. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Tests - `/root/go/bin/go test ./internal/handler -count=1` --------- Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-06-12 16:09:10 +08:00
Jin Hai	e96bc37d06	Go: use NATS as the message queue (#15327 ) ### What problem does this PR solve? ``` RAGFlow(admin)> mq publish 'msg2'; SUCCESS RAGFlow(admin)> mq publish 'msg3'; SUCCESS RAGFlow(admin)> mq list; +---------+---------------+ \| message \| subject \| +---------+---------------+ \| msg1 \| tasks.RAGFLOW \| \| msg2 \| tasks.RAGFLOW \| \| msg3 \| tasks.RAGFLOW \| +---------+---------------+ RAGFlow(admin)> mq pull 2; +---------+---------------+ \| message \| subject \| +---------+---------------+ \| msg1 \| tasks.RAGFLOW \| \| msg2 \| tasks.RAGFLOW \| +---------+---------------+ RAGFlow(admin)> mq pull noack; +---------+---------------+ \| message \| subject \| +---------+---------------+ \| abc \| tasks.RAGFLOW \| +---------+---------------+ RAGFlow(admin)> mq show +-------------------+----------------+--------+---------------+---------------+-------------------+---------------+ \| ack_pending_count \| consumer_count \| memory \| message_count \| pending_count \| redelivered_count \| waiting_count \| +-------------------+----------------+--------+---------------+---------------+-------------------+---------------+ \| 2 \| 1 \| 0 \| 2 \| 0 \| 1 \| 0 \| +-------------------+----------------+--------+---------------+---------------+-------------------+---------------+ RAGFlow(admin)> list ingestors; +--------------+-------------------------------------------+--------+ \| host \| name \| status \| +--------------+-------------------------------------------+--------+ \| 192.168.1.38 \| ingestor-8f0e4bd5650a4ac58b0151969fbf6935 \| alive \| +--------------+-------------------------------------------+--------+ RAGFlow(admin)> list ingestion tasks; +----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+ \| document_id \| id \| status \| step \| user \| user_id \| +----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+ \| ffe64fae423411f1a2d938a74640adcc \| 90d3d0f6528941c1ac8eb0360effccc4 \| COMPLETED \| 5 \| aaa@aaa.com \| 2ba4881420fa11f19e9c38a74640adcc \| +----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+ RAGFlow(admin)> remove ingestion tasks '90d3d0f6528941c1ac8eb0360effccc4'; +---------+----------------------------------+ \| delete \| task_id \| +---------+----------------------------------+ \| success \| 90d3d0f6528941c1ac8eb0360effccc4 \| +---------+----------------------------------+ RAGFlow(admin)> stop ingestion tasks 'e89e20d9a25848a1b79bd9345ddbfe1d'; +----------+----------------------------------+ \| status \| task_id \| +----------+----------------------------------+ \| STOPPING \| e89e20d9a25848a1b79bd9345ddbfe1d \| +----------+----------------------------------+ # Publish a message RAGFlow(admin)> mq publish 'cdd'; SUCCESS # List current tasks in the message queue RAGFlow(admin)> mq list +----------------------------------+---------------+ \| message \| subject \| +----------------------------------+---------------+ \| 7ce392a3c1624cd2be4b5276e8825059 \| tasks.RAGFLOW \| +----------------------------------+---------------+ # Consume a task from the message queue RAGFlow(admin)> mq pull +------+-----+----------------+ \| ack \| id \| type \| +------+-----+----------------+ \| true \| cdd \| ingestion_test \| +------+-----+----------------+ # User mode # List ingestion tasks, followed by dataset id RAGFlow(user)> list ingestion tasks from '0abe79f9423311f1ad8d38a74640adcc'; +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| create_date \| create_time \| dataset_id \| document_id \| id \| schema \| status \| update_date \| update_time \| user_id \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| 2026-05-30T20:21:06+08:00 \| 1780143666289 \| 0abe79f9423311f1ad8d38a74640adcc \| ffe64fae423411f1a2d938a74640adcc \| 8d758cd14a8b4ba8ab505003fb52017d \| \| COMPLETED \| 2026-05-30T20:21:26+08:00 \| 1780143686431 \| 2ba4881420fa11f19e9c38a74640adcc \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ RAGFlow(user)> list ingestion tasks; +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| create_date \| create_time \| dataset_id \| document_id \| id \| schema \| status \| update_date \| update_time \| user_id \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| 2026-06-02T19:02:31+08:00 \| 1780398151417 \| 0abe79f9423311f1ad8d38a74640adcc \| ffe64fae423411f1a2d938a74640adcc \| e89e20d9a25848a1b79bd9345ddbfe1d \| \| COMPLETED \| 2026-06-02T19:02:52+08:00 \| 1780398172208 \| 2ba4881420fa11f19e9c38a74640adcc \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ # Create an ingestion task # First argument is document id, second argument is dataset id RAGFlow(user)> start ingestion 'ffe64fae423411f1a2d938a74640adcc' from '0abe79f9423311f1ad8d38a74640adcc'; +----------------------------------+-------------------------------------------+ \| document_id \| result \| +----------------------------------+-------------------------------------------+ \| ffe64fae423411f1a2d938a74640adcc \| task_id: 8d758cd14a8b4ba8ab505003fb52017d \| +----------------------------------+-------------------------------------------+ # Pause an ingestion task, first argument is ingestion id RAGFlow(user)> stop ingestion '8d758cd14a8b4ba8ab505003fb52017d'; +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| create_date \| create_time \| dataset_id \| document_id \| id \| schema \| status \| update_date \| update_time \| user_id \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ \| 2026-05-30T20:21:06+08:00 \| 1780143666289 \| 0abe79f9423311f1ad8d38a74640adcc \| ffe64fae423411f1a2d938a74640adcc \| 8d758cd14a8b4ba8ab505003fb52017d \| \| COMPLETED \| 2026-05-30T20:21:26+08:00 \| 1780143686431 \| 2ba4881420fa11f19e9c38a74640adcc \| +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ # Delete an ingestion task RAGFlow(api/default)> remove ingestion tasks 'f366450a27d54677aec1c7090add30f0'; +---------+----------------------------------+ \| remove \| task_id \| +---------+----------------------------------+ \| success \| f366450a27d54677aec1c7090add30f0 \| +---------+----------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-12 14:56:44 +08:00
JPette1783	daa3811165	feat(models): add shared HTTP client, SSE parser, and stub helpers for Go model drivers (#15821 ) ### What problem does this PR solve? The Go model-driver layer () has ~38,700 lines across 109 files. Roughly 74% of that is boilerplate duplicated into every driver: identical HTTP client setup, the same 65-line SSE scanner loop, and 10-11 one-line "not supported" stub methods per driver. Any fix must be manually propagated to every file. Closes #15820. This PR establishes the three shared utility files that form the foundation for incremental driver migration: --- ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring --------- Co-authored-by: Haruko386 <tryeverypossible@163.com>	2026-06-11 19:20:12 +08:00
Haruko386	9c30557ef7	Go: add dimensions for list models and fix some embed-bug in providers (#15940 ) ### What problem does this PR solve? As title ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-06-11 19:18:49 +08:00
Hz_	312514c032	feat(go): Add embedding dimension metadata and validation (#15939 ) ### What problem does this PR solve? - Replace embedding model `dimension` metadata with `max_dimension`. - Add optional `dimensions` metadata for models with fixed selectable output dimensions. - Include `max_dimension` and `dimensions` in model list responses. - Validate requested embedding dimensions before calling provider embedding APIs. - Forward SiliconFlow embedding dimensions with the correct `dimensions` request field. - Add unit coverage for embedding dimension validation rules.	2026-06-11 17:55:13 +08:00
Jin Hai	ca00d23aac	Go: add parse and chunk command (#15936 ) ### What problem does this PR solve? Two commands are used for ingestion file testing ``` RAGFlow(api/default)> chunk 'file' with 'dsl'; Chunk file: file, DSL: dsl SUCCESS RAGFlow(api/default)> parse file 'filename' chat 'xxx'; Success to parse local file "filename", vision: , chat: xxx, asr: , ocr: , embedding: , doc_parse: SUCCESS ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-11 13:33:26 +08:00
Haruko386	84edf539e7	Go: Refactor list-models func (#15900 ) ### What problem does this PR solve? As title Issue: #15853 ### Type of change - [x] Refactoring	2026-06-11 13:32:50 +08:00
JPette1783	4b10c0b885	fix(go-models): guard nil pointers in DeepSeek and VolcEngine streaming (#15817 ) ### What problem does this PR solve? `ChatStreamlyWithSender` in two Go model drivers could panic on nil pointer dereferences when a caller passes a nil model config or omits the reasoning `Effort`: - deepseek.go - `switch chatModelConfig.Effort` dereferenced `Effort` without a nil check. It now defaults to `"high"` when nil. - volcengine.go* - the `modelConfig` pointer itself was dereferenced (`Stream`, `MaxTokens`, `Temperature`, .) with no guard, and `Effort` was dereferenced unchecked. `modelConfig` now defaults to an empty `&ChatConfig{}` when nil so the optional-field accesses are safe, and `Effort` defaults to `"medium"` when nil. Addresses the CodeRabbit review on `volcengine.go` `ChatStreamlyWithSender`. Per maintainer feedback ("one PR do one thing"), the unrelated `handler/auth.go` and `service/heartbeat_sender.go` changes were removed so this PR is scoped to the model-provider fixes. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-11 13:32:24 +08:00
Rene Arredondo	bf59eb77cc	feat(go-api): port forgot-password flow to Go (#15282 ) (#15290 ) ## Summary Implements chunk 1 of #15282 — the four `/api/v1/auth/password/...` endpoints from the login-page Go port. Chunk 2 (OAuth/OIDC) is deferred to its own subtask, matching the issue author's own confidence-low recommendation ("multi-provider, stateful redirect flow with external dependencies; recommend its own subtask"). New endpoints, all registered under `apiNoAuth` (forgot-password users are unauthenticated by definition): \| Method \| Path \| Status \| \|--------\|------\|--------\| \| `POST` \| `/api/v1/auth/password/forgot/captcha` \| new \| \| `POST` \| `/api/v1/auth/password/forgot/otp` \| new \| \| `POST` \| `/api/v1/auth/password/forgot/otp/verify` \| new \| \| `POST` \| `/api/v1/auth/password/reset` \| new \| ## Wire compatibility with the Python backend The two backends share state through Redis, so the Go port had to use identical keys, encodings, and constants. Either backend can now validate a code the other minted. - Redis keys: `captcha:<email>`, `otp:<email>`, `otp_attempts:<email>`, `otp_last_sent:<email>`, `otp_lock:<email>`, `otp:verified:<email>` — same as `api/utils/web_utils.py`. - Stored OTP value: `"<hex_hash>:<hex_salt>"` — same as Python. - Hash: HMAC-SHA256 with a `crypto/rand` 16-byte salt — same as `hash_code()`. - Constants: `OTP_LENGTH=4`, `OTP_TTL=5min`, `ATTEMPT_LIMIT=5`, `ATTEMPT_LOCK_SECONDS=30min`, `RESEND_COOLDOWN_SECONDS=60s` — all match `api/utils/web_utils.py`. - Email body: matches `RESET_CODE_EMAIL_TMPL` byte-for-byte. ## Files ### New \| File \| Purpose \| \|---\|---\| \| `internal/utility/otp.go` \| OTP/captcha constants, Redis key builders (`CaptchaRedisKey`, `OTPRedisKeys`, `OTPVerifiedRedisKey`), `HashOTPCode`, `GenerateOTPCode` / `GenerateCaptchaCode` / `GenerateOTPSalt` via `crypto/rand`, and `EncodeOTPStorageValue` / `DecodeOTPStorageValue` matching Python's storage shape. \| \| `internal/utility/smtp.go` \| Minimal stdlib `net/smtp` sender. `SendResetCodeEmail(to, otp, ttlMin)` builds an RFC 5322 plain-text message and dispatches via implicit TLS / STARTTLS / plain — same selectors as Python `aiosmtplib`. Returns `SMTPNotConfiguredError` if the config block is empty. \| ### Modified \| File \| Change \| \|---\|---\| \| `internal/server/config.go` \| New `SMTPConfig` struct + `Config.SMTP` field. Field names mirror the `smtp:` keys in `common/settings.py` (`mail_server`, `mail_port`, `mail_use_ssl`, `mail_use_tls`, `mail_username`, `mail_password`, `mail_from_name`, `mail_from_address`, `mail_frontend_url`) so a single `conf/service_conf.yaml` powers both backends. \| \| `internal/service/user.go` \| Four methods — `ForgotIssueCaptcha`, `ForgotSendOTP`, `ForgotVerifyOTP`, `ForgotResetPassword`. Reuses the existing `decryptPassword`, `HashPassword`, `userDAO.Update`, and `utility.GenerateToken` so the reset+auto-login path is identical to `LoginByEmail`. \| \| `internal/handler/user.go` \| Four handlers in the same `c.JSON` shape as `LoginByEmail`. The reset handler rotates the access token and emits an `Authorization` header for auto-login (matches Python `construct_response(auth=user.get_id())`). \| \| `internal/router/router.go` \| Routes registered under `apiNoAuth`, with an explanatory comment on why they sit outside the auth middleware. \| ## Known divergence — captcha rendering The Python endpoint returns a rendered `image/JPEG` from the `python-captcha` library. The Go side has no image-captcha dependency vendored in `go.mod`, and hand-rolling a raster generator was out of scope for this PR. This commit returns JSON `{captcha: "<text>"}` instead. Implications: - Backend gate is identical — the OTP step still verifies the user-submitted captcha string against the Redis value, so the security model is unchanged. - Frontend impact: the password-reset page rendering needs a small tweak (text display instead of `<img>`) until a Go captcha library is wired in. - The handler comments call this out explicitly so the next PR knows what to swap. Possible follow-ups (any one closes the gap): 1. Add `github.com/mojocn/base64Captcha` or `github.com/dchest/captcha` to `go.mod` and replace the JSON response with an `image/JPEG`. 2. Hand-roll a 5x7 bitmap font + `image/png` writer using only the stdlib. 3. Render a server-side SVG (cheap, but trivially OCR-able — only useful as a UI shim). ## Test plan - [ ] Captcha: `POST /api/v1/auth/password/forgot/captcha?email=<existing>` returns `{code: 0, data: {captcha: "ABCD"}}`. Redis shows `captcha:<email>` with that value and ~60s TTL. Unknown email returns `code: CodeDataError`. - [ ] OTP send: `POST /api/v1/auth/password/forgot/otp` with the right captcha mints an OTP, stores `<hash>:<salt>` under `otp:<email>` for 5 min, sends an email, returns success. With a wrong captcha returns `CodeAuthenticationError`. Hitting it again within 60s returns "you still have to wait …" with `CodeNotEffective`. - [ ] OTP verify: correct OTP → `code: 0`, OTP keys cleared, `otp:verified:<email>` = `"1"`. Wrong OTP → `code: CodeAuthenticationError`, attempt counter bumped; after 5 wrong tries `otp_lock:<email>` is set and further attempts hit `CodeNotEffective`. - [ ] Reset: with the verified flag set, supply a new password (RSA-encrypted+base64, same as `LoginByEmail`). Returns `code: 0`, `Authorization` header set, verified flag deleted. Without the verified flag returns `CodeAuthenticationError`. - [ ] Wire-compat smoke: mint an OTP from the Python backend, verify it via the Go endpoint, and vice versa. Should both succeed. - [ ] SMTP misconfigured: drop `smtp.mail_server` from `conf/service_conf.yaml`. The OTP-send endpoint should now return "failed to send email" without panicking; check the log for the `SMTPNotConfiguredError` warning. - [ ] End-to-end FE: hit the password-reset flow from `web/src/pages/login-next/`. Confirm the text-captcha shim works after the FE tweak. - [ ] `go build ./...` and `go vet ./...` — I could not run these in the sandbox; please confirm a clean build before merging. - [ ] `uv run pytest` to confirm no Python regressions (shared Redis schema). ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-06-10 21:27:56 +08:00
Jonathan Chang	dfcf226ba3	feat: Implement API of ragflow server in Go (#15256 ) ## Summary - Implemented the Go API endpoint for Memory message forgetting: - `DELETE /api/v1/messages/{memory_id}:{message_id}` - Added route registration for the Memory message DELETE endpoint only. - Added request path validation for `memory_id:message_id`. - Added service logic to mark a message as forgotten by setting `forget_at`. - Preserved Python-compatible response behavior: - Success returns `code: 0`, `message: true`, `data: null`. - Added focused unit tests for message path parsing and invalid message ID handling. - Fixed Linux cgo linker config to use the installed shared PCRE2 library so Go tests/builds can run in this environment. ## Related Issue Closes: #15240 ## Change Type - [x] Feature - [x] Test - [x] Build / CI compatibility ## Implemented API - `DELETE /api/v1/messages/{memory_id}:{message_id}` ## Real Behavior Proof Validated with targeted Go tests: ```bash /tmp/go1.25.0/bin/go test ./internal/handler ./internal/router ``` Result: ```text ok ragflow/internal/handler ? ragflow/internal/router [no test files] ``` Validated server entrypoint build: ```bash /tmp/go1.25.0/bin/go build -o /tmp/ragflow-server-main ./cmd/server_main.go ``` Result: ```text build succeeded ``` Validated patch formatting: ```bash git diff --check ``` Result: ```text no whitespace errors ``` ## Checklist - [x] Implemented only `DELETE /api/v1/messages/{memory_id}:{message_id}`. - [x] Did not implement unrelated Memory message APIs. - [x] Added route registration. - [x] Added handler validation. - [x] Added service-level memory access check. - [x] Added tests. - [x] Ran targeted Go tests. - [x] Ran server build validation. - [x] Ran `git diff --check`.	2026-06-10 21:27:35 +08:00
Jin Hai	3e4fb8cf1c	Go: fix test and remove unused code (#15909 ) ### What problem does this PR solve? 1. Fix go test, some cases still failed. 2. Remove unused code. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-10 20:38:43 +08:00
Hz_	515acf4f60	fix(go): Fix case-insensitive model alias lookup (#15911 ) ## Summary - Normalize model alias index keys to lowercase - Detect lowercase alias collisions during provider manager initialization - Fix ListModels metadata mapping for mixed-case provider aliases	2026-06-10 20:36:43 +08:00
Hunnyboy1217	16d5b4fa02	feat[Go]: implement POST /api/v1/files/link-to-datasets (#15674 ) ### What problem does this PR solve? Closes #15673 — ports the Python `file2document_api.py` `convert()` endpoint to Go. \| Method \| Path \| Handler \| \|--------\|------\|---------\| \| POST \| `/api/v1/files/link-to-datasets` \| `FileHandler.LinkToDatasets` \| ### Type of change - [x] New Feature (non-breaking change which adds functionality) --- #### Implementation notes Files changed: ``` internal/service/file2document.go – new service (File2DocumentService) internal/dao/file2document.go – added Create method internal/handler/file.go – FileHandler gains file2DocumentService; LinkToDatasets HTTP handler internal/router/router.go – route registered ``` Functional parity table: \| Concern \| Go behaviour \| \|---------\|-------------\| \| Required fields \| `file_ids` and `kb_ids` both required; missing either → `CodeDataError` mirroring Python `@validate_request` \| \| File existence \| `fileDAO.GetByIDs(fileIDs)` builds a set; any missing ID → `"File not found!"` \| \| KB existence \| `kbDAO.GetByID(kbID)` per KB; missing → `"Can't find this dataset!"` \| \| Folder expansion \| `getAllInnermostFileIDs` recursively calls `fileDAO.ListByParentID` — mirrors `FileService.get_all_innermost_file_ids` \| \| File permissions \| `checkFileTeamPermission`: `file.TenantID == userID` OR user in tenant's team — mirrors `check_file_team_permission` \| \| KB permissions \| `checkKBTeamPermission`: `kb.TenantID == userID` OR user in tenant's team — mirrors `check_kb_team_permission` \| \| Fire-and-forget \| `go convertFiles(...)` goroutine after all validation passes — mirrors `loop.run_in_executor(None, _convert_files, …)` \| \| Conversion \| `convertFiles`: for each file → delete existing mappings + hard-delete old documents → create new `Document` in each target KB → create `File2Document` mapping — mirrors Python `_convert_files` \| \| `getParser` \| Extension-based lookup with fallback to `kb.ParserID` — mirrors `FileService.get_parser` \| \| Immediate return \| `true` returned to caller as soon as goroutine is scheduled \| --------- Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-06-10 16:46:55 +08:00
Hz_	3796835c4d	feat(go-api): migrate agent file download handler to Go with strict P… (#15769 ) ## What does this PR do? This PR migrates the Agent Temporary File Download endpoint (`GET /api/v1/agents/download`) from the Python backend to the Go backend, optimizing the data retrieval flow and maintaining strict functional parity. It also fixes a persistent parsing error in the Sandbox code execution node. ## Checklist - [x] Code logic matches Python implementation - [x] All local unit tests passed - [x] No breaking changes to existing router interfaces Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-06-10 16:09:36 +08:00
Jin Hai	139f4515e8	Go: refactor CLI (#15898 ) ### What problem does this PR solve? 1. remove unused code 2. fix login issue ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-10 16:06:30 +08:00
Hz_	38755c705a	feat(go): Add DeepSeek models and Gitee alias metadata tests (#15885 ) This PR expands conf/all_models.json with DeepSeek model entries and provider aliases. Changes: - Added DeepSeek model entries across `V4`, `V3.2`, `V3.1`, `V3`, `R1`, `Coder`, `Math`, `VL`, `OCR`, `Prover`, `MoE`, and `LLM` series. - Normalized model name values to lowercase canonical IDs. - Added alias values for official DeepSeek/Hugging Face names and provider-specific names from OpenRouter, VolcEngine, SiliconFlow, HuaweiCloud, and QiniuCloud. - Preserved model metadata such as max_tokens, model_types, and thinking where applicable. - Added Gitee ListModels tests to verify DeepSeek aliases map back to model metadata from all_models.json. - Added an optional Gitee integration test gated by GITEE_LIST_MODELS_INTEGRATION=1. Test: /usr/local/go/bin/go clean -cache /usr/local/go/bin/go test ./internal/entity/models -run 'TestGiteeListModels(MapsAllDeepSeekAliasesToModelMetadata\|KeepsOwnedBySuffixAfterAliasMetadataLookup\| Integration)'	2026-06-10 13:59:23 +08:00
oktofeesh	bbc1f2ecec	feat(go-api): add RAG retrieval to chat completions (#15739 ) ## Summary - Add knowledge-base retrieval support to Go chat completions. ## What changed - Routes KB-backed chat sessions through the Go retrieval service instead of falling back to solo chat. - Resolves embedding and rerank models, validates accessible knowledge bases, and preserves tenant-aware retrieval. - Rejects mixed embedding models across selected knowledge bases before retrieval to avoid incompatible vector dimensions. - Threads the HTTP request context into streaming retrieval so cancelled requests can stop downstream retrieval work. - Applies metadata filters and message-level `doc_ids` before retrieval. - Expands parent/child chunks before building references and prompt context. - Injects retrieved knowledge through a copied dialog prompt config so the caller's original dialog is not mutated. - Honors configured empty responses when no chunks are found. - Names the metadata no-match sentinel and reuses it across retrieval/handler paths. - Adds a defensive content cast while appending streamed answers. - Adds focused unit coverage for retrieval, metadata filtering, authorization, multimodal messages, references, empty-response behavior, prompt immutability, and mixed embedding models. --------- Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-10 11:07:45 +08:00
Jin Hai	7c1bd9a5a5	Go CLI: switch to admin/api server (#15861 ) ### What problem does this PR solve? ``` RAGFlow(api/default)> use admin SUCCESS RAGFlow(api/default)> use api 'abc'; SUCCESS ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-10 10:57:00 +08:00
Haruko386	d56aeb2f5d	feat[Go]: api datasets/<dataset_id>/documents/<document_id>/metadata/… (#15846 ) ### What problem does this PR solve? As title ``` /api/v1/datasets/<dataset_id>/documents/<document_id>/metadata/config PUT ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-06-10 09:57:11 +08:00
Haruko386	a396b1ace2	feat[Go]: implement /api/v1/agents/<agent_id> and test_db_connection (#15771 ) ### What problem does this PR solve? Add two API in go ``` /api/v1/agents/test_db_connection POST /api/v1/agents/<agent_id>/sessions DELETE ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-06-10 09:54:07 +08:00
Jack	87b8062df4	feat: implement POST /api/v1/searchbots/ask — streaming RAG with citations and think-tag processing (#15825 ) Implements POST /api/v1/searchbots/ask in Go with streaming SSE, citations, and think-tag processing. 23 files, 90+ unit tests. --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 22:48:50 +08:00
Jack	2f99d52fb5	fix(ci): re-enable Go tests and fix compilation errors after ListModels signature change (#15862 ) ## Summary This PR re-enables the Go test steps in CI that were previously commented out, and fixes all compilation errors that have accumulated in `internal/entity/models/` since the `ListModels` return type was changed from `[]string` to `[]ListModelResponse`. ## Changes ### CI (`.github/workflows/tests.yml`) - Re-enable Prepare test resources step (clones resource repo with WordNet data) - Re-enable Test Go packages step (runs `go test ./internal/...`) - Fix resource path race condition by using `/tmp/resource-${GITHUB_RUN_ID}` instead of `/tmp/resource` - Exclude `/cli` package from Go tests (contains `main` redeclarations) ### Test fixes (16 model provider test files) All errors were caused by the upstream change from `[]string` to `[]ListModelResponse` in the `ListModels` interface: - Add `joinModelNames` test helper to extract `.Name` from `[]ListModelResponse` slices - `strings.Join(models, ",")` → `joinModelNames(models, ",")` (11 files) - `ids[i] != "..."` → `ids[i].Name != "..."` (cometapi, mistral) - `got[i] != want[i]` → `got[i].Name != want[i]` (bedrock) - `[]string` return types → `[]ListModelResponse` (google) ### Pre-existing bugs in model_test.go Bugs introduced by the upstream `entity/` → `entity/models/` directory rename: - Add missing `pm := GetProviderManager()` calls in 3 test functions - Fix `InitProviderManager` signature (`_, err :=` → `err :=`) - Fix `MaxTokens` `*int` dereference (6 comparisons) - Fix `readProviderConfig` relative path (3 levels up instead of 2) ### model.go - Add `findRepoRoot()` to make `conf/all_models.json` resolution work from any CWD, fixing `TestSiliconFlowProviderConfigLoadsLatestProModels` ### Test validation ```bash go build ./internal/... # ✅ go test ./internal/entity/models/... -count=1 # ✅ all pass ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 21:12:15 +08:00
ghost	64b860f771	fix(elasticsearch): complete Go result functions (#15148 ) ## Summary - Complete the Go Elasticsearch result functions that remained stubbed after #15160. - Add focused unit coverage for field mapping, aggregation, IDs, and highlighting behavior. - Update a stale query-builder test type import discovered during validation. ## What changed - Keep the Elasticsearch Go implementation merged in #15160 and fill in `GetFields`, `GetAggregation`, `GetHighlight`, and `GetDocIDs` in `internal/engine/elasticsearch/chunk.go`. - Add regression and invariant coverage in `internal/engine/elasticsearch/chunk_helpers_test.go`. - Update `internal/service/nlp/query_builder_test.go` to use the current `types.MatchTextExpr` type. ## Why - #15160 implemented the main Go Elasticsearch surface, but retrieval/tag flows still call result functions that returned stubs. - Completing these functions keeps Elasticsearch result processing aligned with the expected document-engine behavior for field extraction, tag aggregation, doc ID extraction, and snippet highlighting. ## Validation - `go test ./internal/engine/elasticsearch` - `GOARCH=arm64 CGO_ENABLED=1 go test ./internal/service/nlp -run TestQueryBuilder` - `git diff --check` - CodeRabbit review reported 0 issues after follow-up fixes. - Codex Security diff scan found no reportable issues. ## Notes - This PR is now a follow-up to #15160 rather than a competing implementation. - A full local `go test ./internal/service/nlp` run is blocked by local WordNet resource prerequisites; the query-builder tests touched by this PR pass with the arm64 CGO path.	2026-06-09 20:10:11 +08:00
JPette1783	acae932938	fix(go): guard four nil-pointer dereferences causing runtime panics (#15815 ) ### What problem does this PR solve? Fixes four Go paths that dereference a pointer with no prior nil check, each causing a runtime panic. Closes #15814. \| # \| File \| Bug \| Fix \| \|---\|------\|-----\|-----\| \| 1 \| `internal/entity/models/deepseek.go` \| streaming path runs `switch chatModelConfig.Effort` inside `if Thinking`; panics when `Thinking=true` and `Effort==nil` \| nil-check with default `"high"`, matching the non-streaming path in the same file \| \| 2 \| `internal/entity/models/volcengine.go` \| identical oversight: `switch modelConfig.Effort` with no guard \| nil-check with default `"medium"`, matching its non-streaming path \| \| 3 \| `internal/handler/auth.go` \| `AuthMiddleware` does `if user.IsSuperuser`; panics on every authenticated request when the DB column is `NULL` \| guard with `user.IsSuperuser != nil &&`, matching every other call site (`admin/handler.go`, `admin/service.go`, `user.go`) \| \| 4 \| `internal/service/heartbeat_sender.go` \| `responseBody["code"].(float64)` panics on any non-200 response lacking a numeric `code`; the upstream `recover()` calls `Fatal()` → `os.Exit(1)`, taking down the whole server \| comma-ok assertion (`code, ok := ...`); return an error instead of panicking \| - [x] Bug Fix (non-breaking change which fixes an issue)	2026-06-09 19:29:25 +08:00
Hz_	d4fe3bb148	feat(go-api): Add GET dataset metadata summary API (#15843 ) ## What Adds the RESTful dataset metadata summary endpoint: `GET /api/v1/datasets/{dataset_id}/metadata/summary` The endpoint supports optional document filtering through: `?doc_ids=doc_id_1,doc_id_2`	2026-06-09 19:27:47 +08:00
JPette1783	e050f1816e	fix(models): guard unsafe index access in Google and Ollama drivers (#15819 ) ### What problem does this PR solve? Fixes four panic / spurious-error paths in the Go model layer. Closes #15818. \| # \| File \| Bug \| Fix \| \|---\|------\|-----\|-----\| \| 1 \| \| Thinking-mode streaming path: accessed unconditionally; Gemini emits usage-only chunks with an empty slice, causing a runtime panic \| Guard each step: , , before indexing \| \| 2 \| \| is a plain for ordinary requests; the cast to silently returns , then panics immediately \| Switch on concrete type; handle both and \| \| 3 \| \| Identical panic on the streaming path \| Same switch-on-type fix \| \| 4 \| \| The field is optional (absent for non-thinking models) but the code returned an error when it was missing, breaking every ordinary Ollama completion \| Change to a silent comma-ok assertion; is empty string when the field is absent \| ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 19:26:52 +08:00
Jin Hai	719ce15c95	Go CLI: update list supported models (#15845 ) ### What problem does this PR solve? Now list supported models will show more info. ``` RAGFlow(api/default)> list supported models from 'gitee' 'test'; +-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+ \| dimension \| max_tokens \| model_types \| name \| thinking \| +-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+ \| \| \| \| Wan2.7 \| \| \| \| \| \| HappyHorse-1.0 \| \| \| \| \| \| Qwen3.6-27B@Qwen \| \| \| \| \| \| Qwen3.6-35B-A3B@Qwen \| \| \| \| 1048576 \| [chat] \| DeepSeek-V4-Flash@deepseek-ai \| map[clear_thinking:true default_value:true] \| \| \| 1048576 \| [chat] \| DeepSeek-V4-Pro@deepseek-ai \| map[clear_thinking:true default_value:true] \| +-----------+------------+-------------+----------------------------------------------------------+---------------------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-09 19:01:00 +08:00
Hz_	d1c436b804	feat(api): implement `GET /api/v1/agents/prompts` endpoint in Go (#15748 ) ### Description This PR ports the `GET /api/v1/agents/prompts` endpoint from the Python backend to the Go backend. ### Changes Made - Handler: Added `GetPrompts` method to `internal/handler/agent.go`. - Router: Registered the `agents.GET("/prompts")` route in `internal/router/router.go`. - Logic: Leveraged the existing `service.LoadPrompt` utility to read `analyze_task_system`, `analyze_task_user`, `next_step`, `reflect`, and `citation_prompt` templates directly from the `rag/prompts` directory. - Unit Test: Added `TestGetPrompts_Success` to `internal/handler/agent_test.go` to mock the HTTP context and validate the JSON response structure. ### Motivation This is part of the ongoing effort to port the Agent API surface to Go. Since this specific endpoint only serves static prompt templates and does not require the complex DAG/Canvas execution engine, it can be seamlessly and safely handled by the Go backend. ### Testing - [x] Added automated unit test `TestGetPrompts_Success` (verified passing). - [x] Tested locally via `curl` against the Go server (port 9380) and Python server (port 9384). - [x] Verified that the Go JSON response structure and loaded prompt text are logically 100% identical to the Python implementation.	2026-06-09 17:03:42 +08:00
Yingfeng	01a2a44766	Clean CLI for filesystem (#15838 ) ### Type of change - [x] Refactoring	2026-06-09 17:00:10 +08:00
Jin Hai	d02eb6b596	Go: refactor CLI (#15728 ) ### What problem does this PR solve? ``` RAGFlow(user)> add api server 'ccc' host '127.0.0.1:9980'; SUCCESS RAGFlow(user)> list api server; +------------+---------------+-----------------+---------+-------------+---------------+ \| api_server \| api_server_ip \| api_server_port \| auth \| user_name \| user_password \| +------------+---------------+-----------------+---------+-------------+---------------+ \| ccc \| 127.0.0.1 \| 9980 \| no auth \| \| \| \| default \| 127.0.0.1 \| 9384 \| login \| aaa@aaa.com \| * \| +------------+---------------+-----------------+---------+-------------+---------------+ RAGFlow(user)> delete api server 'ccc'; SUCCESS RAGFlow(user)> list api server; +------------+---------------+-----------------+---------+ \| api_server \| api_server_ip \| api_server_port \| auth \| +------------+---------------+-----------------+---------+ \| default \| 127.0.0.1 \| 9384 \| no auth \| +------------+---------------+-----------------+---------+ RAGFlow(user)> show admin server; +--------------+-------+ \| field \| value \| +--------------+-------+ \| admin_server \| N/A \| +--------------+-------+ RAGFlow(user)> add admin server host '127.0.0.1:9880'; SUCCESS RAGFlow(user)> show admin server; +-------------------+-----------+ \| field \| value \| +-------------------+-----------+ \| admin_server_ip \| 127.0.0.1 \| \| admin_server_port \| 9880 \| \| auth \| no auth \| +-------------------+-----------+ RAGFlow(user)> delete admin server; SUCCESS RAGFlow(user)> show admin server; +--------------+-------+ \| field \| value \| +--------------+-------+ \| admin_server \| N/A \| +--------------+-------+ RAGFlow(user)> show current +-----------------+-------------+ \| field \| value \| +-----------------+-------------+ \| api_server_port \| 9384 \| \| user_name \| aaa@aaa.com \| \| user_password \| * \| \| mode \| api \| \| verbose \| false \| \| api_server \| default \| \| api_server_ip \| 127.0.0.1 \| \| auth \| login \| \| output \| table \| \| interactive \| true \| +-----------------+-------------+ ``` ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-09 15:22:50 +08:00
Jin Hai	55abf4f565	Go: new CLI command, list all models and show model (#15786 ) ### What problem does this PR solve? ``` RAGFlow(user)> list models; +---------------------------+------------+-------------+--------------------+---------------------------------------------+ \| alias \| max_tokens \| model_types \| name \| thinking \| +---------------------------+------------+-------------+--------------------+---------------------------------------------+ \| \| 1048576 \| [chat] \| deepseek-v4-flash \| map[clear_thinking:true default_value:true] \| \| \| 1048576 \| [chat] \| deepseek-v4-pro \| map[clear_thinking:true default_value:true] \| \| \| 1024000 \| [chat] \| minimax-m3 \| map[clear_thinking:true default_value:true] \| \| \| 64000 \| [vision] \| glm-4.5v \| map[clear_thinking:true default_value:true] \| \| [baai/bge-m3] \| 8192 \| [embedding] \| bge-m3 \| \| \| [baai/bge-reranker-v2-m3] \| 1024 \| [rerank] \| bge-reranker-v2-m3 \| \| \| \| \| [tts] \| step-audio-tts-3b \| \| \| [qwen/qwen3-asr-1.7b] \| \| [asr] \| qwen3-asr-1.7b \| \| \| [paddleocr-vl-1.5] \| \| [ocr] \| paddleocr-vl-0.9b \| \| +---------------------------+------------+-------------+--------------------+---------------------------------------------+ RAGFlow(user)> show model 'minimax-m3'; +--------------+---------------------------------------------+ \| field \| value \| +--------------+---------------------------------------------+ \| name \| minimax-m3 \| \| max_tokens \| 1024000 \| \| model_types \| [chat] \| \| thinking \| map[clear_thinking:true default_value:true] \| \| class \| \| \| alias \| \| \| ModelTypeMap \| \| +--------------+---------------------------------------------+ RAGFlow(user)> show model 'baai/bge-m3'; +--------------+---------------+ \| field \| value \| +--------------+---------------+ \| model_types \| [embedding] \| \| thinking \| \| \| class \| \| \| alias \| [baai/bge-m3] \| \| ModelTypeMap \| \| \| name \| bge-m3 \| \| max_tokens \| 8192 \| +--------------+---------------+ ``` --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-06-08 21:38:15 +08:00
Jack	35527f6755	fix: guard http.DefaultTransport type assertion in xiaomi for Go 1.25 (#15787 ) ## Problem `TestXiaomiNewModelWithCustomDefaultTransport` panics on Go 1.25: ``` panic: interface conversion: http.RoundTripper is models.roundTripperFunc, not http.Transport ``` In Go 1.25, `http.DefaultTransport` is no longer `http.Transport`, so the unchecked type assertion in `NewXiaomiModel` panics when the test replaces it with a `roundTripperFunc`. ## Fix Use a safe type assertion with fallback to a new `http.Transport`, matching the pattern already used in `modelscope.go`. ## Verification ```bash go test -run TestXiaomiNewModelWithCustomDefaultTransport ./internal/entity/models/... # PASS ``` Internal contributors only. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 21:11:21 +08:00
Jack	338fdb65fb	feat(ci): enable go test in CI pipeline (#15750 ) ## What problem does this PR solve? Go test files are never compiled in CI — only production binaries via `go build`. This allowed a missing `"sort"` import in `metadata_filter_test.go` to be merged without detection. ## Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) ## Changes - Add `go test -count=1 ./internal/...` step after Go build in CI workflow - Fix missing `"sort"` import in `metadata_filter_test.go` (pre-existing compile error) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 20:06:57 +08:00
oktofeesh	6fc3955cab	fix(go-models): normalize Qwen reasoning families (#15735 ) ## Summary Normalizes Qwen model-family names before reasoning extraction so provider-prefixed Qwen models use the existing `<think>...</think>` fallback.	2026-06-08 19:32:19 +08:00
oktofeesh	e0dc7af5dd	fix(go-models): fix MiniMax driver requests (#15527 ) ## Summary - keep MiniMax chat calls in non-streaming mode and streaming calls in SSE mode - make MiniMax model listing and connection checks use a bodyless GET /v1/models - add focused MiniMax request/response regression tests	2026-06-08 19:32:01 +08:00
oktofeesh	25df0a6725	fix(go-models): validate URL suffix config keys (#15734 ) ## Summary Fixes typoed model-provider URL suffix keys and adds strict nested decoding so future URL suffix config mistakes fail during provider loading instead of being silently ignored.	2026-06-08 19:29:36 +08:00
Haruko386	8dc7f1d95e	Go: implement ASR and TTS for xiaomi (#15765 ) ### What problem does this PR solve? Verified from CLI ``` RAGFlow(user)> chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Answer: Hello! I'm MiMo-v2.5, a large language model developed by Xiaomi's LLM Core Team. You can think of me as a friendly AI assistant ready to help you answer questions, have conversations, or work on creative tasks. My context window can handle up to 1 million tokens, so we can dive into pretty long discussions or documents if you'd like. What can I help you with today? Time: 3.831830 RAGFlow(user)> stream chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Answer: there! I'm MiMo-v2.5, an AI assistant created by the Xiaomi LLM Core Team. I'm here to chat, help out, answer questions, or just have a friendly conversation. Think of me as a helpful buddy with a pretty big memory (1 million tokens worth!). What can I do for you today?😊 Time: 2.421630 RAGFlow(user)> think chat with 'mimo-v2.5@test@xiaomi' message 'who r u' Thinking: The user is asking a simple question about who I am. According to my system prompt, I should: - Identify myself as MiMo-v2.5 - State that I was developed by the Xiaomi LLM Core Team - Answer in first person and be warm and conversational Answer: Hey there! 👋 I'm MiMo, an AI assistant created by the Xiaomi LLM Core Team. Think of me as a friendly chat buddy who's here to help you with all sorts of questions and tasks! I love having conversations, answering questions, brainstorming ideas, and helping people figure things out. Whether you want to chat, need help with something specific, or just want to explore ideas together — I'm here for it! 😊 What can I help you with today? Time: 6.651589 RAGFlow(user)> tts with 'mimo-v2.5-tts@test@xiaomi' text 'hello? show yourself' play format 'wav' param '{"voice": "Chloe"}' SUCCESS RAGFlow(user)> asr with 'mimo-v2.5-asr@test@xiaomi' audio './internal/test.wav' param '{"language": "zh"}' +------------------------------------------------------------------------------------------------------------------------+ \| text \| +------------------------------------------------------------------------------------------------------------------------+ \| 1 The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired. \| +------------------------------------------------------------------------------------------------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2026-06-08 19:27:45 +08:00
oktofeesh	d63bd81d0d	fix(go-models): fix Moonshot model and balance requests (#15528 ) ## Summary - keep Moonshot chat calls in non-streaming mode and streaming calls in SSE mode - make Moonshot model listing and balance checks use bodyless GET requests - add focused Moonshot request/response regression tests	2026-06-08 19:27:19 +08:00
Jack	04209ffccf	feat: implement FetchChunkVectors for citation vector hydration (#15749 ) ## What problem does this PR solve? Implements `FetchChunkVectors` — the infrastructure needed to hydrate chunk embedding vectors on demand. This is a prerequisite for `insert_citations` (citation insertion in the `searchbots/ask` endpoint), matching the Python `Dealer.fetch_chunk_vectors` pattern. Without this, citation insertion cannot compute answer-vs-chunk vector similarity. ## Type of change - [x] New Feature (non-breaking change which adds functionality) ## Changes ### New Function - `FetchChunkVectors(engine, chunkIDs, tenantIDs, kbIDs, dim)` — fetches embedding vectors for a set of chunk IDs - Consumer-side `vectorFetcher` interface with only `Search` + `GetType` methods - Both `elasticsearchEngine` and `infinityEngine` implicitly satisfy the interface ### Engine Behavior - ES: queries by chunk ID list via `Search` with filter `{"id": chunkIDs}`, parses tab-separated `q_N_vec` string format - Infinity / OceanBase: skips the round-trip (vectors already shipped with chunks) - Degrades gracefully: engine errors return zero vectors — citation insertion will use placeholders instead of failing ### Vector Parsing - Handles ES tab-separated string format (`"0.1\t0.2\t0.3"`) - Handles `[]float64` and `[]interface{}` formats - Returns zero vector for wrong-dimension or unparseable input ### Bug Fix - `metadata_filter_test.go`: add missing `"sort"` import (pre-existing build break) ### Tests - 12 unit tests: empty input, Infinity/OceanBase skip, ES string vector, ES float slice, ES interface slice, search error degradation, missing chunk → zero, wrong dimension → zero, parse edge cases ## Files Changed \| File \| Change \| \|------\|--------\| \| `internal/service/chunk_vector.go` \| New — FetchChunkVectors + parse helpers \| \| `internal/service/chunk_vector_test.go` \| New — 12 tests \| \| `internal/service/metadata_filter_test.go` \| Fix missing `"sort"` import \| 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 16:54:00 +08:00

1 2 3 4 5 ...

484 Commits