feat(agent): Go port — canvas engine, 22 components, DSL v2, 13 endpoints (#15952)
Ports the agent canvas subsystem from Python to Go.
## What's included
### Canvas Engine (Phase 0/1)
- State engine, scheduler, variable resolver, Redis checkpoint store,
cancel protocol
- **209 tests** across canvas / component / io packages
### 22 Components (P0–P4)
| Tier | Components |
|---|---|
| P0 T1+T2+T3 | LLM, Agent, ExitLoop, Switch, Categorize, Begin,
Message, Invoke |
| P1 T3 | VariableAggregator, VariableAssigner, StringTransform,
ListOperations, DataOperations |
| P2 T3 | Iteration, IterationItem, Loop, LoopItem |
| P3 T3 | UserFillUp, Fillup |
| P4 T5 | Browser, ExcelProcessor, DocsGenerator |
### DSL v2 Schema (Phase 2.5)
- Typed v2 in-memory model with v1-to-v2 auto-detect converter
- v1 legacy field stripping per plan §2.11.7
### HTTP Endpoints & Bug Fixes (Plans PR1–PR3)
- **DELETE SQL bug fix**: gorm v2 `Where("id = ?", id).Delete(...)`
pattern
- **CreateAgent validation**: title/DSL required, duplicate check, 103
envelope
- **13 new endpoints**: templates, prompts, tags, sessions CRUD,
chat/completions (SSE + non-stream stubs), rerun, test_db_connection,
logs, webhook/logs
- **756 Go unit tests** (745 → 756, +18)
- **17 → 0 Python integration test failures** (test_agents.py +
test_session_management/)
### Tools
21 eino tools: HTTPHelper, search tools, financial/data tools, mandatory
stubs
### Infrastructure
OTel observability, NATS message queue, DeepDoc gRPC client, SSRF
guards, IDOR mitigation
2026-06-12 22:58:28 +08:00
|
|
|
//go:build ignore
|
2026-06-15 14:44:16 +08:00
|
|
|
|
2026-05-25 14:00:08 +08:00
|
|
|
//
|
|
|
|
|
// Copyright 2026 The InfiniFlow Authors. All Rights Reserved.
|
|
|
|
|
//
|
|
|
|
|
// Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
|
|
// you may not use this file except in compliance with the License.
|
|
|
|
|
// You may obtain a copy of the License at
|
|
|
|
|
//
|
|
|
|
|
// http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
|
//
|
|
|
|
|
// Unless required by applicable law or agreed to in writing, software
|
|
|
|
|
// distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
|
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
|
// See the License for the specific language governing permissions and
|
|
|
|
|
// limitations under the License.
|
|
|
|
|
//
|
|
|
|
|
|
2026-03-04 19:17:16 +08:00
|
|
|
package main
|
|
|
|
|
|
|
|
|
|
import (
|
|
|
|
|
"context"
|
2026-04-28 12:12:58 +08:00
|
|
|
"errors"
|
2026-03-12 09:50:57 +08:00
|
|
|
"flag"
|
2026-03-04 19:17:16 +08:00
|
|
|
"fmt"
|
|
|
|
|
"net/http"
|
|
|
|
|
"os"
|
|
|
|
|
"os/signal"
|
2026-03-11 11:23:13 +08:00
|
|
|
"ragflow/internal/common"
|
2026-06-15 14:44:16 +08:00
|
|
|
"ragflow/internal/engine/redis"
|
2026-03-04 19:17:16 +08:00
|
|
|
"ragflow/internal/server"
|
2026-03-12 20:02:50 +08:00
|
|
|
"ragflow/internal/server/local"
|
2026-03-20 13:15:41 +08:00
|
|
|
"ragflow/internal/storage"
|
2026-03-06 20:05:10 +08:00
|
|
|
"ragflow/internal/utility"
|
|
|
|
|
"strings"
|
2026-03-04 19:17:16 +08:00
|
|
|
"syscall"
|
|
|
|
|
"time"
|
|
|
|
|
|
|
|
|
|
"github.com/gin-gonic/gin"
|
|
|
|
|
"go.uber.org/zap"
|
|
|
|
|
|
2026-06-17 13:24:03 +08:00
|
|
|
"ragflow/internal/agent/audio"
|
|
|
|
|
"ragflow/internal/agent/canvas"
|
feat(agent): align Go agent behavior with Python (except retrieval component) (#16225)
## Summary
Aligns the **Go agent runtime/canvas/components/tools** behavior with
the **Python `agent/` implementation** so the same stored canvas DSL
produces the same execution result on either side. Every component,
tool, and runtime primitive in `internal/agent/` is now driven by the
same semantics as its Python counterpart — variable resolution, template
substitution, control flow, error reporting, retry/cancel, and stream
event shapes.
The **retrieval component is the one explicit exception** in this PR. It
is being reworked in a separate change and is excluded from this
alignment pass; the wrapper slot (`universe_a_wrappers.go →
newRetrievalComponent`) is preserved.
## Scope of alignment
### Components (all aligned with `agent/component/`)
`Begin` · `Message` · `LLM` (incl. ChatTemplateKwargs,
MessageHistoryWindowSize, VisualFiles, Cite, OutputStructure,
JSONOutput, TopP, MaxRetries, DelayAfterError, credentials) · `Agent`
(react + tool artifact capture + `Reset()` interface-assert) · `Switch`
(12/12 operators, Python-equivalent semantics) · `Categorize` · `Invoke`
· `Iteration` · `Loop` (macro-expansion through `workflowx.AddLoopNode`)
· `UserFillUp` (Python-equivalent interrupt/resume via eino
`compose.Interrupt`/`ResumeWithData`) · `FillUp` · `DataOperations` ·
`ListOperations` · `StringTransform` · `VariableAggregator` ·
`VariableAssigner` · `Browser` (full stagehand runtime parity) ·
`DocsGenerator` · `ExcelProcessor`.
### Tools (all aligned with `agent/tools/`)
`Retrieval` (wrapper slot only — logic out of scope) · `MCPToolAdapter`
(streamable-HTTP) · `CodeExec` (sandbox bridge with
`code_exec_contract.go` matching Python contract) · `AkShare` · `ArXiv`
· `Crawler` · `DeepL` · `DuckDuckGo` · `Email` · `ExeSQL` · `GitHub` ·
`Google` · `GoogleScholar` · `Jin10` · `PubMed` · `QWeather` · `SearXNG`
· `Tavily` · `Tushare` · `Wencai` · `Wikipedia` · `YahooFinance` —
uniform `eino tool.InvokableTool` interface, SSRF protection, shared
HTTP client.
### Canvas execution engine (`internal/agent/canvas/`)
Aligned with Python's `agent/canvas.py`:
- **Scheduler** (`scheduler.go`): state pre/post handlers, node lambdas,
per-component timeout resolver (4-level: per-class env → per-class table
→ uniform env → 600s fallback), `legacyNoOpNames`.
- **Loop subgraph** (`loop_subgraph.go`): Python-equivalent
`AddLoopNode` macro expansion + condition translation.
- **Multibranch** (`multibranch.go`): `Switch` / `Categorize` routing
via `compose.NewGraphMultiBranch` — same branch selection semantics as
Python.
- **Parallel subgraph** (`parallel_subgraph.go`): matches Python's
parallel fan-out contract.
- **Interrupt/Resume** (`interrupt_resume.go`): `UserFillUpNodeBody` /
`IsInterruptError` / `ExtractInterruptContexts` — replaces the
deprecated Python sentinel chain with eino's native interrupt API,
preserving the same external behavior.
- **Checkpoint** (`checkpoint_store.go`): `RedisCheckPointStore`
Get/Set/Delete, with business metadata (status / canvas_id /
parent_run_id) on a parallel Redis Hash.
- **RunTracker** (`run_tracker.go`): Start / MarkSucceeded / MarkFailed
/ MarkCancelled / AttachCheckpoint — same lifecycle as the Python run
record.
- **Cancel** (`cancel.go`): Redis pub/sub watch.
- **Stream** (`stream.go`): SSE channel with `messages` / `waiting` /
`errors` / `done` events, same shape as Python's `agent.canvas.RunEvent`
payload.
### DSL bridge (`internal/agent/dsl/`)
- `normalize.go`: v1↔v2 collapsed into a single wire format — Python and
Go consume the same stored JSON.
- `reset.go`: per-run state reset matches Python's `Canvas.reset()`
semantics.
- Testdata mirrors Python's `agent_msg.json` / `all.json` / etc.
### Runtime (`internal/agent/runtime/`)
- `CanvasState` / `NewCanvasState` / `GetVar` / `SetVar` / `ReadVars`:
same `{{cpn_id@param}}` resolution model.
- `ResolveTemplate` (regex fast path + gonja fallback) — Python
Jinja-style semantics.
- `selector.go`, `metrics.go`, `component.go`: shared runtime contracts.
## Out of scope (intentionally)
- **`Retrieval` component logic** — wrapped only; full parity lands in a
follow-up PR.
- **Frontend** — only minor dsl-bridge / canvas UX fixes ride along.
- **CLI / admin / model registry** — orthogonal to agent behavior.
## How alignment is verified
`internal/service/agent_run_e2e_test.go` exercises the **full production
chain** against real Python-shaped DSL fixtures:
```
loadCanvasForUser → versionDAO.GetLatest → decodeCanvasFromDSL →
canvas.Compile → cc.Workflow.Invoke → answer extraction
```
using in-memory SQLite + miniredis (no Docker). Covers:
- `TestRunAgent_RealCanvas_BeginMessage` — happy path, `{{sys.query}}`
resolution
- `TestRunAgent_RealCanvas_WaitForUserResume` — two-run resume cycle
(Python-equivalent)
- `TestRunAgent_RealCanvas_CompileFails` — unknown component name →
sanitized error (Python-equivalent)
- `TestRunAgent_RealCanvas_InvokeFails` — unresolvable template ref
(Python-equivalent)
- `TestRunAgent_RunTracker_AttachCheckpoint_CallSequence` —
Start→AttachCheckpoint→MarkSucceeded lifecycle
`internal/handler/agent_test.go` — SSE streaming parity (`Content-Type:
text/event-stream`, `data: {…}\n\n`, trailing `data: [DONE]\n\n`,
OpenAI-compatible non-stream `choices`).
`internal/agent/canvas/fixture_compile_test.go` + per-component tests
pin the Python-equivalent outputs.
```
go test -count=1 -v -run 'TestRunAgent_RealCanvas|TestRunAgent_RunTracker' ./internal/service/
```
## Design reference
`docs/develop/agent-go-port-design.md` (1329 lines, last cross-checked
2026-06-17) — module layout, per-component / per-tool inventory,
corner-case catalogue, and the actionable backlog (Section 14, including
the retrieval alignment follow-up).
---------
Co-authored-by: Claude <noreply@anthropic.com>
2026-06-22 11:58:29 +08:00
|
|
|
_ "ragflow/internal/agent/component" // blank import: registers every Component factory (Begin / Agent / LLM / Message / Retrieval / ...) into the shared runtime at package init
|
feat(agent): Go port — canvas engine, 22 components, DSL v2, 13 endpoints (#15952)
Ports the agent canvas subsystem from Python to Go.
## What's included
### Canvas Engine (Phase 0/1)
- State engine, scheduler, variable resolver, Redis checkpoint store,
cancel protocol
- **209 tests** across canvas / component / io packages
### 22 Components (P0–P4)
| Tier | Components |
|---|---|
| P0 T1+T2+T3 | LLM, Agent, ExitLoop, Switch, Categorize, Begin,
Message, Invoke |
| P1 T3 | VariableAggregator, VariableAssigner, StringTransform,
ListOperations, DataOperations |
| P2 T3 | Iteration, IterationItem, Loop, LoopItem |
| P3 T3 | UserFillUp, Fillup |
| P4 T5 | Browser, ExcelProcessor, DocsGenerator |
### DSL v2 Schema (Phase 2.5)
- Typed v2 in-memory model with v1-to-v2 auto-detect converter
- v1 legacy field stripping per plan §2.11.7
### HTTP Endpoints & Bug Fixes (Plans PR1–PR3)
- **DELETE SQL bug fix**: gorm v2 `Where("id = ?", id).Delete(...)`
pattern
- **CreateAgent validation**: title/DSL required, duplicate check, 103
envelope
- **13 new endpoints**: templates, prompts, tags, sessions CRUD,
chat/completions (SSE + non-stream stubs), rerun, test_db_connection,
logs, webhook/logs
- **756 Go unit tests** (745 → 756, +18)
- **17 → 0 Python integration test failures** (test_agents.py +
test_session_management/)
### Tools
21 eino tools: HTTPHelper, search tools, financial/data tools, mandatory
stubs
### Infrastructure
OTel observability, NATS message queue, DeepDoc gRPC client, SSRF
guards, IDOR mitigation
2026-06-12 22:58:28 +08:00
|
|
|
"ragflow/internal/agent/runtime"
|
2026-03-04 19:17:16 +08:00
|
|
|
"ragflow/internal/dao"
|
|
|
|
|
"ragflow/internal/engine"
|
|
|
|
|
"ragflow/internal/handler"
|
|
|
|
|
"ragflow/internal/router"
|
|
|
|
|
"ragflow/internal/service"
|
2026-06-12 14:56:44 +08:00
|
|
|
"ragflow/internal/service/chunk"
|
2026-03-04 19:17:16 +08:00
|
|
|
"ragflow/internal/service/nlp"
|
|
|
|
|
"ragflow/internal/tokenizer"
|
|
|
|
|
)
|
|
|
|
|
|
2026-03-12 09:50:57 +08:00
|
|
|
func printHelp() {
|
|
|
|
|
fmt.Fprintf(os.Stderr, "Usage: %s [OPTIONS]\n\n", os.Args[0])
|
|
|
|
|
fmt.Fprintf(os.Stderr, "RAGFlow Server - Open-source RAG engine based on deep document understanding\n\n")
|
|
|
|
|
fmt.Fprintf(os.Stderr, "Options:\n")
|
2026-06-16 20:27:37 +08:00
|
|
|
fmt.Fprintf(os.Stderr, " -p, --port int\t\tServer port (overrides config file)\n")
|
|
|
|
|
fmt.Fprintf(os.Stderr, " -v, --version \tPrint version information and exit\n")
|
|
|
|
|
fmt.Fprintf(os.Stderr, " --debug \tEnable debug-level logging\n")
|
|
|
|
|
fmt.Fprintf(os.Stderr, " -h, --help \tShow this help message and exit\n")
|
2026-03-12 09:50:57 +08:00
|
|
|
fmt.Fprintf(os.Stderr, "\nExamples:\n")
|
2026-06-16 20:27:37 +08:00
|
|
|
fmt.Fprintf(os.Stderr, " %s \t\t# Start server with config file port\n", os.Args[0])
|
|
|
|
|
fmt.Fprintf(os.Stderr, " %s -p 8080 \t\t# Start server on port 8080\n", os.Args[0])
|
|
|
|
|
fmt.Fprintf(os.Stderr, " %s --port 8080 \t# Start server on port 8080\n", os.Args[0])
|
|
|
|
|
fmt.Fprintf(os.Stderr, " %s --version \t# Show version and exit\n", os.Args[0])
|
|
|
|
|
fmt.Fprintf(os.Stderr, " %s --debug \t# Start server with debug logging\n", os.Args[0])
|
2026-03-12 09:50:57 +08:00
|
|
|
}
|
|
|
|
|
|
2026-03-04 19:17:16 +08:00
|
|
|
func main() {
|
2026-03-12 09:50:57 +08:00
|
|
|
// Parse command line flags
|
|
|
|
|
var portFlag int
|
|
|
|
|
flag.IntVar(&portFlag, "port", 0, "Server port (overrides config file)")
|
|
|
|
|
flag.IntVar(&portFlag, "p", 0, "Server port (shorthand, overrides config file)")
|
2026-06-16 20:27:37 +08:00
|
|
|
var debugFlag bool
|
|
|
|
|
flag.BoolVar(&debugFlag, "debug", false, "Enable debug-level logging")
|
|
|
|
|
var versionFlag bool
|
|
|
|
|
flag.BoolVar(&versionFlag, "version", false, "Print version information and exit")
|
2026-03-12 09:50:57 +08:00
|
|
|
|
|
|
|
|
// Custom help message
|
|
|
|
|
flag.Usage = printHelp
|
|
|
|
|
|
|
|
|
|
flag.Parse()
|
|
|
|
|
|
2026-06-16 20:27:37 +08:00
|
|
|
// Handle --version flag: print version and exit immediately
|
|
|
|
|
if versionFlag {
|
|
|
|
|
fmt.Printf("RAGFlow version: %s\n", utility.GetRAGFlowVersion())
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
2026-03-04 19:17:16 +08:00
|
|
|
// Initialize logger with default level
|
|
|
|
|
// logger.Init("info"); // set debug log level
|
2026-06-08 11:49:37 +08:00
|
|
|
if err := common.Init("info", "server_main.log"); err != nil {
|
2026-03-04 19:17:16 +08:00
|
|
|
panic(fmt.Sprintf("Failed to initialize logger: %v", err))
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Initialize configuration
|
|
|
|
|
if err := server.Init(""); err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("Failed to initialize config", zap.Error(err))
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
|
|
|
|
|
2026-03-12 09:50:57 +08:00
|
|
|
// Override port with command line argument if provided
|
2026-04-28 12:12:58 +08:00
|
|
|
config := server.GetConfig()
|
2026-03-12 09:50:57 +08:00
|
|
|
if portFlag > 0 {
|
|
|
|
|
config.Server.Port = portFlag
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Info("Port overridden by command line argument", zap.Int("port", portFlag))
|
2026-03-12 09:50:57 +08:00
|
|
|
}
|
|
|
|
|
|
2026-04-28 12:12:58 +08:00
|
|
|
if config.Server.Port == 0 {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("Server port is not configured. Please specify via --port flag or config file.")
|
2026-04-28 12:12:58 +08:00
|
|
|
}
|
|
|
|
|
|
2026-03-04 19:17:16 +08:00
|
|
|
// Reinitialize logger with configured level if different
|
2026-06-08 11:49:37 +08:00
|
|
|
level := config.Log.Level
|
|
|
|
|
if level == "" {
|
|
|
|
|
level = "info"
|
|
|
|
|
}
|
2026-06-16 20:27:37 +08:00
|
|
|
|
|
|
|
|
if debugFlag {
|
|
|
|
|
level = "debug"
|
|
|
|
|
}
|
|
|
|
|
|
2026-06-08 11:49:37 +08:00
|
|
|
if err := common.Init(level, "server_main.log"); err != nil {
|
|
|
|
|
common.Error("Failed to reinitialize logger", err)
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
2026-05-06 10:41:58 +08:00
|
|
|
server.SetLogger(common.Logger)
|
2026-04-08 19:32:53 +08:00
|
|
|
if config.Log.Level == "" {
|
2026-05-06 10:41:58 +08:00
|
|
|
config.Log.Level = common.GetLevel()
|
2026-04-08 19:32:53 +08:00
|
|
|
}
|
2026-03-04 19:17:16 +08:00
|
|
|
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Info("Server mode", zap.String("mode", config.Server.Mode))
|
2026-03-04 19:17:16 +08:00
|
|
|
|
|
|
|
|
// Print all configuration settings
|
|
|
|
|
server.PrintAll()
|
|
|
|
|
|
|
|
|
|
// Initialize database
|
|
|
|
|
if err := dao.InitDB(); err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("Failed to initialize database", zap.Error(err))
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Initialize doc engine
|
2026-03-09 17:48:29 +08:00
|
|
|
if err := engine.Init(&config.DocEngine); err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("Failed to initialize doc engine", zap.Error(err))
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
|
|
|
|
defer engine.Close()
|
|
|
|
|
|
|
|
|
|
// Initialize Redis cache
|
2026-06-15 14:44:16 +08:00
|
|
|
if err := redis.Init(&config.Redis); err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("Failed to initialize Redis", zap.Error(err))
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
2026-06-15 14:44:16 +08:00
|
|
|
defer redis.Close()
|
2026-03-04 19:17:16 +08:00
|
|
|
|
2026-03-20 13:15:41 +08:00
|
|
|
if err := storage.InitStorageFactory(); err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("Failed to initialize storage factory", zap.Error(err))
|
2026-03-20 13:15:41 +08:00
|
|
|
}
|
|
|
|
|
|
2026-06-12 14:56:44 +08:00
|
|
|
if err := engine.InitMessageQueueEngine(config.TaskExecutor.MessageQueueType); err != nil {
|
|
|
|
|
common.Error("Failed to initialize message queue engine", err)
|
|
|
|
|
}
|
|
|
|
|
|
2026-03-04 19:17:16 +08:00
|
|
|
// Initialize server variables (runtime variables that can change during operation)
|
|
|
|
|
// This must be done after Cache is initialized
|
2026-06-15 14:44:16 +08:00
|
|
|
if err := server.InitVariables(redis.Get()); err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Warn("Failed to initialize server variables from Redis, using defaults", zap.String("error", err.Error()))
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
|
|
|
|
|
2026-03-12 20:02:50 +08:00
|
|
|
// Initialize admin status (default: unavailable=1)
|
|
|
|
|
local.InitAdminStatus(1, "admin server not connected")
|
|
|
|
|
|
2026-03-04 19:17:16 +08:00
|
|
|
// Initialize tokenizer (rag_analyzer)
|
2026-06-03 20:55:53 +08:00
|
|
|
dictPath := os.Getenv("RAGFLOW_DICT_PATH")
|
|
|
|
|
if dictPath == "" {
|
|
|
|
|
dictPath = "/usr/share/infinity/resource"
|
|
|
|
|
}
|
2026-03-04 19:17:16 +08:00
|
|
|
tokenizerCfg := &tokenizer.PoolConfig{
|
2026-06-03 20:55:53 +08:00
|
|
|
DictPath: dictPath,
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
|
|
|
|
if err := tokenizer.Init(tokenizerCfg); err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("Failed to initialize tokenizer", zap.Error(err))
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
|
|
|
|
defer tokenizer.Close()
|
|
|
|
|
|
|
|
|
|
// Initialize global QueryBuilder using tokenizer's DictPath
|
|
|
|
|
// This ensures the Synonym uses the same wordnet directory as tokenizer
|
|
|
|
|
if err := nlp.InitQueryBuilderFromTokenizer(tokenizerCfg.DictPath); err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("Failed to initialize query builder", zap.Error(err))
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
|
|
|
|
|
2026-03-09 17:48:29 +08:00
|
|
|
startServer(config)
|
|
|
|
|
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Info("Server exited")
|
2026-03-09 17:48:29 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func startServer(config *server.Config) {
|
|
|
|
|
|
|
|
|
|
// Set Gin mode
|
|
|
|
|
if config.Server.Mode == "release" {
|
|
|
|
|
gin.SetMode(gin.ReleaseMode)
|
|
|
|
|
} else {
|
|
|
|
|
gin.SetMode(gin.DebugMode)
|
|
|
|
|
}
|
|
|
|
|
|
2026-03-04 19:17:16 +08:00
|
|
|
// Initialize service layer
|
|
|
|
|
userService := service.NewUserService()
|
|
|
|
|
documentService := service.NewDocumentService()
|
2026-05-15 14:00:45 +08:00
|
|
|
datasetsService := service.NewDatasetService()
|
2026-05-20 20:32:06 +08:00
|
|
|
knowledgebaseService := service.NewKnowledgebaseService()
|
|
|
|
|
metadataService := service.NewMetadataService()
|
2026-06-09 22:48:50 +08:00
|
|
|
chunkService := chunk.NewChunkService()
|
2026-03-04 19:17:16 +08:00
|
|
|
llmService := service.NewLLMService()
|
|
|
|
|
tenantService := service.NewTenantService()
|
|
|
|
|
chatService := service.NewChatService()
|
2026-06-22 18:16:15 +08:00
|
|
|
chatChannelService := service.NewChatChannelService()
|
2026-03-04 19:17:16 +08:00
|
|
|
chatSessionService := service.NewChatSessionService()
|
2026-06-18 18:07:27 +08:00
|
|
|
openaiChatService := service.NewOpenAIChatService()
|
2026-03-04 19:17:16 +08:00
|
|
|
systemService := service.NewSystemService()
|
|
|
|
|
connectorService := service.NewConnectorService()
|
|
|
|
|
searchService := service.NewSearchService()
|
|
|
|
|
fileService := service.NewFileService()
|
2026-03-27 09:49:50 +08:00
|
|
|
memoryService := service.NewMemoryService()
|
2026-05-27 22:43:21 -10:00
|
|
|
mcpService := service.NewMCPService()
|
2026-04-02 20:20:35 +08:00
|
|
|
modelProviderService := service.NewModelProviderService()
|
2026-03-04 19:17:16 +08:00
|
|
|
|
2026-04-30 12:36:03 +08:00
|
|
|
// Initialize doc engine for skill search
|
|
|
|
|
docEngine := engine.Get()
|
|
|
|
|
|
2026-03-04 19:17:16 +08:00
|
|
|
// Initialize handler layer
|
2026-03-11 11:23:13 +08:00
|
|
|
authHandler := handler.NewAuthHandler()
|
2026-03-04 19:17:16 +08:00
|
|
|
userHandler := handler.NewUserHandler(userService)
|
2026-05-25 19:15:07 +08:00
|
|
|
tenantHandler := handler.NewTenantHandler(tenantService, userService, knowledgebaseService)
|
2026-05-15 14:00:45 +08:00
|
|
|
documentHandler := handler.NewDocumentHandler(documentService, datasetsService)
|
2026-05-20 20:32:06 +08:00
|
|
|
datasetsHandler := handler.NewDatasetsHandler(datasetsService, metadataService)
|
2026-03-04 19:17:16 +08:00
|
|
|
systemHandler := handler.NewSystemHandler(systemService)
|
2026-05-20 20:32:06 +08:00
|
|
|
knowledgebaseHandler := handler.NewKnowledgebaseHandler(knowledgebaseService, userService, documentService)
|
2026-03-04 19:17:16 +08:00
|
|
|
chunkHandler := handler.NewChunkHandler(chunkService, userService)
|
|
|
|
|
llmHandler := handler.NewLLMHandler(llmService, userService)
|
|
|
|
|
chatHandler := handler.NewChatHandler(chatService, userService)
|
2026-06-22 18:16:15 +08:00
|
|
|
chatChannelHandler := handler.NewChatChannelHandler(chatChannelService)
|
2026-03-04 19:17:16 +08:00
|
|
|
chatSessionHandler := handler.NewChatSessionHandler(chatSessionService, userService)
|
2026-06-18 18:07:27 +08:00
|
|
|
openaiChatHandler := handler.NewOpenAIChatHandler(openaiChatService)
|
2026-03-04 19:17:16 +08:00
|
|
|
connectorHandler := handler.NewConnectorHandler(connectorService, userService)
|
|
|
|
|
searchHandler := handler.NewSearchHandler(searchService, userService)
|
|
|
|
|
fileHandler := handler.NewFileHandler(fileService, userService)
|
2026-03-27 09:49:50 +08:00
|
|
|
memoryHandler := handler.NewMemoryHandler(memoryService)
|
2026-05-27 22:43:21 -10:00
|
|
|
mcpHandler := handler.NewMCPHandler(mcpService)
|
2026-04-30 12:36:03 +08:00
|
|
|
skillSearchHandler := handler.NewSkillSearchHandler(docEngine)
|
2026-04-02 20:20:35 +08:00
|
|
|
providerHandler := handler.NewProviderHandler(userService, modelProviderService)
|
2026-06-17 13:24:03 +08:00
|
|
|
// Install the agent service's Redis-backed run infrastructure
|
|
|
|
|
// (CheckPointStore / StateSerializer / RunTracker). When Redis
|
|
|
|
|
// is unreachable (degraded boot, stand-alone mode, no-redis CI)
|
|
|
|
|
// the constructors return errors and we fall through to the
|
|
|
|
|
// in-memory / no-tracking path: the agent service treats nil
|
|
|
|
|
// options as the in-memory test path, so graceful degradation
|
|
|
|
|
// is a 1-line if-not-nil pass-through — no separate "boot" mode
|
|
|
|
|
// required.
|
|
|
|
|
agentOpts := buildAgentRunOptions()
|
|
|
|
|
agentHandler := handler.NewAgentHandler(service.NewAgentServiceWithOptions(
|
|
|
|
|
agentOpts.checkpointStore,
|
|
|
|
|
agentOpts.stateSerializer,
|
|
|
|
|
agentOpts.runTracker,
|
|
|
|
|
), fileService)
|
|
|
|
|
|
|
|
|
|
// Wire the TTS synthesizer to the per-tenant model-provider
|
|
|
|
|
// dispatch. SynthesizeRequest is routed through
|
|
|
|
|
// ModelProviderService.AudioSpeech, which fans out to the
|
|
|
|
|
// tenant's configured TTS model driver. When the model
|
|
|
|
|
// provider is unconfigured, the synthesizer falls back to a
|
|
|
|
|
// no-op echo (the audio package contract), so this is always
|
|
|
|
|
// safe to call.
|
|
|
|
|
configureTTSSynthesizer(modelProviderService)
|
2026-06-09 22:48:50 +08:00
|
|
|
searchBotLLM := &handler.SearchBotRealLLM{Svc: modelProviderService}
|
feat: implement POST /api/v1/searchbots/retrieval_test (#15710)
## What problem does this PR solve?
Implements `POST /api/v1/searchbots/retrieval_test` in the Go API
server, aligning with the Python `bot_api.py` counterpart. Also applies
security hardening and consistency fixes discovered during CTO-level
code review:
- **Missing endpoint**: `retrieval_test` was not available in Go,
requiring Python fallback
- **Security**: Both `chunkHandler` and `searchBotHandler` leaked
`err.Error()` to API consumers
- **Python alignment**: Default values, empty question handling, and
`top_k <= 0` validation differed from Python behavior
- **Test gaps**: `chunkHandler.RetrievalTest` had zero unit tests;
several edge cases uncovered
## Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
## Summary
### New Endpoint
- `POST /api/v1/searchbots/retrieval_test` — retrieval test with full
field support (page, size, top_k, use_kg, cross_languages, keyword,
similarity_threshold, vector_similarity_weight)
### New Type
- `common.StringSlice` — JSON type that accepts both `"kb1"` and
`["kb1", "kb2"]`, matching Python API flexibility
### Security
- Both `searchBotHandler` and `chunkHandler` now use `common.Warn()` +
generic error messages instead of leaking `err.Error()` to API consumers
- All error responses include consistent `"data": nil` shape
- `chunkHandler.RetrievalTest` uses interface-based DI (`chunkService`)
to enable testability
### Python Alignment
- Handler-level defaults align with Python `bot_api.py` (page=1,
size=30, top_k=1024, similarity_threshold=0.0,
vector_similarity_weight=0.3)
- `top_k <= 0` validation matching Python behavior
- Empty/whitespace question returns 200 + empty result (matches
`chunk_api.py`)
- `chunkHandler` `Datasets` field uses `common.StringSlice` for
string-or-array flexibility
### Refactoring
- `ChunkServiceIface` → `ChunkRetriever`, `chunkSvcIface` →
`chunkService` (Go-conventional naming)
- Extracted `applyRetrievalDefaults`, `toRetrievalServiceRequest` from
handler body
- Regex moved to package-level var in `parseRelatedQuestions`
- `service.RetrievalTestRequest.Datasets` type changed to
`common.StringSlice`
- `chunkHandler` now uses consumer-side interface for DI
### Tests
- 37 unit tests across both handlers: auth, validation, defaults,
StringSlice edge cases, empty/whitespace KbID, service errors, JSON
format, `top_k <= 0`, field mapping verification
## Files Changed
| File | Change |
|------|--------|
| `cmd/server_main.go` | Wire new handler + chunkService +
difyRetrievalHandler |
| `internal/common/json_types.go` | New StringSlice type |
| `internal/common/json_types_test.go` | StringSlice tests |
| `internal/handler/chunk.go` | Interface-based DI, security, Python
alignment, defaults |
| `internal/handler/chunk_test.go` | New — 9 comprehensive tests |
| `internal/handler/searchbot.go` | New endpoint + refactoring + `top_k
<= 0` validation |
| `internal/handler/searchbot_test.go` | 18 tests covering all edge
cases |
| `internal/router/router.go` | Register new route +
difyRetrievalHandler |
| `internal/service/chunk.go` | Datasets type → StringSlice, Question
binding relaxed |
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 16:16:56 +08:00
|
|
|
searchBotHandler := handler.NewSearchBotHandler(
|
2026-06-04 19:13:58 +08:00
|
|
|
searchService,
|
|
|
|
|
tenantService,
|
2026-06-09 22:48:50 +08:00
|
|
|
searchBotLLM,
|
feat: implement POST /api/v1/searchbots/retrieval_test (#15710)
## What problem does this PR solve?
Implements `POST /api/v1/searchbots/retrieval_test` in the Go API
server, aligning with the Python `bot_api.py` counterpart. Also applies
security hardening and consistency fixes discovered during CTO-level
code review:
- **Missing endpoint**: `retrieval_test` was not available in Go,
requiring Python fallback
- **Security**: Both `chunkHandler` and `searchBotHandler` leaked
`err.Error()` to API consumers
- **Python alignment**: Default values, empty question handling, and
`top_k <= 0` validation differed from Python behavior
- **Test gaps**: `chunkHandler.RetrievalTest` had zero unit tests;
several edge cases uncovered
## Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
## Summary
### New Endpoint
- `POST /api/v1/searchbots/retrieval_test` — retrieval test with full
field support (page, size, top_k, use_kg, cross_languages, keyword,
similarity_threshold, vector_similarity_weight)
### New Type
- `common.StringSlice` — JSON type that accepts both `"kb1"` and
`["kb1", "kb2"]`, matching Python API flexibility
### Security
- Both `searchBotHandler` and `chunkHandler` now use `common.Warn()` +
generic error messages instead of leaking `err.Error()` to API consumers
- All error responses include consistent `"data": nil` shape
- `chunkHandler.RetrievalTest` uses interface-based DI (`chunkService`)
to enable testability
### Python Alignment
- Handler-level defaults align with Python `bot_api.py` (page=1,
size=30, top_k=1024, similarity_threshold=0.0,
vector_similarity_weight=0.3)
- `top_k <= 0` validation matching Python behavior
- Empty/whitespace question returns 200 + empty result (matches
`chunk_api.py`)
- `chunkHandler` `Datasets` field uses `common.StringSlice` for
string-or-array flexibility
### Refactoring
- `ChunkServiceIface` → `ChunkRetriever`, `chunkSvcIface` →
`chunkService` (Go-conventional naming)
- Extracted `applyRetrievalDefaults`, `toRetrievalServiceRequest` from
handler body
- Regex moved to package-level var in `parseRelatedQuestions`
- `service.RetrievalTestRequest.Datasets` type changed to
`common.StringSlice`
- `chunkHandler` now uses consumer-side interface for DI
### Tests
- 37 unit tests across both handlers: auth, validation, defaults,
StringSlice edge cases, empty/whitespace KbID, service errors, JSON
format, `top_k <= 0`, field mapping verification
## Files Changed
| File | Change |
|------|--------|
| `cmd/server_main.go` | Wire new handler + chunkService +
difyRetrievalHandler |
| `internal/common/json_types.go` | New StringSlice type |
| `internal/common/json_types_test.go` | StringSlice tests |
| `internal/handler/chunk.go` | Interface-based DI, security, Python
alignment, defaults |
| `internal/handler/chunk_test.go` | New — 9 comprehensive tests |
| `internal/handler/searchbot.go` | New endpoint + refactoring + `top_k
<= 0` validation |
| `internal/handler/searchbot_test.go` | 18 tests covering all edge
cases |
| `internal/router/router.go` | Register new route +
difyRetrievalHandler |
| `internal/service/chunk.go` | Datasets type → StringSlice, Question
binding relaxed |
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 16:16:56 +08:00
|
|
|
chunkService,
|
2026-06-04 19:13:58 +08:00
|
|
|
)
|
2026-06-09 22:48:50 +08:00
|
|
|
searchBotHandler.SetStreamLLM(searchBotLLM)
|
|
|
|
|
searchBotHandler.SetAskService(service.NewAskService(chunkService, nil, 0, 0))
|
2026-06-07 20:53:19 -07:00
|
|
|
pluginHandler := handler.NewPluginHandler(service.NewPluginService())
|
2026-06-08 21:38:15 +08:00
|
|
|
modelHandler := handler.NewModelHandler(service.NewModelProviderService())
|
2026-06-15 11:19:56 +08:00
|
|
|
fileCommitHandler := handler.NewFileCommitHandler(service.NewFileCommitService())
|
2026-03-04 19:17:16 +08:00
|
|
|
|
2026-06-05 21:16:25 +08:00
|
|
|
// Dify retrieval handler
|
|
|
|
|
docDAO := dao.NewDocumentDAO()
|
|
|
|
|
retrievalService := nlp.NewRetrievalService(docEngine, docDAO)
|
|
|
|
|
difyRetrievalHandler := handler.NewDifyRetrievalHandler(
|
|
|
|
|
knowledgebaseService,
|
|
|
|
|
modelProviderService,
|
|
|
|
|
metadataService,
|
|
|
|
|
retrievalService,
|
|
|
|
|
docDAO,
|
|
|
|
|
docEngine,
|
|
|
|
|
)
|
2026-06-17 13:24:03 +08:00
|
|
|
// Per-tenant canvas-runtime override selector, backed by the
|
|
|
|
|
// existing Redis client and the global logger. The handler is
|
|
|
|
|
// ALWAYS constructed, even when Redis is briefly unavailable at
|
|
|
|
|
// startup, so the POST /api/v1/admin/canvas-runtime/:tenant_id
|
|
|
|
|
// endpoint stays registered and returns the explicit
|
|
|
|
|
// ErrSelectorNotConfigured (HTTP 500) path until Redis recovers.
|
|
|
|
|
// Skipping handler construction when rdb == nil silently removed
|
|
|
|
|
// the route until the next process restart, so a transient
|
|
|
|
|
// Redis blip at boot stranded canary operators with a 404 they
|
|
|
|
|
// could not diagnose from the client side. Keep the route hot.
|
feat(agent): Go port — canvas engine, 22 components, DSL v2, 13 endpoints (#15952)
Ports the agent canvas subsystem from Python to Go.
## What's included
### Canvas Engine (Phase 0/1)
- State engine, scheduler, variable resolver, Redis checkpoint store,
cancel protocol
- **209 tests** across canvas / component / io packages
### 22 Components (P0–P4)
| Tier | Components |
|---|---|
| P0 T1+T2+T3 | LLM, Agent, ExitLoop, Switch, Categorize, Begin,
Message, Invoke |
| P1 T3 | VariableAggregator, VariableAssigner, StringTransform,
ListOperations, DataOperations |
| P2 T3 | Iteration, IterationItem, Loop, LoopItem |
| P3 T3 | UserFillUp, Fillup |
| P4 T5 | Browser, ExcelProcessor, DocsGenerator |
### DSL v2 Schema (Phase 2.5)
- Typed v2 in-memory model with v1-to-v2 auto-detect converter
- v1 legacy field stripping per plan §2.11.7
### HTTP Endpoints & Bug Fixes (Plans PR1–PR3)
- **DELETE SQL bug fix**: gorm v2 `Where("id = ?", id).Delete(...)`
pattern
- **CreateAgent validation**: title/DSL required, duplicate check, 103
envelope
- **13 new endpoints**: templates, prompts, tags, sessions CRUD,
chat/completions (SSE + non-stream stubs), rerun, test_db_connection,
logs, webhook/logs
- **756 Go unit tests** (745 → 756, +18)
- **17 → 0 Python integration test failures** (test_agents.py +
test_session_management/)
### Tools
21 eino tools: HTTPHelper, search tools, financial/data tools, mandatory
stubs
### Infrastructure
OTel observability, NATS message queue, DeepDoc gRPC client, SSRF
guards, IDOR mitigation
2026-06-12 22:58:28 +08:00
|
|
|
var adminRuntimeSelector *runtime.Selector
|
2026-06-15 14:44:16 +08:00
|
|
|
if rdb := redis.Get().GetClient(); rdb != nil {
|
feat(agent): Go port — canvas engine, 22 components, DSL v2, 13 endpoints (#15952)
Ports the agent canvas subsystem from Python to Go.
## What's included
### Canvas Engine (Phase 0/1)
- State engine, scheduler, variable resolver, Redis checkpoint store,
cancel protocol
- **209 tests** across canvas / component / io packages
### 22 Components (P0–P4)
| Tier | Components |
|---|---|
| P0 T1+T2+T3 | LLM, Agent, ExitLoop, Switch, Categorize, Begin,
Message, Invoke |
| P1 T3 | VariableAggregator, VariableAssigner, StringTransform,
ListOperations, DataOperations |
| P2 T3 | Iteration, IterationItem, Loop, LoopItem |
| P3 T3 | UserFillUp, Fillup |
| P4 T5 | Browser, ExcelProcessor, DocsGenerator |
### DSL v2 Schema (Phase 2.5)
- Typed v2 in-memory model with v1-to-v2 auto-detect converter
- v1 legacy field stripping per plan §2.11.7
### HTTP Endpoints & Bug Fixes (Plans PR1–PR3)
- **DELETE SQL bug fix**: gorm v2 `Where("id = ?", id).Delete(...)`
pattern
- **CreateAgent validation**: title/DSL required, duplicate check, 103
envelope
- **13 new endpoints**: templates, prompts, tags, sessions CRUD,
chat/completions (SSE + non-stream stubs), rerun, test_db_connection,
logs, webhook/logs
- **756 Go unit tests** (745 → 756, +18)
- **17 → 0 Python integration test failures** (test_agents.py +
test_session_management/)
### Tools
21 eino tools: HTTPHelper, search tools, financial/data tools, mandatory
stubs
### Infrastructure
OTel observability, NATS message queue, DeepDoc gRPC client, SSRF
guards, IDOR mitigation
2026-06-12 22:58:28 +08:00
|
|
|
adminRuntimeSelector = runtime.NewSelector(rdb, common.Logger)
|
|
|
|
|
}
|
|
|
|
|
adminRuntimeHandler := handler.NewAdminRuntimeHandler(adminRuntimeSelector)
|
|
|
|
|
|
2026-03-04 19:17:16 +08:00
|
|
|
// Initialize router
|
2026-06-22 18:16:15 +08:00
|
|
|
r := router.NewRouter(authHandler, userHandler, tenantHandler, documentHandler, datasetsHandler, systemHandler, knowledgebaseHandler, chunkHandler, llmHandler, chatHandler, chatChannelHandler, chatSessionHandler, connectorHandler, searchHandler, fileHandler, memoryHandler, mcpHandler, skillSearchHandler, providerHandler, agentHandler, searchBotHandler, difyRetrievalHandler, pluginHandler, modelHandler, fileCommitHandler, adminRuntimeHandler, openaiChatHandler)
|
2026-03-04 19:17:16 +08:00
|
|
|
|
|
|
|
|
// Create Gin engine
|
|
|
|
|
ginEngine := gin.New()
|
|
|
|
|
|
|
|
|
|
// Middleware
|
2026-03-09 17:48:29 +08:00
|
|
|
if config.Server.Mode == "debug" {
|
2026-03-04 19:17:16 +08:00
|
|
|
ginEngine.Use(gin.Logger())
|
|
|
|
|
}
|
|
|
|
|
ginEngine.Use(gin.Recovery())
|
|
|
|
|
|
|
|
|
|
// Setup routes
|
|
|
|
|
r.Setup(ginEngine)
|
|
|
|
|
|
2026-04-30 12:36:03 +08:00
|
|
|
// Create HTTP server with timeouts to prevent slow clients from blocking shutdown
|
2026-03-09 17:48:29 +08:00
|
|
|
addr := fmt.Sprintf(":%d", config.Server.Port)
|
2026-03-04 19:17:16 +08:00
|
|
|
srv := &http.Server{
|
2026-04-30 12:36:03 +08:00
|
|
|
Addr: addr,
|
|
|
|
|
Handler: ginEngine,
|
|
|
|
|
ReadHeaderTimeout: 10 * time.Second,
|
|
|
|
|
ReadTimeout: 60 * time.Second,
|
|
|
|
|
WriteTimeout: 120 * time.Second,
|
|
|
|
|
IdleTimeout: 120 * time.Second,
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Start server in a goroutine
|
|
|
|
|
go func() {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Info(
|
2026-03-06 20:05:10 +08:00
|
|
|
"\n ____ ___ ______ ______ __\n" +
|
|
|
|
|
" / __ \\ / | / ____// ____// /____ _ __\n" +
|
|
|
|
|
" / /_/ // /| | / / __ / /_ / // __ \\| | /| / /\n" +
|
|
|
|
|
" / _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ /\n" +
|
|
|
|
|
" /_/ |_|/_/ |_|\\____//_/ /_/ \\____/ |__/|__/\n",
|
|
|
|
|
)
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Info(fmt.Sprintf("RAGFlow Go Version: %s", utility.GetRAGFlowVersion()))
|
|
|
|
|
common.Info(fmt.Sprintf("Server starting on port: %d", config.Server.Port))
|
2026-04-28 12:12:58 +08:00
|
|
|
if err := srv.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("Failed to start server", zap.Error(err))
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
|
|
|
|
}()
|
|
|
|
|
|
2026-03-09 17:48:29 +08:00
|
|
|
// Get local IP address for heartbeat reporting
|
2026-04-28 12:12:58 +08:00
|
|
|
localIP, err := utility.GetLocalIP()
|
|
|
|
|
if err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("fail to get local ip address")
|
2026-03-09 17:48:29 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Initialize and start heartbeat reporter to admin server
|
2026-06-12 14:56:44 +08:00
|
|
|
service.AdminServiceClient = service.NewAdminClient(
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Logger,
|
2026-03-09 17:48:29 +08:00
|
|
|
common.ServerTypeAPI,
|
|
|
|
|
fmt.Sprintf("ragflow-server-%d", config.Server.Port),
|
|
|
|
|
localIP,
|
|
|
|
|
config.Server.Port,
|
|
|
|
|
)
|
2026-06-12 14:56:44 +08:00
|
|
|
if err = service.AdminServiceClient.InitHTTPClient(); err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Warn("Failed to initialize heartbeat service", zap.Error(err))
|
2026-03-09 17:48:29 +08:00
|
|
|
} else {
|
|
|
|
|
// Start heartbeat reporter with 30 seconds interval
|
|
|
|
|
heartbeatReporter := utility.NewScheduledTask("Heartbeat reporter", 3*time.Second, func() {
|
2026-06-12 14:56:44 +08:00
|
|
|
if err = service.AdminServiceClient.SendHeartbeat(); err == nil {
|
2026-03-12 20:02:50 +08:00
|
|
|
local.SetAdminStatus(0, "")
|
|
|
|
|
} else {
|
2026-03-13 14:41:02 +08:00
|
|
|
local.SetAdminStatus(1, err.Error())
|
2026-03-19 10:25:35 +08:00
|
|
|
//logger.Warn(fmt.Sprintf(err.Error()))
|
2026-03-09 17:48:29 +08:00
|
|
|
}
|
|
|
|
|
})
|
|
|
|
|
heartbeatReporter.Start()
|
|
|
|
|
defer heartbeatReporter.Stop()
|
|
|
|
|
}
|
|
|
|
|
|
2026-03-04 19:17:16 +08:00
|
|
|
// Wait for interrupt signal to gracefully shutdown
|
|
|
|
|
quit := make(chan os.Signal, 1)
|
|
|
|
|
signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM, syscall.SIGQUIT, syscall.SIGUSR2)
|
|
|
|
|
sig := <-quit
|
|
|
|
|
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Info(fmt.Sprintf("Receives %s signal to shutdown server", strings.ToUpper(sig.String())))
|
|
|
|
|
common.Info("Shutting down server...")
|
2026-03-04 19:17:16 +08:00
|
|
|
|
|
|
|
|
// Create context with timeout for graceful shutdown
|
|
|
|
|
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
|
|
|
|
|
defer cancel()
|
|
|
|
|
|
|
|
|
|
// Shutdown server
|
2026-04-28 12:12:58 +08:00
|
|
|
if err = srv.Shutdown(ctx); err != nil {
|
2026-05-06 10:41:58 +08:00
|
|
|
common.Fatal("Server forced to shutdown", zap.Error(err))
|
2026-03-04 19:17:16 +08:00
|
|
|
}
|
|
|
|
|
}
|
2026-06-17 13:24:03 +08:00
|
|
|
|
|
|
|
|
// agentRunOptions bundles the three optional injection slots the
|
|
|
|
|
// agent service accepts via NewAgentServiceWithOptions: the Redis-
|
|
|
|
|
// backed CheckPointStore, StateSerializer, and RunTracker. The
|
|
|
|
|
// fields stay nil when the underlying constructors fail (Redis
|
|
|
|
|
// unreachable, etc.); the agent service treats nil as "in-memory
|
|
|
|
|
// / no-tracking" so the server continues to serve traffic without
|
|
|
|
|
// requiring Redis to be up.
|
|
|
|
|
type agentRunOptions struct {
|
|
|
|
|
checkpointStore canvas.CheckPointStore
|
|
|
|
|
stateSerializer canvas.StateSerializer
|
|
|
|
|
runTracker *canvas.RunTracker
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// buildAgentRunOptions installs the Redis-backed run infrastructure
|
|
|
|
|
// when Redis is available. The Redis client is the one already
|
|
|
|
|
// initialised at the top of main; the TTL is a conservative 24h for
|
|
|
|
|
// both the checkpoint store and the run tracker. On any error
|
|
|
|
|
// (Redis down at boot, constructor panic, nil-Redis fallback) we
|
|
|
|
|
// log and return a zero-value struct — the agent service falls back
|
|
|
|
|
// to the in-memory path transparently.
|
|
|
|
|
func buildAgentRunOptions() agentRunOptions {
|
|
|
|
|
var out agentRunOptions
|
|
|
|
|
if !redis.IsEnabled() || redis.Get() == nil {
|
|
|
|
|
common.Info("agent: redis client not initialised; agent run infra in in-memory mode (no checkpoints, no run tracker)")
|
|
|
|
|
return out
|
|
|
|
|
}
|
|
|
|
|
cp := canvas.NewRedisCheckPointStore(24 * time.Hour)
|
|
|
|
|
out.checkpointStore = cp
|
|
|
|
|
// stateSerializer is intentionally left nil. eino's default
|
|
|
|
|
// InternalSerializer (used when no compose.WithSerializer is
|
|
|
|
|
// passed at compile time) already knows how to round-trip
|
|
|
|
|
// runtime.CanvasState because the runtime package registers
|
|
|
|
|
// it via compose.RegisterSerializableType[CanvasState] in
|
|
|
|
|
// init(). Overriding with RAGFlow's plain-JSON
|
|
|
|
|
// CanvasStateSerializer (json.Marshal/Unmarshal) produces
|
|
|
|
|
// bytes the InternalSerializer cannot decode on the resume
|
|
|
|
|
// pass — the UserFillUp two-node pattern surfaces this as
|
|
|
|
|
// "load checkpoint from store fail: cannot unmarshal object
|
|
|
|
|
// into Go struct field checkpoint.Channels of type
|
|
|
|
|
// compose.channel". Rely on eino's default instead.
|
|
|
|
|
rt := canvas.NewRunTracker(24 * time.Hour)
|
|
|
|
|
out.runTracker = rt
|
|
|
|
|
common.Info("agent: redis-backed run infra installed (24h TTL on checkpoint store + run tracker; eino default serializer)")
|
|
|
|
|
return out
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// configureTTSSynthesizer installs the audio.ModelProviderFunc
|
|
|
|
|
// that dispatches Synthesize requests through the project's
|
|
|
|
|
// ModelProviderService. The model provider's AudioSpeech method
|
|
|
|
|
// (internal/service/model_service.go) resolves the per-tenant TTS
|
|
|
|
|
// model driver, sends the request upstream, and returns
|
|
|
|
|
// synthesized audio bytes.
|
|
|
|
|
//
|
|
|
|
|
// The audio package's NewTTSDispatchFunc helper converts the
|
|
|
|
|
// audio.SynthesizeRequest shape into the model's dispatch shape
|
|
|
|
|
// (audioContent = req.Text, voice/lang → TTSConfig.Params,
|
|
|
|
|
// ModelName from req.Engine). When the model provider is
|
|
|
|
|
// unconfigured (nil dispatcher) the helper returns nil, which
|
|
|
|
|
// reverts the audio package to its default stub.
|
|
|
|
|
func configureTTSSynthesizer(modelProviderService *service.ModelProviderService) {
|
|
|
|
|
if modelProviderService == nil {
|
|
|
|
|
common.Info("agent: model provider service not initialised; TTS in no-op echo mode")
|
|
|
|
|
audio.SetModelProviderSynthesizer(nil)
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
audio.SetModelProviderSynthesizer(audio.NewTTSDispatchFunc(modelProviderService))
|
|
|
|
|
common.Info("agent: TTS model-provider dispatch installed (audio.Synthesize → ModelProviderService.AudioSpeech)")
|
|
|
|
|
}
|