Files
ragflow/internal/handler/agent_attachment.go

118 lines
4.2 KiB
Go
Raw Permalink Normal View History

feat[Go]: port agent attachment download, chatbot + agentbot completion/info endpoints from Python (#16405) ## Summary Ports five Python agent APIs to Go under the v1 Gin router: - `GET /api/v1/agents/attachments/<attachment_id>/download` - `POST /api/v1/chatbots/<dialog_id>/completions` (SSE) - `GET /api/v1/chatbots/<dialog_id>/info` - `POST /api/v1/agentbots/<agent_id>/completions` (SSE) - `GET /api/v1/agentbots/<agent_id>/inputs` Mirrors the existing Python wire shape (`{code, message, data:{answer,reference,...}}` per Python `canvas_service.completion`) so the iframe SDK and existing JS widgets keep working. ## Behavioural parity with Python | # | Concern | How it's met | |---|---------|--------------| | R0 | Bot routes must not require regular user session | Routes mount on `apiNoAuth` (router.go:198-202), with `BetaAuthMiddleware` only | | R3 | Two SSE formats in Go drift | F2: `AgentChatCompletions` and `AgentbotCompletion` share `service.WriteChatbotRunEvent` | | R7 | `GetBySessionID` returns `(nil, nil)` on miss | Defensive nil-check before `session.UserID != tenantID` | | R8 | Begin component name vs ID | `FindBeginComponentID` resolves name → ID first, then `ExtractComponentInputForm(dsl, beginID)` | | R9 | Defensive PromptConfig parsing | `stringFromMap` helper used for `prologue` and `tavily_api_key` | | R10 | `BetaAuthMiddleware` Bearer-prefix pre-filter | Removed — `GetUserByToken` is called unconditionally, falls back to `GetUserByBetaAPIToken` | | F8 | Multi-turn chatbot history | `ChatbotCompletion` reads prior turns from `session.Message`, appends user turn, calls LLM, persists new pair via new `API4ConversationDAO.Update` | | F9 | UUID gate stricter than plan | Removed — only `filepath.Base` + CR/LF/quote header sanitization remains | | H2 | Defence-in-depth IDOR | `AgentbotCompletion` calls `loadCanvas` before delegating to `RunAgent` | | M2 | SSE error leakage | `WriteChatbotFrame` emits generic `"an internal error occurred"`; real error logged via `common.Error` | ## Verification ```bash $ go vet ./... # clean (only pre-existing issues) $ go build ./... # success $ go test ./internal/handler/ ./internal/service/ ./internal/agent/dsl/ ./internal/common/ ./internal/dao/ ok ragflow/internal/handler 0.617s ok ragflow/internal/service 1.729s ok ragflow/internal/agent/dsl 0.008s ok ragflow/internal/common 0.087s ok ragflow/internal/dao 0.083s ``` 1199 tests pass across 5 packages. ## Known follow-ups (out of scope for this PR) - **F1**: token-level streaming in `ChatbotCompletion` (currently emits one frame per turn) - **F3**: per-route `auth_types` attribute in Go (currently applied via route group middleware) --------- Co-authored-by: Claude <noreply@anthropic.com>
2026-06-27 16:52:21 +08:00
//
// Copyright 2026 The InfiniFlow Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Gap D — `GET /api/v1/agents/attachments/<attachment_id>/download`
// (Python api/apps/restful_apis/agent_api.py:2368).
//
// Mirrors the python download_agent_attachment handler:
// - auth via @login_required → GetUser
// - reads `attachment_id` from the URL path (NOT a query string)
// - default `ext` query parameter is "markdown"
// - uses utility.CONTENT_TYPE_MAP to pick the content type, falling
// back to "application/<ext>" for unknown extensions
// - streams raw bytes back with a sanitized Content-Disposition
package handler
import (
"fmt"
"net/http"
"net/url"
"path/filepath"
"strings"
"github.com/gin-gonic/gin"
"ragflow/internal/common"
"ragflow/internal/utility"
)
// agentAttachmentFileService is the subset of FileService used by
// the attachment-download handler.
type agentAttachmentFileService interface {
DownloadAgentFile(tenantID, location string) ([]byte, error)
}
// DownloadAttachment GET /api/v1/agents/attachments/<attachment_id>/download
func (h *AgentHandler) DownloadAttachment(c *gin.Context) {
user, code, msg := GetUser(c)
if code != common.CodeSuccess {
jsonError(c, code, msg)
return
}
attachmentID := c.Param("attachment_id")
if attachmentID == "" {
jsonError(c, common.CodeArgumentError, "`attachment_id` is required.")
return
}
// Note (review F9): the plan explicitly defers attachment-id
// shape validation to the storage layer. The python download
// endpoint at api/apps/restful_apis/agent_api.py:2368 and the
// existing Go DownloadAgentFile path rely on storage lookup +
// header sanitization; we DO NOT gate on UUID here because
// attachment IDs in storage are not guaranteed UUIDs and the
// review found no evidence of a UUID invariant. The
// filepath.Base + CR/LF/quote check below is the only defensive
// layer and runs BEFORE the file-service call so an unsafe id
// never crosses the service boundary.
safe := filepath.Base(attachmentID)
if safe == "" || safe == "." || safe == "/" || strings.ContainsAny(safe, "\r\n\"") {
jsonError(c, common.CodeArgumentError, "invalid attachment id.")
return
}
// Normalize the ext query once. A blank or dotted input like
// `?ext=` or `?ext=.pdf` would otherwise produce a malformed
// MIME type like `application/` or `application/.pdf`. Trim
// whitespace, lowercase, strip any leading dot, then fall back
// to markdown when the value is empty.
ext := strings.ToLower(strings.TrimSpace(c.DefaultQuery("ext", "markdown")))
ext = strings.TrimPrefix(ext, ".")
if ext == "" {
ext = "markdown"
}
// IDOR note: the Go User struct collapses user/tenant into one
// identifier (same model as the python download_agent_file
// endpoint at agent_api.py:523-530). The python attachment
// endpoint relies on the storage bucket's tenant scoping for
// authorisation. The Go port preserves that shape.
if h.fileService == nil {
jsonError(c, common.CodeServerError, "file service not configured")
return
}
blob, err := h.fileService.DownloadAgentFile(user.ID, attachmentID)
if err != nil {
// Mirror agent_download.go error mapping — DAO/transport
// errors collapse to a generic 102 so we don't leak storage
// internals in the response body.
jsonError(c, common.CodeDataError, "Attachment not found!")
return
}
contentType := utility.CONTENT_TYPE_MAP[ext]
if contentType == "" {
// Fallback for unknown extensions — keep the wire shape
// consistent with the python handler.
contentType = "application/" + ext
}
c.Header("Content-Disposition", fmt.Sprintf(
`attachment; filename="%s"; filename*=UTF-8''%s`,
safe, url.PathEscape(safe),
))
c.Data(http.StatusOK, contentType, blob)
}