Fix: use .a to replace .so for pdfium/pdf_oxide/office_oxide (#16496 )

### Summary Use .a to replace .so for pdfium/pdf_oxide/office_oxide --------- Co-authored-by: Claude <noreply@anthropic.com>
perf: batch-embed entity/edge names in set_graph() to fix stall on large graphs (#16205 ) (#16472 )
2026-07-06 03:18:36 +08:00 · 2026-07-01 21:21:31 +08:00 · 2026-07-01 20:45:20 +08:00 · 2026-07-01 20:10:42 +08:00 · 2026-07-01 19:33:18 +08:00 · 2026-07-01 19:08:17 +08:00
4140 changed files with 922042 additions and 692857 deletions
--- a/.agents/rules/named.md
+++ b/.agents/rules/named.md
@@ -0,0 +1,192 @@
+# Go Naming Best Practices
+
+## 1. Package Naming
+
+- **All lowercase, no underscores**: `package user`, not `package userService` or `package user_service`
+- **Short and meaningful**: `package http`, `package json`, `package dao`
+- **Avoid plurals**: `package user` not `package users`
+- **Avoid generic names**: Avoid `package util`, `package common`, `package base`
+
+```go
+// Recommended
+package user
+package handler
+package service
+
+// Not recommended
+package UserService
+package user_service
+package utils
+```
+
+## 2. File Naming
+
+- **All lowercase, underscore separated**: `user_handler.go`, `user_service.go`
+- **Test files**: `user_handler_test.go`
+- **Platform-specific**: `user_linux.go`, `user_windows.go`
+
+```
+user/
+├── user_handler.go
+├── user_service.go
+├── user_dao.go
+└── user_test.go
+```
+
+## 3. Directory Naming
+
+- **All lowercase, no underscores or hyphens**: `internal/`, `pkg/`, `cmd/`
+- **Short and descriptive**: `handler/`, `service/`, `dao/`
+
+```
+project/
+├── cmd/                    # Main entry point
+│   └── server_main.go
+├── internal/               # Private code
+│   ├── handler/
+│   ├── service/
+│   ├── dao/
+│   ├── model/
+│   └── middleware/
+├── pkg/                    # Public code
+└── api/                    # API definitions
+```
+
+## 4. Interface Naming
+
+- **Single-method interfaces end with "-er"**: `Reader`, `Writer`, `Handler`
+- **Verb form**: `Reader`, `Executor`, `Validator`
+
+```go
+// Recommended
+type Reader interface {
+    Read(p []byte) (n int, err error)
+}
+
+type UserService interface {
+    Register(req *RegisterRequest) (*User, error)
+    Login(req *LoginRequest) (*User, error)
+}
+
+// Not recommended
+type UserInterface interface {}
+type IUserService interface {}
+```
+
+## 5. Struct Naming
+
+- **CamelCase**: `UserService`, `UserHandler`
+- **Avoid redundant prefixes**: `User` not `UserModel`
+
+```go
+// Recommended
+type UserService struct {}
+type UserHandler struct {}
+type RegisterRequest struct {}
+
+// Not recommended
+type user_service struct {}
+type SUserService struct {}
+type UserModel struct {}
+```
+
+## 6. Method/Function Naming
+
+- **CamelCase**
+- **Start with verb**: `GetUser`, `CreateUser`, `DeleteUser`
+- **Boolean returns use Is/Has/Can prefix**: `IsValid`, `HasPermission`
+
+```go
+// Recommended
+func (s *UserService) Register(req *RegisterRequest) (*User, error)
+func (s *UserService) GetUserByID(id uint) (*User, error)
+func (s *UserService) IsEmailExists(email string) bool
+
+// Not recommended
+func (s *UserService) register_user()
+func (s *UserService) get_user_by_id()
+func (s *UserService) CheckEmailExists() // Should use Is/Has
+```
+
+## 7. Constant Naming
+
+- **CamelCase**: `const MaxRetryCount = 3`
+- **Enum constants**: `const StatusActive = "active"`
+
+```go
+// Recommended
+const (
+    StatusActive   = "1"
+    StatusInactive = "0"
+    MaxRetryCount  = 3
+)
+
+// Not recommended
+const (
+    STATUS_ACTIVE = "1"  // Not all uppercase
+    status_active = "1"  // Not all lowercase
+)
+```
+
+## 8. Error Variable Naming
+
+- **Start with "Err"**: `ErrNotFound`, `ErrInvalidInput`
+
+```go
+// Recommended
+var (
+    ErrNotFound      = errors.New("not found")
+    ErrInvalidInput  = errors.New("invalid input")
+    ErrUnauthorized  = errors.New("unauthorized")
+)
+```
+
+## 9. Acronyms Keep Consistent Case
+
+```go
+// Recommended
+type HTTPHandler struct {}
+var URL string
+func GetHTTPClient() {}
+func ParseJSON() {}
+
+// Not recommended
+type HttpHandler struct {}
+var Url string
+func GetHttpClient() {}
+```
+
+## 10. Project Structure Naming
+
+```
+project-name/
+├── cmd/                    # Main programs
+│   └── app_name/
+│       └── main.go
+├── internal/               # Private code
+│   ├── handler/           # HTTP handlers
+│   ├── service/           # Business logic
+│   ├── repository/        # Data access
+│   ├── model/             # Data models
+│   └── config/            # Configuration
+├── pkg/                    # Public code
+├── api/                    # API definitions
+├── configs/               # Config files
+├── scripts/               # Scripts
+├── docs/                  # Documentation
+├── go.mod
+└── go.sum
+```
+
+## Summary Table
+
+| Type           | Rule                                | Example             |
+| -------------- | ----------------------------------- | ------------------- |
+| Package        | All lowercase, no underscores       | `package user`      |
+| File           | All lowercase, underscore separated | `user_service.go`   |
+| Directory      | All lowercase, no separators        | `internal/handler/` |
+| Struct         | CamelCase, capitalized first letter | `UserService`       |
+| Interface      | CamelCase, -er suffix               | `Reader`, `Writer`  |
+| Method         | CamelCase, verb prefix              | `GetUserByID`       |
+| Constant       | CamelCase                           | `MaxRetryCount`     |
+| Error Variable | Err prefix                          | `ErrNotFound`       |
--- a/.agents/skills/go-naming/SKILL.md
+++ b/.agents/skills/go-naming/SKILL.md
@@ -0,0 +1,6 @@
+---
+name: go-naming
+description: Go naming conventions and best practices. Use this skill when working with Go code and need to name packages, files, directories, structs, interfaces, functions, variables, or constants. Provides comprehensive naming guidelines following Go community standards.
+---
+
+Strictly follow the naming conventions in [rules/named.md](../../rules/named.md)
--- a/.dockerignore
+++ b/.dockerignore
@@ -0,0 +1,58 @@
+# RAGFlow .dockerignore
+# Reduces Docker build context sent to the daemon.
+# All excluded items are either rebuilt inside Docker, mounted from
+# infiniflow/ragflow_deps, or are local-only artifacts.
+
+# ── Python virtual environments ─────────────────────────────────────────────
+.venv/
+venv/
+__pycache__/
+*.pyc
+*.pyo
+*.egg-info/
+.pytest_cache/
+
+# ── Frontend dependencies and build outputs ─────────────────────────────────
+web/node_modules/
+web/dist/
+
+# ── Runtime logs ────────────────────────────────────────────────────────────
+logs/
+*.log
+docker/ragflow-logs/
+
+# ── Docker runtime data ─────────────────────────────────────────────────────
+docker/data/
+docker/oceanbase/
+docker/seekdb/
+
+# ── Go and C++ build outputs ────────────────────────────────────────────────
+internal/cpp/build/
+internal/cpp/cmake-build-release/
+internal/cpp/cmake-build-debug/
+target/
+
+# ── ragflow_deps build context (built as a separate image, mounted ──
+# ── from infiniflow/ragflow_deps:latest by the main Dockerfile)     ──
+# Excluding the entire directory keeps the main build context small
+# regardless of which deps files download_deps.py currently fetches.
+# The deps image is built from this directory with:
+#   cd ragflow_deps && docker build -f Dockerfile -t infiniflow/ragflow_deps .
+ragflow_deps/
+
+# ── IDE and editor config ──────────────────────────────────────────────────
+.idea/
+.vscode/
+.cursor/
+.trae/
+.DS_Store
+
+# ── Test and coverage artifacts ─────────────────────────────────────────────
+coverage/
+htmlcov/
+.coverage
+.hypothesis/
+.nox/
+
+# ── Docker env (contains secrets) ───────────────────────────────────────────
+docker/.env
--- a/.github/codeql/codeql-config.yml
+++ b/.github/codeql/codeql-config.yml
@@ -0,0 +1,25 @@
+# CodeQL configuration. The default CodeQL Analysis workflow (managed by
+# GitHub) reads this file when scanning the repository. We use it to
+# exclude files that the Go analysis cannot compile — the rest of the
+# repo compiles fine, but the CGO-based office_oxide bindings require
+# a native header (`office_oxide.h`) that isn't present in the CodeQL
+# runner image. Without this exclusion the entire Go analysis aborts
+# with `fatal error: office_oxide.h: No such file or directory`, which
+# means no Go alerts can be re-evaluated and alerts on these files
+# stay open indefinitely even after their root cause is fixed.
+#
+# The excluded files are MS Office document parsers. They are also
+# excluded from `go test` and `go build` in local development when
+# the office_oxide C library is not installed, so this exclusion
+# brings CodeQL in line with the rest of the toolchain.
+paths-ignore:
+  - internal/ingestion/parser/doc_parser.go
+  - internal/ingestion/parser/docx_parser.go
+  - internal/ingestion/parser/ppt_parser.go
+  - internal/ingestion/parser/pptx_parser.go
+  - internal/ingestion/parser/xls_parser.go
+  - internal/ingestion/parser/xlsx_parser.go
+  # Generated / vendored — also break analysis without adding signal.
+  - "**/testdata/**"
+  - "**/node_modules/**"
+  - "**/*.pb.go"
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -0,0 +1,22 @@
+# Project instructions for Copilot
+
+## How to run (minimum)
+- Install:
+  - python -m venv .venv && source .venv/bin/activate
+  - pip install -r requirements.txt
+- Run:
+  - (fill) e.g. uvicorn app.main:app --reload
+- Verify:
+  - (fill) curl http://127.0.0.1:8000/health
+
+## Project layout (what matters)
+- app/: API entrypoints + routers
+- services/: business logic
+- configs/: config loading (.env)
+- docs/: documents
+- tests/: pytest
+
+## Conventions
+- Prefer small, incremental changes.
+- Add logging for new flows.
+- Add/adjust tests for behavior changes.
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -1,12 +1,3 @@
-### What problem does this PR solve?
+### Summary

 _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._
-
-### Type of change
-
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -3,11 +3,19 @@ name: release
 on:
  schedule:
    - cron: '0 13 * * *'  # This schedule runs every 13:00:00Z(21:00:00+08:00)
+  # https://github.com/orgs/community/discussions/26286?utm_source=chatgpt.com#discussioncomment-3251208
+  # "The create event does not support branch filter and tag filter."
  # The "create tags" trigger is specifically focused on the creation of new tags, while the "push tags" trigger is activated when tags are pushed, including both new tag creations and updates to existing tags.
-  create:
+  push:
    tags:
      - "v*.*.*"                  # normal release
-      - "nightly"                 # the only one mutable tag
+      - 'nightly'                 # mutable tag
+
+permissions:
+  contents: write
+  actions: read
+  checks: read
+  statuses: read

 # https://docs.github.com/en/actions/using-jobs/using-concurrency
 concurrency:
@@ -15,28 +23,40 @@ concurrency:
  cancel-in-progress: true

 jobs:
-  release:
-    runs-on: [ "self-hosted", "ragflow-test" ]
+  prepare:
+    runs-on: [ "self-hosted", "ragflow-release" ]
+    outputs:
+      release_tag: ${{ steps.release.outputs.release_tag }}
+      prerelease: ${{ steps.release.outputs.prerelease }}
+
    steps:
      - name: Ensure workspace ownership
        run: echo "chown -R ${USER} ${GITHUB_WORKSPACE}" && sudo chown -R ${USER} ${GITHUB_WORKSPACE}

-      # https://github.com/actions/checkout/blob/v3/README.md
+      # https://github.com/actions/checkout/blob/v6/README.md
      - name: Check out code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          token: ${{ secrets.GITHUB_TOKEN }}  # Use the secret as an environment variable
          fetch-depth: 0
          fetch-tags: true

-      - name: Prepare release body
+      # https://github.com/actions/setup-go
+      - name: Set up Go
+        uses: actions/setup-go@v5
+        with:
+          go-version-file: go.mod
+          cache: true
+
+      - name: Prepare release metadata
+        id: release
        run: |
-          if [[ ${GITHUB_EVENT_NAME} == "create" ]]; then
+          if [[ ${GITHUB_EVENT_NAME} != "schedule" ]]; then
            RELEASE_TAG=${GITHUB_REF#refs/tags/}
-            if [[ ${RELEASE_TAG} == "nightly" ]]; then
-              PRERELEASE=true
-            else
+            if [[ ${RELEASE_TAG} == v* ]]; then
              PRERELEASE=false
+            else
+              PRERELEASE=true
            fi
            echo "Workflow triggered by create tag: ${RELEASE_TAG}"
          else
@@ -46,8 +66,8 @@ jobs:
          fi
          echo "RELEASE_TAG=${RELEASE_TAG}" >> ${GITHUB_ENV}
          echo "PRERELEASE=${PRERELEASE}" >> ${GITHUB_ENV}
-          RELEASE_DATETIME=$(date --rfc-3339=seconds)
-          echo Release ${RELEASE_TAG} created from ${GITHUB_SHA} at ${RELEASE_DATETIME} > release_body.md
+          echo "release_tag=${RELEASE_TAG}" >> ${GITHUB_OUTPUT}
+          echo "prerelease=${PRERELEASE}" >> ${GITHUB_OUTPUT}

      - name: Move the existing mutable tag
        # https://github.com/softprops/action-gh-release/issues/171
@@ -55,7 +75,7 @@ jobs:
          git fetch --tags
          if [[ ${GITHUB_EVENT_NAME} == "schedule" ]]; then
            # Determine if a given tag exists and matches a specific Git commit.
-            # actions/checkout@v4 fetch-tags doesn't work when triggered by schedule
+            # actions/checkout@v6 fetch-tags doesn't work when triggered by schedule
            if [ "$(git rev-parse -q --verify "refs/tags/${RELEASE_TAG}")" = "${GITHUB_SHA}" ]; then
              echo "mutable tag ${RELEASE_TAG} exists and matches ${GITHUB_SHA}"
            else
@@ -65,15 +85,532 @@ jobs:
            fi
          fi

-      - name: Create or overwrite a release
-        # https://github.com/actions/upload-release-asset has been replaced by https://github.com/softprops/action-gh-release
+  build_cli:
+    needs: prepare
+    strategy:
+      fail-fast: false
+      matrix:
+        include:
+          - goos: linux
+            goarch: amd64
+            runner: ubuntu-24.04
+            native_asset: native-linux-x86_64.tar.gz
+          - goos: linux
+            goarch: arm64
+            runner: ubuntu-24.04-arm
+            native_asset: native-linux-aarch64.tar.gz
+          - goos: darwin
+            goarch: amd64
+            runner: macos-15-intel
+            native_asset: native-macos-x86_64.tar.gz
+          - goos: darwin
+            goarch: arm64
+            runner: macos-14
+            native_asset: native-macos-aarch64.tar.gz
+          - goos: windows
+            goarch: amd64
+            runner: windows-latest
+            native_asset: native-windows-x86_64.zip
+            msystem: CLANG64
+            rust_target: x86_64-pc-windows-gnullvm
+            msys2_packages: >-
+              mingw-w64-clang-x86_64-clang
+              mingw-w64-clang-x86_64-lld
+              mingw-w64-clang-x86_64-cmake
+              mingw-w64-clang-x86_64-ninja
+              mingw-w64-clang-x86_64-pcre2
+              mingw-w64-clang-x86_64-pkgconf
+            output_ext: .exe
+          - goos: windows
+            goarch: arm64
+            runner: windows-11-arm
+            native_asset: native-windows-aarch64.zip
+            msystem: CLANGARM64
+            rust_target: aarch64-pc-windows-gnullvm
+            msys2_packages: >-
+              mingw-w64-clang-aarch64-clang
+              mingw-w64-clang-aarch64-lld
+              mingw-w64-clang-aarch64-cmake
+              mingw-w64-clang-aarch64-ninja
+              mingw-w64-clang-aarch64-pcre2
+              mingw-w64-clang-aarch64-pkgconf
+            output_ext: .exe
+    runs-on: ${{ matrix.runner }}
+    env:
+      CLI_NAME: ragflow-cli
+      CLI_MAIN: ./cmd/ragflow-cli.go
+      DIST_DIR: dist/cli
+      OFFICE_OXIDE_VERSION: "0.1.2"
+      RELEASE_TAG: ${{ needs.prepare.outputs.release_tag }}
+
+    steps:
+      # https://github.com/actions/checkout/blob/v6/README.md
+      - name: Check out code
+        uses: actions/checkout@v6
+        with:
+          token: ${{ secrets.GITHUB_TOKEN }}
+          fetch-depth: 0
+          fetch-tags: true
+
+      # https://github.com/actions/setup-go
+      - name: Set up Go
+        uses: actions/setup-go@v5
+        with:
+          go-version-file: go.mod
+          cache: true
+
+      - name: Install Unix native build dependencies
+        if: runner.os != 'Windows'
+        shell: bash
+        run: |
+          set -euo pipefail
+
+          if [[ "${{ matrix.goos }}" == "linux" ]]; then
+            sudo apt-get update
+            sudo apt-get install -y build-essential libpcre2-dev libsimde-dev pkg-config python3-pip
+          elif [[ "${{ matrix.goos }}" == "darwin" ]]; then
+            brew list pcre2 >/dev/null 2>&1 || brew install pcre2
+            brew list simde >/dev/null 2>&1 || brew install simde
+          fi
+
+          python3 -m pip install --user --upgrade 'cmake>=4.0,<5' || \
+            python3 -m pip install --user --break-system-packages --upgrade 'cmake>=4.0,<5'
+          python_user_base="$(python3 -m site --user-base)"
+          echo "${python_user_base}/bin" >> "${GITHUB_PATH}"
+          export PATH="${python_user_base}/bin:${PATH}"
+          cmake --version
+
+      - name: Download office_oxide native library
+        if: runner.os != 'Windows'
+        shell: bash
+        run: |
+          set -euo pipefail
+
+          OFFICE_OXIDE_PREFIX="${RUNNER_TEMP}/office_oxide"
+          OFFICE_OXIDE_URL="https://github.com/yfedoseev/office_oxide/releases/download/v${OFFICE_OXIDE_VERSION}/${{ matrix.native_asset }}"
+
+          mkdir -p "${OFFICE_OXIDE_PREFIX}"
+          curl -fsSL "${OFFICE_OXIDE_URL}" -o "${RUNNER_TEMP}/${{ matrix.native_asset }}"
+          tar xzf "${RUNNER_TEMP}/${{ matrix.native_asset }}" -C "${OFFICE_OXIDE_PREFIX}"
+
+          test -f "${OFFICE_OXIDE_PREFIX}/include/office_oxide_c/office_oxide.h"
+          test -f "${OFFICE_OXIDE_PREFIX}/lib/liboffice_oxide.a"
+          ln -sf "office_oxide_c/office_oxide.h" "${OFFICE_OXIDE_PREFIX}/include/office_oxide.h"
+          test -f "${OFFICE_OXIDE_PREFIX}/include/office_oxide.h"
+
+          echo "OFFICE_OXIDE_PREFIX=${OFFICE_OXIDE_PREFIX}" >> "${GITHUB_ENV}"
+
+      - name: Set up MSYS2
+        if: runner.os == 'Windows'
+        uses: msys2/setup-msys2@v2
+        with:
+          msystem: ${{ matrix.msystem }}
+          update: true
+          install: ${{ matrix.msys2_packages }}
+          path-type: inherit
+
+      - name: Install Windows SIMDe headers
+        if: runner.os == 'Windows'
+        shell: msys2 {0}
+        run: |
+          set -euo pipefail
+
+          simde_dir="$(cygpath -u "${RUNNER_TEMP}")/simde"
+          simde_archive="$(cygpath -u "${RUNNER_TEMP}")/simde-v0.8.2.tar.gz"
+          github_env="$(cygpath -u "${GITHUB_ENV}")"
+
+          rm -rf "${simde_dir}"
+          mkdir -p "${simde_dir}"
+          curl -fsSL "https://github.com/simd-everywhere/simde/archive/refs/tags/v0.8.2.tar.gz" -o "${simde_archive}"
+          tar xzf "${simde_archive}" -C "${simde_dir}" --strip-components=1
+
+          test -f "${simde_dir}/simde/x86/sse4.1.h"
+
+          # Install SIMDe headers into the MSYS2 toolchain include directory.
+          # CMake/Ninja invokes clang-scan-deps as a Windows-native executable;
+          # a POSIX-style include path like /c/a/_temp/simde may not be resolved
+          # there, so keep the headers under ${MINGW_PREFIX}/include instead.
+          rm -rf "${MINGW_PREFIX}/include/simde"
+          cp -R "${simde_dir}/simde" "${MINGW_PREFIX}/include/simde"
+          test -f "${MINGW_PREFIX}/include/simde/x86/sse4.1.h"
+
+      - name: Configure Windows C compiler
+        if: runner.os == 'Windows'
+        shell: msys2 {0}
+        run: |
+          set -euo pipefail
+
+          cc_path="$(command -v clang.exe 2>/dev/null || command -v clang)"
+          cxx_path="$(command -v clang++.exe 2>/dev/null || command -v clang++)"
+
+          if [[ ! -f "${cc_path}" && -f "${cc_path}.exe" ]]; then
+            cc_path="${cc_path}.exe"
+          fi
+          if [[ ! -f "${cxx_path}" && -f "${cxx_path}.exe" ]]; then
+            cxx_path="${cxx_path}.exe"
+          fi
+          test -f "${cc_path}"
+          test -f "${cxx_path}"
+
+          cc="$(cygpath -m "${cc_path}")"
+          cxx="$(cygpath -m "${cxx_path}")"
+          github_env="$(cygpath -u "${GITHUB_ENV}")"
+
+          pcre2_libdir="$(pkg-config --variable=libdir libpcre2-8)"
+          pcre2_includedir="$(pkg-config --variable=includedir libpcre2-8)"
+
+          echo "CC=${cc}" >> "${github_env}"
+          echo "CXX=${cxx}" >> "${github_env}"
+          echo "PCRE2_LIBDIR=$(cygpath -m "${pcre2_libdir}")" >> "${github_env}"
+          echo "PCRE2_INCLUDEDIR=$(cygpath -m "${pcre2_includedir}")" >> "${github_env}"
+
+          echo "Resolved MSYS2 clang: ${cc}"
+          "${cc_path}" --version
+          echo "Resolved MSYS2 clang++: ${cxx}"
+          "${cxx_path}" --version
+
+      - name: Set up Rust for Windows office_oxide staticlib
+        if: runner.os == 'Windows'
+        uses: dtolnay/rust-toolchain@stable
+        with:
+          targets: ${{ matrix.rust_target }}
+
+      - name: Build office_oxide native library for Windows GNU ABI
+        if: runner.os == 'Windows'
+        shell: msys2 {0}
+        run: |
+          set -euo pipefail
+
+          office_oxide_prefix="$(cygpath -u "${RUNNER_TEMP}")/office_oxide"
+          office_oxide_src="$(cygpath -u "${RUNNER_TEMP}")/office_oxide-src"
+          archive_path="$(cygpath -u "${RUNNER_TEMP}")/office_oxide-v${OFFICE_OXIDE_VERSION}.tar.gz"
+          github_env="$(cygpath -u "${GITHUB_ENV}")"
+
+          rm -rf "${office_oxide_prefix}" "${office_oxide_src}"
+          mkdir -p "${office_oxide_prefix}/include/office_oxide_c" "${office_oxide_prefix}/lib" "${office_oxide_src}"
+
+          curl -fsSL "https://github.com/yfedoseev/office_oxide/archive/refs/tags/v${OFFICE_OXIDE_VERSION}.tar.gz" -o "${archive_path}"
+          tar xzf "${archive_path}" -C "${office_oxide_src}" --strip-components=1
+
+          cd "${office_oxide_src}"
+
+          cc_path="$(command -v clang.exe 2>/dev/null || command -v clang)"
+          cxx_path="$(command -v clang++.exe 2>/dev/null || command -v clang++)"
+          if [[ ! -f "${cc_path}" && -f "${cc_path}.exe" ]]; then
+            cc_path="${cc_path}.exe"
+          fi
+          if [[ ! -f "${cxx_path}" && -f "${cxx_path}.exe" ]]; then
+            cxx_path="${cxx_path}.exe"
+          fi
+          test -f "${cc_path}"
+          test -f "${cxx_path}"
+
+          export CC="${cc_path}"
+          export CXX="${cxx_path}"
+          export AR="$(command -v llvm-ar || command -v ar)"
+          export CARGO_BUILD_TARGET="${{ matrix.rust_target }}"
+          export RUSTFLAGS="-C target-feature=+crt-static -C link-arg=-fuse-ld=lld"
+
+          case "${{ matrix.rust_target }}" in
+            x86_64-pc-windows-gnullvm)
+              export CARGO_TARGET_X86_64_PC_WINDOWS_GNULLVM_LINKER="${CC}"
+              export CARGO_TARGET_X86_64_PC_WINDOWS_GNULLVM_AR="${AR}"
+              ;;
+            aarch64-pc-windows-gnullvm)
+              export CARGO_TARGET_AARCH64_PC_WINDOWS_GNULLVM_LINKER="${CC}"
+              export CARGO_TARGET_AARCH64_PC_WINDOWS_GNULLVM_AR="${AR}"
+              ;;
+            *)
+              echo "Unsupported Rust target: ${{ matrix.rust_target }}"
+              exit 1
+              ;;
+          esac
+
+          # The release workflow only needs the static archive for cgo.
+          # Building cdylib on Windows pulls in extra MinGW runtime libraries,
+          # which can fail under the CLANG64/CLANGARM64 environments.
+          perl -0pi -e 's/crate-type\s*=\s*\[[^\]]+\]/crate-type = ["staticlib"]/s' Cargo.toml
+
+          cargo build --release --lib --target "${CARGO_BUILD_TARGET}" --no-default-features
+
+          cp "include/office_oxide_c/office_oxide.h" "${office_oxide_prefix}/include/office_oxide_c/office_oxide.h"
+          cp "include/office_oxide_c/office_oxide.h" "${office_oxide_prefix}/include/office_oxide.h"
+          cp "target/${CARGO_BUILD_TARGET}/release/liboffice_oxide.a" "${office_oxide_prefix}/lib/liboffice_oxide.a"
+
+          test -f "${office_oxide_prefix}/include/office_oxide_c/office_oxide.h"
+          test -f "${office_oxide_prefix}/include/office_oxide.h"
+          test -f "${office_oxide_prefix}/lib/liboffice_oxide.a"
+
+          echo "OFFICE_OXIDE_PREFIX=$(cygpath -m "${office_oxide_prefix}")" >> "${github_env}"
+
+      - name: Build rag tokenizer native library
+        if: runner.os != 'Windows'
+        shell: bash
+        run: |
+          set -euo pipefail
+
+          cmake_args=(
+            -S internal/cpp
+            -B internal/cpp/cmake-build-release
+            -DCMAKE_BUILD_TYPE=Release
+          )
+
+          if [[ "${{ matrix.goos }}" == "darwin" ]]; then
+            simde_prefix="$(brew --prefix simde)"
+            pcre2_prefix="$(brew --prefix pcre2)"
+            cmake_args+=(
+              -DCMAKE_PREFIX_PATH="${pcre2_prefix};${simde_prefix}"
+              -DCMAKE_C_FLAGS="-I${simde_prefix}/include"
+              -DCMAKE_CXX_FLAGS="-I${simde_prefix}/include"
+            )
+          fi
+
+          cmake "${cmake_args[@]}"
+          cmake --build internal/cpp/cmake-build-release --target rag_tokenizer_c_api --parallel
+
+          test -f internal/cpp/cmake-build-release/librag_tokenizer_c_api.a
+
+      - name: Build rag tokenizer native library
+        if: runner.os == 'Windows'
+        shell: msys2 {0}
+        run: |
+          set -euo pipefail
+
+          cc_path="$(command -v clang.exe 2>/dev/null || command -v clang)"
+          cxx_path="$(command -v clang++.exe 2>/dev/null || command -v clang++)"
+
+          if [[ ! -f "${cc_path}" && -f "${cc_path}.exe" ]]; then
+            cc_path="${cc_path}.exe"
+          fi
+          if [[ ! -f "${cxx_path}" && -f "${cxx_path}.exe" ]]; then
+            cxx_path="${cxx_path}.exe"
+          fi
+          test -f "${cc_path}"
+          test -f "${cxx_path}"
+
+          test -f "${MINGW_PREFIX}/include/simde/x86/sse4.1.h"
+
+          cmake -S internal/cpp -B internal/cpp/cmake-build-release -G Ninja \
+            -DCMAKE_BUILD_TYPE=Release \
+            -DCMAKE_C_COMPILER="$(cygpath -m "${cc_path}")" \
+            -DCMAKE_CXX_COMPILER="$(cygpath -m "${cxx_path}")" \
+            -DCMAKE_C_FLAGS="-I${MINGW_PREFIX}/include" \
+            -DCMAKE_CXX_FLAGS="-I${MINGW_PREFIX}/include"
+
+          cmake --build internal/cpp/cmake-build-release --target rag_tokenizer_c_api --parallel
+
+          test -f internal/cpp/cmake-build-release/librag_tokenizer_c_api.a
+
+      - name: Build Go CLI release binaries
+        if: runner.os != 'Windows'
+        shell: bash
+        run: |
+          set -euo pipefail
+
+          mkdir -p "${DIST_DIR}"
+
+          if [[ ! -e "${CLI_MAIN}" ]]; then
+            echo "::error::Go CLI entry does not exist: ${CLI_MAIN}"
+            echo "::error::Please update CLI_MAIN in .github/workflows/release.yml"
+            exit 1
+          fi
+
+          echo "Building Go CLI release binaries"
+          echo "CLI name: ${CLI_NAME}"
+          echo "CLI main: ${CLI_MAIN}"
+          echo "Release tag: ${RELEASE_TAG}"
+          echo "Commit: ${GITHUB_SHA}"
+
+          output="${DIST_DIR}/${CLI_NAME}-${RELEASE_TAG}-${{ matrix.goos }}-${{ matrix.goarch }}"
+          echo "Building ${{ matrix.goos }}/${{ matrix.goarch }} -> ${output}"
+
+          case "${{ matrix.goos }}" in
+            linux)
+              cgo_ldflags="${OFFICE_OXIDE_PREFIX}/lib/liboffice_oxide.a -lm -ldl -lpthread"
+              ;;
+            darwin)
+              cgo_ldflags="${OFFICE_OXIDE_PREFIX}/lib/liboffice_oxide.a"
+              ;;
+            *)
+              echo "::error::Unsupported Unix target: ${{ matrix.goos }}"
+              exit 1
+              ;;
+          esac
+          cgo_cflags="-I${OFFICE_OXIDE_PREFIX}/include -I${OFFICE_OXIDE_PREFIX}/include/office_oxide_c"
+
+          echo "office_oxide prefix: ${OFFICE_OXIDE_PREFIX}"
+          echo "CGO_CFLAGS: ${cgo_cflags}"
+          echo "CGO_LDFLAGS: ${cgo_ldflags}"
+
+          CGO_ENABLED=1 \
+          CGO_CFLAGS="${cgo_cflags}" \
+          CGO_LDFLAGS="${cgo_ldflags}" \
+          GOOS="${{ matrix.goos }}" \
+          GOARCH="${{ matrix.goarch }}" \
+          go build \
+            -trimpath \
+            -ldflags="-s -w -X main.version=${RELEASE_TAG} -X main.commit=${GITHUB_SHA}" \
+            -o "${output}" \
+            "${CLI_MAIN}"
+
+          chmod +x "${output}"
+
+          if [[ "${{ matrix.goos }}" == "linux" ]] && command -v ldd >/dev/null 2>&1; then
+            if ldd "${output}" 2>&1 | grep -q "liboffice_oxide"; then
+              echo "::error::linux CLI unexpectedly links liboffice_oxide dynamically"
+              ldd "${output}" || true
+              exit 1
+            fi
+            echo "Verified linux CLI does not require liboffice_oxide.so at runtime"
+          fi
+
+      - name: Build Go CLI release binaries
+        if: runner.os == 'Windows'
+        shell: pwsh
+        run: |
+          New-Item -ItemType Directory -Force -Path $env:DIST_DIR | Out-Null
+
+          if (-not (Test-Path $env:CLI_MAIN)) {
+            Write-Error "Go CLI entry does not exist: $env:CLI_MAIN"
+            exit 1
+          }
+
+          $output = Join-Path $env:DIST_DIR "${env:CLI_NAME}-${env:RELEASE_TAG}-${{ matrix.goos }}-${{ matrix.goarch }}${{ matrix.output_ext }}"
+          Write-Host "Building ${{ matrix.goos }}/${{ matrix.goarch }} -> $output"
+
+          $officeOxidePrefix = $env:OFFICE_OXIDE_PREFIX -replace "\\", "/"
+          $cc = $env:CC
+          if ([string]::IsNullOrWhiteSpace($cc)) {
+            Write-Error "CC is not set"
+            exit 1
+          }
+          if (-not (Test-Path $cc)) {
+            Write-Error "C compiler does not exist: $cc"
+            exit 1
+          }
+          if (-not (Test-Path "${officeOxidePrefix}/lib/liboffice_oxide.a")) {
+            Write-Error "liboffice_oxide.a does not exist: ${officeOxidePrefix}/lib/liboffice_oxide.a"
+            exit 1
+          }
+          if (-not (Test-Path "internal/cpp/cmake-build-release/librag_tokenizer_c_api.a")) {
+            Write-Error "librag_tokenizer_c_api.a does not exist"
+            exit 1
+          }
+          if ([string]::IsNullOrWhiteSpace($env:PCRE2_LIBDIR) -or -not (Test-Path $env:PCRE2_LIBDIR)) {
+            Write-Error "PCRE2_LIBDIR is not set or does not exist: $env:PCRE2_LIBDIR"
+            exit 1
+          }
+
+          $ragTokenizerLib = (Resolve-Path "internal/cpp/cmake-build-release/librag_tokenizer_c_api.a").Path -replace '\\', '/'
+          $pcre2LibDir = $env:PCRE2_LIBDIR -replace '\\', '/'
+          $pcre2IncludeDir = $env:PCRE2_INCLUDEDIR -replace '\\', '/'
+
+          $env:CGO_ENABLED = "1"
+          $env:CC = $cc
+          $env:CGO_CFLAGS = "-I${officeOxidePrefix}/include -I${officeOxidePrefix}/include/office_oxide_c -I${pcre2IncludeDir}"
+          $env:CGO_LDFLAGS = "${officeOxidePrefix}/lib/liboffice_oxide.a ${ragTokenizerLib} -L${pcre2LibDir} -lpcre2-8 -lc++ -static -static-libgcc -static-libstdc++ -lws2_32 -lbcrypt -lntdll -luserenv -ladvapi32"
+          $env:GOOS = "${{ matrix.goos }}"
+          $env:GOARCH = "${{ matrix.goarch }}"
+
+          Write-Host "CC: $env:CC"
+          & $env:CC --version
+          Write-Host "office_oxide prefix: $officeOxidePrefix"
+          Write-Host "CGO_CFLAGS: $env:CGO_CFLAGS"
+          Write-Host "CGO_LDFLAGS: $env:CGO_LDFLAGS"
+
+          go build `
+            -trimpath `
+            -ldflags="-s -w -X main.version=$env:RELEASE_TAG -X main.commit=$env:GITHUB_SHA" `
+            -o "$output" `
+            "$env:CLI_MAIN"
+
+      - name: Upload CLI artifact
+        uses: actions/upload-artifact@v4
+        with:
+          name: cli-${{ matrix.goos }}-${{ matrix.goarch }}
+          path: dist/cli/*
+          if-no-files-found: error
+
+  publish_cli_assets:
+    needs:
+      - prepare
+      - build_cli
+    runs-on: [ "self-hosted", "ragflow-release" ]
+
+    steps:
+      - name: Ensure workspace ownership
+        run: echo "chown -R ${USER} ${GITHUB_WORKSPACE}" && sudo chown -R ${USER} ${GITHUB_WORKSPACE}
+
+      # https://github.com/actions/checkout/blob/v6/README.md
+      - name: Check out code
+        uses: actions/checkout@v6
+        with:
+          token: ${{ secrets.GITHUB_TOKEN }}
+          fetch-depth: 0
+          fetch-tags: true
+
+      - name: Download CLI artifacts
+        uses: actions/download-artifact@v5
+        with:
+          pattern: cli-*
+          path: dist/cli
+          merge-multiple: true
+
+      - name: Prepare CLI release assets
+        env:
+          RELEASE_TAG: ${{ needs.prepare.outputs.release_tag }}
+        run: |
+          set -euo pipefail
+
+          RELEASE_DATETIME=$(date --rfc-3339=seconds)
+          echo Release ${RELEASE_TAG} created from ${GITHUB_SHA} at ${RELEASE_DATETIME} > release_body.md
+
+          cd dist/cli
+          sha256sum * > SHA256SUMS
+          cd -
+
+          echo "Generated CLI release assets:"
+          ls -lh dist/cli
+
+      - name: Upload Go CLI release assets
        uses: softprops/action-gh-release@v2
        with:
-          token: ${{ secrets.GITHUB_TOKEN }}  # Use the secret as an environment variable
-          prerelease: ${{ env.PRERELEASE }}
-          tag_name: ${{ env.RELEASE_TAG }}
-          # The body field does not support environment variable substitution directly.
+          token: ${{ secrets.GITHUB_TOKEN }}
+          prerelease: ${{ needs.prepare.outputs.prerelease }}
+          tag_name: ${{ needs.prepare.outputs.release_tag }}
          body_path: release_body.md
+          files: |
+            dist/cli/*
+            tools/scripts/install.sh
+            tools/scripts/install.ps1
+
+  release:
+    needs:
+      - prepare
+      - publish_cli_assets
+    runs-on: [ "self-hosted", "ragflow-release" ]
+    env:
+      RELEASE_TAG: ${{ needs.prepare.outputs.release_tag }}
+
+    steps:
+      - name: Ensure workspace ownership
+        run: echo "chown -R ${USER} ${GITHUB_WORKSPACE}" && sudo chown -R ${USER} ${GITHUB_WORKSPACE}
+
+      # https://github.com/actions/checkout/blob/v6/README.md
+      - name: Check out code
+        uses: actions/checkout@v6
+        with:
+          token: ${{ secrets.GITHUB_TOKEN }}
+          fetch-depth: 0
+          fetch-tags: true
+
+      - name: Build and push image
+        run: |
+          sudo docker login --username infiniflow --password-stdin <<< ${{ secrets.DOCKERHUB_TOKEN }}
+          sudo docker build -t infiniflow/ragflow:${RELEASE_TAG} -f Dockerfile .
+          sudo docker tag infiniflow/ragflow:${RELEASE_TAG} infiniflow/ragflow:latest
+          sudo docker push infiniflow/ragflow:${RELEASE_TAG}
+          sudo docker push infiniflow/ragflow:latest

      - name: Build and push ragflow-sdk
        if: startsWith(github.ref, 'refs/tags/v')
@@ -84,11 +621,3 @@ jobs:
        if: startsWith(github.ref, 'refs/tags/v')
        run: |
          cd admin/client && uv build && uv publish --token ${{ secrets.PYPI_API_TOKEN }}
-
-      - name: Build and push image
-        run: |
-          sudo docker login --username infiniflow --password-stdin <<< ${{ secrets.DOCKERHUB_TOKEN }}
-          sudo docker build --build-arg NEED_MIRROR=1 -t infiniflow/ragflow:${RELEASE_TAG} -f Dockerfile .
-          sudo docker tag infiniflow/ragflow:${RELEASE_TAG} infiniflow/ragflow:latest
-          sudo docker push infiniflow/ragflow:${RELEASE_TAG}
-          sudo docker push infiniflow/ragflow:latest
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
--- a/.gitignore
+++ b/.gitignore
@@ -7,7 +7,7 @@ hudet/
 cv/
 layout_app.py
 api/flask_session
-
+venv/
 # Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
 # More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
 Cargo.lock
@@ -21,6 +21,7 @@ Cargo.lock

 .idea/
 .vscode/
+.cursor/settings.json

 # Exclude Mac generated files
 .DS_Store
@@ -44,6 +45,7 @@ cl100k_base.tiktoken
 chrome*
 huggingface.co/
 nltk_data/
+uv-x86_64*.tar.gz

 # Exclude hash-like temporary files like 9b5ad71b2ce5302211f9c61530b329a4922fc6a4
 *[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]*
@@ -51,6 +53,13 @@ nltk_data/
 .venv
 docker/data

+# OceanBase data and conf
+docker/oceanbase/conf
+docker/oceanbase/data
+
+# SeekDB data and conf
+docker/seekdb
+

 #--------------------------------------------------#
 # The following was generated with gitignore.nvim: #
@@ -128,6 +137,9 @@ web_modules/
 # Output of 'npm pack'
 *.tgz

+# Claude Code plans / state — local-only artifacts
+.claude/
+
 # Yarn Integrity file
 .yarn-integrity

@@ -195,3 +207,41 @@ ragflow_cli.egg-info

 # Default backup dir
 backup
+
+
+*huqie.txt
+
+.hypothesis
+
+
+# Added by cargo
+
+/target
+
+# Do not include in PR (local dev / build artifacts)
+ragflow.egg-info/
+uv-aarch64*.tar.gz
+uv-aarch64-unknown-linux-gnu.tar.gz
+docker/launch_backend_service_windows.sh
+
+# C++ build directories
+internal/cpp/build/
+internal/cpp/cmake-build-release/
+internal/cpp/cmake-build-debug/
+
+# Trae IDE config
+.trae/
+
+# Go server build output
+bin/*
+!bin/.gitkeep
+.claude/settings.local.json
+
+.run/
+# Local agent tooling state (per-developer; not for commit)
+.omc/
+.marscode/
+
+# Parser test fixtures and python tools
+internal/deepdoc/parser/pdf/testdata/
+internal/deepdoc/parser/pdf/tools-py/
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -1,19 +0,0 @@
-repos:
-  - repo: https://github.com/pre-commit/pre-commit-hooks
-    rev: v4.6.0
-    hooks:
-      - id: check-yaml
-      - id: check-json
-      - id: end-of-file-fixer
-      - id: trailing-whitespace
-      - id: check-case-conflict
-      - id: check-merge-conflict
-      - id: mixed-line-ending
-      - id: check-symlinks
-
-  - repo: https://github.com/astral-sh/ruff-pre-commit
-    rev: v0.11.6
-    hooks:
-      - id: ruff
-        args: [ --fix ]
-      - id: ruff-format
--- a/.rooignore
+++ b/.rooignore
@@ -0,0 +1,85 @@
+# .rooignore for RAGFlow
+# Purpose: reduce indexing noise, token waste, and accidental reads of generated files
+
+# Git / platform
+.git/
+.github/
+
+# IDE / local editor
+.idea/
+.vscode/
+.trae/
+
+# Python caches / build artifacts
+__pycache__/
+*.pyc
+*.pyo
+*.pyd
+.pytest_cache/
+.mypy_cache/
+.ruff_cache/
+.hypothesis/
+.coverage
+*.egg-info/
+ragflow.egg-info/
+sdk/python/ragflow_sdk.egg-info/
+sdk/python/build/
+sdk/python/dist/
+build/
+dist/
+
+# Virtual environments
+.venv/
+venv/
+env/
+
+# Node / frontend dependencies and build output
+node_modules/
+web/node_modules/
+web/dist/
+web/build/
+web/.cache/
+*.tsbuildinfo
+
+# Logs / runtime artifacts
+logs/
+docker/ragflow-logs/
+*.log
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+.pnpm-debug.log*
+
+# Large local dependency artifacts
+libssl*.deb
+tika-server*.jar*
+cl100k_base.tiktoken
+chrome*
+huggingface.co/
+nltk_data/
+uv-x86_64*.tar.gz
+uv-aarch64*.tar.gz
+
+# Temp / data / local storage
+tmp/
+cache/
+backup/
+docker/data/
+docker/oceanbase/conf
+docker/oceanbase/data
+docker/seekdb
+
+# Native / compiled build dirs
+target/
+bin/
+internal/cpp/build/
+internal/cpp/cmake-build-release/
+internal/cpp/cmake-build-debug/
+
+# Optional: skip tests and docs from indexing
+# test/
+# tests/
+# docs/
+
+# Ignore Roo's own config file
+.rooignore
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,109 @@
+# RAGFlow Project Instructions for GitHub Copilot
+
+This file provides context, build instructions, and coding standards for the RAGFlow project.
+It is structured to follow GitHub Copilot's [customization guidelines](https://docs.github.com/en/copilot/concepts/prompting/response-customization).
+
+## 1. Project Overview
+RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It is a full-stack application with a Python backend and a React/TypeScript frontend.
+
+- **Backend**: Python 3.10+ (Flask/Quart)
+- **Frontend**: TypeScript, React, UmiJS
+- **Architecture**: Microservices based on Docker.
+  - `api/`: Backend API server.
+  - `rag/`: Core RAG logic (indexing, retrieval).
+  - `deepdoc/`: Document parsing and OCR.
+  - `web/`: Frontend application.
+
+## 2. Directory Structure
+- `api/`: Backend API server (Flask/Quart).
+  - `apps/`: API Blueprints (Knowledge Base, Chat, etc.).
+  - `db/`: Database models and services.
+- `rag/`: Core RAG logic.
+  - `llm/`: LLM, Embedding, and Rerank model abstractions.
+- `deepdoc/`: Document parsing and OCR modules.
+- `agent/`: Agentic reasoning components.
+- `web/`: Frontend application (React + UmiJS).
+- `docker/`: Docker deployment configurations.
+- `sdk/`: Python SDK.
+- `test/`: Backend tests.
+
+## 3. Build Instructions
+
+### Backend (Python)
+The project uses **uv** for dependency management.
+
+1. **Setup Environment**:
+   ```bash
+   uv sync --python 3.13 --all-extras
+   uv run python3 ragflow_deps/download_deps.py
+   ```
+
+2. **Run Server**:
+   - **Pre-requisite**: Start dependent services (MySQL, ES/Infinity, Redis, MinIO).
+     ```bash
+     docker compose -f docker/docker-compose-base.yml up -d
+     ```
+   - **Launch**:
+     ```bash
+     source .venv/bin/activate
+     export PYTHONPATH=$(pwd)
+     bash docker/launch_backend_service.sh
+     ```
+
+### Frontend (TypeScript/React)
+Located in `web/`.
+
+1. **Install Dependencies**:
+   ```bash
+   cd web
+   npm install
+   ```
+
+2. **Run Dev Server**:
+   ```bash
+   npm run dev
+   ```
+   Runs on port 8000 by default.
+
+### Docker Deployment
+To run the full stack using Docker:
+```bash
+cd docker
+docker compose -f docker-compose.yml up -d
+```
+
+## 4. Testing Instructions
+
+### Backend Tests
+- **Run All Tests**:
+  ```bash
+  uv run pytest
+  ```
+- **Run Specific Test**:
+  ```bash
+  uv run pytest test/test_api.py
+  ```
+
+### Frontend Tests
+- **Run Tests**:
+  ```bash
+  cd web
+  npm run test
+  ```
+
+## 5. Coding Standards & Guidelines
+- **Python Formatting**: Use `ruff` for linting and formatting.
+  ```bash
+  ruff check
+  ruff format
+  ```
+- **Frontend Linting**:
+  ```bash
+  cd web
+  npm run lint
+  ```
+- **Git Hooks**: Run this once after the first clone to enable local Git hooks.
+  ```bash
+  lefthook install
+  lefthook run pre-commit --all-files
+  ```
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -5,49 +5,76 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 ## Project Overview

 RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It's a full-stack application with:
- Python backend (Flask-based API server)
- React/TypeScript frontend (built with UmiJS)
- Microservices architecture with Docker deployment
- Multiple data stores (MySQL, Elasticsearch/Infinity, Redis, MinIO)
+
+- Python backend (Quart-based async API server — Quart is the async reimplementation of Flask)
+- React/TypeScript frontend (built with vitejs)
+- Background task executor workers (separate Python processes, Redis-queue-driven)
+- Peewee ORM for database models (not SQLAlchemy)
+- Multiple data stores (MySQL/PostgreSQL, Elasticsearch/Infinity/OpenSearch/OceanBase, Redis, MinIO)

 ## Architecture

-### Backend (`/api/`)
- **Main Server**: `api/ragflow_server.py` - Flask application entry point
- **Apps**: Modular Flask blueprints in `api/apps/` for different functionalities:
-  - `kb_app.py` - Knowledge base management
-  - `dialog_app.py` - Chat/conversation handling
-  - `document_app.py` - Document processing
-  - `canvas_app.py` - Agent workflow canvas
-  - `file_app.py` - File upload/management
- **Services**: Business logic in `api/db/services/`
- **Models**: Database models in `api/db/db_models.py`
+### Runtime Architecture
+
+RAGFlow runs as **two separate Python process types**, orchestrated by `docker/launch_backend_service.sh`:
+
+- **API Server** (`api/ragflow_server.py`): Quart-based async HTTP server
+- **Task Executors** (`rag/svr/task_executor.py`): Background workers processing documents from Redis streams. Multiple instances run in parallel (controlled by `WS` env var). Each consumes from priority-ordered Redis streams (`te.1.common`, `te.0.common`), using consumer groups for load distribution.
+
+Key consequence: task executors import a different code surface than the API server, so always check which process a module is meant for.
+
+### Backend API (`/api/`)
+
+- **App factory**: `api/apps/__init__.py` — creates the Quart app, configures auth (`login_required` decorator, JWT + API token + session fallback), and dynamically discovers/registers blueprints
+- **Two API coexisting patterns**:
+  - **RESTful APIs** in `api/apps/restful_apis/` — newer pattern with Pydantic request validation, service layer in `api/apps/services/`, routes registered under `/api/v1`
+  - **Legacy APIs** in `api/apps/*_app.py` — older pattern using `@validate_request()`, routes registered under `/v1/<page_name>`
+  - **SDK APIs** in `api/apps/sdk/` — registered under `/v1/`
+- **Services**: `api/db/services/` — business logic wrapping Peewee model operations. `api/apps/services/` — service layer for the RESTful APIs
+- **Models**: `api/db/db_models.py` — Peewee ORM models with pooled MySQL/PostgreSQL connections, custom `JSONField`/`ListField` types, retry logic on connection loss

 ### Core Processing (`/rag/`)
- **Document Processing**: `deepdoc/` - PDF parsing, OCR, layout analysis
- **LLM Integration**: `rag/llm/` - Model abstractions for chat, embedding, reranking
- **RAG Pipeline**: `rag/flow/` - Chunking, parsing, tokenization
- **Graph RAG**: `graphrag/` - Knowledge graph construction and querying
+
+- **Document ingestion pipeline**: `rag/flow/pipeline.py` — `Pipeline` (extends `agent.canvas.Graph`) orchestrates the ingestion DAG. Components: File (fetches binary from storage), Parser (dispatches to `deepdoc.parser` based on file type), TokenChunker/TitleChunker (splits into chunks), Tokenizer (computes full-text tokens + embedding vectors), Extractor (LLM-based extraction). Data flows via Pydantic `*FromUpstream` schemas.
+- **Document parsing**: `deepdoc/` — PDF parsing (vision-based OCR, layout analysis, table structure recognition) and format-specific parsers (DOCX, XLSX, PPT, Markdown, HTML, images). All parsers normalize to a common structure (list of bbox dicts for PDFs, `{text, doc_type_kwd}` for others).
+- **DeepDoc HTTP API service** (`deepdoc/server/`): OSS ONNX models (DLA, OCR, TSR) wrapped with LitServe as a standalone HTTP API on port 8124. The Go parser (`internal/parser/`) calls this service via `DeepDocClient`. Endpoints: `GET /health`, `GET /model`, `POST /predict/dla`, `POST /predict/tsr`, `POST /predict/ocr` (with `operator=det` or `operator=rec` form field). Docker image: `deepdoc_oss:latest`. See `deepdoc/server/README.md` for the full API reference.
+- **LLM Integration**: `rag/llm/` — factory pattern with runtime class discovery. `chat_model.py` (30+ providers via OpenAI SDK and LiteLLM wrappers), `embedding_model.py`, `rerank_model.py`, `cv_model.py` (image-to-text), `sequence2txt_model.py` (ASR), `tts_model.py`. Use `LLMBundle` (from `api.db.services.llm_service`) as the unified interface.
+- **Graph RAG**: `rag/graphrag/` — multi-phase pipeline: per-document subgraph extraction (LLM or spaCy NER), Leiden community detection, entity resolution, community summarization. Entities/relations/reports are indexed as chunks alongside regular text chunks, differentiated by `knowledge_graph_kwd`.
+- **Search**: `rag/nlp/search.py` — `Dealer` class combines vector similarity + BM25 + re-ranking. `KGSearch` extends it for graph-aware retrieval (entity resolution, n-hop enrichment).

 ### Agent System (`/agent/`)
- **Components**: Modular workflow components (LLM, retrieval, categorize, etc.)
- **Templates**: Pre-built agent workflows in `agent/templates/`
- **Tools**: External API integrations (Tavily, Wikipedia, SQL execution, etc.)
+
+- **Execution engine**: `agent/canvas.py` — `Canvas` (extends `Graph`) executes the DAG. Components are run in topological order via `_run_batch`, each receiving upstream outputs as kwargs. Control-flow components (`Categorize`, `Switch`, `Iteration`, `Loop`) dynamically modify the execution path.
+- **Component base**: `agent/component/base.py` — `ComponentBase` with `invoke(**kwargs)` / `invoke_async(**kwargs)` lifecycle. Variable references (`{component_id@output_var}` or `{sys.query}`) are resolved from the canvas graph at runtime.
+- **Components**: Modular workflow components in `agent/component/` — Begin, LLM, Agent (tool-calling LLM), Categorize, Switch, Iteration, Loop, Message, Invoke (HTTP), and data manipulation nodes. Auto-discovered by `__init__.py`.
+- **Templates**: Pre-built agent workflows as JSON DSL files in `agent/templates/`. Each contains a complete `components` DAG, `path`, and `globals`.
+- **Tools**: `agent/tools/` — Retrieval, web search (DuckDuckGo, Google, Tavily, SearXNG), academic search (ArXiv, PubMed, Google Scholar, Wikipedia), code execution, SQL execution, email, GitHub, finance data, translation, weather. Tools implement `ToolBase` (extends `ComponentBase`) and produce OpenAI-compatible function descriptors.
+- **Plugins**: `agent/plugin/` — plugin system using `pluginlib` for loading external LLM tool plugins from `embedded_plugins/`.

 ### Frontend (`/web/`)
- React/TypeScript with UmiJS framework
- Ant Design + shadcn/ui components
- State management with Zustand
- Tailwind CSS for styling
+
+- React/TypeScript with vitejs framework
+- shadcn/ui components (Radix UI primitives + Tailwind CSS)
+- `@tanstack/react-query` for server state (cache keys, mutations, invalidation)
+- Zustand for local state (primarily agent canvas graph store)
+- `react-router` v7 with lazy-loaded pages
+- `react-i18next` for i18n (17 languages)
+- Axios for HTTP with a layered pattern: endpoint definitions (`utils/api.ts`) → HTTP client (`utils/next-request.ts`) → service layer (`services/`) → query hooks (`hooks/use-*-request.ts`) → components
+- `@xyflow/react` for the agent workflow canvas
+- `react-hook-form` + `zod` for form validation
+- Two API proxy prefixes: `webAPI = '/v1'` (legacy) and `restAPIv1 = '/api/v1'` (RESTful)

 ## Common Development Commands

 ### Backend Development
+
 ```bash
 # Install Python dependencies
-uv sync --python 3.10 --all-extras
-uv run download_deps.py
-pre-commit install
+uv sync --python 3.13 --all-extras
+uv run python3 ragflow_deps/download_deps.py
+
+# Run once after the first clone to enable local Git hooks
+lefthook install

 # Start dependent services
 docker compose -f docker/docker-compose-base.yml up -d
@@ -66,6 +93,7 @@ ruff format
 ```

 ### Frontend Development
+
 ```bash
 cd web
 npm install
@@ -76,14 +104,19 @@ npm run test       # Jest tests
 ```

 ### Docker Development
+
 ```bash
-# Full stack with Docker
+# Full stack with Docker (includes deepdoc vision service)
 cd docker
 docker compose -f docker-compose.yml up -d

 # Check server status
 docker logs -f ragflow-server

+# Build the OSS deepdoc vision service standalone
+docker build -f docker/Dockerfile_deepdoc_oss -t deepdoc_oss:latest .
+docker run -p 8124:8124 deepdoc_oss:latest
+
 # Rebuild images
 docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
 ```
@@ -104,13 +137,172 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
 ## Database Engines

 RAGFlow supports switching between Elasticsearch (default) and Infinity:
+
 - Set `DOC_ENGINE=infinity` in `docker/.env` to use Infinity
 - Requires container restart: `docker compose down -v && docker compose up -d`

+## Account Password Handling (Critical for Login Flow)
+
+### Password Encryption Pipeline (Browser → Backend → DB Hash)
+
+The login password verification chain is counterintuitive. Understanding this is essential when generating or verifying password hashes.
+
+**Complete flow:**
+
+```
+Browser input: "demo"
+  → Base64("demo") = "ZGVtbw=="
+  → RSA encrypt with conf/public.pem
+  → POST to /api/v1/auth/login
+
+Backend DecryptPassword():
+  → RSA decrypt with conf/private.pem (passphrase: "Welcome")
+  → Returns "ZGVtbw=="  (NOT "demo"!)
+
+VerifyPassword("ZGVtbw==", storedHash)  ← hash is of Base64(password), not raw password
+```
+
+**Consequences:**
+- The string verified against the hash is **Base64(original password)**, never the raw password
+- `DecryptPassword()` handles both RSA-encrypted (browser) and plaintext (curl/API key) inputs: if base64 decode fails, the input is returned as-is for backward compatibility
+- Python backend has the same design: `api/utils/crypt.py:decrypt()` RSA-decrypts and returns the Base64-encoded string directly, no further decode
+
+### How to Generate a Valid Password Hash
+
+```bash
+# For password "demo" (user input in browser):
+# The actual verified string = Base64("demo") = "ZGVtbw=="
+# Generate hash with: common.GenerateWerkzeugPasswordHash("ZGVtbw==")
+# or use the scrypt template:
+# scrypt:32768:8:1$<random-b64-salt>$<hex-hash-of-ZGVtbw==>
+```
+
+**To update a user's password in the running database:**
+```bash
+docker exec docker-mysql-1 mysql -u root -pinfini_rag_flow rag_flow \
+  -e "UPDATE user SET password='<hash>' WHERE email='<email>';"
+```
+
+### RSA Keys
+- `conf/public.pem` — frontend uses this to encrypt Base64(password) before sending
+- `conf/private.pem` — backend uses this to decrypt, passphrase `"Welcome"`
+- Both referenced in `internal/common/password.go:DecryptPassword()`
+
+### Obtaining an API Token for a Tenant
+
+When testing APIs manually (curl, Go scripts, etc.), you need a valid auth token. The login endpoint returns **two different tokens**:
+
+| Field | Format | Purpose |
+|-------|--------|---------|
+| `response.body.data.access_token` | Raw UUID | Stored in DB, NOT used for API auth |
+| `response.Header["Authorization"]` | itsdangerous-signed token | Used as `Bearer <token>` for all subsequent API requests |
+
+**How to obtain the correct token:**
+
+```bash
+# Step 1: Construct the encrypted password
+# Raw password → Base64 → RSA encrypt with conf/public.pem
+PASSWORD="demo"
+PASSWORD_B64=$(echo -n "$PASSWORD" | base64)
+
+# Step 2: POST to login (use RSA encryption — easiest via a Go/Python script)
+# Response header contains: Authorization: <itsdangerous-signed-token>
+
+# Step 3: Use the Authorization header value for all API requests
+curl -H "Authorization: <itsdangerous-signed-token>" \
+  http://127.0.0.1:9222/api/v1/agents
+```
+
+**Go snippet (complete login + token extraction):**
+
+```go
+// Login
+passwordB64 := base64.StdEncoding.EncodeToString([]byte(password))
+pubData, _ := os.ReadFile("conf/public.pem")
+block, _ := pem.Decode(pubData)
+pubKey, _ := x509.ParsePKIXPublicKey(block.Bytes)
+ciphertext, _ := rsa.EncryptPKCS1v15(rand.Reader, pubKey.(*rsa.PublicKey), []byte(passwordB64))
+encryptedB64 := base64.StdEncoding.EncodeToString(ciphertext)
+
+body, _ := json.Marshal(map[string]string{"email": email, "password": encryptedB64})
+resp, _ := http.Post(baseURL+"/api/v1/auth/login", "application/json", bytes.NewReader(body))
+
+// KEY: use the Authorization header, NOT body.access_token
+authToken := resp.Header.Get("Authorization")
+
+// Use for API calls
+req, _ := http.NewRequest("GET", baseURL+"/api/v1/agents", nil)
+req.Header.Set("Authorization", authToken)
+```
+
+**The raw `access_token` (UUID) in the response body** is the internal DB token used only by the `itsdangerous` middleware to verify the signed token — it is never passed directly in API Authorization headers.
+
+---
+
+## Agent Run E2E Tests
+
+### Running the Tests
+
+```bash
+# Run all agent run e2e tests (in-memory SQLite + miniredis, no Docker needed)
+cd /home/zhichyu/github.com/infiniflow/ragflow
+go test -count=1 -v -run 'TestRunAgent_RealCanvas|TestRunAgent_RunTracker' ./internal/service/
+```
+
+### Test Architecture
+
+All e2e tests live in `internal/service/agent_run_e2e_test.go`. They exercise the full production chain:
+
+```
+loadCanvasForUser → versionDAO.GetLatest → decodeCanvasFromDSL →
+canvas.Compile → cc.Workflow.Invoke → answer extraction
+```
+
+**Test isolation**: Each test stands up its own in-memory SQLite DB (pushed as `dao.DB`), seeds User/Tenant/UserCanvas/UserCanvasVersion rows, and tears down in `t.Cleanup`. Tests use **miniredis** for Redis-backed CheckPointStore + RunTracker — no external services needed.
+
+**Key test helpers:**
+- `makeCanvasWithDSL(t, canvasID, userID, tenantID, versionID, dsl)` — seeds all required DB rows
+- `drainAgentEvents(t, events)` — drains the `<-chan canvas.RunEvent` channel, buckets results into `messages`, `waiting`, `errors_`, `done`
+- `newRunTrackerForTest(t, ttl)` — wires a `canvas.RunTracker` against in-memory miniredis
+
+**Existing e2e tests:**
+
+| Test | What it covers |
+|------|---------------|
+| `TestRunAgent_RealCanvas_BeginMessage` | Happy path: Begin→Message, verifies `"{{sys.query}}"` resolution |
+| `TestRunAgent_RealCanvas_WaitForUserResume` | Resume path: Begin→Message→UserFillUp, two-run cycle |
+| `TestRunAgent_RealCanvas_CompileFails` | Error path: unknown component name → sanitized error |
+| `TestRunAgent_RealCanvas_InvokeFails` | Error path: unresolvable template ref |
+| `TestRunAgent_RunTracker_AttachCheckpoint_CallSequence` | Production boot: Start→AttachCheckpoint→MarkSucceeded with Redis/miniredis |
+
+**Test DSL data files** are in `internal/agent/dsl/testdata/`:
+- `agent_msg.json` — Agent+Message with Begin, LLM-powered agent component
+- `all.json` — Complex: Begin→UserFillUp→Switch→Loop→Message
+- `switch.json`, `resume.json`, `browser.json`, `subagent.json`, etc.
+
+**Handler-level SSE streaming tests** in `internal/handler/agent_test.go` use a `stubChatRunner` that emits pre-configured `canvas.RunEvent` values without a real DB or eino runner, verifying:
+- SSE `Content-Type: text/event-stream`
+- `data: {...}\n\n` framing
+- Trailing `data: [DONE]\n\n` terminator
+- OpenAI-compatible non-stream `choices` response shape
+
+**Important**: `_ "ragflow/internal/agent/component"` (blank import in test) is required — it triggers `init()` to register all component factories. Without it, `canvas.Compile` fails to resolve any component type.
+
+---
+
 ## Development Environment Requirements

- Python 3.10-3.12
+- Python 3.10-3.13
 - Node.js >=18.20.4
 - Docker & Docker Compose
 - uv package manager
- 16GB+ RAM, 50GB+ disk space
+- 16GB+ RAM, 50GB+ disk space
+
+1. Think before acting. Read existing files before writing code.
+2. Be concise in output but thorough in reasoning.
+3. Prefer editing over rewriting whole files.
+4. Do not re-read files you have already read.
+5. Test your code before declaring done.
+6. No sycophantic openers or closing fluff.
+7. Keep solutions simple and direct.
+8. User instructions always override this file.
--- a/202
+++ b/202
@@ -1,5 +1,5 @@
 # base stage
-FROM ubuntu:22.04 AS base
+FROM ubuntu:24.04 AS base
 USER root
 SHELL ["/bin/bash", "-c"]

@@ -7,88 +7,140 @@ ARG NEED_MIRROR=0

 WORKDIR /ragflow

-# Copy models downloaded via download_deps.py
+# copy models downloaded via download_deps.py
 RUN mkdir -p /ragflow/rag/res/deepdoc /root/.ragflow
 RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/huggingface.co,target=/huggingface.co \
-    cp /huggingface.co/InfiniFlow/huqie/huqie.txt.trie /ragflow/rag/res/ && \
    tar --exclude='.*' -cf - \
        /huggingface.co/InfiniFlow/text_concat_xgb_v1.0 \
        /huggingface.co/InfiniFlow/deepdoc \
-        | tar -xf - --strip-components=3 -C /ragflow/rag/res/deepdoc 
+        | tar -xf - --strip-components=3 -C /ragflow/rag/res/deepdoc

 # https://github.com/chrismattmann/tika-python
 # This is the only way to run python-tika without internet access. Without this set, the default is to check the tika version and pull latest every time from Apache.
 RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps \
    cp -r /deps/nltk_data /root/ && \
-    cp /deps/tika-server-standard-3.0.0.jar /deps/tika-server-standard-3.0.0.jar.md5 /ragflow/ && \
+    cp /deps/tika-server-standard-3.3.0.jar /deps/tika-server-standard-3.3.0.jar.md5 /ragflow/ && \
    cp /deps/cl100k_base.tiktoken /ragflow/9b5ad71b2ce5302211f9c61530b329a4922fc6a4

-ENV TIKA_SERVER_JAR="file:///ragflow/tika-server-standard-3.0.0.jar"
+ENV TIKA_SERVER_JAR="file:///ragflow/tika-server-standard-3.3.0.jar"
 ENV DEBIAN_FRONTEND=noninteractive

 # Setup apt
 # Python package and implicit dependencies:
 # opencv-python: libglib2.0-0 libglx-mesa0 libgl1
-# aspose-slides: pkg-config libicu-dev libgdiplus         libssl1.1_1.1.1f-1ubuntu2_amd64.deb
-# python-pptx:   default-jdk                              tika-server-standard-3.0.0.jar
+# python-pptx:   default-jdk                              tika-server-standard-3.3.0.jar
 # selenium:      libatk-bridge2.0-0                       chrome-linux64-121-0-6167-85
 # Building C extensions: libpython3-dev libgtk-4-1 libnss3 xdg-utils libgbm-dev
 RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
+    apt update && \
+    apt --no-install-recommends install -y ca-certificates; \
    if [ "$NEED_MIRROR" == "1" ]; then \
-        sed -i 's|http://ports.ubuntu.com|http://mirrors.tuna.tsinghua.edu.cn|g' /etc/apt/sources.list; \
-        sed -i 's|http://archive.ubuntu.com|http://mirrors.tuna.tsinghua.edu.cn|g' /etc/apt/sources.list; \
+        # CI runners may inject a proxy whose TLS certificate is not trusted inside
+        # the fresh Ubuntu base image yet. Keep the Ubuntu mirror on HTTP here so
+        # the mirror switch remains usable before the full CA store is available.
+        sed -i 's|http://archive.ubuntu.com/ubuntu|http://mirrors.aliyun.com/ubuntu|g' /etc/apt/sources.list.d/ubuntu.sources; \
+        sed -i 's|http://security.ubuntu.com/ubuntu|http://mirrors.aliyun.com/ubuntu|g' /etc/apt/sources.list.d/ubuntu.sources; \
    fi; \
    rm -f /etc/apt/apt.conf.d/docker-clean && \
    echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache && \
    chmod 1777 /tmp && \
    apt update && \
-    apt --no-install-recommends install -y ca-certificates && \
-    apt update && \
-    apt install -y libglib2.0-0 libglx-mesa0 libgl1 && \
-    apt install -y pkg-config libicu-dev libgdiplus && \
-    apt install -y default-jdk && \
-    apt install -y libatk-bridge2.0-0 && \
-    apt install -y libpython3-dev libgtk-4-1 libnss3 xdg-utils libgbm-dev && \
-    apt install -y libjemalloc-dev && \
-    apt install -y python3-pip pipx nginx unzip curl wget git vim less && \
-    apt install -y ghostscript
+    apt install -y \
+    libglib2.0-0 libglx-mesa0 libgl1 pkg-config libgdiplus default-jdk libatk-bridge2.0-0 libgtk-4-1 libnss3 xdg-utils libjemalloc-dev gnupg unzip curl wget git vim less ghostscript pandoc texlive texlive-latex-extra texlive-xetex texlive-lang-chinese fonts-freefont-ttf fonts-noto-cjk postgresql-client && \
+    rm -rf /var/lib/apt/lists/*

-RUN if [ "$NEED_MIRROR" == "1" ]; then \
-        pip3 config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple && \
-        pip3 config set global.trusted-host pypi.tuna.tsinghua.edu.cn; \
+# Download resource from GitHub to /usr/share/infinity
+RUN mkdir -p /usr/share/infinity/resource && \
+    if [ "$NEED_MIRROR" == "1" ]; then \
+        git clone --depth 1 --single-branch https://gitee.com/infiniflow/resource /tmp/resource; \
+    else \
+        git clone --depth 1 --single-branch https://github.com/infiniflow/resource.git /tmp/resource; \
+    fi && \
+    cp -r /tmp/resource/* /usr/share/infinity/resource && \
+    rm -rf /tmp/resource
+
+ARG NGINX_VERSION=1.31.0-1~noble
+RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
+    mkdir -p /etc/apt/keyrings && \
+    curl --retry 5 --retry-delay 2 --retry-all-errors -fsSL https://nginx.org/keys/nginx_signing.key | gpg --dearmor -o /etc/apt/keyrings/nginx-archive-keyring.gpg && \
+    echo "deb [signed-by=/etc/apt/keyrings/nginx-archive-keyring.gpg] https://nginx.org/packages/mainline/ubuntu/ noble nginx" > /etc/apt/sources.list.d/nginx.list && \
+    apt -o Acquire::Retries=5 update && \
+    apt -o Acquire::Retries=5 install -y nginx=${NGINX_VERSION} && \
+    apt-mark hold nginx && \
+    rm -rf /var/lib/apt/lists/*
+
+# Install uv
+RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps \
+    if [ "$NEED_MIRROR" == "1" ]; then \
        mkdir -p /etc/uv && \
-        echo "[[index]]" > /etc/uv/uv.toml && \
-        echo 'url = "https://pypi.tuna.tsinghua.edu.cn/simple"' >> /etc/uv/uv.toml && \
-        echo "default = true" >> /etc/uv/uv.toml; \
+        echo 'python-install-mirror = "https://registry.npmmirror.com/-/binary/python-build-standalone/"' > /etc/uv/uv.toml && \
+        echo '[[index]]' >> /etc/uv/uv.toml && \
+        echo 'url = "https://mirrors.aliyun.com/pypi/simple"' >> /etc/uv/uv.toml && \
+        echo 'default = true' >> /etc/uv/uv.toml; \
    fi; \
-    pipx install uv
+    arch="$(uname -m)"; \
+    if [ "$arch" = "x86_64" ]; then uv_arch="x86_64"; else uv_arch="aarch64"; fi; \
+    tar xzf "/deps/uv-${uv_arch}-unknown-linux-gnu.tar.gz" \
+    && cp "uv-${uv_arch}-unknown-linux-gnu/"* /usr/local/bin/ \
+    && rm -rf "uv-${uv_arch}-unknown-linux-gnu" \
+    && uv python install 3.13

-ENV PYTHONDONTWRITEBYTECODE=1 DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1
+ENV PYTHONDONTWRITEBYTECODE=1 DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1 \
+    UV_HTTP_TIMEOUT=200 \
+    UV_HTTP_RETRIES=3
 ENV PATH=/root/.local/bin:$PATH

 # nodejs 12.22 on Ubuntu 22.04 is too old
 RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
    curl -fsSL https://deb.nodesource.com/setup_20.x | bash - && \
-    apt purge -y nodejs npm cargo && \
+    apt purge -y nodejs npm && \
    apt autoremove -y && \
    apt update && \
-    apt install -y nodejs
+    apt install -y nodejs && \
+    rm -rf /var/lib/apt/lists/*

-# A modern version of cargo is needed for the latest version of the Rust compiler.
-RUN apt update && apt install -y curl build-essential \
-    && if [ "$NEED_MIRROR" == "1" ]; then \
-         # Use TUNA mirrors for rustup/rust dist files
-         export RUSTUP_DIST_SERVER="https://mirrors.tuna.tsinghua.edu.cn/rustup"; \
-         export RUSTUP_UPDATE_ROOT="https://mirrors.tuna.tsinghua.edu.cn/rustup/rustup"; \
-         echo "Using TUNA mirrors for Rustup."; \
-       fi; \
-    # Force curl to use HTTP/1.1
-    curl --proto '=https' --tlsv1.2 --http1.1 -sSf https://sh.rustup.rs | bash -s -- -y --profile minimal \
-    && echo 'export PATH="/root/.cargo/bin:${PATH}"' >> /root/.bashrc
-
-ENV PATH="/root/.cargo/bin:${PATH}"
-
-RUN cargo --version && rustc --version
+# stagehand-server-v3 (Node.js SEA binary used by Browser component
+# in local mode).
+#
+# The `v3.21.0` value below is the `stagehand-go/v3` Go module
+# version pinned in `go.mod`. It is used here only to compute the
+# `go_<ver>/` subdirectory that `local.go:cacheDir()` will look in
+# for the binary at runtime — that subdirectory name is keyed by
+# the Go module's own `internal.PackageVersion`, NOT by the server
+# binary's release tag.
+#
+# The server binary itself is fetched separately by `download_deps.py`
+# from the browserbase/stagehand GitHub releases. The two are
+# LOOSELY MATCHED — both stay on the v3.x line and remain protocol-
+# compatible, but the version numbers do NOT track each other (Go
+# SDK is at v3.21.0, server binary is at v3.7.2 today). On every
+# go.mod bump, refresh the server binary pin in `download_deps.py`
+# to the current latest server release; no version correspondence
+# is required to maintain.
+#
+# Drift on the Go SDK pin (this ARG vs go.mod) forces a fresh
+# GitHub download at process boot — a hard failure in air-gapped
+# deployments. CI cross-checks the two values.
+#
+# The binary is pre-fetched by `download_deps.py` and shipped via
+# the ragflow_deps image, then written directly to the stagehand-go
+# cache path that `local.go:cacheDir()` constructs at runtime —
+# `/root/.cache/stagehand/lib/go_<ver>/stagehand-server-v3-<arch>`.
+ARG STAGEHAND_GO_VERSION=v3.21.0
+RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps \
+    set -eux; \
+    arch="$(uname -m)"; \
+    case "$arch" in \
+        x86_64) stagehand_arch=x64 ;; \
+        aarch64|arm64) stagehand_arch=arm64 ;; \
+        *) echo "Unsupported architecture: $arch" >&2; exit 1 ;; \
+    esac; \
+    stagehand_version="${STAGEHAND_GO_VERSION#v}"; \
+    stagehand_cache_dir="/root/.cache/stagehand/lib/go_${stagehand_version}"; \
+    mkdir -p "${stagehand_cache_dir}"; \
+    cp "/deps/stagehand-server-v3-linux-${stagehand_arch}" \
+       "${stagehand_cache_dir}/stagehand-server-v3-linux-${stagehand_arch}"; \
+    chmod +x "${stagehand_cache_dir}/stagehand-server-v3-linux-${stagehand_arch}"

 # Add msssql ODBC driver
 # macOS ARM64 environment, install msodbcsql18.
@@ -99,12 +151,13 @@ RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
    apt update && \
    arch="$(uname -m)"; \
    if [ "$arch" = "arm64" ] || [ "$arch" = "aarch64" ]; then \
-        # ARM64 (macOS/Apple Silicon or Linux aarch64)
+        # ARM64 (macOS/Apple Silicon or Linux aarch64) \
        ACCEPT_EULA=Y apt install -y unixodbc-dev msodbcsql18; \
    else \
-        # x86_64 or others
+        # x86_64 or others \
        ACCEPT_EULA=Y apt install -y unixodbc-dev msodbcsql17; \
-    fi || \
+    fi && \
+    rm -rf /var/lib/apt/lists/* || \
    { echo "Failed to install ODBC driver"; exit 1; }


@@ -119,8 +172,6 @@ RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/chromedriver-l
    mv chromedriver /usr/local/bin/ && \
    rm -f /usr/bin/google-chrome

-# https://forum.aspose.com/t/aspose-slides-for-net-no-usable-version-of-libssl-found-with-linux-server/271344/13
-# aspose-slides on linux/arm64 is unavailable
 RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps \
    if [ "$(uname -m)" = "x86_64" ]; then \
        dpkg -i /deps/libssl1.1_1.1.1f-1ubuntu2_amd64.deb; \
@@ -135,23 +186,54 @@ USER root

 WORKDIR /ragflow

+# Install build-only dependencies for compiling Python C extensions.
+# These are not inherited from base to keep the production image smaller.
+RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
+    apt update && \
+    apt install -y build-essential libpython3-dev libicu-dev libgbm-dev && \
+    rm -rf /var/lib/apt/lists/*
+
 # install dependencies from uv.lock file
 COPY pyproject.toml uv.lock ./

 # https://github.com/astral-sh/uv/issues/10462
 # uv records index url into uv.lock but doesn't failover among multiple indexes
+# Also rewrite pypi.tuna.tsinghua.edu.cn to mirrors.aliyun.com/pypi so locks
+# that were resolved against the Tsinghua mirror (e.g. when UV_INDEX pointed
+# there) get normalized to the Aliyun mirror in NEED_MIRROR=1 builds. Without
+# this, stale Tsinghua URLs slip through and `uv sync --frozen` 404s on
+# packages that the Tsinghua mirror no longer carries.
 RUN --mount=type=cache,id=ragflow_uv,target=/root/.cache/uv,sharing=locked \
    if [ "$NEED_MIRROR" == "1" ]; then \
-        sed -i 's|pypi.org|pypi.tuna.tsinghua.edu.cn|g' uv.lock; \
+        sed -i 's|pypi.org|mirrors.aliyun.com/pypi|g' uv.lock; \
+        sed -i 's|pypi.tuna.tsinghua.edu.cn|mirrors.aliyun.com/pypi|g' uv.lock; \
    else \
+        sed -i 's|mirrors.aliyun.com/pypi|pypi.org|g' uv.lock; \
        sed -i 's|pypi.tuna.tsinghua.edu.cn|pypi.org|g' uv.lock; \
+        sed -i 's|gitee.com|github.com|g' uv.lock; \
    fi; \
-    uv sync --python 3.10 --frozen
+    # --refresh-package litellm forces a re-download of litellm from the
+    # (post-sed) URLs in uv.lock even if BuildKit's persistent uv cache mount
+    # holds a stale wheel from a previous build. litellm 1.88.x has had
+    # multiple internal ImportError issues (1.88.1 missing
+    # DEFAULT_HEALTH_CHECK_STALENESS_MULTIPLIER, 1.88.0 wheel pulled via
+    # some proxies missing RedisPipelineLpopOperation) — always re-fetching
+    # the locked version avoids serving a half-broken cached copy.
+    uv sync --python 3.13 --frozen --refresh-package litellm && \
+    # Ensure pip is available in the venv for runtime package installation (fixes #12651)
+    .venv/bin/python3 -m ensurepip --upgrade

+# Install frontend dependencies — depends only on package manifests so
+# web source / docs changes don't invalidate this layer.
+COPY web/package.json web/package-lock.json web/.npmrc ./web/
+RUN --mount=type=cache,id=ragflow_npm,target=/root/.npm,sharing=locked \
+    cd web && NODE_OPTIONS="--max-old-space-size=8192" npm install
+
+# Copy full web source and docs for the frontend build.
 COPY web web
 COPY docs docs
 RUN --mount=type=cache,id=ragflow_npm,target=/root/.npm,sharing=locked \
-    cd web && npm install && npm run build
+    cd web && NODE_OPTIONS="--max-old-space-size=8192" VITE_BUILD_SOURCEMAP=false VITE_MINIFY=esbuild npm run build

 COPY .git /ragflow/.git

@@ -173,24 +255,30 @@ ENV PATH="${VIRTUAL_ENV}/bin:${PATH}"

 ENV PYTHONPATH=/ragflow/

-COPY web web
 COPY admin admin
 COPY api api
 COPY conf conf
 COPY deepdoc deepdoc
 COPY rag rag
 COPY agent agent
-COPY graphrag graphrag
-COPY agentic_reasoning agentic_reasoning
 COPY pyproject.toml uv.lock ./
 COPY mcp mcp
-COPY plugin plugin
 COPY common common
+COPY memory memory
+COPY bin bin
+COPY tools/scripts tools/scripts

 COPY docker/service_conf.yaml.template ./conf/service_conf.yaml.template
 COPY docker/entrypoint.sh ./
 RUN chmod +x ./entrypoint*.sh

+# Copy nginx configuration for frontend serving
+COPY docker/nginx/ragflow.conf.golang docker/nginx/ragflow.conf.python docker/nginx/ragflow.conf.hybrid docker/nginx/nginx.conf docker/nginx/proxy.conf /etc/nginx/
+RUN mv /etc/nginx/ragflow.conf.golang /etc/nginx/conf.d/ragflow.conf.golang && \
+    mv /etc/nginx/ragflow.conf.python /etc/nginx/conf.d/ragflow.conf.python && \
+    mv /etc/nginx/ragflow.conf.hybrid /etc/nginx/conf.d/ragflow.conf.hybrid && \
+    rm -f /etc/nginx/sites-enabled/default
+
 # Copy compiled web pages
 COPY --from=builder /ragflow/web/dist /ragflow/web/dist

--- a/Dockerfile.deps
+++ b/Dockerfile.deps
@@ -1,10 +0,0 @@
-# This builds an image that contains the resources needed by Dockerfile
-#
-FROM scratch
-
-# Copy resources downloaded via download_deps.py
-COPY chromedriver-linux64-121-0-6167-85 chrome-linux64-121-0-6167-85 cl100k_base.tiktoken libssl1.1_1.1.1f-1ubuntu2_amd64.deb libssl1.1_1.1.1f-1ubuntu2_arm64.deb tika-server-standard-3.0.0.jar tika-server-standard-3.0.0.jar.md5 libssl*.deb /
-
-COPY nltk_data /nltk_data
-
-COPY huggingface.co /huggingface.co
--- a/66
+++ b/66
@@ -0,0 +1,66 @@
+# OSS DeepDoc server — minimal image with ONNX-only inference.
+# Build: docker build -f docker/Dockerfile_deepdoc_oss -t deepdoc_oss:latest .
+# With mirror (China): docker build --build-arg NEED_MIRROR=1 -f docker/Dockerfile_deepdoc_oss -t deepdoc_oss:latest .
+
+FROM ubuntu:24.04
+
+ARG NEED_MIRROR=1
+
+ENV PYTHONPATH=/app
+ENV DEBIAN_FRONTEND=noninteractive
+
+# ── System dependencies (onnxruntime + opencv runtime libs) ──
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    -o Acquire::Retries=5 \
+    python3.12 python3.12-venv \
+    libglib2.0-0 libglx-mesa0 libgl1 libgomp1 \
+    libgdiplus curl ca-certificates \
+    && rm -rf /var/lib/apt/lists/*
+
+# ── Python venv with ONNX inference stack ──
+RUN python3.12 -m venv /app/.venv
+COPY deepdoc/server/pyproject.toml /tmp/pyproject.toml
+RUN PIP_INDEX="https://pypi.org/simple" && \
+    PIP_TRUSTED="" && \
+    if [ "$NEED_MIRROR" = "1" ]; then \
+        PIP_INDEX="https://mirrors.aliyun.com/pypi/simple"; \
+        PIP_TRUSTED="mirrors.aliyun.com"; \
+    fi && \
+    if [ -n "$PIP_TRUSTED" ]; then \
+        /app/.venv/bin/pip install --no-cache-dir -i "$PIP_INDEX" --trusted-host "$PIP_TRUSTED" \
+        litserve onnxruntime opencv-python-headless numpy pillow pyclipper \
+        python-multipart shapely six huggingface_hub; \
+    else \
+        /app/.venv/bin/pip install --no-cache-dir -i "$PIP_INDEX" \
+        litserve onnxruntime opencv-python-headless numpy pillow pyclipper \
+        python-multipart shapely six huggingface_hub; \
+    fi
+
+# ── ONNX models (downloaded from HuggingFace) ──
+COPY deepdoc/server/download_deps.py /tmp/download_deps.py
+RUN if [ "$NEED_MIRROR" = "1" ]; then \
+        export HF_ENDPOINT=https://hf-mirror.com; \
+    fi && \
+    mkdir -p /app/rag/res/deepdoc && \
+    /app/.venv/bin/python3 /tmp/download_deps.py /app/rag/res/deepdoc
+
+# ── Vision module (ONNX inference logic) ──
+RUN mkdir -p /app/deepdoc/vision
+COPY deepdoc/vision/ /app/deepdoc/vision/
+
+# ── Docker stubs (lightweight replacements for heavy common/rag/deepdoc imports) ──
+COPY deepdoc/server/docker_stubs.py /tmp/docker_stubs.py
+RUN /app/.venv/bin/python3 /tmp/docker_stubs.py
+
+# ── Server code ──
+RUN mkdir -p /app/deepdoc/server/endpoints /app/deepdoc/server/adapters
+COPY deepdoc/server/deepdoc_server.py       /app/deepdoc/server/
+COPY deepdoc/server/endpoints/           /app/deepdoc/server/endpoints/
+COPY deepdoc/server/adapters/            /app/deepdoc/server/adapters/
+
+EXPOSE 9390
+
+HEALTHCHECK --interval=10s --timeout=10s --retries=5 \
+    CMD curl -f http://localhost:9390/health || exit 1
+
+ENTRYPOINT ["/app/.venv/bin/python3", "/app/deepdoc/server/deepdoc_server.py", "--model-dir", "/app/rag/res/deepdoc"]
--- a/README.md
+++ b/README.md
@@ -1,5 +1,5 @@
 <div align="center">
-<a href="https://demo.ragflow.io/">
+<a href="https://cloud.ragflow.io/">
 <img src="web/src/assets/logo-with-text.svg" width="520" alt="ragflow logo">
 </a>
 </div>
@@ -10,19 +10,22 @@
  <a href="./README_tzh.md"><img alt="繁體版中文自述文件" src="https://img.shields.io/badge/繁體中文-DFE0E5"></a>
  <a href="./README_ja.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-DFE0E5"></a>
  <a href="./README_ko.md"><img alt="한국어" src="https://img.shields.io/badge/한국어-DFE0E5"></a>
+  <a href="./README_fr.md"><img alt="README en Français" src="https://img.shields.io/badge/Français-DFE0E5"></a>
  <a href="./README_id.md"><img alt="Bahasa Indonesia" src="https://img.shields.io/badge/Bahasa Indonesia-DFE0E5"></a>
  <a href="./README_pt_br.md"><img alt="Português(Brasil)" src="https://img.shields.io/badge/Português(Brasil)-DFE0E5"></a>
+  <a href="./README_ar.md"><img alt="README in Arabic" src="https://img.shields.io/badge/Arabic-DFE0E5"></a>
+  <a href="./README_tr.md"><img alt="Türkçe README" src="https://img.shields.io/badge/Türkçe-DFE0E5"></a>
 </p>

 <p align="center">
    <a href="https://x.com/intent/follow?screen_name=infiniflowai" target="_blank">
        <img src="https://img.shields.io/twitter/follow/infiniflow?logo=X&color=%20%23f5f5f5" alt="follow on X(Twitter)">
    </a>
-    <a href="https://demo.ragflow.io" target="_blank">
-        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
+    <a href="https://cloud.ragflow.io" target="_blank">
+        <img alt="Static Badge" src="https://img.shields.io/badge/Get-Started-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.22.0">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.26.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@@ -36,11 +39,10 @@
 </p>

 <h4 align="center">
+  <a href="https://cloud.ragflow.io">Cloud</a> |
  <a href="https://ragflow.io/docs/dev/">Document</a> |
-  <a href="https://github.com/infiniflow/ragflow/issues/4214">Roadmap</a> |
-  <a href="https://twitter.com/infiniflowai">Twitter</a> |
-  <a href="https://discord.gg/NjYzJD3GM3">Discord</a> |
-  <a href="https://demo.ragflow.io">Demo</a>
+  <a href="https://github.com/infiniflow/ragflow/issues/12241">Roadmap</a> |
+  <a href="https://discord.gg/NjYzJD3GM3">Discord</a>
 </h4>

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
@@ -55,11 +57,11 @@
 <summary><b>📕 Table of Contents</b></summary>

 - 💡 [What is RAGFlow?](#-what-is-ragflow)
- 🎮 [Demo](#-demo)
+- 🎮 [Get Started](#-get-started)
 - 📌 [Latest Updates](#-latest-updates)
 - 🌟 [Key Features](#-key-features)
 - 🔎 [System Architecture](#-system-architecture)
- 🎬 [Get Started](#-get-started)
+- 🎬 [Self-Hosting](#-self-hosting)
 - 🔧 [Configurations](#-configurations)
 - 🔧 [Build a Docker image](#-build-a-docker-image)
 - 🔨 [Launch service from source for development](#-launch-service-from-source-for-development)
@@ -72,11 +74,11 @@

 ## 💡 What is RAGFlow?

-[RAGFlow](https://ragflow.io/) is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs. It offers a streamlined RAG workflow adaptable to enterprises of any scale. Powered by a converged context engine and pre-built agent templates, RAGFlow enables developers to transform complex data into high-fidelity, production-ready AI systems with exceptional efficiency and precision.
+[RAGFlow](https://ragflow.io/) is a leading open-source Retrieval-Augmented Generation ([RAG](https://ragflow.io/basics/what-is-rag)) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs. It offers a streamlined RAG workflow adaptable to enterprises of any scale. Powered by a converged [context engine](https://ragflow.io/basics/what-is-agent-context-engine) and pre-built agent templates, RAGFlow enables developers to transform complex data into high-fidelity, production-ready AI systems with exceptional efficiency and precision.

-## 🎮 Demo
+## 🎮 Get Started

-Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
+Try our cloud service at [https://cloud.ragflow.io](https://cloud.ragflow.io).

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
 <img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/chunking.gif" width="1200"/>
@@ -85,16 +87,18 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).

 ## 🔥 Latest Updates

- 2025-11-12 Supports data synchronization from Confluence, AWS S3, Discord, Google Drive.
+- 2026-06-15 Support multiple chat channels such as Feishu, Discord, Telegram, Line, etc.
+- 2026-04-24 Supports DeepSeek v4.
+- 2026-03-24 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — Provides an official skill for accessing RAGFlow datasets via OpenClaw.
+- 2025-12-26 Supports 'Memory' for AI agent.
+- 2025-11-19 Supports Gemini 3 Pro.
+- 2025-11-12 Supports data synchronization from Confluence, S3, Notion, Discord, Google Drive.
 - 2025-10-23 Supports MinerU & Docling as document parsing methods.
 - 2025-10-15 Supports orchestrable ingestion pipeline.
 - 2025-08-08 Supports OpenAI's latest GPT-5 series models.
 - 2025-08-01 Supports agentic workflow and MCP.
 - 2025-05-23 Adds a Python/JavaScript code executor component to Agent.
- 2025-05-05 Supports cross-language query.
 - 2025-03-19 Supports using a multi-modal model to make sense of images within PDF or DOCX files.
- 2024-12-18 Upgrades Document Layout Analysis model in DeepDoc.
- 2024-08-22 Support text to SQL statements through RAG.

 ## 🎉 Stay Tuned

@@ -140,7 +144,7 @@ releases! 🌟
 <img src="https://github.com/user-attachments/assets/31b0dd6f-ca4f-445a-9457-70cb44a381b2" width="1000"/>
 </div>

-## 🎬 Get Started
+## 🎬 Self-Hosting

 ### 📝 Prerequisites

@@ -148,6 +152,7 @@ releases! 🌟
 - RAM >= 16 GB
 - Disk >= 50 GB
 - Docker >= 24.0.0 & Docker Compose >= v2.26.1
+- Python >= 3.13
 - [gVisor](https://gvisor.dev/docs/user_guide/install/): Required only if you intend to use the code executor (sandbox) feature of RAGFlow.

 > [!TIP]
@@ -188,12 +193,14 @@ releases! 🌟
 > All Docker images are built for x86 platforms. We don't currently offer Docker images for ARM64.
 > If you are on an ARM64 platform, follow [this guide](https://ragflow.io/docs/dev/build_docker_image) to build a Docker image compatible with your system.

-> The command below downloads the `v0.22.0` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.22.0`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server.
+> The command below downloads the `v0.26.2` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.26.2`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server.

 ```bash
   $ cd ragflow/docker
-   
-   # Optional: use a stable tag (see releases: https://github.com/infiniflow/ragflow/releases), e.g.: git checkout v0.22.0
+
+   # git checkout v0.26.2
+   # Optional: use a stable tag (see releases: https://github.com/infiniflow/ragflow/releases)
+   # This step ensures the **entrypoint.sh** file in the code matches the Docker image version.

   # Use CPU for DeepDoc tasks:
   $ docker compose -f docker-compose.yml up -d
@@ -205,10 +212,10 @@ releases! 🌟

 > Note: Prior to `v0.22.0`, we provided both images with embedding models and slim images without embedding models. Details as follows:

-| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
-| ----------------- | --------------- | --------------------- | ------------------------ |
-| v0.21.1           | &approx;9       | ✔️                    | Stable release           |
-| v0.21.1-slim      | &approx;2       | ❌                    | Stable release           |
+| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?        |
+|-------------------|-----------------|-----------------------|----------------|
+| v0.21.1           | &approx;9       | ✔️                    | Stable release |
+| v0.21.1-slim      | &approx;2       | ❌                     | Stable release |

 > Starting with `v0.22.0`, we ship only the slim edition and no longer append the **-slim** suffix to the image tag.

@@ -231,7 +238,7 @@ releases! 🌟
    * Running on all addresses (0.0.0.0)
   ```

-   > If you skip this confirmation step and directly log in to RAGFlow, your browser may prompt a `network anormal`
+   > If you skip this confirmation step and directly log in to RAGFlow, your browser may prompt a `network abnormal`
   > error because, at that moment, your RAGFlow may not be fully initialized.
   >
 5. In your web browser, enter the IP address of your server and log in to RAGFlow.
@@ -301,21 +308,33 @@ cd ragflow/
 docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
 ```

+Or if you are behind a proxy, you can pass proxy arguments:
+
+```bash
+docker build --platform linux/amd64 \
+  --build-arg http_proxy=http://YOUR_PROXY:PORT \
+  --build-arg https_proxy=http://YOUR_PROXY:PORT \
+  -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
 ## 🔨 Launch service from source for development

-1. Install `uv` and `pre-commit`, or skip this step if they are already installed:
+> [!IMPORTANT]
+> After cloning the repository for the first time, run `lefthook install` once from the repo root to enable local Git hooks.
+
+1. Install `uv`, or skip this step if it is already installed:

   ```bash
-   pipx install uv pre-commit
+   pipx install uv
   ```
 2. Clone the source code and install Python dependencies:

   ```bash
   git clone https://github.com/infiniflow/ragflow.git
   cd ragflow/
-   uv sync --python 3.10 # install RAGFlow dependent python modules
-   uv run download_deps.py
-   pre-commit install
+   uv sync --python 3.13 # install RAGFlow dependent python modules
+   uv run python3 ragflow_deps/download_deps.py
+   lefthook install
   ```
 3. Launch the dependent services (MinIO, Elasticsearch, Redis, and MySQL) using Docker Compose:

@@ -378,19 +397,19 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
 - [Quickstart](https://ragflow.io/docs/dev/)
 - [Configuration](https://ragflow.io/docs/dev/configurations)
 - [Release notes](https://ragflow.io/docs/dev/release_notes)
- [User guides](https://ragflow.io/docs/dev/category/guides)
- [Developer guides](https://ragflow.io/docs/dev/category/developers)
+- [User guides](https://ragflow.io/docs/category/user-guides)
+- [Developer guides](https://ragflow.io/docs/category/developer-guides)
 - [References](https://ragflow.io/docs/dev/category/references)
 - [FAQs](https://ragflow.io/docs/dev/faq)

 ## 📜 Roadmap

-See the [RAGFlow Roadmap 2025](https://github.com/infiniflow/ragflow/issues/4214)
+See the [RAGFlow Roadmap 2026](https://github.com/infiniflow/ragflow/issues/12241)

 ## 🏄 Community

 - [Discord](https://discord.gg/NjYzJD3GM3)
- [Twitter](https://twitter.com/infiniflowai)
+- [X](https://x.com/infiniflowai)
 - [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)

 ## 🙌 Contributing
--- a/README_ar.md
+++ b/README_ar.md
@@ -0,0 +1,415 @@
+<div align="center">
+<a href="https://cloud.ragflow.io/">
+<img src="web/src/assets/logo-with-text.svg" width="520" alt="ragflow logo">
+</a>
+</div>
+
+<p align="center">
+  <a href="./README.md"><img alt="README in English" src="https://img.shields.io/badge/English-DFE0E5"></a>
+  <a href="./README_zh.md"><img alt="简体中文版自述文件" src="https://img.shields.io/badge/简体中文-DFE0E5"></a>
+  <a href="./README_tzh.md"><img alt="繁體版中文自述文件" src="https://img.shields.io/badge/繁體中文-DFE0E5"></a>
+  <a href="./README_ja.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-DFE0E5"></a>
+  <a href="./README_ko.md"><img alt="한국어" src="https://img.shields.io/badge/한국어-DFE0E5"></a>
+  <a href="./README_fr.md"><img alt="README en Français" src="https://img.shields.io/badge/Français-DFE0E5"></a>
+  <a href="./README_id.md"><img alt="Bahasa Indonesia" src="https://img.shields.io/badge/Bahasa Indonesia-DFE0E5"></a>
+  <a href="./README_pt_br.md"><img alt="Português(Brasil)" src="https://img.shields.io/badge/Português(Brasil)-DFE0E5"></a>
+  <a href="./README_ar.md"><img alt="README in Arabic" src="https://img.shields.io/badge/Arabic-DBEDFA"></a>
+  <a href="./README_tr.md"><img alt="Türkçe README" src="https://img.shields.io/badge/Türkçe-DFE0E5"></a>
+</p>
+
+<p align="center">
+    <a href="https://x.com/intent/follow?screen_name=infiniflowai" target="_blank">
+        <img src="https://img.shields.io/twitter/follow/infiniflow?logo=X&color=%20%23f5f5f5" alt="follow on X(Twitter)">
+    </a>
+    <a href="https://cloud.ragflow.io" target="_blank">
+        <img alt="Static Badge" src="https://img.shields.io/badge/Get-Started-4e6b99">
+    </a>
+    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.26.2">
+    </a>
+    <a href="https://github.com/infiniflow/ragflow/releases/latest">
+        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
+    </a>
+    <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
+        <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="license">
+    </a>
+    <a href="https://deepwiki.com/infiniflow/ragflow">
+        <img alt="Ask DeepWiki" src="https://deepwiki.com/badge.svg">
+    </a>
+</p>
+
+<h4 align="center">
+  <a href="https://cloud.ragflow.io">Cloud</a> |
+  <a href="https://ragflow.io/docs/dev/">Document</a> |
+  <a href="https://github.com/infiniflow/ragflow/issues/12241">Roadmap</a> |
+  <a href="https://discord.gg/NjYzJD3GM3">Discord</a>
+</h4>
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/ragflow-octoverse.png" width="1200"/>
+</div>
+
+<div align="center">
+<a href="https://trendshift.io/repositories/9064" target="_blank"><img src="https://trendshift.io/api/badge/repositories/9064" alt="infiniflow%2Fragflow | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
+</div>
+
+<details open>
+<summary><b>📕 جدول المحتويات</b></summary>
+
+- 💡 [ما هو RAGFlow؟](#-what-is-ragflow)
+- 🎮 [ابدأ](#-get-started)
+- 📌 [آخر التحديثات](#-latest-updates)
+- 🌟 [الميزات الرئيسية](#-key-features)
+- 🔎 [بنية النظام](#-system-architecture)
+- 🎬 [الاستضافة الذاتية](#-self-hosting)
+- 🔧 [التكوينات](#-configurations)
+- 🔧 [إنشاء صورة Docker](#-build-a-docker-image)
+- 🔨 [إطلاق الخدمة من المصدر للتطوير](#-launch-service-from-source-for-development)
+- 📚 [التوثيق](#-documentation)
+- 📜 [Roadmap](#-roadmap)
+- 🏄 [المجتمع](#-community)
+- 🙌 [مساهمة](#-contributing)
+
+</details>
+
+## 💡 ما هو RAGFlow؟
+
+يُعد مشروع [RAGFlow](https://ragflow.io/) محركًا رائدًا ومفتوح المصدر للاسترجاع المعزز بالتوليد (<bdi dir="ltr">RAG</bdi>)، ويجمع أحدث تقنيات <bdi dir="ltr">RAG</bdi> مع قدرات الوكلاء لبناء طبقة سياق متقدمة لنماذج <bdi dir="ltr">LLMs</bdi>. يوفّر سير عمل <bdi dir="ltr">RAG</bdi> مبسّطًا وقابلًا للتكيّف مع المؤسسات بمختلف أحجامها. وبالاعتماد على [محرك سياق موحّد](https://ragflow.io/basics/what-is-agent-context-engine) وقوالب وكلاء جاهزة، يتيح <bdi dir="ltr">RAGFlow</bdi> للمطورين تحويل البيانات المعقّدة إلى أنظمة <bdi dir="ltr">AI</bdi> عالية الدقة وجاهزة للإنتاج بكفاءة وموثوقية.
+
+## 🎮 ابدأ
+
+جرّب النسخة التجريبية على [https://cloud.ragflow.io](https://cloud.ragflow.io).
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/chunking.gif" width="1200"/>
+<img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/agentic-dark.gif" width="1200"/>
+</div>
+
+## 🔥 آخر التحديثات
+
+- 15-06-2026 يدعم قنوات دردشة متعددة مثل Feishu و Discord و Telegram و Line وما إلى ذلك.
+- 24-04-2026 يدعم DeepSeek v4.
+- 24-03-2026 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — توفر مهارة رسمية للوصول إلى مجموعات بيانات RAGFlow عبر OpenClaw.
+- 26-12-2025 يدعم ميزة "Memory" لوكلاء الذكاء الاصطناعي.
+- 11-11-2025 يدعم Gemini 3 Pro.
+- 12-11-2025 يدعم مزامنة البيانات من Confluence، S3، Notion، Discord، Google Drive.
+- 23-10-2025 يدعم MinerU وDocling كطرق لتحليل المستندات.
+- 15-10-2025 يدعم العرض الأوركسترالي pipeline.
+- 08-08-2025 يدعم أحدث موديلات سلسلة OpenAI.
+- 01-08-2025 يدعم سير العمل الوكيل وMCP.
+- 23-05-2025 تمت إضافة مكون منفذ كود Python/JavaScript إلى Agent.
+- 19-03-2025 يدعم استخدام نموذج متعدد الوسائط لفهم الصور داخل ملفات PDF أو DOCX.
+
+## 🎉 تابعونا
+
+⭐️ قم بتمييز مستودعنا بنجمة لتبقى على اطلاع بالميزات والتحسينات الجديدة والمثيرة! احصل على إشعارات فورية بالجديد
+الإصدارات! 🌟
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/user-attachments/assets/18c9707e-b8aa-4caf-a154-037089c105ba" width="1200"/>
+</div>
+
+## 🌟 الميزات الرئيسية
+
+### 🍭 **"الجودة في الداخل، الجودة في الخارج"**
+
+- [الفهم العميق للمستندات](./deepdoc/README.md) لاستخراج المعرفة من البيانات غير المنظمة
+  ذات التنسيقات المعقدة.
+- يجد "إبرة في كومة قش بيانات" من الرموز غير المحدودة حرفيًا.
+
+### 🍱 **التقطيع القائم على القالب**
+
+- ذكي وقابل للتفسير.
+- الكثير من خيارات القالب للاختيار من بينها.
+
+### 🌱 **استشهادات مؤرضة لتقليل الهلوسة**
+
+- تصور تقطيع النص للسماح بالتدخل البشري.
+- عرض سريع للمراجع الرئيسية والاستشهادات التي يمكن تتبعها لدعم الإجابات المبنية على أسس سليمة.
+
+### 🍔 **التوافق مع مصادر البيانات غير المتجانسة**
+
+- يدعم Word، والشرائح، وExcel، وtxt، والصور، والنسخ الممسوحة ضوئيًا، والبيانات المنظمة، وصفحات الويب، والمزيد.
+
+### 🛀 **سير عمل RAG آلي وسهل**
+
+- تنسيق RAG مبسط يلبي احتياجات الشركات الشخصية والكبيرة على حد سواء.
+- نماذج LLMs قابلة للتكوين بالإضافة إلى نماذج embedding.
+- الاستدعاء المتعدد المقترن بإعادة التصنيف المدمجة.
+- APIs بديهي للتكامل السلس مع الأعمال.
+
+## 🔎 هندسة النظام
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/user-attachments/assets/31b0dd6f-ca4f-445a-9457-70cb44a381b2" width="1000"/>
+</div>
+
+## 🎬 الاستضافة الذاتية
+
+### 📝 المتطلبات الأساسية
+
+- CPU >= 4 مراكز
+- الرام >= 16 جيجا
+- القرص >= 50 جيجا بايت
+- Docker >= 24.0.0 & Docker Compose >= v2.26.1
+- بايثون >= 3.13
+- [gVisor](https://gvisor.dev/docs/user_guide/install/): مطلوب فقط إذا كنت تنوي استخدام ميزة منفذ التعليمات البرمجية (وضع الحماية) لـ RAGFlow.
+
+> [!TIP]
+> إذا لم تقم بتثبيت Docker على جهازك المحلي (Windows أو Mac أو Linux)، راجع [تثبيت Docker Engine](https://docs.docker.com/engine/install/).
+
+### 🚀 بدء تشغيل الخادم
+
+1. تأكد من `vm.max_map_count` >= 262144:
+
+   > للتحقق من قيمة `vm.max_map_count`:
+   >
+   > ```bash
+   > $ sysctl vm.max_map_count
+   > ```
+   >
+   > أعد تعيين `vm.max_map_count` إلى قيمة 262144 على الأقل إذا لم تكن كذلك.
+   >
+   > ```bash
+   > # In this case, we set it to 262144:
+   > $ sudo sysctl -w vm.max_map_count=262144
+   > ```
+   >
+   > سيتم إعادة ضبط هذا التغيير بعد إعادة تشغيل النظام. لضمان بقاء التغيير دائمًا، قم بإضافة أو تحديث
+   > `vm.max_map_count` القيمة في **/etc/sysctl.conf** وفقًا لذلك:
+   >
+   > ```bash
+   > vm.max_map_count=262144
+   > ```
+   >
+2. استنساخ الريبو:
+
+   ```bash
+   $ git clone https://github.com/infiniflow/ragflow.git
+   ```
+3. ابدأ تشغيل الخادم باستخدام صور Docker المعدة مسبقًا:
+
+> [!CAUTION]
+> جميع الصور Docker مصممة لمنصات x86. لا نعرض حاليًا صور Docker لـ ARM64.
+> إذا كنت تستخدم نظامًا أساسيًا ARM64، فاتبع [هذا الدليل](https://ragflow.io/docs/dev/build_docker_image) لإنشاء صورة Docker متوافقة مع نظامك.
+
+> يقوم الأمر أدناه بتنزيل إصدار `v0.26.2` من الصورة RAGFlow Docker. راجع الجدول التالي للحصول على أوصاف لإصدارات RAGFlow المختلفة. لتنزيل إصدار RAGFlow مختلف عن `v0.26.2`، قم بتحديث المتغير `RAGFLOW_IMAGE` وفقًا لذلك في **docker/.env** قبل استخدام `docker compose` لبدء تشغيل الخادم.
+
+```bash
+   $ cd ragflow/docker
+
+   # git checkout v0.26.2
+   # Optional: use a stable tag (see releases: https://github.com/infiniflow/ragflow/releases)
+   # This step ensures the **entrypoint.sh** file in the code matches the Docker image version.
+
+   # Use CPU for DeepDoc tasks:
+   $ docker compose -f docker-compose.yml up -d
+
+   # To use GPU to accelerate DeepDoc tasks:
+   # sed -i '1i DEVICE=gpu' .env
+   # docker compose -f docker-compose.yml up -d
+```
+
+> ملاحظة: قبل `v0.22.0`، قدمنا كلتا الصورتين بنماذج embedding وصورًا رفيعة بدون نماذج embedding. التفاصيل على النحو التالي:
+
+| RAGFlow علامة الصورة | حجم الصورة (جيجابايت) | هل لديه نماذج embedding؟ | مستقر؟        |
+|-------------------|-----------------|-----------------------|----------------|
+| v0.21.1 | &approx;9 | ✔️ | إصدار مستقر |
+| v0.21.1-slim | &approx;2 | ❌ | إصدار مستقر |
+
+> بدءًا من `v0.22.0`، نقوم بشحن الإصدار النحيف فقط ولم نعد نلحق اللاحقة **-slim** بعلامة الصورة.
+
+4. التحقق من حالة الخادم بعد تشغيل الخادم:
+
+   ```bash
+   $ docker logs -f docker-ragflow-cpu-1
+   ```
+
+   _النتيجة التالية تؤكد الإطلاق الناجح للنظام:_
+
+   ```bash
+
+         ____   ___    ______ ______ __
+        / __ \ /   |  / ____// ____// /____  _      __
+       / /_/ // /| | / / __ / /_   / // __ \| | /| / /
+      / _, _// ___ |/ /_/ // __/  / // /_/ /| |/ |/ /
+     /_/ |_|/_/  |_|\____//_/    /_/ \____/ |__/|__/
+
+    * Running on all addresses (0.0.0.0)
+   ```
+
+   > إذا تخطيت خطوة التأكيد هذه وقمت بتسجيل الدخول مباشرة إلى RAGFlow، فقد يعرض متصفحك تنبيه `network abnormal`
+   > خطأ لأنه في تلك اللحظة، قد لا تتم تهيئة RAGFlow بشكل كامل.
+   >
+5. في متصفح الويب الخاص بك، أدخل عنوان IP الخاص بالخادم الخاص بك وقم بتسجيل الدخول إلى RAGFlow.
+
+   > باستخدام الإعدادات الافتراضية، ما عليك سوى إدخال `http://IP_OF_YOUR_MACHINE` (**من دون** رقم المنفذ) كإعداد افتراضي
+   > HTTP يمكن حذف منفذ العرض `80` عند استخدام التكوينات الافتراضية.
+   >
+6. في [service_conf.yaml.template](./docker/service_conf.yaml.template)، حدد المصنع LLM المطلوب في `user_default_llm` وقم بالتحديث
+   الحقل `API_KEY` مع مفتاح API المقابل.
+
+   > راجع [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) لمزيد من المعلومات.
+   >
+
+   _العرض بدأ!_
+
+## 🔧 التكوينات
+
+عندما يتعلق الأمر بتكوينات النظام، ستحتاج إلى إدارة الملفات التالية:
+
+- [.env](./docker/.env): يحتفظ بالإعدادات الأساسية للنظام، مثل `SVR_HTTP_PORT`، `MYSQL_PASSWORD`، و
+  `MINIO_PASSWORD`.
+- [service_conf.yaml.template](./docker/service_conf.yaml.template): تكوين الخدمات الخلفية. سيتم ملء متغيرات البيئة في هذا الملف تلقائيًا عند بدء تشغيل الحاوية Docker. ستكون أي متغيرات بيئة تم تعيينها داخل حاوية Docker متاحة للاستخدام، مما يسمح لك بتخصيص سلوك الخدمة استنادًا إلى بيئة النشر.
+- [docker-compose.yml](./docker/docker-compose.yml): يعتمد النظام على [docker-compose.yml](./docker/docker-compose.yml) لبدء التشغيل.
+
+> يوفر الملف [./docker/README](./docker/README.md) وصفًا تفصيليًا لإعدادات البيئة والخدمة
+> التكوينات التي يمكن استخدامها كـ `${ENV_VARS}` في ملف [service_conf.yaml.template](./docker/service_conf.yaml.template).
+
+لتحديث منفذ العرض الافتراضي HTTP (80)، انتقل إلى [docker-compose.yml](./docker/docker-compose.yml) وقم بتغيير `80:80`
+إلى `<YOUR_SERVING_PORT>:80`.
+
+تتطلب تحديثات التكوينات المذكورة أعلاه إعادة تشغيل جميع الحاويات لتصبح سارية المفعول:
+
+> ```bash
+> $ docker compose -f docker-compose.yml up -d
+> ```
+
+### تبديل محرك المستندات من Elasticsearch إلى Infinity
+
+RAGFlow يستخدم Elasticsearch بشكل افتراضي لتخزين النص الكامل والمتجهات. للتبديل إلى [Infinity](https://github.com/infiniflow/infinity/)، اتبع الخطوات التالية:
+
+1. إيقاف كافة الحاويات قيد التشغيل:
+
+   ```bash
+   $ docker compose -f docker/docker-compose.yml down -v
+   ```
+
+> [!WARNING]
+> `-v` سوف يحذف docker وحدات تخزين الحاوية، وسيتم مسح البيانات الموجودة.
+
+2. اضبط `DOC_ENGINE` في **docker/.env** على `infinity`.
+3. ابدأ الحاويات:
+
+   ```bash
+   $ docker compose -f docker-compose.yml up -d
+   ```
+
+> [!WARNING]
+> التبديل إلى Infinity على جهاز Linux/arm64 غير مدعوم رسميًا بعد.
+
+## 🔧 أنشئ صورة Docker
+
+يبلغ حجم هذه الصورة حوالي 2 غيغابايت وتعتمد على خدمات LLM وembedding الخارجية.
+
+```bash
+git clone https://github.com/infiniflow/ragflow.git
+cd ragflow/
+docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
+أو إذا كنت خلف وكيل، فيمكنك تمرير وسيطات الوكيل:
+
+```bash
+docker build --platform linux/amd64 \
+  --build-arg http_proxy=http://YOUR_PROXY:PORT \
+  --build-arg https_proxy=http://YOUR_PROXY:PORT \
+  -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
+## 🔨 إطلاق الخدمة من المصدر للتطوير
+
+1. قم بتثبيت `uv`، أو قم بتخطي هذه الخطوة إذا كان مثبتًا بالفعل:
+
+   ```bash
+   pipx install uv
+   ```
+2. استنساخ الكود المصدري وتثبيت تبعيات بايثون:
+
+   ```bash
+   git clone https://github.com/infiniflow/ragflow.git
+   cd ragflow/
+   uv sync --python 3.13 # install RAGFlow dependent python modules
+   uv run python3 ragflow_deps/download_deps.py
+   lefthook install
+   ```
+3. قم بتشغيل الخدمات التابعة (MinIO وElasticsearch وRedis وMySQL) باستخدام Docker Compose:
+
+   ```bash
+   docker compose -f docker/docker-compose-base.yml up -d
+   ```
+
+   أضف السطر التالي إلى `/etc/hosts` لحل كافة المضيفين المحددين في **docker/.env** إلى `127.0.0.1`:
+
+   ```
+   127.0.0.1       es01 infinity mysql minio redis sandbox-executor-manager
+   ```
+4. إذا لم تتمكن من الوصول إلى HuggingFace، فقم بتعيين متغير البيئة `HF_ENDPOINT` لاستخدام موقع مرآة:
+
+   ```bash
+   export HF_ENDPOINT=https://hf-mirror.com
+   ```
+5. إذا كان نظام التشغيل لديك لا يحتوي على jemalloc، فيرجى تثبيته على النحو التالي:
+
+   ```bash
+   # Ubuntu
+   sudo apt-get install libjemalloc-dev
+   # CentOS
+   sudo yum install jemalloc
+   # OpenSUSE
+   sudo zypper install jemalloc
+   # macOS
+   sudo brew install jemalloc
+   ```
+6. إطلاق الخدمة الخلفية:
+
+   ```bash
+   source .venv/bin/activate
+   export PYTHONPATH=$(pwd)
+   bash docker/launch_backend_service.sh
+   ```
+7. تثبيت تبعيات الواجهة الأمامية:
+
+   ```bash
+   cd web
+   npm install
+   ```
+8. إطلاق خدمة الواجهة الأمامية:
+
+   ```bash
+   npm run dev
+   ```
+
+   _النتيجة التالية تؤكد الإطلاق الناجح للنظام:_
+
+   ![](https://github.com/user-attachments/assets/0daf462c-a24d-4496-a66f-92533534e187)
+9. أوقف خدمة الواجهة الأمامية والخلفية RAGFlow بعد اكتمال التطوير:
+
+   ```bash
+   pkill -f "ragflow_server.py|task_executor.py"
+   ```
+
+## 📚 التوثيق
+
+- [البدء السريع](https://ragflow.io/docs/dev/)
+- [التكوين](https://ragflow.io/docs/dev/configurations)
+- [ملاحظات الإصدار](https://ragflow.io/docs/dev/release_notes)
+- [أدلة المستخدم](https://ragflow.io/docs/category/user-guides)
+- [أدلة المطورين](https://ragflow.io/docs/category/developer-guides)
+- [المراجع](https://ragflow.io/docs/dev/category/references)
+- [الأسئلة الشائعة](https://ragflow.io/docs/dev/faq)
+
+## 📜 Roadmap
+
+راجع [RAGFlow Roadmap 2026](https://github.com/infiniflow/ragflow/issues/12241)
+
+## 🏄 المجتمع
+
+- [Discord](https://discord.gg/NjYzJD3GM3)
+- [X](https://x.com/infiniflowai)
+- [مناقشات جيثب](https://github.com/orgs/infiniflow/discussions)
+
+## 🙌 المساهمة
+
+RAGFlow يزدهر من خلال التعاون مفتوح المصدر. وبهذه الروح، فإننا نحتضن المساهمات المتنوعة من المجتمع.
+إذا كنت ترغب في أن تكون جزءًا، فراجع [إرشادات المساهمة](https://ragflow.io/docs/dev/contributing) أولاً.
--- a/README_fr.md
+++ b/README_fr.md
@@ -0,0 +1,406 @@
+<div align="center">
+<a href="https://cloud.ragflow.io/">
+<img src="web/src/assets/logo-with-text.svg" width="520" alt="ragflow logo">
+</a>
+</div>
+
+<p align="center">
+  <a href="./README.md"><img alt="README in English" src="https://img.shields.io/badge/English-DFE0E5"></a>
+  <a href="./README_zh.md"><img alt="简体中文版自述文件" src="https://img.shields.io/badge/简体中文-DFE0E5"></a>
+  <a href="./README_tzh.md"><img alt="繁體版中文自述文件" src="https://img.shields.io/badge/繁體中文-DFE0E5"></a>
+  <a href="./README_ja.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-DFE0E5"></a>
+  <a href="./README_ko.md"><img alt="한국어" src="https://img.shields.io/badge/한국어-DFE0E5"></a>
+  <a href="./README_fr.md"><img alt="README en Français" src="https://img.shields.io/badge/Français-DBEDFA"></a>
+  <a href="./README_id.md"><img alt="Bahasa Indonesia" src="https://img.shields.io/badge/Bahasa Indonesia-DFE0E5"></a>
+  <a href="./README_pt_br.md"><img alt="Português(Brasil)" src="https://img.shields.io/badge/Português(Brasil)-DFE0E5"></a>
+  <a href="./README_ar.md"><img alt="README in Arabic" src="https://img.shields.io/badge/Arabic-DFE0E5"></a>
+  <a href="./README_tr.md"><img alt="Türkçe README" src="https://img.shields.io/badge/Türkçe-DFE0E5"></a>
+</p>
+
+<p align="center">
+    <a href="https://x.com/intent/follow?screen_name=infiniflowai" target="_blank">
+        <img src="https://img.shields.io/twitter/follow/infiniflow?logo=X&color=%20%23f5f5f5" alt="suivre sur X(Twitter)">
+    </a>
+    <a href="https://cloud.ragflow.io" target="_blank">
+        <img alt="Badge statique" src="https://img.shields.io/badge/Get-Started-4e6b99">
+    </a>
+    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.26.2">
+    </a>
+    <a href="https://github.com/infiniflow/ragflow/releases/latest">
+        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Dernière%20version" alt="Dernière version">
+    </a>
+    <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
+        <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="licence">
+    </a>
+    <a href="https://deepwiki.com/infiniflow/ragflow">
+        <img alt="Ask DeepWiki" src="https://deepwiki.com/badge.svg">
+    </a>
+</p>
+
+<h4 align="center">
+  <a href="https://cloud.ragflow.io">Cloud</a> |
+  <a href="https://ragflow.io/docs/dev/">Documentation</a> |
+  <a href="https://github.com/infiniflow/ragflow/issues/12241">Roadmap</a> |
+  <a href="https://discord.gg/NjYzJD3GM3">Discord</a>
+</h4>
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/ragflow-octoverse.png" width="1200"/>
+</div>
+
+<div align="center">
+<a href="https://trendshift.io/repositories/9064" target="_blank"><img src="https://trendshift.io/api/badge/repositories/9064" alt="infiniflow%2Fragflow | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
+</div>
+
+<details open>
+<summary><b>📕 Table des matières</b></summary>
+
+- 💡 [Qu'est-ce que RAGFlow?](#-quest-ce-que-ragflow)
+- 🎮 [Démarrage](#-démarrage)
+- 📌 [Dernières mises à jour](#-dernières-mises-à-jour)
+- 🌟 [Fonctionnalités clés](#-fonctionnalités-clés)
+- 🔎 [Architecture du système](#-architecture-du-système)
+- 🎬 [Auto-hébergement](#-auto-hébergement)
+- 🔧 [Configurations](#-configurations)
+- 🔧 [Construire une image Docker](#-construire-une-image-docker)
+- 🔨 [Lancer le service depuis les sources pour le développement](#-lancer-le-service-depuis-les-sources-pour-le-développement)
+- 📚 [Documentation](#-documentation)
+- 📜 [Roadmap](#-feuille-de-route)
+- 🏄 [Communauté](#-communauté)
+- 🙌 [Contribuer](#-contribuer)
+
+</details>
+
+## 💡 Qu'est-ce que RAGFlow?
+
+[RAGFlow](https://ragflow.io/) est un moteur de [RAG](https://ragflow.io/basics/what-is-rag) (Retrieval-Augmented Generation) open-source de premier plan qui fusionne les technologies RAG de pointe avec des capacités Agent pour créer une couche de contexte supérieure pour les LLM. Il offre un flux de travail RAG rationalisé, adaptable aux entreprises de toute taille. Alimenté par un [moteur de contexte](https://ragflow.io/basics/what-is-agent-context-engine) convergent et des modèles d'agents préconstruits, RAGFlow permet aux développeurs de transformer des données complexes en systèmes d'IA haute-fidélité, prêts pour la production, avec une efficacité et une précision exceptionnelles.
+
+## 🎮 Démarrage
+
+Essayez notre service cloud sur [https://cloud.ragflow.io](https://cloud.ragflow.io).
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/chunking.gif" width="1200"/>
+<img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/agentic-dark.gif" width="1200"/>
+</div>
+
+## 🔥 Dernières mises à jour
+
+- 15-06-2026 Prise en charge de plusieurs canaux de discussion tels que Feishu, Discord, Telegram, Line, etc.
+- 24-04-2026 Prise en charge de DeepSeek v4.
+- 24-03-2026 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — Fournit un skill officiel pour accéder aux datasets RAGFlow via OpenClaw.
+- 26-12-2025 Prise en charge de la « Mémoire » pour l'agent IA.
+- 19-11-2025 Prise en charge de Gemini 3 Pro.
+- 12-11-2025 Prise en charge de la synchronisation de données depuis Confluence, S3, Notion, Discord et Google Drive.
+- 23-10-2025 Prise en charge de MinerU & Docling comme méthodes d'analyse de documents.
+- 15-10-2025 Prise en charge du pipeline d'ingestion orchestrable.
+- 08-08-2025 Prise en charge des derniers modèles de la série GPT-5 d'OpenAI.
+- 01-08-2025 Prise en charge du flux de travail agentique et de MCP.
+- 23-05-2025 Ajout d'un composant exécuteur de code Python/JavaScript à l'Agent.
+- 19-03-2025 Prise en charge de l'utilisation d'un modèle multi-modal pour analyser les images dans les fichiers PDF ou DOCX.
+
+## 🎉 Restez informé
+
+⭐️ Mettez une étoile à notre dépôt pour rester informé des nouvelles fonctionnalités et améliorations passionnantes ! Recevez des notifications instantanées pour les nouvelles versions ! 🌟
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/user-attachments/assets/18c9707e-b8aa-4caf-a154-037089c105ba" width="1200"/>
+</div>
+
+## 🌟 Fonctionnalités clés
+
+### 🍭 **"Quality in, quality out"**
+
+- Extraction de connaissances basée sur la [compréhension approfondie des documents](./deepdoc/README.md) à partir de données non structurées aux formats complexes.
+- Trouve "l'aiguille dans la meule de données" de tokens littéralement illimités.
+
+### 🍱 **Découpage(Chunking) basé sur des templates**
+
+- Intelligent et explicable.
+- De nombreuses options de templates disponibles.
+
+### 🌱 **Citations fondées avec réduction des hallucinations**
+
+- Visualisation du découpage de texte pour permettre une intervention humaine.
+- Aperçu rapide des références clés et citations traçables pour soutenir des réponses fondées.
+
+### 🍔 **Compatibilité avec des sources de données hétérogènes**
+
+- Prend en charge Word, présentations, Excel, txt, images, copies numérisées, données structurées, pages web, et plus encore.
+
+### 🛀 **Flux de travail RAG automatisé et sans effort**
+
+- Orchestration RAG rationalisée adaptée aux particuliers comme aux grandes entreprises.
+- LLM et modèles d'embedding configurables.
+- Rappel multiple associé à un ré-classement fusionné.
+- APIs intuitives pour une intégration transparente avec les entreprises.
+
+## 🔎 Architecture du système
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/user-attachments/assets/31b0dd6f-ca4f-445a-9457-70cb44a381b2" width="1000"/>
+</div>
+
+## 🎬 Auto-hébergement
+
+### 📝 Prérequis
+
+- CPU >= 4 cœurs
+- RAM >= 16 Go
+- Disque >= 50 Go
+- Docker >= 24.0.0 & Docker Compose >= v2.26.1
+- Python >= 3.13
+- [gVisor](https://gvisor.dev/docs/user_guide/install/) : Requis uniquement si vous souhaitez utiliser la fonctionnalité d'exécuteur de code (sandbox) de RAGFlow.
+
+> [!TIP]
+> Si vous n'avez pas installé Docker sur votre machine locale (Windows, Mac ou Linux), consultez [Installer Docker Engine](https://docs.docker.com/engine/install/).
+
+### 🚀 Démarrer le serveur
+
+1. Assurez-vous que `vm.max_map_count` >= 262144 :
+
+   > Pour vérifier la valeur de `vm.max_map_count` :
+   >
+   > ```bash
+   > $ sysctl vm.max_map_count
+   > ```
+   >
+   > Réinitialisez `vm.max_map_count` à une valeur d'au moins 262144 si ce n'est pas le cas.
+   >
+   > ```bash
+   > # Dans ce cas, nous le définissons à 262144 :
+   > $ sudo sysctl -w vm.max_map_count=262144
+   > ```
+   >
+   > Ce changement sera réinitialisé après un redémarrage du système. Pour que votre modification reste permanente, ajoutez ou mettez à jour la valeur `vm.max_map_count` dans **/etc/sysctl.conf** :
+   >
+   > ```bash
+   > vm.max_map_count=262144
+   > ```
+   >
+2. Clonez le dépôt :
+
+   ```bash
+   $ git clone https://github.com/infiniflow/ragflow.git
+   ```
+3. Démarrez le serveur en utilisant les images Docker préconstruites :
+
+> [!CAUTION]
+> Toutes les images Docker sont construites pour les plateformes x86. Nous ne proposons pas actuellement d'images Docker pour ARM64.
+> Si vous êtes sur une plateforme ARM64, suivez [ce guide](https://ragflow.io/docs/dev/build_docker_image) pour construire une image Docker compatible avec votre système.
+
+> La commande ci-dessous télécharge l'édition `v0.26.2` de l'image Docker RAGFlow. Consultez le tableau suivant pour les descriptions des différentes éditions de RAGFlow. Pour télécharger une édition de RAGFlow différente de `v0.26.2`, mettez à jour la variable `RAGFLOW_IMAGE` dans **docker/.env** avant d'utiliser `docker compose` pour démarrer le serveur.
+
+```bash
+   $ cd ragflow/docker
+
+   # git checkout v0.26.2
+   # Optionnel : utiliser un tag stable (voir les versions : https://github.com/infiniflow/ragflow/releases)
+   # Cette étape garantit que le fichier **entrypoint.sh** dans le code correspond à la version de l'image Docker.
+
+   # Use CPU for DeepDoc tasks:
+   $ docker compose -f docker-compose.yml up -d
+
+   # To use GPU to accelerate DeepDoc tasks:
+   # sed -i '1i DEVICE=gpu' .env
+   # docker compose -f docker-compose.yml up -d
+```
+
+> Remarque : Avant `v0.22.0`, nous fournissions à la fois des images avec des modèles d'embedding et des images slim sans modèles d'embedding. Détails ci-dessous :
+
+| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?        |
+|-------------------|-----------------|-----------------------|----------------|
+| v0.21.1           | &approx;9       | ✔️                    | Stable release |
+| v0.21.1-slim      | &approx;2       | ❌                     | Stable release |
+
+> À partir de `v0.22.0`, nous ne distribuons que l'édition slim et ne rajoutons plus le suffixe **-slim** au tag d'image.
+
+4. Vérifiez l'état du serveur après son démarrage :
+
+   ```bash
+   $ docker logs -f docker-ragflow-cpu-1
+   ```
+
+   _La sortie suivante confirme un lancement réussi du système :_
+
+   ```bash
+
+         ____   ___    ______ ______ __
+        / __ \ /   |  / ____// ____// /____  _      __
+       / /_/ // /| | / / __ / /_   / // __ \| | /| / /
+      / _, _// ___ |/ /_/ // __/  / // /_/ /| |/ |/ /
+     /_/ |_|/_/  |_|\____//_/    /_/ \____/ |__/|__/
+
+    * Running on all addresses (0.0.0.0)
+   ```
+
+   > Si vous sautez cette étape de confirmation et vous connectez directement à RAGFlow, votre navigateur peut afficher une erreur `network abnormal`, car à ce moment-là, votre RAGFlow peut ne pas être entièrement initialisé.
+   >
+5. Dans votre navigateur web, entrez l'adresse IP de votre serveur et connectez-vous à RAGFlow.
+
+   > Avec les paramètres par défaut, il vous suffit d'entrer `http://IP_OF_YOUR_MACHINE` (**sans** numéro de port), car le port HTTP par défaut `80` peut être omis lors de l'utilisation des configurations par défaut.
+   >
+6. Dans [service_conf.yaml.template](./docker/service_conf.yaml.template), sélectionnez la fabrique LLM souhaitée dans `user_default_llm` et mettez à jour le champ `API_KEY` avec la clé API correspondante.
+
+   > Voir [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) pour plus d'informations.
+   >
+
+   _Le spectacle commence !_
+
+## 🔧 Configurations
+
+En ce qui concerne les configurations système, vous devrez gérer les fichiers suivants :
+
+- [.env](./docker/.env) : Conserve les paramètres de base du système, tels que `SVR_HTTP_PORT`, `MYSQL_PASSWORD` et `MINIO_PASSWORD`.
+- [service_conf.yaml.template](./docker/service_conf.yaml.template) : Configure les services back-end. Les variables d'environnement dans ce fichier seront automatiquement renseignées au démarrage du conteneur Docker. Toutes les variables d'environnement définies dans le conteneur Docker seront disponibles, vous permettant de personnaliser le comportement du service en fonction de l'environnement de déploiement.
+- [docker-compose.yml](./docker/docker-compose.yml) : Le système s'appuie sur [docker-compose.yml](./docker/docker-compose.yml) pour démarrer.
+
+> Le fichier [./docker/README](./docker/README.md) fournit une description détaillée des paramètres d'environnement et des configurations de services qui peuvent être utilisés comme `${ENV_VARS}` dans le fichier [service_conf.yaml.template](./docker/service_conf.yaml.template).
+
+Pour mettre à jour le port HTTP de service par défaut (80), accédez à [docker-compose.yml](./docker/docker-compose.yml) et changez `80:80` en `<YOUR_SERVING_PORT>:80`.
+
+Les mises à jour des configurations ci-dessus nécessitent un redémarrage de tous les conteneurs pour prendre effet :
+
+> ```bash
+> $ docker compose -f docker-compose.yml up -d
+> ```
+
+### Passer du moteur de documents Elasticsearch à Infinity
+
+RAGFlow utilise Elasticsearch par défaut pour stocker le texte intégral et les vecteurs. Pour passer à [Infinity](https://github.com/infiniflow/infinity/), suivez ces étapes :
+
+1. Arrêtez tous les conteneurs en cours d'exécution :
+
+   ```bash
+   $ docker compose -f docker/docker-compose.yml down -v
+   ```
+
+> [!WARNING]
+> `-v` supprimera les volumes des conteneurs Docker, et les données existantes seront effacées.
+
+2. Définissez `DOC_ENGINE` dans **docker/.env** sur `infinity`.
+3. Démarrez les conteneurs :
+
+   ```bash
+   $ docker compose -f docker-compose.yml up -d
+   ```
+
+> [!WARNING]
+> Le passage à Infinity sur une machine Linux/arm64 n'est pas encore officiellement pris en charge.
+
+## 🔧 Construire une image Docker
+
+Cette image fait environ 2 Go et dépend de services LLM et d'embedding externes.
+
+```bash
+git clone https://github.com/infiniflow/ragflow.git
+cd ragflow/
+docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
+Ou si vous êtes derrière un proxy, vous pouvez passer des arguments de proxy :
+
+```bash
+docker build --platform linux/amd64 \
+  --build-arg http_proxy=http://YOUR_PROXY:PORT \
+  --build-arg https_proxy=http://YOUR_PROXY:PORT \
+  -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
+## 🔨 Lancer le service depuis les sources pour le développement
+
+1. Installez `uv`, ou ignorez cette étape s'il est déjà installé :
+
+   ```bash
+   pipx install uv
+   ```
+2. Clonez le code source et installez les dépendances Python :
+
+   ```bash
+   git clone https://github.com/infiniflow/ragflow.git
+   cd ragflow/
+   uv sync --python 3.13 # install RAGFlow dependent python modules
+   uv run python3 ragflow_deps/download_deps.py
+   lefthook install
+   ```
+3. Lancez les services dépendants (MinIO, Elasticsearch, Redis et MySQL) avec Docker Compose :
+
+   ```bash
+   docker compose -f docker/docker-compose-base.yml up -d
+   ```
+
+   Ajoutez la ligne suivante à `/etc/hosts` pour résoudre tous les hôtes spécifiés dans **docker/.env** vers `127.0.0.1` :
+
+   ```
+   127.0.0.1       es01 infinity mysql minio redis sandbox-executor-manager
+   ```
+4. Si vous ne pouvez pas accéder à HuggingFace, définissez la variable d'environnement `HF_ENDPOINT` pour utiliser un site miroir :
+
+   ```bash
+   export HF_ENDPOINT=https://hf-mirror.com
+   ```
+5. Si votre système d'exploitation n'a pas jemalloc, installez-le comme suit :
+
+   ```bash
+   # Ubuntu
+   sudo apt-get install libjemalloc-dev
+   # CentOS
+   sudo yum install jemalloc
+   # OpenSUSE
+   sudo zypper install jemalloc
+   # macOS
+   sudo brew install jemalloc
+   ```
+6. Lancez le service back-end :
+
+   ```bash
+   source .venv/bin/activate
+   export PYTHONPATH=$(pwd)
+   bash docker/launch_backend_service.sh
+   ```
+7. Installez les dépendances front-end :
+
+   ```bash
+   cd web
+   npm install
+   ```
+8. Lancez le service front-end :
+
+   ```bash
+   npm run dev
+   ```
+
+   _La sortie suivante confirme un lancement réussi du système :_
+
+   ![](https://github.com/user-attachments/assets/0daf462c-a24d-4496-a66f-92533534e187)
+9. Arrêtez les services front-end et back-end de RAGFlow une fois le développement terminé :
+
+   ```bash
+   pkill -f "ragflow_server.py|task_executor.py"
+   ```
+
+## 📚 Documentation
+
+- [Quickstart](https://ragflow.io/docs/dev/)
+- [Configuration](https://ragflow.io/docs/dev/configurations)
+- [Release notes](https://ragflow.io/docs/dev/release_notes)
+- [User guides](https://ragflow.io/docs/category/user-guides)
+- [Developer guides](https://ragflow.io/docs/category/developer-guides)
+- [References](https://ragflow.io/docs/dev/category/references)
+- [FAQs](https://ragflow.io/docs/dev/faq)
+
+## 📜 Roadmap
+
+Voir la [Feuille de route RAGFlow 2026](https://github.com/infiniflow/ragflow/issues/12241)
+
+## 🏄 Communauté
+
+- [Discord](https://discord.gg/NjYzJD3GM3)
+- [X](https://x.com/infiniflowai)
+- [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)
+
+## 🙌 Contribuer
+
+RAGFlow s'épanouit grâce à la collaboration open-source. Dans cet esprit, nous accueillons des contributions diverses de la communauté.
+Si vous souhaitez en faire partie, consultez d'abord nos [Directives de contribution](https://ragflow.io/docs/dev/contributing).
--- a/README_id.md
+++ b/README_id.md
@@ -1,5 +1,5 @@
 <div align="center">
-<a href="https://demo.ragflow.io/">
+<a href="https://cloud.ragflow.io/">
 <img src="web/src/assets/logo-with-text.svg" width="520" alt="Logo ragflow">
 </a>
 </div>
@@ -10,19 +10,22 @@
  <a href="./README_tzh.md"><img alt="繁體中文版自述文件" src="https://img.shields.io/badge/繁體中文-DFE0E5"></a>
  <a href="./README_ja.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-DFE0E5"></a>
  <a href="./README_ko.md"><img alt="한국어" src="https://img.shields.io/badge/한국어-DFE0E5"></a>
+  <a href="./README_fr.md"><img alt="README en Français" src="https://img.shields.io/badge/Français-DFE0E5"></a>
  <a href="./README_id.md"><img alt="Bahasa Indonesia" src="https://img.shields.io/badge/Bahasa Indonesia-DBEDFA"></a>
  <a href="./README_pt_br.md"><img alt="Português(Brasil)" src="https://img.shields.io/badge/Português(Brasil)-DFE0E5"></a>
+  <a href="./README_ar.md"><img alt="README in Arabic" src="https://img.shields.io/badge/Arabic-DFE0E5"></a>
+  <a href="./README_tr.md"><img alt="Türkçe README" src="https://img.shields.io/badge/Türkçe-DFE0E5"></a>
 </p>

 <p align="center">
    <a href="https://x.com/intent/follow?screen_name=infiniflowai" target="_blank">
        <img src="https://img.shields.io/twitter/follow/infiniflow?logo=X&color=%20%23f5f5f5" alt="Ikuti di X (Twitter)">
    </a>
-    <a href="https://demo.ragflow.io" target="_blank">
-        <img alt="Lencana Daring" src="https://img.shields.io/badge/Online-Demo-4e6b99">
+    <a href="https://cloud.ragflow.io" target="_blank">
+        <img alt="Lencana Daring" src="https://img.shields.io/badge/Get-Started-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.22.0">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.26.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Rilis%20Terbaru" alt="Rilis Terbaru">
@@ -36,11 +39,10 @@
 </p>

 <h4 align="center">
+  <a href="https://cloud.ragflow.io">Cloud</a> |
  <a href="https://ragflow.io/docs/dev/">Dokumentasi</a> |
-  <a href="https://github.com/infiniflow/ragflow/issues/4214">Peta Jalan</a> |
-  <a href="https://twitter.com/infiniflowai">Twitter</a> |
-  <a href="https://discord.gg/NjYzJD3GM3">Discord</a> |
-  <a href="https://demo.ragflow.io">Demo</a>
+  <a href="https://github.com/infiniflow/ragflow/issues/12241">Peta Jalan</a> |
+  <a href="https://discord.gg/NjYzJD3GM3">Discord</a>
 </h4>

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
@@ -55,11 +57,11 @@
 <summary><b>📕 Daftar Isi </b> </summary>

 - 💡 [Apa Itu RAGFlow?](#-apa-itu-ragflow)
- 🎮 [Demo](#-demo)
+- 🎮 [Mulai](#-mulai)
 - 📌 [Pembaruan Terbaru](#-pembaruan-terbaru)
 - 🌟 [Fitur Utama](#-fitur-utama)
 - 🔎 [Arsitektur Sistem](#-arsitektur-sistem)
- 🎬 [Mulai](#-mulai)
+- 🎬 [Pengelolaan Mandiri](#-pengelolaan-mandiri)
 - 🔧 [Konfigurasi](#-konfigurasi)
 - 🔧 [Membangun Image Docker](#-membangun-docker-image)
 - 🔨 [Meluncurkan aplikasi dari Sumber untuk Pengembangan](#-meluncurkan-aplikasi-dari-sumber-untuk-pengembangan)
@@ -72,11 +74,11 @@

 ## 💡 Apa Itu RAGFlow?

-[RAGFlow](https://ragflow.io/) adalah mesin RAG (Retrieval-Augmented Generation) open-source terkemuka yang mengintegrasikan teknologi RAG mutakhir dengan kemampuan Agent untuk menciptakan lapisan kontekstual superior bagi LLM. Menyediakan alur kerja RAG yang efisien dan dapat diadaptasi untuk perusahaan segala skala. Didukung oleh mesin konteks terkonvergensi dan template Agent yang telah dipra-bangun, RAGFlow memungkinkan pengembang mengubah data kompleks menjadi sistem AI kesetiaan-tinggi dan siap-produksi dengan efisiensi dan presisi yang luar biasa.
+[RAGFlow](https://ragflow.io/) adalah mesin [RAG](https://ragflow.io/basics/what-is-rag) (Retrieval-Augmented Generation) open-source terkemuka yang mengintegrasikan teknologi RAG mutakhir dengan kemampuan Agent untuk menciptakan lapisan kontekstual superior bagi LLM. Menyediakan alur kerja RAG yang efisien dan dapat diadaptasi untuk perusahaan segala skala. Didukung oleh mesin konteks terkonvergensi dan template Agent yang telah dipra-bangun, RAGFlow memungkinkan pengembang mengubah data kompleks menjadi sistem AI kesetiaan-tinggi dan siap-produksi dengan efisiensi dan presisi yang luar biasa.

-## 🎮 Demo
+## 🎮 Mulai

-Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
+Coba layanan cloud kami di [https://cloud.ragflow.io](https://cloud.ragflow.io).

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
 <img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/chunking.gif" width="1200"/>
@@ -85,16 +87,18 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).

 ## 🔥 Pembaruan Terbaru

- 2025-11-12 Mendukung sinkronisasi data dari Confluence, AWS S3, Discord, Google Drive.
+- 2026-06-15 Mendukung berbagai saluran obrolan seperti Feishu, Discord, Telegram, Line, dll.
+- 2026-04-24 Mendukung DeepSeek v4.
+- 2026-03-24 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — Menyediakan skill resmi untuk mengakses dataset RAGFlow melalui OpenClaw.
+- 2025-12-26 Mendukung 'Memori' untuk agen AI.
+- 2025-11-19 Mendukung Gemini 3 Pro.
+- 2025-11-12 Mendukung sinkronisasi data dari Confluence, S3, Notion, Discord, Google Drive.
 - 2025-10-23 Mendukung MinerU & Docling sebagai metode penguraian dokumen.
 - 2025-10-15 Dukungan untuk jalur data yang terorkestrasi.
 - 2025-08-08 Mendukung model seri GPT-5 terbaru dari OpenAI.
 - 2025-08-01 Mendukung alur kerja agen dan MCP.
 - 2025-05-23 Menambahkan komponen pelaksana kode Python/JS ke Agen.
- 2025-05-05 Mendukung kueri lintas bahasa.
 - 2025-03-19 Mendukung penggunaan model multi-modal untuk memahami gambar di dalam file PDF atau DOCX.
- 2024-12-18 Meningkatkan model Analisis Tata Letak Dokumen di DeepDoc.
- 2024-08-22 Dukungan untuk teks ke pernyataan SQL melalui RAG.

 ## 🎉 Tetap Terkini

@@ -138,7 +142,7 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
 <img src="https://github.com/user-attachments/assets/31b0dd6f-ca4f-445a-9457-70cb44a381b2" width="1000"/>
 </div>

-## 🎬 Mulai
+## 🎬 Pengelolaan Mandiri

 ### 📝 Prasyarat

@@ -146,6 +150,7 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
 - RAM >= 16 GB
 - Disk >= 50 GB
 - Docker >= 24.0.0 & Docker Compose >= v2.26.1
+- Python >= 3.13
 - [gVisor](https://gvisor.dev/docs/user_guide/install/): Hanya diperlukan jika Anda ingin menggunakan fitur eksekutor kode (sandbox) dari RAGFlow.

 > [!TIP]
@@ -186,12 +191,14 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
 > Semua gambar Docker dibangun untuk platform x86. Saat ini, kami tidak menawarkan gambar Docker untuk ARM64.
 > Jika Anda menggunakan platform ARM64, [silakan gunakan panduan ini untuk membangun gambar Docker yang kompatibel dengan sistem Anda](https://ragflow.io/docs/dev/build_docker_image).

-> Perintah di bawah ini mengunduh edisi v0.22.0 dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.22.0, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server.
+> Perintah di bawah ini mengunduh edisi v0.26.2 dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.26.2, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server.

 ```bash
   $ cd ragflow/docker
-   
-   # Opsional: gunakan tag stabil (lihat releases: https://github.com/infiniflow/ragflow/releases), contoh: git checkout v0.22.0
+
+   # git checkout v0.26.2
+   # Opsional: gunakan tag stabil (lihat releases: https://github.com/infiniflow/ragflow/releases)
+   # This steps ensures the **entrypoint.sh** file in the code matches the Docker image version.

   # Use CPU for DeepDoc tasks:
   $ docker compose -f docker-compose.yml up -d
@@ -203,10 +210,10 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).

 > Catatan: Sebelum `v0.22.0`, kami menyediakan image dengan model embedding dan image slim tanpa model embedding. Detailnya sebagai berikut:

-| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
-| ----------------- | --------------- | --------------------- | ------------------------ |
-| v0.21.1           | &approx;9       | ✔️                    | Stable release           |
-| v0.21.1-slim      | &approx;2       | ❌                    | Stable release           |
+| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?        |
+|-------------------|-----------------|-----------------------|----------------|
+| v0.21.1           | &approx;9       | ✔️                    | Stable release |
+| v0.21.1-slim      | &approx;2       | ❌                     | Stable release |

 > Mulai dari `v0.22.0`, kami hanya menyediakan edisi slim dan tidak lagi menambahkan akhiran **-slim** pada tag image.

@@ -229,7 +236,7 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
    * Running on all addresses (0.0.0.0)
   ```

-   > Jika Anda melewatkan langkah ini dan langsung login ke RAGFlow, browser Anda mungkin menampilkan error `network anormal`
+   > Jika Anda melewatkan langkah ini dan langsung login ke RAGFlow, browser Anda mungkin menampilkan error `network abnormal`
   > karena RAGFlow mungkin belum sepenuhnya siap.
   >
 2. Buka browser web Anda, masukkan alamat IP server Anda, dan login ke RAGFlow.
@@ -273,21 +280,30 @@ cd ragflow/
 docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
 ```

+Jika berada di belakang proxy, Anda dapat melewatkan argumen proxy:
+
+```bash
+docker build --platform linux/amd64 \
+  --build-arg http_proxy=http://YOUR_PROXY:PORT \
+  --build-arg https_proxy=http://YOUR_PROXY:PORT \
+  -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
 ## 🔨 Menjalankan Aplikasi dari untuk Pengembangan

-1. Instal `uv` dan `pre-commit`, atau lewati langkah ini jika sudah terinstal:
+1. Instal `uv`, atau lewati langkah ini jika sudah terinstal:

   ```bash
-   pipx install uv pre-commit
+   pipx install uv
   ```
 2. Clone kode sumber dan instal dependensi Python:

   ```bash
   git clone https://github.com/infiniflow/ragflow.git
   cd ragflow/
-   uv sync --python 3.10 # install RAGFlow dependent python modules
-   uv run download_deps.py
-   pre-commit install
+   uv sync --python 3.13 # install RAGFlow dependent python modules
+   uv run python3 ragflow_deps/download_deps.py
+   lefthook install
   ```
 3. Jalankan aplikasi yang diperlukan (MinIO, Elasticsearch, Redis, dan MySQL) menggunakan Docker Compose:

@@ -348,19 +364,19 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
 - [Quickstart](https://ragflow.io/docs/dev/)
 - [Configuration](https://ragflow.io/docs/dev/configurations)
 - [Release notes](https://ragflow.io/docs/dev/release_notes)
- [User guides](https://ragflow.io/docs/dev/category/guides)
- [Developer guides](https://ragflow.io/docs/dev/category/developers)
+- [User guides](https://ragflow.io/docs/category/user-guides)
+- [Developer guides](https://ragflow.io/docs/category/developer-guides)
 - [References](https://ragflow.io/docs/dev/category/references)
 - [FAQs](https://ragflow.io/docs/dev/faq)

 ## 📜 Roadmap

-Lihat [Roadmap RAGFlow 2025](https://github.com/infiniflow/ragflow/issues/4214)
+Lihat [Roadmap RAGFlow 2026](https://github.com/infiniflow/ragflow/issues/12241)

 ## 🏄 Komunitas

 - [Discord](https://discord.gg/NjYzJD3GM3)
- [Twitter](https://twitter.com/infiniflowai)
+- [X](https://x.com/infiniflowai)
 - [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)

 ## 🙌 Kontribusi
--- a/README_ja.md
+++ b/README_ja.md
@@ -1,5 +1,5 @@
 <div align="center">
-<a href="https://demo.ragflow.io/">
+<a href="https://cloud.ragflow.io/">
 <img src="web/src/assets/logo-with-text.svg" width="350" alt="ragflow logo">
 </a>
 </div>
@@ -10,19 +10,22 @@
  <a href="./README_tzh.md"><img alt="繁體中文版自述文件" src="https://img.shields.io/badge/繁體中文-DFE0E5"></a>
  <a href="./README_ja.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-DBEDFA"></a>
  <a href="./README_ko.md"><img alt="한국어" src="https://img.shields.io/badge/한국어-DFE0E5"></a>
+  <a href="./README_fr.md"><img alt="README en Français" src="https://img.shields.io/badge/Français-DFE0E5"></a>
  <a href="./README_id.md"><img alt="Bahasa Indonesia" src="https://img.shields.io/badge/Bahasa Indonesia-DFE0E5"></a>
  <a href="./README_pt_br.md"><img alt="Português(Brasil)" src="https://img.shields.io/badge/Português(Brasil)-DFE0E5"></a>
+  <a href="./README_ar.md"><img alt="README in Arabic" src="https://img.shields.io/badge/Arabic-DFE0E5"></a>
+  <a href="./README_tr.md"><img alt="Türkçe README" src="https://img.shields.io/badge/Türkçe-DFE0E5"></a>
 </p>

 <p align="center">
    <a href="https://x.com/intent/follow?screen_name=infiniflowai" target="_blank">
        <img src="https://img.shields.io/twitter/follow/infiniflow?logo=X&color=%20%23f5f5f5" alt="follow on X(Twitter)">
    </a>
-    <a href="https://demo.ragflow.io" target="_blank">
-        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
+    <a href="https://cloud.ragflow.io" target="_blank">
+        <img alt="Static Badge" src="https://img.shields.io/badge/Get-Started-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.22.0">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.26.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@@ -36,11 +39,10 @@
 </p>

 <h4 align="center">
+  <a href="https://cloud.ragflow.io">Cloud</a> |
  <a href="https://ragflow.io/docs/dev/">Document</a> |
-  <a href="https://github.com/infiniflow/ragflow/issues/4214">Roadmap</a> |
-  <a href="https://twitter.com/infiniflowai">Twitter</a> |
-  <a href="https://discord.gg/NjYzJD3GM3">Discord</a> |
-  <a href="https://demo.ragflow.io">Demo</a>
+  <a href="https://github.com/infiniflow/ragflow/issues/12241">Roadmap</a> |
+  <a href="https://discord.gg/NjYzJD3GM3">Discord</a>
 </h4>

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
@@ -53,11 +55,11 @@

 ## 💡 RAGFlow とは？

-[RAGFlow](https://ragflow.io/) は、先進的なRAG（Retrieval-Augmented Generation）技術と Agent 機能を融合し、大規模言語モデル（LLM）に優れたコンテキスト層を構築する最先端のオープンソース RAG エンジンです。あらゆる規模の企業に対応可能な合理化された RAG ワークフローを提供し、統合型コンテキストエンジンと事前構築されたAgentテンプレートにより、開発者が複雑なデータを驚異的な効率性と精度で高精細なプロダクションレディAIシステムへ変換することを可能にします。
+[RAGFlow](https://ragflow.io/) は、先進的な[RAG](https://ragflow.io/basics/what-is-rag)（Retrieval-Augmented Generation）技術と Agent 機能を融合し、大規模言語モデル（LLM）に優れたコンテキスト層を構築する最先端のオープンソース RAG エンジンです。あらゆる規模の企業に対応可能な合理化された RAG ワークフローを提供し、統合型[コンテキストエンジン](https://ragflow.io/basics/what-is-agent-context-engine)と事前構築されたAgentテンプレートにより、開発者が複雑なデータを驚異的な効率性と精度で高精細なプロダクションレディAIシステムへ変換することを可能にします。

-## 🎮 Demo
+## 🎮 はじめに

-デモをお試しください：[https://demo.ragflow.io](https://demo.ragflow.io)。
+当社のクラウドサービスをぜひお試しください：[https://cloud.ragflow.io](https://cloud.ragflow.io)。

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
 <img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/chunking.gif" width="1200"/>
@@ -66,16 +68,19 @@

 ## 🔥 最新情報

- 2025-11-12 Confluence、AWS S3、Discord、Google Drive からのデータ同期をサポートします。
+- 2026-06-15 Feishu、Discord、Telegram、Lineなどの複数のチャットチャンネルをサポートします。
+- 2026-04-24 DeepSeek v4 をサポート。
+- 2026-03-24 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — OpenClaw経由でRAGFlowデータセットにアクセスする公式スキルを提供。
+- 2025-12-26 AIエージェントの「メモリ」機能をサポート。
+- 2025-11-19 Gemini 3 Proをサポートしています。
+- 2025-11-12 Confluence、S3、Notion、Discord、Google Drive からのデータ同期をサポートします。
 - 2025-10-23 ドキュメント解析方法として MinerU と Docling をサポートします。
 - 2025-10-15 オーケストレーションされたデータパイプラインのサポート。
 - 2025-08-08 OpenAI の最新 GPT-5 シリーズモデルをサポートします。
 - 2025-08-01 エージェントワークフローとMCPをサポート。
 - 2025-05-23 エージェントに Python/JS コードエグゼキュータコンポーネントを追加しました。
- 2025-05-05 言語間クエリをサポートしました。
 - 2025-03-19 PDFまたはDOCXファイル内の画像を理解するために、多モーダルモデルを使用することをサポートします。
- 2024-12-18 DeepDoc のドキュメント レイアウト分析モデルをアップグレードします。
- 2024-08-22 RAG を介して SQL ステートメントへのテキストをサポートします。
+

 ## 🎉 続きを楽しみに

@@ -119,7 +124,7 @@
 <img src="https://github.com/user-attachments/assets/31b0dd6f-ca4f-445a-9457-70cb44a381b2" width="1000"/>
 </div>

-## 🎬 初期設定
+## 🎬 セルフホスティング

 ### 📝 必要条件

@@ -127,6 +132,7 @@
 - RAM >= 16 GB
 - Disk >= 50 GB
 - Docker >= 24.0.0 & Docker Compose >= v2.26.1
+- Python >= 3.13
 - [gVisor](https://gvisor.dev/docs/user_guide/install/): RAGFlowのコード実行（サンドボックス）機能を利用する場合のみ必要です。

 > [!TIP]
@@ -166,12 +172,14 @@
 > 現在、公式に提供されているすべての Docker イメージは x86 アーキテクチャ向けにビルドされており、ARM64 用の Docker イメージは提供されていません。
 > ARM64 アーキテクチャのオペレーティングシステムを使用している場合は、[このドキュメント](https://ragflow.io/docs/dev/build_docker_image)を参照して Docker イメージを自分でビルドしてください。

-> 以下のコマンドは、RAGFlow Docker イメージの v0.22.0 エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.22.0 とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。
+> 以下のコマンドは、RAGFlow Docker イメージの v0.26.2 エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.26.2 とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。

 ```bash
   $ cd ragflow/docker

-   # 任意: 安定版タグを利用 (一覧: https://github.com/infiniflow/ragflow/releases) 例: git checkout v0.22.0
+   # git checkout v0.26.2
+   # 任意: 安定版タグを利用 (一覧: https://github.com/infiniflow/ragflow/releases)
+   # この手順は、コード内の entrypoint.sh ファイルが Docker イメージのバージョンと一致していることを確認します。

   # Use CPU for DeepDoc tasks:
   $ docker compose -f docker-compose.yml up -d
@@ -183,15 +191,15 @@

 > 注意：`v0.22.0` より前のバージョンでは、embedding モデルを含むイメージと、embedding モデルを含まない slim イメージの両方を提供していました。詳細は以下の通りです：

-| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
-| ----------------- | --------------- | --------------------- | ------------------------ |
-| v0.21.1           | &approx;9       | ✔️                    | Stable release           |
-| v0.21.1-slim      | &approx;2       | ❌                    | Stable release           |
+| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?        |
+|-------------------|-----------------|-----------------------|----------------|
+| v0.21.1           | &approx;9       | ✔️                    | Stable release |
+| v0.21.1-slim      | &approx;2       | ❌                     | Stable release |

 > `v0.22.0` 以降、当プロジェクトでは slim エディションのみを提供し、イメージタグに **-slim** サフィックスを付けなくなりました。

-   1. サーバーを立ち上げた後、サーバーの状態を確認する:   
-   
+   1. サーバーを立ち上げた後、サーバーの状態を確認する:
+
   ```bash
   $ docker logs -f docker-ragflow-cpu-1
   ```
@@ -273,21 +281,30 @@ cd ragflow/
 docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
 ```

+プロキシ環境下にいる場合は、プロキシ引数を指定できます：
+
+```bash
+docker build --platform linux/amd64 \
+  --build-arg http_proxy=http://YOUR_PROXY:PORT \
+  --build-arg https_proxy=http://YOUR_PROXY:PORT \
+  -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
 ## 🔨 ソースコードからサービスを起動する方法

-1. `uv` と `pre-commit` をインストールする。すでにインストールされている場合は、このステップをスキップしてください:
+1. `uv` をインストールする。すでにインストールされている場合は、このステップをスキップしてください:

   ```bash
-   pipx install uv pre-commit
+   pipx install uv
   ```
 2. ソースコードをクローンし、Python の依存関係をインストールする:

   ```bash
   git clone https://github.com/infiniflow/ragflow.git
   cd ragflow/
-   uv sync --python 3.10 # install RAGFlow dependent python modules
-   uv run download_deps.py
-   pre-commit install
+   uv sync --python 3.13 # install RAGFlow dependent python modules
+   uv run python3 ragflow_deps/download_deps.py
+   lefthook install
   ```
 3. Docker Compose を使用して依存サービス（MinIO、Elasticsearch、Redis、MySQL）を起動する:

@@ -348,19 +365,19 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
 - [Quickstart](https://ragflow.io/docs/dev/)
 - [Configuration](https://ragflow.io/docs/dev/configurations)
 - [Release notes](https://ragflow.io/docs/dev/release_notes)
- [User guides](https://ragflow.io/docs/dev/category/guides)
- [Developer guides](https://ragflow.io/docs/dev/category/developers)
+- [User guides](https://ragflow.io/docs/category/user-guides)
+- [Developer guides](https://ragflow.io/docs/category/developer-guides)
 - [References](https://ragflow.io/docs/dev/category/references)
 - [FAQs](https://ragflow.io/docs/dev/faq)

 ## 📜 ロードマップ

-[RAGFlow ロードマップ 2025](https://github.com/infiniflow/ragflow/issues/4214) を参照
+[RAGFlow ロードマップ 2026](https://github.com/infiniflow/ragflow/issues/12241) を参照

 ## 🏄 コミュニティ

 - [Discord](https://discord.gg/NjYzJD3GM3)
- [Twitter](https://twitter.com/infiniflowai)
+- [X](https://x.com/infiniflowai)
 - [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)

 ## 🙌 コントリビュート
--- a/README_ko.md
+++ b/README_ko.md
@@ -1,5 +1,5 @@
 <div align="center">
-<a href="https://demo.ragflow.io/">
+<a href="https://cloud.ragflow.io/">
 <img src="web/src/assets/logo-with-text.svg" width="520" alt="ragflow logo">
 </a>
 </div>
@@ -10,19 +10,22 @@
  <a href="./README_tzh.md"><img alt="繁體版中文自述文件" src="https://img.shields.io/badge/繁體中文-DFE0E5"></a>
  <a href="./README_ja.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-DFE0E5"></a>
  <a href="./README_ko.md"><img alt="한국어" src="https://img.shields.io/badge/한국어-DBEDFA"></a>
+  <a href="./README_fr.md"><img alt="README en Français" src="https://img.shields.io/badge/Français-DFE0E5"></a>
  <a href="./README_id.md"><img alt="Bahasa Indonesia" src="https://img.shields.io/badge/Bahasa Indonesia-DFE0E5"></a>
  <a href="./README_pt_br.md"><img alt="Português(Brasil)" src="https://img.shields.io/badge/Português(Brasil)-DFE0E5"></a>
+  <a href="./README_ar.md"><img alt="README in Arabic" src="https://img.shields.io/badge/Arabic-DFE0E5"></a>
+  <a href="./README_tr.md"><img alt="Türkçe README" src="https://img.shields.io/badge/Türkçe-DFE0E5"></a>
 </p>

 <p align="center">
    <a href="https://x.com/intent/follow?screen_name=infiniflowai" target="_blank">
        <img src="https://img.shields.io/twitter/follow/infiniflow?logo=X&color=%20%23f5f5f5" alt="follow on X(Twitter)">
    </a>
-    <a href="https://demo.ragflow.io" target="_blank">
-        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
+    <a href="https://cloud.ragflow.io" target="_blank">
+        <img alt="Static Badge" src="https://img.shields.io/badge/Get-Started-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.22.0">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.26.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@@ -36,11 +39,10 @@
 </p>

 <h4 align="center">
+  <a href="https://cloud.ragflow.io">Cloud</a> |
  <a href="https://ragflow.io/docs/dev/">Document</a> |
-  <a href="https://github.com/infiniflow/ragflow/issues/4214">Roadmap</a> |
-  <a href="https://twitter.com/infiniflowai">Twitter</a> |
-  <a href="https://discord.gg/NjYzJD3GM3">Discord</a> |
-  <a href="https://demo.ragflow.io">Demo</a>
+  <a href="https://github.com/infiniflow/ragflow/issues/12241">Roadmap</a> |
+  <a href="https://discord.gg/NjYzJD3GM3">Discord</a>
 </h4>

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
@@ -54,11 +56,11 @@

 ## 💡 RAGFlow란?

-[RAGFlow](https://ragflow.io/) 는 최첨단 RAG(Retrieval-Augmented Generation)와 Agent 기능을 융합하여 대규모 언어 모델(LLM)을 위한 우수한 컨텍스트 계층을 생성하는 선도적인 오픈소스 RAG 엔진입니다. 모든 규모의 기업에 적용 가능한 효율적인 RAG 워크플로를 제공하며, 통합 컨텍스트 엔진과 사전 구축된 Agent 템플릿을 통해 개발자들이 복잡한 데이터를 예외적인 효율성과 정밀도로 고급 구현도의 프로덕션 준비 완료 AI 시스템으로 변환할 수 있도록 지원합니다.
+[RAGFlow](https://ragflow.io/) 는 최첨단 [RAG](https://ragflow.io/basics/what-is-rag)(Retrieval-Augmented Generation)와 Agent 기능을 융합하여 대규모 언어 모델(LLM)을 위한 우수한 컨텍스트 계층을 생성하는 선도적인 오픈소스 RAG 엔진입니다. 모든 규모의 기업에 적용 가능한 효율적인 RAG 워크플로를 제공하며, 통합 [컨텍스트 엔진](https://ragflow.io/basics/what-is-agent-context-engine)과 사전 구축된 Agent 템플릿을 통해 개발자들이 복잡한 데이터를 예외적인 효율성과 정밀도로 고급 구현도의 프로덕션 준비 완료 AI 시스템으로 변환할 수 있도록 지원합니다.

-## 🎮 데모
+## 🎮 시작하기

-데모를 [https://demo.ragflow.io](https://demo.ragflow.io)에서 실행해 보세요.
+[https://cloud.ragflow.io](https://cloud.ragflow.io)에서 저희 클라우드 서비스를 이용해 보세요.

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
 <img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/chunking.gif" width="1200"/>
@@ -67,16 +69,19 @@

 ## 🔥 업데이트

- 2025-11-12 Confluence, AWS S3, Discord, Google Drive에서 데이터 동기화를 지원합니다.
+- 2026-06-15 Feishu, Discord, Telegram, Line 등 다양한 채팅 채널을 지원합니다.
+- 2026-04-24 DeepSeek v4를 지원합니다.
+- 2026-03-24 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — OpenClaw를 통해 RAGFlow 데이터셋에 접근하는 공식 스킬 제공.
+- 2025-12-26 AI 에이전트의 '메모리' 기능 지원.
+- 2025-11-19 Gemini 3 Pro를 지원합니다.
+- 2025-11-12 Confluence, S3, Notion, Discord, Google Drive에서 데이터 동기화를 지원합니다.
 - 2025-10-23 문서 파싱 방법으로 MinerU 및 Docling을 지원합니다.
 - 2025-10-15 조정된 데이터 파이프라인 지원.
 - 2025-08-08 OpenAI의 최신 GPT-5 시리즈 모델을 지원합니다.
 - 2025-08-01 에이전트 워크플로우와 MCP를 지원합니다.
 - 2025-05-23 Agent에 Python/JS 코드 실행기 구성 요소를 추가합니다.
- 2025-05-05 언어 간 쿼리를 지원합니다.
 - 2025-03-19 PDF 또는 DOCX 파일 내의 이미지를 이해하기 위해 다중 모드 모델을 사용하는 것을 지원합니다.
- 2024-12-18 DeepDoc의 문서 레이아웃 분석 모델 업그레이드.
- 2024-08-22 RAG를 통해 SQL 문에 텍스트를 지원합니다.
+

 ## 🎉 계속 지켜봐 주세요

@@ -120,7 +125,7 @@
 <img src="https://github.com/user-attachments/assets/31b0dd6f-ca4f-445a-9457-70cb44a381b2" width="1000"/>
 </div>

-## 🎬 시작하기
+## 🎬 자체 호스팅

 ### 📝 사전 준비 사항

@@ -128,6 +133,7 @@
 - RAM >= 16 GB
 - Disk >= 50 GB
 - Docker >= 24.0.0 & Docker Compose >= v2.26.1
+- Python >= 3.13
 - [gVisor](https://gvisor.dev/docs/user_guide/install/): RAGFlow의 코드 실행기(샌드박스) 기능을 사용하려는 경우에만 필요합니다.

 > [!TIP]
@@ -168,12 +174,14 @@
 > 모든 Docker 이미지는 x86 플랫폼을 위해 빌드되었습니다. 우리는 현재 ARM64 플랫폼을 위한 Docker 이미지를 제공하지 않습니다.
 > ARM64 플랫폼을 사용 중이라면, [시스템과 호환되는 Docker 이미지를 빌드하려면 이 가이드를 사용해 주세요](https://ragflow.io/docs/dev/build_docker_image).

-   > 아래 명령어는 RAGFlow Docker 이미지의 v0.22.0 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.22.0과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오.
+   > 아래 명령어는 RAGFlow Docker 이미지의 v0.26.2 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.26.2와 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오.

   ```bash
   $ cd ragflow/docker
-   
-   # Optional: use a stable tag (see releases: https://github.com/infiniflow/ragflow/releases), e.g.: git checkout v0.22.0
+
+   # git checkout v0.26.2
+   # Optional: use a stable tag (see releases: https://github.com/infiniflow/ragflow/releases)
+   # 이 단계는 코드의 entrypoint.sh 파일이 Docker 이미지 버전과 일치하도록 보장합니다.

   # Use CPU for DeepDoc tasks:
   $ docker compose -f docker-compose.yml up -d
@@ -185,10 +193,10 @@

 > 참고: `v0.22.0` 이전 버전에서는 embedding 모델이 포함된 이미지와 embedding 모델이 포함되지 않은 slim 이미지를 모두 제공했습니다. 자세한 내용은 다음과 같습니다:

-| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
-| ----------------- | --------------- | --------------------- | ------------------------ |
-| v0.21.1           | &approx;9       | ✔️                    | Stable release           |
-| v0.21.1-slim      | &approx;2       | ❌                    | Stable release           |
+| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?        |
+|-------------------|-----------------|-----------------------|----------------|
+| v0.21.1           | &approx;9       | ✔️                    | Stable release |
+| v0.21.1-slim      | &approx;2       | ❌                     | Stable release |

 > `v0.22.0`부터는 slim 에디션만 배포하며 이미지 태그에 **-slim** 접미사를 더 이상 붙이지 않습니다.

@@ -210,7 +218,7 @@
    * Running on all addresses (0.0.0.0)
   ```

-   > 만약 확인 단계를 건너뛰고 바로 RAGFlow에 로그인하면, RAGFlow가 완전히 초기화되지 않았기 때문에 브라우저에서 `network anormal` 오류가 발생할 수 있습니다.
+   > 만약 확인 단계를 건너뛰고 바로 RAGFlow에 로그인하면, RAGFlow가 완전히 초기화되지 않았기 때문에 브라우저에서 `network abnormal` 오류가 발생할 수 있습니다.

 2. 웹 브라우저에 서버의 IP 주소를 입력하고 RAGFlow에 로그인하세요.
   > 기본 설정을 사용할 경우, `http://IP_OF_YOUR_MACHINE`만 입력하면 됩니다 (포트 번호는 제외). 기본 HTTP 서비스 포트 `80`은 기본 구성으로 사용할 때 생략할 수 있습니다.
@@ -267,12 +275,21 @@ cd ragflow/
 docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
 ```

+프록시 환경인 경우, 프록시 인수를 전달할 수 있습니다：
+
+```bash
+docker build --platform linux/amd64 \
+  --build-arg http_proxy=http://YOUR_PROXY:PORT \
+  --build-arg https_proxy=http://YOUR_PROXY:PORT \
+  -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
 ## 🔨 소스 코드로 서비스를 시작합니다.

 1. `uv` 와 `pre-commit` 을 설치하거나, 이미 설치된 경우 이 단계를 건너뜁니다:

   ```bash
-   pipx install uv pre-commit
+   pipx install uv
   ```

 2. 소스 코드를 클론하고 Python 의존성을 설치합니다:
@@ -280,9 +297,9 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
   ```bash
   git clone https://github.com/infiniflow/ragflow.git
   cd ragflow/
-   uv sync --python 3.10 # install RAGFlow dependent python modules
-   uv run download_deps.py
-   pre-commit install
+   uv sync --python 3.13 # install RAGFlow dependent python modules
+   uv run python3 ragflow_deps/download_deps.py
+   lefthook install
   ```

 3. Docker Compose를 사용하여 의존 서비스(MinIO, Elasticsearch, Redis 및 MySQL)를 시작합니다:
@@ -352,19 +369,19 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
 - [Quickstart](https://ragflow.io/docs/dev/)
 - [Configuration](https://ragflow.io/docs/dev/configurations)
 - [Release notes](https://ragflow.io/docs/dev/release_notes)
- [User guides](https://ragflow.io/docs/dev/category/guides)
- [Developer guides](https://ragflow.io/docs/dev/category/developers)
+- [User guides](https://ragflow.io/docs/category/user-guides)
+- [Developer guides](https://ragflow.io/docs/category/developer-guides)
 - [References](https://ragflow.io/docs/dev/category/references)
 - [FAQs](https://ragflow.io/docs/dev/faq)

 ## 📜 로드맵

-[RAGFlow 로드맵 2025](https://github.com/infiniflow/ragflow/issues/4214)을 확인하세요.
+[RAGFlow 로드맵 2026](https://github.com/infiniflow/ragflow/issues/12241)을 확인하세요.

 ## 🏄 커뮤니티

 - [Discord](https://discord.gg/NjYzJD3GM3)
- [Twitter](https://twitter.com/infiniflowai)
+- [X](https://x.com/infiniflowai)
 - [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)

 ## 🙌 컨트리뷰션
--- a/README_pt_br.md
+++ b/README_pt_br.md
@@ -1,5 +1,5 @@
 <div align="center">
-<a href="https://demo.ragflow.io/">
+<a href="https://cloud.ragflow.io/">
 <img src="web/src/assets/logo-with-text.svg" width="520" alt="ragflow logo">
 </a>
 </div>
@@ -10,19 +10,22 @@
  <a href="./README_tzh.md"><img alt="繁體版中文自述文件" src="https://img.shields.io/badge/繁體中文-DFE0E5"></a>
  <a href="./README_ja.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-DFE0E5"></a>
  <a href="./README_ko.md"><img alt="한국어" src="https://img.shields.io/badge/한국어-DFE0E5"></a>
+  <a href="./README_fr.md"><img alt="README en Français" src="https://img.shields.io/badge/Français-DFE0E5"></a>
  <a href="./README_id.md"><img alt="Bahasa Indonesia" src="https://img.shields.io/badge/Bahasa Indonesia-DFE0E5"></a>
  <a href="./README_pt_br.md"><img alt="Português(Brasil)" src="https://img.shields.io/badge/Português(Brasil)-DBEDFA"></a>
+  <a href="./README_ar.md"><img alt="README in Arabic" src="https://img.shields.io/badge/Arabic-DFE0E5"></a>
+  <a href="./README_tr.md"><img alt="Türkçe README" src="https://img.shields.io/badge/Türkçe-DFE0E5"></a>
 </p>

 <p align="center">
    <a href="https://x.com/intent/follow?screen_name=infiniflowai" target="_blank">
        <img src="https://img.shields.io/twitter/follow/infiniflow?logo=X&color=%20%23f5f5f5" alt="seguir no X(Twitter)">
    </a>
-    <a href="https://demo.ragflow.io" target="_blank">
-        <img alt="Badge Estático" src="https://img.shields.io/badge/Online-Demo-4e6b99">
+    <a href="https://cloud.ragflow.io" target="_blank">
+        <img alt="Badge Estático" src="https://img.shields.io/badge/Get-Started-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.22.0">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.26.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Última%20Relese" alt="Última Versão">
@@ -36,11 +39,10 @@
 </p>

 <h4 align="center">
+  <a href="https://cloud.ragflow.io">Cloud</a> |
  <a href="https://ragflow.io/docs/dev/">Documentação</a> |
-  <a href="https://github.com/infiniflow/ragflow/issues/4214">Roadmap</a> |
-  <a href="https://twitter.com/infiniflowai">Twitter</a> |
-  <a href="https://discord.gg/NjYzJD3GM3">Discord</a> |
-  <a href="https://demo.ragflow.io">Demo</a>
+  <a href="https://github.com/infiniflow/ragflow/issues/12241">Roadmap</a> |
+  <a href="https://discord.gg/NjYzJD3GM3">Discord</a>
 </h4>

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
@@ -55,11 +57,11 @@
 <summary><b>📕 Índice</b></summary>

 - 💡 [O que é o RAGFlow?](#-o-que-é-o-ragflow)
- 🎮 [Demo](#-demo)
+- 🎮 [Primeiros Passos](#-primeiros-passos)
 - 📌 [Últimas Atualizações](#-últimas-atualizações)
 - 🌟 [Principais Funcionalidades](#-principais-funcionalidades)
 - 🔎 [Arquitetura do Sistema](#-arquitetura-do-sistema)
- 🎬 [Primeiros Passos](#-primeiros-passos)
+- 🎬 [Auto-hospedagem](#-auto-hospedagem)
 - 🔧 [Configurações](#-configurações)
 - 🔧 [Construir uma imagem docker sem incorporar modelos](#-construir-uma-imagem-docker-sem-incorporar-modelos)
 - 🔧 [Construir uma imagem docker incluindo modelos](#-construir-uma-imagem-docker-incluindo-modelos)
@@ -73,11 +75,11 @@

 ## 💡 O que é o RAGFlow?

-[RAGFlow](https://ragflow.io/) é um mecanismo de RAG (Retrieval-Augmented Generation) open-source líder que fusiona tecnologias RAG de ponta com funcionalidades Agent para criar uma camada contextual superior para LLMs. Oferece um fluxo de trabalho RAG otimizado adaptável a empresas de qualquer escala. Alimentado por um motor de contexto convergente e modelos Agent pré-construídos, o RAGFlow permite que desenvolvedores transformem dados complexos em sistemas de IA de alta fidelidade e pronto para produção com excepcional eficiência e precisão.
+[RAGFlow](https://ragflow.io/) é um mecanismo de [RAG](https://ragflow.io/basics/what-is-rag) (Retrieval-Augmented Generation) open-source líder que fusiona tecnologias RAG de ponta com funcionalidades Agent para criar uma camada contextual superior para LLMs. Oferece um fluxo de trabalho RAG otimizado adaptável a empresas de qualquer escala. Alimentado por [um motor de contexto](https://ragflow.io/basics/what-is-agent-context-engine) convergente e modelos Agent pré-construídos, o RAGFlow permite que desenvolvedores transformem dados complexos em sistemas de IA de alta fidelidade e pronto para produção com excepcional eficiência e precisão.

-## 🎮 Demo
+## 🎮 Primeiros Passos

-Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
+Experimente o nosso serviço na nuvem em [https://cloud.ragflow.io](https://cloud.ragflow.io).

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
 <img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/chunking.gif" width="1200"/>
@@ -86,16 +88,18 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).

 ## 🔥 Últimas Atualizações

- 12-11-2025 Suporta a sincronização de dados do Confluence, AWS S3, Discord e Google Drive.
+- 15-06-2026 Suporte a múltiplos canais de chat, como Feishu, Discord, Telegram, Line, etc..
+- 24-04-2026 Suporta DeepSeek v4.
+- 24-03-2026 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — Fornece um skill oficial para acessar datasets do RAGFlow via OpenClaw.
+- 26-12-2025 Suporte à função 'Memória' para agentes de IA.
+- 19-11-2025 Suporta Gemini 3 Pro.
+- 12-11-2025 Suporta a sincronização de dados do Confluence, S3, Notion, Discord e Google Drive.
 - 23-10-2025 Suporta MinerU e Docling como métodos de análise de documentos.
 - 15-10-2025 Suporte para pipelines de dados orquestrados.
 - 08-08-2025 Suporta a mais recente série GPT-5 da OpenAI.
 - 01-08-2025 Suporta fluxo de trabalho agente e MCP.
 - 23-05-2025 Adicione o componente executor de código Python/JS ao Agente.
- 05-05-2025 Suporte a consultas entre idiomas.
 - 19-03-2025 Suporta o uso de um modelo multi-modal para entender imagens dentro de arquivos PDF ou DOCX.
- 18-12-2024 Atualiza o modelo de Análise de Layout de Documentos no DeepDoc.
- 22-08-2024 Suporta conversão de texto para comandos SQL via RAG.

 ## 🎉 Fique Ligado

@@ -139,7 +143,7 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
 <img src="https://github.com/user-attachments/assets/31b0dd6f-ca4f-445a-9457-70cb44a381b2" width="1000"/>
 </div>

-## 🎬 Primeiros Passos
+## 🎬 Auto-hospedagem

 ### 📝 Pré-requisitos

@@ -147,6 +151,7 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
 - RAM >= 16 GB
 - Disco >= 50 GB
 - Docker >= 24.0.0 & Docker Compose >= v2.26.1
+- Python >= 3.13
 - [gVisor](https://gvisor.dev/docs/user_guide/install/): Necessário apenas se você pretende usar o recurso de executor de código (sandbox) do RAGFlow.

 > [!TIP]
@@ -186,12 +191,14 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
 > Todas as imagens Docker são construídas para plataformas x86. Atualmente, não oferecemos imagens Docker para ARM64.
 > Se você estiver usando uma plataforma ARM64, por favor, utilize [este guia](https://ragflow.io/docs/dev/build_docker_image) para construir uma imagem Docker compatível com o seu sistema.

-    > O comando abaixo baixa a edição`v0.22.0` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.22.0`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor.
+    > O comando abaixo baixa a edição`v0.26.2` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.26.2`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor.

 ```bash
   $ cd ragflow/docker
-   
-   # Opcional: use uma tag estável (veja releases: https://github.com/infiniflow/ragflow/releases), ex.: git checkout v0.22.0
+
+   # git checkout v0.26.2
+   # Opcional: use uma tag estável (veja releases: https://github.com/infiniflow/ragflow/releases)
+   # Esta etapa garante que o arquivo entrypoint.sh no código corresponda à versão da imagem do Docker.

   # Use CPU for DeepDoc tasks:
   $ docker compose -f docker-compose.yml up -d
@@ -203,10 +210,10 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).

 > Nota: Antes da `v0.22.0`, fornecíamos imagens com modelos de embedding e imagens slim sem modelos de embedding. Detalhes a seguir:

-| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
-| ----------------- | --------------- | --------------------- | ------------------------ |
-| v0.21.1           | &approx;9       | ✔️                    | Stable release           |
-| v0.21.1-slim      | &approx;2       | ❌                    | Stable release           |
+| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?        |
+|-------------------|-----------------|-----------------------|----------------|
+| v0.21.1           | &approx;9       | ✔️                    | Stable release |
+| v0.21.1-slim      | &approx;2       | ❌                     | Stable release |

 > A partir da `v0.22.0`, distribuímos apenas a edição slim e não adicionamos mais o sufixo **-slim** às tags das imagens.

@@ -228,7 +235,7 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
    * Rodando em todos os endereços (0.0.0.0)
   ```

-   > Se você pular essa etapa de confirmação e acessar diretamente o RAGFlow, seu navegador pode exibir um erro `network anormal`, pois, nesse momento, seu RAGFlow pode não estar totalmente inicializado.
+   > Se você pular essa etapa de confirmação e acessar diretamente o RAGFlow, seu navegador pode exibir um erro `network abnormal`, pois, nesse momento, seu RAGFlow pode não estar totalmente inicializado.
   >
 5. No seu navegador, insira o endereço IP do seu servidor e faça login no RAGFlow.

@@ -290,21 +297,30 @@ cd ragflow/
 docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
 ```

+Se você estiver atrás de um proxy, pode passar argumentos de proxy:
+
+```bash
+docker build --platform linux/amd64 \
+  --build-arg http_proxy=http://YOUR_PROXY:PORT \
+  --build-arg https_proxy=http://YOUR_PROXY:PORT \
+  -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
 ## 🔨 Lançar o serviço a partir do código-fonte para desenvolvimento

 1. Instale o `uv` e o `pre-commit`, ou pule esta etapa se eles já estiverem instalados:

   ```bash
-   pipx install uv pre-commit
+   pipx install uv
   ```
 2. Clone o código-fonte e instale as dependências Python:

   ```bash
   git clone https://github.com/infiniflow/ragflow.git
   cd ragflow/
-   uv sync --python 3.10 # instala os módulos Python dependentes do RAGFlow
-   uv run download_deps.py
-   pre-commit install
+   uv sync --python 3.13 # instala os módulos Python dependentes do RAGFlow
+   uv run python3 ragflow_deps/download_deps.py
+   lefthook install
   ```
 3. Inicie os serviços dependentes (MinIO, Elasticsearch, Redis e MySQL) usando Docker Compose:

@@ -365,19 +381,19 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
 - [Quickstart](https://ragflow.io/docs/dev/)
 - [Configuration](https://ragflow.io/docs/dev/configurations)
 - [Release notes](https://ragflow.io/docs/dev/release_notes)
- [User guides](https://ragflow.io/docs/dev/category/guides)
- [Developer guides](https://ragflow.io/docs/dev/category/developers)
+- [User guides](https://ragflow.io/docs/category/user-guides)
+- [Developer guides](https://ragflow.io/docs/category/developer-guides)
 - [References](https://ragflow.io/docs/dev/category/references)
 - [FAQs](https://ragflow.io/docs/dev/faq)

 ## 📜 Roadmap

-Veja o [RAGFlow Roadmap 2025](https://github.com/infiniflow/ragflow/issues/4214)
+Veja o [RAGFlow Roadmap 2026](https://github.com/infiniflow/ragflow/issues/12241)

 ## 🏄 Comunidade

 - [Discord](https://discord.gg/NjYzJD3GM3)
- [Twitter](https://twitter.com/infiniflowai)
+- [X](https://x.com/infiniflowai)
 - [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)

 ## 🙌 Contribuindo
--- a/README_tr.md
+++ b/README_tr.md
@@ -0,0 +1,410 @@
+<div align="center">
+<a href="https://cloud.ragflow.io/">
+<img src="web/src/assets/logo-with-text.svg" width="520" alt="ragflow logo">
+</a>
+</div>
+
+<p align="center">
+  <a href="./README.md"><img alt="README in English" src="https://img.shields.io/badge/English-DFE0E5"></a>
+  <a href="./README_zh.md"><img alt="简体中文版自述文件" src="https://img.shields.io/badge/简体中文-DFE0E5"></a>
+  <a href="./README_tzh.md"><img alt="繁體版中文自述文件" src="https://img.shields.io/badge/繁體中文-DFE0E5"></a>
+  <a href="./README_ja.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-DFE0E5"></a>
+  <a href="./README_ko.md"><img alt="한국어" src="https://img.shields.io/badge/한국어-DFE0E5"></a>
+  <a href="./README_fr.md"><img alt="README en Français" src="https://img.shields.io/badge/Français-DFE0E5"></a>
+  <a href="./README_id.md"><img alt="Bahasa Indonesia" src="https://img.shields.io/badge/Bahasa Indonesia-DFE0E5"></a>
+  <a href="./README_pt_br.md"><img alt="Português(Brasil)" src="https://img.shields.io/badge/Português(Brasil)-DFE0E5"></a>
+  <a href="./README_ar.md"><img alt="README in Arabic" src="https://img.shields.io/badge/Arabic-DFE0E5"></a>
+  <a href="./README_tr.md"><img alt="Türkçe README" src="https://img.shields.io/badge/Türkçe-DBEDFA"></a>
+</p>
+
+<p align="center">
+    <a href="https://x.com/intent/follow?screen_name=infiniflowai" target="_blank">
+        <img src="https://img.shields.io/twitter/follow/infiniflow?logo=X&color=%20%23f5f5f5" alt="X(Twitter)'da takip et">
+    </a>
+    <a href="https://cloud.ragflow.io" target="_blank">
+        <img alt="Çevrimiçi Demo" src="https://img.shields.io/badge/Get-Started-4e6b99">
+    </a>
+    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.26.2">
+    </a>
+    <a href="https://github.com/infiniflow/ragflow/releases/latest">
+        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Son%20Sürüm" alt="Son Sürüm">
+    </a>
+    <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
+        <img height="21" src="https://img.shields.io/badge/Lisans-Apache--2.0-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="lisans">
+    </a>
+    <a href="https://deepwiki.com/infiniflow/ragflow">
+        <img alt="Ask DeepWiki" src="https://deepwiki.com/badge.svg">
+    </a>
+</p>
+
+<h4 align="center">
+  <a href="https://cloud.ragflow.io">Cloud</a> |
+  <a href="https://ragflow.io/docs/dev/">Dokümantasyon</a> |
+  <a href="https://github.com/infiniflow/ragflow/issues/12241">Yol Haritası</a> |
+  <a href="https://discord.gg/NjYzJD3GM3">Discord</a>
+</h4>
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/ragflow-octoverse.png" width="1200"/>
+</div>
+
+<div align="center">
+<a href="https://trendshift.io/repositories/9064" target="_blank"><img src="https://trendshift.io/api/badge/repositories/9064" alt="infiniflow%2Fragflow | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
+</div>
+
+<details open>
+<summary><b>📕 İçindekiler</b></summary>
+
+- 💡 [RAGFlow Nedir?](#-ragflow-nedir)
+- 🎮 [Başlarken](#-başlarken)
+- 📌 [Son Güncellemeler](#-son-güncellemeler)
+- 🌟 [Temel Özellikler](#-temel-özellikler)
+- 🔎 [Sistem Mimarisi](#-sistem-mimarisi)
+- 🎬 [Kendi Sunucusunda Barındırma](#-kendi-sunucusunda-barındırma)
+- 🔧 [Yapılandırmalar](#-yapılandırmalar)
+- 🔧 [Docker İmajı Oluşturma](#-docker-i̇majı-oluşturma)
+- 🔨 [Geliştirme İçin Kaynaktan Hizmet Başlatma](#-geliştirme-i̇çin-kaynaktan-hizmet-başlatma)
+- 📚 [Dokümantasyon](#-dokümantasyon)
+- 📜 [Yol Haritası](#-yol-haritası)
+- 🏄 [Topluluk](#-topluluk)
+- 🙌 [Katkıda Bulunma](#-katkıda-bulunma)
+
+</details>
+
+## 💡 RAGFlow Nedir?
+
+[RAGFlow](https://ragflow.io/), derin doküman anlayışına dayalı, açık kaynaklı ve öncü bir Artırılmış Üretim ile Bilgi Erişimi ([RAG](https://ragflow.io/basics/what-is-rag)) motorudur. En son RAG teknolojisini Ajan yetenekleriyle birleştirerek LLM'ler için üstün bir bağlam katmanı oluşturur. Her ölçekteki kuruluşa uyarlanabilir, kolaylaştırılmış bir RAG iş akışı sunar. Yakınsanmış bir [bağlam motoru](https://ragflow.io/basics/what-is-agent-context-engine) ve hazır ajan şablonlarıyla donatılmış RAGFlow, geliştiricilerin karmaşık verileri yüksek doğrulukta, üretime hazır yapay zeka sistemlerine olağanüstü verimlilik ve hassasiyetle dönüştürmesini sağlar.
+
+## 🎮 Başlarken
+
+Bulut hizmetimizi [https://cloud.ragflow.io](https://cloud.ragflow.io) adresinden deneyin.
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/chunking.gif" width="1200"/>
+<img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/agentic-dark.gif" width="1200"/>
+</div>
+
+## 🔥 Son Güncellemeler
+
+- 2026-06-15 Feishu, Discord, Telegram, Line vb. gibi birden fazla sohbet kanalını destekleyin.
+- 2026-04-24 DeepSeek v4 desteği.
+- 2026-03-24 [RAGFlow Skill on OpenClaw](https://clawhub.ai/yingfeng/ragflow-skill) — OpenClaw üzerinden RAGFlow veri setlerine erişmek için resmi bir skill sağlar.
+- 2025-12-26 Yapay zeka ajanı için 'Bellek' desteği eklendi.
+- 2025-11-19 Gemini 3 Pro desteği eklendi.
+- 2025-11-12 Confluence, S3, Notion, Discord, Google Drive'dan veri senkronizasyonu desteği eklendi.
+- 2025-10-23 Doküman ayrıştırma yöntemi olarak MinerU ve Docling desteği eklendi.
+- 2025-10-15 Düzenlenebilir veri alım hattı desteği eklendi.
+- 2025-08-08 OpenAI'ın en yeni GPT-5 serisi modelleri için destek eklendi.
+- 2025-08-01 Ajanlı iş akışı ve MCP desteği eklendi.
+- 2025-05-23 Ajana Python/JavaScript kod çalıştırıcı bileşeni eklendi.
+- 2025-03-19 PDF veya DOCX dosyalarındaki görselleri yorumlamak için çok modlu model desteği eklendi.
+
+## 🎉 Bizi Takip Edin
+
+⭐️ Heyecan verici yeni özellikler ve iyileştirmelerden haberdar olmak için depomuzı yıldızlayın! Yeni sürümler için anında bildirim alın! 🌟
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/user-attachments/assets/18c9707e-b8aa-4caf-a154-037089c105ba" width="1200"/>
+</div>
+
+## 🌟 Temel Özellikler
+
+### 🍭 **"Kaliteli girdi, kaliteli çıktı"**
+
+- Karmaşık formatlara sahip yapılandırılmamış verilerden [derin doküman anlayışı](./deepdoc/README.md) tabanlı bilgi çıkarımı.
+- Kelimenin tam anlamıyla sınırsız token içinde "samanlıkta iğne bulma" yeteneği.
+
+### 🍱 **Şablon tabanlı parçalama**
+
+- Akıllı ve açıklanabilir.
+- Aralarından seçim yapabileceğiniz çok sayıda şablon seçeneği.
+
+### 🌱 **Azaltılmış halüsinasyonlarla temellendirilmiş alıntılar**
+
+- İnsan müdahalesine olanak tanıyan metin parçalama görselleştirmesi.
+- Temellendirilmiş yanıtları desteklemek için anahtar referansların hızlı görüntülenmesi ve izlenebilir alıntılar.
+
+### 🍔 **Heterojen veri kaynaklarıyla uyumluluk**
+
+- Word, slaytlar, Excel, txt, görseller, taranmış kopyalar, yapılandırılmış veriler, web sayfaları ve daha fazlasını destekler.
+
+### 🛀 **Otomatik ve zahmetsiz RAG iş akışı**
+
+- Hem bireysel hem de büyük işletmeler için özelleştirilmiş kolaylaştırılmış RAG düzenlemesi.
+- Yapılandırılabilir LLM'ler ve gömme (embedding) modelleri.
+- Birleştirilmiş yeniden sıralama ile çoklu geri çağırma.
+- İş süreçlerine sorunsuz entegrasyon için sezgisel API'ler.
+
+## 🔎 Sistem Mimarisi
+
+<div align="center" style="margin-top:20px;margin-bottom:20px;">
+<img src="https://github.com/user-attachments/assets/31b0dd6f-ca4f-445a-9457-70cb44a381b2" width="1000"/>
+</div>
+
+## 🎬 Kendi Sunucusunda Barındırma
+
+### 📝 Ön Koşullar
+
+- CPU >= 4 çekirdek
+- RAM >= 16 GB
+- Disk >= 50 GB
+- Docker >= 24.0.0 & Docker Compose >= v2.26.1
+- Python >= 3.13
+- [gVisor](https://gvisor.dev/docs/user_guide/install/): Yalnızca RAGFlow'un kod çalıştırıcı (sandbox) özelliğini kullanmayı planlıyorsanız gereklidir.
+
+> [!TIP]
+> Yerel makinenize (Windows, Mac veya Linux) Docker yüklemediyseniz, [Docker Engine Kurulumu](https://docs.docker.com/engine/install/) sayfasına bakın.
+
+### 🚀 Sunucuyu Başlatma
+
+1. `vm.max_map_count` değerinin >= 262144 olduğundan emin olun:
+
+   > `vm.max_map_count` değerini kontrol etmek için:
+   >
+   > ```bash
+   > $ sysctl vm.max_map_count
+   > ```
+   >
+   > Değer 262144'ten düşükse, en az 262144 olarak ayarlayın.
+   >
+   > ```bash
+   > # Bu örnekte 262144 olarak ayarlıyoruz:
+   > $ sudo sysctl -w vm.max_map_count=262144
+   > ```
+   >
+   > Bu değişiklik sistem yeniden başlatıldığında sıfırlanacaktır. Değişikliğin kalıcı olmasını sağlamak için
+   > **/etc/sysctl.conf** dosyasındaki `vm.max_map_count` değerini buna göre ekleyin veya güncelleyin:
+   >
+   > ```bash
+   > vm.max_map_count=262144
+   > ```
+   >
+2. Depoyu klonlayın:
+
+   ```bash
+   $ git clone https://github.com/infiniflow/ragflow.git
+   ```
+3. Önceden oluşturulmuş Docker imajlarını kullanarak sunucuyu başlatın:
+
+> [!CAUTION]
+> Tüm Docker imajları x86 platformları için oluşturulmuştur. Şu anda ARM64 için Docker imajı sunmuyoruz.
+> ARM64 platformundaysanız, sisteminizle uyumlu bir Docker imajı oluşturmak için [bu kılavuzu](https://ragflow.io/docs/dev/build_docker_image) takip edin.
+
+> Aşağıdaki komut RAGFlow Docker imajının `v0.26.2` sürümünü indirir. Farklı RAGFlow sürümleri için aşağıdaki tabloya bakın. `v0.26.2` dışında bir sürüm indirmek için, `docker compose` ile sunucuyu başlatmadan önce **docker/.env** dosyasındaki `RAGFLOW_IMAGE` değişkenini güncelleyin.
+
+```bash
+   $ cd ragflow/docker
+
+   # git checkout v0.26.2
+   # İsteğe bağlı: Kararlı bir etiket kullanın (sürümler: https://github.com/infiniflow/ragflow/releases)
+   # Bu adım, koddaki **entrypoint.sh** dosyasının Docker imaj sürümüyle eşleşmesini sağlar.
+
+   # DeepDoc görevleri için CPU kullanımı:
+   $ docker compose -f docker-compose.yml up -d
+
+   # DeepDoc görevlerini hızlandırmak için GPU kullanımı:
+   # sed -i '1i DEVICE=gpu' .env
+   # docker compose -f docker-compose.yml up -d
+```
+
+> Not: `v0.22.0` öncesinde hem gömme modelleri içeren imajlar hem de gömme modelleri içermeyen ince (slim) imajlar sunuyorduk. Detaylar aşağıdadır:
+
+| RAGFlow imaj etiketi | İmaj boyutu (GB) | Gömme modelleri var mı? | Kararlı mı?    |
+|-----------------------|-------------------|-------------------------|-----------------|
+| v0.21.1               | &approx;9        | ✔️                      | Kararlı sürüm   |
+| v0.21.1-slim          | &approx;2        | ❌                       | Kararlı sürüm   |
+
+> `v0.22.0`'dan itibaren yalnızca ince (slim) sürümü sunuyoruz ve imaj etiketine artık **-slim** son eki eklemiyoruz.
+
+4. Sunucu çalışır duruma geldikten sonra sunucu durumunu kontrol edin:
+
+   ```bash
+   $ docker logs -f docker-ragflow-cpu-1
+   ```
+
+   _Aşağıdaki çıktı, sistemin başarıyla başlatıldığını onaylar:_
+
+   ```bash
+
+         ____   ___    ______ ______ __
+        / __ \ /   |  / ____// ____// /____  _      __
+       / /_/ // /| | / / __ / /_   / // __ \| | /| / /
+      / _, _// ___ |/ /_/ // __/  / // /_/ /| |/ |/ /
+     /_/ |_|/_/  |_|\____//_/    /_/ \____/ |__/|__/
+
+    * Running on all addresses (0.0.0.0)
+   ```
+
+   > Bu onay adımını atlayıp doğrudan RAGFlow'a giriş yaparsanız, o anda RAGFlow tam olarak başlatılmamış olabileceğinden
+   > tarayıcınız `ağ hatası` uyarısı verebilir.
+   >
+5. Web tarayıcınıza sunucunuzun IP adresini girin ve RAGFlow'a giriş yapın.
+
+   > Varsayılan ayarlarla, yalnızca `http://MAKİNENİZİN_IP_ADRESİ` girmeniz yeterlidir (port numarası **gerekmez**),
+   > çünkü varsayılan HTTP sunucu portu `80` varsayılan yapılandırmalar kullanıldığında ihmal edilebilir.
+   >
+6. [service_conf.yaml.template](./docker/service_conf.yaml.template) dosyasında, `user_default_llm` içinde istediğiniz LLM sağlayıcısını seçin ve
+   `API_KEY` alanını ilgili API anahtarıyla güncelleyin.
+
+   > Daha fazla bilgi için [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) sayfasına bakın.
+   >
+
+   _Gösteri başlasın!_
+
+## 🔧 Yapılandırmalar
+
+Sistem yapılandırmaları söz konusu olduğunda, aşağıdaki dosyaları yönetmeniz gerekecektir:
+
+- [.env](./docker/.env): `SVR_HTTP_PORT`, `MYSQL_PASSWORD` ve `MINIO_PASSWORD` gibi temel sistem ayarlarını içerir.
+- [service_conf.yaml.template](./docker/service_conf.yaml.template): Arka uç hizmetlerini yapılandırır. Bu dosyadaki ortam değişkenleri, Docker konteyneri başladığında otomatik olarak doldurulacaktır. Docker konteyneri içinde ayarlanan tüm ortam değişkenleri kullanıma hazır olacak ve hizmet davranışını dağıtım ortamına göre özelleştirmenize olanak tanıyacaktır.
+- [docker-compose.yml](./docker/docker-compose.yml): Sistem, başlatılmak için [docker-compose.yml](./docker/docker-compose.yml) dosyasına dayanır.
+
+> [./docker/README](./docker/README.md) dosyası, [service_conf.yaml.template](./docker/service_conf.yaml.template) dosyasında `${ENV_VARS}` olarak kullanılabilen ortam ayarları ve hizmet yapılandırmalarının ayrıntılı bir açıklamasını sağlar.
+
+Varsayılan HTTP sunucu portunu (80) değiştirmek için [docker-compose.yml](./docker/docker-compose.yml) dosyasında `80:80` ifadesini `<SUNUCU_PORTUNUZ>:80` olarak değiştirin.
+
+Yukarıdaki yapılandırma değişikliklerinin etkili olması için tüm konteynerlerin yeniden başlatılması gerekir:
+
+> ```bash
+> $ docker compose -f docker-compose.yml up -d
+> ```
+
+### Doküman Motorunu Elasticsearch'ten Infinity'ye Geçirme
+
+RAGFlow varsayılan olarak tam metin ve vektörlerin depolanması için Elasticsearch kullanır. [Infinity](https://github.com/infiniflow/infinity/)'ye geçmek için şu adımları izleyin:
+
+1. Çalışan tüm konteynerleri durdurun:
+
+   ```bash
+   $ docker compose -f docker/docker-compose.yml down -v
+   ```
+
+> [!WARNING]
+> `-v` seçeneği Docker konteyner birimlerini silecek ve mevcut veriler temizlenecektir.
+
+2. **docker/.env** dosyasında `DOC_ENGINE` değerini `infinity` olarak ayarlayın.
+3. Konteynerleri başlatın:
+
+   ```bash
+   $ docker compose -f docker-compose.yml up -d
+   ```
+
+> [!WARNING]
+> Linux/arm64 makinesinde Infinity'ye geçiş henüz resmi olarak desteklenmemektedir.
+
+## 🔧 Docker İmajı Oluşturma
+
+Bu imaj yaklaşık 2 GB boyutundadır ve harici LLM ile gömme hizmetlerine bağlıdır.
+
+```bash
+git clone https://github.com/infiniflow/ragflow.git
+cd ragflow/
+docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
+Veya bir proxy arkasındaysanız, proxy parametrelerini iletebilirsiniz:
+
+```bash
+docker build --platform linux/amd64 \
+  --build-arg http_proxy=http://PROXY_ADRESINIZ:PORT \
+  --build-arg https_proxy=http://PROXY_ADRESINIZ:PORT \
+  -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
+## 🔨 Geliştirme İçin Kaynaktan Hizmet Başlatma
+
+1. `uv` yükleyin veya zaten yüklüyse bu adımı atlayın:
+
+   ```bash
+   pipx install uv
+   ```
+2. Kaynak kodunu klonlayın ve Python bağımlılıklarını yükleyin:
+
+   ```bash
+   git clone https://github.com/infiniflow/ragflow.git
+   cd ragflow/
+   uv sync --python 3.13 # RAGFlow'un bağımlı Python modüllerini yükler
+   uv run python3 ragflow_deps/download_deps.py
+   lefthook install
+   ```
+3. Bağımlı hizmetleri (MinIO, Elasticsearch, Redis ve MySQL) Docker Compose kullanarak başlatın:
+
+   ```bash
+   docker compose -f docker/docker-compose-base.yml up -d
+   ```
+
+   **docker/.env** dosyasında belirtilen tüm ana bilgisayar adlarını `127.0.0.1`'e çözümlemek için `/etc/hosts` dosyasına aşağıdaki satırı ekleyin:
+
+   ```
+   127.0.0.1       es01 infinity mysql minio redis sandbox-executor-manager
+   ```
+4. HuggingFace'e erişemiyorsanız, bir ayna site kullanmak için `HF_ENDPOINT` ortam değişkenini ayarlayın:
+
+   ```bash
+   export HF_ENDPOINT=https://hf-mirror.com
+   ```
+5. İşletim sisteminizde jemalloc yoksa, aşağıdaki şekilde yükleyin:
+
+   ```bash
+   # Ubuntu
+   sudo apt-get install libjemalloc-dev
+   # CentOS
+   sudo yum install jemalloc
+   # OpenSUSE
+   sudo zypper install jemalloc
+   # macOS
+   sudo brew install jemalloc
+   ```
+6. Arka uç hizmetini başlatın:
+
+   ```bash
+   source .venv/bin/activate
+   export PYTHONPATH=$(pwd)
+   bash docker/launch_backend_service.sh
+   ```
+7. Ön yüz bağımlılıklarını yükleyin:
+
+   ```bash
+   cd web
+   npm install
+   ```
+8. Ön yüz hizmetini başlatın:
+
+   ```bash
+   npm run dev
+   ```
+
+   _Aşağıdaki çıktı, sistemin başarıyla başlatıldığını onaylar:_
+
+   ![](https://github.com/user-attachments/assets/0daf462c-a24d-4496-a66f-92533534e187)
+9. Geliştirme tamamlandıktan sonra RAGFlow ön yüz ve arka uç hizmetini durdurun:
+
+   ```bash
+   pkill -f "ragflow_server.py|task_executor.py"
+   ```
+
+## 📚 Dokümantasyon
+
+- [Hızlı Başlangıç](https://ragflow.io/docs/dev/)
+- [Yapılandırma](https://ragflow.io/docs/dev/configurations)
+- [Sürüm Notları](https://ragflow.io/docs/dev/release_notes)
+- [Kullanıcı Kılavuzları](https://ragflow.io/docs/category/user-guides)
+- [Geliştirici Kılavuzları](https://ragflow.io/docs/category/developer-guides)
+- [Referanslar](https://ragflow.io/docs/dev/category/references)
+- [SSS](https://ragflow.io/docs/dev/faq)
+
+## 📜 Yol Haritası
+
+[RAGFlow Yol Haritası 2026](https://github.com/infiniflow/ragflow/issues/12241) sayfasına bakın.
+
+## 🏄 Topluluk
+
+- [Discord](https://discord.gg/NjYzJD3GM3)
+- [X](https://x.com/infiniflowai)
+- [GitHub Tartışmalar](https://github.com/orgs/infiniflow/discussions)
+
+## 🙌 Katkıda Bulunma
+
+RAGFlow, açık kaynak iş birliği sayesinde gelişmektedir. Bu anlayışla, topluluktan gelen çeşitli katkıları benimsiyoruz.
+Bir parçası olmak istiyorsanız, önce [Katkıda Bulunma Kılavuzumuzu](https://ragflow.io/docs/dev/contributing) inceleyin.
--- a/README_tzh.md
+++ b/README_tzh.md
@@ -1,5 +1,5 @@
 <div align="center">
-<a href="https://demo.ragflow.io/">
+<a href="https://cloud.ragflow.io/">
 <img src="web/src/assets/logo-with-text.svg" width="350" alt="ragflow logo">
 </a>
 </div>
@@ -10,19 +10,22 @@
  <a href="./README_tzh.md"><img alt="繁體版中文自述文件" src="https://img.shields.io/badge/繁體中文-DBEDFA"></a>
  <a href="./README_ja.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-DFE0E5"></a>
  <a href="./README_ko.md"><img alt="한국어" src="https://img.shields.io/badge/한국어-DFE0E5"></a>
+  <a href="./README_fr.md"><img alt="README en Français" src="https://img.shields.io/badge/Français-DFE0E5"></a>
  <a href="./README_id.md"><img alt="Bahasa Indonesia" src="https://img.shields.io/badge/Bahasa Indonesia-DFE0E5"></a>
  <a href="./README_pt_br.md"><img alt="Português(Brasil)" src="https://img.shields.io/badge/Português(Brasil)-DFE0E5"></a>
+  <a href="./README_ar.md"><img alt="README in Arabic" src="https://img.shields.io/badge/Arabic-DFE0E5"></a>
+  <a href="./README_tr.md"><img alt="Türkçe README" src="https://img.shields.io/badge/Türkçe-DFE0E5"></a>
 </p>

 <p align="center">
    <a href="https://x.com/intent/follow?screen_name=infiniflowai" target="_blank">
        <img src="https://img.shields.io/twitter/follow/infiniflow?logo=X&color=%20%23f5f5f5" alt="follow on X(Twitter)">
    </a>
-    <a href="https://demo.ragflow.io" target="_blank">
-        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
+    <a href="https://cloud.ragflow.io" target="_blank">
+        <img alt="Static Badge" src="https://img.shields.io/badge/Get-Started-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.22.0">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.26.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@@ -36,11 +39,10 @@
 </p>

 <h4 align="center">
+  <a href="https://cloud.ragflow.io">Cloud</a> |
  <a href="https://ragflow.io/docs/dev/">Document</a> |
-  <a href="https://github.com/infiniflow/ragflow/issues/4214">Roadmap</a> |
-  <a href="https://twitter.com/infiniflowai">Twitter</a> |
-  <a href="https://discord.gg/NjYzJD3GM3">Discord</a> |
-  <a href="https://demo.ragflow.io">Demo</a>
+  <a href="https://github.com/infiniflow/ragflow/issues/12241">Roadmap</a> |
+  <a href="https://discord.gg/NjYzJD3GM3">Discord</a>
 </h4>

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
@@ -55,11 +57,11 @@
 <summary><b>📕 目錄</b></summary>

 - 💡 [RAGFlow 是什麼？](#-RAGFlow-是什麼)
- 🎮 [Demo-試用](#-demo-試用)
+- 🎮 [快速開始](#-快速開始)
 - 📌 [近期更新](#-近期更新)
 - 🌟 [主要功能](#-主要功能)
 - 🔎 [系統架構](#-系統架構)
- 🎬 [快速開始](#-快速開始)
+- 🎬 [自行架設](#-自行架設)
 - 🔧 [系統配置](#-系統配置)
 - 🔨 [以原始碼啟動服務](#-以原始碼啟動服務)
 - 📚 [技術文檔](#-技術文檔)
@@ -72,11 +74,11 @@

 ## 💡 RAGFlow 是什麼？

-[RAGFlow](https://ragflow.io/) 是一款領先的開源 RAG（Retrieval-Augmented Generation）引擎，通過融合前沿的 RAG 技術與 Agent 能力，為大型語言模型提供卓越的上下文層。它提供可適配任意規模企業的端到端 RAG 工作流，憑藉融合式上下文引擎與預置的 Agent 模板，助力開發者以極致效率與精度將複雜數據轉化為高可信、生產級的人工智能系統。
+[RAGFlow](https://ragflow.io/) 是一款領先的開源 [RAG](https://ragflow.io/basics/what-is-rag)（Retrieval-Augmented Generation）引擎，通過融合前沿的 RAG 技術與 Agent 能力，為大型語言模型提供卓越的上下文層。它提供可適配任意規模企業的端到端 RAG 工作流，憑藉融合式[上下文引擎](https://ragflow.io/basics/what-is-agent-context-engine)與預置的 Agent 模板，助力開發者以極致效率與精度將複雜數據轉化為高可信、生產級的人工智能系統。

-## 🎮 Demo 試用
+## 🎮 快速開始

-請登入網址 [https://demo.ragflow.io](https://demo.ragflow.io) 試用 demo。
+請登入網址 [https://cloud.ragflow.io](https://cloud.ragflow.io) 試用雲服務。

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
 <img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/chunking.gif" width="1200"/>
@@ -85,16 +87,19 @@

 ## 🔥 近期更新

- 2025-11-12 支援從 Confluence、AWS S3、Discord、Google Drive 進行資料同步。
+- 2026-06-15 支援飛書、Discord、Telegram、Line 等多種聊天管道。
+- 2026-04-24 支援 DeepSeek v4 版本。
+- 2026-03-24 發布 [RAGFlow 官方 Skill](https://clawhub.ai/yingfeng/ragflow-skill) — 提供官方 Skill 以透過 OpenClaw 訪問 RAGFlow 數據集。
+- 2025-12-26 支援AI代理的「記憶」功能。
+- 2025-11-19 支援 Gemini 3 Pro。
+- 2025-11-12 支援從 Confluence、S3、Notion、Discord、Google Drive 進行資料同步。
 - 2025-10-23 支援 MinerU 和 Docling 作為文件解析方法。
 - 2025-10-15 支援可編排的資料管道。
 - 2025-08-08 支援 OpenAI 最新的 GPT-5 系列模型。
- 2025-08-01 支援 agentic workflow 和 MCP
+- 2025-08-01 支援 agentic workflow 和 MCP。
 - 2025-05-23 為 Agent 新增 Python/JS 程式碼執行器元件。
- 2025-05-05 支援跨語言查詢。
- 2025-03-19 PDF和DOCX中的圖支持用多模態大模型去解析得到描述.
- 2024-12-18 升級了 DeepDoc 的文檔佈局分析模型。
- 2024-08-22 支援用 RAG 技術實現從自然語言到 SQL 語句的轉換。
+- 2025-03-19 PDF和DOCX中的圖支持用多模態大模型去解析得到描述。
+

 ## 🎉 關注項目

@@ -123,7 +128,7 @@

 ### 🍔 **相容各類異質資料來源**

- 支援豐富的文件類型，包括 Word 文件、PPT、excel 表格、txt 檔案、圖片、PDF、影印件、影印件、結構化資料、網頁等。
+- 支援豐富的文件類型，包括 Word 文件、PPT、excel 表格、txt 檔案、圖片、PDF、影印件、複印件、結構化資料、網頁等。

 ### 🛀 **全程無憂、自動化的 RAG 工作流程**

@@ -138,7 +143,7 @@
 <img src="https://github.com/user-attachments/assets/31b0dd6f-ca4f-445a-9457-70cb44a381b2" width="1000"/>
 </div>

-## 🎬 快速開始
+## 🎬 自行架設

 ### 📝 前提條件

@@ -146,6 +151,7 @@
 - RAM >= 16 GB
 - Disk >= 50 GB
 - Docker >= 24.0.0 & Docker Compose >= v2.26.1
+- Python >= 3.13
 - [gVisor](https://gvisor.dev/docs/user_guide/install/): 僅在您打算使用 RAGFlow 的代碼執行器（沙箱）功能時才需要安裝。

 > [!TIP]
@@ -185,12 +191,14 @@
 > 所有 Docker 映像檔都是為 x86 平台建置的。目前，我們不提供 ARM64 平台的 Docker 映像檔。
 > 如果您使用的是 ARM64 平台，請使用 [這份指南](https://ragflow.io/docs/dev/build_docker_image) 來建置適合您系統的 Docker 映像檔。

-> 執行以下指令會自動下載 RAGFlow Docker 映像 `v0.22.0`。請參考下表查看不同 Docker 發行版的說明。如需下載不同於 `v0.22.0` 的 Docker 映像，請在執行 `docker compose` 啟動服務之前先更新 **docker/.env** 檔案內的 `RAGFLOW_IMAGE` 變數。
+> 執行以下指令會自動下載 RAGFlow Docker 映像 `v0.26.2`。請參考下表查看不同 Docker 發行版的說明。如需下載不同於 `v0.26.2` 的 Docker 映像，請在執行 `docker compose` 啟動服務之前先更新 **docker/.env** 檔案內的 `RAGFLOW_IMAGE` 變數。

 ```bash
   $ cd ragflow/docker
-   
-   # 可選：使用穩定版標籤（查看發佈：https://github.com/infiniflow/ragflow/releases），例：git checkout v0.22.0
+
+   # git checkout v0.26.2
+   # 可選：使用穩定版標籤（查看發佈：https://github.com/infiniflow/ragflow/releases）
+   # 此步驟確保程式碼中的 entrypoint.sh 檔案與 Docker 映像版本一致。

   # Use CPU for DeepDoc tasks:
   $ docker compose -f docker-compose.yml up -d
@@ -202,10 +210,10 @@

 > 注意：在 `v0.22.0` 之前的版本，我們會同時提供包含 embedding 模型的映像和不含 embedding 模型的 slim 映像。具體如下：

-| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
-| ----------------- | --------------- | --------------------- | ------------------------ |
-| v0.21.1           | &approx;9       | ✔️                    | Stable release           |
-| v0.21.1-slim      | &approx;2       | ❌                    | Stable release           |
+| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?        |
+|-------------------|-----------------|-----------------------|----------------|
+| v0.21.1           | &approx;9       | ✔️                    | Stable release |
+| v0.21.1-slim      | &approx;2       | ❌                     | Stable release |

 > 從 `v0.22.0` 開始，我們只發佈 slim 版本，並且不再在映像標籤後附加 **-slim** 後綴。

@@ -233,7 +241,7 @@
    * Running on all addresses (0.0.0.0)
   ```

-   > 如果您跳過這一步驟系統確認步驟就登入 RAGFlow，你的瀏覽器有可能會提示 `network anormal` 或 `網路異常`，因為 RAGFlow 可能並未完全啟動成功。
+   > 如果您跳過這一步驟系統確認步驟就登入 RAGFlow，你的瀏覽器有可能會提示 `network abnormal` 或 `網路異常`，因為 RAGFlow 可能並未完全啟動成功。
   >
 5. 在你的瀏覽器中輸入你的伺服器對應的 IP 位址並登入 RAGFlow。

@@ -299,12 +307,21 @@ cd ragflow/
 docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
 ```

+若您位於代理環境，可傳遞代理參數：
+
+```bash
+docker build --platform linux/amd64 \
+  --build-arg http_proxy=http://YOUR_PROXY:PORT \
+  --build-arg https_proxy=http://YOUR_PROXY:PORT \
+  -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
 ## 🔨 以原始碼啟動服務

-1. 安裝 `uv` 和 `pre-commit`。如已安裝，可跳過此步驟：
+1. 安裝 `uv`。如已安裝，可跳過此步驟：

   ```bash
-   pipx install uv pre-commit
+   pipx install uv
   export UV_INDEX=https://mirrors.aliyun.com/pypi/simple
   ```
 2. 下載原始碼並安裝 Python 依賴：
@@ -312,9 +329,9 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
   ```bash
   git clone https://github.com/infiniflow/ragflow.git
   cd ragflow/
-   uv sync --python 3.10 # install RAGFlow dependent python modules
-   uv run download_deps.py
-   pre-commit install
+   uv sync --python 3.13 # install RAGFlow dependent python modules
+   uv run python3 ragflow_deps/download_deps.py
+   lefthook install
   ```
 3. 透過 Docker Compose 啟動依賴的服務（MinIO, Elasticsearch, Redis, and MySQL）：

@@ -379,19 +396,19 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
 - [Quickstart](https://ragflow.io/docs/dev/)
 - [Configuration](https://ragflow.io/docs/dev/configurations)
 - [Release notes](https://ragflow.io/docs/dev/release_notes)
- [User guides](https://ragflow.io/docs/dev/category/guides)
- [Developer guides](https://ragflow.io/docs/dev/category/developers)
+- [User guides](https://ragflow.io/docs/category/user-guides)
+- [Developer guides](https://ragflow.io/docs/category/developer-guides)
 - [References](https://ragflow.io/docs/dev/category/references)
 - [FAQs](https://ragflow.io/docs/dev/faq)

 ## 📜 路線圖

-詳見 [RAGFlow Roadmap 2025](https://github.com/infiniflow/ragflow/issues/4214) 。
+詳見 [RAGFlow Roadmap 2026](https://github.com/infiniflow/ragflow/issues/12241) 。

 ## 🏄 開源社群

- [Discord](https://discord.gg/zd4qPW6t)
- [Twitter](https://twitter.com/infiniflowai)
+- [Discord](https://discord.gg/NjYzJD3GM3)
+- [X](https://x.com/infiniflowai)
 - [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)

 ## 🙌 貢獻指南
--- a/README_zh.md
+++ b/README_zh.md
@@ -1,5 +1,5 @@
 <div align="center">
-<a href="https://demo.ragflow.io/">
+<a href="https://cloud.ragflow.io/">
 <img src="web/src/assets/logo-with-text.svg" width="350" alt="ragflow logo">
 </a>
 </div>
@@ -10,19 +10,22 @@
  <a href="./README_tzh.md"><img alt="繁體版中文自述文件" src="https://img.shields.io/badge/繁體中文-DFE0E5"></a>
  <a href="./README_ja.md"><img alt="日本語のREADME" src="https://img.shields.io/badge/日本語-DFE0E5"></a>
  <a href="./README_ko.md"><img alt="한국어" src="https://img.shields.io/badge/한국어-DFE0E5"></a>
+  <a href="./README_fr.md"><img alt="README en Français" src="https://img.shields.io/badge/Français-DFE0E5"></a>
  <a href="./README_id.md"><img alt="Bahasa Indonesia" src="https://img.shields.io/badge/Bahasa Indonesia-DFE0E5"></a>
  <a href="./README_pt_br.md"><img alt="Português(Brasil)" src="https://img.shields.io/badge/Português(Brasil)-DFE0E5"></a>
+  <a href="./README_ar.md"><img alt="README in Arabic" src="https://img.shields.io/badge/Arabic-DFE0E5"></a>
+  <a href="./README_tr.md"><img alt="Türkçe README" src="https://img.shields.io/badge/Türkçe-DFE0E5"></a>
 </p>

 <p align="center">
    <a href="https://x.com/intent/follow?screen_name=infiniflowai" target="_blank">
        <img src="https://img.shields.io/twitter/follow/infiniflow?logo=X&color=%20%23f5f5f5" alt="follow on X(Twitter)">
    </a>
-    <a href="https://demo.ragflow.io" target="_blank">
-        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
+    <a href="https://cloud.ragflow.io" target="_blank">
+        <img alt="Static Badge" src="https://img.shields.io/badge/Get-Started-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.22.0">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.26.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@@ -36,11 +39,10 @@
 </p>

 <h4 align="center">
+  <a href="https://cloud.ragflow.io">Cloud</a> |
  <a href="https://ragflow.io/docs/dev/">Document</a> |
-  <a href="https://github.com/infiniflow/ragflow/issues/4214">Roadmap</a> |
-  <a href="https://twitter.com/infiniflowai">Twitter</a> |
-  <a href="https://discord.gg/NjYzJD3GM3">Discord</a> |
-  <a href="https://demo.ragflow.io">Demo</a>
+  <a href="https://github.com/infiniflow/ragflow/issues/12241">Roadmap</a> |
+  <a href="https://discord.gg/NjYzJD3GM3">Discord</a>
 </h4>

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
@@ -55,11 +57,11 @@
 <summary><b>📕 目录</b></summary>

 - 💡 [RAGFlow 是什么？](#-RAGFlow-是什么)
- 🎮 [Demo](#-demo)
+- 🎮 [快速开始](#-快速开始)
 - 📌 [近期更新](#-近期更新)
 - 🌟 [主要功能](#-主要功能)
 - 🔎 [系统架构](#-系统架构)
- 🎬 [快速开始](#-快速开始)
+- 🎬 [自主托管](#-自主托管)
 - 🔧 [系统配置](#-系统配置)
 - 🔨 [以源代码启动服务](#-以源代码启动服务)
 - 📚 [技术文档](#-技术文档)
@@ -72,11 +74,11 @@

 ## 💡 RAGFlow 是什么？

-[RAGFlow](https://ragflow.io/) 是一款领先的开源检索增强生成（RAG）引擎，通过融合前沿的 RAG 技术与 Agent 能力，为大型语言模型提供卓越的上下文层。它提供可适配任意规模企业的端到端 RAG 工作流，凭借融合式上下文引擎与预置的 Agent 模板，助力开发者以极致效率与精度将复杂数据转化为高可信、生产级的人工智能系统。
+[RAGFlow](https://ragflow.io/) 是一款领先的开源检索增强生成（[RAG](https://ragflow.io/basics/what-is-rag)）引擎，通过融合前沿的 RAG 技术与 Agent 能力，为大型语言模型提供卓越的上下文层。它提供可适配任意规模企业的端到端 RAG 工作流，凭借融合式[上下文引擎](https://ragflow.io/basics/what-is-agent-context-engine)与预置的 Agent 模板，助力开发者以极致效率与精度将复杂数据转化为高可信、生产级的人工智能系统。

-## 🎮 Demo 试用
+## 🎮 快速开始

-请登录网址 [https://demo.ragflow.io](https://demo.ragflow.io) 试用 demo。
+请登录网址 [https://cloud.ragflow.io](https://cloud.ragflow.io) 体验云服务。

 <div align="center" style="margin-top:20px;margin-bottom:20px;">
 <img src="https://raw.githubusercontent.com/infiniflow/ragflow-docs/refs/heads/image/image/chunking.gif" width="1200"/>
@@ -85,16 +87,19 @@

 ## 🔥 近期更新

- 2025-11-12 支持从 Confluence、AWS S3、Discord、Google Drive 进行数据同步。
+- 2026-06-15 支持飞书、Discord、Telegram、Line 等多种聊天渠道。
+- 2026-04-24 支持 DeepSeek v4.
+- 2026-03-24 发布 [RAGFlow 官方 Skill](https://clawhub.ai/yingfeng/ragflow-skill) — 提供官方 Skill 以通过 OpenClaw 访问 RAGFlow 数据集。
+- 2025-12-26 支持 AI 代理的"记忆"功能。
+- 2025-11-19 支持 Gemini 3 Pro。
+- 2025-11-12 支持从 Confluence、S3、Notion、Discord、Google Drive 进行数据同步。
 - 2025-10-23 支持 MinerU 和 Docling 作为文档解析方法。
 - 2025-10-15 支持可编排的数据管道。
 - 2025-08-08 支持 OpenAI 最新的 GPT-5 系列模型。
 - 2025-08-01 支持 agentic workflow 和 MCP。
 - 2025-05-23 Agent 新增 Python/JS 代码执行器组件。
- 2025-05-05 支持跨语言查询。
- 2025-03-19 PDF 和 DOCX 中的图支持用多模态大模型去解析得到描述.
- 2024-12-18 升级了 DeepDoc 的文档布局分析模型。
- 2024-08-22 支持用 RAG 技术实现从自然语言到 SQL 语句的转换。
+- 2025-03-19 PDF 和 DOCX 中的图支持用多模态大模型去解析得到描述。
+

 ## 🎉 关注项目

@@ -138,7 +143,7 @@
 <img src="https://github.com/user-attachments/assets/31b0dd6f-ca4f-445a-9457-70cb44a381b2" width="1000"/>
 </div>

-## 🎬 快速开始
+## 🎬 自主托管

 ### 📝 前提条件

@@ -146,6 +151,7 @@
 - RAM >= 16 GB
 - Disk >= 50 GB
 - Docker >= 24.0.0 & Docker Compose >= v2.26.1
+- Python >= 3.13
 - [gVisor](https://gvisor.dev/docs/user_guide/install/): 仅在你打算使用 RAGFlow 的代码执行器（沙箱）功能时才需要安装。

 > [!TIP]
@@ -186,12 +192,14 @@
 > 请注意，目前官方提供的所有 Docker 镜像均基于 x86 架构构建，并不提供基于 ARM64 的 Docker 镜像。
 > 如果你的操作系统是 ARM64 架构，请参考[这篇文档](https://ragflow.io/docs/dev/build_docker_image)自行构建 Docker 镜像。

-   > 运行以下命令会自动下载 RAGFlow Docker 镜像 `v0.22.0`。请参考下表查看不同 Docker 发行版的描述。如需下载不同于 `v0.22.0` 的 Docker 镜像，请在运行 `docker compose` 启动服务之前先更新 **docker/.env** 文件内的 `RAGFLOW_IMAGE` 变量。
+   > 运行以下命令会自动下载 RAGFlow Docker 镜像 `v0.26.2`。请参考下表查看不同 Docker 发行版的描述。如需下载不同于 `v0.26.2` 的 Docker 镜像，请在运行 `docker compose` 启动服务之前先更新 **docker/.env** 文件内的 `RAGFLOW_IMAGE` 变量。

   ```bash
   $ cd ragflow/docker
-   
-   # 可选：使用稳定版本标签（查看发布：https://github.com/infiniflow/ragflow/releases），例如：git checkout v0.22.0
+
+   # git checkout v0.26.2
+   # 可选：使用稳定版本标签（查看发布：https://github.com/infiniflow/ragflow/releases）
+   # 这一步确保代码中的 entrypoint.sh 文件与 Docker 镜像的版本保持一致。

   # Use CPU for DeepDoc tasks:
   $ docker compose -f docker-compose.yml up -d
@@ -200,13 +208,13 @@
   # sed -i '1i DEVICE=gpu' .env
   # docker compose -f docker-compose.yml up -d
   ```
-   
+
   > 注意：在 `v0.22.0` 之前的版本，我们会同时提供包含 embedding 模型的镜像和不含 embedding 模型的 slim 镜像。具体如下：

-   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
-   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.21.1           | &approx;9       | ✔️                    | Stable release           |
-   | v0.21.1-slim      | &approx;2       | ❌                    | Stable release           |
+   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?        |
+   |-------------------|-----------------|-----------------------|----------------|
+   | v0.21.1           | &approx;9       | ✔️                    | Stable release |
+   | v0.21.1-slim      | &approx;2       | ❌                     | Stable release |

   > 从 `v0.22.0` 开始，我们只发布 slim 版本，并且不再在镜像标签后附加 **-slim** 后缀。

@@ -234,7 +242,7 @@
    * Running on all addresses (0.0.0.0)
   ```

-   > 如果您在没有看到上面的提示信息出来之前，就尝试登录 RAGFlow，你的浏览器有可能会提示 `network anormal` 或 `网络异常`。
+   > 如果您在没有看到上面的提示信息出来之前，就尝试登录 RAGFlow，你的浏览器有可能会提示 `network abnormal` 或 `网络异常`。

 5. 在你的浏览器中输入你的服务器对应的 IP 地址并登录 RAGFlow。
   > 上面这个例子中，您只需输入 http://IP_OF_YOUR_MACHINE 即可：未改动过配置则无需输入端口（默认的 HTTP 服务端口 80）。
@@ -298,12 +306,21 @@ cd ragflow/
 docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
 ```

+如果您处在代理环境下，可以传递代理参数：
+
+```bash
+docker build --platform linux/amd64 \
+  --build-arg http_proxy=http://YOUR_PROXY:PORT \
+  --build-arg https_proxy=http://YOUR_PROXY:PORT \
+  -f Dockerfile -t infiniflow/ragflow:nightly .
+```
+
 ## 🔨 以源代码启动服务

-1. 安装 `uv` 和 `pre-commit`。如已经安装，可跳过本步骤：
+1. 安装 `uv`。如已经安装，可跳过本步骤：

   ```bash
-   pipx install uv pre-commit
+   pipx install uv
   export UV_INDEX=https://mirrors.aliyun.com/pypi/simple
   ```

@@ -312,9 +329,9 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
   ```bash
   git clone https://github.com/infiniflow/ragflow.git
   cd ragflow/
-   uv sync --python 3.10 # install RAGFlow dependent python modules
-   uv run download_deps.py
-   pre-commit install
+   uv sync --python 3.13 # install RAGFlow dependent python modules
+   uv run python3 ragflow_deps/download_deps.py
+   lefthook install
   ```

 3. 通过 Docker Compose 启动依赖的服务（MinIO, Elasticsearch, Redis, and MySQL）：
@@ -382,19 +399,19 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly
 - [Quickstart](https://ragflow.io/docs/dev/)
 - [Configuration](https://ragflow.io/docs/dev/configurations)
 - [Release notes](https://ragflow.io/docs/dev/release_notes)
- [User guides](https://ragflow.io/docs/dev/category/guides)
- [Developer guides](https://ragflow.io/docs/dev/category/developers)
+- [User guides](https://ragflow.io/docs/category/user-guides)
+- [Developer guides](https://ragflow.io/docs/category/developer-guides)
 - [References](https://ragflow.io/docs/dev/category/references)
 - [FAQs](https://ragflow.io/docs/dev/faq)

 ## 📜 路线图

-详见 [RAGFlow Roadmap 2025](https://github.com/infiniflow/ragflow/issues/4214) 。
+详见 [RAGFlow Roadmap 2026](https://github.com/infiniflow/ragflow/issues/12241) 。

 ## 🏄 开源社区

- [Discord](https://discord.gg/zd4qPW6t)
- [Twitter](https://twitter.com/infiniflowai)
+- [Discord](https://discord.gg/NjYzJD3GM3)
+- [X](https://x.com/infiniflowai)
 - [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)

 ## 🙌 贡献指南
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -6,8 +6,8 @@ Use this section to tell people about which versions of your project are
 currently being supported with security updates.

 | Version | Supported          |
-| ------- | ------------------ |
-| <=0.7.0   | :white_check_mark: |
+|---------|--------------------|
+| <=0.7.0 | :white_check_mark: |

 ## Reporting a Vulnerability

--- a/admin/build_cli_release.sh
+++ b/admin/build_cli_release.sh
@@ -21,7 +21,7 @@ cp pyproject.toml release/$PROJECT_NAME/pyproject.toml
 cp README.md release/$PROJECT_NAME/README.md

 mkdir release/$PROJECT_NAME/$SOURCE_DIR/$PACKAGE_DIR -p
-cp admin_client.py release/$PROJECT_NAME/$SOURCE_DIR/$PACKAGE_DIR/admin_client.py
+cp ragflow_cli.py release/$PROJECT_NAME/$SOURCE_DIR/$PACKAGE_DIR/ragflow_cli.py

 if [ -d "release/$PROJECT_NAME/$SOURCE_DIR" ]; then
    echo "✅ source dir: release/$PROJECT_NAME/$SOURCE_DIR"
--- a/admin/client/COMMAND.md
+++ b/admin/client/COMMAND.md
@@ -0,0 +1,779 @@
+# RAGFlow CLI User Command Reference
+
+This document describes the user commands available in RAGFlow CLI. All commands must end with a semicolon (`;`).
+
+## Command List
+
+### ping_server
+
+**Description**  
+Tests the connection status to the server.
+
+**Usage**  
+```
+PING;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> PING;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### show_current_user
+
+**Description**  
+Displays information about the currently logged-in user.
+
+**Usage**  
+```
+SHOW CURRENT USER;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> SHOW CURRENT USER;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### create_model_provider
+
+**Description**  
+Creates a new model provider.
+
+**Usage**  
+```
+CREATE MODEL PROVIDER <provider_name> <provider_key>;
+```
+
+**Parameters**  
+- `provider_name`: Provider name, quoted string.
+- `provider_key`: Provider key, quoted string.
+
+**Example**  
+```
+ragflow> CREATE MODEL PROVIDER 'openai' 'sk-...';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### drop_model_provider
+
+**Description**  
+Deletes a model provider.
+
+**Usage**  
+```
+DROP MODEL PROVIDER <provider_name>;
+```
+
+**Parameters**  
+- `provider_name`: Name of the provider to delete, quoted string.
+
+**Example**  
+```
+ragflow> DROP MODEL PROVIDER 'openai';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### set_default_llm
+
+**Description**  
+Sets the default LLM (Large Language Model).
+
+**Usage**  
+```
+SET DEFAULT LLM <llm_id>;
+```
+
+**Parameters**  
+- `llm_id`: LLM identifier, quoted string.
+
+**Example**  
+```
+ragflow> SET DEFAULT LLM 'gpt-4';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### set_default_vlm
+
+**Description**  
+Sets the default VLM (Vision Language Model).
+
+**Usage**  
+```
+SET DEFAULT VLM <vlm_id>;
+```
+
+**Parameters**  
+- `vlm_id`: VLM identifier, quoted string.
+
+**Example**  
+```
+ragflow> SET DEFAULT VLM 'clip-vit-large';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### set_default_embedding
+
+**Description**  
+Sets the default embedding model.
+
+**Usage**  
+```
+SET DEFAULT EMBEDDING <embedding_id>;
+```
+
+**Parameters**  
+- `embedding_id`: Embedding model identifier, quoted string.
+
+**Example**  
+```
+ragflow> SET DEFAULT EMBEDDING 'text-embedding-ada-002';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### set_default_reranker
+
+**Description**  
+Sets the default reranker model.
+
+**Usage**  
+```
+SET DEFAULT RERANKER <reranker_id>;
+```
+
+**Parameters**  
+- `reranker_id`: Reranker model identifier, quoted string.
+
+**Example**  
+```
+ragflow> SET DEFAULT RERANKER 'bge-reranker-large';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### set_default_asr
+
+**Description**  
+Sets the default ASR (Automatic Speech Recognition) model.
+
+**Usage**  
+```
+SET DEFAULT ASR <asr_id>;
+```
+
+**Parameters**  
+- `asr_id`: ASR model identifier, quoted string.
+
+**Example**  
+```
+ragflow> SET DEFAULT ASR 'whisper-large';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### set_default_tts
+
+**Description**  
+Sets the default TTS (Text-to-Speech) model.
+
+**Usage**  
+```
+SET DEFAULT TTS <tts_id>;
+```
+
+**Parameters**  
+- `tts_id`: TTS model identifier, quoted string.
+
+**Example**  
+```
+ragflow> SET DEFAULT TTS 'tts-1';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### reset_default_llm
+
+**Description**  
+Resets the default LLM to system default.
+
+**Usage**  
+```
+RESET DEFAULT LLM;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> RESET DEFAULT LLM;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### reset_default_vlm
+
+**Description**  
+Resets the default VLM to system default.
+
+**Usage**  
+```
+RESET DEFAULT VLM;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> RESET DEFAULT VLM;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### reset_default_embedding
+
+**Description**  
+Resets the default embedding model to system default.
+
+**Usage**  
+```
+RESET DEFAULT EMBEDDING;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> RESET DEFAULT EMBEDDING;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### reset_default_reranker
+
+**Description**  
+Resets the default reranker model to system default.
+
+**Usage**  
+```
+RESET DEFAULT RERANKER;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> RESET DEFAULT RERANKER;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### reset_default_asr
+
+**Description**  
+Resets the default ASR model to system default.
+
+**Usage**  
+```
+RESET DEFAULT ASR;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> RESET DEFAULT ASR;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### reset_default_tts
+
+**Description**  
+Resets the default TTS model to system default.
+
+**Usage**  
+```
+RESET DEFAULT TTS;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> RESET DEFAULT TTS;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### create_user_dataset_with_parser
+
+**Description**  
+Creates a user dataset with the specified parser.
+
+**Usage**  
+```
+CREATE DATASET <dataset_name> WITH EMBEDDING <embedding> PARSER <parser_type>;
+```
+
+**Parameters**  
+- `dataset_name`: Dataset name, quoted string.
+- `embedding`: Embedding model name, quoted string.
+- `parser_type`: Parser type, quoted string.
+
+**Example**  
+```
+ragflow> CREATE DATASET 'my_dataset' WITH EMBEDDING 'text-embedding-ada-002' PARSER 'pdf';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### create_user_dataset_with_pipeline
+
+**Description**  
+Creates a user dataset with the specified pipeline.
+
+**Usage**  
+```
+CREATE DATASET <dataset_name> WITH EMBEDDING <embedding> PIPELINE <pipeline>;
+```
+
+**Parameters**  
+- `dataset_name`: Dataset name, quoted string.
+- `embedding`: Embedding model name, quoted string.
+- `pipeline`: Pipeline name, quoted string.
+
+**Example**  
+```
+ragflow> CREATE DATASET 'my_dataset' WITH EMBEDDING 'text-embedding-ada-002' PIPELINE 'standard';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### drop_user_dataset
+
+**Description**  
+Deletes a user dataset.
+
+**Usage**  
+```
+DROP DATASET <dataset_name>;
+```
+
+**Parameters**  
+- `dataset_name`: Name of the dataset to delete, quoted string.
+
+**Example**  
+```
+ragflow> DROP DATASET 'my_dataset';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### list_user_datasets
+
+**Description**  
+Lists all datasets for the current user.
+
+**Usage**  
+```
+LIST DATASETS;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> LIST DATASETS;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### list_user_dataset_files
+
+**Description**  
+Lists all files in the specified dataset.
+
+**Usage**  
+```
+LIST FILES OF DATASET <dataset_name>;
+```
+
+**Parameters**  
+- `dataset_name`: Dataset name, quoted string.
+
+**Example**  
+```
+ragflow> LIST FILES OF DATASET 'my_dataset';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### list_user_agents
+
+**Description**  
+Lists all agents for the current user.
+
+**Usage**  
+```
+LIST AGENTS;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> LIST AGENTS;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### list_user_chats
+
+**Description**  
+Lists all chat sessions for the current user.
+
+**Usage**  
+```
+LIST CHATS;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> LIST CHATS;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### create_user_chat
+
+**Description**  
+Creates a new chat session.
+
+**Usage**  
+```
+CREATE CHAT <chat_name>;
+```
+
+**Parameters**  
+- `chat_name`: Chat session name, quoted string.
+
+**Example**  
+```
+ragflow> CREATE CHAT 'my_chat';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### drop_user_chat
+
+**Description**  
+Deletes a chat session.
+
+**Usage**  
+```
+DROP CHAT <chat_name>;
+```
+
+**Parameters**  
+- `chat_name`: Name of the chat session to delete, quoted string.
+
+**Example**  
+```
+ragflow> DROP CHAT 'my_chat';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### list_user_model_providers
+
+**Description**  
+Lists all model providers for the current user.
+
+**Usage**  
+```
+LIST MODEL PROVIDERS;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> LIST MODEL PROVIDERS;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### list_user_default_models
+
+**Description**  
+Lists all default model settings for the current user.
+
+**Usage**  
+```
+LIST DEFAULT MODELS;
+```
+
+**Parameters**  
+No parameters.
+
+**Example**  
+```
+ragflow> LIST DEFAULT MODELS;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### import_docs_into_dataset
+
+**Description**  
+Imports documents into the specified dataset.
+
+**Usage**  
+```
+IMPORT <document_list> INTO DATASET <dataset_name>;
+```
+
+**Parameters**  
+- `document_list`: List of document paths, multiple paths can be separated by commas, or as a space-separated quoted string.
+- `dataset_name`: Target dataset name, quoted string.
+
+**Example**  
+```
+ragflow> IMPORT '/path/to/doc1.pdf,/path/to/doc2.pdf' INTO DATASET 'my_dataset';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### search_on_datasets
+
+**Description**  
+Searches in one or more specified datasets.
+
+**Usage**  
+```
+SEARCH <question> ON DATASETS <dataset_list>;
+```
+
+**Parameters**  
+- `question`: Search question, quoted string.
+- `dataset_list`: List of dataset names, multiple names can be separated by commas, or as a space-separated quoted string.
+
+**Example**  
+```
+ragflow> SEARCH 'What is RAG?' ON DATASETS 'dataset1,dataset2';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### parse_dataset_docs
+
+**Description**  
+Parses specified documents in a dataset.
+
+**Usage**  
+```
+PARSE <document_names> OF DATASET <dataset_name>;
+```
+
+**Parameters**  
+- `document_names`: List of document names, multiple names can be separated by commas, or as a space-separated quoted string.
+- `dataset_name`: Dataset name, quoted string.
+
+**Example**  
+```
+ragflow> PARSE 'doc1.pdf,doc2.pdf' OF DATASET 'my_dataset';
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### parse_dataset_sync
+
+**Description**  
+Synchronously parses the entire dataset.
+
+**Usage**  
+```
+PARSE DATASET <dataset_name> SYNC;
+```
+
+**Parameters**  
+- `dataset_name`: Dataset name, quoted string.
+
+**Example**  
+```
+ragflow> PARSE DATASET 'my_dataset' SYNC;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### parse_dataset_async
+
+**Description**  
+Asynchronously parses the entire dataset.
+
+**Usage**  
+```
+PARSE DATASET <dataset_name> ASYNC;
+```
+
+**Parameters**  
+- `dataset_name`: Dataset name, quoted string.
+
+**Example**  
+```
+ragflow> PARSE DATASET 'my_dataset' ASYNC;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+### benchmark
+
+**Description**  
+Performs performance benchmark testing on the specified user command.
+
+**Usage**  
+```
+BENCHMARK <concurrency> <iterations> <user_command>;
+```
+
+**Parameters**  
+- `concurrency`: Concurrency number, positive integer.
+- `iterations`: Number of iterations, positive integer.
+- `user_command`: User command to test (must be a valid user command, such as `PING;`).
+
+**Example**  
+```
+ragflow> BENCHMARK 5 10 PING;
+```
+
+**Display Effect**  
+(Sample output will be provided by the user)
+
+---
+
+**Notes**  
+- All string parameters (such as names, IDs, paths) must be enclosed in single quotes (`'`) or double quotes (`"`).
+- Commands must end with a semicolon (`;`).
+- The prompt is `ragflow>`.
--- a/admin/client/README.md
+++ b/admin/client/README.md
@@ -4,7 +4,7 @@

 Admin Service is a dedicated management component designed to monitor, maintain, and administrate the RAGFlow system. It provides comprehensive tools for ensuring system stability, performing operational tasks, and managing users and permissions efficiently.

-The service offers real-time monitoring of critical components, including the RAGFlow server, Task Executor processes, and dependent services such as MySQL, Elasticsearch, Redis, and MinIO. It automatically checks their health status, resource usage, and uptime, and performs restarts in case of failures to minimize downtime.
+The service offers real-time monitoring of critical components, including the RAGFlow server, Task Executor processes, and dependent services such as MySQL, Infinity, Elasticsearch, Redis, and MinIO. It automatically checks their health status, resource usage, and uptime, and performs restarts in case of failures to minimize downtime.

 For user and system management, it supports listing, creating, modifying, and deleting users and their associated resources like knowledge bases and Agents.

@@ -48,7 +48,7 @@ It consists of a server-side Service and a command-line client (CLI), both imple
 1.  Ensure the Admin Service is running.
 2.  Install ragflow-cli.
    ```bash
-    pip install ragflow-cli==0.22.0
+    pip install ragflow-cli==0.26.2
    ```
 3.  Launch the CLI client:
    ```bash
--- a/admin/client/admin_client.py
+++ b/admin/client/admin_client.py
@@ -1,975 +0,0 @@
-#
-#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
-#
-#  Licensed under the Apache License, Version 2.0 (the "License");
-#  you may not use this file except in compliance with the License.
-#  You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-#  Unless required by applicable law or agreed to in writing, software
-#  distributed under the License is distributed on an "AS IS" BASIS,
-#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-#  See the License for the specific language governing permissions and
-#  limitations under the License.
-#
-
-import argparse
-import base64
-from cmd import Cmd
-
-from Cryptodome.PublicKey import RSA
-from Cryptodome.Cipher import PKCS1_v1_5 as Cipher_pkcs1_v1_5
-from typing import Dict, List, Any
-from lark import Lark, Transformer, Tree
-import requests
-import getpass
-
-GRAMMAR = r"""
-start: command
-
-command: sql_command | meta_command
-
-sql_command: list_services
-           | show_service
-           | startup_service
-           | shutdown_service
-           | restart_service
-           | list_users
-           | show_user
-           | drop_user
-           | alter_user
-           | create_user
-           | activate_user
-           | list_datasets
-           | list_agents
-           | create_role
-           | drop_role
-           | alter_role
-           | list_roles
-           | show_role
-           | grant_permission
-           | revoke_permission
-           | alter_user_role
-           | show_user_permission
-           | show_version
-
-// meta command definition
-meta_command: "\\" meta_command_name [meta_args]
-
-meta_command_name: /[a-zA-Z?]+/
-meta_args: (meta_arg)+
-
-meta_arg: /[^\\s"']+/ | quoted_string
-
-// command definition
-
-LIST: "LIST"i
-SERVICES: "SERVICES"i
-SHOW: "SHOW"i
-CREATE: "CREATE"i
-SERVICE: "SERVICE"i
-SHUTDOWN: "SHUTDOWN"i
-STARTUP: "STARTUP"i
-RESTART: "RESTART"i
-USERS: "USERS"i
-DROP: "DROP"i
-USER: "USER"i
-ALTER: "ALTER"i
-ACTIVE: "ACTIVE"i
-PASSWORD: "PASSWORD"i
-DATASETS: "DATASETS"i
-OF: "OF"i
-AGENTS: "AGENTS"i
-ROLE: "ROLE"i
-ROLES: "ROLES"i
-DESCRIPTION: "DESCRIPTION"i
-GRANT: "GRANT"i
-REVOKE: "REVOKE"i
-ALL: "ALL"i
-PERMISSION: "PERMISSION"i
-TO: "TO"i
-FROM: "FROM"i
-FOR: "FOR"i
-RESOURCES: "RESOURCES"i
-ON: "ON"i
-SET: "SET"i
-VERSION: "VERSION"i
-
-list_services: LIST SERVICES ";"
-show_service: SHOW SERVICE NUMBER ";"
-startup_service: STARTUP SERVICE NUMBER ";"
-shutdown_service: SHUTDOWN SERVICE NUMBER ";"
-restart_service: RESTART SERVICE NUMBER ";"
-
-list_users: LIST USERS ";"
-drop_user: DROP USER quoted_string ";"
-alter_user: ALTER USER PASSWORD quoted_string quoted_string ";"
-show_user: SHOW USER quoted_string ";"
-create_user: CREATE USER quoted_string quoted_string ";"
-activate_user: ALTER USER ACTIVE quoted_string status ";"
-
-list_datasets: LIST DATASETS OF quoted_string ";"
-list_agents: LIST AGENTS OF quoted_string ";"
-
-create_role: CREATE ROLE identifier [DESCRIPTION quoted_string] ";"
-drop_role: DROP ROLE identifier ";"
-alter_role: ALTER ROLE identifier SET DESCRIPTION quoted_string ";"
-list_roles: LIST ROLES ";"
-show_role: SHOW ROLE identifier ";"
-
-grant_permission: GRANT action_list ON identifier TO ROLE identifier ";"
-revoke_permission: REVOKE action_list ON identifier FROM ROLE identifier ";"
-alter_user_role: ALTER USER quoted_string SET ROLE identifier ";"
-show_user_permission: SHOW USER PERMISSION quoted_string ";"
-
-show_version: SHOW VERSION ";"
-
-action_list: identifier ("," identifier)*
-
-identifier: WORD
-quoted_string: QUOTED_STRING
-status: WORD
-
-QUOTED_STRING: /'[^']+'/ | /"[^"]+"/
-WORD: /[a-zA-Z0-9_\-\.]+/
-NUMBER: /[0-9]+/
-
-%import common.WS
-%ignore WS
-"""
-
-
-class AdminTransformer(Transformer):
-
-    def start(self, items):
-        return items[0]
-
-    def command(self, items):
-        return items[0]
-
-    def list_services(self, items):
-        result = {'type': 'list_services'}
-        return result
-
-    def show_service(self, items):
-        service_id = int(items[2])
-        return {"type": "show_service", "number": service_id}
-
-    def startup_service(self, items):
-        service_id = int(items[2])
-        return {"type": "startup_service", "number": service_id}
-
-    def shutdown_service(self, items):
-        service_id = int(items[2])
-        return {"type": "shutdown_service", "number": service_id}
-
-    def restart_service(self, items):
-        service_id = int(items[2])
-        return {"type": "restart_service", "number": service_id}
-
-    def list_users(self, items):
-        return {"type": "list_users"}
-
-    def show_user(self, items):
-        user_name = items[2]
-        return {"type": "show_user", "user_name": user_name}
-
-    def drop_user(self, items):
-        user_name = items[2]
-        return {"type": "drop_user", "user_name": user_name}
-
-    def alter_user(self, items):
-        user_name = items[3]
-        new_password = items[4]
-        return {"type": "alter_user", "user_name": user_name, "password": new_password}
-
-    def create_user(self, items):
-        user_name = items[2]
-        password = items[3]
-        return {"type": "create_user", "user_name": user_name, "password": password, "role": "user"}
-
-    def activate_user(self, items):
-        user_name = items[3]
-        activate_status = items[4]
-        return {"type": "activate_user", "activate_status": activate_status, "user_name": user_name}
-
-    def list_datasets(self, items):
-        user_name = items[3]
-        return {"type": "list_datasets", "user_name": user_name}
-
-    def list_agents(self, items):
-        user_name = items[3]
-        return {"type": "list_agents", "user_name": user_name}
-
-    def create_role(self, items):
-        role_name = items[2]
-        if len(items) > 4:
-            description = items[4]
-            return {"type": "create_role", "role_name": role_name, "description": description}
-        else:
-            return {"type": "create_role", "role_name": role_name}
-
-    def drop_role(self, items):
-        role_name = items[2]
-        return {"type": "drop_role", "role_name": role_name}
-
-    def alter_role(self, items):
-        role_name = items[2]
-        description = items[5]
-        return {"type": "alter_role", "role_name": role_name, "description": description}
-
-    def list_roles(self, items):
-        return {"type": "list_roles"}
-
-    def show_role(self, items):
-        role_name = items[2]
-        return {"type": "show_role", "role_name": role_name}
-
-    def grant_permission(self, items):
-        action_list = items[1]
-        resource = items[3]
-        role_name = items[6]
-        return {"type": "grant_permission", "role_name": role_name, "resource": resource, "actions": action_list}
-
-    def revoke_permission(self, items):
-        action_list = items[1]
-        resource = items[3]
-        role_name = items[6]
-        return {
-            "type": "revoke_permission",
-            "role_name": role_name,
-            "resource": resource, "actions": action_list
-        }
-
-    def alter_user_role(self, items):
-        user_name = items[2]
-        role_name = items[5]
-        return {"type": "alter_user_role", "user_name": user_name, "role_name": role_name}
-
-    def show_user_permission(self, items):
-        user_name = items[3]
-        return {"type": "show_user_permission", "user_name": user_name}
-
-    def show_version(self, items):
-        return {"type": "show_version"}
-
-    def action_list(self, items):
-        return items
-
-    def meta_command(self, items):
-        command_name = str(items[0]).lower()
-        args = items[1:] if len(items) > 1 else []
-
-        # handle quoted parameter
-        parsed_args = []
-        for arg in args:
-            if hasattr(arg, 'value'):
-                parsed_args.append(arg.value)
-            else:
-                parsed_args.append(str(arg))
-
-        return {'type': 'meta', 'command': command_name, 'args': parsed_args}
-
-    def meta_command_name(self, items):
-        return items[0]
-
-    def meta_args(self, items):
-        return items
-
-
-def encrypt(input_string):
-    pub = '-----BEGIN PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArq9XTUSeYr2+N1h3Afl/z8Dse/2yD0ZGrKwx+EEEcdsBLca9Ynmx3nIB5obmLlSfmskLpBo0UACBmB5rEjBp2Q2f3AG3Hjd4B+gNCG6BDaawuDlgANIhGnaTLrIqWrrcm4EMzJOnAOI1fgzJRsOOUEfaS318Eq9OVO3apEyCCt0lOQK6PuksduOjVxtltDav+guVAA068NrPYmRNabVKRNLJpL8w4D44sfth5RvZ3q9t+6RTArpEtc5sh5ChzvqPOzKGMXW83C95TxmXqpbK6olN4RevSfVjEAgCydH6HN6OhtOQEcnrU97r9H0iZOWwbw3pVrZiUkuRD1R56Wzs2wIDAQAB\n-----END PUBLIC KEY-----'
-    pub_key = RSA.importKey(pub)
-    cipher = Cipher_pkcs1_v1_5.new(pub_key)
-    cipher_text = cipher.encrypt(base64.b64encode(input_string.encode('utf-8')))
-    return base64.b64encode(cipher_text).decode("utf-8")
-
-
-def encode_to_base64(input_string):
-    base64_encoded = base64.b64encode(input_string.encode('utf-8'))
-    return base64_encoded.decode('utf-8')
-
-
-class AdminCLI(Cmd):
-    def __init__(self):
-        super().__init__()
-        self.parser = Lark(GRAMMAR, start='start', parser='lalr', transformer=AdminTransformer())
-        self.command_history = []
-        self.is_interactive = False
-        self.admin_account = "admin@ragflow.io"
-        self.admin_password: str = "admin"
-        self.session = requests.Session()
-        self.access_token: str = ""
-        self.host: str = ""
-        self.port: int = 0
-
-    intro = r"""Type "\h" for help."""
-    prompt = "admin> "
-
-    def onecmd(self, command: str) -> bool:
-        try:
-            result = self.parse_command(command)
-
-            if isinstance(result, dict):
-                if 'type' in result and result.get('type') == 'empty':
-                    return False
-
-            self.execute_command(result)
-
-            if isinstance(result, Tree):
-                return False
-
-            if result.get('type') == 'meta' and result.get('command') in ['q', 'quit', 'exit']:
-                return True
-
-        except KeyboardInterrupt:
-            print("\nUse '\\q' to quit")
-        except EOFError:
-            print("\nGoodbye!")
-            return True
-        return False
-
-    def emptyline(self) -> bool:
-        return False
-
-    def default(self, line: str) -> bool:
-        return self.onecmd(line)
-
-    def parse_command(self, command_str: str) -> dict[str, str]:
-        if not command_str.strip():
-            return {'type': 'empty'}
-
-        self.command_history.append(command_str)
-
-        try:
-            result = self.parser.parse(command_str)
-            return result
-        except Exception as e:
-            return {'type': 'error', 'message': f'Parse error: {str(e)}'}
-
-    def verify_admin(self, arguments: dict, single_command: bool):
-        self.host = arguments['host']
-        self.port = arguments['port']
-        print(f"Attempt to access ip: {self.host}, port: {self.port}")
-        url = f"http://{self.host}:{self.port}/api/v1/admin/login"
-
-        attempt_count = 3
-        if single_command:
-            attempt_count = 1
-
-        try_count = 0
-        while True:
-            try_count += 1
-            if try_count > attempt_count:
-                return False
-
-            if single_command:
-                admin_passwd = arguments['password']
-            else:
-                admin_passwd = getpass.getpass(f"password for {self.admin_account}: ").strip()
-            try:
-                self.admin_password = encrypt(admin_passwd)
-                response = self.session.post(url, json={'email': self.admin_account, 'password': self.admin_password})
-                if response.status_code == 200:
-                    res_json = response.json()
-                    error_code = res_json.get('code', -1)
-                    if error_code == 0:
-                        self.session.headers.update({
-                            'Content-Type': 'application/json',
-                            'Authorization': response.headers['Authorization'],
-                            'User-Agent': 'RAGFlow-CLI/0.22.0'
-                        })
-                        print("Authentication successful.")
-                        return True
-                    else:
-                        error_message = res_json.get('message', 'Unknown error')
-                        print(f"Authentication failed: {error_message}, try again")
-                        continue
-                else:
-                    print(f"Bad response，status: {response.status_code}, password is wrong")
-            except Exception as e:
-                print(str(e))
-                print(f"Can't access {self.host}, port: {self.port}")
-
-    def _format_service_detail_table(self, data):
-        if not any([isinstance(v, list) for v in data.values()]):
-            # normal table
-            return data
-        # handle task_executor heartbeats map, for example {'name': [{'done': 2, 'now': timestamp1}, {'done': 3, 'now': timestamp2}]
-        task_executor_list = []
-        for k, v in data.items():
-            # display latest status
-            heartbeats = sorted(v, key=lambda x: x["now"], reverse=True)
-            task_executor_list.append({
-                "task_executor_name": k,
-                **heartbeats[0],
-            })
-        return task_executor_list
-
-    def _print_table_simple(self, data):
-        if not data:
-            print("No data to print")
-            return
-        if isinstance(data, dict):
-            # handle single row data
-            data = [data]
-
-        columns = list(data[0].keys())
-        col_widths = {}
-
-        def get_string_width(text):
-            half_width_chars = (
-                " !\"#$%&'()*+,-./0123456789:;<=>?@"
-                "ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`"
-                "abcdefghijklmnopqrstuvwxyz{|}~"
-                "\t\n\r"
-            )
-            width = 0
-            for char in text:
-                if char in half_width_chars:
-                    width += 1
-                else:
-                    width += 2
-            return width
-
-        for col in columns:
-            max_width = get_string_width(str(col))
-            for item in data:
-                value_len = get_string_width(str(item.get(col, '')))
-                if value_len > max_width:
-                    max_width = value_len
-            col_widths[col] = max(2, max_width)
-
-        # Generate delimiter
-        separator = "+" + "+".join(["-" * (col_widths[col] + 2) for col in columns]) + "+"
-
-        # Print header
-        print(separator)
-        header = "|" + "|".join([f" {col:<{col_widths[col]}} " for col in columns]) + "|"
-        print(header)
-        print(separator)
-
-        # Print data
-        for item in data:
-            row = "|"
-            for col in columns:
-                value = str(item.get(col, ''))
-                if get_string_width(value) > col_widths[col]:
-                    value = value[:col_widths[col] - 3] + "..."
-                row += f" {value:<{col_widths[col] - (get_string_width(value) - len(value))}} |"
-            print(row)
-
-        print(separator)
-
-    def run_interactive(self):
-
-        self.is_interactive = True
-        print("RAGFlow Admin command line interface - Type '\\?' for help, '\\q' to quit")
-
-        while True:
-            try:
-                command = input("admin> ").strip()
-                if not command:
-                    continue
-
-                print(f"command: {command}")
-                result = self.parse_command(command)
-                self.execute_command(result)
-
-                if isinstance(result, Tree):
-                    continue
-
-                if result.get('type') == 'meta' and result.get('command') in ['q', 'quit', 'exit']:
-                    break
-
-            except KeyboardInterrupt:
-                print("\nUse '\\q' to quit")
-            except EOFError:
-                print("\nGoodbye!")
-                break
-
-    def run_single_command(self, command: str):
-        result = self.parse_command(command)
-        self.execute_command(result)
-
-    def parse_connection_args(self, args: List[str]) -> Dict[str, Any]:
-        parser = argparse.ArgumentParser(description='Admin CLI Client', add_help=False)
-        parser.add_argument('-h', '--host', default='localhost', help='Admin service host')
-        parser.add_argument('-p', '--port', type=int, default=9381, help='Admin service port')
-        parser.add_argument('-w', '--password', default='admin', type=str, help='Superuser password')
-        parser.add_argument('command', nargs='?', help='Single command')
-        try:
-            parsed_args, remaining_args = parser.parse_known_args(args)
-            if remaining_args:
-                command = remaining_args[0]
-                return {
-                    'host': parsed_args.host,
-                    'port': parsed_args.port,
-                    'password': parsed_args.password,
-                    'command': command
-                }
-            else:
-                return {
-                    'host': parsed_args.host,
-                    'port': parsed_args.port,
-                }
-        except SystemExit:
-            return {'error': 'Invalid connection arguments'}
-
-    def execute_command(self, parsed_command: Dict[str, Any]):
-
-        command_dict: dict
-        if isinstance(parsed_command, Tree):
-            command_dict = parsed_command.children[0]
-        else:
-            if parsed_command['type'] == 'error':
-                print(f"Error: {parsed_command['message']}")
-                return
-            else:
-                command_dict = parsed_command
-
-        # print(f"Parsed command: {command_dict}")
-
-        command_type = command_dict['type']
-
-        match command_type:
-            case 'list_services':
-                self._handle_list_services(command_dict)
-            case 'show_service':
-                self._handle_show_service(command_dict)
-            case 'restart_service':
-                self._handle_restart_service(command_dict)
-            case 'shutdown_service':
-                self._handle_shutdown_service(command_dict)
-            case 'startup_service':
-                self._handle_startup_service(command_dict)
-            case 'list_users':
-                self._handle_list_users(command_dict)
-            case 'show_user':
-                self._handle_show_user(command_dict)
-            case 'drop_user':
-                self._handle_drop_user(command_dict)
-            case 'alter_user':
-                self._handle_alter_user(command_dict)
-            case 'create_user':
-                self._handle_create_user(command_dict)
-            case 'activate_user':
-                self._handle_activate_user(command_dict)
-            case 'list_datasets':
-                self._handle_list_datasets(command_dict)
-            case 'list_agents':
-                self._handle_list_agents(command_dict)
-            case 'create_role':
-                self._create_role(command_dict)
-            case 'drop_role':
-                self._drop_role(command_dict)
-            case 'alter_role':
-                self._alter_role(command_dict)
-            case 'list_roles':
-                self._list_roles(command_dict)
-            case 'show_role':
-                self._show_role(command_dict)
-            case 'grant_permission':
-                self._grant_permission(command_dict)
-            case 'revoke_permission':
-                self._revoke_permission(command_dict)
-            case 'alter_user_role':
-                self._alter_user_role(command_dict)
-            case 'show_user_permission':
-                self._show_user_permission(command_dict)
-            case 'show_version':
-                self._show_version(command_dict)
-            case 'meta':
-                self._handle_meta_command(command_dict)
-            case _:
-                print(f"Command '{command_type}' would be executed with API")
-
-    def _handle_list_services(self, command):
-        print("Listing all services")
-
-        url = f'http://{self.host}:{self.port}/api/v1/admin/services'
-        response = self.session.get(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(f"Fail to get all services, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _handle_show_service(self, command):
-        service_id: int = command['number']
-        print(f"Showing service: {service_id}")
-
-        url = f'http://{self.host}:{self.port}/api/v1/admin/services/{service_id}'
-        response = self.session.get(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            res_data = res_json['data']
-            if 'status' in res_data and res_data['status'] == 'alive':
-                print(f"Service {res_data['service_name']} is alive, ")
-                if isinstance(res_data['message'], str):
-                    print(res_data['message'])
-                else:
-                    data = self._format_service_detail_table(res_data['message'])
-                    self._print_table_simple(data)
-            else:
-                print(f"Service {res_data['service_name']} is down, {res_data['message']}")
-        else:
-            print(f"Fail to show service, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _handle_restart_service(self, command):
-        service_id: int = command['number']
-        print(f"Restart service {service_id}")
-
-    def _handle_shutdown_service(self, command):
-        service_id: int = command['number']
-        print(f"Shutdown service {service_id}")
-
-    def _handle_startup_service(self, command):
-        service_id: int = command['number']
-        print(f"Startup service {service_id}")
-
-    def _handle_list_users(self, command):
-        print("Listing all users")
-
-        url = f'http://{self.host}:{self.port}/api/v1/admin/users'
-        response = self.session.get(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(f"Fail to get all users, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _handle_show_user(self, command):
-        username_tree: Tree = command['user_name']
-        user_name: str = username_tree.children[0].strip("'\"")
-        print(f"Showing user: {user_name}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/users/{user_name}'
-        response = self.session.get(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            table_data = res_json['data']
-            table_data.pop('avatar')
-            self._print_table_simple(table_data)
-        else:
-            print(f"Fail to get user {user_name}, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _handle_drop_user(self, command):
-        username_tree: Tree = command['user_name']
-        user_name: str = username_tree.children[0].strip("'\"")
-        print(f"Drop user: {user_name}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/users/{user_name}'
-        response = self.session.delete(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            print(res_json["message"])
-        else:
-            print(f"Fail to drop user, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _handle_alter_user(self, command):
-        user_name_tree: Tree = command['user_name']
-        user_name: str = user_name_tree.children[0].strip("'\"")
-        password_tree: Tree = command['password']
-        password: str = password_tree.children[0].strip("'\"")
-        print(f"Alter user: {user_name}, password: {password}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/users/{user_name}/password'
-        response = self.session.put(url, json={'new_password': encrypt(password)})
-        res_json = response.json()
-        if response.status_code == 200:
-            print(res_json["message"])
-        else:
-            print(f"Fail to alter password, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _handle_create_user(self, command):
-        user_name_tree: Tree = command['user_name']
-        user_name: str = user_name_tree.children[0].strip("'\"")
-        password_tree: Tree = command['password']
-        password: str = password_tree.children[0].strip("'\"")
-        role: str = command['role']
-        print(f"Create user: {user_name}, password: {password}, role: {role}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/users'
-        response = self.session.post(
-            url,
-            json={'user_name': user_name, 'password': encrypt(password), 'role': role}
-        )
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(f"Fail to create user {user_name}, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _handle_activate_user(self, command):
-        user_name_tree: Tree = command['user_name']
-        user_name: str = user_name_tree.children[0].strip("'\"")
-        activate_tree: Tree = command['activate_status']
-        activate_status: str = activate_tree.children[0].strip("'\"")
-        if activate_status.lower() in ['on', 'off']:
-            print(f"Alter user {user_name} activate status, turn {activate_status.lower()}.")
-            url = f'http://{self.host}:{self.port}/api/v1/admin/users/{user_name}/activate'
-            response = self.session.put(url, json={'activate_status': activate_status})
-            res_json = response.json()
-            if response.status_code == 200:
-                print(res_json["message"])
-            else:
-                print(f"Fail to alter activate status, code: {res_json['code']}, message: {res_json['message']}")
-        else:
-            print(f"Unknown activate status: {activate_status}.")
-
-    def _handle_list_datasets(self, command):
-        username_tree: Tree = command['user_name']
-        user_name: str = username_tree.children[0].strip("'\"")
-        print(f"Listing all datasets of user: {user_name}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/users/{user_name}/datasets'
-        response = self.session.get(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            table_data = res_json['data']
-            for t in table_data:
-                t.pop('avatar')
-            self._print_table_simple(table_data)
-        else:
-            print(f"Fail to get all datasets of {user_name}, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _handle_list_agents(self, command):
-        username_tree: Tree = command['user_name']
-        user_name: str = username_tree.children[0].strip("'\"")
-        print(f"Listing all agents of user: {user_name}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/users/{user_name}/agents'
-        response = self.session.get(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            table_data = res_json['data']
-            for t in table_data:
-                t.pop('avatar')
-            self._print_table_simple(table_data)
-        else:
-            print(f"Fail to get all agents of {user_name}, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _create_role(self, command):
-        role_name_tree: Tree = command['role_name']
-        role_name: str = role_name_tree.children[0].strip("'\"")
-        desc_str: str = ''
-        if 'description' in command:
-            desc_tree: Tree = command['description']
-            desc_str = desc_tree.children[0].strip("'\"")
-
-        print(f"create role name: {role_name}, description: {desc_str}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/roles'
-        response = self.session.post(
-            url,
-            json={'role_name': role_name, 'description': desc_str}
-        )
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(f"Fail to create role {role_name}, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _drop_role(self, command):
-        role_name_tree: Tree = command['role_name']
-        role_name: str = role_name_tree.children[0].strip("'\"")
-        print(f"drop role name: {role_name}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/roles/{role_name}'
-        response = self.session.delete(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(f"Fail to drop role {role_name}, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _alter_role(self, command):
-        role_name_tree: Tree = command['role_name']
-        role_name: str = role_name_tree.children[0].strip("'\"")
-        desc_tree: Tree = command['description']
-        desc_str: str = desc_tree.children[0].strip("'\"")
-
-        print(f"alter role name: {role_name}, description: {desc_str}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/roles/{role_name}'
-        response = self.session.put(
-            url,
-            json={'description': desc_str}
-        )
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(
-                f"Fail to update role {role_name} with description: {desc_str}, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _list_roles(self, command):
-        print("Listing all roles")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/roles'
-        response = self.session.get(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(f"Fail to list roles, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _show_role(self, command):
-        role_name_tree: Tree = command['role_name']
-        role_name: str = role_name_tree.children[0].strip("'\"")
-        print(f"show role: {role_name}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/roles/{role_name}/permission'
-        response = self.session.get(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(f"Fail to list roles, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _grant_permission(self, command):
-        role_name_tree: Tree = command['role_name']
-        role_name_str: str = role_name_tree.children[0].strip("'\"")
-        resource_tree: Tree = command['resource']
-        resource_str: str = resource_tree.children[0].strip("'\"")
-        action_tree_list: list = command['actions']
-        actions: list = []
-        for action_tree in action_tree_list:
-            action_str: str = action_tree.children[0].strip("'\"")
-            actions.append(action_str)
-        print(f"grant role_name: {role_name_str}, resource: {resource_str}, actions: {actions}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/roles/{role_name_str}/permission'
-        response = self.session.post(
-            url,
-            json={'actions': actions, 'resource': resource_str}
-        )
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(
-                f"Fail to grant role {role_name_str} with {actions} on {resource_str}, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _revoke_permission(self, command):
-        role_name_tree: Tree = command['role_name']
-        role_name_str: str = role_name_tree.children[0].strip("'\"")
-        resource_tree: Tree = command['resource']
-        resource_str: str = resource_tree.children[0].strip("'\"")
-        action_tree_list: list = command['actions']
-        actions: list = []
-        for action_tree in action_tree_list:
-            action_str: str = action_tree.children[0].strip("'\"")
-            actions.append(action_str)
-        print(f"revoke role_name: {role_name_str}, resource: {resource_str}, actions: {actions}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/roles/{role_name_str}/permission'
-        response = self.session.delete(
-            url,
-            json={'actions': actions, 'resource': resource_str}
-        )
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(
-                f"Fail to revoke role {role_name_str} with {actions} on {resource_str}, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _alter_user_role(self, command):
-        role_name_tree: Tree = command['role_name']
-        role_name_str: str = role_name_tree.children[0].strip("'\"")
-        user_name_tree: Tree = command['user_name']
-        user_name_str: str = user_name_tree.children[0].strip("'\"")
-        print(f"alter_user_role user_name: {user_name_str}, role_name: {role_name_str}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/users/{user_name_str}/role'
-        response = self.session.put(
-            url,
-            json={'role_name': role_name_str}
-        )
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(
-                f"Fail to alter user: {user_name_str} to role {role_name_str}, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _show_user_permission(self, command):
-        user_name_tree: Tree = command['user_name']
-        user_name_str: str = user_name_tree.children[0].strip("'\"")
-        print(f"show_user_permission user_name: {user_name_str}")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/users/{user_name_str}/permission'
-        response = self.session.get(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(
-                f"Fail to show user: {user_name_str} permission, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _show_version(self, command):
-        print("show_version")
-        url = f'http://{self.host}:{self.port}/api/v1/admin/version'
-        response = self.session.get(url)
-        res_json = response.json()
-        if response.status_code == 200:
-            self._print_table_simple(res_json['data'])
-        else:
-            print(f"Fail to show version, code: {res_json['code']}, message: {res_json['message']}")
-
-    def _handle_meta_command(self, command):
-        meta_command = command['command']
-        args = command.get('args', [])
-
-        if meta_command in ['?', 'h', 'help']:
-            self.show_help()
-        elif meta_command in ['q', 'quit', 'exit']:
-            print("Goodbye!")
-        else:
-            print(f"Meta command '{meta_command}' with args {args}")
-
-    def show_help(self):
-        """Help info"""
-        help_text = """
-Commands:
-  LIST SERVICES
-  SHOW SERVICE <service>
-  STARTUP SERVICE <service>
-  SHUTDOWN SERVICE <service>
-  RESTART SERVICE <service>
-  LIST USERS
-  SHOW USER <user>
-  DROP USER <user>
-  CREATE USER <user> <password>
-  ALTER USER PASSWORD <user> <new_password>
-  ALTER USER ACTIVE <user> <on/off>
-  LIST DATASETS OF <user>
-  LIST AGENTS OF <user>
-
-Meta Commands:
-  \\?, \\h, \\help     Show this help
-  \\q, \\quit, \\exit   Quit the CLI
-        """
-        print(help_text)
-
-
-def main():
-    import sys
-
-    cli = AdminCLI()
-
-    args = cli.parse_connection_args(sys.argv)
-    if 'error' in args:
-        print(f"Error: {args['error']}")
-        return
-
-    if 'command' in args:
-        if 'password' not in args:
-            print("Error: password is missing")
-            return
-        if cli.verify_admin(args, single_command=True):
-            command: str = args['command']
-            print(f"Run single command: {command}")
-            cli.run_single_command(command)
-    else:
-        if cli.verify_admin(args, single_command=False):
-            print(r"""
-                ____  ___   ______________                 ___       __          _     
-               / __ \/   | / ____/ ____/ /___ _      __   /   | ____/ /___ ___  (_)___ 
-              / /_/ / /| |/ / __/ /_  / / __ \ | /| / /  / /| |/ __  / __ `__ \/ / __ \
-             / _, _/ ___ / /_/ / __/ / / /_/ / |/ |/ /  / ___ / /_/ / / / / / / / / / /
-            /_/ |_/_/  |_\____/_/   /_/\____/|__/|__/  /_/  |_\__,_/_/ /_/ /_/_/_/ /_/ 
-            """)
-            cli.cmdloop()
-
-
-if __name__ == '__main__':
-    main()
--- a/admin/client/http_client.py
+++ b/admin/client/http_client.py
@@ -0,0 +1,182 @@
+#
+#  Copyright 2026 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+import time
+import json
+import typing
+from typing import Any, Dict, Optional
+
+import requests
+# from requests.sessions import HTTPAdapter
+
+
+class HttpClient:
+    def __init__(
+            self,
+            host: str = "127.0.0.1",
+            port: int = 9381,
+            api_version: str = "v1",
+            api_key: Optional[str] = None,
+            connect_timeout: float = 5.0,
+            read_timeout: float = 60.0,
+            verify_ssl: bool = False,
+    ) -> None:
+        self.host = host
+        self.port = port
+        self.api_version = api_version
+        self.api_key = api_key
+        self.login_token: str | None = None
+        self.connect_timeout = connect_timeout
+        self.read_timeout = read_timeout
+        self.verify_ssl = verify_ssl
+
+    def api_base(self) -> str:
+        return f"{self.host}:{self.port}/api/{self.api_version}"
+
+    def non_api_base(self) -> str:
+        return f"{self.host}:{self.port}/{self.api_version}"
+
+    def build_url(self, path: str, use_api_base: bool = True) -> str:
+        base = self.api_base() if use_api_base else self.non_api_base()
+        if self.verify_ssl:
+            return f"https://{base}/{path.lstrip('/')}"
+        else:
+            return f"http://{base}/{path.lstrip('/')}"
+
+    def _headers(self, auth_kind: Optional[str], extra: Optional[Dict[str, str]]) -> Dict[str, str]:
+        headers = {}
+        if auth_kind == "api" and self.api_key:
+            headers["Authorization"] = f"Bearer {self.api_key}"
+        elif auth_kind == "web" and self.login_token:
+            headers["Authorization"] = self.login_token
+        elif auth_kind == "admin" and self.login_token:
+            headers["Authorization"] = self.login_token
+        else:
+            pass
+        if extra:
+            headers.update(extra)
+        return headers
+
+    def request(
+            self,
+            method: str,
+            path: str,
+            *,
+            use_api_base: bool = True,
+            auth_kind: Optional[str] = "api",
+            headers: Optional[Dict[str, str]] = None,
+            json_body: Optional[Dict[str, Any]] = None,
+            data: Any = None,
+            files: Any = None,
+            params: Optional[Dict[str, Any]] = None,
+            stream: bool = False,
+            iterations: int = 1,
+    ) -> requests.Response | dict:
+        url = self.build_url(path, use_api_base=use_api_base)
+        merged_headers = self._headers(auth_kind, headers)
+        # timeout: Tuple[float, float] = (self.connect_timeout, self.read_timeout)
+        session = requests.Session()
+        # adapter = HTTPAdapter(pool_connections=100, pool_maxsize=100)
+        # session.mount("http://", adapter)
+        http_function = typing.Any
+        match method:
+            case "GET":
+                http_function = session.get
+            case "POST":
+                http_function = session.post
+            case "PUT":
+                http_function = session.put
+            case "DELETE":
+                http_function = session.delete
+            case "PATCH":
+                http_function = session.patch
+            case _:
+                raise ValueError(f"Invalid HTTP method: {method}")
+
+        if iterations > 1:
+            response_list = []
+            total_duration = 0.0
+            for _ in range(iterations):
+                start_time = time.perf_counter()
+                response = http_function(url, headers=merged_headers, json=json_body, data=data, stream=stream)
+                # response = session.get(url, headers=merged_headers, json=json_body, data=data, stream=stream)
+                # response = requests.request(
+                #     method=method,
+                #     url=url,
+                #     headers=merged_headers,
+                #     json=json_body,
+                #     data=data,
+                #     files=files,
+                #     params=params,
+                #     stream=stream,
+                #     verify=self.verify_ssl,
+                # )
+                end_time = time.perf_counter()
+                total_duration += end_time - start_time
+                response_list.append(response)
+            return {"duration": total_duration, "response_list": response_list}
+        else:
+            return http_function(url, headers=merged_headers, json=json_body, data=data, stream=stream)
+            # return session.get(url, headers=merged_headers, json=json_body, data=data, stream=stream)
+            # return requests.request(
+            #     method=method,
+            #     url=url,
+            #     headers=merged_headers,
+            #     json=json_body,
+            #     data=data,
+            #     files=files,
+            #     params=params,
+            #     stream=stream,
+            #     verify=self.verify_ssl,
+            # )
+
+    def request_json(
+            self,
+            method: str,
+            path: str,
+            *,
+            use_api_base: bool = True,
+            auth_kind: Optional[str] = "api",
+            headers: Optional[Dict[str, str]] = None,
+            json_body: Optional[Dict[str, Any]] = None,
+            data: Any = None,
+            files: Any = None,
+            params: Optional[Dict[str, Any]] = None,
+            stream: bool = False,
+    ) -> Dict[str, Any]:
+        response = self.request(
+            method,
+            path,
+            use_api_base=use_api_base,
+            auth_kind=auth_kind,
+            headers=headers,
+            json_body=json_body,
+            data=data,
+            files=files,
+            params=params,
+            stream=stream,
+        )
+        try:
+            return response.json()
+        except Exception as exc:
+            raise ValueError(f"Non-JSON response from {path}: {exc}") from exc
+
+    @staticmethod
+    def parse_json_bytes(raw: bytes) -> Dict[str, Any]:
+        try:
+            return json.loads(raw.decode("utf-8"))
+        except Exception as exc:
+            raise ValueError(f"Invalid JSON payload: {exc}") from exc
--- a/admin/client/parser.py
+++ b/admin/client/parser.py
@@ -0,0 +1,905 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+from lark import Transformer
+
+GRAMMAR = r"""
+start: command
+
+command: sql_command | meta_command
+
+sql_command: login_user
+           | ping_server
+           | list_services
+           | show_service
+           | startup_service
+           | shutdown_service
+           | restart_service
+           | register_user
+           | list_users
+           | show_user
+           | drop_user
+           | alter_user
+           | create_user
+           | activate_user
+           | list_datasets
+           | list_agents
+           | create_role
+           | drop_role
+           | alter_role
+           | list_roles
+           | show_role
+           | grant_permission
+           | revoke_permission
+           | alter_user_role
+           | show_user_permission
+           | show_version
+           | grant_admin
+           | revoke_admin
+           | set_variable
+           | show_variable
+           | list_variables
+           | list_configs
+           | list_environments
+           | generate_key
+           | list_keys
+           | drop_key
+           | show_current_user
+           | set_default_llm
+           | set_default_vlm
+           | set_default_embedding
+           | set_default_reranker
+           | set_default_asr
+           | set_default_tts
+           | reset_default_llm
+           | reset_default_vlm
+           | reset_default_embedding
+           | reset_default_reranker
+           | reset_default_asr
+           | reset_default_tts
+           | create_model_provider
+           | drop_model_provider
+           | create_user_dataset_with_parser
+           | create_user_dataset_with_pipeline
+           | drop_user_dataset
+           | list_user_datasets
+           | list_user_dataset_files
+           | list_user_dataset_documents
+           | list_user_datasets_metadata
+           | list_user_documents_metadata_summary
+           | list_user_agents
+           | list_user_chats
+           | create_user_chat
+           | drop_user_chat
+           | create_dataset_table
+           | drop_dataset_table
+           | create_metadata_table
+           | drop_metadata_table
+           | list_user_model_providers
+           | list_user_default_models
+           | parse_dataset_docs
+           | parse_dataset_sync
+           | parse_dataset_async
+           | import_docs_into_dataset
+           | search_on_datasets
+           | get_chunk
+           | list_chunks
+           | insert_dataset_from_file
+           | insert_metadata_from_file
+           | update_chunk
+           | set_metadata
+           | remove_tags
+           | remove_chunks
+           | create_chat_session
+           | drop_chat_session
+           | list_chat_sessions
+           | chat_on_session
+           | list_server_configs
+           | show_fingerprint
+           | set_license
+           | set_license_config
+           | show_license
+           | check_license
+           | benchmark
+
+// meta command definition
+meta_command: "\\" meta_command_name [meta_args]
+
+COMMA: ","
+
+meta_command_name: /[a-zA-Z?]+/
+meta_args: (meta_arg)+
+
+meta_arg: /[^\s"',]+/ | quoted_string
+
+// command definition
+
+LOGIN: "LOGIN"i
+REGISTER: "REGISTER"i
+LIST: "LIST"i
+SERVICES: "SERVICES"i
+SHOW: "SHOW"i
+CREATE: "CREATE"i
+SERVICE: "SERVICE"i
+SHUTDOWN: "SHUTDOWN"i
+STARTUP: "STARTUP"i
+RESTART: "RESTART"i
+USERS: "USERS"i
+DROP: "DROP"i
+USER: "USER"i
+ALTER: "ALTER"i
+ACTIVE: "ACTIVE"i
+ADMIN: "ADMIN"i
+PASSWORD: "PASSWORD"i
+DATASET_TABLE: "DATASET TABLE"i
+DATASET: "DATASET"i
+DATASETS: "DATASETS"i
+OF: "OF"i
+AGENTS: "AGENTS"i
+ROLE: "ROLE"i
+ROLES: "ROLES"i
+DESCRIPTION: "DESCRIPTION"i
+GRANT: "GRANT"i
+REVOKE: "REVOKE"i
+ALL: "ALL"i
+PERMISSION: "PERMISSION"i
+TO: "TO"i
+FROM: "FROM"i
+FOR: "FOR"i
+RESOURCES: "RESOURCES"i
+ON: "ON"i
+SET: "SET"i
+RESET: "RESET"i
+VERSION: "VERSION"i
+VAR: "VAR"i
+VARS: "VARS"i
+CONFIGS: "CONFIGS"i
+ENVS: "ENVS"i
+KEY: "KEY"i
+KEYS: "KEYS"i
+GENERATE: "GENERATE"i
+MODEL: "MODEL"i
+MODELS: "MODELS"i
+PROVIDER: "PROVIDER"i
+PROVIDERS: "PROVIDERS"i
+DEFAULT: "DEFAULT"i
+CHATS: "CHATS"i
+CHAT: "CHAT"i
+FILES: "FILES"i
+DOCUMENT: "DOCUMENT"i
+DOCUMENTS: "DOCUMENTS"i
+METADATA: "METADATA"i
+SUMMARY: "SUMMARY"i
+AS: "AS"i
+PARSE: "PARSE"i
+IMPORT: "IMPORT"i
+INTO: "INTO"i
+IN: "IN"i
+WITH: "WITH"i
+VECTOR: "VECTOR"i
+SIZE: "SIZE"i
+PARSER: "PARSER"i
+PIPELINE: "PIPELINE"i
+SEARCH: "SEARCH"i
+CURRENT: "CURRENT"i
+LLM: "LLM"i
+VLM: "VLM"i
+EMBEDDING: "EMBEDDING"i
+RERANKER: "RERANKER"i
+ASR: "ASR"i
+TTS: "TTS"i
+ASYNC: "ASYNC"i
+SYNC: "SYNC"i
+BENCHMARK: "BENCHMARK"i
+PING: "PING"i
+SESSION: "SESSION"i
+SESSIONS: "SESSIONS"i
+SERVER: "SERVER"i
+FINGERPRINT: "FINGERPRINT"i
+LICENSE: "LICENSE"i
+CHECK: "CHECK"i
+CONFIG: "CONFIG"i
+INDEX: "INDEX"i
+TABLE: "TABLE"i
+CHUNK: "CHUNK"i
+CHUNKS: "CHUNKS"i
+GET: "GET"i
+INSERT: "INSERT"i
+PAGE: "PAGE"i
+KEYWORDS: "KEYWORDS"i
+AVAILABLE: "AVAILABLE"i
+FILE: "FILE"i
+UPDATE: "UPDATE"i
+REMOVE: "REMOVE"i
+TAGS: "TAGS"i
+
+login_user: LOGIN USER quoted_string (PASSWORD quoted_string)? ";"
+list_services: LIST SERVICES ";"
+show_service: SHOW SERVICE NUMBER ";"
+startup_service: STARTUP SERVICE NUMBER ";"
+shutdown_service: SHUTDOWN SERVICE NUMBER ";"
+restart_service: RESTART SERVICE NUMBER ";"
+
+register_user: REGISTER USER quoted_string AS quoted_string PASSWORD quoted_string ";"
+list_users: LIST USERS ";"
+drop_user: DROP USER quoted_string ";"
+alter_user: ALTER USER PASSWORD quoted_string quoted_string ";"
+show_user: SHOW USER quoted_string ";"
+create_user: CREATE USER quoted_string quoted_string ";"
+activate_user: ALTER USER ACTIVE quoted_string status ";"
+
+list_datasets: LIST DATASETS OF quoted_string ";"
+list_agents: LIST AGENTS OF quoted_string ";"
+
+create_role: CREATE ROLE identifier [DESCRIPTION quoted_string] ";"
+drop_role: DROP ROLE identifier ";"
+alter_role: ALTER ROLE identifier SET DESCRIPTION quoted_string ";"
+list_roles: LIST ROLES ";"
+show_role: SHOW ROLE identifier ";"
+
+grant_permission: GRANT identifier_list ON identifier TO ROLE identifier ";"
+revoke_permission: REVOKE identifier_list ON identifier FROM ROLE identifier ";"
+alter_user_role: ALTER USER quoted_string SET ROLE identifier ";"
+show_user_permission: SHOW USER PERMISSION quoted_string ";"
+
+show_version: SHOW VERSION ";"
+
+grant_admin: GRANT ADMIN quoted_string ";"
+revoke_admin: REVOKE ADMIN quoted_string ";"
+
+generate_key: GENERATE KEY FOR USER quoted_string ";"
+list_keys: LIST KEYS OF quoted_string ";"
+drop_key: DROP KEY quoted_string OF quoted_string ";"
+
+set_variable: SET VAR identifier variable_value ";"
+show_variable: SHOW VAR identifier ";"
+list_variables: LIST VARS ";"
+list_configs: LIST CONFIGS ";"
+list_environments: LIST ENVS ";"
+
+show_fingerprint: SHOW FINGERPRINT ";"
+set_license: SET LICENSE quoted_string ";"
+set_license_config: SET LICENSE CONFIG NUMBER NUMBER ";"
+show_license: SHOW LICENSE ";"
+check_license: CHECK LICENSE ";"
+
+list_server_configs: LIST SERVER CONFIGS ";"
+
+benchmark: BENCHMARK NUMBER NUMBER user_statement
+
+user_statement: ping_server
+                | show_current_user
+                | create_model_provider
+                | drop_model_provider
+                | set_default_llm
+                | set_default_vlm
+                | set_default_embedding
+                | set_default_reranker
+                | set_default_asr
+                | set_default_tts
+                | reset_default_llm
+                | reset_default_vlm
+                | reset_default_embedding
+                | reset_default_reranker
+                | reset_default_asr
+                | reset_default_tts
+                | create_user_dataset_with_parser
+                | create_user_dataset_with_pipeline
+                | drop_user_dataset
+                | list_user_datasets
+                | list_user_dataset_files
+                | list_user_agents
+                | list_user_chats
+                | create_user_chat
+                | drop_user_chat
+                | list_user_model_providers
+                | list_user_default_models
+                | import_docs_into_dataset
+                | search_on_datasets
+                | update_chunk
+                | set_metadata
+                | remove_tags
+                | create_chat_session
+                | drop_chat_session
+                | list_chat_sessions
+                | chat_on_session
+
+ping_server: PING ";"
+show_current_user: SHOW CURRENT USER ";"
+create_model_provider: CREATE MODEL PROVIDER quoted_string quoted_string ";"
+drop_model_provider: DROP MODEL PROVIDER quoted_string ";"
+set_default_llm: SET DEFAULT LLM quoted_string ";"
+set_default_vlm: SET DEFAULT VLM quoted_string ";"
+set_default_embedding: SET DEFAULT EMBEDDING quoted_string ";"
+set_default_reranker: SET DEFAULT RERANKER quoted_string ";"
+set_default_asr: SET DEFAULT ASR quoted_string ";"
+set_default_tts: SET DEFAULT TTS quoted_string ";"
+
+reset_default_llm: RESET DEFAULT LLM ";"
+reset_default_vlm: RESET DEFAULT VLM ";"
+reset_default_embedding: RESET DEFAULT EMBEDDING ";"
+reset_default_reranker: RESET DEFAULT RERANKER ";"
+reset_default_asr: RESET DEFAULT ASR ";"
+reset_default_tts: RESET DEFAULT TTS ";"
+
+list_user_datasets: LIST DATASETS ";"
+create_user_dataset_with_parser: CREATE DATASET quoted_string WITH EMBEDDING quoted_string PARSER quoted_string ";" 
+create_user_dataset_with_pipeline: CREATE DATASET quoted_string WITH EMBEDDING quoted_string PIPELINE quoted_string ";" 
+drop_user_dataset: DROP DATASET quoted_string ";"
+list_user_dataset_files: LIST FILES OF DATASET quoted_string ";"
+list_user_dataset_documents: LIST DOCUMENTS OF DATASET quoted_string ";"
+list_user_datasets_metadata: LIST METADATA OF DATASETS quoted_string (COMMA quoted_string)* ";"
+list_user_documents_metadata_summary: LIST METADATA SUMMARY OF DATASET quoted_string (DOCUMENTS quoted_string (COMMA quoted_string)*)? ";"
+list_user_agents: LIST AGENTS ";"
+list_user_chats: LIST CHATS ";"
+create_user_chat: CREATE CHAT quoted_string ";"
+drop_user_chat: DROP CHAT quoted_string ";"
+create_chat_session: CREATE CHAT quoted_string SESSION ";"
+drop_chat_session: DROP CHAT quoted_string SESSION quoted_string ";"
+list_chat_sessions: LIST CHAT quoted_string SESSIONS ";"
+chat_on_session: CHAT quoted_string ON quoted_string SESSION quoted_string ";"
+list_user_model_providers: LIST MODEL PROVIDERS ";"
+list_user_default_models: LIST DEFAULT MODELS ";"
+import_docs_into_dataset: IMPORT quoted_string INTO DATASET quoted_string ";"
+search_on_datasets: SEARCH quoted_string ON DATASETS quoted_string ";"
+get_chunk: GET CHUNK quoted_string ";"
+list_chunks: LIST CHUNKS OF DOCUMENT quoted_string ("PAGE" NUMBER)? ("SIZE" NUMBER)? ("KEYWORDS" quoted_string)? ("AVAILABLE" NUMBER)? ";"
+set_metadata: SET METADATA OF DOCUMENT quoted_string TO quoted_string ";"
+remove_tags: REMOVE TAGS quoted_string (COMMA quoted_string)* FROM DATASET quoted_string ";"
+remove_chunks: REMOVE CHUNKS quoted_string (COMMA quoted_string)* FROM DOCUMENT quoted_string ";"
+           | REMOVE ALL CHUNKS FROM DOCUMENT quoted_string ";"
+
+parse_dataset_docs: PARSE quoted_string OF DATASET quoted_string ";"
+parse_dataset_sync: PARSE DATASET quoted_string SYNC ";"
+parse_dataset_async: PARSE DATASET quoted_string ASYNC ";"
+
+// Internal CLI only for GO
+create_dataset_table: CREATE DATASET TABLE quoted_string VECTOR SIZE NUMBER ";"
+drop_dataset_table: DROP DATASET TABLE quoted_string ";"
+create_metadata_table: CREATE METADATA TABLE ";"
+drop_metadata_table: DROP METADATA TABLE ";"
+insert_dataset_from_file: INSERT DATASET FROM FILE quoted_string ";"
+insert_metadata_from_file: INSERT METADATA FROM FILE quoted_string ";"
+update_chunk: UPDATE CHUNK quoted_string OF DATASET quoted_string SET quoted_string ";"
+
+identifier_list: identifier (COMMA identifier)*
+
+identifier: WORD
+variable_value: WORD | NUMBER | QUOTED_STRING
+quoted_string: QUOTED_STRING
+status: ON | WORD
+
+QUOTED_STRING: /'[^']+'/ | /"[^"]+"/
+WORD: /[a-zA-Z0-9_\-\.]+/
+NUMBER: /[0-9]+/
+
+%import common.WS
+%ignore WS
+"""
+
+
+class RAGFlowCLITransformer(Transformer):
+    def start(self, items):
+        return items[0]
+
+    def command(self, items):
+        return items[0]
+
+    def login_user(self, items):
+        email = items[2].children[0].strip("'\"")
+        if len(items) == 5:
+            # With password: LOGIN USER email PASSWORD password
+            password = items[4].children[0].strip("'\"")
+            return {"type": "login_user", "email": email, "password": password}
+        else:
+            # Without password: LOGIN USER email
+            return {"type": "login_user", "email": email}
+
+    def ping_server(self, items):
+        return {"type": "ping_server"}
+
+    def list_services(self, items):
+        result = {"type": "list_services"}
+        return result
+
+    def show_service(self, items):
+        service_id = int(items[2])
+        return {"type": "show_service", "number": service_id}
+
+    def startup_service(self, items):
+        service_id = int(items[2])
+        return {"type": "startup_service", "number": service_id}
+
+    def shutdown_service(self, items):
+        service_id = int(items[2])
+        return {"type": "shutdown_service", "number": service_id}
+
+    def restart_service(self, items):
+        service_id = int(items[2])
+        return {"type": "restart_service", "number": service_id}
+
+    def register_user(self, items):
+        user_name: str = items[2].children[0].strip("'\"")
+        nickname: str = items[4].children[0].strip("'\"")
+        password: str = items[6].children[0].strip("'\"")
+        return {"type": "register_user", "user_name": user_name, "nickname": nickname, "password": password}
+
+    def list_users(self, items):
+        return {"type": "list_users"}
+
+    def show_user(self, items):
+        user_name = items[2]
+        return {"type": "show_user", "user_name": user_name}
+
+    def drop_user(self, items):
+        user_name = items[2]
+        return {"type": "drop_user", "user_name": user_name}
+
+    def alter_user(self, items):
+        user_name = items[3]
+        new_password = items[4]
+        return {"type": "alter_user", "user_name": user_name, "password": new_password}
+
+    def create_user(self, items):
+        user_name = items[2]
+        password = items[3]
+        return {"type": "create_user", "user_name": user_name, "password": password, "role": "user"}
+
+    def activate_user(self, items):
+        user_name = items[3]
+        activate_status = items[4]
+        return {"type": "activate_user", "activate_status": activate_status, "user_name": user_name}
+
+    def list_datasets(self, items):
+        user_name = items[3]
+        return {"type": "list_datasets", "user_name": user_name}
+
+    def list_agents(self, items):
+        user_name = items[3]
+        return {"type": "list_agents", "user_name": user_name}
+
+    def create_role(self, items):
+        role_name = items[2]
+        if len(items) > 4:
+            description = items[4]
+            return {"type": "create_role", "role_name": role_name, "description": description}
+        else:
+            return {"type": "create_role", "role_name": role_name}
+
+    def drop_role(self, items):
+        role_name = items[2]
+        return {"type": "drop_role", "role_name": role_name}
+
+    def alter_role(self, items):
+        role_name = items[2]
+        description = items[5]
+        return {"type": "alter_role", "role_name": role_name, "description": description}
+
+    def list_roles(self, items):
+        return {"type": "list_roles"}
+
+    def show_role(self, items):
+        role_name = items[2]
+        return {"type": "show_role", "role_name": role_name}
+
+    def grant_permission(self, items):
+        action_list = items[1]
+        resource = items[3]
+        role_name = items[6]
+        return {"type": "grant_permission", "role_name": role_name, "resource": resource, "actions": action_list}
+
+    def revoke_permission(self, items):
+        action_list = items[1]
+        resource = items[3]
+        role_name = items[6]
+        return {"type": "revoke_permission", "role_name": role_name, "resource": resource, "actions": action_list}
+
+    def alter_user_role(self, items):
+        user_name = items[2]
+        role_name = items[5]
+        return {"type": "alter_user_role", "user_name": user_name, "role_name": role_name}
+
+    def show_user_permission(self, items):
+        user_name = items[3]
+        return {"type": "show_user_permission", "user_name": user_name}
+
+    def show_version(self, items):
+        return {"type": "show_version"}
+
+    def grant_admin(self, items):
+        user_name = items[2]
+        return {"type": "grant_admin", "user_name": user_name}
+
+    def revoke_admin(self, items):
+        user_name = items[2]
+        return {"type": "revoke_admin", "user_name": user_name}
+
+    def generate_key(self, items):
+        user_name = items[4]
+        return {"type": "generate_key", "user_name": user_name}
+
+    def list_keys(self, items):
+        user_name = items[3]
+        return {"type": "list_keys", "user_name": user_name}
+
+    def drop_key(self, items):
+        key = items[2]
+        user_name = items[4]
+        return {"type": "drop_key", "key": key, "user_name": user_name}
+
+    def set_variable(self, items):
+        var_name = items[2]
+        var_value = items[3]
+        return {"type": "set_variable", "var_name": var_name, "var_value": var_value}
+
+    def show_variable(self, items):
+        var_name = items[2]
+        return {"type": "show_variable", "var_name": var_name}
+
+    def list_variables(self, items):
+        return {"type": "list_variables"}
+
+    def list_configs(self, items):
+        return {"type": "list_configs"}
+
+    def list_environments(self, items):
+        return {"type": "list_environments"}
+
+    def show_fingerprint(self, items):
+        return {"type": "show_fingerprint"}
+
+    def set_license(self, items):
+        license = items[2].children[0].strip("'\"")
+        return {"type": "set_license", "license": license}
+
+    def set_license_config(self, items):
+        value1: int = int(items[3])
+        value2: int = int(items[4])
+        return {"type": "set_license_config", "value1": value1, "value2": value2}
+
+    def show_license(self, items):
+        return {"type": "show_license"}
+
+    def check_license(self, items):
+        return {"type": "check_license"}
+
+    def list_server_configs(self, items):
+        return {"type": "list_server_configs"}
+
+    def create_model_provider(self, items):
+        provider_name = items[3].children[0].strip("'\"")
+        provider_key = items[4].children[0].strip("'\"")
+        return {"type": "create_model_provider", "provider_name": provider_name, "provider_key": provider_key}
+
+    def drop_model_provider(self, items):
+        provider_name = items[3].children[0].strip("'\"")
+        return {"type": "drop_model_provider", "provider_name": provider_name}
+
+    def show_current_user(self, items):
+        return {"type": "show_current_user"}
+
+    def set_default_llm(self, items):
+        llm_id = items[3].children[0].strip("'\"")
+        return {"type": "set_default_model", "model_type": "llm_id", "model_id": llm_id}
+
+    def set_default_vlm(self, items):
+        vlm_id = items[3].children[0].strip("'\"")
+        return {"type": "set_default_model", "model_type": "img2txt_id", "model_id": vlm_id}
+
+    def set_default_embedding(self, items):
+        embedding_id = items[3].children[0].strip("'\"")
+        return {"type": "set_default_model", "model_type": "embd_id", "model_id": embedding_id}
+
+    def set_default_reranker(self, items):
+        reranker_id = items[3].children[0].strip("'\"")
+        return {"type": "set_default_model", "model_type": "reranker_id", "model_id": reranker_id}
+
+    def set_default_asr(self, items):
+        asr_id = items[3].children[0].strip("'\"")
+        return {"type": "set_default_model", "model_type": "asr_id", "model_id": asr_id}
+
+    def set_default_tts(self, items):
+        tts_id = items[3].children[0].strip("'\"")
+        return {"type": "set_default_model", "model_type": "tts_id", "model_id": tts_id}
+
+    def reset_default_llm(self, items):
+        return {"type": "reset_default_model", "model_type": "llm_id"}
+
+    def reset_default_vlm(self, items):
+        return {"type": "reset_default_model", "model_type": "img2txt_id"}
+
+    def reset_default_embedding(self, items):
+        return {"type": "reset_default_model", "model_type": "embd_id"}
+
+    def reset_default_reranker(self, items):
+        return {"type": "reset_default_model", "model_type": "reranker_id"}
+
+    def reset_default_asr(self, items):
+        return {"type": "reset_default_model", "model_type": "asr_id"}
+
+    def reset_default_tts(self, items):
+        return {"type": "reset_default_model", "model_type": "tts_id"}
+
+    def list_user_datasets(self, items):
+        return {"type": "list_user_datasets"}
+
+    def create_user_dataset_with_parser(self, items):
+        dataset_name = items[2].children[0].strip("'\"")
+        embedding = items[5].children[0].strip("'\"")
+        parser_type = items[7].children[0].strip("'\"")
+        return {"type": "create_user_dataset", "dataset_name": dataset_name, "embedding": embedding,
+                "parser_type": parser_type}
+
+    def create_user_dataset_with_pipeline(self, items):
+        dataset_name = items[2].children[0].strip("'\"")
+        embedding = items[5].children[0].strip("'\"")
+        pipeline = items[7].children[0].strip("'\"")
+        return {"type": "create_user_dataset", "dataset_name": dataset_name, "embedding": embedding,
+                "pipeline": pipeline}
+
+    def drop_user_dataset(self, items):
+        dataset_name = items[2].children[0].strip("'\"")
+        return {"type": "drop_user_dataset", "dataset_name": dataset_name}
+
+    def list_user_dataset_files(self, items):
+        dataset_name = items[4].children[0].strip("'\"")
+        return {"type": "list_user_dataset_files", "dataset_name": dataset_name}
+
+    def list_user_dataset_documents(self, items):
+        dataset_name = items[4].children[0].strip("'\"")
+        return {"type": "list_user_dataset_documents", "dataset_name": dataset_name}
+
+    def list_user_datasets_metadata(self, items):
+        dataset_names = []
+        dataset_names.append(items[4].children[0].strip("'\""))
+        for i in range(5, len(items)):
+            if items[i] and hasattr(items[i], 'children') and items[i].children:
+                dataset_names.append(items[i].children[0].strip("'\""))
+        return {"type": "list_user_datasets_metadata", "dataset_names": dataset_names}
+
+    def list_user_documents_metadata_summary(self, items):
+        dataset_name = items[5].children[0].strip("'\"")
+        doc_ids = []
+        if len(items) > 6 and items[6] == "DOCUMENTS":
+            for i in range(7, len(items)):
+                if items[i] and hasattr(items[i], 'children') and items[i].children:
+                    doc_id = items[i].children[0].strip("'\"")
+                    doc_ids.append(doc_id)
+        return {"type": "list_user_documents_metadata_summary", "dataset_name": dataset_name, "document_ids": doc_ids}
+
+    def list_user_agents(self, items):
+        return {"type": "list_user_agents"}
+
+    def list_user_chats(self, items):
+        return {"type": "list_user_chats"}
+
+    def create_user_chat(self, items):
+        chat_name = items[2].children[0].strip("'\"")
+        return {"type": "create_user_chat", "chat_name": chat_name}
+
+    def drop_user_chat(self, items):
+        chat_name = items[2].children[0].strip("'\"")
+        return {"type": "drop_user_chat", "chat_name": chat_name}
+
+    def create_dataset_table(self, items):
+        dataset_name = None
+        vector_size = None
+        for i, item in enumerate(items):
+            if hasattr(item, 'data') and item.data == 'quoted_string':
+                dataset_name = item.children[0].strip("'\"")
+            if hasattr(item, 'type') and item.type == 'NUMBER':
+                if i > 0 and items[i-1].type == 'SIZE' and items[i-2].type == 'VECTOR':
+                    vector_size = int(item)
+        return {"type": "create_dataset_table", "dataset_name": dataset_name, "vector_size": vector_size}
+
+    def drop_dataset_table(self, items):
+        dataset_name = None
+        for item in items:
+            if hasattr(item, 'data') and item.data == 'quoted_string':
+                dataset_name = item.children[0].strip("'\"")
+        return {"type": "drop_dataset_table", "dataset_name": dataset_name}
+
+    def create_metadata_table(self, items):
+        return {"type": "create_metadata_table"}
+
+    def drop_metadata_table(self, items):
+        return {"type": "drop_metadata_table"}
+
+    def list_user_model_providers(self, items):
+        return {"type": "list_user_model_providers"}
+
+    def list_user_default_models(self, items):
+        return {"type": "list_user_default_models"}
+
+    def parse_dataset_docs(self, items):
+        document_list_str = items[1].children[0].strip("'\"")
+        document_names = document_list_str.split(",")
+        if len(document_names) == 1:
+            document_names = document_names[0]
+            document_names = document_names.split(" ")
+        dataset_name = items[4].children[0].strip("'\"")
+        return {"type": "parse_dataset_docs", "dataset_name": dataset_name, "document_names": document_names}
+
+    def parse_dataset_sync(self, items):
+        dataset_name = items[2].children[0].strip("'\"")
+        return {"type": "parse_dataset", "dataset_name": dataset_name, "method": "sync"}
+
+    def parse_dataset_async(self, items):
+        dataset_name = items[2].children[0].strip("'\"")
+        return {"type": "parse_dataset", "dataset_name": dataset_name, "method": "async"}
+
+    def create_chat_session(self, items):
+        chat_name = items[2].children[0].strip("'\"")
+        return {"type": "create_chat_session", "chat_name": chat_name}
+
+    def drop_chat_session(self, items):
+        chat_name = items[2].children[0].strip("'\"")
+        session_id = items[4].children[0].strip("'\"")
+        return {"type": "drop_chat_session", "chat_name": chat_name, "session_id": session_id}
+
+    def list_chat_sessions(self, items):
+        chat_name = items[2].children[0].strip("'\"")
+        return {"type": "list_chat_sessions", "chat_name": chat_name}
+
+    def chat_on_session(self, items):
+        message = items[1].children[0].strip("'\"")
+        chat_name = items[3].children[0].strip("'\"")
+        session_id = items[5].children[0].strip("'\"")
+        return {"type": "chat_on_session", "message": message, "chat_name": chat_name, "session_id": session_id}
+
+    def import_docs_into_dataset(self, items):
+        document_list_str = items[1].children[0].strip("'\"")
+        document_paths = document_list_str.split(",")
+        if len(document_paths) == 1:
+            document_paths = document_paths[0]
+            document_paths = document_paths.split(" ")
+        dataset_name = items[4].children[0].strip("'\"")
+        return {"type": "import_docs_into_dataset", "dataset_name": dataset_name, "document_paths": document_paths}
+
+    def search_on_datasets(self, items):
+        question = items[1].children[0].strip("'\"")
+        datasets_str = items[4].children[0].strip("'\"")
+        datasets = datasets_str.split(",")
+        if len(datasets) == 1:
+            datasets = datasets[0]
+            datasets = datasets.split(" ")
+        return {"type": "search_on_datasets", "datasets": datasets, "question": question}
+
+    def get_chunk(self, items):
+        chunk_id = items[2].children[0].strip("'\"")
+        return {"type": "get_chunk", "chunk_id": chunk_id}
+
+    def insert_dataset_from_file(self, items):
+        file_path = items[4].children[0].strip("'\"")
+        return {"type": "insert_dataset_from_file", "file_path": file_path}
+
+    def insert_metadata_from_file(self, items):
+        file_path = items[4].children[0].strip("'\"")
+        return {"type": "insert_metadata_from_file", "file_path": file_path}
+
+    def update_chunk(self, items):
+        def get_quoted_value(item):
+            if hasattr(item, 'children') and item.children:
+                return item.children[0].strip("'\"")
+            return str(item).strip("'\"")
+
+        chunk_id = get_quoted_value(items[2])
+        dataset_name = get_quoted_value(items[5])
+        json_body = get_quoted_value(items[7])
+        return {"type": "update_chunk", "chunk_id": chunk_id, "dataset_name": dataset_name, "json_body": json_body}
+
+    def set_metadata(self, items):
+        doc_id = items[4].children[0].strip("'\"")
+        meta_json = items[6].children[0].strip("'\"")
+        return {"type": "set_metadata", "doc_id": doc_id, "meta": meta_json}
+
+    def remove_tags(self, items):
+        # items: REMOVE, TAGS, quoted_string(tag1), quoted_string(tag2), ..., FROM, DATASET, quoted_string(dataset_name), ";"
+        tags = []
+        # Start from index 2 (after TAGS keyword) and parse quoted strings until FROM
+        for i in range(2, len(items)):
+            item = items[i]
+            # Check for FROM token to stop
+            if hasattr(item, 'type') and item.type == 'FROM':
+                break
+            if hasattr(item, 'children') and item.children:
+                tag = item.children[0].strip("'\"")
+                tags.append(tag)
+        # Find dataset_name: quoted_string after DATASET
+        dataset_name = None
+        for i, item in enumerate(items):
+            # Check if item is a DATASET token
+            if hasattr(item, 'type') and item.type == 'DATASET':
+                # Next item should be quoted_string
+                dataset_name = items[i + 1].children[0].strip("'\"")
+                break
+        return {"type": "remove_tags", "dataset_name": dataset_name, "tags": tags}
+
+    def remove_chunks(self, items):
+        # Handle two cases:
+        # 1. REMOVE CHUNKS quoted_string (COMMA quoted_string)* FROM DOCUMENT quoted_string ";"
+        # 2. REMOVE ALL CHUNKS FROM DOCUMENT quoted_string ";"
+
+        # Check if it's "REMOVE ALL CHUNKS"
+        for item in items:
+            if hasattr(item, 'type') and item.type == 'ALL':
+                # Find doc_id
+                for j, inner_item in enumerate(items):
+                    if hasattr(inner_item, 'type') and inner_item.type == 'DOCUMENT':
+                        doc_id = items[j + 1].children[0].strip("'\"")
+                        return {"type": "remove_chunks", "doc_id": doc_id, "delete_all": True}
+
+        # Otherwise, we have chunk_ids
+        chunk_ids = []
+        doc_id = None
+        for i, item in enumerate(items):
+            if hasattr(item, 'type') and item.type == 'DOCUMENT':
+                doc_id = items[i + 1].children[0].strip("'\"")
+            elif hasattr(item, 'children') and item.children:
+                val = item.children[0].strip("'\"")
+                # Skip if it's "FROM" or "DOCUMENT"
+                if val.upper() in ['FROM', 'DOCUMENT']:
+                    continue
+                chunk_ids.append(val)
+
+        return {"type": "remove_chunks", "doc_id": doc_id, "chunk_ids": chunk_ids}
+
+    def list_chunks(self, items):
+        doc_id = items[4].children[0].strip("'\"")
+        result = {"type": "list_chunks", "doc_id": doc_id}
+
+        # Parse optional parameters: PAGE, SIZE, KEYWORDS, AVAILABLE
+        # items structure varies based on which params are present
+        for i, item in enumerate(items):
+            if str(item) == "PAGE":
+                result["page"] = int(items[i + 1])
+            elif str(item) == "SIZE":
+                result["size"] = int(items[i + 1])
+            elif str(item) == "KEYWORDS":
+                result["keywords"] = items[i + 1].children[0].strip("'\"")
+            elif str(item) == "AVAILABLE":
+                result["available_int"] = int(items[i + 1])
+
+        return result
+
+    def benchmark(self, items):
+        concurrency: int = int(items[1])
+        iterations: int = int(items[2])
+        command = items[3].children[0]
+        return {"type": "benchmark", "concurrency": concurrency, "iterations": iterations, "command": command}
+
+    def action_list(self, items):
+        return items
+
+    def meta_command(self, items):
+        command_name = str(items[0]).lower()
+        args = items[1:] if len(items) > 1 else []
+
+        # handle quoted parameter
+        parsed_args = []
+        for arg in args:
+            if hasattr(arg, "value"):
+                parsed_args.append(arg.value)
+            else:
+                parsed_args.append(str(arg))
+
+        return {"type": "meta", "command": command_name, "args": parsed_args}
+
+    def meta_command_name(self, items):
+        return items[0]
+
+    def meta_args(self, items):
+        return items
--- a/admin/client/pyproject.toml
+++ b/admin/client/pyproject.toml
@@ -1,24 +1,27 @@
 [project]
 name = "ragflow-cli"
-version = "0.22.0"
+version = "0.26.2"
 description = "Admin Service's client of [RAGFlow](https://github.com/infiniflow/ragflow). The Admin Service provides user management and system monitoring. "
 authors = [{ name = "Lynn", email = "lynn_inf@hotmail.com" }]
 license = { text = "Apache License, Version 2.0" }
 readme = "README.md"
-requires-python = ">=3.10,<3.13"
+requires-python = ">=3.13,<3.14"
 dependencies = [
    "requests>=2.30.0,<3.0.0",
-    "beartype>=0.18.5,<0.19.0",
+    "beartype>=0.20.0,<1.0.0",
    "pycryptodomex>=3.10.0",
    "lark>=1.1.0",
+    "requests-toolbelt>=1.0.0",
 ]

 [dependency-groups]
 test = [
    "pytest>=8.3.5",
    "requests>=2.32.3",
-    "requests-toolbelt>=1.0.0",
 ]

+[tool.setuptools]
+py-modules = ["ragflow_cli", "parser", "http_client", "ragflow_client", "user"]
+
 [project.scripts]
-ragflow-cli = "admin_client:main"
+ragflow-cli = "ragflow_cli:main"
--- a/admin/client/ragflow_cli.py
+++ b/admin/client/ragflow_cli.py
@@ -0,0 +1,347 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+import sys
+import argparse
+import base64
+import getpass
+import os
+import atexit
+import readline
+from cmd import Cmd
+from typing import Any, Dict, List
+
+import requests
+import warnings
+from Cryptodome.Cipher import PKCS1_v1_5 as Cipher_pkcs1_v1_5
+from Cryptodome.PublicKey import RSA
+from lark import Lark, Tree
+from parser import GRAMMAR, RAGFlowCLITransformer
+from http_client import HttpClient
+from ragflow_client import RAGFlowClient, run_command
+from user import login_user
+
+warnings.filterwarnings("ignore", category=getpass.GetPassWarning)
+
+def encrypt(input_string):
+    pub = "-----BEGIN PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArq9XTUSeYr2+N1h3Afl/z8Dse/2yD0ZGrKwx+EEEcdsBLca9Ynmx3nIB5obmLlSfmskLpBo0UACBmB5rEjBp2Q2f3AG3Hjd4B+gNCG6BDaawuDlgANIhGnaTLrIqWrrcm4EMzJOnAOI1fgzJRsOOUEfaS318Eq9OVO3apEyCCt0lOQK6PuksduOjVxtltDav+guVAA068NrPYmRNabVKRNLJpL8w4D44sfth5RvZ3q9t+6RTArpEtc5sh5ChzvqPOzKGMXW83C95TxmXqpbK6olN4RevSfVjEAgCydH6HN6OhtOQEcnrU97r9H0iZOWwbw3pVrZiUkuRD1R56Wzs2wIDAQAB\n-----END PUBLIC KEY-----"
+    pub_key = RSA.importKey(pub)
+    cipher = Cipher_pkcs1_v1_5.new(pub_key)
+    cipher_text = cipher.encrypt(base64.b64encode(input_string.encode("utf-8")))
+    return base64.b64encode(cipher_text).decode("utf-8")
+
+
+def encode_to_base64(input_string):
+    base64_encoded = base64.b64encode(input_string.encode("utf-8"))
+    return base64_encoded.decode("utf-8")
+
+
+
+
+
+class RAGFlowCLI(Cmd):
+    def __init__(self):
+        super().__init__()
+        self.parser = Lark(GRAMMAR, start="start", parser="lalr", transformer=RAGFlowCLITransformer())
+        self.command_history = []
+        self.account = "admin@ragflow.io"
+        self.account_password: str = "admin"
+        self.session = requests.Session()
+        self.host: str = ""
+        self.port: int = 0
+        self.mode: str = "admin"
+        self.ragflow_client = None
+        # History file for readline persistence
+        self.history_file = os.path.expanduser("~/.ragflow_cli_history")
+        # Load existing history
+        self._load_history()
+        # Register cleanup to save history on exit
+        atexit.register(self._save_history)
+
+    intro = r"""Type "\h" for help."""
+    prompt = "ragflow> "
+
+    def onecmd(self, command: str) -> bool:
+        try:
+            result = self.parse_command(command)
+
+            if isinstance(result, dict):
+                if "type" in result and result.get("type") == "empty":
+                    return False
+
+            self.execute_command(result)
+
+            if isinstance(result, Tree):
+                return False
+
+            if result.get("type") == "meta" and result.get("command") in ["q", "quit", "exit"]:
+                return True
+
+        except KeyboardInterrupt:
+            print("\nUse '\\q' to quit")
+        except EOFError:
+            print("\nGoodbye!")
+            return True
+        return False
+
+    def emptyline(self) -> bool:
+        return False
+
+    def default(self, line: str) -> bool:
+        return self.onecmd(line)
+
+    def parse_command(self, command_str: str) -> dict[str, str]:
+        if not command_str.strip():
+            return {"type": "empty"}
+
+        self.command_history.append(command_str)
+        readline.add_history(command_str)
+
+        try:
+            result = self.parser.parse(command_str)
+            return result
+        except Exception as e:
+            return {"type": "error", "message": f"Parse error: {str(e)}"}
+
+    def verify_auth(self, arguments: dict, single_command: bool, auth: bool):
+        server_type = arguments.get("type", "admin")
+        http_client = HttpClient(arguments["host"], arguments["port"])
+        if not auth:
+            self.ragflow_client = RAGFlowClient(http_client, server_type)
+            return True
+
+        user_name = arguments["username"]
+        attempt_count = 3
+        if single_command:
+            attempt_count = 1
+
+        try_count = 0
+        while True:
+            try_count += 1
+            if try_count > attempt_count:
+                return False
+
+            if single_command:
+                user_password = arguments["password"]
+            else:
+                user_password = getpass.getpass(f"password for {user_name}: ").strip()
+
+            try:
+                token = login_user(http_client, server_type, user_name, user_password)
+                http_client.login_token = token
+                self.ragflow_client = RAGFlowClient(http_client, server_type)
+                return True
+            except Exception as e:
+                print(str(e))
+                print("Can't access server for login (connection failed)")
+
+    def _format_service_detail_table(self, data):
+        if isinstance(data, list):
+            return data
+        if not all([isinstance(v, list) for v in data.values()]):
+            # normal table
+            return data
+        # handle task_executor heartbeats map, for example {'name': [{'done': 2, 'now': timestamp1}, {'done': 3, 'now': timestamp2}]
+        task_executor_list = []
+        for k, v in data.items():
+            # display latest status
+            heartbeats = sorted(v, key=lambda x: x["now"], reverse=True)
+            task_executor_list.append(
+                {
+                    "task_executor_name": k,
+                    **heartbeats[0],
+                }
+                if heartbeats
+                else {"task_executor_name": k}
+            )
+        return task_executor_list
+
+    def _print_table_simple(self, data):
+        if not data:
+            print("No data to print")
+            return
+        if isinstance(data, dict):
+            # handle single row data
+            data = [data]
+
+        columns = list(set().union(*(d.keys() for d in data)))
+        columns.sort()
+        col_widths = {}
+
+        def get_string_width(text):
+            half_width_chars = " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\t\n\r"
+            width = 0
+            for char in text:
+                if char in half_width_chars:
+                    width += 1
+                else:
+                    width += 2
+            return width
+
+        for col in columns:
+            max_width = get_string_width(str(col))
+            for item in data:
+                value_len = get_string_width(str(item.get(col, "")))
+                if value_len > max_width:
+                    max_width = value_len
+            col_widths[col] = max(2, max_width)
+
+        # Generate delimiter
+        separator = "+" + "+".join(["-" * (col_widths[col] + 2) for col in columns]) + "+"
+
+        # Print header
+        print(separator)
+        header = "|" + "|".join([f" {col:<{col_widths[col]}} " for col in columns]) + "|"
+        print(header)
+        print(separator)
+
+        # Print data
+        for item in data:
+            row = "|"
+            for col in columns:
+                value = str(item.get(col, ""))
+                if get_string_width(value) > col_widths[col]:
+                    value = value[: col_widths[col] - 3] + "..."
+                row += f" {value:<{col_widths[col] - (get_string_width(value) - len(value))}} |"
+            print(row)
+
+        print(separator)
+
+    def _load_history(self):
+        """Load command history from file."""
+        try:
+            if os.path.exists(self.history_file):
+                readline.read_history_file(self.history_file)
+        except Exception:
+            pass  # Ignore errors loading history
+
+    def _save_history(self):
+        """Save command history to file."""
+        try:
+            readline.write_history_file(self.history_file)
+        except Exception:
+            pass  # Ignore errors saving history
+
+    def run_interactive(self, args):
+        if self.verify_auth(args, single_command=False, auth=args["auth"]):
+            print(r"""
+                ____  ___   ______________                 ________    ____
+               / __ \/   | / ____/ ____/ /___ _      __   / ____/ /   /  _/
+              / /_/ / /| |/ / __/ /_  / / __ \ | /| / /  / /   / /    / /  
+             / _, _/ ___ / /_/ / __/ / / /_/ / |/ |/ /  / /___/ /____/ /   
+            /_/ |_/_/  |_\____/_/   /_/\____/|__/|__/   \____/_____/___/   
+            """)
+            self.cmdloop()
+
+        print("RAGFlow command line interface - Type '\\?' for help, '\\q' to quit")
+
+    def run_single_command(self, args):
+        if self.verify_auth(args, single_command=True, auth=args["auth"]):
+            command = args["command"]
+            result = self.parse_command(command)
+            self.execute_command(result)
+
+
+    def parse_connection_args(self, args: List[str]) -> Dict[str, Any]:
+        parser = argparse.ArgumentParser(description="RAGFlow CLI Client", add_help=False)
+        parser.add_argument("-h", "--host", default="127.0.0.1", help="Admin or RAGFlow service host")
+        parser.add_argument("-p", "--port", type=int, default=9381, help="Admin or RAGFlow service port")
+        parser.add_argument("-w", "--password", default="admin", type=str, help="Superuser password")
+        parser.add_argument("-t", "--type", default="admin", type=str, help="CLI mode, admin or user")
+        parser.add_argument("-u", "--username", default=None,
+                            help="Username (email). In admin mode defaults to admin@ragflow.io, in user mode required.")
+        parser.add_argument("command", nargs="?", help="Single command")
+        try:
+            parsed_args, remaining_args = parser.parse_known_args(args)
+            # Determine username based on mode
+            username = parsed_args.username
+            if parsed_args.type == "admin":
+                if username is None:
+                    username = "admin@ragflow.io"
+
+            if remaining_args:
+                if remaining_args[0] == "command":
+                    command_str = ' '.join(remaining_args[1:]) + ';'
+                    auth = True
+                    if remaining_args[1] == "register":
+                        auth = False
+                    else:
+                        if username is None:
+                            print("Error: username (-u) is required in user mode")
+                            return {"error": "Username required"}
+                    return {
+                        "host": parsed_args.host,
+                        "port": parsed_args.port,
+                        "password": parsed_args.password,
+                        "type": parsed_args.type,
+                        "username": username,
+                        "command": command_str,
+                        "auth": auth
+                    }
+                else:
+                    return {"error": "Invalid command"}
+            else:
+                auth = True
+                if username is None:
+                    auth = False
+                return {
+                    "host": parsed_args.host,
+                    "port": parsed_args.port,
+                    "type": parsed_args.type,
+                    "username": username,
+                    "auth": auth
+                }
+        except SystemExit:
+            return {"error": "Invalid connection arguments"}
+
+    def execute_command(self, parsed_command: Dict[str, Any]):
+        command_dict: dict
+        if isinstance(parsed_command, Tree):
+            command_dict = parsed_command.children[0]
+        else:
+            if parsed_command["type"] == "error":
+                print(f"Error: {parsed_command['message']}")
+                return
+            else:
+                command_dict = parsed_command
+
+        # print(f"Parsed command: {command_dict}")
+        run_command(self.ragflow_client, command_dict)
+
+def main():
+
+    cli = RAGFlowCLI()
+
+    args = cli.parse_connection_args(sys.argv)
+    if "error" in args:
+        print("Error: Invalid connection arguments")
+        return
+
+    if "command" in args:
+        # single command mode
+        # for user mode, api key or password is ok
+        # for admin mode, only password
+        if "password" not in args:
+            print("Error: password is missing")
+            return
+
+        cli.run_single_command(args)
+    else:
+        cli.run_interactive(args)
+
+
+if __name__ == "__main__":
+    main()
--- a/admin/client/ragflow_client.py
+++ b/admin/client/ragflow_client.py
--- a/admin/client/user.py
+++ b/admin/client/user.py
@@ -0,0 +1,77 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+from http_client import HttpClient
+
+
+class AuthException(Exception):
+    def __init__(self, message, code=401):
+        super().__init__(message)
+        self.code = code
+        self.message = message
+
+
+def encrypt_password(password_plain: str) -> str:
+    try:
+        import base64
+        from Cryptodome.PublicKey import RSA
+        from Cryptodome.Cipher import PKCS1_v1_5 as Cipher_pkcs1_v1_5
+        def crypt(line):
+            """
+            decrypt(crypt(input_string)) == base64(input_string), which frontend and ragflow_cli use.
+            """
+            pub = "-----BEGIN PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArq9XTUSeYr2+N1h3Afl/z8Dse/2yD0ZGrKwx+EEEcdsBLca9Ynmx3nIB5obmLlSfmskLpBo0UACBmB5rEjBp2Q2f3AG3Hjd4B+gNCG6BDaawuDlgANIhGnaTLrIqWrrcm4EMzJOnAOI1fgzJRsOOUEfaS318Eq9OVO3apEyCCt0lOQK6PuksduOjVxtltDav+guVAA068NrPYmRNabVKRNLJpL8w4D44sfth5RvZ3q9t+6RTArpEtc5sh5ChzvqPOzKGMXW83C95TxmXqpbK6olN4RevSfVjEAgCydH6HN6OhtOQEcnrU97r9H0iZOWwbw3pVrZiUkuRD1R56Wzs2wIDAQAB\n-----END PUBLIC KEY-----"
+            rsa_key = RSA.importKey(pub)
+            cipher = Cipher_pkcs1_v1_5.new(rsa_key)
+            password_base64 = base64.b64encode(line.encode('utf-8')).decode("utf-8")
+            encrypted_password = cipher.encrypt(password_base64.encode())
+            return base64.b64encode(encrypted_password).decode('utf-8')
+    except Exception as exc:
+        raise AuthException(
+            "Password encryption unavailable; install pycryptodomex (uv sync --python 3.13 --group test)."
+        ) from exc
+    return crypt(password_plain)
+
+
+def register_user(client: HttpClient, email: str, nickname: str, password: str) -> None:
+    password_enc = encrypt_password(password)
+    payload = {"email": email, "nickname": nickname, "password": password_enc}
+    res = client.request_json("POST", "/users", use_api_base=True, auth_kind=None, json_body=payload)
+    if res.get("code") == 0:
+        return
+    msg = res.get("message", "")
+    if "has already registered" in msg:
+        return
+    raise AuthException(f"Register failed: {msg}")
+
+
+def login_user(client: HttpClient, server_type: str, email: str, password: str) -> str:
+    password_enc = encrypt_password(password)
+    payload = {"email": email, "password": password_enc}
+    if server_type == "admin":
+        response = client.request("POST", "/admin/login", use_api_base=True, auth_kind=None, json_body=payload)
+    else:
+        response = client.request("POST", "/auth/login", use_api_base=True, auth_kind=None, json_body=payload)
+    try:
+        res = response.json()
+    except Exception as exc:
+        raise AuthException(f"Login failed: invalid JSON response ({exc})") from exc
+    if res.get("code") != 0:
+        raise AuthException(f"Login failed: {res.get('message')}")
+    token = response.headers.get("Authorization")
+    if not token:
+        raise AuthException("Login failed: missing Authorization header")
+    return token
--- a/admin/client/uv.lock
+++ b/admin/client/uv.lock
@@ -0,0 +1,224 @@
+version = 1
+revision = 3
+requires-python = "==3.13.*"
+
+[[package]]
+name = "beartype"
+version = "0.22.6"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/88/e2/105ceb1704cb80fe4ab3872529ab7b6f365cf7c74f725e6132d0efcf1560/beartype-0.22.6.tar.gz", hash = "sha256:97fbda69c20b48c5780ac2ca60ce3c1bb9af29b3a1a0216898ffabdd523e48f4", size = 1588975, upload-time = "2025-11-20T04:47:14.736Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/98/c9/ceecc71fe2c9495a1d8e08d44f5f31f5bca1350d5b2e27a4b6265424f59e/beartype-0.22.6-py3-none-any.whl", hash = "sha256:0584bc46a2ea2a871509679278cda992eadde676c01356ab0ac77421f3c9a093", size = 1324807, upload-time = "2025-11-20T04:47:11.837Z" },
+]
+
+[[package]]
+name = "certifi"
+version = "2025.11.12"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/a2/8c/58f469717fa48465e4a50c014a0400602d3c437d7c0c468e17ada824da3a/certifi-2025.11.12.tar.gz", hash = "sha256:d8ab5478f2ecd78af242878415affce761ca6bc54a22a27e026d7c25357c3316", size = 160538, upload-time = "2025-11-12T02:54:51.517Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/70/7d/9bc192684cea499815ff478dfcdc13835ddf401365057044fb721ec6bddb/certifi-2025.11.12-py3-none-any.whl", hash = "sha256:97de8790030bbd5c2d96b7ec782fc2f7820ef8dba6db909ccf95449f2d062d4b", size = 159438, upload-time = "2025-11-12T02:54:49.735Z" },
+]
+
+[[package]]
+name = "charset-normalizer"
+version = "3.4.4"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/13/69/33ddede1939fdd074bce5434295f38fae7136463422fe4fd3e0e89b98062/charset_normalizer-3.4.4.tar.gz", hash = "sha256:94537985111c35f28720e43603b8e7b43a6ecfb2ce1d3058bbe955b73404e21a", size = 129418, upload-time = "2025-10-14T04:42:32.879Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/97/45/4b3a1239bbacd321068ea6e7ac28875b03ab8bc0aa0966452db17cd36714/charset_normalizer-3.4.4-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:e1f185f86a6f3403aa2420e815904c67b2f9ebc443f045edd0de921108345794", size = 208091, upload-time = "2025-10-14T04:41:13.346Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/7d/62/73a6d7450829655a35bb88a88fca7d736f9882a27eacdca2c6d505b57e2e/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6b39f987ae8ccdf0d2642338faf2abb1862340facc796048b604ef14919e55ed", size = 147936, upload-time = "2025-10-14T04:41:14.461Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/89/c5/adb8c8b3d6625bef6d88b251bbb0d95f8205831b987631ab0c8bb5d937c2/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:3162d5d8ce1bb98dd51af660f2121c55d0fa541b46dff7bb9b9f86ea1d87de72", size = 144180, upload-time = "2025-10-14T04:41:15.588Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/91/ed/9706e4070682d1cc219050b6048bfd293ccf67b3d4f5a4f39207453d4b99/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:81d5eb2a312700f4ecaa977a8235b634ce853200e828fbadf3a9c50bab278328", size = 161346, upload-time = "2025-10-14T04:41:16.738Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d5/0d/031f0d95e4972901a2f6f09ef055751805ff541511dc1252ba3ca1f80cf5/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:5bd2293095d766545ec1a8f612559f6b40abc0eb18bb2f5d1171872d34036ede", size = 158874, upload-time = "2025-10-14T04:41:17.923Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/f5/83/6ab5883f57c9c801ce5e5677242328aa45592be8a00644310a008d04f922/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a8a8b89589086a25749f471e6a900d3f662d1d3b6e2e59dcecf787b1cc3a1894", size = 153076, upload-time = "2025-10-14T04:41:19.106Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/75/1e/5ff781ddf5260e387d6419959ee89ef13878229732732ee73cdae01800f2/charset_normalizer-3.4.4-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:bc7637e2f80d8530ee4a78e878bce464f70087ce73cf7c1caf142416923b98f1", size = 150601, upload-time = "2025-10-14T04:41:20.245Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d7/57/71be810965493d3510a6ca79b90c19e48696fb1ff964da319334b12677f0/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:f8bf04158c6b607d747e93949aa60618b61312fe647a6369f88ce2ff16043490", size = 150376, upload-time = "2025-10-14T04:41:21.398Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/e5/d5/c3d057a78c181d007014feb7e9f2e65905a6c4ef182c0ddf0de2924edd65/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:554af85e960429cf30784dd47447d5125aaa3b99a6f0683589dbd27e2f45da44", size = 144825, upload-time = "2025-10-14T04:41:22.583Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/e6/8c/d0406294828d4976f275ffbe66f00266c4b3136b7506941d87c00cab5272/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:74018750915ee7ad843a774364e13a3db91682f26142baddf775342c3f5b1133", size = 162583, upload-time = "2025-10-14T04:41:23.754Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d7/24/e2aa1f18c8f15c4c0e932d9287b8609dd30ad56dbe41d926bd846e22fb8d/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:c0463276121fdee9c49b98908b3a89c39be45d86d1dbaa22957e38f6321d4ce3", size = 150366, upload-time = "2025-10-14T04:41:25.27Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/e4/5b/1e6160c7739aad1e2df054300cc618b06bf784a7a164b0f238360721ab86/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:362d61fd13843997c1c446760ef36f240cf81d3ebf74ac62652aebaf7838561e", size = 160300, upload-time = "2025-10-14T04:41:26.725Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/7a/10/f882167cd207fbdd743e55534d5d9620e095089d176d55cb22d5322f2afd/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:9a26f18905b8dd5d685d6d07b0cdf98a79f3c7a918906af7cc143ea2e164c8bc", size = 154465, upload-time = "2025-10-14T04:41:28.322Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/89/66/c7a9e1b7429be72123441bfdbaf2bc13faab3f90b933f664db506dea5915/charset_normalizer-3.4.4-cp313-cp313-win32.whl", hash = "sha256:9b35f4c90079ff2e2edc5b26c0c77925e5d2d255c42c74fdb70fb49b172726ac", size = 99404, upload-time = "2025-10-14T04:41:29.95Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/c4/26/b9924fa27db384bdcd97ab83b4f0a8058d96ad9626ead570674d5e737d90/charset_normalizer-3.4.4-cp313-cp313-win_amd64.whl", hash = "sha256:b435cba5f4f750aa6c0a0d92c541fb79f69a387c91e61f1795227e4ed9cece14", size = 107092, upload-time = "2025-10-14T04:41:31.188Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/af/8f/3ed4bfa0c0c72a7ca17f0380cd9e4dd842b09f664e780c13cff1dcf2ef1b/charset_normalizer-3.4.4-cp313-cp313-win_arm64.whl", hash = "sha256:542d2cee80be6f80247095cc36c418f7bddd14f4a6de45af91dfad36d817bba2", size = 100408, upload-time = "2025-10-14T04:41:32.624Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/0a/4c/925909008ed5a988ccbb72dcc897407e5d6d3bd72410d69e051fc0c14647/charset_normalizer-3.4.4-py3-none-any.whl", hash = "sha256:7a32c560861a02ff789ad905a2fe94e3f840803362c84fecf1851cb4cf3dc37f", size = 53402, upload-time = "2025-10-14T04:42:31.76Z" },
+]
+
+[[package]]
+name = "colorama"
+version = "0.4.6"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d8/53/6f443c9a4a8358a93a6792e2acffb9d9d5cb0a5cfd8802644b7b1c9a02e4/colorama-0.4.6.tar.gz", hash = "sha256:08695f5cb7ed6e0531a20572697297273c47b8cae5a63ffc6d6ed5c201be6e44", size = 27697, upload-time = "2022-10-25T02:36:22.414Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6", size = 25335, upload-time = "2022-10-25T02:36:20.889Z" },
+]
+
+[[package]]
+name = "idna"
+version = "3.11"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/6f/6d/0703ccc57f3a7233505399edb88de3cbd678da106337b9fcde432b65ed60/idna-3.11.tar.gz", hash = "sha256:795dafcc9c04ed0c1fb032c2aa73654d8e8c5023a7df64a53f39190ada629902", size = 194582, upload-time = "2025-10-12T14:55:20.501Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/0e/61/66938bbb5fc52dbdf84594873d5b51fb1f7c7794e9c0f5bd885f30bc507b/idna-3.11-py3-none-any.whl", hash = "sha256:771a87f49d9defaf64091e6e6fe9c18d4833f140bd19464795bc32d966ca37ea", size = 71008, upload-time = "2025-10-12T14:55:18.883Z" },
+]
+
+[[package]]
+name = "iniconfig"
+version = "2.3.0"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/72/34/14ca021ce8e5dfedc35312d08ba8bf51fdd999c576889fc2c24cb97f4f10/iniconfig-2.3.0.tar.gz", hash = "sha256:c76315c77db068650d49c5b56314774a7804df16fee4402c1f19d6d15d8c4730", size = 20503, upload-time = "2025-10-18T21:55:43.219Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/cb/b1/3846dd7f199d53cb17f49cba7e651e9ce294d8497c8c150530ed11865bb8/iniconfig-2.3.0-py3-none-any.whl", hash = "sha256:f631c04d2c48c52b84d0d0549c99ff3859c98df65b3101406327ecc7d53fbf12", size = 7484, upload-time = "2025-10-18T21:55:41.639Z" },
+]
+
+[[package]]
+name = "lark"
+version = "1.3.1"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/da/34/28fff3ab31ccff1fd4f6c7c7b0ceb2b6968d8ea4950663eadcb5720591a0/lark-1.3.1.tar.gz", hash = "sha256:b426a7a6d6d53189d318f2b6236ab5d6429eaf09259f1ca33eb716eed10d2905", size = 382732, upload-time = "2025-10-27T18:25:56.653Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/82/3d/14ce75ef66813643812f3093ab17e46d3a206942ce7376d31ec2d36229e7/lark-1.3.1-py3-none-any.whl", hash = "sha256:c629b661023a014c37da873b4ff58a817398d12635d3bbb2c5a03be7fe5d1e12", size = 113151, upload-time = "2025-10-27T18:25:54.882Z" },
+]
+
+[[package]]
+name = "packaging"
+version = "25.0"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/a1/d4/1fc4078c65507b51b96ca8f8c3ba19e6a61c8253c72794544580a7b6c24d/packaging-25.0.tar.gz", hash = "sha256:d443872c98d677bf60f6a1f2f8c1cb748e8fe762d2bf9d3148b5599295b0fc4f", size = 165727, upload-time = "2025-04-19T11:48:59.673Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/20/12/38679034af332785aac8774540895e234f4d07f7545804097de4b666afd8/packaging-25.0-py3-none-any.whl", hash = "sha256:29572ef2b1f17581046b3a2227d5c611fb25ec70ca1ba8554b24b0e69331a484", size = 66469, upload-time = "2025-04-19T11:48:57.875Z" },
+]
+
+[[package]]
+name = "pluggy"
+version = "1.6.0"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/f9/e2/3e91f31a7d2b083fe6ef3fa267035b518369d9511ffab804f839851d2779/pluggy-1.6.0.tar.gz", hash = "sha256:7dcc130b76258d33b90f61b658791dede3486c3e6bfb003ee5c9bfb396dd22f3", size = 69412, upload-time = "2025-05-15T12:30:07.975Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/54/20/4d324d65cc6d9205fabedc306948156824eb9f0ee1633355a8f7ec5c66bf/pluggy-1.6.0-py3-none-any.whl", hash = "sha256:e920276dd6813095e9377c0bc5566d94c932c33b27a3e3945d8389c374dd4746", size = 20538, upload-time = "2025-05-15T12:30:06.134Z" },
+]
+
+[[package]]
+name = "pycryptodomex"
+version = "3.23.0"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/c9/85/e24bf90972a30b0fcd16c73009add1d7d7cd9140c2498a68252028899e41/pycryptodomex-3.23.0.tar.gz", hash = "sha256:71909758f010c82bc99b0abf4ea12012c98962fbf0583c2164f8b84533c2e4da", size = 4922157, upload-time = "2025-05-17T17:23:41.434Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/2e/00/10edb04777069a42490a38c137099d4b17ba6e36a4e6e28bdc7470e9e853/pycryptodomex-3.23.0-cp313-cp313t-macosx_10_13_universal2.whl", hash = "sha256:7b37e08e3871efe2187bc1fd9320cc81d87caf19816c648f24443483005ff886", size = 2498764, upload-time = "2025-05-17T17:22:21.453Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/6b/3f/2872a9c2d3a27eac094f9ceaa5a8a483b774ae69018040ea3240d5b11154/pycryptodomex-3.23.0-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:91979028227543010d7b2ba2471cf1d1e398b3f183cb105ac584df0c36dac28d", size = 1643012, upload-time = "2025-05-17T17:22:23.702Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/70/af/774c2e2b4f6570fbf6a4972161adbb183aeeaa1863bde31e8706f123bf92/pycryptodomex-3.23.0-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6b8962204c47464d5c1c4038abeadd4514a133b28748bcd9fa5b6d62e3cec6fa", size = 2187643, upload-time = "2025-05-17T17:22:26.37Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/de/a3/71065b24cb889d537954cedc3ae5466af00a2cabcff8e29b73be047e9a19/pycryptodomex-3.23.0-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a33986a0066860f7fcf7c7bd2bc804fa90e434183645595ae7b33d01f3c91ed8", size = 2273762, upload-time = "2025-05-17T17:22:28.313Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/c9/0b/ff6f43b7fbef4d302c8b981fe58467b8871902cdc3eb28896b52421422cc/pycryptodomex-3.23.0-cp313-cp313t-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:c7947ab8d589e3178da3d7cdeabe14f841b391e17046954f2fbcd941705762b5", size = 2313012, upload-time = "2025-05-17T17:22:30.57Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/02/de/9d4772c0506ab6da10b41159493657105d3f8bb5c53615d19452afc6b315/pycryptodomex-3.23.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:c25e30a20e1b426e1f0fa00131c516f16e474204eee1139d1603e132acffc314", size = 2186856, upload-time = "2025-05-17T17:22:32.819Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/28/ad/8b30efcd6341707a234e5eba5493700a17852ca1ac7a75daa7945fcf6427/pycryptodomex-3.23.0-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:da4fa650cef02db88c2b98acc5434461e027dce0ae8c22dd5a69013eaf510006", size = 2347523, upload-time = "2025-05-17T17:22:35.386Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/0f/02/16868e9f655b7670dbb0ac4f2844145cbc42251f916fc35c414ad2359849/pycryptodomex-3.23.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:58b851b9effd0d072d4ca2e4542bf2a4abcf13c82a29fd2c93ce27ee2a2e9462", size = 2272825, upload-time = "2025-05-17T17:22:37.632Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/ca/18/4ca89ac737230b52ac8ffaca42f9c6f1fd07c81a6cd821e91af79db60632/pycryptodomex-3.23.0-cp313-cp313t-win32.whl", hash = "sha256:a9d446e844f08299236780f2efa9898c818fe7e02f17263866b8550c7d5fb328", size = 1772078, upload-time = "2025-05-17T17:22:40Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/73/34/13e01c322db027682e00986873eca803f11c56ade9ba5bbf3225841ea2d4/pycryptodomex-3.23.0-cp313-cp313t-win_amd64.whl", hash = "sha256:bc65bdd9fc8de7a35a74cab1c898cab391a4add33a8fe740bda00f5976ca4708", size = 1803656, upload-time = "2025-05-17T17:22:42.139Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/54/68/9504c8796b1805d58f4425002bcca20f12880e6fa4dc2fc9a668705c7a08/pycryptodomex-3.23.0-cp313-cp313t-win_arm64.whl", hash = "sha256:c885da45e70139464f082018ac527fdaad26f1657a99ee13eecdce0f0ca24ab4", size = 1707172, upload-time = "2025-05-17T17:22:44.704Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/dd/9c/1a8f35daa39784ed8adf93a694e7e5dc15c23c741bbda06e1d45f8979e9e/pycryptodomex-3.23.0-cp37-abi3-macosx_10_9_universal2.whl", hash = "sha256:06698f957fe1ab229a99ba2defeeae1c09af185baa909a31a5d1f9d42b1aaed6", size = 2499240, upload-time = "2025-05-17T17:22:46.953Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/7a/62/f5221a191a97157d240cf6643747558759126c76ee92f29a3f4aee3197a5/pycryptodomex-3.23.0-cp37-abi3-macosx_10_9_x86_64.whl", hash = "sha256:b2c2537863eccef2d41061e82a881dcabb04944c5c06c5aa7110b577cc487545", size = 1644042, upload-time = "2025-05-17T17:22:49.098Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/8c/fd/5a054543c8988d4ed7b612721d7e78a4b9bf36bc3c5ad45ef45c22d0060e/pycryptodomex-3.23.0-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:43c446e2ba8df8889e0e16f02211c25b4934898384c1ec1ec04d7889c0333587", size = 2186227, upload-time = "2025-05-17T17:22:51.139Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/c8/a9/8862616a85cf450d2822dbd4fff1fcaba90877907a6ff5bc2672cafe42f8/pycryptodomex-3.23.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f489c4765093fb60e2edafdf223397bc716491b2b69fe74367b70d6999257a5c", size = 2272578, upload-time = "2025-05-17T17:22:53.676Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/46/9f/bda9c49a7c1842820de674ab36c79f4fbeeee03f8ff0e4f3546c3889076b/pycryptodomex-3.23.0-cp37-abi3-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:bdc69d0d3d989a1029df0eed67cc5e8e5d968f3724f4519bd03e0ec68df7543c", size = 2312166, upload-time = "2025-05-17T17:22:56.585Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/03/cc/870b9bf8ca92866ca0186534801cf8d20554ad2a76ca959538041b7a7cf4/pycryptodomex-3.23.0-cp37-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:6bbcb1dd0f646484939e142462d9e532482bc74475cecf9c4903d4e1cd21f003", size = 2185467, upload-time = "2025-05-17T17:22:59.237Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/96/e3/ce9348236d8e669fea5dd82a90e86be48b9c341210f44e25443162aba187/pycryptodomex-3.23.0-cp37-abi3-musllinux_1_2_i686.whl", hash = "sha256:8a4fcd42ccb04c31268d1efeecfccfd1249612b4de6374205376b8f280321744", size = 2346104, upload-time = "2025-05-17T17:23:02.112Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/a5/e9/e869bcee87beb89040263c416a8a50204f7f7a83ac11897646c9e71e0daf/pycryptodomex-3.23.0-cp37-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:55ccbe27f049743a4caf4f4221b166560d3438d0b1e5ab929e07ae1702a4d6fd", size = 2271038, upload-time = "2025-05-17T17:23:04.872Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/8d/67/09ee8500dd22614af5fbaa51a4aee6e342b5fa8aecf0a6cb9cbf52fa6d45/pycryptodomex-3.23.0-cp37-abi3-win32.whl", hash = "sha256:189afbc87f0b9f158386bf051f720e20fa6145975f1e76369303d0f31d1a8d7c", size = 1771969, upload-time = "2025-05-17T17:23:07.115Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/69/96/11f36f71a865dd6df03716d33bd07a67e9d20f6b8d39820470b766af323c/pycryptodomex-3.23.0-cp37-abi3-win_amd64.whl", hash = "sha256:52e5ca58c3a0b0bd5e100a9fbc8015059b05cffc6c66ce9d98b4b45e023443b9", size = 1803124, upload-time = "2025-05-17T17:23:09.267Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/f9/93/45c1cdcbeb182ccd2e144c693eaa097763b08b38cded279f0053ed53c553/pycryptodomex-3.23.0-cp37-abi3-win_arm64.whl", hash = "sha256:02d87b80778c171445d67e23d1caef279bf4b25c3597050ccd2e13970b57fd51", size = 1707161, upload-time = "2025-05-17T17:23:11.414Z" },
+]
+
+[[package]]
+name = "pygments"
+version = "2.19.2"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/b0/77/a5b8c569bf593b0140bde72ea885a803b82086995367bf2037de0159d924/pygments-2.19.2.tar.gz", hash = "sha256:636cb2477cec7f8952536970bc533bc43743542f70392ae026374600add5b887", size = 4968631, upload-time = "2025-06-21T13:39:12.283Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/c7/21/705964c7812476f378728bdf590ca4b771ec72385c533964653c68e86bdc/pygments-2.19.2-py3-none-any.whl", hash = "sha256:86540386c03d588bb81d44bc3928634ff26449851e99741617ecb9037ee5ec0b", size = 1225217, upload-time = "2025-06-21T13:39:07.939Z" },
+]
+
+[[package]]
+name = "pytest"
+version = "9.0.1"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+dependencies = [
+    { name = "colorama", marker = "sys_platform == 'win32'" },
+    { name = "iniconfig" },
+    { name = "packaging" },
+    { name = "pluggy" },
+    { name = "pygments" },
+]
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/07/56/f013048ac4bc4c1d9be45afd4ab209ea62822fb1598f40687e6bf45dcea4/pytest-9.0.1.tar.gz", hash = "sha256:3e9c069ea73583e255c3b21cf46b8d3c56f6e3a1a8f6da94ccb0fcf57b9d73c8", size = 1564125, upload-time = "2025-11-12T13:05:09.333Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/0b/8b/6300fb80f858cda1c51ffa17075df5d846757081d11ab4aa35cef9e6258b/pytest-9.0.1-py3-none-any.whl", hash = "sha256:67be0030d194df2dfa7b556f2e56fb3c3315bd5c8822c6951162b92b32ce7dad", size = 373668, upload-time = "2025-11-12T13:05:07.379Z" },
+]
+
+[[package]]
+name = "ragflow-cli"
+version = "0.26.2"
+source = { virtual = "." }
+dependencies = [
+    { name = "beartype" },
+    { name = "lark" },
+    { name = "pycryptodomex" },
+    { name = "requests" },
+    { name = "requests-toolbelt" },
+]
+
+[package.dev-dependencies]
+test = [
+    { name = "pytest" },
+    { name = "requests" },
+]
+
+[package.metadata]
+requires-dist = [
+    { name = "beartype", specifier = ">=0.20.0,<1.0.0" },
+    { name = "lark", specifier = ">=1.1.0" },
+    { name = "pycryptodomex", specifier = ">=3.10.0" },
+    { name = "requests", specifier = ">=2.30.0,<3.0.0" },
+    { name = "requests-toolbelt", specifier = ">=1.0.0" },
+]
+
+[package.metadata.requires-dev]
+test = [
+    { name = "pytest", specifier = ">=8.3.5" },
+    { name = "requests", specifier = ">=2.32.3" },
+]
+
+[[package]]
+name = "requests"
+version = "2.32.5"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+dependencies = [
+    { name = "certifi" },
+    { name = "charset-normalizer" },
+    { name = "idna" },
+    { name = "urllib3" },
+]
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/c9/74/b3ff8e6c8446842c3f5c837e9c3dfcfe2018ea6ecef224c710c85ef728f4/requests-2.32.5.tar.gz", hash = "sha256:dbba0bac56e100853db0ea71b82b4dfd5fe2bf6d3754a8893c3af500cec7d7cf", size = 134517, upload-time = "2025-08-18T20:46:02.573Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/1e/db/4254e3eabe8020b458f1a747140d32277ec7a271daf1d235b70dc0b4e6e3/requests-2.32.5-py3-none-any.whl", hash = "sha256:2462f94637a34fd532264295e186976db0f5d453d1cdd31473c85a6a161affb6", size = 64738, upload-time = "2025-08-18T20:46:00.542Z" },
+]
+
+[[package]]
+name = "requests-toolbelt"
+version = "1.0.0"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+dependencies = [
+    { name = "requests" },
+]
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/f3/61/d7545dafb7ac2230c70d38d31cbfe4cc64f7144dc41f6e4e4b78ecd9f5bb/requests-toolbelt-1.0.0.tar.gz", hash = "sha256:7681a0a3d047012b5bdc0ee37d7f8f07ebe76ab08caeccfc3921ce23c88d5bc6", size = 206888, upload-time = "2023-05-01T04:11:33.229Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/3f/51/d4db610ef29373b879047326cbf6fa98b6c1969d6f6dc423279de2b1be2c/requests_toolbelt-1.0.0-py2.py3-none-any.whl", hash = "sha256:cccfdd665f0a24fcf4726e690f65639d272bb0637b9b92dfd91a5568ccf6bd06", size = 54481, upload-time = "2023-05-01T04:11:28.427Z" },
+]
+
+[[package]]
+name = "urllib3"
+version = "2.6.3"
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/c7/24/5f1b3bdffd70275f6661c76461e25f024d5a38a46f04aaca912426a2b1d3/urllib3-2.6.3.tar.gz", hash = "sha256:1b62b6884944a57dbe321509ab94fd4d3b307075e0c2eae991ac71ee15ad38ed", size = 435556, upload-time = "2026-01-07T16:24:43.925Z" }
+wheels = [
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/39/08/aaaad47bc4e9dc8c725e68f9d04865dbcb2052843ff09c97b08904852d84/urllib3-2.6.3-py3-none-any.whl", hash = "sha256:bf272323e553dfb2e87d9bfd225ca7b0f467b919d7bbd355436d3fd37cb0acd4", size = 131584, upload-time = "2026-01-07T16:24:42.685Z" },
+]
--- a/admin/server/admin_server.py
+++ b/admin/server/admin_server.py
@@ -14,14 +14,18 @@
 #  limitations under the License.
 #

+import time
+start_ts = time.time()
+
 import os
 import signal
 import logging
-import time
 import threading
-import traceback
-from werkzeug.serving import run_simple
+import faulthandler
+
 from flask import Flask
+from flask_login import LoginManager
+from werkzeug.serving import run_simple
 from routes import admin_bp
 from common.log_utils import init_root_logger
 from common.constants import SERVICE_CONF
@@ -30,12 +34,12 @@ from common import settings
 from config import load_configurations, SERVICE_CONFIGS
 from auth import init_default_admin, setup_auth
 from flask_session import Session
-from flask_login import LoginManager
 from common.versions import get_ragflow_version

 stop_event = threading.Event()

 if __name__ == '__main__':
+    faulthandler.enable()
    init_root_logger("admin_service")
    logging.info(r"""
        ____  ___   ______________                 ___       __          _     
@@ -53,7 +57,7 @@ if __name__ == '__main__':
        os.environ.get("MAX_CONTENT_LENGTH", 1024 * 1024 * 1024)
    )
    Session(app)
-    logging.info(f'RAGFlow version: {get_ragflow_version()}')
+    logging.info(f'RAGFlow admin version: {get_ragflow_version()}')
    show_configs()
    login_manager = LoginManager()
    login_manager.init_app(app)
@@ -63,17 +67,17 @@ if __name__ == '__main__':
    SERVICE_CONFIGS.configs = load_configurations(SERVICE_CONF)

    try:
-        logging.info("RAGFlow Admin service start...")
+        logging.info(f"RAGFlow admin is ready after {time.time() - start_ts}s initialization.")
        run_simple(
            hostname="0.0.0.0",
            port=9381,
            application=app,
            threaded=True,
            use_reloader=False,
-            use_debugger=True,
+            use_debugger=False,
        )
-    except Exception:
-        traceback.print_exc()
+    except Exception as e:
+        logging.exception(f"Unhandled exception: {e}")
        stop_event.set()
        time.sleep(1)
        os.kill(os.getpid(), signal.SIGKILL)
--- a/admin/server/auth.py
+++ b/admin/server/auth.py
@@ -19,36 +19,54 @@ import logging
 import uuid
 from functools import wraps
 from datetime import datetime
-from flask import request, jsonify
+
+from flask import jsonify, request
 from flask_login import current_user, login_user
-from itsdangerous.url_safe import URLSafeTimedSerializer as Serializer

 from api.common.exceptions import AdminException, UserNotFoundError
 from api.common.base64 import encode_to_base64
 from api.db.services import UserService
+from api.db import UserTenantRole
+from api.db.services.user_service import TenantService, UserTenantService
 from common.constants import ActiveEnum, StatusEnum
 from api.utils.crypt import decrypt
 from common.misc_utils import get_uuid
 from common.time_utils import current_timestamp, datetime_format, get_format_time
-from common.connection_utils import construct_response
+from common.connection_utils import sync_construct_response
 from common import settings


 def setup_auth(login_manager):
    @login_manager.request_loader
    def load_user(web_request):
-        jwt = Serializer(secret_key=settings.SECRET_KEY)
+        # Authorization header contains JWT-encoded access token
+        # First decode JWT to get the UUID, then query database
+        from itsdangerous.url_safe import URLSafeTimedSerializer as Serializer
+        from common import settings
+
        authorization = web_request.headers.get("Authorization")
        if authorization:
            try:
-                access_token = str(jwt.loads(authorization))
+                # Strip "Bearer " prefix if present
+                jwt_token = authorization
+                if jwt_token.startswith("Bearer "):
+                    jwt_token = jwt_token[7:]

-                if not access_token or not access_token.strip():
-                    logging.warning("Authentication attempt with empty access token")
+                jwt_token = jwt_token.strip()
+                if not jwt_token:
+                    logging.warning("Authentication attempt with empty JWT token")
                    return None

-                # Access tokens should be UUIDs (32 hex characters)
-                if len(access_token.strip()) < 32:
+                # Decode JWT to get the UUID access_token
+                jwt = Serializer(secret_key=settings.get_secret_key())
+                access_token = str(jwt.loads(jwt_token))
+
+                if not access_token or not access_token.strip():
+                    logging.warning("Authentication attempt with empty access token after JWT decode")
+                    return None
+
+                # Access tokens stored in database are UUIDs (32 hex characters)
+                if len(access_token) < 32:
                    logging.warning(f"Authentication attempt with invalid token format: {len(access_token)} chars")
                    return None

@@ -84,8 +102,43 @@ def init_default_admin():
        }
        if not UserService.save(**default_admin):
            raise AdminException("Can't init admin.", 500)
+        add_tenant_for_admin(default_admin, UserTenantRole.OWNER)
    elif not any([u.is_active == ActiveEnum.ACTIVE.value for u in users]):
        raise AdminException("No active admin. Please update 'is_active' in db manually.", 500)
+    else:
+        default_admin_rows = [u for u in users if u.email == "admin@ragflow.io"]
+        if default_admin_rows:
+            default_admin = default_admin_rows[0].to_dict()
+            exist, default_admin_tenant = TenantService.get_by_id(default_admin["id"])
+            if not exist:
+                add_tenant_for_admin(default_admin, UserTenantRole.OWNER)
+
+
+def add_tenant_for_admin(user_info: dict, role: str):
+
+    tenant = {
+        "id": user_info["id"],
+        "name": user_info["nickname"] + "‘s Kingdom",
+        "llm_id": settings.CHAT_MDL,
+        "embd_id": settings.EMBEDDING_MDL,
+        "asr_id": settings.ASR_MDL,
+        "parser_ids": settings.PARSERS,
+        "img2txt_id": settings.IMAGE2TEXT_MDL,
+        "rerank_id": settings.RERANK_MDL,
+    }
+    usr_tenant = {
+        "tenant_id": user_info["id"],
+        "user_id": user_info["id"],
+        "invited_by": user_info["id"],
+        "role": role
+    }
+
+    # tenant_llm = get_init_tenant_llm(user_info["id"])
+    TenantService.insert(**tenant)
+    UserTenantService.insert(**usr_tenant)
+    # TenantLLMService.insert_many(tenant_llm)
+    logging.info(
+        f"Added tenant for email: {user_info['email']}, A default tenant has been set; changing the default models after login is strongly recommended.")


 def check_admin_auth(func):
@@ -129,7 +182,7 @@ def login_admin(email: str, password: str):
    user.last_login_time = get_format_time()
    user.save()
    msg = "Welcome back!"
-    return construct_response(data=resp, auth=user.get_id(), message=msg)
+    return sync_construct_response(data=resp, auth=user.get_id(), message=msg)


 def check_admin(username: str, password: str):
@@ -169,17 +222,17 @@ def login_verify(f):
        username = auth.parameters['username']
        password = auth.parameters['password']
        try:
-            if check_admin(username, password) is False:
+            if not check_admin(username, password):
                return jsonify({
                    "code": 500,
                    "message": "Access denied",
                    "data": None
                }), 200
-        except Exception as e:
-            error_msg = str(e)
+        except Exception:
+            logging.exception("An error occurred during admin login verification.")
            return jsonify({
                "code": 500,
-                "message": error_msg
+                "message": "An internal server error occurred."
            }), 200

        return f(*args, **kwargs)
--- a/admin/server/config.py
+++ b/admin/server/config.py
@@ -25,8 +25,21 @@ from common.config_utils import read_config
 from urllib.parse import urlparse


+class BaseConfig(BaseModel):
+    id: int
+    name: str
+    host: str
+    port: int
+    service_type: str
+    detail_func_name: str
+
+    def to_dict(self) -> dict[str, Any]:
+        return {'id': self.id, 'name': self.name, 'host': self.host, 'port': self.port,
+                'service_type': self.service_type}
+
+
 class ServiceConfigs:
-    configs = dict
+    configs = list[BaseConfig]

    def __init__(self):
        self.configs = []
@@ -45,19 +58,6 @@ class ServiceType(Enum):
    FILE_STORE = "file_store"


-class BaseConfig(BaseModel):
-    id: int
-    name: str
-    host: str
-    port: int
-    service_type: str
-    detail_func_name: str
-
-    def to_dict(self) -> dict[str, Any]:
-        return {'id': self.id, 'name': self.name, 'host': self.host, 'port': self.port,
-                'service_type': self.service_type}
-
-
 class MetaConfig(BaseConfig):
    meta_type: str

@@ -227,7 +227,7 @@ def load_configurations(config_path: str) -> list[BaseConfig]:
    ragflow_count = 0
    id_count = 0
    for k, v in raw_configs.items():
-        match (k):
+        match k:
            case "ragflow":
                name: str = f'ragflow_{ragflow_count}'
                host: str = v['host']
@@ -264,6 +264,19 @@ def load_configurations(config_path: str) -> list[BaseConfig]:
                                        db_name=database, detail_func_name="get_infinity_status")
                configurations.append(config)
                id_count += 1
+            case "minio_0":
+                name: str = 'minio_0'
+                url = v['host']
+                parts = url.split(':', 1)
+                host = parts[0]
+                port = int(parts[1])
+                user = v.get('user')
+                password = v.get('password')
+                config = MinioConfig(id=id_count, name=name, host=host, port=port, user=user, password=password,
+                                     service_type="file_store",
+                                     store_type="minio", detail_func_name="check_minio_alive")
+                configurations.append(config)
+                id_count += 1
            case "minio":
                name: str = 'minio'
                url = v['host']
@@ -310,6 +323,14 @@ def load_configurations(config_path: str) -> list[BaseConfig]:
                                            service_type="task_executor", detail_func_name="check_task_executor_alive")
                configurations.append(config)
                id_count += 1
+            case "rabbitmq":
+                name: str = 'rabbitmq'
+                host: str = v.get('host')
+                port: int = v.get('port')
+                config = RabbitMQConfig(id=id_count, name=name, host=host, port=port,
+                                        service_type="message_queue", mq_type="rabbitmq", detail_func_name="check_rabbitmq_alive")
+                configurations.append(config)
+                id_count += 1
            case _:
                logging.warning(f"Unknown configuration key: {k}")
                continue
--- a/admin/server/responses.py
+++ b/admin/server/responses.py
@@ -13,8 +13,6 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
-
-
 from flask import jsonify


--- a/admin/server/routes.py
+++ b/admin/server/routes.py
@@ -15,24 +15,35 @@
 #

 import secrets
+import logging
+from typing import Any

-from flask import Blueprint, request
-from flask_login import current_user, logout_user, login_required
+from common.time_utils import current_timestamp, datetime_format
+from datetime import datetime
+from flask import Blueprint, Response, request
+from flask_login import current_user, login_required, logout_user

 from auth import login_verify, login_admin, check_admin_auth
 from responses import success_response, error_response
-from services import UserMgr, ServiceMgr, UserServiceMgr
+from services import UserMgr, ServiceMgr, UserServiceMgr, SettingsMgr, ConfigMgr, EnvironmentsMgr, SandboxMgr
 from roles import RoleMgr
 from api.common.exceptions import AdminException
 from common.versions import get_ragflow_version
+from api.utils.api_utils import generate_confirmation_token
+from common.log_utils import get_log_levels, set_log_level

-admin_bp = Blueprint('admin', __name__, url_prefix='/api/v1/admin')
+admin_bp = Blueprint("admin", __name__, url_prefix="/api/v1/admin")


-@admin_bp.route('/login', methods=['POST'])
+@admin_bp.route("/ping", methods=["GET"])
+def ping():
+    return success_response(message="pong")
+
+
+@admin_bp.route("/login", methods=["POST"])
 def login():
    if not request.json:
-        return error_response('Authorize admin failed.' ,400)
+        return error_response("Authorize admin failed.", 400)
    try:
        email = request.json.get("email", "")
        password = request.json.get("password", "")
@@ -41,7 +52,7 @@ def login():
        return error_response(str(e), 500)


-@admin_bp.route('/logout', methods=['GET'])
+@admin_bp.route("/logout", methods=["GET"])
@login_required
 def logout():
    try:
@@ -53,7 +64,7 @@ def logout():
        return error_response(str(e), 500)


-@admin_bp.route('/auth', methods=['GET'])
+@admin_bp.route("/auth", methods=["GET"])
@login_verify
 def auth_admin():
    try:
@@ -62,7 +73,7 @@ def auth_admin():
        return error_response(str(e), 500)


-@admin_bp.route('/users', methods=['GET'])
+@admin_bp.route("/users", methods=["GET"])
@login_required
@check_admin_auth
 def list_users():
@@ -73,18 +84,18 @@ def list_users():
        return error_response(str(e), 500)


-@admin_bp.route('/users', methods=['POST'])
+@admin_bp.route("/users", methods=["POST"])
@login_required
@check_admin_auth
 def create_user():
    try:
        data = request.get_json()
-        if not data or 'username' not in data or 'password' not in data:
+        if not data or "username" not in data or "password" not in data:
            return error_response("Username and password are required", 400)

-        username = data['username']
-        password = data['password']
-        role = data.get('role', 'user')
+        username = data["username"]
+        password = data["password"]
+        role = data.get("role", "user")

        res = UserMgr.create_user(username, password, role)
        if res["success"]:
@@ -100,7 +111,7 @@ def create_user():
        return error_response(str(e))


-@admin_bp.route('/users/<username>', methods=['DELETE'])
+@admin_bp.route("/users/<username>", methods=["DELETE"])
@login_required
@check_admin_auth
 def delete_user(username):
@@ -117,16 +128,16 @@ def delete_user(username):
        return error_response(str(e), 500)


-@admin_bp.route('/users/<username>/password', methods=['PUT'])
+@admin_bp.route("/users/<username>/password", methods=["PUT"])
@login_required
@check_admin_auth
 def change_password(username):
    try:
        data = request.get_json()
-        if not data or 'new_password' not in data:
+        if not data or "new_password" not in data:
            return error_response("New password is required", 400)

-        new_password = data['new_password']
+        new_password = data["new_password"]
        msg = UserMgr.update_user_password(username, new_password)
        return success_response(None, msg)

@@ -136,15 +147,15 @@ def change_password(username):
        return error_response(str(e), 500)


-@admin_bp.route('/users/<username>/activate', methods=['PUT'])
+@admin_bp.route("/users/<username>/activate", methods=["PUT"])
@login_required
@check_admin_auth
 def alter_user_activate_status(username):
    try:
        data = request.get_json()
-        if not data or 'activate_status' not in data:
+        if not data or "activate_status" not in data:
            return error_response("Activation status is required", 400)
-        activate_status = data['activate_status']
+        activate_status = data["activate_status"]
        msg = UserMgr.update_user_activate_status(username, activate_status)
        return success_response(None, msg)
    except AdminException as e:
@@ -153,7 +164,39 @@ def alter_user_activate_status(username):
        return error_response(str(e), 500)


-@admin_bp.route('/users/<username>', methods=['GET'])
+@admin_bp.route("/users/<username>/admin", methods=["PUT"])
+@login_required
+@check_admin_auth
+def grant_admin(username):
+    try:
+        if current_user.email == username:
+            return error_response(f"can't grant current user: {username}", 409)
+        msg = UserMgr.grant_admin(username)
+        return success_response(None, msg)
+
+    except AdminException as e:
+        return error_response(e.message, e.code)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/users/<username>/admin", methods=["DELETE"])
+@login_required
+@check_admin_auth
+def revoke_admin(username):
+    try:
+        if current_user.email == username:
+            return error_response(f"can't grant current user: {username}", 409)
+        msg = UserMgr.revoke_admin(username)
+        return success_response(None, msg)
+
+    except AdminException as e:
+        return error_response(e.message, e.code)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/users/<username>", methods=["GET"])
@login_required
@check_admin_auth
 def get_user_details(username):
@@ -167,7 +210,7 @@ def get_user_details(username):
        return error_response(str(e), 500)


-@admin_bp.route('/users/<username>/datasets', methods=['GET'])
+@admin_bp.route("/users/<username>/datasets", methods=["GET"])
@login_required
@check_admin_auth
 def get_user_datasets(username):
@@ -181,7 +224,7 @@ def get_user_datasets(username):
        return error_response(str(e), 500)


-@admin_bp.route('/users/<username>/agents', methods=['GET'])
+@admin_bp.route("/users/<username>/agents", methods=["GET"])
@login_required
@check_admin_auth
 def get_user_agents(username):
@@ -195,7 +238,7 @@ def get_user_agents(username):
        return error_response(str(e), 500)


-@admin_bp.route('/services', methods=['GET'])
+@admin_bp.route("/services", methods=["GET"])
@login_required
@check_admin_auth
 def get_services():
@@ -206,7 +249,7 @@ def get_services():
        return error_response(str(e), 500)


-@admin_bp.route('/service_types/<service_type>', methods=['GET'])
+@admin_bp.route("/service_types/<service_type>", methods=["GET"])
@login_required
@check_admin_auth
 def get_services_by_type(service_type_str):
@@ -217,7 +260,7 @@ def get_services_by_type(service_type_str):
        return error_response(str(e), 500)


-@admin_bp.route('/services/<service_id>', methods=['GET'])
+@admin_bp.route("/services/<service_id>", methods=["GET"])
@login_required
@check_admin_auth
 def get_service(service_id):
@@ -228,7 +271,7 @@ def get_service(service_id):
        return error_response(str(e), 500)


-@admin_bp.route('/services/<service_id>', methods=['DELETE'])
+@admin_bp.route("/services/<service_id>", methods=["DELETE"])
@login_required
@check_admin_auth
 def shutdown_service(service_id):
@@ -239,7 +282,7 @@ def shutdown_service(service_id):
        return error_response(str(e), 500)


-@admin_bp.route('/services/<service_id>', methods=['PUT'])
+@admin_bp.route("/services/<service_id>", methods=["PUT"])
@login_required
@check_admin_auth
 def restart_service(service_id):
@@ -250,38 +293,38 @@ def restart_service(service_id):
        return error_response(str(e), 500)


-@admin_bp.route('/roles', methods=['POST'])
+@admin_bp.route("/roles", methods=["POST"])
@login_required
@check_admin_auth
 def create_role():
    try:
        data = request.get_json()
-        if not data or 'role_name' not in data:
+        if not data or "role_name" not in data:
            return error_response("Role name is required", 400)
-        role_name: str = data['role_name']
-        description: str = data['description']
+        role_name: str = data["role_name"]
+        description: str = data["description"]
        res = RoleMgr.create_role(role_name, description)
        return success_response(res)
    except Exception as e:
        return error_response(str(e), 500)


-@admin_bp.route('/roles/<role_name>', methods=['PUT'])
+@admin_bp.route("/roles/<role_name>", methods=["PUT"])
@login_required
@check_admin_auth
 def update_role(role_name: str):
    try:
        data = request.get_json()
-        if not data or 'description' not in data:
+        if not data or "description" not in data:
            return error_response("Role description is required", 400)
-        description: str = data['description']
+        description: str = data["description"]
        res = RoleMgr.update_role_description(role_name, description)
        return success_response(res)
    except Exception as e:
        return error_response(str(e), 500)


-@admin_bp.route('/roles/<role_name>', methods=['DELETE'])
+@admin_bp.route("/roles/<role_name>", methods=["DELETE"])
@login_required
@check_admin_auth
 def delete_role(role_name: str):
@@ -292,7 +335,7 @@ def delete_role(role_name: str):
        return error_response(str(e), 500)


-@admin_bp.route('/roles', methods=['GET'])
+@admin_bp.route("/roles", methods=["GET"])
@login_required
@check_admin_auth
 def list_roles():
@@ -303,7 +346,7 @@ def list_roles():
        return error_response(str(e), 500)


-@admin_bp.route('/roles/<role_name>/permission', methods=['GET'])
+@admin_bp.route("/roles/<role_name>/permission", methods=["GET"])
@login_required
@check_admin_auth
 def get_role_permission(role_name: str):
@@ -314,54 +357,54 @@ def get_role_permission(role_name: str):
        return error_response(str(e), 500)


-@admin_bp.route('/roles/<role_name>/permission', methods=['POST'])
+@admin_bp.route("/roles/<role_name>/permission", methods=["POST"])
@login_required
@check_admin_auth
 def grant_role_permission(role_name: str):
    try:
        data = request.get_json()
-        if not data or 'actions' not in data or 'resource' not in data:
+        if not data or "actions" not in data or "resource" not in data:
            return error_response("Permission is required", 400)
-        actions: list = data['actions']
-        resource: str = data['resource']
+        actions: list = data["actions"]
+        resource: str = data["resource"]
        res = RoleMgr.grant_role_permission(role_name, actions, resource)
        return success_response(res)
    except Exception as e:
        return error_response(str(e), 500)


-@admin_bp.route('/roles/<role_name>/permission', methods=['DELETE'])
+@admin_bp.route("/roles/<role_name>/permission", methods=["DELETE"])
@login_required
@check_admin_auth
 def revoke_role_permission(role_name: str):
    try:
        data = request.get_json()
-        if not data or 'actions' not in data or 'resource' not in data:
+        if not data or "actions" not in data or "resource" not in data:
            return error_response("Permission is required", 400)
-        actions: list = data['actions']
-        resource: str = data['resource']
+        actions: list = data["actions"]
+        resource: str = data["resource"]
        res = RoleMgr.revoke_role_permission(role_name, actions, resource)
        return success_response(res)
    except Exception as e:
        return error_response(str(e), 500)


-@admin_bp.route('/users/<user_name>/role', methods=['PUT'])
+@admin_bp.route("/users/<user_name>/role", methods=["PUT"])
@login_required
@check_admin_auth
 def update_user_role(user_name: str):
    try:
        data = request.get_json()
-        if not data or 'role_name' not in data:
+        if not data or "role_name" not in data:
            return error_response("Role name is required", 400)
-        role_name: str = data['role_name']
+        role_name: str = data["role_name"]
        res = RoleMgr.update_user_role(user_name, role_name)
        return success_response(res)
    except Exception as e:
        return error_response(str(e), 500)


-@admin_bp.route('/users/<user_name>/permission', methods=['GET'])
+@admin_bp.route("/users/<user_name>/permission", methods=["GET"])
@login_required
@check_admin_auth
 def get_user_permission(user_name: str):
@@ -371,7 +414,140 @@ def get_user_permission(user_name: str):
    except Exception as e:
        return error_response(str(e), 500)

-@admin_bp.route('/version', methods=['GET'])
+
+@admin_bp.route("/variables", methods=["PUT"])
+@login_required
+@check_admin_auth
+def set_variable():
+    try:
+        data = request.get_json()
+        if not data or "var_name" not in data:
+            return error_response("Var name is required", 400)
+
+        if "var_value" not in data:
+            return error_response("Var value is required", 400)
+        var_name: str = data["var_name"]
+        var_value: str = data["var_value"]
+
+        SettingsMgr.update_by_name(var_name, var_value)
+        return success_response(None, "Set variable successfully")
+    except AdminException as e:
+        return error_response(str(e), 400)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/variables", methods=["GET"])
+@login_required
+@check_admin_auth
+def get_variable():
+    try:
+        if request.content_length is None or request.content_length == 0:
+            # list variables
+            res = list(SettingsMgr.get_all())
+            return success_response(res)
+
+        # get var
+        data = request.get_json()
+        if not data or "var_name" not in data:
+            return error_response("Var name is required", 400)
+        var_name: str = data["var_name"]
+        res = SettingsMgr.get_by_name(var_name)
+        return success_response(res)
+    except AdminException as e:
+        return error_response(str(e), 400)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/configs", methods=["GET"])
+@login_required
+@check_admin_auth
+def get_config():
+    try:
+        res = list(ConfigMgr.get_all())
+        return success_response(res)
+    except AdminException as e:
+        return error_response(str(e), 400)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/environments", methods=["GET"])
+@login_required
+@check_admin_auth
+def get_environments():
+    try:
+        res = list(EnvironmentsMgr.get_all())
+        return success_response(res)
+    except AdminException as e:
+        return error_response(str(e), 400)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/users/<username>/keys", methods=["POST"])
+@login_required
+@check_admin_auth
+def generate_user_api_key(username: str) -> tuple[Response, int]:
+    try:
+        user_details: list[dict[str, Any]] = UserMgr.get_user_details(username)
+        if not user_details:
+            return error_response("User not found!", 404)
+        tenants: list[dict[str, Any]] = UserServiceMgr.get_user_tenants(username)
+        if not tenants:
+            return error_response("Tenant not found!", 404)
+        tenant_id: str = tenants[0]["tenant_id"]
+        key: str = generate_confirmation_token()
+        obj: dict[str, Any] = {
+            "tenant_id": tenant_id,
+            "token": key,
+            "beta": generate_confirmation_token().replace("ragflow-", "")[:32],
+            "create_time": current_timestamp(),
+            "create_date": datetime_format(datetime.now()),
+            "update_time": None,
+            "update_date": None,
+        }
+
+        if not UserMgr.save_api_key(obj):
+            return error_response("Failed to generate API key!", 500)
+        return success_response(obj, "API key generated successfully")
+    except AdminException as e:
+        return error_response(e.message, e.code)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/users/<username>/keys", methods=["GET"])
+@login_required
+@check_admin_auth
+def get_user_api_keys(username: str) -> tuple[Response, int]:
+    try:
+        api_keys: list[dict[str, Any]] = UserMgr.get_user_api_key(username)
+        return success_response(api_keys, "Get user API keys")
+    except AdminException as e:
+        return error_response(e.message, e.code)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/users/<username>/keys/<key>", methods=["DELETE"])
+@login_required
+@check_admin_auth
+def delete_user_api_key(username: str, key: str) -> tuple[Response, int]:
+    try:
+        deleted = UserMgr.delete_api_key(username, key)
+        if deleted:
+            return success_response(None, "API key deleted successfully")
+        else:
+            return error_response("API key not found or could not be deleted", 404)
+    except AdminException as e:
+        return error_response(e.message, e.code)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/version", methods=["GET"])
@login_required
@check_admin_auth
 def show_version():
@@ -380,3 +556,136 @@ def show_version():
        return success_response(res)
    except Exception as e:
        return error_response(str(e), 500)
+
+
+@admin_bp.route("/sandbox/providers", methods=["GET"])
+@login_required
+@check_admin_auth
+def list_sandbox_providers():
+    """List all available sandbox providers."""
+    try:
+        res = SandboxMgr.list_providers()
+        return success_response(res)
+    except AdminException as e:
+        return error_response(str(e), 400)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/sandbox/providers/<provider_id>/schema", methods=["GET"])
+@login_required
+@check_admin_auth
+def get_sandbox_provider_schema(provider_id: str):
+    """Get configuration schema for a specific provider."""
+    try:
+        res = SandboxMgr.get_provider_config_schema(provider_id)
+        return success_response(res)
+    except AdminException as e:
+        return error_response(str(e), 400)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/sandbox/config", methods=["GET"])
+@login_required
+@check_admin_auth
+def get_sandbox_config():
+    """Get current sandbox configuration."""
+    try:
+        res = SandboxMgr.get_config()
+        return success_response(res)
+    except AdminException as e:
+        return error_response(str(e), 400)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/sandbox/config", methods=["POST"])
+@login_required
+@check_admin_auth
+def set_sandbox_config():
+    """Set sandbox provider configuration."""
+    try:
+        data = request.get_json()
+        if not data:
+            logging.error("set_sandbox_config: Request body is required")
+            return error_response("Request body is required", 400)
+
+        provider_type = data.get("provider_type")
+        if not provider_type:
+            logging.error("set_sandbox_config: provider_type is required")
+            return error_response("provider_type is required", 400)
+
+        config = data.get("config", {})
+        set_active = data.get("set_active", True)  # Default to True for backward compatibility
+
+        logging.info(f"set_sandbox_config: provider_type={provider_type}, set_active={set_active}")
+        logging.info(f"set_sandbox_config: config keys={list(config.keys())}")
+
+        res = SandboxMgr.set_config(provider_type, config, set_active)
+        return success_response(res, "Sandbox configuration updated successfully")
+    except AdminException as e:
+        logging.exception("set_sandbox_config AdminException")
+        return error_response(str(e), 400)
+    except Exception as e:
+        logging.exception("set_sandbox_config unexpected error")
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/sandbox/test", methods=["POST"])
+@login_required
+@check_admin_auth
+def test_sandbox_connection():
+    """Test connection to sandbox provider."""
+    try:
+        data = request.get_json()
+        if not data:
+            return error_response("Request body is required", 400)
+
+        provider_type = data.get("provider_type")
+        if not provider_type:
+            return error_response("provider_type is required", 400)
+
+        config = data.get("config", {})
+        res = SandboxMgr.test_connection(provider_type, config)
+        return success_response(res)
+    except AdminException as e:
+        return error_response(str(e), 400)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/log_levels", methods=["GET"])
+@login_required
+@check_admin_auth
+def get_logger_levels():
+    """Get current log levels for all packages."""
+    try:
+        res = get_log_levels()
+        return success_response(res, "Get log levels", 0)
+    except Exception as e:
+        return error_response(str(e), 500)
+
+
+@admin_bp.route("/log_levels", methods=["PUT"])
+@login_required
+@check_admin_auth
+def set_logger_level():
+    """Set log level for a package."""
+    try:
+        data = request.get_json()
+        if not data or "pkg_name" not in data or "level" not in data:
+            return error_response("pkg_name and level are required", 400)
+
+        pkg_name = data["pkg_name"]
+        level = data["level"]
+        if not isinstance(pkg_name, str) or not isinstance(level, str):
+            return error_response("pkg_name and level must be strings", 400)
+
+        success = set_log_level(pkg_name, level)
+        if success:
+            return success_response({"pkg_name": pkg_name, "level": level}, "Log level updated successfully")
+        else:
+            return error_response(f"Invalid log level: {level}", 400)
+    except Exception as e:
+        return error_response(str(e), 500)
--- a/admin/server/services.py
+++ b/admin/server/services.py
@@ -14,15 +14,22 @@
 #  limitations under the License.
 #

-
+import json
+import os
+import logging
 import re
+from typing import Any
+
 from werkzeug.security import check_password_hash
 from common.constants import ActiveEnum
 from api.db.services import UserService
 from api.db.joint_services.user_account_service import create_new_user, delete_user_data
 from api.db.services.canvas_service import UserCanvasService
-from api.db.services.user_service import TenantService
+from api.db.services.user_service import TenantService, UserTenantService
 from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.system_settings_service import SystemSettingsService
+from api.db.services.api_service import APITokenService
+from api.db.db_models import APIToken
 from api.utils.crypt import decrypt
 from api.utils import health_utils

@@ -36,13 +43,15 @@ class UserMgr:
        users = UserService.get_all_users()
        result = []
        for user in users:
-            result.append({
-                'email': user.email,
-                'nickname': user.nickname,
-                'create_date': user.create_date,
-                'is_active': user.is_active,
-                'is_superuser': user.is_superuser,
-            })
+            result.append(
+                {
+                    "email": user.email,
+                    "nickname": user.nickname,
+                    "create_date": user.create_date,
+                    "is_active": user.is_active,
+                    "is_superuser": user.is_superuser,
+                }
+            )
        return result

    @staticmethod
@@ -51,19 +60,21 @@ class UserMgr:
        users = UserService.query_user_by_email(username)
        result = []
        for user in users:
-            result.append({
-                'avatar': user.avatar,
-                'email': user.email,
-                'language': user.language,
-                'last_login_time': user.last_login_time,
-                'is_active': user.is_active,
-                'is_anonymous': user.is_anonymous,
-                'login_channel': user.login_channel,
-                'status': user.status,
-                'is_superuser': user.is_superuser,
-                'create_date': user.create_date,
-                'update_date': user.update_date
-            })
+            result.append(
+                {
+                    "avatar": user.avatar,
+                    "email": user.email,
+                    "language": user.language,
+                    "last_login_time": user.last_login_time,
+                    "is_active": user.is_active,
+                    "is_anonymous": user.is_anonymous,
+                    "login_channel": user.login_channel,
+                    "status": user.status,
+                    "is_superuser": user.is_superuser,
+                    "create_date": user.create_date,
+                    "update_date": user.update_date,
+                }
+            )
        return result

    @staticmethod
@@ -125,8 +136,8 @@ class UserMgr:
        # format activate_status before handle
        _activate_status = activate_status.lower()
        target_status = {
-            'on': ActiveEnum.ACTIVE.value,
-            'off': ActiveEnum.INACTIVE.value,
+            "on": ActiveEnum.ACTIVE.value,
+            "off": ActiveEnum.INACTIVE.value,
        }.get(_activate_status)
        if not target_status:
            raise AdminException(f"Invalid activate_status: {activate_status}")
@@ -136,9 +147,84 @@ class UserMgr:
        UserService.update_user(usr.id, {"is_active": target_status})
        return f"Turn {_activate_status} user activate status successfully!"

+    @staticmethod
+    def get_user_api_key(username: str) -> list[dict[str, Any]]:
+        # use email to find user. check exist and unique.
+        user_list: list[Any] = UserService.query_user_by_email(username)
+        if not user_list:
+            raise UserNotFoundError(username)
+        elif len(user_list) > 1:
+            raise AdminException(f"More than one user with username '{username}' found!")
+
+        usr: Any = user_list[0]
+        # tenant_id is typically the same as user_id for the owner tenant
+        tenant_id: str = usr.id
+
+        # Query all API keys for this tenant
+        api_keys: Any = APITokenService.query(tenant_id=tenant_id)
+
+        result: list[dict[str, Any]] = []
+        for key in api_keys:
+            result.append(key.to_dict())
+
+        return result
+
+    @staticmethod
+    def save_api_key(api_key: dict[str, Any]) -> bool:
+        return APITokenService.save(**api_key)
+
+    @staticmethod
+    def delete_api_key(username: str, key: str) -> bool:
+        # use email to find user. check exist and unique.
+        user_list: list[Any] = UserService.query_user_by_email(username)
+        if not user_list:
+            raise UserNotFoundError(username)
+        elif len(user_list) > 1:
+            raise AdminException(f"Exist more than 1 user: {username}!")
+
+        usr: Any = user_list[0]
+        # tenant_id is typically the same as user_id for the owner tenant
+        tenant_id: str = usr.id
+
+        # Delete the API key
+        deleted_count: int = APITokenService.filter_delete([APIToken.tenant_id == tenant_id, APIToken.token == key])
+        return deleted_count > 0
+
+    @staticmethod
+    def grant_admin(username: str):
+        # use email to find user. check exist and unique.
+        user_list = UserService.query_user_by_email(username)
+        if not user_list:
+            raise UserNotFoundError(username)
+        elif len(user_list) > 1:
+            raise AdminException(f"Exist more than 1 user: {username}!")
+
+        # check activate status different from new
+        usr = user_list[0]
+        if usr.is_superuser:
+            return f"{usr} is already superuser!"
+        # update is_active
+        UserService.update_user(usr.id, {"is_superuser": True})
+        return "Grant successfully!"
+
+    @staticmethod
+    def revoke_admin(username: str):
+        # use email to find user. check exist and unique.
+        user_list = UserService.query_user_by_email(username)
+        if not user_list:
+            raise UserNotFoundError(username)
+        elif len(user_list) > 1:
+            raise AdminException(f"Exist more than 1 user: {username}!")
+        # check activate status different from new
+        usr = user_list[0]
+        if not usr.is_superuser:
+            return f"{usr} isn't superuser, yet!"
+        # update is_active
+        UserService.update_user(usr.id, {"is_superuser": False})
+        return "Revoke successfully!"
+

 class UserServiceMgr:
-
    @staticmethod
    def get_user_datasets(username):
        # use email to find user.
@@ -168,34 +254,43 @@ class UserServiceMgr:
        tenant_ids = [m["tenant_id"] for m in tenants]
        # filter permitted agents and owned agents
        res = UserCanvasService.get_all_agents_by_tenant_ids(tenant_ids, usr.id)
-        return [{
-            'title': r['title'],
-            'permission': r['permission'],
-            'canvas_category': r['canvas_category'].split('_')[0],
-            'avatar': r['avatar']
-        } for r in res]
+        return [{"title": r["title"], "permission": r["permission"], "canvas_category": r["canvas_category"].split("_")[0], "avatar": r["avatar"]} for r in res]
+
+    @staticmethod
+    def get_user_tenants(email: str) -> list[dict[str, Any]]:
+        users: list[Any] = UserService.query_user_by_email(email)
+        if not users:
+            raise UserNotFoundError(email)
+        user: Any = users[0]
+
+        tenants: list[dict[str, Any]] = UserTenantService.get_tenants_by_user_id(user.id)
+        return tenants


 class ServiceMgr:
-
    @staticmethod
    def get_all_services():
+        doc_engine = os.getenv("DOC_ENGINE", "elasticsearch")
        result = []
        configs = SERVICE_CONFIGS.configs
        for service_id, config in enumerate(configs):
            config_dict = config.to_dict()
+            if config_dict["service_type"] == "retrieval":
+                if config_dict["extra"]["retrieval_type"] != doc_engine:
+                    continue
            try:
                service_detail = ServiceMgr.get_service_details(service_id)
                if "status" in service_detail:
-                    config_dict['status'] = service_detail['status']
+                    config_dict["status"] = service_detail["status"]
                else:
-                    config_dict['status'] = 'timeout'
-            except Exception:
-                config_dict['status'] = 'timeout'
-            if not config_dict['host']:
-                config_dict['host'] = '-'
-            if not config_dict['port']:
-                config_dict['port'] = '-'
+                    config_dict["status"] = "timeout"
+            except Exception as e:
+                logging.warning(f"Can't get service details, error: {e}")
+                config_dict["status"] = "timeout"
+            if not config_dict["host"]:
+                config_dict["host"] = "-"
+            if not config_dict["port"]:
+                config_dict["port"] = "-"
            result.append(config_dict)
        return result

@@ -205,21 +300,24 @@ class ServiceMgr:

    @staticmethod
    def get_service_details(service_id: int):
-        service_id = int(service_id)
+        service_idx = int(service_id)
        configs = SERVICE_CONFIGS.configs
-        service_config_mapping = {
-            c.id: {
-                'name': c.name,
-                'detail_func_name': c.detail_func_name
-            } for c in configs
-        }
-        service_info = service_config_mapping.get(service_id, {})
-        if not service_info:
-            raise AdminException(f"invalid service_id: {service_id}")
+        if service_idx < 0 or service_idx >= len(configs):
+            raise AdminException(f"invalid service_index: {service_idx}")

-        detail_func = getattr(health_utils, service_info.get('detail_func_name'))
+        service_config = configs[service_idx]
+
+        # exclude retrieval service if retrieval_type is not matched
+        doc_engine = os.getenv("DOC_ENGINE", "elasticsearch")
+        if service_config.service_type == "retrieval":
+            if service_config.retrieval_type != doc_engine:
+                raise AdminException(f"invalid service_index: {service_idx}")
+
+        service_info = {"name": service_config.name, "detail_func_name": service_config.detail_func_name}
+
+        detail_func = getattr(health_utils, service_info.get("detail_func_name"))
        res = detail_func()
-        res.update({'service_name': service_info.get('name')})
+        res.update({"service_name": service_info.get("name")})
        return res

    @staticmethod
@@ -229,3 +327,436 @@ class ServiceMgr:
    @staticmethod
    def restart_service(service_id: int):
        raise AdminException("restart_service: not implemented")
+
+
+class SettingsMgr:
+    @staticmethod
+    def _format_setting(setting):
+        return {
+            "data_type": setting.data_type,
+            "name": setting.name,
+            "setting_type": "config",
+            "value": setting.value,
+        }
+
+    @staticmethod
+    def _validate_value(name: str, data_type: str, value: str):
+        data_type = data_type.lower()
+        value = str(value)
+        if data_type == "string":
+            return
+        if data_type == "integer":
+            try:
+                int(value)
+            except ValueError:
+                raise AdminException(f"Invalid integer value for {name}: {value}")
+            return
+        if data_type in {"bool", "boolean"}:
+            if value not in {"true", "false"}:
+                raise AdminException(f"Invalid bool value for {name}: expected true or false")
+            return
+        if data_type == "json":
+            try:
+                json.loads(value)
+            except json.JSONDecodeError:
+                raise AdminException(f"Invalid JSON value for {name}")
+            return
+        raise AdminException(f"Unsupported data type for {name}: {data_type}")
+
+    @staticmethod
+    def _infer_data_type(name: str):
+        if name.startswith("sandbox."):
+            return "json"
+        if name.endswith(".enabled"):
+            return "bool"
+        return "string"
+
+    @staticmethod
+    def get_all():
+        settings = SystemSettingsService.get_all(reverse=False, order_by="name")
+        result = []
+        for setting in settings:
+            result.append(SettingsMgr._format_setting(setting))
+        return result
+
+    @staticmethod
+    def get_by_name(name: str):
+        settings = SystemSettingsService.get_by_name(name)
+        if len(settings) == 0:
+            settings = SystemSettingsService.get_by_name_prefix(name)
+            if len(settings) == 0:
+                raise AdminException(f"Can't get setting: {name}")
+        result = []
+        for setting in settings:
+            result.append(SettingsMgr._format_setting(setting))
+        return result
+
+    @staticmethod
+    def update_by_name(name: str, value: str):
+        settings = SystemSettingsService.get_by_name(name)
+        if len(settings) == 1:
+            setting = settings[0]
+            SettingsMgr._validate_value(name, setting.data_type, value)
+            setting.value = value
+            setting_dict = setting.to_dict()
+            SystemSettingsService.update_by_name(name, setting_dict)
+        elif len(settings) > 1:
+            raise AdminException(f"Can't update more than 1 setting: {name}")
+        else:
+            # Create new setting if it doesn't exist
+
+            # Determine data_type based on name and value
+            data_type = SettingsMgr._infer_data_type(name)
+            SettingsMgr._validate_value(name, data_type, value)
+
+            new_setting = {
+                "name": name,
+                "value": str(value),
+                "source": "admin",
+                "data_type": data_type,
+            }
+            SystemSettingsService.save(**new_setting)
+
+
+class ConfigMgr:
+    @staticmethod
+    def get_all():
+        result = []
+        configs = SERVICE_CONFIGS.configs
+        for config in configs:
+            config_dict = config.to_dict()
+            result.append(config_dict)
+        return result
+
+
+class EnvironmentsMgr:
+    @staticmethod
+    def get_all():
+        result = []
+
+        env_kv = {"env": "DOC_ENGINE", "value": os.getenv("DOC_ENGINE")}
+        result.append(env_kv)
+
+        env_kv = {"env": "DEFAULT_SUPERUSER_EMAIL", "value": os.getenv("DEFAULT_SUPERUSER_EMAIL", "admin@ragflow.io")}
+        result.append(env_kv)
+
+        env_kv = {"env": "DB_TYPE", "value": os.getenv("DB_TYPE", "mysql")}
+        result.append(env_kv)
+
+        env_kv = {"env": "DEVICE", "value": os.getenv("DEVICE", "cpu")}
+        result.append(env_kv)
+
+        env_kv = {"env": "STORAGE_IMPL", "value": os.getenv("STORAGE_IMPL", "MINIO")}
+        result.append(env_kv)
+
+        return result
+
+
+class SandboxMgr:
+    """Manager for sandbox provider configuration and operations."""
+
+    # Provider registry with metadata
+    PROVIDER_REGISTRY = {
+        "local": {
+            "name": "Local",
+            "description": "Execute code directly on the current host process.",
+            "tags": ["local", "host", "minimal"],
+        },
+        "self_managed": {
+            "name": "Self-Managed",
+            "description": "On-premise deployment using Daytona/Docker",
+            "tags": ["self-hosted", "low-latency", "secure"],
+        },
+        "ssh": {
+            "name": "SSH",
+            "description": "Execute code on a remote machine over SSH.",
+            "tags": ["remote", "ssh", "custom-runtime"],
+        },
+        "aliyun_codeinterpreter": {
+            "name": "Aliyun Code Interpreter",
+            "description": "Aliyun Function Compute Code Interpreter - Code execution in serverless microVMs",
+            "tags": ["saas", "cloud", "scalable", "aliyun"],
+        },
+        "e2b": {
+            "name": "E2B",
+            "description": "E2B Cloud - Code Execution Sandboxes",
+            "tags": ["saas", "fast", "global"],
+        },
+    }
+
+    @staticmethod
+    def list_providers():
+        """List all available sandbox providers."""
+        result = []
+        for provider_id, metadata in SandboxMgr.PROVIDER_REGISTRY.items():
+            result.append({
+                "id": provider_id,
+                **metadata
+            })
+        return result
+
+    @staticmethod
+    def get_provider_config_schema(provider_id: str):
+        """Get configuration schema for a specific provider."""
+        from agent.sandbox.providers import (
+            LocalProvider,
+            SelfManagedProvider,
+            SSHProvider,
+            AliyunCodeInterpreterProvider,
+            E2BProvider,
+        )
+
+        schemas = {
+            "local": LocalProvider.get_config_schema(),
+            "self_managed": SelfManagedProvider.get_config_schema(),
+            "ssh": SSHProvider.get_config_schema(),
+            "aliyun_codeinterpreter": AliyunCodeInterpreterProvider.get_config_schema(),
+            "e2b": E2BProvider.get_config_schema(),
+        }
+
+        if provider_id not in schemas:
+            raise AdminException(f"Unknown provider: {provider_id}")
+
+        return schemas.get(provider_id, {})
+
+    @staticmethod
+    def get_config():
+        """Get current sandbox configuration."""
+        try:
+            # Get active provider type
+            provider_type_settings = SystemSettingsService.get_by_name("sandbox.provider_type")
+            if not provider_type_settings:
+                provider_type = "self_managed"
+            else:
+                provider_type = provider_type_settings[0].value
+
+            # Get provider-specific config
+            provider_config_settings = SystemSettingsService.get_by_name(f"sandbox.{provider_type}")
+            if not provider_config_settings:
+                provider_config = {}
+            else:
+                try:
+                    provider_config = json.loads(provider_config_settings[0].value)
+                except json.JSONDecodeError:
+                    provider_config = {}
+
+            if not provider_config:
+                schema = SandboxMgr.get_provider_config_schema(provider_type)
+                provider_config = {}
+                for field_name, field_schema in schema.items():
+                    if field_schema.get("readonly"):
+                        continue
+                    if field_schema.get("default") is not None:
+                        provider_config[field_name] = field_schema["default"]
+
+            return {
+                "provider_type": provider_type,
+                "config": provider_config,
+            }
+        except Exception as e:
+            raise AdminException(f"Failed to get sandbox config: {str(e)}")
+
+    @staticmethod
+    def set_config(provider_type: str, config: dict, set_active: bool = True):
+        """
+        Set sandbox provider configuration.
+
+        Args:
+            provider_type: Provider identifier (e.g., "self_managed", "e2b")
+            config: Provider configuration dictionary
+            set_active: If True, also update the active provider. If False,
+                       only update the configuration without switching providers.
+                       Default: True
+
+        Returns:
+            Dictionary with updated provider_type and config
+        """
+        from agent.sandbox.providers import (
+            LocalProvider,
+            SelfManagedProvider,
+            SSHProvider,
+            AliyunCodeInterpreterProvider,
+            E2BProvider,
+        )
+
+        try:
+            # Validate provider type
+            if provider_type not in SandboxMgr.PROVIDER_REGISTRY:
+                raise AdminException(f"Unknown provider type: {provider_type}")
+
+            # Get provider schema for validation
+            schema = SandboxMgr.get_provider_config_schema(provider_type)
+
+            # Validate config against schema
+            for field_name, field_schema in schema.items():
+                if field_schema.get("required", False) and field_name not in config:
+                    raise AdminException(f"Required field '{field_name}' is missing")
+
+                # Type validation
+                if field_name in config:
+                    field_type = field_schema.get("type")
+                    if field_type == "integer":
+                        if not isinstance(config[field_name], int):
+                            raise AdminException(f"Field '{field_name}' must be an integer")
+                    elif field_type == "string":
+                        if not isinstance(config[field_name], str):
+                            raise AdminException(f"Field '{field_name}' must be a string")
+                    elif field_type == "boolean":
+                        if not isinstance(config[field_name], bool):
+                            raise AdminException(f"Field '{field_name}' must be a boolean")
+
+                    # Range validation for integers
+                    if field_type == "integer" and field_name in config:
+                        min_val = field_schema.get("min")
+                        max_val = field_schema.get("max")
+                        if min_val is not None and config[field_name] < min_val:
+                            raise AdminException(f"Field '{field_name}' must be >= {min_val}")
+                        if max_val is not None and config[field_name] > max_val:
+                            raise AdminException(f"Field '{field_name}' must be <= {max_val}")
+
+            # Provider-specific custom validation
+            provider_classes = {
+                "local": LocalProvider,
+                "self_managed": SelfManagedProvider,
+                "ssh": SSHProvider,
+                "aliyun_codeinterpreter": AliyunCodeInterpreterProvider,
+                "e2b": E2BProvider,
+            }
+            provider = provider_classes[provider_type]()
+            is_valid, error_msg = provider.validate_config(config)
+            if not is_valid:
+                raise AdminException(f"Provider validation failed: {error_msg}")
+
+            # Update provider_type only if set_active is True
+            if set_active:
+                SettingsMgr.update_by_name("sandbox.provider_type", provider_type)
+
+            # Always update the provider config
+            config_json = json.dumps(config)
+            SettingsMgr.update_by_name(f"sandbox.{provider_type}", config_json)
+            from agent.sandbox.client import reload_provider
+            reload_provider()
+
+            return {"provider_type": provider_type, "config": config}
+        except AdminException:
+            raise
+        except Exception as e:
+            raise AdminException(f"Failed to set sandbox config: {str(e)}")
+
+    @staticmethod
+    def test_connection(provider_type: str, config: dict):
+        """
+        Test connection to sandbox provider by executing a simple Python script.
+
+        This creates a temporary sandbox instance and runs a test code to verify:
+        - Connection credentials are valid
+        - Sandbox can be created
+        - Code execution works correctly
+
+        Args:
+            provider_type: Provider identifier
+            config: Provider configuration dictionary
+
+        Returns:
+            dict with test results including stdout, stderr, exit_code, execution_time
+        """
+        try:
+            from agent.sandbox.providers import (
+                LocalProvider,
+                SelfManagedProvider,
+                SSHProvider,
+                AliyunCodeInterpreterProvider,
+                E2BProvider,
+            )
+
+            # Instantiate provider based on type
+            provider_classes = {
+                "local": LocalProvider,
+                "self_managed": SelfManagedProvider,
+                "ssh": SSHProvider,
+                "aliyun_codeinterpreter": AliyunCodeInterpreterProvider,
+                "e2b": E2BProvider,
+            }
+
+            if provider_type not in provider_classes:
+                raise AdminException(f"Unknown provider type: {provider_type}")
+
+            provider = provider_classes[provider_type]()
+
+            # Initialize with config
+            if not provider.initialize(config):
+                raise AdminException(f"Failed to initialize provider '{provider_type}'")
+
+            # Create a temporary sandbox instance for testing
+            instance = provider.create_instance(template="python")
+            if not instance:
+                raise AdminException("Failed to create sandbox instance.")
+
+            try:
+                # Keep the probe close to the original coverage, but avoid
+                # `sys` because the sandbox security analyzer blocks it.
+                test_code = """
+import json
+import math
+
+
+def main() -> dict:
+    left = 2
+    right = 2
+    print(f"2 + 2 = {left + right}")
+    print(f"JSON dump: {json.dumps({'test': 'data', 'value': 123})}")
+    print(f"Math.sqrt(16) = {math.sqrt(16)}")
+    print("TEST_PASSED")
+    return {"ok": True, "provider_test": "TEST_PASSED"}
+"""
+
+                # Execute test code with timeout
+                execution_result = provider.execute_code(
+                    instance_id=instance.instance_id,
+                    code=test_code,
+                    language="python",
+                    timeout=10,
+                )
+            finally:
+                try:
+                    provider.destroy_instance(instance.instance_id)
+                    logging.info(f"Cleaned up test instance {instance.instance_id}")
+                except Exception as cleanup_error:
+                    logging.warning(f"Failed to cleanup test instance {instance.instance_id}: {cleanup_error}")
+
+            # Build detailed result message
+            success = execution_result.exit_code == 0 and "TEST_PASSED" in execution_result.stdout
+
+            message_parts = [
+                f"Test {success and 'PASSED' or 'FAILED'}",
+                f"Exit code: {execution_result.exit_code}",
+                f"Execution time: {execution_result.execution_time:.2f}s"
+            ]
+
+            if execution_result.stdout.strip():
+                stdout_preview = execution_result.stdout.strip()[:200]
+                message_parts.append(f"Output: {stdout_preview}...")
+
+            if execution_result.stderr.strip():
+                stderr_preview = execution_result.stderr.strip()[:200]
+                message_parts.append(f"Errors: {stderr_preview}...")
+
+            message = " | ".join(message_parts)
+
+            return {
+                "success": success,
+                "message": message,
+                "details": {
+                    "exit_code": execution_result.exit_code,
+                    "execution_time": execution_result.execution_time,
+                    "stdout": execution_result.stdout,
+                    "stderr": execution_result.stderr,
+                }
+            }
+
+        except AdminException:
+            raise
+        except Exception as e:
+            import traceback
+            error_details = traceback.format_exc()
+            raise AdminException(f"Connection test failed: {str(e)}\\n\\nStack trace:\\n{error_details}")
--- a/agent/init.py
+++ b/agent/init.py
@@ -13,6 +13,3 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
-
-from beartype.claw import beartype_this_package
-beartype_this_package()
--- a/agent/canvas.py
+++ b/agent/canvas.py
@@ -13,7 +13,10 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
+import asyncio
 import base64
+import datetime
+import inspect
 import json
 import logging
 import re
@@ -25,12 +28,19 @@ from typing import Any, Union, Tuple

 from agent.component import component_class
 from agent.component.base import ComponentBase
+from agent.dsl_migration import normalize_chunker_dsl
 from api.db.services.file_service import FileService
+from api.db.services.llm_service import LLMBundle
 from api.db.services.task_service import has_canceled
+from api.db.joint_services.tenant_model_service import get_tenant_default_model_by_type
+from common.constants import LLMType
 from common.misc_utils import get_uuid, hash_str2int
 from common.exceptions import TaskCanceledException
 from rag.prompts.generator import chunks_format
 from rag.utils.redis_conn import REDIS_CONN
+from rag.utils.tts_cache import synthesize_with_cache
+
+_logger = logging.getLogger(__name__)

 class Graph:
    """
@@ -73,24 +83,25 @@ class Graph:
        }
        """

-    def __init__(self, dsl: str, tenant_id=None, task_id=None):
+    def __init__(self, dsl: str, tenant_id=None, task_id=None, custom_header=None):
        self.path = []
        self.components = {}
        self.error = ""
-        self.dsl = json.loads(dsl)
+        # Accept legacy DSL on read, but keep the in-memory canvas in the latest schema.
+        self.dsl = normalize_chunker_dsl(json.loads(dsl))
        self._tenant_id = tenant_id
        self.task_id = task_id if task_id else get_uuid()
+        self.custom_header = custom_header
+        self._thread_pool = ThreadPoolExecutor(max_workers=5)
        self.load()

    def load(self):
        self.components = self.dsl["components"]
        cpn_nms = set([])
-        for k, cpn in self.components.items():
-            cpn_nms.add(cpn["obj"]["component_name"])
-
        for k, cpn in self.components.items():
            cpn_nms.add(cpn["obj"]["component_name"])
            param = component_class(cpn["obj"]["component_name"] + "Param")()
+            cpn["obj"]["params"]["custom_header"] = self.custom_header
            param.update(cpn["obj"]["params"])
            try:
                param.check()
@@ -110,7 +121,11 @@ class Graph:
        for k in self.dsl.keys():
            if k in ["components"]:
                continue
-            dsl[k] = deepcopy(self.dsl[k])
+            try:
+                dsl[k] = deepcopy(self.dsl[k])
+            except Exception as e:
+                logging.warning("Graph.__str__: deepcopy failed for dsl key '%s' (type=%s): %s. Using shallow reference.", k, type(self.dsl[k]).__name__, e)
+                dsl[k] = self.dsl[k]

        for k, cpn in self.components.items():
            if k not in dsl["components"]:
@@ -119,8 +134,17 @@ class Graph:
                if c == "obj":
                    dsl["components"][k][c] = json.loads(str(cpn["obj"]))
                    continue
-                dsl["components"][k][c] = deepcopy(cpn[c])
-        return json.dumps(dsl, ensure_ascii=False)
+                try:
+                    dsl["components"][k][c] = deepcopy(cpn[c])
+                except Exception as e:
+                    logging.warning("Graph.__str__: deepcopy failed for component '%s' key '%s' (type=%s): %s. Using shallow reference.", k, c, type(cpn[c]).__name__, e)
+                    dsl["components"][k][c] = cpn[c]
+        def _serialize_default(obj):
+            if callable(obj):
+                return None
+            logging.warning("Graph.__str__: JSON fallback via str() for type=%s", type(obj).__name__)
+            return str(obj)
+        return json.dumps(dsl, ensure_ascii=False, default=_serialize_default)

    def reset(self):
        self.path = []
@@ -157,7 +181,7 @@ class Graph:
        return self._tenant_id

    def get_value_with_variable(self,value: str) -> Any:
-        pat = re.compile(r"\{* *\{([a-zA-Z:0-9]+@[A-Za-z0-9_.]+|sys\.[A-Za-z0-9_.]+|env\.[A-Za-z0-9_.]+)\} *\}*")
+        pat = re.compile(r"\{* *\{([a-zA-Z:0-9]+@[A-Za-z0-9_.-]+|sys\.[A-Za-z0-9_.]+|env\.[A-Za-z0-9_.]+)\} *\}*")
        out_parts = []
        last = 0

@@ -207,17 +231,60 @@ class Graph:
        for key in path.split('.'):
            if cur is None:
                return None
+
            if isinstance(cur, str):
                try:
                    cur = json.loads(cur)
                except Exception:
                    return None
+
            if isinstance(cur, dict):
                cur = cur.get(key)
-            else:
-                cur = getattr(cur, key, None)
+                continue
+
+            if isinstance(cur, (list, tuple)):
+                try:
+                    idx = int(key)
+                    cur = cur[idx]
+                except Exception:
+                    return None
+                continue
+
+            cur = getattr(cur, key, None)
        return cur

+    def set_variable_value(self, exp: str,value):
+        exp = exp.strip("{").strip("}").strip(" ").strip("{").strip("}")
+        if exp.find("@") < 0:
+            self.globals[exp] = value
+            return
+        cpn_id, var_nm = exp.split("@")
+        cpn = self.get_component(cpn_id)
+        if not cpn:
+            raise Exception(f"Can't find variable: '{cpn_id}@{var_nm}'")
+        parts = var_nm.split(".", 1)
+        root_key = parts[0]
+        rest = parts[1] if len(parts) > 1 else ""
+        if not rest:
+            cpn["obj"].set_output(root_key, value)
+            return
+        root_val = cpn["obj"].output(root_key)
+        if not root_val:
+            root_val = {}
+        cpn["obj"].set_output(root_key, self.set_variable_param_value(root_val,rest,value))
+
+    def set_variable_param_value(self, obj: Any, path: str, value) -> Any:
+        cur = obj
+        keys = path.split('.')
+        if not path:
+            return value
+        for key in keys[:-1]:
+            if key not in cur or not isinstance(cur[key], dict):
+                cur[key] = {}
+            cur = cur[key]
+        cur[keys[-1]] = value
+        return obj
+
    def is_canceled(self) -> bool:
        return has_canceled(self.task_id)

@@ -232,27 +299,41 @@ class Graph:

 class Canvas(Graph):

-    def __init__(self, dsl: str, tenant_id=None, task_id=None):
+    def __init__(self, dsl: str, tenant_id=None, task_id=None, canvas_id=None, custom_header=None):
        self.globals = {
            "sys.query": "",
            "sys.user_id": tenant_id,
            "sys.conversation_turns": 0,
-            "sys.files": []
+            "sys.files": [],
+            "sys.history": [],
+            "sys.date": datetime.datetime.now(datetime.timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
        }
-        super().__init__(dsl, tenant_id, task_id)
+        self.variables = {}
+        super().__init__(dsl, tenant_id, task_id, custom_header=custom_header)
+        self._id = canvas_id

    def load(self):
        super().load()
        self.history = self.dsl["history"]
        if "globals" in self.dsl:
            self.globals = self.dsl["globals"]
+            if "sys.history" not in self.globals:
+                self.globals["sys.history"] = []
+            if "sys.date" not in self.globals:
+                self.globals["sys.date"] = datetime.datetime.now(datetime.timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
        else:
            self.globals = {
            "sys.query": "",
            "sys.user_id": "",
            "sys.conversation_turns": 0,
-            "sys.files": []
+            "sys.files": [],
+            "sys.history": [],
+            "sys.date": datetime.datetime.now(datetime.timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
        }
+        if "variables" in self.dsl:
+            self.variables = self.dsl["variables"]
+        else:
+            self.variables = {}

        self.retrieval = self.dsl["retrieval"]
        self.memory = self.dsl.get("memory", [])
@@ -263,12 +344,18 @@ class Canvas(Graph):
        self.dsl["memory"] = self.memory
        return super().__str__()

+    def clear_history(self):
+        self.history = []
+        if isinstance(self.globals.get("sys.history"), list):
+            self.globals["sys.history"] = []
+
    def reset(self, mem=False):
        super().reset()
        if not mem:
            self.history = []
            self.retrieval = []
            self.memory = []
+        print(self.variables)
        for k in self.globals.keys():
            if k.startswith("sys."):
                if isinstance(self.globals[k], str):
@@ -283,32 +370,67 @@ class Canvas(Graph):
                    self.globals[k] = {}
                else:
                    self.globals[k] = None
+            if k.startswith("env."):
+                key = k[4:]
+                if key in self.variables:
+                    variable = self.variables[key]
+                    value = variable.get("value")
+                    if value is not None:
+                        self.globals[k] = value
+                    else:
+                        var_type = variable.get("type", "")
+                        if var_type == "number":
+                            self.globals[k] = 0
+                        elif var_type == "boolean":
+                            self.globals[k] = False
+                        elif var_type == "object":
+                            self.globals[k] = {}
+                        elif var_type.startswith("array"):
+                            self.globals[k] = []
+                        else:  # "string" or unknown
+                            self.globals[k] = ""
+                else:
+                    self.globals[k] = ""

-    def run(self, **kwargs):
+    async def run(self, **kwargs):
+        self.globals["sys.date"] = datetime.datetime.now(datetime.timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
        st = time.perf_counter()
+        self._loop = asyncio.get_running_loop()
        self.message_id = get_uuid()
        created_at = int(time.time())
        self.add_user_input(kwargs.get("query"))
+        path_set = set(self.path)
        for k, cpn in self.components.items():
-            self.components[k]["obj"].reset(True)
+            if k in path_set:
+                self.components[k]["obj"].reset(True)

        if kwargs.get("webhook_payload"):
            for k, cpn in self.components.items():
-                if self.components[k]["obj"].component_name.lower() == "webhook":
-                    for kk, vv in kwargs["webhook_payload"].items():
+                if self.components[k]["obj"].component_name.lower() == "begin"  and self.components[k]["obj"]._param.mode == "Webhook":
+                    payload = kwargs.get("webhook_payload", {})
+                    if "input" in payload:
+                        self.components[k]["obj"].set_input_value("request", payload["input"])
+                    for kk, vv in payload.items():
+                        if kk == "input":
+                            continue
                        self.components[k]["obj"].set_output(kk, vv)

-            self.components[k]["obj"].reset(True)
+        layout_recognize = None
+        for cpn in self.components.values():
+            if cpn["obj"].component_name.lower() == "begin":
+                layout_recognize = getattr(cpn["obj"]._param, "layout_recognize", None)
+                break

        for k in kwargs.keys():
-            if k in ["query", "user_id", "files"] and kwargs[k]:
+            if k in ["query", "user_id", "files", "chat_template_kwargs"] and kwargs[k]:
                if k == "files":
-                    self.globals[f"sys.{k}"] = self.get_files(kwargs[k])
+                    self.globals[f"sys.{k}"] = await self.get_files_async(kwargs[k], layout_recognize)
                else:
                    self.globals[f"sys.{k}"] = kwargs[k]
        if not self.globals["sys.conversation_turns"] :
            self.globals["sys.conversation_turns"] = 0
        self.globals["sys.conversation_turns"] += 1
+        is_resume = bool(self.path) and self.path[0].lower().find("userfillup") >= 0

        def decorate(event, dt):
            nonlocal created_at
@@ -321,48 +443,91 @@ class Canvas(Graph):
                "data": dt
            }

-        if not self.path or self.path[-1].lower().find("userfillup") < 0:
+        if not is_resume:
            self.path.append("begin")
            self.retrieval.append({"chunks": [], "doc_aggs": []})
-
        if self.is_canceled():
            msg = f"Task {self.task_id} has been canceled before starting."
            logging.info(msg)
            raise TaskCanceledException(msg)

-        yield decorate("workflow_started", {"inputs": kwargs.get("inputs")})
+        if not is_resume:
+            yield decorate("workflow_started", {"inputs": kwargs.get("inputs")})
+            _logger.debug(
+                "[Canvas] Workflow started. Path: %s, Inputs: %s",
+                [self.get_component_name(c) for c in self.path],
+                json.dumps(kwargs.get("inputs", {}), ensure_ascii=False, default=str)[:500],
+            )
        self.retrieval.append({"chunks": {}, "doc_aggs": {}})

-        def _run_batch(f, t):
+        async def _run_batch(f, t):
            if self.is_canceled():
                msg = f"Task {self.task_id} has been canceled during batch execution."
                logging.info(msg)
                raise TaskCanceledException(msg)

-            with ThreadPoolExecutor(max_workers=5) as executor:
-                thr = []
-                i = f
-                while i < t:
-                    cpn = self.get_component_obj(self.path[i])
-                    if cpn.component_name.lower() in ["begin", "userfillup"]:
-                        thr.append(executor.submit(cpn.invoke, inputs=kwargs.get("inputs", {})))
-                        i += 1
+            loop = asyncio.get_running_loop()
+            tasks = []
+            max_concurrency = getattr(self._thread_pool, "_max_workers", 5)
+            sem = asyncio.Semaphore(max_concurrency)
+
+            async def _invoke_one(cpn_obj, sync_fn, call_kwargs, use_async: bool):
+                async with sem:
+                    if use_async:
+                        await cpn_obj.invoke_async(**(call_kwargs or {}))
+                        return
+                    await loop.run_in_executor(self._thread_pool, partial(sync_fn, **(call_kwargs or {})))
+
+            i = f
+            while i < t:
+                cpn = self.get_component_obj(self.path[i])
+                task_fn = None
+                call_kwargs = None
+
+                if cpn.component_name.lower() in ["begin", "userfillup"]:
+                    call_kwargs = {"inputs": kwargs.get("inputs", {})}
+                    task_fn = cpn.invoke
+                    i += 1
+                else:
+                    for _, ele in cpn.get_input_elements().items():
+                        if isinstance(ele, dict) and ele.get("_cpn_id") and ele.get("_cpn_id") not in self.path[:i] and self.path[0].lower().find("userfillup") < 0:
+                            self.path.pop(i)
+                            t -= 1
+                            break
                    else:
-                        for _, ele in cpn.get_input_elements().items():
-                            if isinstance(ele, dict) and ele.get("_cpn_id") and ele.get("_cpn_id") not in self.path[:i] and self.path[0].lower().find("userfillup") < 0:
-                                self.path.pop(i)
-                                t -= 1
-                                break
-                        else:
-                            thr.append(executor.submit(cpn.invoke, **cpn.get_input()))
-                            i += 1
-                for t in thr:
-                    t.result()
+                        call_kwargs = cpn.get_input()
+                        task_fn = cpn.invoke
+                        i += 1
+
+                if task_fn is None:
+                    continue
+
+                _logger.debug(
+                    "[Canvas] Invoking component '%s' (%s) with inputs: %s",
+                    self.get_component_name(self.path[i - 1]),
+                    cpn.component_name,
+                    json.dumps(call_kwargs, ensure_ascii=False, default=str)[:500],
+                )
+
+                fn_invoke_async = getattr(cpn, "_invoke_async", None)
+                use_async = (fn_invoke_async and asyncio.iscoroutinefunction(fn_invoke_async)) or asyncio.iscoroutinefunction(getattr(cpn, "_invoke", None))
+                tasks.append(asyncio.create_task(_invoke_one(cpn, task_fn, call_kwargs, use_async)))
+
+            if tasks:
+                await asyncio.gather(*tasks)

        def _node_finished(cpn_obj):
+            outputs = cpn_obj.output()
+            _logger.debug(
+                "[Canvas] Component '%s' (%s) finished. Outputs: %s, Error: %s",
+                self.get_component_name(cpn_obj._id),
+                self.get_component_type(cpn_obj._id),
+                json.dumps(outputs, ensure_ascii=False, default=str)[:500],
+                cpn_obj.error(),
+            )
            return decorate("node_finished",{
                           "inputs": cpn_obj.get_input_values(),
-                           "outputs": cpn_obj.output(),
+                           "outputs": outputs,
                           "component_id": cpn_obj._id,
                           "component_name": self.get_component_name(cpn_obj._id),
                           "component_type": self.get_component_type(cpn_obj._id),
@@ -372,8 +537,9 @@ class Canvas(Graph):
                       })

        self.error = ""
-        idx = len(self.path) - 1
+        idx = 0 if is_resume else len(self.path) - 1
        partials = []
+        tts_mdl = None
        while idx < len(self.path):
            to = len(self.path)
            for i in range(idx, to):
@@ -384,31 +550,65 @@ class Canvas(Graph):
                    "component_type": self.get_component_type(self.path[i]),
                    "thoughts": self.get_component_thoughts(self.path[i])
                })
-            _run_batch(idx, to)
+            await _run_batch(idx, to)
            to = len(self.path)
-            # post processing of components invocation
+            # post-processing of components invocation
            for i in range(idx, to):
                cpn = self.get_component(self.path[i])
                cpn_obj = self.get_component_obj(self.path[i])
                if cpn_obj.component_name.lower() == "message":
+                    if cpn_obj.get_param("auto_play"):
+                        tts_model_config = get_tenant_default_model_by_type(self._tenant_id, LLMType.TTS)
+                        tts_mdl = LLMBundle(self._tenant_id, tts_model_config)
                    if isinstance(cpn_obj.output("content"), partial):
                        _m = ""
-                        for m in cpn_obj.output("content")():
+                        buff_m = ""
+                        stream = cpn_obj.output("content")()
+                        async def _process_stream(m):
+                            nonlocal buff_m, _m, tts_mdl
                            if not m:
-                                continue
+                                return
                            if m == "<think>":
-                                yield decorate("message", {"content": "", "start_to_think": True})
+                                return decorate("message", {"content": "", "start_to_think": True})
+
                            elif m == "</think>":
-                                yield decorate("message", {"content": "", "end_to_think": True})
-                            else:
-                                yield decorate("message", {"content": m})
-                                _m += m
+                                return decorate("message", {"content": "", "end_to_think": True})
+
+                            buff_m += m
+                            _m += m
+
+                            if len(buff_m) > 16:
+                                ev = decorate(
+                                    "message",
+                                    {
+                                        "content": m,
+                                        "audio_binary": self.tts(tts_mdl, buff_m)
+                                    }
+                                )
+                                buff_m = ""
+                                return ev
+
+                            return decorate("message", {"content": m})
+
+                        if inspect.isasyncgen(stream):
+                            async for m in stream:
+                                ev= await _process_stream(m)
+                                if ev:
+                                    yield ev
+                        else:
+                            for m in stream:
+                                ev= await _process_stream(m)
+                                if ev:
+                                    yield ev
+                        if buff_m:
+                            yield decorate("message", {"content": "", "audio_binary": self.tts(tts_mdl, buff_m)})
+                            buff_m = ""
                        cpn_obj.set_output("content", _m)
-                        cite = re.search(r"\[ID:[ 0-9]+\]", _m)
                    else:
                        yield decorate("message", {"content": cpn_obj.output("content")})
-                        cite = re.search(r"\[ID:[ 0-9]+\]",  cpn_obj.output("content"))
-                    yield decorate("message_end", {"reference": self.get_reference() if cite else None})
+
+                    message_end = self._build_message_end(cpn_obj)
+                    yield decorate("message_end", message_end)

                    while partials:
                        _cpn_obj = self.get_component_obj(partials[0])
@@ -429,7 +629,7 @@ class Canvas(Graph):
                    else:
                        self.error = cpn_obj.error()

-                if cpn_obj.component_name.lower() != "iteration":
+                if cpn_obj.component_name.lower() not in ("iteration","loop"):
                    if isinstance(cpn_obj.output("content"), partial):
                        if self.error:
                            cpn_obj.set_output("content", None)
@@ -454,14 +654,16 @@ class Canvas(Graph):
                    for cpn_id in cpn_ids:
                        _append_path(cpn_id)

-                if cpn_obj.component_name.lower() == "iterationitem" and cpn_obj.end():
+                if cpn_obj.component_name.lower() in ("iterationitem","loopitem") and cpn_obj.end():
                    iter = cpn_obj.get_parent()
                    yield _node_finished(iter)
                    _extend_path(self.get_component(cpn["parent_id"])["downstream"])
                elif cpn_obj.component_name.lower() in ["categorize", "switch"]:
                    _extend_path(cpn_obj.output("_next"))
-                elif cpn_obj.component_name.lower() == "iteration":
+                elif cpn_obj.component_name.lower() in ("iteration", "loop"):
                    _append_path(cpn_obj.get_start())
+                elif cpn_obj.component_name.lower() == "exitloop" and cpn_obj.get_parent().component_name.lower() == "loop":
+                    _extend_path(self.get_component(cpn["parent_id"])["downstream"])
                elif not cpn["downstream"] and cpn_obj.get_parent():
                    _append_path(cpn_obj.get_parent().get_start())
                else:
@@ -472,18 +674,23 @@ class Canvas(Graph):
                break
            idx = to

-            if any([self.get_component_obj(c).component_name.lower() == "userfillup" for c in self.path[idx:]]):
-                path = [c for c in self.path[idx:] if self.get_component(c)["obj"].component_name.lower() == "userfillup"]
-                path.extend([c for c in self.path[idx:] if self.get_component(c)["obj"].component_name.lower() != "userfillup"])
+            if any([self.components.get(c) is not None and self.get_component_obj(c).component_name.lower() == "userfillup" for c in self.path[idx:]]):
+                path = [c for c in self.path[idx:] if self.components.get(c) is not None and self.get_component(c)["obj"].component_name.lower() == "userfillup"]
+                path.extend([c for c in self.path[idx:] if self.components.get(c) is not None and self.get_component(c)["obj"].component_name.lower() != "userfillup"])
                another_inputs = {}
                tips = ""
                for c in path:
                    o = self.get_component_obj(c)
                    if o.component_name.lower() == "userfillup":
                        o.invoke()
-                        another_inputs.update(o.get_input_elements())
+                        another_inputs.update({
+                            k: v for k, v in o.get_input_elements().items()
+                            if not self._is_input_field_satisfied(v)
+                        })
                        if o.get_param("enable_tips"):
                            tips = o.output("tips")
+                if not another_inputs:
+                    continue
                self.path = path
                yield decorate("user_inputs", {"inputs": another_inputs, "tips": tips})
                return
@@ -497,6 +704,7 @@ class Canvas(Graph):
                           "created_at": st,
                       })
            self.history.append(("assistant", self.get_component_obj(self.path[-1]).output()))
+            self.globals["sys.history"].append(f"{self.history[-1][0]}: {self.history[-1][1]}")
        elif "Task has been canceled" in self.error:
            yield decorate("workflow_finished",
                       {
@@ -517,6 +725,43 @@ class Canvas(Graph):
            return False
        return True

+
+    def tts(self,tts_mdl, text):
+        def clean_tts_text(text: str) -> str:
+            if not text:
+                return ""
+
+            text = text.encode("utf-8", "ignore").decode("utf-8", "ignore")
+
+            text = re.sub(r"[\x00-\x08\x0B-\x0C\x0E-\x1F\x7F]", "", text)
+
+            emoji_pattern = re.compile(
+                "[\U0001F600-\U0001F64F"
+                "\U0001F300-\U0001F5FF"
+                "\U0001F680-\U0001F6FF"
+                "\U0001F1E0-\U0001F1FF"
+                "\U00002700-\U000027BF"
+                "\U0001F900-\U0001F9FF"
+                "\U0001FA70-\U0001FAFF"
+                "\U0001FAD0-\U0001FAFF]+",
+                flags=re.UNICODE
+            )
+            text = emoji_pattern.sub("", text)
+
+            text = re.sub(r"\s+", " ", text).strip()
+
+            MAX_LEN = 500
+            if len(text) > MAX_LEN:
+                text = text[:MAX_LEN]
+
+            return text
+        if not tts_mdl or not text:
+            return None
+        text = clean_tts_text(text)
+        if not text:
+            return None
+        return synthesize_with_cache(tts_mdl, text)
+
    def get_history(self, window_size):
        convs = []
        if window_size <= 0:
@@ -530,6 +775,25 @@ class Canvas(Graph):

    def add_user_input(self, question):
        self.history.append(("user", question))
+        rendered = json.dumps(question, ensure_ascii=False) if isinstance(question, dict) else question
+        self.globals["sys.history"].append(f"{self.history[-1][0]}: {rendered}")
+
+    @staticmethod
+    def _is_input_field_satisfied(field: Any) -> bool:
+        if not isinstance(field, dict):
+            return field is not None
+
+        value = field.get("value")
+        field_type = str(field.get("type", "")).lower()
+        if field_type.find("file") >= 0:
+            if field.get("optional") and value is None:
+                return True
+            return value not in (None, [], "")
+
+        if value is None:
+            return False
+
+        return True

    def get_prologue(self):
        return self.components["begin"]["obj"]._param.prologue
@@ -537,6 +801,9 @@ class Canvas(Graph):
    def get_mode(self):
        return self.components["begin"]["obj"]._param.mode

+    def get_sys_query(self):
+        return self.globals.get("sys.query", "")
+
    def set_global_param(self, **kwargs):
        self.globals.update(kwargs)

@@ -546,20 +813,33 @@ class Canvas(Graph):
    def get_component_input_elements(self, cpnnm):
        return self.components[cpnnm]["obj"].get_input_elements()

-    def get_files(self, files: Union[None, list[dict]]) -> list[str]:
+    async def get_files_async(self, files: Union[None, list[dict]], layout_recognize: str = None) -> list[str]:
        if not files:
            return  []
        def image_to_base64(file):
            return "data:{};base64,{}".format(file["mime_type"],
                                        base64.b64encode(FileService.get_blob(file["created_by"], file["id"])).decode("utf-8"))
-        exe = ThreadPoolExecutor(max_workers=5)
-        threads = []
+        def parse_file(file):
+            blob = FileService.get_blob(file["created_by"], file["id"])
+            return FileService.parse(file["name"], blob, True, file["created_by"], layout_recognize)
+        loop = asyncio.get_running_loop()
+        tasks = []
        for file in files:
            if file["mime_type"].find("image") >=0:
-                threads.append(exe.submit(image_to_base64, file))
+                tasks.append(loop.run_in_executor(self._thread_pool, image_to_base64, file))
                continue
-            threads.append(exe.submit(FileService.parse, file["name"], FileService.get_blob(file["created_by"], file["id"]), True, file["created_by"]))
-        return [th.result() for th in threads]
+            tasks.append(loop.run_in_executor(self._thread_pool, parse_file, file))
+        return await asyncio.gather(*tasks)
+
+    def get_files(self, files: Union[None, list[dict]], layout_recognize: str = None) -> list[str]:
+        """
+        Synchronous wrapper for get_files_async, used by sync component invoke paths.
+        """
+        loop = getattr(self, "_loop", None)
+        if loop and loop.is_running():
+            return asyncio.run_coroutine_threadsafe(self.get_files_async(files, layout_recognize), loop).result()
+
+        return asyncio.run(self.get_files_async(files, layout_recognize))

    def tool_use_callback(self, agent_id: str, func_name: str, params: dict, result: Any, elapsed_time=None):
        agent_ids = agent_id.split("-->")
@@ -605,6 +885,22 @@ class Canvas(Graph):
            return {"chunks": {}, "doc_aggs": {}}
        return self.retrieval[-1]

+    def _has_reference(self) -> bool:
+        ref = self.get_reference()
+        if not isinstance(ref, dict):
+            return False
+        return bool(ref.get("chunks") or ref.get("doc_aggs"))
+
+    def _build_message_end(self, cpn_obj) -> dict:
+        message_end = {}
+        if cpn_obj.get_param("status"):
+            message_end["status"] = cpn_obj.get_param("status")
+        if isinstance(cpn_obj.output("attachment"), dict):
+            message_end["attachment"] = cpn_obj.output("attachment")
+        if self._has_reference():
+            message_end["reference"] = self.get_reference()
+        return message_end
+
    def add_memory(self, user:str, assist:str, summ: str):
        self.memory.append((user, assist, summ))

@@ -613,4 +909,3 @@ class Canvas(Graph):

    def get_component_thoughts(self, cpn_id) -> str:
        return self.components.get(cpn_id)["obj"].thoughts()
-
--- a/agent/component/agent_with_tools.py
+++ b/agent/component/agent_with_tools.py
@@ -13,25 +13,28 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
+import asyncio
+import json
 import logging
 import os
 import re
-from concurrent.futures import ThreadPoolExecutor
 from copy import deepcopy
 from functools import partial
+from timeit import default_timer as timer
 from typing import Any

 import json_repair
-from timeit import default_timer as timer
-from agent.tools.base import LLMToolPluginCallSession, ToolParamBase, ToolBase, ToolMeta
+
+from agent.component.llm import LLM, LLMParam
+from agent.tools.base import LLMToolPluginCallSession, ToolBase, ToolMeta, ToolParamBase
+from api.db.joint_services.tenant_model_service import get_model_config_from_provider_instance, get_model_type_by_name
 from api.db.services.llm_service import LLMBundle
-from api.db.services.tenant_llm_service import TenantLLMService
 from api.db.services.mcp_server_service import MCPServerService
 from common.connection_utils import timeout
-from rag.prompts.generator import next_step, COMPLETE_TASK, analyze_task, \
-    citation_prompt, reflect, rank_memories, kb_prompt, citation_plus, full_question, message_fit_in
-from rag.utils.mcp_tool_call_conn import MCPToolCallSession, mcp_tool_metadata_to_openai_tool
-from agent.component.llm import LLMParam, LLM
+from common.mcp_tool_call_conn import MCPToolBinding, MCPToolCallSession, mcp_tool_metadata_to_openai_tool
+from rag.prompts.generator import citation_plus, citation_prompt, full_question, kb_prompt, message_fit_in, structured_output_prompt
+
+_logger = logging.getLogger(__name__)


 class AgentParam(LLMParam, ToolParamBase):
@@ -40,41 +43,32 @@ class AgentParam(LLMParam, ToolParamBase):
    """

    def __init__(self):
-        self.meta:ToolMeta = {
-                "name": "agent",
-                "description": "This is an agent for a specific task.",
-                "parameters": {
-                    "user_prompt": {
-                        "type": "string",
-                        "description": "This is the order you need to send to the agent.",
-                        "default": "",
-                        "required": True
-                    },
-                    "reasoning": {
-                        "type": "string",
-                        "description": (
-                            "Supervisor's reasoning for choosing the this agent. "
-                            "Explain why this agent is being invoked and what is expected of it."
-                        ),
-                        "required": True
-                    },
-                    "context": {
-                        "type": "string",
-                        "description": (
-                                "All relevant background information, prior facts, decisions, "
-                                "and state needed by the agent to solve the current query. "
-                                "Should be as detailed and self-contained as possible."
-                            ),
-                        "required": True
-                    },
-                }
-            }
+        self.meta: ToolMeta = {
+            "name": "agent",
+            "description": "This is an agent for a specific task.",
+            "parameters": {
+                "user_prompt": {"type": "string", "description": "This is the order you need to send to the agent.", "default": "", "required": True},
+                "reasoning": {
+                    "type": "string",
+                    "description": ("Supervisor's reasoning for choosing the this agent. Explain why this agent is being invoked and what is expected of it."),
+                    "required": True,
+                },
+                "context": {
+                    "type": "string",
+                    "description": (
+                        "All relevant background information, prior facts, decisions, and state needed by the agent to solve the current query. Should be as detailed and self-contained as possible."
+                    ),
+                    "required": True,
+                },
+            },
+        }
        super().__init__()
        self.function_name = "agent"
        self.tools = []
        self.mcp = []
        self.max_rounds = 5
        self.description = ""
+        self.custom_header = {}


 class Agent(LLM, ToolBase):
@@ -83,31 +77,67 @@ class Agent(LLM, ToolBase):
    def __init__(self, canvas, id, param: LLMParam):
        LLM.__init__(self, canvas, id, param)
        self.tools = {}
-        for cpn in self._param.tools:
+        for idx, cpn in enumerate(self._param.tools):
            cpn = self._load_tool_obj(cpn)
-            self.tools[cpn.get_meta()["function"]["name"]] = cpn
-
-        self.chat_mdl = LLMBundle(self._canvas.get_tenant_id(), TenantLLMService.llm_id2llm_type(self._param.llm_id), self._param.llm_id,
-                                  max_retries=self._param.max_retries,
-                                  retry_interval=self._param.delay_after_error,
-                                  max_rounds=self._param.max_rounds,
-                                  verbose_tool_use=True
-                                  )
-        self.tool_meta = [v.get_meta() for _,v in self.tools.items()]
+            original_name = cpn.get_meta()["function"]["name"]
+            indexed_name = f"{original_name}_{idx}"
+            self.tools[indexed_name] = cpn
+        model_types = get_model_type_by_name(self._canvas.get_tenant_id(), self._param.llm_id)
+        model_type = "chat" if "chat" in model_types else model_types[0]
+        chat_model_config = get_model_config_from_provider_instance(self._canvas.get_tenant_id(), model_type, self._param.llm_id)
+        self.chat_mdl = LLMBundle(
+            self._canvas.get_tenant_id(),
+            chat_model_config,
+            max_retries=self._param.max_retries,
+            retry_interval=self._param.delay_after_error,
+            max_rounds=self._param.max_rounds,
+            verbose_tool_use=False,
+        )
+        self.tool_meta = []
+        for indexed_name, tool_obj in self.tools.items():
+            original_meta = tool_obj.get_meta()
+            indexed_meta = deepcopy(original_meta)
+            indexed_meta["function"]["name"] = indexed_name
+            self.tool_meta.append(indexed_meta)

+        tool_idx = len(self.tools)
        for mcp in self._param.mcp:
            _, mcp_server = MCPServerService.get_by_id(mcp["mcp_id"])
-            tool_call_session = MCPToolCallSession(mcp_server, mcp_server.variables)
+            custom_header = self._param.custom_header
+            tool_call_session = MCPToolCallSession(mcp_server, mcp_server.variables, custom_header)
            for tnm, meta in mcp["tools"].items():
-                self.tool_meta.append(mcp_tool_metadata_to_openai_tool(meta))
-                self.tools[tnm] = tool_call_session
+                indexed_name = f"{tnm}_{tool_idx}"
+                tool_idx += 1
+                self.tool_meta.append(mcp_tool_metadata_to_openai_tool(meta, function_name=indexed_name))
+                self.tools[indexed_name] = MCPToolBinding(tool_call_session, tnm)
        self.callback = partial(self._canvas.tool_use_callback, id)
        self.toolcall_session = LLMToolPluginCallSession(self.tools, self.callback)
-        #self.chat_mdl.bind_tools(self.toolcall_session, self.tool_metas)
+        if self.tool_meta:
+            self.chat_mdl.bind_tools(self.toolcall_session, self.tool_meta)
+
+    def _fit_messages(self, prompt: str, msg: list[dict]) -> tuple[list[dict] | None, str | None]:
+        msg_fit, fit_error = LLM.fit_messages(prompt, msg, self.chat_mdl.max_length)
+        if fit_error:
+            logging.error("Agent prompt fit error: %s", fit_error)
+            return None, fit_error
+        return msg_fit, None
+
+    @staticmethod
+    def _append_system_prompt(msg: list[dict], extra_prompt: str) -> None:
+        if extra_prompt and msg and msg[0]["role"] == "system":
+            msg[0]["content"] += "\n" + extra_prompt
+
+    @staticmethod
+    def _clean_formatted_answer(ans: str) -> str:
+        ans = re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
+        ans = re.sub(r"^.*```json", "", ans, flags=re.DOTALL)
+        return re.sub(r"```\n*$", "", ans, flags=re.DOTALL)

    def _load_tool_obj(self, cpn: dict) -> object:
        from agent.component import component_class
-        param = component_class(cpn["component_name"] + "Param")()
+
+        tool_name = cpn["component_name"]
+        param = component_class(tool_name + "Param")()
        param.update(cpn["params"])
        try:
            param.check()
@@ -118,30 +148,65 @@ class Agent(LLM, ToolBase):
        return component_class(cpn["component_name"])(self._canvas, cpn_id, param)

    def get_meta(self) -> dict[str, Any]:
-        self._param.function_name= self._id.split("-->")[-1]
+        self._param.function_name = self._id.split("-->")[-1]
        m = super().get_meta()
        if hasattr(self._param, "user_prompt") and self._param.user_prompt:
-            m["function"]["parameters"]["properties"]["user_prompt"] = self._param.user_prompt
+            # Keep the JSON schema valid; user_prompt is a string field, not a schema node.
+            m["function"]["parameters"]["properties"]["user_prompt"]["default"] = self._param.user_prompt
        return m

    def get_input_form(self) -> dict[str, dict]:
        res = {}
        for k, v in self.get_input_elements().items():
-            res[k] = {
-                "type": "line",
-                "name": v["name"]
-            }
+            res[k] = {"type": "line", "name": v["name"]}
        for cpn in self._param.tools:
            if not isinstance(cpn, LLM):
                continue
            res.update(cpn.get_input_form())
        return res

-    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 20*60)))
+    def _get_output_schema(self):
+        try:
+            cand = self._param.outputs.get("structured")
+        except Exception:
+            return None
+
+        if isinstance(cand, dict):
+            if isinstance(cand.get("properties"), dict) and len(cand["properties"]) > 0:
+                return cand
+            for k in ("schema", "structured"):
+                if isinstance(cand.get(k), dict) and isinstance(cand[k].get("properties"), dict) and len(cand[k]["properties"]) > 0:
+                    return cand[k]
+
+        return None
+
+    async def _force_format_to_schema_async(self, text: str, schema_prompt: str) -> str:
+        fmt_msgs = [
+            {"role": "system", "content": schema_prompt + "\nIMPORTANT: Output ONLY valid JSON. No markdown, no extra text."},
+            {"role": "user", "content": text},
+        ]
+        _, fmt_msgs = message_fit_in(fmt_msgs, LLM.context_fit_budget(self.chat_mdl.max_length))
+        return await self._generate_async(fmt_msgs)
+
    def _invoke(self, **kwargs):
+        return asyncio.run(self._invoke_async(**kwargs))
+
+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 20 * 60)))
+    async def _invoke_async(self, **kwargs):
        if self.check_if_canceled("Agent processing"):
            return

+        user_prompt = kwargs.get("user_prompt")
+        user_prompt_text = "" if user_prompt is None else str(user_prompt)
+        _logger.debug(
+            "[Agent] _invoke_async called. Component: %s, Keys in kwargs: %s, user_prompt_present: %s, user_prompt_length: %d, tools count: %d",
+            self._id,
+            list(kwargs.keys()),
+            bool(user_prompt_text.strip()),
+            len(user_prompt_text),
+            len(self.tools) if self.tools else 0,
+        )
+
        if kwargs.get("user_prompt"):
            usr_pmt = ""
            if kwargs.get("reasoning"):
@@ -153,32 +218,41 @@ class Agent(LLM, ToolBase):
            else:
                usr_pmt = str(kwargs["user_prompt"])
            self._param.prompts = [{"role": "user", "content": usr_pmt}]
+            _logger.debug("[Agent] Built user prompt with length=%d, reasoning=%s, context=%s", len(usr_pmt), bool(kwargs.get("reasoning")), bool(kwargs.get("context")))

        if not self.tools:
            if self.check_if_canceled("Agent processing"):
                return
-            return LLM._invoke(self, **kwargs)
+            _logger.debug("[Agent] No tools configured. Delegating to LLM._invoke_async. prompt_count=%d", len(self._param.prompts) if self._param.prompts else 0)
+            return await LLM._invoke_async(self, **kwargs)

        prompt, msg, user_defined_prompt = self._prepare_prompt_variables()
+        output_schema = self._get_output_schema()
+        schema_prompt = ""
+        if output_schema:
+            schema = json.dumps(output_schema, ensure_ascii=False, indent=2)
+            schema_prompt = structured_output_prompt(schema)

-        downstreams = self._canvas.get_component(self._id)["downstream"] if self._canvas.get_component(self._id) else []
+        component = self._canvas.get_component(self._id)
+        downstreams = component["downstream"] if component else []
        ex = self.exception_handler()
-        output_structure=None
-        try:
-            output_structure=self._param.outputs['structured']
-        except Exception:
-            pass
-        if any([self._canvas.get_component_obj(cid).component_name.lower()=="message" for cid in downstreams]) and not output_structure and not (ex and ex["goto"]):
-            self.set_output("content", partial(self.stream_output_with_tools, prompt, msg, user_defined_prompt))
+        has_message_downstream = any(self._canvas.get_component_obj(cid).component_name.lower() == "message" for cid in downstreams)
+        if has_message_downstream and not (ex and ex["goto"]) and not output_schema:
+            _logger.debug("[Agent] Entering streaming mode (has message downstream)")
+            self.set_output("content", partial(self.stream_output_with_tools_async, prompt, deepcopy(msg), user_defined_prompt))
            return

-        _, msg = message_fit_in([{"role": "system", "content": prompt}, *msg], int(self.chat_mdl.max_length * 0.97))
-        use_tools = []
-        ans = ""
-        for delta_ans, tk in self._react_with_tools_streamly(prompt, msg, use_tools, user_defined_prompt):
-            if self.check_if_canceled("Agent processing"):
-                return
-            ans += delta_ans
+        msg, fit_error = self._fit_messages(prompt, msg)
+        if fit_error:
+            if self.get_exception_default_value():
+                self.set_output("content", self.get_exception_default_value())
+            else:
+                self.set_output("_ERROR", fit_error)
+            return
+
+        self._append_system_prompt(msg, schema_prompt)
+        _logger.debug("[Agent] Calling LLM with %d messages, has_schema=%s", len(msg), bool(schema_prompt))
+        ans = await self._generate_async(msg)

        if ans.find("**ERROR**") >= 0:
            logging.error(f"Agent._chat got error. response: {ans}")
@@ -188,191 +262,134 @@ class Agent(LLM, ToolBase):
                self.set_output("_ERROR", ans)
            return

+        if output_schema:
+            error = ""
+            for _ in range(self._param.max_retries + 1):
+                try:
+                    obj = json_repair.loads(self._clean_formatted_answer(ans))
+                    self.set_output("structured", obj)
+                    return obj
+                except Exception:
+                    error = "The answer cannot be parsed as JSON"
+                    ans = await self._force_format_to_schema_async(ans, schema_prompt)
+                    if ans.find("**ERROR**") >= 0:
+                        continue
+
+            self.set_output("_ERROR", error)
+            return
+
+        artifact_md = self._collect_tool_artifact_markdown(existing_text=ans)
+        if artifact_md:
+            ans += "\n\n" + artifact_md
+        _logger.debug("[Agent] Final output. content_length=%d, has_artifact=%s", len(ans), bool(artifact_md))
        self.set_output("content", ans)
-        if use_tools:
-            self.set_output("use_tools", use_tools)
        return ans

-    def stream_output_with_tools(self, prompt, msg, user_defined_prompt={}):
-        _, msg = message_fit_in([{"role": "system", "content": prompt}, *msg], int(self.chat_mdl.max_length * 0.97))
-        answer_without_toolcall = ""
-        use_tools = []
-        for delta_ans,_ in self._react_with_tools_streamly(prompt, msg, use_tools, user_defined_prompt):
+    async def stream_output_with_tools_async(self, prompt, msg, user_defined_prompt={}):
+        if len(msg) > 3:
+            st = timer()
+            user_request = await full_question(messages=msg, chat_mdl=self.chat_mdl)
+            self.callback("Multi-turn conversation optimization", {}, user_request, elapsed_time=timer() - st)
+            msg = [*msg[:-1], {"role": "user", "content": user_request}]
+
+        msg, fit_error = self._fit_messages(prompt, msg)
+        if fit_error:
+            if self.get_exception_default_value():
+                fallback = self.get_exception_default_value()
+                self.set_output("content", fallback)
+                yield fallback
+            else:
+                self.set_output("_ERROR", fit_error)
+                self.set_output("content", fit_error)
+                yield fit_error
+            return
+
+        need2cite = self._param.cite and self._canvas.get_reference()["chunks"] and self._id.find("-->") < 0
+        cited = False
+        if need2cite and len(msg) < 7:
+            self._append_system_prompt(msg, citation_prompt())
+            cited = True
+
+        answer = ""
+        async for delta in self._generate_streamly(msg):
            if self.check_if_canceled("Agent streaming"):
                return
-
-            if delta_ans.find("**ERROR**") >= 0:
+            if delta.find("**ERROR**") >= 0:
                if self.get_exception_default_value():
-                    self.set_output("content", self.get_exception_default_value())
-                    yield self.get_exception_default_value()
+                    fallback = self.get_exception_default_value()
+                    self.set_output("content", fallback)
+                    yield fallback
                else:
-                    self.set_output("_ERROR", delta_ans)
-                    return
-            answer_without_toolcall += delta_ans
-            yield delta_ans
+                    self.set_output("_ERROR", delta)
+                    self.set_output("content", delta)
+                    yield delta
+                return
+            if not need2cite or cited:
+                yield delta
+            answer += delta

-        self.set_output("content", answer_without_toolcall)
-        if use_tools:
-            self.set_output("use_tools", use_tools)
+        if not need2cite or cited:
+            artifact_md = self._collect_tool_artifact_markdown(existing_text=answer)
+            if artifact_md:
+                yield "\n\n" + artifact_md
+                answer += "\n\n" + artifact_md
+            self.set_output("content", answer)
+            return

-    def _gen_citations(self, text):
+        st = timer()
+        cited_answer = ""
+        async for delta in self._gen_citations_async(answer):
+            if self.check_if_canceled("Agent streaming"):
+                return
+            yield delta
+            cited_answer += delta
+        artifact_md = self._collect_tool_artifact_markdown(existing_text=cited_answer)
+        if artifact_md:
+            yield "\n\n" + artifact_md
+            cited_answer += "\n\n" + artifact_md
+        self.callback("gen_citations", {}, cited_answer, elapsed_time=timer() - st)
+        self.set_output("content", cited_answer)
+
+    async def _gen_citations_async(self, text):
        retrievals = self._canvas.get_reference()
        retrievals = {"chunks": list(retrievals["chunks"].values()), "doc_aggs": list(retrievals["doc_aggs"].values())}
        formated_refer = kb_prompt(retrievals, self.chat_mdl.max_length, True)
-        for delta_ans in self._generate_streamly([{"role": "system", "content": citation_plus("\n\n".join(formated_refer))},
-                                                  {"role": "user", "content": text}
-                                                  ]):
+        async for delta_ans in self._generate_streamly([{"role": "system", "content": citation_plus("\n\n".join(formated_refer))}, {"role": "user", "content": text}]):
            yield delta_ans

-    def _react_with_tools_streamly(self, prompt, history: list[dict], use_tools, user_defined_prompt={}):
-        token_count = 0
-        tool_metas = self.tool_meta
-        hist = deepcopy(history)
-        last_calling = ""
-        if len(hist) > 3:
-            st = timer()
-            user_request = full_question(messages=history, chat_mdl=self.chat_mdl)
-            self.callback("Multi-turn conversation optimization", {}, user_request, elapsed_time=timer()-st)
-        else:
-            user_request = history[-1]["content"]
+    def _collect_tool_artifact_markdown(self, existing_text: str = "") -> str:
+        md_parts = []
+        for tool_obj in self.tools.values():
+            if not hasattr(tool_obj, "_param") or not hasattr(tool_obj._param, "outputs"):
+                continue
+            artifacts_meta = tool_obj._param.outputs.get("_ARTIFACTS", {})
+            artifacts = artifacts_meta.get("value") if isinstance(artifacts_meta, dict) else None
+            if not artifacts:
+                continue
+            for art in artifacts:
+                if not isinstance(art, dict):
+                    continue
+                url = art.get("url", "")
+                if url and (f"![]({url})" in existing_text or f"![{art.get('name', '')}]({url})" in existing_text):
+                    continue
+                if art.get("mime_type", "").startswith("image/"):
+                    md_parts.append(f"![{art['name']}]({url})")
+                else:
+                    md_parts.append(f"[Download {art['name']}]({url})")
+        return "\n\n".join(md_parts)

-        def use_tool(name, args):
-            nonlocal hist, use_tools, token_count,last_calling,user_request
-            logging.info(f"{last_calling=} == {name=}")
-            # Summarize of function calling
-            #if all([
-            #    isinstance(self.toolcall_session.get_tool_obj(name), Agent),
-            #    last_calling,
-            #    last_calling != name
-            #]):
-            #    self.toolcall_session.get_tool_obj(name).add2system_prompt(f"The chat history with other agents are as following: \n" + self.get_useful_memory(user_request, str(args["user_prompt"]),user_defined_prompt))
-            last_calling = name
-            tool_response = self.toolcall_session.tool_call(name, args)
-            use_tools.append({
-                "name": name,
-                "arguments": args,
-                "results": tool_response
-            })
-            # self.callback("add_memory", {}, "...")
-            #self.add_memory(hist[-2]["content"], hist[-1]["content"], name, args, str(tool_response), user_defined_prompt)
-
-            return name, tool_response
-
-        def complete():
-            nonlocal hist
-            need2cite = self._param.cite and self._canvas.get_reference()["chunks"] and self._id.find("-->") < 0
-            cited = False
-            if hist[0]["role"] == "system" and need2cite:
-                if len(hist) < 7:
-                    hist[0]["content"] += citation_prompt()
-                    cited = True
-            yield "", token_count
-
-            _hist = hist
-            if len(hist) > 12:
-                _hist = [hist[0], hist[1], *hist[-10:]]
-            entire_txt = ""
-            for delta_ans in self._generate_streamly(_hist):
-                if not need2cite or cited:
-                    yield delta_ans, 0
-                entire_txt += delta_ans
-            if not need2cite or cited:
-                return
-
-            st = timer()
-            txt = ""
-            for delta_ans in self._gen_citations(entire_txt):
-                if self.check_if_canceled("Agent streaming"):
-                    return
-                yield delta_ans, 0
-                txt += delta_ans
-
-            self.callback("gen_citations", {}, txt, elapsed_time=timer()-st)
-
-        def append_user_content(hist, content):
-            if hist[-1]["role"] == "user":
-                hist[-1]["content"] += content
-            else:
-                hist.append({"role": "user", "content": content})
-
-        st = timer()
-        task_desc = analyze_task(self.chat_mdl, prompt, user_request, tool_metas, user_defined_prompt)
-        self.callback("analyze_task", {}, task_desc, elapsed_time=timer()-st)
-        for _ in range(self._param.max_rounds + 1):
-            if self.check_if_canceled("Agent streaming"):
-                return
-            response, tk = next_step(self.chat_mdl, hist, tool_metas, task_desc, user_defined_prompt)
-            # self.callback("next_step", {}, str(response)[:256]+"...")
-            token_count += tk
-            hist.append({"role": "assistant", "content": response})
-            try:
-                functions = json_repair.loads(re.sub(r"```.*", "", response))
-                if not isinstance(functions, list):
-                    raise TypeError(f"List should be returned, but `{functions}`")
-                for f in functions:
-                    if not isinstance(f, dict):
-                        raise TypeError(f"An object type should be returned, but `{f}`")
-                with ThreadPoolExecutor(max_workers=5) as executor:
-                    thr = []
-                    for func in functions:
-                        name = func["name"]
-                        args = func["arguments"]
-                        if name == COMPLETE_TASK:
-                            append_user_content(hist, f"Respond with a formal answer. FORGET(DO NOT mention) about `{COMPLETE_TASK}`. The language for the response MUST be as the same as the first user request.\n")
-                            for txt, tkcnt in complete():
-                                yield txt, tkcnt
-                            return
-
-                        thr.append(executor.submit(use_tool, name, args))
-
-                    st = timer()
-                    reflection = reflect(self.chat_mdl, hist, [th.result() for th in thr], user_defined_prompt)
-                    append_user_content(hist, reflection)
-                    self.callback("reflection", {}, str(reflection), elapsed_time=timer()-st)
-
-            except Exception as e:
-                logging.exception(msg=f"Wrong JSON argument format in LLM ReAct response: {e}")
-                e = f"\nTool call error, please correct the input parameter of response format and call it again.\n *** Exception ***\n{e}"
-                append_user_content(hist, str(e))
-
-        logging.warning( f"Exceed max rounds: {self._param.max_rounds}")
-        final_instruction = f"""
-{user_request}
-IMPORTANT: You have reached the conversation limit. Based on ALL the information and research you have gathered so far, please provide a DIRECT and COMPREHENSIVE final answer to the original request.
-Instructions:
-1. SYNTHESIZE all information collected during this conversation
-2. Provide a COMPLETE response using existing data - do not suggest additional research
-3. Structure your response as a FINAL DELIVERABLE, not a plan
-4. If information is incomplete, state what you found and provide the best analysis possible with available data
-5. DO NOT mention conversation limits or suggest further steps
-6. Focus on delivering VALUE with the information already gathered
-Respond immediately with your final comprehensive answer.
-        """
-        if self.check_if_canceled("Agent final instruction"):
-            return
-        append_user_content(hist, final_instruction)
-
-        for txt, tkcnt in complete():
-            yield txt, tkcnt
-
-    def get_useful_memory(self, goal: str, sub_goal:str, topn=3, user_defined_prompt:dict={}) -> str:
-        # self.callback("get_useful_memory", {"topn": 3}, "...")
-        mems = self._canvas.get_memory()
-        rank = rank_memories(self.chat_mdl, goal, sub_goal, [summ for (user, assist, summ) in mems], user_defined_prompt)
-        try:
-            rank = json_repair.loads(re.sub(r"```.*", "", rank))[:topn]
-            mems = [mems[r] for r in rank]
-            return "\n\n".join([f"User: {u}\nAgent: {a}" for u, a,_ in mems])
-        except Exception as e:
-            logging.exception(e)
-
-        return "Error occurred."
-
-    def reset(self, temp=False):
+    def reset(self, only_output=False):
        """
        Reset all tools if they have a reset method. This avoids errors for tools like MCPToolCallSession.
        """
+        for k in self._param.outputs.keys():
+            self._param.outputs[k]["value"] = None
+
        for k, cpn in self.tools.items():
            if hasattr(cpn, "reset") and callable(cpn.reset):
                cpn.reset()
-
+        if only_output:
+            return
+        for k in self._param.inputs.keys():
+            self._param.inputs[k]["value"] = None
+        self._param.debug_inputs = {}
--- a/agent/component/base.py
+++ b/agent/component/base.py
@@ -14,6 +14,7 @@
 #  limitations under the License.
 #

+import asyncio
 import re
 import time
 from abc import ABC
@@ -23,11 +24,15 @@ import os
 import logging
 from typing import Any, List, Union
 import pandas as pd
-import trio
 from agent import settings
 from common.connection_utils import timeout


+
+from common.misc_utils import thread_pool_exec
+
+_logger = logging.getLogger(__name__)
+
 _FEEDED_DEPRECATED_PARAMS = "_feeded_deprecated_params"
 _DEPRECATED_PARAMS = "_deprecated_params"
 _USER_FEEDED_PARAMS = "_user_feeded_params"
@@ -91,13 +96,18 @@ class ComponentParamBase(ABC):
        return {name: True for name in self.get_feeded_deprecated_params()}

    def __str__(self):
-        return json.dumps(self.as_dict(), ensure_ascii=False)
+        def _serialize_default(obj):
+            if callable(obj):
+                return None
+            logging.warning("ComponentParamBase.__str__: JSON fallback via str() for type=%s", type(obj).__name__)
+            return str(obj)
+        return json.dumps(self.as_dict(), ensure_ascii=False, default=_serialize_default)

    def as_dict(self):
        def _recursive_convert_obj_to_dict(obj):
            ret_dict = {}
            if isinstance(obj, dict):
-                for k,v in obj.items():
+                for k, v in obj.items():
                    if isinstance(v, dict) or (v and type(v).__name__ not in dir(builtins)):
                        ret_dict[k] = _recursive_convert_obj_to_dict(v)
                    else:
@@ -253,96 +263,65 @@ class ComponentParamBase(ABC):
                self._validate_param(attr, validation_json)

    @staticmethod
-    def check_string(param, descr):
+    def check_string(param, description):
        if type(param).__name__ not in ["str"]:
-            raise ValueError(
-                descr + " {} not supported, should be string type".format(param)
-            )
+            raise ValueError(description + " {} not supported, should be string type".format(param))

    @staticmethod
-    def check_empty(param, descr):
+    def check_empty(param, description):
        if not param:
-            raise ValueError(
-                descr + " does not support empty value."
-            )
+            raise ValueError(description + " does not support empty value.")

    @staticmethod
-    def check_positive_integer(param, descr):
+    def check_positive_integer(param, description):
        if type(param).__name__ not in ["int", "long"] or param <= 0:
-            raise ValueError(
-                descr + " {} not supported, should be positive integer".format(param)
-            )
+            raise ValueError(description + " {} not supported, should be positive integer".format(param))

    @staticmethod
-    def check_positive_number(param, descr):
+    def check_positive_number(param, description):
        if type(param).__name__ not in ["float", "int", "long"] or param <= 0:
-            raise ValueError(
-                descr + " {} not supported, should be positive numeric".format(param)
-            )
+            raise ValueError(description + " {} not supported, should be positive numeric".format(param))

    @staticmethod
-    def check_nonnegative_number(param, descr):
+    def check_nonnegative_number(param, description):
        if type(param).__name__ not in ["float", "int", "long"] or param < 0:
-            raise ValueError(
-                descr
-                + " {} not supported, should be non-negative numeric".format(param)
-            )
+            raise ValueError(description + " {} not supported, should be non-negative numeric".format(param))

    @staticmethod
-    def check_decimal_float(param, descr):
+    def check_decimal_float(param, description):
        if type(param).__name__ not in ["float", "int"] or param < 0 or param > 1:
-            raise ValueError(
-                descr
-                + " {} not supported, should be a float number in range [0, 1]".format(
-                    param
-                )
-            )
+            raise ValueError(description + " {} not supported, should be a float number in range [0, 1]".format(param))

    @staticmethod
-    def check_boolean(param, descr):
+    def check_boolean(param, description):
        if type(param).__name__ != "bool":
-            raise ValueError(
-                descr + " {} not supported, should be bool type".format(param)
-            )
+            raise ValueError(description + " {} not supported, should be bool type".format(param))

    @staticmethod
-    def check_open_unit_interval(param, descr):
+    def check_open_unit_interval(param, description):
        if type(param).__name__ not in ["float"] or param <= 0 or param >= 1:
-            raise ValueError(
-                descr + " should be a numeric number between 0 and 1 exclusively"
-            )
+            raise ValueError(description + " should be a numeric number between 0 and 1 exclusively")

    @staticmethod
-    def check_valid_value(param, descr, valid_values):
+    def check_valid_value(param, description, valid_values):
        if param not in valid_values:
-            raise ValueError(
-                descr
-                + " {} is not supported, it should be in {}".format(param, valid_values)
-            )
+            raise ValueError(description + " {} is not supported, it should be in {}".format(param, valid_values))

    @staticmethod
-    def check_defined_type(param, descr, types):
+    def check_defined_type(param, description, types):
        if type(param).__name__ not in types:
-            raise ValueError(
-                descr + " {} not supported, should be one of {}".format(param, types)
-            )
+            raise ValueError(description + " {} not supported, should be one of {}".format(param, types))

    @staticmethod
-    def check_and_change_lower(param, valid_list, descr=""):
+    def check_and_change_lower(param, valid_list, description=""):
        if type(param).__name__ != "str":
-            raise ValueError(
-                descr
-                + " {} not supported, should be one of {}".format(param, valid_list)
-            )
+            raise ValueError(description + " {} not supported, should be one of {}".format(param, valid_list))

        lower_param = param.lower()
        if lower_param in valid_list:
            return lower_param
        else:
-            raise ValueError(
-                descr
-                + " {} not supported, should be one of {}".format(param, valid_list)
-            )
+            raise ValueError(description + " {} not supported, should be one of {}".format(param, valid_list))

    @staticmethod
    def _greater_equal_than(value, limit):
@@ -374,16 +353,16 @@ class ComponentParamBase(ABC):
    def _not_in(value, wrong_value_list):
        return value not in wrong_value_list

-    def _warn_deprecated_param(self, param_name, descr):
+    def _warn_deprecated_param(self, param_name, description):
        if self._deprecated_params_set.get(param_name):
            logging.warning(
-                f"{descr} {param_name} is deprecated and ignored in this version."
+                f"{description} {param_name} is deprecated and ignored in this version."
            )

-    def _warn_to_deprecate_param(self, param_name, descr, new_param):
+    def _warn_to_deprecate_param(self, param_name, description, new_param):
        if self._deprecated_params_set.get(param_name):
            logging.warning(
-                f"{descr} {param_name} will be deprecated in future release; "
+                f"{description} {param_name} will be deprecated in future release; "
                f"please use {new_param} instead."
            )
            return True
@@ -392,8 +371,9 @@ class ComponentParamBase(ABC):

 class ComponentBase(ABC):
    component_name: str
-    thread_limiter = trio.CapacityLimiter(int(os.environ.get('MAX_CONCURRENT_CHATS', 10)))
-    variable_ref_patt = r"\{* *\{([a-zA-Z:0-9]+@[A-Za-z0-9_.]+|sys\.[A-Za-z0-9_.]+|env\.[A-Za-z0-9_.]+)\} *\}*"
+    thread_limiter = asyncio.Semaphore(int(os.environ.get("MAX_CONCURRENT_CHATS", 10)))
+    variable_ref_patt = r"\{* *\{([a-zA-Z:0-9]+@[A-Za-z0-9_.-]+|sys\.[A-Za-z0-9_.]+|env\.[A-Za-z0-9_.]+)\} *\}*"
+    iteration_alias_patt = r"\{* *\{(item|index|result)\} *\}*"

    def __str__(self):
        """
@@ -407,10 +387,11 @@ class ComponentBase(ABC):
            "params": {}
        }}""".format(self.component_name,
                     self._param
-        )
+                     )

    def __init__(self, canvas, id, param: ComponentParamBase):
        from agent.canvas import Graph  # Local import to avoid cyclic dependency
+
        assert isinstance(canvas, Graph), "canvas must be an instance of Canvas"
        self._canvas = canvas
        self._id = id
@@ -445,14 +426,42 @@ class ComponentBase(ABC):
        self.set_output("_elapsed_time", time.perf_counter() - self.output("_created_time"))
        return self.output()

-    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60)))
+    async def invoke_async(self, **kwargs) -> dict[str, Any]:
+        """
+        Async wrapper for component invocation.
+        Prefers coroutine `_invoke_async` if present; otherwise falls back to `_invoke`.
+        Handles timing and error recording consistently with `invoke`.
+        """
+        self.set_output("_created_time", time.perf_counter())
+        try:
+            if self.check_if_canceled("Component processing"):
+                return
+
+            fn_async = getattr(self, "_invoke_async", None)
+            if fn_async and asyncio.iscoroutinefunction(fn_async):
+                await fn_async(**kwargs)
+            elif asyncio.iscoroutinefunction(self._invoke):
+                await self._invoke(**kwargs)
+            else:
+                await thread_pool_exec(self._invoke, **kwargs)
+        except Exception as e:
+            if self.get_exception_default_value():
+                self.set_exception_default_value()
+            else:
+                self.set_output("_ERROR", str(e))
+            logging.exception(e)
+        self._param.debug_inputs = {}
+        self.set_output("_elapsed_time", time.perf_counter() - self.output("_created_time"))
+        return self.output()
+
+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10 * 60)))
    def _invoke(self, **kwargs):
        raise NotImplementedError()

-    def output(self, var_nm: str=None) -> Union[dict[str, Any], Any]:
+    def output(self, var_nm: str = None) -> Union[dict[str, Any], Any]:
        if var_nm:
            return self._param.outputs.get(var_nm, {}).get("value", "")
-        return {k: o.get("value") for k,o in self._param.outputs.items()}
+        return {k: o.get("value") for k, o in self._param.outputs.items()}

    def set_output(self, key: str, value: Any):
        if key not in self._param.outputs:
@@ -463,27 +472,44 @@ class ComponentBase(ABC):
        return self._param.outputs.get("_ERROR", {}).get("value")

    def reset(self, only_output=False):
-        for k in self._param.outputs.keys():
-            self._param.outputs[k]["value"] = None
+        outputs: dict = self._param.outputs  # for better performance
+        for k in outputs.keys():
+            outputs[k]["value"] = None
        if only_output:
            return
-        for k in self._param.inputs.keys():
-            self._param.inputs[k]["value"] = None
+
+        inputs: dict = self._param.inputs  # for better performance
+        for k in inputs.keys():
+            inputs[k]["value"] = None
        self._param.debug_inputs = {}

-    def get_input(self, key: str=None) -> Union[Any, dict[str, Any]]:
+    def get_input(self, key: str = None) -> Union[Any, dict[str, Any]]:
        if key:
            return self._param.inputs.get(key, {}).get("value")

        res = {}
-        for var, o in self.get_input_elements().items():
+        input_elements = self.get_input_elements()
+        _logger.debug(
+            "[Base] Component '%s' (%s) resolving inputs. Input element keys: %s",
+            self._id, self.component_name, list(input_elements.keys()),
+        )
+        for var, o in input_elements.items():
            v = self.get_param(var)
            if v is None:
+                _logger.debug("[Base]   var '%s': param is None, skipping", var)
                continue
            if isinstance(v, str) and self._canvas.is_reff(v):
-                self.set_input_value(var, self._canvas.get_variable_value(v))
+                resolved = self._canvas.get_variable_value(v)
+                self.set_input_value(var, resolved)
+                _logger.debug("[Base]   var '%s': resolved ref '%s' -> %s", var, v, json.dumps(resolved, ensure_ascii=False, default=str)[:200])
+            elif isinstance(v, str) and re.search(self.variable_ref_patt, v):
+                elements = self.get_input_elements_from_text(v)
+                kv = {k: e.get('value', '') for k, e in elements.items()}
+                self.set_input_value(var, self.string_format(v, kv))
+                _logger.debug("[Base]   var '%s': resolved text refs '%s' -> %s", var, v, json.dumps(kv, ensure_ascii=False, default=str)[:200])
            else:
                self.set_input_value(var, v)
+                _logger.debug("[Base]   var '%s': literal value -> %s", var, json.dumps(v, ensure_ascii=False, default=str)[:200])
            res[var] = self.get_input_value(var)
        return res

@@ -493,15 +519,46 @@ class ComponentBase(ABC):

        return {var: self.get_input_value(var) for var, o in self.get_input_elements().items()}

+    def _resolve_iteration_alias_ref(self, exp: str) -> str | None:
+        if exp not in {"item", "index", "result"}:
+            return None
+
+        parent = self.get_parent()
+        if not parent or parent.component_name.lower() != "iteration":
+            return None
+
+        for cid, cpn in self._canvas.components.items():
+            if cpn.get("parent_id") != parent._id:
+                continue
+            if cpn["obj"].component_name.lower() != "iterationitem":
+                continue
+            return f"{cid}@{exp}"
+
+        return None
+
    def get_input_elements_from_text(self, txt: str) -> dict[str, dict[str, str]]:
        res = {}
-        for r in re.finditer(self.variable_ref_patt, txt, flags=re.IGNORECASE|re.DOTALL):
+        for r in re.finditer(self.variable_ref_patt, txt, flags=re.IGNORECASE | re.DOTALL):
            exp = r.group(1)
-            cpn_id, var_nm = exp.split("@") if exp.find("@")>0 else ("", exp)
+            cpn_id, var_nm = exp.split("@") if exp.find("@") > 0 else ("", exp)
            res[exp] = {
-                "name": (self._canvas.get_component_name(cpn_id) +f"@{var_nm}") if cpn_id else exp,
+                "name": (self._canvas.get_component_name(cpn_id) + f"@{var_nm}") if cpn_id else exp,
                "value": self._canvas.get_variable_value(exp),
-                "_retrival": self._canvas.get_variable_value(f"{cpn_id}@_references") if cpn_id else None,
+                "_retrieval": self._canvas.get_variable_value(f"{cpn_id}@_references") if cpn_id else None,
+                "_cpn_id": cpn_id
+            }
+        for r in re.finditer(self.iteration_alias_patt, txt, flags=re.IGNORECASE | re.DOTALL):
+            exp = r.group(1)
+            if exp in res:
+                continue
+            ref = self._resolve_iteration_alias_ref(exp)
+            if not ref:
+                continue
+            cpn_id, var_nm = ref.split("@", 1)
+            res[exp] = {
+                "name": (self._canvas.get_component_name(cpn_id) + f"@{var_nm}"),
+                "value": self._canvas.get_variable_value(ref),
+                "_retrieval": self._canvas.get_variable_value(f"{cpn_id}@_references"),
                "_cpn_id": cpn_id
            }
        return res
@@ -522,6 +579,10 @@ class ComponentBase(ABC):
            return None
        return self._param.inputs[key].get("value")

+    @staticmethod
+    def be_output(v):
+        return pd.DataFrame([{"content": v}])
+
    def get_component_name(self, cpn_id) -> str:
        return self._canvas.get_component(cpn_id)["obj"].component_name.lower()

@@ -552,6 +613,7 @@ class ComponentBase(ABC):
        for n, v in kv.items():
            def repl(_match, val=v):
                return str(val) if val is not None else ""
+
            content = re.sub(
                r"\{%s\}" % re.escape(n),
                repl,
--- a/agent/component/begin.py
+++ b/agent/component/begin.py
@@ -27,7 +27,7 @@ class BeginParam(UserFillUpParam):
        self.prologue = "Hi! I'm your smart assistant. What can I do for you?"

    def check(self):
-        self.check_valid_value(self.mode, "The 'mode' should be either `conversational` or `task`", ["conversational", "task"])
+        self.check_valid_value(self.mode, "The 'mode' should be either `conversational` or `task`", ["conversational", "task","Webhook"])

    def get_input_form(self) -> dict[str, dict]:
        return getattr(self, "inputs")
@@ -40,17 +40,12 @@ class Begin(UserFillUp):
        if self.check_if_canceled("Begin processing"):
            return

-        for k, v in kwargs.get("inputs", {}).items():
+        layout_recognize = self._param.layout_recognize or None
+        merged_inputs = self._merge_runtime_inputs(kwargs.get("inputs", {}))
+        for k, v in merged_inputs.items():
            if self.check_if_canceled("Begin processing"):
                return
-
-            if isinstance(v, dict) and v.get("type", "").lower().find("file") >=0:
-                if v.get("optional") and v.get("value", None) is None:
-                    v = None
-                else:
-                    v = self._canvas.get_files([v["value"]])
-            else:
-                v = v.get("value")
+            v = self._resolve_input_value(v, layout_recognize)
            self.set_output(k, v)
            self.set_input_value(k, v)

--- a/agent/component/browser.py
+++ b/agent/component/browser.py
@@ -0,0 +1,730 @@
+#
+#  Copyright 2026 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+import asyncio
+import hashlib
+import inspect
+import json
+import logging
+import os
+import re
+import shutil
+import tempfile
+from abc import ABC
+from pathlib import Path
+from typing import Any
+from urllib.error import HTTPError, URLError
+from urllib.parse import unquote, urlparse
+from urllib.request import Request, urlopen
+
+from agent.component.base import ComponentBase
+from agent.component.llm import LLMParam
+from api.db import FileType
+from api.db.joint_services.tenant_model_service import get_model_config_from_provider_instance, get_model_type_by_name
+from api.db.services import duplicate_name
+from api.db.services.file_service import FileService
+from api.utils.file_utils import filename_type
+from common import settings
+from common.connection_utils import timeout
+from common.misc_utils import get_uuid
+from rag.llm import FACTORY_DEFAULT_BASE_URL
+
+
+class BrowserParam(LLMParam):
+    """
+    Parameters for Browser node.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.prompts = "{sys.query}"
+        self.max_steps = 30
+        self.headless = True
+        self.enable_default_extensions = False
+        self.chromium_sandbox = False
+        # Reuse browser profile across runs of the same agent node by default.
+        self.persist_session = True
+        self.upload_sources = []
+        self.outputs = {
+            "content": {"type": "string", "value": ""},
+            "downloaded_files": {"type": "Array<Object>", "value": []},
+        }
+
+    def check(self):
+        self.check_empty(self.llm_id, "[Browser] LLM")
+        self.check_positive_integer(self.max_steps, "[Browser] Max steps")
+        self.check_boolean(self.headless, "[Browser] Headless")
+        self.check_boolean(self.enable_default_extensions, "[Browser] Enable default extensions")
+        self.check_boolean(self.chromium_sandbox, "[Browser] Chromium sandbox")
+        self.check_boolean(self.persist_session, "[Browser] Persist session")
+        self.check_empty(self.prompts, "[Browser] Prompts")
+        return True
+
+    def get_input_form(self) -> dict[str, dict]:
+        return {
+            "prompts": {"type": "text", "name": "Prompts"},
+            "upload_sources": {"type": "line", "name": "Upload sources"},
+        }
+
+
+class Browser(ComponentBase, ABC):
+    component_name = "Browser"
+
+    def _prepare_input_values(self):
+        for key, meta in self.get_input_elements().items():
+            val = meta.get("value")
+            if val is None:
+                val = ""
+            elif not isinstance(val, str):
+                val = json.dumps(val, ensure_ascii=False)
+            self.set_input_value(key, val)
+
+    def get_input_elements(self) -> dict[str, dict]:
+        text_parts = [
+            str(self._param.prompts or ""),
+            json.dumps(self._param.upload_sources, ensure_ascii=False),
+        ]
+        return self.get_input_elements_from_text("\n".join(text_parts))
+
+    def _resolve_param_value(self, value: Any) -> Any:
+        if isinstance(value, str):
+            direct_ref = value.strip()
+            if direct_ref.startswith("{") and direct_ref.endswith("}") and self._canvas.is_reff(direct_ref):
+                return self._canvas.get_variable_value(direct_ref)
+            return value
+        return value
+
+    def _extract_ids(self, value: Any) -> list[str]:
+        ids: list[str] = []
+        value = self._resolve_param_value(value)
+
+        def collect(item: Any):
+            if item is None:
+                return
+            if isinstance(item, str):
+                token = item.strip()
+                if not token:
+                    return
+                if token.startswith("{") and token.endswith("}") and self._canvas.is_reff(token):
+                    collect(self._canvas.get_variable_value(token))
+                    return
+                if token.startswith("[") and token.endswith("]"):
+                    try:
+                        parsed = json.loads(token)
+                        collect(parsed)
+                        return
+                    except Exception:
+                        pass
+                if self._is_http_url(token):
+                    ids.append(token)
+                    return
+                if "," in token:
+                    for part in token.split(","):
+                        collect(part)
+                    return
+                ids.append(token)
+                return
+            if isinstance(item, dict):
+                for k in ("file_id", "id", "url", "value"):
+                    if k in item:
+                        collect(item[k])
+                        return
+                for v in item.values():
+                    collect(v)
+                return
+            if isinstance(item, (list, tuple, set)):
+                for v in item:
+                    collect(v)
+                return
+            token = str(item).strip()
+            if token:
+                ids.append(token)
+
+        collect(value)
+        deduped: list[str] = []
+        visited = set()
+        for item in ids:
+            if item in visited:
+                continue
+            visited.add(item)
+            deduped.append(item)
+        return deduped
+
+    @staticmethod
+    def _is_http_url(value: str) -> bool:
+        token = str(value or "").strip()
+        if not token:
+            return False
+        parsed = urlparse(token)
+        return parsed.scheme in {"http", "https"} and bool(parsed.netloc)
+
+    @staticmethod
+    def _extract_url_filename(url: str, headers: Any) -> str:
+        content_disposition = str(getattr(headers, "get", lambda *_args, **_kwargs: "")("Content-Disposition", "") or "")
+        if content_disposition:
+            # Prefer RFC 5987 encoded filename*=UTF-8''... when present.
+            m = re.search(r"filename\*\s*=\s*(?:UTF-8''|utf-8'')?([^;]+)", content_disposition)
+            if m:
+                name = unquote(m.group(1).strip().strip('"'))
+                if name:
+                    return os.path.basename(name)
+            m = re.search(r'filename\s*=\s*"([^"]+)"', content_disposition)
+            if m:
+                name = m.group(1).strip()
+                if name:
+                    return os.path.basename(name)
+            m = re.search(r"filename\s*=\s*([^;]+)", content_disposition)
+            if m:
+                name = m.group(1).strip().strip('"')
+                if name:
+                    return os.path.basename(name)
+
+        parsed = urlparse(url)
+        raw_name = os.path.basename(parsed.path or "")
+        name = unquote(raw_name).strip()
+        if name:
+            return name
+        return f"url_file_{get_uuid()[:8]}.bin"
+
+    @staticmethod
+    def _resolve_upload_url_max_bytes() -> int:
+        raw = str(os.getenv("RAGFLOW_BROWSER_UPLOAD_URL_MAX_BYTES", "") or "").strip()
+        default_max_bytes = 100 * 1024 * 1024
+        if not raw:
+            return default_max_bytes
+        try:
+            parsed = int(raw)
+            return parsed if parsed > 0 else default_max_bytes
+        except (TypeError, ValueError):
+            return default_max_bytes
+
+    @staticmethod
+    def _restore_env_var(key: str, value: str | None):
+        if value is None:
+            os.environ.pop(key, None)
+            return
+        os.environ[key] = value
+
+    def _prepare_upload_url_file(self, url: str, upload_dir: str) -> dict[str, Any] | None:
+        max_bytes = self._resolve_upload_url_max_bytes()
+        local_path = ""
+        local_name = ""
+        total_size = 0
+        try:
+            req = Request(url, headers={"User-Agent": "RAGFlow-Browser-Node/1.0"})
+            with urlopen(req, timeout=30) as response:
+                local_name = self._extract_url_filename(url, response.headers)
+
+                local_path = os.path.join(upload_dir, local_name)
+                index = 1
+                while os.path.exists(local_path):
+                    stem, ext = os.path.splitext(local_name)
+                    local_path = os.path.join(upload_dir, f"{stem}_{index}{ext}")
+                    index += 1
+
+                with open(local_path, "wb") as f:
+                    while True:
+                        chunk = response.read(1024 * 1024)
+                        if not chunk:
+                            break
+                        total_size += len(chunk)
+                        if total_size > max_bytes:
+                            raise ValueError(f"upload url file exceeds max size limit: {max_bytes}")
+                        f.write(chunk)
+        except (HTTPError, URLError, OSError, TimeoutError, ValueError) as e:
+            if local_path and os.path.exists(local_path):
+                try:
+                    os.remove(local_path)
+                except OSError:
+                    pass
+            logging.warning("Browser failed to fetch upload url. url=%s, error=%s", url, e)
+            return None
+
+        if total_size <= 0:
+            if local_path and os.path.exists(local_path):
+                try:
+                    os.remove(local_path)
+                except OSError:
+                    pass
+            logging.warning("Browser upload url returned empty content: %s", url)
+            return None
+
+        return {
+            "file_id": "",
+            "name": local_name,
+            "size": total_size,
+            "local_path": local_path,
+            "source_url": url,
+        }
+
+    def _resolve_text(self, raw_text: Any) -> str:
+        text = str(self._resolve_param_value(raw_text) or "")
+        vars_map = self.get_input_elements_from_text(text)
+        kv = {}
+        for key, meta in vars_map.items():
+            val = meta.get("value", "")
+            if isinstance(val, str):
+                kv[key] = val
+            else:
+                kv[key] = json.dumps(val, ensure_ascii=False)
+        return self.string_format(text, kv)
+
+    @staticmethod
+    def _as_model_config_dict(cfg_obj: Any) -> dict[str, Any]:
+        if cfg_obj is None:
+            return {}
+        if isinstance(cfg_obj, dict):
+            return cfg_obj
+        if hasattr(cfg_obj, "to_dict") and callable(cfg_obj.to_dict):
+            try:
+                result = cfg_obj.to_dict()
+                return result if isinstance(result, dict) else {}
+            except (AttributeError, TypeError, ValueError):
+                return {}
+        result = {}
+        for key in ("model", "model_name", "llm_name", "llm_factory", "api_key", "base_url", "api_base", "temperature"):
+            val = getattr(cfg_obj, key, None)
+            if val not in (None, ""):
+                result[key] = val
+        return result
+
+    @staticmethod
+    def _error_chain(exc: Exception) -> str:
+        parts = []
+        cur = exc
+        depth = 0
+        while cur is not None and depth < 6:
+            parts.append(f"{type(cur).__name__}: {cur}")
+            cur = cur.__cause__ or cur.__context__
+            depth += 1
+        return " <- ".join(parts)
+
+    @staticmethod
+    def _resolve_browser_executable() -> str:
+        explicit_candidates = [
+            os.getenv("BROWSER_USE_EXECUTABLE_PATH", "").strip(),
+            os.getenv("BROWSER_USE_BROWSER_BINARY_PATH", "").strip(),
+            os.getenv("BROWSER_USE_CHROME_BINARY_PATH", "").strip(),
+        ]
+        for explicit in explicit_candidates:
+            if explicit and os.path.isfile(explicit) and os.access(explicit, os.X_OK):
+                return explicit
+        candidates = [
+            "/opt/chrome/chrome",
+            "/usr/local/bin/chrome",
+            "/usr/local/bin/google-chrome",
+            "/usr/bin/google-chrome",
+            "/usr/bin/google-chrome-stable",
+            "/usr/bin/chromium",
+            "/usr/bin/chromium-browser",
+        ]
+        for path in candidates:
+            if os.path.isfile(path) and os.access(path, os.X_OK):
+                return path
+        for cmd in ("chrome", "google-chrome", "google-chrome-stable", "chromium", "chromium-browser"):
+            path = shutil.which(cmd)
+            if path and os.path.isfile(path) and os.access(path, os.X_OK):
+                return path
+        return ""
+
+    @staticmethod
+    def _normalize_model_name(model: Any) -> str:
+        name = str(model or "").strip()
+        if not name:
+            return ""
+        if name.startswith("bu-") or name.startswith("browser-use/"):
+            return name
+        if "@" in name:
+            # RAGFlow model aliases may include provider suffix, e.g. qwen3.5-flash@Tongyi-Qianwen.
+            # browser-use OpenAI-compatible adapters need the pure model name.
+            name = name.split("@", 1)[0].strip()
+        return name
+
+    @staticmethod
+    def _safe_path_segment(value: Any) -> str:
+        token = str(value or "").strip()
+        if not token:
+            return "unknown"
+        token = re.sub(r"[^A-Za-z0-9._-]+", "_", token)
+        return token.strip("._-") or "unknown"
+
+    def _resolve_persistent_profile_dir(self) -> str:
+        root = os.path.join(tempfile.gettempdir(), "ragflow_browser_use_profiles")
+        tenant = self._safe_path_segment(self._canvas.get_tenant_id())
+        raw_canvas_id = getattr(self._canvas, "_id", "")
+        if not raw_canvas_id:
+            graph_text = json.dumps(
+                self._canvas.dsl.get("graph", {}),
+                sort_keys=True,
+                ensure_ascii=False,
+            )
+            raw_canvas_id = (
+                f"dsl_{hashlib.sha1(graph_text.encode('utf-8')).hexdigest()[:12]}"
+            )
+        canvas_id = self._safe_path_segment(raw_canvas_id)
+        node_id = self._safe_path_segment(self._id)
+        return os.path.join(root, tenant, canvas_id, node_id)
+
+    def _should_persist_session(self) -> bool:
+        return bool(self._param.persist_session)
+
+    def _infer_provider_name(self, cfg: dict[str, Any]) -> str:
+        provider = str(cfg.get("llm_factory") or "").strip()
+        if provider:
+            return provider
+        llm_id = str(self._param.llm_id or "")
+        if "@" in llm_id:
+            return llm_id.split("@", 1)[1].strip()
+        return ""
+
+    def _resolve_openai_compatible_base_url(self, cfg: dict[str, Any]) -> str:
+        explicit = str(cfg.get("base_url") or cfg.get("api_base") or "").strip()
+        if explicit:
+            return explicit
+
+        provider = self._infer_provider_name(cfg)
+        fallback = str(FACTORY_DEFAULT_BASE_URL.get(provider, "")).strip()
+        return fallback if fallback else ""
+
+    def _build_browser_llm(self):
+        from browser_use.llm import ChatBrowserUse, ChatOpenAI
+
+        chat_model_config = get_model_config_from_provider_instance(
+            self._canvas.get_tenant_id(),
+            get_model_type_by_name(self._canvas.get_tenant_id(), self._param.llm_id),
+            self._param.llm_id,
+        )
+        cfg = self._as_model_config_dict(chat_model_config)
+        model_name = self._normalize_model_name(cfg.get("model_name") or cfg.get("model") or self._param.llm_id)
+        if not model_name:
+            raise ValueError(f"Invalid model config for Browser llm_id={self._param.llm_id}")
+        base_url = self._resolve_openai_compatible_base_url(cfg)
+
+        # ChatBrowserUse only supports bu-* models. For tenant models, use OpenAI-compatible adapter.
+        if model_name.startswith("bu-") or model_name.startswith("browser-use/"):
+            llm_kwargs = {
+                "model": model_name,
+                "api_key": cfg.get("api_key"),
+                "base_url": base_url,
+                "temperature": self._param.temperature,
+                "max_retries": self._param.max_retries,
+            }
+            llm_kwargs = {k: v for k, v in llm_kwargs.items() if v not in (None, "")}
+            return ChatBrowserUse(**llm_kwargs)
+
+        # browser-use Agent defaults to json_schema response_format and may use tool_choice via
+        # ChatDeepSeek. Many providers (e.g. DeepSeek thinking models) reject both. Use ChatOpenAI
+        # with schema-in-prompt and without forced structured output on the first run.
+        llm_kwargs = {
+            "model": model_name,
+            "api_key": cfg.get("api_key"),
+            "base_url": base_url,
+            "temperature": self._param.temperature,
+            "max_retries": self._param.max_retries,
+            "add_schema_to_system_prompt": True,
+            "dont_force_structured_output": True,
+        }
+        llm_kwargs = {k: v for k, v in llm_kwargs.items() if v not in (None, "")}
+        return ChatOpenAI(**llm_kwargs)
+
+    async def _run_browser_use_async(
+        self,
+        task_text: str,
+        download_dir: str,
+        available_file_paths: list[str] | None = None,
+        profile_dir: str | None = None,
+    ):
+        from browser_use import Agent as BrowserUseAgent, Browser as BrowserUseBrowser
+
+        llm = self._build_browser_llm()
+        # NOTE:
+        # _invoke() uses asyncio.run(), which creates a fresh event loop per task run.
+        # Reusing a Browser object created by a previous loop can deadlock/timestamp out
+        # in browser-use watchdog handlers on subsequent runs.
+        # We keep persistent user_data_dir for session continuity, but we do not keep
+        # browser instances alive across runs.
+        available_file_paths = available_file_paths or []
+        agent_kwargs: dict[str, Any] = {
+            "task": task_text,
+            "llm": llm,
+            "available_file_paths": available_file_paths,
+        }
+        browser_obj = None
+        previous_disable_extensions = os.environ.get("BROWSER_USE_DISABLE_EXTENSIONS")
+        previous_browser_binary_path = os.environ.get("BROWSER_USE_BROWSER_BINARY_PATH")
+
+        try:
+            enable_default_extensions = bool(self._param.enable_default_extensions)
+            if not enable_default_extensions:
+                os.environ["BROWSER_USE_DISABLE_EXTENSIONS"] = "1"
+            else:
+                os.environ.pop("BROWSER_USE_DISABLE_EXTENSIONS", None)
+
+            executable_path = self._resolve_browser_executable()
+            browser_kwargs = {
+                "headless": self._param.headless,
+                "downloads_path": download_dir,
+                # Docker often runs as root without user namespaces; disable sandbox by default.
+                "chromium_sandbox": bool(self._param.chromium_sandbox),
+                # Disable runtime extension download by default for intranet/offline environments.
+                # Enable only when explicitly required and extensions are pre-cached.
+                "enable_default_extensions": enable_default_extensions,
+            }
+            if executable_path:
+                browser_kwargs["executable_path"] = executable_path
+                # Keep browser-use watchdog fallback in sync with our resolved path.
+                os.environ["BROWSER_USE_BROWSER_BINARY_PATH"] = executable_path
+            else:
+                logging.warning(
+                    "Browser no local browser executable found. "
+                    "Set BROWSER_USE_EXECUTABLE_PATH or preinstall chromium in image to avoid runtime playwright install."
+                )
+            if profile_dir:
+                browser_kwargs["user_data_dir"] = profile_dir
+                # browser-use expects profile_directory to be a profile name
+                # such as "Default" / "Profile 1", not an absolute path.
+                browser_kwargs["profile_directory"] = "Default"
+
+            browser_obj = BrowserUseBrowser(**browser_kwargs)
+            agent_kwargs["browser"] = browser_obj
+        except (OSError, RuntimeError, TypeError, ValueError) as e:
+            logging.warning("Browser browser context customization skipped: %s", e)
+
+        agent = BrowserUseAgent(**agent_kwargs)
+
+        history = None
+        run_fn = getattr(agent, "run", None)
+        if run_fn is None:
+            raise RuntimeError("browser-use Agent does not provide run().")
+
+        run_kwargs = {"max_steps": self._param.max_steps}
+        try:
+            if inspect.iscoroutinefunction(run_fn):
+                history = await run_fn(**run_kwargs)
+            else:
+                history = await asyncio.to_thread(run_fn, **run_kwargs)
+        except Exception as e:
+            logging.error("Browser agent.run failed. error_chain=%s", self._error_chain(e))
+            logging.exception("Browser agent.run traceback")
+            raise
+        finally:
+            if browser_obj:
+                close_fn = getattr(browser_obj, "close", None)
+                if close_fn:
+                    try:
+                        if inspect.iscoroutinefunction(close_fn):
+                            await close_fn()
+                        else:
+                            await asyncio.to_thread(close_fn)
+                    except Exception as close_err:
+                        logging.warning("Browser failed to close browser object cleanly: %s", close_err)
+            self._restore_env_var("BROWSER_USE_DISABLE_EXTENSIONS", previous_disable_extensions)
+            self._restore_env_var("BROWSER_USE_BROWSER_BINARY_PATH", previous_browser_binary_path)
+
+        return history
+
+    def _prepare_upload_files(self, upload_dir: str) -> list[dict[str, Any]]:
+        upload_refs = self._extract_ids(self._param.upload_sources)
+        prepared = []
+        for file_ref in upload_refs:
+            if self._is_http_url(file_ref):
+                prepared_url_file = self._prepare_upload_url_file(file_ref, upload_dir)
+                if prepared_url_file:
+                    prepared.append(prepared_url_file)
+                continue
+
+            file_id = file_ref
+            exists, file = FileService.get_by_id(file_id)
+            if not exists:
+                logging.warning("Browser upload file_id not found: %s", file_id)
+                continue
+            try:
+                blob = settings.STORAGE_IMPL.get(file.parent_id, file.location)
+                if not blob:
+                    logging.warning("Browser upload blob not found: %s", file_id)
+                    continue
+                local_name = os.path.basename(file.location) if file.location else (file.name or f"{file_id}.bin")
+                local_path = os.path.join(upload_dir, local_name)
+                index = 1
+                while os.path.exists(local_path):
+                    stem, ext = os.path.splitext(local_name)
+                    local_path = os.path.join(upload_dir, f"{stem}_{index}{ext}")
+                    index += 1
+                with open(local_path, "wb") as f:
+                    f.write(blob)
+            except OSError as e:
+                logging.warning("Browser failed to prepare upload file. file_id=%s, error=%s", file_id, e)
+                continue
+            except Exception as e:
+                logging.warning("Browser failed to fetch upload blob. file_id=%s, error=%s", file_id, e)
+                continue
+            prepared.append(
+                {
+                    "file_id": file.id,
+                    "name": file.name,
+                    "size": file.size,
+                    "local_path": local_path,
+                }
+            )
+        return prepared
+
+    def _save_downloads(self, download_dir: str, parent_id: str) -> list[dict[str, Any]]:
+        downloaded_files: list[dict[str, Any]] = []
+        exists, folder = FileService.get_by_id(parent_id)
+        if not exists or folder.type != FileType.FOLDER.value:
+            raise ValueError(f"RAGFlow target folder does not exist or is not a folder: {parent_id}")
+        tenant_id = self._canvas.get_tenant_id()
+        storage_put = settings.STORAGE_IMPL.put
+        storage_rm = getattr(settings.STORAGE_IMPL, "rm", None)
+        insert_file = FileService.insert
+
+        for path in Path(download_dir).rglob("*"):
+            if not path.is_file():
+                continue
+            try:
+                if path.stat().st_size <= 0:
+                    continue
+                blob = path.read_bytes()
+            except OSError as e:
+                logging.warning("Browser failed to read downloaded file. path=%s, error=%s", path, e)
+                continue
+            if not blob:
+                continue
+            display_name = ""
+            blob_stored = False
+            try:
+                display_name = duplicate_name(FileService.query, name=path.name, parent_id=parent_id)
+                storage_put(parent_id, display_name, blob)
+                blob_stored = True
+                file_data = {
+                    "id": get_uuid(),
+                    "parent_id": parent_id,
+                    "tenant_id": tenant_id,
+                    "created_by": tenant_id,
+                    "type": filename_type(display_name),
+                    "name": display_name,
+                    "location": display_name,
+                    "size": len(blob),
+                }
+                inserted = insert_file(file_data)
+                downloaded_files.append(
+                    {
+                        "file_id": inserted.id,
+                        "name": inserted.name,
+                        "size": inserted.size,
+                        "parent_id": inserted.parent_id,
+                    }
+                )
+            except Exception as e:
+                if blob_stored and callable(storage_rm):
+                    try:
+                        storage_rm(parent_id, display_name)
+                    except Exception as rollback_err:
+                        logging.warning(
+                            "Browser rollback stored download failed. path=%s, parent_id=%s, display_name=%s, error=%s",
+                            path,
+                            parent_id,
+                            display_name,
+                            rollback_err,
+                        )
+                logging.error(
+                    "Browser failed to save download. path=%s, tenant_id=%s, parent_id=%s, display_name=%s, error=%s",
+                    path,
+                    tenant_id,
+                    parent_id,
+                    display_name,
+                    e,
+                )
+                continue
+        return downloaded_files
+
+    @staticmethod
+    def _extract_history_text(history: Any) -> str:
+        if history is None:
+            return ""
+
+        def pick_final_result(value: Any) -> str:
+            if value is None:
+                return ""
+            if isinstance(value, str):
+                return value.strip()
+            if isinstance(value, (int, float, bool)):
+                return str(value)
+            return ""
+
+        # Only trust browser-use's explicit final_result API/property.
+        final_result_fn = getattr(history, "final_result", None)
+        if callable(final_result_fn):
+            try:
+                final_result_value = final_result_fn()
+                return pick_final_result(final_result_value)
+            except Exception:
+                return ""
+        return pick_final_result(final_result_fn)
+
+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 20 * 60)))
+    def _invoke(self, **kwargs):
+        profile_dir = None
+        persist_session = self._should_persist_session()
+        try:
+            self._prepare_input_values()
+            user_prompt = self._resolve_text(kwargs.get("prompts", self._param.prompts))
+            with tempfile.TemporaryDirectory(prefix="browser_use_upload_") as upload_dir, tempfile.TemporaryDirectory(
+                prefix="browser_use_download_"
+            ) as download_dir:
+                uploaded_files = self._prepare_upload_files(upload_dir)
+
+                upload_lines = [
+                    f"- file_id={item['file_id']}, name={item['name']}, local_path={item['local_path']}"
+                    for item in uploaded_files
+                ]
+                task_text = user_prompt
+                if upload_lines:
+                    task_text += (
+                        "\n\nYou can upload files from these local paths when operating web pages:\n"
+                        + "\n".join(upload_lines)
+                    )
+
+                upload_local_paths = [item.get("local_path", "") for item in uploaded_files if item.get("local_path")]
+                if persist_session:
+                    profile_dir = self._resolve_persistent_profile_dir()
+                    os.makedirs(profile_dir, exist_ok=True)
+                else:
+                    try:
+                        profile_dir = tempfile.mkdtemp(prefix="browser_use_profile_")
+                    except OSError:
+                        profile_dir = None
+                history = asyncio.run(
+                    self._run_browser_use_async(
+                        task_text, download_dir, upload_local_paths, profile_dir
+                    )
+                )
+                target_dir_id = FileService.get_root_folder(self._canvas.get_tenant_id())["id"]
+                downloaded_files = self._save_downloads(download_dir, target_dir_id)
+
+                self.set_output("content", self._extract_history_text(history))
+                self.set_output("downloaded_files", downloaded_files)
+                return self.output()
+        except Exception as e:
+            logging.exception("Browser invoke failed")
+            self.set_output("_ERROR", str(e))
+            return self.output()
+        finally:
+            if profile_dir and not persist_session:
+                shutil.rmtree(profile_dir, ignore_errors=True)
+
+    def thoughts(self) -> str:
+        return "Planning and executing browser actions..."
--- a/agent/component/categorize.py
+++ b/agent/component/categorize.py
@@ -13,6 +13,7 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
+import asyncio
 import logging
 import os
 import re
@@ -20,6 +21,7 @@ from abc import ABC

 from common.constants import LLMType
 from api.db.services.llm_service import LLMBundle
+from api.db.joint_services.tenant_model_service import get_model_config_from_provider_instance
 from agent.component.llm import LLMParam, LLM
 from common.connection_utils import timeout
 from rag.llm.chat_model import ERROR_PREFIX
@@ -38,7 +40,8 @@ class CategorizeParam(LLMParam):
        self.update_prompt()

    def check(self):
-        self.check_positive_integer(self.message_history_window_size, "[Categorize] Message window size > 0")
+        if not isinstance(self.message_history_window_size, int) or self.message_history_window_size < 0:
+            raise ValueError("[Categorize] Message window size cannot be negative")
        self.check_empty(self.category_description, "[Categorize] Category examples")
        for k, v in self.category_description.items():
            if not k:
@@ -96,22 +99,33 @@ Here's description of each category:
 class Categorize(LLM, ABC):
    component_name = "Categorize"

+    def get_input_elements(self) -> dict[str, dict]:
+        query_key = self._param.query or "sys.query"
+        elements = self.get_input_elements_from_text(f"{{{query_key}}}")
+        if not elements:
+            logging.warning(f"[Categorize] input element not detected for query key: {query_key}")
+        return elements
+
    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60)))
-    def _invoke(self, **kwargs):
+    async def _invoke_async(self, **kwargs):
        if self.check_if_canceled("Categorize processing"):
            return

        msg = self._canvas.get_history(self._param.message_history_window_size)
        if not msg:
            msg = [{"role": "user", "content": ""}]
-        if kwargs.get("sys.query"):
-            msg[-1]["content"] = kwargs["sys.query"]
-            self.set_input_value("sys.query", kwargs["sys.query"])
+        query_key = self._param.query or "sys.query"
+        if query_key in kwargs:
+            query_value = kwargs[query_key]
        else:
-            msg[-1]["content"] = self._canvas.get_variable_value(self._param.query)
-            self.set_input_value(self._param.query, msg[-1]["content"])
+            query_value = self._canvas.get_variable_value(query_key)
+        if query_value is None:
+            query_value = ""
+        msg[-1]["content"] = query_value
+        self.set_input_value(query_key, msg[-1]["content"])
        self._param.update_prompt()
-        chat_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.CHAT, self._param.llm_id)
+        chat_model_config = get_model_config_from_provider_instance(self._canvas.get_tenant_id(), LLMType.CHAT, self._param.llm_id)
+        chat_mdl = LLMBundle(self._canvas.get_tenant_id(), chat_model_config)

        user_prompt = """
 ---- Real Data ----
@@ -121,7 +135,7 @@ class Categorize(LLM, ABC):
        if self.check_if_canceled("Categorize processing"):
            return

-        ans = chat_mdl.chat(self._param.sys_prompt, [{"role": "user", "content": user_prompt}], self._param.gen_conf())
+        ans = await chat_mdl.async_chat(self._param.sys_prompt, [{"role": "user", "content": user_prompt}], self._param.gen_conf())
        logging.info(f"input: {user_prompt}, answer: {str(ans)}")
        if ERROR_PREFIX in ans:
            raise Exception(ans)
@@ -136,7 +150,7 @@ class Categorize(LLM, ABC):
            category_counts[c] = count

        cpn_ids = list(self._param.category_description.items())[-1][1]["to"]
-        max_category = list(self._param.category_description.keys())[0]
+        max_category = list(self._param.category_description.keys())[-1]
        if any(category_counts.values()):
            max_category = max(category_counts.items(), key=lambda x: x[1])[0]
            cpn_ids = self._param.category_description[max_category]["to"]
@@ -144,5 +158,9 @@ class Categorize(LLM, ABC):
        self.set_output("category_name", max_category)
        self.set_output("_next", cpn_ids)

+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60)))
+    def _invoke(self, **kwargs):
+        return asyncio.run(self._invoke_async(**kwargs))
+
    def thoughts(self) -> str:
        return "Which should it falls into {}? ...".format(",".join([f"`{c}`" for c, _ in self._param.category_description.items()]))
--- a/agent/component/data_operations.py
+++ b/agent/component/data_operations.py
@@ -1,3 +1,18 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
 from abc import ABC
 import ast
 import os
@@ -58,7 +73,7 @@ class DataOperations(ComponentBase,ABC):
                continue
        if self._param.operations == "select_keys":
            self._select_keys()
-        elif self._param.operations == "recursive_eval":
+        elif self._param.operations == "literal_eval":
            self._literal_eval()
        elif self._param.operations == "combine":
            self._combine()
@@ -79,9 +94,9 @@ class DataOperations(ComponentBase,ABC):

    def _recursive_eval(self, data):
        if isinstance(data, dict):
-            return {k: self.recursive_eval(v) for k, v in data.items()}
+            return {k: self._recursive_eval(v) for k, v in data.items()}
        if isinstance(data, list):
-            return [self.recursive_eval(item) for item in data]
+            return [self._recursive_eval(item) for item in data]
        if isinstance(data, str):
            try:
                if (
--- a/agent/component/docs_generator.py
+++ b/agent/component/docs_generator.py
@@ -0,0 +1,642 @@
+import base64
+import logging
+import json
+import os
+import re
+import shutil
+import tempfile
+from abc import ABC
+from datetime import datetime
+from functools import partial
+from io import BytesIO
+from xml.sax.saxutils import escape
+
+from agent.component.base import ComponentParamBase
+from api.utils.api_utils import timeout
+from common import settings
+from common.misc_utils import get_uuid
+from .message import Message
+
+
+def sanitize_filename(name: str, extension: str) -> str:
+    if not name:
+        return f"file.{extension}"
+
+    name = str(name).strip()
+    name = re.sub(r'[\\/\x00-\x1f\?\#\%\*\:\|\<\>"]', " ", name)
+    name = re.sub(r"\s+", " ", name).strip(" .")
+
+    if not name:
+        return f"file.{extension}"
+
+    base, _ = os.path.splitext(name)
+    base = base[:180].rstrip() or "file"
+    return f"{base}.{extension}"
+
+
+class DocGeneratorParam(ComponentParamBase):
+    """
+    Define the Docs Generator component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.output_format = "pdf"  # pdf, docx, txt, markdown, html
+        self.content = ""
+        self.filename = ""
+        self.header_text = ""
+        self.footer_text = ""
+        self.watermark_text = ""
+        self.add_page_numbers = True
+        self.add_timestamp = True
+        self.include_download_info_in_content = False
+        self.font_size = 12
+        self.outputs = {
+            "doc_id": {"value": "", "type": "string"},
+            "filename": {"value": "", "type": "string"},
+            "mime_type": {"value": "", "type": "string"},
+            "size": {"value": 0, "type": "number"},
+            "download": {"value": "", "type": "string"},
+        }
+
+    def check(self):
+        self.check_empty(self.content, "[DocGenerator] Content")
+        self.check_valid_value(
+            self.output_format,
+            "[DocGenerator] Output format",
+            ["pdf", "docx", "txt", "markdown", "html"],
+        )
+        self.check_positive_number(self.font_size, "[DocGenerator] Font size")
+        if self.font_size < 12:
+            raise ValueError("[DocGenerator] Font size must be greater than or equal to 12")
+
+
+class DocGenerator(Message, ABC):
+    component_name = "DocGenerator"
+    _default_output_directory = os.path.join(tempfile.gettempdir(), "doc_outputs")
+    _overlay_margin = 36
+    _overlay_font_size = 9
+    _pdf_main_font = "Noto Sans CJK SC"
+    _pdf_cjk_font = "Noto Sans CJK SC"
+    _pdf_overlay_font = "STSong-Light"
+
+    def get_input_form(self) -> dict[str, dict]:
+        return {
+            "content": {
+                "name": "Content",
+                "type": "text",
+            }
+        }
+
+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10 * 60)))
+    def _invoke(self, **kwargs):
+        file_path = None
+        try:
+            content = self._resolve_content(kwargs)
+            output_format = self._param.output_format or "pdf"
+
+            try:
+                if output_format == "pdf":
+                    file_path, file_bytes = self._generate_pdf(content)
+                    mime_type = "application/pdf"
+                elif output_format == "docx":
+                    file_path, file_bytes = self._generate_docx(content)
+                    mime_type = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
+                elif output_format == "txt":
+                    file_path, file_bytes = self._generate_txt(content)
+                    mime_type = "text/plain"
+                elif output_format == "markdown":
+                    file_path, file_bytes = self._generate_markdown(content)
+                    mime_type = "text/markdown"
+                elif output_format == "html":
+                    file_path, file_bytes = self._generate_html(content)
+                    mime_type = "text/html"
+                else:
+                    raise Exception(f"Unsupported output format: {output_format}")
+
+                filename = os.path.basename(file_path)
+                if not file_bytes:
+                    raise Exception("Document file is empty")
+
+                file_size = len(file_bytes)
+                file_base64 = base64.b64encode(file_bytes).decode("utf-8")
+                doc_id = get_uuid()
+                settings.STORAGE_IMPL.put(self._canvas.get_tenant_id(), doc_id, file_bytes)
+
+                logging.info(
+                    "Successfully generated %s: %s (Size: %s bytes)",
+                    output_format.upper(),
+                    filename,
+                    file_size,
+                )
+
+                download_info = {
+                    "doc_id": doc_id,
+                    "filename": filename,
+                    "mime_type": mime_type,
+                    "size": file_size,
+                    "base64": file_base64,
+                    "include_download_info_in_content": self._param.include_download_info_in_content,
+                }
+                self.set_output("doc_id", doc_id)
+                self.set_output("filename", filename)
+                self.set_output("mime_type", mime_type)
+                self.set_output("size", file_size)
+                self.set_output("download", json.dumps(download_info))
+                return download_info
+
+            except Exception as e:
+                logging.exception("Error generating %s document", output_format)
+                self.set_output("_ERROR", f"Document generation failed: {str(e)}")
+                raise
+
+        except Exception as e:
+            logging.exception("Error in DocGenerator._invoke")
+            self.set_output("_ERROR", f"Document generation failed: {str(e)}")
+            raise
+        finally:
+            if file_path and os.path.exists(file_path):
+                os.remove(file_path)
+
+    def _resolve_content(self, kwargs: dict) -> str:
+        content = self._param.content or kwargs.get("content", "") or ""
+        logging.info("Starting document generation, content length: %s chars", len(content))
+
+        if content:
+            def _replace_variable(match_obj: re.Match[str]) -> str:
+                match = match_obj.group(1)
+                try:
+                    var_value = self._canvas.get_variable_value(match)
+                    if var_value is None:
+                        return ""
+                    if isinstance(var_value, partial):
+                        resolved_content = ""
+                        for chunk in var_value():
+                            resolved_content += chunk
+                        return resolved_content
+                    return self._stringify_message_value(var_value, fallback_to_str=True)
+                except Exception as e:
+                    logging.warning("Error resolving variable %s: %s", match, str(e))
+                    return f"[ERROR: {str(e)}]"
+
+            content = re.sub(
+                self.variable_ref_patt,
+                _replace_variable,
+                content,
+                flags=re.DOTALL,
+            )
+
+        return content
+
+    def _get_output_directory(self) -> str:
+        os.makedirs(self._default_output_directory, exist_ok=True)
+        return self._default_output_directory
+
+    def _build_output_filename(self, output_format: str) -> str:
+        import uuid
+
+        if self._param.filename:
+            return sanitize_filename(self._param.filename, output_format.lower())
+
+        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+        return f"document_{timestamp}_{uuid.uuid4().hex[:8]}.{output_format}"
+
+    def _get_timestamp_text(self) -> str:
+        return f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"
+
+    def _write_bytes_output(self, content: bytes, extension: str) -> tuple[str, bytes]:
+        output_directory = self._get_output_directory()
+        filename = self._build_output_filename(extension)
+        file_path = os.path.join(output_directory, filename)
+        with open(file_path, "wb") as f:
+            f.write(content)
+        return file_path, content
+
+    def _build_markdown_source(self, content: str, include_timestamp_in_body: bool = False) -> str:
+        if not (include_timestamp_in_body and self._param.add_timestamp):
+            return content
+        return f"{self._get_timestamp_text()}\n\n{content}"
+
+    def _get_heading_sizes(self) -> tuple[int, int, int]:
+        base = int(self._param.font_size)
+        return base + 6, base + 4, base + 2
+
+    def _generate_pandoc_binary_output(
+        self,
+        content: str,
+        target_format: str,
+        extension: str,
+        include_timestamp_in_body: bool = False,
+        extra_args: list[str] | None = None,
+    ) -> tuple[str, bytes]:
+        import pypandoc
+
+        output_directory = self._get_output_directory()
+        filename = self._build_output_filename(extension)
+        file_path = os.path.join(output_directory, filename)
+        markdown_content = self._build_markdown_source(
+            content,
+            include_timestamp_in_body=include_timestamp_in_body,
+        )
+
+        pypandoc.convert_text(
+            markdown_content,
+            to=target_format,
+            format="markdown",
+            outputfile=file_path,
+            extra_args=extra_args or [],
+        )
+
+        with open(file_path, "rb") as f:
+            file_bytes = f.read()
+
+        return file_path, file_bytes
+
+    def _generate_pandoc_text_output(
+        self,
+        content: str,
+        target_format: str,
+        extension: str,
+        include_timestamp_in_body: bool = True,
+    ) -> tuple[str, bytes]:
+        import pypandoc
+
+        markdown_content = self._build_markdown_source(
+            content,
+            include_timestamp_in_body=include_timestamp_in_body,
+        )
+        converted_content = pypandoc.convert_text(
+            markdown_content,
+            to=target_format,
+            format="markdown",
+        )
+        return self._write_bytes_output(converted_content.encode("utf-8"), extension)
+
+    def _select_pdf_engine(self) -> str:
+        if shutil.which("xelatex"):
+            return "xelatex"
+        raise Exception("No PDF engine found. Install xelatex.")
+
+    def _get_pdf_font_args(self) -> list[str]:
+        return [
+            "-V",
+            f"mainfont={self._pdf_main_font}",
+            "-V",
+            f"CJKmainfont={self._pdf_cjk_font}",
+        ]
+
+    def _get_pdf_overlay_font_name(self) -> str:
+        from reportlab.pdfbase import pdfmetrics
+        from reportlab.pdfbase.cidfonts import UnicodeCIDFont
+
+        try:
+            pdfmetrics.getFont(self._pdf_overlay_font)
+        except KeyError:
+            pdfmetrics.registerFont(UnicodeCIDFont(self._pdf_overlay_font))
+
+        return self._pdf_overlay_font
+
+    def _build_pdf_heading_overrides(self) -> str:
+        font_size = int(self._param.font_size)
+        leading = round(font_size * 1.2, 1)
+        h1_size, h2_size, h3_size = self._get_heading_sizes()
+        h1_leading = round(h1_size * 1.2, 1)
+        h2_leading = round(h2_size * 1.2, 1)
+        h3_leading = round(h3_size * 1.2, 1)
+
+        return rf"""
+\makeatletter
+\renewcommand\normalsize{{
+  \@setfontsize\normalsize{{{font_size}pt}}{{{leading}pt}}
+  \abovedisplayskip 12pt plus 3pt minus 7pt
+  \abovedisplayshortskip \z@ plus 3pt
+  \belowdisplayshortskip 6.5pt plus 3.5pt minus 3pt
+  \belowdisplayskip \abovedisplayskip
+  \let\@listi\@listI
+}}
+\normalsize
+\renewcommand\section{{\@startsection{{section}}{{1}}{{\z@}}{{-3.5ex \@plus -1ex \@minus -.2ex}}{{2.3ex \@plus .2ex}}{{\normalfont\fontsize{{{h1_size}pt}}{{{h1_leading}pt}}\selectfont\bfseries}}}}
+\renewcommand\subsection{{\@startsection{{subsection}}{{2}}{{\z@}}{{-3.25ex\@plus -1ex \@minus -.2ex}}{{1.5ex \@plus .2ex}}{{\normalfont\fontsize{{{h2_size}pt}}{{{h2_leading}pt}}\selectfont\bfseries}}}}
+\renewcommand\subsubsection{{\@startsection{{subsubsection}}{{3}}{{\z@}}{{-3.25ex\@plus -1ex \@minus -.2ex}}{{1.5ex \@plus .2ex}}{{\normalfont\fontsize{{{h3_size}pt}}{{{h3_leading}pt}}\selectfont\bfseries}}}}
+\makeatother
+""".strip()
+
+    def _write_temp_tex(self, content: str) -> str:
+        output_directory = self._get_output_directory()
+        with tempfile.NamedTemporaryFile(
+            mode="w",
+            encoding="utf-8",
+            suffix=".tex",
+            dir=output_directory,
+            delete=False,
+        ) as f:
+            f.write(content)
+            return f.name
+
+    def _should_apply_pdf_overlay(self) -> bool:
+        return any(
+            [
+                self._param.header_text,
+                self._param.footer_text,
+                self._param.watermark_text,
+                self._param.add_page_numbers,
+                self._param.add_timestamp,
+            ]
+        )
+
+    def _build_pdf_overlay_page(self, width: float, height: float, page_number: int):
+        if not self._should_apply_pdf_overlay():
+            return None
+
+        from pypdf import PdfReader
+        from reportlab.lib.colors import Color
+        from reportlab.pdfgen import canvas as pdf_canvas
+
+        buffer = BytesIO()
+        overlay = pdf_canvas.Canvas(buffer, pagesize=(width, height))
+        overlay_font = self._get_pdf_overlay_font_name()
+
+        if self._param.watermark_text:
+            overlay.saveState()
+            if hasattr(overlay, "setFillAlpha"):
+                overlay.setFillAlpha(0.15)
+            overlay.setFillColor(Color(0.6, 0.6, 0.6))
+            overlay.setFont(overlay_font, 48)
+            overlay.translate(width / 2, height / 2)
+            overlay.rotate(45)
+            overlay.drawCentredString(0, 0, self._param.watermark_text)
+            overlay.restoreState()
+
+        overlay.setFont(overlay_font, self._overlay_font_size)
+        overlay.setFillColor(Color(0.35, 0.35, 0.35))
+
+        if self._param.header_text:
+            overlay.drawString(
+                self._overlay_margin,
+                height - self._overlay_margin + 8,
+                self._param.header_text,
+            )
+
+        if self._param.footer_text:
+            overlay.drawString(
+                self._overlay_margin,
+                self._overlay_margin - 8,
+                self._param.footer_text,
+            )
+
+        if self._param.add_timestamp:
+            overlay.drawCentredString(
+                width / 2,
+                self._overlay_margin - 8,
+                self._get_timestamp_text(),
+            )
+
+        if self._param.add_page_numbers:
+            overlay.drawRightString(
+                width - self._overlay_margin,
+                self._overlay_margin - 8,
+                f"Page {page_number}",
+            )
+
+        overlay.save()
+        buffer.seek(0)
+        return PdfReader(buffer).pages[0]
+
+    def _apply_pdf_overlay(self, file_path: str) -> tuple[str, bytes]:
+        from pypdf import PdfReader, PdfWriter
+
+        if not self._should_apply_pdf_overlay():
+            with open(file_path, "rb") as f:
+                file_bytes = f.read()
+            return file_path, file_bytes
+
+        reader = PdfReader(file_path)
+        writer = PdfWriter()
+
+        for page_number, page in enumerate(reader.pages, start=1):
+            overlay_page = self._build_pdf_overlay_page(
+                float(page.mediabox.width),
+                float(page.mediabox.height),
+                page_number,
+            )
+            if overlay_page is not None:
+                page.merge_page(overlay_page)
+            writer.add_page(page)
+
+        temp_file = f"{file_path}.overlay"
+        with open(temp_file, "wb") as f:
+            writer.write(f)
+
+        os.replace(temp_file, file_path)
+        with open(file_path, "rb") as f:
+            file_bytes = f.read()
+        return file_path, file_bytes
+
+    def _clear_docx_container(self, container):
+        element = container._element
+        for child in list(element):
+            element.remove(child)
+
+    def _append_docx_field(self, run, instruction: str):
+        from docx.oxml import OxmlElement
+
+        begin = OxmlElement("w:fldChar")
+        begin.set(run.part.element.nsmap["w"] and "{http://schemas.openxmlformats.org/wordprocessingml/2006/main}fldCharType", "begin")
+
+        instr = OxmlElement("w:instrText")
+        instr.set("{http://www.w3.org/XML/1998/namespace}space", "preserve")
+        instr.text = instruction
+
+        end = OxmlElement("w:fldChar")
+        end.set(run.part.element.nsmap["w"] and "{http://schemas.openxmlformats.org/wordprocessingml/2006/main}fldCharType", "end")
+
+        run._r.append(begin)
+        run._r.append(instr)
+        run._r.append(end)
+
+    def _add_docx_watermark(self, section):
+        if not self._param.watermark_text:
+            return
+
+        from docx.enum.text import WD_ALIGN_PARAGRAPH
+        from docx.oxml import parse_xml
+
+        header = section.header
+        paragraph = header.add_paragraph()
+        paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER
+        run = paragraph.add_run()
+        watermark_xml = parse_xml(
+            rf"""
+            <w:pict
+              xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
+              xmlns:v="urn:schemas-microsoft-com:vml"
+              xmlns:o="urn:schemas-microsoft-com:office:office">
+              <v:shape id="PowerPlusWaterMarkObject"
+                o:spid="_x0000_s2049"
+                type="#_x0000_t136"
+                style="position:absolute;
+                  margin-left:0;
+                  margin-top:0;
+                  width:468pt;
+                  height:117pt;
+                  rotation:315;
+                  z-index:-251654144;
+                  mso-wrap-edited:f;
+                  mso-position-horizontal:center;
+                  mso-position-horizontal-relative:margin;
+                  mso-position-vertical:center;
+                  mso-position-vertical-relative:margin"
+                fillcolor="#d9d9d9"
+                stroked="f">
+                <v:fill opacity="0.18"/>
+                <v:textpath on="t" style="font-family:&quot;Calibri&quot;;font-size:1pt" string="{escape(self._param.watermark_text)}"/>
+              </v:shape>
+            </w:pict>
+            """
+        )
+        run._r.append(watermark_xml)
+
+    def _normalize_docx_section_geometry(self, section, default_section):
+        for attr in ("page_width", "left_margin", "right_margin"):
+            if getattr(section, attr) is None:
+                setattr(section, attr, getattr(default_section, attr))
+
+    def _get_docx_available_width(self, section):
+        page_width = section.page_width
+        left_margin = section.left_margin
+        right_margin = section.right_margin
+
+        if page_width is None or left_margin is None or right_margin is None:
+            raise ValueError("DOCX section geometry is incomplete after normalization.")
+
+        return page_width - left_margin - right_margin
+
+    def _decorate_docx(self, file_path: str) -> tuple[str, bytes]:
+        from docx import Document
+        from docx.enum.text import WD_TAB_ALIGNMENT
+        from docx.shared import Pt
+
+        document = Document(file_path)
+        default_section = Document().sections[0]
+        h1_size, h2_size, h3_size = self._get_heading_sizes()
+
+        style_map = {
+            "Normal": int(self._param.font_size),
+            "Heading 1": h1_size,
+            "Heading 2": h2_size,
+            "Heading 3": h3_size,
+        }
+        for style_name, size in style_map.items():
+            try:
+                document.styles[style_name].font.size = Pt(size)
+            except Exception:
+                continue
+
+        for section in document.sections:
+            self._normalize_docx_section_geometry(section, default_section)
+            available_width = self._get_docx_available_width(section)
+
+            header = section.header
+            header.is_linked_to_previous = False
+            self._clear_docx_container(header)
+            if self._param.header_text:
+                paragraph = header.add_paragraph()
+                paragraph.add_run(self._param.header_text)
+
+            self._add_docx_watermark(section)
+
+            footer = section.footer
+            footer.is_linked_to_previous = False
+            self._clear_docx_container(footer)
+            if any(
+                [
+                    self._param.footer_text,
+                    self._param.add_timestamp,
+                    self._param.add_page_numbers,
+                ]
+            ):
+                paragraph = footer.add_paragraph()
+                paragraph.paragraph_format.tab_stops.add_tab_stop(
+                    int(available_width // 2),
+                    WD_TAB_ALIGNMENT.CENTER,
+                )
+                paragraph.paragraph_format.tab_stops.add_tab_stop(
+                    int(available_width),
+                    WD_TAB_ALIGNMENT.RIGHT,
+                )
+
+                if self._param.footer_text:
+                    paragraph.add_run(self._param.footer_text)
+
+                if self._param.add_timestamp or self._param.add_page_numbers:
+                    paragraph.add_run("\t")
+
+                if self._param.add_timestamp:
+                    paragraph.add_run(self._get_timestamp_text())
+
+                if self._param.add_page_numbers:
+                    paragraph.add_run("\t")
+                    self._append_docx_field(paragraph.add_run(), " PAGE ")
+
+        document.save(file_path)
+        with open(file_path, "rb") as f:
+            file_bytes = f.read()
+        return file_path, file_bytes
+
+    def thoughts(self) -> str:
+        return f"Generating {self._param.output_format.upper()} document with markdown conversion..."
+
+    def _generate_pdf(self, content: str) -> tuple[str, bytes]:
+        try:
+            engine = self._select_pdf_engine()
+            header_path = self._write_temp_tex(self._build_pdf_heading_overrides())
+            try:
+                file_path, _ = self._generate_pandoc_binary_output(
+                    content,
+                    "pdf",
+                    "pdf",
+                    include_timestamp_in_body=False,
+                    extra_args=[
+                        "--standalone",
+                        f"--pdf-engine={engine}",
+                        f"--include-in-header={header_path}",
+                        *self._get_pdf_font_args(),
+                    ],
+                )
+            finally:
+                if os.path.exists(header_path):
+                    os.remove(header_path)
+            return self._apply_pdf_overlay(file_path)
+        except Exception as e:
+            raise Exception(f"PDF generation failed: {str(e)}")
+
+    def _generate_docx(self, content: str) -> tuple[str, bytes]:
+        try:
+            file_path, _ = self._generate_pandoc_binary_output(
+                content,
+                "docx",
+                "docx",
+                include_timestamp_in_body=False,
+                extra_args=["--standalone"],
+            )
+            return self._decorate_docx(file_path)
+        except Exception as e:
+            raise Exception(f"DOCX generation failed: {str(e)}")
+
+    def _generate_txt(self, content: str) -> tuple[str, bytes]:
+        try:
+            return self._generate_pandoc_text_output(content, "plain", "txt")
+        except Exception as e:
+            raise Exception(f"TXT generation failed: {str(e)}")
+
+    def _generate_markdown(self, content: str) -> tuple[str, bytes]:
+        try:
+            return self._generate_pandoc_text_output(content, "markdown", "md")
+        except Exception as e:
+            raise Exception(f"Markdown generation failed: {str(e)}")
+
+    def _generate_html(self, content: str) -> tuple[str, bytes]:
+        try:
+            return self._generate_pandoc_text_output(content, "html", "html")
+        except Exception as e:
+            raise Exception(f"HTML generation failed: {str(e)}")
--- a/agent/component/excel_processor.py
+++ b/agent/component/excel_processor.py
@@ -0,0 +1,401 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+"""
+ExcelProcessor Component
+
+A component for reading, processing, and generating Excel files in RAGFlow agents.
+Supports multiple Excel file inputs, data transformation, and Excel output generation.
+"""
+
+import logging
+import os
+from abc import ABC
+from io import BytesIO
+
+import pandas as pd
+
+from agent.component.base import ComponentBase, ComponentParamBase
+from api.db.services.file_service import FileService
+from api.utils.api_utils import timeout
+from common import settings
+from common.misc_utils import get_uuid
+
+
+class ExcelProcessorParam(ComponentParamBase):
+    """
+    Define the ExcelProcessor component parameters.
+    """
+    def __init__(self):
+        super().__init__()
+        # Input configuration
+        self.input_files = []  # Variable references to uploaded files
+        self.operation = "read"  # read, merge, transform, output
+        
+        # Processing options
+        self.sheet_selection = "all"  # all, first, or comma-separated sheet names
+        self.merge_strategy = "concat"  # concat, join
+        self.join_on = ""  # Column name for join operations
+        
+        # Transform options (for LLM-guided transformations)
+        self.transform_instructions = ""
+        self.transform_data = ""  # Variable reference to transformation data
+        
+        # Output options
+        self.output_format = "xlsx"  # xlsx, csv
+        self.output_filename = "output"
+        
+        # Component outputs
+        self.outputs = {
+            "data": {
+                "type": "object",
+                "value": {}
+            },
+            "summary": {
+                "type": "str",
+                "value": ""
+            },
+            "markdown": {
+                "type": "str",
+                "value": ""
+            }
+        }
+    
+    def check(self):
+        self.check_valid_value(
+            self.operation, 
+            "[ExcelProcessor] Operation", 
+            ["read", "merge", "transform", "output"]
+        )
+        self.check_valid_value(
+            self.output_format,
+            "[ExcelProcessor] Output format",
+            ["xlsx", "csv"]
+        )
+        return True
+
+
+class ExcelProcessor(ComponentBase, ABC):
+    """
+    Excel processing component for RAGFlow agents.
+    
+    Operations:
+    - read: Parse Excel files into structured data
+    - merge: Combine multiple Excel files
+    - transform: Apply data transformations based on instructions
+    - output: Generate Excel file output
+    """
+    component_name = "ExcelProcessor"
+
+    def get_input_form(self) -> dict[str, dict]:
+        """Define input form for the component."""
+        res = {}
+        for ref in (self._param.input_files or []):
+            for k, o in self.get_input_elements_from_text(ref).items():
+                res[k] = {"name": o.get("name", ""), "type": "file"}
+        if self._param.transform_data:
+            for k, o in self.get_input_elements_from_text(self._param.transform_data).items():
+                res[k] = {"name": o.get("name", ""), "type": "object"}
+        return res
+
+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60)))
+    def _invoke(self, **kwargs):
+        if self.check_if_canceled("ExcelProcessor processing"):
+            return
+
+        operation = self._param.operation.lower()
+        
+        if operation == "read":
+            self._read_excels()
+        elif operation == "merge":
+            self._merge_excels()
+        elif operation == "transform":
+            self._transform_data()
+        elif operation == "output":
+            self._output_excel()
+        else:
+            self.set_output("summary", f"Unknown operation: {operation}")
+
+    def _get_file_content(self, file_ref: str) -> tuple[bytes, str]:
+        """
+        Get file content from a variable reference.
+        Returns (content_bytes, filename).
+        """
+        value = self._canvas.get_variable_value(file_ref)
+        if value is None:
+            return None, None
+            
+        # Handle different value formats
+        if isinstance(value, dict):
+            # File reference from Begin/UserFillUp component
+            file_id = value.get("id") or value.get("file_id")
+            created_by = value.get("created_by") or self._canvas.get_tenant_id()
+            filename = value.get("name") or value.get("filename", "unknown.xlsx")
+            if file_id:
+                content = FileService.get_blob(created_by, file_id)
+                return content, filename
+        elif isinstance(value, list) and len(value) > 0:
+            # List of file references - return first
+            return self._get_file_content_from_list(value[0])
+        elif isinstance(value, str):
+            # Could be base64 encoded or a path
+            if value.startswith("data:"):
+                import base64
+                # Extract base64 content
+                _, encoded = value.split(",", 1)
+                return base64.b64decode(encoded), "uploaded.xlsx"
+                
+        return None, None
+    
+    def _get_file_content_from_list(self, item) -> tuple[bytes, str]:
+        """Extract file content from a list item."""
+        if isinstance(item, dict):
+            return self._get_file_content(item)
+        return None, None
+
+    def _parse_excel_to_dataframes(self, content: bytes, filename: str) -> dict[str, pd.DataFrame]:
+        """Parse Excel content into a dictionary of DataFrames (one per sheet)."""
+        try:
+            excel_file = BytesIO(content)
+            
+            if filename.lower().endswith(".csv"):
+                df = pd.read_csv(excel_file)
+                return {"Sheet1": df}
+            else:
+                # Read all sheets
+                xlsx = pd.ExcelFile(excel_file, engine='openpyxl')
+                sheet_selection = self._param.sheet_selection
+                
+                if sheet_selection == "all":
+                    sheets_to_read = xlsx.sheet_names
+                elif sheet_selection == "first":
+                    sheets_to_read = [xlsx.sheet_names[0]] if xlsx.sheet_names else []
+                else:
+                    # Comma-separated sheet names
+                    requested = [s.strip() for s in sheet_selection.split(",")]
+                    sheets_to_read = [s for s in requested if s in xlsx.sheet_names]
+                
+                dfs = {}
+                for sheet in sheets_to_read:
+                    dfs[sheet] = pd.read_excel(xlsx, sheet_name=sheet)
+                return dfs
+                
+        except Exception as e:
+            logging.error(f"Error parsing Excel file {filename}: {e}")
+            return {}
+
+    def _read_excels(self):
+        """Read and parse Excel files into structured data."""
+        all_data = {}
+        summaries = []
+        markdown_parts = []
+        
+        for file_ref in (self._param.input_files or []):
+            if self.check_if_canceled("ExcelProcessor reading"):
+                return
+                
+            # Get variable value
+            value = self._canvas.get_variable_value(file_ref)
+            self.set_input_value(file_ref, str(value)[:200] if value else "")
+            
+            if value is None:
+                continue
+            
+            # Handle file content
+            content, filename = self._get_file_content(file_ref)
+            if content is None:
+                continue
+                
+            # Parse Excel
+            dfs = self._parse_excel_to_dataframes(content, filename)
+            
+            for sheet_name, df in dfs.items():
+                key = f"{filename}_{sheet_name}" if len(dfs) > 1 else filename
+                all_data[key] = df.to_dict(orient="records")
+                
+                # Build summary
+                summaries.append(f"**{key}**: {len(df)} rows, {len(df.columns)} columns ({', '.join(df.columns.tolist()[:5])}{'...' if len(df.columns) > 5 else ''})")
+                
+                # Build markdown table
+                markdown_parts.append(f"### {key}\n\n{df.head(10).to_markdown(index=False)}\n")
+        
+        # Set outputs
+        self.set_output("data", all_data)
+        self.set_output("summary", "\n".join(summaries) if summaries else "No Excel files found")
+        self.set_output("markdown", "\n\n".join(markdown_parts) if markdown_parts else "No data")
+
+    def _merge_excels(self):
+        """Merge multiple Excel files/sheets into one."""
+        all_dfs = []
+        
+        for file_ref in (self._param.input_files or []):
+            if self.check_if_canceled("ExcelProcessor merging"):
+                return
+                
+            value = self._canvas.get_variable_value(file_ref)
+            self.set_input_value(file_ref, str(value)[:200] if value else "")
+            
+            if value is None:
+                continue
+                
+            content, filename = self._get_file_content(file_ref)
+            if content is None:
+                continue
+                
+            dfs = self._parse_excel_to_dataframes(content, filename)
+            all_dfs.extend(dfs.values())
+        
+        if not all_dfs:
+            self.set_output("data", {})
+            self.set_output("summary", "No data to merge")
+            return
+        
+        # Merge strategy
+        if self._param.merge_strategy == "concat":
+            merged_df = pd.concat(all_dfs, ignore_index=True)
+        elif self._param.merge_strategy == "join" and self._param.join_on:
+            # Join on specified column
+            merged_df = all_dfs[0]
+            for df in all_dfs[1:]:
+                merged_df = merged_df.merge(df, on=self._param.join_on, how="outer")
+        else:
+            merged_df = pd.concat(all_dfs, ignore_index=True)
+        
+        self.set_output("data", {"merged": merged_df.to_dict(orient="records")})
+        self.set_output("summary", f"Merged {len(all_dfs)} sources into {len(merged_df)} rows, {len(merged_df.columns)} columns")
+        self.set_output("markdown", merged_df.head(20).to_markdown(index=False))
+
+    def _transform_data(self):
+        """Apply transformations to data based on instructions or input data."""
+        # Get the data to transform
+        transform_ref = self._param.transform_data
+        if not transform_ref:
+            self.set_output("summary", "No transform data reference provided")
+            return
+            
+        data = self._canvas.get_variable_value(transform_ref)
+        self.set_input_value(transform_ref, str(data)[:300] if data else "")
+        
+        if data is None:
+            self.set_output("summary", "Transform data is empty")
+            return
+        
+        # Convert to DataFrame
+        if isinstance(data, dict):
+            # Could be {"sheet": [rows]} format
+            if all(isinstance(v, list) for v in data.values()):
+                # Multiple sheets
+                all_markdown = []
+                for sheet_name, rows in data.items():
+                    df = pd.DataFrame(rows)
+                    all_markdown.append(f"### {sheet_name}\n\n{df.to_markdown(index=False)}")
+                self.set_output("data", data)
+                self.set_output("markdown", "\n\n".join(all_markdown))
+            else:
+                df = pd.DataFrame([data])
+                self.set_output("data", df.to_dict(orient="records"))
+                self.set_output("markdown", df.to_markdown(index=False))
+        elif isinstance(data, list):
+            df = pd.DataFrame(data)
+            self.set_output("data", df.to_dict(orient="records"))
+            self.set_output("markdown", df.to_markdown(index=False))
+        else:
+            self.set_output("data", {"raw": str(data)})
+            self.set_output("markdown", str(data))
+        
+        self.set_output("summary", "Transformed data ready for processing")
+
+    def _output_excel(self):
+        """Generate Excel file output from data."""
+        # Get data from transform_data reference
+        transform_ref = self._param.transform_data
+        if not transform_ref:
+            self.set_output("summary", "No data reference for output")
+            return
+            
+        data = self._canvas.get_variable_value(transform_ref)
+        self.set_input_value(transform_ref, str(data)[:300] if data else "")
+        
+        if data is None:
+            self.set_output("summary", "No data to output")
+            return
+        
+        try:
+            # Prepare DataFrames
+            if isinstance(data, dict):
+                if all(isinstance(v, list) for v in data.values()):
+                    # Multi-sheet format
+                    dfs = {k: pd.DataFrame(v) for k, v in data.items()}
+                else:
+                    dfs = {"Sheet1": pd.DataFrame([data])}
+            elif isinstance(data, list):
+                dfs = {"Sheet1": pd.DataFrame(data)}
+            else:
+                self.set_output("summary", "Invalid data format for Excel output")
+                return
+            
+            # Generate output
+            doc_id = get_uuid()
+            
+            if self._param.output_format == "csv":
+                # For CSV, only output first sheet
+                first_df = list(dfs.values())[0]
+                binary_content = first_df.to_csv(index=False).encode("utf-8")
+                filename = f"{self._param.output_filename}.csv"
+            else:
+                # Excel output
+                excel_io = BytesIO()
+                with pd.ExcelWriter(excel_io, engine='openpyxl') as writer:
+                    for sheet_name, df in dfs.items():
+                        # Sanitize sheet name (max 31 chars, no special chars)
+                        safe_name = sheet_name[:31].replace("/", "_").replace("\\", "_")
+                        df.to_excel(writer, sheet_name=safe_name, index=False)
+                excel_io.seek(0)
+                binary_content = excel_io.read()
+                filename = f"{self._param.output_filename}.xlsx"
+            
+            # Store file
+            settings.STORAGE_IMPL.put(self._canvas._tenant_id, doc_id, binary_content)
+            
+            # Set attachment output
+            self.set_output("attachment", {
+                "doc_id": doc_id,
+                "format": self._param.output_format,
+                "file_name": filename
+            })
+            
+            total_rows = sum(len(df) for df in dfs.values())
+            self.set_output("summary", f"Generated {filename} with {len(dfs)} sheet(s), {total_rows} total rows")
+            self.set_output("data", {k: v.to_dict(orient="records") for k, v in dfs.items()})
+            
+            logging.info(f"ExcelProcessor: Generated {filename} as {doc_id}")
+            
+        except Exception as e:
+            logging.error(f"ExcelProcessor output error: {e}")
+            self.set_output("summary", f"Error generating output: {str(e)}")
+
+    def thoughts(self) -> str:
+        """Return component thoughts for UI display."""
+        op = self._param.operation
+        if op == "read":
+            return "Reading Excel files..."
+        elif op == "merge":
+            return "Merging Excel data..."
+        elif op == "transform":
+            return "Transforming data..."
+        elif op == "output":
+            return "Generating Excel output..."
+        return "Processing Excel..."
--- a/agent/component/exit_loop.py
+++ b/agent/component/exit_loop.py
@@ -0,0 +1,32 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class ExitLoopParam(ComponentParamBase, ABC):
+    def check(self):
+        return True
+
+
+class ExitLoop(ComponentBase, ABC):
+    component_name = "ExitLoop"
+
+    def _invoke(self, **kwargs):
+        pass
+
+    def thoughts(self) -> str:
+        return ""
--- a/agent/component/fillup.py
+++ b/agent/component/fillup.py
@@ -18,6 +18,10 @@ import re
 from functools import partial

 from agent.component.base import ComponentParamBase, ComponentBase
+from api.db.services.file_service import FileService
+
+
+_INITIAL_USER_INPUT_CONSUMED_KEY = "sys.__initial_user_input_consumed__"


 class UserFillUpParam(ComponentParamBase):
@@ -26,6 +30,7 @@ class UserFillUpParam(ComponentParamBase):
        super().__init__()
        self.enable_tips = True
        self.tips = "Please fill up the form"
+        self.layout_recognize = ""

    def check(self) -> bool:
        return True
@@ -34,6 +39,52 @@ class UserFillUpParam(ComponentParamBase):
 class UserFillUp(ComponentBase):
    component_name = "UserFillUp"

+    def _merge_runtime_inputs(self, runtime_inputs):
+        if runtime_inputs:
+            return runtime_inputs
+
+        fields = self.get_input_elements()
+        if not fields:
+            return {}
+
+        if self._canvas.globals.get(_INITIAL_USER_INPUT_CONSUMED_KEY):
+            return {}
+
+        query = self._canvas.globals.get("sys.query")
+        if query is None or query == "":
+            return {}
+
+        if isinstance(query, dict):
+            matched = {
+                key: value if isinstance(value, dict) else {"value": value}
+                for key, value in query.items()
+                if key in fields
+            }
+            if matched:
+                self._canvas.globals[_INITIAL_USER_INPUT_CONSUMED_KEY] = True
+            return matched
+
+        if len(fields) == 1:
+            field_name = next(iter(fields))
+            self._canvas.globals[_INITIAL_USER_INPUT_CONSUMED_KEY] = True
+            return {field_name: {"value": query}}
+
+        return {}
+
+    def _resolve_input_value(self, value, layout_recognize):
+        if isinstance(value, dict) and value.get("type", "").lower().find("file") >= 0:
+            if value.get("optional") and value.get("value", None) is None:
+                return None
+
+            file_value = value["value"]
+            files = file_value if isinstance(file_value, list) else [file_value]
+            return FileService.get_files(files, layout_recognize=layout_recognize)
+
+        if isinstance(value, dict):
+            return value.get("value")
+
+        return value
+
    def _invoke(self, **kwargs):
        if self.check_if_canceled("UserFillUp processing"):
            return
@@ -60,10 +111,14 @@ class UserFillUp(ComponentBase):
                content = re.sub(r"\{%s\}"%k, ans, content)

            self.set_output("tips", content)
-        for k, v in kwargs.get("inputs", {}).items():
+        layout_recognize = self._param.layout_recognize or None
+        merged_inputs = self._merge_runtime_inputs(kwargs.get("inputs", {}))
+        for k, v in merged_inputs.items():
            if self.check_if_canceled("UserFillUp processing"):
                return
-            self.set_output(k, v)
+            resolved = self._resolve_input_value(v, layout_recognize)
+            self.set_output(k, resolved)
+            self.set_input_value(k, resolved)

    def thoughts(self) -> str:
        return "Waiting for your input..."
--- a/agent/component/invoke.py
+++ b/agent/component/invoke.py
@@ -19,17 +19,20 @@ import os
 import re
 import time
 from abc import ABC
+from functools import partial
+from urllib.parse import urlparse

 import requests

 from agent.component.base import ComponentBase, ComponentParamBase
 from common.connection_utils import timeout
+from common.ssrf_guard import assert_url_is_safe, pin_dns
 from deepdoc.parser import HtmlParser


 class InvokeParam(ComponentParamBase):
    """
-    Define the Crawler component parameters.
+    Define the Invoke component parameters.
    """

    def __init__(self):
@@ -41,7 +44,7 @@ class InvokeParam(ComponentParamBase):
        self.url = ""
        self.timeout = 60
        self.clean_html = False
-        self.datatype = "json"  # New parameter to determine data posting type
+        self.datatype = "json"

    def check(self):
        self.check_valid_value(self.method.lower(), "Type of content from the crawler", ["get", "post", "put"])
@@ -53,92 +56,250 @@ class InvokeParam(ComponentParamBase):

 class Invoke(ComponentBase, ABC):
    component_name = "Invoke"
+    header_variable_ref_patt = r"\{([a-zA-Z_][a-zA-Z0-9_.@-]*)\}"
+
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self._pinned_hostname: str | None = None
+        self._pinned_ip: str | None = None
+
+    @staticmethod
+    def _coerce_json_arg_if_possible(key, value):
+        raw_value = value
+        if isinstance(value, str):
+            try:
+                value = json.loads(value)
+                logging.debug(
+                    "Invoke JSON arg coercion succeeded. key=%s parsed_type=%s",
+                    key,
+                    type(value).__name__,
+                )
+            except json.JSONDecodeError as exc:
+                logging.info(
+                    "Invoke JSON arg coercion skipped; value is not valid JSON. key=%s raw=%r error=%s",
+                    key,
+                    raw_value,
+                    exc,
+                )
+                return raw_value
+
+        try:
+            json.dumps(value, allow_nan=False)
+        except (TypeError, ValueError) as exc:
+            logging.warning(
+                "Invoke JSON arg is not JSON-serializable. key=%s value_type=%s value=%r error=%s",
+                key,
+                type(value).__name__,
+                value,
+                exc,
+            )
+            raise ValueError(f"Invoke JSON argument '{key}' is not JSON-serializable.") from exc
+
+        return value
+
+    def get_input_form(self) -> dict[str, dict]:
+        res = {}
+        for item in self._param.variables or []:
+            if not isinstance(item, dict):
+                continue
+            ref = (item.get("ref") or "").strip()
+            if not ref or ref in res:
+                continue
+
+            elements = self.get_input_elements_from_text("{" + ref + "}")
+            element = elements.get(ref, {})
+            res[ref] = {
+                "type": "line",
+                "name": element.get("name") or item.get("key") or ref,
+            }
+        return res
+
+    def _resolve_variable_value(self, variable_name: str, kwargs: dict | None = None):
+        kwargs = kwargs or {}
+        value = kwargs.get(variable_name, self._canvas.get_variable_value(variable_name))
+        if isinstance(value, partial):
+            value = "".join(value())
+            self.set_input_value(variable_name, value)
+        return "" if value is None else value
+
+    def _render_template(self, content: str, pattern: str, kwargs: dict | None = None, *, flags: int = 0) -> str:
+        content = content or ""
+        if not content:
+            return content
+
+        def replace_variable(match_obj):
+            return str(self._resolve_variable_value(match_obj.group(1), kwargs))
+
+        return re.sub(pattern, replace_variable, content, flags=flags)
+
+    def _resolve_template_text(self, content: str, kwargs: dict | None = None) -> str:
+        return self._render_template(content, self.variable_ref_patt, kwargs, flags=re.DOTALL)
+
+    def _resolve_header_text(self, content: str, kwargs: dict | None = None) -> str:
+        # Headers support plain {token} placeholders, so they cannot reuse the canvas variable regex.
+        return self._render_template(content, self.header_variable_ref_patt, kwargs)
+
+    def _resolve_arg_value(self, para: dict, kwargs: dict) -> object:
+        ref = (para.get("ref") or "").strip()
+        if ref and (ref in kwargs or self._canvas.get_variable_value(ref) is not None):
+            return self._resolve_variable_value(ref, kwargs)
+
+        if para.get("value") is not None:
+            value = para["value"]
+            if isinstance(value, str):
+                return self._resolve_template_text(value, kwargs)
+            return value
+
+        if ref:
+            return self._resolve_variable_value(ref, kwargs)
+
+        return ""
+
+    def _is_json_mode(self) -> bool:
+        return self._param.datatype.lower() == "json"
+
+    def _build_request_args(self, kwargs: dict) -> dict:
+        args = {}
+        for para in self._param.variables:
+            key = para["key"]
+            value = self._resolve_arg_value(para, kwargs)
+            if self._is_json_mode():
+                # JSON mode accepts stringified JSON so complex payloads can be passed through variables.
+                value = self._coerce_json_arg_if_possible(key, value)
+            args[key] = value
+
+            if para.get("ref"):
+                self.set_input_value(para["ref"], value)
+        return args
+
+    def _build_url(self, kwargs: dict) -> str:
+        url = self._resolve_template_text(self._param.url.strip(), kwargs)
+        if not url.startswith(("http://", "https://")):
+            url = "http://" + url
+        hostname, ip = assert_url_is_safe(url)
+        self._pinned_hostname = hostname
+        self._pinned_ip = ip
+        return url
+
+    def _build_headers(self, kwargs: dict) -> dict:
+        if not self._param.headers:
+            return {}
+
+        headers = json.loads(self._param.headers)
+        if not isinstance(headers, dict):
+            raise ValueError("Invoke headers must be a JSON object.")
+
+        return {key: self._resolve_header_text(value, kwargs) if isinstance(value, str) else value for key, value in headers.items()}
+
+    @staticmethod
+    def _ssrf_log_target(url: str) -> str:
+        parsed = urlparse(url)
+        if not parsed.scheme or not parsed.hostname:
+            return "invalid-url"
+        return f"{parsed.scheme}://{parsed.hostname}"
+
+    def _normalize_proxy_url(self) -> str | None:
+        proxy = (self._param.proxy or "").strip()
+        if not re.sub(r"https?:?/?/?", "", proxy):
+            return None
+        if not proxy.startswith(("http://", "https://")):
+            proxy = "http://" + proxy
+        return proxy
+
+    def _build_proxies(self) -> dict | None:
+        proxy_url = self._normalize_proxy_url()
+        if not proxy_url:
+            return None
+        return {"http": self._param.proxy, "https": self._param.proxy}
+
+    def _send_request(self, url: str, args: dict, headers: dict, proxies: dict | None):
+        method = self._param.method.lower()
+        request = getattr(requests, method)
+        request_kwargs = {
+            "url": url,
+            "headers": headers,
+            "proxies": proxies,
+            "timeout": self._param.timeout,
+            "allow_redirects": False,
+        }
+
+        # GET sends query params; POST/PUT send either JSON or form data based on datatype.
+        if method == "get":
+            request_kwargs["params"] = args
+            return request(**request_kwargs)
+
+        body_key = "json" if self._is_json_mode() else "data"
+        request_kwargs[body_key] = args
+        return request(**request_kwargs)
+
+    def _format_response(self, response) -> str:
+        if not self._param.clean_html:
+            return response.text
+
+        # HtmlParser keeps the Invoke output text-focused when the endpoint returns HTML.
+        sections = HtmlParser()(None, response.content)
+        return "\n".join(sections)

    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 3)))
    def _invoke(self, **kwargs):
        if self.check_if_canceled("Invoke processing"):
            return

-        args = {}
-        for para in self._param.variables:
-            if para.get("value"):
-                args[para["key"]] = para["value"]
-            else:
-                args[para["key"]] = self._canvas.get_variable_value(para["ref"])
+        args = self._build_request_args(kwargs)
+        headers = self._build_headers(kwargs)
+        proxies = self._build_proxies()
+        proxy_hostname = proxy_ip = None

-        url = self._param.url.strip()
-
-        def replace_variable(match):
-            var_name = match.group(1)
+        if proxies:
+            proxy_url = self._normalize_proxy_url()
            try:
-                value = self._canvas.get_variable_value(var_name)
-                return str(value or "")
-            except Exception:
-                return ""
+                proxy_hostname, proxy_ip = assert_url_is_safe(proxy_url)
+            except ValueError as exc:
+                logging.warning(
+                    "Invoke SSRF guard blocked proxy=%s: %s",
+                    self._ssrf_log_target(proxy_url),
+                    exc,
+                )
+                self.set_output("_ERROR", "URL not valid")
+                return "Http request error: URL not valid"

-        # {base_url} or {component_id@variable_name}
-        url = re.sub(r"\{([a-zA-Z_][a-zA-Z0-9_.@-]*)\}", replace_variable, url)
-
-        if url.find("http") != 0:
-            url = "http://" + url
-
-        method = self._param.method.lower()
-        headers = {}
-        if self._param.headers:
-            headers = json.loads(self._param.headers)
-        proxies = None
-        if re.sub(r"https?:?/?/?", "", self._param.proxy):
-            proxies = {"http": self._param.proxy, "https": self._param.proxy}
-
-        last_e = ""
+        last_error = None
        for _ in range(self._param.max_retries + 1):
            if self.check_if_canceled("Invoke processing"):
                return

            try:
-                if method == "get":
-                    response = requests.get(url=url, params=args, headers=headers, proxies=proxies, timeout=self._param.timeout)
-                    if self._param.clean_html:
-                        sections = HtmlParser()(None, response.content)
-                        self.set_output("result", "\n".join(sections))
+                url = self._build_url(kwargs)
+                if not self._pinned_hostname or not self._pinned_ip:
+                    raise ValueError("Invoke URL was not validated before request.")
+                with pin_dns(self._pinned_hostname, self._pinned_ip):
+                    if proxy_hostname and proxy_ip:
+                        with pin_dns(proxy_hostname, proxy_ip):
+                            response = self._send_request(url, args, headers, proxies)
                    else:
-                        self.set_output("result", response.text)
-
-                if method == "put":
-                    if self._param.datatype.lower() == "json":
-                        response = requests.put(url=url, json=args, headers=headers, proxies=proxies, timeout=self._param.timeout)
-                    else:
-                        response = requests.put(url=url, data=args, headers=headers, proxies=proxies, timeout=self._param.timeout)
-                    if self._param.clean_html:
-                        sections = HtmlParser()(None, response.content)
-                        self.set_output("result", "\n".join(sections))
-                    else:
-                        self.set_output("result", response.text)
-
-                if method == "post":
-                    if self._param.datatype.lower() == "json":
-                        response = requests.post(url=url, json=args, headers=headers, proxies=proxies, timeout=self._param.timeout)
-                    else:
-                        response = requests.post(url=url, data=args, headers=headers, proxies=proxies, timeout=self._param.timeout)
-                    if self._param.clean_html:
-                        self.set_output("result", "\n".join(sections))
-                    else:
-                        self.set_output("result", response.text)
-
-                return self.output("result")
+                        response = self._send_request(url, args, headers, proxies)
+                result = self._format_response(response)
+                self.set_output("result", result)
+                return result
+            except ValueError as e:
+                logging.warning(
+                    "Invoke SSRF guard blocked url=%s: %s",
+                    self._ssrf_log_target(locals().get("url", self._param.url)),
+                    e,
+                )
+                self.set_output("_ERROR", "URL not valid")
+                return "Http request error: URL not valid"
            except Exception as e:
                if self.check_if_canceled("Invoke processing"):
                    return

-                last_e = e
+                last_error = e
                logging.exception(f"Http request error: {e}")
                time.sleep(self._param.delay_after_error)

-        if last_e:
-            self.set_output("_ERROR", str(last_e))
-            return f"Http request error: {last_e}"
-
-        assert False, self.output()
+        if last_error:
+            self.set_output("_ERROR", str(last_error))
+            return f"Http request error: {last_error}"

    def thoughts(self) -> str:
        return "Waiting for the server respond..."
--- a/agent/component/iteration.py
+++ b/agent/component/iteration.py
@@ -32,6 +32,7 @@ class IterationParam(ComponentParamBase):
    def __init__(self):
        super().__init__()
        self.items_ref = ""
+        self.variable={}

    def get_input_form(self) -> dict[str, dict]:
        return {
--- a/agent/component/iterationitem.py
+++ b/agent/component/iterationitem.py
@@ -54,7 +54,11 @@ class IterationItem(ComponentBase, ABC):
        if self.check_if_canceled("IterationItem processing"):
            return

-        self.set_output("item", arr[self._idx])
+        current_item = arr[self._idx]
+        self.set_output("item", current_item)
+        # Keep `result` as a compatibility alias because existing DSL examples
+        # and downstream references may still consume IterationItem via `@result`.
+        self.set_output("result", current_item)
        self.set_output("index", self._idx)

        self._idx += 1
@@ -69,7 +73,7 @@ class IterationItem(ComponentBase, ABC):
            if p._id != pid:
                continue

-            if p.component_name.lower() in ["categorize", "message", "switch", "userfillup", "interationitem"]:
+            if p.component_name.lower() in ["categorize", "message", "switch", "userfillup", "iterationitem"]:
                continue

            for k, o in p._param.outputs.items():
--- a/agent/component/list_operations.py
+++ b/agent/component/list_operations.py
@@ -0,0 +1,235 @@
+from abc import ABC
+import os
+from agent.component.base import ComponentBase, ComponentParamBase
+from api.utils.api_utils import timeout
+
+class ListOperationsParam(ComponentParamBase):
+    """
+    Define the List Operations component parameters.
+    """
+    def __init__(self):
+        super().__init__()
+        self.query = ""
+        self.operations = "nth"
+        self.n = 0
+        self.strict = False
+        self.sort_method = "asc"
+        # Comma-separated list of map keys to sort by (primary,
+        # tiebreak, ...). Empty / unset falls back to the legacy
+        # full-hashable-key behaviour (sort by the lexicographically
+        # first field). Mirrors internal/agent/component/list_operations.go
+        # parseSortByFieldList + opSort's SortBy path.
+        self.sort_by = ""
+        self.filter = {
+            "operator": "=",
+            "value": ""
+        }
+        self.outputs = {
+            "result": {
+                "value": [],
+                "type": "Array of ?"
+            },
+            "first": {
+                "value": "",
+                "type": "?"
+            },
+            "last": {
+                "value": "",
+                "type": "?"
+            }
+        }
+
+    @staticmethod
+    def _normalize_operation_name(operation):
+        op = "" if operation is None else str(operation).strip()
+        if op.lower() == "topn":
+            return "head"
+        return op or "nth"
+    
+    def check(self):
+        self.check_empty(self.query, "query")
+        self.operations = self._normalize_operation_name(self.operations)
+        self.check_valid_value(
+            self.operations,
+            "Support operations",
+            ["nth", "head", "tail", "filter", "sort", "drop_duplicates"],
+        )
+
+    def get_input_form(self) -> dict[str, dict]:
+        return {}
+    
+
+class ListOperations(ComponentBase,ABC):
+    component_name = "ListOperations"
+
+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60)))
+    def _invoke(self, **kwargs):
+        self.input_objects=[]
+        inputs = getattr(self._param, "query", None)
+        self.inputs = self._canvas.get_variable_value(inputs)
+        if not isinstance(self.inputs, list):
+            raise TypeError("The input of List Operations should be an array.")
+        self.set_input_value(inputs, self.inputs)
+        if self._param.operations == "nth":
+            self._nth()
+        elif self._param.operations == "head":
+            self._head()
+        elif self._param.operations == "tail":
+            self._tail()
+        elif self._param.operations == "filter":
+            self._filter()
+        elif self._param.operations == "sort":
+            self._sort()
+        elif self._param.operations == "drop_duplicates":
+            self._drop_duplicates()
+
+
+    def _coerce_n(self):
+        try:
+            return int(getattr(self._param, "n", 0))
+        except Exception:
+            return 0
+
+    def _is_strict(self):
+        strict = getattr(self._param, "strict", False)
+        if isinstance(strict, str):
+            return strict.strip().lower() in {"1", "true", "yes", "on"}
+        return bool(strict)
+
+    def _set_outputs(self, outputs):
+        self._param.outputs["result"]["value"] = outputs
+        self._param.outputs["first"]["value"] = outputs[0] if outputs else None
+        self._param.outputs["last"]["value"]  = outputs[-1] if outputs else None
+
+    def _raise_strict_range_error(self, operation, n):
+        raise ValueError(
+            f"{operation} requires n to be within the valid range in strict mode, got {n}."
+        )
+
+    def _nth(self):
+        n = self._coerce_n()
+        strict = self._is_strict()
+        if n == 0:
+            if strict:
+                self._raise_strict_range_error("nth", n)
+            outputs = []
+        elif n > 0:
+            if n <= len(self.inputs):
+                outputs = [self.inputs[n - 1]]
+            elif strict:
+                self._raise_strict_range_error("nth", n)
+            else:
+                outputs = []
+        else:
+            if abs(n) <= len(self.inputs):
+                outputs = [self.inputs[n]]
+            elif strict:
+                self._raise_strict_range_error("nth", n)
+            else:
+                outputs = []
+        self._set_outputs(outputs)
+
+    def _head(self):
+        n = self._coerce_n()
+        strict = self._is_strict()
+        if strict:
+            if 1 <= n <= len(self.inputs):
+                outputs = self.inputs[:n]
+            else:
+                self._raise_strict_range_error("head", n)
+        else:
+            if n < 1:
+                outputs = []
+            else:
+                outputs = self.inputs[:n]
+        self._set_outputs(outputs)
+
+    def _tail(self):
+        n = self._coerce_n()
+        strict = self._is_strict()
+        if strict:
+            if 1 <= n <= len(self.inputs):
+                outputs = self.inputs[-n:]
+            else:
+                self._raise_strict_range_error("tail", n)
+        else:
+            if n < 1:
+                outputs = []
+            else:
+                outputs = self.inputs[-n:]
+        self._set_outputs(outputs)
+
+    def _filter(self):
+        self._set_outputs([i for i in self.inputs if self._eval(self._norm(i),self._param.filter["operator"],self._param.filter["value"])])
+
+    def _norm(self,v):
+        s = "" if v is None else str(v)
+        return s
+
+    def _eval(self, v, operator, value):
+        if operator == "=":
+            return v == value
+        elif operator == "≠":
+            return v != value
+        elif operator == "contains":
+            return value in v
+        elif operator == "start with":
+            return v.startswith(value)
+        elif operator == "end with":
+            return v.endswith(value)
+        else:
+            return False
+
+    def _sort(self):
+        items = self.inputs or []
+        method = getattr(self._param, "sort_method", "asc") or "asc"
+        reverse = method == "desc"
+
+        if not items:
+            self._set_outputs([])
+            return
+
+        first = items[0]
+
+        if isinstance(first, dict):
+            sort_by_raw = getattr(self._param, "sort_by", "") or ""
+            sort_by = [k.strip() for k in sort_by_raw.split(",") if k.strip()]
+            if sort_by:
+                outputs = sorted(
+                    items,
+                    key=lambda x: tuple(x.get(k) for k in sort_by),
+                    reverse=reverse,
+                )
+            else:
+                outputs = sorted(
+                    items,
+                    key=lambda x: self._hashable(x),
+                    reverse=reverse,
+                )
+        else:
+            outputs = sorted(items, reverse=reverse)
+
+        self._set_outputs(outputs)
+
+    def _drop_duplicates(self):
+        seen = set()
+        outs = []
+        for item in self.inputs:
+            k = self._hashable(item)
+            if k in seen:
+                continue
+            seen.add(k)
+            outs.append(item)
+        self._set_outputs(outs)
+
+    def _hashable(self,x):
+        if isinstance(x, dict):
+            return tuple(sorted((k, self._hashable(v)) for k, v in x.items()))
+        if isinstance(x, (list, tuple)):
+            return tuple(self._hashable(v) for v in x)
+        if isinstance(x, set):
+            return tuple(sorted(self._hashable(v) for v in x))
+        return x
+
+    def thoughts(self) -> str:
+        return "ListOperation in progress"
--- a/agent/component/llm.py
+++ b/agent/component/llm.py
@@ -13,17 +13,19 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
+import asyncio
 import json
 import logging
 import os
 import re
 from copy import deepcopy
-from typing import Any, Generator
+from typing import Any, AsyncGenerator
 import json_repair
 from functools import partial
 from common.constants import LLMType
+from api.db.services.dialog_service import _stream_with_think_delta
 from api.db.services.llm_service import LLMBundle
-from api.db.services.tenant_llm_service import TenantLLMService
+from api.db.joint_services.tenant_model_service import get_model_config_from_provider_instance, get_model_type_by_name
 from agent.component.base import ComponentBase, ComponentParamBase
 from common.connection_utils import timeout
 from rag.prompts.generator import tool_call_summary, message_fit_in, citation_prompt, structured_output_prompt
@@ -55,11 +57,11 @@ class LLMParam(ComponentParamBase):
        self.check_nonnegative_number(int(self.max_tokens), "[Agent] Max tokens")
        self.check_decimal_float(float(self.top_p), "[Agent] Top P")
        self.check_empty(self.llm_id, "[Agent] LLM")
-        self.check_empty(self.sys_prompt, "[Agent] System prompt")
        self.check_empty(self.prompts, "[Agent] User prompt")

    def gen_conf(self):
        conf = {}
+
        def get_attr(nm):
            try:
                return getattr(self, nm)
@@ -84,19 +86,16 @@ class LLM(ComponentBase):

    def __init__(self, canvas, component_id, param: ComponentParamBase):
        super().__init__(canvas, component_id, param)
-        self.chat_mdl = LLMBundle(self._canvas.get_tenant_id(), TenantLLMService.llm_id2llm_type(self._param.llm_id),
-                                  self._param.llm_id, max_retries=self._param.max_retries,
-                                  retry_interval=self._param.delay_after_error
-                                  )
+        model_types = get_model_type_by_name(self._canvas.get_tenant_id(), self._param.llm_id)
+        model_type = "chat" if "chat" in model_types else model_types[0]
+        chat_model_config = get_model_config_from_provider_instance(self._canvas.get_tenant_id(), model_type, self._param.llm_id)
+        self.chat_mdl = LLMBundle(self._canvas.get_tenant_id(), chat_model_config, max_retries=self._param.max_retries, retry_interval=self._param.delay_after_error)
        self.imgs = []

    def get_input_form(self) -> dict[str, dict]:
        res = {}
        for k, v in self.get_input_elements().items():
-            res[k] = {
-                "type": "line",
-                "name": v["name"]
-            }
+            res[k] = {"type": "line", "name": v["name"]}
        return res

    def get_input_elements(self) -> dict[str, Any]:
@@ -117,31 +116,190 @@ class LLM(ComponentBase):
    def _sys_prompt_and_msg(self, msg, args):
        if isinstance(self._param.prompts, str):
            self._param.prompts = [{"role": "user", "content": self._param.prompts}]
+        history_size = len(msg)
        for p in self._param.prompts:
-            if msg and msg[-1]["role"] == p["role"]:
-                continue
-            p = deepcopy(p)
-            p["content"] = self.string_format(p["content"], args)
-            msg.append(p)
+            formatted = deepcopy(p)
+            formatted["content"] = self.string_format(formatted["content"], args)
+            if len(msg) == history_size and msg and msg[-1]["role"] == formatted["role"]:
+                msg[-1] = formatted
+            else:
+                msg.append(formatted)
        return msg, self.string_format(self._param.sys_prompt, args)

-    def _prepare_prompt_variables(self):
-        if self._param.visual_files_var:
-            self.imgs = self._canvas.get_variable_value(self._param.visual_files_var)
-            if not self.imgs:
-                self.imgs = []
-            self.imgs = [img for img in self.imgs if img[:len("data:image/")] == "data:image/"]
-            if self.imgs and TenantLLMService.llm_id2llm_type(self._param.llm_id) == LLMType.CHAT.value:
-                self.chat_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.IMAGE2TEXT.value,
-                                          self._param.llm_id, max_retries=self._param.max_retries,
-                                          retry_interval=self._param.delay_after_error
-                                          )
+    @staticmethod
+    def effective_context_length(max_length) -> int:
+        return max_length or 8192

+    @classmethod
+    def context_fit_budget(cls, max_length) -> int:
+        return int(cls.effective_context_length(max_length) * 0.97)
+
+    @staticmethod
+    def validate_fitted_messages(msg_fit: list[dict]) -> str | None:
+        if len(msg_fit) < 2:
+            return "**ERROR**: message_fit_in produced insufficient messages for LLM"
+        last = msg_fit[-1]
+        if last.get("role") != "user" or not str(last.get("content") or "").strip():
+            return "**ERROR**: LLM user message is empty after prompt fitting; check model max_tokens context setting"
+        return None
+
+    @classmethod
+    def fit_messages(cls, system_prompt: str, msg: list[dict], max_length) -> tuple[list[dict], str | None]:
+        _, msg_fit = message_fit_in(
+            [{"role": "system", "content": system_prompt}, *deepcopy(msg)],
+            cls.context_fit_budget(max_length),
+        )
+        return msg_fit, cls.validate_fitted_messages(msg_fit)
+
+    @staticmethod
+    def _extract_data_images(value) -> list[str]:
+        imgs = []
+
+        def walk(v):
+            if v is None:
+                return
+            if isinstance(v, str):
+                v = v.strip()
+                if v.startswith("data:image/"):
+                    imgs.append(v)
+                return
+            if isinstance(v, (list, tuple, set)):
+                for item in v:
+                    walk(item)
+                return
+            if isinstance(v, dict):
+                if "content" in v:
+                    walk(v.get("content"))
+                else:
+                    for item in v.values():
+                        walk(item)
+
+        walk(value)
+        return imgs
+
+    @staticmethod
+    def _uniq_images(images: list[str]) -> list[str]:
+        seen = set()
+        uniq = []
+        for img in images:
+            if not isinstance(img, str):
+                continue
+            if not img.startswith("data:image/"):
+                continue
+            if img in seen:
+                continue
+            seen.add(img)
+            uniq.append(img)
+        return uniq
+
+    @classmethod
+    def _remove_data_images(cls, value):
+        if value is None:
+            return None
+
+        if isinstance(value, str):
+            return None if value.strip().startswith("data:image/") else value
+
+        if isinstance(value, list):
+            cleaned = []
+            for item in value:
+                v = cls._remove_data_images(item)
+                if v is None:
+                    continue
+                if isinstance(v, (list, tuple, set, dict)) and not v:
+                    continue
+                cleaned.append(v)
+            return cleaned
+
+        if isinstance(value, tuple):
+            cleaned = []
+            for item in value:
+                v = cls._remove_data_images(item)
+                if v is None:
+                    continue
+                if isinstance(v, (list, tuple, set, dict)) and not v:
+                    continue
+                cleaned.append(v)
+            return tuple(cleaned)
+
+        if isinstance(value, set):
+            cleaned = []
+            for item in value:
+                v = cls._remove_data_images(item)
+                if v is None:
+                    continue
+                if isinstance(v, (list, tuple, set, dict)) and not v:
+                    continue
+                cleaned.append(v)
+            return cleaned
+
+        if isinstance(value, dict):
+            if value.get("type") in {"image_url", "input_image", "image"} and cls._extract_data_images(value):
+                return None
+
+            cleaned = {}
+            for k, item in value.items():
+                v = cls._remove_data_images(item)
+                if v is None:
+                    continue
+                if isinstance(v, (list, tuple, set, dict)) and not v:
+                    continue
+                cleaned[k] = v
+            return cleaned
+
+        return value
+
+    def _collect_sys_files(self) -> tuple[list[str], list[str]]:
+        files = self._canvas.globals.get("sys.files") or []
+        if not files:
+            logging.debug("[LLM] sys.files empty; skipping attachment injection")
+            return [], []
+
+        logging.info("[LLM] sys.files present: count=%d", len(files))
+
+        explicit = "{sys.files}" in (self._param.sys_prompt or "")
+        if not explicit and isinstance(self._param.prompts, list):
+            for p in self._param.prompts:
+                if isinstance(p, dict) and "{sys.files}" in (p.get("content") or ""):
+                    explicit = True
+                    break
+        if explicit:
+            logging.info("[LLM] prompt template references {sys.files}; skipping auto-injection (explicit=%s)", explicit)
+            return [], []
+
+        text_parts: list[str] = []
+        image_data_uris: list[str] = []
+        for f in files:
+            if not isinstance(f, str):
+                logging.debug("[LLM] skipping non-str sys.files entry: type=%s", type(f).__name__)
+                continue
+            if f.startswith("data:image/"):
+                image_data_uris.append(f)
+            else:
+                text_parts.append(f)
+        logging.info(
+            "[LLM] sys.files split: text_parts=%d image_data_uris=%d (explicit=%s)",
+            len(text_parts),
+            len(image_data_uris),
+            explicit,
+        )
+        return text_parts, image_data_uris
+
+    def _prepare_prompt_variables(self):
+        self.imgs = []
+        if self._param.visual_files_var:
+            visual_val = self._canvas.get_variable_value(self._param.visual_files_var)
+            self.imgs.extend(self._extract_data_images(visual_val))

        args = {}
        vars = self.get_input_elements() if not self._param.debug_inputs else self._param.debug_inputs
+        extracted_imgs = []
        for k, o in vars.items():
-            args[k] = o["value"]
+            raw_value = o["value"]
+            extracted_imgs.extend(self._extract_data_images(raw_value))
+            args[k] = self._remove_data_images(raw_value)
+            if args[k] is None:
+                args[k] = ""
            if not isinstance(args[k], str):
                try:
                    args[k] = json.dumps(args[k], ensure_ascii=False)
@@ -149,7 +307,47 @@ class LLM(ComponentBase):
                    args[k] = str(args[k])
            self.set_input_value(k, args[k])

+        sys_file_texts, sys_file_imgs = self._collect_sys_files()
+        prev_img_count = len(self.imgs) + len(extracted_imgs)
+        self.imgs = self._uniq_images(self.imgs + extracted_imgs + sys_file_imgs)
+        logging.debug(
+            "[LLM] imgs rebuilt: total=%d sys_files_added=%d unique_dropped=%d",
+            len(self.imgs),
+            len(sys_file_imgs),
+            max(0, prev_img_count + len(sys_file_imgs) - len(self.imgs)),
+        )
+        model_types = get_model_type_by_name(self._canvas.get_tenant_id(), self._param.llm_id)
+        if self.imgs and LLMType.IMAGE2TEXT.value in model_types:
+            model_type = LLMType.IMAGE2TEXT.value
+        elif LLMType.CHAT.value in model_types:
+            model_type = LLMType.CHAT.value
+        else:
+            model_type = model_types[0]
+        model_config = get_model_config_from_provider_instance(self._canvas.get_tenant_id(), model_type, self._param.llm_id)
+        if self.imgs:
+            self.chat_mdl = LLMBundle(self._canvas.get_tenant_id(), model_config, max_retries=self._param.max_retries, retry_interval=self._param.delay_after_error)
+
        msg, sys_prompt = self._sys_prompt_and_msg(self._canvas.get_history(self._param.message_history_window_size)[:-1], args)
+
+        if sys_file_texts:
+            joined = "\n\n".join(sys_file_texts)
+            merged_idx = -1
+            for i in range(len(msg) - 1, -1, -1):
+                if msg[i].get("role") == "user":
+                    msg[i]["content"] = (msg[i].get("content") or "") + "\n\n" + joined
+                    merged_idx = i
+                    break
+            else:
+                msg.append({"role": "user", "content": joined})
+                merged_idx = len(msg) - 1
+            logging.info(
+                "[LLM] sys.files text merged into msg: parts=%d total_chars=%d msg_index=%d action=%s",
+                len(sys_file_texts),
+                len(joined),
+                merged_idx,
+                "merged_into_existing_user" if merged_idx < len(msg) - 1 or msg[merged_idx].get("content", "") != joined else "appended_new_user",
+            )
+
        user_defined_prompt, sys_prompt = self._extract_prompts(sys_prompt)
        if self._param.cite and self._canvas.get_reference()["chunks"]:
            sys_prompt += citation_prompt(user_defined_prompt)
@@ -159,54 +357,61 @@ class LLM(ComponentBase):
    def _extract_prompts(self, sys_prompt):
        pts = {}
        for tag in ["TASK_ANALYSIS", "PLAN_GENERATION", "REFLECTION", "CONTEXT_SUMMARY", "CONTEXT_RANKING", "CITATION_GUIDELINES"]:
-            r = re.search(rf"<{tag}>(.*?)</{tag}>", sys_prompt, flags=re.DOTALL|re.IGNORECASE)
+            r = re.search(rf"<{tag}>(.*?)</{tag}>", sys_prompt, flags=re.DOTALL | re.IGNORECASE)
            if not r:
                continue
            pts[tag.lower()] = r.group(1)
-            sys_prompt = re.sub(rf"<{tag}>(.*?)</{tag}>", "", sys_prompt, flags=re.DOTALL|re.IGNORECASE)
+            sys_prompt = re.sub(rf"<{tag}>(.*?)</{tag}>", "", sys_prompt, flags=re.DOTALL | re.IGNORECASE)
        return pts, sys_prompt

-    def _generate(self, msg:list[dict], **kwargs) -> str:
+    async def _generate_async(self, msg: list[dict], **kwargs) -> str:
        if not self.imgs:
-            return self.chat_mdl.chat(msg[0]["content"], msg[1:], self._param.gen_conf(), **kwargs)
-        return self.chat_mdl.chat(msg[0]["content"], msg[1:], self._param.gen_conf(), images=self.imgs, **kwargs)
+            return await self.chat_mdl.async_chat(msg[0]["content"], msg[1:], self._param.gen_conf(), **kwargs)
+        return await self.chat_mdl.async_chat(msg[0]["content"], msg[1:], self._param.gen_conf(), images=self.imgs, **kwargs)

-    def _generate_streamly(self, msg:list[dict], **kwargs) -> Generator[str, None, None]:
-        ans = ""
-        last_idx = 0
-        endswith_think = False
-        def delta(txt):
-            nonlocal ans, last_idx, endswith_think
-            delta_ans = txt[last_idx:]
-            ans = txt
+    async def _generate_streamly(self, msg: list[dict], **kwargs) -> AsyncGenerator[str, None]:
+        stream_kwargs = {"images": self.imgs} if self.imgs else {}
+        stream_kwargs.update(kwargs)
+        stream = self.chat_mdl.async_chat_streamly_delta(msg[0]["content"], msg[1:], self._param.gen_conf(), **stream_kwargs)
+        async for _, value, _ in _stream_with_think_delta(stream, min_tokens=0):
+            yield value

-            if delta_ans.find("<think>") == 0:
-                last_idx += len("<think>")
-                return "<think>"
-            elif delta_ans.find("<think>") > 0:
-                delta_ans = txt[last_idx:last_idx+delta_ans.find("<think>")]
-                last_idx += delta_ans.find("<think>")
-                return delta_ans
-            elif delta_ans.endswith("</think>"):
-                endswith_think = True
-            elif endswith_think:
-                endswith_think = False
-                return "</think>"
+    async def _stream_output_async(self, prompt, msg):
+        msg_fit, fit_error = self.fit_messages(prompt, msg, self.chat_mdl.max_length)
+        if fit_error:
+            logging.error("LLM streaming prompt fit error: %s", fit_error)
+            if self.get_exception_default_value():
+                fallback = self.get_exception_default_value()
+                self.set_output("content", fallback)
+                yield fallback
+            else:
+                self.set_output("_ERROR", fit_error)
+            return

-            last_idx = len(ans)
-            if ans.endswith("</think>"):
-                last_idx -= len("</think>")
-            return re.sub(r"(<think>|</think>)", "", delta_ans)
+        answer = ""
+        stream_kwargs = {"images": self.imgs} if self.imgs else {}
+        extra_chat_kwargs = self._get_chat_template_kwargs()
+        stream_kwargs.update(extra_chat_kwargs)
+        stream = self.chat_mdl.async_chat_streamly_delta(msg_fit[0]["content"], msg_fit[1:], self._param.gen_conf(), **stream_kwargs)
+        async for _, ans, _ in _stream_with_think_delta(stream, min_tokens=0):
+            if self.check_if_canceled("LLM streaming"):
+                return

-        if not self.imgs:
-            for txt in self.chat_mdl.chat_streamly(msg[0]["content"], msg[1:], self._param.gen_conf(), **kwargs):
-                yield delta(txt)
-        else:
-            for txt in self.chat_mdl.chat_streamly(msg[0]["content"], msg[1:], self._param.gen_conf(), images=self.imgs, **kwargs):
-                yield delta(txt)
+            if ans.find("**ERROR**") >= 0:
+                if self.get_exception_default_value():
+                    self.set_output("content", self.get_exception_default_value())
+                    yield self.get_exception_default_value()
+                else:
+                    self.set_output("_ERROR", ans)
+                return

-    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60)))
-    def _invoke(self, **kwargs):
+            answer += ans
+            yield ans
+
+        self.set_output("content", answer)
+
+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10 * 60)))
+    async def _invoke_async(self, **kwargs):
        if self.check_if_canceled("LLM processing"):
            return

@@ -216,23 +421,28 @@ class LLM(ComponentBase):
            return re.sub(r"```\n*$", "", ans, flags=re.DOTALL)

        prompt, msg, _ = self._prepare_prompt_variables()
+        extra_chat_kwargs = self._get_chat_template_kwargs()
        error: str = ""
-        output_structure=None
+        output_structure = None
        try:
-            output_structure = self._param.outputs['structured']
+            output_structure = self._param.outputs["structured"]
        except Exception:
            pass
-        if output_structure:
-            schema=json.dumps(output_structure, ensure_ascii=False, indent=2)
-            prompt += structured_output_prompt(schema)
-            for _ in range(self._param.max_retries+1):
+        if output_structure and isinstance(output_structure, dict) and output_structure.get("properties") and len(output_structure["properties"]) > 0:
+            schema = json.dumps(output_structure, ensure_ascii=False, indent=2)
+            prompt_with_schema = prompt + structured_output_prompt(schema)
+            for _ in range(self._param.max_retries + 1):
                if self.check_if_canceled("LLM processing"):
                    return

-                _, msg = message_fit_in([{"role": "system", "content": prompt}, *msg], int(self.chat_mdl.max_length * 0.97))
+                msg_fit, fit_error = self.fit_messages(prompt_with_schema, msg, self.chat_mdl.max_length)
+                if fit_error:
+                    logging.error("LLM structured prompt fit error: %s", fit_error)
+                    self.set_output("_ERROR", fit_error)
+                    return
                error = ""
-                ans = self._generate(msg)
-                msg.pop(0)
+                ans = await self._generate_async(msg_fit, **extra_chat_kwargs)
+                msg_fit.pop(0)
                if ans.find("**ERROR**") >= 0:
                    logging.error(f"LLM response error: {ans}")
                    error = ans
@@ -241,7 +451,7 @@ class LLM(ComponentBase):
                    self.set_output("structured", json_repair.loads(clean_formated_answer(ans)))
                    return
                except Exception:
-                    msg.append({"role": "user", "content": "The answer can't not be parsed as JSON"})
+                    msg_fit.append({"role": "user", "content": "The answer can't not be parsed as JSON"})
                    error = "The answer can't not be parsed as JSON"
            if error:
                self.set_output("_ERROR", error)
@@ -249,18 +459,23 @@ class LLM(ComponentBase):

        downstreams = self._canvas.get_component(self._id)["downstream"] if self._canvas.get_component(self._id) else []
        ex = self.exception_handler()
-        if any([self._canvas.get_component_obj(cid).component_name.lower()=="message" for cid in downstreams]) and not output_structure and not (ex and ex["goto"]):
-            self.set_output("content", partial(self._stream_output, prompt, msg))
+        if any([self._canvas.get_component_obj(cid).component_name.lower() == "message" for cid in downstreams]) and not (ex and ex["goto"]):
+            self.set_output("content", partial(self._stream_output_async, prompt, deepcopy(msg)))
            return

-        for _ in range(self._param.max_retries+1):
+        error = ""
+        for _ in range(self._param.max_retries + 1):
            if self.check_if_canceled("LLM processing"):
                return

-            _, msg = message_fit_in([{"role": "system", "content": prompt}, *msg], int(self.chat_mdl.max_length * 0.97))
+            msg_fit, fit_error = self.fit_messages(prompt, msg, self.chat_mdl.max_length)
+            if fit_error:
+                logging.error("LLM prompt fit error: %s", fit_error)
+                error = fit_error
+                break
            error = ""
-            ans = self._generate(msg)
-            msg.pop(0)
+            ans = await self._generate_async(msg_fit, **extra_chat_kwargs)
+            msg_fit.pop(0)
            if ans.find("**ERROR**") >= 0:
                logging.error(f"LLM response error: {ans}")
                error = ans
@@ -274,29 +489,33 @@ class LLM(ComponentBase):
            else:
                self.set_output("_ERROR", error)

-    def _stream_output(self, prompt, msg):
-        _, msg = message_fit_in([{"role": "system", "content": prompt}, *msg], int(self.chat_mdl.max_length * 0.97))
-        answer = ""
-        for ans in self._generate_streamly(msg):
-            if self.check_if_canceled("LLM streaming"):
-                return
+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10 * 60)))
+    def _invoke(self, **kwargs):
+        return asyncio.run(self._invoke_async(**kwargs))

-            if ans.find("**ERROR**") >= 0:
-                if self.get_exception_default_value():
-                    self.set_output("content", self.get_exception_default_value())
-                    yield self.get_exception_default_value()
-                else:
-                    self.set_output("_ERROR", ans)
-                return
-            yield ans
-            answer += ans
-        self.set_output("content", answer)
+    def _get_chat_template_kwargs(self) -> dict[str, Any]:
+        chat_template_kwargs = self._canvas.globals.get("sys.chat_template_kwargs")
+        if chat_template_kwargs is None:
+            return {}

-    def add_memory(self, user:str, assist:str, func_name: str, params: dict, results: str, user_defined_prompt:dict={}):
-        summ = tool_call_summary(self.chat_mdl, func_name, params, results, user_defined_prompt)
+        # The API should pass this as a JSON object, but accept a JSON string for compatibility.
+        if isinstance(chat_template_kwargs, str):
+            try:
+                chat_template_kwargs = json_repair.loads(chat_template_kwargs)
+            except Exception:
+                logging.warning("Ignore invalid sys.chat_template_kwargs: expected JSON object or JSON string object.")
+                return {}
+
+        if not isinstance(chat_template_kwargs, dict):
+            logging.warning("Ignore invalid sys.chat_template_kwargs type: %s", type(chat_template_kwargs).__name__)
+            return {}
+        return {"chat_template_kwargs": chat_template_kwargs}
+
+    async def add_memory(self, user: str, assist: str, func_name: str, params: dict, results: str, user_defined_prompt: dict = {}):
+        summ = await tool_call_summary(self.chat_mdl, func_name, params, results, user_defined_prompt)
        logging.info(f"[MEMORY]: {summ}")
        self._canvas.add_memory(user, assist, summ)

    def thoughts(self) -> str:
-        _, msg,_ = self._prepare_prompt_variables()
-        return "⌛Give me a moment—starting from: \n\n" + re.sub(r"(User's query:|[\\]+)", '', msg[-1]['content'], flags=re.DOTALL) + "\n\nI’ll figure out our best next move."
+        _, msg, _ = self._prepare_prompt_variables()
+        return "⌛Give me a moment—starting from: \n\n" + re.sub(r"(User's query:|[\\]+)", "", msg[-1]["content"], flags=re.DOTALL) + "\n\nI’ll figure out our best next move."
--- a/agent/component/loop.py
+++ b/agent/component/loop.py
@@ -0,0 +1,106 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class LoopParam(ComponentParamBase):
+    """
+    Define the Loop component parameters.
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.loop_variables = []
+        self.loop_termination_condition=[]
+        self.maximum_loop_count = 0
+
+    def get_input_form(self) -> dict[str, dict]:
+        return {
+            "items": {
+                "type": "json",
+                "name": "Items"
+            }
+        }
+
+    def check(self):
+        return True
+
+
+class Loop(ComponentBase, ABC):
+    component_name = "Loop"
+
+    @staticmethod
+    def _is_missing_required_field(value):
+        if value is None:
+            return True
+        if isinstance(value, str):
+            return value == ""
+        return False
+
+    @classmethod
+    def _is_incomplete_loop_variable(cls, item):
+        if any(
+            [
+                cls._is_missing_required_field(item.get("variable")),
+                cls._is_missing_required_field(item.get("input_mode")),
+                cls._is_missing_required_field(item.get("type")),
+            ]
+        ):
+            return True
+
+        input_mode = item.get("input_mode")
+        if input_mode == "variable":
+            return cls._is_missing_required_field(item.get("value"))
+        if input_mode == "constant":
+            return item.get("value") is None
+        return True
+
+    def get_start(self):
+        for cid in self._canvas.components.keys():
+            if self._canvas.get_component(cid)["obj"].component_name.lower() != "loopitem":
+                continue
+            if self._canvas.get_component(cid)["parent_id"] == self._id:
+                return cid
+
+    def _invoke(self, **kwargs):
+        if self.check_if_canceled("Loop processing"):
+            return
+
+        for item in self._param.loop_variables:
+            if self._is_incomplete_loop_variable(item):
+                raise ValueError("Loop Variable is not complete.")
+            if item["input_mode"]=="variable":
+                self.set_output(item["variable"],self._canvas.get_variable_value(item["value"]))
+            elif item["input_mode"]=="constant":
+                self.set_output(item["variable"],item["value"])
+            else:
+                if item["type"] == "number":
+                    self.set_output(item["variable"], 0)
+                elif item["type"] == "string":
+                    self.set_output(item["variable"], "")
+                elif item["type"] == "boolean":
+                    self.set_output(item["variable"], False)
+                elif item["type"].startswith("object"):
+                    self.set_output(item["variable"], {})
+                elif item["type"].startswith("array"):
+                    self.set_output(item["variable"], [])
+                else:
+                    self.set_output(item["variable"], "")
+
+
+    def thoughts(self) -> str:
+        return "Loop from canvas."
--- a/agent/component/loopitem.py
+++ b/agent/component/loopitem.py
@@ -0,0 +1,167 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+from agent.component.base import ComponentBase, ComponentParamBase
+
+
+class LoopItemParam(ComponentParamBase):
+    """
+    Define the LoopItem component parameters.
+    """
+    def check(self):
+        return True
+
+class LoopItem(ComponentBase, ABC):
+    component_name = "LoopItem"
+
+    def __init__(self, canvas, id, param: ComponentParamBase):
+        super().__init__(canvas, id, param)
+        self._idx = 0
+
+
+    def _invoke(self, **kwargs):
+        if self.check_if_canceled("LoopItem processing"):
+            return
+        parent = self.get_parent()
+        maximum_loop_count = parent._param.maximum_loop_count
+        if self._idx >= maximum_loop_count:
+            self._idx = -1
+            return
+        if self._idx > 0:
+            if self.check_if_canceled("LoopItem processing"):
+                return
+        self._idx += 1
+
+    def evaluate_condition(self,var, operator, value):
+        if isinstance(var, str):
+            if operator == "contains":
+                return value in var
+            elif operator == "not contains":
+                return value not in var
+            elif operator == "start with":
+                return var.startswith(value)
+            elif operator == "end with":
+                return var.endswith(value)
+            elif operator == "is":
+                return var == value
+            elif operator == "is not":
+                return var != value
+            elif operator == "empty":
+                return var == ""
+            elif operator == "not empty":
+                return var != ""
+
+        elif isinstance(var, bool):
+            if operator == "is":
+                return var is value
+            elif operator == "is not":
+                return var is not value
+            elif operator == "empty":
+                return var is None
+            elif operator == "not empty":
+                return var is not None
+
+        elif isinstance(var, (int, float)):
+            if operator == "=":
+                return var == value
+            elif operator == "≠":
+                return var != value
+            elif operator == ">":
+                return var > value
+            elif operator == "<":
+                return var < value
+            elif operator == "≥":
+                return var >= value
+            elif operator == "≤":
+                return var <= value
+            elif operator == "empty":
+                return var is None
+            elif operator == "not empty":
+                return var is not None
+
+        elif isinstance(var, dict):
+            if operator == "empty":
+                return len(var) == 0
+            elif operator == "not empty":
+                return len(var) > 0
+
+        elif isinstance(var, list):
+            if operator == "contains":
+                return value in var
+            elif operator == "not contains":
+                return value not in var
+
+            elif operator == "is":
+                return var == value
+            elif operator == "is not":
+                return var != value
+
+            elif operator == "empty":
+                return len(var) == 0
+            elif operator == "not empty":
+                return len(var) > 0
+        elif var is None:
+            if operator == "empty":
+                return True
+            return False
+
+        raise Exception(f"Invalid operator: {operator}")
+
+    def end(self):
+        if self._idx == -1:
+            return True
+        parent = self.get_parent()
+        logical_operator = parent._param.logical_operator if hasattr(parent._param, "logical_operator") else "and"
+        conditions = []
+        for item in parent._param.loop_termination_condition:
+            if not item.get("variable") or not item.get("operator"):
+                raise ValueError("Loop condition is incomplete.")
+            var = self._canvas.get_variable_value(item["variable"])
+            operator = item["operator"]
+            input_mode = item.get("input_mode", "constant")
+
+            if input_mode == "variable":
+                value = self._canvas.get_variable_value(item.get("value", ""))
+            elif input_mode == "constant":
+                value = item.get("value", "")
+            else:
+                raise ValueError("Invalid input mode.")
+            conditions.append(self.evaluate_condition(var, operator, value))
+        should_end = (
+            all(conditions) if logical_operator == "and"
+            else any(conditions) if logical_operator == "or"
+            else None
+        )
+        if should_end is None:
+            raise ValueError("Invalid logical operator,should be 'and' or 'or'.")
+
+        if should_end:
+            self._idx = -1
+            return True
+
+        return False
+
+    def next(self):
+        if self._idx == -1:
+            self._idx = 0
+        else:
+            self._idx += 1
+            if self._idx >= len(self._items):
+                self._idx = -1
+        return False
+
+    def thoughts(self) -> str:
+        return "Next turn..."
--- a/agent/component/message.py
+++ b/agent/component/message.py
@@ -13,17 +13,32 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
+import asyncio
+try:
+    import nest_asyncio
+    nest_asyncio.apply()
+except Exception:
+    pass
+import inspect
 import json
 import os
 import random
 import re
+import logging
+import tempfile
 from functools import partial
 from typing import Any

 from agent.component.base import ComponentBase, ComponentParamBase
-from jinja2 import Template as Jinja2Template
+from jinja2.sandbox import SandboxedEnvironment
+
+_jinja2_sandbox = SandboxedEnvironment()

 from common.connection_utils import timeout
+from common.misc_utils import get_uuid
+from common import settings
+
+from api.db.joint_services.memory_message_service import queue_save_to_memory_task


 class MessageParam(ComponentParamBase):
@@ -34,9 +49,14 @@ class MessageParam(ComponentParamBase):
        super().__init__()
        self.content = []
        self.stream = True
+        self.output_format = None  # default output format
+        self.auto_play = False
        self.outputs = {
            "content": {
                "type": "str"
+            },
+            "downloads": {
+                "type": "list"
            }
        }

@@ -49,29 +69,110 @@ class MessageParam(ComponentParamBase):
 class Message(ComponentBase):
    component_name = "Message"

+    @staticmethod
+    def _is_download_info(value: Any) -> bool:
+        return isinstance(value, dict) and all(
+            key in value for key in ("doc_id", "filename", "mime_type")
+        )
+
+    @staticmethod
+    def _download_info_includes_content(value: Any) -> bool:
+        return isinstance(value, dict) and bool(value.get("include_download_info_in_content"))
+
+    @staticmethod
+    def _normalize_download_info(value: Any) -> Any:
+        if isinstance(value, list):
+            return [Message._normalize_download_info(item) for item in value]
+
+        if not isinstance(value, dict):
+            return value
+
+        normalized = value.copy()
+        normalized.pop("include_download_info_in_content", None)
+        return normalized
+
+    def _extract_downloads(self, value: Any) -> list[dict[str, Any]]:
+        if isinstance(value, str):
+            try:
+                value = json.loads(value)
+            except Exception:
+                return []
+
+        if self._is_download_info(value):
+            return [value]
+
+        if isinstance(value, list) and all(self._is_download_info(item) for item in value):
+            return value
+
+        return []
+
+    def _stringify_message_value(
+        self,
+        value: Any,
+        delimiter: str = None,
+        downloads: list[dict[str, Any]] | None = None,
+        fallback_to_str: bool = False,
+    ) -> str:
+        extracted_downloads = self._extract_downloads(value)
+        if extracted_downloads:
+            if downloads is not None:
+                downloads.extend(self._normalize_download_info(item) for item in extracted_downloads)
+            if any(self._download_info_includes_content(item) for item in extracted_downloads):
+                if isinstance(value, str):
+                    try:
+                        value = json.loads(value)
+                    except Exception:
+                        return value
+                try:
+                    return json.dumps(self._normalize_download_info(value), ensure_ascii=False)
+                except Exception:
+                    if fallback_to_str:
+                        return str(value)
+                    return ""
+            return ""
+
+        if value is None:
+            return ""
+
+        if isinstance(value, list) and delimiter:
+            return delimiter.join([str(vv) for vv in value])
+
+        if isinstance(value, str):
+            return value
+
+        try:
+            return json.dumps(value, ensure_ascii=False)
+        except Exception:
+            if fallback_to_str:
+                return str(value)
+            return ""
+
    def get_input_elements(self) -> dict[str, Any]:
        return self.get_input_elements_from_text("".join(self._param.content))

-    def get_kwargs(self, script:str, kwargs:dict = {}, delimiter:str=None) -> tuple[str, dict[str, str | list | Any]]:
+    def get_kwargs(
+        self,
+        script: str,
+        kwargs: dict = {},
+        delimiter: str = None,
+        downloads: list[dict[str, Any]] | None = None,
+    ) -> tuple[str, dict[str, str | list | Any]]:
        for k,v in self.get_input_elements_from_text(script).items():
            if k in kwargs:
                continue
            v = v["value"]
-            if not v:
+            if v is None:
                v = ""
            ans = ""
            if isinstance(v, partial):
-                for t in v():
-                    ans += t
-            elif isinstance(v, list) and delimiter:
-                ans = delimiter.join([str(vv) for vv in v])
-            elif not isinstance(v, str):
-                try:
-                    ans = json.dumps(v, ensure_ascii=False)
-                except Exception:
-                    pass
+                iter_obj = v()
+                if inspect.isasyncgen(iter_obj):
+                    ans = asyncio.run(self._consume_async_gen(iter_obj))
+                else:
+                    for t in iter_obj:
+                        ans += t
            else:
-                ans = v
+                ans = self._stringify_message_value(v, delimiter, downloads)
            if not ans:
                ans = ""
            kwargs[k] = ans
@@ -84,10 +185,17 @@ class Message(ComponentBase):
            _kwargs[_n] = v
        return script, _kwargs

-    def _stream(self, rand_cnt:str):
+    async def _consume_async_gen(self, agen):
+        buf = ""
+        async for t in agen:
+            buf += t
+        return buf
+
+    async def _stream(self, rand_cnt:str):
        s = 0
        all_content = ""
        cache = {}
+        downloads = []
        for r in re.finditer(self.variable_ref_patt, rand_cnt, flags=re.DOTALL):
            if self.check_if_canceled("Message streaming"):
                return
@@ -106,20 +214,30 @@ class Message(ComponentBase):
                v = ""
            if isinstance(v, partial):
                cnt = ""
-                for t in v():
-                    if self.check_if_canceled("Message streaming"):
-                        return
+                iter_obj = v()
+                if inspect.isasyncgen(iter_obj):
+                    async for t in iter_obj:
+                        if self.check_if_canceled("Message streaming"):
+                            return

-                    all_content += t
-                    cnt += t
-                    yield t
+                        all_content += t
+                        cnt += t
+                        yield t
+                else:
+                    for t in iter_obj:
+                        if self.check_if_canceled("Message streaming"):
+                            return
+
+                        all_content += t
+                        cnt += t
+                        yield t
                self.set_input_value(exp, cnt)
                continue
-            elif not isinstance(v, str):
-                try:
-                    v = json.dumps(v, ensure_ascii=False)
-                except Exception:
-                    v = str(v)
+            elif inspect.isawaitable(v):
+                v = await v
+            v = self._stringify_message_value(
+                v, downloads=downloads, fallback_to_str=True
+            )
            yield v
            self.set_input_value(exp, v)
            all_content += v
@@ -132,7 +250,10 @@ class Message(ComponentBase):
            all_content += rand_cnt[s: ]
            yield rand_cnt[s: ]

+        self.set_output("downloads", downloads)
        self.set_output("content", all_content)
+        self._convert_content(all_content)
+        await self._save_to_memory(all_content)

    def _is_jinjia2(self, content:str) -> bool:
        patt = [
@@ -150,20 +271,304 @@ class Message(ComponentBase):
            self.set_output("content", partial(self._stream, rand_cnt))
            return

-        rand_cnt, kwargs = self.get_kwargs(rand_cnt, kwargs)
-        template = Jinja2Template(rand_cnt)
+        downloads = []
+        rand_cnt, kwargs = self.get_kwargs(rand_cnt, kwargs, downloads=downloads)
+        template = _jinja2_sandbox.from_string(rand_cnt)
        try:
            content = template.render(kwargs)
-        except Exception:
-            pass
+        except Exception as e:
+            logging.warning(f"Jinja2 template rendering failed: {e}")
+            content = rand_cnt  # fallback to unrendered content

        if self.check_if_canceled("Message processing"):
            return

        for n, v in kwargs.items():
-            content = re.sub(n, v, content)
+            if v is not None:
+                content = re.sub(n, str(v), content)

+        self.set_output("downloads", downloads)
        self.set_output("content", content)
+        self._convert_content(content)
+        try:
+            loop = asyncio.get_running_loop()
+        except RuntimeError:
+            asyncio.run(self._save_to_memory(content))
+        else:
+            asyncio.run_coroutine_threadsafe(self._save_to_memory(content), loop)

    def thoughts(self) -> str:
        return ""
+
+    def _parse_markdown_table_lines(self, table_lines: list):
+        """
+        Parse a list of Markdown table lines into a pandas DataFrame.
+        
+        Args:
+            table_lines: List of strings, each representing a row in the Markdown table
+                        (excluding separator lines like |---|---|)
+        
+        Returns:
+            pandas DataFrame with the table data, or None if parsing fails
+        """
+        import pandas as pd
+        
+        if not table_lines:
+            return None
+        
+        rows = []
+        headers = None
+
+        def _coerce_excel_cell_type(cell: str):
+            # Convert markdown cell text to native numeric types when safe,so Excel writes numeric cells instead of text.
+            if not isinstance(cell, str):
+                return cell
+
+            value = cell.strip()
+            if value == "":
+                return ""
+
+            # Keep values like "00123" as text to avoid losing leading zeros.
+            if re.match(r"^[+-]?0\d+$", value):
+                return cell
+
+            # Support thousand separators like 1,234 or 1,234.56
+            numeric_candidate = value
+            if re.match(r"^[+-]?\d{1,3}(,\d{3})+(\.\d+)?$", value):
+                numeric_candidate = value.replace(",", "")
+
+            if re.match(r"^[+-]?\d+$", numeric_candidate):
+                try:
+                    return int(numeric_candidate)
+                except ValueError:
+                    return cell
+
+            if re.match(r"^[+-]?(\d+\.\d+|\d+\.|\.\d+)([eE][+-]?\d+)?$", numeric_candidate) or re.match(r"^[+-]?\d+[eE][+-]?\d+$", numeric_candidate):
+                try:
+                    return float(numeric_candidate)
+                except ValueError:
+                    return cell
+
+            return cell
+        
+        for line in table_lines:
+            # Split by | and clean up
+            cells = [cell.strip() for cell in line.split('|')]
+            # Remove empty first and last elements from split (caused by leading/trailing |)
+            cells = [c for c in cells if c]
+            
+            if headers is None:
+                headers = cells
+            else:
+                cells = [_coerce_excel_cell_type(c) for c in cells]
+                rows.append(cells)
+        
+        if headers and rows:
+            # Ensure all rows have same number of columns as headers
+            normalized_rows = []
+            for row in rows:
+                while len(row) < len(headers):
+                    row.append('')
+                normalized_rows.append(row[:len(headers)])
+            
+            return pd.DataFrame(normalized_rows, columns=headers)
+        
+        return None
+
+    def _convert_content(self, content):
+        if not self._param.output_format:
+            return
+
+        import pypandoc
+        doc_id = get_uuid()
+
+        if self._param.output_format.lower() not in {"markdown", "html", "pdf", "docx", "xlsx"}:
+            self._param.output_format = "markdown"
+
+        try:
+            if self._param.output_format in {"markdown", "html"}:
+                if isinstance(content, str):
+                    converted = pypandoc.convert_text(
+                        content,
+                        to=self._param.output_format,
+                        format="markdown",
+                    )
+                else:
+                    converted = pypandoc.convert_file(
+                        content,
+                        to=self._param.output_format,
+                        format="markdown",
+                    )
+
+                binary_content = converted.encode("utf-8")
+
+            elif self._param.output_format == "xlsx":
+                import pandas as pd
+                from io import BytesIO
+
+                # Debug: log the content being parsed
+                logging.info(f"XLSX Parser: Content length={len(content) if content else 0}, first 500 chars: {content[:500] if content else 'None'}")
+                
+                # Try to parse ALL Markdown tables from the content
+                # Each table will be written to a separate sheet
+                tables = []  # List of (sheet_name, dataframe)
+                
+                if isinstance(content, str):
+                    lines = content.strip().split('\n')
+                    logging.info(f"XLSX Parser: Total lines={len(lines)}, lines starting with '|': {sum(1 for line in lines if line.strip().startswith('|'))}")
+                    current_table_lines = []
+                    current_table_title = None
+                    pending_title = None
+                    in_table = False
+                    table_count = 0
+                    
+                    for i, line in enumerate(lines):
+                        stripped = line.strip()
+                        
+                        # Check for potential table title (lines before a table)
+                        # Look for patterns like "Table 1:", "## Table", or markdown headers
+                        if not in_table and stripped and not stripped.startswith('|'):
+                            # Check if this could be a table title
+                            lower_stripped = stripped.lower()
+                            if (lower_stripped.startswith('table') or 
+                                stripped.startswith('#') or
+                                ':' in stripped):
+                                pending_title = stripped.lstrip('#').strip()
+                        
+                        if stripped.startswith('|') and '|' in stripped[1:]:
+                            # Check if this is a separator line (|---|---|)
+                            cleaned = stripped.replace(' ', '').replace('|', '').replace('-', '').replace(':', '')
+                            if cleaned == '':
+                                continue  # Skip separator line
+                            
+                            if not in_table:
+                                # Starting a new table
+                                in_table = True
+                                current_table_lines = []
+                                current_table_title = pending_title
+                                pending_title = None
+                            
+                            current_table_lines.append(stripped)
+                        
+                        elif in_table and not stripped.startswith('|'):
+                            # End of current table - save it
+                            if current_table_lines:
+                                df = self._parse_markdown_table_lines(current_table_lines)
+                                if df is not None and not df.empty:
+                                    table_count += 1
+                                    # Generate sheet name
+                                    if current_table_title:
+                                        # Clean and truncate title for sheet name
+                                        sheet_name = current_table_title[:31]
+                                        sheet_name = sheet_name.replace('/', '_').replace('\\', '_').replace('*', '').replace('?', '').replace('[', '').replace(']', '').replace(':', '')
+                                    else:
+                                        sheet_name = f"Table_{table_count}"
+                                    tables.append((sheet_name, df))
+                            
+                            # Reset for next table
+                            in_table = False
+                            current_table_lines = []
+                            current_table_title = None
+                            
+                            # Check if this line could be a title for the next table
+                            if stripped:
+                                lower_stripped = stripped.lower()
+                                if (lower_stripped.startswith('table') or 
+                                    stripped.startswith('#') or
+                                    ':' in stripped):
+                                    pending_title = stripped.lstrip('#').strip()
+                    
+                    # Don't forget the last table if content ends with a table
+                    if in_table and current_table_lines:
+                        df = self._parse_markdown_table_lines(current_table_lines)
+                        if df is not None and not df.empty:
+                            table_count += 1
+                            if current_table_title:
+                                sheet_name = current_table_title[:31]
+                                sheet_name = sheet_name.replace('/', '_').replace('\\', '_').replace('*', '').replace('?', '').replace('[', '').replace(']', '').replace(':', '')
+                            else:
+                                sheet_name = f"Table_{table_count}"
+                            tables.append((sheet_name, df))
+                
+                # Fallback: if no tables found, create single sheet with content
+                if not tables:
+                    df = pd.DataFrame({"Content": [content if content else ""]})
+                    tables = [("Data", df)]
+
+                # Write all tables to Excel, each in a separate sheet
+                excel_io = BytesIO()
+                with pd.ExcelWriter(excel_io, engine='openpyxl') as writer:
+                    used_names = set()
+                    for sheet_name, df in tables:
+                        # Ensure unique sheet names
+                        original_name = sheet_name
+                        counter = 1
+                        while sheet_name in used_names:
+                            suffix = f"_{counter}"
+                            sheet_name = original_name[:31-len(suffix)] + suffix
+                            counter += 1
+                        used_names.add(sheet_name)
+                        df.to_excel(writer, sheet_name=sheet_name, index=False)
+                
+                excel_io.seek(0)
+                binary_content = excel_io.read()
+                
+                logging.info(f"Generated Excel with {len(tables)} sheet(s): {[t[0] for t in tables]}")
+
+            else:  # pdf, docx
+                with tempfile.NamedTemporaryFile(suffix=f".{self._param.output_format}", delete=False) as tmp:
+                    tmp_name = tmp.name
+
+                try:
+                    if isinstance(content, str):
+                        pypandoc.convert_text(
+                            content,
+                            to=self._param.output_format,
+                            format="markdown",
+                            outputfile=tmp_name,
+                        )
+                    else:
+                        pypandoc.convert_file(
+                            content,
+                            to=self._param.output_format,
+                            format="markdown",
+                            outputfile=tmp_name,
+                        )
+
+                    with open(tmp_name, "rb") as f:
+                        binary_content = f.read()
+
+                finally:
+                    if os.path.exists(tmp_name):
+                        os.remove(tmp_name)
+
+            settings.STORAGE_IMPL.put(self._canvas._tenant_id, doc_id, binary_content)
+            self.set_output("attachment", {
+                "doc_id":doc_id,
+                "format":self._param.output_format,
+                "file_name":f"{doc_id[:8]}.{self._param.output_format}"})
+
+            logging.info(f"Converted content uploaded as {doc_id} (format={self._param.output_format})")
+
+        except Exception as e:
+            logging.error(f"Error converting content to {self._param.output_format}: {e}")
+
+    async def _save_to_memory(self, content):
+        if not hasattr(self._param, "memory_ids") or not self._param.memory_ids:
+            return True, "No memory selected."
+
+        user_id = self._param.user_id if hasattr(self._param, "user_id") else ""
+        if user_id:
+            import re
+            # is variable
+            if re.match(r"^{.*}$", user_id):
+                user_id = self._canvas.get_variable_value(user_id)
+
+        message_dict = {
+            "user_id": user_id,
+            "agent_id": self._canvas._id,
+            "session_id": self._canvas.task_id,
+            "user_input": self._canvas.get_sys_query(),
+            "agent_response": content
+        }
+        return await queue_save_to_memory_task(self._param.memory_ids, message_dict)
--- a/agent/component/pipeline_chunker.py
+++ b/agent/component/pipeline_chunker.py
@@ -0,0 +1,194 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+"""
+PipelineChunker Component
+
+Run RAGFlow Pipeline-style chunkers (rag.app.*) against uploaded files inside an
+Agent workflow. Emits plain text chunks for downstream Agent nodes — no
+embedding, no persistence. Wraps existing chunker functions; does not
+re-implement chunking logic.
+"""
+
+import importlib
+import logging
+import os
+from abc import ABC
+
+from agent.component.base import ComponentBase, ComponentParamBase
+from api.db.services.file_service import FileService
+from common.connection_utils import timeout
+
+
+# Parser id -> dotted module path under rag.app. Imported lazily so we don't
+# pull deepdoc/OCR/VLM machinery at component-discovery time.
+_PARSER_MODULES: dict[str, str] = {
+    "general": "rag.app.naive",
+    "naive": "rag.app.naive",
+    "paper": "rag.app.paper",
+    "book": "rag.app.book",
+    "presentation": "rag.app.presentation",
+    "manual": "rag.app.manual",
+    "laws": "rag.app.laws",
+    "qa": "rag.app.qa",
+    "table": "rag.app.table",
+    "resume": "rag.app.resume",
+    "picture": "rag.app.picture",
+    "one": "rag.app.one",
+    "audio": "rag.app.audio",
+    "email": "rag.app.email",
+    "tag": "rag.app.tag",
+}
+
+
+def _load_chunker(parser_id: str):
+    """Resolve a parser id to the underlying ``rag.app.<module>.chunk`` callable."""
+    module_path = _PARSER_MODULES[parser_id.lower()]
+    return importlib.import_module(module_path).chunk
+
+
+class PipelineChunkerParam(ComponentParamBase):
+    """
+    Define the PipelineChunker component parameters.
+    """
+
+    def __init__(self):
+        """Initialise PipelineChunker defaults and declare component outputs."""
+        super().__init__()
+        self.inputs = []  # variable references to uploaded files
+        self.parser_id = "naive"
+        self.lang = "English"
+        self.from_page = 0
+        self.to_page = 100000000
+        self.parser_config = {}
+
+        self.outputs = {
+            "chunks": {"type": "list", "value": []},
+            "chunks_full": {"type": "list", "value": []},
+            "summary": {"type": "str", "value": ""},
+        }
+
+    def check(self):
+        """Validate parser id, page range, and parser_config shape."""
+        self.check_valid_value(
+            self.parser_id.lower(),
+            "[PipelineChunker] parser_id",
+            list(_PARSER_MODULES.keys()),
+        )
+        self.check_nonnegative_number(self.from_page, "[PipelineChunker] from_page")
+        self.check_nonnegative_number(self.to_page, "[PipelineChunker] to_page")
+        if isinstance(self.from_page, (int, float)) and isinstance(self.to_page, (int, float)) and self.from_page > self.to_page:
+            raise ValueError("[PipelineChunker] from_page must be <= to_page")
+        if not isinstance(self.parser_config, dict):
+            raise ValueError("[PipelineChunker] parser_config must be a dict.")
+        return True
+
+
+class PipelineChunker(ComponentBase, ABC):
+    """
+    Run a Pipeline-style chunker (naive, paper, qa, manual, book, ...) against
+    one or more uploaded files and surface the resulting chunks to downstream
+    Agent nodes.
+    """
+
+    component_name = "PipelineChunker"
+
+    def get_input_form(self) -> dict[str, dict]:
+        """Expose each referenced file input as a file-typed form element."""
+        res = {}
+        for ref in self._param.inputs or []:
+            for k, o in self.get_input_elements_from_text(ref).items():
+                res[k] = {"name": o.get("name", ""), "type": "file"}
+        return res
+
+    def _get_file_content(self, file_ref: str) -> tuple[bytes | None, str | None]:
+        """Resolve a canvas variable reference to ``(content_bytes, filename)``."""
+        value = self._canvas.get_variable_value(file_ref)
+        if value is None:
+            return None, None
+
+        if isinstance(value, list) and value:
+            value = value[0]
+
+        if isinstance(value, dict):
+            file_id = value.get("id") or value.get("file_id")
+            created_by = value.get("created_by") or self._canvas.get_tenant_id()
+            filename = value.get("name") or value.get("filename") or "uploaded"
+            if file_id:
+                try:
+                    return FileService.get_blob(created_by, file_id), filename
+                except Exception as e:
+                    logging.exception(
+                        f"[PipelineChunker] FileService.get_blob failed for "
+                        f"file_id={file_id} created_by={created_by} filename={filename}: {e}"
+                    )
+                    return None, None
+        return None, None
+
+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10 * 60)))
+    def _invoke(self, **kwargs):
+        """Run the configured chunker over every referenced file and publish outputs."""
+        if self.check_if_canceled("PipelineChunker processing"):
+            return
+
+        chunker = _load_chunker(self._param.parser_id)
+        tenant_id = self._canvas.get_tenant_id()
+        chunk_kwargs = dict(
+            lang=self._param.lang,
+            tenant_id=tenant_id,
+            from_page=self._param.from_page,
+            to_page=self._param.to_page,
+            parser_config=self._param.parser_config or {},
+            callback=lambda prog=0, msg="": logging.info(f"[PipelineChunker] {prog}: {msg}"),
+        )
+
+        all_chunks: list[dict] = []
+        per_file_counts: list[str] = []
+
+        for file_ref in self._param.inputs or []:
+            if self.check_if_canceled("PipelineChunker processing"):
+                return
+
+            content, filename = self._get_file_content(file_ref)
+            self.set_input_value(file_ref, filename or "")
+            if content is None:
+                logging.warning(f"[PipelineChunker] could not resolve file ref: {file_ref}")
+                per_file_counts.append(f"{filename or file_ref}: error (could not resolve file)")
+                continue
+
+            try:
+                file_chunks = chunker(filename, binary=content, **chunk_kwargs) or []
+            except Exception as e:
+                logging.exception(e)
+                per_file_counts.append(f"{filename}: error (chunking failed)")
+                continue
+
+            all_chunks.extend(file_chunks)
+            per_file_counts.append(f"{filename}: {len(file_chunks)} chunks")
+
+        text_only = [(c.get("content_with_weight") or c.get("text") or "") for c in all_chunks if isinstance(c, dict)]
+        text_only = [t for t in text_only if t]
+
+        self.set_output("chunks", text_only)
+        self.set_output("chunks_full", all_chunks)
+        self.set_output(
+            "summary",
+            f"Parser: {self._param.parser_id} | Files: {len(self._param.inputs or [])} | Chunks: {len(text_only)}" + (" | " + "; ".join(per_file_counts) if per_file_counts else ""),
+        )
+
+    def thoughts(self) -> str:
+        """Return a short status line for UI display."""
+        return f"Chunking with `{self._param.parser_id}` strategy..."
--- a/agent/component/string_transform.py
+++ b/agent/component/string_transform.py
@@ -18,7 +18,9 @@ import re
 from abc import ABC
 from typing import Any

-from jinja2 import Template as Jinja2Template
+from jinja2.sandbox import SandboxedEnvironment
+
+_jinja2_sandbox = SandboxedEnvironment()
 from agent.component.base import ComponentParamBase
 from common.connection_utils import timeout
 from .message import Message
@@ -96,14 +98,14 @@ class StringTransform(Message, ABC):
        script, kwargs = self.get_kwargs(script, kwargs, self._param.delimiters[0])

        if self._is_jinjia2(script):
-            template = Jinja2Template(script)
+            template = _jinja2_sandbox.from_string(script)
            try:
                script = template.render(kwargs)
            except Exception:
                pass

        for k,v in kwargs.items():
-            if not v:
+            if v is None:
                v = ""
            script = re.sub(k, lambda match: v, script)

--- a/agent/component/switch.py
+++ b/agent/component/switch.py
@@ -88,7 +88,7 @@ class Switch(ComponentBase, ABC):
                    self.set_output("_next", cond["to"])
                    return

-            if all(res):
+            if res and all(res):
                self.set_output("next", [self._canvas.get_component_name(cpn_id) for cpn_id in cond["to"]])
                self.set_output("_next", cond["to"])
                return
@@ -97,6 +97,9 @@ class Switch(ComponentBase, ABC):
        self.set_output("_next", self._param.end_cpn_ids)

    def process_operator(self, input: Any, operator: str, value: Any) -> bool:
+        if operator in ("contains", "not contains", "start with", "end with"):
+            input = "" if input is None else str(input)
+            value = "" if value is None else str(value)
        if operator == "contains":
            return True if value.lower() in input.lower() else False
        elif operator == "not contains":
@@ -134,7 +137,7 @@ class Switch(ComponentBase, ABC):
            except Exception:
                return True if input <= value else False

-        raise ValueError('Not supported operator' + operator)
+        raise ValueError(f'Not supported operator: {operator}')

    def thoughts(self) -> str:
        return "I’m weighing a few options and will pick the next step shortly."
--- a/agent/component/varaiable_aggregator.py
+++ b/agent/component/varaiable_aggregator.py
--- a/agent/component/variable_assigner.py
+++ b/agent/component/variable_assigner.py
@@ -0,0 +1,194 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from abc import ABC
+import os
+import numbers
+from agent.component.base import ComponentBase, ComponentParamBase
+from api.utils.api_utils import timeout
+
+class VariableAssignerParam(ComponentParamBase):
+    """
+    Define the Variable Assigner component parameters.
+    """
+    def __init__(self):
+        super().__init__()
+        self.variables=[]
+
+    def check(self):
+        return True
+    
+    def get_input_form(self) -> dict[str, dict]:
+        return {
+            "items": {
+                "type": "json",
+                "name": "Items"
+            }
+        }
+
+class VariableAssigner(ComponentBase,ABC):
+    component_name = "VariableAssigner"
+    _NO_PARAMETER_OPERATORS = {"clear", "remove_first", "remove_last"}
+
+    @timeout(int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60)))
+    def _invoke(self, **kwargs):
+        if not isinstance(self._param.variables,list):
+            return
+        else:
+            for item in self._param.variables:
+                variable = item.get("variable")
+                operator = item.get("operator")
+                parameter = item.get("parameter")
+
+                if any([not variable, not operator]):
+                    raise ValueError("Variable is not complete.")
+                if operator not in self._NO_PARAMETER_OPERATORS and parameter is None:
+                    raise ValueError("Variable is not complete.")
+                variable_value=self._canvas.get_variable_value(variable)
+                new_variable=self._operate(variable_value,operator,parameter)
+                self._canvas.set_variable_value(variable, new_variable)
+
+    def _operate(self,variable,operator,parameter):
+        if operator == "overwrite":
+            return self._overwrite(parameter)
+        elif operator == "clear":
+            return self._clear(variable)
+        elif operator == "set":
+            return self._set(variable,parameter)
+        elif operator == "append":
+            return self._append(variable,parameter)
+        elif operator == "extend":
+            return self._extend(variable,parameter)
+        elif operator == "remove_first":
+            return self._remove_first(variable)
+        elif operator == "remove_last":
+            return self._remove_last(variable)
+        elif operator == "+=":
+            return self._add(variable,parameter)
+        elif operator == "-=":
+            return self._subtract(variable,parameter)
+        elif operator == "*=":
+            return self._multiply(variable,parameter)
+        elif operator == "/=":
+            return self._divide(variable,parameter)
+        else:
+            return
+    
+    def _overwrite(self,parameter):
+        return self._canvas.get_variable_value(parameter)
+
+    def _clear(self,variable):
+        if isinstance(variable,list):
+            return []
+        elif isinstance(variable,str):
+            return ""
+        elif isinstance(variable,dict):
+            return {}
+        elif isinstance(variable,bool):
+            return False
+        elif isinstance(variable,int):
+            return 0
+        elif isinstance(variable,float):
+            return 0.0
+        else:
+            return None
+
+    def _set(self,variable,parameter):
+        if variable is None:
+            return self._canvas.get_value_with_variable(parameter)
+        elif isinstance(variable,str):
+            return self._canvas.get_value_with_variable(parameter)
+        elif isinstance(variable,bool):
+            return parameter
+        elif isinstance(variable,int):
+            return parameter
+        elif isinstance(variable,float):
+            return parameter
+        else:
+            return parameter
+
+    def _append(self,variable,parameter):
+        parameter=self._canvas.get_variable_value(parameter)
+        if variable is None:
+            variable=[]
+        if not isinstance(variable,list):
+            return "ERROR:VARIABLE_NOT_LIST"
+        elif len(variable)!=0 and not isinstance(parameter,type(variable[0])):
+            return "ERROR:PARAMETER_NOT_LIST_ELEMENT_TYPE"
+        else:
+            variable.append(parameter)
+            return variable
+
+    def _extend(self,variable,parameter):
+        parameter=self._canvas.get_variable_value(parameter)
+        if variable is None:
+            variable=[]
+        if not isinstance(variable,list):
+            return "ERROR:VARIABLE_NOT_LIST"
+        elif not isinstance(parameter,list):
+            return "ERROR:PARAMETER_NOT_LIST"
+        elif len(variable)!=0 and len(parameter)!=0 and not isinstance(parameter[0],type(variable[0])):
+            return "ERROR:PARAMETER_NOT_LIST_ELEMENT_TYPE"
+        else:
+            return variable + parameter
+
+    def _remove_first(self,variable):
+        if not isinstance(variable,list):
+            return "ERROR:VARIABLE_NOT_LIST"
+        if len(variable)==0:
+            return variable
+        return variable[1:]
+
+    def _remove_last(self,variable):
+        if not isinstance(variable,list):
+            return "ERROR:VARIABLE_NOT_LIST"
+        if len(variable)==0:
+            return variable
+        return variable[:-1]
+
+    def is_number(self, value):
+        if isinstance(value, bool):
+            return False
+        return isinstance(value, numbers.Number)
+
+    def _add(self,variable,parameter):
+        if self.is_number(variable) and self.is_number(parameter):
+            return variable + parameter
+        else:
+            return "ERROR:VARIABLE_NOT_NUMBER or PARAMETER_NOT_NUMBER"
+
+    def _subtract(self,variable,parameter):
+        if self.is_number(variable) and self.is_number(parameter):
+            return variable - parameter
+        else:
+            return "ERROR:VARIABLE_NOT_NUMBER or PARAMETER_NOT_NUMBER"
+
+    def _multiply(self,variable,parameter):
+        if self.is_number(variable) and self.is_number(parameter):
+            return variable * parameter
+        else:
+            return "ERROR:VARIABLE_NOT_NUMBER or PARAMETER_NOT_NUMBER"
+
+    def _divide(self,variable,parameter):
+        if self.is_number(variable) and self.is_number(parameter):
+            if  parameter==0:
+                return "ERROR:DIVIDE_BY_ZERO"
+            else:
+                return variable/parameter
+        else:
+            return  "ERROR:VARIABLE_NOT_NUMBER or PARAMETER_NOT_NUMBER"
+
+    def thoughts(self) -> str:
+        return "Assign variables from canvas."
--- a/agent/component/webhook.py
+++ b/agent/component/webhook.py
@@ -1,38 +0,0 @@
-#
-#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
-#
-#  Licensed under the Apache License, Version 2.0 (the "License");
-#  you may not use this file except in compliance with the License.
-#  You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-#  Unless required by applicable law or agreed to in writing, software
-#  distributed under the License is distributed on an "AS IS" BASIS,
-#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-#  See the License for the specific language governing permissions and
-#  limitations under the License.
-#
-from agent.component.base import ComponentParamBase, ComponentBase
-
-
-class WebhookParam(ComponentParamBase):
-
-    """
-    Define the Begin component parameters.
-    """
-    def __init__(self):
-        super().__init__()
-
-    def get_input_form(self) -> dict[str, dict]:
-        return getattr(self, "inputs")
-
-
-class Webhook(ComponentBase):
-    component_name = "Webhook"
-
-    def _invoke(self, **kwargs):
-        pass
-
-    def thoughts(self) -> str:
-        return ""
--- a/agent/dsl_migration.py
+++ b/agent/dsl_migration.py
@@ -0,0 +1,178 @@
+#
+#  Copyright 2026 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+import copy
+import re
+
+
+# Keep all legacy chunker renames in one place so the migration rule stays readable.
+COMPONENT_RENAMES = {
+    "Splitter": "TokenChunker",
+    "HierarchicalMerger": "TitleChunker",
+    "PDFGenerator": "DocGenerator",
+}
+
+NODE_TYPE_RENAMES = {
+    "splitterNode": "chunkerNode",
+}
+
+VARIABLE_REF_PATTERN = re.compile(r"(\{+\s*)([A-Za-z0-9:_-]+)(@[A-Za-z0-9_.-]+)(\s*\}+)")
+
+
+def normalize_chunker_dsl(dsl: dict) -> dict:
+    """
+    Rewrite legacy chunker component names and ids into the current DSL schema.
+
+    This is intentionally a pure migration step:
+    - it does not change business params
+    - it only rewrites structural identifiers used by the canvas/runtime
+    - custom human-authored names are preserved unless they are still the exact
+      built-in legacy operator name
+    """
+    if not isinstance(dsl, dict):
+        return dsl
+
+    normalized = copy.deepcopy(dsl)
+    components = normalized.get("components")
+    if not isinstance(components, dict):
+        return normalized
+
+    component_id_map: dict[str, str] = {}
+    for component_id in components.keys():
+        new_component_id = component_id
+        for old_name, new_name in COMPONENT_RENAMES.items():
+            prefix = f"{old_name}:"
+            if component_id.startswith(prefix):
+                new_component_id = f"{new_name}:{component_id[len(prefix):]}"
+                break
+        component_id_map[component_id] = new_component_id
+
+    def rewrite_variable_refs(text: str) -> str:
+        if text in component_id_map:
+            return component_id_map[text]
+
+        def repl(match: re.Match[str]) -> str:
+            component_id = match.group(2)
+            return (
+                match.group(1)
+                + component_id_map.get(component_id, component_id)
+                + match.group(3)
+                + match.group(4)
+            )
+
+        return VARIABLE_REF_PATTERN.sub(repl, text)
+
+    def rewrite_value(value):
+        if isinstance(value, str):
+            return rewrite_variable_refs(value)
+        if isinstance(value, list):
+            return [rewrite_value(item) for item in value]
+        if isinstance(value, dict):
+            return {key: rewrite_value(item) for key, item in value.items()}
+        return value
+
+    rewritten_components = {}
+    for old_component_id, component in components.items():
+        new_component_id = component_id_map[old_component_id]
+        new_component = rewrite_value(component)
+
+        if isinstance(new_component, dict):
+            obj = new_component.get("obj")
+            if isinstance(obj, dict):
+                component_name = obj.get("component_name")
+                obj["component_name"] = COMPONENT_RENAMES.get(component_name, component_name)
+
+            if isinstance(new_component.get("downstream"), list):
+                new_component["downstream"] = [
+                    component_id_map.get(component_id, component_id)
+                    for component_id in new_component["downstream"]
+                ]
+            if isinstance(new_component.get("upstream"), list):
+                new_component["upstream"] = [
+                    component_id_map.get(component_id, component_id)
+                    for component_id in new_component["upstream"]
+                ]
+
+            parent_id = new_component.get("parent_id")
+            if isinstance(parent_id, str):
+                new_component["parent_id"] = component_id_map.get(parent_id, parent_id)
+
+        rewritten_components[new_component_id] = new_component
+
+    normalized["components"] = rewritten_components
+
+    if isinstance(normalized.get("path"), list):
+        normalized["path"] = [
+            component_id_map.get(component_id, component_id)
+            for component_id in normalized["path"]
+        ]
+
+    graph = normalized.get("graph")
+    if isinstance(graph, dict):
+        nodes = graph.get("nodes")
+        if isinstance(nodes, list):
+            for node in nodes:
+                if not isinstance(node, dict):
+                    continue
+                node_id = node.get("id")
+                if isinstance(node_id, str):
+                    node["id"] = component_id_map.get(node_id, node_id)
+
+                parent_id = node.get("parentId")
+                if isinstance(parent_id, str):
+                    node["parentId"] = component_id_map.get(parent_id, parent_id)
+
+                node_type = node.get("type")
+                if isinstance(node_type, str):
+                    node["type"] = NODE_TYPE_RENAMES.get(node_type, node_type)
+
+                data = node.get("data")
+                if not isinstance(data, dict):
+                    continue
+
+                label = data.get("label")
+                if isinstance(label, str):
+                    data["label"] = COMPONENT_RENAMES.get(label, label)
+
+                name = data.get("name")
+                if isinstance(name, str) and name in COMPONENT_RENAMES:
+                    data["name"] = COMPONENT_RENAMES[name]
+
+                if "form" in data:
+                    data["form"] = rewrite_value(data["form"])
+
+        edges = graph.get("edges")
+        if isinstance(edges, list):
+            replacements = sorted(component_id_map.items(), key=lambda item: len(item[0]), reverse=True)
+            for edge in edges:
+                if not isinstance(edge, dict):
+                    continue
+                for key in ("source", "target"):
+                    value = edge.get(key)
+                    if isinstance(value, str):
+                        edge[key] = component_id_map.get(value, value)
+
+                edge_id = edge.get("id")
+                if isinstance(edge_id, str):
+                    for old_component_id, new_component_id in replacements:
+                        edge_id = edge_id.replace(old_component_id, new_component_id)
+                    edge["id"] = edge_id
+
+    for key in ("history", "messages", "reference"):
+        if key in normalized:
+            normalized[key] = rewrite_value(normalized[key])
+
+    return normalized
--- a/agent/plugin/README.md
+++ b/agent/plugin/README.md
@@ -0,0 +1,97 @@
+# Plugins
+
+This directory contains the plugin mechanism for RAGFlow.
+
+RAGFlow will load plugins from `embedded_plugins` subdirectory recursively.
+
+## Supported plugin types
+
+Currently, the only supported plugin type is `llm_tools`.
+
+- `llm_tools`: A tool for LLM to call.
+
+## How to add a plugin
+
+Add a LLM tool plugin is simple: create a plugin file, put a class inherits the `LLMToolPlugin` class in it, then implement the `get_metadata` and the `invoke` methods.
+
+- `get_metadata` method: This method returns a `LLMToolMetadata` object, which contains the description of this tool.
+The description will be provided to LLM, and the RAGFlow web frontend for displaying.
+
+- `invoke` method: This method accepts parameters generated by LLM, and return a `str` containing the tool execution result.
+All the execution logic of this tool should go into this method.
+
+When you start RAGFlow, you can see your plugin was loaded in the log:
+
+```
+2025-05-15 19:29:08,959 INFO     34670 Recursively importing plugins from path `/some-path/ragflow/agent/plugin/embedded_plugins`
+2025-05-15 19:29:08,960 INFO     34670 Loaded llm_tools plugin BadCalculatorPlugin version 1.0.0
+```
+
+Or it may contain some errors for you to fix your plugin.
+
+### Demo
+
+We will demonstrate how to add a plugin with a calculator tool which will give wrong answers.
+
+First, create a plugin file `bad_calculator.py` under the `embedded_plugins/llm_tools` directory.
+
+Then, we create a `BadCalculatorPlugin` class, extending the `LLMToolPlugin` base class:
+
+```python
+class BadCalculatorPlugin(LLMToolPlugin):
+    _version_ = "1.0.0"
+```
+
+The `_version_` field is required, which specifies the version of the plugin.
+
+Our calculator has two numbers `a` and `b` as inputs, so we add a `invoke` method to our `BadCalculatorPlugin` class:
+
+```python
+def invoke(self, a: int, b: int) -> str:
+    return str(a + b + 100)
+```
+
+The `invoke` method will be called by LLM. It can have many parameters, but the return type must be a `str`.
+
+Finally, we have to add a `get_metadata` method, to tell LLM how to use our `bad_calculator`:
+
+```python
+@classmethod
+def get_metadata(cls) -> LLMToolMetadata:
+    return {
+        # Name of this tool, providing to LLM
+        "name": "bad_calculator",
+        # Display name of this tool, providing to RAGFlow frontend
+        "displayName": "$t:bad_calculator.name",
+        # Description of the usage of this tool, providing to LLM
+        "description": "A tool to calculate the sum of two numbers (will give wrong answer)",
+        # Description of this tool, providing to RAGFlow frontend
+        "displayDescription": "$t:bad_calculator.description",
+        # Parameters of this tool
+        "parameters": {
+            # The first parameter - a
+            "a": {
+                # Parameter type, options are: number, string, or whatever the LLM can recognise
+                "type": "number",
+                # Description of this parameter, providing to LLM
+                "description": "The first number",
+                # Description of this parameter, provding to RAGFlow frontend
+                "displayDescription": "$t:bad_calculator.params.a",
+                # Whether this parameter is required
+                "required": True
+            },
+            # The second parameter - b
+            "b": {
+                "type": "number",
+                "description": "The second number",
+                "displayDescription": "$t:bad_calculator.params.b",
+                "required": True
+            }
+        }
+```
+
+The `get_metadata` method is a `classmethod`. It will provide the description of this tool to LLM.
+
+The fields start with `display` can use a special notation: `$t:xxx`, which will use the i18n mechanism in the RAGFlow frontend, getting text from the `llmTools` category. The frontend will display what you put here if you don't use this notation.
+
+Now our tool is ready. You can select it in the `Generate` component and try it out.
--- a/agent/plugin/README_tr.md
+++ b/agent/plugin/README_tr.md
@@ -0,0 +1,99 @@
+[English](./README.md) | [简体中文](./README_zh.md) | Türkçe
+
+# Eklentiler
+
+Bu klasör, RAGFlow'un eklenti mekanizmasını içerir.
+
+RAGFlow, `embedded_plugins` alt klasöründen eklentileri özyinelemeli olarak yükleyecektir.
+
+## Desteklenen eklenti türleri
+
+Şu anda desteklenen tek eklenti türü `llm_tools`'dur.
+
+- `llm_tools`: LLM'nin çağırması için bir araç.
+
+## Eklenti nasıl eklenir
+
+Bir LLM araç eklentisi eklemek basittir: bir eklenti dosyası oluşturun, içine `LLMToolPlugin` sınıfından türetilmiş bir sınıf koyun, ardından `get_metadata` ve `invoke` metodlarını uygulayın.
+
+- `get_metadata` metodu: Bu metod, aracın açıklamasını içeren bir `LLMToolMetadata` nesnesi döndürür.
+Açıklama, LLM'ye çağrı için ve RAGFlow web ön yüzüne görüntüleme amacıyla sağlanacaktır.
+
+- `invoke` metodu: Bu metod, LLM tarafından üretilen parametreleri kabul eder ve aracın yürütme sonucunu içeren bir `str` döndürür.
+Bu aracın tüm yürütme mantığı bu metoda konulmalıdır.
+
+RAGFlow'u başlattığınızda, günlükte eklentinizin yüklendiğini göreceksiniz:
+
+```
+2025-05-15 19:29:08,959 INFO     34670 Recursively importing plugins from path `/some-path/ragflow/agent/plugin/embedded_plugins`
+2025-05-15 19:29:08,960 INFO     34670 Loaded llm_tools plugin BadCalculatorPlugin version 1.0.0
+```
+
+Veya eklentinizi düzeltmeniz gereken hatalar da içerebilir.
+
+### Örnek
+
+Yanlış cevaplar veren bir hesap makinesi aracı ekleyerek eklenti ekleme sürecini göstereceğiz.
+
+Önce, `embedded_plugins/llm_tools` klasörü altında `bad_calculator.py` adında bir eklenti dosyası oluşturun.
+
+Ardından, `LLMToolPlugin` temel sınıfından türetilmiş bir `BadCalculatorPlugin` sınıfı oluşturuyoruz:
+
+```python
+class BadCalculatorPlugin(LLMToolPlugin):
+    _version_ = "1.0.0"
+```
+
+`_version_` alanı zorunludur ve eklentinin sürüm numarasını belirtir.
+
+Hesap makinemizin girdileri olarak `a` ve `b` olmak üzere iki sayısı vardır, bu yüzden `BadCalculatorPlugin` sınıfımıza aşağıdaki `invoke` metodunu ekliyoruz:
+
+```python
+def invoke(self, a: int, b: int) -> str:
+    return str(a + b + 100)
+```
+
+`invoke` metodu LLM tarafından çağrılacaktır. Birçok parametreye sahip olabilir, ancak dönüş tipi `str` olmalıdır.
+
+Son olarak, LLM'ye `bad_calculator` aracımızı nasıl kullanacağını anlatmak için bir `get_metadata` metodu eklememiz gerekiyor:
+
+```python
+@classmethod
+def get_metadata(cls) -> LLMToolMetadata:
+    return {
+        # Bu aracın adı, LLM'ye sağlanır
+        "name": "bad_calculator",
+        # Bu aracın görüntüleme adı, RAGFlow ön yüzüne sağlanır
+        "displayName": "$t:bad_calculator.name",
+        # Bu aracın kullanım açıklaması, LLM'ye sağlanır
+        "description": "A tool to calculate the sum of two numbers (will give wrong answer)",
+        # Bu aracın açıklaması, RAGFlow ön yüzüne sağlanır
+        "displayDescription": "$t:bad_calculator.description",
+        # Bu aracın parametreleri
+        "parameters": {
+            # Birinci parametre - a
+            "a": {
+                # Parametre tipi, seçenekler: number, string veya LLM'nin tanıyabileceği herhangi bir tip
+                "type": "number",
+                # Bu parametrenin açıklaması, LLM'ye sağlanır
+                "description": "The first number",
+                # Bu parametrenin açıklaması, RAGFlow ön yüzüne sağlanır
+                "displayDescription": "$t:bad_calculator.params.a",
+                # Bu parametrenin zorunlu olup olmadığı
+                "required": True
+            },
+            # İkinci parametre - b
+            "b": {
+                "type": "number",
+                "description": "The second number",
+                "displayDescription": "$t:bad_calculator.params.b",
+                "required": True
+            }
+        }
+```
+
+`get_metadata` metodu bir `classmethod`'dur. Bu aracın açıklamasını LLM'ye sağlayacaktır.
+
+`display` ile başlayan alanlar özel bir gösterim kullanabilir: `$t:xxx`, bu gösterim RAGFlow ön yüzündeki uluslararasılaştırma (i18n) mekanizmasını kullanarak `llmTools` kategorisinden metin alır. Bu gösterimi kullanmazsanız, ön yüz buraya yazdığınız metni doğrudan gösterecektir.
+
+Artık aracımız hazırdır. `Yanıt Üret` bileşeninde seçip deneyebilirsiniz.
--- a/agent/plugin/README_zh.md
+++ b/agent/plugin/README_zh.md
@@ -0,0 +1,98 @@
+# 插件
+
+这个文件夹包含了RAGFlow的插件机制。
+
+RAGFlow将会从`embedded_plugins`子文件夹中递归加载所有的插件。
+
+## 支持的插件类型
+
+目前，唯一支持的插件类型是`llm_tools`。
+
+- `llm_tools`：用于供LLM进行调用的工具。
+
+## 如何添加一个插件
+
+添加一个LLM工具插件是很简单的：创建一个插件文件，向其中放一个继承自`LLMToolPlugin`的类，再实现它的`get_metadata`和`invoke`方法即可。
+
+- `get_metadata`方法：这个方法返回一个`LLMToolMetadata`对象，其中包含了对这个工具的描述。
+这些描述信息将被提供给LLM进行调用，和RAGFlow的Web前端用作展示。
+
+- `invoke`方法：这个方法接受LLM生成的参数，并且返回一个`str`对象，其中包含了这个工具的执行结果。
+这个工具的所有执行逻辑都应当放到这个方法里。
+
+当你启动RAGFlow时，你会在日志中看见你的插件被加载了：
+
+```
+2025-05-15 19:29:08,959 INFO     34670 Recursively importing plugins from path `/some-path/ragflow/agent/plugin/embedded_plugins`
+2025-05-15 19:29:08,960 INFO     34670 Loaded llm_tools plugin BadCalculatorPlugin version 1.0.0
+```
+
+也可能会报错，这时就需要根据报错对你的插件进行修复。
+
+### 示例
+
+我们将会添加一个会给出错误答案的计算器工具，来演示添加插件的过程。
+
+首先，在`embedded_plugins/llm_tools`文件夹下创建一个插件文件`bad_calculator.py`。
+
+接下来，我们创建一个`BadCalculatorPlugin`类，继承基类`LLMToolPlugin`：
+
+```python
+class BadCalculatorPlugin(LLMToolPlugin):
+    _version_ = "1.0.0"
+```
+
+`_version_`字段是必填的，用于指定这个插件的版本号。
+
+我们的计算器拥有两个输入字段`a`和`b`，所以我们添加如下的`invoke`方法到`BadCalculatorPlugin`类中：
+
+```python
+def invoke(self, a: int, b: int) -> str:
+    return str(a + b + 100)
+```
+
+`invoke`方法将会被LLM所调用。这个方法可以有许多参数，但它必须返回一个`str`。
+
+最后，我们需要添加一个`get_metadata`方法，来告诉LLM怎样使用我们的`bad_calculator`工具：
+
+```python
+@classmethod
+def get_metadata(cls) -> LLMToolMetadata:
+    return {
+        # 这个工具的名称，会提供给LLM
+        "name": "bad_calculator",
+        # 这个工具的展示名称，会提供给RAGFlow的Web前端
+        "displayName": "$t:bad_calculator.name",
+        # 这个工具的用法描述，会提供给LLM
+        "description": "A tool to calculate the sum of two numbers (will give wrong answer)",
+        # 这个工具的描述，会提供给RAGFlow的Web前端
+        "displayDescription": "$t:bad_calculator.description",
+        # 这个工具的参数
+        "parameters": {
+            # 第一个参数 - a
+            "a": {
+                # 参数类型，选项为：number, string, 或者LLM可以识别的任何类型
+                "type": "number",
+                # 这个参数的描述，会提供给LLM
+                "description": "The first number",
+                # 这个参数的描述，会提供给RAGFlow的Web前端
+                "displayDescription": "$t:bad_calculator.params.a",
+                # 这个参数是否是必填的
+                "required": True
+            },
+            # 第二个参数 - b
+            "b": {
+                "type": "number",
+                "description": "The second number",
+                "displayDescription": "$t:bad_calculator.params.b",
+                "required": True
+            }
+        }
+```
+
+`get_metadata`方法是一个`classmethod`。它会把这个工具的描述提供给LLM。
+
+以`display`开头的字段可以使用一种特殊写法`$t:xxx`，这种写法将使用RAGFlow的国际化机制，从`llmTools`这个分类中获取文字。如果你不使用这种写法，那么前端将会显示此处的原始内容。
+
+现在，我们的工具已经做好了，你可以在`生成回答`组件中选择这个工具来尝试一下。
+
--- a/agent/plugin/init.py
+++ b/agent/plugin/init.py
--- a/agent/plugin/common.py
+++ b/agent/plugin/common.py
--- a/agent/plugin/embedded_plugins/llm_tools/bad_calculator.py
+++ b/agent/plugin/embedded_plugins/llm_tools/bad_calculator.py
@@ -0,0 +1,37 @@
+import logging
+from agent.plugin.llm_tool_plugin import LLMToolMetadata, LLMToolPlugin
+
+
+class BadCalculatorPlugin(LLMToolPlugin):
+    """
+    A sample LLM tool plugin, will add two numbers with 100.
+    It only presents for demo purpose. Do not use it in production.
+    """
+    _version_ = "1.0.0"
+
+    @classmethod
+    def get_metadata(cls) -> LLMToolMetadata:
+        return {
+            "name": "bad_calculator",
+            "displayName": "$t:bad_calculator.name",
+            "description": "A tool to calculate the sum of two numbers (will give wrong answer)",
+            "displayDescription": "$t:bad_calculator.description",
+            "parameters": {
+                "a": {
+                    "type": "number",
+                    "description": "The first number",
+                    "displayDescription": "$t:bad_calculator.params.a",
+                    "required": True
+                },
+                "b": {
+                    "type": "number",
+                    "description": "The second number",
+                    "displayDescription": "$t:bad_calculator.params.b",
+                    "required": True
+                }
+            }
+        }
+
+    def invoke(self, a: int, b: int) -> str:
+        logging.info(f"Bad calculator tool was called with arguments {a} and {b}")
+        return str(a + b + 100)
--- a/agent/plugin/llm_tool_plugin.py
+++ b/agent/plugin/llm_tool_plugin.py
--- a/agent/plugin/plugin_manager.py
+++ b/agent/plugin/plugin_manager.py
--- a/agent/sandbox/.env.example
+++ b/agent/sandbox/.env.example
--- a/agent/sandbox/Makefile
+++ b/agent/sandbox/Makefile
@@ -0,0 +1,115 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+# Force using Bash to ensure the source command is available
+SHELL := /bin/bash
+
+# Environment variable definitions
+VENV := .venv
+PYTHON := $(VENV)/bin/python
+UV := uv
+ACTIVATE_SCRIPT := $(VENV)/bin/activate
+SYS_PYTHON := python3
+PYTHONPATH := $(shell pwd)
+
+.PHONY: all setup ensure_env ensure_uv start stop restart build clean test logs
+
+all: setup start
+
+# 🌱 Initialize environment + install dependencies
+setup: ensure_env ensure_uv
+	@echo "📦 Installing dependencies with uv..."
+	@$(UV) sync --python 3.12
+	source $(ACTIVATE_SCRIPT) && \
+	export PYTHONPATH=$(PYTHONPATH)
+	@$(UV) pip install -r executor_manager/requirements.txt
+	@echo "✅ Setup complete."
+
+# 🔑 Ensure .env exists (copy from .env.example on first run)
+ensure_env:
+	@if [ ! -f ".env" ]; then \
+		if [ -f ".env.example" ]; then \
+			echo "📝 Creating .env from .env.example..."; \
+			cp .env.example .env; \
+		else \
+			echo "⚠️ Warning: .env.example not found, creating empty .env"; \
+			touch .env; \
+		fi; \
+	else \
+		echo "✅ .env already exists."; \
+	fi
+
+# 🔧 Ensure uv is executable (install using system Python)
+ensure_uv:
+	@if ! command -v $(UV) >/dev/null 2>&1; then \
+		echo "🛠️ Installing uv using system Python..."; \
+		$(SYS_PYTHON) -m pip install -q --upgrade pip; \
+		$(SYS_PYTHON) -m pip install -q uv || (echo "⚠️ uv install failed, check manually" && exit 1); \
+	fi
+
+# 🐳 Service control (using safer variable loading)
+start:
+	@echo "🚀 Starting services..."
+	source $(ACTIVATE_SCRIPT) && \
+	export PYTHONPATH=$(PYTHONPATH) && \
+	[ -f .env ] && source .env || true && \
+	bash scripts/start.sh
+
+stop:
+	@echo "🛑 Stopping services..."
+	source $(ACTIVATE_SCRIPT) && \
+	bash scripts/stop.sh
+
+restart: stop start
+	@echo "🔁 Restarting services..."
+
+build:
+	@echo "🔧 Building base sandbox images..."
+	@if [ -f .env ]; then \
+		source .env && \
+		echo "🐍 Building base sandbox image for Python ($$SANDBOX_BASE_PYTHON_IMAGE)..." && \
+		docker build -t "$$SANDBOX_BASE_PYTHON_IMAGE" ./sandbox_base_image/python && \
+		echo "⬢ Building base sandbox image for Nodejs ($$SANDBOX_BASE_NODEJS_IMAGE)..." && \
+		docker build -t "$$SANDBOX_BASE_NODEJS_IMAGE" ./sandbox_base_image/nodejs; \
+	else \
+		echo "⚠️ .env file not found, skipping build."; \
+	fi
+
+test:
+	@echo "🧪 Running sandbox security tests..."
+	source $(ACTIVATE_SCRIPT) && \
+	export PYTHONPATH=$(PYTHONPATH) && \
+	$(PYTHON) tests/sandbox_security_tests_full.py
+
+logs:
+	@echo "📋 Showing logs from api-server and executor-manager..."
+	docker compose logs -f
+
+# 🧹 Clean all containers and volumes
+clean:
+	@echo "🧹 Cleaning all containers and volumes..."
+	@docker compose down -v || true
+	@if [ -f .env ]; then \
+		source .env && \
+		for i in $$(seq 0 $$((SANDBOX_EXECUTOR_MANAGER_POOL_SIZE - 1))); do \
+			echo "🧹 Deleting sandbox_python_$$i..." && \
+			docker rm -f sandbox_python_$$i 2>/dev/null || true && \
+			echo "🧹 Deleting sandbox_nodejs_$$i..." && \
+			docker rm -f sandbox_nodejs_$$i 2>/dev/null || true; \
+		done; \
+	else \
+		echo "⚠️ .env not found, skipping container cleanup"; \
+	fi
--- a/agent/sandbox/README.md
+++ b/agent/sandbox/README.md
@@ -0,0 +1,361 @@
+# RAGFlow Sandbox
+
+A secure, pluggable code execution backend for RAGFlow and beyond.
+
+## 🔧 Features
+
+- ✅ **Seamless RAGFlow Integration** — Out-of-the-box compatibility with the `code` component.
+- 🔐 **High Security** — Leverages [gVisor](https://gvisor.dev/) for syscall-level sandboxing.
+- 🔧 **Customizable Sandboxing** — Easily modify `seccomp` settings as needed.
+- 🧩 **Pluggable Runtime Support** — Easily extend to support any programming language.
+- ⚙️ **Developer Friendly** — Get started with a single command using `Makefile`.
+
+## 🏗 Architecture
+
+<p align="center">
+  <img src="asserts/code_executor_manager.svg" width="520" alt="Architecture Diagram">
+</p>
+
+## 🚀 Quick Start
+
+### 📋 Prerequisites
+
+#### Required
+
+- Linux distro compatible with gVisor
+- [gVisor](https://gvisor.dev/docs/user_guide/install/)
+- Docker >= `25.0` (API 1.44+) — executor manager now bundles Docker CLI `29.1.0` to match newer daemons.
+- Docker Compose >= `v2.26.1` like [RAGFlow](https://github.com/infiniflow/ragflow)
+- [uv](https://docs.astral.sh/uv/) as package and project manager
+
+#### Optional (Recommended)
+
+- [GNU Make](https://www.gnu.org/software/make/) for simplified CLI management
+
+---
+
+> ⚠️ **New Docker CLI requirement**
+>
+> If you see `client version 1.43 is too old. Minimum supported API version is 1.44`, pull the latest `infiniflow/sandbox-executor-manager:latest` (rebuilt with Docker CLI `29.1.0`) or rebuild it in `./sandbox/executor_manager`. Older images shipped Docker 24.x, which cannot talk to newer Docker daemons.
+
+### 🐳 Build Docker Base Images
+
+We use isolated base images for secure containerized execution:
+
+```bash
+# Build base images manually
+docker build -t sandbox-base-python:latest ./sandbox_base_image/python
+docker build -t sandbox-base-nodejs:latest ./sandbox_base_image/nodejs
+
+# OR use Makefile
+make build
+```
+
+Then, build the executor manager image:
+
+```bash
+docker build -t sandbox-executor-manager:latest ./executor_manager
+```
+
+---
+
+### 📦 Running with RAGFlow
+
+1. Ensure gVisor is correctly installed.
+2. Configure your `.env` in `docker/.env`:
+
+   - Uncomment sandbox-related variables.
+   - Enable sandbox profile at the bottom.
+3. Add the following line to `/etc/hosts` as recommended:
+
+   ```text
+   127.0.0.1 sandbox-executor-manager
+   ```
+
+4. Start RAGFlow service.
+
+---
+
+### 🧭 Running Standalone
+
+#### Manual Setup
+
+1. Initialize environment:
+
+   ```bash
+   cp .env.example .env
+   ```
+
+2. Launch:
+
+   ```bash
+   docker compose -f docker-compose.yml up
+   ```
+
+3. Test:
+
+   ```bash
+   source .venv/bin/activate
+   export PYTHONPATH=$(pwd)
+   uv pip install -r executor_manager/requirements.txt
+   uv run tests/sandbox_security_tests_full.py
+   ```
+
+#### With Make
+
+```bash
+make          # setup + build + launch + test
+```
+
+---
+
+### 📈 Monitoring
+
+```bash
+docker logs -f sandbox-executor-manager  # Manual
+make logs                                 # With Make
+```
+
+---
+
+### 🧰 Makefile Toolbox
+
+| Command           | Description                                      |
+|-------------------|--------------------------------------------------|
+| `make`            | Setup, build, launch and test all at once        |
+| `make setup`      | Initialize environment and install uv            |
+| `make ensure_env` | Auto-create `.env` if missing                    |
+| `make ensure_uv`  | Install `uv` package manager if missing          |
+| `make build`      | Build all Docker base images                     |
+| `make start`      | Start services with safe env loading and testing |
+| `make stop`       | Gracefully stop all services                     |
+| `make restart`    | Shortcut for `stop` + `start`                    |
+| `make test`       | Run full test suite                              |
+| `make logs`       | Stream container logs                            |
+| `make clean`      | Stop and remove orphan containers and volumes    |
+
+---
+
+## 🔐 Security
+
+The RAGFlow sandbox is designed to balance security and usability, offering solid protection without compromising developer experience.
+
+### ✅ gVisor Isolation
+
+At its core, we use [gVisor](https://gvisor.dev/docs/architecture_guide/security/), a user-space kernel, to isolate code execution from the host system. gVisor intercepts and restricts syscalls, offering robust protection against container escapes and privilege escalations.
+
+### 🔒 Optional seccomp Support (Advanced)
+
+For users who need **zero-trust-level syscall control**, we support an additional `seccomp` profile. This feature restricts containers to only a predefined set of system calls, as specified in `executor_manager/seccomp-profile-default.json`.
+
+> ⚠️ This feature is **disabled by default** to maintain compatibility and usability. Enabling it may cause compatibility issues with some dependencies.
+
+#### To enable seccomp
+
+1. Edit your `.env` file:
+
+   ```dotenv
+   SANDBOX_ENABLE_SECCOMP=true
+   ```
+
+2. Customize allowed syscalls in:
+
+   ```
+   executor_manager/seccomp-profile-default.json
+   ```
+
+   This profile is passed to the container with:
+
+   ```bash
+   --security-opt seccomp=/app/seccomp-profile-default.json
+   ```
+
+### 🧠 Python Code AST Inspection
+
+In addition to sandboxing, Python code is **statically analyzed via AST (Abstract Syntax Tree)** before execution. Potentially malicious code (e.g. file operations, subprocess calls, etc.) is rejected early, providing an extra layer of protection.
+
+---
+
+This security model strikes a balance between **robust isolation** and **developer usability**. While `seccomp` can be highly restrictive, our default setup aims to keep things usable for most developers — no obscure crashes or cryptic setup required.
+
+## 📦 Add Extra Dependencies for Supported Languages
+
+Currently, the following languages are officially supported:
+
+| Language | Priority |
+|----------|----------|
+| Python   | High     |
+| Node.js  | Medium   |
+
+### 🐍 Python
+
+Pre-installed packages: `requests`, `numpy`, `pandas`, `matplotlib`.
+
+> `matplotlib` uses the `Agg` (non-interactive) backend by default in the sandbox (`MPLBACKEND=Agg`). No display server is available, so always save figures to files (e.g. `fig.savefig("artifacts/chart.png")`) rather than calling `plt.show()`.
+>
+> Tip: if Chinese text renders as missing boxes/squares in `matplotlib`, install Debian package `fonts-noto-cjk` in your custom image. We do not preinstall it by default to keep the base image smaller. The sandbox base image ships a `matplotlibrc` that already lists common CJK fonts in the `font.sans-serif` fallback chain, so no code-level font configuration is needed — just install the font package and rebuild the image.
+>
+> Example:
+>
+> ```dockerfile
+> RUN apt-get update && apt-get install -y --no-install-recommends fonts-noto-cjk && rm -rf /var/lib/apt/lists/*
+> ```
+
+To add more dependencies, edit:
+
+```bash
+sandbox_base_image/python/requirements.txt
+```
+
+Add any additional packages you need, one per line (just like a normal pip requirements file).
+
+### 🟨 Node.js
+
+Pre-installed packages: `axios`.
+
+To add Node.js dependencies:
+
+1. Navigate to the Node.js base image directory:
+
+   ```bash
+   cd sandbox_base_image/nodejs
+   ```
+
+2. Use `npm` to install the desired packages. For example:
+
+   ```bash
+   npm install lodash
+   ```
+
+3. The dependencies will be saved to `package.json` and `package-lock.json`, and included in the Docker image when rebuilt.
+
+---
+
+
+## Usage
+
+### 🐍 A Python example
+
+```python
+def main(arg1: str, arg2: str) -> str:
+    return f"result: {arg1 + arg2}"
+```
+
+### 🟨 JavaScript examples
+
+A simple sync function
+
+```javascript
+function main({arg1, arg2}) {
+  return arg1+arg2
+}
+```
+
+Async funcion with aioxs
+
+```javascript
+const axios = require('axios');
+async function main() {
+  try {
+    const response = await axios.get('https://github.com/infiniflow/ragflow');
+    return 'Body:' + response.data;
+  } catch (error) {
+    return 'Error:' + error.message;
+  }
+}
+```
+
+---
+
+## 📋 FAQ
+
+### ❓Sandbox Not Working?
+
+Follow this checklist to troubleshoot:
+
+- [ ] **Is your machine compatible with gVisor?**
+
+  Ensure that your system supports gVisor. Refer to the [gVisor installation guide](https://gvisor.dev/docs/user_guide/install/).
+
+- [ ] **Is gVisor properly installed?**
+
+  **Common error:**
+
+  `HTTPConnectionPool(host='sandbox-executor-manager', port=9385): Read timed out.`
+
+  Cause: `runsc` is an unknown or invalid Docker runtime.
+  **Fix:**
+
+  - Install gVisor
+
+  - Restart Docker
+
+  - Test with:
+
+    ```bash
+    docker run --rm --runtime=runsc hello-world
+    ```
+
+- [ ] **Is `sandbox-executor-manager` mapped in `/etc/hosts`?**
+
+  **Common error:**
+
+  `HTTPConnectionPool(host='none', port=9385): Max retries exceeded.`
+
+  **Fix:**
+
+  Add the following entry to `/etc/hosts`:
+
+  ```text
+  127.0.0.1 es01 infinity mysql minio redis sandbox-executor-manager
+  ```
+
+- [ ] **Are you running the latest executor manager image?**
+
+  **Common error:**
+
+  `docker: Error response from daemon: client version 1.43 is too old. Minimum supported API version is 1.44`
+
+  **Fix:**
+
+  Pull the refreshed image that bundles Docker CLI `29.1.0`, or rebuild it in `./sandbox/executor_manager`:
+
+  ```bash
+  docker pull infiniflow/sandbox-executor-manager:latest
+  # or
+  docker build -t sandbox-executor-manager:latest ./sandbox/executor_manager
+  ```
+
+- [ ] **Have you enabled sandbox-related configurations in RAGFlow?**
+
+  Double-check that all sandbox settings are correctly enabled in your RAGFlow configuration.
+
+- [ ] **Have you pulled the required base images for the runners?**
+
+  **Common error:**
+
+  `HTTPConnectionPool(host='sandbox-executor-manager', port=9385): Read timed out.`
+
+  Cause: no runner was started.
+
+  **Fix:**
+
+  Pull the necessary base images:
+
+  ```bash
+  docker pull infiniflow/sandbox-base-nodejs:latest
+  docker pull infiniflow/sandbox-base-python:latest
+  ```
+
+- [ ] **Did you restart the service after making changes?**
+
+  Any changes to configuration or environment require a full service restart to take effect.
+
+
+### ❓Container pool is busy?
+
+All available runners are currently in use, executing tasks/running code. Please try again shortly, or consider increasing the pool size in the configuration to improve availability and reduce wait times.
+
+## 🤝 Contribution
+
+Contributions are welcome!
--- a/agent/sandbox/asserts/code_executor_manager.svg
+++ b/agent/sandbox/asserts/code_executor_manager.svg
--- a/agent/sandbox/client.py
+++ b/agent/sandbox/client.py
@@ -0,0 +1,259 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+"""
+Sandbox client for agent components.
+
+This module provides a unified interface for agent components to interact
+with the configured sandbox provider.
+"""
+
+import json
+import logging
+from typing import Dict, Any, Optional
+
+from api.db.services.system_settings_service import SystemSettingsService
+from agent.sandbox.providers import ProviderManager
+from agent.sandbox.providers.base import ExecutionResult, SandboxProviderConfigError
+
+logger = logging.getLogger(__name__)
+
+
+# Global provider manager instance
+_provider_manager: Optional[ProviderManager] = None
+
+
+def get_provider_manager() -> ProviderManager:
+    """
+    Get the global provider manager instance.
+
+    Returns:
+        ProviderManager instance with active provider loaded
+    """
+    global _provider_manager
+
+    if _provider_manager is not None:
+        return _provider_manager
+
+    _provider_manager = ProviderManager()
+    _load_provider_from_settings()
+
+    return _provider_manager
+
+
+def _load_provider_from_settings() -> None:
+    """
+    Load sandbox provider from system settings and configure the provider manager.
+
+    This function resolves the active provider type, then loads configuration
+    from system settings.
+    """
+    global _provider_manager
+
+    if _provider_manager is None:
+        return
+
+    try:
+        provider_type = _resolve_provider_type()
+        config = _load_provider_config(provider_type)
+
+        # Import and instantiate the provider
+        from agent.sandbox.providers import (
+            SelfManagedProvider,
+            AliyunCodeInterpreterProvider,
+            E2BProvider,
+            LocalProvider,
+            SSHProvider,
+        )
+
+        provider_classes = {
+            "self_managed": SelfManagedProvider,
+            "aliyun_codeinterpreter": AliyunCodeInterpreterProvider,
+            "e2b": E2BProvider,
+            "local": LocalProvider,
+            "ssh": SSHProvider,
+        }
+
+        if provider_type not in provider_classes:
+            logger.error(f"Unknown provider type: {provider_type}")
+            return
+
+        provider_class = provider_classes[provider_type]
+        provider = provider_class()
+
+        # Initialize the provider
+        if not provider.initialize(config):
+            message = f"Failed to initialize sandbox provider: {provider_type}. Config keys: {list(config.keys())}"
+            if provider_type in {"local", "ssh"}:
+                raise SandboxProviderConfigError(message)
+            logger.error(message)
+            return
+
+        # Set the active provider
+        _provider_manager.set_provider(provider_type, provider)
+        logger.info(f"Sandbox provider '{provider_type}' initialized successfully")
+
+    except SandboxProviderConfigError:
+        raise
+    except Exception as e:
+        logger.error(f"Failed to load sandbox provider from settings: {e}")
+        import traceback
+        traceback.print_exc()
+def _load_provider_config_from_settings(provider_type: str) -> Dict[str, Any]:
+    provider_config_settings = SystemSettingsService.get_by_name(f"sandbox.{provider_type}")
+    if not provider_config_settings:
+        logger.warning(f"No configuration found for provider: {provider_type}")
+        return {}
+
+    try:
+        return json.loads(provider_config_settings[0].value)
+    except json.JSONDecodeError as e:
+        logger.error(f"Failed to parse sandbox config for {provider_type}: {e}")
+        return {}
+
+
+def _resolve_provider_type() -> str:
+    provider_type_settings = SystemSettingsService.get_by_name("sandbox.provider_type")
+    if not provider_type_settings:
+        return "self_managed"
+    return provider_type_settings[0].value
+
+
+def _load_provider_config(provider_type: str) -> Dict[str, Any]:
+    return _load_provider_config_from_settings(provider_type)
+
+
+def reload_provider() -> None:
+    """
+    Reload the sandbox provider from system settings.
+
+    Use this function when sandbox settings have been updated.
+    """
+    global _provider_manager
+    _provider_manager = None
+    _load_provider_from_settings()
+
+
+def execute_code(
+    code: str,
+    language: str = "python",
+    timeout: int = 30,
+    arguments: Optional[Dict[str, Any]] = None
+) -> ExecutionResult:
+    """
+    Execute code in the configured sandbox.
+
+    This is the main entry point for agent components to execute code.
+
+    Args:
+        code: Source code to execute
+        language: Programming language (python, nodejs, javascript)
+        timeout: Maximum execution time in seconds
+        arguments: Optional arguments dict to pass to main() function
+
+    Returns:
+        ExecutionResult containing stdout, stderr, exit_code, and metadata
+
+    Raises:
+        RuntimeError: If no provider is configured or execution fails
+    """
+    provider_manager = get_provider_manager()
+
+    if not provider_manager.is_configured():
+        raise RuntimeError(
+            "No sandbox provider configured. Please configure sandbox settings in the admin panel."
+        )
+
+    provider = provider_manager.get_provider()
+    provider_name = provider_manager.get_provider_name() or getattr(provider, "__class__", type(provider)).__name__
+
+    logger.info(
+        "CodeExec using sandbox provider '%s' (language=%s, timeout=%ss)",
+        provider_name,
+        language,
+        timeout,
+    )
+
+    # Create a sandbox instance
+    instance = provider.create_instance(template=language)
+
+    try:
+        # Execute the code
+        result = provider.execute_code(
+            instance_id=instance.instance_id,
+            code=code,
+            language=language,
+            timeout=timeout,
+            arguments=arguments
+        )
+
+        return result
+
+    finally:
+        # Clean up the instance
+        try:
+            provider.destroy_instance(instance.instance_id)
+        except Exception as e:
+            logger.warning(f"Failed to destroy sandbox instance {instance.instance_id}: {e}")
+
+
+def health_check() -> bool:
+    """
+    Check if the sandbox provider is healthy.
+
+    Returns:
+        True if provider is configured and healthy, False otherwise
+    """
+    try:
+        provider_manager = get_provider_manager()
+
+        if not provider_manager.is_configured():
+            return False
+
+        provider = provider_manager.get_provider()
+        return provider.health_check()
+
+    except Exception as e:
+        logger.error(f"Sandbox health check failed: {e}")
+        return False
+
+
+def get_provider_info() -> Dict[str, Any]:
+    """
+    Get information about the current sandbox provider.
+
+    Returns:
+        Dictionary with provider information:
+        - provider_type: Type of the active provider
+        - configured: Whether provider is configured
+        - healthy: Whether provider is healthy
+    """
+    try:
+        provider_manager = get_provider_manager()
+
+        return {
+            "provider_type": provider_manager.get_provider_name(),
+            "configured": provider_manager.is_configured(),
+            "healthy": health_check(),
+        }
+
+    except Exception as e:
+        logger.error(f"Failed to get provider info: {e}")
+        return {
+            "provider_type": None,
+            "configured": False,
+            "healthy": False,
+        }
--- a/agent/sandbox/docker-compose.yml
+++ b/agent/sandbox/docker-compose.yml
@@ -0,0 +1,32 @@
+services:
+  sandbox-executor-manager:
+    build:
+      context: ./executor_manager
+      dockerfile: Dockerfile
+    image: sandbox-executor-manager:latest
+    runtime: runc
+    privileged: true
+    ports:
+      - "${SANDBOX_EXECUTOR_MANAGER_PORT:-9385}:9385"
+    volumes:
+      - /var/run/docker.sock:/var/run/docker.sock
+    networks:
+      - sandbox-network
+    restart: always
+    security_opt:
+      - no-new-privileges:true
+    environment:
+      - SANDBOX_EXECUTOR_MANAGER_POOL_SIZE=${SANDBOX_EXECUTOR_MANAGER_POOL_SIZE:-5}
+      - SANDBOX_BASE_PYTHON_IMAGE=${SANDBOX_BASE_PYTHON_IMAGE-sandbox-base-python:latest}
+      - SANDBOX_BASE_NODEJS_IMAGE=${SANDBOX_BASE_NODEJS_IMAGE-sandbox-base-nodejs:latest}
+      - SANDBOX_ENABLE_SECCOMP=${SANDBOX_ENABLE_SECCOMP:-false}
+      - SANDBOX_MAX_MEMORY=${SANDBOX_MAX_MEMORY:-256m} # b, k, m, g
+      - SANDBOX_TIMEOUT=${SANDBOX_TIMEOUT:-10s} # s, m, 1m30s
+    healthcheck:
+      test: ["CMD-SHELL", "curl --fail http://localhost:9385/healthz || exit 1"]
+      interval: 10s
+      timeout: 5s
+      retries: 5
+networks:
+  sandbox-network:
+    driver: bridge
--- a/agent/sandbox/executor_manager/Dockerfile
+++ b/agent/sandbox/executor_manager/Dockerfile
@@ -0,0 +1,41 @@
+FROM python:3.11-slim-bookworm
+
+ARG NEED_MIRROR=1
+
+RUN if [ "$NEED_MIRROR" = 1 ]; then \
+        grep -rl 'deb.debian.org' /etc/apt/ | xargs sed -i 's|http[s]*://deb.debian.org|https://mirrors.tuna.tsinghua.edu.cn|g'; \
+    fi; \
+    apt-get update && \
+    apt-get install -y curl gcc && \
+    rm -rf /var/lib/apt/lists/*
+
+ARG TARGETARCH
+ARG TARGETVARIANT
+
+RUN set -eux; \
+    case "${TARGETARCH}${TARGETVARIANT}" in \
+        amd64) DOCKER_ARCH=x86_64 ;; \
+        arm64) DOCKER_ARCH=aarch64 ;; \
+        armv7) DOCKER_ARCH=armhf ;; \
+        armv6) DOCKER_ARCH=armel ;; \
+        arm64v8) DOCKER_ARCH=aarch64 ;; \
+        arm64v7) DOCKER_ARCH=armhf ;; \
+        arm*) DOCKER_ARCH=armhf ;; \
+        ppc64le) DOCKER_ARCH=ppc64le ;; \
+        s390x) DOCKER_ARCH=s390x ;; \
+        *) echo "Unsupported architecture: ${TARGETARCH}${TARGETVARIANT}" && exit 1 ;; \
+    esac; \
+    echo "Downloading Docker for architecture: ${DOCKER_ARCH}"; \
+    curl -fsSL "https://download.docker.com/linux/static/stable/${DOCKER_ARCH}/docker-29.1.0.tgz" | \
+    tar xz -C /usr/local/bin --strip-components=1 docker/docker; \
+    ln -sf /usr/local/bin/docker /usr/bin/docker
+
+COPY --from=ghcr.io/astral-sh/uv:0.7.5 /uv /uvx /bin/
+
+WORKDIR /app
+COPY . .
+
+RUN if [ "$NEED_MIRROR" = 1 ]; then export UV_INDEX_URL="https://pypi.tuna.tsinghua.edu.cn/simple"; else export UV_INDEX_URL="https://pypi.org/simple"; fi && \
+    uv pip install --system -r requirements.txt
+
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "9385"]
--- a/agent/sandbox/executor_manager/api/init.py
+++ b/agent/sandbox/executor_manager/api/init.py
--- a/agent/sandbox/executor_manager/api/handlers.py
+++ b/agent/sandbox/executor_manager/api/handlers.py
--- a/agent/sandbox/executor_manager/api/routes.py
+++ b/agent/sandbox/executor_manager/api/routes.py
@@ -0,0 +1,25 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from fastapi import APIRouter
+
+from api.handlers import healthz_handler, run_code_handler
+
+router = APIRouter()
+
+router.get("/")(healthz_handler)
+router.get("/healthz")(healthz_handler)
+router.post("/run")(run_code_handler)
+
--- a/agent/sandbox/executor_manager/core/init.py
+++ b/agent/sandbox/executor_manager/core/init.py
--- a/agent/sandbox/executor_manager/core/config.py
+++ b/agent/sandbox/executor_manager/core/config.py
--- a/agent/sandbox/executor_manager/core/container.py
+++ b/agent/sandbox/executor_manager/core/container.py
@@ -0,0 +1,191 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import asyncio
+import contextlib
+import os
+from queue import Empty, Queue
+
+from models.enums import SupportLanguage
+from util import env_setting_enabled, is_valid_memory_limit
+from utils.common import async_run_command
+
+from core.logger import logger
+
+_CONTAINER_QUEUES: dict[SupportLanguage, Queue] = {}
+_CONTAINER_LOCK: asyncio.Lock = asyncio.Lock()
+_CONTAINER_EXECUTION_SEMAPHORES: dict[SupportLanguage, asyncio.Semaphore] = {}
+
+
+async def init_containers(size: int) -> tuple[int, int]:
+    global _CONTAINER_QUEUES
+    _CONTAINER_QUEUES = {SupportLanguage.PYTHON: Queue(), SupportLanguage.NODEJS: Queue()}
+
+    async with _CONTAINER_LOCK:
+        while not _CONTAINER_QUEUES[SupportLanguage.PYTHON].empty():
+            _CONTAINER_QUEUES[SupportLanguage.PYTHON].get_nowait()
+        while not _CONTAINER_QUEUES[SupportLanguage.NODEJS].empty():
+            _CONTAINER_QUEUES[SupportLanguage.NODEJS].get_nowait()
+
+    for language in SupportLanguage:
+        _CONTAINER_EXECUTION_SEMAPHORES[language] = asyncio.Semaphore(size)
+
+    create_tasks = []
+    for i in range(size):
+        name = f"sandbox_python_{i}"
+        logger.info(f"🛠️ Creating Python container {i + 1}/{size}")
+        create_tasks.append(_prepare_container(name, SupportLanguage.PYTHON))
+
+        name = f"sandbox_nodejs_{i}"
+        logger.info(f"🛠️ Creating Node.js container {i + 1}/{size}")
+        create_tasks.append(_prepare_container(name, SupportLanguage.NODEJS))
+
+    results = await asyncio.gather(*create_tasks, return_exceptions=True)
+    success_count = sum(1 for r in results if r is True)
+    total_task_count = len(create_tasks)
+    return success_count, total_task_count
+
+
+async def teardown_containers():
+    async with _CONTAINER_LOCK:
+        while not _CONTAINER_QUEUES[SupportLanguage.PYTHON].empty():
+            name = _CONTAINER_QUEUES[SupportLanguage.PYTHON].get_nowait()
+            await async_run_command("docker", "rm", "-f", name, timeout=5)
+        while not _CONTAINER_QUEUES[SupportLanguage.NODEJS].empty():
+            name = _CONTAINER_QUEUES[SupportLanguage.NODEJS].get_nowait()
+            await async_run_command("docker", "rm", "-f", name, timeout=5)
+
+
+async def _prepare_container(name: str, language: SupportLanguage) -> bool:
+    """Prepare a single container"""
+    with contextlib.suppress(Exception):
+        await async_run_command("docker", "rm", "-f", name, timeout=5)
+
+    if await create_container(name, language):
+        _CONTAINER_QUEUES[language].put(name)
+        return True
+    return False
+
+
+async def create_container(name: str, language: SupportLanguage) -> bool:
+    """Asynchronously create a container"""
+    create_args = [
+        "docker",
+        "run",
+        "-d",
+        "--runtime=runsc",
+        "--name",
+        name,
+        "--read-only",
+        "--tmpfs",
+        "/workspace:rw,exec,size=100M,uid=65534,gid=65534",
+        "--tmpfs",
+        "/tmp:rw,exec,size=50M",
+        "--user",
+        "nobody",
+        "--workdir",
+        "/workspace",
+    ]
+    if os.getenv("SANDBOX_MAX_MEMORY"):
+        memory_limit = os.getenv("SANDBOX_MAX_MEMORY") or "256m"
+        if is_valid_memory_limit(memory_limit):
+            logger.info(f"SANDBOX_MAX_MEMORY: {os.getenv('SANDBOX_MAX_MEMORY')}")
+        else:
+            logger.info("Invalid SANDBOX_MAX_MEMORY, using default value: 256m")
+            memory_limit = "256m"
+        create_args.extend(["--memory", memory_limit])
+    else:
+        logger.info("Set default SANDBOX_MAX_MEMORY: 256m")
+        create_args.extend(["--memory", "256m"])
+
+    if env_setting_enabled("SANDBOX_ENABLE_SECCOMP", "false"):
+        logger.info(f"SANDBOX_ENABLE_SECCOMP: {os.getenv('SANDBOX_ENABLE_SECCOMP')}")
+        create_args.extend(["--security-opt", "seccomp=/app/seccomp-profile-default.json"])
+
+    if language == SupportLanguage.PYTHON:
+        create_args.append(os.getenv("SANDBOX_BASE_PYTHON_IMAGE", "sandbox-base-python:latest"))
+    elif language == SupportLanguage.NODEJS:
+        create_args.append(os.getenv("SANDBOX_BASE_NODEJS_IMAGE", "sandbox-base-nodejs:latest"))
+
+    logger.info(f"Sandbox config:\n\t {create_args}")
+
+    try:
+        return_code, _, stderr = await async_run_command(*create_args, timeout=10)
+        if return_code != 0:
+            logger.error(f"❌ Container creation failed {name}: {stderr}")
+            return False
+
+        if language == SupportLanguage.NODEJS:
+            copy_cmd = ["docker", "exec", name, "bash", "-c", "cp -a /app/node_modules /workspace/"]
+            return_code, _, stderr = await async_run_command(*copy_cmd, timeout=10)
+            if return_code != 0:
+                logger.error(f"❌ Failed to prepare dependencies for {name}: {stderr}")
+                return False
+
+        return await container_is_running(name)
+    except Exception as e:
+        logger.error(f"❌ Container creation exception {name}: {str(e)}")
+        return False
+
+
+async def recreate_container(name: str, language: SupportLanguage) -> bool:
+    """Asynchronously recreate a container"""
+    logger.info(f"🛠️ Recreating container: {name}")
+    try:
+        await async_run_command("docker", "rm", "-f", name, timeout=5)
+
+        return await create_container(name, language)
+    except Exception as e:
+        logger.error(f"❌ Container {name} recreation failed: {str(e)}")
+        return False
+
+
+async def release_container(name: str, language: SupportLanguage):
+    """Asynchronously release a container"""
+    async with _CONTAINER_LOCK:
+        if await container_is_running(name):
+            _CONTAINER_QUEUES[language].put(name)
+            logger.info(f"🟢 Released container: {name} (remaining available: {_CONTAINER_QUEUES[language].qsize()})")
+        else:
+            logger.warning(f"⚠️ Container {name} has crashed, attempting to recreate...")
+            if await recreate_container(name, language):
+                _CONTAINER_QUEUES[language].put(name)
+                logger.info(f"✅ Container {name} successfully recreated and returned to queue")
+
+
+async def allocate_container_blocking(language: SupportLanguage, timeout=10) -> str:
+    """Asynchronously allocate an available container"""
+    start_time = asyncio.get_running_loop().time()
+    while asyncio.get_running_loop().time() - start_time < timeout:
+        try:
+            name = _CONTAINER_QUEUES[language].get_nowait()
+            async with _CONTAINER_LOCK:
+                if not await container_is_running(name) and not await recreate_container(name, language):
+                    continue
+
+                return name
+        except Empty:
+            await asyncio.sleep(0.1)
+
+    return ""
+
+
+async def container_is_running(name: str) -> bool:
+    """Asynchronously check the container status"""
+    try:
+        return_code, stdout, _ = await async_run_command("docker", "inspect", "-f", "{{.State.Running}}", name, timeout=2)
+        return return_code == 0 and stdout.strip() == "true"
+    except Exception:
+        return False
--- a/agent/sandbox/executor_manager/core/logger.py
+++ b/agent/sandbox/executor_manager/core/logger.py
--- a/agent/sandbox/executor_manager/main.py
+++ b/agent/sandbox/executor_manager/main.py
--- a/agent/sandbox/executor_manager/models/init.py
+++ b/agent/sandbox/executor_manager/models/init.py
--- a/agent/sandbox/executor_manager/models/enums.py
+++ b/agent/sandbox/executor_manager/models/enums.py
--- a/agent/sandbox/executor_manager/models/schemas.py
+++ b/agent/sandbox/executor_manager/models/schemas.py
@@ -0,0 +1,72 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import base64
+from typing import Any, Optional
+
+from pydantic import BaseModel, Field, field_validator
+
+from models.enums import ResourceLimitType, ResultStatus, RuntimeErrorType, SupportLanguage, UnauthorizedAccessType
+
+
+class ArtifactItem(BaseModel):
+    name: str
+    mime_type: str
+    size: int
+    content_b64: str
+
+
+class ExecutionStructuredResult(BaseModel):
+    present: bool
+    value: Any = None
+    type: str = "json"
+
+
+class CodeExecutionResult(BaseModel):
+    status: ResultStatus
+    stdout: str
+    stderr: str
+    exit_code: int
+    detail: Optional[str] = None
+
+    # Resource usage
+    time_used_ms: Optional[float] = None
+    memory_used_kb: Optional[float] = None
+
+    # Error details
+    resource_limit_type: Optional[ResourceLimitType] = None
+    unauthorized_access_type: Optional[UnauthorizedAccessType] = None
+    runtime_error_type: Optional[RuntimeErrorType] = None
+
+    # File artifacts produced by code execution (images, PDFs, CSVs, etc.)
+    artifacts: list[ArtifactItem] = []
+
+    # Structured return value produced by main()
+    result: Optional[ExecutionStructuredResult] = None
+
+
+class CodeExecutionRequest(BaseModel):
+    code_b64: str = Field(..., description="Base64 encoded code string")
+    language: SupportLanguage = Field(default=SupportLanguage.PYTHON, description="Programming language")
+    arguments: Optional[dict] = Field(default={}, description="Arguments")
+
+    @field_validator("code_b64")
+    @classmethod
+    def validate_base64(cls, v: str) -> str:
+        try:
+            base64.b64decode(v, validate=True)
+            return v
+        except Exception as e:
+            raise ValueError(f"Invalid base64 encoding: {str(e)}")
--- a/agent/sandbox/executor_manager/requirements.txt
+++ b/agent/sandbox/executor_manager/requirements.txt
--- a/agent/sandbox/executor_manager/seccomp-profile-default.json
+++ b/agent/sandbox/executor_manager/seccomp-profile-default.json
--- a/agent/sandbox/executor_manager/services/init.py
+++ b/agent/sandbox/executor_manager/services/init.py
--- a/Show More
+++ b/Show More