Files
ragflow/CLAUDE.md

307 lines
15 KiB
Markdown
Raw Permalink Normal View History

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It's a full-stack application with:
refactor: overhaul task executor with layered architecture and comprehensive test suite (#15471) ## Summary Decomposes the monolithic `task_executor.py` (1945 lines) into a 6-layer architecture with clear separation of concerns. The refactored code is functionally equivalent to the original, verified through 400 passing tests and a production-vs-dry-run comparison framework. ## Architecture ``` entry (task_manager) └─ orchestration (task_handler) ├─ services (chunk_service, embedding_service, dataflow_service, raptor_service, post_processor) │ └─ utilities (chunk_builder, chunk_post_processor, embedding_utils) └─ infrastructure (task_context, recording_context, interceptor) ``` Key design decisions: - **TaskContext** — typed facade over raw task dict, injects rate limiters + callbacks via composition - **RecordingContext + Comparator** — enables side-by-side production vs dry-run execution for safe migration - **NullRecordingContext** — zero-allocation no-op for production, uses `__slots__` - **WriteOperationInterceptor** — FIFO replay of previous runs function returns for comparison mode ## Migration Strategy The original `handle_task()` in `task_executor.py` uses a 3-way switch via `TE_RUN_MODE`: - `TE_RUN_MODE=0` (default) → runs refactored code - `TE_RUN_MODE=1` → runs both original + refactored, compares all intermediate results - `TE_RUN_MODE=2` → runs original code (fallback) The comparison mode (`TE_RUN_MODE=1`) records ~40 intermediate values (chunks, vectors, token counts, func return values) from the production run and replays them during dry-run, then uses `ContextComparator` to report mismatches. ## Functional Equivalence Fixes All divergences between original and refactored code were identified and fixed: - Timeout decorators (handle/build_chunks/raptor/embedding) - NullRecordingContext leak in finally block causing RuntimeError - MinIO None-binary check with proper FileNotFoundError - Dataflow dispatch after embedding binding + init_kb - Memory task missing return after processing - RAPTOR checkpoint progress reporting - Tag cache (get_tags_from_cache/set_tags_to_cache) restoration - dataflow_id correction in _load_dsl - Language default Chinese, dead code guard removal - embed_chunks made async with proper thread_pool_exec - Full GraphRAG default configuration (10 parameters) - Hardcoded q_768_vec fallback removal in RAPTOR ## Test Changes - 20 new tests covering table parser manual mode, tag cache, embedding edge cases, RAPTOR checkpoint, dataflow_id correction, storage binary None, cancel cleanup, metadata=None boundary - Unified `make_task_context`/`make_task_dict` factories eliminated 10+ duplicated helpers - DataflowService tests migrated from internal method mocks to IO boundary mocks (real orchestration code executes) - Parametrized duplicate build_chunks post-processor tests - 7 raptor tests modernized to @pytest.mark.asyncio - Mock count per test reduced through boundary-level mocking strategy **Test count: 400 passing, 0 warnings, 0 skips** ## Files Changed | File | Change | |------|--------| | `rag/svr/task_executor.py` | +1 line (NullRecordingContext fix) | | `rag/svr/task_executor_refactor/task_handler.py` | Orchestration layer, 8 logic fixes | | `rag/svr/task_executor_refactor/chunk_service.py` | +timeout + None-check | | `rag/svr/task_executor_refactor/embedding_service.py` | sync→async rewrite | | `rag/svr/task_executor_refactor/dataflow_service.py` | dataflow_id fix + timeout | | `rag/svr/task_executor_refactor/raptor_service.py` | checkpoint fix + assert | | `rag/svr/task_executor_refactor/chunk_post_processor.py` | tag cache restore | | `rag/svr/task_executor_refactor/task_context.py` | language default fix | | `test/.../conftest.py` | +294 lines shared helpers | | `test/.../*.py` | 15 test files refactored, 20 new tests | --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:18:31 +08:00
- Python backend (Quart-based async API server — Quart is the async reimplementation of Flask)
- React/TypeScript frontend (built with vitejs)
refactor: overhaul task executor with layered architecture and comprehensive test suite (#15471) ## Summary Decomposes the monolithic `task_executor.py` (1945 lines) into a 6-layer architecture with clear separation of concerns. The refactored code is functionally equivalent to the original, verified through 400 passing tests and a production-vs-dry-run comparison framework. ## Architecture ``` entry (task_manager) └─ orchestration (task_handler) ├─ services (chunk_service, embedding_service, dataflow_service, raptor_service, post_processor) │ └─ utilities (chunk_builder, chunk_post_processor, embedding_utils) └─ infrastructure (task_context, recording_context, interceptor) ``` Key design decisions: - **TaskContext** — typed facade over raw task dict, injects rate limiters + callbacks via composition - **RecordingContext + Comparator** — enables side-by-side production vs dry-run execution for safe migration - **NullRecordingContext** — zero-allocation no-op for production, uses `__slots__` - **WriteOperationInterceptor** — FIFO replay of previous runs function returns for comparison mode ## Migration Strategy The original `handle_task()` in `task_executor.py` uses a 3-way switch via `TE_RUN_MODE`: - `TE_RUN_MODE=0` (default) → runs refactored code - `TE_RUN_MODE=1` → runs both original + refactored, compares all intermediate results - `TE_RUN_MODE=2` → runs original code (fallback) The comparison mode (`TE_RUN_MODE=1`) records ~40 intermediate values (chunks, vectors, token counts, func return values) from the production run and replays them during dry-run, then uses `ContextComparator` to report mismatches. ## Functional Equivalence Fixes All divergences between original and refactored code were identified and fixed: - Timeout decorators (handle/build_chunks/raptor/embedding) - NullRecordingContext leak in finally block causing RuntimeError - MinIO None-binary check with proper FileNotFoundError - Dataflow dispatch after embedding binding + init_kb - Memory task missing return after processing - RAPTOR checkpoint progress reporting - Tag cache (get_tags_from_cache/set_tags_to_cache) restoration - dataflow_id correction in _load_dsl - Language default Chinese, dead code guard removal - embed_chunks made async with proper thread_pool_exec - Full GraphRAG default configuration (10 parameters) - Hardcoded q_768_vec fallback removal in RAPTOR ## Test Changes - 20 new tests covering table parser manual mode, tag cache, embedding edge cases, RAPTOR checkpoint, dataflow_id correction, storage binary None, cancel cleanup, metadata=None boundary - Unified `make_task_context`/`make_task_dict` factories eliminated 10+ duplicated helpers - DataflowService tests migrated from internal method mocks to IO boundary mocks (real orchestration code executes) - Parametrized duplicate build_chunks post-processor tests - 7 raptor tests modernized to @pytest.mark.asyncio - Mock count per test reduced through boundary-level mocking strategy **Test count: 400 passing, 0 warnings, 0 skips** ## Files Changed | File | Change | |------|--------| | `rag/svr/task_executor.py` | +1 line (NullRecordingContext fix) | | `rag/svr/task_executor_refactor/task_handler.py` | Orchestration layer, 8 logic fixes | | `rag/svr/task_executor_refactor/chunk_service.py` | +timeout + None-check | | `rag/svr/task_executor_refactor/embedding_service.py` | sync→async rewrite | | `rag/svr/task_executor_refactor/dataflow_service.py` | dataflow_id fix + timeout | | `rag/svr/task_executor_refactor/raptor_service.py` | checkpoint fix + assert | | `rag/svr/task_executor_refactor/chunk_post_processor.py` | tag cache restore | | `rag/svr/task_executor_refactor/task_context.py` | language default fix | | `test/.../conftest.py` | +294 lines shared helpers | | `test/.../*.py` | 15 test files refactored, 20 new tests | --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:18:31 +08:00
- Background task executor workers (separate Python processes, Redis-queue-driven)
- Peewee ORM for database models (not SQLAlchemy)
- Multiple data stores (MySQL/PostgreSQL, Elasticsearch/Infinity/OpenSearch/OceanBase, Redis, MinIO)
## Architecture
refactor: overhaul task executor with layered architecture and comprehensive test suite (#15471) ## Summary Decomposes the monolithic `task_executor.py` (1945 lines) into a 6-layer architecture with clear separation of concerns. The refactored code is functionally equivalent to the original, verified through 400 passing tests and a production-vs-dry-run comparison framework. ## Architecture ``` entry (task_manager) └─ orchestration (task_handler) ├─ services (chunk_service, embedding_service, dataflow_service, raptor_service, post_processor) │ └─ utilities (chunk_builder, chunk_post_processor, embedding_utils) └─ infrastructure (task_context, recording_context, interceptor) ``` Key design decisions: - **TaskContext** — typed facade over raw task dict, injects rate limiters + callbacks via composition - **RecordingContext + Comparator** — enables side-by-side production vs dry-run execution for safe migration - **NullRecordingContext** — zero-allocation no-op for production, uses `__slots__` - **WriteOperationInterceptor** — FIFO replay of previous runs function returns for comparison mode ## Migration Strategy The original `handle_task()` in `task_executor.py` uses a 3-way switch via `TE_RUN_MODE`: - `TE_RUN_MODE=0` (default) → runs refactored code - `TE_RUN_MODE=1` → runs both original + refactored, compares all intermediate results - `TE_RUN_MODE=2` → runs original code (fallback) The comparison mode (`TE_RUN_MODE=1`) records ~40 intermediate values (chunks, vectors, token counts, func return values) from the production run and replays them during dry-run, then uses `ContextComparator` to report mismatches. ## Functional Equivalence Fixes All divergences between original and refactored code were identified and fixed: - Timeout decorators (handle/build_chunks/raptor/embedding) - NullRecordingContext leak in finally block causing RuntimeError - MinIO None-binary check with proper FileNotFoundError - Dataflow dispatch after embedding binding + init_kb - Memory task missing return after processing - RAPTOR checkpoint progress reporting - Tag cache (get_tags_from_cache/set_tags_to_cache) restoration - dataflow_id correction in _load_dsl - Language default Chinese, dead code guard removal - embed_chunks made async with proper thread_pool_exec - Full GraphRAG default configuration (10 parameters) - Hardcoded q_768_vec fallback removal in RAPTOR ## Test Changes - 20 new tests covering table parser manual mode, tag cache, embedding edge cases, RAPTOR checkpoint, dataflow_id correction, storage binary None, cancel cleanup, metadata=None boundary - Unified `make_task_context`/`make_task_dict` factories eliminated 10+ duplicated helpers - DataflowService tests migrated from internal method mocks to IO boundary mocks (real orchestration code executes) - Parametrized duplicate build_chunks post-processor tests - 7 raptor tests modernized to @pytest.mark.asyncio - Mock count per test reduced through boundary-level mocking strategy **Test count: 400 passing, 0 warnings, 0 skips** ## Files Changed | File | Change | |------|--------| | `rag/svr/task_executor.py` | +1 line (NullRecordingContext fix) | | `rag/svr/task_executor_refactor/task_handler.py` | Orchestration layer, 8 logic fixes | | `rag/svr/task_executor_refactor/chunk_service.py` | +timeout + None-check | | `rag/svr/task_executor_refactor/embedding_service.py` | sync→async rewrite | | `rag/svr/task_executor_refactor/dataflow_service.py` | dataflow_id fix + timeout | | `rag/svr/task_executor_refactor/raptor_service.py` | checkpoint fix + assert | | `rag/svr/task_executor_refactor/chunk_post_processor.py` | tag cache restore | | `rag/svr/task_executor_refactor/task_context.py` | language default fix | | `test/.../conftest.py` | +294 lines shared helpers | | `test/.../*.py` | 15 test files refactored, 20 new tests | --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:18:31 +08:00
### Runtime Architecture
refactor: overhaul task executor with layered architecture and comprehensive test suite (#15471) ## Summary Decomposes the monolithic `task_executor.py` (1945 lines) into a 6-layer architecture with clear separation of concerns. The refactored code is functionally equivalent to the original, verified through 400 passing tests and a production-vs-dry-run comparison framework. ## Architecture ``` entry (task_manager) └─ orchestration (task_handler) ├─ services (chunk_service, embedding_service, dataflow_service, raptor_service, post_processor) │ └─ utilities (chunk_builder, chunk_post_processor, embedding_utils) └─ infrastructure (task_context, recording_context, interceptor) ``` Key design decisions: - **TaskContext** — typed facade over raw task dict, injects rate limiters + callbacks via composition - **RecordingContext + Comparator** — enables side-by-side production vs dry-run execution for safe migration - **NullRecordingContext** — zero-allocation no-op for production, uses `__slots__` - **WriteOperationInterceptor** — FIFO replay of previous runs function returns for comparison mode ## Migration Strategy The original `handle_task()` in `task_executor.py` uses a 3-way switch via `TE_RUN_MODE`: - `TE_RUN_MODE=0` (default) → runs refactored code - `TE_RUN_MODE=1` → runs both original + refactored, compares all intermediate results - `TE_RUN_MODE=2` → runs original code (fallback) The comparison mode (`TE_RUN_MODE=1`) records ~40 intermediate values (chunks, vectors, token counts, func return values) from the production run and replays them during dry-run, then uses `ContextComparator` to report mismatches. ## Functional Equivalence Fixes All divergences between original and refactored code were identified and fixed: - Timeout decorators (handle/build_chunks/raptor/embedding) - NullRecordingContext leak in finally block causing RuntimeError - MinIO None-binary check with proper FileNotFoundError - Dataflow dispatch after embedding binding + init_kb - Memory task missing return after processing - RAPTOR checkpoint progress reporting - Tag cache (get_tags_from_cache/set_tags_to_cache) restoration - dataflow_id correction in _load_dsl - Language default Chinese, dead code guard removal - embed_chunks made async with proper thread_pool_exec - Full GraphRAG default configuration (10 parameters) - Hardcoded q_768_vec fallback removal in RAPTOR ## Test Changes - 20 new tests covering table parser manual mode, tag cache, embedding edge cases, RAPTOR checkpoint, dataflow_id correction, storage binary None, cancel cleanup, metadata=None boundary - Unified `make_task_context`/`make_task_dict` factories eliminated 10+ duplicated helpers - DataflowService tests migrated from internal method mocks to IO boundary mocks (real orchestration code executes) - Parametrized duplicate build_chunks post-processor tests - 7 raptor tests modernized to @pytest.mark.asyncio - Mock count per test reduced through boundary-level mocking strategy **Test count: 400 passing, 0 warnings, 0 skips** ## Files Changed | File | Change | |------|--------| | `rag/svr/task_executor.py` | +1 line (NullRecordingContext fix) | | `rag/svr/task_executor_refactor/task_handler.py` | Orchestration layer, 8 logic fixes | | `rag/svr/task_executor_refactor/chunk_service.py` | +timeout + None-check | | `rag/svr/task_executor_refactor/embedding_service.py` | sync→async rewrite | | `rag/svr/task_executor_refactor/dataflow_service.py` | dataflow_id fix + timeout | | `rag/svr/task_executor_refactor/raptor_service.py` | checkpoint fix + assert | | `rag/svr/task_executor_refactor/chunk_post_processor.py` | tag cache restore | | `rag/svr/task_executor_refactor/task_context.py` | language default fix | | `test/.../conftest.py` | +294 lines shared helpers | | `test/.../*.py` | 15 test files refactored, 20 new tests | --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:18:31 +08:00
RAGFlow runs as **two separate Python process types**, orchestrated by `docker/launch_backend_service.sh`:
- **API Server** (`api/ragflow_server.py`): Quart-based async HTTP server
- **Task Executors** (`rag/svr/task_executor.py`): Background workers processing documents from Redis streams. Multiple instances run in parallel (controlled by `WS` env var). Each consumes from priority-ordered Redis streams (`te.1.common`, `te.0.common`), using consumer groups for load distribution.
Key consequence: task executors import a different code surface than the API server, so always check which process a module is meant for.
### Backend API (`/api/`)
- **App factory**: `api/apps/__init__.py` — creates the Quart app, configures auth (`login_required` decorator, JWT + API token + session fallback), and dynamically discovers/registers blueprints
- **Two API coexisting patterns**:
- **RESTful APIs** in `api/apps/restful_apis/` — newer pattern with Pydantic request validation, service layer in `api/apps/services/`, routes registered under `/api/v1`
- **Legacy APIs** in `api/apps/*_app.py` — older pattern using `@validate_request()`, routes registered under `/v1/<page_name>`
- **SDK APIs** in `api/apps/sdk/` — registered under `/v1/`
- **Services**: `api/db/services/` — business logic wrapping Peewee model operations. `api/apps/services/` — service layer for the RESTful APIs
- **Models**: `api/db/db_models.py` — Peewee ORM models with pooled MySQL/PostgreSQL connections, custom `JSONField`/`ListField` types, retry logic on connection loss
### Core Processing (`/rag/`)
refactor: overhaul task executor with layered architecture and comprehensive test suite (#15471) ## Summary Decomposes the monolithic `task_executor.py` (1945 lines) into a 6-layer architecture with clear separation of concerns. The refactored code is functionally equivalent to the original, verified through 400 passing tests and a production-vs-dry-run comparison framework. ## Architecture ``` entry (task_manager) └─ orchestration (task_handler) ├─ services (chunk_service, embedding_service, dataflow_service, raptor_service, post_processor) │ └─ utilities (chunk_builder, chunk_post_processor, embedding_utils) └─ infrastructure (task_context, recording_context, interceptor) ``` Key design decisions: - **TaskContext** — typed facade over raw task dict, injects rate limiters + callbacks via composition - **RecordingContext + Comparator** — enables side-by-side production vs dry-run execution for safe migration - **NullRecordingContext** — zero-allocation no-op for production, uses `__slots__` - **WriteOperationInterceptor** — FIFO replay of previous runs function returns for comparison mode ## Migration Strategy The original `handle_task()` in `task_executor.py` uses a 3-way switch via `TE_RUN_MODE`: - `TE_RUN_MODE=0` (default) → runs refactored code - `TE_RUN_MODE=1` → runs both original + refactored, compares all intermediate results - `TE_RUN_MODE=2` → runs original code (fallback) The comparison mode (`TE_RUN_MODE=1`) records ~40 intermediate values (chunks, vectors, token counts, func return values) from the production run and replays them during dry-run, then uses `ContextComparator` to report mismatches. ## Functional Equivalence Fixes All divergences between original and refactored code were identified and fixed: - Timeout decorators (handle/build_chunks/raptor/embedding) - NullRecordingContext leak in finally block causing RuntimeError - MinIO None-binary check with proper FileNotFoundError - Dataflow dispatch after embedding binding + init_kb - Memory task missing return after processing - RAPTOR checkpoint progress reporting - Tag cache (get_tags_from_cache/set_tags_to_cache) restoration - dataflow_id correction in _load_dsl - Language default Chinese, dead code guard removal - embed_chunks made async with proper thread_pool_exec - Full GraphRAG default configuration (10 parameters) - Hardcoded q_768_vec fallback removal in RAPTOR ## Test Changes - 20 new tests covering table parser manual mode, tag cache, embedding edge cases, RAPTOR checkpoint, dataflow_id correction, storage binary None, cancel cleanup, metadata=None boundary - Unified `make_task_context`/`make_task_dict` factories eliminated 10+ duplicated helpers - DataflowService tests migrated from internal method mocks to IO boundary mocks (real orchestration code executes) - Parametrized duplicate build_chunks post-processor tests - 7 raptor tests modernized to @pytest.mark.asyncio - Mock count per test reduced through boundary-level mocking strategy **Test count: 400 passing, 0 warnings, 0 skips** ## Files Changed | File | Change | |------|--------| | `rag/svr/task_executor.py` | +1 line (NullRecordingContext fix) | | `rag/svr/task_executor_refactor/task_handler.py` | Orchestration layer, 8 logic fixes | | `rag/svr/task_executor_refactor/chunk_service.py` | +timeout + None-check | | `rag/svr/task_executor_refactor/embedding_service.py` | sync→async rewrite | | `rag/svr/task_executor_refactor/dataflow_service.py` | dataflow_id fix + timeout | | `rag/svr/task_executor_refactor/raptor_service.py` | checkpoint fix + assert | | `rag/svr/task_executor_refactor/chunk_post_processor.py` | tag cache restore | | `rag/svr/task_executor_refactor/task_context.py` | language default fix | | `test/.../conftest.py` | +294 lines shared helpers | | `test/.../*.py` | 15 test files refactored, 20 new tests | --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:18:31 +08:00
- **Document ingestion pipeline**: `rag/flow/pipeline.py``Pipeline` (extends `agent.canvas.Graph`) orchestrates the ingestion DAG. Components: File (fetches binary from storage), Parser (dispatches to `deepdoc.parser` based on file type), TokenChunker/TitleChunker (splits into chunks), Tokenizer (computes full-text tokens + embedding vectors), Extractor (LLM-based extraction). Data flows via Pydantic `*FromUpstream` schemas.
- **Document parsing**: `deepdoc/` — PDF parsing (vision-based OCR, layout analysis, table structure recognition) and format-specific parsers (DOCX, XLSX, PPT, Markdown, HTML, images). All parsers normalize to a common structure (list of bbox dicts for PDFs, `{text, doc_type_kwd}` for others).
- **DeepDoc HTTP API service** (`deepdoc/server/`): OSS ONNX models (DLA, OCR, TSR) wrapped with LitServe as a standalone HTTP API on port 8124. The Go parser (`internal/parser/`) calls this service via `DeepDocClient`. Endpoints: `GET /health`, `GET /model`, `POST /predict/dla`, `POST /predict/tsr`, `POST /predict/ocr` (with `operator=det` or `operator=rec` form field). Docker image: `deepdoc_oss:latest`. See `deepdoc/server/README.md` for the full API reference.
refactor: overhaul task executor with layered architecture and comprehensive test suite (#15471) ## Summary Decomposes the monolithic `task_executor.py` (1945 lines) into a 6-layer architecture with clear separation of concerns. The refactored code is functionally equivalent to the original, verified through 400 passing tests and a production-vs-dry-run comparison framework. ## Architecture ``` entry (task_manager) └─ orchestration (task_handler) ├─ services (chunk_service, embedding_service, dataflow_service, raptor_service, post_processor) │ └─ utilities (chunk_builder, chunk_post_processor, embedding_utils) └─ infrastructure (task_context, recording_context, interceptor) ``` Key design decisions: - **TaskContext** — typed facade over raw task dict, injects rate limiters + callbacks via composition - **RecordingContext + Comparator** — enables side-by-side production vs dry-run execution for safe migration - **NullRecordingContext** — zero-allocation no-op for production, uses `__slots__` - **WriteOperationInterceptor** — FIFO replay of previous runs function returns for comparison mode ## Migration Strategy The original `handle_task()` in `task_executor.py` uses a 3-way switch via `TE_RUN_MODE`: - `TE_RUN_MODE=0` (default) → runs refactored code - `TE_RUN_MODE=1` → runs both original + refactored, compares all intermediate results - `TE_RUN_MODE=2` → runs original code (fallback) The comparison mode (`TE_RUN_MODE=1`) records ~40 intermediate values (chunks, vectors, token counts, func return values) from the production run and replays them during dry-run, then uses `ContextComparator` to report mismatches. ## Functional Equivalence Fixes All divergences between original and refactored code were identified and fixed: - Timeout decorators (handle/build_chunks/raptor/embedding) - NullRecordingContext leak in finally block causing RuntimeError - MinIO None-binary check with proper FileNotFoundError - Dataflow dispatch after embedding binding + init_kb - Memory task missing return after processing - RAPTOR checkpoint progress reporting - Tag cache (get_tags_from_cache/set_tags_to_cache) restoration - dataflow_id correction in _load_dsl - Language default Chinese, dead code guard removal - embed_chunks made async with proper thread_pool_exec - Full GraphRAG default configuration (10 parameters) - Hardcoded q_768_vec fallback removal in RAPTOR ## Test Changes - 20 new tests covering table parser manual mode, tag cache, embedding edge cases, RAPTOR checkpoint, dataflow_id correction, storage binary None, cancel cleanup, metadata=None boundary - Unified `make_task_context`/`make_task_dict` factories eliminated 10+ duplicated helpers - DataflowService tests migrated from internal method mocks to IO boundary mocks (real orchestration code executes) - Parametrized duplicate build_chunks post-processor tests - 7 raptor tests modernized to @pytest.mark.asyncio - Mock count per test reduced through boundary-level mocking strategy **Test count: 400 passing, 0 warnings, 0 skips** ## Files Changed | File | Change | |------|--------| | `rag/svr/task_executor.py` | +1 line (NullRecordingContext fix) | | `rag/svr/task_executor_refactor/task_handler.py` | Orchestration layer, 8 logic fixes | | `rag/svr/task_executor_refactor/chunk_service.py` | +timeout + None-check | | `rag/svr/task_executor_refactor/embedding_service.py` | sync→async rewrite | | `rag/svr/task_executor_refactor/dataflow_service.py` | dataflow_id fix + timeout | | `rag/svr/task_executor_refactor/raptor_service.py` | checkpoint fix + assert | | `rag/svr/task_executor_refactor/chunk_post_processor.py` | tag cache restore | | `rag/svr/task_executor_refactor/task_context.py` | language default fix | | `test/.../conftest.py` | +294 lines shared helpers | | `test/.../*.py` | 15 test files refactored, 20 new tests | --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:18:31 +08:00
- **LLM Integration**: `rag/llm/` — factory pattern with runtime class discovery. `chat_model.py` (30+ providers via OpenAI SDK and LiteLLM wrappers), `embedding_model.py`, `rerank_model.py`, `cv_model.py` (image-to-text), `sequence2txt_model.py` (ASR), `tts_model.py`. Use `LLMBundle` (from `api.db.services.llm_service`) as the unified interface.
- **Graph RAG**: `rag/graphrag/` — multi-phase pipeline: per-document subgraph extraction (LLM or spaCy NER), Leiden community detection, entity resolution, community summarization. Entities/relations/reports are indexed as chunks alongside regular text chunks, differentiated by `knowledge_graph_kwd`.
- **Search**: `rag/nlp/search.py``Dealer` class combines vector similarity + BM25 + re-ranking. `KGSearch` extends it for graph-aware retrieval (entity resolution, n-hop enrichment).
### Agent System (`/agent/`)
refactor: overhaul task executor with layered architecture and comprehensive test suite (#15471) ## Summary Decomposes the monolithic `task_executor.py` (1945 lines) into a 6-layer architecture with clear separation of concerns. The refactored code is functionally equivalent to the original, verified through 400 passing tests and a production-vs-dry-run comparison framework. ## Architecture ``` entry (task_manager) └─ orchestration (task_handler) ├─ services (chunk_service, embedding_service, dataflow_service, raptor_service, post_processor) │ └─ utilities (chunk_builder, chunk_post_processor, embedding_utils) └─ infrastructure (task_context, recording_context, interceptor) ``` Key design decisions: - **TaskContext** — typed facade over raw task dict, injects rate limiters + callbacks via composition - **RecordingContext + Comparator** — enables side-by-side production vs dry-run execution for safe migration - **NullRecordingContext** — zero-allocation no-op for production, uses `__slots__` - **WriteOperationInterceptor** — FIFO replay of previous runs function returns for comparison mode ## Migration Strategy The original `handle_task()` in `task_executor.py` uses a 3-way switch via `TE_RUN_MODE`: - `TE_RUN_MODE=0` (default) → runs refactored code - `TE_RUN_MODE=1` → runs both original + refactored, compares all intermediate results - `TE_RUN_MODE=2` → runs original code (fallback) The comparison mode (`TE_RUN_MODE=1`) records ~40 intermediate values (chunks, vectors, token counts, func return values) from the production run and replays them during dry-run, then uses `ContextComparator` to report mismatches. ## Functional Equivalence Fixes All divergences between original and refactored code were identified and fixed: - Timeout decorators (handle/build_chunks/raptor/embedding) - NullRecordingContext leak in finally block causing RuntimeError - MinIO None-binary check with proper FileNotFoundError - Dataflow dispatch after embedding binding + init_kb - Memory task missing return after processing - RAPTOR checkpoint progress reporting - Tag cache (get_tags_from_cache/set_tags_to_cache) restoration - dataflow_id correction in _load_dsl - Language default Chinese, dead code guard removal - embed_chunks made async with proper thread_pool_exec - Full GraphRAG default configuration (10 parameters) - Hardcoded q_768_vec fallback removal in RAPTOR ## Test Changes - 20 new tests covering table parser manual mode, tag cache, embedding edge cases, RAPTOR checkpoint, dataflow_id correction, storage binary None, cancel cleanup, metadata=None boundary - Unified `make_task_context`/`make_task_dict` factories eliminated 10+ duplicated helpers - DataflowService tests migrated from internal method mocks to IO boundary mocks (real orchestration code executes) - Parametrized duplicate build_chunks post-processor tests - 7 raptor tests modernized to @pytest.mark.asyncio - Mock count per test reduced through boundary-level mocking strategy **Test count: 400 passing, 0 warnings, 0 skips** ## Files Changed | File | Change | |------|--------| | `rag/svr/task_executor.py` | +1 line (NullRecordingContext fix) | | `rag/svr/task_executor_refactor/task_handler.py` | Orchestration layer, 8 logic fixes | | `rag/svr/task_executor_refactor/chunk_service.py` | +timeout + None-check | | `rag/svr/task_executor_refactor/embedding_service.py` | sync→async rewrite | | `rag/svr/task_executor_refactor/dataflow_service.py` | dataflow_id fix + timeout | | `rag/svr/task_executor_refactor/raptor_service.py` | checkpoint fix + assert | | `rag/svr/task_executor_refactor/chunk_post_processor.py` | tag cache restore | | `rag/svr/task_executor_refactor/task_context.py` | language default fix | | `test/.../conftest.py` | +294 lines shared helpers | | `test/.../*.py` | 15 test files refactored, 20 new tests | --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:18:31 +08:00
- **Execution engine**: `agent/canvas.py``Canvas` (extends `Graph`) executes the DAG. Components are run in topological order via `_run_batch`, each receiving upstream outputs as kwargs. Control-flow components (`Categorize`, `Switch`, `Iteration`, `Loop`) dynamically modify the execution path.
- **Component base**: `agent/component/base.py``ComponentBase` with `invoke(**kwargs)` / `invoke_async(**kwargs)` lifecycle. Variable references (`{component_id@output_var}` or `{sys.query}`) are resolved from the canvas graph at runtime.
- **Components**: Modular workflow components in `agent/component/` — Begin, LLM, Agent (tool-calling LLM), Categorize, Switch, Iteration, Loop, Message, Invoke (HTTP), and data manipulation nodes. Auto-discovered by `__init__.py`.
- **Templates**: Pre-built agent workflows as JSON DSL files in `agent/templates/`. Each contains a complete `components` DAG, `path`, and `globals`.
- **Tools**: `agent/tools/` — Retrieval, web search (DuckDuckGo, Google, Tavily, SearXNG), academic search (ArXiv, PubMed, Google Scholar, Wikipedia), code execution, SQL execution, email, GitHub, finance data, translation, weather. Tools implement `ToolBase` (extends `ComponentBase`) and produce OpenAI-compatible function descriptors.
- **Plugins**: `agent/plugin/` — plugin system using `pluginlib` for loading external LLM tool plugins from `embedded_plugins/`.
### Frontend (`/web/`)
- React/TypeScript with vitejs framework
refactor: overhaul task executor with layered architecture and comprehensive test suite (#15471) ## Summary Decomposes the monolithic `task_executor.py` (1945 lines) into a 6-layer architecture with clear separation of concerns. The refactored code is functionally equivalent to the original, verified through 400 passing tests and a production-vs-dry-run comparison framework. ## Architecture ``` entry (task_manager) └─ orchestration (task_handler) ├─ services (chunk_service, embedding_service, dataflow_service, raptor_service, post_processor) │ └─ utilities (chunk_builder, chunk_post_processor, embedding_utils) └─ infrastructure (task_context, recording_context, interceptor) ``` Key design decisions: - **TaskContext** — typed facade over raw task dict, injects rate limiters + callbacks via composition - **RecordingContext + Comparator** — enables side-by-side production vs dry-run execution for safe migration - **NullRecordingContext** — zero-allocation no-op for production, uses `__slots__` - **WriteOperationInterceptor** — FIFO replay of previous runs function returns for comparison mode ## Migration Strategy The original `handle_task()` in `task_executor.py` uses a 3-way switch via `TE_RUN_MODE`: - `TE_RUN_MODE=0` (default) → runs refactored code - `TE_RUN_MODE=1` → runs both original + refactored, compares all intermediate results - `TE_RUN_MODE=2` → runs original code (fallback) The comparison mode (`TE_RUN_MODE=1`) records ~40 intermediate values (chunks, vectors, token counts, func return values) from the production run and replays them during dry-run, then uses `ContextComparator` to report mismatches. ## Functional Equivalence Fixes All divergences between original and refactored code were identified and fixed: - Timeout decorators (handle/build_chunks/raptor/embedding) - NullRecordingContext leak in finally block causing RuntimeError - MinIO None-binary check with proper FileNotFoundError - Dataflow dispatch after embedding binding + init_kb - Memory task missing return after processing - RAPTOR checkpoint progress reporting - Tag cache (get_tags_from_cache/set_tags_to_cache) restoration - dataflow_id correction in _load_dsl - Language default Chinese, dead code guard removal - embed_chunks made async with proper thread_pool_exec - Full GraphRAG default configuration (10 parameters) - Hardcoded q_768_vec fallback removal in RAPTOR ## Test Changes - 20 new tests covering table parser manual mode, tag cache, embedding edge cases, RAPTOR checkpoint, dataflow_id correction, storage binary None, cancel cleanup, metadata=None boundary - Unified `make_task_context`/`make_task_dict` factories eliminated 10+ duplicated helpers - DataflowService tests migrated from internal method mocks to IO boundary mocks (real orchestration code executes) - Parametrized duplicate build_chunks post-processor tests - 7 raptor tests modernized to @pytest.mark.asyncio - Mock count per test reduced through boundary-level mocking strategy **Test count: 400 passing, 0 warnings, 0 skips** ## Files Changed | File | Change | |------|--------| | `rag/svr/task_executor.py` | +1 line (NullRecordingContext fix) | | `rag/svr/task_executor_refactor/task_handler.py` | Orchestration layer, 8 logic fixes | | `rag/svr/task_executor_refactor/chunk_service.py` | +timeout + None-check | | `rag/svr/task_executor_refactor/embedding_service.py` | sync→async rewrite | | `rag/svr/task_executor_refactor/dataflow_service.py` | dataflow_id fix + timeout | | `rag/svr/task_executor_refactor/raptor_service.py` | checkpoint fix + assert | | `rag/svr/task_executor_refactor/chunk_post_processor.py` | tag cache restore | | `rag/svr/task_executor_refactor/task_context.py` | language default fix | | `test/.../conftest.py` | +294 lines shared helpers | | `test/.../*.py` | 15 test files refactored, 20 new tests | --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 17:18:31 +08:00
- shadcn/ui components (Radix UI primitives + Tailwind CSS)
- `@tanstack/react-query` for server state (cache keys, mutations, invalidation)
- Zustand for local state (primarily agent canvas graph store)
- `react-router` v7 with lazy-loaded pages
- `react-i18next` for i18n (17 languages)
- Axios for HTTP with a layered pattern: endpoint definitions (`utils/api.ts`) → HTTP client (`utils/next-request.ts`) → service layer (`services/`) → query hooks (`hooks/use-*-request.ts`) → components
- `@xyflow/react` for the agent workflow canvas
- `react-hook-form` + `zod` for form validation
- Two API proxy prefixes: `webAPI = '/v1'` (legacy) and `restAPIv1 = '/api/v1'` (RESTful)
## Common Development Commands
### Backend Development
```bash
# Install Python dependencies
feat: bump Python minimum from 3.12 to 3.13, drop strenum backport (#14767) Closes #14753 ## What changed | File | Change | |---|---| | `pyproject.toml` | `requires-python` → `>=3.13,<3.15`; remove `strenum==0.4.15` | | `Dockerfile` | `uv python install 3.13`, `uv sync --python 3.13` | | `.github/workflows/tests.yml` | `uv sync --python 3.13` on both matrix legs | | `CLAUDE.md` | dev setup command + requirements note updated | | `deepdoc/parser/mineru_parser.py` | `from strenum import StrEnum` → `from enum import StrEnum` | | `agent/tools/code_exec.py` | same | `StrEnum` has been in the stdlib since Python 3.11 — the `strenum` backport package is no longer needed once the floor is 3.13. ## Why uv.lock is not regenerated `uv lock --python 3.13` fails because: 1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0` 2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels) depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0` 3. These two constraints are irreconcilable on Python 3.13 The lockfile regeneration requires loosening the `numpy` upper bound in the `infiniflow/graspologic` fork. Once that fork commit is updated and the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will succeed. ## RFC corrections Two claims in the original RFC (#14753) did not hold up under code review: - **"graspologic hard-blocks 3.13"** — the infiniflow fork at the pinned commit has no `<3.13` Python constraint. The blocker is the transitive `numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a direct Python version cap. - **"free-threading throughput gains for I/O-bound workload"** — Python 3.13 free-threading requires a special `--disable-gil` build and provides no benefit for async I/O code (the GIL is already released during I/O). The real motivation is forward compatibility and improved error messages.
2026-05-15 08:40:53 +02:00
uv sync --python 3.13 --all-extras
feat(agent): align Go agent behavior with Python (except retrieval component) (#16225) ## Summary Aligns the **Go agent runtime/canvas/components/tools** behavior with the **Python `agent/` implementation** so the same stored canvas DSL produces the same execution result on either side. Every component, tool, and runtime primitive in `internal/agent/` is now driven by the same semantics as its Python counterpart — variable resolution, template substitution, control flow, error reporting, retry/cancel, and stream event shapes. The **retrieval component is the one explicit exception** in this PR. It is being reworked in a separate change and is excluded from this alignment pass; the wrapper slot (`universe_a_wrappers.go → newRetrievalComponent`) is preserved. ## Scope of alignment ### Components (all aligned with `agent/component/`) `Begin` · `Message` · `LLM` (incl. ChatTemplateKwargs, MessageHistoryWindowSize, VisualFiles, Cite, OutputStructure, JSONOutput, TopP, MaxRetries, DelayAfterError, credentials) · `Agent` (react + tool artifact capture + `Reset()` interface-assert) · `Switch` (12/12 operators, Python-equivalent semantics) · `Categorize` · `Invoke` · `Iteration` · `Loop` (macro-expansion through `workflowx.AddLoopNode`) · `UserFillUp` (Python-equivalent interrupt/resume via eino `compose.Interrupt`/`ResumeWithData`) · `FillUp` · `DataOperations` · `ListOperations` · `StringTransform` · `VariableAggregator` · `VariableAssigner` · `Browser` (full stagehand runtime parity) · `DocsGenerator` · `ExcelProcessor`. ### Tools (all aligned with `agent/tools/`) `Retrieval` (wrapper slot only — logic out of scope) · `MCPToolAdapter` (streamable-HTTP) · `CodeExec` (sandbox bridge with `code_exec_contract.go` matching Python contract) · `AkShare` · `ArXiv` · `Crawler` · `DeepL` · `DuckDuckGo` · `Email` · `ExeSQL` · `GitHub` · `Google` · `GoogleScholar` · `Jin10` · `PubMed` · `QWeather` · `SearXNG` · `Tavily` · `Tushare` · `Wencai` · `Wikipedia` · `YahooFinance` — uniform `eino tool.InvokableTool` interface, SSRF protection, shared HTTP client. ### Canvas execution engine (`internal/agent/canvas/`) Aligned with Python's `agent/canvas.py`: - **Scheduler** (`scheduler.go`): state pre/post handlers, node lambdas, per-component timeout resolver (4-level: per-class env → per-class table → uniform env → 600s fallback), `legacyNoOpNames`. - **Loop subgraph** (`loop_subgraph.go`): Python-equivalent `AddLoopNode` macro expansion + condition translation. - **Multibranch** (`multibranch.go`): `Switch` / `Categorize` routing via `compose.NewGraphMultiBranch` — same branch selection semantics as Python. - **Parallel subgraph** (`parallel_subgraph.go`): matches Python's parallel fan-out contract. - **Interrupt/Resume** (`interrupt_resume.go`): `UserFillUpNodeBody` / `IsInterruptError` / `ExtractInterruptContexts` — replaces the deprecated Python sentinel chain with eino's native interrupt API, preserving the same external behavior. - **Checkpoint** (`checkpoint_store.go`): `RedisCheckPointStore` Get/Set/Delete, with business metadata (status / canvas_id / parent_run_id) on a parallel Redis Hash. - **RunTracker** (`run_tracker.go`): Start / MarkSucceeded / MarkFailed / MarkCancelled / AttachCheckpoint — same lifecycle as the Python run record. - **Cancel** (`cancel.go`): Redis pub/sub watch. - **Stream** (`stream.go`): SSE channel with `messages` / `waiting` / `errors` / `done` events, same shape as Python's `agent.canvas.RunEvent` payload. ### DSL bridge (`internal/agent/dsl/`) - `normalize.go`: v1↔v2 collapsed into a single wire format — Python and Go consume the same stored JSON. - `reset.go`: per-run state reset matches Python's `Canvas.reset()` semantics. - Testdata mirrors Python's `agent_msg.json` / `all.json` / etc. ### Runtime (`internal/agent/runtime/`) - `CanvasState` / `NewCanvasState` / `GetVar` / `SetVar` / `ReadVars`: same `{{cpn_id@param}}` resolution model. - `ResolveTemplate` (regex fast path + gonja fallback) — Python Jinja-style semantics. - `selector.go`, `metrics.go`, `component.go`: shared runtime contracts. ## Out of scope (intentionally) - **`Retrieval` component logic** — wrapped only; full parity lands in a follow-up PR. - **Frontend** — only minor dsl-bridge / canvas UX fixes ride along. - **CLI / admin / model registry** — orthogonal to agent behavior. ## How alignment is verified `internal/service/agent_run_e2e_test.go` exercises the **full production chain** against real Python-shaped DSL fixtures: ``` loadCanvasForUser → versionDAO.GetLatest → decodeCanvasFromDSL → canvas.Compile → cc.Workflow.Invoke → answer extraction ``` using in-memory SQLite + miniredis (no Docker). Covers: - `TestRunAgent_RealCanvas_BeginMessage` — happy path, `{{sys.query}}` resolution - `TestRunAgent_RealCanvas_WaitForUserResume` — two-run resume cycle (Python-equivalent) - `TestRunAgent_RealCanvas_CompileFails` — unknown component name → sanitized error (Python-equivalent) - `TestRunAgent_RealCanvas_InvokeFails` — unresolvable template ref (Python-equivalent) - `TestRunAgent_RunTracker_AttachCheckpoint_CallSequence` — Start→AttachCheckpoint→MarkSucceeded lifecycle `internal/handler/agent_test.go` — SSE streaming parity (`Content-Type: text/event-stream`, `data: {…}\n\n`, trailing `data: [DONE]\n\n`, OpenAI-compatible non-stream `choices`). `internal/agent/canvas/fixture_compile_test.go` + per-component tests pin the Python-equivalent outputs. ``` go test -count=1 -v -run 'TestRunAgent_RealCanvas|TestRunAgent_RunTracker' ./internal/service/ ``` ## Design reference `docs/develop/agent-go-port-design.md` (1329 lines, last cross-checked 2026-06-17) — module layout, per-component / per-tool inventory, corner-case catalogue, and the actionable backlog (Section 14, including the retrieval alignment follow-up). --------- Co-authored-by: Claude <noreply@anthropic.com>
2026-06-22 11:58:29 +08:00
uv run python3 ragflow_deps/download_deps.py
pre-commit install
# Start dependent services
docker compose -f docker/docker-compose-base.yml up -d
# Run backend (requires services to be running)
source .venv/bin/activate
export PYTHONPATH=$(pwd)
bash docker/launch_backend_service.sh
# Run tests
uv run pytest
# Linting
ruff check
ruff format
```
### Frontend Development
```bash
cd web
npm install
npm run dev # Development server
npm run build # Production build
npm run lint # ESLint
npm run test # Jest tests
```
### Docker Development
```bash
# Full stack with Docker (includes deepdoc vision service)
cd docker
docker compose -f docker-compose.yml up -d
# Check server status
docker logs -f ragflow-server
# Build the OSS deepdoc vision service standalone
docker build -f docker/Dockerfile_deepdoc_oss -t deepdoc_oss:latest .
docker run -p 8124:8124 deepdoc_oss:latest
# Rebuild images
docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .
```
## Key Configuration Files
- `docker/.env` - Environment variables for Docker deployment
- `docker/service_conf.yaml.template` - Backend service configuration
- `pyproject.toml` - Python dependencies and project configuration
- `web/package.json` - Frontend dependencies and scripts
## Testing
- **Python**: pytest with markers (p1/p2/p3 priority levels)
- **Frontend**: Jest with React Testing Library
- **API Tests**: HTTP API and SDK tests in `test/` and `sdk/python/test/`
## Database Engines
RAGFlow supports switching between Elasticsearch (default) and Infinity:
- Set `DOC_ENGINE=infinity` in `docker/.env` to use Infinity
- Requires container restart: `docker compose down -v && docker compose up -d`
feat(agent): align Go agent behavior with Python (except retrieval component) (#16225) ## Summary Aligns the **Go agent runtime/canvas/components/tools** behavior with the **Python `agent/` implementation** so the same stored canvas DSL produces the same execution result on either side. Every component, tool, and runtime primitive in `internal/agent/` is now driven by the same semantics as its Python counterpart — variable resolution, template substitution, control flow, error reporting, retry/cancel, and stream event shapes. The **retrieval component is the one explicit exception** in this PR. It is being reworked in a separate change and is excluded from this alignment pass; the wrapper slot (`universe_a_wrappers.go → newRetrievalComponent`) is preserved. ## Scope of alignment ### Components (all aligned with `agent/component/`) `Begin` · `Message` · `LLM` (incl. ChatTemplateKwargs, MessageHistoryWindowSize, VisualFiles, Cite, OutputStructure, JSONOutput, TopP, MaxRetries, DelayAfterError, credentials) · `Agent` (react + tool artifact capture + `Reset()` interface-assert) · `Switch` (12/12 operators, Python-equivalent semantics) · `Categorize` · `Invoke` · `Iteration` · `Loop` (macro-expansion through `workflowx.AddLoopNode`) · `UserFillUp` (Python-equivalent interrupt/resume via eino `compose.Interrupt`/`ResumeWithData`) · `FillUp` · `DataOperations` · `ListOperations` · `StringTransform` · `VariableAggregator` · `VariableAssigner` · `Browser` (full stagehand runtime parity) · `DocsGenerator` · `ExcelProcessor`. ### Tools (all aligned with `agent/tools/`) `Retrieval` (wrapper slot only — logic out of scope) · `MCPToolAdapter` (streamable-HTTP) · `CodeExec` (sandbox bridge with `code_exec_contract.go` matching Python contract) · `AkShare` · `ArXiv` · `Crawler` · `DeepL` · `DuckDuckGo` · `Email` · `ExeSQL` · `GitHub` · `Google` · `GoogleScholar` · `Jin10` · `PubMed` · `QWeather` · `SearXNG` · `Tavily` · `Tushare` · `Wencai` · `Wikipedia` · `YahooFinance` — uniform `eino tool.InvokableTool` interface, SSRF protection, shared HTTP client. ### Canvas execution engine (`internal/agent/canvas/`) Aligned with Python's `agent/canvas.py`: - **Scheduler** (`scheduler.go`): state pre/post handlers, node lambdas, per-component timeout resolver (4-level: per-class env → per-class table → uniform env → 600s fallback), `legacyNoOpNames`. - **Loop subgraph** (`loop_subgraph.go`): Python-equivalent `AddLoopNode` macro expansion + condition translation. - **Multibranch** (`multibranch.go`): `Switch` / `Categorize` routing via `compose.NewGraphMultiBranch` — same branch selection semantics as Python. - **Parallel subgraph** (`parallel_subgraph.go`): matches Python's parallel fan-out contract. - **Interrupt/Resume** (`interrupt_resume.go`): `UserFillUpNodeBody` / `IsInterruptError` / `ExtractInterruptContexts` — replaces the deprecated Python sentinel chain with eino's native interrupt API, preserving the same external behavior. - **Checkpoint** (`checkpoint_store.go`): `RedisCheckPointStore` Get/Set/Delete, with business metadata (status / canvas_id / parent_run_id) on a parallel Redis Hash. - **RunTracker** (`run_tracker.go`): Start / MarkSucceeded / MarkFailed / MarkCancelled / AttachCheckpoint — same lifecycle as the Python run record. - **Cancel** (`cancel.go`): Redis pub/sub watch. - **Stream** (`stream.go`): SSE channel with `messages` / `waiting` / `errors` / `done` events, same shape as Python's `agent.canvas.RunEvent` payload. ### DSL bridge (`internal/agent/dsl/`) - `normalize.go`: v1↔v2 collapsed into a single wire format — Python and Go consume the same stored JSON. - `reset.go`: per-run state reset matches Python's `Canvas.reset()` semantics. - Testdata mirrors Python's `agent_msg.json` / `all.json` / etc. ### Runtime (`internal/agent/runtime/`) - `CanvasState` / `NewCanvasState` / `GetVar` / `SetVar` / `ReadVars`: same `{{cpn_id@param}}` resolution model. - `ResolveTemplate` (regex fast path + gonja fallback) — Python Jinja-style semantics. - `selector.go`, `metrics.go`, `component.go`: shared runtime contracts. ## Out of scope (intentionally) - **`Retrieval` component logic** — wrapped only; full parity lands in a follow-up PR. - **Frontend** — only minor dsl-bridge / canvas UX fixes ride along. - **CLI / admin / model registry** — orthogonal to agent behavior. ## How alignment is verified `internal/service/agent_run_e2e_test.go` exercises the **full production chain** against real Python-shaped DSL fixtures: ``` loadCanvasForUser → versionDAO.GetLatest → decodeCanvasFromDSL → canvas.Compile → cc.Workflow.Invoke → answer extraction ``` using in-memory SQLite + miniredis (no Docker). Covers: - `TestRunAgent_RealCanvas_BeginMessage` — happy path, `{{sys.query}}` resolution - `TestRunAgent_RealCanvas_WaitForUserResume` — two-run resume cycle (Python-equivalent) - `TestRunAgent_RealCanvas_CompileFails` — unknown component name → sanitized error (Python-equivalent) - `TestRunAgent_RealCanvas_InvokeFails` — unresolvable template ref (Python-equivalent) - `TestRunAgent_RunTracker_AttachCheckpoint_CallSequence` — Start→AttachCheckpoint→MarkSucceeded lifecycle `internal/handler/agent_test.go` — SSE streaming parity (`Content-Type: text/event-stream`, `data: {…}\n\n`, trailing `data: [DONE]\n\n`, OpenAI-compatible non-stream `choices`). `internal/agent/canvas/fixture_compile_test.go` + per-component tests pin the Python-equivalent outputs. ``` go test -count=1 -v -run 'TestRunAgent_RealCanvas|TestRunAgent_RunTracker' ./internal/service/ ``` ## Design reference `docs/develop/agent-go-port-design.md` (1329 lines, last cross-checked 2026-06-17) — module layout, per-component / per-tool inventory, corner-case catalogue, and the actionable backlog (Section 14, including the retrieval alignment follow-up). --------- Co-authored-by: Claude <noreply@anthropic.com>
2026-06-22 11:58:29 +08:00
## Account Password Handling (Critical for Login Flow)
### Password Encryption Pipeline (Browser → Backend → DB Hash)
The login password verification chain is counterintuitive. Understanding this is essential when generating or verifying password hashes.
**Complete flow:**
```
Browser input: "demo"
→ Base64("demo") = "ZGVtbw=="
→ RSA encrypt with conf/public.pem
→ POST to /api/v1/auth/login
Backend DecryptPassword():
→ RSA decrypt with conf/private.pem (passphrase: "Welcome")
→ Returns "ZGVtbw==" (NOT "demo"!)
VerifyPassword("ZGVtbw==", storedHash) ← hash is of Base64(password), not raw password
```
**Consequences:**
- The string verified against the hash is **Base64(original password)**, never the raw password
- `DecryptPassword()` handles both RSA-encrypted (browser) and plaintext (curl/API key) inputs: if base64 decode fails, the input is returned as-is for backward compatibility
- Python backend has the same design: `api/utils/crypt.py:decrypt()` RSA-decrypts and returns the Base64-encoded string directly, no further decode
### How to Generate a Valid Password Hash
```bash
# For password "demo" (user input in browser):
# The actual verified string = Base64("demo") = "ZGVtbw=="
# Generate hash with: common.GenerateWerkzeugPasswordHash("ZGVtbw==")
# or use the scrypt template:
# scrypt:32768:8:1$<random-b64-salt>$<hex-hash-of-ZGVtbw==>
```
**To update a user's password in the running database:**
```bash
docker exec docker-mysql-1 mysql -u root -pinfini_rag_flow rag_flow \
-e "UPDATE user SET password='<hash>' WHERE email='<email>';"
```
### RSA Keys
- `conf/public.pem` — frontend uses this to encrypt Base64(password) before sending
- `conf/private.pem` — backend uses this to decrypt, passphrase `"Welcome"`
- Both referenced in `internal/common/password.go:DecryptPassword()`
### Obtaining an API Token for a Tenant
When testing APIs manually (curl, Go scripts, etc.), you need a valid auth token. The login endpoint returns **two different tokens**:
| Field | Format | Purpose |
|-------|--------|---------|
| `response.body.data.access_token` | Raw UUID | Stored in DB, NOT used for API auth |
| `response.Header["Authorization"]` | itsdangerous-signed token | Used as `Bearer <token>` for all subsequent API requests |
**How to obtain the correct token:**
```bash
# Step 1: Construct the encrypted password
# Raw password → Base64 → RSA encrypt with conf/public.pem
PASSWORD="demo"
PASSWORD_B64=$(echo -n "$PASSWORD" | base64)
# Step 2: POST to login (use RSA encryption — easiest via a Go/Python script)
# Response header contains: Authorization: <itsdangerous-signed-token>
# Step 3: Use the Authorization header value for all API requests
curl -H "Authorization: <itsdangerous-signed-token>" \
http://127.0.0.1:9222/api/v1/agents
```
**Go snippet (complete login + token extraction):**
```go
// Login
passwordB64 := base64.StdEncoding.EncodeToString([]byte(password))
pubData, _ := os.ReadFile("conf/public.pem")
block, _ := pem.Decode(pubData)
pubKey, _ := x509.ParsePKIXPublicKey(block.Bytes)
ciphertext, _ := rsa.EncryptPKCS1v15(rand.Reader, pubKey.(*rsa.PublicKey), []byte(passwordB64))
encryptedB64 := base64.StdEncoding.EncodeToString(ciphertext)
body, _ := json.Marshal(map[string]string{"email": email, "password": encryptedB64})
resp, _ := http.Post(baseURL+"/api/v1/auth/login", "application/json", bytes.NewReader(body))
// KEY: use the Authorization header, NOT body.access_token
authToken := resp.Header.Get("Authorization")
// Use for API calls
req, _ := http.NewRequest("GET", baseURL+"/api/v1/agents", nil)
req.Header.Set("Authorization", authToken)
```
**The raw `access_token` (UUID) in the response body** is the internal DB token used only by the `itsdangerous` middleware to verify the signed token — it is never passed directly in API Authorization headers.
---
## Agent Run E2E Tests
### Running the Tests
```bash
# Run all agent run e2e tests (in-memory SQLite + miniredis, no Docker needed)
cd /home/zhichyu/github.com/infiniflow/ragflow
go test -count=1 -v -run 'TestRunAgent_RealCanvas|TestRunAgent_RunTracker' ./internal/service/
```
### Test Architecture
All e2e tests live in `internal/service/agent_run_e2e_test.go`. They exercise the full production chain:
```
loadCanvasForUser → versionDAO.GetLatest → decodeCanvasFromDSL →
canvas.Compile → cc.Workflow.Invoke → answer extraction
```
**Test isolation**: Each test stands up its own in-memory SQLite DB (pushed as `dao.DB`), seeds User/Tenant/UserCanvas/UserCanvasVersion rows, and tears down in `t.Cleanup`. Tests use **miniredis** for Redis-backed CheckPointStore + RunTracker — no external services needed.
**Key test helpers:**
- `makeCanvasWithDSL(t, canvasID, userID, tenantID, versionID, dsl)` — seeds all required DB rows
- `drainAgentEvents(t, events)` — drains the `<-chan canvas.RunEvent` channel, buckets results into `messages`, `waiting`, `errors_`, `done`
- `newRunTrackerForTest(t, ttl)` — wires a `canvas.RunTracker` against in-memory miniredis
**Existing e2e tests:**
| Test | What it covers |
|------|---------------|
| `TestRunAgent_RealCanvas_BeginMessage` | Happy path: Begin→Message, verifies `"{{sys.query}}"` resolution |
| `TestRunAgent_RealCanvas_WaitForUserResume` | Resume path: Begin→Message→UserFillUp, two-run cycle |
| `TestRunAgent_RealCanvas_CompileFails` | Error path: unknown component name → sanitized error |
| `TestRunAgent_RealCanvas_InvokeFails` | Error path: unresolvable template ref |
| `TestRunAgent_RunTracker_AttachCheckpoint_CallSequence` | Production boot: Start→AttachCheckpoint→MarkSucceeded with Redis/miniredis |
**Test DSL data files** are in `internal/agent/dsl/testdata/`:
- `agent_msg.json` — Agent+Message with Begin, LLM-powered agent component
- `all.json` — Complex: Begin→UserFillUp→Switch→Loop→Message
- `switch.json`, `resume.json`, `browser.json`, `subagent.json`, etc.
**Handler-level SSE streaming tests** in `internal/handler/agent_test.go` use a `stubChatRunner` that emits pre-configured `canvas.RunEvent` values without a real DB or eino runner, verifying:
- SSE `Content-Type: text/event-stream`
- `data: {...}\n\n` framing
- Trailing `data: [DONE]\n\n` terminator
- OpenAI-compatible non-stream `choices` response shape
**Important**: `_ "ragflow/internal/agent/component"` (blank import in test) is required — it triggers `init()` to register all component factories. Without it, `canvas.Compile` fails to resolve any component type.
---
## Development Environment Requirements
feat: bump Python minimum from 3.12 to 3.13, drop strenum backport (#14767) Closes #14753 ## What changed | File | Change | |---|---| | `pyproject.toml` | `requires-python` → `>=3.13,<3.15`; remove `strenum==0.4.15` | | `Dockerfile` | `uv python install 3.13`, `uv sync --python 3.13` | | `.github/workflows/tests.yml` | `uv sync --python 3.13` on both matrix legs | | `CLAUDE.md` | dev setup command + requirements note updated | | `deepdoc/parser/mineru_parser.py` | `from strenum import StrEnum` → `from enum import StrEnum` | | `agent/tools/code_exec.py` | same | `StrEnum` has been in the stdlib since Python 3.11 — the `strenum` backport package is no longer needed once the floor is 3.13. ## Why uv.lock is not regenerated `uv lock --python 3.13` fails because: 1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0` 2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels) depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0` 3. These two constraints are irreconcilable on Python 3.13 The lockfile regeneration requires loosening the `numpy` upper bound in the `infiniflow/graspologic` fork. Once that fork commit is updated and the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will succeed. ## RFC corrections Two claims in the original RFC (#14753) did not hold up under code review: - **"graspologic hard-blocks 3.13"** — the infiniflow fork at the pinned commit has no `<3.13` Python constraint. The blocker is the transitive `numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a direct Python version cap. - **"free-threading throughput gains for I/O-bound workload"** — Python 3.13 free-threading requires a special `--disable-gil` build and provides no benefit for async I/O code (the GIL is already released during I/O). The real motivation is forward compatibility and improved error messages.
2026-05-15 08:40:53 +02:00
- Python 3.10-3.13
- Node.js >=18.20.4
- Docker & Docker Compose
- uv package manager
- 16GB+ RAM, 50GB+ disk space
1. Think before acting. Read existing files before writing code.
2. Be concise in output but thorough in reasoning.
3. Prefer editing over rewriting whole files.
4. Do not re-read files you have already read.
5. Test your code before declaring done.
6. No sycophantic openers or closing fluff.
7. Keep solutions simple and direct.
8. User instructions always override this file.