Files
ragflow/CLAUDE.md
wdeveloper16 14c0985182 feat: bump Python minimum from 3.12 to 3.13, drop strenum backport (#14767)
Closes #14753

## What changed

| File | Change |
|---|---|
| `pyproject.toml` | `requires-python` → `>=3.13,<3.15`; remove
`strenum==0.4.15` |
| `Dockerfile` | `uv python install 3.13`, `uv sync --python 3.13` |
| `.github/workflows/tests.yml` | `uv sync --python 3.13` on both matrix
legs |
| `CLAUDE.md` | dev setup command + requirements note updated |
| `deepdoc/parser/mineru_parser.py` | `from strenum import StrEnum` →
`from enum import StrEnum` |
| `agent/tools/code_exec.py` | same |

`StrEnum` has been in the stdlib since Python 3.11 — the `strenum`
backport package is no longer needed once the floor is 3.13.

## Why uv.lock is not regenerated

`uv lock --python 3.13` fails because:

1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0`
2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels)
depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0`
3. These two constraints are irreconcilable on Python 3.13

The lockfile regeneration requires loosening the `numpy` upper bound in
the `infiniflow/graspologic` fork. Once that fork commit is updated and
the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will
succeed.

## RFC corrections

Two claims in the original RFC (#14753) did not hold up under code
review:

- **"graspologic hard-blocks 3.13"** — the infiniflow fork at the pinned
commit has no `<3.13` Python constraint. The blocker is the transitive
`numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a
direct Python version cap.
- **"free-threading throughput gains for I/O-bound workload"** — Python
3.13 free-threading requires a special `--disable-gil` build and
provides no benefit for async I/O code (the GIL is already released
during I/O). The real motivation is forward compatibility and improved
error messages.
2026-05-15 14:40:53 +08:00

3.9 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It's a full-stack application with:

  • Python backend (Flask-based API server)
  • React/TypeScript frontend (built with vitejs)
  • Microservices architecture with Docker deployment
  • Multiple data stores (MySQL, Elasticsearch/Infinity, Redis, MinIO)

Architecture

Backend (/api/)

  • Main Server: api/ragflow_server.py - Flask application entry point
  • Apps: Modular Flask blueprints in api/apps/ for different functionalities:
    • kb_app.py - Knowledge base management
    • dialog_app.py - Chat/conversation handling
    • document_app.py - Document processing
    • canvas_app.py - Agent workflow canvas
    • file_app.py - File upload/management
  • Services: Business logic in api/db/services/
  • Models: Database models in api/db/db_models.py

Core Processing (/rag/)

  • Document Processing: deepdoc/ - PDF parsing, OCR, layout analysis
  • LLM Integration: rag/llm/ - Model abstractions for chat, embedding, reranking
  • RAG Pipeline: rag/flow/ - Chunking, parsing, tokenization
  • Graph RAG: rag/graphrag/ - Knowledge graph construction and querying

Agent System (/agent/)

  • Components: Modular workflow components (LLM, retrieval, categorize, etc.)
  • Templates: Pre-built agent workflows in agent/templates/
  • Tools: External API integrations (Tavily, Wikipedia, SQL execution, etc.)

Frontend (/web/)

  • React/TypeScript with vitejs framework
  • shadcn/ui components
  • State management with Zustand
  • Tailwind CSS for styling

Common Development Commands

Backend Development

# Install Python dependencies
uv sync --python 3.13 --all-extras
uv run python3 download_deps.py
pre-commit install

# Start dependent services
docker compose -f docker/docker-compose-base.yml up -d

# Run backend (requires services to be running)
source .venv/bin/activate
export PYTHONPATH=$(pwd)
bash docker/launch_backend_service.sh

# Run tests
uv run pytest

# Linting
ruff check
ruff format

Frontend Development

cd web
npm install
npm run dev        # Development server
npm run build      # Production build
npm run lint       # ESLint
npm run test       # Jest tests

Docker Development

# Full stack with Docker
cd docker
docker compose -f docker-compose.yml up -d

# Check server status
docker logs -f ragflow-server

# Rebuild images
docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly .

Key Configuration Files

  • docker/.env - Environment variables for Docker deployment
  • docker/service_conf.yaml.template - Backend service configuration
  • pyproject.toml - Python dependencies and project configuration
  • web/package.json - Frontend dependencies and scripts

Testing

  • Python: pytest with markers (p1/p2/p3 priority levels)
  • Frontend: Jest with React Testing Library
  • API Tests: HTTP API and SDK tests in test/ and sdk/python/test/

Database Engines

RAGFlow supports switching between Elasticsearch (default) and Infinity:

  • Set DOC_ENGINE=infinity in docker/.env to use Infinity
  • Requires container restart: docker compose down -v && docker compose up -d

Development Environment Requirements

  • Python 3.10-3.13
  • Node.js >=18.20.4
  • Docker & Docker Compose
  • uv package manager
  • 16GB+ RAM, 50GB+ disk space
  1. Think before acting. Read existing files before writing code.
  2. Be concise in output but thorough in reasoning.
  3. Prefer editing over rewriting whole files.
  4. Do not re-read files you have already read.
  5. Test your code before declaring done.
  6. No sycophantic openers or closing fluff.
  7. Keep solutions simple and direct.
  8. User instructions always override this file.