mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-06-29 23:41:12 +08:00
### What problem does this PR solve? `_get_optimal_clusters` in `rag/raptor.py` had two edge-case issues in GMM cluster-count selection: 1. It used `np.arange(1, max_clusters)`, which never evaluates the upper-bound candidate (`max_clusters`). 2. When effective `max_clusters` becomes `1`, the candidate list was empty and `argmin` crashed. This PR makes candidate evaluation inclusive (`1..max_clusters`) and guards the single-cluster case by returning `1` directly. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Validation - `pytest test/unit_test/rag/test_raptor_psi_tree_builder.py --config-file pyproject.toml -q` - `ruff check rag/raptor.py test/unit_test/rag/test_raptor_psi_tree_builder.py` ### Tests added - Regression test for `max_cluster == 1` path (no crash, returns 1) - Regression test verifying upper-bound candidate is evaluated and can be selected _AI-assistance disclosure: parts of this change (bug triage and test scaffolding) were drafted with AI assistance and fully reviewed and verified by me._ --------- Co-authored-by: Harsh Kashyap <harshkashyap@Harshs-MacBook-Pro.local> Co-authored-by: Cursor <cursoragent@cursor.com>
(1). Deploy RAGFlow services and images
https://ragflow.io/docs/build_docker_image
(2). Configure the required environment for testing
Install Python dependencies (including test dependencies):
uv sync --python 3.13 --only-group test --no-default-groups --frozen
Activate the environment:
source .venv/bin/activate
Install SDK:
uv pip install sdk/python
Modify the .env file: Add the following code:
COMPOSE_PROFILES=${COMPOSE_PROFILES},tei-cpu
TEI_MODEL=BAAI/bge-small-en-v1.5
RAGFLOW_IMAGE=infiniflow/ragflow:v0.26.1 #Replace with the image you are using
Start the container(wait two minutes):
docker compose -f docker/docker-compose.yml up -d
(3). Test Elasticsearch
a) Run sdk tests against Elasticsearch:
export HTTP_API_TEST_LEVEL=p2
export HOST_ADDRESS=http://127.0.0.1:9380 # Ensure that this port is the API port mapped to your localhost
pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_sdk_api
b) Run http api tests against Elasticsearch:
pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_http_api
(4). Test Infinity
Modify the .env file:
DOC_ENGINE=${DOC_ENGINE:-infinity}
Start the container:
docker compose -f docker/docker-compose.yml down -v
docker compose -f docker/docker-compose.yml up -d
a) Run sdk tests against Infinity:
DOC_ENGINE=infinity pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_sdk_api
b) Run http api tests against Infinity:
DOC_ENGINE=infinity pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_http_api