Files
ragflow/docker/.env.single-bucket-example

109 lines
3.6 KiB
Plaintext
Raw Permalink Normal View History

feat: Add Single Bucket Mode for MinIO/S3 (#11416) ## Overview This PR adds support for **Single Bucket Mode** in RAGFlow, allowing users to configure MinIO/S3 to use a single bucket with a directory structure instead of creating multiple buckets per Knowledge Base and user folder. ## Problem Statement The current implementation creates one bucket per Knowledge Base and one bucket per user folder, which can be problematic when: - Cloud providers charge per bucket - IAM policies restrict bucket creation - Organizations want centralized data management in a single bucket ## Solution Added a `prefix_path` configuration option to the MinIO connector that enables: - Using a single bucket with directory-based organization - Backward compatibility with existing multi-bucket deployments - Support for MinIO, AWS S3, and other S3-compatible storage backends ## Changes - **`rag/utils/minio_conn.py`**: Enhanced MinIO connector to support single bucket mode with prefix paths - **`conf/service_conf.yaml`**: Added new configuration options (`bucket` and `prefix_path`) - **`docker/service_conf.yaml.template`**: Updated template with single bucket configuration examples - **`docker/.env.single-bucket-example`**: Added example environment variables for single bucket setup - **`docs/single-bucket-mode.md`**: Comprehensive documentation covering usage, migration, and troubleshooting ## Configuration Example ```yaml minio: user: "access-key" password: "secret-key" host: "minio.example.com:443" bucket: "ragflow-bucket" # Single bucket name prefix_path: "ragflow" # Optional prefix path ``` ## Backward Compatibility ✅ Fully backward compatible - existing deployments continue to work without any changes - If `bucket` is not configured, uses default multi-bucket behavior - If `bucket` is configured without `prefix_path`, uses bucket root - If both are configured, uses `bucket/prefix_path/` structure ## Testing - Tested with MinIO (local and cloud) - Verified backward compatibility with existing multi-bucket mode - Validated IAM policy restrictions work correctly ## Documentation Included comprehensive documentation in `docs/single-bucket-mode.md` covering: - Configuration examples - Migration guide from multi-bucket to single-bucket mode - IAM policy examples - Troubleshooting guide --- **Related Issue**: Addresses use cases where bucket creation is restricted or costly
2025-12-11 12:22:47 +01:00
# Example: Single Bucket Mode Configuration
#
# This file shows how to configure RAGFlow to use a single MinIO/S3 bucket
# with directory structure instead of creating multiple buckets.
# ============================================================================
# MinIO/S3 Configuration for Single Bucket Mode
# ============================================================================
# MinIO/S3 Endpoint (with port if not default)
# For HTTPS (port 443), the connection will automatically use secure=True
export MINIO_HOST=minio.example.com:443
# Access credentials
export MINIO_USER=your-access-key
export MINIO_PASSWORD=your-secret-password-here
# Single Bucket Configuration (NEW!)
# If set, all data will be stored in this bucket instead of creating
# separate buckets for each knowledge base
export MINIO_BUCKET=ragflow-bucket
# Optional: Prefix path within the bucket (NEW!)
# If set, all files will be stored under this prefix
# Example: bucket/prefix_path/kb_id/file.pdf
export MINIO_PREFIX_PATH=ragflow
# ============================================================================
# Alternative: Multi-Bucket Mode (Default)
# ============================================================================
#
# To use the original multi-bucket mode, simply don't set MINIO_BUCKET
# and MINIO_PREFIX_PATH:
#
# export MINIO_HOST=minio.local
# export MINIO_USER=admin
# export MINIO_PASSWORD=password
# # MINIO_BUCKET not set
# # MINIO_PREFIX_PATH not set
# ============================================================================
# Storage Mode Selection (Environment Variable)
# ============================================================================
#
# Make sure this is set to use MinIO (default)
export STORAGE_IMPL=MINIO
# ============================================================================
# Example Path Structures
# ============================================================================
#
# Multi-Bucket Mode (default):
# bucket: kb_12345/file.pdf
# bucket: kb_67890/file.pdf
# bucket: folder_abc/file.txt
#
# Single Bucket Mode (MINIO_BUCKET set):
# bucket: ragflow-bucket/kb_12345/file.pdf
# bucket: ragflow-bucket/kb_67890/file.pdf
# bucket: ragflow-bucket/folder_abc/file.txt
#
# Single Bucket with Prefix (both set):
# bucket: ragflow-bucket/ragflow/kb_12345/file.pdf
# bucket: ragflow-bucket/ragflow/kb_67890/file.pdf
# bucket: ragflow-bucket/ragflow/folder_abc/file.txt
# ============================================================================
# IAM Policy for Single Bucket Mode
# ============================================================================
#
# When using single bucket mode, you only need permissions for one bucket:
#
# {
# "Version": "2012-10-17",
# "Statement": [
# {
# "Effect": "Allow",
# "Action": ["s3:*"],
# "Resource": [
# "arn:aws:s3:::ragflow-bucket",
# "arn:aws:s3:::ragflow-bucket/*"
# ]
# }
# ]
# }
# ============================================================================
# Testing the Configuration
# ============================================================================
#
# After setting these variables, you can test with MinIO Client (mc):
#
# # Configure mc alias
# mc alias set ragflow https://minio.example.com:443 \
# your-access-key \
# your-secret-password-here
#
# # List bucket contents
# mc ls ragflow/ragflow-bucket/
#
# # If prefix is set, check the prefix
# mc ls ragflow/ragflow-bucket/ragflow/
#
# # Test write permission
# echo "test" | mc pipe ragflow/ragflow-bucket/ragflow/_test.txt
#
# # Clean up test file
# mc rm ragflow/ragflow-bucket/ragflow/_test.txt