mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-07-01 08:15:44 +08:00
### What problem does this PR solve? ``` RAGFlow(admin)> mq publish 'msg2'; SUCCESS RAGFlow(admin)> mq publish 'msg3'; SUCCESS RAGFlow(admin)> mq list; +---------+---------------+ | message | subject | +---------+---------------+ | msg1 | tasks.RAGFLOW | | msg2 | tasks.RAGFLOW | | msg3 | tasks.RAGFLOW | +---------+---------------+ RAGFlow(admin)> mq pull 2; +---------+---------------+ | message | subject | +---------+---------------+ | msg1 | tasks.RAGFLOW | | msg2 | tasks.RAGFLOW | +---------+---------------+ RAGFlow(admin)> mq pull noack; +---------+---------------+ | message | subject | +---------+---------------+ | abc | tasks.RAGFLOW | +---------+---------------+ RAGFlow(admin)> mq show +-------------------+----------------+--------+---------------+---------------+-------------------+---------------+ | ack_pending_count | consumer_count | memory | message_count | pending_count | redelivered_count | waiting_count | +-------------------+----------------+--------+---------------+---------------+-------------------+---------------+ | 2 | 1 | 0 | 2 | 0 | 1 | 0 | +-------------------+----------------+--------+---------------+---------------+-------------------+---------------+ RAGFlow(admin)> list ingestors; +--------------+-------------------------------------------+--------+ | host | name | status | +--------------+-------------------------------------------+--------+ | 192.168.1.38 | ingestor-8f0e4bd5650a4ac58b0151969fbf6935 | alive | +--------------+-------------------------------------------+--------+ RAGFlow(admin)> list ingestion tasks; +----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+ | document_id | id | status | step | user | user_id | +----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+ | ffe64fae423411f1a2d938a74640adcc | 90d3d0f6528941c1ac8eb0360effccc4 | COMPLETED | 5 | aaa@aaa.com | 2ba4881420fa11f19e9c38a74640adcc | +----------------------------------+----------------------------------+-----------+------+-------------+----------------------------------+ RAGFlow(admin)> remove ingestion tasks '90d3d0f6528941c1ac8eb0360effccc4'; +---------+----------------------------------+ | delete | task_id | +---------+----------------------------------+ | success | 90d3d0f6528941c1ac8eb0360effccc4 | +---------+----------------------------------+ RAGFlow(admin)> stop ingestion tasks 'e89e20d9a25848a1b79bd9345ddbfe1d'; +----------+----------------------------------+ | status | task_id | +----------+----------------------------------+ | STOPPING | e89e20d9a25848a1b79bd9345ddbfe1d | +----------+----------------------------------+ # Publish a message RAGFlow(admin)> mq publish 'cdd'; SUCCESS # List current tasks in the message queue RAGFlow(admin)> mq list +----------------------------------+---------------+ | message | subject | +----------------------------------+---------------+ | 7ce392a3c1624cd2be4b5276e8825059 | tasks.RAGFLOW | +----------------------------------+---------------+ # Consume a task from the message queue RAGFlow(admin)> mq pull +------+-----+----------------+ | ack | id | type | +------+-----+----------------+ | true | cdd | ingestion_test | +------+-----+----------------+ # User mode # List ingestion tasks, followed by dataset id RAGFlow(user)> list ingestion tasks from '0abe79f9423311f1ad8d38a74640adcc'; +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ | create_date | create_time | dataset_id | document_id | id | schema | status | update_date | update_time | user_id | +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ | 2026-05-30T20:21:06+08:00 | 1780143666289 | 0abe79f9423311f1ad8d38a74640adcc | ffe64fae423411f1a2d938a74640adcc | 8d758cd14a8b4ba8ab505003fb52017d | | COMPLETED | 2026-05-30T20:21:26+08:00 | 1780143686431 | 2ba4881420fa11f19e9c38a74640adcc | +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ RAGFlow(user)> list ingestion tasks; +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ | create_date | create_time | dataset_id | document_id | id | schema | status | update_date | update_time | user_id | +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ | 2026-06-02T19:02:31+08:00 | 1780398151417 | 0abe79f9423311f1ad8d38a74640adcc | ffe64fae423411f1a2d938a74640adcc | e89e20d9a25848a1b79bd9345ddbfe1d | | COMPLETED | 2026-06-02T19:02:52+08:00 | 1780398172208 | 2ba4881420fa11f19e9c38a74640adcc | +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ # Create an ingestion task # First argument is document id, second argument is dataset id RAGFlow(user)> start ingestion 'ffe64fae423411f1a2d938a74640adcc' from '0abe79f9423311f1ad8d38a74640adcc'; +----------------------------------+-------------------------------------------+ | document_id | result | +----------------------------------+-------------------------------------------+ | ffe64fae423411f1a2d938a74640adcc | task_id: 8d758cd14a8b4ba8ab505003fb52017d | +----------------------------------+-------------------------------------------+ # Pause an ingestion task, first argument is ingestion id RAGFlow(user)> stop ingestion '8d758cd14a8b4ba8ab505003fb52017d'; +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ | create_date | create_time | dataset_id | document_id | id | schema | status | update_date | update_time | user_id | +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ | 2026-05-30T20:21:06+08:00 | 1780143666289 | 0abe79f9423311f1ad8d38a74640adcc | ffe64fae423411f1a2d938a74640adcc | 8d758cd14a8b4ba8ab505003fb52017d | | COMPLETED | 2026-05-30T20:21:26+08:00 | 1780143686431 | 2ba4881420fa11f19e9c38a74640adcc | +---------------------------+---------------+----------------------------------+----------------------------------+----------------------------------+--------+-----------+---------------------------+---------------+----------------------------------+ # Delete an ingestion task RAGFlow(api/default)> remove ingestion tasks 'f366450a27d54677aec1c7090add30f0'; +---------+----------------------------------+ | remove | task_id | +---------+----------------------------------+ | success | f366450a27d54677aec1c7090add30f0 | +---------+----------------------------------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>
RAGFlow CLI (Go Version)
This is the Go implementation of the RAGFlow command-line interface, compatible with the Python version's syntax.
Features
- Interactive mode and single command execution
- Full compatibility with Python CLI syntax
- Recursive descent parser for SQL-like commands
- Virtual Filesystem for intuitive resource management
- Support for all major commands:
- User management: LOGIN, REGISTER, CREATE USER, DROP USER, LIST USERS, etc.
- Service management: LIST SERVICES, SHOW SERVICE, STARTUP/SHUTDOWN/RESTART SERVICE
- Role management: CREATE ROLE, DROP ROLE, LIST ROLES, GRANT/REVOKE PERMISSION
- Dataset management via Virtual Filesystem:
ls,search,mkdir,cat,rm - Model management: SET/RESET DEFAULT LLM/VLM/EMBEDDING/etc.
- And more...
Usage
Build and run
go build -o ragflow_cli ./cmd/ragflow_cli.go
./ragflow_cli
Architecture
internal/cli/
├── cli.go # Main CLI loop and interaction
├── client.go # RAGFlowClient with Filesystem integration
├── http_client.go # HTTP client for API communication
├── parser/ # Command parser package
│ ├── types.go # Token and Command types
│ ├── lexer.go # Lexical analyzer
│ └── parser.go # Recursive descent parser
└── filesystem/ # Virtual Filesystem
├── engine.go # Core engine: path resolution, command routing
├── types.go # Node, Command, Result types
├── base.go # Provider interface definition
├── dataset.go # Dataset provider implementation
├── file.go # File manager provider implementation
└── utils.go # Helper functions
Virtual Filesystem
The Virtual Filesystem provides a unified filesystem interface over RAGFlow's RESTful APIs.
Design Principles
- No Server-Side Changes: All logic implemented client-side using existing APIs
- Provider Pattern: Modular providers for different resource types (datasets, files, etc.)
- Unified Interface: Common
ls,search,mkdircommands across all providers - Path-Based Navigation: Virtual paths like
/datasets,/datasets/{name}/files
Supported Paths
| Path | Description |
|---|---|
/datasets |
List all datasets |
/datasets/{name} |
List documents in dataset (default behavior) |
/datasets/{name}/{doc} |
Get document info |
Commands
ls [path] [options] - List nodes at path
List contents of a path in the context filesystem.
Arguments:
[path]- Path to list (default: "datasets")
Options:
-n, --limit <number>- Maximum number of items to display (default: 10)-h, --help- Show ls help message
Examples:
ls # List all datasets (default 10)
ls -n 20 # List 20 datasets
ls datasets/kb1 # List files in kb1 dataset
ls datasets/kb1 -n 50 # List 50 files in kb1 dataset
search [options] - Search for content
Semantic search in datasets.
Options:
-n, --number- Number of top results to return (default: 10)
Output Formats:
- Default: JSON format
--output plain- Plain text format--output table- Table format with borders
Examples:
search "machine learning" # Search all datasets (JSON output)
search "neural networks" datasets/kb1 # Search in kb1
search "AI" datasets/kb1 --output plain # Plain text output
search "RAG" -n 20 # Return 20 results
SEARCH 'machine learning' ON DATASETS 'kb1' 'kb2'
SEARCH 'AI' ON DATASETS 'kb1' WITH top_k 1024 similarity_threshold 0.0 vector_similarity_weight 0.3 keyword true
SEARCH 'AI' ON DATASETS 'kb1' WITH cross_languages ['Chinese']
cat <path> - Display content
Display document content (if available).
Examples:
cat myskills/doc.md # Show content of doc.md file
cat datasets/kb1/document.pdf # Error: cannot display binary file content
Command Examples
-- Authentication
LOGIN USER 'admin@example.com';
-- User management
REGISTER USER 'john' AS 'John Doe' PASSWORD 'secret';
CREATE USER 'jane' 'password123';
DROP USER 'jane';
LIST USERS;
SHOW USER 'john';
-- Service management
LIST SERVICES;
SHOW SERVICE 1;
STARTUP SERVICE 1;
SHUTDOWN SERVICE 1;
RESTART SERVICE 1;
PING;
-- Role management
CREATE ROLE admin DESCRIPTION 'Administrator role';
LIST ROLES;
GRANT read,write ON datasets TO ROLE admin;
-- Dataset management
CREATE DATASET 'my_dataset' WITH EMBEDDING 'text-embedding-ada-002' PARSER 'naive';
LIST DATASETS;
DROP DATASET 'my_dataset';
-- Model configuration
SET DEFAULT LLM 'gpt-4';
SET DEFAULT EMBEDDING 'text-embedding-ada-002';
RESET DEFAULT LLM;
## Parser Implementation
The parser uses a hand-written recursive descent approach instead of go-yacc for:
- Better control over error messages
- Easier to extend and maintain
- No code generation step required
The parser structure follows the grammar defined in the Python version, ensuring full syntax compatibility.