mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-06-29 23:41:12 +08:00
### What problem does this PR solve? This PR enhances the CLI parser to support dimension configurations for custom embedding models. Users can now specify the maximum dimension and other supported dimensions directly after the embedding keyword. ``` add model 'x1 x2 x3 x4 x5' to provider 'vllm' instance 'test' with tokens 1024 chat think vision, token 2048 chat, token 1024 think vision, token 0 embedding 2048 64 1024 2048, token 0 embedding 2048; ``` - The first integer following embedding represents the max_dimension. - Any subsequent integers represent specific alternative dimensions. - If no subsequent integers are provided, dimensions defaults to empty, indicating all sizes under max_dimension are supported.
RAGFlow CLI (Go Version)
This is the Go implementation of the RAGFlow command-line interface, compatible with the Python version's syntax.
Features
- Interactive mode and single command execution
- Full compatibility with Python CLI syntax
- Recursive descent parser for SQL-like commands
- Virtual Filesystem for intuitive resource management
- Support for all major commands:
- User management: LOGIN, REGISTER, CREATE USER, DROP USER, LIST USERS, etc.
- Service management: LIST SERVICES, SHOW SERVICE, STARTUP/SHUTDOWN/RESTART SERVICE
- Role management: CREATE ROLE, DROP ROLE, LIST ROLES, GRANT/REVOKE PERMISSION
- Dataset management via Virtual Filesystem:
ls,search,mkdir,cat,rm - Model management: SET/RESET DEFAULT LLM/VLM/EMBEDDING/etc.
- And more...
Usage
Build and run
go build -o ragflow_cli ./cmd/ragflow_cli.go
./ragflow_cli
Architecture
internal/cli/
├── cli.go # Main CLI loop and interaction
├── client.go # RAGFlowClient with Filesystem integration
├── http_client.go # HTTP client for API communication
├── parser/ # Command parser package
│ ├── types.go # Token and Command types
│ ├── lexer.go # Lexical analyzer
│ └── parser.go # Recursive descent parser
└── filesystem/ # Virtual Filesystem
├── engine.go # Core engine: path resolution, command routing
├── types.go # Node, Command, Result types
├── base.go # Provider interface definition
├── dataset.go # Dataset provider implementation
├── file.go # File manager provider implementation
└── utils.go # Helper functions
Virtual Filesystem
The Virtual Filesystem provides a unified filesystem interface over RAGFlow's RESTful APIs.
Design Principles
- No Server-Side Changes: All logic implemented client-side using existing APIs
- Provider Pattern: Modular providers for different resource types (datasets, files, etc.)
- Unified Interface: Common
ls,search,mkdircommands across all providers - Path-Based Navigation: Virtual paths like
/datasets,/datasets/{name}/files
Supported Paths
| Path | Description |
|---|---|
/datasets |
List all datasets |
/datasets/{name} |
List documents in dataset (default behavior) |
/datasets/{name}/{doc} |
Get document info |
Commands
ls [path] [options] - List nodes at path
List contents of a path in the context filesystem.
Arguments:
[path]- Path to list (default: "datasets")
Options:
-n, --limit <number>- Maximum number of items to display (default: 10)-h, --help- Show ls help message
Examples:
ls # List all datasets (default 10)
ls -n 20 # List 20 datasets
ls datasets/kb1 # List files in kb1 dataset
ls datasets/kb1 -n 50 # List 50 files in kb1 dataset
search [options] - Search for content
Semantic search in datasets.
Options:
-n, --number- Number of top results to return (default: 10)
Output Formats:
- Default: JSON format
--output plain- Plain text format--output table- Table format with borders
Examples:
search "machine learning" # Search all datasets (JSON output)
search "neural networks" datasets/kb1 # Search in kb1
search "AI" datasets/kb1 --output plain # Plain text output
search "RAG" -n 20 # Return 20 results
SEARCH 'machine learning' ON DATASETS 'kb1' 'kb2'
SEARCH 'AI' ON DATASETS 'kb1' WITH top_k 1024 similarity_threshold 0.0 vector_similarity_weight 0.3 keyword true
SEARCH 'AI' ON DATASETS 'kb1' WITH cross_languages ['Chinese']
cat <path> - Display content
Display document content (if available).
Examples:
cat myskills/doc.md # Show content of doc.md file
cat datasets/kb1/document.pdf # Error: cannot display binary file content
Command Examples
-- Authentication
LOGIN USER 'admin@example.com';
-- User management
REGISTER USER 'john' AS 'John Doe' PASSWORD 'secret';
CREATE USER 'jane' 'password123';
DROP USER 'jane';
LIST USERS;
SHOW USER 'john';
-- Service management
LIST SERVICES;
SHOW SERVICE 1;
STARTUP SERVICE 1;
SHUTDOWN SERVICE 1;
RESTART SERVICE 1;
PING;
-- Role management
CREATE ROLE admin DESCRIPTION 'Administrator role';
LIST ROLES;
GRANT read,write ON datasets TO ROLE admin;
-- Dataset management
CREATE DATASET 'my_dataset' WITH EMBEDDING 'text-embedding-ada-002' PARSER 'naive';
LIST DATASETS;
DROP DATASET 'my_dataset';
-- Model configuration
SET DEFAULT LLM 'gpt-4';
SET DEFAULT EMBEDDING 'text-embedding-ada-002';
RESET DEFAULT LLM;
## Parser Implementation
The parser uses a hand-written recursive descent approach instead of go-yacc for:
- Better control over error messages
- Easier to extend and maintain
- No code generation step required
The parser structure follows the grammar defined in the Python version, ensuring full syntax compatibility.