mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-07-04 18:45:38 +08:00
1838 lines
56 KiB
Markdown
1838 lines
56 KiB
Markdown
|
|
# RAGFlow Sandbox Multi-Provider Architecture - Design Specification
|
|||
|
|
|
|||
|
|
## 1. Overview
|
|||
|
|
|
|||
|
|
### 1.1 Goals
|
|||
|
|
Enable RAGFlow to support multiple sandbox deployment modes:
|
|||
|
|
- **Self-Managed**: On-premise deployment using Daytona/Docker (current implementation)
|
|||
|
|
- **SaaS Providers**: Cloud-based sandbox services (Aliyun Code Interpreter, E2B)
|
|||
|
|
|
|||
|
|
### 1.2 Key Requirements
|
|||
|
|
- Provider-agnostic interface for sandbox operations
|
|||
|
|
- Admin-configurable provider settings with dynamic schema
|
|||
|
|
- Multi-tenant isolation (1:1 session-to-sandbox mapping)
|
|||
|
|
- Graceful fallback and error handling
|
|||
|
|
- Unified monitoring and observability
|
|||
|
|
|
|||
|
|
## 2. Architecture Design
|
|||
|
|
|
|||
|
|
### 2.1 Provider Abstraction Layer
|
|||
|
|
|
|||
|
|
**Location**: `agent/sandbox/providers/`
|
|||
|
|
|
|||
|
|
Define a unified `SandboxProvider` interface:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# agent/sandbox/providers/base.py
|
|||
|
|
from abc import ABC, abstractmethod
|
|||
|
|
from typing import Dict, Any, Optional
|
|||
|
|
from dataclasses import dataclass
|
|||
|
|
|
|||
|
|
@dataclass
|
|||
|
|
class SandboxInstance:
|
|||
|
|
instance_id: str
|
|||
|
|
provider: str
|
|||
|
|
status: str # running, stopped, error
|
|||
|
|
metadata: Dict[str, Any]
|
|||
|
|
|
|||
|
|
@dataclass
|
|||
|
|
class ExecutionResult:
|
|||
|
|
stdout: str
|
|||
|
|
stderr: str
|
|||
|
|
exit_code: int
|
|||
|
|
execution_time: float
|
|||
|
|
metadata: Dict[str, Any]
|
|||
|
|
|
|||
|
|
class SandboxProvider(ABC):
|
|||
|
|
"""Base interface for all sandbox providers"""
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def initialize(self, config: Dict[str, Any]) -> bool:
|
|||
|
|
"""Initialize provider with configuration"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def create_instance(self, template: str = "python") -> SandboxInstance:
|
|||
|
|
"""Create a new sandbox instance"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def execute_code(
|
|||
|
|
self,
|
|||
|
|
instance_id: str,
|
|||
|
|
code: str,
|
|||
|
|
language: str,
|
|||
|
|
timeout: int = 10
|
|||
|
|
) -> ExecutionResult:
|
|||
|
|
"""Execute code in the sandbox"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def destroy_instance(self, instance_id: str) -> bool:
|
|||
|
|
"""Destroy a sandbox instance"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def health_check(self) -> bool:
|
|||
|
|
"""Check if provider is healthy"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def get_supported_languages(self) -> list[str]:
|
|||
|
|
"""Get list of supported programming languages"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@staticmethod
|
|||
|
|
def get_config_schema() -> Dict[str, Dict]:
|
|||
|
|
"""
|
|||
|
|
Return configuration schema for this provider.
|
|||
|
|
|
|||
|
|
Returns a dictionary mapping field names to their schema definitions,
|
|||
|
|
including type, required status, validation rules, labels, and descriptions.
|
|||
|
|
"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
def validate_config(self, config: Dict[str, Any]) -> tuple[bool, Optional[str]]:
|
|||
|
|
"""
|
|||
|
|
Validate provider-specific configuration.
|
|||
|
|
|
|||
|
|
This method allows providers to implement custom validation logic beyond
|
|||
|
|
the basic schema validation. Override this method to add provider-specific
|
|||
|
|
checks like URL format validation, API key format validation, etc.
|
|||
|
|
|
|||
|
|
Args:
|
|||
|
|
config: Configuration dictionary to validate
|
|||
|
|
|
|||
|
|
Returns:
|
|||
|
|
Tuple of (is_valid, error_message):
|
|||
|
|
- is_valid: True if configuration is valid, False otherwise
|
|||
|
|
- error_message: Error message if invalid, None if valid
|
|||
|
|
"""
|
|||
|
|
# Default implementation: no custom validation
|
|||
|
|
return True, None
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2.2 Provider Implementations
|
|||
|
|
|
|||
|
|
#### 2.2.1 Self-Managed Provider
|
|||
|
|
**File**: `agent/sandbox/providers/self_managed.py`
|
|||
|
|
|
|||
|
|
Wraps the existing executor_manager implementation.
|
|||
|
|
|
|||
|
|
**Prerequisites**:
|
|||
|
|
- **gVisor (runsc)**: Required for secure container isolation. Install with:
|
|||
|
|
```bash
|
|||
|
|
go install gvisor.dev/gvisor/runsc@latest
|
|||
|
|
sudo cp ~/go/bin/runsc /usr/local/bin/
|
|||
|
|
runsc --version
|
|||
|
|
```
|
|||
|
|
Or download from: https://github.com/google/gvisor/releases
|
|||
|
|
- **Docker**: Docker runtime with gVisor support
|
|||
|
|
- **Base Images**: Pull sandbox base images:
|
|||
|
|
```bash
|
|||
|
|
docker pull infiniflow/sandbox-base-python:latest
|
|||
|
|
docker pull infiniflow/sandbox-base-nodejs:latest
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Configuration**: Docker API endpoint, pool size, resource limits
|
|||
|
|
- `endpoint`: HTTP endpoint (default: "http://localhost:9385")
|
|||
|
|
- `timeout`: Request timeout in seconds (default: 30)
|
|||
|
|
- `max_retries`: Maximum retry attempts (default: 3)
|
|||
|
|
- `pool_size`: Container pool size (default: 10)
|
|||
|
|
|
|||
|
|
**Languages**: Python, Node.js, JavaScript
|
|||
|
|
|
|||
|
|
**Security**: gVisor (runsc runtime), seccomp, read-only filesystem, memory limits
|
|||
|
|
|
|||
|
|
**Advantages**:
|
|||
|
|
- Low latency (<90ms), data privacy, full control
|
|||
|
|
- No per-execution costs
|
|||
|
|
- Supports `arguments` parameter for passing data to `main()` function
|
|||
|
|
|
|||
|
|
**Limitations**:
|
|||
|
|
- Operational overhead, finite resources
|
|||
|
|
- Requires gVisor installation for security
|
|||
|
|
- Pool exhaustion causes "Container pool is busy" errors
|
|||
|
|
|
|||
|
|
**Common Issues**:
|
|||
|
|
- **"Container pool is busy"**: Increase `SANDBOX_EXECUTOR_MANAGER_POOL_SIZE` (default: 1 in .env, should be 5+)
|
|||
|
|
- **Container creation fails**: Ensure gVisor is installed and accessible at `/usr/local/bin/runsc`
|
|||
|
|
|
|||
|
|
#### 2.2.2 Aliyun Code Interpreter Provider
|
|||
|
|
**File**: `agent/sandbox/providers/aliyun_codeinterpreter.py`
|
|||
|
|
|
|||
|
|
SaaS integration with Aliyun Function Compute Code Interpreter service using the official agentrun-sdk.
|
|||
|
|
|
|||
|
|
**Official Resources**:
|
|||
|
|
- API Documentation: https://help.aliyun.com/zh/functioncompute/fc/sandbox-sandbox-code-interepreter
|
|||
|
|
- Official SDK: https://github.com/Serverless-Devs/agentrun-sdk-python
|
|||
|
|
- SDK Docs: https://docs.agent.run
|
|||
|
|
|
|||
|
|
**Implementation**:
|
|||
|
|
- Uses official `agentrun-sdk` package
|
|||
|
|
- SDK handles authentication (AccessKey signature) automatically
|
|||
|
|
- Supports environment variable configuration
|
|||
|
|
- Structured error handling with `ServerError` exceptions
|
|||
|
|
|
|||
|
|
**Configuration**:
|
|||
|
|
- `access_key_id`: Aliyun AccessKey ID
|
|||
|
|
- `access_key_secret`: Aliyun AccessKey Secret
|
|||
|
|
- `account_id`: Aliyun primary account ID (主账号ID) - Required for API calls
|
|||
|
|
- `region`: Region (cn-hangzhou, cn-beijing, cn-shanghai, cn-shenzhen, cn-guangzhou)
|
|||
|
|
- `template_name`: Optional sandbox template name for pre-configured environments
|
|||
|
|
- `timeout`: Execution timeout (max 30 seconds - hard limit)
|
|||
|
|
|
|||
|
|
**Languages**: Python, JavaScript
|
|||
|
|
|
|||
|
|
**Security**: Serverless microVM isolation, 30-second hard timeout limit
|
|||
|
|
|
|||
|
|
**Advantages**:
|
|||
|
|
- Official SDK with automatic signature handling
|
|||
|
|
- Unlimited scalability, no maintenance
|
|||
|
|
- China region support with low latency
|
|||
|
|
- Built-in file system management
|
|||
|
|
- Support for execution contexts (Jupyter kernel)
|
|||
|
|
- Context-based execution for state persistence
|
|||
|
|
|
|||
|
|
**Limitations**:
|
|||
|
|
- Network dependency
|
|||
|
|
- 30-second execution time limit (hard limit)
|
|||
|
|
- Pay-as-you-go costs
|
|||
|
|
- Requires Aliyun primary account ID for API calls
|
|||
|
|
|
|||
|
|
**Setup Instructions - Creating a RAM User with Minimal Privileges**:
|
|||
|
|
|
|||
|
|
⚠️ **Security Warning**: Never use your Aliyun primary account (root account) AccessKey for SDK operations. Primary accounts have full resource permissions, and leaked credentials pose significant security risks.
|
|||
|
|
|
|||
|
|
**Step 1: Create a RAM User**
|
|||
|
|
|
|||
|
|
1. Log in to [RAM Console](https://ram.console.aliyun.com/)
|
|||
|
|
2. Navigate to **People** → **Users**
|
|||
|
|
3. Click **Create User**
|
|||
|
|
4. Configure the user:
|
|||
|
|
- **Username**: e.g., `ragflow-sandbox-user`
|
|||
|
|
- **Display Name**: e.g., `RAGFlow Sandbox Service Account`
|
|||
|
|
- **Access Mode**: Check ✅ **OpenAPI/Programmatic Access** (this creates an AccessKey)
|
|||
|
|
- **Console Login**: Optional (not needed for SDK-only access)
|
|||
|
|
5. Click **OK** and save the AccessKey ID and Secret immediately (displayed only once!)
|
|||
|
|
|
|||
|
|
**Step 2: Create a Custom Authorization Policy**
|
|||
|
|
|
|||
|
|
Navigate to **Permissions** → **Policies** → **Create Policy** → **Custom Policy** → **Configuration Script (JSON)**
|
|||
|
|
|
|||
|
|
Choose one of the following policy options based on your security requirements:
|
|||
|
|
|
|||
|
|
**Option A: Minimal Privilege Policy (Recommended)**
|
|||
|
|
|
|||
|
|
Grants only the permissions required by the AgentRun SDK:
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"Version": "1",
|
|||
|
|
"Statement": [
|
|||
|
|
{
|
|||
|
|
"Effect": "Allow",
|
|||
|
|
"Action": [
|
|||
|
|
"agentrun:CreateTemplate",
|
|||
|
|
"agentrun:GetTemplate",
|
|||
|
|
"agentrun:UpdateTemplate",
|
|||
|
|
"agentrun:DeleteTemplate",
|
|||
|
|
"agentrun:ListTemplates",
|
|||
|
|
"agentrun:CreateSandbox",
|
|||
|
|
"agentrun:GetSandbox",
|
|||
|
|
"agentrun:DeleteSandbox",
|
|||
|
|
"agentrun:StopSandbox",
|
|||
|
|
"agentrun:ListSandboxes",
|
|||
|
|
"agentrun:CreateContext",
|
|||
|
|
"agentrun:ExecuteCode",
|
|||
|
|
"agentrun:DeleteContext",
|
|||
|
|
"agentrun:ListContexts",
|
|||
|
|
"agentrun:CreateFile",
|
|||
|
|
"agentrun:GetFile",
|
|||
|
|
"agentrun:DeleteFile",
|
|||
|
|
"agentrun:ListFiles",
|
|||
|
|
"agentrun:CreateProcess",
|
|||
|
|
"agentrun:GetProcess",
|
|||
|
|
"agentrun:KillProcess",
|
|||
|
|
"agentrun:ListProcesses",
|
|||
|
|
"agentrun:CreateRecording",
|
|||
|
|
"agentrun:GetRecording",
|
|||
|
|
"agentrun:DeleteRecording",
|
|||
|
|
"agentrun:ListRecordings",
|
|||
|
|
"agentrun:CheckHealth"
|
|||
|
|
],
|
|||
|
|
"Resource": [
|
|||
|
|
"acs:agentrun:*:{account_id}:template/*",
|
|||
|
|
"acs:agentrun:*:{account_id}:sandbox/*"
|
|||
|
|
]
|
|||
|
|
}
|
|||
|
|
]
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
> Replace `{account_id}` with your Aliyun primary account ID
|
|||
|
|
|
|||
|
|
**Option B: Resource-Level Privilege Control (Most Secure)**
|
|||
|
|
|
|||
|
|
Limits access to specific resource prefixes:
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"Version": "1",
|
|||
|
|
"Statement": [
|
|||
|
|
{
|
|||
|
|
"Effect": "Allow",
|
|||
|
|
"Action": [
|
|||
|
|
"agentrun:CreateTemplate",
|
|||
|
|
"agentrun:GetTemplate",
|
|||
|
|
"agentrun:UpdateTemplate",
|
|||
|
|
"agentrun:DeleteTemplate",
|
|||
|
|
"agentrun:ListTemplates"
|
|||
|
|
],
|
|||
|
|
"Resource": "acs:agentrun:*:{account_id}:template/ragflow-*"
|
|||
|
|
},
|
|||
|
|
{
|
|||
|
|
"Effect": "Allow",
|
|||
|
|
"Action": [
|
|||
|
|
"agentrun:CreateSandbox",
|
|||
|
|
"agentrun:GetSandbox",
|
|||
|
|
"agentrun:DeleteSandbox",
|
|||
|
|
"agentrun:StopSandbox",
|
|||
|
|
"agentrun:ListSandboxes",
|
|||
|
|
"agentrun:CheckHealth"
|
|||
|
|
],
|
|||
|
|
"Resource": "acs:agentrun:*:{account_id}:sandbox/*"
|
|||
|
|
},
|
|||
|
|
{
|
|||
|
|
"Effect": "Allow",
|
|||
|
|
"Action": ["agentrun:*"],
|
|||
|
|
"Resource": "acs:agentrun:*:{account_id}:sandbox/*/context/*"
|
|||
|
|
},
|
|||
|
|
{
|
|||
|
|
"Effect": "Allow",
|
|||
|
|
"Action": ["agentrun:*"],
|
|||
|
|
"Resource": "acs:agentrun:*:{account_id}:sandbox/*/file/*"
|
|||
|
|
},
|
|||
|
|
{
|
|||
|
|
"Effect": "Allow",
|
|||
|
|
"Action": ["agentrun:*"],
|
|||
|
|
"Resource": "acs:agentrun:*:{account_id}:sandbox/*/process/*"
|
|||
|
|
},
|
|||
|
|
{
|
|||
|
|
"Effect": "Allow",
|
|||
|
|
"Action": ["agentrun:*"],
|
|||
|
|
"Resource": "acs:agentrun:*:{account_id}:sandbox/*/recording/*"
|
|||
|
|
}
|
|||
|
|
]
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
> This limits template creation to only those prefixed with `ragflow-*`
|
|||
|
|
|
|||
|
|
**Option C: Full Access (Not Recommended for Production)**
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"Version": "1",
|
|||
|
|
"Statement": [
|
|||
|
|
{
|
|||
|
|
"Effect": "Allow",
|
|||
|
|
"Action": "agentrun:*",
|
|||
|
|
"Resource": "*"
|
|||
|
|
}
|
|||
|
|
]
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Step 3: Authorize the RAM User**
|
|||
|
|
|
|||
|
|
1. Return to **Users** list
|
|||
|
|
2. Find the user you just created (e.g., `ragflow-sandbox-user`)
|
|||
|
|
3. Click **Add Permissions** in the Actions column
|
|||
|
|
4. In the **Custom Policy** tab, select the policy you created in Step 2
|
|||
|
|
5. Click **OK**
|
|||
|
|
|
|||
|
|
**Step 4: Configure RAGFlow with the RAM User Credentials**
|
|||
|
|
|
|||
|
|
After creating the RAM user and obtaining the AccessKey, configure it in RAGFlow's admin settings or environment variables:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# Method 1: Environment variables (for development/testing)
|
|||
|
|
export AGENTRUN_ACCESS_KEY_ID="LTAI5t..." # RAM user's AccessKey ID
|
|||
|
|
export AGENTRUN_ACCESS_KEY_SECRET="xxx..." # RAM user's AccessKey Secret
|
|||
|
|
export AGENTRUN_ACCOUNT_ID="123456789..." # Your primary account ID
|
|||
|
|
export AGENTRUN_REGION="cn-hangzhou"
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Or via Admin UI (recommended for production):
|
|||
|
|
|
|||
|
|
1. Navigate to **Admin Settings** → **Sandbox Providers**
|
|||
|
|
2. Select **Aliyun Code Interpreter** provider
|
|||
|
|
3. Fill in the configuration:
|
|||
|
|
- `access_key_id`: RAM user's AccessKey ID
|
|||
|
|
- `access_key_secret`: RAM user's AccessKey Secret
|
|||
|
|
- `account_id`: Your primary account ID
|
|||
|
|
- `region`: e.g., `cn-hangzhou`
|
|||
|
|
|
|||
|
|
**Step 5: Verify Permissions**
|
|||
|
|
|
|||
|
|
Test if the RAM user permissions are correctly configured:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from agentrun.sandbox import Sandbox, TemplateInput, TemplateType
|
|||
|
|
|
|||
|
|
try:
|
|||
|
|
# Test template creation
|
|||
|
|
template = Sandbox.create_template(
|
|||
|
|
input=TemplateInput(
|
|||
|
|
template_name="ragflow-permission-test",
|
|||
|
|
template_type=TemplateType.CODE_INTERPRETER
|
|||
|
|
)
|
|||
|
|
)
|
|||
|
|
print("✅ RAM user permissions are correctly configured")
|
|||
|
|
except Exception as e:
|
|||
|
|
print(f"❌ Permission test failed: {e}")
|
|||
|
|
finally:
|
|||
|
|
# Cleanup test resources
|
|||
|
|
try:
|
|||
|
|
Sandbox.delete_template("ragflow-permission-test")
|
|||
|
|
except:
|
|||
|
|
pass
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Security Best Practices**:
|
|||
|
|
|
|||
|
|
1. ✅ **Always use RAM user AccessKeys**, never primary account AccessKeys
|
|||
|
|
2. ✅ **Follow the principle of least privilege** - grant only necessary permissions
|
|||
|
|
3. ✅ **Rotate AccessKeys regularly** - recommend every 3-6 months
|
|||
|
|
4. ✅ **Enable MFA** - enable multi-factor authentication for RAM users
|
|||
|
|
5. ✅ **Use secure storage** - store credentials in environment variables or secret management services, never hardcode in code
|
|||
|
|
6. ✅ **Restrict IP access** - add IP whitelist policies for RAM users if needed
|
|||
|
|
7. ✅ **Monitor access logs** - regularly check RAM user access logs in CloudTrail
|
|||
|
|
|
|||
|
|
**Reference Links**:
|
|||
|
|
- [Aliyun RAM Documentation](https://help.aliyun.com/product/28625.html)
|
|||
|
|
- [RAM Policy Language](https://help.aliyun.com/document_detail/100676.html)
|
|||
|
|
- [AgentRun Official Documentation](https://docs.agent.run)
|
|||
|
|
- [AgentRun SDK GitHub](https://github.com/Serverless-Devs/agentrun-sdk-python)
|
|||
|
|
|
|||
|
|
#### 2.2.3 E2B Provider
|
|||
|
|
**File**: `agent/sandbox/providers/e2b.py`
|
|||
|
|
|
|||
|
|
SaaS integration with E2B Cloud.
|
|||
|
|
- **Configuration**: api_key, region (us/eu)
|
|||
|
|
- **Languages**: Python, JavaScript, Go, Bash, etc.
|
|||
|
|
- **Security**: Firecracker microVMs
|
|||
|
|
- **Advantages**: Global CDN, fast startup, multiple language support
|
|||
|
|
- **Limitations**: International network latency for China users
|
|||
|
|
|
|||
|
|
### 2.3 Provider Management
|
|||
|
|
|
|||
|
|
**File**: `agent/sandbox/providers/manager.py`
|
|||
|
|
|
|||
|
|
Since we only use one active provider at a time (configured globally), the provider management is simplified:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
class ProviderManager:
|
|||
|
|
"""Manages the currently active sandbox provider"""
|
|||
|
|
|
|||
|
|
def __init__(self):
|
|||
|
|
self.current_provider: Optional[SandboxProvider] = None
|
|||
|
|
self.current_provider_name: Optional[str] = None
|
|||
|
|
|
|||
|
|
def set_provider(self, name: str, provider: SandboxProvider):
|
|||
|
|
"""Set the active provider"""
|
|||
|
|
self.current_provider = provider
|
|||
|
|
self.current_provider_name = name
|
|||
|
|
|
|||
|
|
def get_provider(self) -> Optional[SandboxProvider]:
|
|||
|
|
"""Get the active provider"""
|
|||
|
|
return self.current_provider
|
|||
|
|
|
|||
|
|
def get_provider_name(self) -> Optional[str]:
|
|||
|
|
"""Get the active provider name"""
|
|||
|
|
return self.current_provider_name
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Rationale**: With global configuration, there's only one active provider at a time. The provider manager simply holds a reference to the currently active provider, making it a thin wrapper rather than a complex multi-provider manager.
|
|||
|
|
|
|||
|
|
## 3. Admin Configuration
|
|||
|
|
|
|||
|
|
### 3.1 Database Schema
|
|||
|
|
|
|||
|
|
Use the existing **SystemSettings** table for global sandbox configuration:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# In api/db/db_models.py
|
|||
|
|
|
|||
|
|
class SystemSettings(DataBaseModel):
|
|||
|
|
name = CharField(max_length=128, primary_key=True)
|
|||
|
|
source = CharField(max_length=32, null=False, index=False)
|
|||
|
|
data_type = CharField(max_length=32, null=False, index=False)
|
|||
|
|
value = CharField(max_length=1024, null=False, index=False)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Rationale**: Sandbox manager is a **system-level service** shared by all tenants:
|
|||
|
|
- No per-tenant configuration needed (unlike LLM providers where each tenant has their own API keys)
|
|||
|
|
- Global settings like system email, DOC_ENGINE, etc.
|
|||
|
|
- Managed by administrators only
|
|||
|
|
- Leverages existing `SettingsMgr` in admin interface
|
|||
|
|
|
|||
|
|
**Storage Strategy**: Each provider's configuration stored as a **single JSON object**:
|
|||
|
|
- `sandbox.provider_type` - Active provider selection ("self_managed", "aliyun_codeinterpreter", "e2b")
|
|||
|
|
- `sandbox.self_managed` - JSON config for self-managed provider
|
|||
|
|
- `sandbox.aliyun_codeinterpreter` - JSON config for Aliyun Code Interpreter provider
|
|||
|
|
- `sandbox.e2b` - JSON config for E2B provider
|
|||
|
|
|
|||
|
|
**Note**: The `value` field has a 1024 character limit, which should be sufficient for typical sandbox configurations. If larger configs are needed, consider using a TextField or a separate configuration table.
|
|||
|
|
|
|||
|
|
### 3.2 Configuration Schema
|
|||
|
|
|
|||
|
|
Each provider's configuration is stored as a **single JSON object** in the `value` field:
|
|||
|
|
|
|||
|
|
#### Self-Managed Provider
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"name": "sandbox.self_managed",
|
|||
|
|
"source": "variable",
|
|||
|
|
"data_type": "json",
|
|||
|
|
"value": "{\"endpoint\": \"http://localhost:9385\", \"pool_size\": 10, \"max_memory\": \"256m\", \"timeout\": 30}"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### Aliyun Code Interpreter
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"name": "sandbox.aliyun_codeinterpreter",
|
|||
|
|
"source": "variable",
|
|||
|
|
"data_type": "json",
|
|||
|
|
"value": "{\"access_key_id\": \"LTAI5t...\", \"access_key_secret\": \"xxxxx\", \"account_id\": \"1234567890...\", \"region\": \"cn-hangzhou\", \"timeout\": 30}"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### E2B
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"name": "sandbox.e2b",
|
|||
|
|
"source": "variable",
|
|||
|
|
"data_type": "json",
|
|||
|
|
"value": "{\"api_key\": \"e2b_sk_...\", \"region\": \"us\", \"timeout\": 30}"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### Active Provider Selection
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"name": "sandbox.provider_type",
|
|||
|
|
"source": "variable",
|
|||
|
|
"data_type": "string",
|
|||
|
|
"value": "self_managed"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3.3 Provider Self-Describing Schema
|
|||
|
|
|
|||
|
|
Each provider class implements a static method to describe its configuration schema:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# agent/sandbox/providers/base.py
|
|||
|
|
|
|||
|
|
class SandboxProvider(ABC):
|
|||
|
|
"""Base interface for all sandbox providers"""
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def initialize(self, config: Dict[str, Any]) -> bool:
|
|||
|
|
"""Initialize provider with configuration"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def create_instance(self, template: str = "python") -> SandboxInstance:
|
|||
|
|
"""Create a new sandbox instance"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def execute_code(
|
|||
|
|
self,
|
|||
|
|
instance_id: str,
|
|||
|
|
code: str,
|
|||
|
|
language: str,
|
|||
|
|
timeout: int = 10
|
|||
|
|
) -> ExecutionResult:
|
|||
|
|
"""Execute code in the sandbox"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def destroy_instance(self, instance_id: str) -> bool:
|
|||
|
|
"""Destroy a sandbox instance"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def health_check(self) -> bool:
|
|||
|
|
"""Check if provider is healthy"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@abstractmethod
|
|||
|
|
def get_supported_languages(self) -> list[str]:
|
|||
|
|
"""Get list of supported programming languages"""
|
|||
|
|
pass
|
|||
|
|
|
|||
|
|
@staticmethod
|
|||
|
|
def get_config_schema() -> Dict[str, Dict]:
|
|||
|
|
"""Return configuration schema for this provider"""
|
|||
|
|
return {}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Example Implementation**:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# agent/sandbox/providers/self_managed.py
|
|||
|
|
|
|||
|
|
class SelfManagedProvider(SandboxProvider):
|
|||
|
|
@staticmethod
|
|||
|
|
def get_config_schema() -> Dict[str, Dict]:
|
|||
|
|
return {
|
|||
|
|
"endpoint": {
|
|||
|
|
"type": "string",
|
|||
|
|
"required": True,
|
|||
|
|
"label": "API Endpoint",
|
|||
|
|
"placeholder": "http://localhost:9385"
|
|||
|
|
},
|
|||
|
|
"pool_size": {
|
|||
|
|
"type": "integer",
|
|||
|
|
"default": 10,
|
|||
|
|
"label": "Container Pool Size",
|
|||
|
|
"min": 1,
|
|||
|
|
"max": 100
|
|||
|
|
},
|
|||
|
|
"max_memory": {
|
|||
|
|
"type": "string",
|
|||
|
|
"default": "256m",
|
|||
|
|
"label": "Max Memory per Container",
|
|||
|
|
"options": ["128m", "256m", "512m", "1g"]
|
|||
|
|
},
|
|||
|
|
"timeout": {
|
|||
|
|
"type": "integer",
|
|||
|
|
"default": 30,
|
|||
|
|
"label": "Execution Timeout (seconds)",
|
|||
|
|
"min": 5,
|
|||
|
|
"max": 300
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
# agent/sandbox/providers/aliyun_codeinterpreter.py
|
|||
|
|
|
|||
|
|
class AliyunCodeInterpreterProvider(SandboxProvider):
|
|||
|
|
@staticmethod
|
|||
|
|
def get_config_schema() -> Dict[str, Dict]:
|
|||
|
|
return {
|
|||
|
|
"access_key_id": {
|
|||
|
|
"type": "string",
|
|||
|
|
"required": True,
|
|||
|
|
"secret": True,
|
|||
|
|
"label": "Access Key ID",
|
|||
|
|
"description": "Aliyun AccessKey ID for authentication"
|
|||
|
|
},
|
|||
|
|
"access_key_secret": {
|
|||
|
|
"type": "string",
|
|||
|
|
"required": True,
|
|||
|
|
"secret": True,
|
|||
|
|
"label": "Access Key Secret",
|
|||
|
|
"description": "Aliyun AccessKey Secret for authentication"
|
|||
|
|
},
|
|||
|
|
"account_id": {
|
|||
|
|
"type": "string",
|
|||
|
|
"required": True,
|
|||
|
|
"label": "Account ID",
|
|||
|
|
"description": "Aliyun primary account ID (主账号ID), required for API calls"
|
|||
|
|
},
|
|||
|
|
"region": {
|
|||
|
|
"type": "string",
|
|||
|
|
"default": "cn-hangzhou",
|
|||
|
|
"label": "Region",
|
|||
|
|
"options": ["cn-hangzhou", "cn-beijing", "cn-shanghai", "cn-shenzhen", "cn-guangzhou"],
|
|||
|
|
"description": "Aliyun region for Code Interpreter service"
|
|||
|
|
},
|
|||
|
|
"template_name": {
|
|||
|
|
"type": "string",
|
|||
|
|
"required": False,
|
|||
|
|
"label": "Template Name",
|
|||
|
|
"description": "Optional sandbox template name for pre-configured environments"
|
|||
|
|
},
|
|||
|
|
"timeout": {
|
|||
|
|
"type": "integer",
|
|||
|
|
"default": 30,
|
|||
|
|
"label": "Execution Timeout (seconds)",
|
|||
|
|
"min": 1,
|
|||
|
|
"max": 30,
|
|||
|
|
"description": "Code execution timeout (max 30 seconds - hard limit)"
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
# agent/sandbox/providers/e2b.py
|
|||
|
|
|
|||
|
|
class E2BProvider(SandboxProvider):
|
|||
|
|
@staticmethod
|
|||
|
|
def get_config_schema() -> Dict[str, Dict]:
|
|||
|
|
return {
|
|||
|
|
"api_key": {
|
|||
|
|
"type": "string",
|
|||
|
|
"required": True,
|
|||
|
|
"secret": True,
|
|||
|
|
"label": "API Key"
|
|||
|
|
},
|
|||
|
|
"region": {
|
|||
|
|
"type": "string",
|
|||
|
|
"default": "us",
|
|||
|
|
"label": "Region",
|
|||
|
|
"options": ["us", "eu"]
|
|||
|
|
},
|
|||
|
|
"timeout": {
|
|||
|
|
"type": "integer",
|
|||
|
|
"default": 30,
|
|||
|
|
"label": "Execution Timeout (seconds)",
|
|||
|
|
"min": 5,
|
|||
|
|
"max": 300
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Benefits of Self-Describing Providers**:
|
|||
|
|
- Single source of truth - schema defined alongside implementation
|
|||
|
|
- Easy to add new providers - no central registry to update
|
|||
|
|
- Type safety - schema stays in sync with provider code
|
|||
|
|
- Flexible - frontend can use schema for validation or hardcode if preferred
|
|||
|
|
|
|||
|
|
### 3.4 Admin API Endpoints
|
|||
|
|
|
|||
|
|
Follow existing pattern in `admin/server/routes.py` and use `SettingsMgr`:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# admin/server/routes.py (add new endpoints)
|
|||
|
|
|
|||
|
|
from flask import request, jsonify
|
|||
|
|
import json
|
|||
|
|
from api.db.services.system_settings_service import SystemSettingsService
|
|||
|
|
from agent.agent.sandbox.providers.self_managed import SelfManagedProvider
|
|||
|
|
from agent.agent.sandbox.providers.aliyun_codeinterpreter import AliyunCodeInterpreterProvider
|
|||
|
|
from agent.agent.sandbox.providers.e2b import E2BProvider
|
|||
|
|
from admin.server.services import SettingsMgr
|
|||
|
|
|
|||
|
|
# Map provider IDs to their classes
|
|||
|
|
PROVIDER_CLASSES = {
|
|||
|
|
"self_managed": SelfManagedProvider,
|
|||
|
|
"aliyun_codeinterpreter": AliyunCodeInterpreterProvider,
|
|||
|
|
"e2b": E2BProvider,
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
@admin_bp.route('/api/admin/sandbox/providers', methods=['GET'])
|
|||
|
|
def list_sandbox_providers():
|
|||
|
|
"""List available sandbox providers with their schemas"""
|
|||
|
|
providers = []
|
|||
|
|
for provider_id, provider_class in PROVIDER_CLASSES.items():
|
|||
|
|
schema = provider_class.get_config_schema()
|
|||
|
|
providers.append({
|
|||
|
|
"id": provider_id,
|
|||
|
|
"name": provider_id.replace("_", " ").title(),
|
|||
|
|
"config_schema": schema
|
|||
|
|
})
|
|||
|
|
return jsonify({"data": providers})
|
|||
|
|
|
|||
|
|
@admin_bp.route('/api/admin/sandbox/config', methods=['GET'])
|
|||
|
|
def get_sandbox_config():
|
|||
|
|
"""Get current sandbox configuration"""
|
|||
|
|
# Get active provider
|
|||
|
|
active_provider_setting = SystemSettingsService.get_by_name("sandbox.provider_type")
|
|||
|
|
active_provider = active_provider_setting[0].value if active_provider_setting else None
|
|||
|
|
|
|||
|
|
config = {"active": active_provider}
|
|||
|
|
|
|||
|
|
# Load all provider configs
|
|||
|
|
for provider_id in PROVIDER_CLASSES.keys():
|
|||
|
|
setting = SystemSettingsService.get_by_name(f"sandbox.{provider_id}")
|
|||
|
|
if setting:
|
|||
|
|
try:
|
|||
|
|
config[provider_id] = json.loads(setting[0].value)
|
|||
|
|
except json.JSONDecodeError:
|
|||
|
|
config[provider_id] = {}
|
|||
|
|
else:
|
|||
|
|
# Return default values from schema
|
|||
|
|
provider_class = PROVIDER_CLASSES[provider_id]
|
|||
|
|
schema = provider_class.get_config_schema()
|
|||
|
|
config[provider_id] = {
|
|||
|
|
key: field_def.get("default", "")
|
|||
|
|
for key, field_def in schema.items()
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
return jsonify({"data": config})
|
|||
|
|
|
|||
|
|
@admin_bp.route('/api/admin/sandbox/config', methods=['POST'])
|
|||
|
|
def set_sandbox_config():
|
|||
|
|
"""
|
|||
|
|
Update sandbox provider configuration.
|
|||
|
|
|
|||
|
|
Request Parameters:
|
|||
|
|
- provider_type: Provider identifier (e.g., "self_managed", "e2b")
|
|||
|
|
- config: Provider configuration dictionary
|
|||
|
|
- set_active: (optional) If True, also set this provider as active.
|
|||
|
|
Default: True for backward compatibility.
|
|||
|
|
Set to False to update config without switching providers.
|
|||
|
|
- test_connection: (optional) If True, test connection before saving
|
|||
|
|
|
|||
|
|
Response: Success message
|
|||
|
|
"""
|
|||
|
|
req = request.json
|
|||
|
|
provider_type = req.get('provider_type')
|
|||
|
|
config = req.get('config')
|
|||
|
|
set_active = req.get('set_active', True) # Default to True
|
|||
|
|
|
|||
|
|
# Validate provider exists
|
|||
|
|
if provider_type not in PROVIDER_CLASSES:
|
|||
|
|
return jsonify({"error": "Unknown provider"}), 400
|
|||
|
|
|
|||
|
|
# Validate configuration against schema
|
|||
|
|
provider_class = PROVIDER_CLASSES[provider_type]
|
|||
|
|
schema = provider_class.get_config_schema()
|
|||
|
|
validation_result = validate_config(config, schema)
|
|||
|
|
if not validation_result.valid:
|
|||
|
|
return jsonify({"error": "Invalid config", "details": validation_result.errors}), 400
|
|||
|
|
|
|||
|
|
# Test connection if requested
|
|||
|
|
if req.get('test_connection'):
|
|||
|
|
test_result = test_provider_connection(provider_type, config)
|
|||
|
|
if not test_result.success:
|
|||
|
|
return jsonify({"error": "Connection failed", "details": test_result.error}), 400
|
|||
|
|
|
|||
|
|
# Store entire config as a single JSON record
|
|||
|
|
config_json = json.dumps(config)
|
|||
|
|
setting_name = f"sandbox.{provider_type}"
|
|||
|
|
|
|||
|
|
existing = SystemSettingsService.get_by_name(setting_name)
|
|||
|
|
if existing:
|
|||
|
|
SettingsMgr.update_by_name(setting_name, config_json)
|
|||
|
|
else:
|
|||
|
|
SystemSettingsService.save(
|
|||
|
|
name=setting_name,
|
|||
|
|
source="variable",
|
|||
|
|
data_type="json",
|
|||
|
|
value=config_json
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# Set as active provider if requested (default: True)
|
|||
|
|
if set_active:
|
|||
|
|
SettingsMgr.update_by_name("sandbox.provider_type", provider_type)
|
|||
|
|
|
|||
|
|
return jsonify({"message": "Configuration saved"})
|
|||
|
|
|
|||
|
|
@admin_bp.route('/api/admin/sandbox/test', methods=['POST'])
|
|||
|
|
def test_sandbox_connection():
|
|||
|
|
"""Test connection to sandbox provider"""
|
|||
|
|
provider_type = request.json.get('provider_type')
|
|||
|
|
config = request.json.get('config')
|
|||
|
|
|
|||
|
|
test_result = test_provider_connection(provider_type, config)
|
|||
|
|
return jsonify({
|
|||
|
|
"success": test_result.success,
|
|||
|
|
"message": test_result.message,
|
|||
|
|
"latency_ms": test_result.latency_ms
|
|||
|
|
})
|
|||
|
|
|
|||
|
|
@admin_bp.route('/api/admin/sandbox/active', methods=['PUT'])
|
|||
|
|
def set_active_sandbox_provider():
|
|||
|
|
"""Set active sandbox provider"""
|
|||
|
|
provider_name = request.json.get('provider')
|
|||
|
|
|
|||
|
|
if provider_name not in PROVIDER_CLASSES:
|
|||
|
|
return jsonify({"error": "Unknown provider"}), 400
|
|||
|
|
|
|||
|
|
# Check if provider is configured
|
|||
|
|
provider_setting = SystemSettingsService.get_by_name(f"sandbox.{provider_name}")
|
|||
|
|
if not provider_setting:
|
|||
|
|
return jsonify({"error": "Provider not configured"}), 400
|
|||
|
|
|
|||
|
|
SettingsMgr.update_by_name("sandbox.provider_type", provider_name)
|
|||
|
|
return jsonify({"message": "Active provider updated"})
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 4. Frontend Integration
|
|||
|
|
|
|||
|
|
### 4.1 Admin Settings UI
|
|||
|
|
|
|||
|
|
**Location**: `web/src/pages/SandboxSettings/index.tsx`
|
|||
|
|
|
|||
|
|
```typescript
|
|||
|
|
import { Form, Select, Input, Button, Card, Space, Tag, message } from 'antd';
|
|||
|
|
import { listSandboxProviders, getSandboxConfig, setSandboxConfig, testSandboxConnection } from '@/utils/api';
|
|||
|
|
|
|||
|
|
const SandboxSettings: React.FC = () => {
|
|||
|
|
const [providers, setProviders] = useState<Provider[]>([]);
|
|||
|
|
const [configs, setConfigs] = useState<Config[]>([]);
|
|||
|
|
const [selectedProvider, setSelectedProvider] = useState<string>('');
|
|||
|
|
const [testing, setTesting] = useState(false);
|
|||
|
|
|
|||
|
|
const providerSchema = providers.find(p => p.id === selectedProvider);
|
|||
|
|
|
|||
|
|
const renderConfigForm = () => {
|
|||
|
|
if (!providerSchema) return null;
|
|||
|
|
|
|||
|
|
return (
|
|||
|
|
<Form layout="vertical">
|
|||
|
|
{Object.entries(providerSchema.config_schema).map(([key, schema]) => (
|
|||
|
|
<Form.Item
|
|||
|
|
key={key}
|
|||
|
|
name={key}
|
|||
|
|
label={schema.label}
|
|||
|
|
rules={[{ required: schema.required }]}
|
|||
|
|
>
|
|||
|
|
{schema.secret ? (
|
|||
|
|
<Input.Password placeholder={schema.placeholder} />
|
|||
|
|
) : schema.type === 'integer' ? (
|
|||
|
|
<InputNumber min={schema.min} max={schema.max} />
|
|||
|
|
) : schema.options ? (
|
|||
|
|
<Select>
|
|||
|
|
{schema.options.map((opt: string) => (
|
|||
|
|
<Option key={opt} value={opt}>{opt}</Option>
|
|||
|
|
))}
|
|||
|
|
</Select>
|
|||
|
|
) : (
|
|||
|
|
<Input placeholder={schema.placeholder} />
|
|||
|
|
)}
|
|||
|
|
</Form.Item>
|
|||
|
|
))}
|
|||
|
|
</Form>
|
|||
|
|
);
|
|||
|
|
};
|
|||
|
|
|
|||
|
|
return (
|
|||
|
|
<Card title="Sandbox Provider Configuration">
|
|||
|
|
<Space direction="vertical" style={{ width: '100%' }}>
|
|||
|
|
{/* Provider Selection */}
|
|||
|
|
<Form.Item label="Select Provider">
|
|||
|
|
<Select
|
|||
|
|
style={{ width: '100%' }}
|
|||
|
|
onChange={setSelectedProvider}
|
|||
|
|
value={selectedProvider}
|
|||
|
|
>
|
|||
|
|
{providers.map(provider => (
|
|||
|
|
<Option key={provider.id} value={provider.id}>
|
|||
|
|
<Space>
|
|||
|
|
<Icon type={provider.icon} />
|
|||
|
|
{provider.name}
|
|||
|
|
{provider.tags.map(tag => (
|
|||
|
|
<Tag key={tag}>{tag}</Tag>
|
|||
|
|
))}
|
|||
|
|
</Space>
|
|||
|
|
</Option>
|
|||
|
|
))}
|
|||
|
|
</Select>
|
|||
|
|
</Form.Item>
|
|||
|
|
|
|||
|
|
{/* Dynamic Configuration Form */}
|
|||
|
|
{renderConfigForm()}
|
|||
|
|
|
|||
|
|
{/* Actions */}
|
|||
|
|
<Space>
|
|||
|
|
<Button type="primary" onClick={handleSave}>
|
|||
|
|
Save Configuration
|
|||
|
|
</Button>
|
|||
|
|
<Button onClick={handleTest} loading={testing}>
|
|||
|
|
Test Connection
|
|||
|
|
</Button>
|
|||
|
|
</Space>
|
|||
|
|
</Space>
|
|||
|
|
</Card>
|
|||
|
|
);
|
|||
|
|
};
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4.2 API Client
|
|||
|
|
|
|||
|
|
**File**: `web/src/utils/api.ts`
|
|||
|
|
|
|||
|
|
```typescript
|
|||
|
|
export async function listSandboxProviders() {
|
|||
|
|
return request<{ data: Provider[] }>('/api/admin/sandbox/providers');
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
export async function getSandboxConfig() {
|
|||
|
|
return request<{ data: SandboxConfig }>('/api/admin/sandbox/config');
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
export async function setSandboxConfig(config: SandboxConfigRequest) {
|
|||
|
|
return request('/api/admin/sandbox/config', {
|
|||
|
|
method: 'POST',
|
|||
|
|
data: config,
|
|||
|
|
});
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
export async function testSandboxConnection(provider: string, config: any) {
|
|||
|
|
return request('/api/admin/sandbox/test', {
|
|||
|
|
method: 'POST',
|
|||
|
|
data: { provider, config },
|
|||
|
|
});
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
export async function setActiveSandboxProvider(provider: string) {
|
|||
|
|
return request('/api/admin/sandbox/active', {
|
|||
|
|
method: 'PUT',
|
|||
|
|
data: { provider },
|
|||
|
|
});
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4.3 Type Definitions
|
|||
|
|
|
|||
|
|
**File**: `web/src/types/sandbox.ts`
|
|||
|
|
|
|||
|
|
```typescript
|
|||
|
|
interface Provider {
|
|||
|
|
id: string;
|
|||
|
|
name: string;
|
|||
|
|
description: string;
|
|||
|
|
icon: string;
|
|||
|
|
tags: string[];
|
|||
|
|
config_schema: Record<string, ConfigField>;
|
|||
|
|
supported_languages: string[];
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
interface ConfigField {
|
|||
|
|
type: 'string' | 'integer' | 'boolean';
|
|||
|
|
required: boolean;
|
|||
|
|
secret?: boolean;
|
|||
|
|
label: string;
|
|||
|
|
placeholder?: string;
|
|||
|
|
default?: any;
|
|||
|
|
options?: string[];
|
|||
|
|
min?: number;
|
|||
|
|
max?: number;
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
// Configuration response grouped by provider
|
|||
|
|
interface SandboxConfig {
|
|||
|
|
active: string; // Currently active provider
|
|||
|
|
self_managed?: Record<string, string>;
|
|||
|
|
aliyun_codeinterpreter?: Record<string, string>;
|
|||
|
|
e2b?: Record<string, string>;
|
|||
|
|
// Add more providers as needed
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
// Request to update provider configuration
|
|||
|
|
interface SandboxConfigRequest {
|
|||
|
|
provider_type: string;
|
|||
|
|
config: Record<string, string | number | boolean>;
|
|||
|
|
test_connection?: boolean;
|
|||
|
|
set_active?: boolean;
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 5. Integration with Agent System
|
|||
|
|
|
|||
|
|
### 5.1 Agent Component Usage
|
|||
|
|
|
|||
|
|
The agent system will use the sandbox through the simplified provider manager, loading global configuration from SystemSettings:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# In agent/components/code_executor.py
|
|||
|
|
|
|||
|
|
import json
|
|||
|
|
from agent.agent.sandbox.providers.manager import ProviderManager
|
|||
|
|
from agent.agent.sandbox.providers.self_managed import SelfManagedProvider
|
|||
|
|
from agent.agent.sandbox.providers.aliyun_codeinterpreter import AliyunCodeInterpreterProvider
|
|||
|
|
from agent.agent.sandbox.providers.e2b import E2BProvider
|
|||
|
|
from api.db.services.system_settings_service import SystemSettingsService
|
|||
|
|
|
|||
|
|
# Map provider IDs to their classes
|
|||
|
|
PROVIDER_CLASSES = {
|
|||
|
|
"self_managed": SelfManagedProvider,
|
|||
|
|
"aliyun_codeinterpreter": AliyunCodeInterpreterProvider,
|
|||
|
|
"e2b": E2BProvider,
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
class CodeExecutorComponent:
|
|||
|
|
def __init__(self):
|
|||
|
|
self.provider_manager = ProviderManager()
|
|||
|
|
self._load_active_provider()
|
|||
|
|
|
|||
|
|
def _load_active_provider(self):
|
|||
|
|
"""Load the active provider from system settings"""
|
|||
|
|
# Get active provider
|
|||
|
|
active_setting = SystemSettingsService.get_by_name("sandbox.provider_type")
|
|||
|
|
if not active_setting:
|
|||
|
|
raise RuntimeError("No sandbox provider configured")
|
|||
|
|
|
|||
|
|
active_provider = active_setting[0].value
|
|||
|
|
|
|||
|
|
# Load configuration for active provider (single JSON record)
|
|||
|
|
provider_setting = SystemSettingsService.get_by_name(f"sandbox.{active_provider}")
|
|||
|
|
if not provider_setting:
|
|||
|
|
raise RuntimeError(f"Sandbox provider {active_provider} not configured")
|
|||
|
|
|
|||
|
|
# Parse JSON configuration
|
|||
|
|
try:
|
|||
|
|
config = json.loads(provider_setting[0].value)
|
|||
|
|
except json.JSONDecodeError as e:
|
|||
|
|
raise RuntimeError(f"Invalid sandbox configuration for {active_provider}: {e}")
|
|||
|
|
|
|||
|
|
# Get provider class
|
|||
|
|
provider_class = PROVIDER_CLASSES.get(active_provider)
|
|||
|
|
if not provider_class:
|
|||
|
|
raise RuntimeError(f"Unknown provider: {active_provider}")
|
|||
|
|
|
|||
|
|
# Initialize provider
|
|||
|
|
provider = provider_class()
|
|||
|
|
provider.initialize(config)
|
|||
|
|
|
|||
|
|
# Set as active provider in manager
|
|||
|
|
self.provider_manager.set_provider(active_provider, provider)
|
|||
|
|
|
|||
|
|
def execute(self, code: str, language: str) -> ExecutionResult:
|
|||
|
|
"""Execute code using the active provider"""
|
|||
|
|
provider = self.provider_manager.get_provider()
|
|||
|
|
|
|||
|
|
if not provider:
|
|||
|
|
raise RuntimeError("No sandbox provider configured")
|
|||
|
|
|
|||
|
|
# Create instance
|
|||
|
|
instance = provider.create_instance(template=language)
|
|||
|
|
|
|||
|
|
try:
|
|||
|
|
# Execute code
|
|||
|
|
result = provider.execute_code(
|
|||
|
|
instance_id=instance.instance_id,
|
|||
|
|
code=code,
|
|||
|
|
language=language
|
|||
|
|
)
|
|||
|
|
return result
|
|||
|
|
finally:
|
|||
|
|
# Always cleanup
|
|||
|
|
provider.destroy_instance(instance.instance_id)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 6. Security Considerations
|
|||
|
|
|
|||
|
|
### 6.1 Credential Storage
|
|||
|
|
- Sensitive credentials (API keys, secrets) encrypted at rest in database
|
|||
|
|
- Use RAGFlow's existing encryption mechanisms (AES-256)
|
|||
|
|
- Never log or expose credentials in error messages or API responses
|
|||
|
|
- Credentials redacted in UI (show only last 4 characters)
|
|||
|
|
|
|||
|
|
### 6.2 Tenant Isolation
|
|||
|
|
- **Configuration**: Global sandbox settings shared by all tenants (admin-only access)
|
|||
|
|
- **Execution**: Sandboxes never shared across tenants/sessions during runtime
|
|||
|
|
- **Instance IDs**: Scoped to tenant: `{tenant_id}:{session_id}:{instance_id}`
|
|||
|
|
- **Network Isolation**: Between tenant sandboxes (VPC per tenant for SaaS providers)
|
|||
|
|
- **Resource Quotas**: Per-tenant limits on concurrent executions, total execution time
|
|||
|
|
- **Audit Logging**: All sandbox executions logged with tenant_id for traceability
|
|||
|
|
|
|||
|
|
### 6.3 Resource Limits
|
|||
|
|
- Timeout limits per execution (configurable per provider, default 30s)
|
|||
|
|
- Memory/CPU limits enforced at provider level
|
|||
|
|
- Automatic cleanup of stale instances (max lifetime: 5 minutes)
|
|||
|
|
- Rate limiting per tenant (max concurrent executions: 10)
|
|||
|
|
|
|||
|
|
### 6.4 Code Security
|
|||
|
|
- For self-managed: AST-based security analysis before execution
|
|||
|
|
- Blocked operations: file system writes, network calls, system commands
|
|||
|
|
- Allowlist approach: only specific imports allowed
|
|||
|
|
- Runtime monitoring for malicious patterns
|
|||
|
|
|
|||
|
|
### 6.5 Network Security
|
|||
|
|
- Self-managed: Network isolation by default, no external access
|
|||
|
|
- SaaS: HTTPS only, certificate pinning
|
|||
|
|
- IP whitelisting for self-managed endpoint access
|
|||
|
|
|
|||
|
|
## 7. Monitoring and Observability
|
|||
|
|
|
|||
|
|
### 7.1 Metrics to Track
|
|||
|
|
|
|||
|
|
**Common Metrics (All Providers)**:
|
|||
|
|
- Execution success rate (target: >95%)
|
|||
|
|
- Average execution time (p50, p95, p99)
|
|||
|
|
- Error rate by error type
|
|||
|
|
- Active instance count
|
|||
|
|
- Queue depth (for self-managed pool)
|
|||
|
|
|
|||
|
|
**Self-Managed Specific**:
|
|||
|
|
- Container pool utilization (target: 60-80%)
|
|||
|
|
- Host resource usage (CPU, memory, disk)
|
|||
|
|
- Container creation latency
|
|||
|
|
- Container restart rate
|
|||
|
|
- gVisor runtime health
|
|||
|
|
|
|||
|
|
**SaaS Specific**:
|
|||
|
|
- API call latency by region
|
|||
|
|
- Rate limit usage and throttling events
|
|||
|
|
- Cost estimation (execution count × unit cost)
|
|||
|
|
- Provider availability (uptime %)
|
|||
|
|
- API error rate by error code
|
|||
|
|
|
|||
|
|
### 7.2 Logging
|
|||
|
|
|
|||
|
|
Structured logging for all provider operations:
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"timestamp": "2025-01-26T10:00:00Z",
|
|||
|
|
"tenant_id": "tenant_123",
|
|||
|
|
"provider": "aliyun_codeinterpreter",
|
|||
|
|
"operation": "execute_code",
|
|||
|
|
"instance_id": "inst_xyz",
|
|||
|
|
"language": "python",
|
|||
|
|
"code_hash": "sha256:...",
|
|||
|
|
"duration_ms": 1234,
|
|||
|
|
"status": "success",
|
|||
|
|
"exit_code": 0,
|
|||
|
|
"memory_used_mb": 64,
|
|||
|
|
"region": "cn-hangzhou"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 7.3 Alerts
|
|||
|
|
|
|||
|
|
**Critical Alerts**:
|
|||
|
|
- Provider availability < 99%
|
|||
|
|
- Error rate > 5%
|
|||
|
|
- Average execution time > 10s
|
|||
|
|
- Container pool exhaustion (0 available)
|
|||
|
|
|
|||
|
|
**Warning Alerts**:
|
|||
|
|
- Cost spike (2x daily average)
|
|||
|
|
- Rate limit approaching (>80%)
|
|||
|
|
- High memory usage (>90%)
|
|||
|
|
- Slow execution times (p95 > 5s)
|
|||
|
|
|
|||
|
|
## 8. Migration Path
|
|||
|
|
|
|||
|
|
### 8.1 Phase 1: Refactor Existing Code (Week 1-2)
|
|||
|
|
**Goals**: Extract current implementation into provider pattern
|
|||
|
|
|
|||
|
|
**Tasks**:
|
|||
|
|
- [ ] Create `agent/sandbox/providers/base.py` with `SandboxProvider` interface
|
|||
|
|
- [ ] Implement `agent/sandbox/providers/self_managed.py` wrapping executor_manager
|
|||
|
|
- [ ] Create `agent/sandbox/providers/manager.py` for provider management
|
|||
|
|
- [ ] Write unit tests for self-managed provider
|
|||
|
|
- [ ] Document existing behavior and configuration
|
|||
|
|
|
|||
|
|
**Deliverables**:
|
|||
|
|
- Provider abstraction layer
|
|||
|
|
- Self-managed provider implementation
|
|||
|
|
- Unit test suite
|
|||
|
|
|
|||
|
|
### 8.2 Phase 2: Database Integration (Week 3)
|
|||
|
|
**Goals**: Add sandbox configuration to admin system
|
|||
|
|
|
|||
|
|
**Tasks**:
|
|||
|
|
- [ ] Add sandbox entries to `conf/system_settings.json` initialization file
|
|||
|
|
- [ ] Extend `SettingsMgr` in `admin/server/services.py` with sandbox-specific methods
|
|||
|
|
- [ ] Add admin endpoints to `admin/server/routes.py`
|
|||
|
|
- [ ] Implement configuration validation logic
|
|||
|
|
- [ ] Add provider connection testing
|
|||
|
|
- [ ] Write API tests
|
|||
|
|
|
|||
|
|
**Deliverables**:
|
|||
|
|
- SystemSettings integration
|
|||
|
|
- Admin API endpoints (`/api/admin/sandbox/*`)
|
|||
|
|
- Configuration validation
|
|||
|
|
- API test suite
|
|||
|
|
|
|||
|
|
### 8.3 Phase 3: Frontend UI (Week 4)
|
|||
|
|
**Goals**: Build admin settings interface
|
|||
|
|
|
|||
|
|
**Tasks**:
|
|||
|
|
- [ ] Create `web/src/pages/SandboxSettings/index.tsx`
|
|||
|
|
- [ ] Implement dynamic form generation from provider schema
|
|||
|
|
- [ ] Add connection testing UI
|
|||
|
|
- [ ] Create TypeScript types
|
|||
|
|
- [ ] Write frontend tests
|
|||
|
|
|
|||
|
|
**Deliverables**:
|
|||
|
|
- Admin settings UI
|
|||
|
|
- Type definitions
|
|||
|
|
- Frontend test suite
|
|||
|
|
|
|||
|
|
### 8.4 Phase 4: SaaS Provider Implementation (Week 5-6)
|
|||
|
|
**Goals**: Implement Aliyun Code Interpreter and E2B providers
|
|||
|
|
|
|||
|
|
**Tasks**:
|
|||
|
|
- [ ] Implement `agent/sandbox/providers/aliyun_codeinterpreter.py`
|
|||
|
|
- [ ] Implement `agent/sandbox/providers/e2b.py`
|
|||
|
|
- [ ] Add provider-specific tests with mocking
|
|||
|
|
- [ ] Document provider-specific behaviors
|
|||
|
|
- [ ] Create provider setup guides
|
|||
|
|
|
|||
|
|
**Deliverables**:
|
|||
|
|
- Aliyun Code Interpreter provider
|
|||
|
|
- E2B provider
|
|||
|
|
- Provider documentation
|
|||
|
|
|
|||
|
|
### 8.5 Phase 5: Agent Integration (Week 7)
|
|||
|
|
**Goals**: Update agent components to use new provider system
|
|||
|
|
|
|||
|
|
**Tasks**:
|
|||
|
|
- [ ] Update `agent/components/code_executor.py` to use ProviderManager
|
|||
|
|
- [ ] Implement fallback logic
|
|||
|
|
- [ ] Add tenant-specific provider loading
|
|||
|
|
- [ ] Update agent tests
|
|||
|
|
- [ ] Performance testing
|
|||
|
|
|
|||
|
|
**Deliverables**:
|
|||
|
|
- Agent integration
|
|||
|
|
- Fallback mechanism
|
|||
|
|
- Updated test suite
|
|||
|
|
|
|||
|
|
### 8.6 Phase 6: Monitoring & Documentation (Week 8)
|
|||
|
|
**Goals**: Add observability and complete documentation
|
|||
|
|
|
|||
|
|
**Tasks**:
|
|||
|
|
- [ ] Implement metrics collection
|
|||
|
|
- [ ] Add structured logging
|
|||
|
|
- [ ] Configure alerts
|
|||
|
|
- [ ] Write deployment guide
|
|||
|
|
- [ ] Write user documentation
|
|||
|
|
- [ ] Create troubleshooting guide
|
|||
|
|
|
|||
|
|
**Deliverables**:
|
|||
|
|
- Monitoring dashboards
|
|||
|
|
- Complete documentation
|
|||
|
|
- Deployment guides
|
|||
|
|
|
|||
|
|
## 9. Testing Strategy
|
|||
|
|
|
|||
|
|
### 9.1 Unit Tests
|
|||
|
|
|
|||
|
|
**Provider Tests** (`test/agent/sandbox/providers/test_*.py`):
|
|||
|
|
```python
|
|||
|
|
class TestSelfManagedProvider:
|
|||
|
|
def test_initialize_with_config():
|
|||
|
|
provider = SelfManagedProvider()
|
|||
|
|
assert provider.initialize({"endpoint": "http://localhost:9385"})
|
|||
|
|
|
|||
|
|
def test_create_python_instance():
|
|||
|
|
provider = SelfManagedProvider()
|
|||
|
|
provider.initialize(test_config)
|
|||
|
|
instance = provider.create_instance("python")
|
|||
|
|
assert instance.status == "running"
|
|||
|
|
|
|||
|
|
def test_execute_code():
|
|||
|
|
provider = SelfManagedProvider()
|
|||
|
|
result = provider.execute_code(instance_id, "print('hello')", "python")
|
|||
|
|
assert result.exit_code == 0
|
|||
|
|
assert "hello" in result.stdout
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Configuration Tests**:
|
|||
|
|
- Test configuration validation for each provider schema
|
|||
|
|
- Test error handling for invalid configurations
|
|||
|
|
- Test secret field redaction
|
|||
|
|
|
|||
|
|
### 9.2 Integration Tests
|
|||
|
|
|
|||
|
|
**Provider Switching**:
|
|||
|
|
- Test switching between providers
|
|||
|
|
- Test fallback mechanism
|
|||
|
|
- Test concurrent provider usage
|
|||
|
|
|
|||
|
|
**Multi-Tenant Isolation**:
|
|||
|
|
- Test tenant configuration isolation
|
|||
|
|
- Test instance ID scoping
|
|||
|
|
- Test resource separation
|
|||
|
|
|
|||
|
|
**Admin API Tests**:
|
|||
|
|
- Test CRUD operations for configurations
|
|||
|
|
- Test connection testing endpoint
|
|||
|
|
- Test validation error responses
|
|||
|
|
|
|||
|
|
### 9.3 E2E Tests
|
|||
|
|
|
|||
|
|
**Complete Flow Tests**:
|
|||
|
|
```python
|
|||
|
|
def test_sandbox_execution_flow():
|
|||
|
|
# 1. Configure provider via admin API
|
|||
|
|
setSandboxConfig(provider="self_managed", config={...})
|
|||
|
|
|
|||
|
|
# 2. Create agent task with code execution
|
|||
|
|
task = create_agent_task(code="print('test')")
|
|||
|
|
|
|||
|
|
# 3. Execute task
|
|||
|
|
result = execute_agent_task(task.id)
|
|||
|
|
|
|||
|
|
# 4. Verify result
|
|||
|
|
assert result.status == "success"
|
|||
|
|
assert "test" in result.output
|
|||
|
|
|
|||
|
|
# 5. Verify sandbox cleanup
|
|||
|
|
assert get_active_instances() == 0
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Admin UI Tests**:
|
|||
|
|
- Test provider configuration flow
|
|||
|
|
- Test connection testing
|
|||
|
|
- Test error handling in UI
|
|||
|
|
|
|||
|
|
### 9.4 Performance Tests
|
|||
|
|
|
|||
|
|
**Load Testing**:
|
|||
|
|
- Test 100 concurrent executions
|
|||
|
|
- Test pool exhaustion behavior
|
|||
|
|
- Test queue performance (self-managed)
|
|||
|
|
|
|||
|
|
**Latency Testing**:
|
|||
|
|
- Measure cold start time per provider
|
|||
|
|
- Measure execution latency percentiles
|
|||
|
|
- Compare provider performance
|
|||
|
|
|
|||
|
|
## 10. Cost Considerations
|
|||
|
|
|
|||
|
|
### 10.1 Self-Managed Costs
|
|||
|
|
|
|||
|
|
**Infrastructure**:
|
|||
|
|
- Server hosting: $X/month (depends on specs)
|
|||
|
|
- Maintenance: engineering time
|
|||
|
|
- Scaling: manual, requires additional servers
|
|||
|
|
|
|||
|
|
**Pros**:
|
|||
|
|
- Predictable costs
|
|||
|
|
- No per-execution fees
|
|||
|
|
- Full control over resources
|
|||
|
|
|
|||
|
|
**Cons**:
|
|||
|
|
- High initial setup cost
|
|||
|
|
- Operational overhead
|
|||
|
|
- Finite capacity
|
|||
|
|
|
|||
|
|
### 10.2 SaaS Costs
|
|||
|
|
|
|||
|
|
**Aliyun Code Interpreter** (estimated):
|
|||
|
|
- Pricing: execution time × memory configuration
|
|||
|
|
- Example: 1000 executions/day × 30s × $0.01/1000s = ~$0.30/day
|
|||
|
|
|
|||
|
|
**E2B** (estimated):
|
|||
|
|
- Pricing: $0.02/execution-second
|
|||
|
|
- Example: 1000 executions/day × 30s × $0.02/s = ~$600/day
|
|||
|
|
|
|||
|
|
**Pros**:
|
|||
|
|
- No upfront costs
|
|||
|
|
- Automatic scaling
|
|||
|
|
- No maintenance
|
|||
|
|
|
|||
|
|
**Cons**:
|
|||
|
|
- Variable costs (can spike with usage)
|
|||
|
|
- Network dependency
|
|||
|
|
- Potential for runaway costs
|
|||
|
|
|
|||
|
|
### 10.3 Cost Optimization
|
|||
|
|
|
|||
|
|
**Recommendations**:
|
|||
|
|
1. **Hybrid Approach**: Use self-managed for base load, SaaS for spikes
|
|||
|
|
2. **Cost Monitoring**: Set budget alerts per tenant
|
|||
|
|
3. **Resource Limits**: Enforce max executions per tenant/day
|
|||
|
|
4. **Caching**: Reuse instances when possible (self-managed pool)
|
|||
|
|
5. **Smart Routing**: Route to cheapest provider based on availability
|
|||
|
|
|
|||
|
|
## 11. Future Extensibility
|
|||
|
|
|
|||
|
|
The architecture supports easy addition of new providers:
|
|||
|
|
|
|||
|
|
### 11.1 Adding a New Provider
|
|||
|
|
|
|||
|
|
**Step 1**: Implement provider class with schema
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# agent/sandbox/providers/new_provider.py
|
|||
|
|
from .base import SandboxProvider
|
|||
|
|
|
|||
|
|
class NewProvider(SandboxProvider):
|
|||
|
|
@staticmethod
|
|||
|
|
def get_config_schema() -> Dict[str, Dict]:
|
|||
|
|
return {
|
|||
|
|
"api_key": {
|
|||
|
|
"type": "string",
|
|||
|
|
"required": True,
|
|||
|
|
"secret": True,
|
|||
|
|
"label": "API Key"
|
|||
|
|
},
|
|||
|
|
"region": {
|
|||
|
|
"type": "string",
|
|||
|
|
"default": "us-east-1",
|
|||
|
|
"label": "Region"
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
def initialize(self, config: Dict[str, Any]) -> bool:
|
|||
|
|
self.api_key = config.get("api_key")
|
|||
|
|
self.region = config.get("region", "us-east-1")
|
|||
|
|
# Initialize client
|
|||
|
|
return True
|
|||
|
|
|
|||
|
|
# Implement other abstract methods...
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Step 2**: Register in provider mapping
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# In api/apps/sandbox_app.py or wherever providers are listed
|
|||
|
|
from agent.agent.sandbox.providers.new_provider import NewProvider
|
|||
|
|
|
|||
|
|
PROVIDER_CLASSES = {
|
|||
|
|
"self_managed": SelfManagedProvider,
|
|||
|
|
"aliyun_codeinterpreter": AliyunCodeInterpreterProvider,
|
|||
|
|
"e2b": E2BProvider,
|
|||
|
|
"new_provider": NewProvider, # Add here
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**No central registry to update** - just import and add to the mapping!
|
|||
|
|
|
|||
|
|
### 11.2 Potential Future Providers
|
|||
|
|
|
|||
|
|
- **GitHub Codespaces**: For GitHub-integrated workflows
|
|||
|
|
- **Gitpod**: Cloud development environments
|
|||
|
|
- **CodeSandbox**: Frontend code execution
|
|||
|
|
- **AWS Firecracker**: Raw microVM management
|
|||
|
|
- **Custom Provider**: User-defined provider implementations
|
|||
|
|
|
|||
|
|
### 11.3 Advanced Features
|
|||
|
|
|
|||
|
|
**Feature Pooling**:
|
|||
|
|
- Share instances across executions (same language, same user)
|
|||
|
|
- Warm pool for reduced latency
|
|||
|
|
- Instance hibernation for cost savings
|
|||
|
|
|
|||
|
|
**Feature Multi-Region**:
|
|||
|
|
- Route to nearest region
|
|||
|
|
- Failover across regions
|
|||
|
|
- Regional cost optimization
|
|||
|
|
|
|||
|
|
**Feature Hybrid Execution**:
|
|||
|
|
- Split workloads between providers
|
|||
|
|
- Dynamic provider selection based on cost/performance
|
|||
|
|
- A/B testing for provider performance
|
|||
|
|
|
|||
|
|
## 12. Appendix
|
|||
|
|
|
|||
|
|
### 12.1 Configuration Examples
|
|||
|
|
|
|||
|
|
**SystemSettings Initialization File** (`conf/system_settings.json` - add these entries):
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"system_settings": [
|
|||
|
|
{
|
|||
|
|
"name": "sandbox.provider_type",
|
|||
|
|
"source": "variable",
|
|||
|
|
"data_type": "string",
|
|||
|
|
"value": "self_managed"
|
|||
|
|
},
|
|||
|
|
{
|
|||
|
|
"name": "sandbox.self_managed",
|
|||
|
|
"source": "variable",
|
|||
|
|
"data_type": "json",
|
|||
|
|
"value": "{\"endpoint\": \"http://sandbox-internal:9385\", \"pool_size\": 20, \"max_memory\": \"512m\", \"timeout\": 60, \"enable_seccomp\": true, \"enable_ast_analysis\": true}"
|
|||
|
|
},
|
|||
|
|
{
|
|||
|
|
"name": "sandbox.aliyun_codeinterpreter",
|
|||
|
|
"source": "variable",
|
|||
|
|
"data_type": "json",
|
|||
|
|
"value": "{\"access_key_id\": \"\", \"access_key_secret\": \"\", \"account_id\": \"\", \"region\": \"cn-hangzhou\", \"template_name\": \"\", \"timeout\": 30}"
|
|||
|
|
},
|
|||
|
|
{
|
|||
|
|
"name": "sandbox.e2b",
|
|||
|
|
"source": "variable",
|
|||
|
|
"data_type": "json",
|
|||
|
|
"value": "{\"api_key\": \"\", \"region\": \"us\", \"timeout\": 30}"
|
|||
|
|
}
|
|||
|
|
]
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Admin API Request Example** (POST to `/api/admin/sandbox/config`):
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"provider_type": "self_managed",
|
|||
|
|
"config": {
|
|||
|
|
"endpoint": "http://sandbox-internal:9385",
|
|||
|
|
"pool_size": 20,
|
|||
|
|
"max_memory": "512m",
|
|||
|
|
"timeout": 60,
|
|||
|
|
"enable_seccomp": true,
|
|||
|
|
"enable_ast_analysis": true
|
|||
|
|
},
|
|||
|
|
"test_connection": true,
|
|||
|
|
"set_active": true
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Note**: The `config` object in the request is a plain JSON object. The API will serialize it to a JSON string before storing in SystemSettings.
|
|||
|
|
|
|||
|
|
**Admin API Response Example** (GET from `/api/admin/sandbox/config`):
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"data": {
|
|||
|
|
"active": "self_managed",
|
|||
|
|
"self_managed": {
|
|||
|
|
"endpoint": "http://sandbox-internal:9385",
|
|||
|
|
"pool_size": 20,
|
|||
|
|
"max_memory": "512m",
|
|||
|
|
"timeout": 60,
|
|||
|
|
"enable_seccomp": true,
|
|||
|
|
"enable_ast_analysis": true
|
|||
|
|
},
|
|||
|
|
"aliyun_codeinterpreter": {
|
|||
|
|
"access_key_id": "",
|
|||
|
|
"access_key_secret": "",
|
|||
|
|
"region": "cn-hangzhou",
|
|||
|
|
"workspace_id": ""
|
|||
|
|
},
|
|||
|
|
"e2b": {
|
|||
|
|
"api_key": "",
|
|||
|
|
"region": "us",
|
|||
|
|
"timeout": 30
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Note**: The response deserializes the JSON strings back to objects for easier frontend consumption.
|
|||
|
|
|
|||
|
|
### 12.2 Error Codes
|
|||
|
|
|
|||
|
|
| Code | Description | Resolution |
|
|||
|
|
|------|-------------|------------|
|
|||
|
|
| SB001 | Provider not initialized | Configure provider in admin |
|
|||
|
|
| SB002 | Invalid configuration | Check configuration values |
|
|||
|
|
| SB003 | Connection failed | Check network and credentials |
|
|||
|
|
| SB004 | Instance creation failed | Check provider capacity |
|
|||
|
|
| SB005 | Execution timeout | Increase timeout or optimize code |
|
|||
|
|
| SB006 | Out of memory | Reduce memory usage or increase limits |
|
|||
|
|
| SB007 | Code blocked by security policy | Remove blocked imports/operations |
|
|||
|
|
| SB008 | Rate limit exceeded | Reduce concurrency or upgrade plan |
|
|||
|
|
| SB009 | Provider unavailable | Check provider status or use fallback |
|
|||
|
|
|
|||
|
|
### 12.3 References
|
|||
|
|
|
|||
|
|
- [Current Sandbox Implementation](../sandbox/README.md)
|
|||
|
|
- [RAGFlow Admin System](../CONTRIBUTING.md)
|
|||
|
|
- [Daytona Documentation](https://daytona.dev/docs)
|
|||
|
|
- [Aliyun Code Interpreter](https://help.aliyun.com/...)
|
|||
|
|
- [E2B Documentation](https://e2b.dev/docs)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
**Document Version**: 1.0
|
|||
|
|
**Last Updated**: 2025-01-26
|
|||
|
|
**Author**: RAGFlow Team
|
|||
|
|
**Status**: Design Specification - Ready for Review
|
|||
|
|
|
|||
|
|
## Appendix C: Configuration Storage Considerations
|
|||
|
|
|
|||
|
|
### Current Implementation
|
|||
|
|
- **Storage**: SystemSettings table with `value` field as `TextField` (unlimited length)
|
|||
|
|
- **Migration**: Database migration added to convert from `CharField(1024)` to `TextField`
|
|||
|
|
- **Benefit**: Supports arbitrarily long API keys, workspace IDs, and other SaaS provider credentials
|
|||
|
|
|
|||
|
|
### Validation
|
|||
|
|
- **Schema validation**: Type checking, range validation, required field validation
|
|||
|
|
- **Provider-specific validation**: Custom validation via `validate_config()` method
|
|||
|
|
- **Example**: SelfManagedProvider validates URL format, timeout ranges, pool size constraints
|
|||
|
|
|
|||
|
|
### Configuration Storage Format
|
|||
|
|
Each provider's configuration is stored as JSON in `SystemSettings.value`:
|
|||
|
|
- `sandbox.provider_type`: Active provider selection
|
|||
|
|
- `sandbox.self_managed`: Self-managed provider JSON config
|
|||
|
|
- `sandbox.aliyun_codeinterpreter`: Aliyun provider JSON config
|
|||
|
|
- `sandbox.e2b`: E2B provider JSON config
|
|||
|
|
|
|||
|
|
## Appendix D: Configuration Hot Reload Limitations
|
|||
|
|
|
|||
|
|
### Current Behavior
|
|||
|
|
**Provider Configuration Requires Restart**: When switching sandbox providers in the admin panel, the ragflow service must be restarted for changes to take effect.
|
|||
|
|
|
|||
|
|
**Reason**:
|
|||
|
|
- Admin and ragflow are separate processes
|
|||
|
|
- ragflow loads sandbox provider configuration only at startup
|
|||
|
|
- The `get_provider_manager()` function caches the provider globally
|
|||
|
|
- Configuration changes in MySQL are not automatically detected
|
|||
|
|
|
|||
|
|
**Impact**:
|
|||
|
|
- Switching from `self_managed` → `aliyun_codeinterpreter` requires ragflow restart
|
|||
|
|
- Updating credentials/config requires ragflow restart
|
|||
|
|
- Not a dynamic configuration system
|
|||
|
|
|
|||
|
|
**Workarounds**:
|
|||
|
|
1. **Production**: Restart ragflow service after configuration changes:
|
|||
|
|
```bash
|
|||
|
|
cd docker
|
|||
|
|
docker compose restart ragflow-server
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
2. **Development**: Use the `reload_provider()` function in code:
|
|||
|
|
```python
|
|||
|
|
from agent.sandbox.client import reload_provider
|
|||
|
|
reload_provider() # Reloads from MySQL settings
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Future Enhancement**:
|
|||
|
|
To support hot reload without restart, implement configuration change detection:
|
|||
|
|
```python
|
|||
|
|
# In agent/sandbox/client.py
|
|||
|
|
_config_timestamp: Optional[int] = None
|
|||
|
|
|
|||
|
|
def get_provider_manager() -> ProviderManager:
|
|||
|
|
global _provider_manager, _config_timestamp
|
|||
|
|
|
|||
|
|
# Check if configuration has changed
|
|||
|
|
setting = SystemSettingsService.get_by_name("sandbox.provider_type")
|
|||
|
|
current_timestamp = setting[0].update_time if setting else 0
|
|||
|
|
|
|||
|
|
if _config_timestamp is None or current_timestamp > _config_timestamp:
|
|||
|
|
# Configuration changed, reload provider
|
|||
|
|
_provider_manager = None
|
|||
|
|
_load_provider_from_settings()
|
|||
|
|
_config_timestamp = current_timestamp
|
|||
|
|
|
|||
|
|
return _provider_manager
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
However, this adds overhead on every `execute_code()` call. For production use, explicit restart is preferred for simplicity and reliability.
|
|||
|
|
|
|||
|
|
## Appendix E: Arguments Parameter Support
|
|||
|
|
|
|||
|
|
### Overview
|
|||
|
|
All sandbox providers support passing arguments to the `main()` function in user code. This enables dynamic parameter injection for code execution.
|
|||
|
|
|
|||
|
|
### Implementation Details
|
|||
|
|
|
|||
|
|
**Base Interface**:
|
|||
|
|
```python
|
|||
|
|
# agent/sandbox/providers/base.py
|
|||
|
|
@abstractmethod
|
|||
|
|
def execute_code(
|
|||
|
|
self,
|
|||
|
|
instance_id: str,
|
|||
|
|
code: str,
|
|||
|
|
language: str,
|
|||
|
|
timeout: int = 10,
|
|||
|
|
arguments: Optional[Dict[str, Any]] = None
|
|||
|
|
) -> ExecutionResult:
|
|||
|
|
"""
|
|||
|
|
Execute code in the sandbox.
|
|||
|
|
|
|||
|
|
The code should contain a main() function that will be called with:
|
|||
|
|
- Python: main(**arguments) if arguments provided, else main()
|
|||
|
|
- JavaScript: main(arguments) if arguments provided, else main()
|
|||
|
|
"""
|
|||
|
|
pass
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Provider Implementations**:
|
|||
|
|
|
|||
|
|
1. **Self-Managed Provider** ([self_managed.py:164](agent/sandbox/providers/self_managed.py:164)):
|
|||
|
|
- Passes arguments via HTTP API: `"arguments": arguments or {}`
|
|||
|
|
- executor_manager receives and passes to code via command line
|
|||
|
|
- Runner script: `args = json.loads(sys.argv[1])` then `result = main(**args)`
|
|||
|
|
|
|||
|
|
2. **Aliyun Code Interpreter** ([aliyun_codeinterpreter.py:260-275](agent/sandbox/providers/aliyun_codeinterpreter.py:260-275)):
|
|||
|
|
- Wraps user code to call `main(**arguments)` or `main()` if no arguments
|
|||
|
|
- Python example:
|
|||
|
|
```python
|
|||
|
|
if arguments:
|
|||
|
|
wrapped_code = f'''{code}
|
|||
|
|
|
|||
|
|
if __name__ == "__main__":
|
|||
|
|
import json
|
|||
|
|
result = main(**{json.dumps(arguments)})
|
|||
|
|
print(json.dumps(result) if isinstance(result, dict) else result)
|
|||
|
|
'''
|
|||
|
|
```
|
|||
|
|
- JavaScript example:
|
|||
|
|
```javascript
|
|||
|
|
if arguments:
|
|||
|
|
wrapped_code = f'''{code}
|
|||
|
|
|
|||
|
|
const result = main({json.dumps(arguments)});
|
|||
|
|
console.log(typeof result === 'object' ? JSON.stringify(result) : String(result));
|
|||
|
|
'''
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Client Layer** ([client.py:138-190](agent/sandbox/client.py:138-190)):
|
|||
|
|
```python
|
|||
|
|
def execute_code(
|
|||
|
|
code: str,
|
|||
|
|
language: str = "python",
|
|||
|
|
timeout: int = 30,
|
|||
|
|
arguments: Optional[Dict[str, Any]] = None
|
|||
|
|
) -> ExecutionResult:
|
|||
|
|
provider_manager = get_provider_manager()
|
|||
|
|
provider = provider_manager.get_provider()
|
|||
|
|
|
|||
|
|
instance = provider.create_instance(template=language)
|
|||
|
|
try:
|
|||
|
|
result = provider.execute_code(
|
|||
|
|
instance_id=instance.instance_id,
|
|||
|
|
code=code,
|
|||
|
|
language=language,
|
|||
|
|
timeout=timeout,
|
|||
|
|
arguments=arguments # Passed through to provider
|
|||
|
|
)
|
|||
|
|
return result
|
|||
|
|
finally:
|
|||
|
|
provider.destroy_instance(instance.instance_id)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**CodeExec Tool Integration** ([code_exec.py:136-165](agent/tools/code_exec.py:136-165)):
|
|||
|
|
```python
|
|||
|
|
def _execute_code(self, language: str, code: str, arguments: dict):
|
|||
|
|
# ... collect arguments from component configuration
|
|||
|
|
|
|||
|
|
result = sandbox_execute_code(
|
|||
|
|
code=code,
|
|||
|
|
language=language,
|
|||
|
|
timeout=int(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10 * 60)),
|
|||
|
|
arguments=arguments # Passed through to sandbox client
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Usage Examples
|
|||
|
|
|
|||
|
|
**Python Code with Arguments**:
|
|||
|
|
```python
|
|||
|
|
# User code
|
|||
|
|
def main(name: str, count: int) -> dict:
|
|||
|
|
"""Generate greeting"""
|
|||
|
|
return {"message": f"Hello {name}!" * count}
|
|||
|
|
|
|||
|
|
# Called with: arguments={"name": "World", "count": 3}
|
|||
|
|
# Result: {"message": "Hello World!Hello World!Hello World!"}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**JavaScript Code with Arguments**:
|
|||
|
|
```javascript
|
|||
|
|
// User code
|
|||
|
|
function main(args) {
|
|||
|
|
const { name, count } = args;
|
|||
|
|
return `Hello ${name}!`.repeat(count);
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
// Called with: arguments={"name": "World", "count": 3}
|
|||
|
|
// Result: "Hello World!Hello World!Hello World!"
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Important Notes
|
|||
|
|
|
|||
|
|
1. **Function Signature**: Code MUST define a `main()` function
|
|||
|
|
- Python: `def main(**kwargs)` or `def main()` if no arguments
|
|||
|
|
- JavaScript: `function main(args)` or `function main()` if no arguments
|
|||
|
|
|
|||
|
|
2. **Type Consistency**: Arguments are passed as JSON, so types are preserved:
|
|||
|
|
- Numbers → int/float
|
|||
|
|
- Strings → str
|
|||
|
|
- Booleans → bool
|
|||
|
|
- Objects → dict (Python) / object (JavaScript)
|
|||
|
|
- Arrays → list (Python) / array (JavaScript)
|
|||
|
|
|
|||
|
|
3. **Return Value**: Return value is serialized as JSON for parsing
|
|||
|
|
- Python: `print(json.dumps(result))` if dict
|
|||
|
|
- JavaScript: `console.log(JSON.stringify(result))` if object
|
|||
|
|
|
|||
|
|
4. **Provider Alignment**: All providers (self_managed, aliyun_codeinterpreter, e2b) implement arguments passing consistently
|