Initial commit with translated description

2026-03-29 13:02:57 +08:00
commit d97055d190
7 changed files with 2719 additions and 0 deletions
--- a/AI_AGENT_GUIDE.md
+++ b/AI_AGENT_GUIDE.md
@@ -0,0 +1,448 @@
+# AI Desktop Agent - Cognitive Automation Guide
+
+## 🤖 What Is This?
+
+The **AI Desktop Agent** is an intelligent layer on top of the basic desktop control that **understands** what you want and figures out how to do it autonomously.
+
+Unlike basic automation that requires exact instructions, the AI Agent:
+- **Understands natural language** ("Draw a cat in Paint")
+- **Plans the steps** automatically
+- **Executes autonomously** 
+- **Adapts** based on what it sees
+
+---
+
+## 🎯 What Can It Do?
+
+### ✅ Autonomous Drawing
+```python
+from skills.desktop_control.ai_agent import AIDesktopAgent
+
+agent = AIDesktopAgent()
+
+# Just describe what you want!
+agent.execute_task("Draw a circle in Paint")
+agent.execute_task("Draw a star in MS Paint")
+agent.execute_task("Draw a house with a sun")
+```
+
+**What it does:**
+1. Opens MS Paint
+2. Selects pencil tool
+3. Figures out how to draw the requested shape
+4. Draws it autonomously
+5. Takes a screenshot of the result
+
+### ✅ Autonomous Text Entry
+```python
+# It figures out where to type
+agent.execute_task("Type 'Hello World' in Notepad")
+agent.execute_task("Write an email saying thank you")
+```
+
+**What it does:**
+1. Opens Notepad (or finds active text editor)
+2. Types the text naturally
+3. Formats if needed
+
+### ✅ Autonomous Application Control
+```python
+# It knows how to open apps
+agent.execute_task("Open Calculator")
+agent.execute_task("Launch Microsoft Paint")
+agent.execute_task("Open File Explorer")
+```
+
+### ✅ Autonomous Game Playing (Advanced)
+```python
+# It will try to play the game!
+agent.execute_task("Play Solitaire for me")
+agent.execute_task("Play Minesweeper")
+```
+
+**What it does:**
+1. Analyzes the game screen
+2. Detects game state (cards, mines, etc.)
+3. Decides best move
+4. Executes the move
+5. Repeats until win/lose
+
+---
+
+## 🏗️ How It Works
+
+### Architecture
+
+```
+User Request ("Draw a cat")
+    ↓
+Natural Language Understanding
+    ↓
+Task Planning (Step-by-step plan)
+    ↓
+Step Execution Loop:
+    - Observe Screen (Computer Vision)
+    - Decide Action (AI Reasoning)
+    - Execute Action (Desktop Control)
+    - Verify Result
+    ↓
+Task Complete!
+```
+
+### Key Components
+
+1. **Task Planner** - Breaks down high-level tasks into steps
+2. **Vision System** - Understands what's on screen (screenshots, OCR, object detection)
+3. **Reasoning Engine** - Decides what to do next
+4. **Action Executor** - Performsthe actual mouse/keyboard actions
+5. **Feedback Loop** - Verifies actions succeeded
+
+---
+
+## 📋 Supported Tasks (Current)
+
+### Tier 1: Fully Automated ✅
+
+| Task Pattern | Example | Status |
+|-------------|---------|---------|
+| Draw shapes in Paint | "Draw a circle" | ✅ Working |
+| Basic text entry | "Type Hello" | ✅ Working |
+| Launch applications | "Open Paint" | ✅ Working |
+
+### Tier 2: Partially Automated 🔨
+
+| Task Pattern | Example | Status |
+|-------------|---------|---------|
+| Form filling | "Fill out this form" | 🔨 In Progress |
+| File operations | "Copy these files" | 🔨 In Progress |
+| Web navigation | "Find on Google" | 🔨 Planned |
+
+### Tier 3: Experimental 🧪
+
+| Task Pattern | Example | Status |
+|-------------|---------|---------|
+| Game playing | "Play Solitaire" | 🧪 Experimental |
+| Image editing | "Resize this photo" | 🧪 Planned |
+| Code editing | "Fix this bug" | 🧪 Research |
+
+---
+
+## 🎨 Example: Drawing in Paint
+
+### Simple Request
+```python
+agent = AIDesktopAgent()
+result = agent.execute_task("Draw a circle in Paint")
+
+# Check result
+print(f"Status: {result['status']}")
+print(f"Steps taken: {len(result['steps'])}")
+```
+
+### What Happens Behind the Scenes
+
+**1. Planning Phase:**
+```
+Plan generated:
+  Step 1: Launch MS Paint
+  Step 2: Wait 2s for Paint to load
+  Step 3: Activate Paint window
+  Step 4: Select pencil tool (press 'P')
+  Step 5: Draw circle at canvas center
+  Step 6: Screenshot the result
+```
+
+**2. Execution Phase:**
+```
+[✓] Launched Paint via Win+R → mspaint
+[✓] Waited 2.0s
+[✓] Activated window "Paint"
+[✓] Pressed 'P' to select pencil
+[✓] Drew circle with 72 points
+[✓] Screenshot saved: drawing_result.png
+```
+
+**3. Result:**
+```python
+{
+    "task": "Draw a circle in Paint",
+    "status": "completed",
+    "success": True,
+    "steps": [... 6 steps ...],
+    "screenshots": [... 6 screenshots ...],
+}
+```
+
+---
+
+## 🎮 Example: Game Playing
+
+```python
+agent = AIDesktopAgent()
+
+# Play a simple game
+result = agent.execute_task("Play Solitaire for me")
+```
+
+### Game Playing Loop
+
+```
+1. Analyze screen → Detect cards, positions
+2. Identify valid moves → Find legal plays
+3. Evaluate moves → Which is best?
+4. Execute move → Click and drag card
+5. Repeat until game ends
+```
+
+### Game-Specific Intelligence
+
+The agent can learn patterns for:
+- **Solitaire**: Card stacking rules, suit matching
+- **Minesweeper**: Probability calculations, safe clicks
+- **2048**: Tile merging strategy
+- **Chess** (if integrated with engine): Move evaluation
+
+---
+
+## 🧠 Enhancing the AI
+
+### Adding Application Knowledge
+
+```python
+# In ai_agent.py, add to app_knowledge:
+
+self.app_knowledge = {
+    "photoshop": {
+        "name": "Adobe Photoshop",
+        "launch_command": "photoshop",
+        "common_actions": {
+            "new_layer": {"hotkey": ["ctrl", "shift", "n"]},
+            "brush_tool": {"hotkey": ["b"]},
+            "eraser": {"hotkey": ["e"]},
+        }
+    }
+}
+```
+
+### Adding Custom Task Patterns
+
+```python
+# Add a custom planning method
+def _plan_photo_edit(self, task: str) -> List[Dict]:
+    """Plan for photo editing tasks."""
+    return [
+        {"type": "launch_app", "app": "photoshop"},
+        {"type": "wait", "duration": 3.0},
+        {"type": "open_file", "path": extracted_path},
+        {"type": "apply_filter", "filter": extracted_filter},
+        {"type": "save_file"},
+    ]
+```
+
+---
+
+## 🔥 Advanced: Vision + Reasoning
+
+### Screen Analysis
+
+The agent can analyze screenshots to:
+- **Detect UI elements** (buttons, text fields, menus)
+- **Read text** (OCR for labels, instructions)
+- **Identify objects** (icons, images, game pieces)
+- **Understand layout** (where things are)
+
+```python
+# Analyze what's on screen
+analysis = agent._analyze_screen()
+
+print(analysis)
+# Output:
+# {
+#     "active_window": "Untitled - Paint",
+#     "mouse_position": (640, 480),
+#     "detected_elements": [...],
+#     "text_found": [...],
+# }
+```
+
+### Integration with OpenClaw LLM
+
+```python
+# Future: Use OpenClaw's LLM for reasoning
+agent = AIDesktopAgent(llm_client=openclaw_llm)
+
+# The agent can now:
+# - Reason about complex tasks
+# - Understand context better
+# - Plan more sophisticated workflows
+# - Learn from feedback
+```
+
+---
+
+## 🛠️ Extending for Your Needs
+
+### Add Support for New Apps
+
+1. **Identify the app**
+2. **Document common actions**
+3. **Add to knowledge base**
+4. **Create planning method**
+
+Example: Adding Excel support
+
+```python
+# Step 1: Add to app_knowledge
+"excel": {
+    "name": "Microsoft Excel",
+    "launch_command": "excel",
+    "common_actions": {
+        "new_sheet": {"hotkey": ["shift", "f11"]},
+        "sum_formula": {"action": "type", "text": "=SUM()"},
+    }
+}
+
+# Step 2: Create planner
+def _plan_excel_task(self, task: str) -> List[Dict]:
+    return [
+        {"type": "launch_app", "app": "excel"},
+        {"type": "wait", "duration": 2.0},
+        # ... specific Excel steps
+    ]
+
+# Step 3: Hook into main planner
+if "excel" in task_lower or "spreadsheet" in task_lower:
+    return self._plan_excel_task(task)
+```
+
+---
+
+## 🎯 Real-World Use Cases
+
+### 1. Automated Form Filling
+```python
+agent.execute_task("Fill out the job application with my resume data")
+```
+
+### 2. Batch Image Processing
+```python
+agent.execute_task("Resize all images in this folder to 800x600")
+```
+
+### 3. Social Media Posting
+```python
+agent.execute_task("Post this image to Instagram with caption 'Beautiful sunset'")
+```
+
+### 4. Data Entry
+```python
+agent.execute_task("Copy data from this PDF to Excel spreadsheet")
+```
+
+### 5. Testing
+```python
+agent.execute_task("Test the login form with invalid credentials")
+```
+
+---
+
+## ⚙️ Configuration
+
+### Enable/Disable Failsafe
+```python
+# Safe mode (default)
+agent = AIDesktopAgent(failsafe=True)
+
+# Fast mode (no failsafe)
+agent = AIDesktopAgent(failsafe=False)
+```
+
+### Set Max Steps
+```python
+# Prevent infinite loops
+result = agent.execute_task("Play game", max_steps=100)
+```
+
+### Access Action History
+```python
+# Review what the agent did
+print(agent.action_history)
+```
+
+---
+
+## 🐛 Debugging
+
+### View Step-by-Step Execution
+```python
+result = agent.execute_task("Draw a star in Paint")
+
+for i, step in enumerate(result['steps'], 1):
+    print(f"Step {i}: {step['step']['description']}")
+    print(f"  Success: {step['success']}")
+    if 'error' in step:
+        print(f"  Error: {step['error']}")
+```
+
+### View Screenshots
+```python
+# Each step captures before/after screenshots
+for screenshot_pair in result['screenshots']:
+    before = screenshot_pair['before']
+    after = screenshot_pair['after']
+    
+    # Display or save for analysis
+    before.save(f"step_{screenshot_pair['step']}_before.png")
+    after.save(f"step_{screenshot_pair['step']}_after.png")
+```
+
+---
+
+## 🚀 Future Enhancements
+
+Planned features:
+
+- [ ] **Computer Vision**: OCR, object detection, UI element recognition
+- [ ] **LLM Integration**: Natural language understanding with OpenClaw LLM
+- [ ] **Learning**: Remember successful patterns, improve over time
+- [ ] **Multi-App Workflows**: "Get data from Chrome and put in Excel"
+- [ ] **Voice Control**: "Alexa, draw a cat in Paint"
+- [ ] **Autonomous Debugging**: Fix errors automatically
+- [ ] **Game AI**: Reinforcement learning for game playing
+- [ ] **Web Automation**: Full browser control with understanding
+
+---
+
+## 📚 Full API
+
+### Main Methods
+
+```python
+# Execute a task
+result = agent.execute_task(task: str, max_steps: int = 50)
+
+# Analyze screen
+analysis = agent._analyze_screen()
+
+# Manual mode: Execute individual steps
+step = {"type": "launch_app", "app": "paint"}
+result = agent._execute_step(step)
+```
+
+### Result Structure
+
+```python
+{
+    "task": str,                    # Original task
+    "status": str,                  # "completed", "failed", "error"
+    "success": bool,                # Overall success
+    "steps": List[Dict],            # All steps executed
+    "screenshots": List[Dict],      # Before/after screenshots
+    "failed_at_step": int,          # If failed, which step
+    "error": str,                   # Error message if failed
+}
+```
+
+---
+
+**🦞 Built for OpenClaw - The future of desktop automation!**
--- a/QUICK_REFERENCE.md
+++ b/QUICK_REFERENCE.md
@@ -0,0 +1,269 @@
+# Desktop Control - Quick Reference Card
+
+## 🚀 Instant Start
+
+```python
+from skills.desktop_control import DesktopController
+
+dc = DesktopController()
+```
+
+## 🖱️ Mouse Control (Top 10)
+
+```python
+# 1. Move mouse
+dc.move_mouse(500, 300, duration=0.5)
+
+# 2. Click
+dc.click(500, 300)  # Left click at position
+dc.click()           # Click at current position
+
+# 3. Right click
+dc.right_click(500, 300)
+
+# 4. Double click
+dc.double_click(500, 300)
+
+# 5. Drag & drop
+dc.drag(100, 100, 500, 500, duration=1.0)
+
+# 6. Scroll
+dc.scroll(-5)  # Scroll down 5 clicks
+
+# 7. Get position
+x, y = dc.get_mouse_position()
+
+# 8. Move relative
+dc.move_relative(100, 50)  # Move 100px right, 50px down
+
+# 9. Smooth movement
+dc.move_mouse(1000, 500, duration=1.0, smooth=True)
+
+# 10. Middle click
+dc.middle_click()
+```
+
+## ⌨️ Keyboard Control (Top 10)
+
+```python
+# 1. Type text (instant)
+dc.type_text("Hello World")
+
+# 2. Type text (human-like, 60 WPM)
+dc.type_text("Hello World", wpm=60)
+
+# 3. Press key
+dc.press('enter')
+dc.press('tab')
+dc.press('escape')
+
+# 4. Hotkeys (shortcuts)
+dc.hotkey('ctrl', 'c')      # Copy
+dc.hotkey('ctrl', 'v')      # Paste  
+dc.hotkey('ctrl', 's')      # Save
+dc.hotkey('win', 'r')       # Run dialog
+dc.hotkey('alt', 'tab')     # Switch window
+
+# 5. Hold & release
+dc.key_down('shift')
+dc.type_text("hello")  # Types "HELLO"
+dc.key_up('shift')
+
+# 6. Arrow keys
+dc.press('up')
+dc.press('down')
+dc.press('left')
+dc.press('right')
+
+# 7. Function keys
+dc.press('f5')  # Refresh
+
+# 8. Multiple presses
+dc.press('backspace', presses=5)
+
+# 9. Special keys
+dc.press('home')
+dc.press('end')
+dc.press('pagedown')
+dc.press('delete')
+
+# 10. Fast combo
+dc.hotkey('ctrl', 'alt', 'delete')
+```
+
+## 📸 Screen Operations (Top 5)
+
+```python
+# 1. Screenshot (full screen)
+img = dc.screenshot()
+dc.screenshot(filename="screen.png")
+
+# 2. Screenshot (region)
+img = dc.screenshot(region=(100, 100, 800, 600))
+
+# 3. Get pixel color
+r, g, b = dc.get_pixel_color(500, 300)
+
+# 4. Find image on screen
+location = dc.find_on_screen("button.png")
+
+# 5. Get screen size
+width, height = dc.get_screen_size()
+```
+
+## 🪟 Window Management (Top 5)
+
+```python
+# 1. Get all windows
+windows = dc.get_all_windows()
+
+# 2. Activate window
+dc.activate_window("Chrome")
+
+# 3. Get active window
+active = dc.get_active_window()
+
+# 4. List windows
+for title in dc.get_all_windows():
+    print(title)
+
+# 5. Switch to app
+dc.activate_window("Visual Studio Code")
+```
+
+## 📋 Clipboard (Top 2)
+
+```python
+# 1. Copy to clipboard
+dc.copy_to_clipboard("Hello!")
+
+# 2. Get from clipboard
+text = dc.get_from_clipboard()
+```
+
+## 🔥 Real-World Examples
+
+### Example 1: Auto-fill Form
+```python
+dc.click(300, 200)  # Name field
+dc.type_text("John Doe", wpm=80)
+dc.press('tab')
+dc.type_text("john@email.com", wpm=80)
+dc.press('tab')
+dc.type_text("Password123", wpm=60)
+dc.press('enter')
+```
+
+### Example 2: Copy-Paste Automation
+```python
+# Select all
+dc.hotkey('ctrl', 'a')
+# Copy
+dc.hotkey('ctrl', 'c')
+# Wait
+dc.pause(0.5)
+# Switch window
+dc.hotkey('alt', 'tab')
+# Paste
+dc.hotkey('ctrl', 'v')
+```
+
+### Example 3: File Operations
+```python
+# Select multiple files
+dc.key_down('ctrl')
+dc.click(100, 200)
+dc.click(100, 250)
+dc.click(100, 300)
+dc.key_up('ctrl')
+# Copy
+dc.hotkey('ctrl', 'c')
+```
+
+### Example 4: Screenshot Workflow
+```python
+# Take screenshot
+dc.screenshot(filename=f"capture_{time.time()}.png")
+# Open in Paint
+dc.hotkey('win', 'r')
+dc.pause(0.5)
+dc.type_text('mspaint')
+dc.press('enter')
+```
+
+### Example 5: Search & Replace
+```python
+# Open Find & Replace
+dc.hotkey('ctrl', 'h')
+dc.pause(0.3)
+# Type find text
+dc.type_text("old_text")
+dc.press('tab')
+# Type replace text
+dc.type_text("new_text")
+# Replace all
+dc.hotkey('alt', 'a')
+```
+
+## ⚙️ Configuration
+
+```python
+# With failsafe (move to corner to abort)
+dc = DesktopController(failsafe=True)
+
+# With approval mode (ask before each action)
+dc = DesktopController(require_approval=True)
+
+# Maximum speed (no safety checks)
+dc = DesktopController(failsafe=False)
+```
+
+## 🛡️ Safety
+
+```python
+# Check if safe to continue
+if dc.is_safe():
+    dc.click(500, 500)
+
+# Pause execution
+dc.pause(2.0)  # Wait 2 seconds
+
+# Emergency abort: Move mouse to any screen corner
+```
+
+## 🎯 Pro Tips
+
+1. **Instant typing**: `interval=0` or `wpm=None`
+2. **Human typing**: `wpm=60` (60 words/min)
+3. **Smooth mouse**: `duration=0.5, smooth=True`
+4. **Instant mouse**: `duration=0`
+5. **Wait for UI**: `dc.pause(0.5)` between actions
+6. **Failsafe**: Always enable for safety
+7. **Test first**: Use `demo.py` to test features
+8. **Coordinates**: Use `get_mouse_position()` to find them
+9. **Screenshots**: Capture before/after for verification
+10. **Hotkeys > Menus**: Faster and more reliable
+
+## 📦 Dependencies
+
+```bash
+pip install pyautogui pillow opencv-python pygetwindow pyperclip
+```
+
+## 🚨 Common Issues
+
+**Mouse not moving correctly?**
+- Check DPI scaling in Windows settings
+- Verify coordinates with `get_mouse_position()`
+
+**Keyboard not working?**
+- Ensure target app has focus
+- Some apps block automation (games, secure apps)
+
+**Failsafe triggering?**
+- Keep mouse away from screen corners
+- Disable if needed: `failsafe=False`
+
+---
+
+**Built for OpenClaw** 🦞 - Desktop automation made easy!
--- a/SKILL.md
+++ b/SKILL.md
@@ -0,0 +1,623 @@
+---
+description: "使用鼠标、键盘和屏幕控制的高级桌面自动化。"
+---
+
+# Desktop Control Skill
+
+**The most advanced desktop automation skill for OpenClaw.** Provides pixel-perfect mouse control, lightning-fast keyboard input, screen capture, window management, and clipboard operations.
+
+## 🎯 Features
+
+### Mouse Control
+- ✅ **Absolute positioning** - Move to exact coordinates
+- ✅ **Relative movement** - Move from current position
+- ✅ **Smooth movement** - Natural, human-like mouse paths
+- ✅ **Click types** - Left, right, middle, double, triple clicks
+- ✅ **Drag & drop** - Drag from point A to point B
+- ✅ **Scroll** - Vertical and horizontal scrolling
+- ✅ **Position tracking** - Get current mouse coordinates
+
+### Keyboard Control
+- ✅ **Text typing** - Fast, accurate text input
+- ✅ **Hotkeys** - Execute keyboard shortcuts (Ctrl+C, Win+R, etc.)
+- ✅ **Special keys** - Enter, Tab, Escape, Arrow keys, F-keys
+- ✅ **Key combinations** - Multi-key press combinations
+- ✅ **Hold & release** - Manual key state control
+- ✅ **Typing speed** - Configurable WPM (instant to human-like)
+
+### Screen Operations
+- ✅ **Screenshot** - Capture entire screen or regions
+- ✅ **Image recognition** - Find elements on screen (via OpenCV)
+- ✅ **Color detection** - Get pixel colors at coordinates
+- ✅ **Multi-monitor** - Support for multiple displays
+
+### Window Management
+- ✅ **Window list** - Get all open windows
+- ✅ **Activate window** - Bring window to front
+- ✅ **Window info** - Get position, size, title
+- ✅ **Minimize/Maximize** - Control window states
+
+### Safety Features
+- ✅ **Failsafe** - Move mouse to corner to abort
+- ✅ **Pause control** - Emergency stop mechanism
+- ✅ **Approval mode** - Require confirmation for actions
+- ✅ **Bounds checking** - Prevent out-of-screen operations
+- ✅ **Logging** - Track all automation actions
+
+---
+
+## 🚀 Quick Start
+
+### Installation
+
+First, install required dependencies:
+
+```bash
+pip install pyautogui pillow opencv-python pygetwindow
+```
+
+### Basic Usage
+
+```python
+from skills.desktop_control import DesktopController
+
+# Initialize controller
+dc = DesktopController(failsafe=True)
+
+# Mouse operations
+dc.move_mouse(500, 300)  # Move to coordinates
+dc.click()  # Left click at current position
+dc.click(100, 200, button="right")  # Right click at position
+
+# Keyboard operations
+dc.type_text("Hello from OpenClaw!")
+dc.hotkey("ctrl", "c")  # Copy
+dc.press("enter")
+
+# Screen operations
+screenshot = dc.screenshot()
+position = dc.get_mouse_position()
+```
+
+---
+
+## 📋 Complete API Reference
+
+### Mouse Functions
+
+#### `move_mouse(x, y, duration=0, smooth=True)`
+Move mouse to absolute screen coordinates.
+
+**Parameters:**
+- `x` (int): X coordinate (pixels from left)
+- `y` (int): Y coordinate (pixels from top)
+- `duration` (float): Movement time in seconds (0 = instant, 0.5 = smooth)
+- `smooth` (bool): Use bezier curve for natural movement
+
+**Example:**
+```python
+# Instant movement
+dc.move_mouse(1000, 500)
+
+# Smooth 1-second movement
+dc.move_mouse(1000, 500, duration=1.0)
+```
+
+#### `move_relative(x_offset, y_offset, duration=0)`
+Move mouse relative to current position.
+
+**Parameters:**
+- `x_offset` (int): Pixels to move horizontally (positive = right)
+- `y_offset` (int): Pixels to move vertically (positive = down)
+- `duration` (float): Movement time in seconds
+
+**Example:**
+```python
+# Move 100px right, 50px down
+dc.move_relative(100, 50, duration=0.3)
+```
+
+#### `click(x=None, y=None, button='left', clicks=1, interval=0.1)`
+Perform mouse click.
+
+**Parameters:**
+- `x, y` (int, optional): Coordinates to click (None = current position)
+- `button` (str): 'left', 'right', 'middle'
+- `clicks` (int): Number of clicks (1 = single, 2 = double)
+- `interval` (float): Delay between multiple clicks
+
+**Example:**
+```python
+# Simple left click
+dc.click()
+
+# Double-click at specific position
+dc.click(500, 300, clicks=2)
+
+# Right-click
+dc.click(button='right')
+```
+
+#### `drag(start_x, start_y, end_x, end_y, duration=0.5, button='left')`
+Drag and drop operation.
+
+**Parameters:**
+- `start_x, start_y` (int): Starting coordinates
+- `end_x, end_y` (int): Ending coordinates
+- `duration` (float): Drag duration
+- `button` (str): Mouse button to use
+
+**Example:**
+```python
+# Drag file from desktop to folder
+dc.drag(100, 100, 500, 500, duration=1.0)
+```
+
+#### `scroll(clicks, direction='vertical', x=None, y=None)`
+Scroll mouse wheel.
+
+**Parameters:**
+- `clicks` (int): Scroll amount (positive = up/left, negative = down/right)
+- `direction` (str): 'vertical' or 'horizontal'
+- `x, y` (int, optional): Position to scroll at
+
+**Example:**
+```python
+# Scroll down 5 clicks
+dc.scroll(-5)
+
+# Scroll up 10 clicks
+dc.scroll(10)
+
+# Horizontal scroll
+dc.scroll(5, direction='horizontal')
+```
+
+#### `get_mouse_position()`
+Get current mouse coordinates.
+
+**Returns:** `(x, y)` tuple
+
+**Example:**
+```python
+x, y = dc.get_mouse_position()
+print(f"Mouse is at: {x}, {y}")
+```
+
+---
+
+### Keyboard Functions
+
+#### `type_text(text, interval=0, wpm=None)`
+Type text with configurable speed.
+
+**Parameters:**
+- `text` (str): Text to type
+- `interval` (float): Delay between keystrokes (0 = instant)
+- `wpm` (int, optional): Words per minute (overrides interval)
+
+**Example:**
+```python
+# Instant typing
+dc.type_text("Hello World")
+
+# Human-like typing at 60 WPM
+dc.type_text("Hello World", wpm=60)
+
+# Slow typing with 0.1s between keys
+dc.type_text("Hello World", interval=0.1)
+```
+
+#### `press(key, presses=1, interval=0.1)`
+Press and release a key.
+
+**Parameters:**
+- `key` (str): Key name (see Key Names section)
+- `presses` (int): Number of times to press
+- `interval` (float): Delay between presses
+
+**Example:**
+```python
+# Press Enter
+dc.press('enter')
+
+# Press Space 3 times
+dc.press('space', presses=3)
+
+# Press Down arrow
+dc.press('down')
+```
+
+#### `hotkey(*keys, interval=0.05)`
+Execute keyboard shortcut.
+
+**Parameters:**
+- `*keys` (str): Keys to press together
+- `interval` (float): Delay between key presses
+
+**Example:**
+```python
+# Copy (Ctrl+C)
+dc.hotkey('ctrl', 'c')
+
+# Paste (Ctrl+V)
+dc.hotkey('ctrl', 'v')
+
+# Open Run dialog (Win+R)
+dc.hotkey('win', 'r')
+
+# Save (Ctrl+S)
+dc.hotkey('ctrl', 's')
+
+# Select All (Ctrl+A)
+dc.hotkey('ctrl', 'a')
+```
+
+#### `key_down(key)` / `key_up(key)`
+Manually control key state.
+
+**Example:**
+```python
+# Hold Shift
+dc.key_down('shift')
+dc.type_text("hello")  # Types "HELLO"
+dc.key_up('shift')
+
+# Hold Ctrl and click (for multi-select)
+dc.key_down('ctrl')
+dc.click(100, 100)
+dc.click(200, 100)
+dc.key_up('ctrl')
+```
+
+---
+
+### Screen Functions
+
+#### `screenshot(region=None, filename=None)`
+Capture screen or region.
+
+**Parameters:**
+- `region` (tuple, optional): (left, top, width, height) for partial capture
+- `filename` (str, optional): Path to save image
+
+**Returns:** PIL Image object
+
+**Example:**
+```python
+# Full screen
+img = dc.screenshot()
+
+# Save to file
+dc.screenshot(filename="screenshot.png")
+
+# Capture specific region
+img = dc.screenshot(region=(100, 100, 500, 300))
+```
+
+#### `get_pixel_color(x, y)`
+Get color of pixel at coordinates.
+
+**Returns:** RGB tuple `(r, g, b)`
+
+**Example:**
+```python
+r, g, b = dc.get_pixel_color(500, 300)
+print(f"Color at (500, 300): RGB({r}, {g}, {b})")
+```
+
+#### `find_on_screen(image_path, confidence=0.8)`
+Find image on screen (requires OpenCV).
+
+**Parameters:**
+- `image_path` (str): Path to template image
+- `confidence` (float): Match threshold (0-1)
+
+**Returns:** `(x, y, width, height)` or None
+
+**Example:**
+```python
+# Find button on screen
+location = dc.find_on_screen("button.png")
+if location:
+    x, y, w, h = location
+    # Click center of found image
+    dc.click(x + w//2, y + h//2)
+```
+
+#### `get_screen_size()`
+Get screen resolution.
+
+**Returns:** `(width, height)` tuple
+
+**Example:**
+```python
+width, height = dc.get_screen_size()
+print(f"Screen: {width}x{height}")
+```
+
+---
+
+### Window Functions
+
+#### `get_all_windows()`
+List all open windows.
+
+**Returns:** List of window titles
+
+**Example:**
+```python
+windows = dc.get_all_windows()
+for title in windows:
+    print(f"Window: {title}")
+```
+
+#### `activate_window(title_substring)`
+Bring window to front by title.
+
+**Parameters:**
+- `title_substring` (str): Part of window title to match
+
+**Example:**
+```python
+# Activate Chrome
+dc.activate_window("Chrome")
+
+# Activate VS Code
+dc.activate_window("Visual Studio Code")
+```
+
+#### `get_active_window()`
+Get currently focused window.
+
+**Returns:** Window title (str)
+
+**Example:**
+```python
+active = dc.get_active_window()
+print(f"Active window: {active}")
+```
+
+---
+
+### Clipboard Functions
+
+#### `copy_to_clipboard(text)`
+Copy text to clipboard.
+
+**Example:**
+```python
+dc.copy_to_clipboard("Hello from OpenClaw!")
+```
+
+#### `get_from_clipboard()`
+Get text from clipboard.
+
+**Returns:** str
+
+**Example:**
+```python
+text = dc.get_from_clipboard()
+print(f"Clipboard: {text}")
+```
+
+---
+
+## ⌨️ Key Names Reference
+
+### Alphabet Keys
+`'a'` through `'z'`
+
+### Number Keys
+`'0'` through `'9'`
+
+### Function Keys
+`'f1'` through `'f24'`
+
+### Special Keys
+- `'enter'` / `'return'`
+- `'esc'` / `'escape'`
+- `'space'` / `'spacebar'`
+- `'tab'`
+- `'backspace'`
+- `'delete'` / `'del'`
+- `'insert'`
+- `'home'`
+- `'end'`
+- `'pageup'` / `'pgup'`
+- `'pagedown'` / `'pgdn'`
+
+### Arrow Keys
+- `'up'` / `'down'` / `'left'` / `'right'`
+
+### Modifier Keys
+- `'ctrl'` / `'control'`
+- `'shift'`
+- `'alt'`
+- `'win'` / `'winleft'` / `'winright'`
+- `'cmd'` / `'command'` (Mac)
+
+### Lock Keys
+- `'capslock'`
+- `'numlock'`
+- `'scrolllock'`
+
+### Punctuation
+- `'.'` / `','` / `'?'` / `'!'` / `';'` / `':'`
+- `'['` / `']'` / `'{'` / `'}'`
+- `'('` / `')'`
+- `'+'` / `'-'` / `'*'` / `'/'` / `'='`
+
+---
+
+## 🛡️ Safety Features
+
+### Failsafe Mode
+
+Move mouse to **any corner** of the screen to abort all automation.
+
+```python
+# Enable failsafe (enabled by default)
+dc = DesktopController(failsafe=True)
+```
+
+### Pause Control
+
+```python
+# Pause all automation for 2 seconds
+dc.pause(2.0)
+
+# Check if automation is safe to proceed
+if dc.is_safe():
+    dc.click(500, 500)
+```
+
+### Approval Mode
+
+Require user confirmation before actions:
+
+```python
+dc = DesktopController(require_approval=True)
+
+# This will ask for confirmation
+dc.click(500, 500)  # Prompt: "Allow click at (500, 500)? [y/n]"
+```
+
+---
+
+## 🎨 Advanced Examples
+
+### Example 1: Automated Form Filling
+
+```python
+dc = DesktopController()
+
+# Click name field
+dc.click(300, 200)
+dc.type_text("John Doe", wpm=80)
+
+# Tab to next field
+dc.press('tab')
+dc.type_text("john@example.com", wpm=80)
+
+# Tab to password
+dc.press('tab')
+dc.type_text("SecurePassword123", wpm=60)
+
+# Submit form
+dc.press('enter')
+```
+
+### Example 2: Screenshot Region and Save
+
+```python
+# Capture specific area
+region = (100, 100, 800, 600)  # left, top, width, height
+img = dc.screenshot(region=region)
+
+# Save with timestamp
+import datetime
+timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
+img.save(f"capture_{timestamp}.png")
+```
+
+### Example 3: Multi-File Selection
+
+```python
+# Hold Ctrl and click multiple files
+dc.key_down('ctrl')
+dc.click(100, 200)  # First file
+dc.click(100, 250)  # Second file
+dc.click(100, 300)  # Third file
+dc.key_up('ctrl')
+
+# Copy selected files
+dc.hotkey('ctrl', 'c')
+```
+
+### Example 4: Window Automation
+
+```python
+# Activate Calculator
+dc.activate_window("Calculator")
+time.sleep(0.5)
+
+# Type calculation
+dc.type_text("5+3=", interval=0.2)
+time.sleep(0.5)
+
+# Take screenshot of result
+dc.screenshot(filename="calculation_result.png")
+```
+
+### Example 5: Drag & Drop File
+
+```python
+# Drag file from source to destination
+dc.drag(
+    start_x=200, start_y=300,  # File location
+    end_x=800, end_y=500,       # Folder location
+    duration=1.0                 # Smooth 1-second drag
+)
+```
+
+---
+
+## ⚡ Performance Tips
+
+1. **Use instant movements** for speed: `duration=0`
+2. **Batch operations** instead of individual calls
+3. **Cache screen positions** instead of recalculating
+4. **Disable failsafe** for maximum performance (use with caution)
+5. **Use hotkeys** instead of menu navigation
+
+---
+
+## ⚠️ Important Notes
+
+- **Screen coordinates** start at (0, 0) in top-left corner
+- **Multi-monitor setups** may have negative coordinates for secondary displays
+- **Windows DPI scaling** may affect coordinate accuracy
+- **Failsafe corners** are: (0,0), (width-1, 0), (0, height-1), (width-1, height-1)
+- **Some applications** may block simulated input (games, secure apps)
+
+---
+
+## 🔧 Troubleshooting
+
+### Mouse not moving to correct position
+- Check DPI scaling settings
+- Verify screen resolution matches expectations
+- Use `get_screen_size()` to confirm dimensions
+
+### Keyboard input not working
+- Ensure target application has focus
+- Some apps require admin privileges
+- Try increasing `interval` for reliability
+
+### Failsafe triggering accidentally
+- Increase screen border tolerance
+- Move mouse away from corners during normal use
+- Disable if needed: `DesktopController(failsafe=False)`
+
+### Permission errors
+- Run Python with administrator privileges for some operations
+- Some secure applications block automation
+
+---
+
+## 📦 Dependencies
+
+- **PyAutoGUI** - Core automation engine
+- **Pillow** - Image processing
+- **OpenCV** (optional) - Image recognition
+- **PyGetWindow** - Window management
+
+Install all:
+```bash
+pip install pyautogui pillow opencv-python pygetwindow
+```
+
+---
+
+**Built for OpenClaw** - The ultimate desktop automation companion 🦞
--- a/init.py
+++ b/init.py
@@ -0,0 +1,522 @@
+"""
+Desktop Control - Advanced Mouse, Keyboard, and Screen Automation
+The best ever possible responsive desktop control for OpenClaw
+"""
+
+import pyautogui
+import time
+import sys
+from typing import Tuple, Optional, List, Union
+from pathlib import Path
+import logging
+
+# Configure PyAutoGUI
+pyautogui.MINIMUM_DURATION = 0  # Allow instant movements
+pyautogui.MINIMUM_SLEEP = 0     # No forced delays
+pyautogui.PAUSE = 0             # No pause between function calls
+
+# Setup logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+
+
+class DesktopController:
+    """
+    Advanced desktop automation controller with mouse, keyboard, and screen operations.
+    Designed for maximum responsiveness and reliability.
+    """
+    
+    def __init__(self, failsafe: bool = True, require_approval: bool = False):
+        """
+        Initialize desktop controller.
+        
+        Args:
+            failsafe: Enable failsafe (move mouse to corner to abort)
+            require_approval: Require user confirmation for actions
+        """
+        self.failsafe = failsafe
+        self.require_approval = require_approval
+        pyautogui.FAILSAFE = failsafe
+        
+        # Get screen info
+        self.screen_width, self.screen_height = pyautogui.size()
+        logger.info(f"Desktop Controller initialized. Screen: {self.screen_width}x{self.screen_height}")
+        logger.info(f"Failsafe: {failsafe}, Require Approval: {require_approval}")
+    
+    # ========== MOUSE OPERATIONS ==========
+    
+    def move_mouse(self, x: int, y: int, duration: float = 0, smooth: bool = True) -> None:
+        """
+        Move mouse to absolute screen coordinates.
+        
+        Args:
+            x: X coordinate (pixels from left)
+            y: Y coordinate (pixels from top)
+            duration: Movement time in seconds (0 = instant)
+            smooth: Use smooth movement (cubic bezier)
+        """
+        if self._check_approval(f"move mouse to ({x}, {y})"):
+            if smooth and duration > 0:
+                pyautogui.moveTo(x, y, duration=duration, tween=pyautogui.easeInOutQuad)
+            else:
+                pyautogui.moveTo(x, y, duration=duration)
+            logger.debug(f"Moved mouse to ({x}, {y}) in {duration}s")
+    
+    def move_relative(self, x_offset: int, y_offset: int, duration: float = 0) -> None:
+        """
+        Move mouse relative to current position.
+        
+        Args:
+            x_offset: Pixels to move horizontally (+ = right, - = left)
+            y_offset: Pixels to move vertically (+ = down, - = up)
+            duration: Movement time in seconds
+        """
+        if self._check_approval(f"move mouse relative ({x_offset}, {y_offset})"):
+            pyautogui.move(x_offset, y_offset, duration=duration)
+            logger.debug(f"Moved mouse relative ({x_offset}, {y_offset})")
+    
+    def click(self, x: Optional[int] = None, y: Optional[int] = None, 
+              button: str = 'left', clicks: int = 1, interval: float = 0.1) -> None:
+        """
+        Perform mouse click.
+        
+        Args:
+            x, y: Coordinates to click (None = current position)
+            button: 'left', 'right', 'middle'
+            clicks: Number of clicks (1 = single, 2 = double, etc.)
+            interval: Delay between multiple clicks
+        """
+        position_str = f"at ({x}, {y})" if x is not None else "at current position"
+        if self._check_approval(f"{button} click {position_str}"):
+            pyautogui.click(x=x, y=y, clicks=clicks, interval=interval, button=button)
+            logger.info(f"{button.capitalize()} click {position_str} (x{clicks})")
+    
+    def double_click(self, x: Optional[int] = None, y: Optional[int] = None) -> None:
+        """Convenience method for double-click."""
+        self.click(x, y, clicks=2)
+    
+    def right_click(self, x: Optional[int] = None, y: Optional[int] = None) -> None:
+        """Convenience method for right-click."""
+        self.click(x, y, button='right')
+    
+    def middle_click(self, x: Optional[int] = None, y: Optional[int] = None) -> None:
+        """Convenience method for middle-click."""
+        self.click(x, y, button='middle')
+    
+    def drag(self, start_x: int, start_y: int, end_x: int, end_y: int,
+             duration: float = 0.5, button: str = 'left') -> None:
+        """
+        Drag and drop operation.
+        
+        Args:
+            start_x, start_y: Starting coordinates
+            end_x, end_y: Ending coordinates
+            duration: Drag duration in seconds
+            button: Mouse button to use ('left', 'right', 'middle')
+        """
+        if self._check_approval(f"drag from ({start_x}, {start_y}) to ({end_x}, {end_y})"):
+            pyautogui.moveTo(start_x, start_y)
+            time.sleep(0.05)  # Small delay to ensure position
+            pyautogui.drag(end_x - start_x, end_y - start_y, duration=duration, button=button)
+            logger.info(f"Dragged from ({start_x}, {start_y}) to ({end_x}, {end_y})")
+    
+    def scroll(self, clicks: int, direction: str = 'vertical', 
+               x: Optional[int] = None, y: Optional[int] = None) -> None:
+        """
+        Scroll mouse wheel.
+        
+        Args:
+            clicks: Scroll amount (+ = up/left, - = down/right)
+            direction: 'vertical' or 'horizontal'
+            x, y: Position to scroll at (None = current position)
+        """
+        if x is not None and y is not None:
+            pyautogui.moveTo(x, y)
+        
+        if direction == 'vertical':
+            pyautogui.scroll(clicks)
+        else:
+            pyautogui.hscroll(clicks)
+        logger.debug(f"Scrolled {direction} {clicks} clicks")
+    
+    def get_mouse_position(self) -> Tuple[int, int]:
+        """
+        Get current mouse coordinates.
+        
+        Returns:
+            (x, y) tuple
+        """
+        pos = pyautogui.position()
+        return (pos.x, pos.y)
+    
+    # ========== KEYBOARD OPERATIONS ==========
+    
+    def type_text(self, text: str, interval: float = 0, wpm: Optional[int] = None) -> None:
+        """
+        Type text with configurable speed.
+        
+        Args:
+            text: Text to type
+            interval: Delay between keystrokes (0 = instant)
+            wpm: Words per minute (overrides interval, typical human: 40-80 WPM)
+        """
+        if wpm is not None:
+            # Convert WPM to interval (assuming avg 5 chars per word)
+            chars_per_second = (wpm * 5) / 60
+            interval = 1.0 / chars_per_second
+        
+        if self._check_approval(f"type text: '{text[:50]}...'"):
+            pyautogui.write(text, interval=interval)
+            logger.info(f"Typed text: '{text[:50]}{'...' if len(text) > 50 else ''}' (interval={interval:.3f}s)")
+    
+    def press(self, key: str, presses: int = 1, interval: float = 0.1) -> None:
+        """
+        Press and release a key.
+        
+        Args:
+            key: Key name (e.g., 'enter', 'space', 'a', 'f1')
+            presses: Number of times to press
+            interval: Delay between presses
+        """
+        if self._check_approval(f"press '{key}' {presses}x"):
+            pyautogui.press(key, presses=presses, interval=interval)
+            logger.info(f"Pressed '{key}' {presses}x")
+    
+    def hotkey(self, *keys, interval: float = 0.05) -> None:
+        """
+        Execute keyboard shortcut (e.g., Ctrl+C, Alt+Tab).
+        
+        Args:
+            *keys: Keys to press together (e.g., 'ctrl', 'c')
+            interval: Delay between key presses
+        """
+        keys_str = '+'.join(keys)
+        if self._check_approval(f"hotkey: {keys_str}"):
+            pyautogui.hotkey(*keys, interval=interval)
+            logger.info(f"Executed hotkey: {keys_str}")
+    
+    def key_down(self, key: str) -> None:
+        """Press and hold a key without releasing."""
+        pyautogui.keyDown(key)
+        logger.debug(f"Key down: '{key}'")
+    
+    def key_up(self, key: str) -> None:
+        """Release a held key."""
+        pyautogui.keyUp(key)
+        logger.debug(f"Key up: '{key}'")
+    
+    # ========== SCREEN OPERATIONS ==========
+    
+    def screenshot(self, region: Optional[Tuple[int, int, int, int]] = None,
+                   filename: Optional[str] = None):
+        """
+        Capture screen or region.
+        
+        Args:
+            region: (left, top, width, height) for partial capture
+            filename: Path to save image (None = return PIL Image)
+            
+        Returns:
+            PIL Image object (if filename is None)
+        """
+        img = pyautogui.screenshot(region=region)
+        
+        if filename:
+            img.save(filename)
+            logger.info(f"Screenshot saved to: {filename}")
+        else:
+            logger.debug(f"Screenshot captured (region={region})")
+            return img
+    
+    def get_pixel_color(self, x: int, y: int) -> Tuple[int, int, int]:
+        """
+        Get RGB color of pixel at coordinates.
+        
+        Args:
+            x, y: Screen coordinates
+            
+        Returns:
+            (r, g, b) tuple
+        """
+        color = pyautogui.pixel(x, y)
+        return color
+    
+    def find_on_screen(self, image_path: str, confidence: float = 0.8,
+                       region: Optional[Tuple[int, int, int, int]] = None):
+        """
+        Find image on screen using template matching.
+        Requires OpenCV (opencv-python).
+        
+        Args:
+            image_path: Path to template image
+            confidence: Match threshold 0-1 (0.8 = 80% match)
+            region: Search region (left, top, width, height)
+            
+        Returns:
+            (x, y, width, height) of match, or None if not found
+        """
+        try:
+            location = pyautogui.locateOnScreen(image_path, confidence=confidence, region=region)
+            if location:
+                logger.info(f"Found '{image_path}' at {location}")
+                return location
+            else:
+                logger.debug(f"'{image_path}' not found on screen")
+                return None
+        except Exception as e:
+            logger.error(f"Error finding image: {e}")
+            return None
+    
+    def get_screen_size(self) -> Tuple[int, int]:
+        """
+        Get screen resolution.
+        
+        Returns:
+            (width, height) tuple
+        """
+        return (self.screen_width, self.screen_height)
+    
+    # ========== WINDOW OPERATIONS ==========
+    
+    def get_all_windows(self) -> List[str]:
+        """
+        Get list of all open window titles.
+        
+        Returns:
+            List of window title strings
+        """
+        try:
+            import pygetwindow as gw
+            windows = gw.getAllTitles()
+            # Filter out empty titles
+            windows = [w for w in windows if w.strip()]
+            return windows
+        except ImportError:
+            logger.error("pygetwindow not installed. Run: pip install pygetwindow")
+            return []
+        except Exception as e:
+            logger.error(f"Error getting windows: {e}")
+            return []
+    
+    def activate_window(self, title_substring: str) -> bool:
+        """
+        Bring window to front by title (partial match).
+        
+        Args:
+            title_substring: Part of window title to match
+            
+        Returns:
+            True if window was activated, False otherwise
+        """
+        try:
+            import pygetwindow as gw
+            windows = gw.getWindowsWithTitle(title_substring)
+            if windows:
+                windows[0].activate()
+                logger.info(f"Activated window: '{windows[0].title}'")
+                return True
+            else:
+                logger.warning(f"No window found with title containing: '{title_substring}'")
+                return False
+        except ImportError:
+            logger.error("pygetwindow not installed")
+            return False
+        except Exception as e:
+            logger.error(f"Error activating window: {e}")
+            return False
+    
+    def get_active_window(self) -> Optional[str]:
+        """
+        Get title of currently focused window.
+        
+        Returns:
+            Window title string, or None if error
+        """
+        try:
+            import pygetwindow as gw
+            active = gw.getActiveWindow()
+            return active.title if active else None
+        except ImportError:
+            logger.error("pygetwindow not installed")
+            return None
+        except Exception as e:
+            logger.error(f"Error getting active window: {e}")
+            return None
+    
+    # ========== CLIPBOARD OPERATIONS ==========
+    
+    def copy_to_clipboard(self, text: str) -> None:
+        """
+        Copy text to clipboard.
+        
+        Args:
+            text: Text to copy
+        """
+        try:
+            import pyperclip
+            pyperclip.copy(text)
+            logger.info(f"Copied to clipboard: '{text[:50]}...'")
+        except ImportError:
+            logger.error("pyperclip not installed. Run: pip install pyperclip")
+        except Exception as e:
+            logger.error(f"Error copying to clipboard: {e}")
+    
+    def get_from_clipboard(self) -> Optional[str]:
+        """
+        Get text from clipboard.
+        
+        Returns:
+            Clipboard text, or None if error
+        """
+        try:
+            import pyperclip
+            text = pyperclip.paste()
+            logger.debug(f"Got from clipboard: '{text[:50]}...'")
+            return text
+        except ImportError:
+            logger.error("pyperclip not installed. Run: pip install pyperclip")
+            return None
+        except Exception as e:
+            logger.error(f"Error getting clipboard: {e}")
+            return None
+    
+    # ========== UTILITY METHODS ==========
+    
+    def pause(self, seconds: float) -> None:
+        """
+        Pause automation for specified duration.
+        
+        Args:
+            seconds: Time to pause
+        """
+        logger.info(f"Pausing for {seconds}s...")
+        time.sleep(seconds)
+    
+    def is_safe(self) -> bool:
+        """
+        Check if it's safe to continue automation.
+        Returns False if mouse is in a corner (failsafe position).
+        
+        Returns:
+            True if safe to continue
+        """
+        if not self.failsafe:
+            return True
+        
+        x, y = self.get_mouse_position()
+        corner_tolerance = 5
+        
+        # Check corners
+        corners = [
+            (0, 0),  # Top-left
+            (self.screen_width - 1, 0),  # Top-right
+            (0, self.screen_height - 1),  # Bottom-left
+            (self.screen_width - 1, self.screen_height - 1)  # Bottom-right
+        ]
+        
+        for cx, cy in corners:
+            if abs(x - cx) <= corner_tolerance and abs(y - cy) <= corner_tolerance:
+                logger.warning(f"Mouse in corner ({x}, {y}) - FAILSAFE TRIGGERED")
+                return False
+        
+        return True
+    
+    def _check_approval(self, action: str) -> bool:
+        """
+        Check if user approves action (if approval mode is enabled).
+        
+        Args:
+            action: Description of action
+            
+        Returns:
+            True if approved (or approval not required)
+        """
+        if not self.require_approval:
+            return True
+        
+        response = input(f"Allow: {action}? [y/n]: ").strip().lower()
+        approved = response in ['y', 'yes']
+        
+        if not approved:
+            logger.warning(f"Action declined: {action}")
+        
+        return approved
+    
+    # ========== CONVENIENCE METHODS ==========
+    
+    def alert(self, text: str = '', title: str = 'Alert', button: str = 'OK') -> None:
+        """Show alert dialog box."""
+        pyautogui.alert(text=text, title=title, button=button)
+    
+    def confirm(self, text: str = '', title: str = 'Confirm', buttons: List[str] = None) -> str:
+        """Show confirmation dialog with buttons."""
+        if buttons is None:
+            buttons = ['OK', 'Cancel']
+        return pyautogui.confirm(text=text, title=title, buttons=buttons)
+    
+    def prompt(self, text: str = '', title: str = 'Input', default: str = '') -> Optional[str]:
+        """Show input prompt dialog."""
+        return pyautogui.prompt(text=text, title=title, default=default)
+
+
+# ========== QUICK ACCESS FUNCTIONS ==========
+
+# Global controller instance for quick access
+_controller = None
+
+def get_controller(**kwargs) -> DesktopController:
+    """Get or create global controller instance."""
+    global _controller
+    if _controller is None:
+        _controller = DesktopController(**kwargs)
+    return _controller
+
+
+# Convenience function exports
+def move_mouse(x: int, y: int, duration: float = 0) -> None:
+    """Quick mouse move."""
+    get_controller().move_mouse(x, y, duration)
+
+def click(x: Optional[int] = None, y: Optional[int] = None, button: str = 'left') -> None:
+    """Quick click."""
+    get_controller().click(x, y, button=button)
+
+def type_text(text: str, wpm: Optional[int] = None) -> None:
+    """Quick text typing."""
+    get_controller().type_text(text, wpm=wpm)
+
+def hotkey(*keys) -> None:
+    """Quick hotkey."""
+    get_controller().hotkey(*keys)
+
+def screenshot(filename: Optional[str] = None):
+    """Quick screenshot."""
+    return get_controller().screenshot(filename=filename)
+
+
+# ========== DEMONSTRATION ==========
+
+if __name__ == "__main__":
+    print("🖱️  Desktop Control Skill - Test Mode")
+    print("=" * 50)
+    
+    # Initialize controller
+    dc = DesktopController(failsafe=True)
+    
+    # Display info
+    print(f"\n📺 Screen Size: {dc.get_screen_size()}")
+    print(f"🖱️  Current Mouse Position: {dc.get_mouse_position()}")
+    
+    # Test window operations
+    print(f"\n🪟 Active Window: {dc.get_active_window()}")
+    
+    windows = dc.get_all_windows()
+    print(f"\n📋 Open Windows ({len(windows)}):")
+    for i, title in enumerate(windows[:10], 1):  # Show first 10
+        print(f"  {i}. {title}")
+    
+    print("\n✅ Desktop Control ready!")
+    print("⚠️  Move mouse to any corner to trigger failsafe")
+    
+    # Keep running to allow testing
+    print("\nController is ready. Import this module to use it in your OpenClaw skills!")
--- a/_meta.json
+++ b/_meta.json
@@ -0,0 +1,6 @@
+{
+  "ownerId": "kn7ag28ra4hhta8bx2k2j1kpv180kqbk",
+  "slug": "desktop-control",
+  "version": "1.0.0",
+  "publishedAt": 1770255200863
+}
--- a/ai_agent.py
+++ b/ai_agent.py
@@ -0,0 +1,613 @@
+"""
+AI Desktop Agent - Cognitive Desktop Automation
+Combines vision, reasoning, and control for autonomous task execution
+"""
+
+import base64
+import time
+from typing import Dict, List, Optional, Any, Callable
+from pathlib import Path
+import logging
+
+from desktop_control import DesktopController
+
+logger = logging.getLogger(__name__)
+
+
+class AIDesktopAgent:
+    """
+    Intelligent desktop agent that combines computer vision, LLM reasoning,
+    and desktop control for autonomous task execution.
+    
+    Can understand screen content, plan actions, and execute complex workflows.
+    """
+    
+    def __init__(self, llm_client=None, failsafe: bool = True):
+        """
+        Initialize AI Desktop Agent.
+        
+        Args:
+            llm_client: OpenClaw LLM client for reasoning (optional, will try to auto-detect)
+            failsafe: Enable failsafe mode
+        """
+        self.dc = DesktopController(failsafe=failsafe)
+        self.llm_client = llm_client
+        self.screen_width, self.screen_height = self.dc.get_screen_size()
+        
+        # Action history for learning
+        self.action_history = []
+        
+        # Application knowledge base
+        self.app_knowledge = self._load_app_knowledge()
+        
+        logger.info("AI Desktop Agent initialized")
+    
+    def _load_app_knowledge(self) -> Dict[str, Dict]:
+        """
+        Load application-specific knowledge.
+        This can be extended with learned patterns.
+        """
+        return {
+            "mspaint": {
+                "name": "Microsoft Paint",
+                "launch_command": "mspaint",
+                "common_actions": {
+                    "select_pencil": {"menu": "Tools", "position": "toolbar_left"},
+                    "select_brush": {"menu": "Tools", "position": "toolbar"},
+                    "select_color": {"menu": "Colors", "action": "click_palette"},
+                    "draw_line": {"action": "drag", "tool_required": "line"},
+                }
+            },
+            "notepad": {
+                "name": "Notepad",
+                "launch_command": "notepad",
+                "common_actions": {
+                    "type_text": {"action": "type"},
+                    "save": {"hotkey": ["ctrl", "s"]},
+                    "new_file": {"hotkey": ["ctrl", "n"]},
+                }
+            },
+            "calculator": {
+                "name": "Calculator",
+                "launch_command": "calc",
+                "common_actions": {
+                    "calculate": {"action": "type_numbers"},
+                }
+            }
+        }
+    
+    def execute_task(self, task: str, max_steps: int = 50) -> Dict[str, Any]:
+        """
+        Execute a high-level task autonomously.
+        
+        Args:
+            task: Natural language task description
+            max_steps: Maximum number of steps to attempt
+            
+        Returns:
+            Execution result with status and details
+        """
+        logger.info(f"Executing task: {task}")
+        
+        # Initialize result
+        result = {
+            "task": task,
+            "status": "in_progress",
+            "steps": [],
+            "screenshots": [],
+            "success": False
+        }
+        
+        try:
+            # Step 1: Analyze task and plan
+            plan = self._plan_task(task)
+            logger.info(f"Generated plan with {len(plan)} steps")
+            
+            # Step 2: Execute plan step by step
+            for step_num, step in enumerate(plan, 1):
+                if step_num > max_steps:
+                    logger.warning(f"Reached max steps ({max_steps})")
+                    break
+                
+                logger.info(f"Step {step_num}/{len(plan)}: {step['description']}")
+                
+                # Capture screen before action
+                screenshot_before = self.dc.screenshot()
+                
+                # Execute step
+                step_result = self._execute_step(step)
+                result["steps"].append(step_result)
+                
+                # Capture screen after action
+                screenshot_after = self.dc.screenshot()
+                result["screenshots"].append({
+                    "step": step_num,
+                    "before": screenshot_before,
+                    "after": screenshot_after
+                })
+                
+                # Verify step success
+                if not step_result.get("success", False):
+                    logger.error(f"Step {step_num} failed: {step_result.get('error')}")
+                    result["status"] = "failed"
+                    result["failed_at_step"] = step_num
+                    return result
+                
+                # Small delay between steps
+                time.sleep(0.5)
+            
+            result["status"] = "completed"
+            result["success"] = True
+            logger.info(f"Task completed successfully in {len(result['steps'])} steps")
+            
+        except Exception as e:
+            logger.error(f"Task execution error: {e}")
+            result["status"] = "error"
+            result["error"] = str(e)
+        
+        return result
+    
+    def _plan_task(self, task: str) -> List[Dict[str, Any]]:
+        """
+        Plan task execution using LLM reasoning.
+        
+        Args:
+            task: Task description
+            
+        Returns:
+            List of execution steps
+        """
+        # For now, use rule-based planning
+        # TODO: Integrate with OpenClaw LLM for intelligent planning
+        
+        # Parse task intent
+        task_lower = task.lower()
+        
+        # Pattern matching for common tasks
+        if "draw" in task_lower and "paint" in task_lower:
+            return self._plan_paint_drawing(task)
+        elif "type" in task_lower or "write" in task_lower:
+            return self._plan_text_entry(task)
+        elif "play" in task_lower and "game" in task_lower:
+            return self._plan_game_play(task)
+        elif "open" in task_lower or "launch" in task_lower:
+            return self._plan_app_launch(task)
+        else:
+            # Generic plan - analyze and improvise
+            return self._plan_generic(task)
+    
+    def _plan_paint_drawing(self, task: str) -> List[Dict]:
+        """Plan for drawing in MS Paint."""
+        # Extract what to draw
+        drawing_subject = self._extract_subject(task)
+        
+        return [
+            {
+                "type": "launch_app",
+                "app": "mspaint",
+                "description": "Launch Microsoft Paint"
+            },
+            {
+                "type": "wait",
+                "duration": 2.0,
+                "description": "Wait for Paint to load"
+            },
+            {
+                "type": "activate_window",
+                "title": "Paint",
+                "description": "Ensure Paint window is active"
+            },
+            {
+                "type": "select_tool",
+                "tool": "pencil",
+                "description": "Select pencil tool"
+            },
+            {
+                "type": "draw",
+                "subject": drawing_subject,
+                "description": f"Draw {drawing_subject}"
+            },
+            {
+                "type": "screenshot",
+                "save_as": "drawing_result.png",
+                "description": "Capture the drawing"
+            }
+        ]
+    
+    def _plan_text_entry(self, task: str) -> List[Dict]:
+        """Plan for text entry task."""
+        # Extract text to type
+        text_content = self._extract_text_content(task)
+        
+        return [
+            {
+                "type": "launch_app",
+                "app": "notepad",
+                "description": "Launch Notepad"
+            },
+            {
+                "type": "wait",
+                "duration": 1.0,
+                "description": "Wait for Notepad to load"
+            },
+            {
+                "type": "type_text",
+                "text": text_content,
+                "wpm": 80,
+                "description": f"Type: {text_content[:50]}..."
+            }
+        ]
+    
+    def _plan_game_play(self, task: str) -> List[Dict]:
+        """Plan for playing a game."""
+        game_name = self._extract_game_name(task)
+        
+        return [
+            {
+                "type": "analyze_screen",
+                "description": "Analyze game screen"
+            },
+            {
+                "type": "detect_game_state",
+                "game": game_name,
+                "description": f"Detect {game_name} state"
+            },
+            {
+                "type": "execute_game_loop",
+                "game": game_name,
+                "max_iterations": 100,
+                "description": f"Play {game_name}"
+            }
+        ]
+    
+    def _plan_app_launch(self, task: str) -> List[Dict]:
+        """Plan for launching an application."""
+        app_name = self._extract_app_name(task)
+        
+        return [
+            {
+                "type": "launch_app",
+                "app": app_name,
+                "description": f"Launch {app_name}"
+            },
+            {
+                "type": "wait",
+                "duration": 2.0,
+                "description": f"Wait for {app_name} to load"
+            }
+        ]
+    
+    def _plan_generic(self, task: str) -> List[Dict]:
+        """Generic planning fallback."""
+        return [
+            {
+                "type": "analyze_screen",
+                "description": "Analyze current screen state"
+            },
+            {
+                "type": "infer_action",
+                "task": task,
+                "description": f"Infer action for: {task}"
+            }
+        ]
+    
+    def _execute_step(self, step: Dict[str, Any]) -> Dict[str, Any]:
+        """
+        Execute a single step.
+        
+        Args:
+            step: Step definition
+            
+        Returns:
+            Execution result
+        """
+        step_type = step.get("type")
+        result = {"step": step, "success": False}
+        
+        try:
+            if step_type == "launch_app":
+                self._do_launch_app(step["app"])
+                result["success"] = True
+                
+            elif step_type == "wait":
+                time.sleep(step["duration"])
+                result["success"] = True
+                
+            elif step_type == "activate_window":
+                success = self.dc.activate_window(step["title"])
+                result["success"] = success
+                
+            elif step_type == "select_tool":
+                self._do_select_tool(step["tool"])
+                result["success"] = True
+                
+            elif step_type == "draw":
+                self._do_draw(step["subject"])
+                result["success"] = True
+                
+            elif step_type == "type_text":
+                self.dc.type_text(step["text"], wpm=step.get("wpm", 80))
+                result["success"] = True
+                
+            elif step_type == "screenshot":
+                filename = step.get("save_as", "screenshot.png")
+                self.dc.screenshot(filename=filename)
+                result["success"] = True
+                result["saved_to"] = filename
+                
+            elif step_type == "analyze_screen":
+                analysis = self._analyze_screen()
+                result["analysis"] = analysis
+                result["success"] = True
+                
+            elif step_type == "execute_game_loop":
+                game_result = self._execute_game_loop(step)
+                result["game_result"] = game_result
+                result["success"] = True
+                
+            else:
+                result["error"] = f"Unknown step type: {step_type}"
+                
+        except Exception as e:
+            logger.error(f"Step execution error: {e}")
+            result["error"] = str(e)
+        
+        return result
+    
+    def _do_launch_app(self, app: str) -> None:
+        """Launch an application."""
+        # Get launch command from knowledge base
+        app_info = self.app_knowledge.get(app, {})
+        launch_cmd = app_info.get("launch_command", app)
+        
+        # Open Run dialog
+        self.dc.hotkey('win', 'r')
+        time.sleep(0.5)
+        
+        # Type and execute command
+        self.dc.type_text(launch_cmd, wpm=100)
+        self.dc.press('enter')
+        
+        logger.info(f"Launched: {app}")
+    
+    def _do_select_tool(self, tool: str) -> None:
+        """Select a tool (e.g., in Paint)."""
+        # This is simplified - in reality would use computer vision
+        # to find and click the tool button
+        
+        # For Paint, tools are typically in the ribbon
+        # We'll use hotkeys where possible
+        if tool == "pencil":
+            # In Paint, press 'P' for pencil
+            self.dc.press('p')
+        elif tool == "brush":
+            self.dc.press('b')
+        elif tool == "eraser":
+            self.dc.press('e')
+        
+        logger.info(f"Selected tool: {tool}")
+    
+    def _do_draw(self, subject: str) -> None:
+        """
+        Draw something on screen.
+        This is a simplified implementation - would be enhanced with:
+        - Image generation (use wan2gp to generate reference)
+        - Trace generation (convert image to draw commands)
+        - Executed drawing (execute the commands)
+        """
+        logger.info(f"Drawing: {subject}")
+        
+        # Get canvas center (simplified - would detect canvas)
+        canvas_x = self.screen_width // 2
+        canvas_y = self.screen_height // 2
+        
+        # Simple drawing pattern (example: draw a    simple shape)
+        if "circle" in subject.lower():
+            self._draw_circle(canvas_x, canvas_y, radius=100)
+        elif "square" in subject.lower():
+            self._draw_square(canvas_x, canvas_y, size=200)
+        elif "star" in subject.lower():
+            self._draw_star(canvas_x, canvas_y, size=100)
+        else:
+            # Generic: draw a simple pattern
+            self._draw_simple_pattern(canvas_x, canvas_y)
+        
+        logger.info(f"Completed drawing: {subject}")
+    
+    def _draw_circle(self, cx: int, cy: int, radius: int) -> None:
+        """Draw a circle."""
+        import math
+        
+        points = []
+        for angle in range(0, 360, 5):
+            rad = math.radians(angle)
+            x = int(cx + radius * math.cos(rad))
+            y = int(cy + radius * math.sin(rad))
+            points.append((x, y))
+        
+        # Draw by connecting points
+        for i in range(len(points) - 1):
+            self.dc.drag(points[i][0], points[i][1], 
+                        points[i+1][0], points[i+1][1], 
+                        duration=0.01)
+        # Close the circle
+        self.dc.drag(points[-1][0], points[-1][1], 
+                    points[0][0], points[0][1], 
+                    duration=0.01)
+    
+    def _draw_square(self, cx: int, cy: int, size: int) -> None:
+        """Draw a square."""
+        half = size // 2
+        corners = [
+            (cx - half, cy - half),  # Top-left
+            (cx + half, cy - half),  # Top-right
+            (cx + half, cy + half),  # Bottom-right
+            (cx - half, cy + half),  # Bottom-left
+        ]
+        
+        # Draw sides
+        for i in range(4):
+            start = corners[i]
+            end = corners[(i + 1) % 4]
+            self.dc.drag(start[0], start[1], end[0], end[1], duration=0.2)
+    
+    def _draw_star(self, cx: int, cy: int, size: int) -> None:
+        """Draw a 5-pointed star."""
+        import math
+        
+        points = []
+        for i in range(10):
+            angle = math.radians(i * 36 - 90)
+            radius = size if i % 2 == 0 else size // 2
+            x = int(cx + radius * math.cos(angle))
+            y = int(cy + radius * math.sin(angle))
+            points.append((x, y))
+        
+        # Draw by connecting points
+        for i in range(len(points)):
+            start = points[i]
+            end = points[(i + 1) % len(points)]
+            self.dc.drag(start[0], start[1], end[0], end[1], duration=0.1)
+    
+    def _draw_simple_pattern(self, cx: int, cy: int) -> None:
+        """Draw a simple decorative pattern."""
+        # Draw a few curved lines
+        for offset in [-50, 0, 50]:
+            self.dc.drag(cx - 100, cy + offset, 
+                        cx + 100, cy + offset, 
+                        duration=0.3)
+    
+    def _analyze_screen(self) -> Dict[str, Any]:
+        """
+        Analyze current screen state.
+        Would use OCR, object detection in full implementation.
+        """
+        screenshot = self.dc.screenshot()
+        active_window = self.dc.get_active_window()
+        mouse_pos = self.dc.get_mouse_position()
+        
+        analysis = {
+            "active_window": active_window,
+            "mouse_position": mouse_pos,
+            "screen_size": (self.screen_width, self.screen_height),
+            "timestamp": time.time()
+        }
+        
+        # TODO: Add OCR, object detection, UI element detection
+        
+        return analysis
+    
+    def _execute_game_loop(self, step: Dict) -> Dict:
+        """
+        Execute game playing loop.
+        Would use reinforcement learning in full implementation.
+        """
+        game = step.get("game", "unknown")
+        max_iter = step.get("max_iterations", 100)
+        
+        logger.info(f"Starting game loop for: {game}")
+        
+        result = {
+            "game": game,
+            "iterations": 0,
+            "actions_taken": []
+        }
+        
+        # Simple game loop - would be much more sophisticated
+        for i in range(max_iter):
+            # Analyze game state
+            state = self._analyze_screen()
+            
+            # Decide action (simplified - would use ML model)
+            action = self._decide_game_action(state, game)
+            
+            # Execute action
+            self._execute_game_action(action)
+            
+            result["iterations"] += 1
+            result["actions_taken"].append(action)
+            
+            # Check win/lose condition
+            # (would detect from screen)
+            
+            time.sleep(0.1)
+        
+        return result
+    
+    def _decide_game_action(self, state: Dict, game: str) -> str:
+        """Decide next game action based on state."""
+        # Simplified - would use game-specific AI
+        return "continue"
+    
+    def _execute_game_action(self, action: str) -> None:
+        """Execute a game action."""
+        # Simplified - would translate to specific inputs
+        pass
+    
+    # Helper methods for parsing
+    
+    def _extract_subject(self, text: str) -> str:
+        """Extract subject from drawing request."""
+        # Simple extraction - would use NLP
+        if "draw" in text.lower():
+            parts = text.lower().split("draw")
+            if len(parts) > 1:
+                return parts[1].strip()
+        return "unknown"
+    
+    def _extract_text_content(self, text: str) -> str:
+        """Extract text content from typing request."""
+        # Simple extraction
+        if "type" in text.lower():
+            parts = text.split("type")
+            if len(parts) > 1:
+                return parts[1].strip().strip('"').strip("'")
+        return text
+    
+    def _extract_game_name(self, text: str) -> str:
+        """Extract game name from request."""
+        # Would use NER for better extraction
+        return "unknown_game"
+    
+    def _extract_app_name(self, text: str) -> str:
+        """Extract application name from request."""
+        # Simple extraction - would use NER
+        for app in self.app_knowledge.keys():
+            if app in text.lower():
+                return app
+        return "notepad"  # Default fallback
+
+
+# Quick access function
+def create_agent(**kwargs) -> AIDesktopAgent:
+    """Create an AI Desktop Agent instance."""
+    return AIDesktopAgent(**kwargs)
+
+
+if __name__ == "__main__":
+    print("🤖 AI Desktop Agent - Cognitive Automation")
+    print("=" * 60)
+    
+    # Create agent
+    agent = AIDesktopAgent(failsafe=True)
+    
+    print("\n✨ Examples of what you can ask:")
+    print("  - 'Draw a circle in Paint'")
+    print("  - 'Type Hello World in Notepad'")
+    print("  - 'Open Calculator'")
+    print("  - 'Play Solitaire for me'")
+    
+    print("\n🎯 Try it:")
+    task = input("\nWhat would you like me to do? ")
+    
+    if task.strip():
+        result = agent.execute_task(task)
+        print(f"\n{'='* 60}")
+        print(f"Task Status: {result['status']}")
+        print(f"Steps Executed: {len(result['steps'])}")
+        print(f"Success: {result['success']}")
+        
+        if result.get('screenshots'):
+            print(f"Screenshots captured: {len(result['screenshots'])}")
+    else:
+        print("\nNo task entered. Exiting.")
--- a/demo.py
+++ b/demo.py
@@ -0,0 +1,238 @@
+"""
+Desktop Control Demo - Quick examples and tests
+"""
+
+import sys
+import time
+from pathlib import Path
+
+# Add skills to path
+sys.path.insert(0, str(Path(__file__).parent))
+
+from desktop_control import DesktopController
+
+def demo_mouse_control():
+    """Demo: Mouse movement and clicking"""
+    print("\n🖱️  === MOUSE CONTROL DEMO ===")
+    
+    dc = DesktopController(failsafe=True)
+    
+    print(f"Current position: {dc.get_mouse_position()}")
+    
+    # Smooth movement
+    print("\n1. Moving mouse smoothly to center of screen...")
+    screen_w, screen_h = dc.get_screen_size()
+    center_x, center_y = screen_w // 2, screen_h // 2
+    dc.move_mouse(center_x, center_y, duration=1.0)
+    
+    # Relative movement
+    print("2. Moving 100px right...")
+    dc.move_relative(100, 0, duration=0.5)
+    
+    print(f"Final position: {dc.get_mouse_position()}")
+    
+    print("✅ Mouse demo complete!")
+
+
+def demo_keyboard_control():
+    """Demo: Keyboard typing"""
+    print("\n⌨️  === KEYBOARD CONTROL DEMO ===")
+    
+    dc = DesktopController()
+    
+    print("\n⚠️  In 3 seconds, I'll type 'Hello from OpenClaw!' in the active window")
+    print("Switch to Notepad or any text editor NOW!")
+    time.sleep(3)
+    
+    # Type with human-like speed
+    dc.type_text("Hello from OpenClaw! ", wpm=60)
+    dc.type_text("This is desktop automation in action. ", wpm=80)
+    
+    # Press Enter
+    dc.press('enter')
+    dc.press('enter')
+    
+    # Type instant
+    dc.type_text("This was typed instantly!", interval=0)
+    
+    print("\n✅ Keyboard demo complete!")
+
+
+def demo_screen_capture():
+    """Demo: Screenshot functionality"""
+    print("\n📸 === SCREEN CAPTURE DEMO ===")
+    
+    dc = DesktopController()
+    
+    # Full screenshot
+    print("\n1. Capturing full screen...")
+    dc.screenshot(filename="demo_fullscreen.png")
+    print("   Saved: demo_fullscreen.png")
+    
+    # Region screenshot (center 800x600)
+    print("\n2. Capturing center region (800x600)...")
+    screen_w, screen_h = dc.get_screen_size()
+    region = (
+        (screen_w - 800) // 2,  # left
+        (screen_h - 600) // 2,  # top
+        800,  # width
+        600   # height
+    )
+    dc.screenshot(region=region, filename="demo_region.png")
+    print("   Saved: demo_region.png")
+    
+    # Get pixel color
+    print("\n3. Getting pixel color at center...")
+    center_x, center_y = screen_w // 2, screen_h // 2
+    r, g, b = dc.get_pixel_color(center_x, center_y)
+    print(f"   Color at ({center_x}, {center_y}): RGB({r}, {g}, {b})")
+    
+    print("\n✅ Screen capture demo complete!")
+
+
+def demo_window_management():
+    """Demo: Window operations"""
+    print("\n🪟 === WINDOW MANAGEMENT DEMO ===")
+    
+    dc = DesktopController()
+    
+    # Get current window
+    print(f"\n1. Active window: {dc.get_active_window()}")
+    
+    # List all windows
+    windows = dc.get_all_windows()
+    print(f"\n2. Found {len(windows)} open windows:")
+    for i, title in enumerate(windows[:15], 1):  # Show first 15
+        print(f"   {i}. {title}")
+    
+    print("\n✅ Window management demo complete!")
+
+
+def demo_hotkeys():
+    """Demo: Keyboard shortcuts"""
+    print("\n🔥 === HOTKEY DEMO ===")
+    
+    dc = DesktopController()
+    
+    print("\n⚠️  This demo will:")
+    print("   1. Open Windows Run dialog (Win+R)")
+    print("   2. Type 'notepad'")
+    print("   3. Press Enter to open Notepad")
+    print("   4. Type a message")
+    print("\nPress Enter to continue...")
+    input()
+    
+    # Open Run dialog
+    print("\n1. Opening Run dialog...")
+    dc.hotkey('win', 'r')
+    time.sleep(0.5)
+    
+    # Type notepad command
+    print("2. Typing 'notepad'...")
+    dc.type_text('notepad', wpm=80)
+    time.sleep(0.3)
+    
+    # Press Enter
+    print("3. Launching Notepad...")
+    dc.press('enter')
+    time.sleep(1)
+    
+    # Type message
+    print("4. Typing message in Notepad...")
+    dc.type_text("Desktop Control Skill Test\n\n", wpm=60)
+    dc.type_text("This was automated by OpenClaw!\n", wpm=60)
+    dc.type_text("- Mouse control ✓\n", wpm=60)
+    dc.type_text("- Keyboard control ✓\n", wpm=60)
+    dc.type_text("- Hotkeys ✓\n", wpm=60)
+    
+    print("\n✅ Hotkey demo complete!")
+
+
+def demo_advanced_automation():
+    """Demo: Complete automation workflow"""
+    print("\n🚀 === ADVANCED AUTOMATION DEMO ===")
+    
+    dc = DesktopController()
+    
+    print("\nThis demo will:")
+    print("1. Get your clipboard content")
+    print("2. Copy a new string to clipboard")
+    print("3. Show the changes")
+    print("\nPress Enter to continue...")
+    input()
+    
+    # Get current clipboard
+    original = dc.get_from_clipboard()
+    print(f"\n1. Original clipboard: '{original}'")
+    
+    # Copy new content
+    test_text = "Hello from OpenClaw Desktop Control!"
+    dc.copy_to_clipboard(test_text)
+    print(f"2. Copied to clipboard: '{test_text}'")
+    
+    # Verify
+    new_clipboard = dc.get_from_clipboard()
+    print(f"3. Verified clipboard: '{new_clipboard}'")
+    
+    # Restore original
+    if original:
+        dc.copy_to_clipboard(original)
+        print("4. Restored original clipboard")
+    
+    print("\n✅ Advanced automation demo complete!")
+
+
+def  main():
+    """Run all demos"""
+    print("=" * 60)
+    print("🎮 DESKTOP CONTROL SKILL - DEMO SUITE")
+    print("=" * 60)
+    print("\n⚠️  IMPORTANT:")
+    print("- Failsafe is ENABLED (move mouse to corner to abort)")
+    print("- Some demos will control your mouse and keyboard")
+    print("- Close important applications before continuing")
+    print("\n" + "=" * 60)
+    
+    demos = [
+        ("Mouse Control", demo_mouse_control),
+        ("Window Management", demo_window_management),
+        ("Screen Capture", demo_screen_capture),
+        ("Hotkeys", demo_hotkeys),
+        ("Keyboard Control", demo_keyboard_control),
+        ("Advanced Automation", demo_advanced_automation),
+    ]
+    
+    while True:
+        print("\n📋 SELECT DEMO:")
+        for i, (name, _) in enumerate(demos, 1):
+            print(f"  {i}. {name}")
+        print(f"  {len(demos) + 1}. Run All")
+        print("  0. Exit")
+        
+        choice = input("\nEnter choice: ").strip()
+        
+        if choice == '0':
+            print("\n👋 Goodbye!")
+            break
+        elif choice == str(len(demos) + 1):
+            for name, func in demos:
+                print(f"\n{'=' * 60}")
+                func()
+                time.sleep(1)
+            print(f"\n{'=' * 60}")
+            print("🎉 All demos complete!")
+        elif choice.isdigit() and 1 <= int(choice) <= len(demos):
+            demos[int(choice) - 1][1]()
+        else:
+            print("❌ Invalid choice!")
+
+
+if __name__ == "__main__":
+    try:
+        main()
+    except KeyboardInterrupt:
+        print("\n\n⚠️  Demo interrupted by user")
+    except Exception as e:
+        print(f"\n\n❌ Error: {e}")
+        import traceback
+        traceback.print_exc()