Skip to content

AI Desktop Assistant

Mixstart 3.0 features a powerful built-in AI Desktop Assistant. It's not just a chatbot, but a smart steward deeply integrated into your system. You can control your computer, manage applications, process files, generate documents, and execute complex automation tasks using natural language commands.

Evoke Method

You can evoke the AI Assistant in the following way:

  • Shortcut Key: Press Alt + V (recommended).

Versatile Assistant Capabilities

The AI Assistant integrates 9 categories and over 30 system-level tools to provide comprehensive desktop assistance.

1. App & Window Management

AI can help you quickly launch, close, or switch between applications without searching for icons manually.

  • Launch/Close: "Open VS Code", "Close Notepad", "Force close the stuck program"
  • Window Control: "Switch to WeChat", "Back to browser"
  • Running Status: "What programs are running now?", "List all open windows"

2. System Control

Directly control system power states through voice or text commands, supporting scheduled tasks.

  • Power Management: "Lock screen", "Put computer to sleep", "Restart computer", "Shut down now"
  • Scheduled Tasks: "Shut down in 30 minutes", "Cancel shutdown plan"
  • System Status: "Check CPU and memory usage", "How long has my computer been running?"

3. File Management

Completely manage your file system through natural language.

  • File Browsing: "List all files in the Downloads folder", "See what's on the desktop"
  • Find Files: "Search for PDF files in the Downloads folder", "Find a document with 'report' in its name"
  • Organize Files: "Move all images from Downloads to the Pictures directory", "Copy all Word documents on the desktop to the backup folder"
  • Cleanup & Rename: "Delete this temporary file", "Rename this folder to Project_V2"

3.5 File Content Modification New

AI can directly help you modify file contents without opening an editor manually. Supports text files, Excel, and Word documents.

  • Text File Editing: "Replace line 3 of config.txt with new content", "Insert a piece of code after line 10", "Delete lines 5-8"
  • Excel Cell Operations: "Change cell A1 of the Excel to 100", "Add a row of data to the end of the sheet", "Replace all 'Pending' with 'Completed'"
  • Word Document Editing: "Replace 'John' with 'Doe' in the Word document", "Add a signature to the end of the document"
  • Batch Modification: "Replace 'Old Version' with 'New Version' in all txt files in the Downloads folder"
  • Auto-Compare: Automatically display a modification comparison (Diff) in the browser after changes for easy confirmation.

Security Mechanism

  • Modified files are saved as new files with a _modified suffix by default; the original file remains unaffected.
  • Automatic backups are created, supporting undo/restore.
  • If the document is opened by another program, a new filename will be generated automatically to avoid conflicts.

4. Office Document Generation

AI has a built-in Office document generation engine to create professional documents with one sentence.

  • Excel Spreadsheet: "Help me generate a sales report with date, product, and amount columns"
  • PPT Presentation: "Make a year-end summary PPT with a black and gold theme, including 5 pages"
  • Word Document: "Write a project meeting minutes template"

5. Desktop Organization & Pushing

Manage your Mixstart desktop components (fences, to-do items).

  • Fence Management: "Create a fence named 'Work'", "Show 'Game' fence"
  • To-do Items: "Create a to-do list window", "Show my to-do items"
  • Layout Control: "Close all desktop pushes", "See what fences are on the desktop"

6. Scheduled Reminders

Let AI be your personal secretary.

  • Set Reminders: "Remind me to have a meeting tomorrow at 9 AM", "Remind me to drink water every afternoon at 3 PM"
  • Manage Reminders: "View all my reminders", "Cancel tomorrow's meeting reminder"

7. Automation Scripts

AI can help you create, run, and manage Mix automation scripts.

  • Create Scripts: "Create a script: first open VS Code, then open Chrome, then play music"
  • Execute Scripts: "Run 'Dev Environment' script"
  • Manage Scripts: "List all scripts", "Check the content of 'Cleanup' script"

8. Information Query & Calculation

Handle daily searches and information queries.

  • App Search: "Search for apps starting with 'Photo'", "Find the Calculator in the system"
  • Weather Query: "How is the weather in Beijing today?", "Will it rain in London tomorrow?"
  • Math Calculation: "Calculate 128 * 45 + 300"
  • Time & Date: "What time is it now?", "What day is today?"

9. Web Browsing

  • Open Webpages: "Open YouTube", "Access GitHub"
  • Search Content: "Search for React tutorials using Google"

10. Clipboard Tools

AI can read, analyze, and modify your system clipboard.

  • Read Content: "Help me translate this passage in the clipboard", "Analyze what this code in the clipboard does"
  • Write Content: "Generate a leave request note and copy it to the clipboard"
  • Clear: "Clear clipboard"

12. Vision Agent Screen Proxy New

This is the most significant evolution of the AI Assistant. With Vision Agent mode, the AI no longer just gives you suggestions; it can actually observe your screen and operate your computer like a real person.

  • Autonomous Operation: One sentence allows the AI to automatically complete complex GUI operations, such as "Help me open Notepad and write a passage" or "Open WeChat and send a message to John."
  • Visual Feedback: The model automatically takes screenshots and analyzes every button and input box on the screen, making precise decisions with the help of vision models (e.g., GPT-4o, GeminiPro Vision).
  • Asynchronous Execution: Tasks run silently in the background without locking the AI chat window. You can see the steps the AI is currently executing in real-time (e.g., "Finding WeChat icon," "Typing text").
  • Multi-step Decomposition: For complex tasks, the Agent automatically decomposes them into multiple sub-steps and executes them in a loop until the task is completed.

Usage Note

  • Screen proxy involves high-risk simulation operations and will force a secondary confirmation before execution.
  • It is recommended to use it only when a high-performance Vision model (e.g., qwen-vl-max or gpt-4o) is configured to ensure accuracy.
  • If the task involves launching an app, it is recommended to use the launch_app tool first to ensure the app is in the foreground.

13. Code Execution & Data Acquisition Pro

AI can execute code directly to fetch external data and then generate documents based on that data. This is one of the most powerful features!

Typical Use Cases

CaseDescriptionExample Command
Web Data ScrapingGet data from website APIs"Get the trending videos from YouTube"
Data → DocumentGenerate tables/docs after getting data"Get the top 10 gainers in the stock market today, generate an Excel"
Information QueryReal-time weather, exchange rates, etc."What's the weather in Beijing today"
File ProcessingBatch statistics, format conversion"Calculate the size of the Downloads folder"
Data AnalysisWord frequency statistics, formatting, etc."Statistically count word frequency in clipboard text"

Supported Languages

LanguageDescriptionRecommended
JavaScriptBuilt-in modules like axios, cheerio, no installation required⭐⭐⭐ Recommended
PythonRequires Python to be installed on your system⭐⭐
PowerShellWindows built-in

Usage Examples

Example 1: Fetch trending videos and generate a table

User: Help me get trending videos from YouTube and generate an Excel table

AI Execution Flow:
1. Call public API to get trending video data
2. Extract title, uploader, views, etc.
3. Call create_excel to generate the table file
4. Prompt user to save the file

Example 2: Get Stock Market Gainers

User: Get the top 10 gainers in the stock market today, make it a table

AI Execution Flow:
1. Call stock API to get the gainer list data
2. Parse stock code, name, price change, etc.
3. Generate Excel table

Example 3: Weather Query

User: How's the weather in London today?

AI Execution Flow:
1. Call weather API to get real-time weather in London
2. Parse temperature, weather condition, humidity, etc.
3. Reply to the user in natural language

Built-in API Reference

The following APIs require no authentication and can be directly called by AI:

CategoryPurpose
YouTubeTrending videos, video search, user info
WeatherGlobal city weather queries
StockReal-time quotes for stocks
IP GeoGet public IP and location
GitHubRepo info, star count, etc.

Tip

For dynamically loaded websites, AI will automatically choose to use APIs instead of direct scraping to ensure the success rate of data acquisition.

File Upload & Analysis

Mixstart AI Assistant supports multimodal file analysis; you can directly drag and drop or upload files for AI to process.

Supported File Types

TypeExtensionFunction Example
Imagepng, jpg, webp, bmp"What is in this picture?", "Help me convert the table in the image to Excel"
Office Docdocx, xlsx"Summarize this Word document", "Analyze the data trend of this Excel table"
Code/Texttxt, md, json, js, py..."Explain this code", "Help me refactor this Python script"

Usage Limits

  • Image: Max 5MB (automatic Base64 conversion)
  • Other Files: Max 10MB
  • Single Batch: Up to 4 files at a time

Security Mechanism

To ensure system security, multiple protection mechanisms are built into the AI Assistant:

  1. Sensitive Operation Confirmation: When performing high-risk operations like deleting files, closing processes, or shutting down/restarting, AI will force a confirmation card to pop up, which will only take effect after you click "Confirm Execution".
  2. Local Sandbox: All system operations are executed in a safe local environment to protect your privacy.

⚠️ Important Reminder

Although AI Assistant has integrated sensitive functions such as file deletion and system shutdown, we strongly recommend not using AI for such operations.

AI may misinterpret commands, leading to accidental deletion of important files or unexpected operations. For sensitive tasks like deleting files, moving important data, or shutting down/restarting, please operate manually to ensure safety.

AI Assistant is more suitable for: Launching apps, generating documents, querying information, creating automation scripts, and other safe and controllable tasks.

API Configuration

Mixstart AI Assistant supports multiple configuration methods to meet different user needs:

Mode 1: Server Proxy (Ready to use)

If you have purchased the Pro version or are within the trial period, you can use the AI Assistant directly without any configuration. The system will automatically call AI services through the official server proxy.

  • ✅ Zero configuration, ready to use out of the box
  • ✅ No need to provide your own API Key
  • ✅ Free version has 10 trials per day; Pro version has unlimited dialogue

Mode 2: Custom API (Complete Control)

If you want to use your own AI service or private deployment, you can enable "Use Custom API" in Settings → AI Assistant:

  1. Select API provider (supports OpenAI, Alibaba Qwen, DeepSeek, Kimi, Zhipu AI, etc.)
  2. Fill in your API Key
  3. Optional: Customize API address and model

Supported Provider Presets:

ProviderAPI AddressRecommended Model
OpenAIapi.openai.comgpt-4o / gpt-4o-mini
Alibaba Qwendashscope.aliyuncs.comqwen-turbo / qwen-max
DeepSeekapi.deepseek.comdeepseek-chat
Moonshot (Kimi)api.moonshot.cnmoonshot-v1-8k
Zhipu AIopen.bigmodel.cnglm-4 / glm-4-flash

Mode 3: Hybrid Mode (Server + Image Recognition)

📢 Current Status

The AI model configured by default on the Mixstart server does not support image recognition. To use image analysis features, please configure the Vision API yourself according to the tutorial below.

Since the server's default model doesn't support image recognition, you can configure a Vision model separately to work with the server. We recommend using Qwen VL, which is cost-effective and has a free tier.

📝 Qwen API Key Tutorial

  1. Visit Alibaba Cloud Bailian Platform, log in with your Alibaba Cloud account (real-name authentication required).
  2. Select "API-KEY Management" in the left menu.
  3. Click "Create New API-KEY", and copy the generated key starting with sk-.

💰 Cost Description

Activating the service is free, and the Qwen VL model has a free tier. Beyond that, it's pay-as-you-go at a very affordable price.

⚙️ Configure in Mixstart

  1. Open Mixstart, go to Settings → AI Assistant.
  2. Scroll down to the "Image Recognition Config" section.
  3. Turn on the "Enable Image Recognition" switch.
  4. Fill in according to the following configuration:
Config ItemContent
Vision API KeyPaste your sk-xxxxxxxx
Vision API Base URLFill in https://dashscope.aliyuncs.com/compatible-mode/v1
Vision Model(Recommended) qwen-vl-plus
  1. Click "Save Settings".

Done! Now you can send images in the AI Assistant, and the system will automatically use the Qwen VL model for recognition.

Other Recommended Vision Models:

  • GPT-4o / GPT-4o Mini (OpenAI) - requires a VPN if in restricted regions
  • GLM-4V (Zhipu AI)

Note

The Vision configuration is independent; you can configure image recognition separately without enabling "Use Custom API". After configuration, text dialogue continues to use the server proxy for free, and only image recognition uses your configured Vision API.

Usage Tip

Try saying to AI: "Move all images on the desktop to the Pictures folder, then help me create a PPT report, and finally set a reminder to submit the report tomorrow morning." AI can understand and process these complex combined instructions step by step!

UI Optimization & Stability Update

To provide a flawless visual experience, Mixstart has undergone deep optimization for the AI Assistant's window rendering:

  • Flicker-Free Experience: Specifically adapted for the DWM (Desktop Window Manager) on Windows 11, eliminating ghosting and flickering issues during resizing and fade-in/out animations.
  • Refined Rounded Corners: A new window composition strategy ensures smooth rounded corners even on opaque backgrounds, completely removing unwanted black border artifacts.
  • Fluid Animations: Optimized frame rates ensure that launching or hiding the assistant feels natural, lightweight, and responsive.