Imagine this: It’s Monday morning, and you need to send the same message to 50 customers, enable dark mode on your phone, check your battery level, and download a new app—all while sipping your coffee. Sounds tedious, right? What if you could accomplish all of this by simply telling your phone what to do in plain English, and an AI agent handles the rest?

Welcome to Droidrun, an open-source LLM-agnostic mobile automation framework that’s changing how developers and automation enthusiasts interact with Android devices. Whether you’re testing mobile apps, automating repetitive tasks, or building intelligent workflows, Droidrun empowers you to control your Android device through natural language commands—no complex scripting required.

What Makes Droidrun Special?

At its core, Droidrun is a language model-agnostic framework that bridges the gap between artificial intelligence and mobile device control. Unlike tools locked into a single LLM provider, Droidrun works seamlessly with OpenAI GPT-4, Anthropic Claude, Google Gemini, DeepSeek, Ollama, and more. This flexibility means you’re never vendor-locked and can choose the AI model that works best for your use case and budget.

Key Features at a Glance

  • Natural Language Control: Describe what you want to do, and Droidrun makes it happen
  • LLM Agnostic: Works with any major LLM provider—switch providers without rewriting code
  • Smart Planning: Handles complex, multi-step tasks through intelligent reasoning
  • Visual Awareness: Analyzes screenshots to understand device state and UI elements
  • Dual Access: Easy CLI for quick tasks, powerful Python API for advanced automation
  • Extendable: Build custom workflows and integrate with your existing tools
  • Open Source: MIT-licensed and community-driven

Who Should Use Droidrun?

  • Android Developers: Automate UI testing without writing boilerplate test code
  • QA Engineers: Execute complex testing scenarios with natural language prompts
  • Automation Enthusiasts: Build scripts to handle repetitive daily tasks
  • Data Analysts: Extract information from mobile apps programmatically
  • Productivity Hackers: Automate workflows and save hours each week

Now, let’s get you up and running with Droidrun in minutes.

Installation Guide

Getting Droidrun installed is straightforward. Follow these steps to have a fully functional Android automation setup.

Prerequisites

Before diving into installation, ensure your system meets these requirements:

System Requirements:

  • Python 3.11+ (or Python 3.10+ for older versions)
  • Android Debug Bridge (ADB) installed and accessible
  • A physical Android device or emulator with:
  • Developer Options enabled
  • USB Debugging activated
  • Connected via USB or on the same network (for wireless debugging)
  • Internet connection for API calls to your chosen LLM provider
  • API key or credentials for at least one supported LLM (or a local Ollama instance)

Step 1: Install Python and ADB

On macOS:

# Using Homebrew
brew install [email protected]
brew install android-platform-tools

On Ubuntu/Debian:

sudo apt update
sudo apt install python3.11 python3-pip adb

On Fedora:

sudo dnf install python3.11 python3-pip android-tools

On Windows:
Download and install from python.org and Android SDK.

Step 2: Install the uv Package Manager (Recommended)

Droidrun documentation recommends using uv for faster, more reliable installation:

On macOS/Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

On Windows (PowerShell):

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

If you prefer using pip (the traditional Python package manager), you can skip this step and use pip install in the following steps.

Step 3: Install Droidrun with Your Preferred LLM Providers

Choose your installation method based on your use case:

For CLI usage only:

uv tool install 'droidrun[google,anthropic,openai,deepseek,ollama,openrouter]'

For CLI + Python API integration:

uv pip install 'droidrun[google,anthropic,openai,deepseek,ollama,openrouter]'

If using pip instead of uv:

pip install 'droidrun[google,anthropic,openai,deepseek,ollama,openrouter]'

Install only specific providers (e.g., OpenAI and Gemini):

uv tool install 'droidrun[google,openai]'

Verify the installation:

droidrun --version

You should see the version number displayed. If not, restart your terminal and try again.

Step 4: Enable Developer Options on Your Android Device

On your Android device:

  1. Open Settings
  2. Scroll to About Phone (or similar, depending on your device)
  3. Tap Build Number seven times rapidly
  4. You should see a toast notification saying “You are now a developer”
  5. Go back to Settings > Developer Options (or System > Developer Options)
  6. Enable USB Debugging
  7. (Optional) Enable Wireless Debugging for network connections

Step 5: Connect Your Android Device via USB

Connect via USB cable:

# Plug in your device via USB cable
# Authorize the connection when prompted on your device

# Verify connection
adb devices

# You should see:
# List of attached devices
# your-device-id    device

Or connect wirelessly:

# Enable "Wireless Debugging" on your device first
# Then in Developer Options, note your device's IP and port

adb connect your-device-ip:5555

# Verify wireless connection
adb devices

Step 6: Set Up the Droidrun Portal App

Droidrun requires the Portal app on your Android device to access accessibility services. This one command handles everything:

droidrun setup

This command automatically:

  • Downloads the latest Portal APK
  • Installs it on your connected device
  • Enables the accessibility service

Manual setup (if automatic setup fails):

If the automatic setup doesn’t work, manually grant accessibility permissions:

  1. Open Settings on your device
  2. Go to Accessibility > Accessibility Services
  3. Find “Droidrun Portal” and enable it
  4. Grant all requested permissions

Step 7: Configure Your LLM Provider

Droidrun uses a configuration file (config.yaml) for LLM settings. On first run, it creates a default configuration.

Set your API key (choose one provider):

For Google Gemini:

export GOOGLE_API_KEY="your-gemini-api-key-here"

For OpenAI:

export OPENAI_API_KEY="your-openai-api-key-here"

For Anthropic Claude:

export ANTHROPIC_API_KEY="your-anthropic-api-key-here"

For DeepSeek:

export DEEPSEEK_API_KEY="your-deepseek-api-key-here"

For local Ollama (no API key needed):

# Make sure Ollama is running locally
ollama serve

Make environment variables permanent (add to ~/.bashrc, ~/.zshrc, or ~/.profile):

# Open your shell config file
nano ~/.bashrc

# Add the line at the end:
export OPENAI_API_KEY="your-openai-api-key-here"

# Save (Ctrl+O, Enter, Ctrl+X)
# Reload
source ~/.bashrc

Troubleshooting Installation Issues

Problem: “droidrun: command not found”

Solution:

  • If using uv tool install, the tool is installed in a separate directory. Try using the full path or restart your terminal.
  • If using pip install, ensure your Python bin directory is in PATH: export PATH="$HOME/.local/bin:$PATH"

Problem: “ADB device not found”

Solution:

  • Ensure USB Debugging is enabled on your device
  • Reconnect the USB cable
  • Restart ADB: adb kill-server && adb start-server
  • Check device authorization: Look for a permission prompt on your device and tap “Allow”

Problem: “Portal app installation failed”

Solution:

  • Manually install via ADB: Download the Portal APK and run adb install droidrun-portal.apk
  • Ensure developer options and USB debugging are enabled
  • Free up device storage space if installation fails due to insufficient space

Problem: “API key error” or “Failed to authenticate”

Solution:

  • Double-check your API key is correct and properly exported
  • Ensure the API key has the necessary permissions in your provider’s console
  • Verify your internet connection
  • Check that your API account has active credits/quota

Usage Instructions

With Droidrun installed, let’s explore how to automate your Android device using natural language.

Understanding How Droidrun Works

When you issue a command to Droidrun, here’s what happens behind the scenes:

  1. You provide a goal in natural language (e.g., “Send a WhatsApp message to John”)
  2. Droidrun captures a screenshot of your device
  3. The LLM agent analyzes the screenshot and understands the current UI state
  4. The agent plans the steps needed to accomplish your goal
  5. Droidrun executes the steps using Android accessibility services
  6. The process repeats until the goal is achieved or the step limit is reached

This architecture means Droidrun can handle complex, unpredictable UIs and adapt to different devices and app versions.

Basic CLI Usage

The simplest way to use Droidrun is through the command line:

droidrun "Your command here"

Or with the explicit run subcommand:

droidrun run "Your command here"

Example 1: Simple App Control

Open an app and check information:

droidrun "Open Settings and tell me the Android version"

Response: Droidrun will open Settings, navigate to About Phone, and report the Android version back to you.

Example 2: Sending Messages

Automate messaging across apps:

droidrun "Send WhatsApp to John: I'll be there in 10 minutes"

What happens:

  1. Opens WhatsApp
  2. Finds contact “John”
  3. Types your message
  4. Sends it

No manual tapping required—Droidrun handles it all.

Example 3: Data Collection

Extract information from multiple apps:

droidrun "Go to Amazon, search for headphones, and tell me the top 3 results with prices"

Output: Droidrun returns:

  • Product names
  • Prices
  • Link information
  • Ratings

Perfect for price comparisons, market research, or competitive analysis.

Example 4: Complex Multi-Step Workflows

For intricate tasks, enable reasoning mode:

droidrun "Find a contact named Sarah, call her, and if she doesn't answer in 30 seconds, send her a text saying I'll call back later" --reasoning

The --reasoning flag enables planning mode, where Droidrun breaks down complex goals into manageable steps before execution.

Example 5: Visual Analysis

Analyze what’s currently on screen:

droidrun "What's currently displayed on my phone screen? Describe all the buttons and text." --vision

Enable --vision to have the LLM analyze and describe the visual content of your device screen.

Advanced CLI Options

Here are powerful flags to customize Droidrun’s behavior:

FlagPurposeExample
--providerChoose LLM provider--provider OpenAI
--modelSpecify model--model gpt-4o-mini
--deviceTarget specific device--device emulator-5554
--visionEnable screenshot analysis--vision
--reasoningEnable planning mode--reasoning
--stepsMax execution steps (default: 15)--steps 20
--debugEnable detailed logging--debug
--save-trajectorySave execution history--save-trajectory step
--configUse custom config file--config custom.yaml

Example with multiple options:

droidrun "Install and open Instagram" --provider OpenAI --model gpt-4o --vision --steps 25 --device emulator-5554

Using Different LLM Providers

Droidrun’s true power lies in its flexibility to work with any LLM provider.

Using OpenAI GPT-4:

droidrun "Check battery level" --provider OpenAI --model gpt-4o

Using Anthropic Claude:

droidrun "Find and open Chrome" --provider Anthropic --model claude-3-sonnet-20240229

Using Google Gemini:

droidrun "Open the calculator app" --provider GoogleGenAI --model gemini-2.5-pro

Using Local Ollama (Free!):

droidrun "Enable dark mode" --provider Ollama --model llama2

Python API for Advanced Automation

For developers needing programmatic control, Droidrun provides a Python API:

import asyncio
from droidrun import DroidAgent, DroidrunConfig

async def main():
    # Create configuration
    config = DroidrunConfig()

    # Create an agent
    agent = DroidAgent(
        goal="Open Settings and enable dark mode",
        llm_provider="openai",
        llm_model="gpt-4o-mini",
        vision=True
    )

    # Execute the task
    result = await agent.run()

    # Check results
    if result['success']:
        print(f"✅ Task completed successfully!")
        if result.get('output'):
            print(f"Output: {result['output']}")
    else:
        print(f"❌ Task failed: {result.get('reason', 'Unknown error')}")

if __name__ == "__main__":
    asyncio.run(main())

Save this as automation_script.py and run:

python automation_script.py

Batch Processing Multiple Commands

Create a workflow file with multiple tasks:

# Create a file named tasks.txt
cat > tasks.txt << EOF
Open Settings and check Android version
Enable dark mode in Settings
Check battery level and storage
EOF

# Execute all tasks sequentially
droidrun --batch tasks.txt

Best Practices for Droidrun Automation

1. Be Specific with Commands

  • Bad: “Do something with settings”
  • Good: “Open Settings, go to Display, and enable dark mode”

2. Use Reasoning Mode for Complex Tasks

droidrun "Search for flights from New York to London in the next week" --reasoning

3. Enable Vision for UI Analysis

droidrun "What apps are visible on the home screen?" --vision

4. Handle Errors Gracefully

droidrun "Try to open Instagram, if it fails, tell me why" --debug

5. Choose the Right LLM Provider

  • Speed + Cost: Use DeepSeek or Claude 3.5 Sonnet
  • Advanced Reasoning: Use GPT-4o or Gemini 2.5 Pro
  • Free Local Option: Use Ollama with Llama 2 or Mistral

6. Monitor API Usage
Track your API consumption to avoid unexpected bills. Use --debug to see token usage.

Troubleshooting Common Issues

Issue: “Device timeout” during execution

Solution:

  • Increase max steps: --steps 25 (default is 15)
  • Ensure device isn’t locked
  • Check network connectivity for API calls

Issue: “Portal app is not responding”

Solution:

  • Force-stop the Portal app: adb shell am force-stop ai.droidrun.portal
  • Reinstall: droidrun setup --force
  • Reboot device: adb reboot

Issue: “LLM rate limited”

Solution:

  • Switch to a different provider or model
  • Add delays between commands using a custom Python script
  • Use local Ollama for unlimited free requests

Issue: “Incorrect action taken”

Solution:

  • Enable --debug to see step-by-step execution
  • Use --reasoning for better planning on complex tasks
  • Try a different LLM provider (Gemini 2.5 Pro performs best currently)

Conclusion

Droidrun represents a paradigm shift in mobile automation. By combining the power of large language models with Android accessibility services, it eliminates the need for complex scripting and repetitive manual tasks. Whether you’re a developer automating UI tests, a power user saving hours on daily tasks, or a data analyst extracting mobile app data, Droidrun brings the future of AI-driven automation to your pocket.

Key Takeaways

  • Install in minutes – Simple setup with just a few commands
  • LLM agnostic – Choose your favorite AI provider; switch anytime
  • Natural language control – Describe what you want; let AI handle the how
  • Powerful yet accessible – CLI for simple tasks, Python API for advanced workflows
  • Open source – Community-driven, MIT-licensed, and constantly improving

Your Next Steps

  1. Install Droidrun using the guide above
  2. Try a simple command like droidrun "Open Settings"
  3. Explore examples from the official documentation
  4. Build your first automation script using the Python API
  5. Share your creations with the Droidrun community

Join the Community

Droidrun thrives on contributions from developers like you. Here’s how you can get involved:

  • Star the repo on GitHub
  • Report bugs and suggest features via Issues
  • Contribute code with Pull Requests
  • Share your projects and automation ideas
  • Participate in discussions and help other users

The automation revolution is here. Will you lead it?

You may also like

Subscribe
Notify of
guest

0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments