A complete autonomous desktop agent. Real desktop control, multi-provider AI, containerized safety, and a modern web interface -- all open source.
ByteBot controls a real Ubuntu desktop running inside Docker. It moves the mouse, clicks buttons, types text, scrolls pages, and interacts with any application -- exactly like a human user would. No APIs, no browser automation scripts, no application-specific integrations.
ByteBot supports multiple AI providers through a unified abstraction layer. Switch between Anthropic Claude, OpenAI GPT-4, Google Gemini, or local Ollama models without changing your workflow. Configure your preferred provider in the .env file and select it through the UI.
ByteBot runs inside a Docker container with a full Ubuntu desktop. The AI agent operates within this sandbox and cannot access your host machine, files, network, or other applications unless you explicitly configure volume mounts or network bridges.
Access the containerized desktop through a browser-based VNC viewer powered by noVNC. Watch every click, keystroke, and screen transition as the AI executes tasks. You can also take manual control at any time.
Every task is recorded with complete conversation history, screenshots at each step, action logs, and execution timeline. Review past tasks, debug failures, and improve your prompts over time.
Built with modern technologies: NestJS backend for AI orchestration, Next.js frontend with real-time WebSocket updates, PostgreSQL for persistence, and a clean TypeScript monorepo structure. Everything containerized with Docker Compose for one-command deployment.
ACTIONS
screenshot Capture current desktop state
click_mouse Click at coordinates (x, y)
type_text Type a string of characters
key Press a key combination
scroll Scroll up or down
open_url Navigate browser to URL
USAGE
$ bytebot action screenshot
$ bytebot action click_mouse --x 450 --y 320
$ bytebot action type_text --text "hello world"
$ bytebot action key --combo "ctrl+s"
$ bytebot action scroll --direction down --amount 3
$ bytebot action open_url --url "https://example.com"
TASK MODE
$ bytebot task "Find all PDF invoices and organize them by date"
$ bytebot task "Open Firefox and search for AI news"
$ bytebot task "Fill out the expense report form"
Watch ByteBot execute a complete task in our interactive demo, or deploy it yourself in minutes.