Introduction
AgentDesk provides full-featured Desktop environments which can be programatically controlled by AI agents.
Features
-
Built on AgentD – a runtime daemon which exposes a REST API for interacting with the desktop.
-
Implements the DeviceBay Protocol.
-
Provides a CLI and a Python library.
-
The Desktops can be run locally or in the cloud.
Motivation
Why do we want this? Simple. APIs are not always available and they can be incredibly expensive to use. Agents that can use GUIs with ease have a massive advantage operating mobile phones, desktops and SaaS applications. They can work with it just like a human.
GUI navigation makes any program accessible and programmable to an agent, which offers tremendous potential to gather information, automate complex, open ended tasks and control your desktop. Almost all the work in this area is currently focused on helping agents to work in browsers, but many apps aren’t available on the web.
That’s why we created AgentDesk. It allows you to run VMs locally and in the cloud, and to control them using a Python SDK and CLI. This gives you a tremendously solid foundation for advanced GUI controlling agents.
Check out an example of a complex GUI-based agent here. Read on to learn how to use AgentDesk.
Installation
pip install agentdesk
If you run local VMs, you need Docker to run the containers with Desktop GUI.
You also need QEMU if you are creating QEMU desktops instead of Docker desktops.