Package: AgentD
Introduction
AgentD
is a powerful daemon designed to make a desktop OS accessible to AI agents. By exposing an HTTP API, AgentD
allows for seamless interactions between a desktop environment and AI-driven applications or scripts.
Features
- Mouse and Keyboard Control: Simulate mouse movements, clicks, and keyboard inputs.
- Web Browser Control: Open URLs and interact with web content through a Chromium-based browser.
- Screen Capture: Take screenshots of your desktop for analysis or record-keeping.
- Session Recording: Record and replay desktop sessions to capture workflows or for debugging purposes.
Getting Started
To get started with AgentD
, follow these simple steps:
-
Installation:
- For a quick start, we recommend using one of our pre-configured VMs which come with
AgentD
pre-installed. This is the easiest way to get up and running without worrying about dependencies or configuration. - Alternatively, if you prefer to install
AgentD
on your own Ubuntu VM, you can use our remote installation script. This is suitable for users who want more control over the installation process or need to integrateAgentD
into an existing setup. - See details in Installation section.
- For a quick start, we recommend using one of our pre-configured VMs which come with
-
Usage:
- Once
AgentD
is installed and the VM is launched, you can start interacting with its desktop through the HTTP API. The API allows you to control the mouse and keyboard, manage web browser sessions, capture screenshots, and much more. - To check if
AgentD
is running correctly, you can send a request to the/health
endpoint. A successful response indicates that AgentD is ready to accept commands.
- Once
-
API Endpoints:
AgentD
provides a rich set of API endpoints to interact with the desktop. Here are some of the key functionalities:- Mouse and Keyboard Control:
/move_mouse
,/click
,/type_text
, etc. - Web Browser Control:
/open_url
- Screen Capture:
/screenshot
- Session Recording:
/recordings
,/recordings/{session_id}/stop
, etc.
- Mouse and Keyboard Control:
For more detailed information on how to use AgentD
and its API, please refer to the full API documentation and examples provided in our GitHub repository.