Hero Light

Introduction

The AgentSea platform delivers a collection of libraries and tools for building AI agent apps. We favor the UNIX philosophy of do one thing and do it well. Making our tools easy to use, easy to extend, and easy to mix and match. Use the tools one by one or stack them together into a single agent app.

You can also use our tools with other popular frameworks like LlamaIndex and LangChain.

Our tools ▼

  • SurfKit an orchestartor for building and launching agents locally, in a docker container or in the cloud. Think of it as k8s for agents.
  • DeviceBay offers pluggable devices ready to be used by AI agents, complete with a UI experience.
  • ToolFuse a library that wraps up scripts, 3rd party apps and APIs as Tool implementations for agents.
  • AgentD a powerful daemon that makes a Linux desktop OS accessible to your bot, like a remote desktop app but where the agent takes all the actions.
  • AgentDesk a library for running AgentD powered VMs as Tool instances on any cloud.
  • Taskara task management for your agentic systems.
  • ThreadMem a library for building multi-role persistent threads that keep track of all the messages and dialogues with your agents.

Build your own agent or use our alpha agents. Our initial batch of agents focus on multimodal navigation of GUI interfaces. Our prototypes use a combo of old school computer vision techniques and some new tricks of our own applied AI methods.

Our agents ▼

  • SurfPizza an agent that explores by slicing up the screen and returning a composite to the multimodal model so it can pick where to go next.
  • SurfSlicer divides up the screen into dots that signify regions and the multimodal model picks the dot closest to what it’s looking for and then zooms in and does it again, zeroing in on its target.
  • SurfNinja (coming soon) - A precision-based second gen AI agent.
  • SurfMonsta (coming soon) - Our best performing agent: It combines a number of our techniques, like SurfSlicer regions, the SAM model for bounding and segmenting a GUI, OCR for text positioning and a GAN for upscaling smaller slices of images to give the multimodal model the best resolution.

Demo

The AgentSea platform is currently in open beta release.

The overall framework is solid and getting stronger every day.

All of our agents are alpha releases.

They sometimes work like magic and other times they struggle badly and make frustrating mistakes. See them as proof of concepts to expand on and iterate on and you’ll have a strong base of ideas to build your own agents on. We’re improving them every day and we’ll continue to release new and improved agents as we develop them.

All the tools are free to use and licensed under the MIT License.

Insterested in helping? Great! We love contributions. If you’ve got ideas, bug reports, or code contributions, please open an issue or a pull request in the right repo.

Let’s work together and build better agents now.

Explore