README

EmbodiedAgents is a production-grade framework, built on top of ROS2, designed to deploy Physical AI on real world robots. It enables you to create interactive, physical agents that do not just chat, but understand, move, manipulate, and adapt to their environment.

Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI. It provides an orchestration layer for Adaptive Intelligence.
Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
Pure Python, Native ROS2: Define complex asynchronous graphs in standard Python without touching XML launch files. Yet, underneath, it is pure ROS2 compatible with the entire ecosystem of hardware drivers, simulation tools, and visualization suites.

Join our Discord 👾

Checkout Installation Instructions 🛠️

Get started with the Quickstart Guide 🚀

Get familiar with Basic Concepts 📚

Dive right in with Example Recipes ✨

Installation 🛠️

Install a model serving platform

The core of EmbodiedAgents is agnostic to model serving platforms. It supports Ollama, RoboML and all platforms or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). For VLA models EmbodiedAgents supports policies severed on the Async Inference server from LeRobot. Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.

Install EmbodiedAgents (Ubuntu)

For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:

sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents

Alternatively, grab your favorite deb package from the release page and install it as follows:

sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb

If the attrs version from your package manager is < 23.2, install it using pip as follows:

pip install 'attrs>=23.2.0'

Install EmbodiedAgents from source

Get Dependencies

Install python dependencies

pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets

Download Sugarcoat🍬

git clone https://github.com/automatika-robotics/sugarcoat

Install EmbodiedAgents

git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py

Quick Start 🚀

Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following recipe in a python script and run it.

from agents.clients.ollama import OllamaClient
from agents.components import VLM
from agents.models import OllamaModel
from agents.ros import Topic, Launcher

# Define input and output topics (pay attention to msg_type)
text0 = Topic(name="text0", msg_type="String")
image0 = Topic(name="image_raw", msg_type="Image")
text1 = Topic(name="text1", msg_type="String")

# Define a model client (working with Ollama in this case)
# OllamaModel is a generic wrapper for all Ollama models
llava = OllamaModel(name="llava", checkpoint="llava:latest")
llava_client = OllamaClient(llava)

# Define a VLM component (A component represents a node with a particular functionality)
mllm = VLM(
    inputs=[text0, image0],
    outputs=[text1],
    model_client=llava_client,
    trigger=text0,
    component_name="vqa"
)
# Additional prompt settings
mllm.set_topic_prompt(text0, template="""You are an amazing and funny robot.
    Answer the following about this image: {{ text0 }}"""
)
# Launch the component
launcher = Launcher()
launcher.add_pkg(components=[mllm])
launcher.bringup()

And just like that we have an agent that can answer questions like ‘What do you see?’. Checkout the Quick Start Guide to learn more about how components and models work together.

Complex Physical Agents

The quickstart example above is just an amuse-bouche of what is possible with EmbodiedAgents. In EmbodiedAgents we can create arbitrarily sophisticated component graphs. And furthermore our system can be configured to even change or reconfigure itself based on events internal or external to the system. Check out the code for the following agent here.

Dynamic Web UI for EmbodiedAgent Recipes

Leveraging the power of the underlying Sugarcoat framework, EmbodiedAgents offers a fully dynamic, auto-generated Web UI for every recipe. This feature is built with FastHTML and eliminates manual GUI development, instantly providing a responsive interface for control and visualization.

The UI automatically creates:

Settings interfaces for all the components used in the recipe.
Real-time data visualizations and controls for component inputs/outputs.
WebSocket-based data streaming for all supported message types.

Example: VLM Agent UI

A full interface is automatically generated for a VLM Q&A agent (similat to the Quick Start example), providing simple controls for settings and displaying real-time text input/output.

EmbodiedAgents UI Example GIF

Copyright

EmbodiedAgents is made available under the MIT license. Details can be found in the LICENSE file.

Contributions

EmbodiedAgents has been developed in collaboration between Automatika Robotics and Inria. Contributions from the community are most welcome.

automatika_embodied_agents