Embodied AI

We lead cutting-edge research to develop the next generation of intelligent robots, safely trained in advanced simulation environments to automate routine tasks and improve daily life. All of our discoveries are open sourced, enabling the community to build on our progress and collaboratively shape the future of embodied AI.

Advancing models for robotics

Our robotics models are built to go beyond understanding text — they are designed to perceive and interact with the physical world. MolmoAct is our fully open‑source Action Reasoning Model (ARM) built on top of our Molmo multimodal model family. MolmoAct reasons in three stages, grounding a scene with depth‑aware perception tokens, sketching an image‑space waypoint plan, and finally decoding robot‑specific actions. Because every artifact – including the checkpoints and evals – is released, researchers can inspect and build upon a truly reproducible foundation.

Read the MolmoAct paper Read more in our blog

Training at massive scale in simulations

We train policies for embodied AI at massive scale in simulation, leveraging procedural generation and a 10M+ library of 3D assets to create a stunning diversity of virtual environments. Award-winning tools like ProcTHOR, Objaverse, Objaverse-XL, and Holodeck enable exponentially greater diversity in training environments. Massive scale and visual diversity enables us to train generalizable policies for zero-shot real transfer.

Explore ProcTHOR More about Objaverse-XL Learn about Holodeck

Zero-shot real transfer

We’re developing new ways for robots to learn how to navigate in real-world environments using training done entirely in simulation. By removing the need for costly real-world data collection, our approach makes it much easier to scale up embodied AI. Projects like SPOC, PoliFormer, and FLaRe are leading the way—showing that simulation-trained robots can successfully operate in unfamiliar, real-world spaces.

Explore SPOC PoliFormer: a masterful navigator Check out FLaRe

Our teams continue to push the boundaries of embodied AI, focusing on open data and platforms that support training and experimentation across the broader community.

Ai2-THOR

AI2-THOR is an open‑source simulation platform designed for embodied AI and robotics research that provides near photo‑realistic 3D indoor environments in which virtual agents can navigate, interact with objects, and learn from their actions.

Explore Ai2-THOR

GraspMolmo

GraspMolmo is an open-source AI model that helps robots understand everyday language to pick things up in smart, task-aware ways—like grabbing a teapot by the handle when asked to pour tea. Trained on a massive synthetic dataset, it outperforms previous systems and even handles complex, cluttered scenes.

Check out GraspMolmo

The One RING

Modern robots vary significantly in style and ability, but most navigation policies are trained on only one robot and fail to generalize to another. RING (Robotic Indoor Navigation Generalist) is a novel embodiment-agnostic policy that turns any mobile robot into an effective indoor semantic navigator.

Explore RING