An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality

Cheng-Yu HsiehJieyu ZhangZixian MaRanjay Krishna

2023

NeurIPS

In the last year alone, a surge of new benchmarks to measure compositional understanding of vision-language models have permeated the machine learning ecosystem. Given an image, these benchmarks…

SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

Kiana EhsaniTanmay GuptaRose HendrixAniruddha Kembhavi

2023

CVPR

Reinforcement learning (RL) with dense rewards and imitation learning (IL) with human-generated trajectories are the most widely used approaches for training modern embodied agents. RL requires…

SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding

Favyen BastaniPiper WoltersRitwik GuptaAniruddha Kembhavi

2023

ICCV

Remote sensing images are useful for a wide variety of planet monitoring applications, from tracking deforestation to tackling illegal fishing. The Earth is extremely diverse -- the amount of…

Objaverse-XL: A Universe of 10M+ 3D Objects

Matt DeitkeRuoshi LiuMatthew WallingfordAli Farhadi

2023

NeurIPS

Natural language processing and 2D vision models have attained remarkable proficiency on many tasks primarily by escalating the scale of training data. However, 3D vision tasks have not seen the…

Phone2Proc: Bringing Robust Robots Into Our Chaotic World

Matt DeitkeRose HendrixLuca WeihsAniruddha Kembhavi

2023

CVPR

Training embodied agents in simulation has become mainstream for the embodied AI community. However, these agents often struggle when deployed in the physical world due to their inability to…

Visual Programming: Compositional visual reasoning without training

Tanmay GuptaAniruddha Kembhavi

2023

CVPR

We present VISPROG, a neuro-symbolic approach to solving complex and compositional visual tasks given natural language instructions. VISPROG avoids the need for any task-speciﬁc training. Instead,…

EXCALIBUR: Encouraging and Evaluating Embodied Exploration

Hao ZhuRaghav KapoorSo Yeon MinLuca Weihs

2023

CVPR

Experience precedes understanding. Humans constantly explore and learn about their environment out of curiosity, gather information, and update their models of the world. On the other hand,…

Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization

Rajkumar RamamurthyPrithviraj AmmanabroluKianté BrantleyYejin Choi

2023

ICLR

We tackle the problem of aligning pre-trained large language models (LMs) with human preferences. If we view text generation as a sequential decision-making problem, reinforcement learning (RL)…

Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics

Kuo-Hao ZengLuca WeihsRoozbeh MottaghiAli Farhadi

2023

ICLR

A common assumption when training embodied agents is that the impact of taking an action is stable; for instance, executing the"move ahead"action will always move the agent forward by a fixed…

When Learning Is Out of Reach, Reset: Generalization in Autonomous Visuomotor Reinforcement Learning

Zichen ZhangLuca Weihs

2023

CVPR • Embodied AI workshop

Episodic training, where an agent's environment is reset to some initial condition after every success or failure, is the de facto standard when training embodied reinforcement learning (RL) agents.…

Previous12-21Next