Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

When Learning Is Out of Reach, Reset: Generalization in Autonomous Visuomotor Reinforcement Learning

Zichen ZhangLuca Weihs
2023
CVPR • Embodied AI workshop

Episodic training, where an agent's environment is reset to some initial condition after every success or failure, is the de facto standard when training embodied reinforcement learning (RL) agents.… 

Objaverse: A Universe of Annotated 3D Objects

Matt DeitkeDustin SchwenkJordi SalvadorAli Farhadi
2022
CVPR

Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebImageText, and LAION have propelled recent dramatic progress in AI. Large neural models trained on such datasets produce… 

Ask4Help: Learning to Leverage an Expert for Embodied Tasks

Kunal Pratap SinghLuca WeihsAlvaro HerrastiRoozbeh Mottaghi
2022
arXiv

Embodied AI agents continue to become more capable every year with the advent of new models, environments, and benchmarks, but are still far away from being performant and reliable enough to be… 

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

Matt DeitkeEli VanderBiltAlvaro HerrastiRoozbeh Mottaghi
2022
NeurIPS

Massive datasets and high-capacity models have driven many recent advancements in computer vision and natural language understanding. This work presents a platform to enable similar success stories… 

Webly Supervised Concept Expansion for General Purpose Vision Models

Amita KamathChristopher ClarkTanmay GuptaAniruddha Kembhavi
2022
ECCV

General purpose vision (GPV) systems [25] are models that are designed to solve a wide array of visual tasks without requiring architectural changes. Today, GPVs primarily learn both skills and… 

Towards Disturbance-Free Visual Mobile Manipulation

Tianwei NiKiana EhsaniLuca WeihsJordi Salvador
2022
arXiv

Deep reinforcement learning has shown promising results on an abundance of robotic tasks in simulation, including visual navigation and manipulation. Prior work generally aims to build embodied… 

Benchmarking Progress to Infant-Level Physical Reasoning in AI

Luca WeihsAmanda Rose YuileRenée BaillargeonAniruddha Kembhavi
2022
TMLR

To what extent do modern AI systems comprehend the physical world? We introduce the open-access Infant-Level Physical Reasoning Benchmark ( InfLevel ) to gain insight into this question. We evaluate… 

I can’t believe there’s no images! : Learning Visual Tasks Using Only Language Supervision

Sophia GuChristopher ClarkAniruddha Kembhavi
2022
ICCV International Conference on Computer Vision

Many high-level skills that are required for computer vision tasks, such as parsing questions, comparing and contrasting semantics, and writing descriptions, are also required in other domains such… 

Simple but Effective: CLIP Embeddings for Embodied AI

Apoorv KhandelwalLuca WeihsRoozbeh MottaghiAniruddha Kembhavi
2022
CVPR

Contrastive language image pretraining (CLIP) encoders have been shown to be beneficial for a range of visual tasks from classification and detection to caption-ing and image manipulation. We… 

MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound

Rowan ZellersJiasen LuXiming LuYejin Choi
2022
CVPR

This task enables it to perform well variety Abstract As humans, we navigate a multimodal world, building a holistic understanding from all our senses. We introduce MERLOT Reserve , a model that…