Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Learning Curves for Analysis of Deep Networks

Derek HoiemTanmay GuptaZhizhong LiMichal Shlapentokh-Rothman

2021

arXiv

A learning curve models a classifier's test error as a function of the number of training samples. Prior works show that learning curves can be used to select model parameters and extrapolate…

Visual Semantic Role Labeling for Video Understanding

Arka SadhuTanmay GuptaMark YatskarAniruddha Kembhavi

2021

CVPR

We propose a new framework for understanding and representing related salient events in a video using visual semantic role labeling. We represent videos as a set of related events, wherein each…

Visual Room Rearrangement

Luca WeihsMatt DeitkeAniruddha KembhaviR. Mottaghi

2021

arXiv

There has been a significant recent progress in the field of Embodied AI with researchers developing models and algorithms enabling embodied agents to navigate and interact within completely unseen…

What Can You Learn from Your Muscles? Learning Visual Representation from Human Interactions

Kiana EhsaniDaniel GordonT. NguyenA. Farhadi

2021

ICLR

Learning effective representations of visual data that generalize to a variety of downstream tasks has been a long quest for computer vision. Most representation learning approaches rely solely on…

Learning Generalizable Visual Representations via Interactive Gameplay

Luca WeihsAniruddha KembhaviKiana EhsaniA. Farhadi

2021

ICLR

A growing body of research suggests that embodied gameplay, prevalent not just in human cultures but across a variety of animal species including turtles and ravens, is critical in developing the…

Learning About Objects by Learning to Interact with Them

Martin LohmannJordi SalvadorAniruddha KembhaviRoozbeh Mottaghi

2020

NeurIPS

Much of the remarkable progress in computer vision has been focused around fully supervised learning mechanisms relying on highly curated datasets for a variety of tasks. In contrast, humans often…

X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers

Jaemin ChoJiasen LuDustin Schwenkand Aniruddha Kembhavi

2020

EMNLP

Mirroring the success of masked language models, vision-and-language counterparts like VILBERT, LXMERT and UNITER have achieved state of the art performance on a variety of multimodal discriminative…

Rearrangement: A Challenge for Embodied AI

Dhruv BatraA. X. ChangS. ChernovaHao Su

2020

arXiv

We describe a framework for research and evaluation in Embodied AI. Our proposal is based on a canonical task: Rearrangement. A standard task can focus the development of new techniques and serve as…

AllenAct: A Framework for Embodied AI Research

Luca WeihsJ. SalvadorKlemen KotarAniruddha Kembhavi

2020

arXiv

The domain of Embodied AI, in which agents learn to complete tasks through interaction with their environment from egocentric observations, has experienced substantial growth with the advent of deep…

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

Unnat JainLuca WeihsEric KolveAlexander Schwing

2020

ECCV

Autonomous agents must learn to collaborate. It is not scalable to develop a new centralized agent every time a task’s difficulty outpaces a single agent’s abilities. While multi-agent collaboration…

Previous51-60Next