Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Towards General Purpose Vision Systems

Tanmay GuptaA. KamathAniruddha KembhaviDerek Hoiem

2022

CVPR

A special purpose learning system assumes knowledge of admissible tasks at design time. Adapting such a system to unforeseen tasks requires architecture manipulation such as adding an output head…

Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks

Jiasen LuChristopher ClarkRowan ZellersAniruddha Kembhavi

2022

arXiv

We propose Unified-IO, a model that performs a large variety of AI tasks spanning classical computer vision tasks, including pose estimation, object detection, depth estimation and image generation,…

What do navigation agents learn about their environment?

Kshitij DwivediG. RoigAniruddha KembhaviRoozbeh Mottaghi

2022

arXiv

Today’s state of the art visual navigation agents typically consist of large deep learning models trained end to end. Such models offer little to no interpretability about the learned skills or the…

A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge

Dustin SchwenkApoorv KhandelwalChristopher ClarkRoozbeh Mottaghi

2022

arXiv

The Visual Question Answering (VQA) task aspires to provide a meaningful testbed for the development of AI models that can jointly reason over visual and natural language inputs. Despite a…

Continuous Scene Representations for Embodied AI

S. GadreKiana EhsaniS. SongRoozbeh Mottaghi

2022

arXiv

We propose Continuous Scene Representations (CSR), a scene representation constructed by an embodied agent navigating within a space, where objects and their relationships are modeled by continuous…

Object Manipulation via Visual Target Localization

Kiana EhsaniAli FarhadiAniruddha KembhaviRoozbeh Mottaghi

2022

arXiv

Object manipulation is a critical skill required for Embodied AI agents interacting with the world around them. Training agents to manipulate objects, poses many challenges. These include occlusion…

Interactron: Embodied Adaptive Object Detection

Klemen KotarRoozbeh Mottaghi

2022

CVPR

Over the years various methods have been proposed for the problem of object detection. Recently, we have wit-nessed great strides in this domain owing to the emergence of powerful deep neural…

Multi-Modal Answer Validation for Knowledge-Based VQA

Jialin WuJiasen LuAshish SabharwalR. Mottaghi

2022

AAAI

The problem of knowledge-based visual question answering involves answering questions that require external knowledge in addition to the content of the image. Such knowledge typically comes in a…

Vessel Detection in Sentinel-1 Imagery

Favyen BastaniPiper WoltersRose HendrixAni Kembhavi

2022

AI2 whitepaper

In this document, we detail the approach in our xView3 submission. The xView3 dataset presents the challenge of detecting vessels and other maritime objects in synthetic aperture radar (SAR) images…

Bridging the Imitation Gap by Adaptive Insubordination

Luca WeihsUnnat JainJordi SalvadorA. Schwing

2021

arXiv

Why do agents often obtain better reinforcement learning policies when imitating a worse expert? We show that privileged information used by the expert is marginalized in the learned agent policy,…

Previous31-40Next