Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Towards General Purpose Vision Systems

Tanmay GuptaA. KamathAniruddha KembhaviDerek Hoiem
2022
CVPR

A special purpose learning system assumes knowledge of admissible tasks at design time. Adapting such a system to unforeseen tasks requires architecture manipulation such as adding an output head… 

Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks

Jiasen LuChristopher ClarkRowan ZellersAniruddha Kembhavi
2022
arXiv

We propose Unified-IO, a model that performs a large variety of AI tasks spanning classical computer vision tasks, including pose estimation, object detection, depth estimation and image generation,… 

What do navigation agents learn about their environment?

Kshitij DwivediG. RoigAniruddha KembhaviRoozbeh Mottaghi
2022
arXiv

Today’s state of the art visual navigation agents typically consist of large deep learning models trained end to end. Such models offer little to no interpretability about the learned skills or the… 

A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge

Dustin SchwenkApoorv KhandelwalChristopher ClarkRoozbeh Mottaghi
2022
arXiv

The Visual Question Answering (VQA) task aspires to provide a meaningful testbed for the development of AI models that can jointly reason over visual and natural language inputs. Despite a… 

Continuous Scene Representations for Embodied AI

S. GadreKiana EhsaniS. SongRoozbeh Mottaghi
2022
arXiv

We propose Continuous Scene Representations (CSR), a scene representation constructed by an embodied agent navigating within a space, where objects and their relationships are modeled by continuous… 

Object Manipulation via Visual Target Localization

Kiana EhsaniAli FarhadiAniruddha KembhaviRoozbeh Mottaghi
2022
arXiv

Object manipulation is a critical skill required for Embodied AI agents interacting with the world around them. Training agents to manipulate objects, poses many challenges. These include occlusion… 

Interactron: Embodied Adaptive Object Detection

Klemen KotarRoozbeh Mottaghi
2022
CVPR

Over the years various methods have been proposed for the problem of object detection. Recently, we have wit-nessed great strides in this domain owing to the emergence of powerful deep neural… 

Multi-Modal Answer Validation for Knowledge-Based VQA

Jialin WuJiasen LuAshish SabharwalR. Mottaghi
2022
AAAI

The problem of knowledge-based visual question answering involves answering questions that require external knowledge in addition to the content of the image. Such knowledge typically comes in a… 

Vessel Detection in Sentinel-1 Imagery

Favyen BastaniPiper WoltersRose HendrixAni Kembhavi
2022
AI2 whitepaper

In this document, we detail the approach in our xView3 submission. The xView3 dataset presents the challenge of detecting vessels and other maritime objects in synthetic aperture radar (SAR) images… 

Bridging the Imitation Gap by Adaptive Insubordination

Luca WeihsUnnat JainJordi SalvadorA. Schwing
2021
arXiv

Why do agents often obtain better reinforcement learning policies when imitating a worse expert? We show that privileged information used by the expert is marginalized in the learned agent policy,…