Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Visual Semantic Planning using Deep Successor Representations

Yuke ZhuDaniel GordonEric KolveAli Farhadi
2017
ICCV

A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world. In this work, we address the problem of visual… 

YOLO9000: Better, Faster, Stronger

Joseph RedmonAli Farhadi
2017
CVPR

We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both… 

Actions ~ Transformations

Xiaolong WangAli Farhadiand Abhinav Gupta
2016
CVPR

What defines an action like “kicking ball”? We argue that the true meaning of an action lies in the change or transformation an action brings to the environment. In this paper, we propose a novel… 

A Diagram Is Worth A Dozen Images

Aniruddha KembhaviMike SalvatoEric Kolveand Ali Farhadi
2016
ECCV

Diagrams are common tools for representing complex concepts, relationships and events, often when it would be difficult to portray the same information with natural images. Understanding natural… 

Are Elephants Bigger than Butterflies? Reasoning about Sizes of Objects

Hessam BagherinezhadHannaneh HajishirziYejin Choiand Ali Farhadi
2016
AAAI

Human vision greatly benefits from the information about sizes of objects. The role of size in several visual reasoning tasks has been thoroughly explored in human perception and cognition. However,… 

A Task-Oriented Approach for Cost-sensitive Recognition

Roozbeh MottaghiHannaneh Hajishirziand Ali Fahradi
2016
CVPR

With the recent progress in visual recognition, we have already started to see a surge of vision related real-world applications. These applications, unlike general scene understanding, are task… 

Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks

Junyuan XieRoss Girshickand Ali Farhadi
2016
ECCV

We propose Deep3D, a fully automatic 2D-to-3D conversion algorithm that takes 2D images or video frames as input and outputs stereo 3D image pairs. The stereo images can be viewed with 3D glasses or… 

FigureSeer: Parsing Result-Figures in Research Papers

Noah SiegelZachary HorvitzRoie Levinand Ali Farhadi
2016
ECCV

‘Which are the pedestrian detectors that yield a precision above 95% at 25% recall?’ Answering such a complex query involves identifying and analyzing the results reported in figures within several… 

G-CNN: an Iterative Grid Based Object Detector

Mahyar NajibiMohammad Rastegariand Larry Davis
2016
CVPR

We introduce G-CNN, an object detection technique based on CNNs which works without proposal algorithms. G-CNN starts with a multi-scale grid of fixed bounding boxes. We train a regressor to move… 

Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

Gunnar A. SigurdssonGül VarolXiaolong Wangand Abhinav Gupta
2016
ECCV

Computer vision has a great potential to help our daily lives by searching for lost keys, watering flowers or reminding us to take a pill. To succeed with such tasks, computer vision methods need to…