Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Visual Semantic Planning using Deep Successor Representations

Yuke ZhuDaniel GordonEric KolveAli Farhadi

2017

ICCV

A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world. In this work, we address the problem of visual…

YOLO9000: Better, Faster, Stronger

Joseph RedmonAli Farhadi

2017

CVPR

We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both…

Actions ~ Transformations

Xiaolong WangAli Farhadiand Abhinav Gupta

2016

CVPR

What defines an action like “kicking ball”? We argue that the true meaning of an action lies in the change or transformation an action brings to the environment. In this paper, we propose a novel…

A Diagram Is Worth A Dozen Images

Aniruddha KembhaviMike SalvatoEric Kolveand Ali Farhadi

2016

ECCV

Diagrams are common tools for representing complex concepts, relationships and events, often when it would be difficult to portray the same information with natural images. Understanding natural…

Are Elephants Bigger than Butterflies? Reasoning about Sizes of Objects

Hessam BagherinezhadHannaneh HajishirziYejin Choiand Ali Farhadi

2016

AAAI

Human vision greatly benefits from the information about sizes of objects. The role of size in several visual reasoning tasks has been thoroughly explored in human perception and cognition. However,…

A Task-Oriented Approach for Cost-sensitive Recognition

Roozbeh MottaghiHannaneh Hajishirziand Ali Fahradi

2016

CVPR

With the recent progress in visual recognition, we have already started to see a surge of vision related real-world applications. These applications, unlike general scene understanding, are task…

Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks

Junyuan XieRoss Girshickand Ali Farhadi

2016

ECCV

We propose Deep3D, a fully automatic 2D-to-3D conversion algorithm that takes 2D images or video frames as input and outputs stereo 3D image pairs. The stereo images can be viewed with 3D glasses or…

FigureSeer: Parsing Result-Figures in Research Papers

Noah SiegelZachary HorvitzRoie Levinand Ali Farhadi

2016

ECCV

‘Which are the pedestrian detectors that yield a precision above 95% at 25% recall?’ Answering such a complex query involves identifying and analyzing the results reported in figures within several…

G-CNN: an Iterative Grid Based Object Detector

Mahyar NajibiMohammad Rastegariand Larry Davis

2016

CVPR

We introduce G-CNN, an object detection technique based on CNNs which works without proposal algorithms. G-CNN starts with a multi-scale grid of fixed bounding boxes. We train a regressor to move…

Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

Gunnar A. SigurdssonGül VarolXiaolong Wangand Abhinav Gupta

2016

ECCV

Computer vision has a great potential to help our daily lives by searching for lost keys, watering flowers or reminding us to take a pill. To succeed with such tasks, computer vision methods need to…

Previous101-110Next