Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Discovering Neural Wirings
The success of neural networks has driven a shift in focus from feature engineering to architecture engineering. However, successful networks today are constructed using a small and manually defined…
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations
In this work, we present a framework to measure and mitigate intrinsic biases with respect to protected variables --such as gender-- in visual recognition tasks. We show that trained models…
Assisted Excitation of Activations: A Learning Technique to Improve Object Detectors
We present a simple and effective learning technique that significantly improves mAP of YOLO object detectors without compromising their speed. During network training, we carefully feed in…
ELASTIC: Improving CNNs by Instance Specific Scaling Policies
Scale variation has been a challenge from traditional to modern approaches in computer vision. Most solutions to scale issues have similar theme: a set of intuitive and manually designed policies…
ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network
We introduce a light-weight, power efficient, and general purpose convolutional neural network, ESPNetv2 , for modeling visual and sequential data. Our network uses group point-wise and depth-wise…
From Recognition to Cognition: Visual Commonsense Reasoning
Visual understanding goes well beyond object recognition. With one glance at an image, we can effortlessly imagine the world beyond the pixels: for instance, we can infer people’s actions, goals,…
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning
Learning is an inherently continuous phenomenon. When humans learn a new task there is no explicit distinction between training and inference. After we learn a task, we keep learning about it while…
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
Visual Question Answering (VQA) in its ideal form lets us study reasoning in the joint space of vision and language and serves as a proxy for the AI task of scene understanding. However, most VQA…
Two Body Problem: Collaborative Visual Task Completion
Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities. Addressed extensively in both conventional and modern AI, multi-agent collaboration has often been…
Video Relationship Reasoning using Gated Spatio-Temporal Energy Graph
Visual relationship reasoning is a crucial yet challenging task for understanding rich interactions across visual concepts. For example, a relationship \{man, open, door\} involves a complex…