Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
GridToPix: Training Embodied Agents with Minimal Supervision
While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped…
Contrasting Contrastive Self-Supervised Representation Learning Pipelines
In the past few years, we have witnessed remarkable breakthroughs in self-supervised representation learning. Despite the success and adoption of representations learned through this paradigm, much…
“I’m Not Mad”: Commonsense Implications of Negation and Contradiction
Natural language inference requires reasoning about contradictions, negations, and their commonsense implications. Given a simple premise (e.g., “I’m mad at you”), humans can reason about the…
Learning Curves for Analysis of Deep Networks
A learning curve models a classifier's test error as a function of the number of training samples. Prior works show that learning curves can be used to select model parameters and extrapolate…
Visual Semantic Role Labeling for Video Understanding
We propose a new framework for understanding and representing related salient events in a video using visual semantic role labeling. We represent videos as a set of related events, wherein each…
Visual Room Rearrangement
There has been a significant recent progress in the field of Embodied AI with researchers developing models and algorithms enabling embodied agents to navigate and interact within completely unseen…
LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis
Recent advances in document image analysis (DIA) have been primarily driven by the application of neural networks. Ideally, research outcomes could be easily deployed in production and extended for…
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2
Thinking aloud is an effective meta-cognitive strategy human reasoners apply to solve difficult problems. We suggest to improve the reasoning ability of pre-trained neural language models in a…
Information to Wisdom: Commonsense Knowledge Extraction and Compilation
Commonsense knowledge is a foundational cornerstone of artificial intelligence applications. Whereas information extraction and knowledge base construction for instance-oriented assertions, such as…
What Can You Learn from Your Muscles? Learning Visual Representation from Human Interactions
Learning effective representations of visual data that generalize to a variety of downstream tasks has been a long quest for computer vision. Most representation learning approaches rely solely on…