Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
One-Shot Labeling for Automatic Relevance Estimation
Dealing with unjudged documents ("holes") in relevance assessments is a perennial problem when evaluating search systems with offline experiments. Holes can reduce the apparent effectiveness of…
A Search Engine for Discovery of Scientific Challenges and Directions
Keeping track of scientific challenges, advances and emerging directions is a fundamental part of research. However, researchers face a flood of papers that hinders discovery of important knowledge.…
Knowledge is Power: Symbolic Knowledge Distillation, Commonsense Morality, & Multimodal Script Knowledge
Scale appears to be the winning recipe in today's AI leaderboards. And yet, extreme-scale neural models are still brittle to make errors that are often nonsensical and even counterintuitive. In this…
A Controllable Model of Grounded Response Generation
Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process. This control is essential to ensure that users' semantic…
Multi-Modal Answer Validation for Knowledge-Based VQA
The problem of knowledge-based visual question answering involves answering questions that require external knowledge in addition to the content of the image. Such knowledge typically comes in a…
Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability
Investigating the reasoning abilities of transformer models, and discovering new challenging tasks for them, has been a topic of much interest. Recent studies have found these models to be…
Interactron: Embodied Adaptive Object Detection
Over the years various methods have been proposed for the problem of object detection. Recently, we have wit-nessed great strides in this domain owing to the emergence of powerful deep neural…
MuSiQue: Multihop Questions via Single-hop Question Composition
Multihop reasoning remains an elusive goal as existing multihop benchmarks are known to be largely solvable via shortcuts. Can we create a question answering (QA) dataset that, by construction,…
Correcting Coarse-Grid Weather and Climate Models by Machine Learning From Global Storm-Resolving Simulations
Global atmospheric `storm-resolving' models with horizontal grid spacing of less than 5~km resolve deep cumulus convection and flow in complex terrain. They promise to be reference models that could…
SCROLLS: Standardized CompaRison Over Long Language Sequences
NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild. We introduce SCROLLS, a…