Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation
Pre-trained language models, despite their rapid advancements powered by scale, still fall short of robust commonsense capabilities. And yet, scale appears to be the win-ning recipe; after all, the…
Reproducible scaling laws for contrastive language-image learning
Scaling up neural networks has led to remarkable performance across a wide range of tasks. Moreover, performance often follows reliable scaling laws as a function of training set size, model size,…
Continued Pretraining for Better Zero- and Few-Shot Promptability
Recently introduced language model prompting methods can achieve high accuracy in zero-and few-shot settings while requiring few to no learned task-specific parameters. Never-theless, these methods…
Exploring The Landscape of Distributional Robustness for Question Answering Models
We conduct a large empirical evaluation to investigate the landscape of distributional robustness in question answering. Our investigation spans over 350 models and 16 question answering datasets,…
Hyperdecoders: Instance-specific decoders for multi-task NLP
We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-efficient adaptations for a decoder using a hypernetwork conditioned on the output of an encoder. This…
Lila: A Unified Benchmark for Mathematical Reasoning
Mathematical reasoning skills are essential for general-purpose intelligent systems to perform tasks from grocery shopping to climate modeling. Towards evaluating and improving AI systems in this…
Statistical and Computational Guarantees for Influence Diagnostics
Influence diagnostics such as influence functions and approximate maximum influence perturbations are popular in machine learning and in AI domain applications. Influence diagnostics are powerful…
Abstract Visual Reasoning with Tangram Shapes
We introduce KiloGram, a resource for studying abstract visual reasoning in humans and machines. Drawing on the history of tangram puzzles as stimuli in cognitive science, we build a richly…
CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation
The full power of human language-based communication cannot be realized without negation. All human languages have some form of negation. Despite this, negation remains a challenging phenomenon for…
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems
Large transformer models can highly improve Answer Sentence Selection (AS2) tasks, but their high computational costs prevent their use in many real-world applications. In this pa-per, we explore…