Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Transformer Feed-Forward Layers Are Key-Value Memories
Feed-forward layers constitute two-thirds of a transformer model’s parameters, yet their role in the network remains underexplored. We show that feed-forward layers in transformer-based language…
Value-aware Approximate Attention
Following the success of dot-product attention in Transformers, numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length. However, all…
GooAQ: Open Question Answering with Diverse Answer Types
While day-to-day questions come with a variety of answer types, the current questionanswering (QA) literature has failed to adequately address the answer diversity of questions. To this end, we…
BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief
Although pretrained language models (PTLMs) have been shown to contain significant amounts of world knowledge, they can still produce inconsistent answers to questions when probed, even after using…
MS2: Multi-Document Summarization of Medical Studies
To assess the effectiveness of any medical intervention, researchers must conduct a timeintensive and highly manual literature review. NLP systems can help to automate or assist in parts of this…
Achieving Model Robustness through Discrete Adversarial Training
Discrete adversarial attacks are symbolic perturbations to a language input that preserve the output label but lead to a prediction error. While such attacks have been extensively explored for the…
What's in your Head? Emergent Behaviour in Multi-Task Transformer Models
The primary paradigm for multi-task training in natural language processing is to represent the input with a shared pre-trained language model, and add a small, thin network (head) per task. Given…
Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences
In social settings, much of human behavior is governed by unspoken rules of conduct. For artificial systems to be fully integrated into social environments, adherence to such norms is a central…
proScript: Partially Ordered Scripts Generation
Scripts standardized event sequences describing typical everyday activities have been shown to help understand narratives by providing expectations, resolving ambiguity, and filling in unstated…
Contrastive Explanations for Model Interpretability
Contrastive explanations clarify why an event occurred in contrast to another. They are more inherently intuitive to humans to both produce and comprehend. We propose a methodology to produce…