Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers
The attention mechanism is considered the backbone of the widely-used Transformer architecture. It contextualizes the input by computing input-specific attention matrices. We find that this…
In-Context Learning for Few-Shot Dialogue State Tracking
Collecting and annotating task-oriented dialogues is time-consuming and costly. Thus, zero and few shot learning for dialogue tasks presents an exciting opportunity. In this work, we propose an…
Lexical Generalization Improves with Larger Models and Longer Training
While fine-tuned language models perform well on many tasks, they were also shown to rely on superficial surface features such as lexical overlap. Excessive utilization of such heuristics can lead to…
Modeling Context With Linear Attention for Scalable Document-Level Translation
Document-level machine translation leverages inter-sentence dependencies to produce more coherent and consistent translations. However, these models, predominantly based on transformers, are…
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Integrating vision and language has gained no-table attention following the success of pretrained language models. Despite that, a fraction of emerging multimodal models is suitable for text…
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce SUPER-NATURALINSTRUCTIONS, a benchmark of 1,616…
Twist Decoding: Diverse Generators Guide Each Other
Natural language generation technology has recently seen remarkable progress with large-scale training, and many natural language applications are now built upon a wide range of generation models.…
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models
Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases. Since the inputs…
Unsupervised Learning of Hierarchical Conversation Structure
Human conversations can evolve in many different ways, creating challenges for automatic understanding and summarization. Goal-oriented conversations often have meaningful sub-dialogue structure,…
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation
A recurring challenge of crowdsourcing NLP datasets at scale is that human writers often rely on repetitive patterns when crafting examples, leading to a lack of linguistic diversity. We introduce a…