Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Parameter Norm Growth During Training of Transformers

William MerrillVivek RamanujanYoav GoldbergNoah A. Smith
2021
EMNLP

The capacity of neural networks like the widely adopted transformer is known to be very high. Evidence is emerging that they learn successfully due to inductive bias in the training routine,… 

Probing Across Time: What Does RoBERTa Know and When?

Leo Z. LiuYizhong WangJungo KasaiNoah A. Smith
2021
Findings of EMNLP

Models of language trained on very large corpora have been demonstrated useful for NLP. As fixed artifacts, they have become the object of intense study, with many researchers “probing” the extent… 

proScript: Partially Ordered Scripts Generation

Keisuke SakaguchiChandra BhagavatulaRonan Le BrasYejin Choi
2021
EMNLP • Findings

Scripts standardized event sequences describing typical everyday activities have been shown to help understand narratives by providing expectations, resolving ambiguity, and filling in unstated… 

Sentence Bottleneck Autoencoders from Transformer Language Models

Ivan MonteroNikolaos PappasNoah A. Smith
2021
EMNLP

Representation learning for text via pretraining a language model on a large corpus has become a standard starting point for building NLP systems. This approach stands in contrast to autoencoders,… 

Sister Help: Data Augmentation for Frame-Semantic Role Labeling

Ayush PancholyMiriam R. L. PetruckSwabha Swayamdipta
2021
EMNLP • LAW-DMR Workshop

While FrameNet is widely regarded as a rich resource of semantics in natural language processing, a major criticism concerns its lack of coverage and the relative paucity of its labeled data… 

Surface Form Competition: Why the Highest Probability Answer Isn't Always Right

Ari HoltzmanPeter WestVered SchwartzLuke Zettlemoyer
2021
EMNLP

Large language models have shown promising results in zero-shot settings (Brown et al., 2020; Radford et al., 2019). For example, they can perform multiple choice tasks simply by conditioning on a… 

Think about it! Improving defeasible reasoning by first modeling the question scenario

Aman MadaanNiket TandonDheeraj RajagopalE. Hovy
2021
EMNLP

Defeasible reasoning is the mode of reasoning where conclusions can be overturned by taking into account new evidence. Existing cognitive science literature on defeasible reasoning suggests that a… 

Transformer Feed-Forward Layers Are Key-Value Memories

Mor GevaR. SchusterJonathan BerantOmer Levy
2021
EMNLP

Feed-forward layers constitute two-thirds of a transformer model’s parameters, yet their role in the network remains underexplored. We show that feed-forward layers in transformer-based language… 

Understanding Mention Detector-Linker Interaction in Neural Coreference Resolution

Zhaofeng WuMatt Gardner
2021
EMNLP • CRAC

Despite significant recent progress in coreference resolution, the quality of current state-of-the-art systems still considerably trails behind human-level performance. Using the CoNLL-2012 and… 

Value-aware Approximate Attention

Ankit GuptaJonathan Berant
2021
EMNLP

Following the success of dot-product attention in Transformers, numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length. However, all…