Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Mitigating False-Negative Contexts in Multi-document Question Answering with Retrieval Marginalization

Ansong NiMatt GardnerPradeep Dasigi
2021
EMNLP

Question Answering (QA) tasks requiring information from multiple documents often rely on a retrieval model to identify relevant information from which the reasoning model can derive an answer. The… 

Paired Examples as Indirect Supervision in Latent Decision Models

Nitish GuptaSameer SinghMatt Gardner and Dan Roth
2021
EMNLP

Compositional, structured models are appealing because they explicitly decompose problems and provide interpretable intermediate outputs that give confidence that the model is not simply latching… 

Parameter Norm Growth During Training of Transformers

William MerrillVivek RamanujanYoav GoldbergNoah A. Smith
2021
EMNLP

The capacity of neural networks like the widely adopted transformer is known to be very high. Evidence is emerging that they learn successfully due to inductive bias in the training routine,… 

Probing Across Time: What Does RoBERTa Know and When?

Leo Z. LiuYizhong WangJungo KasaiNoah A. Smith
2021
Findings of EMNLP

Models of language trained on very large corpora have been demonstrated useful for NLP. As fixed artifacts, they have become the object of intense study, with many researchers “probing” the extent… 

Sentence Bottleneck Autoencoders from Transformer Language Models

Ivan MonteroNikolaos PappasNoah A. Smith
2021
EMNLP

Representation learning for text via pretraining a language model on a large corpus has become a standard starting point for building NLP systems. This approach stands in contrast to autoencoders,… 

Understanding Mention Detector-Linker Interaction in Neural Coreference Resolution

Zhaofeng WuMatt Gardner
2021
EMNLP • CRAC

Despite significant recent progress in coreference resolution, the quality of current state-of-the-art systems still considerably trails behind human-level performance. Using the CoNLL-2012 and… 

DIALKI: Knowledge Identification in Conversational Systems through Dialogue-Document Contextualization

Zeqiu WuBo-Ru LuHannaneh HajishirziMari Ostendorf
2021
EMNLP

Identifying relevant knowledge to be used in conversational systems that are grounded in long documents is critical to effective response generation. We introduce a knowledge identification model… 

Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study

Rahul NadkarniDavid WaddenIz BeltagyTom Hope
2021
AKBC

Biomedical knowledge graphs (KGs) hold rich information on entities such as diseases, drugs, and genes. Predicting missing links in these graphs can boost many important applications, such as drug… 

Competency Problems: On Finding and Removing Artifacts in Language Data

Matt GardnerWilliam Cooper MerrillJesse DodgeNoah A. Smith
2021
EMNLP

Much recent work in NLP has documented dataset artifacts, bias, and spurious correlations between input features and output labels. However, how to tell which features have “spurious” instead of… 

Expected Validation Performance and Estimation of a Random Variable's Maximum

Jesse DodgeSuchin GururanganD. CardNoah A. Smith
2021
Findings of EMNLP

Research in NLP is often supported by experimental results, and improved reporting of such results can lead to better understanding and more reproducible science. In this paper we analyze three…