Papers
See AI2's Award Winning Papers
Learn more about AI2's Lasting Impact Award
Viewing 381-390 of 991 papers
Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text
Christopher Clark, Jordi Salvador, Dustin Schwenk, Derrick Bonafilia, Mark Yatskar, Eric Kolve, Alvaro Herrasti, Jonghyun Choi, Sachin Mehta, Sam Skjonsberg, Carissa Schoenick, A. Sarnat, Hannaneh Hajishirzi, Aniruddha Kembhavi, Oren Etzioni, Ali FarhadiarXiv • 2021 Communicating with humans is challenging for AIs because it requires a shared understanding of the world, complex semantics (e.g., metaphors or analogies), and at times multimodal gestures (e.g., pointing with a finger, or an arrow in a diagram). We…Specializing Multilingual Language Models: An Empirical Study
Ethan C. Chau, Noah A. SmithEMNLP • Workshop on Multilingual Representation Learning • 2021Pretrained multilingual language models have become a common tool in transferring NLP capabilities to low-resource languages, often with adaptations. In this work, we study the performance, extensibility, and interaction of two such adaptations: vocabulary…Best Paper Honorable MentionTowards Personalized Descriptions of Scientific Concepts
Sonia K. Murthy, Daniel King, Tom Hope, Daniel S. Weld, Doug DowneyEMNLP 2021 • WiNLP • 2021 A single scientific concept can be described in many different ways, and the most informative description depends on the audience. In this paper, we propose generating personalized scientific concept descriptions that are tailored to the user’s expertise and…Achieving Model Robustness through Discrete Adversarial Training
Maor Ivgi, Jonathan BerantEMNLP • 2021 Discrete adversarial attacks are symbolic perturbations to a language input that preserve the output label but lead to a prediction error. While such attacks have been extensively explored for the purpose of evaluating model robustness, their utility for…Back to Square One: Bias Detection, Training and Commonsense Disentanglement in the Winograd Schema
Yanai Elazar, Hongming Zhang, Yoav Goldberg, Dan RothEMNLP • 2021 The Winograd Schema (WS) has been proposed as a test for measuring commonsense capabilities of models. Recently, pre-trained language model-based approaches have boosted performance on some WS benchmarks but the source of improvement is still not clear. We…BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief
Nora Kassner, Oyvind Tafjord, H. Schutze, P. ClarkEMNLP • 2021 Although pretrained language models (PTLMs) have been shown to contain significant amounts of world knowledge, they can still produce inconsistent answers to questions when probed, even after using specialized training techniques to reduce inconsistency. As a…CDLM: Cross-Document Language Modeling
Avi Caciularu, Arman Cohan, Iz Beltagy, Matthew E. Peters, Arie Cattan, Ido DaganFindings of EMNLP • 2021 We introduce a new pretraining approach for language models that are geared to support multi-document NLP tasks. Our crossdocument language model (CD-LM) improves masked language modeling for these tasks with two key ideas. First, we pretrain with multiple…Contrastive Explanations for Model Interpretability
Alon Jacovi, Swabha Swayamdipta, Shauli Ravfogel, Yanai Elazar, Yejin Choi, Yoav GoldbergEMNLP • 2021 Contrastive explanations clarify why an event occurred in contrast to another. They are more inherently intuitive to humans to both produce and comprehend. We propose a methodology to produce contrastive explanations for classification models by modifying the…Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus
Jesse Dodge, Maarten Sap, Ana Marasović, William Agnew, Gabriel Ilharco, Dirk Groeneveld, Matt GardnerEMNLP • 2021 As language models are trained on ever more text, researchers are turning to some of the largest corpora available. Unlike most other types of datasets in NLP, large unlabeled text corpora are often presented with minimal documentation, and best practices for…Explaining Answers with Entailment Trees
Bhavana Dalvi, Peter A. Jansen, Oyvind Tafjord, Zhengnan Xie, Hannah Smith, Leighanna Pipatanangkura, Peter ClarkEMNLP • 2021 Our goal, in the context of open-domain textual question-answering (QA), is to explain answers by not just listing supporting textual evidence (“rationales”), but also showing how such evidence leads to the answer in a systematic way. If this could be done…