Papers
See AI2's Award Winning Papers
Learn more about AI2's Lasting Impact Award
Viewing 401-410 of 1016 papers
Natural Adversarial Objects
Felix Lau, Nishant Subramani, Sasha Harrison, Aerin Kim, E. Branson, Rosanne LiuNeurIPS 2021 Data Centric AI Workshop • 2021 Although state-of-the-art object detection methods have shown compelling performance, models often are not robust to adversarial attacks and out-of-distribution data. We introduce a new dataset, Natural Adversarial Objects (NAO), to evaluate the robustness of…NaturalProofs: Mathematical Theorem Proving in Natural Language
S. Welleck, Jiachen Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun ChoNeurIPS • 2021 Understanding and creating mathematics using natural mathematical language – the mixture of symbolic and natural language used by humans – is a challenging and important problem for driving progress in machine learning. As a step in this direction, we develop…One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval
Akari Asai, Xinyan Yu, Jungo Kasai, Hanna HajishirziNeurIPS • 2021 We present CORA, a Cross-lingual Open-Retrieval Answer Generation model that can answer questions across many languages even when language-specific annotated data or knowledge sources are unavailable. We introduce a new dense passage retrieval algorithm that…Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing
Sarah Wiegreffe and Ana Marasović NeurIPS • 2021 Explainable NLP (ExNLP) has increasingly focused on collecting human-annotated explanations. These explanations are used downstream in three ways: as data augmentation to improve performance on a predictive task, as a loss signal to train models to produce…Bridging the Imitation Gap by Adaptive Insubordination
Luca Weihs, Unnat Jain, Jordi Salvador, S. Lazebnik, Aniruddha Kembhavi, A. SchwingarXiv • 2021 Why do agents often obtain better reinforcement learning policies when imitating a worse expert? We show that privileged information used by the expert is marginalized in the learned agent policy, resulting in an "imitation gap." Prior work bridges this gap…Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text
Christopher Clark, Jordi Salvador, Dustin Schwenk, Derrick Bonafilia, Mark Yatskar, Eric Kolve, Alvaro Herrasti, Jonghyun Choi, Sachin Mehta, Sam Skjonsberg, Carissa Schoenick, A. Sarnat, Hannaneh Hajishirzi, Aniruddha Kembhavi, Oren Etzioni, Ali FarhadiarXiv • 2021 Communicating with humans is challenging for AIs because it requires a shared understanding of the world, complex semantics (e.g., metaphors or analogies), and at times multimodal gestures (e.g., pointing with a finger, or an arrow in a diagram). We…Specializing Multilingual Language Models: An Empirical Study
Ethan C. Chau, Noah A. SmithEMNLP • Workshop on Multilingual Representation Learning • 2021Pretrained multilingual language models have become a common tool in transferring NLP capabilities to low-resource languages, often with adaptations. In this work, we study the performance, extensibility, and interaction of two such adaptations: vocabulary…Best Paper Honorable MentionTowards Personalized Descriptions of Scientific Concepts
Sonia K. Murthy, Daniel King, Tom Hope, Daniel S. Weld, Doug DowneyEMNLP 2021 • WiNLP • 2021 A single scientific concept can be described in many different ways, and the most informative description depends on the audience. In this paper, we propose generating personalized scientific concept descriptions that are tailored to the user’s expertise and…Achieving Model Robustness through Discrete Adversarial Training
Maor Ivgi, Jonathan BerantEMNLP • 2021 Discrete adversarial attacks are symbolic perturbations to a language input that preserve the output label but lead to a prediction error. While such attacks have been extensively explored for the purpose of evaluating model robustness, their utility for…Back to Square One: Bias Detection, Training and Commonsense Disentanglement in the Winograd Schema
Yanai Elazar, Hongming Zhang, Yoav Goldberg, Dan RothEMNLP • 2021 The Winograd Schema (WS) has been proposed as a test for measuring commonsense capabilities of models. Recently, pre-trained language model-based approaches have boosted performance on some WS benchmarks but the source of improvement is still not clear. We…