Papers

Learn more about AI2's Lasting Impact Award
Viewing 11-20 of 598 papers
  • Natural Adversarial Objects

    Felix Lau, Nishant Subramani, Sasha Harrison, Aerin Kim, E. Branson, Rosanne LiuNeurIPS2021 Although state-of-the-art object detection methods have shown compelling performance, models often are not robust to adversarial attacks and out-of-distribution data. We introduce a new dataset, Natural Adversarial Objects (NAO), to evaluate the robustness of…
  • NaturalProofs: Mathematical Theorem Proving in Natural Language

    S. Welleck, Jiachen Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun ChoNeurIPS2021 Understanding and creating mathematics using natural mathematical language – the mixture of symbolic and natural language used by humans – is a challenging and important problem for driving progress in machine learning. As a step in this direction, we develop…
  • One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval

    Akari Asai, Xinyan Yu, Jungo Kasai, Hanna HajishirziNeurIPS2021 We present CORA, a Cross-lingual Open-Retrieval Answer Generation model that can answer questions across many languages even when language-specific annotated data or knowledge sources are unavailable. We introduce a new dense passage retrieval algorithm that…
  • Teach Me to Explain: A Review of Datasets for Explainable NLP

    Sarah Wiegreffe and Ana Marasović NeurIPS2021 Explainable NLP (ExNLP) has increasingly focused on collecting human-annotated explanations. These explanations are used downstream in three ways: as data augmentation to improve performance on a predictive task, as a loss signal to train models to produce…
  • Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text

    Christopher Clark, Jordi Salvador, Dustin Schwenk, Derrick Bonafilia, Mark Yatskar, Eric Kolve, Alvaro Herrasti, Jonghyun Choi, Sachin Mehta, Sam Skjonsberg, Carissa Schoenick, A. Sarnat, Hannaneh Hajishirzi, Aniruddha Kembhavi, Oren Etzioni, Ali FarhadiarXiv2021 Communicating with humans is challenging for AIs because it requires a shared understanding of the world, complex semantics (e.g., metaphors or analogies), and at times multimodal gestures (e.g., pointing with a finger, or an arrow in a diagram). We…
  • Dyna-bAbI: unlocking bAbI’s potential with dynamic synthetic benchmarking

    Ronen Tamari, Kyle Richardson, Aviad Sar-Shalom, Noam Kahlon, Nelson H S Liu, Reut Tsarfaty, Dafna Shahaf arXiv2021 While neural language models often perform surprisingly well on natural language understanding (NLU) tasks, their strengths and limitations remain poorly understood. Controlled synthetic tasks are thus an increasingly important resource for diagnosing model…
  • Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity

    Sheshera Mysore, Arman Cohan, Tom HopearXiv2021 We present ASPIRE, a new scientific document similarity model based on matching finegrained aspects. Our model is trained using co-citation contexts that describe related paper aspects as a novel form of textual supervision. We use multi-vector document…
  • Specializing Multilingual Language Models: An Empirical Study

    Ethan C. Chau, Noah A. SmithEMNLP • Workshop on Multilingual Representation Learning2021
    Best Paper Honorable Mention
    Pretrained multilingual language models have become a common tool in transferring NLP capabilities to low-resource languages, often with adaptations. In this work, we study the performance, extensibility, and interaction of two such adaptations: vocabulary…
  • Achieving Model Robustness through Discrete Adversarial Training

    Maor Ivgi, Jonathan BerantEMNLP2021 Discrete adversarial attacks are symbolic perturbations to a language input that preserve the output label but lead to a prediction error. While such attacks have been extensively explored for the purpose of evaluating model robustness, their utility for…
  • Back to Square One: Bias Detection, Training and Commonsense Disentanglement in the Winograd Schema

    Yanai Elazar, Hongming Zhang, Yoav Goldberg, Dan RothEMNLP2021 The Winograd Schema (WS) has been proposed as a test for measuring commonsense capabilities of models. Recently, pre-trained language model-based approaches have boosted performance on some WS benchmarks but the source of improvement is still not clear. We…