Papers

Learn more about AI2's Lasting Impact Award
Viewing 591-600 of 991 papers
  • X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers

    Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, and Aniruddha KembhaviEMNLP2020 Mirroring the success of masked language models, vision-and-language counterparts like VILBERT, LXMERT and UNITER have achieved state of the art performance on a variety of multimodal discriminative tasks like visual question answering and visual grounding…
  • "You are grounded!": Latent Name Artifacts in Pre-trained Language Models

    Vered Shwartz, Rachel Rudinger, Oyvind TafjordEMNLP2020 Pre-trained language models (LMs) may perpetuate biases originating in their training corpus to downstream models. We focus on artifacts associated with the representation of given names (e.g., Donald), which, depending on the corpus, may be associated with…
  • ZEST: Zero-shot Learning from Text Descriptions using Textual Similarity and Visual Summarization

    Tzuf Paz-Argaman, Y. Atzmon, Gal Chechik, Reut TsarfatyFindings of EMNLP2020 We study the problem of recognizing visual entities from the textual descriptions of their classes. Specifically, given birds' images with free-text descriptions of their species, we learn to classify images of previously-unseen species based on specie…
  • Rearrangement: A Challenge for Embodied AI

    Dhruv Batra, A. X. Chang, S. Chernova, A. Davison, Jun Deng, V. Koltun, S. Levine, J. Malik, Igor Mordatch, R. Mottaghi, M. Savva, Hao SuarXiv2020 We describe a framework for research and evaluation in Embodied AI. Our proposal is based on a canonical task: Rearrangement. A standard task can focus the development of new techniques and serve as a source of trained models that can be transferred to other…
  • ABNIRML: Analyzing the Behavior of Neural IR Models

    Sean MacAvaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan TACL2020 Numerous studies have demonstrated the effectiveness of pretrained contextualized language models such as BERT and T5 for ad-hoc search. However, it is not wellunderstood why these methods are so effective, what makes some variants more effective than others…
  • GO FIGURE: A Meta Evaluation of Factuality in Summarization

    Saadia Gabriel, Asli Celikyilmaz, Rahul Jha, Yejin Choi, Jianfeng GaoACL2020 Text generation models can generate factually inconsistent text containing distorted or fabricated facts about the source text. Recent work has focused on building evaluation models to verify the factual correctness of semantically constrained text generation…
  • NeuroLogic Decoding: (Un)supervised Neural Text Generation with Predicate Logic Constraints

    Ximing Lu, Peter West, Rowan Zellers, Ronan Le Bras, Chandra Bhagavatula, Yejin ChoiNAACL2020 Conditional text generation often requires lexical constraints, i.e., which words should or shouldn’t be included in the output text. While the dominant recipe for conditional text generation has been large-scale pretrained language models that are finetuned…
  • Paraphrasing vs Coreferring: Two Sides of the Same Coin

    Y. Meged, Avi Caciularu, Vered Shwartz, I. DaganarXiv2020 We study the potential synergy between two different NLP tasks, both confronting lexical variability: identifying predicate paraphrases and event coreference resolution. First, we used annotations from an event coreference dataset as distant supervision to re…
  • Generative Data Augmentation for Commonsense Reasoning

    Yiben Yang, Chaitanya Malaviya, Jared Fernandez, Swabha Swayamdipta, Ronan Le Bras, J. Wang, Chandra Bhagavatula, Yejin Choi, Doug DowneyFindings of EMNLP2020 Recent advances in commonsense reasoning depend on large-scale human-annotated training data to achieve peak performance. However, manual curation of training examples is expensive and has been shown to introduce annotation artifacts that neural models can…
  • Evaluating Models' Local Decision Boundaries via Contrast Sets

    M. Gardner, Y. Artzi, V. Basmova, J. Berant, B. Bogin, S. Chen, P. Dasigi, D. Dua, Y. Elazar, A. Gottumukkala, N. Gupta, H. Hajishirzi, G. Ilharco, D.Khashabi, K. Lin, J. Liu, N. F. Liu, P. Mulcaire, Q. Ning, S.Singh, N.A. Smith, S. Subramanian, et alFindings of EMNLP2020 Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on…