Papers

Learn more about AI2's Lasting Impact Award
Viewing 421-430 of 991 papers
  • Competency Problems: On Finding and Removing Artifacts in Language Data

    Matt Gardner, William Cooper Merrill, Jesse Dodge, Matthew E. Peters, Alexis Ross, Sameer Singh, Noah A. SmithEMNLP2021 Much recent work in NLP has documented dataset artifacts, bias, and spurious correlations between input features and output labels. However, how to tell which features have “spurious” instead of legitimate correlations is typically left unspecified. In this…
  • Ethical-Advice Taker: Do Language Models Understand Natural Language Interventions?

    Jieyu Zhao, Daniel Khashabi, Tushar Khot, Ashish Sabharwal and Kai-Wei Chang ACL-IJCNLP2021 Is it possible to use natural language to intervene in a model’s behavior and alter its prediction in a desired way? We investigate the effectiveness of natural language interventions for reading-comprehension systems, studying this in the context of social…
  • Expected Validation Performance and Estimation of a Random Variable's Maximum

    Jesse Dodge, Suchin Gururangan, D. Card, Roy Schwartz, Noah A. SmithFindings of EMNLP2021 Research in NLP is often supported by experimental results, and improved reporting of such results can lead to better understanding and more reproducible science. In this paper we analyze three statistical estimators for expected validation performance, a…
  • Investigating Transfer Learning in Multilingual Pre-trained Language Models through Chinese Natural Language Inference

    Hai Hu, He Zhou, Zuoyu Tian, Yiwen Zhang, Yina Ma, Yanting Li, Yixin Nie, Kyle RichardsonFindings of ACL2021 Multilingual transformers (XLM, mT5) have been shown to have remarkable transfer skills in zero-shot settings. Most transfer studies, however, rely on automatically translated resources (XNLI, XQuAD), making it hard to discern the particular linguistic…
  • ReadOnce Transformers: Reusable Representations of Text for Transformers

    Shih-Ting Lin, Ashish Sabharwal, Tushar KhotACL2021 While large-scale language models are extremely effective when directly fine-tuned on many end-tasks, such models learn to extract information and solve the task simultaneously from end-task supervision. This is wasteful, as the general problem of gathering…
  • Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation

    Jungo Kasai, Nikolaos Pappas, Hao Peng, James Cross, Noah A. SmithICLR2021 State-of-the-art neural machine translation models generate outputs autoregressively, where every step conditions on the previously generated tokens. This sequential nature causes inherent decoding latency. Non-autoregressive translation techniques, on the…
  • Random Feature Attention

    Hao Peng, Nikolaos Pappas, Dani Yogatama, Roy Schwartz, Noah A. Smith, Lingpeng KongICLR2021 Transformers are state-of-the-art models for a variety of sequence modeling tasks. At their core is an attention function which models pairwise interactions between the inputs at every timestep. While attention is powerful, it does not scale efficiently to…
  • Symbolic Brittleness in Sequence Models: on Systematic Generalization in Symbolic Mathematics

    S. Welleck, Peter West, Jize Cao, Yejin ChoiAAAI 2021 Neural sequence models trained with maximum likelihood estimation have led to breakthroughs in many tasks, where success is defined by the gap between training and test performance. However, their ability to achieve stronger forms of generalization remains…
  • S2AND: A Benchmark and Evaluation System for Author Name Disambiguation

    Shivashankar Subramanian, Daniel King, Doug Downey, Sergey Feldman JCDL2021 Author Name Disambiguation (AND) is the task of resolving which author mentions in a bibliographic database refer to the same real-world person, and is a critical ingredient of digital library applications such as search and citation analysis. While many AND…
  • COVR: A test-bed for Visually Grounded Compositional Generalization with real images

    Ben Bogin, Shivanshu Gupta, Matt Gardner, Jonathan BerantEMNLP2021 While interest in models that generalize at test time to new compositions has risen in recent years, benchmarks in the visually-grounded domain have thus far been restricted to synthetic images. In this work, we propose COVR, a new test-bed for visually…