Papers

Learn more about AI2's Lasting Impact Award
Viewing 211-220 of 293 papers
  • A Formal Hierarchy of RNN Architectures

    William. Merrill, Gail Garfinkel Weiss, Yoav Goldberg, Roy Schwartz, Noah A. Smith, Eran YahavACL2020 We develop a formal hierarchy of the expressive capacity of RNN architectures. The hierarchy is based on two formal properties: space complexity, which measures the RNN's memory, and rational recurrence, defined as whether the recurrent update can be…
  • A Mixture of h-1 Heads is Better than h Heads

    Hao Peng, Roy Schwartz, Dianqi Li, Noah A. SmithACL2020 Multi-head attentive neural architectures have achieved state-of-the-art results on a variety of natural language processing tasks. Evidence has shown that they are overparameterized; attention heads can be pruned without significant performance loss. In this…
  • Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks

    Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, Noah A. SmithACL2020 Language models pretrained on text from a wide variety of sources form the foundation of today's NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target…
  • Improving Transformer Models by Reordering their Sublayers

    Ofir Press, Noah A. Smith, Omer LevyACL2020 Multilayer transformer networks consist of interleaved self-attention and feedforward sublayers. Could ordering the sublayers in a different pattern lead to better performance? We generate randomly ordered transformers and train them with the language…
  • Obtaining Faithful Interpretations from Compositional Neural Networks

    Sanjay Subramanian, Ben Bogin, Nitish Gupta, Tomer Wolfson, Sameer Singh, Jonathan Berant, Matt Gardner ACL2020 Neural module networks (NMNs) are a popular approach for modeling compositionality: they achieve high accuracy when applied to problems in language and vision, while reflecting the compositional structure of the problem in the network architecture. However…
  • QuASE: Question-Answer Driven Sentence Encoding.

    Hangfeng He, Qiang Ning, Dan RothACL2020 Question-answering (QA) data often encodes essential information in many facets. This paper studies a natural question: Can we get supervision from QA data for other tasks (typically, non-QA ones)? For example, {\em can we use QAMR (Michael et al., 2017) to…
  • Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models

    Maarten Sap, Eric Horvitz, Yejin Choi, Noah A. Smith, James W. Pennebaker ACL2020 We investigate the use of NLP as a measure of the cognitive processes involved in storytelling, contrasting imagination and recollection of events. To facilitate this, we collect and release HIPPOCORPUS, a dataset of 7,000 stories about imagined and recalled…
  • Social Bias Frames: Reasoning about Social and Power Implications of Language

    Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, Yejin ChoiACL2020
    WeCNLP Best Paper
    Language has the power to reinforce stereotypes and project social biases onto others. At the core of the challenge is that it is rarely what is stated explicitly, but all the implied meanings that frame people's judgements about others. For example, given a…
  • The Right Tool for the Job: Matching Model and Instance Complexities

    Roy Schwartz, Gabi Stanovsky, Swabha Swayamdipta, Jesse Dodge, Noah A. SmithACL2020 As NLP models become larger, executing a trained model requires significant computational resources incurring monetary and environmental costs. To better respect a given inference budget, we propose a modification to contextual representation fine-tuning…
  • Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering

    Ben Bogin, Sanjay Subramanian, Matt Gardner, Jonathan BerantTACL2020 Answering questions that involve multi-step reasoning requires decomposing them and using the answers of intermediate steps to reach the final answer. However, state-ofthe-art models in grounded question answering often do not explicitly perform decomposition…