Viewing 21-30 of 369 papers
  • QuASE: Question-Answer Driven Sentence Encoding.

    Hangfeng He, Qiang Ning, Dan RothACL2020Question-answering (QA) data often encodes essential information in many facets. This paper studies a natural question: Can we get supervision from QA data for other tasks (typically, non-QA ones)? For example, {\em can we use QAMR (Michael et al., 2017) to improve named entity recognition?} We… more
  • Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models

    Maarten Sap, Eric Horvitz, Yejin Choi, Noah A. Smith, James W. Pennebaker ACL2020We investigate the use of NLP as a measure of the cognitive processes involved in storytelling, contrasting imagination and recollection of events. To facilitate this, we collect and release HIPPOCORPUS, a dataset of 7,000 stories about imagined and recalled events. We introduce a measure of… more
  • S2ORC: The Semantic Scholar Open Research Corpus

    Kyle Lo, Lucy Lu Wang, Mark E Neumann, Rodney Michael Kinney, Daniel S. Weld ACL2020We introduce S2ORC, a large contextual citation graph of English-language academic papers from multiple scientific domains; the corpus consists of 81.1M papers, 380.5M citation edges, and associated paper metadata. We provide structured full text for 8.1M open access papers. All inline citation… more
  • SciREX: A Challenge Dataset for Document-Level Information Extraction

    Sarthak Jain, Madeleine van Zuylen, Hannaneh Hajishirzi, Iz BeltagyACL2020Extracting information from full documents is an important problem in many domains, but most previous work focus on identifying relationships within a sentence or a paragraph. It is challenging to create a large-scale information extraction (IE) dataset at the document level since it requires an… more
  • Social Bias Frames: Reasoning about Social and Power Implications of Language

    Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, Yejin ChoiACL2020Language has the power to reinforce stereotypes and project social biases onto others. At the core of the challenge is that it is rarely what is stated explicitly, but all the implied meanings that frame people's judgements about others. For example, given a seemingly innocuous statement "we… more
  • SPECTER: Document-level Representation Learning using Citation-informed Transformers

    Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. WeldACL2020Representation learning is a critical ingredient for natural language processing systems. Recent Transformer language models like BERT learn powerful textual representations, but these models are targeted towards tokenand sentence-level training objectives and do not leverage information on inter… more
  • Stolen Probability: A Structural Weakness of Neural Language Models

    David Demeter, Gregory Kimmel, Doug DowneyACL2020Neural Network Language Models (NNLMs) generate probability distributions by applying a softmax function to a distance metric formed by taking the dot product of a prediction vector with all word vectors in a high-dimensional embedding space. The dot-product distance metric forms part of the… more
  • Syntactic Search by Example

    Micah Shlain, Hillel Taub-Tabib, Shoval Sadde, Yoav GoldbergACL2020We present a system that allows a user to search a large linguistically annotated corpus using syntactic patterns over dependency graphs. In contrast to previous attempts to this effect, we introduce a light-weight query language that does not require the user to know the details of the underlying… more
  • Temporal Common Sense Acquisition with Minimal Supervision

    Ben Zhou, Qiang Ning, Daniel Khashabi, Dan RothACL 2020Temporal common sense (e.g., duration and frequency of events) is crucial for understanding natural language. However, its acquisition is challenging, partly because such information is often not expressed explicitly in text, and human annotation on such concepts is costly. This work proposes a… more
  • The Right Tool for the Job: Matching Model and Instance Complexities

    Roy Schwartz, Gabi Stanovsky, Swabha Swayamdipta, Jesse Dodge, Noah A. SmithACL2020As NLP models become larger, executing a trained model requires significant computational resources incurring monetary and environmental costs. To better respect a given inference budget, we propose a modification to contextual representation fine-tuning which, during inference, allows for an early… more