Papers

Viewing 1-10 of 62 papers
  • Adversarial Filters of Dataset Biases

    Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula, Rowan Zellers, Matthew E. Peters, Ashish Sabharwal, Yejin Choi arXiv2020Large neural models have demonstrated humanlevel performance on language and vision benchmarks such as ImageNet and Stanford Natural Language Inference (SNLI). Yet, their performance degrades considerably when tested on adversarial or out-of-distribution samples. This raises the question of whether… more
  • Evaluating Question Answering Evaluation

    Anthony Chen, Gabriel Stanovsky, Sameer Singh, Matt GardnerEMNLP • MRQA Workshop2019As the complexity of question answering (QA) datasets evolve, moving away from restricted formats like span extraction and multiple-choice (MC) to free-form answer generation, it is imperative to understand how well current metrics perform in evaluating QA. This is especially important as existing… more
  • On Making Reading Comprehension More Comprehensive

    Matt Gardner, Jonathan Berant, Hannaneh Hajishirzi, Alon Talmor, Sewon MinEMNLP • MRQA Workshop2019Machine reading comprehension, the task of evaluating a machine’s ability to comprehend a passage of text, has seen a surge in popularity in recent years. There are many datasets that are targeted at reading comprehension, and many systems that perform as well as humans on some of these datasets… more
  • ORB: An Open Reading Benchmark for Comprehensive Evaluation of Machine Reading Comprehension

    Dheeru Dua, Ananth Gottumukkala, Alon Talmor, Sameer Singh, Matt GardnerEMNLP • MRQA Workshop2019Reading comprehension is one of the crucial tasks for furthering research in natural language understanding. A lot of diverse reading comprehension datasets have recently been introduced to study various phenomena in natural language, ranging from simple paraphrase matching and entity typing to… more
  • Reasoning Over Paragraph Effects in Situations

    Kevin Lin, Oyvind Tafjord, Peter Clark, Matt GardnerEMNLP • MRQA Workshop2019A key component of successfully reading a passage of text is the ability to apply knowledge gained from the passage to a new situation. In order to facilitate progress on this kind of reading, we present ROPES, a challenging benchmark for reading comprehension targeting Reasoning Over Paragraph… more
  • AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models

    Eric Wallace, Jens Tuyls, Junlin Wang, Sanjay Subramanian, Matthew Gardner, Sameer SinghEMNLP2019Neural NLP models are increasingly accurate but are imperfect and opaque---they break in counterintuitive ways and leave end users puzzled at their behavior. Model interpretation methods ameliorate this opacity by providing explanations for specific model predictions. Unfortunately, existing… more
  • Do NLP Models Know Numbers? Probing Numeracy in Embeddings

    Eric Wallace, Yizhong Wang, Sujian Li, Sameer Singh, Matt GardnerEMNLP2019The ability to understand and work with numbers (numeracy) is critical for many complex reasoning tasks. Currently, most NLP models treat numbers in text in the same way as other tokens---they embed them as distributed vectors. Is this enough to capture numeracy? We begin by investigating the… more
  • Low-Resource Parsing with Crosslingual Contextualized Representations

    Phoebe Mulcaire, Jungo Kasai, Noah A. SmithCoNLL2019Despite advances in dependency parsing, languages with small treebanks still present challenges. We assess recent approaches to multilingual contextual word representations (CWRs), and compare them for crosslingual transfer from a language with a large treebank to a language with a small or… more
  • On the Limits of Learning to Actively Learn Semantic Representations

    Omri Koshorek, Gabriel Stanovsky, Yichu Zhou, Vivek Srikumar and Jonathan BerantCoNLL2019One of the goals of natural language understanding is to develop models that map sentences into meaning representations. However, training such models requires expensive annotation of complex structures, which hinders their adoption. Learning to actively-learn (LTAL) is a recent paradigm for… more
  • Universal Adversarial Triggers for Attacking and Analyzing NLP

    Eric Wallace, Shi Feng, Nikhil Kandpal, Matthew Gardner, Sameer Singh EMNLP2019dversarial examples highlight model vulnerabilities and are useful for evaluation and interpretation. We define universal adversarial triggers: input-agnostic sequences of tokens that trigger a model to produce a specific prediction when concatenated to any input from a dataset. We propose a… more