• Crowdsourcing Multiple Choice Science Questions
    Johannes Welbl, Nelson F. Liu, and Matt Gardner Workshop on Noisy User-generated Text 2017
  • Ontology Aware Token Embeddings for Prepositional Phrase Attachment
    Pradeep Dasigi, Waleed Ammar, Chris Dyer, and Eduard Hovy ACL 2017
  • QSAnglyzer: Visual Analytics for Prismatic Analysis of Question Answering System Evaluations
    Nan-Chen Chen and Been Kim VAST 2017 Demo Video

    Developing sophisticated artificial intelligence (AI) systems requires AI researchers to experiment with different designs and analyze results from evaluations (we refer this task as evaluation analysis). In this paper, we tackle the challenges of evaluation analysis in the domain of question-answering (QA) systems. Through in-depth studies with QA researchers, we identify tasks and goals of evaluation analysis and derive a set of design rationales, based on which we propose a novel approach termed prismatic analysis. Prismatic analysis examines data through multiple ways of categorization (referred as angles). Categories in each angle are measured by aggregate metrics to enable diverse comparison scenarios. Less

  • Interactive Visualization for Linguistic Structure
    Aaron Sarnat, Vidur Joshi, Cristian Petrescu-Prahova, Alvaro Herrasti, Brandon Stilson, and Mark Hopkins EMNLP 2017

    We provide a visualization library and web interface for interactively exploring a parse tree or a forest of parses. The library is not tied to any particular linguistic representation, but provides a generalpurpose API for the interactive exploration of hierarchical linguistic structure. To facilitate rapid understanding of a complex structure, the API offers several important features, including expand/collapse functionality, positional and color cues, explicit visual support for sequential structure, and dynamic highlighting to convey node-to-text correspondence. Less

  • Beyond Sentential Semantic Parsing: Tackling the Math SAT with a Cascade of Tree Transducers
    Mark Hopkins, Cristian Petrescu-Prahova, Roie Levin, Ronan Le Bras, Alvaro Herrasti, and Vidur Joshi EMNLP 2017

    We present an approach for answering questions that span multiple sentences and exhibit sophisticated cross-sentence anaphoric phenomena, evaluating on a rich source of such questions--the math portion of the Scholastic Aptitude Test (SAT). By using a tree transducer cascade as its basic architecture, our system (called EUCLID) propagates uncertainty from multiple sources (e.g. coreference resolution or verb interpretation) until it can be confidently resolved. Experiments show the first-ever results (43% recall and 91% precision) on SAT algebra word problems. We also apply EUCLID to the public Dolphin algebra question set, and improve the state-of-the-art F1-score from 73.9% to 77.0%. Less

  • Neural Semantic Parsing with Type Constraints for Semi-Structured Tables
    Jayant Krishnamurthy, Pradeep Dasigi, and Matt Gardner EMNLP 2017

    We present a new semantic parsing model for answering compositional questions on semi-structured Wikipedia tables. Our parser is an encoder-decoder neural network with two key technical innovations: (1) a grammar for the decoder that only generates well-typed logical forms; and (2) an entity embedding and linking module that identifies entity mentions while generalizing across tables. We also introduce a novel method for training our neural model with question-answer supervision. On the WIKITABLEQUESTIONS data set, our parser achieves a state-of-theart accuracy of 43.3% for a single model and 45.9% for a 5-model ensemble, improving on the best prior score of 38.7% set by a 15-model ensemble. These results suggest that type constraints and entity linking are valuable components to incorporate in neural semantic parsers. Less

  • End-to-end Neural Coreference Resolution
    Kenton Lee, Luheng He, Mike Lewis, and Luke Zettlemoyer EMNLP 2017

    We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or handengineered mention detector. The key idea is to directly consider all spans in a document as potential mentions and learn distributions over possible antecedents for each. The model computes span embeddings that combine context-dependent boundary representations with a headfinding attention mechanism. It is trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions. Experiments demonstrate state-of-the-art performance, with a gain of 1.5 F1 on the OntoNotes benchmark and by 3.1 F1 using a 5-model ensemble, despite the fact that this is the first approach to be successfully trained with no external resources. Less

  • AI zooms in on highly influential citations
    Oren Etzioni Nature 2017

    The number of times a paper is cited is a poor proxy for its impact (see P. Stephan et al. Nature 544, 411–412; 2017). I suggest relying instead on a new metric that uses artificial intelligence (AI) to capture the subset of an author's or a paper's essential and therefore most highly influential citations. Academics may cite papers for non-essential reasons — out of courtesy, for completeness or to promote their own publications. These superfluous citations can impede literature searches and exaggerate a paper's importance. Less

  • End-to-End Neural Ad-hoc Ranking with Kernel Pooling
    Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power SIGIR 2017

    This paper proposes K-NRM, a kernel based neural model for document ranking. Given a query and a set of documents, K-NRM uses a translation matrix that models word-level similarities via word embeddings, a new kernel-pooling technique that uses kernels to extract multi-level soft match features, and a learning-to-rank layer that combines those features into the final ranking score. The whole model is trained end-to-end. The ranking layer learns desired feature patterns from the pairwise ranking loss. The kernels transfer the feature patterns into soft-match targets at each similarity level and enforce them on the translation matrix. The word embeddings are tuned accordingly so that they can produce the desired soft matches. Experiments on a commercial search engine's query log demonstrate the improvements of K-NRM over prior feature-based and neural-based states-of-the-art, and explain the source of K-NRM's advantage: Its kernel-guided embedding encodes a similarity metric tailored for matching query words to document words, and provides effective multi-level soft matches. Less

  • Tell Me Why: Using Question Answering as Distant Supervision for Answer Justification
    Rebecca Sharp, Mihai Surdeanu, Peter Jansen, Marco A. Valenzuela-Escárcega, Peter Clark, and Michael Hammond CoNNL 2017

    For many applications of question answering (QA), being able to explain why a given model chose an answer is critical. However, the lack of labeled data for answer justifications makes learning this difficult and expensive. Here we propose an approach that uses answer ranking as distant supervision for learning how to select informative justifications, where justifications serve as inferential connections between the question and the correct answer while often containing little lexical overlap with either. We propose a neural network architecture for QA that reranks answer justifications as an intermediate (and human-interpretable) step in answer selection. Our approach is informed by a set of features designed to combine both learned representations and explicit features to capture the connection between questions, answers, and answer justifications. We show that with this end-to-end approach we are able to significantly improve upon a strong IR baseline in both justification ranking (+9% rated highly relevant) and answer selection (+6% P@1). Less