Menu
Viewing 95 papers in Aristo
Clear all
    • EMNLP 2019
      Tushar Khot, Ashish Sabharwal, Peter Clark
      Multi-hop textual question answering requires combining information from multiple sentences. We focus on a natural setting where, unlike typical reading comprehension, only partial information is provided with each question. The model must retrieve and use additional knowledge to correctly answer…  (More)
    • EMNLP 2019
      Niket Tandon, Bhavana Dalvi Mishra, Keisuke Sakaguchi, Antoine Bosselut, Peter Clark
      We introduce WIQA, the first large-scale dataset of "What if..." questions over procedural text. WIQA contains three parts: a collection of paragraphs each describing a process, e.g., beach erosion; a set of crowdsourced influence graphs for each paragraph, describing how one change affects another…  (More)
    • EMNLP 2019
      Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld
      As a step toward better document-level understanding, we explore classification of a sequence of sentences into their corresponding categories, a task that requires understanding sentences in context of the document. Recent successful models for this task have used hierarchical models to…  (More)
    • EMNLP 2019
      Bhavana Dalvi Mishra, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter Clark
      Our goal is to better comprehend procedural text, e.g., a paragraph about photosynthesis, by not only predicting what happens, but why some actions need to happen before others. Our approach builds on a prior process comprehension framework for predicting actions' effects, to also identify…  (More)
    • EMNLP 2019
      Ben Zhou, Daniel Khashabi, Qiang Ning, Dan Roth
      Understanding time is crucial for understanding events expressed in natural language. Because people rarely say the obvious, it is often necessary to have commonsense knowledge about various temporal aspects of events, such as duration, frequency, and temporal order. However, this important problem…  (More)
    • EMNLP 2019
      Oyvind Tafjord, Matt Gardner, Kevin Lin, Peter Clark
      We introduce the first open-domain dataset, called QuaRTz, for reasoning about textual qualitative relationships. QuaRTz contains general qualitative statements, e.g., "A sunscreen with a higher SPF protects the skin longer.", twinned with 3864 crowdsourced situated questions, e.g., "Billy is…  (More)
    • arXiv 2019
      Kyle Richardson, Hai Na Hu, Lawrence S. Moss, Ashish Sabharwal
      Do state-of-the-art models for language understanding already have, or can they easily learn, abilities such as boolean coordination, quantification, conditionals, comparatives, and monotonicity reasoning (i.e., reasoning about word substitutions in sentential contexts)? While such phenomena are…  (More)
    • arXiv 2019
      Peter Clark, Oren Etzioni, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Niket Tandon, Sumithra Bhakthavatsalam, Dirk Groeneveld, Michal Guerquin, Michael Schmitz
      AI has achieved remarkable mastery over games such as Chess, Go, and Poker, and even Jeopardy!, but the rich variety of standardized exams has remained a landmark challenge. Even in 2016, the best AI system achieved merely 59.3% on an 8th Grade science exam challenge (Schoenick et al., 2016). This…  (More)
    • arXiv 2019
      Dongfang Xu, Peter Jansen, Jaycie Martin, Zhengnan Xie, Vikas Yadav, Harish Tayyar Madabushi, Oyvind Tafjord, Peter Clark
      Prior work has demonstrated that question classification (QC), recognizing the problem domain of a question, can help answer it more accurately. However, developing strong QC algorithms has been hindered by the limited size and complexity of annotated data available. To address this, we present the…  (More)
    • ACL 2019
      Souvik Kundu, Tushar Khot, Ashish Sabharwal, Peter Clark
      We propose a novel, path-based reasoning approach for the multi-hop reading comprehension task where a system needs to combine facts from multiple passages to answer a question. Although inspired by multi-hop reasoning over knowledge graphs, our proposed approach operates directly over unstructured…  (More)
    • NAACL-HLT 2019
      Xinya Du, Bhavana Dalvi Mishra, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter Clark, Claire Cardie
      Our goal is procedural text comprehension, namely tracking how the properties of entities (e.g., their location) change with time given a procedural text (e.g., a paragraph about photosynthesis, a recipe). This task is challenging as the world is changing throughout the text, and despite recent…  (More)
    • NAACL 2019
      Harsh Trivedi, Heeyoung Kwon, Tushar Khot, Ashish Sabharwal, Niranjan Balasubramanian
      Question Answering (QA) naturally reduces to an entailment problem, namely, verifying whether some text entails the answer to a question. However, for multi-hop QA tasks, which require reasoning with multiple sentences, it remains unclear how best to utilize entailment models pre-trained on large…  (More)
    • AAAI 2019
      Arindam Mitra, Peter Clark, Oyvind Tafjord, Chitta Baral
      While in recent years machine learning (ML) based approaches have been the popular approach in developing end-to-end question answering systems, such systems often struggle when additional knowledge is needed to correctly answer the questions. Proposed alternatives involve translating the question…  (More)
    • AAAI 2019
      Oyvind Tafjord, Peter Clark, Matt Gardner, Wen-tau Yih, Ashish Sabharwal
      Many natural language questions require recognizing and reasoning with qualitative relationships (e.g., in science, economics, and medicine), but are challenging to answer with corpus-based methods. Qualitative modeling provides tools that support such reasoning, but the semantic parsing task of…  (More)
    • arXiv 2019
      Daniel Khashabi, Erfan Sadeqi Azer, Tushar Khot, Ashish Sabharwal, Dan Roth
      Recent systems for natural language understanding are strong at overcoming linguistic variability for lookup style reasoning. Yet, their accuracy drops dramatically as the number of reasoning steps increases. We present the first formal framework to study such empirical observations, addressing the…  (More)
    • NeurIPS 2018
      Yexiang Xue, Yang Yuan, Zhitian Xu, Ashish Sabharwal
      Neural models operating over structured spaces such as knowledge graphs require a continuous embedding of the discrete elements of this space (such as entities) as well as the relationships between them. Relational embeddings with high expressivity, however, have high model complexity, making them…  (More)
    • EMNLP 2018
      Todor Mihaylov, Peter Clark, Tushar Khot, Ashish Sabharwal
      We present a new kind of question answering dataset, OpenBookQA, modeled after open book exams for assessing human understanding of a subject. The open book that comes with our questions is a set of 1329 elementary level science facts. Roughly 6000 questions probe an understanding of these facts…  (More)
    • EMNLP 2018
      Niket Tandon, Bhavana Dalvi Mishra, Joel Grus, Wen-tau Yih, Antoine Bosselut, Peter Clark
      Comprehending procedural text, e.g., a paragraph describing photosynthesis, requires modeling actions and the state changes they produce, so that questions about entities at different timepoints can be answered. Although several recent systems have shown impressive progress in this task, their…  (More)
    • EMNLP 2018
      Dongyeop Kang, Tushar Khot, Ashish Sabharwal and Peter Clark
      Most textual entailment models focus on lexical gaps between the premise text and the hypothesis, but rarely on knowledge gaps. We focus on filling these knowledge gaps in the Science Entailment task, by leveraging an external structured knowledge base (KB) of science facts. Our new architecture…  (More)
    • UAI 2018
      Ashish Sabharwal, Yexiang Xue
      We propose a new algorithm for computing a constant-factor approximation of precision-recall (PR) curves for massive noisy datasets produced by generative models. Assessing validity of items in such datasets requires human annotation, which is costly and must be minimized. Our algorithm, AdaStrat…  (More)
    • ACL 2018
      Tushar Khot, Ashish Sabharwal and Dongyeop Kang
      We consider the problem of learning textual entailment models with limited supervision (5K-10K training examples), and present two complementary approaches for it. First, we propose knowledge-guided adversarial example generators for incorporating large lexical resources in entailment models via…  (More)
    • NAACL 2018
      Bhavana Dalvi, Lifu Huang, Niket Tandon, Wen-tau Yih, Peter Clark
      We present a new dataset and models for comprehending paragraphs about processes (e.g., photosynthesis), an important genre of text describing a dynamic world. The new dataset, ProPara, is the first to contain natural (rather than machine-generated) text about a changing world along with a full…  (More)
    • arXiv 2018
      Peter Clark, Bhavana Dalvi, Niket Tandon
      Our goal is to answer questions about paragraphs describing processes (e.g., photosynthesis). Texts of this genre are challenging because the effects of actions are often implicit (unstated), requiring background knowledge and inference to reason about the changing world states. To supply this…  (More)
    • NAACL 2018
      Po-Sen Huang, Chenglong Wang, Rishabh Singh, Wen-tau Yih, Xiaodong He
      In conventional supervised training, a model is trained to fit all the training examples. However, having a monolithic model may not always be the best strategy, as examples could vary widely. In this work, we explore a different learning protocol that treats each example as a unique pseudo-task…  (More)
    • NAACL 2018
      Asli Celikyilmaz, Antoine Bosselut, Xiaodong He and Yejin Choi
      We present deep communicating agents in an encoder-decoder architecture to address the challenges of representing a long document for abstractive summarization. With deep communicating agents, the task of encoding a long text is divided across multiple collaborating agents, each in charge of a…  (More)
    • NAACL 2018
      Antoine Bosselut, Asli Celikyilmaz, Xiaodong He, Jianfeng Gao, Po-Sen Huang and Yejin Choi
      In this paper, we investigate the use of discourse-aware rewards with reinforcement learning to guide a model to generate long, coherent text. In particular, we propose to learn neural rewards to model cross-sentence ordering as a means to approximate desired discourse structure. Empirical results…  (More)
    • WSDM 2018
      Sreyasi Nag Chowdhury, Niket Tandon, Hakan Ferhatosmanoglu, Gerhard Weikum
      The social media explosion has populated the Internet with a wealth of images. There are two existing paradigms for image retrieval: 1)content-based image retrieval (BIR), which has traditionally used visual features for similarity search (e.g., SIFT features), and 2) tag-based image retrieval…  (More)
    • TACL 2018
      Hanie Sedghi and Ashish Sabharwal
      Given a knowledge base or KB containing (noisy) facts about common nouns or generics, such as "all trees produce oxygen" or "some animals live in forests", we consider the problem of inferring additional such facts at a precision similar to that of the starting KB. Such KBs capture general…  (More)
    • arXiv 2018
      Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord
      We present a new question set, text corpus, and baselines assembled to encourage AI research in advanced question answering. Together, these constitute the AI2 Reasoning Challenge (ARC), which requires far more powerful knowledge and reasoning than previous challenges such as SQuAD or SNLI. The ARC…  (More)
    • AAAI 2018
      Tushar Khot, Ashish Sabharwal, and Peter Clark
      We present a new dataset and model for textual entailment, derived from treating multiple-choice question-answering as an entailment problem. SCITAIL is the first entailment set that is created solely from natural sentences that already exist independently "in the wild" rather than sentences…  (More)
    • AAAI 2018
      Daniel Khashabi, Tushar Khot, Ashish Sabharwal, and Dan Roth
      We propose a novel method for exploiting the semantic structure of text to answer multiple-choice questions. The approach is especially suitable for domains that require reasoning over a diverse set of linguistic constructs but have limited training data. To address these challenges, we present the…  (More)
    • AAAI 2018
      Jonathan Kuck, Ashish Sabharwal, and Stefano Ermon
      Rademacher complexity is often used to characterize the learnability of a hypothesis class and is known to be related to the class size. We leverage this observation and introduce a new technique for estimating the size of an arbitrary weighted set, defined as the sum of weights of all elements in…  (More)
    • SIGMOD Record 2017
      Niket Tandon, Aparna S. Varde, Gerard de Melo
      There is growing conviction that the future of computing depends on our ability to exploit big data on theWeb to enhance intelligent systems. This includes encyclopedic knowledge for factual details, common sense for human-like reasoning and natural language generation for smarter communication…  (More)
    • Award Best Paper Award
      EMNLP 2017
      Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordóñez, Kai-Wei Chang
      Language is increasingly being used to define rich visual recognition problems with supporting image collections sourced from the web. Structured prediction models are used in these tasks to take advantage of correlations between co-occurring labels and visual input but risk inadvertently encoding…  (More)
    • ACL 2017
      Tushar Khot, Ashish Sabharwal, and Peter Clark
      While there has been substantial progress in factoid question-answering (QA), answering complex questions remains challenging, typically requiring both a large body of knowledge and inference techniques. Open Information Extraction (Open IE) provides a way to generate semi-structured knowledge for…  (More)
    • ACL 2017
      Niket Tandon, Gerard de Melo, and Gerhard Weikum
      Despite important progress in the area of intelligent systems, most such systems still lack commonsense knowledge that appears crucial for enabling smarter, more human-like decisions. In this paper, we present a system based on a series of algorithms to distill fine-grained disambiguated…  (More)
    • ACL 2017
      Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, and Luke Zettlemoyer
      We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention. To achieve this, we adapt neural sequence models to map utterances directly to SQL with its…  (More)
    • WWW 2017
      Cuong Xuan Chu, Niket Tandon, and Gerhard Weikum
      Knowledge graphs have become a fundamental asset for search engines. A fair amount of user queries seek information on problem-solving tasks such as building a fence or repairing a bicycle. However, knowledge graphs completely lack this kind of how-to knowledge. This paper presents a method for…  (More)
    • TACL 2017
      Bhavana Dalvi, Niket Tandon, and Peter Clark
      Our goal is to construct a domain-targeted, high precision knowledge base (KB), containing general (subject,predicate,object) statements about the world, in support of a downstream question-answering (QA) application. Despite recent advances in information extraction (IE) techniques, no suitable…  (More)
    • arXiv 2017 Slides
      Peter D. Turney
      While open-domain question answering (QA) systems have proven effective for answering simple questions, they struggle with more complex questions. Our goal is to answer more complex questions reliably, without incurring a significant cost in knowledge resource construction to support the QA. One…  (More)
    • UAI 2017
      Ashish Sabharwal and Hanie Sedghi
      Large scale machine learning produces massive datasets whose items are often associated with a confidence level and can thus be ranked. However, computing the precision of these resources requires human annotation, which is often prohibitively expensive and is therefore skipped. We consider the…  (More)
    • VAST 2017 Demo Video
      Nan-Chen Chen and Been Kim
      Developing sophisticated artificial intelligence (AI) systems requires AI researchers to experiment with different designs and analyze results from evaluations (we refer this task as evaluation analysis). In this paper, we tackle the challenges of evaluation analysis in the domain of question…  (More)
    • EMNLP • Workshop on Noisy User-generated Text 2017
      Johannes Welbl, Nelson F. Liu, and Matt Gardner
      We present a novel method for obtaining high-quality, domain-targeted multiple choice questions from crowd workers. Generating these questions can be difficult without trading away originality, relevance or diversity in the answer options. Our method addresses these problems by leveraging a large…  (More)
    • CoNLL 2017
      Daniel Khashabi, Tushar Khot, Ashish Sabharwal, and Dan Roth
      Question answering (QA) systems are easily distracted by irrelevant or redundant words in questions, especially when faced with long or multi-sentence questions in difficult domains. This paper introduces and studies the notion of essential question terms with the goal of improving such QA solvers…  (More)
    • CoNLL 2017
      Rebecca Sharp, Mihai Surdeanu, Peter Jansen, Marco A. Valenzuela-Escárcega, Peter Clark, and Michael Hammond
      For many applications of question answering (QA), being able to explain why a given model chose an answer is critical. However, the lack of labeled data for answer justifications makes learning this difficult and expensive. Here we propose an approach that uses answer ranking as distant supervision…  (More)
    • CoNLL 2017
      Ivan Vulic, Roy Schwartz, Ari Rappoport, Roi Reichart, and Anna Korhonen
      This paper is concerned with identifying contexts useful for training word representation models for different word classes such as adjectives (A), verbs (V), and nouns (N). We introduce a simple yet effective framework for an automatic selection of class-specific context configurations. We…  (More)
    • CoNLL 2017
      Roy Schwartz, Maarten Sap, Ioannis Konstas, Leila Zilles, Yejin Choi, Noah A. Smith
      A writer’s style depends not just on personal traits but also on her intent and mental state. In this paper, we show how variants of the same writing task can lead to measurable differences in writing style. We present a case study based on the story cloze task (Mostafazadeh et al., 2016a), where…  (More)
    • EMNLP 2017
      Kenton Lee, Luheng He, Mike Lewis, and Luke Zettlemoyer
      We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or handengineered mention detector. The key idea is to directly consider all spans in a document as potential mentions and learn distributions…  (More)
    • EMNLP 2017
      Jayant Krishnamurthy, Pradeep Dasigi, and Matt Gardner
      We present a new semantic parsing model for answering compositional questions on semi-structured Wikipedia tables. Our parser is an encoder-decoder neural network with two key technical innovations: (1) a grammar for the decoder that only generates well-typed logical forms; and (2) an entity…  (More)
    • AAAI 2017
      Matt Gardner and Jayant Krishnamurthy
      Traditional semantic parsers map language onto compositional, executable queries in a fixed schema. This map- ping allows them to effectively leverage the information con- tained in large, formal knowledge bases (KBs, e.g., Freebase) to answer questions, but it is also fundamentally limiting…  (More)
    • NIPS • NAMPI Workshop 2016
      Kenton W. Murray and Jayant Krishnamurthy
      We present probabilistic neural programs, a framework for program induction that permits flexible specification of both a computational model and inference algorithm while simultaneously enabling the use of deep neural networks. Probabilistic neural programs combine a computation graph for…  (More)
    • ACL 2016
      Sujay Kumar Jauhar, Peter D. Turney, Eduard Hovy
      Question answering requires access to a knowledge base to check facts and reason about information. Knowledge in the form of natural language text is easy to acquire, but difficult for automated reasoning. Highly-structured knowledge bases can facilitate reasoning, but are difficult to acquire. In…  (More)
    • AAAI 2016
      Amos Azaria, Jayant Krishnamurthy, and Tom M. Mitchell
      Unlike traditional machine learning methods, humans often learn from natural language instruction. As users become increasingly accustomed to interacting with mobile devices using speech, their interest in instructing these devices in natural language is likely to grow. We introduce our Learning by…  (More)
    • NAACL 2016
      Jayant Krishnamurthy
      We introduce several probabilistic models for learning the lexicon of a semantic parser. Lexicon learning is the first step of training a semantic parser for a new application domain and the quality of the learned lexicon significantly affects both the accuracy and efficiency of the final semantic…  (More)
    • IJCAI 2016 Code Demo
      Daniel Khashabi, Tushar Khot, Ashish Sabharwal, Peter Clark, Oren Etzioni, and Dan Roth
      Answering science questions posed in natural language is an important AI challenge. Answering such questions often requires non-trivial inference and knowledge that goes beyond factoid retrieval. Yet, most systems for this task are based on relatively shallow Information Retrieval (IR) and…  (More)
    • AKBC 2016
      Bhavana Dalvi, Sumithra Bhakthavatsalam, Chris Clark, Peter Clark, Oren Etzioni, Anthony Fader, and Dirk Groeneveld
      Recent work on information extraction has suggested that fast, interactive tools can be highly effective; however, creating a usable system is challenging, and few publicly available tools exist. In this paper we present IKE, a new extraction tool that performs fast, interactive bootstrapping to…  (More)
    • CACM 2016 Video
      Carissa Schoenick, Peter Clark, Oyvind Tafjord, Peter Turney, and Oren Etzioni
      The field of Artificial Intelligence has made great strides forward recently, for example AlphaGo's recent victory against the world champion Lee Sedol in the game of Go, leading to great optimism about the field. But are we really moving towards smarter machines, or are these successes restricted…  (More)
    • SEM 2016
      Saif M. Mohammad, Ekaterina Shutova, and Peter D. Turney
      It is generally believed that a metaphor tends to have a stronger emotional impact than a literal statement; however, there is no quantitative study establishing the extent to which this is true. Further, the mechanisms through which metaphors convey emotions are not well understood. We present the…  (More)
    • ICML 2016
      Tudor Achim, Ashish Sabharwal, and Stefano Ermon
      Random projections have played an important role in scaling up machine learning and data mining algorithms. Recently they have also been applied to probabilistic inference to estimate properties of high-dimensional distributions; however , they all rely on the same class of projections based on…  (More)
    • EMNLP 2016
      Samuel Louvan, Chetan Naik, Sadhana Kumaravel, Heeyoung Kwon, Niranjan Balasubramanian, and Peter Clark
      For AI systems to reason about real world situations, they need to recognize which processes are at play and which entities play key roles in them. Our goal is to extract this kind of rolebased knowledge about processes, from multiple sentence-level descriptions. This knowledge is hard to acquire…  (More)
    • EMNLP 2016
      Rebecca Sharp, Mihai Surdeanu, Peter Jansen, and Peter Clark
      A common model for question answering (QA) is that a good answer is one that is closely related to the question, where relatedness is often determined using generalpurpose lexical models such as word embeddings. We argue that a better approach is to look for answers that are related to the question…  (More)
    • EMNLP 2016
      Jayant Krishnamurthy, Oyvind Tafjord, and Aniruddha Kembhavi
      Situated question answering is the problem of answering questions about an environment such as an image or diagram. This problem requires jointly interpreting a question and an environment using background knowledge to select the correct answer. We present Parsing to Probabilistic Programs (P3), a…  (More)
    • COLING 2016
      Peter Jansen, Niranjan Balasubramanian, Mihai Surdeanu, and Peter Clark
      QA systems have been making steady advances in the challenging elementary science exam domain. In this work, we develop an explanation-based analysis of knowledge and inference requirements, which supports a fine-grained characterization of the challenges. In particular, we model the requirements…  (More)
    • NIPS 2016
      Been Kim, Sanmi Koyejo and Rajiv Khanna
      Example-based explanations are widely used in the effort to improve the interpretability of highly complex distributions. However, prototypes alone are rarely sufficient to represent the gist of the complexity. In order for users to construct better mental models and understand complex data…  (More)
    • NIPS 2016
      Shengjia Zhao, Enze Zhou, Ashish Sabharwal, and Stefano Ermon
      A key challenge in sequential decision problems is to determine how many samples are needed for an agent to make reliable decisions with good probabilistic guarantees. We introduce Hoeffding-like concentration inequalities that hold for a random, adaptively chosen number of samples. Our…  (More)
    • AI Magazine 2016
      Peter Clark and Oren Etzioni
      Given the well-known limitations of the Turing Test, there is a need for objective tests to both focus attention on, and measure progress towards, the goals of AI. In this paper we argue that machine performance on standardized tests should be a key component of any new measure of AI, because…  (More)
    • AAAI 2016
      Ashish Sabharwal, Horst Samulowitz, and Gerald Tesauro
      We study a novel machine learning (ML) problem setting of sequentially allocating small subsets of training data amongst a large set of classifiers. The goal is to select a classifier that will give near-optimal accuracy when trained on all data, while also minimizing the cost of misallocated…  (More)
    • AAAI 2016
      Carolyn Kim, Ashish Sabharwal, and Stefano Ermon
      We consider the problem of sampling from a discrete probability distribution specified by a graphical model. Exact samples can, in principle, be obtained by computing the mode of the original model perturbed with an exponentially many i.i.d. random variables. We propose a novel algorithm that views…  (More)
    • AAAI 2016
      Shengjia Zhao, Sorathan Chaturapruek, Ashish Sabharwal, and Stefano Ermon
      Many recent algorithms for approximate model counting are based on a reduction to combinatorial searches over random subsets of the space defined by parity or XOR constraints. Long parity constraints (involving many variables) provide strong theoretical guarantees but are computationally difficult…  (More)
    • AAAI 2016
      Shuo Yang, Tushar Khot, Kristian Kersting, and Sriraam Natarajan
      Many real world applications in medicine, biology, communication networks, web mining, and economics, among others, involve modeling and learning structured stochastic processes that evolve over continuous time. Existing approaches, however, have focused on propositional domains only. Without…  (More)
    • AAAI 2016
      Peter Clark, Oren Etzioni, Daniel Khashabi, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, and Peter Turney
      What capabilities are required for an AI system to pass standard 4th Grade Science Tests? Previous work has examined the use of Markov Logic Networks (MLNs) to represent the requisite background knowledge and interpret test questions, but did not improve upon an information retrieval (IR) baseline…  (More)
    • WSDM 2016
      Bhavana Dalvi, Aditya Mishra, and William W. Cohen
      In an entity classification task, topic or concept hierarchies are often incomplete. Previous work by Dalvi et al. has shown that in non-hierarchical semi-supervised classification tasks, the presence of such unanticipated classes can cause semantic drift for seeded classes. The Exploratory…  (More)
    • Proceedings of IAAI 2015
      Peter Clark
      While there has been an explosion of impressive, datadriven AI applications in recent years, machines still largely lack a deeper understanding of the world to answer questions that go beyond information explicitly stated in text, and to explain and discuss those answers. To reach this next…  (More)
    • NAACL 2015
      Rebecca Sharp, Peter Jansen, Mihai Surdeanu, and Peter Clark
      Monolingual alignment models have been shown to boost the performance of question answering systems by "bridging the lexical chasm" between questions and answers. The main limitation of these approaches is that they require semistructured training data in the form of question-answer pairs, which is…  (More)
    • NAACL 2015
      Ben Hixon, Peter Clark, and Hannaneh Hajishirzi
      We describe how a question-answering system can learn about its domain from conversational dialogs. Our system learns to relate concepts in science questions to propositions in a fact corpus, stores new concepts and relations in a knowledge graph (KG), and uses the graph to solve questions. We are…  (More)
    • TACL 2015
      Daniel Fried, Peter Jansen, Gustave Hahn-Powell, Mihai Surdeanu, and Peter Clark
      Lexical semantic models provide robust performance for question answering, but, in general, can only capitalize on direct evidence seen during training. For example, monolingual alignment models acquire term alignment probabilities from semistructured data such as question-answer pairs; neural…  (More)
    • EMNLP 2015
      Tushar Khot, Niranjan Balasubramanian, Eric Gribkoff, Ashish Sabharwal, Peter Clark, and Oren Etzioni
      Elementary-level science exams pose significant knowledge acquisition and reasoning challenges for automatic question answering. We develop a system that reasons with knowledge derived from textbooks, represented in a subset of first-order logic. Automatic extraction, while scalable, often results…  (More)
    • EMNLP 2015
      Yang Li and Peter Clark
      Much of what we understand from text is not explicitly stated. Rather, the reader uses his/her knowledge to fill in gaps and create a coherent, mental picture or “scene” depicting what text appears to convey. The scene constitutes an understanding of the text, and can be used to answer questions…  (More)
    • K-CAP • First International Workshop on Capturing Scientific Knowledge (SciKnow) 2015
      Samuel Louvan, Chetan Naik, Veronica Lynn, Ankit Arun, Niranjan Balasubramanian, and Peter Clark
      We consider a 4th grade level question answering task. We focus on a subset involving recognizing instances of physical, biological, and other natural processes. Many processes involve similar entities and are hard to distinguish using simple bag-of-words representations alone.
    • CPAIOR 2015
      Brian Kell, Ashish Sabharwal, and Willem-Jan van Hoeve
      Nogood learning is a critical component of Boolean satisfiability (SAT) solvers, and increasingly popular in the context of integer programming and constraint programming. We present a generic method to learn valid clauses from exact or approximate binary decision diagrams (BDDs) and resolution in…  (More)
    • EACL 2014
      Yuen-Hsien Tseng, Lung-Hao Lee, Shu-Yen Lin, Bo-Shun Liao, Mei-Jun Liu, Hsin-Hsi Chen, Oren Etzioni, and Anthony Fader
      This study presents the Chinese Open Relation Extraction (CORE) system that is able to extract entity-relation triples from Chinese free texts based on a series of NLP techniques, i.e., word segmentation, POS tagging, syntactic parsing, and extraction rules. We employ the proposed CORE techniques…  (More)
    • ACL • Workshop on Semantic Parsing 2014
      Xuchen Yao, Jonathan Berant, and Benjamin Van Durme
      We contrast two seemingly distinct approaches to the task of question answering (QA) using Freebase: one based on information extraction techniques, the other on semantic parsing. Results over the same test-set were collected from two state-ofthe-art, open-source systems, then analyzed in…  (More)
    • KDD 2014
      Anthony Fader, Luke Zettlemoyer, and Oren Etzioni
      We consider the problem of open-domain question answering (Open QA) over massive knowledge bases (KBs). Existing approaches use either manually curated KBs like Freebase or KBs automatically extracted from unstructured text. In this paper, we present oqa, the first approach to leverage both curated…  (More)
    • Award Best Paper Award
      EMNLP 2014
      Jonathan Berant, Vivek Srikumar, Pei-Chun Chen, Brad Huang, Christopher D. Manning, Abby Vander Linden, Brittany Harding, and Peter Clark
      Machine reading calls for programs that read and understand text, but most current work only attempts to extract facts from redundant web-scale corpora. In this paper, we focus on a new reading comprehension task that requires complex reasoning over a single document. The input is a paragraph…  (More)
    • Award Best Paper Award
      AKBC 2014
      Peter Clark, Niranjan Balasubramanian, Sumithra Bhakthavatsalam, Kevin Humphreys, Jesse Kinkead, Ashish Sabharwal, and Oyvind Tafjord
      While there has been tremendous progress in automatic database population in recent years, most of human knowledge does not naturally fit into a database form. For example, knowledge that "metal objects can conduct electricity" or "animals grow fur to help them stay warm" requires a substantially…  (More)
    • International Conference on Principles and Practice of Constraint Programming 2014
      Ashish Sabharwal and Horst Samulowitz
      Novel search space splitting techniques have recently been successfully exploited to paralleliz Constraint Programming and Mixed Integer Programming solvers. We first show how universal hashing can be used to extend one such interesting approach to a generalized setting that goes beyond discrepancy…  (More)
    • ACL 2014
      Peter Jansen, Mihai Surdeanu, and Peter Clark
      We propose a robust answer reranking model for non-factoid questions that integrates lexical semantics with discourse information, driven by two representations of discourse: a shallow representation centered around discourse markers, and a deep one based on Rhetorical Structure Theory. We evaluate…  (More)
    • ACL 2013
      Xuchen Yao, Benjamin Van Durme, and Peter Clark
      Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily performance gain for QA. We propose to tightly integrate them by coupling automatically learned…  (More)
    • ACL 2013
      Xuchen Yao, Benjamin Van Durme, Chris Callision-Burch, and Peter Clark
      Fast alignment is essential for many natural language tasks. But in the setting of monolingual alignment, previous work has not been able to align more than one sentence pair per second. We describe a discriminatively trained monolingual word aligner that uses a Conditional Random Field to globally…  (More)
    • NAACL 2013
      Xuchen Yao, Benjamin Van Durme, Chris Callision-Burch, and Peter Clark
      Our goal is to extract answers from preretrieved sentences for Question Answering (QA). We construct a linear-chain Conditional Random Field based on pairs of questions and their possible answer sentences, learning the association between questions and answer types. This casts answer extraction as…  (More)
    • EMNLP 2013
      Xuchen Yao, Benjamin Van Durme, Chris Callision-Burch, and Peter Clark
      We introduce a novel discriminative model for phrase-based monolingual alignment using a semi-Markov CRF. Our model achieves stateof-the-art alignment accuracy on two phrasebased alignment datasets (RTE and paraphrase), while doing significantly better than other strong baselines in both non…  (More)
    • AKBC 2013
      Xiao Ling, Dan Weld, and Peter Clark
      Knowledge of objects and their parts, meronym relations, are at the heart of many question-answering systems, but manually encoding these facts is impractical. Past researchers have tried hand-written patterns, supervised learning, and bootstrapped methods, but achieving both high precision and…  (More)
    • EMNLP 2013
      Aju Thalappillil Scaria, Jonathan Berant, Mengqiu Wang, Christopher D. Manning, Justin Lewis, Brittany Harding, and Peter Clark
      Biological processes are complex phenomena involving a series of events that are related to one another through various relationships. Systems that can understand and reason over biological processes would dramatically improve the performance of semantic applications involving inference such as…  (More)
    • CIKM • AKBC 2013
      Peter Clark, Phil Harrison, and Niranjan Balasubramanian
      Our long-term interest is in machines that contain large amounts of general and scientific knowledge, stored in a "computable" form that supports reasoning and explanation. As a medium-term focus for this, our goal is to have the computer pass a fourth-grade science test, anticipating that much of…  (More)
    • NAACL-HLT • AKBC Workshop 2012
      Peter Clark, Phil Harrison, Niranjan Balasubramanian, and Oren Etzioni
      As part of our work on building a "knowledgeable textbook" about biology, we are developing a textual question-answering (QA) system that can answer certain classes of biology questions posed by users. In support of that, we are building a "textual KB" - an assembled set of semi-structured…  (More)