Menu
Viewing 6 data from 2018
Clear all
    • 5,957 multiple-choice questions probing a book of 1,326 science facts

      OpenBookQA aims to promote research in advanced question-answering, probing a deeper understanding of both the topic (with salient facts summarized as an open book, also provided with the dataset) and the language it is expressed in. In particular, it contains questions that require multi-step reasoning, use of additional common and commonsense knowledge, and rich text comprehension.

    • 488 richly annotated paragraphs about processes (containing 3,300 sentences)

      The ProPara dataset is designed to train and test comprehension of simple paragraphs describing processes (e.g., photosynthesis), designed for the task of predicting, tracking, and answering questions about how entities change during the process.

    • Over 39 million published research papers in Computer Science, Neuroscience, and Biomedical

      This is a subset of the full Semantic Scholar corpus which represents papers crawled from the Web and subjected to a number of filters.

    • Over 14K paper drafts and over 10K textual peer reviews

      PeerRead is a dataset of scientific peer reviews available to help researchers study this important artifact.

    • 7,787 multiple choice science questions and associated corpora

      A new dataset of 7,787 genuine grade-school level, multiple-choice science questions, assembled to encourage research in advanced question-answering. The dataset is partitioned into a Challenge Set and an Easy Set, where the former contains only questions answered incorrectly by both a retrieval-based algorithm and a word co-occurrence algorithm. We are also including a corpus of over 14 million science sentences relevant to the task, and an implementation of three neural baseline models for this dataset. We pose ARC as a challenge to the community.

    • Explanation graphs for 1,680 questions

      A collection of resources for studying explanation-centered inference, including explanation graphs for 1,680 questions, with 4,950 tablestore rows, and other analyses of the knowledge required to answer elementary and middle-school science questions. ExplanationBank was constructed by Peter Jansen (University of Arizona), in collaboration with AI2.