Datasets

Viewing 61-70 of 81 datasets
  • Science Terms and Sentences

    9,356 science terms and sentencesAristo • 2017The dataset contains 9,356 science terms and, for each term, an average of 16,000 sentences that contain the term.
  • Textbook Question Answering (TQA)

    1,076 textbook lessons, 26,260 questions, 6229 imagesPRIOR • 2017The TextbookQuestionAnswering (TQA) dataset is drawn from middle school science curricula. It consists of 1,076 lessons from Life Science, Earth Science and Physical Science textbooks. This includes 26,260 questions, including 12,567 that have an accompanying…
  • Explicit Semantic Ranking Dataset

    March 2017Semantic Scholar • 2017This is the dataset for the paper Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding. It includes the query log used in the paper, relevance judgements for the queries, ranking lists from Semantic Scholar, candidate documents, entity…
  • Aristo Tuple KB

    294,000 science-relevant tuplesAristo • 2017The Aristo Tuple KB contains 294,000 high-precision, domain-targeted (subject,relation,object) tuples extracted from text using a high-precision extraction pipeline, and guided by domain vocabulary constraints.
  • Aristo Mini Corpus

    1,197,377 science-relevant sentencesAristo • 2016The Aristo Mini corpus contains 1,197,377 (very loosely) science-relevant sentences drawn from public data. It provides simple science-relevant text that may be useful to help answer elementary science questions.
  • Open IE Demo Dataset

    A dataset of Open IE extractions over ClueWeb.Oren • 2016The dataset that powered the Open IE demo (formerly at openie.allenai.org). It includes ReVerb extractions over a billion sentences from the ClueWeb dataset. The full sentences were removed to comply with the ClueWeb license.
  • Foodwebs

    5000 questions about 500 food web diagrams.Aristo • 2016The foodwebs dataset contains 5000 questions about 500 food web diagrams. Each diagram has annotations from a computer vision system and each question is annotated with a logical form.
  • Explanations for Science Questions

    1,363 gold explanation sentencesAristo • 2016This dataset contains gold explanation sentences supporting 363 science questions, relation annotation for a subset of those explanations, and a graphical annotation tool with annotation guidelines.
  • AI2 Diagram Dataset (AI2D)

    4,817 illustrative diagrams for research on diagram understanding and associated question answering.PRIOR • 2016AI2D is a dataset of illustrative diagrams for research on diagram understanding and associated question answering.
  • Perturbed Version of AI2 Elementary School Science Questions (No Diagrams)

    1,080 questionsAristo • 2016These questions were created using the "AI2 Elementary School Science Questions (No Diagrams)" data set by changing all of the incorrect answer options of each question with some other related word. This dataset can be a good measure of robustness for QA…