Datasets

Viewing 31-40 of 43 datasets
  • Science Terms and Sentences

    9,356 science terms and sentencesAristo • 2017The dataset contains 9,356 science terms and, for each term, an average of 16,000 sentences that contain the term.
  • Aristo Tuple KB

    294,000 science-relevant tuplesAristo • 2017The Aristo Tuple KB contains 294,000 high-precision, domain-targeted (subject,relation,object) tuples extracted from text using a high-precision extraction pipeline, and guided by domain vocabulary constraints.
  • Aristo Mini Corpus

    1,197,377 science-relevant sentencesAristo • 2016The Aristo Mini corpus contains 1,197,377 (very loosely) science-relevant sentences drawn from public data. It provides simple science-relevant text that may be useful to help answer elementary science questions.
  • Foodwebs

    5000 questions about 500 food web diagrams.Aristo • 2016The foodwebs dataset contains 5000 questions about 500 food web diagrams. Each diagram has annotations from a computer vision system and each question is annotated with a logical form.
  • Explanations for Science Questions

    1,363 gold explanation sentencesAristo • 2016This dataset contains gold explanation sentences supporting 363 science questions, relation annotation for a subset of those explanations, and a graphical annotation tool with annotation guidelines.
  • Perturbed Version of AI2 Elementary School Science Questions (No Diagrams)

    1,080 questionsAristo • 2016These questions were created using the "AI2 Elementary School Science Questions (No Diagrams)" data set by changing all of the incorrect answer options of each question with some other related word. This dataset can be a good measure of robustness for QA…
  • Foodchains

    774 food chain questions designed to imitate actual questions from the New York State Grade 4 Regents Exam.Aristo • 2016774 food chain questions designed to imitate actual questions from the New York State Grade 4 Regents Exam.
  • AI2 TabMCQ: Multiple Choice Questions aligned with the Aristo Tablestore

    9092 crowd-sourced science questions and 68 tables of curated factsAristo • 2016This dataset contains a copy of the Aristo Tablestore (Nov. 2015 Snapshot), plus a large set of crowd-sourced multiple-choice questions covering the facts in the tables. Through the setup of the crowd-sourced annotation task, the package also contains…
  • AI2 Tablestore (November 2015 Snapshot)

    68 tables of curated factsAristo • 2015This dataset contains a collection of curated facts in the form of tables used by the Aristo Question-Answering System, collected using a mixture of manual and semi-automated techniques.
  • AI2 Conversational Dialog Traces

    81 dialog traces and extractionsAristo • 2015This dataset contains files for the paper "Learning knowledge graphs for question answering through conversational dialog", presented at the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language…