Viewing 1-10 of 36 datasets
- 4998 facts and 12147 constraints to test a model's consistencyAristo • 2021Dataset of 4998 simple facts and 12147 constraints to test, and improve, a model's accuracy and consistency
- 2k multi-step entailment trees, explaining the answers to ARC science questionsAristo • 20212k multi-step entailment trees, explaining the answers to ARC science questions
- 2,780 implicit multi-hop reasoning questionsAI2 Israel, Question Understanding, Aristo • 2021StrategyQA is a question-answering benchmark focusing on open-domain questions where the required reasoning steps are implicit in the question and should be inferred using a strategy. StrategyQA includes 2,780 examples, each consisting of a strategy question, its decomposition, and evidence paragraphs.
- Updated RuleTaker datasets with 500k questions, answers and proofs over rulebases.Aristo • 2020These datasets accompany the paper "ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language". They contain updated RuleTaker-style datasets with 500k questions, answers and proofs over natural-language rulebases, used to show that Transformers can emulate reasoning over rules expressed in language, including proof generation. It includes variants using closed- and open-world semantics. Proofs include intermediate conclusions. Extra annotations provide data to train the iterative ProofWriter model as well as abductive reasoning to make uncertain statements certain.
- Datasets used to teach transformers to reasonAristo • 2020Can transformers be trained to reason (or emulate reasoning) over rules expressed in language? In the associated paper and demo we provide evidence that they can. Our models, that we call RuleTakers, are trained on datasets of synthetic rule bases plus derived conclusions, provided here. The resulting models provide the first demonstration that this kind of soft reasoning over language is indeed learnable.
- 33K state changes over 4,050 sentences from 810 procedural, real-world paragraphsAristo, Mosaic • 2020Open PI is the first dataset for tracking state changes in procedural text from arbitrary domains by using an unrestricted (open) vocabulary. Our solution is a new task formulation in which just the text is provided, from which a set of state changes (entity, attribute, before, after) is generated for each step, where the entity, attribute, and values must all be predicted from an open vocabulary.
- 98k annotated explanations for the QASC datasetAristo • 2020This dataset contains 98k 2-hop explanations for questions in the QASC dataset, with annotations indicating if they are valid (~25k) or invalid (~73k) explanations.
- A high-quality KB of hasPart relationsAristo • 2020A high-quality knowledge base of ~50k hasPart relationships, extracted from a large corpus of generic statements.
- A large knowledge base of generic sentencesAristo • 2020The GenericsKB contains 3.4M+ generic sentences about the world, i.e., sentences expressing general truths such as "Dogs bark," and "Trees remove carbon dioxide from the atmosphere." Generics are potentially useful as a knowledge source for AI systems requiring general world knowledge. The GenericsKB is the first large-scale resource containing naturally occurring generic sentences (as opposed to extracted or crowdsourced triples), and is rich in high-quality, general, semantically complete statements. Generics were primarily extracted from three large text sources, namely the Waterloo Corpus, selected parts of Simple Wikipedia, and the ARC Corpus. A filtered, high-quality subset is also available in GenericsKB-Best, containing 1,020,868 sentences. We recommend you start with GenericsKB-Best.
- A dataset of 2,985 grade-school level, direct-answer science questions derived from the ARC multiple-choice question set.Aristo • 2020A dataset of 2,985 grade-school level, direct-answer ("open response", "free form") science questions derived from the ARC multiple-choice question set released as part of the AI2 Reasoning Challenge in 2018.