Datasets

Viewing 1-10 of 65 datasets
  • The Fermi Challenge

    A challenge dataset of Fermi (estimation) problems, currently beyond the capabilities of modern methods.Aristo • 2021A challenge dataset of Fermi (estimation) problems, currently beyond the capabilities of modern methods.
  • Qasper

    Question Answering on Research PapersAllenNLP, Semantic Scholar • 2021A dataset containing 1585 papers with 5049 information-seeking questions asked by regular readers of NLP papers, and answered by a separate set of NLP practitioners.
  • BeliefBank

    4998 facts and 12147 constraints to test a model's consistencyAristo • 2021Dataset of 4998 simple facts and 12147 constraints to test, and improve, a model's accuracy and consistency
  • EntailmentBank

    2k multi-step entailment trees, explaining the answers to ARC science questionsAristo • 20212k multi-step entailment trees, explaining the answers to ARC science questions
  • ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning

    An atlas of everyday commonsense reasoning, organized through 877k textual descriptions of inferential knowledge.Mosaic • 2021We present ATOMIC, an atlas of everyday commonsense reasoning, organized through 877k textual descriptions of inferential knowledge. Compared to existing resources that center around taxonomic knowledge, ATOMIC focuses on inferential knowledge organized as…
  • ATOMIC 2020

    An atlas of everyday commonsense reasoning, organized through 1.33M textual descriptions of inferential knowledge.Mosaic • 2021We present ATOMIC 2020, a commonsense knowledge graph with 1.33M everyday inferential knowledge tuples about entities and events. ATOMIC 2020 represents a large-scale common sense repository of textual descriptions that encode both the social and the physical…
  • Rainbow: A Commonsense Reasoning Benchmark

    A commonsense reasoning benchmark spanning social and physical common senseMosaic • 2021Rainbow is a universal commonsense reasoning benchmark spanning both social and physical common sense. Rainbow brings together 6 existing commonsense reasoning tasks: aNLI, Cosmos QA, HellaSWAG, Physical IQa, Social IQa, and WinoGrande. Modelers are…
  • Scruples: Subreddit Corpus Requiring Understanding Principles in Life-like Ethical Situations

    A corpus and benchmark for predicting communities' ethical judgments on real-life anecdotesMosaic • 2021Scruples is a corpus and benchmark for studying descriptive machine ethics, or machines' ability to understand people's ethical judgments. Scruples offers two datasets: the Anecdotes and the Dilemmas. The Anecdotes collect real-life experiences with ethical…
  • StrategyQA

    2,780 implicit multi-hop reasoning questionsAI2 Israel, Question Understanding, Aristo • 2021StrategyQA is a question-answering benchmark focusing on open-domain questions where the required reasoning steps are implicit in the question and should be inferred using a strategy. StrategyQA includes 2,780 examples, each consisting of a strategy question…
  • ProofWriter

    Updated RuleTaker datasets with 500k questions, answers and proofs over rulebases.Aristo • 2020These datasets accompany the paper "ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language". They contain updated RuleTaker-style datasets with 500k questions, answers and proofs over natural-language rulebases, used to…