Viewing 41-50 of 81 datasets
  • Social IQA

    37,000 QA pairs for evaluating models’ abilities to reason about the social implications of everyday events and situationsMosaic • 2019We introduce Social IQa: Social Interaction QA, a new question-answering benchmark for testing social commonsense intelligence. Contrary to many prior benchmarks that focus on physical or taxonomic knowledge, Social IQa focuses on reasoning about people’s…
  • QuaRTz Dataset

    3864 questions about open domain qualitative relationshipsAristo • 2019QuaRTz is a crowdsourced dataset of 3864 multiple-choice questions about open domain qualitative relationships. Each question is paired with one of 405 different background sentences (sometimes short paragraphs).
  • ARC Question Classification Dataset

    7,787 multiple choice questions annotated with question classification labelsAristo • 2019A dataset of detailed problem domain classification labels for each of the 7,787 multiple-choice science questions found in the AI2 Reasoning Challenge (ARC) dataset, to enable targeted pairing of questions with problem-specific solvers. Also included is a…
  • What-If Question Answering

    Large-scale dataset of 39705 "What if..." questions over procedural textAristo • 2019The WIQA dataset V1 has 39705 questions containing a perturbation and a possible effect in the context of a paragraph. The dataset is split into 29808 train questions, 6894 dev questions and 3003 test questions.
  • CommonsenseQA

    12,102 multiple-choice questions with one correct answer and four distractor answersAI2 Israel, Question Understanding • 2019CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers. It contains 12,102 questions with one correct answer and four distractor answers.
  • Discrete Reasoning Over the content of Paragraphs (DROP)

    The DROP dataset contains 96k Question and Answering pairs (QAs) over 6.7K paragraphs, split between train (77k QAs), development (9.5k QAs) and a hidden test partition (9.5k QAs).AllenNLP • 2019A lot of diverse reading comprehension datasets have recently been introduced to study various phenomena in natural language, ranging from simple paraphrase matching and entity typing to entity tracking and understanding the implications of the context. Given…
  • HellaSwag

    HellaSWAG is a dataset for studying grounded commonsense inference.Mosaic • 2019HellaSWAG is a dataset for studying grounded commonsense inference. It consists of 70k multiple choice questions about grounded situations: each question comes from one of two domains -- activitynet or wikihow -- with four answer choices about what might…
  • SciCite: Citation intenent classification dataset

    A large dataset of citation intent classification based on citation textSemantic Scholar • 2019Citations play a unique role in scientific discourse and are crucial for understanding and analyzing scientific work. However not all citations are equal. Some citations refer to use of a method from another work, some discuss results or findings of other…
  • QuaRel Dataset

    2771 story questions about qualitative relationshipsAristo • 2018QuaRel is a crowdsourced dataset of 2771 multiple-choice story questions, including their logical forms.
  • ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning

    An atlas of everyday commonsense reasoning, organized through 877k textual descriptions of inferential knowledge.Mosaic • 2018We present ATOMIC, an atlas of everyday commonsense reasoning, organized through 877k textual descriptions of inferential knowledge. Compared to existing resources that center around taxonomic knowledge, ATOMIC focuses on inferential knowledge organized as…