Allen Institute for AI

Datasets

Viewing 1-3 of 3 datasets
  • Break

    83,978 examples sampled from 10 question answering datasets over text, images and databases.AI2 Israel, Question Understanding • 2020Break is a human annotated dataset of natural language questions and their Question Decomposition Meaning Representations (QDMRs). Break consists of 83,978 examples sampled from 10 question answering datasets over text, images and databases.
  • CommonsenseQA

    12,102 multiple-choice questions with one correct answer and four distractor answersAI2 Israel, Question Understanding • 2019CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers. It contains 12,102 questions with one correct answer and four distractor answers.
  • ComplexWebQuestions

    34,689 complex questions and their answers, web snippets, and SPARQL queryAI2 Israel, Question Understanding • 2018ComplexWebQuestions is a dataset for answering complex questions that require reasoning over multiple web snippets. It contains a large set of complex questions in natural language, and can be used in multiple ways: 1) By interacting with a search engine, which is the focus of our paper (Talmor and Berant, 2018); 2) As a reading comprehension task: we release 12,725,989 web snippets that are relevant for the questions, and were collected during the development of our model; 3) As a semantic parsing task: each question is paired with a SPARQL query that can be executed against Freebase to retrieve the answer.