AI2 Biology How/Why Corpus

Aristo • 2014
This dataset consists of 185 "how" and 193 "why" biology questions authored by a domain expert, with one or more gold answer passages identified in an undergraduate textbook.
License: See Repo

Construction

The expert was not constrained in any way during the annotation process, so gold answers might be smaller than a paragraph or span multiple paragraphs.

This dataset was used for the question-answering system described in the paper “Discourse Complements Lexical Semantics for Non-factoid Answer Reranking” (ACL 2014).

Authors

This dataset was produced by Allen Institute for AI and Mihai Surdeanu (University of Arizona).