AI2 Biology How/Why Corpus

Name: biology-how-why-corpus
Creator: Allen Institute for AI

Aristo • 2014

This dataset consists of 185 "how" and 193 "why" biology questions authored by a domain expert, with one or more gold answer passages identified in an undergraduate textbook.

Download Read Paper

License: See Repo

Construction

The expert was not constrained in any way during the annotation process, so gold answers might be smaller than a paragraph or span multiple paragraphs.

This dataset was used for the question-answering system described in the paper “Discourse Complements Lexical Semantics for Non-factoid Answer Reranking” (ACL 2014).

Authors

This dataset was produced by Allen Institute for AI and Mihai Surdeanu (University of Arizona).

Natural Language Processing

Computer Vision

AI for the Environment

Experimentation and Communication

Research

Research

AI2 Biology How/Why Corpus

Construction

Authors