eQASC: Multihop Explanations for QASC

Aristo • 2020

This dataset contains 98k 2-hop explanations for questions in the QASC dataset, with annotations indicating if they are valid (~25k) or invalid (~73k) explanations.

Download Read Paper View Repo

License: CC BY

This repository addresses the current lack of training data for distinguish valid multihop explanations from invalid, by providing three new datasets. The main one, eQASC, contains 98k explanation annotations for the multihop question answering dataset QASC, and is the first that annotates multiple candidate explanations for each answer, for example:

The second dataset, eQASC-perturbed, is constructed by crowd-sourcing perturbations (while preserving their validity) of a subset of explanations in QASC, to test consistency and generalization of explanation prediction models. The third dataset eOBQA is constructed by adding explanation annotations to the OBQA dataset to test generalization of models trained on eQASC. In the associated paper, we show that this data can be used to significantly improve explanation quality (+14% absolute F1) using a BERT-based classifier, but still behind the upper bound, offering a new challenge for future research. We also explore a delexicalized chain representation in which repeated noun phrases are replaced by variables, thus turning them into generalized reasoning chains (for example: ”X is a Y” AND ”Y has Z” IMPLIES ”X has Z”). We find that generalized chains maintain performance while also being more robust to perturbations.

Leaderboard

Top Public Submissions

Details	Created	Explain NDCG
1 BERT-GeneralizedReasoningChains Harsh Jhamtani from CMU, Peter Clark from AI2	12/23/2020	64%

View Leaderboard

Authors

Harsh Jhamtani, Peter Clark

Natural Language Processing

Computer Vision

AI for the Environment

Experimentation and Communication

Research

Research

eQASC: Multihop Explanations for QASC

Leaderboard

Authors