Discrete Reasoning Over the content of Paragraphs (DROP)

AllenNLP • 2019
A lot of diverse reading comprehension datasets have recently been introduced to study various phenomena in natural language, ranging from simple paraphrase matching and entity typing to entity tracking and understanding the implications of the context. Given the availability of many such datasets, comprehensive and reliable evaluation is tedious and time-consuming. ORB is an evaluation server that reports performance on diverse reading comprehension datasets, encouraging and facilitating testing a single model's capability in understanding a wide variety of reading phenomena. It also includes a suite of synthetic augmentations that test model's ability to generalize to out-of-distribution syntactic structures.
License: CC BY

Leaderboard

Top Public Submissions
DetailsCreatedExact Match
1
MindOpt Copilot
DI Lab, Alibaba Damo Academy
11/17/202390%
2
D-Reasoner
iFLYTEK Research & HFL
6/19/202388%
3
QDGAT2.0
AntGroup NLP
1/7/202388%
4
AeNER
Anonymous for blind review
10/19/202288%
5
OPERA++
JD AI Research
2/24/202288%

Authors

Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, Matt Gardner