Skip to content

NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks

License

Notifications You must be signed in to change notification settings

allenai/numglue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

3722b91 · May 10, 2022

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NumGLUE

NumGLUE is a multi-task benchmark that evaluates the performance of AI systems on eight different tasks, that at their core require simple arithmetic understanding.

Dataset

NumGLUE has 8 tasks

  • Task 1: Commonsense + Arithmetic Reasoning
  • Task 2: Domain Specific + Arithmetic Reasoning
  • Task 3: Commonsense + Quantitative Comparison
  • Task 4: Fill-in-the-blanks Format
  • Task 5: Reading Comprehension (RC) + Explicit Numerical Reasoning
  • Task 6: Reading Comprehension (RC) + Implicit Numerical Reasoning
  • Task 7: Quantitative NLI
  • Task 8: Arithmetic Word Problems

Download data from ./data/. It contains the train, dev and test split. Note that the provided task types need to be only used for evaluating model performance across various tasks. They should not be used as additional information during model training, since one of the goal in this benchmark is to identify task types directly from data.

Baseline Model

We used numnetplus as the baseline model in NumGLUE. We use reading comprehension as the common format and convert questions of all tasks to the reading comprehension format.

For more details, please refer to our paper NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks

Feel free to cite us

@article{mishra2022numglue,
  title={NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks},
  author={Mishra, Swaroop and Mitra, Arindam and Varshney, Neeraj and Sachdeva, Bhavdeep and Clark, Peter and Baral, Chitta and Kalyan, Ashwin},
  journal={ACL},
  year={2022}
}

If you use the NumGLUE data, please cite the source dataset papers. The full bibtex of source dataset papers is here.

About

NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published