ZEST: ZEroShot learning from Task descriptions

Mosaic, AllenNLP • 2020
ZEST tests whether NLP systems can perform unseen tasks in a zero-shot way, given a natural language description of the task. It is an instantiation of our proposed framework "learning from task descriptions". The tasks include classification, typed entity extraction and relationship extraction, and each task is paired with 20 different annotated (input, output) examples. ZEST's structure allows us to systematically test whether models can generalize in five different ways.
License: CC BY

Typically, machine learning systems solve new tasks by training on thousands of examples. In contrast, humans can solve new tasks by reading some instructions, with perhaps an example or two. To take a step toward closing this gap, we introduce a framework and benchmark dataset for learning NLP systems that solve new tasks after reading their descriptions. See our EMNLP 2020 paper “Learning from task descriptions” for a description of the framework, evaluation metric, and baseline model results.

ZEST contains task descriptions (formatted as questions) for 1,251 different NLP tasks. Each task has 20 different (context, answer) annotations. This page provides download links (see above) for the dataset. Check out the github repository for a detailed description of the data, evaluation code, and information about submitting to the leaderboard to evaluate on the test set.


Top Public Submissions
DetailsCreatedOutput Structure Mean
T5-11B MTL baseline
Orion Weller, Nicholas Lourie, Matt Gardner and Matt Peters
T5-11B baseline
Orion Weller, Nicholas Lourie, Matt Gardner, Matt Peters
Hypter (BART-Large)
BART-Large baseline
Orion Weller, Nicholas Lourie, Matt Gardner, and Matt Peters


Orion Weller, Nicholas Lourie, Matt Gardner, Matthew E. Peters