Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Do Androids Laugh at Electric Sheep? Humor"Understanding"Benchmarks from The New Yorker Caption Contest
We challenge AI models to “demonstrate un-derstanding” of the sophisticated multimodal humor of The New Yorker Caption Contest. Concretely, we develop three carefully cir-cumscribed tasks for which…
Do language models have coherent mental models of everyday things?
When people think of everyday things like an “egg,” they typically have a mental image associated with it. This commonsense knowledge helps us understand how these everyday things work and how to…
Efficient Methods for Natural Language Processing: A Survey
Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource…
Elaboration-Generating Commonsense Question Answering at Scale
In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge that helps improve performance. Yet the cost of working…
Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation
Few-shot fine-tuning and in-context learning are two alternative strategies for task adaptation of pre-trained language models. Recently, in-context learning has gained popularity over fine-tuning…
FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning
Large pre-trained models are capable of few-shot in-context learning (ICL), i.e., performing a new task by prepending a few demonstrations before the test input. However, the concatenated…
HINT: Hypernetwork Instruction Tuning for Efficient Zero-Shot Generalisation
Recent NLP models have the great ability to generalise ‘zero-shot’ to new tasks using only an instruction as guidance. However, these approaches usually repeat their instructions with every input,…
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
Large-scale language model pretraining is a very successful form of self-supervised learning in natural language processing, but it is increasingly expensive to perform as the models and pretraining…
Nonparametric Masked Language Modeling
Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first nonparametric masked…
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
We introduce INSTRUCTOR, a new method for computing text embeddings given task instructions: every text input is embedded together with instructions explaining the use case (e.g., task and domain…