Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts
Despite recent advances in natural language generation, it remains challenging to control attributes of generated text. We propose DExperts: Decoding-time Experts, a decoding-time method for…
DeLighT: Deep and Light-weight Transformer
We introduce a very deep and light-weight transformer, DeLighT, that delivers similar or better performance than transformer-based models with significantly fewer parameters. DeLighT more…
Challenges in Algorithmic Debiasing for Toxic Language Detection
Biased associations have been a challenge in the development of classifiers for detecting toxic language, hindering both fairness and accuracy. As potential solutions, we investigate recently…
Challenges in Automated Debiasing for Toxic Language Detection
Biased associations have been a challenge in the development of classifiers for detecting toxic language, hindering both fairness and accuracy. As potential solutions, we investigate recently…
Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI
Trust is a central component of the interaction between people and AI, in that 'incorrect' levels of trust may cause misuse, abuse or disuse of the technology. But what, precisely, is the nature of…
GENIE: A Leaderboard for Human-in-the-Loop Evaluation of Text Generation
Leaderboards have eased model development for many NLP datasets by standardizing their evaluation and delegating it to an independent external repository. Their adoption, however, is so far limited…
Green AI
The computations required for deep learning research have been doubling every few months, resulting in an estimated 300,000x increase from 2012 to 2018 [2]. These computations have a surprisingly…
A Simple Yet Strong Pipeline for HotpotQA
State-of-the-art models for multi-hop question answering typically augment large-scale language models like BERT with additional, intuitively useful capabilities such as named entity recognition,…
Easy, Reproducible and Quality-Controlled Data Collection with Crowdaq
High-quality and large-scale data are key to success for AI systems. However, large-scale data annotation efforts are often confronted with a set of common challenges: (1) designing a user-friendly…
Grounded Compositional Outputs for Adaptive Language Modeling
Language models have emerged as a central component across NLP, and a great deal of progress depends on the ability to cheaply adapt them (e.g., through finetuning) to new domains and tasks. A…