AI & Fairness

AI & Fairness

AI & Fairness

We are building on AI2's expertise in NLP, computer vision, and engineering to deliver a tangible positive impact on fairness.


  • Oren Etzioni's Profile PhotoOren EtzioniChief Executive Officer
  • Nicole DeCario's Profile PhotoNicole DeCarioOperations
  • Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations

    Tianlu Wang, Jieyu Zhao, Mark Yatskar, Kai-Wei Chang, Vicente OrdonezICCV2019
    In this work, we present a framework to measure and mitigate intrinsic biases with respect to protected variables --such as gender-- in visual recognition tasks. We show that trained models significantly amplify the association of target labels with gender beyond what one would expect from biased datasets. Surprisingly, we show that even when datasets are balanced such that each label co-occurs equally with each gender, learned models amplify the association between labels and gender, as much as if data had not been balanced! To mitigate this, we adopt an adversarial approach to remove unwanted features corresponding to protected variables from intermediate representations in a deep neural network -- and provide a detailed analysis of its effectiveness. Experiments on two datasets: the COCO dataset (objects), and the imSitu dataset (actions), show reductions in gender bias amplification while maintaining most of the accuracy of the original models.
  • The Risk of Racial Bias in Hate Speech Detection

    Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, Noah A. SmithACL2019
    We investigate how annotators’ insensitivity to differences in dialect can lead to racial bias in automatic hate speech detection models, potentially amplifying harm against minority populations. We first uncover unexpected correlations between surface markers of African American English (AAE) and ratings of toxicity in several widely-used hate speech datasets. Then, we show that models trained on these corpora acquire and propagate these biases, such that AAE tweets and tweets by self-identified African Americans are up to two times more likely to be labelled as offensive compared to others. Finally, we propose dialect and race priming as ways to reduce the racial bias in annotation, showing that when annotators are made explicitly aware of an AAE tweet’s dialect they are significantly less likely to label the tweet as offensive.
  • Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets

    Mor Geva, Yoav Goldberg, Jonathan BerantarXiv2019
    Crowdsourcing has been the prevalent paradigm for creating natural language understanding datasets in recent years. A common crowdsourcing practice is to recruit a small number of high-quality workers, and have them massively generate examples. Having only a few workers generate the majority of examples raises concerns about data diversity, especially when workers freely generate sentences. In this paper, we perform a series of experiments showing these concerns are evident in three recent NLP datasets. We show that model performance improves when training with annotator identifiers as features, and that models are able to recognize the most productive annotators. Moreover, we show that often models do not generalize well to examples from annotators that did not contribute to the training set. Our findings suggest that annotator bias should be monitored during dataset creation, and that test set annotators should be disjoint from training set annotators.
  • Evaluating Gender Bias in Machine Translation

    Gabriel Stanovsky, Noah A. Smith, Luke ZettlemoyerACL2019
    We present the first challenge set and evaluation protocol for the analysis of gender bias in machine translation (MT). Our approach uses two recent coreference resolution datasets composed of English sentences which cast participants into non-stereotypical gender roles (e.g., "The doctor asked the nurse to help her in the operation"). We devise an automatic gender bias evaluation method for eight target languages with grammatical gender, based on morphological analysis (e.g., the use of female inflection for the word "doctor"). Our analyses show that four popular industrial MT systems and two recent state-of-the-art academic MT models are significantly prone to gender-biased translation errors for all tested target languages. Our data and code are made publicly available.
  • Green AI

    Roy Schwartz, Jesse Dodge, Noah A. Smith, Oren EtzioniarXiv2019
    The computations required for deep learning research have been doubling every few months, resulting in an estimated 300,000x increase from 2012 to 2018 [2]. These computations have a surprisingly large carbon footprint [38]. Ironically, deep learning was inspired by the human brain, which is remarkably energy efficient. Moreover, the financial cost of the computations can make it difficult for academics, students, and researchers from emerging economies to engage in deep learning research. This position paper advocates a practical solution by making efficiency an evaluation criterion for research alongside accuracy and related measures. In addition, we propose reporting the financial cost or "price tag" of developing, training, and running models to provide baselines for the investigation of increasingly efficient methods. Our goal is to make AI both greener and more inclusive---enabling any inspired undergraduate with a laptop to write high-quality research papers. Green AI is an emerging focus at the Allen Institute for AI.
“By working arm-in-arm with multiple stakeholders, we can address the important topics rising at the intersection of AI, people, and society.”
Eric Horvitz

The hidden costs of AI

October 29, 2019
Read the Article

At Tech’s Leading Edge, Worry About a Concentration of Power

The New York Times
September 26, 2019
Read the Article

Artificial Intelligence Can’t Think Without Polluting

The Wire
September 26, 2019
Read the Article

Artificial Intelligence Confronts a 'Reproducibility' Crisis

September 16, 2019
Read the Article

המחיר המושתק של בינה מלאכותית (The secret price of artificial intelligence)

August 12, 2019
Read the Article

AI researchers need to stop hiding the climate toll of their work

MIT Tech Review
August 2, 2019
Read the Article

Greening AI | New AI2 Initiative Promotes Model Efficiency

July 31, 2019
Read the Article

Amid a rapid rise in AI resource needs, AI2 campaigns to make it easier to be green

July 26, 2019
Read the Article