Allen Institute for AI

ARISTO

Build machines that read, learn and reason.

The Aristo Project aims to build systems that demonstrate a deep understanding of the world, integrating technologies for reading, learning, reasoning, and explanation.

A multiple choice question and reasoning explaining each answer
Our research integrates multiple AI technologies, including:
  • Natural language processing
  • Information extraction
  • Knowledge representation
  • Machine reasoning
  • Commonsense knowledge

Research Areas

Probing Reasoning with Language Models

Language models (LMs) have dominated much of AI recently. But what kind(s) of reasoning are they capable of? And how can they be taught to do more? We are developing analytical datasets to probe LMs and help answer these questions.

Learn More:

Multihop Reasoning

Many questions require multiple pieces of information to be combined to arrive at an answer. We are developing new multihop models capable of identifying and combining relevant facts to answer such questions.

Learn More:

Explanation

An intelligent system should not only answer questions correctly, but also be able to explain why its answers are correct. Such a capability is essential for practical acceptance of AI technology. It is also essential for the broader goals of communicating knowledge to a user, and receiving correction from the user when the system's answer is wrong.

Learn More:

Reasoning about Actions

A key aspect of intelligence is being able to reason about the dynamics of the world. This requires modeling what state the world might be in, and how different actions might affect that state. Such capabilities are essential for understanding what happens during a procedure or process, for planning, and for reasoning about "what if..." scenarios.

Learn More:

  • Transformers as Soft Reasoners over Language | Aristo

    RuleTaker determines whether statements are True or False based on rules given in natural language.

    Try the demo
    RuleTaker demo logo
  • RuleTaker demo logo
    Transformers as Soft Reasoners over Language | Aristo

    RuleTaker determines whether statements are True or False based on rules given in natural language.

    Try the demo
  • UnifiedQA screenshot
    Crossing format boundaries with a single QA system | Aristo

    UnifiedQA is a single pre-trained QA model that performs surprisingly well across 17 QA datasets spanning 4 diverse formats. Fine-tuning UnifiedQA into specialized models results in a new state-of-the-art on 6 datasets, establishing this model as a strong starting point for building QA systems.

    Try the demo
  • UnifiedQA screenshot
    Crossing format boundaries with a single QA system | Aristo

    UnifiedQA is a single pre-trained QA model that performs surprisingly well across 17 QA datasets spanning 4 diverse formats. Fine-tuning UnifiedQA into specialized models results in a new state-of-the-art on 6 datasets, establishing this model as a strong starting point for building QA systems.

    Try the demo
    • Everything Happens for a Reason: Discovering the Purpose of Actions in Procedural Text

      Bhavana Dalvi Mishra, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter ClarkEMNLP2019Our goal is to better comprehend procedural text, e.g., a paragraph about photosynthesis, by not only predicting what happens, but why some actions need to happen before others. Our approach builds on a prior process comprehension framework for predicting actions' effects, to also identify… more
    • QASC: A Dataset for Question Answering via Sentence Composition

      Tushar Khot, Peter Clark, Michal Guerquin, Paul Edward Jansen, Ashish Sabharwal AAAI2020Composing knowledge from multiple pieces of texts is a key challenge in multi-hop question answering. We present a multi-hop reasoning dataset, Question Answering via Sentence Composition (QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice… more
    • Probing Natural Language Inference Models through Semantic Fragments

      Kyle Richardson, Hai Na Hu, Lawrence S. Moss, Ashish SabharwalAAAI2020Do state-of-the-art models for language understanding already have, or can they easily learn, abilities such as boolean coordination, quantification, conditionals, comparatives, and monotonicity reasoning (i.e., reasoning about word substitutions in sentential contexts)? While such phenomena are… more
    • What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge

      Kyle Richardson, Ashish SabharwalTACL2020Open-domain question answering (QA) is known to involve several underlying knowledge and reasoning challenges, but are models actually learning such knowledge when trained on benchmark tasks? To investigate this, we introduce several new challenge tasks that probe whether state-of-theart QA models… more
    • Text Modular Networks: Learning to Decompose Tasks in the Language of Existing Models

      Tushar Khot, Daniel Khashabi, Kyle Richardson, Peter Clark, Ashish SabharwalarXiv2020A common approach to solve complex tasks is by breaking them down into simple sub-problems that can then be solved by simpler modules. However, these approaches often need to be designed and trained specifically for each complex task. We propose a general approach, Text Modular Networks(TMNs… more

    QuaRTz Dataset

    3864 questions about open domain qualitative relationships

    QuaRTz is a crowdsourced dataset of 3864 multiple-choice questions about open domain qualitative relationships. Each question is paired with one of 405 different background sentences (sometimes short paragraphs).

    QuaRel Dataset

    2771 story questions about qualitative relationships

    QuaRel is a crowdsourced dataset of 2771 multiple-choice story questions, including their logical forms.

    hasPart KB

    A high-quality KB of hasPart relations

    A high-quality knowledge base of ~50k hasPart relationships, extracted from a large corpus of generic statements.

    RuleTaker: Transformers as Soft Reasoners over Language

    Datasets used to teach transformers to reason

    Can transformers be trained to reason (or emulate reasoning) over rules expressed in language? In the associated paper and demo we provide evidence that they can. Our models, that we call RuleTakers, are trained on datasets of synthetic rule bases plus derived conclusions, provided here. The resulting models provide the first demonstration that this kind of soft reasoning over language is indeed learnable.

    “Knowing is not enough, we must apply. Willing is not enough, we must do.”
    Johann Wolfgang von Goethe

    Paul Allen's 'Digital Aristotle' sets eyes on accomplishing practical tasks

    KOMO News
    February 5, 2020
    Read the Article

    מערכת בינה מלאכותית עברה בהצטיינות יתרה מבחן במדעים של כיתה ח' (Artificial Intelligence System Cum Laude Passed 8th Grade Science Test)

    Haaretz
    September 6, 2019
    Read the Article

    Allen Institute's Aristo AI makes breakthrough, passes eighth-grade science test

    TechSpot
    September 5, 2019
    Read the Article

    Allen Institute’s Aristo AI system finally passes an eighth-grade science test

    GeekWire
    September 4, 2019
    Read the Article

    How to tutor AI from an ‘F’ to an ‘A’

    Vulcan Inc
    September 4, 2019
    Read the Article

    A Breakthrough for A.I. Technology: Passing an 8th-Grade Science Test

    The New York Times
    September 4, 2019
    Read the Article

    AI assistants say dumb things, and we’re about to find out why

    MIT Tech Review
    March 14, 2018
    Read the Article

    Moving Beyond the Turing Test with the Allen AI Science Challenge

    CACM
    September 4, 2017
    Read the Article

    Team

    • Peter Clark's Profile PhotoPeter ClarkResearch
    • Sumithra Bhakthavatsalam's Profile PhotoSumithra BhakthavatsalamEngineering
    • Bhavana Dalvi's Profile PhotoBhavana DalviResearch
    • Michal Guerquin's Profile PhotoMichal GuerquinEngineering
    • Daniel Khashabi's Profile PhotoDaniel KhashabiYoung Investigator
    • Tushar Khot's Profile PhotoTushar KhotResearch
    • Kyle Richardson's Profile PhotoKyle RichardsonResearch
    • Ashish Sabharwal's Profile PhotoAshish SabharwalResearch
    • Carissa Schoenick's Profile PhotoCarissa SchoenickProduct
    • Oyvind Tafjord's Profile PhotoOyvind TafjordResearch
    • Niket Tandon's Profile PhotoNiket TandonResearch