ARISTO

Build machines that read, learn and reason.

The Aristo Project aims to build systems that demonstrate a deep understanding of the world, integrating technologies for reading, learning, reasoning, and explanation.

A multiple choice question and reasoning explaining each answer
Our research integrates multiple AI technologies, including:
  • Natural language processing
  • Information extraction
  • Knowledge representation
  • Machine reasoning
  • Commonsense knowledge

Research Areas

Probing Reasoning with Language Models

Language models (LMs) have dominated much of AI recently. But what kind(s) of reasoning are they capable of? And how can they be taught to do more? We are developing analytical datasets to probe LMs and help answer these questions.

Learn More:

Multihop Reasoning

Many questions require multiple pieces of information to be combined to arrive at an answer. We are developing new multihop models capable of identifying and combining relevant facts to answer such questions.

Learn More:

Explanation

An intelligent system should not only answer questions correctly, but also be able to explain why its answers are correct. Such a capability is essential for practical acceptance of AI technology. It is also essential for the broader goals of communicating knowledge to a user, and receiving correction from the user when the system's answer is wrong.

Learn More:

Reasoning about Actions

A key aspect of intelligence is being able to reason about the dynamics of the world. This requires modeling what state the world might be in, and how different actions might affect that state. Such capabilities are essential for understanding what happens during a procedure or process, for planning, and for reasoning about "what if..." scenarios.

Learn More:

  • Transformers as Soft Reasoners over Language | Aristo

    ROVER determines whether statements are True or False based on rules given in natural language.

    Try the demo
    Rover Demo
  • Rover Demo
    Transformers as Soft Reasoners over Language | Aristo

    ROVER determines whether statements are True or False based on rules given in natural language.

    Try the demo
  • Aristo demo
    Science question answering with AI | Aristo

    Aristo is a multidisciplinary project that aims to develop systems that have a deeper understanding of the world and are capable of demonstrating that understanding through question answering and explanation.

    Try the demo
  • Aristo demo
    Science question answering with AI | Aristo

    Aristo is a multidisciplinary project that aims to develop systems that have a deeper understanding of the world and are capable of demonstrating that understanding through question answering and explanation.

    Try the demo
    • Everything Happens for a Reason: Discovering the Purpose of Actions in Procedural Text

      Bhavana Dalvi Mishra, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter ClarkEMNLP2019Our goal is to better comprehend procedural text, e.g., a paragraph about photosynthesis, by not only predicting what happens, but why some actions need to happen before others. Our approach builds on a prior process comprehension framework for predicting actions' effects, to also identify… more
    • QASC: A Dataset for Question Answering via Sentence Composition

      Tushar Khot, Peter Clark, Michal Guerquin, Paul Edward Jansen, Ashish Sabharwal AAAI2020Composing knowledge from multiple pieces of texts is a key challenge in multi-hop question answering. We present a multi-hop reasoning dataset, Question Answering via Sentence Composition (QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice… more
    • Probing Natural Language Inference Models through Semantic Fragments

      Kyle Richardson, Hai Na Hu, Lawrence S. Moss, Ashish SabharwalAAAI2020Do state-of-the-art models for language understanding already have, or can they easily learn, abilities such as boolean coordination, quantification, conditionals, comparatives, and monotonicity reasoning (i.e., reasoning about word substitutions in sentential contexts)? While such phenomena are… more
    • Adversarial Filters of Dataset Biases

      Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula, Rowan Zellers, Matthew E. Peters, Ashish Sabharwal, Yejin Choi arXiv2020Large neural models have demonstrated humanlevel performance on language and vision benchmarks such as ImageNet and Stanford Natural Language Inference (SNLI). Yet, their performance degrades considerably when tested on adversarial or out-of-distribution samples. This raises the question of whether… more
    • Transformers as Soft Reasoners over Language

      Peter Clark, Oyvind Tafjord, Kyle RichardsonarXiv2020AI has long pursued the goal of having systems reason over explicitly provided knowledge, but building suitable representations has proved challenging. Here we explore whether transformers can similarly learn to reason (or emulate reasoning), but using rules expressed in language, thus bypassing a… more

    QuaRTz Dataset

    3864 questions about open domain qualitative relationships

    QuaRTz is a crowdsourced dataset of 3864 multiple-choice questions about open domain qualitative relationships. Each question is paired with one of 405 different background sentences (sometimes short paragraphs).

    QuaRel Dataset

    2771 story questions about qualitative relationships

    QuaRel is a crowdsourced dataset of 2771 multiple-choice story questions, including their logical forms.

    Question Answering via Sentence Composition (QASC)

    9,980 8-way multiple-choice questions about grade school science

    QASC is a question-answering dataset with a focus on sentence composition. It consists of 9,980 8-way multiple-choice questions about grade school science (8,134 train, 926 dev, 920 test), and comes with a corpus of 17M sentences.

    ARC Question Classification Dataset

    7,787 multiple choice questions annotated with question classification labels

    A dataset of detailed problem domain classification labels for each of the 7,787 multiple-choice science questions found in the AI2 Reasoning Challenge (ARC) dataset, to enable targeted pairing of questions with problem-specific solvers. Also included is a taxonomy of 462 detailed problem domains for grade-school science, organized into 6 levels of specificity.

    “Knowing is not enough, we must apply. Willing is not enough, we must do.”
    Johann Wolfgang von Goethe

    Paul Allen's 'Digital Aristotle' sets eyes on accomplishing practical tasks

    KOMO News
    February 5, 2020
    Read the Article

    מערכת בינה מלאכותית עברה בהצטיינות יתרה מבחן במדעים של כיתה ח' (Artificial Intelligence System Cum Laude Passed 8th Grade Science Test)

    Haaretz
    September 6, 2019
    Read the Article

    Allen Institute's Aristo AI makes breakthrough, passes eighth-grade science test

    TechSpot
    September 5, 2019
    Read the Article

    A Breakthrough for A.I. Technology: Passing an 8th-Grade Science Test

    The New York Times
    September 4, 2019
    Read the Article

    Allen Institute’s Aristo AI system finally passes an eighth-grade science test

    GeekWire
    September 4, 2019
    Read the Article

    How to tutor AI from an ‘F’ to an ‘A’

    Vulcan Inc
    September 4, 2019
    Read the Article

    AI assistants say dumb things, and we’re about to find out why

    MIT Tech Review
    March 14, 2018
    Read the Article

    Moving Beyond the Turing Test with the Allen AI Science Challenge

    CACM
    September 4, 2017
    Read the Article

    Team

    • Peter Clark's Profile PhotoPeter ClarkResearch
    • Sumithra Bhakthavatsalam's Profile PhotoSumithra BhakthavatsalamEngineering
    • Bhavana Dalvi's Profile PhotoBhavana DalviResearch
    • Michal Guerquin's Profile PhotoMichal GuerquinEngineering
    • Daniel Khashabi's Profile PhotoDaniel KhashabiYoung Investigator
    • Tushar Khot's Profile PhotoTushar KhotResearch
    • Kyle Richardson's Profile PhotoKyle RichardsonResearch
    • Ashish Sabharwal's Profile PhotoAshish SabharwalResearch
    • Carissa Schoenick's Profile PhotoCarissa SchoenickProduct
    • Oyvind Tafjord's Profile PhotoOyvind TafjordResearch
    • Niket Tandon's Profile PhotoNiket TandonResearch