Allen Institute for AI

AI2 IRVINE

About

The AI2 Irvine office is a collaboration between AI2 scientists and UC Irvine faculty and students, established in May 2018 on the UC Irvine campus. AI2's mission is to contribute to humanity through high-impact AI research and engineering.

AI2 Irvine Team

Our Focus

The focus of AI2 Irvine is fundamental, long-term research on getting machines to read and understand text. This includes carefully defining what it means to read, building models that read, and understanding what models are actually doing when they operate on existing datasets. We have built a large number of linguistically-motivated datasets targeted at pushing the boundaries of machine reading.

Our modeling advances aim for interpretable, compositional reasoning over long sequences of text. With both dataset construction and modeling developments, we strive to ensure that models perform well for the right reasons, using state-of-the-art dataset construction and model analysis techniques, some of which we developed ourselves.

The AI2 Irvine team enjoys a close research collaboration with the University of Califorina, Irvine.

Team

AI2 Irvine Members

  • Matt Gardner's Profile PhotoMatt GardnerResearch
  • Sanjay Subramanian's Profile PhotoSanjay SubramanianPredoctoral Young Investigator

Interns

  • Nitish  Gupta's Profile PhotoNitish GuptaIntern

Alumni

  • Qiang Ning's Profile PhotoQiang NingResearch
  • Orion Weller's Profile PhotoOrion WellerIntern
  • Jianming Liu's Profile PhotoJianming LiuIntern
  • Eric Wallace's Profile PhotoEric WallaceIntern
  • Yizhong Wang's Profile PhotoYizhong WangIntern

UC Irvine Collaborators

  • Sameer Singh's Profile PhotoSameer SinghAssistant Professor
  • Anthony Chen's Profile PhotoAnthony ChenPhD Student
  • Dheeru Dua's Profile PhotoDheeru DuaPhD Student
  • Robert L. Logan IV's Profile PhotoRobert L. Logan IVPhD Student
  • A Framework for Explaining Predictions of NLP Models | AllenNLP, AI2 Irvine

    The AllenNLP Interpret toolkit makes it easy to apply gradient-based saliency maps and adversarial attacks to new models, as well as develop new interpretation methods. AllenNLP Interpret contains three components: a suite of interpretation techniques applicable to most models, APIs for developing new interpretation methods (e.g., APIs to obtain input gradients), and reusable front-end components for visualizing the interpretation results.

    Try the demo
    AllenNLP Interpret text image
  • AllenNLP Interpret text image
    A Framework for Explaining Predictions of NLP Models | AllenNLP, AI2 Irvine

    The AllenNLP Interpret toolkit makes it easy to apply gradient-based saliency maps and adversarial attacks to new models, as well as develop new interpretation methods. AllenNLP Interpret contains three components: a suite of interpretation techniques applicable to most models, APIs for developing new interpretation methods (e.g., APIs to obtain input gradients), and reusable front-end components for visualizing the interpretation results.

    Try the demo
  • AllenNLP demo
    State-of-the-art open source NLP research library | AllenNLP

    AllenNLP is an open source NLP research library that makes it easy for researchers to design and evaluate new deep learning models for nearly any NLP problem, and makes state-of-the-art implementations of several important NLP models and tools readily available for researchers to use and build upon.

    Try the demo
  • AllenNLP demo
    State-of-the-art open source NLP research library | AllenNLP

    AllenNLP is an open source NLP research library that makes it easy for researchers to design and evaluate new deep learning models for nearly any NLP problem, and makes state-of-the-art implementations of several important NLP models and tools readily available for researchers to use and build upon.

    Try the demo
    • Easy, Reproducible and Quality-Controlled Data Collection with Crowdaq

      Qiang Ning, Hao Wu, Pradeep Dasigi, Dheeru Dua, Matt Gardner, IV RobertL.Logan, Ana Marasović, Z. NieEMNLP • Demo2020High-quality and large-scale data are key to success for AI systems. However, large-scale data annotation efforts are often confronted with a set of common challenges: (1) designing a user-friendly annotation interface; (2) training enough annotators efficiently; and (3) reproducibility. To address… more
    • IIRC: A Dataset of Incomplete Information Reading Comprehension Questions

      James Ferguson, Matt Gardner. Hannaneh Hajishirzi, Tushar Khot, Pradeep DasigiEMNLP2020Humans often have to read multiple documents to address their information needs. However, most existing reading comprehension (RC) tasks only focus on questions for which the contexts provide all the information required to answer them, thus not evaluating a system’s performance at identifying a… more
    • Improving Compositional Generalization in Semantic Parsing

      Inbar Oren, Jonathan Herzig, Nitish Gupta, Matt Gardner, Jonathan BerantFindings of EMNLP2020Generalization of models to out-of-distribution (OOD) data has captured tremendous attention recently. Specifically, compositional generalization, i.e., whether a model generalizes to new structures built of components observed during training, has sparked substantial interest. In this work, we… more
    • Learning from Task Descriptions

      Orion Weller, Nick Lourie, Matt Gardner, Matthew PetersEMNLP2020
    • MedICaT: A Dataset of Medical Images, Captions, and Textual References

      Sanjay Subramanian, Lucy Lu Wang, Sachin Mehta, Ben Bogin, Madeleine van Zuylen, Sravanthi Parasa, Sameer Singh, Matt Gardner, Hannaneh HajishirziFindings of EMNLP2020Understanding the relationship between figures and text is key to scientific document understanding. Medical figures in particular are quite complex, often consisting of several subfigures (75% of figures in our dataset), with detailed text describing their content. Previous work studying figures… more

    ZEST: ZEroShot learning from Task descriptions

    ZEST is a benchmark for zero-shot generalization to unseen NLP tasks, with 25K labeled instances across 1,251 different tasks.

    ZEST tests whether NLP systems can perform unseen tasks in a zero-shot way, given a natural language description of the task. It is an instantiation of our proposed framework "learning from task descriptions". The tasks include classification, typed entity extraction and relationship extraction, and each task is paired with 20 different annotated (input, output) examples. ZEST's structure allows us to systematically test whether models can generalize in five different ways.

    Quoref

    24K QA pairs over 4.7K paragraphs, split between train (19K QAs), development (2.4K QAs) and a hidden test partition (2.5K QAs).

    Quoref is a QA dataset which tests the coreferential reasoning capability of reading comprehension systems. In this span-selection benchmark containing 24K questions over 4.7K paragraphs from Wikipedia, a system must resolve hard coreferences before selecting the appropriate span(s) in the paragraphs for answering questions.

    ROPES

    14k QA pairs over 1.7K paragraphs, split between train (10k QAs), development (1.6k QAs) and a hidden test partition (1.7k QAs).

    ROPES is a QA dataset which tests a system's ability to apply knowledge from a passage of text to a new situation. A system is presented a background passage containing a causal or qualitative relation(s), a novel situation that uses this background, and questions that require reasoning about effects of the relationships in the back-ground passage in the context of the situation.

    DROP

    The DROP dataset contains 96k QA pairs over 6.7K paragraphs, split between train (77k QAs), development (9.5k QAs) and a hidden test partition (9.5k QAs).

    DROP is a QA dataset that tests the comprehensive understanding of paragraphs. In this crowdsourced, adversarially-created, 96k question-answering benchmark, a system must resolve multiple references in a question, map them onto a paragraph, and perform discrete operations over them (such as addition, counting, or sorting).

    ICS Partnership with AI2 Leads to a New Toolkit and Best Demo Paper Award

    UCI Dept. of Computer Science
    November 19, 2019
    Read the Article

    AI/NLP Research Partnership with Allen Institute for AI

    UCI CML
    September 30, 2019
    Read the Article

    ICS Partnership with Allen Institute for AI Advances Machine Learning

    UCI Dept. of Computer Science
    April 24, 2019
    Read the Article