Videos

See AI2's full collection of videos on our YouTube channel.
Viewing 1-10 of 244 videos
  • Language AI for RNA Virus and RNA Vaccine Thumbnail

    Language AI for RNA Virus and RNA Vaccine

    November 29, 2023  |  Liang Huang
    Abstract: Linguistics and biology are two sides of the same coin. This talk features several highly unexpected connections between them which yield efficient algorithms with substantial biological impacts. One such connection (Nature, 2023) is between messenger RNA (mRNA) vaccines and formal language theory…
  • OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text Thumbnail

    OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text

    November 28, 2023  |  Keiran Paster
    Abstract: There is growing evidence that pretraining on high quality, carefully thought-out tokens such as code or mathematics plays an important role in improving the reasoning abilities of large language models. For example, Minerva, a PaLM model finetuned on billions of tokens of mathematical documents from…
  • The Worlds I See: Curiosity, Exploration and Discovery at the Dawn of AI Thumbnail

    The Worlds I See: Curiosity, Exploration and Discovery at the Dawn of AI

    November 13, 2023  |  Dr. Fei-Fei Li
    Dr. Fei-Fei Li joins us for a fireside chat with Ali. She discusses her latest book, The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI. Bio: Dr. Fei-Fei Li is the inaugural Sequoia Professor in the Computer Science Department at Stanford University, and Co-Director of Stanford’s Human…
  • Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing Thumbnail

    Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing

    November 2, 2023  |  Tom Sherborne
    Abstract: Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data. Previous work has primarily considered silver-standard data augmentation or zero-shot methods, however, exploiting few-shot gold data is…
  • On Parameter Efficiency of Neural Language Models Thumbnail

    On Parameter Efficiency of Neural Language Models

    November 2, 2023  |  Chen Liang
    Abstract: Pre-trained neural language models have demonstrated remarkable generalizability in various downstream tasks, such as natural language understanding and question answering. However, these models have grown to contain hundreds of billions of parameters, making them difficult to be deployed in…
  • Benchmarking Compositionality with Formal Languages Thumbnail

    Benchmarking Compositionality with Formal Languages

    November 1, 2023  |  Josef Valvoda
    Abstract: Recombining known primitive concepts into larger novel combinations is a quintessentially human cognitive capability. Whether large neural models in NLP can acquire this ability while learning from data is an open question. In this paper, we investigate this problem from the perspective of formal…
  • Studying Large Language Model Generalization with Influence Functions Thumbnail

    Studying Large Language Model Generalization with Influence Functions

    October 31, 2023  |  Roger Grosse
    Abstract: When trying to gain better visibility into a machine learning model in order to understand and mitigate the associated risks, a potentially valuable source of evidence is: which training examples most contribute to a given behavior? Influence functions aim to answer a counterfactual: how would the…
  • Modular Language Models Thumbnail

    Modular Language Models

    October 16, 2023  |  Suchin Gururangan
    Conventional language models (LMs) are trained densely: all parameters are updated with respect to all data. We argue that dense training leads to a variety of well-documented issues with LMs, including their prohibitive training cost and unreliable downstream behavior. We then introduce a new class of LMs that…
  • Towards Cost-Efficient Use of Pre-trained Models Thumbnail

    Towards Cost-Efficient Use of Pre-trained Models

    October 10, 2023  |  Alan Ritter
    Abstract: Large language models are leading to many exciting breakthroughs, but this comes at a significant cost in terms of both computational and data labeling expenses. Training state-of-the-art models requires access to high-end GPUs for pre-training and inference, in addition to labeled data for fine-tuning…
  • Reliability and interactive debugging for language models Thumbnail

    Reliability and interactive debugging for language models

    October 6, 2023  |  Bhargavi Paranjape
    Abstract: Large language models have permeated our everyday lives and are used in critical decision making scenarios that can affect millions of people. Despite their impressive progress, model deficiencies may result in exacerbating harmful biases or lead to catastrophic failures. In this talk, I discuss several…