Papers

Learn more about AI2's Lasting Impact Award
Viewing 1-10 of 111 papers
  • SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding

    Favyen Bastani, Piper Wolters, Ritwik Gupta, Joe Ferdinando, Aniruddha KembhaviICCV2023 Remote sensing images are useful for a wide variety of planet monitoring applications, from tracking deforestation to tackling illegal fishing. The Earth is extremely diverse -- the amount of potential tasks in remote sensing images is massive, and the sizes…
  • Phone2Proc: Bringing Robust Robots Into Our Chaotic World

    Matt Deitke, Rose Hendrix, Luca Weihs, Ali Farhadi, Kiana Ehsani, Aniruddha KembhaviCVPR2023 Training embodied agents in simulation has become mainstream for the embodied AI community. However, these agents often struggle when deployed in the physical world due to their inability to generalize to real-world environments. In this paper, we present…
  • Visual Programming: Compositional visual reasoning without training

    Tanmay Gupta, Aniruddha KembhaviCVPR2023 We present VISPROG, a neuro-symbolic approach to solving complex and compositional visual tasks given natural language instructions. VISPROG avoids the need for any task-specific training. Instead, it uses the in-context learning ability of large language…
  • EXCALIBUR: Encouraging and Evaluating Embodied Exploration

    Hao Zhu , Raghav Kapoor, So Yeon Min , Winson Han, Jiatai Li, Kaiwen Geng, Graham Neubig, Yonatan Bisk , Aniruddha Kembhavi, Luca WeihsCVPR2023 Experience precedes understanding. Humans constantly explore and learn about their environment out of curiosity, gather information, and update their models of the world. On the other hand, machines are either trained to learn passively from static and fixed…
  • Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization

    Rajkumar Ramamurthy, Prithviraj Ammanabrolu, Kianté Brantley, Jack Hessel, Rafet Sifa, Christian Bauckhage, Hannaneh Hajishirzi, Yejin ChoiICLR2023 We tackle the problem of aligning pre-trained large language models (LMs) with human preferences. If we view text generation as a sequential decision-making problem, reinforcement learning (RL) appears to be a natural conceptual framework. However, using RL…
  • Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics

    Kuo-Hao Zeng, Luca Weihs, Roozbeh Mottaghi, Ali FarhadiICLR2023 A common assumption when training embodied agents is that the impact of taking an action is stable; for instance, executing the"move ahead"action will always move the agent forward by a fixed distance, perhaps with some small amount of actuator-induced noise…
  • Ask4Help: Learning to Leverage an Expert for Embodied Tasks

    Kunal Pratap Singh, Luca Weihs, Alvaro Herrasti, Jonghyun Choi, Aniruddha Kemhavi, Roozbeh MottaghiarXiv2022 Embodied AI agents continue to become more capable every year with the advent of new models, environments, and benchmarks, but are still far away from being performant and reliable enough to be deployed in real, user-facing, applications. In this paper, we…
  • ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

    Matt Deitke, Eli VanderBilt, Alvaro Herrasti, Luca Weihs, Jordi Salvador, Kiana Ehsani, Winson Han, Eric Kolve, Ali Farhadi, Aniruddha Kembhavi, Roozbeh MottaghiNeurIPS2022 Massive datasets and high-capacity models have driven many recent advancements in computer vision and natural language understanding. This work presents a platform to enable similar success stories in Embodied AI. We propose ProcTHOR, a framework for…
  • I Can't Believe There's No Images! Learning Visual Tasks Using only Language Data

    Sophia Gu, Christopher Clark, Aniruddha KembhaviarXiv2022 Many high-level skills that are required for computer vision tasks, such as parsing questions, comparing and con-trasting semantics, and writing descriptions, are also required in other domains such as natural language processing. In this paper, we ask…
  • Webly Supervised Concept Expansion for General Purpose Vision Models

    Amita Kamath, Christopher Clark, Tanmay Gupta, Eric Kolve, Derek Hoiem, Aniruddha KembhaviECCV2022 General purpose vision (GPV) systems [25] are models that are designed to solve a wide array of visual tasks without requiring architectural changes. Today, GPVs primarily learn both skills and concepts from large fully supervised datasets. Scaling GPVs to…