As we set our goals and make our predictions for the new year, plenty of experts are anticipating what 2023 will mean for artificial intelligence. At AI2, Semantic Scholar is looking into even more personalized features for scholars, Aristo is investigating how to repair bad chains of reasoning for AI models, and the AllenNLP team wants to continue to lower the barriers for model usage with more efficiency — stay tuned for all that and a lot more in what promises to be another incredible year for AI.
Best Long Paper Award at EMNLP’22
Researchers including Mosaic team Young Investigator Alane Suhr were awarded the Best Long Paper Award at EMNLP 2022 for their paper "Abstract Visual Reasoning with Tangram Shapes" — this paper introduces KiloGram, a richly annotated dataset that is orders of magnitude larger and more diverse than previous resources. KiloGram is an exciting new way to evaluate the abstract visual reasoning capacities of recent multi-modal models.
The PRIOR team announced AI2-THOR v5.0, introducing the new ArchitecTHOR, a collection of 10 high-quality, house-sized environments hand-designed by 3D artists. This new collection allows researchers to work with more complex, realistic environments for testing AI algorithms.
2022 was a year of growth, change, and innovation at AI2. Many of our researchers were honored with 7 prestigious paper awards across several conferences. Our common sense AI research lead Yejin Choi was honored with a MacArthur Foundation fellowship. Our founding CEO Oren Etzioni stepped down after nearly nine years leading the institute, joining the newly independent AI2 Incubator as a technical director.
Our teams all produced exciting new work that we will continue to build on in 2023. Here are a few key highlights:
• We're working to empower AI practitioners to measure and mitigate AI's carbon emissions
• The new AI2 Tango allows users to build machine learning experiments out of steps that can be reused and repeated
• MERLOT Reserve is a step toward helping AI understand the world more as humans do
• Unified-IO was unveiled as a new general-purpose model with unprecedented breadth
• Training your virtual robot got way, way better with ProcTHOR
• EarthRanger held its biggest user conference yet with participants from 35 countries and 5 continents
• Skylight doubled the number of countries using the platform – today, 68 countries are using Skylight to track vessels and manage IUU fishing
Līla: A Unified Benchmark for Mathematical Reasoning
Members of the Aristo team presented the first benchmark for comprehensive evaluation of the mathematical reasoning abilities of AI systems at EMNLP 2022.
Let’s face it: understanding new scientific concepts can be challenging. The ACCoRD system takes advantage of the many ways a concept is talked about across the scientific literature to produce diverse descriptions of concepts.