Skip to main content ->
Ai2

Ai2 blog

August 2025 - MolmoAct: An Action Reasoning Model that reasons in 3D space

MolmoAct is the first model able to “think” in three dimensions, trained efficiently and delivering benchmark-topping performance.

August 2025 - NSF and NVIDIA award Ai2 a combined $152M to support building a national level fully open AI ecosystem

Ai2 has been awarded a combined $152 million from the U.S. National Science Foundation (NSF) and NVIDIA as part of a jointly funded project to advance our research and develop truly open AI models and solutions that will accelerate scientific discovery.

July 2025 - Introducing FlexOlmo: a new paradigm for language model training and data collaboration

Explore how FlexOlmo enables collaborative language model training without sacrificing data privacy or control, introducing a new, flexible approach to building shared AI models.

August 2025 - Signal and Noise: Reducing uncertainty in language model evaluation

We find that two simple metrics, signal and noise, reveal key differences in the utility of current LLM benchmarks.

August 2025 - MoNaCo: More natural questions for reasoning across dozens of documents

Introducing MoNaCo, a benchmark of highly challenging questions spanning dozens of documents for evaluating large…

July 2025 - Contextualized Evaluations: Judging language model responses to underspecified queries

How do we evaluate LLMs on underspecified queries? We show that adding clarifying context flips model rankings…

July 2025 - AutoDS: A prototype engine for autonomous, open-ended scientific discovery

AutoDS goes beyond standard data crunching by building upon its own findings and uncovering insights that may not…

July 2025 - SciArena: A new platform for evaluating foundation models in scientific literature tasks

Discover how SciArena is being used to evaluate foundation models’ capabilities in scientific literature tasks…

June 2025 - OMEGA: Can LLMs reason outside the box in math?

Discover how OMEGA is being used to evaluate large language models' ability to generalize in math through…

June 2025 - New applications of the Ai2 Climate Emulator (ACE) by the international climate modeling community

Learn how ACE is being used for seasonal forecasts and understanding decadal variations in global warming.

June 2025 - Revisiting critical batch size for large-batch OLMo pretraining

We introduce a more reliable method to measure the critical batch size (CBS), analyze how CBS changes over…

April 2025 - Introducing Atlantes: the first AI-powered GPS model for real-time global scale maritime intelligence

Atlantes: a system of transformers for real-time GPS modeling.

1-9Next