Skip to main content ->
Ai2

Open by design: Ai2 brings fully open AI infrastructure online with NSF OMAI

May 7, 2026

Ai2


Open source drives scientific discovery. The ability to reproduce experiments, inspect methods, and build on shared work has always been central to how research advances — and in AI, open infrastructure is what keeps that engine running. Without access to model weights, training data, and methods, researchers are left studying a black box. Today, NSF, NVIDIA, and Ai2 are changing that.

Last year, Ai2 was awarded $152 million from the U.S. National Science Foundation (NSF) and NVIDIA to build the Open Multimodal AI Infrastructure for Science (NSF OMAI). Today, that investment is becoming operational.

With the deployment of NVIDIA Blackwell Ultra-powered systems, Ai2 is bringing new compute online, and with it, a different model for how that compute is used. Instead of powering a single proprietary system, this infrastructure will support a fully open ecosystem where each training run can be reused, extended, and built on by others. Ai2’s research is designed to maximize the infrastructure return by making its models, tools, and the processes behind them fully open, expanding access, and increasing the impact of every GPU hour.

"At a time when access to advanced AI systems is increasingly concentrated among a small number of companies, bringing this hardware infrastructure online represents a critical step for us. NSF OMAI represents a national investment in open infrastructure that has turned into real, usable compute that benefits a broader ecosystem of researchers. Our goal is to accelerate a truly open technology ecosystem with broad impact, developing fully-open AI systems, resources, and tools that strengthen AI research and support continued U.S. leadership in the field." — Noah A. Smith, Principal Investigator, NSF OMAI, and Senior Research Director, Ai2

From infrastructure to impact 

In closed systems, substantial compute is spent on experiments, iteration, and intermediate results that never leave the organization—often yielding only a single final product, an AI model that’s used for commercial purposes. If that same compute is instead spent on generating an open artifact, it continues to generate value long after training ends. The data, checkpoints, methods, and even the final model itself can get picked up and adapted across many downstream applications, and other labs can avoid repeating costly experiments to learn lessons that have been thoroughly documented.

Recent internal research from Ai2 estimates that, in some cases, 82% of a training effort goes toward exploratory work rather than the final model. When this work is shared, each GPU hour contributes not just to one release but to a growing body of work the entire field can draw on. The result is a multiplier effect: the same resources support more ideas, applications, and progress over time—without the footprint required by large, closed deployments.

The new cluster reflects this philosophy. Built on NVIDIA B300 systems, the cluster prioritizes how effectively its capacity is used and shared, rather than relying on raw scale. Deployed and managed in partnership with Cirrascale Cloud Services, it supports both large-scale training and ongoing experimentation across language, multimodal, and scientific domains.

In that sense, return on investment in AI infrastructure can’t be measured solely in outputs, but rather in how much innovation it enables. By focusing on openness and reuse, NSF OMAI is designed to deliver outsized impact without the footprint required by large, closed deployments.

"NSF OMAI reflects our commitment to ensuring that advanced AI infrastructure supports the broader research community. By investing in open, shared resources, we are enabling scientists and researchers across disciplines to build, test, reproduce, validate, and advance AI systems. This work accelerates discovery, strengthens scientific rigor through replicability and transparency, and reinforces U.S. leadership in the field." — Wendy Nilsen, Deputy Directorate Head, NSF Computer and Information Science and Engineering Directorate

"Scientific advancement requires accessible infrastructure to scale AI research and ensure its benefits are widely distributed across a global community. By executing the building of the NSF Open Multimodal AI Infrastructure project cluster on NVIDIA Blackwell Ultra, Ai2 is creating a highly efficient, open ecosystem that maximizes the impact of every compute hour." — Jack Wells, Director of Higher Education and Research Computing, NVIDIA

Looking ahead: powering the future of open source AI

As the cluster comes online, Ai2's work across language and multimodal disciplines is converging, reflecting a greater focus on unified architectures that handle multiple data types and tasks natively. And Ai2 is continuing to invest in models that act as agents – ones that can plan, use tools, and act autonomously in complex environments. Some of this work has already come to light, like its Open Coding Agents family, MolmoWeb, and ongoing research into how training strategies and environments shape reliable agentic behavior. 

Alongside this, Ai2 is investing in improving its infrastructure for training and evaluation—ensuring the systems used to build and benchmark models can scale with the research. 

As part of this work, Olmo team researchers are conducting outreach to science communities to make sure the next generations of models from Ai2 are genuinely useful for that community, adding a new layer on top of the foundational and general-purpose model work Ai2 has long pursued. Across these projects, the direction is not toward isolated systems, but toward open, reusable building blocks that can support both general AI research and scientific discovery.

"As a member of the original Olmo team, I’m excited to be back at Ai2 at this pivotal moment as we continue to advance fully open AI for science. There's a striking symmetry between what I see today with how things felt at the start of the Olmo project—disparate threads of research waiting to be woven together.  We’re just scratching the surface of the research we plan to do in areas like novel model architectures, natively multimodal models, scaled pretraining, and RL." — Iz Beltagy, Research Lead, Olmo Team, Ai2

Research supported through NSF OMAI is already producing results: 

  • Molmo 2 introduced video understanding, pointing, and object tracking to Ai2's multimodal model family, with an 8B-parameter model surpassing the original 72B Molmo on key benchmarks alongside nine new datasetsnine new datasets covering tasks such as advanced video grounding, multi-image grounding, ultra-dense video captioning, and free-form video question-answering, all released under a permissive license. 
  • MolmoPoint followed with a new pointing architecture that replaces text-coordinate outputs with a token-based grounding mechanism tied directly to the model's visual features, achieving state-of-the-art accuracy on spatial reasoning tasks. 
  • On the language modeling side, Olmo Hybrid combined transformer attention with linear RNN layers in a new architecture that matches prior models while using significantly less training data—roughly two times greater efficiency in some cases. 
  • In agentic AI, work on meta-reinforcement learning with self-reflection is advancing how search agents learn from prior attempts, using cross-episode reflection to improve exploration without relying on external reward models.

Together, these projects illustrate the breadth of research that NSF OMAI is helping accelerate across Ai2's language modeling programs—producing not just models, but open artifacts that other teams can inspect, adapt, and build on. 

Olmo 3 Model FlowPretrainingMidtrainingLong contextOlmo 3 BaseInstruct SFTInstruct DPOInstruct RLOlmo 3 InstructThinking SFTThinking DPOThinking RLOlmo 3 ThinkRL ZeroOlmo 3 RL ZeroOlmo 3 Model FlowPretrainingMidtrainingLong contextOlmo 3 BaseInstruct SFTInstruct DPOInstruct RLOlmo 3 InstructThinking SFTThinking DPOThinking RLOlmo 3 ThinkRL ZeroOlmo 3 RL Zero

Explore the Model Flow

Click on any stage to learn more about it and download artifacts.

The model flow above shows how you can trace our Olmo 3 family from pretraining and midtraining through long-context training and post-training branches for Instruct, Think, and RL Zero variants. Click each stage to learn more and download the artifacts.

The benefit to the scientific community is not theoretical. In the years since Olmo's release, it’s become a foundational part of what makes studying the science of language models possible for researchers across the country.  

"Olmo uniquely enabled [my research] work because it is fully open source… Olmo offers an open and credible platform where the community can investigate fundamental questions about reasoning, bias, fairness, and trustworthiness. In a landscape dominated by commercially optimized systems, Olmo stands out by empowering deeper understanding rather than just application." — Yuan He, Former Research Associate, University of Oxford, and first author of “Supposedly Equivalent Facts That Aren't? Entity Frequency in Pre-training Induces Asymmetry in LLMs

NSF OMAI began as a national investment in the idea that frontier AI development doesn't have to be closed. With new compute now operational and research already in use beyond Ai2, the project is entering its next phase—one where the work produced on open infrastructure can compound in ways that private efforts, by design, cannot.


About NSF OMAI

In 2025, Ai2 was awarded a cooperative agreement through the U.S. National Science Foundation’s Mid-Scale Research Infrastructure program combined with a $77M investment from Nvidia to form the NSF Mid-Scale RI-2: Open Multimodal AI Infrastructure to Accelerate Science (NSF OMAI). NSF OMAI aims to empower the AI community to inspect, reproduce, and innovate, while transforming scientific discovery across disciplines. Ai2 leads NSF OMAI alongside co-PIs from the University of Hawai’i Hilo, the University of New Hampshire, the University of New Mexico, and the University of Washington. Learn more from the NSF OMAI project website at www.allenai.org/omai

This material is based upon work supported by the U.S. National Science Foundation under Cooperative Agreement No. 2413244.