Latest research
February 25, 2026
PreScience: Forecasting the future of science end-to-end
PreScience is a new benchmark that evaluates whether AI can forecast how science unfolds end-to-end, from team formation through eventual impact.February 13, 2026
Olmix: A framework for data mixing throughout LM development
Olmix is a framework for language model data mixing that provides empirically grounded defaults and efficient reuse techniques.February 12, 2026
Introducing AutoDiscovery: Automated scientific discovery, now in AstaLabs
AutoDiscovery explores data autonomously, generating its own hypotheses to surface surprising findings that researchers might never have thought to look for.February 12, 2026
How researchers are using AutoDiscovery
Learn about how researchers are using AutoDiscovery, our scientific discovery tool, to make transformative impact across their fields.February 11, 2026
MolmoSpaces, an open ecosystem for embodied AI
MolmoSpaces is our new open platform for embodied AI that provides physics-grounded scenes, objects, and grasp annotations to train and evaluate generalist robotic policies.February 10, 2026
How2Everything: Mining the web to evaluate and improve LLMs on real-world procedures
How2Everything is an open framework for evaluating and improving how well LLMs generate step-by-step procedures.February 4, 2026
Now in Nature: Synthesizing scientific literature with retrieval-augmented LMs
We're excited to share that our paper “Synthesizing scientific literature with retrieval-augmented language models” has been accepted to Nature.January 28, 2026
Theorizer: Turning thousands of papers into scientific laws
Theorizer is a system that automatically reads scientific literature and synthesizes structured, testable theories.January 27, 2026
Open Coding Agents: Fast, accessible coding agents that adapt to any repo
SERA is the first in our family of Open Coding Agents, achieving state-of-the-art performance at low cost.1-9Next