Latest research
April 15, 2025
DataDecide: How to predict best pretraining data with small experiments
Explore the secrets of how language model developers make decisions with DataDecide.April 9, 2025
Going beyond open data – increasing transparency and trust in language models with OLMoTrace
OLMoTrace lets you trace the outputs of language models back to their full, multi-trillion-token training data in real time.March 31, 2025
Introducing CodeScientist: A step toward automated scientific discovery
Will there be a system that automatically identifies gaps in scientific knowledge and runs experiments?March 26, 2025
Introducing Ai2 Paper Finder
Ai2 Paper Finder is an LLM-powered literature search system that mimics the iterative paper-finding process.March 13, 2025
OLMo 2 32B: First fully open model to outperform GPT 3.5 and GPT 4o mini
Introducing OLMo 2 32B, the most capable and largest model in the OLMo 2 family.February 25, 2025
olmOCR: Efficient PDF text extraction with vision language models
We introduce olmOCR, a high-performance toolkit designed to convert PDFs and document images into clean, structured plain text.February 11, 2025
OLMoE, meet iOS
Our mixture-of-experts model is available on the Apple app store! The OLMoE app allows anyone to test the model privately and securely.January 30, 2025
Scaling the Tülu 3 post-training recipes to surpass the performance of DeepSeek V3
Meet Tülu 3 405B, the first application of fully open post-training recipes to the largest open-weight models.January 21, 2025