Release date: March 13, 2025
What's new
The OLMo 2 family is growing – today we introduce OLMo 2 32B. This model is the most capable and largest model in the OLMo 2 family, scaling up the OLMo 2 training recipe used for our 7B and 13B models released in November. Trained up to 6T tokens and post-trained using Tulu 3.1, it is the first fully-open model to outperform GPT3.5-Turbo and GPT-4o mini on a suite of popular, multi-skill academic benchmarks. It is comparable to the leading open-weight models while requiring only a fraction of training compute.
In short, OLMo 32B is efficient, performant, and accurate. Read our technical blog for details, and get your hands on OLMo 32B (artifact links below).
Key improvements include:
• Improved data and efficient pretraining: OLMo 2 32B was built using our previously released resource-efficient pretraining mix. We have optimized our pretraining infrastructure (OLMo-core) for greater efficiency and developed modeling and data techniques that maximize downstream performance per unit of computation.
• Improved post-training and RLVR: Our models integrate our latest breakthrough in reinforcement learning with verifiable rewards (RLVR) by using Group Relative Policy Optimization (GRPO), as part of the Tülu 3 recipe, further enhancing their capabilities.
• Training infrastructure: OLMo 2 32B was trained on the “Augusta”, a 160 node AI Hypercomputer provided by Google Cloud Engine. Each node has 8 H100 GPUs, and the nodes are connected with GPUDirect-TCPXO interconnect. Over the course of the training run, we reached performance of over 1800 tokens per second per GPU, which is equivalent to about 38% MFU.
Artifacts:
• OLMo 2 HuggingFace Collection
• Pretraining dataset: OLMo-mix-1124
• Mid-training dataset: Dolmino-mix-1124
• Post-training dataset: Tülu 3 SFT Mix (updated)
• Preference data for OLMo 2 32B
• RLVR mix
Release date: November 26, 2024
What's new
OLMo 2 introduces a new family of 7B and 13B models trained on up to 5T tokens, representing the best fully-open language models to date. These models sit at the Pareto frontier of performance and training efficiency, with OLMo 2 7B outperforming Llama-3.1 8B and OLMo 2 13B outperforming Qwen 2.5 7B despite lower total training FLOPs. Check out the artifacts linked below, and read the blog.
Key improvements include:
• Enhanced architecture with RMSNorm, QK-Norm, auxiliary Z-loss, and rotary positional embeddings
• Two-stage curriculum training approach using OLMo-Mix-1124 and Dolmino-Mix-1124, model souping for final checkpoints.
• State-of-the-art post-training methodology from Tülu 3
• Evaluated on the OLMES suite
• The Instruct variants are competitive with the best open-weight models, with OLMo 2 13B Instruct outperforming Qwen 2.5 14B instruct, Tülu 3 8B, and Llama 3.1 8B instruct models.
Artifacts:
• Demo
• Pretraining dataset stage 1: OLMo-mix-1124
• Pretraining dataset stage 2: Dolmino-mix-1124
• Post-training dataset: Tülu 3 SFT Mix
• Preference data for OLMo 2 7B
• Preference data for OLMo 2 13B
• RLVR mix
Release date: September 3, 2024
What's new
• OLMoE is the first good Mixture-of-Experts LLM that is 100% open-source. The model has 1B active parameters, and 7B total parameters and is trained for a total of 5T tokens. Performance-wise, OLMoE is the state of the art among models with a similar cost of 1B parameters. It even beats a number of larger models on common benchmarks like MMLU or AlpacaEval, such as Gemma2, Llama2 13B Chat, OLMo-7B-0724 and DeepSeekMoE 16B. Check out the main links below and read the blog announcement for more details.
Main links:
• OLMoE finetunining dataset: tulu-v3.1-mix-preview-4096-OLMoE
• OLMoE finetunining dataset: ultrafeedback_binarized_cleaned
Release date: July 31, 2024
📌A quick note on naming: We have opted to update the OLMo naming convention to adhere to the following format: model name, model version (whole numbers only), model parameters, followed by the month and year of the release. This naming structure will make updates easier to track over time and is scalable so we can have infinite OLMOs! As an example, OLMo v1.7 is now OLMo April 2024, and today’s released models adhere to this updated naming convention.
What's new
• Improvements: OLMo 1B July 2024 shows 4.4 point increase in HellaSwag among other evaluation improvements from an improved version of the Dolma dataset and staged training. OLMo 7B July 2024 also leverages the newest version of the Dolma dataset and is trained with a two-staged curriculum. The second stage consistently adds 2 - 3 points of performance improvements. The OLMo July 2024 SFT and Instruct models use the Tulu 2 recipe with OLMo 7B July 2024 and are generally more capable than OLMo 7B April 2024 and the original OLMo 7B.
Main links:
Release date: April 17, 2024
What's new
• Improvements: OLMo 7B April 2024 (previously known as OLMo 1.7–7B) has a longer context length, up from 2048 to 4096 tokens and is trained on the new Dolma 1.7 dataset. Thanks to the improved Dolma dataset, this model scores 52 on MMLU, sitting above Llama 2–7B and approaching Llama 2–13B, and outperforms Llama 2–13B on GSM8K.
Main links:
Release date: February 1, 2024
What's new
Announcing OLMo, Ai2’s first Open Language Model (OLMo). The AI2 LLM framework is intentionally designed to provide access to data, training code, models, and evaluation code necessary to advance AI through open research to empower academics and researchers to study the science of language models collectively. This first batch of OLMo models includes four variants of our language model at the 7B scale corresponding to different architectures, optimizers, and training hardware, as well as one model at the 1B scale. All variants are trained on at least 2T tokens.
Main links:
• OLMo 1B
• OLMo 7B