Llama 3.3 🦙 Am back with the last release of the year - an early Christmas gift for the community. Releasing an updated 70B model, that we know is the workhorse of the OSS community, but now as capable as our 405B. We’ve been refining our post-training recipe, introducing new online RL techniques that pushed on domains like math and reasoning. The minute we saw this model almost reach parity with 405B, we thought it’ll be great to share with everyone. This can now act as a powerful synthetic data generator or teacher for all your distillation needs. On a personal note, this has been an exciting year for the Llama organization. 5 community moments (4 Llama and Movie Gen), combined with lots of product updates, have kept us busy through 2024. 2025 will be the year of Llama 4. I’ll end by saying we’re hiring - Directors and Research Scientists. Ping me and join us on this journey! Download from Meta ➡️ https://lnkd.in/gPK9QzxM
Roshan Sumbaly You mentioned the introduction of new online RL techniques that enhanced domains like math and reasoning. Can you elaborate on the specific changes in the RL approach and how they differ from traditional methods in improving model performance?
Can you help me get my hacked Facebook Account Back? I was hacked 11-26-2024
Bravo Duccio! Sei proprio forte!
Artificial Intelligence and Robotics Scientist
1wIs there a reference for the "new online RL techniques "?