David Pereira’s Post

View profile for David Pereira, graphic

Europe & Latam Lead - Data & AI

𝐍𝐕𝐈𝐃𝐈𝐀 𝐨𝐮𝐭𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐬 𝐆𝐏𝐓4𝐨 𝐚𝐧𝐝 𝐒𝐨𝐧𝐧𝐞𝐭 3.5 NVIDIA has released a fine-tuned version of Llama 3.1 70B that outperforms both OpenAI's GPT4o and Anthropic's Sonnet 3.5 on multiple benchmarks. This model was trained using Reinforcement Learning from Human Feedback (RLHF), Llama-3.1-Nemotron-70B-Reward and HelpSteer2-Preference prompts on a Llama-3.1-70B-Instruct model as initial policy. NVIDIA models are now available worldwide under a Llama 3.1 Open Source license on Hugging Face, and according to it, the Llama-3.1-Nemotron-70B-Reward-HF model is "#1 on all three automatic alignment benchmarks, edging out strong frontier models such as GPT-4o and Claude 3.5 Sonnet." as o October 1st. 🔗 Hugging face repository: https://lnkd.in/dqexeux6 🔗 Paper: https://lnkd.in/db7YfEDP

  • No alternative text description for this image
Andy Manson

Senior Consultant - Business Solutions & Architecture

2mo

Yep - the progress on llama incl by Andrej Karpathy is tremendous - and it's encouraging a lot of others to re-think! I know of at least one team who think they can go even faster...

To view or add a comment, sign in

Explore topics