𝐍𝐕𝐈𝐃𝐈𝐀 𝐨𝐮𝐭𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐬 𝐆𝐏𝐓4𝐨 𝐚𝐧𝐝 𝐒𝐨𝐧𝐧𝐞𝐭 3.5 NVIDIA has released a fine-tuned version of Llama 3.1 70B that outperforms both OpenAI's GPT4o and Anthropic's Sonnet 3.5 on multiple benchmarks. This model was trained using Reinforcement Learning from Human Feedback (RLHF), Llama-3.1-Nemotron-70B-Reward and HelpSteer2-Preference prompts on a Llama-3.1-70B-Instruct model as initial policy. NVIDIA models are now available worldwide under a Llama 3.1 Open Source license on Hugging Face, and according to it, the Llama-3.1-Nemotron-70B-Reward-HF model is "#1 on all three automatic alignment benchmarks, edging out strong frontier models such as GPT-4o and Claude 3.5 Sonnet." as o October 1st. 🔗 Hugging face repository: https://lnkd.in/dqexeux6 🔗 Paper: https://lnkd.in/db7YfEDP
David Pereira’s Post
More Relevant Posts
-
NVIDIA secretly released a new llama 3 finetune which is trained specifically for RAG tasks. And it is really good. It beats even gpt-4 on retrieval augmented generation oriented benchmarks. Even their 8B parameters model is almost as good as gpt-4. This is really good for basically everyone. As RAG is one of the most profitable applications of LLMs. https://lnkd.in/dGpZNK2k
To view or add a comment, sign in
-
NVIDIA released their open model Nemotron-4 340B, which comes close to GPT-4 in some benchmarks. 98% of the data used to train its Instruct model is synthetic. The interesting thing is that they are not positioning as a competitor to other open models like Llama-3. They are instead positioning it as a tool to help other developers to train better or more models in different domains. https://lnkd.in/gH-HCqC7
To view or add a comment, sign in
-
You can now transcribe 2.5 hours of audio in 98 seconds, locally. A new implementation called insanely-fast-whisper is blowing up on Github. It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations. Here's how you can use it: pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN> ♻️ Repost this if you found it useful. ↓ Are you technical? Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos and papers in AI. Read by 200,000+ devs.
To view or add a comment, sign in
-
This looks like it could be very useful. Thinking of the 1000s of scenarios. hopefully this can be automated, deployed and scaled beyond local. Repo: https://lnkd.in/ePmTCfsK
Covering the latest in AI R&D • ML-Engineer • MIT Lecturer • Building AlphaSignal, a newsletter read by 200,000+ AI engineers.
You can now transcribe 2.5 hours of audio in 98 seconds, locally. A new implementation called insanely-fast-whisper is blowing up on Github. It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations. Here's how you can use it: pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN> ♻️ Repost this if you found it useful. ↓ Are you technical? Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos and papers in AI. Read by 200,000+ devs.
To view or add a comment, sign in
-
NVIDIA released a new llama 3 finetune which is trained specifically for RAG tasks. It beats even gpt-4 on retrieval augmented generation oriented benchmarks. Even their 8B parameters model is almost as good as gpt-4. This is really good for basically everyone. As RAG is one of the most profitable applications of LLMs. https://lnkd.in/e728hc2T
nvidia/Llama3-ChatQA-1.5-8B · Hugging Face
huggingface.co
To view or add a comment, sign in
-
🚀 Breakthrough in Audio Transcription Speed! 🎙️ Imagine transcribing a feature-length movie in less time than it takes to make a cup of coffee. That's now possible with 'insanely-fast-whisper', a game-changing GitHub project. Key highlights: • Transcribe 2.5 hours of audio in just 98 seconds • Works locally on Mac or Nvidia GPUs • Combines Whisper + Pyannote for rapid transcription and speaker segmentation For the tech-savvy, here's a quick setup: 1. pip install insanely-fast-whisper 2. Run with your file and settings This tool isn't just fast—it's revolutionizing how we process audio data. Think about the implications for: • Journalists transcribing interviews • Researchers analyzing focus groups • Content creators captioning videos What would you do with local and near-instant transcription? Share your ideas below! 👇
Covering the latest in AI R&D • ML-Engineer • MIT Lecturer • Building AlphaSignal, a newsletter read by 200,000+ AI engineers.
You can now transcribe 2.5 hours of audio in 98 seconds, locally. A new implementation called insanely-fast-whisper is blowing up on Github. It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations. Here's how you can use it: pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN> ♻️ Repost this if you found it useful. ↓ Are you technical? Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos and papers in AI. Read by 200,000+ devs.
To view or add a comment, sign in
-
pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN>
Covering the latest in AI R&D • ML-Engineer • MIT Lecturer • Building AlphaSignal, a newsletter read by 200,000+ AI engineers.
You can now transcribe 2.5 hours of audio in 98 seconds, locally. A new implementation called insanely-fast-whisper is blowing up on Github. It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations. Here's how you can use it: pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN> ♻️ Repost this if you found it useful. ↓ Are you technical? Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos and papers in AI. Read by 200,000+ devs.
To view or add a comment, sign in
-
🚀 NVIDIA launches Llama-3.1-Nemotron-70B! https://lnkd.in/efahSbKW 📊 As of Oct 1, 2024, it's topping the charts: • Arena Hard: 85.0 • AlpacaEval 2 LC: 57.6 • MT-Bench: 8.98 🔑 Key points: • Outperforms other models across multiple benchmarks • Longer responses (avg 2199.8 characters) • Consistent performance (narrow confidence interval) 𝗛𝗲𝗿𝗲 𝗮𝗿𝗲 𝘀𝗼𝗺𝗲 𝗶𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻 𝗮𝗯𝗼𝘂𝘁 𝘁𝗵𝗲 𝗘𝘃𝗮𝗹 𝗺𝗲𝘁𝗿𝗶𝗰𝘀 𝘂𝘀𝗲𝗱 𝑨𝒓𝒆𝒏𝒂-𝑯𝒂𝒓𝒅 is a challenging AI benchmark created from real user queries on Chatbot Arena. It uses the BenchBuilder pipeline to select 500 high-quality, diverse prompts that test language models on complex, real-world tasks across various domains. The process involves: * Question Source: Real user queries from Chatbot Arena (initially 200,000) * Question Selection Process: - BenchBuilder pipeline filters and evaluates queries - AI (GPT-4-Turbo) scores questions on 7 key qualities: - Specificity - Domain knowledge - Complexity - Problem-solving - Creativity - Technical accuracy - Real-world relevance - Topic modeling clusters similar queries - High-quality clusters are selected - 500 challenging prompts sampled for final benchmark Evaluation uses an LLM-as-judge approach, comparing model outputs to a baseline. This method provides a more comprehensive, updatable, and cost-effective evaluation than previous benchmarks, better separating top models and aligning well with human judgments. Arena-Hard Paper: https://lnkd.in/eh_hK7NK Github: https://lnkd.in/e9qBbd3E #NVIDIA #AI #LLM #TechInnovation #LLM
To view or add a comment, sign in
-
NVIDIA’s new open-source model, Nemotron-70B, surpasses GPT-4o and Claude 3.5 Sonnet with high scores in several benchmarks (Arena Hard, AlpacaEval 2 LC, MT-Bench) despite its relatively smaller 70B parameter size. Key innovations include RLHF (Reinforcement Learning from Human Feedback) with the REINFORCE algorithm and two custom reward models: Llama-3.1-Nemotron-70B-Reward, which evaluates response quality, and HelpSteer2-Preference Prompts, which guide responses based on detailed user feedback, ensuring quality and alignment with user preferences. https://lnkd.in/gAjKNwbm
NVIDIA NIM | llama-3_1-nemotron-51b-instruct
build.nvidia.com
To view or add a comment, sign in
Senior Consultant - Business Solutions & Architecture
2moYep - the progress on llama incl by Andrej Karpathy is tremendous - and it's encouraging a lot of others to re-think! I know of at least one team who think they can go even faster...