David Pereira’s Post

Europe & Latam Lead - Data & AI

2mo

𝐍𝐕𝐈𝐃𝐈𝐀 𝐨𝐮𝐭𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐬 𝐆𝐏𝐓4𝐨 𝐚𝐧𝐝 𝐒𝐨𝐧𝐧𝐞𝐭 3.5 NVIDIA has released a fine-tuned version of Llama 3.1 70B that outperforms both OpenAI's GPT4o and Anthropic's Sonnet 3.5 on multiple benchmarks. This model was trained using Reinforcement Learning from Human Feedback (RLHF), Llama-3.1-Nemotron-70B-Reward and HelpSteer2-Preference prompts on a Llama-3.1-70B-Instruct model as initial policy. NVIDIA models are now available worldwide under a Llama 3.1 Open Source license on Hugging Face, and according to it, the Llama-3.1-Nemotron-70B-Reward-HF model is "#1 on all three automatic alignment benchmarks, edging out strong frontier models such as GPT-4o and Claude 3.5 Sonnet." as o October 1st. 🔗 Hugging face repository: https://lnkd.in/dqexeux6 🔗 Paper: https://lnkd.in/db7YfEDP

1 Comment

Andy Manson

Senior Consultant - Business Solutions & Architecture

2mo

Yep - the progress on llama incl by Andrej Karpathy is tremendous - and it's encouraging a lot of others to re-think! I know of at least one team who think they can go even faster...

1 Reaction

To view or add a comment, sign in

More Relevant Posts

Andreas Nigg

I write about tips and tricks around AI, LLMs and data
7mo Edited
Report this post
NVIDIA secretly released a new llama 3 finetune which is trained specifically for RAG tasks. And it is really good. It beats even gpt-4 on retrieval augmented generation oriented benchmarks. Even their 8B parameters model is almost as good as gpt-4. This is really good for basically everyone. As RAG is one of the most profitable applications of LLMs. https://lnkd.in/dGpZNK2k
Like Comment
To view or add a comment, sign in
Kim Kuhlman, PhD

B2B Content Marketing Strategist | SEO | AI Expert | Data-driven business growth. Let's connect to elevate your marketing strategy with GenAI solutions. #B2BMarketing #ContentMarketing #SEO #GenAI #DigitalGrowth
6mo
Report this post
NVIDIA released their open model Nemotron-4 340B, which comes close to GPT-4 in some benchmarks. 98% of the data used to train its Instruct model is synthetic. The interesting thing is that they are not positioning as a competitor to other open models like Llama-3. They are instead positioning it as a tool to help other developers to train better or more models in different domains. https://lnkd.in/gH-HCqC7

Nvidia releases free LLMs that match GPT-4 in some benchmarks

the-decoder.com
Like Comment
To view or add a comment, sign in
Lior Sinclair Lior Sinclair is an Influencer

Covering the latest in AI R&D • ML-Engineer • MIT Lecturer • Building AlphaSignal, a newsletter read by 200,000+ AI engineers.
3mo
Report this post
You can now transcribe 2.5 hours of audio in 98 seconds, locally. A new implementation called insanely-fast-whisper is blowing up on Github. It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations. Here's how you can use it: pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN> ♻️ Repost this if you found it useful. ↓ Are you technical? Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos and papers in AI. Read by 200,000+ devs.

71 Comments
Like Comment
To view or add a comment, sign in
Antonio Bray

Founder, President, Chief Visionary Officer and Chief Technology Officer at AudioOne, Inc
3mo Edited
Report this post
This looks like it could be very useful. Thinking of the 1000s of scenarios. hopefully this can be automated, deployed and scaled beyond local. Repo: https://lnkd.in/ePmTCfsK

Lior Sinclair Lior Sinclair is an Influencer

Covering the latest in AI R&D • ML-Engineer • MIT Lecturer • Building AlphaSignal, a newsletter read by 200,000+ AI engineers.
3mo

You can now transcribe 2.5 hours of audio in 98 seconds, locally. A new implementation called insanely-fast-whisper is blowing up on Github. It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations. Here's how you can use it: pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN> ♻️ Repost this if you found it useful. ↓ Are you technical? Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos and papers in AI. Read by 200,000+ devs.
Like Comment
To view or add a comment, sign in
Alexandre BERGERE

Head of Data & AI Engineer, Partners at @DataGalaxy, startup founder, Investor at @Formance ☁️ Delta Lake & openLineage lover
7mo
Report this post
NVIDIA released a new llama 3 finetune which is trained specifically for RAG tasks. It beats even gpt-4 on retrieval augmented generation oriented benchmarks. Even their 8B parameters model is almost as good as gpt-4. This is really good for basically everyone. As RAG is one of the most profitable applications of LLMs. https://lnkd.in/e728hc2T

nvidia/Llama3-ChatQA-1.5-8B · Hugging Face

huggingface.co
Like Comment
To view or add a comment, sign in
⭐Patrice Séjalon

CTO | Transforming risk communication & insurance training with AI | Innovator | Metaverse | Web3 | Founder | ex Société Générale, Crédit Agricole, BNP | Zurich, Silicon Valley
3mo
Report this post
🚀 Breakthrough in Audio Transcription Speed! 🎙️ Imagine transcribing a feature-length movie in less time than it takes to make a cup of coffee. That's now possible with 'insanely-fast-whisper', a game-changing GitHub project. Key highlights: • Transcribe 2.5 hours of audio in just 98 seconds • Works locally on Mac or Nvidia GPUs • Combines Whisper + Pyannote for rapid transcription and speaker segmentation For the tech-savvy, here's a quick setup: 1. pip install insanely-fast-whisper 2. Run with your file and settings This tool isn't just fast—it's revolutionizing how we process audio data. Think about the implications for: • Journalists transcribing interviews • Researchers analyzing focus groups • Content creators captioning videos What would you do with local and near-instant transcription? Share your ideas below! 👇

Lior Sinclair Lior Sinclair is an Influencer

Covering the latest in AI R&D • ML-Engineer • MIT Lecturer • Building AlphaSignal, a newsletter read by 200,000+ AI engineers.
3mo

You can now transcribe 2.5 hours of audio in 98 seconds, locally. A new implementation called insanely-fast-whisper is blowing up on Github. It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations. Here's how you can use it: pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN> ♻️ Repost this if you found it useful. ↓ Are you technical? Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos and papers in AI. Read by 200,000+ devs.
Like Comment
To view or add a comment, sign in
David K.

🚀 LLMs & NLP Innovator | AI & Big Data Engineering Leader | Python Back-end Expert | 15+ Years in Tech | Speaker & Mentor
2mo
Report this post
pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN>

Lior Sinclair Lior Sinclair is an Influencer

Covering the latest in AI R&D • ML-Engineer • MIT Lecturer • Building AlphaSignal, a newsletter read by 200,000+ AI engineers.
3mo

You can now transcribe 2.5 hours of audio in 98 seconds, locally. A new implementation called insanely-fast-whisper is blowing up on Github. It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations. Here's how you can use it: pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN> ♻️ Repost this if you found it useful. ↓ Are you technical? Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos and papers in AI. Read by 200,000+ devs.

2 Comments
Like Comment
To view or add a comment, sign in
Sayed Raheel Hussain

ML Engineer | AI Researcher | Generative AI & LLMs | Computer Vision & Data Science
2mo Edited
Report this post
🚀 NVIDIA launches Llama-3.1-Nemotron-70B! https://lnkd.in/efahSbKW 📊 As of Oct 1, 2024, it's topping the charts: • Arena Hard: 85.0 • AlpacaEval 2 LC: 57.6 • MT-Bench: 8.98 🔑 Key points: • Outperforms other models across multiple benchmarks • Longer responses (avg 2199.8 characters) • Consistent performance (narrow confidence interval) 𝗛𝗲𝗿𝗲 𝗮𝗿𝗲 𝘀𝗼𝗺𝗲 𝗶𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻 𝗮𝗯𝗼𝘂𝘁 𝘁𝗵𝗲 𝗘𝘃𝗮𝗹 𝗺𝗲𝘁𝗿𝗶𝗰𝘀 𝘂𝘀𝗲𝗱 𝑨𝒓𝒆𝒏𝒂-𝑯𝒂𝒓𝒅 is a challenging AI benchmark created from real user queries on Chatbot Arena. It uses the BenchBuilder pipeline to select 500 high-quality, diverse prompts that test language models on complex, real-world tasks across various domains. The process involves: * Question Source: Real user queries from Chatbot Arena (initially 200,000) * Question Selection Process: - BenchBuilder pipeline filters and evaluates queries - AI (GPT-4-Turbo) scores questions on 7 key qualities: - Specificity - Domain knowledge - Complexity - Problem-solving - Creativity - Technical accuracy - Real-world relevance - Topic modeling clusters similar queries - High-quality clusters are selected - 500 challenging prompts sampled for final benchmark Evaluation uses an LLM-as-judge approach, comparing model outputs to a baseline. This method provides a more comprehensive, updatable, and cost-effective evaluation than previous benchmarks, better separating top models and aligning well with human judgments. Arena-Hard Paper: https://lnkd.in/eh_hK7NK Github: https://lnkd.in/e9qBbd3E #NVIDIA #AI #LLM #TechInnovation #LLM
Like Comment
To view or add a comment, sign in
Fabrizio Billi

HealthTech Innovator. Professor, Department of Orthopaedic Surgery, UCLA. Director, Musculoskeletal Innovation Group (BiMIG), Co-Chair Digital Orthopaedic Conference San Francisco.
2mo
Report this post
NVIDIA’s new open-source model, Nemotron-70B, surpasses GPT-4o and Claude 3.5 Sonnet with high scores in several benchmarks (Arena Hard, AlpacaEval 2 LC, MT-Bench) despite its relatively smaller 70B parameter size. Key innovations include RLHF (Reinforcement Learning from Human Feedback) with the REINFORCE algorithm and two custom reward models: Llama-3.1-Nemotron-70B-Reward, which evaluates response quality, and HelpSteer2-Preference Prompts, which guide responses based on detailed user feedback, ensuring quality and alignment with user preferences. https://lnkd.in/gAjKNwbm

NVIDIA NIM | llama-3_1-nemotron-51b-instruct

build.nvidia.com
Like Comment
To view or add a comment, sign in

8,795 followers

View Profile Connect

David Pereira’s Post

More from this author

Oferta Data Architecture Expert

The quest for the impact of AI in society

In data we trust: the Dataist Company

Explore topics