Lior Sinclair’s Post

View profile for Lior Sinclair, graphic
Lior Sinclair Lior Sinclair is an Influencer

Covering the latest in AI R&D • ML-Engineer • MIT Lecturer • Building AlphaSignal, a newsletter read by 200,000+ AI engineers.

You can now transcribe 2.5 hours of audio in 98 seconds, locally. A new implementation called insanely-fast-whisper is blowing up on Github. It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations. Here's how you can use it: pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN> ♻️ Repost this if you found it useful. ↓ Are you technical? Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos and papers in AI. Read by 200,000+ devs.

Sahar Mor

I help researchers and builders make sense of AI | ex-Stripe | aitidbits.ai | Angel Investor

3mo

For those looking to run Whisper locally in one click, Mozilla openly released Whisperfile a few weeks ago. A high-performance, local tool for audio transcription and translation using OpenAI's Whisper model https://huggingface.co/Mozilla/whisperfile

This is amazing! 🚀 Transcribing 2.5 hours of audio in just 98 seconds locally is a huge leap forward. Loving how insanely-fast-whisper leverages Whisper and Pyannote for such efficiency. Can't wait to try it out on my projects! Who else is excited to boost their transcription workflows? 🔥 #AI #Transcription #MachineLearning #TechInnovation

Isham Rashik

🤖 Machine Learning Engineer 🦾 Generative AI 🧠 Natural Language Processing 💻 Prompt Engineering 👨💻 Computer Vision 👁️ Data Science 📊 Community Builder & Mentor 👨🏫 On a drive to change the world 🚀

3mo

I used this 8 months ago

Simon Sobisch

Senior Backend Developer; GNU Maintainer and Project Lead of the GnuCOBOL compiler

3mo

Somehow people tend to forget how easy it is to add your sources... I personally find that more useful than reposting. https://github.com/Vaibhavs10/insanely-fast-whisper with the info from above plus more documentation and example calls.

Adebolajo Sunday

AI/ML ENGINEER (COMPUTER VISION, NLP and System Integration)

3mo

I did benchmarking on both faster whisper and insanely fast whisper on (RTX 3070) . Faster whisper still perform better. Check this benchmarking asl.https://medium.com/@GenerationAI/streaming-with-faster-whisper-vs-insanely-fast-whisper-9ecfa4792fd7

Antonio Bray

Founder, President, Chief Visionary Officer and Chief Technology Officer at AudioOne, Inc

3mo

Looks very useful. i am thinking of the 1000s of uses cases now, especially if this could be automated.

Like
Reply

Just cut the audio in chunks (look out for low amplitude) and go for parallel processing.. It would surprise me if you can't transcribe it in under a second. Just add more hardware.

that sounds like a game changer! super cool tech for quick audio transcriptions. how do you see it impacting your work? Lior Sinclair

Swapnil Gupta

AI Ninja 🥷✧25k+🚀 | Deloitte | Spring Boot Microservices | Open Source 🥑 | Helping Software developers to build and scale applications | Building  Apple Vision Pro apps

3mo

Can you let me know any good text to speech model, that has good human like audio

Like
Reply

Whether you're working on research, podcasting, or interviews, tools like this are driving accessibility and productivity through rapid, high-quality transcriptions. AI innovation at its best!

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics