You can now transcribe 2.5 hours of audio in 98 seconds, locally. A new implementation called insanely-fast-whisper is blowing up on Github. It works on works on Mac or Nvidia GPUs and uses the Whisper + Pyannote library speed up transcriptions and speaker segmentations. Here's how you can use it: pip install insanely-fast-whisper insanely-fast-whisper --file-name <FILE NAME or URL> --batch-size 2 --device-id mps --hf_token <HF TOKEN> ♻️ Repost this if you found it useful. ↓ Are you technical? Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos and papers in AI. Read by 200,000+ devs.
This is amazing! 🚀 Transcribing 2.5 hours of audio in just 98 seconds locally is a huge leap forward. Loving how insanely-fast-whisper leverages Whisper and Pyannote for such efficiency. Can't wait to try it out on my projects! Who else is excited to boost their transcription workflows? 🔥 #AI #Transcription #MachineLearning #TechInnovation
I used this 8 months ago
Somehow people tend to forget how easy it is to add your sources... I personally find that more useful than reposting. https://github.com/Vaibhavs10/insanely-fast-whisper with the info from above plus more documentation and example calls.
I did benchmarking on both faster whisper and insanely fast whisper on (RTX 3070) . Faster whisper still perform better. Check this benchmarking asl.https://medium.com/@GenerationAI/streaming-with-faster-whisper-vs-insanely-fast-whisper-9ecfa4792fd7
Looks very useful. i am thinking of the 1000s of uses cases now, especially if this could be automated.
Just cut the audio in chunks (look out for low amplitude) and go for parallel processing.. It would surprise me if you can't transcribe it in under a second. Just add more hardware.
that sounds like a game changer! super cool tech for quick audio transcriptions. how do you see it impacting your work? Lior Sinclair
Can you let me know any good text to speech model, that has good human like audio
Whether you're working on research, podcasting, or interviews, tools like this are driving accessibility and productivity through rapid, high-quality transcriptions. AI innovation at its best!
I help researchers and builders make sense of AI | ex-Stripe | aitidbits.ai | Angel Investor
3moFor those looking to run Whisper locally in one click, Mozilla openly released Whisperfile a few weeks ago. A high-performance, local tool for audio transcription and translation using OpenAI's Whisper model https://huggingface.co/Mozilla/whisperfile