Speechmatics’ Post

View organization page for Speechmatics, graphic

15,180 followers

7mo

Back in February, our CPO Trevor Back and CTO Will Williams shared our belief that “Her”-like AI assistants lie in our future. Last week’s demos from OpenAI and Google have shown that Big Tech also recognize this vision to be true. You can read our reaction to their announcements here 👇 🔗 https://lnkd.in/eXTgUNkV The main takeaway? Being able to interact with technology via audio is now on the agenda more than ever. No keyboards. No touch screens. No mice. No eye tracking Not even a brain interface. Just our most natural, most seamless, most human way of interacting with our world: Our voice. We love to see it 😍 We believe that if you're innovating in the conversational AI space, or building an AI assistant, having world-class speech-to-text as an input is the only way you'll see widespread adoption.

Why Google and Open AI’s latest announcements don’t solve all the challenges of AI Assistants

speechmatics.com

To view or add a comment, sign in

More Relevant Posts

HighSkyAI

442 followers
7mo
Report this post
😲 They are putting AI into everything! 😲 This year's Google I/O event had a lot of announcements and guess what... they were all about AI. The tech giant aimes to infuse AI into all of their product and services which doesn't come as a surprise, knowing the race between Google and OpenAI. Here are some of the key announcements they made during the event: 🔥 Updates to the Gemini models + a whole new Gemini model (1.5 Flash) which is designed to be fast and efficient at scale. 🔥 Gemini 1.5 Pro will soon have a 2 million context window. 🔥 New image generation model - Imagen 3 🔥 The only competitor of Sora - Google's Veo. This is a powerful video generation model, very close to the quality of videos of Sora (rumored). Unlike Sora, Google will release Veo to the public and you can already sign up for the waitlist. 🔥 Music AI sandbox - a music generation tool. (Udio is way better) 🔥 Google search will be infused with Gemini to make their search engine even more powerful. 🔥 Gemini 1.5 Pro will also be added to the Google Workspace and Photos (Gmail, Docs, Drive, Slides and Sheets) 🔥 Maybe the most interesting announcement was their new Project Astra. It's better to see it, than read it, so here is a link of their quick demo. https://lnkd.in/d3SGp6kn Did Google catch up to OpenAI with those announcements, or are they still way off target? We will be exploring this in the coming posts!

Project Astra: Our vision for the future of AI assistants

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Michael Houghton

Co-Founder @ Community Wolf | AI-enabled Technology and Community Reporting
7mo
Report this post
Google released some new AI stuff 🤖 Here's what I think is worth knowing from the event: • Introduction and evolution of Gemini, a multimodal AI model. • Updates and integration of Gemini across Google’s major products. • Expanding AI Overviews in Search • New features in Google Photos powered by Gemini. • Unlocking More Knowledge with Multimodality and Long Context • Expanding to 2 Million Tokens in Private Preview • Integration of Gemini into Google Workspace. • Introduction of audio capabilities in Gemini. • Developing more autonomous and capable AI agents. • Innovations in AI modeling and infrastructure. • Enhancements in Gemini’s conversational capabilities. • Incorporation of Gemini in Android devices. • Efforts to ensure ethical and responsible AI development. • Emphasis on community and developer involvement in AI innovation. For a detailed exploration of each point, you can read the full post on Google's blog here: https://lnkd.in/eBb2JVXC It's a two horse race between OpenAI and Google at this point for who's AI will be in your pocket and on your desktop. We are living in the future. Good Luck.
5 Comments
Like Comment
To view or add a comment, sign in
Flowintelli Company

Founder at Flowintelli
4mo
Report this post
🌟📱🚀 **Discover the Future Today with OpenAI's Advanced Voice Mode!** 🚀📱🌟 📣 Exciting news from TechCrunch! OpenAI has just rolled out its new Advanced Voice Mode (AVM), setting a new benchmark in AI technology. 🤖💬 After a week of testing, Maxwell Zeff reports that AVM offers an incredibly natural and conversational experience, way ahead of Siri and Alexa. 🌟 Who’s ready for the future of AI-powered interactions? 🙋♂️🙋♀️ 🔍 Highlights of AVM: ✅ Faster response times ⏱️ ✅ Unique, engaging answers 🎯 ✅ Capable of handling complex questions 🧠 While Gemini Live might have more voices, OpenAI’s AVM delivers a more convincing and personal experience. 🌐🔮 ✨🌿 "The future belongs to those who believe in the beauty of their dreams." - Eleanor Roosevelt 🌿✨ Power your life with the potential of world-class AI! Are you ready to take your daily interactions to the next level? 🚀📈 #InnovationNation #FutureTech #AIRevolution #StockMarket #TechCrunch #OpenAI #InvestInSuccess #AI --- Visual Suggestions: 1. **Background Image**: Sleek, modern design featuring a futuristic cityscape or high-tech digital interface. 2. **Graphics**: Icons/images of smartphones with speech bubbles indicating voice interaction. 3. **Text Overlays**: Use bold, contrasting text for key points to ensure readability and engagement. 4. **Inspirational Quote**: Featured prominently at the bottom to tie everything together. 🎨💬 Comment below if you’re as excited as we are! 🚀📈 Let’s discuss the future of AI interactions! 🤩👇 *********************** https://lnkd.in/eP26cr3c
Like Comment
To view or add a comment, sign in
asad rasheed

optical at al yaqoob eye trust
4mo
Report this post
Google Pixel Screenshots hands-on: Android version of “Review”, local AI analysis, organization of screenshot information

Google Pixel Screenshots hands-on: Android version of “Review”, local AI analysis, organization of screenshot information

https://xenluo.xyz
Like Comment
To view or add a comment, sign in
Andreas Schneble
7mo Edited
Report this post
Google I/O was packed with AI announcements. As expected, the event focused heavily on Google’s Gemini AI models, along with the ways they’re being integrated into apps like Workspace and Chrome. Here is a summary of the announcements. #google #AI

Google I/O 2024: all the news from the developer conference regarding AI and more

theverge.com
Like Comment
To view or add a comment, sign in
Ryan Walden

Artificial Intelligence Expert | Techstars '23 | Top 3% on Upwork
6d
Report this post
Veo 2 Sets the Bar for AI Video Generation 📹 Only a week after the release of OpenAI’s Sora Turbo, Google has launched its new AI video generation model, Veo 2, and it’s already making waves. 🌊 When tested against other models on over 1,000 prompts from MovieGenBench, Veo 2 achieved 58.8% general preference and 58.2% when compared specifically for prompt adherence. While these results don't clearly indicate a definitive performance advantage, they strongly indicate that OpenAI may be in store for some competition when it comes to video generation. 🎥 Key takeaways include: 🔑 - Veo 2 supports video generation at resolutions up to 4k, while Sora Turbo is only capable of output up to 1080p. - Veo 2 can follow complex prompt instructions, allowing users to specify specific camera controls, such as wide shots, POV, and more, as well as granting it advanced motion capabilities due to an understanding of physics. 🏃 - Veo 2 comes fuly equipped with advanced motion capabilities – seemingly a sore spot for Open AI’s Sora which is noted as not being able to “capture movement in a way that looks natural to the eye”. 👁️ People can sign up for Veo 2’s waitlist on VideoFX. 📋 When you add this on top of recent news regarding Amazon’s additional $4 billion into Anthropic, it looks like 2025 could be a challenging year for Open AI. 🤖 VideoFX Waitlist: https://lnkd.in/gfYX-egF Veo 2 Announcement: https://lnkd.in/ggFnYeCV Disclaimer ‼️ The video used in this post is not mine. It is from promotional video compilation on Google DeepMind's Youtube channel.
Like Comment
To view or add a comment, sign in
Jay K

Top IT Consulting
2mo
Report this post
Meta takes on Google, OpenAI with new text to video generation model. Here's how Movie Gen works https://ift.tt/kohuI32 Meta introduced Movie Gen, a new AI tool capable of generating realistic videos and audio from text prompts. While it excels in performance compared to rivals, it is not yet ready for public release due to high costs and long generation times. via mint - ai https://ift.tt/CRj2duc October 05, 2024 at 08:01AM https://ift.tt/Fx2gHfM
Like Comment
To view or add a comment, sign in
Drew Lucas

Owner @ RADAR Creative | Elevating Experiences Through Video, Projection Mapping, Digital Installations, and Web Design
1w
Report this post
Breaking: Google DeepMind Takes on Sora with New Video AI Model Google DeepMind has unveiled its latest video generation model, positioning it as a direct rival to OpenAI’s Sora. This new model showcases groundbreaking advancements in AI-driven video creation, hinting at an even more competitive landscape in the generative AI space. For creatives and businesses, the implications are exciting: higher-quality video generation, more control over outputs, and an expanding toolkit for storytelling and content creation. The race between AI innovators like Google DeepMind and OpenAI is pushing the boundaries of what’s possible, and the creative industry stands to benefit. Will this new model challenge the dominance of OpenAI’s Sora? Only time (and testing) will tell. Read more here: https://lnkd.in/gkDy7Nbj

Google DeepMind unveils a new video model to rival Sora | TechCrunch

https://techcrunch.com
Like Comment
To view or add a comment, sign in
Sam Witteveen
1w Edited
Report this post
Last week Matt Marshall and I sat down to talk about the Gemini 2.0 launch, what unlocks for Agents & GenAI apps. From everything that is new and doable now to possible counter moves by OpenAI we covered it all. Check out the video on VentureBeat

Matt Marshall
1w Edited

Google's Gemini 2.0 Flash feels like an iPhone moment for AI. The technology allows real-time interaction with video, transforming the way we engage with our surroundings—whether for education, productivity, or creative workflows. I had the chance to test it myself this morning using Google's AI Studio, and the potential is astonishing. Imagine analyzing live video, asking questions, or even editing in real-time. Content creators and developers are already exploring its capabilities, with reactions like “This is absolutely insane” from users editing videos in seconds. Key Highlights: • Real-time multimodal AI: Interact with live video, audio, and images for unprecedented insights. Practical enterprise use cases: Boost productivity and creativity with tools that analyze, suggest edits, and troubleshoot on the fly. • Developer-ready APIs: Seamless integration for apps that push the boundaries of interaction. (See my full post on VentureBeat about this new era here: https://lnkd.in/dfyEDj9E) 🎙️ Podcast Alert: Don’t miss my latest conversation with developer and generative AI developer Sam Witteveen, where we explore how Gemini 2.0 Flash compares to offerings from OpenAI and Microsoft—and why it could lead the next wave of enterprise AI innovation. Watch here: https://lnkd.in/dTJ73C2m #AI #RealTimeAI #MultimodalAI #EnterpriseTech #AIForDevelopers #AIApplications #FutureOfAI

Gemini 2.0: A New Era of Real-World AI

https://www.youtube.com/

1 Comment
Like Comment
To view or add a comment, sign in
Fat Finger Tech

613 followers
8mo
Report this post
The Seeing AI app is a great example of the power of computer vision. Designed for the blind and low-vision community, the Seeing AI app harnesses the power of AI to open up the visual world and describe nearby people, text and objects. To find out more, check out the Seeing AI web page. https://lnkd.in/d3C5UPcw Microsoft changing the way we use AI.
Like Comment
To view or add a comment, sign in

15,180 followers

View Profile Connect

Speechmatics’ Post

Why Google and Open AI’s latest announcements don’t solve all the challenges of AI Assistants

speechmatics.com

More from this author

World Inclusion Day 2023 🌎 The Promise for Speech Recognition

September Updates

Our Highlights from the AI Summit London 2023

Explore topics

Speechmatics’ Post

More Relevant Posts

Project Astra: Our vision for the future of AI assistants

https://www.youtube.com/

Gemini 2.0: A New Era of Real-World AI

https://www.youtube.com/

More from this author

World Inclusion Day 2023 🌎 The Promise for Speech Recognition

September Updates

Our Highlights from the AI Summit London 2023

Explore topics