Troy Thomas’ Post

Name: Troy Thomas on LinkedIn: #groq #inference #llama3 #opensource
Uploaded: 2024-08-06T14:02:05.961Z
Channel: Troy Thomas
Description: Latency matters and on chip inference helps tremendously. Take a look at this LLaMA3 + Groq demo now! #groq #inference #llama3 #opensource

Troy Thomas

Senior Technical Product Manager | Helping Customers Help Themselves with Information | Pinball Aficionado

4mo

Latency matters and on chip inference helps tremendously. Take a look at this LLaMA3 + Groq demo now! #groq #inference #llama3 #opensource

Aleksa Gordić

Founder mode | x-Google DeepMind / Microsoft

5mo Edited

Mind blowing real-time LLaMA 3 + Groq demo. Now extrapolate this 5 years out - even the technology we currently have will change the landscape of apps as we know it today quite dramatically. Not counting in daily advancements that are happening in AI. It's well known that latency changes user experience, just think of some of the research Google did in the early days showing how even a slight increase in latency leads to significantly less searches (see https://lnkd.in/df26CZam). So if we just speed up the existing capabilities the quality of future apps will increase. This realtime table manipulation is such a cool proxy for all the things we'll be able to do: e.g. imagine curating your Notion tables like this vs having to click through the GUI and google search how to do X. If you want to learn what makes Groq tick, I had their head of sillicon talk about their chips - LPUs: https://lnkd.in/dgx3kmCb And here I had Thomas Scialom, PhD LLaMA 2 author talk about LLaMA 2: https://lnkd.in/deibURjY I'll try to bring on over Thomas over the next days to share insights behind building LLaMA 3!

1 Comment

JR Smith

Mission-Driven Product Leader | Empowering Innovation with Customer Empathy and Ethical AI in SaaS | Product Management Mentor

4mo

Exciting!

1 Reaction

To view or add a comment, sign in

More Relevant Posts

Aleksa Gordić Aleksa Gordić is an Influencer

Founder mode | x-Google DeepMind / Microsoft
5mo Edited
Report this post
Mind blowing real-time LLaMA 3 + Groq demo. Now extrapolate this 5 years out - even the technology we currently have will change the landscape of apps as we know it today quite dramatically. Not counting in daily advancements that are happening in AI. It's well known that latency changes user experience, just think of some of the research Google did in the early days showing how even a slight increase in latency leads to significantly less searches (see https://lnkd.in/df26CZam). So if we just speed up the existing capabilities the quality of future apps will increase. This realtime table manipulation is such a cool proxy for all the things we'll be able to do: e.g. imagine curating your Notion tables like this vs having to click through the GUI and google search how to do X. If you want to learn what makes Groq tick, I had their head of sillicon talk about their chips - LPUs: https://lnkd.in/dgx3kmCb And here I had Thomas Scialom, PhD LLaMA 2 author talk about LLaMA 2: https://lnkd.in/deibURjY I'll try to bring on over Thomas over the next days to share insights behind building LLaMA 3!

48 Comments
Like Comment
To view or add a comment, sign in
Matt Trifiro

Storyteller, Brand Builder, Contrarian Marketer. Not afraid to poke the bear 👉🧸
5mo
Report this post
Don't underestimate the massive jump in human capability that could emerge when we take context switching to zero. If the AI responds quickly enough, then your mind has no time to be distracted. In other fields, such as software development, eliminating context switching (by improving, say, compile times) has a significant effect on human productivity. There is evidence that suggests this phenomenon will extend to all types of knowledge work when human-speed LLMs are widely available.

Aleksa Gordić Aleksa Gordić is an Influencer

Founder mode | x-Google DeepMind / Microsoft
5mo Edited

Mind blowing real-time LLaMA 3 + Groq demo. Now extrapolate this 5 years out - even the technology we currently have will change the landscape of apps as we know it today quite dramatically. Not counting in daily advancements that are happening in AI. It's well known that latency changes user experience, just think of some of the research Google did in the early days showing how even a slight increase in latency leads to significantly less searches (see https://lnkd.in/df26CZam). So if we just speed up the existing capabilities the quality of future apps will increase. This realtime table manipulation is such a cool proxy for all the things we'll be able to do: e.g. imagine curating your Notion tables like this vs having to click through the GUI and google search how to do X. If you want to learn what makes Groq tick, I had their head of sillicon talk about their chips - LPUs: https://lnkd.in/dgx3kmCb And here I had Thomas Scialom, PhD LLaMA 2 author talk about LLaMA 2: https://lnkd.in/deibURjY I'll try to bring on over Thomas over the next days to share insights behind building LLaMA 3!

3 Comments
Like Comment
To view or add a comment, sign in
Martin Khristi

Post Graduate in Data Science| Power BI Specialist | Microsoft Fabric Enthusiast | Python for data AI & Forecasting | Advisor @pandasAI | Azure AI Certified
5mo
Report this post
🚀 Check out this mind-blowing real-time LLaMA 3 + Groq demo! 🚀 Imagine how ultra-low latency will transform apps in the next 5 years. AI #TechInnovation #FutureOfApps #LowLatency #Groq #LLaMA3

Aleksa Gordić Aleksa Gordić is an Influencer

Founder mode | x-Google DeepMind / Microsoft
5mo Edited

Mind blowing real-time LLaMA 3 + Groq demo. Now extrapolate this 5 years out - even the technology we currently have will change the landscape of apps as we know it today quite dramatically. Not counting in daily advancements that are happening in AI. It's well known that latency changes user experience, just think of some of the research Google did in the early days showing how even a slight increase in latency leads to significantly less searches (see https://lnkd.in/df26CZam). So if we just speed up the existing capabilities the quality of future apps will increase. This realtime table manipulation is such a cool proxy for all the things we'll be able to do: e.g. imagine curating your Notion tables like this vs having to click through the GUI and google search how to do X. If you want to learn what makes Groq tick, I had their head of sillicon talk about their chips - LPUs: https://lnkd.in/dgx3kmCb And here I had Thomas Scialom, PhD LLaMA 2 author talk about LLaMA 2: https://lnkd.in/deibURjY I'll try to bring on over Thomas over the next days to share insights behind building LLaMA 3!

2 Comments
Like Comment
To view or add a comment, sign in
Dan Mason
6mo
Report this post
If you read only one thing about AI/LLMs this week, make it this: Does Devin live up to the hype?!? by Nick Dobos https://lnkd.in/eMmRsWyA Check out this tweetstorm with ample screencaps of a real world encounter with Devin, the LLM-powered engineer from Cognition Labs. Lots of great stuff in here! Couple things to highlight: • Devin’s UX does a great job of indicating when it’s blocked and unblocked (although I’m wondering if maybe green and orange should have been switched?). I think these tools will ultimately be dashboards, or fleet level views of all the agents out there doing your work — at a glance you’ll be able to see status and where your help is needed, and instead of adding to your to-do list, you’ll just put new ideas or tasks on your dashboard, and find that many of them are done by the time you come back. It’s a whole new paradigm for managing workstreams. • Interacting with websites is a pain in the butt for agents — and for humans, especially at 1/3 scale! Another great tweet by James Alcorn (https://lnkd.in/eeyDiBcA) called this out recently — you’re going to see new interfaces designed for agents, not quite APIs but definitely not the screen scraping hellscape we have today. • As cool as this is, Devin is still a power-user interface with no guardrails — Devin will try to do anything you ask it, whether it’s a good fit or not, and you might not find out that it’s failing and flailing for hours, or unless you watch it like a hawk. We can (and will) build better UXes that raise the flag earlier, guide users in the direction of successful workflows and learn from mistakes — right now, this is a blunt instrument in an infinite loop, and it shows! As always, a reminder — these things are only going to get better.

Nick Dobos (@NickADobos) on X

twitter.com
Like Comment
To view or add a comment, sign in
Datarag.ai

26 followers
2w
Report this post
Building Datarag has been a challenge. Figuring out how to showcase its potential? That's where things got really interesting. Our first step was using Datarag as an API for Team O'clock, powering their service behind the scenes. It worked well, but we wanted to push further—to let Datarag stand on its own as a true Retrieval Augmented Generation (RAG) system. Here's the thing about RAG systems: - They excel when dedicated, human-curated knowledge is at the core. - Generative AI, by contrast, shines with general, broad knowledge. We leaned into this idea of person-driven curation and asked: Where do people curate knowledge every day? The answer was clear—the internet. And the best way to access the internet? The browser. That's why we decided to start with a Chrome extension. It's the perfect bridge between Datarag and real-world, everyday use—natural, intuitive, and built for the way people browse today. (A full browser? Maybe someday. For now, we're keeping it simple and focused.) This is just the beginning. We're exploring how human knowledge curation and AI insights can come together seamlessly. #BehindTheScenes #BuildingInPublic #AI #ChromeExtension
Like Comment
To view or add a comment, sign in
Willy Douhard

Co-founder & CTO at Literal AI & Chainlit
7mo
Report this post
Introducing Chainlit by Literal AI debug mode in the latest pre-release version! ⌨️ pip install -U chainlit --pre && chainlit run my_app.py -d Developers can now seamlessly debug Chainlit apps on Literal AI with a simple click. First, the trace of steps taken to generate the answer helps locate issues. Then, the multi-provider & multi-modal prompt playground enables developers to replay LLM calls and iterate on prompts. There is more to this new pre-release 👀. Learn more here -> https://lnkd.in/eNTTxygf
5 Comments
Like Comment
To view or add a comment, sign in
Nitin Aggarwal

CEO & Senior Research Scientist at PradhVrddhi Research | Data Science & AI Solutions Provider | Helping Businesses Automate & Innovate
2mo
Report this post
Is this a real generator of images or just an innovative image search engine?

Axelle Malek

Daily post to fight your FOMO on AI.
2mo

BlinkShot AI generates images in a fraction of a second. So fast that it changes as you type words: No account is needed to start generating images. BlinkShot is ad-free, censorship-free, and has no hidden paywalls. Plus, it's 100% free to use. With BlinkShot, you just type in a description of the image you want to generate. Once ready, you can download it or share it directly from your browser. BlinkShot is also an open-source project, powered by Flux. This means it's constantly evolving and improving, thanks to a community of developers. ♻️ Repost this if you think it's the future. PS: This post is written by Ruben's AI. Get your $10 off EasyGen by: 1. Clicking on "Visit my website" at the top. 2. Using the code "AXELLE" at the checkout.
Like Comment
To view or add a comment, sign in
Axelle Malek

Daily post to fight your FOMO on AI.
2mo
Report this post
BlinkShot AI generates images in a fraction of a second. So fast that it changes as you type words: No account is needed to start generating images. BlinkShot is ad-free, censorship-free, and has no hidden paywalls. Plus, it's 100% free to use. With BlinkShot, you just type in a description of the image you want to generate. Once ready, you can download it or share it directly from your browser. BlinkShot is also an open-source project, powered by Flux. This means it's constantly evolving and improving, thanks to a community of developers. ♻️ Repost this if you think it's the future. PS: This post is written by Ruben's AI. Get your $10 off EasyGen by: 1. Clicking on "Visit my website" at the top. 2. Using the code "AXELLE" at the checkout.

102 Comments
Like Comment
To view or add a comment, sign in
Satya Raju Koduri

NoCode & Process Automation Consultant | Turning Manual Business Processes into efficient & scalable automated systems. Certified by Zapier | Make | Smartsuite | Glideapps | Softr
3mo
Report this post
I faced a challenge to convert date & time coming as a string from webhook. And it has to be converted into date & time format along with the change of time into 24 Hours format. Tried using functions in Make but couldn't get it work for a few hours. So went to Copilot AI and asked it to write a function. I definitely don't know how to code but I took inspiration from the logic it used in that code. Within an Hour I got my own logic to make it work inside Make with the functions. AI can be leveraged to start from somewhere instead of the need to start from scratch saving your time & effort in figuring out things.
2 Comments
Like Comment
To view or add a comment, sign in
Lior Sinclair Lior Sinclair is an Influencer

Covering the latest in AI R&D • ML-Engineer • MIT Lecturer • Building AlphaSignal, a newsletter read by 200,000+ AI engineers.
4mo
Report this post
Mem0 gained 20,000 stars on Github in 30 days. It's a new memory layer for LLMs that allows you to directly add, update, and search memories in your models. It's crucial for AI systems that require persistent context, like customer support and personalized recommendations. Features: ▸ Multi-Level Memory ▸ Adaptive Personalization ▸ Developer-Friendly API ▸ Cross-Platform Consistency ↓ Are you technical? Check out https://AlphaSignal.ai to get daily email of breakthrough models, repos and papers in AI. Read by 180,000+ devs.
38 Comments
Like Comment
To view or add a comment, sign in

1,284 followers

85 Posts

View Profile Connect

Troy Thomas’ Post

More Relevant Posts

Explore topics