STOP THE CRAZY TRAIN BEFORE IT'S TOO LATE! Here's a nice blog post in "Mission Impossible" style, explaining how my French colleagues built a fun and inspiring AI demo with moving parts and cameras. There's also a link to a YouTube video showing how it all works in practice. Thanks to Nicolas Massé Adrien Legros Pauline T. Mourad Ouachani https://lnkd.in/dH72KMHJ #openshiftai #ai #mlops #predictive #redhat #fun #openshift #opensource
Johan Robinson’s Post
More Relevant Posts
-
There are plenty of reasons why you'd want to run LLMs on your local machine. It may be difficult at times, but here's how you can do it. #datascience #AI #artificialintelligence https://hubs.li/Q02J_d3h0
Bringing LLMs Back to Your Local Machine
https://odsc.com
To view or add a comment, sign in
-
There are plenty of reasons why you'd want to run LLMs on your local machine. It may be difficult at times, but here's how you can do it. #datascience #AI #artificialintelligence https://hubs.li/Q02J_np30
Bringing LLMs Back to Your Local Machine
https://odsc.com
To view or add a comment, sign in
-
One of my favorite AI tool! Discover how Perplexity AI built AI architecture to scale its Llama 3 #LLM deployment to serve millions of inference requests. Key takeaways: ✅ 435M queries/month ✅ 20+ AI models, including Llama 3 (8B, 70B, 405B) ✅ 3x cost reduction
Spotlight: Perplexity AI Serves 400 Million Search Queries a Month Using NVIDIA Inference Stack | NVIDIA Technical Blog
developer.nvidia.com
To view or add a comment, sign in
-
🚀 Exciting News in AI! 🚀 Cerebras has just unveiled the fastest LLM inference processor, and it's not even close! This groundbreaking innovation is set to revolutionize the AI landscape, offering unprecedented speed and efficiency. 🌟 This huge leap forward also supports significant advancements in Agentic AI, paving the way for more autonomous and intelligent systems. The Cerebras processor's capabilities will undoubtedly open new doors for innovation and development in our field. Check out the full article below. #AI #Innovation #Technology #Cerebras #LLM #TechNews #AgenticAI Infused Innovations Gregory Geehan Jeff Wilhelm Connor O'Neill Stephen Webster
Cerebras Trains Llama Models To Leap Over GPUs
https://www.nextplatform.com
To view or add a comment, sign in
-
OMG.. can gpu choice impacts AI output? Good morning, world! From NZ. Yes, I’m excited by this article shared by Manish Kukreja 💙, but I do want validation. Today’s thought starts many moons ago. Before my conscience got side-tracked by pesky topics like ethics, privacy and value creation around AI. 🔵recent alignment discussions A passion for me, and what started me posting regularly, was the topic of alignment and #ai. The psychology of it, the challenges of managing it and the risks. Here is a recent short post summarising the alignment issues and ways it can be controlled. https://lnkd.in/g6qDfssc. From that post I ask two questions: 1️⃣ why use AI if it does not align to our values? 2️⃣ what happens if we give up control? A recent example of this; Taylor Swift, was tagged as “feminist” and thus the advertising algorithm did not put ads next to content about her, thus reducing ad revenue. https://lnkd.in/gjWaksr2. This example demonstrates what can happen if we hand over our decision making to an Algorithm or AI tool, that does not align to our values. At its core, I suggest that AI values are based on the values of the people creating it. 🔵 what choices impact AI output? In the recent discussions above, to manage alignment, the focus as been on controlling the data and the use of guardrails. However, in earlier posts i explored other factors that could influence AI output: https://lnkd.in/g8MQTNdM In that post I respond to reasearch indicating that small design choices for models have an impact, in mobile device environments. The key question raised was; how deep, in to the tech stack, does choices impact AI? From the post: “Does the nature of the coding language impact the model? Does the CPU choice impact model output?” 🔵 the article below: The article suggests my last question / Idea may have some validity to it. Anis ZAKARI’s shares the steps of an experiment he did to see if GPU had a pact on #llm output in RAG. Based on his results, it’s appears as that it does. Great to have validation for the idea.. assuming it true. Could there be other confounding variables? 🔵 implications: If different GPUs can have an impact the switching from GPU to CPU would also, no? 🔸 Do the differences matter? If they do: Do some GPUs have higher accuracy?(for now I doubt this) NVIDIA and other gpu vendors are going to love this. Potentially more demand for the same GPU, for different environments. If they don’t: Then we can continue as usual. 🔸my current hypothesis: Is that the varations in GPU or CPU, change the probability calculations. #generativeartificialintelligence #genai
Now this, AI outputs can differ when you have your RAG pipeline running on a different GPU in production than the one you did your evaluation on in dev or test.. #gpu #LLM #ai #probability https://lnkd.in/d3puW9yi
Changing the GPU is changing the behaviour of your LLM.
medium.com
To view or add a comment, sign in
-
With Gaudi 3, Intel Can Sell AI Accelerators To The PyTorch Masses
With Gaudi 3, Intel Can Sell AI Accelerators To The PyTorch Masses
https://www.nextplatform.com
To view or add a comment, sign in
-
What happens when we quantize both weights and activations of LLMs? Following our previous post on weight-only quantization, this time we are exploring weight-activation quantization in #vLLM and #TensorRT_LLM. Discover how this quantization approach impacts model efficiency and performance across different configurations. Curious about balancing precision and speed? Check out our findings on achieving high-quality results with lower compute costs. #AI #LLM #Optimization #TensorRT #TRTLLM #vLLM #Quantization #Deployment #FitsonChips #SqueezeBits
[vLLM vs TensorRT-LLM] #7. Weight-Activation Quantization - SqueezeBits
blog.squeezebits.com
To view or add a comment, sign in
-
Stable Dispersion 3 Medium — Stability AI
Stable Dispersion 3 Medium — Stability AI
https://dhaabanews.com
To view or add a comment, sign in
-
The AI universe shook this week with two massive announcements. First, #Anthropic announced that their advanced Claude 3 models are now available on #Google #Vertex AI, offering developers access to state-of-the-art language models. Meanwhile, #Microsoft made waves by hiring the co-founders of #Inflection AI, the creators of the popular PI AI assistant. These moves signal a growing commitment from tech giants to stay at the forefront of AI innovation. As if that wasn't enough excitement, Jensen Huang, CEO of #NVIDIA, delivered a jaw-dropping opening keynote at the #GTC2024 AI Conference and Exhibition. To have a quick overview of the two hour keynote, browse to your favorite blog DigitrendZ.
GTC 2024: Nvidia’s Pioneering AI Conference
https://digitrendz.blog
To view or add a comment, sign in
-
Our Instinct MI300X accelerators already power leading AI models from OpenAI, Meta and Hugging Face. With the introduction of the Instinct MI325X AI chip, we are redefining performance for the most demanding AI workloads. Learn more from Quartz: https://bit.ly/4841mkS
AMD is going after Nvidia with new AI chips
qz.com
To view or add a comment, sign in