Yogesh Patil’s Post

5mo

Use linear programming and LLMs to harness the power of AI agents to improve optimization. ➡️ https://nvda.ws/4dfHNrb The cuOpt AI agent is built using multiple hashtag #LLM agents and acts as a natural language front end to cuOpt, enabling you to transform natural language queries to code and to an optimized plan seamlessly.

Building an AI Agent for Supply Chain Optimization with NVIDIA NIM and cuOpt | NVIDIA Technical Blog

developer.nvidia.com

To view or add a comment, sign in

More Relevant Posts

Arham Mehta

Generative AI Product Manager at NVIDIA• MS CS at USC
7mo
Report this post
🚀 Unlock the Power of Custom Data Curation with NVIDIA NeMo Curator! 🚀 Are you ready to elevate your AI game? NVIDIA has just open-sourced NeMo Curator, a groundbreaking data curation toolkit designed to help you build high-quality datasets for training large language models (LLMs) and small language models (SLMs). 🌟🔍 Why NeMo Curator? # Tailored Data Pipelines: Customize your data curation to fit your unique project needs. # Quality Assurance: Rigorous filtering and deduplication ensure top-notch data quality. # Blazing Fast Performance: Leverage GPU acceleration and distributed computing for efficiency. # Seamless Integration: Easily integrate new data sources and techniques. Ready to dive in? Check out the full blog post to learn how NeMo Curator can transform your data curation process and supercharge your AI projects! Mehran Maghoumi #AI #DataCuration #NeMoCurator #NVIDIA #MachineLearning #BigData #LLM

Curating Custom Datasets for LLM Training with NVIDIA NeMo Curator | NVIDIA Technical Blog

developer.nvidia.com
Like Comment
To view or add a comment, sign in
Jamil Semaan

Product Marketing - Data Science
6mo
Report this post
Data processing and curation is an important step in LLM workflows. The cleaner your data, the faster your model can converge and the better it will perform. Beyond just training and inference, accelerated computing also optimizes the data processing steps in your LLM pipelines. 💡 Here's a great tutorial that shows how.

Arham Mehta

Generative AI Product Manager at NVIDIA• MS CS at USC
7mo

🚀 Unlock the Power of Custom Data Curation with NVIDIA NeMo Curator! 🚀 Are you ready to elevate your AI game? NVIDIA has just open-sourced NeMo Curator, a groundbreaking data curation toolkit designed to help you build high-quality datasets for training large language models (LLMs) and small language models (SLMs). 🌟🔍 Why NeMo Curator? # Tailored Data Pipelines: Customize your data curation to fit your unique project needs. # Quality Assurance: Rigorous filtering and deduplication ensure top-notch data quality. # Blazing Fast Performance: Leverage GPU acceleration and distributed computing for efficiency. # Seamless Integration: Easily integrate new data sources and techniques. Ready to dive in? Check out the full blog post to learn how NeMo Curator can transform your data curation process and supercharge your AI projects! Mehran Maghoumi #AI #DataCuration #NeMoCurator #NVIDIA #MachineLearning #BigData #LLM

Curating Custom Datasets for LLM Training with NVIDIA NeMo Curator | NVIDIA Technical Blog

developer.nvidia.com
Like Comment
To view or add a comment, sign in
Muhammad Nouman Khan

AI Undergraduate student at PAF-IAST | President of AI Society, PAF-IAST (2023-24)
1mo
Report this post
Exciting news from NVIDIA! I am excited to share the LLaMA-Mesh project, which unifies 3D mesh generation with large language models (LLMs). This innovative approach allows LLMs to understand and generate 3D meshes from text prompts, making it a game changer in 3D modeling. 🔑 Key Features: 1. Unified Model: Combines text and 3D mesh generation in one framework. 2. Text-Based Representation: Uses plain text to represent 3D mesh data, simplifying processing. 3. High Quality: Achieves mesh generation quality comparable to models trained from scratch. All code and pre-trained weights will be available this November! 🏗️Check out the paper for more insights: https://lnkd.in/dX8d8P4r Project Page: https://lnkd.in/dswT95Ev NVIDIA Let’s explore new horizons in 3D modeling together! #NVIDIA #AI #3DModeling #LLaMAMesh #LLM

GitHub - nv-tlabs/LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models

github.com
Like Comment
To view or add a comment, sign in
Jigar Halani

Director - Solution Architect & Engg. at NVIDIA | Hiring | Twitter: jigarhalani3
5mo Edited
Report this post
Created by #Mistral AI and #NVIDIA, #Mistral #NeMo #12B #NIM is an advanced open language model for chatbots, #multilingual tasks, #coding and #summarization.

Mistral AI and NVIDIA Unveil Mistral NeMo 12B, a Cutting-Edge Enterprise AI Model

blogs.nvidia.com
Like Comment
To view or add a comment, sign in
ChandraSekhar Kalikivae

6K+ CEO/CTO/CIO | 12K+ | Leader at Vanguard with deep knowledge in Cloud, Gent AI. #Innovation #BusinessStrategy #DigitalTransformation | Featured @ NASDAQ | Wealth Transfer | Healthcare | Finance | FATHER
3mo
Report this post
NVIDIA Leading #AI NVLM 1.0 demonstrates several comparative advantages over other multimodal large language models (LLMs) like GPT-4o and Llama 3-V: State-of-the-Art Performance: NVLM 1.0 achieves competitive results on vision-language tasks, rivaling leading proprietary models such as GPT-4o and open-access models like Llama 3-V. It shows improved performance in both multimodal and text-only tasks compared to its LLM backbone after multimodal training. Text-Only Performance Improvement: Unlike some models that experience a degradation in text-only performance after multimodal training, NVLM 1.0, particularly the NVLM-D variant, exhibits significant improvements in text-only benchmarks, with an average accuracy increase of 4.3 points in math and coding tasks. Architectural Flexibility: NVLM 1.0 features a hybrid architecture (NVLM-H) that combines the strengths of decoder-only and cross-attention-based models. This allows it to handle high-resolution images more efficiently while maintaining robust multimodal reasoning capabilities. Dataset Quality Focus: The training of NVLM 1.0 emphasizes the quality and diversity of datasets rather than just their scale. This approach has been shown to enhance performance across all architectures, which is a critical factor in its success compared to other models. Production-Grade Multimodality: NVLM 1.0 has been developed with production-grade multimodality, enabling it to excel in vision-language tasks while also improving text-only performance, which is a notable achievement compared to other models that may not maintain such balance 1. Overall, NVLM 1.0 stands out for its architectural innovations, improved performance metrics, and a strong focus on dataset quality, making it a formidable competitor in the multimodal LLM landscape. #AIlearning #GenerativeAI #AIForEveryone #FreeCourses #AIEducation #artificialintelligence #ai #machinelearning #technology #datascience #python #deeplearning #programming #tech #robotics #innovation #bigdata #coding #iot #computerscience #data #dataanalytics #business #engineering #robot #datascientist #art #software #automation #analytics #ml #pythonprogramming #programmer #digitaltransformation #developer
Like Comment
To view or add a comment, sign in
Christof Koolen

Postdoctoral Researcher at the Centre for IT & IP Law (KU Leuven)
4mo
Report this post
Another day, another Large Language Model. Last week, Meta released Llama 3.1 – a new generation of its open source AI models – to the AI community. Llama 3.1 is available in three model configurations: - Llama 3.1 8B: An 8-billion parameter model for basic use cases - Llama 3.1 70B: A 70-billion parameter model for AI enthusiasts to play around with - Llama 3.1 405B: A 405-billion parameter model aimed at enterprise-level use cases All three models are available for download from Hugging Face, along with extensive code snippets. What can I do with these models? The most obvious advantage of these open source models lies in the fact that they can be operated locally and offline. The Llama 3.1 models can perform the usual tasks (e.g. dialogue, instructions, summarisation, coding,...) and can also be fine-tuned on the basis of proprietary data. What is new is the ability of the 405B model to generate synthetic data and to use the model to evaluate other AI models (LLM-as-a-Judge). What to expect in terms of performance? When assessing LLM benchmark scores for reasoning and knowledge abilities, the 405B model is surprisingly able to keep pace with paid (and also much larger) AI models such as GPT-4o (OpenAI) and Claude 3.5 Sonnet (Anthropic). The mid-range 70B model is comparable to OpenAI’s legacy model GPT-4, while the 8B model achieves performance similar to GPT-3.5. Can I run it at home? To some extent, yes, with the right hardware. Llama 3.1 8B can be run on most current PCs or laptops with a dedicated GPU (for reference, I can run it directly on my work laptop). Llama 3.1 70B is considerably more computationally intensive and calls for an enthusiast level setup of +/- €6.000-8.000 to achieve a meaningful token generation speed. The largest model, Llama 3.1 405B, has a massive video memory footprint (810GB VRAM at FP16) and therefore requires an enterprise-level setup costing about €240.000 when using the latest-generation GPUs. Training Llama 3.1? Training Llama 3.1 405B took 3,8 x 10^25 floating point operations (FLOPs) and was performed over 39,3 million GPU hours on 16.000 Nvidia H100s (a hardware configuration worth €480 million), producing the equivalent of approximately 8.930 tonnes CO2. This massive computational effort also means that Llama 3.1 405B qualifies as a general-purpose AI model with systemic risk under Article 51 of the AI Act. Some questions/thoughts I leave you with: - Meta’s decision to release these AI models for free is interesting, to say the least. One could speculate about the possible motives behind this move. - Is the performance of large language models plateauing? It seems that the recently released AI models are a somewhat converging on the same performance level. - What to do when all (organic) data is consumed for AI model training? Can we rely on synthetic data instead? - More broadly, how do you see this space evolving in the future? Happy to further discuss in the comments!
2 Comments
Like Comment
To view or add a comment, sign in
Innovation Incubator Inc

150 followers
9mo
Report this post
🚀 Exciting News! 🚀 🔥 Supercharging Large Language Models (LLMs) Deployment: Amazon SageMaker x NVIDIA NIM Integration 🔥 We're thrilled to share the latest advancement in deploying Large Language Models (LLMs) for generative AI applications! The seamless integration of Amazon SageMaker with NVIDIA NIM inference microservices is here to revolutionize LLM deployment. ✨ Key Benefits: 1️⃣ Enhanced Price-Performance: Achieve superior price-performance ratios with NVIDIA's accelerated computing infrastructure and SageMaker's managed service capabilities. 2️⃣ Lightning-Fast Deployment: Cut deployment times from days to minutes, enabling rapid iteration and deployment cycles. 3️⃣ Advanced Framework Support: Leverage frameworks like NVIDIA Triton Inference Server and TensorRT-LLM for optimized performance. 4️⃣ Out-of-the-Box Model Support: Access pre-built containers for popular LLMs like Llama, Mistral, Mixtral, Nemotron, StarCoder, and StarCoderPlus. 5️⃣ Customization Options: Tailor deployment with tools to create GPU-optimized versions for specific LLMs. 🛠️ How to Get Started: 1️⃣ Select the desired LLM container from the NVIDIA API catalog. 2️⃣ Deploy effortlessly on Amazon SageMaker, creating an inference endpoint with the chosen NIM container. 3️⃣ Scale seamlessly based on workload demands with SageMaker's managed service capabilities. 🚀 Let's propel your LLM deployment to new heights with Amazon SageMaker and NVIDIA NIM integration! Explore the possibilities and unleash the power of generative AI. #AmazonSageMaker #NVIDIA #NIM #LLMs #GenerativeAI #AIInfrastructure #MachineLearning #Deployment #Inference #Optimization #ArtificialIntelligence #TechInnovation #AI #ML #DeepLearning #NVIDIAAI #AmazonWebServices #AWS #AIEngineering #CloudComputing #DataScience #DigitalTransformation https://lnkd.in/demYYJET

The best Large Language Models in 2023 : The Cutting Edge of AI

https://innovationincubator.com
Like Comment
To view or add a comment, sign in
Nazeer Ali Mohammed

Helping organisations realise value in data with Cloud Scale Analytics & AI | Azure OpenAI | LLMs | AI Governance | ML | Multi-Cloud Data Strategist
8mo
Report this post
Introducing Phi-3: Redefining what’s possible with SLMs We are excited to introduce Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks. This release expands the selection of high-quality models for customers, offering more practical choices as they compose and build generative AI applications. Starting today, Phi-3-mini, a 3.8B language model is available on Microsoft Azure AI Studio, Hugging Face, and Ollama. Phi-3-mini is available in two context-length variants—4K and 128K tokens. It is the first model in its class to support a context window of up to 128K tokens, with little impact on quality. It is instruction-tuned, meaning that it’s trained to follow different types of instructions reflecting how people normally communicate. This ensures the model is ready to use out-of-the-box. It is available on Azure AI to take advantage of the deploy-eval-finetune toolchain, and is available on Ollama for developers to run locally on their laptops. It has been optimized for ONNX Runtime with support for Windows DirectML along with cross-platform support across graphics processing unit (GPU), CPU, and even mobile hardware. It is also available as an NVIDIA NIM microservice with a standard API interface that can be deployed anywhere. And has been optimized for NVIDIA GPUs. More info: https://lnkd.in/g_C5z4-q #phi3 #aoai #azure #ai

Introducing Phi-3: Redefining what's possible with SLMs | Microsoft Azure Blog

https://azure.microsoft.com/en-us/blog
Like Comment
To view or add a comment, sign in
Embedded LLM

5,250 followers
10mo Edited
Report this post
🤯 AI is about to level up on all dimensions. Soon, you might tell an AI, "Make this boring meeting into a hilarious TikTok" or "Show me potential emergency scenarios for this subway station." Think next-gen Sora, but driven by your own videos and prompt as input! Sound crazy? It's closer than you think! Models like Gemini 1.5 Pro or open-source alternatives like LWM (https://lnkd.in/gxQdiy5i) already process an hour of video at once! But get this...these next-gen AI models demand next-gen infrastructure. They gobble up memory as they analyze longer clips, with the KV cache being a major bottleneck. A 65B parameter LLM with a 1M context length needs 1.2TB of memory (roughly 16 A100 GPUs)! Extrapolate that to a 10M context length, and you're looking at 12TB memory and over 153 GPUs! To make this a reality, as an industry we need: - Software optimizations like KV Cache quantization and compression - More HBM please (AMD's MI300 offers a huge 192GB!) - Faster interconnects for ring attention (like Ultra Ethernet Consortium) So, what mind-blowing video edits would YOU ask an AI to do? Share your wildest ideas below! 👇

GitHub - LargeWorldModel/LWM

github.com
Like Comment
To view or add a comment, sign in
Vedant Pandya

ML | Gen AI | Navigating AI Horizons | Deep Learning Enthusiast | ML, NLP & LLM Researcher | Google Cloud Advocate | Quantum Computing | Mentor |
4mo
Report this post
🧭Google is also into the space of Small Language Models: 🚀 The new model Gemma 2 2B: A Deeper Dive into Distillation Excellence and Exceptional Performance! 🌟 I'm thrilled to experience and share insights about Gemma 2 2B, a small language model in the AI landscape. This open-source language model, with just 2 billion parameters, offers significant advancements in efficiency, performance, and flexibility. 🧪 Distillation Excellence 🔹 Knowledge Transfer: Gemma 2 2B inherits knowledge from larger models through knowledge distillation, training to mimic their output while reducing size. 🔹 Efficiency Gains: Distillation maintains much of the original model's performance with a smaller, more computationally efficient footprint. 🔹 Hardware Compatibility: This model runs seamlessly on a wide range of hardware, from high-end servers to edge devices. 📈 Exceptional Performance 🔹 Benchmark Dominance: Demonstrates superior performance in natural language understanding, generation, and question answering benchmarks. 🔹 Chatbot Arena Success: Excels in human-like conversations, often surpassing larger models. 🔹 Fine-Tuning Potential: Can be further enhanced through fine-tuning on specific tasks or domains. ⚡ Efficiency and Speed 🔹 NVIDIA Speed Optimization: Leverages NVIDIA's optimizations for faster inference times, ideal for real-time applications. 🔹 Low Latency: Essential for interactive applications like chatbots and virtual assistants. 🔹 Cost-Effective Deployment: Requires less computational resources, lowering operating costs. 🔄 Flexibility 🔹 Model Variants: Offers different parameter counts to suit specific needs. 🔹 Diverse Applications: Deployable on edge devices and capable of handling complex tasks. 🔹 Scalability: Can scale up or down to handle varying workloads. 🛠️ Technical Specifications 🔹 Transformer Architecture: Based on the standard architecture for state-of-the-art language models. 🔹 Training Methodology: Likely trained with massive text data and self-supervised learning techniques. 🔹 Hardware Acceleration: Utilizes GPUs or TPUs for speed and efficiency. Gemma 2 2B is poised to revolutionize AI applications with its blend of power and versatility. Whether for chatbots, content generation, or research, this model is a game-changer! 🔗 Explore more about Gemma 2 2B: https://lnkd.in/dgtfP4tK 🔗 Read the Documentation: https://lnkd.in/dKkifK7p #AI #MachineLearning #Gemma2 #OpenSource #Python #DeepLearning #AIResearch #LanguageModel #TechInnovation

1 Comment
Like Comment
To view or add a comment, sign in

3,514 followers

971 Posts

View Profile Follow

Yogesh Patil’s Post

More Relevant Posts

Explore topics