🚀 Powering ML with Ray: Insights from Zhe Zhang's Tech Talk 🖥️ Last month, we had the privilege of hosting Zhe Zhang (Distinguished Engineer at NVIDIA and Ray expert) at Granica HQ in Mountain View 🙌 Zhe delivered a “no-BS”, thought-provoking talk that dove into a critical question: 💡 How do you command and utilize remote servers effectively for ML workloads? Here are the 🔑 takeaways from Zhe’s talk: 1️⃣ RPC is foundational: The concept of Remote Procedure Calls (RPCs) is the building block for distributed applications, enabling computation across remote servers. 2️⃣ The cluster spectrum: Frameworks like K8s, Ray, and Apache Spark allow us to scale workloads across clusters, each offering unique tradeoffs for ML practitioners. 3️⃣ Choosing the right framework: - 🏈 K8s shines for workloads resembling American football (quarterback passing to multiple runners). - ⚽ Ray excels when your workload mirrors soccer (everyone passing to everyone). - Apache Spark simplifies things—if your workload fits their expressive language, it "does the RPCs for you." We’re incredibly grateful to Zhe for sharing his practical, real-world perspective on ML infra—one shaped by his hands-on experience at NVIDIA and Anyscale. 🎉 The session was both insightful and approachable with a bunch of helpful Q&A, and his session is a must-watch for anyone navigating the challenges of scaling ML workloads 📖 Thanks again, Zhe, for bringing your expertise to Granica HQ! 🙏 👉 Check out Zhe’s blog (with the session recording inside) using the link in the comments. #MachineLearning #Infrastructure #DistributedComputing #MLPractitioners
Granica’s Post
More Relevant Posts
-
Microsoft and Quantinuum created four highly reliable logical qubits from only 30 physical qubits while demonstrating an 800x improvement in error rate.
To view or add a comment, sign in
-
Microsoft and Quantinuum demonstrate the most reliable logical #qubits on record with an error rate 800x better than physical qubits
Microsoft and Quantinuum demonstrate the most reliable logical qubits on record with an error rate 800x better than physical qubits - Inside Quantum Technology
https://www.insidequantumtechnology.com
To view or add a comment, sign in
-
Explore the design principles of NVIDIA's GB200 NVL72 at the #OCPSummit2024 keynote and discover how we're collaborating with OCP to supercharge accelerated computing. Join us on October 15 to learn more about how this new architecture is tackling the challenges of deploying massive AI workloads.
Fostering Collaboration: Designing Data Centers for Tomorrow's AI Workloads
To view or add a comment, sign in
-
Microsoft and Quantinuum reach new milestone in quantum error correction. The collaboration claims to have used an innovative qubit-virtualization system on Quantinuum's H2 ion-trap platform to create 4 highly reliable logical qubits from only 30 physical qubits. What is quantum error correction? The physical qubits, with error rates in the order of 10^-2, are combined to deliver logical qubits with error rates in the order of 10^-5. According to their press release, this is the largest gap between physical and logical error rates reported to date, and has allowed them to run ran more than 14,000 individual experiments without a single error. (https://lnkd.in/dzETsvVA) The race for the qubits count seemed to finish in 2023, with the latest update on IBM's roadmap focusing on quality rather than on quantity (https://lnkd.in/dFu52wJR, "Until this year, our path was scaling the number of qubits. Going forward we will add a new metric, gate operations—a measure of the workloads our systems can run."), and other developments in quantum error correction, like the one announced in December by Harvard University, Massachusetts Institute of Technology, QuEra Computing Inc. and National Institute of Standards and Technology (NIST)/University of Maryland in December (https://lnkd.in/dkW-TT-w) Practical quantum computing gets a little closer, although it is still a distant target. Microsoft Press release: https://lnkd.in/deJ4QCBk Quantinuum's press release: https://lnkd.in/d4Wnmvdq More details from Microsoft: https://lnkd.in/dusfZ4KY Paper: https://lnkd.in/dpPCX3td #quantumcomputing #quantumerrorcorrection #technology
Microsoft and Quantinuum demonstrate the most reliable logical qubits on record
https://www.youtube.com/
To view or add a comment, sign in
-
I’m happy to share that I’ve obtained a new certification: Efficient Large Language Model (#LLM) Customization from #NVIDIA! In this day-long, hands-on workshop on high-end NVIDIA servers and models, we sharpened our skills in: - Applying parameter-efficient fine-tuning techniques with limited data. - Using LLMs to generate synthetic data for fine-tuning smaller models. - Reducing model size requirements through a cycle of synthetic data generation and model customization.
To view or add a comment, sign in
-
Quantum Leap Achieved! Microsoft and Quantinuum have unlocked a new quantum computing milestone: 4 logical qubits from 30 physical ones, reducing error rates by 800x. This hybrid supercomputer approach brings us closer to solving previously intractable problems. The future of computing is here! #QuantumComputing #Microsoft #Quantinuum #Innovation
Microsoft and Quantinuum demonstrate the most reliable logical qubits on record
https://www.youtube.com/
To view or add a comment, sign in
-
Microsoft and Quantinuum have built a quantum computer that shows exceptional reliability, achieving a significant milestone in error correction, a crucial aspect of quantum computing. Their experiment involved running over 14,000 computational routines on Quantinuum’s H2 quantum processors without any errors. They used a method developed by Microsoft to create “logical quantum bits” or qubits, which proved to be much more stable than traditional qubits, with the logical qubits producing significantly fewer errors. “Today’s results mark a historic achievement and are a wonderful reflection of how this collaboration continues to push the boundaries for the quantum ecosystem,” said @Ilyas Khan, founder and chief product officer of Quantinuum. “With Microsoft’s state-of-the-art error correction aligned with the world’s most powerful quantum computer and a fully integrated approach, we are so excited for the next evolution in quantum applications and can’t wait to see how our customers and partners will benefit from our solutions especially as we move towards quantum processors at scale.” More from Microsoft here: https://lnkd.in/gRywQzve #quantumcomputing
Microsoft and Quantinuum demonstrate the most reliable logical qubits on record
https://www.youtube.com/
To view or add a comment, sign in
-
Started learning Q# (Quantum Sharp) from Microsoft! It's always good to start working with the latest and I'm a big believer in quantum computing! #SoftwareEngineering #QuantumComputing #Developing #MicrosoftLearning
To view or add a comment, sign in
-
Owning a Grace Hopper is like owning a lambo you want to drive it hard…but can you? The most sophisticated platform engineering teams I talk to are utilizing these features to get the most CUDA for their bucks: 1. **Multi-Instance GPU (MIG):** MIG allows the GPU to be partitioned into multiple instances, each with dedicated compute, memory, and bandwidth. This enables efficient utilization of GPU resources, allowing multiple users or applications to run simultaneously without interference. Benefits include improved GPU utilization, isolation between tasks, and the ability to run different CUDA applications concurrently. 2. **Time Slicing:** Time slicing enables the GPU to share its resources between multiple applications by dividing time into slices and assigning them to different tasks. This allows for better resource management and can lead to improved overall system throughput. This feature is particularly useful for mixed workloads, where different applications require GPU resources at different times (ie: almost everyone) 3. **Unified Memory:** Unified Memory simplifies memory management by providing a single memory address space accessible from both the CPU and GPU. This feature eliminates the need for explicit memory transfers between host and device, making programming easier and more efficient. Benefits include reduced programming complexity and potentially better performance for applications with complex memory access patterns. 4. ** cuda-checkpoint:** This command-line utility allows for transparent checkpointing and restoring of CUDA applications on Linux. MemVerge supports a first of its kind GPU optimization, resource sharing, and bursting capabilities by taking advantage of cuda-checkpoint. Reach out to me if you’d like to learn more. 5. **Asynchronous Execution:** CUDA supports asynchronous execution, allowing computation and data transfer operations to be overlapped. This feature can lead to better performance by keeping the GPU busy while data is being transferred. Asynchronous execution is crucial for optimizing performance in applications with significant data transfer requirements. — If you’re gonna own the lambo you may as well learn to drive it hard too. What CUDA features do you find most interesting or useful in your work? #NVIDIA #AI #ML #HPC #AWS
To view or add a comment, sign in
-
Exascale Computing: The Next Frontier in Supercomputing
Exascale Computing: The Next Frontier in Supercomputing
https://nathealliv.com
To view or add a comment, sign in
2,149 followers
Blog and video: https://granica.ai/blog/in-reference-to-rpc-choosing-the-right-compute-framework-for-your-ml-workloads