Granica’s Post

🚀 Powering ML with Ray: Insights from Zhe Zhang's Tech Talk 🖥️ Last month, we had the privilege of hosting Zhe Zhang (Distinguished Engineer at NVIDIA and Ray expert) at Granica HQ in Mountain View 🙌 Zhe delivered a “no-BS”, thought-provoking talk that dove into a critical question: 💡 How do you command and utilize remote servers effectively for ML workloads? Here are the 🔑 takeaways from Zhe’s talk: 1️⃣ RPC is foundational: The concept of Remote Procedure Calls (RPCs) is the building block for distributed applications, enabling computation across remote servers. 2️⃣ The cluster spectrum: Frameworks like K8s, Ray, and Apache Spark allow us to scale workloads across clusters, each offering unique tradeoffs for ML practitioners. 3️⃣ Choosing the right framework: - 🏈 K8s shines for workloads resembling American football (quarterback passing to multiple runners). - ⚽ Ray excels when your workload mirrors soccer (everyone passing to everyone). - Apache Spark simplifies things—if your workload fits their expressive language, it "does the RPCs for you." We’re incredibly grateful to Zhe for sharing his practical, real-world perspective on ML infra—one shaped by his hands-on experience at NVIDIA and Anyscale. 🎉 The session was both insightful and approachable with a bunch of helpful Q&A, and his session is a must-watch for anyone navigating the challenges of scaling ML workloads 📖 Thanks again, Zhe, for bringing your expertise to Granica HQ! 🙏 👉 Check out Zhe’s blog (with the session recording inside) using the link in the comments. #MachineLearning #Infrastructure #DistributedComputing #MLPractitioners

  • graphical user interface, application

To view or add a comment, sign in

Explore topics