Zhe Zhang’s Post

View profile for Zhe Zhang, graphic

Building the future of AI infra at NVIDIA. Apache Software Foundation Member; Former Head of Open Source (Ray) + Head of Field Engineering @ Anyscale.

I'm really excited to share this new blog on Amazon's Exabyte-Scale migration from #Spark to #Ray! 🐘 1.5EiB of data processed in a quarter 🚀 82% better cost efficiency 💸 $120M saving per year OK you might ask: isn't #BigData processing Apache Spark's bread and butter? 🤔 Well, this is a story about software abstractions.. Like Patrick Ames wrote in the blog, "... (Amazon engineers) had limited options to resolve performance issues due to Apache Spark successfully (and unfortunately in this case) abstracting away most of the low-level data processing details". So, the takeaway: if you need more flexibility in data processing (e.g. need GPU, or #unstructured data like video), let's talk! https://lnkd.in/gesSyxHF to get started https://lnkd.in/gfj2Xp_2

Amazon’s Exabyte-Scale Migration from Apache Spark to Ray on Amazon EC2 | Amazon Web Services

Amazon’s Exabyte-Scale Migration from Apache Spark to Ray on Amazon EC2 | Amazon Web Services

aws.amazon.com

Max Barker

Hire FAANG talent on Discord 🕹️ | Trusted by top VC backed startups | Send me a DM for access 👋

4mo
Leo Liang

Building Something New | Venture Partner @ Sancus | Plumber in Data, AI and Blockchain

4mo

The day eventually comes - congrats Anyscale team!

Akshay Verma

#python #dataops #mlops #opensource

4mo

Really helpful!

Like
Reply
Malathi Sankar

Director | Data/ML/Search Engineering and Platform

4mo

Fascinating read. Thanks for sharing!

Pritam Pan

Senior Data Engineer@Pinterest📌 || AWS || Spark || Iceberg || Airflow || Ex-Groupon

4mo

This is another excellent use case for Ray beyond machine learning. Fascinating read!

See more comments

To view or add a comment, sign in

Explore topics