Yi Wang’s Post

Principal Engineer at Apple

6mo

It may be easier than you think to use your skills for server-based deep learning on Apple devices. Yunfei Cheng and I attempted to evaluate the learning curve by comparing MLX kernels working on Metal GPUs in Apple Silicon chips to PyTorch kernels on CUDA GPUs. The image below depicts the scalability of self-attention and linear projection on M1 Max, M2 Ultra, A100, and H100. The x-y plane represents the beam shape size used in our Recurrent Drafting work (https://lnkd.in/dvrvUwbU). All of these kernels show a similar scalability trend as the beam shape grows. It is interesting to reveal that the performance difference between CUDA and Metal in SDPA is considerably lesser than in linear projection. For example, linear projection indicated a roughly 100x performance difference between the M1 Max and the H100, whereas SDPA showed just a 25x difference on the same hardware.

Yunfei Cheng

Machine Learning Engineer @ Apple

6mo Edited

How does MLX on Metal perform in handling machine learning tasks? Yi Wang and I conducted a set of benchmarks using M1 Max, M2 Ultra with MLX, A100, and H100 with PyTorch to compare the performance of two fundamental operations, SDPA and Linear Projection. A surprising revelation is the close performance between the M2 Ultra and A100, underscoring the impressive potential of on-device machine learning. The benchmark also reveals distinct performance trends. Linear Projection shows a linear increase in latency with larger input sizes, while SDPA exhibits exponential latency growth due to its higher complexity. Interestingly, the performance disparity in SDPA is much less pronounced than in Linear Projection. For instance, Linear Projection demonstrates a nearly 100x performance difference between the M1 Max and H100, whereas SDPA shows only 25x difference on the same set of hardwares. These findings highlight the significant potential of on-device machine learning, and we look forward to further enhancements in performance, particularly with advancements in Metal.

To view or add a comment, sign in

More Relevant Posts

Shailendra Prajapati

Data Scientist @ Compunnel India | Machine Learning | IoT | Azure | Technical Writer
3w
Report this post
Mastering Hyperparameter Tuning for Optimized Machine Learning Models Hyperparameter tuning is the secret sauce that transforms a good machine learning model into a great one. By fine-tuning parameters like learning rate, tree depth, or number of layers, you can maximize performance and accuracy. Key Highlights from the Article: 1. What is Hyperparameter Tuning? A method to optimize non-learnable parameters in a machine learning model. Impacts training speed, convergence, and overall accuracy. 2. Techniques for Tuning: Grid Search: Systematic exploration of parameter combinations. Random Search: Random sampling of hyperparameters for efficiency. Bayesian Optimization: Intelligent exploration for fewer iterations. 3. Practical Steps with Code: Learn how to implement tuning using libraries like Scikit-learn, TensorFlow, or PyTorch. Understand real-world examples of hyperparameter tuning in action. 4. Challenges: Time-consuming process for large datasets. Risk of overfitting when tuning excessively. https://lnkd.in/gkBMQ4vc Additional Resources: Tools for Automation: Optuna, Ray Tune https://lnkd.in/gExMuTnF Code Examples: Explore hyperparameter optimization on GitHub. https://lnkd.in/g3n_SQbp #MachineLearning #HyperparameterTuning #AI #DataScience #MLModels #OptimizationTips

Hyperparameter Tuning:

blog.devops.dev
Like Comment
To view or add a comment, sign in
mian ahtisham

Article writer and calligraphy artist
8mo
Report this post
Unveiling the Power of Baseline Models in Machine Learning While complex architectures and cutting-edge techniques rightfully captivate the imagination in machine learning, the foundation of any successful model lies in the humble baseline. In the vast landscape of machine learning (ML), where complex algorithms and sophisticated architectures often steal the spotlight, it’s easy to overlook the humble yet crucial baseline models. These unassuming models serve as the foundation upon which more advanced solutions are built. In this article, we’ll delve into the world of baseline models, demystify their purpose, and explore why they are essential in ML development pipelines. Read more: https://lnkd.in/dEsmakVr

Unveiling the Power of Baseline Models in Machine Learning

secret-scribes.blogspot.com
Like Comment
To view or add a comment, sign in
Yunfei Cheng

Machine Learning Engineer @ Apple
6mo Edited
Report this post
How does MLX on Metal perform in handling machine learning tasks? Yi Wang and I conducted a set of benchmarks using M1 Max, M2 Ultra with MLX, A100, and H100 with PyTorch to compare the performance of two fundamental operations, SDPA and Linear Projection. A surprising revelation is the close performance between the M2 Ultra and A100, underscoring the impressive potential of on-device machine learning. The benchmark also reveals distinct performance trends. Linear Projection shows a linear increase in latency with larger input sizes, while SDPA exhibits exponential latency growth due to its higher complexity. Interestingly, the performance disparity in SDPA is much less pronounced than in Linear Projection. For instance, Linear Projection demonstrates a nearly 100x performance difference between the M1 Max and H100, whereas SDPA shows only 25x difference on the same set of hardwares. These findings highlight the significant potential of on-device machine learning, and we look forward to further enhancements in performance, particularly with advancements in Metal.
Like Comment
To view or add a comment, sign in
Sandro V.

Product Manager | M.Sc. CompSci @ Georgia Tech | JLPT N3 | ServiceNow
6mo
Report this post
Considering the price ratio (1 : 14.3) and the TDP ratio (estimated in 1 : 20) for M1 max vs workstations with H100, these results seem promising and make me believe I will see ubiquitous robotics in my life span.
Yunfei Cheng

Machine Learning Engineer @ Apple
6mo Edited

How does MLX on Metal perform in handling machine learning tasks? Yi Wang and I conducted a set of benchmarks using M1 Max, M2 Ultra with MLX, A100, and H100 with PyTorch to compare the performance of two fundamental operations, SDPA and Linear Projection. A surprising revelation is the close performance between the M2 Ultra and A100, underscoring the impressive potential of on-device machine learning. The benchmark also reveals distinct performance trends. Linear Projection shows a linear increase in latency with larger input sizes, while SDPA exhibits exponential latency growth due to its higher complexity. Interestingly, the performance disparity in SDPA is much less pronounced than in Linear Projection. For instance, Linear Projection demonstrates a nearly 100x performance difference between the M1 Max and H100, whereas SDPA shows only 25x difference on the same set of hardwares. These findings highlight the significant potential of on-device machine learning, and we look forward to further enhancements in performance, particularly with advancements in Metal.
Like Comment
To view or add a comment, sign in
Modaai

154 followers
6mo
Report this post
Machine learning can potentially automate tedious tasks and increase productivity by 50% or more in some cases. Yet, there are challenges that arise and shut down ML pilots before takeoff. In the following article, CTO Lior Gavish examines the four most common reasons Machine Learning pilots fail to take off and how to avoid these pitfalls. Read more: https://lnkd.in/g5EcnQhT Source: Techopedia #machinelearning #optimization

4 Common Machine Learning Pitfalls and How To Avoid Them

https://www.techopedia.com
Like Comment
To view or add a comment, sign in
Ricardo Galante

Advanced Analytics & Artificial Intelligence Advisor | SAS Iberia | Data Science & Artificial Intelligence Lecturer
9mo
Report this post
How Bayes’ Theorem is Applied in Machine Learning Bayes’ theorem tells use how to gradually update our knowledge on something as we get more evidence or that about that something.

How Bayes’ Theorem is Applied in Machine Learning - KDnuggets

kdnuggets.com
Like Comment
To view or add a comment, sign in
Josh Woodruff

Operations Research PhD
5mo
Report this post
This short read highlights some of the challenges companies face in machine learning projects. Number one is a problem we've seen in mathematical optimization projects as well.

Modaai

154 followers
6mo

Machine learning can potentially automate tedious tasks and increase productivity by 50% or more in some cases. Yet, there are challenges that arise and shut down ML pilots before takeoff. In the following article, CTO Lior Gavish examines the four most common reasons Machine Learning pilots fail to take off and how to avoid these pitfalls. Read more: https://lnkd.in/g5EcnQhT Source: Techopedia #machinelearning #optimization

4 Common Machine Learning Pitfalls and How To Avoid Them

https://www.techopedia.com

1 Comment
Like Comment
To view or add a comment, sign in
Eric Feuilleaubois (Ph.D)

Deep Learning / ADAS / Autonomous Parking chez VALEO // Curator of Deep_In_Depth news feed on BlueSky
7mo
Report this post
TD3-BST: A Machine Learning Algorithm to Adjust the Strength of Regularization Dynamically Using Uncertainty Model https://buff.ly/4dquTaX

TD3-BST: A Machine Learning Algorithm to Adjust the Strength of Regularization Dynamically Using Uncertainty Model

https://www.marktechpost.com

1 Comment
Like Comment
To view or add a comment, sign in
Dr.Mohammed Arshad

UAE 🇦🇪 Govt Emp - 9 yrs |HealthcareTech,FinTech,Hospitality | Dy Manager- Database Administration Top Voice | Applications’ Architect | Lead DBA -MS SQL Server/NoSQL(MongoDB)/PostgreSQL/Oracle/Sybase | Data Analytics
7mo
Report this post
“Sharing is Caring” Hi #AI Community!!! #sharingiscaring “Send me an Invite and get your skills endorsed straightaway as pleasantries !” “#Knowledge Booster” 📚 Expand your knowledge base with this helpful article..Keep #Learning !! 👍 NOTE : This is NOT a paid Advertisement #Robotics #artificialintelligence #dataanalytics #datascience #machinelearning #deeplearning #linkedinfamily #knowledgesharing #machinelearningalgorithms #neuralnetworks #deeplearning #deeplearningai #computervision Join me on my way into an exciting world of Data/Analytics/AI Send me an Invite and get your skills endorsed straightaway as pleasantries 😊 🍁Let’s Connect Now

Eric Feuilleaubois (Ph.D)

Deep Learning / ADAS / Autonomous Parking chez VALEO // Curator of Deep_In_Depth news feed on BlueSky
7mo

TD3-BST: A Machine Learning Algorithm to Adjust the Strength of Regularization Dynamically Using Uncertainty Model https://buff.ly/4dquTaX

TD3-BST: A Machine Learning Algorithm to Adjust the Strength of Regularization Dynamically Using Uncertainty Model

https://www.marktechpost.com

1 Comment
Like Comment
To view or add a comment, sign in
Bevan Smith, PhD

Causal AI Data Scientist
5mo Edited
Report this post
In the field of causal inference (CI) and machine learning (ML), we can distinguish between: 1. Causal inference for machine learning and, 2. Machine learning for causal inference Example of ML for causal inference: meta-learners (s-learner, t-learner, x-learner) etc. Example of causal inference for ML: causal discovery, DAGs, etc. ML for Causal Inference: Applying ML techniques to estimate causal effects, identify causal relationships, and make counterfactual predictions. This is about using the power of ML to solve problems traditionally addressed by causal inference methods. Causal Inference for ML: Integrating causal reasoning into the ML workflow to enhance model interpretability, fairness, robustness, and generalization. This is about improving ML models by leveraging insights from causal inference.

All Things Causal ML | Bevan Smith | Substack

causalml.substack.com
Like Comment
To view or add a comment, sign in

5,450 followers

View Profile Follow

Yi Wang’s Post

More from this author

Kubernetes for Distributed Machine Learning

Explore topics