Charles H. Martin, PhD’s Post

AI Specialist and Distinguished Engineer (NLP & Search). Inventor of weightwatcher.ai . TEDx Speaker. Need help with AI ? #talkToChuck

4w Edited

"Fine-Tuning is a nightmare in practice..." (advice from X). I agree. I have built ML and AI models for clients for 20 years. eHow. eBay. Walmart. Blackrock. Even Google (well, Aardvark). It's hard. But the payoff is there --if you can get it right. Best example: eHow was the first $1B IPO since Google. So if you are Fine-Tuning your own models, how can you know if you are on the right path ? Weightwatcher can help. I invented weightwatcher to help my clients who are training and/or fine-tuning their own AI models. And it's open-source. How can it help ? Here are over a dozen examples of how to interpret weighwatcher results for Instruction Fine-Tuned models. Generally speaking, if you get the Fine-Tuning right, your model will follow the predictions of the weightwatcher HTSR theory. And it when it doesn't, that special case can be useful too. If you know what you are looking at. If you have a fine tuned model, and you have the base model, this is all you do: pip install weightwatcher import weightwatcher as ww watcher = ww.WeightWatcher() details = watcher.analyze(model=model, base_model=base_model) Weightwatcher will remove the instruction fine-tuned components from the base model and analyze them for you Want to learn more ? Check out the examples: https://lnkd.in/gS8bS3tM Have questions? Join our Community Discord: https://lnkd.in/gZQF64Bw Or ping me here. And if there are cases you think we should add, please let us know. WeightWatcher is a one-of-a-kind must-have tool for anyone training, deploying, or monitoring Deep Neural Networks (DNNs). #talkToChuck. #theAIguy

11 Comments

Marco Herzog 🤖

AI Insights from the Trenches | Translating Complex AI Concepts into Actionable Knowledge

Oh wow, that sounds actually interesting. I was looking for something to get good evals for QLoRa adapters. I will check it out.

1 Reaction

Mariusz Kurman

AI for Health | MD

Playing in progress... :)

1 Reaction

Remigiusz Kinas

Head of AI @ Grupa NEUCA | AI, iRPA | 2xKaggle GM | Speakleash core team

Thank you Charles for Bielik evaluation 👍👏🙏

Creds

Shoutout for sharing this incredible resource! It's a gift to the world. Your selflessness is inspiring, and we're all supporting your endeavors! 🚀👏

1 Reaction

Igor Couto

CEO of Sofya | Founder & Chief Technologist at 1STI | Deep Tech Amplified Organization Author

Bruno Dorneles

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Charles H. Martin, PhD

AI Specialist and Distinguished Engineer (NLP & Search). Inventor of weightwatcher.ai . TEDx Speaker. Need help with AI ? #talkToChuck
3w Edited
Report this post
I have been building production machine learning models for clients on Wall Street, Main Street, and in Silicon Valley for over 20 years. I believe that the next generation of AI models will use large scale instruction fine tuning. And to achieve this, you need to know if the fine-tuning is working. To solve this problem, I invented weightwatcher. If you want to get into the space and understand how to fine-tune models at scale, please check out these examples of what you should expect from a good model
Charles H. Martin, PhD

AI Specialist and Distinguished Engineer (NLP & Search). Inventor of weightwatcher.ai . TEDx Speaker. Need help with AI ? #talkToChuck
4w Edited

"Fine-Tuning is a nightmare in practice..." (advice from X). I agree. I have built ML and AI models for clients for 20 years. eHow. eBay. Walmart. Blackrock. Even Google (well, Aardvark). It's hard. But the payoff is there --if you can get it right. Best example: eHow was the first $1B IPO since Google. So if you are Fine-Tuning your own models, how can you know if you are on the right path ? Weightwatcher can help. I invented weightwatcher to help my clients who are training and/or fine-tuning their own AI models. And it's open-source. How can it help ? Here are over a dozen examples of how to interpret weighwatcher results for Instruction Fine-Tuned models. Generally speaking, if you get the Fine-Tuning right, your model will follow the predictions of the weightwatcher HTSR theory. And it when it doesn't, that special case can be useful too. If you know what you are looking at. If you have a fine tuned model, and you have the base model, this is all you do: pip install weightwatcher import weightwatcher as ww watcher = ww.WeightWatcher() details = watcher.analyze(model=model, base_model=base_model) Weightwatcher will remove the instruction fine-tuned components from the base model and analyze them for you Want to learn more ? Check out the examples: https://lnkd.in/gS8bS3tM Have questions? Join our Community Discord: https://lnkd.in/gZQF64Bw Or ping me here. And if there are cases you think we should add, please let us know. WeightWatcher is a one-of-a-kind must-have tool for anyone training, deploying, or monitoring Deep Neural Networks (DNNs). #talkToChuck. #theAIguy
1 Comment
Like Comment
To view or add a comment, sign in
Patrick Hall

Machine Learning & AI Risk Management
9mo Edited
Report this post
Running the SHAP package on default settings? Then your explanations are probably wrong! I'll be giving a free preview of my summer Maven course on explainable #AI (#XAI), covering the topic I get asked about the most: Shapley values. This free course preview will be held Monday, April 8th, 2024 at 10:30 AM EDT (30 minutes) via Zoom. In this preview, I'll be giving pointers on the good, bad, and ugly of using Shapley values and the SHAP package to explain #machinelearning models. I'll be covering: - The Good: How SHAP is ideally supposed to work. - The Bad: Silly results due to correlation and high variance in feature attribution values, and waiting forever for those silly results! - The Ugly: What it takes to improve SHAP, including interventional feature perturbation, monotonicity and interaction constraints, large reference datasets, minimizing correlation, feature grouping, Owen values, and more! Learn more about the free course preview here: https://bit.ly/3TIQnba.

Understand SHAP (SHapley Additive exPlanations)

maven.com

5 Comments
Like Comment
To view or add a comment, sign in
Anuradha Tibile

TEDx Organiser | Google Developer Experts | Google Developers Group | Google Women Techmaker| Women and Climate | Speaker at TensorFlow |
8mo Edited
Report this post
Ready to Unbox the Future? Google Dev Groups Build with AI - Calling All Machine Learning Mavens and Neural Network Ninjas! Get ready to #transmute your ideas into groundbreaking innovations! We're brewing up a #PowerfulPotion of knowledge at the next Google Developer Groups event in the #BuildwithAI series, and guess who's stirring the cauldron? (That's right, yours truly!) @AnuradhaTibile Mark your calendars for #May10th because we're about to unleash a potent blend of cutting-edge AI techniques and practical implementation strategies. This isn't your average developer gathering – we're diving deep into the heart of AI and crafting spells (a.k.a. code) that will #revolutionize the future. Stay tuned for more & enjoy this sneak peek. #BuildWithAI #GoogleDevGroups #AI #MachineLearning #GirlsInITC P.S. Spread the word! Let's make this event a gathering of the brightest minds in the AI realm.
Like Comment
To view or add a comment, sign in
Andrew Solomon

Lead Generative AI Engineer | JLR
3mo
Report this post
🚀 Demystifying OpenAI’s New Model: #o1 🤖 The buzz around #OpenAI’s o1 model is impossible to ignore—but let’s take a moment to break it down, beyond the hype. Many people see these advanced AI models as magical, mysterious black boxes. But here’s the thing: AI, especially models like o1, aren’t magic—they’re systems designed with layers of logic and reasoning, much like how our own brains approach complex problems. Imagine you’re in a brainstorming session. 🧠 You start with basic ideas, then gradually add layers of depth, considering different angles, combining thoughts, and eventually reaching a refined conclusion. That’s how models like o1 work too. Just as a deep neural network has layers—each contributing something new and unique—the o1 model processes information in layers of reasoning. It starts with simple, foundational understanding and builds toward more complex insights. It’s like how a human would reason through a problem, one step at a time. Rather than being a mystical force, the o1 model is a tool that mimics this human-like reasoning, breaking problems down into understandable pieces and working through them logically. It’s closer to having a collaborative brainstorming partner than anything magical. Let’s embrace the real power of these models: making reasoning more accessible and scalable—not mysterious. #AI #OpenAI #DeepLearning #NeuralNetworks #AIReasoning #Innovation #Brainstorming #FutureTech #LogicNotMagic #LetsBreakItDownLogically
Like Comment
To view or add a comment, sign in
Corey Abshire

ML & GenAI at Databricks
9mo
Report this post
Really looking forward to this! Check it out and sign up if you're interested in explainable AI (and if you're in or around AI and ML you should be)!

Patrick Hall

Machine Learning & AI Risk Management
9mo Edited

Running the SHAP package on default settings? Then your explanations are probably wrong! I'll be giving a free preview of my summer Maven course on explainable #AI (#XAI), covering the topic I get asked about the most: Shapley values. This free course preview will be held Monday, April 8th, 2024 at 10:30 AM EDT (30 minutes) via Zoom. In this preview, I'll be giving pointers on the good, bad, and ugly of using Shapley values and the SHAP package to explain #machinelearning models. I'll be covering: - The Good: How SHAP is ideally supposed to work. - The Bad: Silly results due to correlation and high variance in feature attribution values, and waiting forever for those silly results! - The Ugly: What it takes to improve SHAP, including interventional feature perturbation, monotonicity and interaction constraints, large reference datasets, minimizing correlation, feature grouping, Owen values, and more! Learn more about the free course preview here: https://bit.ly/3TIQnba.

Understand SHAP (SHapley Additive exPlanations)

maven.com
Like Comment
To view or add a comment, sign in
Pietro Bolcato

Lead AI Engineer @Kittl | Gen AI, CV, NLP, MLOps | MSc AI, Double Degree | 2x Azure AI certified
1mo
Report this post
🏆 Great blogpost from HF explaining Vision-Language Models! Very much recommended, even if we are working with these models on a day to day basis - its alway good to take a step back and see the field as a whole In a nutshell; these models combine vision and language to for complex tasks like image captioning and visual question answering. Since 2021, interest in these models has surged, with examples like OpenAI’s CLIP. They excel in zero-shot generalization, making them very versatile Vision-language models use strategies like contrastive learning and multi-modal fusing to align images and text. They are pre-trained on large datasets and fine-tuned for specific tasks. Thanks to HF, they can be used really easily with the Transformers library. Kudos! 🔗 Read the blogpost: https://lnkd.in/evC-QWkb ⤵ Helpful? Follow me and join ⚡️ AI Pulse (https://lnkd.in/eWudwDsd) for daily, curated, bite-sized updates on AI—focused on what truly matters to keep you ahead of the curve 🔥
Like Comment
To view or add a comment, sign in
Manasa H M

Student at SVCE BANGALORE | CSE | Fellow@NxtWave's CCBP 4.0 Academy | Python, Front End Development
3w
Report this post
🎥 First Step Towards Machine Learning 🚀 Thrilled to share my journey into the world of machine learning with this hands-on project! 🌟 I explored how we can train models to recognize patterns and make predictions, all without diving into complex code. This project is just the beginning, and I’m excited to explore more advanced concepts in ML and AI. Your feedback is invaluable, so let me know what you think! #MachineLearning #DeepLearning #AI #LearningJourney #Projects
Like Comment
To view or add a comment, sign in
Cloudflare

1,007,010 followers
8mo
Report this post
Workers AI now supports fine-tuned models using LoRAs. But what is a LoRA and how does it work? In this post, we dive into fine-tuning, LoRAs and even some math to share the details of how it all works under the hood. https://cfl.re/4ajt39H #DeveloperWeek

Running fine-tuned models on Workers AI with LoRAs

blog.cloudflare.com
Like Comment
To view or add a comment, sign in
Thorben Dannegger

Coca-Cola Europacific Partners
3mo
Report this post
Just when you thought AI Advancement would be slowing down... OpenAI is releasing a new series of models: o1 The benchmarks are absolutely mind-blowing. I literally talked yesterday about how AI hasn't surpassed humans yet in competition level maths, and now this. Just amazing. What's particularly interesting is the new approach they've taken. Instead of just relying on a larger training dataset, the model "thinks" before it answers, refining its responses autonomously in the background. Full article: https://lnkd.in/eWKiry2f
1 Comment
Like Comment
To view or add a comment, sign in
Logan Grasby

ML Engineer at Cloudflare
8mo Edited
Report this post
I had the opportunity to build this feature with Cloudflare's incredible Workers AI team. It is a bit surreal to be able to ship something like this on a platform with over 2 million developers. Thankfully I get to work with some of the smartest people I've ever met. But what even is this feature? You can now bring your own LLM fine-tunes to Cloudflare trained on your own data. This means you can customize the models we host and adapt them to your usecase. It also means that you can achieve beyond GPT4 performance on specific tasks. We're just getting started with fine-tuning and many more task types are coming soon. Read more below about how we built this feature!

Cloudflare

1,007,010 followers
8mo

Workers AI now supports fine-tuned models using LoRAs. But what is a LoRA and how does it work? In this post, we dive into fine-tuning, LoRAs and even some math to share the details of how it all works under the hood. https://cfl.re/4ajt39H #DeveloperWeek

Running fine-tuned models on Workers AI with LoRAs

blog.cloudflare.com
Like Comment
To view or add a comment, sign in

37,407 followers

3000+ Posts

View Profile Connect

Charles H. Martin, PhD’s Post

More Relevant Posts

Explore topics