Stephen Li’s Post

2mo Edited

Apple Foundation Model: What’s the Big Deal for AI Infra? Apple dropped some AI heat in June, and iPhone users are hyped. Sure, it’s a bummer that it only works on iPhone 15 Pro and above, but Apple’s paper on their Foundation Model (AFM-on-Device) shows exciting potential! First, the model is small - just 3 billion parameters, similar size to Phi-3. Apple squished it down using INT4 precision, meaning model weights only needs around 1.5GB of memory. With KV and overheads, the whole thing should stay under 2GB! 🚀 What’s even smarter? Apple introduced the concept of adaptors as Apple has strong controls on the whole stack, meaning iPhone can only load the small portion of the model depending on the app you’re using. If you’re in Mail, it’ll load just what’s needed for email replies. That keeps the model zippy and sharp. Now, here’s the kicker: Apple trained the big server model on 8,192 TPUv4s in Google Cloud (yep, Google’s TPUs!), then distilled it down to the tiny on-device version using 2,048 TPUv5p. If this works, it might shift more attention to TPU training, which could eat into NVIDIA’s GPU dominance. 🤔 If AFM-on-Device performs well, we might see: • More Edge AI use, where small models lighten the memory and compute load. • Data centers can get more flexible with location, as edge AI would reduce latency concerns. That said, some heavy lifting will still need to happen in the cloud (hello, O1!). • A rise in TPU training customers. Before the first iPhone, we’d seen Android and Blackberry. But the iPhone totally changed the world. This time, let’s see if Apple can lead the way for Edge AI! 🧠📱 This post is part of my Independent Study in AI Infrastructure at Stanford GSB, where I’m exploring potential challenges and opportunities in the next 2-3 years. If you have any insights or suggestions, feel free to comment or DM me! https://lnkd.in/gP3kKsTS #AI #Apple #EdgeAI #NVIDIA #TPU #LLM #390IndependentStudy #StanfordGSB #MSx

Introducing Apple’s On-Device and Server Foundation Models

machinelearning.apple.com

3 Comments

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

2mo

The use of INT4 precision and adaptors is ingenious, enabling efficient on-device inference while minimizing memory footprint. This approach could significantly impact the landscape of federated learning by allowing for more privacy-preserving model training directly on user devices. However, how will Apple's AFM-on-Device navigate the complexities of dynamic resource allocation in heterogeneous edge environments, considering factors like device heterogeneity and fluctuating network conditions?

To view or add a comment, sign in

More Relevant Posts

Pradeep Anand Ravindranath PhD, MBA

Founder | CEO @ DynamWorks | Fractional Executive | Keynote Speaker | Scientist
6mo
Report this post
🚀 **Introducing Apple’s On-Device and Server Foundation Models** 🚀 At the 2024 Worldwide Developers Conference, Apple unveiled **Apple Intelligence**, a personal intelligence system integrated into **iOS 18, iPadOS 18, and macOS Sequoia**. This innovation leverages multiple advanced generative models designed to enhance user experience through specialized tasks like text refinement, notification summarization, and creating playful images for personal conversations. ### Key Highlights: 🔹 **Diverse and Specialized Generative Models:** Tailored for specific user activities, adaptable in real-time. 🔹 **Dual Model Approach:** A ~3 billion parameter on-device language model and a larger server-based model for efficiency and accuracy. 🔹 **Comprehensive Support:** Includes a coding model for Xcode and a diffusion model for visual expression in Messages. 🔹 **Responsible AI Development:** Upholds principles to empower users, represent diverse communities, design with care, and protect privacy. ### Responsible AI Principles: 1. **Empower Users with Intelligent Tools:** Respecting user autonomy. 2. **Authentic Representation:** Avoiding biases and authentically representing global users. 3. **Careful Design:** Considering potential misuse and harm, improving based on feedback. 4. **Privacy Protection:** Utilizing on-device processing and Private Cloud Compute. ### Technical Overview: 🔹 **Pre-Training with AXLearn:** Leveraging advanced parallelism techniques and rigorous data filtering. 🔹 **Post-Training Innovations:** Enhancing instruction-following with novel algorithms like rejection sampling fine-tuning. 🔹 **Optimization for Performance:** Advanced techniques ensure speed and efficiency. ### Performance and Evaluation: 🔹 **User-Centric Evaluation:** Benchmarking through human evaluations and adversarial testing. 🔹 **Superior Instruction-Following:** Outperforming comparable open-source and commercial models. 🔹 **Enhanced Writing Ability:** Leading in summarization and composition tasks. #Apple #WWDC2024 #AI #MachineLearning #iOS18 #Innovation #Technology #Privacy #ResponsibleAI

Introducing Apple’s On-Device and Server Foundation Models

machinelearning.apple.com
Like Comment
To view or add a comment, sign in
Nick Ruest

Agent Builder 🌐 | Ai Automation💡| Generative Engine Optimization (GEO) | Social Impact Enthusiast 🌿Click The Link To Book A Call👇
5mo
Report this post
Apple's WWDC 2024 Highlights: Glimpse into Next-Gen AI Capabilities This year's Worldwide Developers Conference showcased innovation, particularly w/ the intro of Apple Intelligence. As we dive into the details here’s my take on the groundbreaking developments and how they might transform our interaction w/ tech Apple Intelligence: A New Era? Apple’s announcement of integrating Apple Intelligence into iOS 18, iPadOS 18 & macOS Sequoia is a new chapter in personalized tech Comprising various generative models this system adapts to user activities enhances tasks from writing texts to managing notifications, and more Four Things Stand Out: 1️⃣ Dual Models for Diverse Needs: The integration of a ~3 billion parameter on-device model and a more extensive server-based model is intriguing. The balance between on-device processing and cloud-based capabilities could set new standards for speed and privacy. 2️⃣ Commitment to Responsible AI: Apple’s outlined AI principles resonate deeply, especially their focus on user empowerment and privacy protection. The proactive approach to AI misuse and bias prevention is commendable. 3️⃣ Optimization Excellence: The reported improvements in model latency and efficiency on the iPhone 15 Pro are impressive. Apple’s use of low-bit palletization and quantization techniques could be game-changers. 4️⃣ Evaluating the Impact: The benchmarks shared show that Apple’s models not only perform well but are preferred by users over competing models. This human-centered evaluation approach is crucial for understanding real-world utility. Questions: How will this impact developers and creators? What are potential privacy concerns for end-users? How will these changes impact daily tech interactions? I’d love to hear your thoughts on these developments Here's the link to read more: https://lnkd.in/eGnWdimE #Ai #Future #Innovation #Technology #SocialNetworking

Introducing Apple’s On-Device and Server Foundation Models

machinelearning.apple.com

4 Comments
Like Comment
To view or add a comment, sign in
Heather Dawe

Chief Data Scientist | Thought Leader | Author | STEM Ambassador
4mo Edited
Report this post
I’ve been taking a look at Apple’s recently released paper detailing the foundation models and architecture underpinning their soon to be released Apple Intelligence: ‘a personal intelligence system integrated deeply into iOS 18, iPadOS18, and macOS Sequoia’. It is a very interesting read, they’ve got a pragmatic approach and have baked in Responsible AI. I’m saying their approach is pragmatic as they acknowledge the limitations of Edge GenAI and so have an integrated private cloud approach for some of the activities that require a larger, more powerful foundation model. That said, their ecosystem approach to foundation models on the phone, using the right model from a selection for the given task is certainly one of the ways forward for GenAI adoption. And the way they’ve taken a data-quality centric approach to model build and human in the loop should lower the amount of bias in their GenAI. Ever since I started playing around with installing LLMs on my laptop (discussed in an article of mine https://lnkd.in/ejP_PMGX from last summer) as well as edge compute devices like a Raspberry Pi (I initially followed this course https://lnkd.in/enbeA7zR and then carried on building), I’ve been excited to see what Apple were going to come up with. They’ve been quiet for quite a long time until the last month or so but like many others I imagined they were working very hard to bring cutting-edge GenAi to the iPhone. I do think companies like Apple bringing GenAI to the phone and similar devices has the potential to be as game changing for AI as the launch of the iPhone itself was for the App (and let’s not forget the ecosystem of data we all have on our phones as we use all the apps we have installed on them – this data will be an integral part of how we use GenAI on our phones). While there are many directions this could go, I’m excited to see it when it gets launched later in the year. #AppleIntelligence #GenAI #EdgeAI #AIAdoption #ResponsibleAI

apple_intelligence_foundation_language_models.pdf

machinelearning.apple.com
Like Comment
To view or add a comment, sign in
Matti Saarinen
6mo
Report this post
Last year, Apple released open-source tools for training and inference on Apple silicon. Yesterday, they provided more details about their models running on Apple silicon. While Apple's on-device models currently lag behind GPT-4 in performance, their strategy of leveraging the computing power in personal devices offers numerous advantages, potentially leading to low latency and high energy efficiency. Utilizing the processing capabilities of personal devices incurs no additional cost for Apple and ensures data remains on the device. Even the relatively low costs of using OpenAI models can accumulate over time and block some use cases. It remains to be seen whether Apple's on-device models will become sufficiently advanced for widespread applications.

Introducing Apple’s On-Device and Server Foundation Models

machinelearning.apple.com

2 Comments
Like Comment
To view or add a comment, sign in
Dharmteja Mansingh

Analytics Lead ANZ - Cloud and EPM
6mo
Report this post
https://lnkd.in/grAibE9m 🚀 **Apple Unveils Next-Gen Foundation Models for On-Device and Server AI** Apple has launched its Apple Intelligence system, introducing powerful generative models for iOS 18, iPadOS 18, and macOS Sequoia. These include a 3 billion parameter on-device model and a larger server model, enhancing user experiences with text refinement, notification summarization, image creation, and more. Prioritizing privacy and responsible AI, these models aim to streamline daily tasks effectively and securely. Some of the benchmarks published are top of the rung 🙌 Learn more: [Apple Foundation Models](https://lnkd.in/grAibE9m) #AI #MachineLearning #Apple #Privacy #Innovation #Tech 🚀📱💡

Introducing Apple’s On-Device and Server Foundation Models

machinelearning.apple.com

1 Comment
Like Comment
To view or add a comment, sign in
Emergent AI

11 followers
6mo Edited
Report this post
OpenAI kicked off this year’s DevCons, and on Monday, June 10th, we’ll have WWDC bookending with something Absolutely Incredible — wonder what the A and I imply?! Apple promises it will be Action Packed. At Emergent, we are focused on helping enterprises make sense of this AI moment. And, as much as Apple is THE consumer tech company, their product announcements ripple through tech broadly. (Just ask the Ads industry – with new fears about the Web Eraser.) We’ve been speculating about what the breadcrumbs and rumors WWDC will yield, particularly for enterprises. Here’s what we’re expecting: 1. Apple Intelligence: An AI system across all of Apple’s apps, including developer apps, with opt-in. It will only be available on new devices, with latest Apple Silicon, sparking a massive upgrade cycle. With Apple Intelligence, we could see AI products entering the mainstream. 2. Privacy-safe LLM: Imagine a hybrid cloud / on-device approach, LLMKit if you will. While Satya may have beaten Apple to the punch with Copilot+ PCs, Apple providing some LLM inference capabilities with pre-baked models in iOS & MacOS would be a godsend for iOS developers. This would unlock more capabilities without spending a fortune on H100s from Nvidia. But, the catch could be more stringent controls on data collection for training & inference. 3. Xcode Copilot: We’re converts. We love VS Code with GitHub Copilot, and it’s a no-brainer for Apple to offer developers a similar productivity boost within Xcode. And, oh, one more thing: a new Siri. Maybe this time she’ll be voiced by ScarJo — properly licensed and all. What do you all think? Share your thoughts! We’ll follow up after WWDC to see how accurate our predictions were and to review any surprises. #WWDC #Apple #AI
Like Comment
To view or add a comment, sign in
Samuel Roy

Building Integrated AI for MacOS
6mo
Report this post
AI is escaping your browser. 🕵️♂️ Microsoft is infusing AI everywhere, including in PCs. Google's doing so too with their Chromebooks. Apple is next. Say hello to the infamous Recall feature that records everything you do on your computer and various AI helpers available from a click or keystroke. That's why OpenAI launched its MacOS desktop app: to compete, you need to escape the browser. More in the last week's topic of Modern Chaos issue 38: https://lnkd.in/eEbNFv7C #google #microsoft #apple #openai #competition #trends

MC.38: AI Escaping the Browser

modernchaos.heytwist.com
Like Comment
To view or add a comment, sign in
Sanjay Basu PhD

MIT Alumnus|Fellow IETE |AI/Quantum|Executive Leader|Author|5x Patents|Life Member-ACM,AAAI,Futurist
6mo
Report this post
Apple's Foundation Models At WWDC 2024, Apple introduced its new suite of AI models under the "Apple Intelligence" banner. Integrated into iOS 18, iPadOS 18, and macOS Sequoia, these models aim to enhance everyday user experiences through advanced generative capabilities. Key Highlights Apple unveiled a ~3 billion parameter on-device language model and a larger server-based model, both tailored for efficiency and accuracy. These models power a range of features, including text refinement, notification summarization, and image creation. Apple's commitment to responsible AI development is evident in their focus on user empowerment, authentic representation, careful design, and robust privacy protections. Utilizing the AXLearn framework, Apple's models are trained on licensed and public data, with strict filters to exclude personal information. Techniques like low-bit palletization and grouped-query-attention ensure high performance and efficiency, especially for on-device models. Apple uses adapters to fine-tune models for specific tasks without altering the base parameters, enabling dynamic specialization while maintaining general knowledge. Performance is rigorously tested through human evaluation, focusing on real-world applications and adversarial prompts to ensure safety and effectiveness. Apple’s approach to data privacy is robust. It integrates on-device processing to safeguard user data and leverages Private Cloud Compute for secure server operations. This ensures that personal data remains protected, aligning with Apple's long-standing commitment to privacy. Apple's foundation models represent a significant advancement in AI integration. They enhance user experience across its ecosystem while adhering to principles of responsible AI development and data privacy. For more details, visit https://lnkd.in/gtxia6Gd #apple #foundationalmodels #llm

Introducing Apple’s On-Device and Server Foundation Models

machinelearning.apple.com
Like Comment
To view or add a comment, sign in
Tobias Kueper

Innovator fueling a digital revolution while racing towards a brighter future 🚀🏍️| Project Manager | Cloud & GenAi Consultant | Speaker | Inspired by Technological Innovation
4mo
Report this post
The Future of Language Models on Mobile: Where Do We Stand? 📱 Inference of large language models (LLMs) is impressive, but when implementing them on mobile devices, we still face significant challenges, particularly due to network constraints. In business scenarios, this limitation is a key factor restricting the seamless use of LLMs on mobile platforms. That’s why I’m closely watching developments in the area of mobile / small language models. It’s exciting to see industry leaders like Apple with #OpenELM, Google with #GeminiNano, and Microsoft with #Phi3Mini taking steps in this direction. I recently stumbled upon an interesting tech community article demonstrating the deployment of Microsoft’s Phi-3-Mini on an older iPhone 12. This shows the potential for bringing powerful LLM capabilities to mobile devices, even with older, not top-notch hardware. Here’s the link to the article if you’d like to check it out: https://lnkd.in/eVPfqn-R As we explore these advancements, I’m curious to hear how others are planning to integrate mobile, production-grade LLM use cases into their workflows. Let’s discuss and share insights! #MobileAI #LanguageModels #TechInnovation #LLM #AIonMobile #BusinessTech #Apple #Google #Microsoft

Getting started with Microsoft Phi-3-mini - Try running the Phi-3-mini on iPhone with ONNX Runtime

techcommunity.microsoft.com
Like Comment
To view or add a comment, sign in
Sahand Sojoodi

Building Solutions with AI @ 8090
6mo
Report this post
I finally got a chance to read Apples summary of their Intelligence strategy announced this week. (Link in comment) It’s a distilled read with architectural details on what I would call “how to make AI consumer-device friendly”. credit to Apple! My highlights: 🛠️ a lot of optimizations (quantization,…) for their devices "on iPhone 15 Pro we are able to reach time-to-first-token latency of about 0.6 millisecond per prompt token, and a generation rate of 30 tokens per second." 🛠️ Model Adaptation: fine-tuned submodels which can be loaded (and offloaded) in memory for specific tasks (summarization, tone-change, …) 📊 Lots of impressive results shared in a visually beautiful way. It seems like they are only second to the state-of-art (GPT 4). They didn't include GPT 4o in their comparisons. Happy reading. And I still want to have the option to turn it off when released on my Mac/iOS devices :)

Introducing Apple’s On-Device and Server Foundation Models

machinelearning.apple.com

3 Comments
Like Comment
To view or add a comment, sign in

4,309 followers

303 Posts

View Profile Connect

Stephen Li’s Post

More Relevant Posts

Explore topics