Last year, Apple released open-source tools for training and inference on Apple silicon. Yesterday, they provided more details about their models running on Apple silicon. While Apple's on-device models currently lag behind GPT-4 in performance, their strategy of leveraging the computing power in personal devices offers numerous advantages, potentially leading to low latency and high energy efficiency. Utilizing the processing capabilities of personal devices incurs no additional cost for Apple and ensures data remains on the device. Even the relatively low costs of using OpenAI models can accumulate over time and block some use cases. It remains to be seen whether Apple's on-device models will become sufficiently advanced for widespread applications.
Matti Saarinen’s Post
More Relevant Posts
-
This validates much of my own experience with where the industry is going. Apple owns this market and it might break them into enterprise by clustering an office of MacBook laptops for combined inference and training power. Conveniently, MLX introduced distributed computing in a new release. Almost certainly being used to build out Apple’s new (AI native) private cloud. And if if you read analysis by security researchers, while never perfect, is following all the right patterns. The new Apple intelligence runs tiny LLMs, probably classifiers to identify workload, and small super quantized 1-3b LLMs with LoRAs that are in the megabytes. Meaning you can fit tons of different use cases on a single device. And no Nvidia needed thanks to the M series silicon. Ignore the ChatGPT stuff when looking at the long term - it’s likely a stopgap, and I wouldn’t be surprised if Apple just partners with everyone in the near to mid term to commoditize it. In short - Apple created an ecosystem and platform that nobody else could, and is about to unleash it to the world with a UX that will change behavior. https://lnkd.in/gHsSUeRP
Introducing Apple’s On-Device and Server Foundation Models
machinelearning.apple.com
To view or add a comment, sign in
-
Apple's Foundation Models At WWDC 2024, Apple introduced its new suite of AI models under the "Apple Intelligence" banner. Integrated into iOS 18, iPadOS 18, and macOS Sequoia, these models aim to enhance everyday user experiences through advanced generative capabilities. Key Highlights Apple unveiled a ~3 billion parameter on-device language model and a larger server-based model, both tailored for efficiency and accuracy. These models power a range of features, including text refinement, notification summarization, and image creation. Apple's commitment to responsible AI development is evident in their focus on user empowerment, authentic representation, careful design, and robust privacy protections. Utilizing the AXLearn framework, Apple's models are trained on licensed and public data, with strict filters to exclude personal information. Techniques like low-bit palletization and grouped-query-attention ensure high performance and efficiency, especially for on-device models. Apple uses adapters to fine-tune models for specific tasks without altering the base parameters, enabling dynamic specialization while maintaining general knowledge. Performance is rigorously tested through human evaluation, focusing on real-world applications and adversarial prompts to ensure safety and effectiveness. Apple’s approach to data privacy is robust. It integrates on-device processing to safeguard user data and leverages Private Cloud Compute for secure server operations. This ensures that personal data remains protected, aligning with Apple's long-standing commitment to privacy. Apple's foundation models represent a significant advancement in AI integration. They enhance user experience across its ecosystem while adhering to principles of responsible AI development and data privacy. For more details, visit https://lnkd.in/gtxia6Gd #apple #foundationalmodels #llm
Introducing Apple’s On-Device and Server Foundation Models
machinelearning.apple.com
To view or add a comment, sign in
-
Apple's recent announcement at WWDC about Apple Intelligence has created an excitement in the tech community. The utilization of On-Device LLM inference is truly impressive and in sync with Apple's privacy and responsible AI policies. What stands out is the handful of remarkable optimizations applied to the foundation model and adapters, enabling the model to operate with an impressive latency of just 0.6 milliseconds per token. The incorporation of speculative decoding further elevates the results. Here are some key highlights from this innovative development: - Shared input and output vocab embedding tables - Implementation of low-bit palletization for on-device inference - Use of grouped-query-attention in both on-device and server models - Development of a new framework using LoRA adapters with a mixed 2-bit and 4-bit configuration strategy (3.5 bit per weight average) - Introduction of Talaria, an interactive model latency and power analysis tool, for bit rate selection guidance - Adoption of activation quantization and embedding quantization - Innovation in efficient Key-Value (KV) cache update on neural engines, which must be a custom kernel level optimization These optimisations signify a remarkable step forward in machine learning and artificial intelligence, setting new standards for efficiency and performance. Learn more about Apple's Foundation Models at https://lnkd.in/enwB3GJS. #Apple #WWDC #MachineLearning #ArtificialIntelligence #optimization #LLM
Introducing Apple’s On-Device and Server Foundation Models
machinelearning.apple.com
To view or add a comment, sign in
-
I’m kind of impressed with Apple’s device model implementation strategy, which incorporates mixed precision quantization so it can provide an optimal device experience. It employs 2-bit and 4-bit quantization for routine tasks, enhancing efficiency and reducing power consumption. For more complex operations, (such as image and voice processing?), the model switches to FP16 precision. Apple has also introduced adapters—small neural network modules that dynamically fine-tune models for specific tasks like text generation and summarization by overlaying specialized weights. This advanced quantization combined with adaptable architecture allows data to be processed directly on-device. This ensures faster response times and enhances privacy by keeping user data local. I know, I know... I'm leaving out the openAI integration, but that's not the part that impressed me. This is from their documentation, I haven't tested it. I can't wait to see how it performs in real life. If you want to read the press release: https://lnkd.in/gfcB3zt3 And if you want to understand more about quantization: https://lnkd.in/gxwWvVQR
Introducing Apple’s On-Device and Server Foundation Models
machinelearning.apple.com
To view or add a comment, sign in
-
https://lnkd.in/g7Gv4ASB reveals some interesting details on Apple's device approach. Includes a few of my sloppy but hopefully useful highlights. Apple Intelligence is comprised of multiple highly-capable generative models that are specialized for our users’ everyday tasks, and can adapt on the fly for their current activity. The foundation models built into Apple Intelligence have been fine-tuned for user experiences such as writing and refining text, prioritizing and summarizing notifications, creating playful images (View Highlight) detail how two of these models — a ~3 billion parameter on-device language model, and a larger server-based language model available with Private Cloud Compute and running on Apple silicon servers — have been built and adapted to perform specialized tasks efficiently, accurately, and responsibly. These two foundation models are part of a larger family of generative models created by Apple (View Highlight) foundation models are trained on Apple's AXLearn framework, an open-source project we released in 2023. It builds on top of JAX and XLA, and allows us to train the models with high efficiency and scalability on various training hardware and cloud platforms, including TPUs and both cloud and on-premise GPUs. We used a combination of data parallelism, tensor parallelism, sequence parallelism, and Fully Sharded Data Parallel (FSDP) to scale training along multiple dimensions such as data (View Highlight) server-based models’ general capabilities. We utilize a comprehensive evaluation set of real-world prompts to test the general model capabilities. These prompts are diverse across different difficulty levels and cover major categories such as brainstorming, classification, closed question answering, coding, extraction, mathematical reasoning, open (View Highlight)
Introducing Apple’s On-Device and Server Foundation Models
machinelearning.apple.com
To view or add a comment, sign in
-
If you are interested in AI and Machine Learning, I really recommend you head to machinelearning.apple.com to read up on how Apple developed their "Apple Intelligence" coming in upcoming iOS, iPadOS and MacOS. The local model is a 3B parameter SLM (Small Language Model) that uses adapters trained for each specific feature. Diffusion model does the same thing, adapter for each style. This is in line with my predictions last week that I posted in Tech Insights Week 23. Anything running locally or in Apple's Secure Cloud is an Apple model, not OpenAI. https://lnkd.in/dCr7UqTC
Introducing Apple’s On-Device and Server Foundation Models
machinelearning.apple.com
To view or add a comment, sign in
-
Apple's WWDC 2024 Highlights: Glimpse into Next-Gen AI Capabilities This year's Worldwide Developers Conference showcased innovation, particularly w/ the intro of Apple Intelligence. As we dive into the details here’s my take on the groundbreaking developments and how they might transform our interaction w/ tech Apple Intelligence: A New Era? Apple’s announcement of integrating Apple Intelligence into iOS 18, iPadOS 18 & macOS Sequoia is a new chapter in personalized tech Comprising various generative models this system adapts to user activities enhances tasks from writing texts to managing notifications, and more Four Things Stand Out: 1️⃣ Dual Models for Diverse Needs: The integration of a ~3 billion parameter on-device model and a more extensive server-based model is intriguing. The balance between on-device processing and cloud-based capabilities could set new standards for speed and privacy. 2️⃣ Commitment to Responsible AI: Apple’s outlined AI principles resonate deeply, especially their focus on user empowerment and privacy protection. The proactive approach to AI misuse and bias prevention is commendable. 3️⃣ Optimization Excellence: The reported improvements in model latency and efficiency on the iPhone 15 Pro are impressive. Apple’s use of low-bit palletization and quantization techniques could be game-changers. 4️⃣ Evaluating the Impact: The benchmarks shared show that Apple’s models not only perform well but are preferred by users over competing models. This human-centered evaluation approach is crucial for understanding real-world utility. Questions: How will this impact developers and creators? What are potential privacy concerns for end-users? How will these changes impact daily tech interactions? I’d love to hear your thoughts on these developments Here's the link to read more: https://lnkd.in/eGnWdimE #Ai #Future #Innovation #Technology #SocialNetworking
Introducing Apple’s On-Device and Server Foundation Models
machinelearning.apple.com
To view or add a comment, sign in
-
“Apple Intelligence is designed with our core values at every step and built on a foundation of groundbreaking privacy innovations. Additionally, we have created a set of Responsible AI principles to guide how we develop AI tools, as well as the models that underpin them: Empower users with intelligent tools: We identify areas where AI can be used responsibly to create tools for addressing specific user needs. We respect how our users choose to use these tools to accomplish their goals. Represent our users: We build deeply personal products with the goal of representing users around the globe authentically. We work continuously to avoid perpetuating stereotypes and systemic biases across our AI tools and models. Design with care: We take precautions at every stage of our process, including design, model training, feature development, and quality evaluation to identify how our AI tools may be misused or lead to potential harm. We will continuously and proactively improve our AI tools with the help of user feedback. Protect privacy: We protect our users' privacy with powerful on-device processing and groundbreaking infrastructure like Private Cloud Compute. We do not use our users' private personal data or user interactions when training our foundation models.” #aiml #appleintelligence #wwdc2024 #apple #wwdc https://lnkd.in/g8bTreeP
Introducing Apple’s On-Device and Server Foundation Models
machinelearning.apple.com
To view or add a comment, sign in
-
https://lnkd.in/grAibE9m 🚀 **Apple Unveils Next-Gen Foundation Models for On-Device and Server AI** Apple has launched its Apple Intelligence system, introducing powerful generative models for iOS 18, iPadOS 18, and macOS Sequoia. These include a 3 billion parameter on-device model and a larger server model, enhancing user experiences with text refinement, notification summarization, image creation, and more. Prioritizing privacy and responsible AI, these models aim to streamline daily tasks effectively and securely. Some of the benchmarks published are top of the rung 🙌 Learn more: [Apple Foundation Models](https://lnkd.in/grAibE9m) #AI #MachineLearning #Apple #Privacy #Innovation #Tech 🚀📱💡
Introducing Apple’s On-Device and Server Foundation Models
machinelearning.apple.com
To view or add a comment, sign in
-
🚀 **Introducing Apple’s On-Device and Server Foundation Models** 🚀 At the 2024 Worldwide Developers Conference, Apple unveiled **Apple Intelligence**, a personal intelligence system integrated into **iOS 18, iPadOS 18, and macOS Sequoia**. This innovation leverages multiple advanced generative models designed to enhance user experience through specialized tasks like text refinement, notification summarization, and creating playful images for personal conversations. ### Key Highlights: 🔹 **Diverse and Specialized Generative Models:** Tailored for specific user activities, adaptable in real-time. 🔹 **Dual Model Approach:** A ~3 billion parameter on-device language model and a larger server-based model for efficiency and accuracy. 🔹 **Comprehensive Support:** Includes a coding model for Xcode and a diffusion model for visual expression in Messages. 🔹 **Responsible AI Development:** Upholds principles to empower users, represent diverse communities, design with care, and protect privacy. ### Responsible AI Principles: 1. **Empower Users with Intelligent Tools:** Respecting user autonomy. 2. **Authentic Representation:** Avoiding biases and authentically representing global users. 3. **Careful Design:** Considering potential misuse and harm, improving based on feedback. 4. **Privacy Protection:** Utilizing on-device processing and Private Cloud Compute. ### Technical Overview: 🔹 **Pre-Training with AXLearn:** Leveraging advanced parallelism techniques and rigorous data filtering. 🔹 **Post-Training Innovations:** Enhancing instruction-following with novel algorithms like rejection sampling fine-tuning. 🔹 **Optimization for Performance:** Advanced techniques ensure speed and efficiency. ### Performance and Evaluation: 🔹 **User-Centric Evaluation:** Benchmarking through human evaluations and adversarial testing. 🔹 **Superior Instruction-Following:** Outperforming comparable open-source and commercial models. 🔹 **Enhanced Writing Ability:** Leading in summarization and composition tasks. #Apple #WWDC2024 #AI #MachineLearning #iOS18 #Innovation #Technology #Privacy #ResponsibleAI
Introducing Apple’s On-Device and Server Foundation Models
machinelearning.apple.com
To view or add a comment, sign in
neurologist, retired
6moWhy Musk so angry?