Plamen Florov’s Post

View profile for Plamen Florov, graphic

Managing Director, Regiware Bulgaria

To me quantization is part of democratizing AI.

View profile for Andrew Ng, graphic
Andrew Ng Andrew Ng is an Influencer

Founder of DeepLearning.AI; Managing General Partner of AI Fund; Exec Chairman of Landing AI

LLMs can take gigabytes of memory to store, which limits what can be run on consumer hardware. But quantization can dramatically compress models, making a wider selection of models available to developers. You can often reduce model size by 4x or more while maintaining reasonable performance. In our new short course Quantization Fundamentals taught by Hugging Face's Younes Belkada and Marc Sun, you'll: - Learn how to quantize nearly any open source model - Use int8 and bfloat16 (Brain float 16) data types to load and run LLMs using PyTorch and the Hugging Face Transformers library - Dive into the technical details of linear quantization to map 32-bit floats to 8-bit integers As models get bigger and bigger, quantization becomes more important for making models practical and accessible. Please check out the course here: https://lnkd.in/g66yNW8W

To view or add a comment, sign in

Explore topics