Learn about new PyTorch advancements for LLMs and how PyTorch is enhancing every aspect of the LLM lifecycle.
In this talk from AI Infra @ Scale 2024, software engineers Wanchao Liang and Evan Smothers are joined by Meta research scientist Kimish Patel to discuss our newest features and tools that enable large-scale training, memory efficient fine-tuning, and on-device LLM capabilities.
First, they cover the importance of memory-efficient fine-tuning and a few common architectural and algorithmic techniques to enable fine-tuning on consumer-grade hardware. Then they discuss the challenges of deploying large models for on-device deployment and how techniques such as quantization make these deployments possible.