Ramin Mehran’s Post

View profile for Ramin Mehran, graphic

Tech Lead @ Google DeepMind Multi-Modal perception/generation, AI Breakdown Podcaster

In this episode, we discuss Giraffe: Adventures in Expanding Context Lengths in LLMs by Arka Pal, Deep Karkhanis, Manley Roberts, Samuel Dooley, Arvind Sundararajan, Siddartha Naidu. The paper reviews techniques for overcoming the fixed context length limitation in large language models like LLaMA or LLaMA 2 by modifying positional encodings and introduces a new truncation strategy. It presents three novel tasks for evaluation, finding that linear scaling of contexts at evaluation time improves model performance, especially with a truncated positional basis. The researchers release new models named Giraffe with extended context lengths, along with datasets and code on HuggingFace to encourage further exploration in context length extrapolation.

arxiv preprint - Giraffe: Adventures in Expanding Context Lengths in LLMs

arxiv preprint - Giraffe: Adventures in Expanding Context Lengths in LLMs

podbean.com

To view or add a comment, sign in

Explore topics