Ramin Mehran’s Post

Tech Lead @ Google DeepMind Multi-Modal perception/generation, AI Breakdown Podcaster

9mo

In this episode, we discuss Giraffe: Adventures in Expanding Context Lengths in LLMs by Arka Pal, Deep Karkhanis, Manley Roberts, Samuel Dooley, Arvind Sundararajan, Siddartha Naidu. The paper reviews techniques for overcoming the fixed context length limitation in large language models like LLaMA or LLaMA 2 by modifying positional encodings and introduces a new truncation strategy. It presents three novel tasks for evaluation, finding that linear scaling of contexts at evaluation time improves model performance, especially with a truncated positional basis. The researchers release new models named Giraffe with extended context lengths, along with datasets and code on HuggingFace to encourage further exploration in context length extrapolation.

arxiv preprint - Giraffe: Adventures in Expanding Context Lengths in LLMs

podbean.com

To view or add a comment, sign in

More Relevant Posts

Pradeep R 🌊

25K AI + DC Community | AI Data Centers Brand Ambassador | On a Mission Building Next Gen Digital Infrastructure | AI-Ready Data Centers | AI Compute | GPU Cloud | AI Cloud Infra Leader | Hyperscalers| AI/HPC Solutions
6mo
Report this post
Vision -Language Modeling
AI at Meta

923,009 followers
7mo

New from FAIR: An Introduction to Vision-Language Modeling. Paper ➡️ https://go.fb.me/ncjj6t This guide covers how VLMs work, how to train them and approaches to evaluation — while it primarily covers mapping image to language, it also discusses how to extend VLMs to videos. FAIR is releasing this guide together with a set of collaborators to enable a greater understanding of mechanics behind mapping vision to language.
Like Comment
To view or add a comment, sign in
Tahmina Khanom Tandra

MASc Graduate Student, Electrical Engineering
6mo
Report this post
When managing large datasets, vision language models are a great option. When it comes to simultaneously processing and analyzing massive amounts of textual and visual information, they provide a number of major advantages. You can gain insightful knowledge and improve your comprehension of big data management and utilization by investigating their capabilities and applications. For your studies or projects requiring extensive data analysis, this subject may be especially helpful. 💡 💡
AI at Meta

923,009 followers
7mo

New from FAIR: An Introduction to Vision-Language Modeling. Paper ➡️ https://go.fb.me/ncjj6t This guide covers how VLMs work, how to train them and approaches to evaluation — while it primarily covers mapping image to language, it also discusses how to extend VLMs to videos. FAIR is releasing this guide together with a set of collaborators to enable a greater understanding of mechanics behind mapping vision to language.
2 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

923,009 followers
7mo
Report this post
New from FAIR: An Introduction to Vision-Language Modeling. Paper ➡️ https://go.fb.me/ncjj6t This guide covers how VLMs work, how to train them and approaches to evaluation — while it primarily covers mapping image to language, it also discusses how to extend VLMs to videos. FAIR is releasing this guide together with a set of collaborators to enable a greater understanding of mechanics behind mapping vision to language.
63 Comments
Like Comment
To view or add a comment, sign in
Usman Ali, Ph.D.

ASIC Engineer, Architect @Meta
7mo
Report this post
LLMs are trained on written language, VLM is extension of LLMs with Vision language.
AI at Meta

923,009 followers
7mo

New from FAIR: An Introduction to Vision-Language Modeling. Paper ➡️ https://go.fb.me/ncjj6t This guide covers how VLMs work, how to train them and approaches to evaluation — while it primarily covers mapping image to language, it also discusses how to extend VLMs to videos. FAIR is releasing this guide together with a set of collaborators to enable a greater understanding of mechanics behind mapping vision to language.
Like Comment
To view or add a comment, sign in
Yunyang Xiong

--
7mo
Report this post
An introduction to Vision-Language Models (VLMs).
AI at Meta

923,009 followers
7mo

New from FAIR: An Introduction to Vision-Language Modeling. Paper ➡️ https://go.fb.me/ncjj6t This guide covers how VLMs work, how to train them and approaches to evaluation — while it primarily covers mapping image to language, it also discusses how to extend VLMs to videos. FAIR is releasing this guide together with a set of collaborators to enable a greater understanding of mechanics behind mapping vision to language.
Like Comment
To view or add a comment, sign in
Nate Haddad

Artificial Intelligence | Computer Vision | Generative AI
7mo
Report this post
I’m putting this at the top of my reading list. If you’ve ever been curious about the technical details behind multimodal vision/text models and their applications, this looks like a great place to start! #artificialintelligence #computervision
AI at Meta

923,009 followers
7mo

New from FAIR: An Introduction to Vision-Language Modeling. Paper ➡️ https://go.fb.me/ncjj6t This guide covers how VLMs work, how to train them and approaches to evaluation — while it primarily covers mapping image to language, it also discusses how to extend VLMs to videos. FAIR is releasing this guide together with a set of collaborators to enable a greater understanding of mechanics behind mapping vision to language.
2 Comments
Like Comment
To view or add a comment, sign in
Tanat Tonguthaisri, CISSP®

enabling digital services for Student Loan related activities while maintaining the highest security standard, the most compliant personal data protection and customer-centric data-driven innovation.
6mo
Report this post
🌟 Excited to share our latest blog post "Rotation Averaging: A Primal-Dual Method and Closed-Forms in Cycle Graphs" on arXiv! This work addresses the optimization problem in rotation averaging, a crucial aspect of geometric reconstruction and visual simultaneous localization and mapping. Our novel primal-dual method and insights from spectral graph theory offer valuable advancements in this field. Check out the full article here: https://bit.ly/3L4az24 #rotationaveraging #geometricreconstruction #arXivpublication
Like Comment
To view or add a comment, sign in
Ramin Mehran

Tech Lead @ Google DeepMind Multi-Modal perception/generation, AI Breakdown Podcaster
7mo
Report this post
In this episode, we discuss Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation by Jaemin Cho, Yushi Hu, Roopal Garg, Peter Anderson, Ranjay Krishna, Jason Baldridge, Mohit Bansal, Jordi Pont-Tuset, Su Wang. The abstract discusses the evaluation of text-to-image models, focusing on ensuring the accuracy between text prompts and generated images through a question generation and answering system. It introduces the Davidsonian Scene Graph (DSG), a strategy intended to improve question quality and answer consistency by creating a structured set of unique, semantic questions. Extensive testing and human assessments have shown DSG's effectiveness, and the release of DSG-1k provides a benchmark for wider usage and evaluation in the field.

arxiv preprint - Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation

podbean.com
Like Comment
To view or add a comment, sign in
Ashish Raj

2nd Year Student @SRMIST KTR| Learner | Coder | Full Stack Web Developer | C++
9mo
Report this post
🌟 Day 12 of 100 Days Challenge: Exploring Subarray Counts 🌟 Today, I delved deep into arrays and subarrays, encountering an intriguing problem: counting subarrays with a specific property. #CodingExploration #AlgorithmicThinking #100DaysChallenge
Like Comment
To view or add a comment, sign in
Nordic Inertial

601 followers
9mo
Report this post
Martti Kirkko-Jaakkola is one of our brilliant mathematicians whose task is to provide clients with answers from the data produced by measuring equipment, together with the team. But how do albatrosses’ flight paths and indoor navigation relate to the work of this academic? Get to know Martti here: https://lnkd.in/dt5HCuy6 Curious about joining Nordic Inertial team? Visit our Career-page or contact Jussi Collin. #Team #Mathematics #Engineering #Research #Interial #Algorithms #StaffIntroduction #MotionSensing
Like Comment
To view or add a comment, sign in

3,315 followers

350 Posts

View Profile Follow

Ramin Mehran’s Post

arxiv preprint - Giraffe: Adventures in Expanding Context Lengths in LLMs

podbean.com

More Relevant Posts

arxiv preprint - Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation

podbean.com

Explore topics