Machine Learning - IAEE’s Post

View organization page for Machine Learning - IAEE, graphic

20 followers

7mo

Speculative Sampling is the principle of using two language models simultaneously to produce output faster at a reduced cost. https://buff.ly/3WAGKNq Author: https://buff.ly/3JVGHEn

Speculative Sampling — Intuitively and Exhaustively Explained

towardsdatascience.com

To view or add a comment, sign in

More Relevant Posts

Tyler Neylon

ML/AI Founder | Focus on: LLMs, AI+code, recommendations
7mo
Report this post
It's weird that LLMs immediately forget old conversations — or even parts of a conversation that are simply too far back in time, after the context window runs out. Yes, there are some huge context windows out there. But still, computers can easily store, word-for-word, the exact history of practically unlimited conversation. It feels like LLMs ought to remember things it heard years ago, just as a person (with a good memory) might. As a fun thought experiment: I type about 90 words per minute. If I typed non-stop for 100 years (no lazy things like sleeping or eating allowed), I'd produce less than 30 gigs of data. In other words, the entire lifetime text output of a single human can easily fit in the memory of a modern laptop. Are researchers just going to sit around and let LLMs have poor memories??? Obviously no, otherwise I wouldn't be writing this. The progress to report here is about a new paper from Google about an idea called Infini-attention. I have to give the authors credit for not flippantly using the word "infini:" This work really does provide a modification to attention that — in theory — has no time limit in terms of what it can remember. Well ... let's get somewhat technical. The "infini" memory of the model is a matrix that can effectively capture many key-value data points from all past conversation history. One very cool fact about high-dimensional vectors is that most of them (chosen at random) are mostly orthogonal to each other, which means that, in some sense, you can pack more information into a matrix than what you might expect by the size of the matrix alone. Having said that, the rank of the matrix (which is limited by the smaller of: number of rows, or number of columns) is indeed a kind of upper bound on the truly independent directions of information a single matrix can hold. The capacity is _not_ infinite. So it's inevitable that such a matrix must gradually forget things over time. In a way — if this is a model for the human brain — it may help us to understand why most people do tend to forget, or experience a fading away — of memories as time passes. This week's Learn & Burn summary explains the clever way the authors are able to incrementally update the memory matrix and incorporate non-linearity into the key-value lookups: https://lnkd.in/g-Gjr5aP

LLMs that never forget

learnandburn.ai
Like Comment
To view or add a comment, sign in
Han Xiao

CEO@Jina AI
6mo
Report this post
CLIP bridges text and image, but literally nobody used it for text retrieval—𝙪𝙣𝙩𝙞𝙡 𝙣𝙤𝙬. We're excited to introduce 𝐉𝐢𝐧𝐚 𝐂𝐋𝐈𝐏: a CLIP-like model that's great at text-text, text-image, image-text, and image-image retrieval. From now on, your Jina CLIP model 𝐢𝐬 𝐚𝐥𝐬𝐨 your text retriever. No need to switch between different embedding models when building MuRAG (Multimodal RAG) - one model, two modalities, four search directions. Not to mention it also handle 8K context length. So how we did it? Read more: https://lnkd.in/e8mYHNYJ

9 Comments
Like Comment
To view or add a comment, sign in
Brian Loyal

Life Science + AI + Cloud
10mo
Report this post
Generating cat videos is fun and all, but this research is important: https://lnkd.in/gyyRwVSa

V-JEPA trains a visual encoder by predicting masked spatio-temporal regions in a learned latent space.

ai.meta.com
Like Comment
To view or add a comment, sign in
Bob Armstrong

CoSy , Simplest Deepest " Tool of Thought " programming environment , evolved from APL in Forth
3mo
Report this post
Jens Fiederer Thanks for the link . I remember you talking about his method -- and perhaps even reading his paper . Perhaps stop by tomorrow's CoSy TuesZoom , https://lnkd.in/g9tBW-X5 and describe it in more detail .

Learn about the only language which spans from the chip in Forth to Array Mathematics ( eg: accounting , AI , voxel modeling ) simplified from APL/K ,

cosy.com
Like Comment
To view or add a comment, sign in
Charles Riley

Catalog Librarian for African Languages at Yale University
7mo
Report this post
Recent paper on work toward machine translation of N'ko:

1 Comment
Like Comment
To view or add a comment, sign in
KEEN ESSAYS

Academic Writing
5mo
Report this post
Support your answers with course material concepts, principles, and theories fro

Support your answers with course material concepts, principles, and theories fro

https://professionalwriters.blog
Like Comment
To view or add a comment, sign in
Alan Mourgues

MSc, CPEng | Oil & Gas Res. Eng. Consultant | Founder of CrowdField, the Premiere Resource Hub for the Reservoir Engineering Crowd.
6mo
Report this post
Here's a beautiful interactive map referencing all the cognitive biases that affect your thinking. Buster Benson's codex: https://lnkd.in/g7TTTjUq
Like Comment
To view or add a comment, sign in
Ramin Mehran

Tech Lead @ Google DeepMind Multi-Modal perception/generation, AI Breakdown Podcaster
1mo
Report this post
In this episode, we discuss Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer? by Nishant Balepur, Feng Gu, Abhilasha Ravichander, Shi Feng, Jordan Boyd-Graber, Rachel Rudinger. The paper investigates the reverse question answering (RQA) task where a question is generated based on a given answer and examines how 16 large language models (LLMs) perform on this task compared to traditional question answering (QA). The study reveals that LLMs are less accurate in RQA for numerical answers but perform better with textual ones, and they often can answer their incorrectly generated questions accurately in traditional QA, indicating that errors are not solely due to knowledge gaps. Findings also highlight that RQA errors correlate with question difficulty and are inversely related to the frequency of answers in the data corpus, presenting challenges in generating valid multi-hop questions and suggesting areas for improvement in LLM reasoning for RQA.

Arxiv Paper - Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can’t Answer?

podbean.com
Like Comment
To view or add a comment, sign in
Fedor Zhdanov

VP, Head of Applied AI | AI & ML Advisor | ex-AWS Principal Scientist
5mo Edited
Report this post
The Llama-3.1 paper about the new model is one of the most open and transparent so far on all aspects, highly recommended for reading

The Llama 3 Herd of Models

ai.meta.com
Like Comment
To view or add a comment, sign in
Towards Data Science

639,388 followers
6mo Edited
Report this post
For a clear and accessible introduction to LLM fine-tuning with Low Rank Adaptation (LoRA), don't miss Matthew Gunton's latest paper walkthrough.

Understanding Low Rank Adaptation (LoRA) in Fine Tuning LLMs

towardsdatascience.com
Like Comment
To view or add a comment, sign in

20 followers

View Profile Follow

Machine Learning - IAEE’s Post

More Relevant Posts

Arxiv Paper - Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can’t Answer?

podbean.com

Explore topics