Sakana AI’s Post

View organization page for Sakana AI, graphic

23,260 followers

1w Edited

Introducing “An Evolved Universal Transformer Memory” 🧠 (Blog: https://sakana.ai/namm/) Neural Attention Memory Models (NAMMs) are a new kind of neural memory system for Transformers that not only boost their performance and efficiency but are also transferable to other foundation models, without any additional training! Memory is a crucial component of cognition, allowing humans to selectively store and extract important notions from our ceaseless exposure to information and noise. Our work learns an artificial memory to replicate these capabilities thanks to the power of evolution, leading to smarter, faster, and more adaptable foundation models. Full Paper: https://lnkd.in/gkDzMCG3 Code: https://lnkd.in/gmvFSma7

5 Comments

Marcus Anzengruber

Tech Subject Matter Expert | 🇸🇪🇺🇸-innovation and policy alignment

It’s fascinating how NAMMs differs fundamentally from H2O/L2! While those use static rules to compress memory, NAMMs learns what to remember, similar to how military intelligence must adaptively filter crucial signals from noise. The zero-shot transfer capability is particularly relevant for some application areas that I’m looking into.

Noah L.

amazing, pushing the boundaries as always

安田真知子

はじめまして

Yeu Wen (耀榮) Mak (麥)

Augmenting and amplifying collective human decision-making under uncertainty and ambiguity conditions

Mario Brcic

See more comments

To view or add a comment, sign in

More Relevant Posts

Marktechpost Media Inc.

5,798 followers
6mo
Report this post
DiffUCO: A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization Quick read: https://lnkd.in/gEDWwdc8 Paper: https://lnkd.in/gbQJNvAN Code: https://lnkd.in/gsW599Yw
Like Comment
To view or add a comment, sign in
Sachith Seneviratne

Research Fellow in Computer Vision at University of Melbourne
9mo
Report this post
Nisal presenting our explainable genAI work at #AAAI2024.
Nisal Ranasinghe

Phonely (YC S24) | Co-founder
10mo

Excited to present our work "GINN-LP" at #AAAI2024. In our work, we propose a new type of interpretable neural network, which can produce a concise mathematical equation that describes the learned model. Find out more in our paper. Paper - https://lnkd.in/gEj4Pbrn Code - https://lnkd.in/g7UiJpdR Damith Senanayake Sachith Seneviratne Malin Premaratne Saman Halgamuge
Like Comment
To view or add a comment, sign in
Applied Sciences MDPI

8,830 followers
2mo
Report this post
📢 #HighlyCited 1. A Study on Detection of Malicious Behavior Based on Host Process Data Using Machine Learning https://lnkd.in/gEMNiEZA 2. Prediction and Optimization of Matte Grade in ISA Furnace Based on GA-BP Neural Network https://lnkd.in/gKY-sUX3 3. Alzheimer’s Dementia Speech (Audio vs. Text): Multi-Modal Machine Learning at High vs. Low Resolution https://lnkd.in/gb_JQD8T
Like Comment
To view or add a comment, sign in
Arundhati Banerjee

Inception Partner, NVIDIA | Engineer | Innovator
7mo
Report this post
Learn the basic concepts, implementations, and applications of graph neural networks in this self-paced course. Build fundamental knowledge in hands-on interactive activities to get started using GNN as a graph analysis tool. Enroll now: https://nvda.ws/3HjyPg6

Introduction to Graph Neural Networks | NVIDIA Deep Learning Institute
Like Comment
To view or add a comment, sign in
Artificial Intelligence Feed

989 followers
2mo
Report this post
IGNN-Solver: A Novel Graph Neural Solver for Implicit Graph Neural Networks The most serious challenge regarding IGNNs relates to slow inference speed and scalability. While these networks are effective at capturing long-range dependencies in graphs and addressing over-smoothing issues, they require computationally expensiv... https://lnkd.in/eMPbVkqb #AI #ML #Automation

IGNN-Solver: A Novel Graph Neural Solver for Implicit Graph Neural Networks

openexo.com
Like Comment
To view or add a comment, sign in
Soham Basak

Final-year B.Tech CS student | Backend development | Passion for low-level programming | Building scalable, efficient applications | Excited to collaborate on innovative solutions.
3mo Edited
Report this post
Explore how to construct a simple Rust-based neural network from scratch. Start building, implement core components, and train on a dataset today. Read the article here - https://lnkd.in/gUm-4JKG #Rust #NeuralNetwork #DEVCommunity
Like Comment
To view or add a comment, sign in
Tanat Tonguthaisri, CISSP®

enabling digital services for Student Loan related activities while maintaining the highest security standard, the most compliant personal data protection and customer-centric data-driven innovation.
8mo
Report this post
🌟 Exciting new research alert! Check out our latest blog post on how message passing enhances collaborative filtering methods. This in-depth article delves into the impact of message passing on recommender systems, focusing on its benefits for graph neural networks and real-world applications. Dive into our curated ablation studies and theoretical analyses to discover groundbreaking insights on the relationship between message passing and CF performance. Plus, don't miss our innovative Test-time Aggregation for CF framework, TAG-CF, designed to optimize graph knowledge without the computational overheads. Click here to read the full post: https://bit.ly/4aAC5iQ #collaborativefiltering #messsagepassing #recommendersystems #graphneuralnetworks #TAGCF
Like Comment
To view or add a comment, sign in
Tanat Tonguthaisri, CISSP®

enabling digital services for Student Loan related activities while maintaining the highest security standard, the most compliant personal data protection and customer-centric data-driven innovation.
7mo
Report this post
Check out the newly published blog post on "The fast committor machine: Interpretable prediction with kernels" on arXiv. The paper introduces a fast and interpretable method for approximating the committor, called the "fast committor machine" (FCM). This innovative approach utilizes simulated trajectory data to train a kernel model, identifying low-dimensional subspaces that optimally describe transitions from set A to set B. The FCM offers efficient training with linear scalability, outperforming neural network models in accuracy and speed. Dive into the details here: https://bit.ly/3yrBjGK
Like Comment
To view or add a comment, sign in
Ankit Malik

Creative Technologist | Technical Sound Designer | Audiovisual Artist
1mo Edited
Report this post
Good to be back after a much-needed break! During my time off, I immersed myself in some incredible technical material. I especially enjoyed reading Generating Sound and Organizing Time by Gregory Taylor and Graham Wakefield—highly recommend it to anyone looking to delve deeper into gen~ in Max. It does a fantastic job of demystifying computer music techniques and low-level signal processing. I also spent time familiarising myself with the FluCoMa Toolkit. This patch is a result of all the knowledge I absorbed during this period. In this iteration, I've built a network that uses the MLPRegressor in FluCoMa to control a granular synthesis module written in gen~. The MLPRegressor is a neural network that performs regression, i.e., it takes n-dimensional input data and maps it to n-dimensional output data. In this iteration, MLPRegressor takes 2-dimensional input data from an XY pad and maps it to 14 parameters of the granular synthesis module. On a tangential note, I'm excited to share that I will be facilitating two workshops on utilising neural networks and machine learning with Max/MSP. Stay tuned for more details! #MachineLearning #NeuralNetworks #MaxMSP #MusicTechnology #AudioProgramming #SoundDesign #DigitalSignalProcessing #ComputerMusic
Like Comment
To view or add a comment, sign in
Amira Hijazi

Research Faculty at Georgia Tech & AI4OPT | AI & OPT Lead for Supply Chains Thrust at GT
2mo
Report this post
Preprint 🚨: we propose a transformer pointer neural network that significantly improves the computational performance of the Column Generation approach for parallel machine scheduling. The proposed NN can solve different problem sizes as well as instances from different probability distribution than what it was trained on! https://lnkd.in/e2Qrrmcg
12 Comments
Like Comment
To view or add a comment, sign in

23,260 followers

View Profile Connect

Sakana AI’s Post

More Relevant Posts

Explore topics