A great course for anyone wanting to learn in detail about LLMs, key prompting techniques, fine tuning methodologies for LLMs - be it Instruction Based or Parameter Efficient fine tuning (PEFT) like LoRA, Reinforcement Learning based on Human Feedback, Retrieval-Augmented Generation with LLMs, Evaluation Methodologies of LLMs, all this with well crafted lab sessions including mostly all listed concepts
Navratan Sharma’s Post
More Relevant Posts
-
Paper Review: YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Paper Review: YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
artgor.medium.com
To view or add a comment, sign in
-
Embracing Simplicity in Representation Learning: The Surprising Power of Random Transformations in Deep and Continual Learning Bhakta Vaschal Samal Abstract This paper delves into the foundational aspects of data representation in machine learning, with a particular focus on deep learning and continual learning paradigms. Challenging the conventional wisdom that complex, learned representations are inherently superior, we present empirical evidence demonstrating that simpler, even randomly generated representations can outperform sophisticated models in certain contexts. We introduce a method that employs fixed random transformations combined with linear classifiers, which consistently surpasses state-of-the-art techniques across multiple online continual learning benchmarks. Furthermore, we explore the theoretical underpinnings of this phenomenon by analysing the spectral properties of Gram matrices derived from these representations. Our findings reveal that first- and second-order statistics are sufficient to capture the essential characteristics of data representations, thereby simplifying the evaluation process. These insights prompt a re-evaluation of the necessity for intricate representation learning in specific scenarios and highlight the potential of more straightforward, computationally efficient approaches.
Embracing Simplicity in Representation Learning: The Surprising Power of Random Transformations in…
link.medium.com
To view or add a comment, sign in
-
Paper Review: YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Paper Review: YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
artgor.medium.com
To view or add a comment, sign in
-
I am trying to find out a simple-to-follow book/course on Deep Learning. I have got quite a few materials but some of them will require a lengthy time like a book almost 700+ pages long. Excellent book but technical and will take some time to digest. Albeit, here is one course that has both book and slides available and somewhat easy to follow, https://lnkd.in/g-WvXpNk The whole book is available at https://lnkd.in/gW6NaNgw The mathematical prerequisite for this course is at least college-level/undergraduate-level algebra and calculus, and possibly a little bit more. I got many more books (all for free) and will post about them one by one. My Goal: The theory of Deep Learning must be accessible and not a black box except for a coveted few.
GDL Course
geometricdeeplearning.com
To view or add a comment, sign in
-
In this episode of paper-talk I cover a very interesting paper from the lab of Sevgi Zübeyde Gürbüz in the U of A. The paper entitled: "Self-Supervised Contrastive Learning for Radar-Based Human Activity Recognition" is describing an innovative method of: (1) Augmenting RF spectrogram "images" data by varying the window length, overlap length and window function used in the calculation of the Short Time Fourier Transform (STFT) and, (2) Using a contrastive loss term between a resent based Encoder+MLP and a Physics Aware GAN branch. By requiring the contrastive loss to be minimized the GAN "helps" the encoder to generate better latent representations of the data to surpass self-supervised CAE by 4% in accuracy of classification of 14 human activities using only radar micro-Doppler spectrograms/signatures. The physics aware GAN is preventing the GAN from generating non-physical "radar images" and this will be a topic of subsequent episodes. Two questions to Sevgi Zübeyde Gürbüz: (1) The paper discusses "dynamic range" but it's not clear (to me) what is the problem with the dynamic range? Are you working with fixed point representations? (2) Did you try RGB based augmentation techniques like translation in both axes? I think this can also add some augmentation. (although varying overlap and window length can be seen as translation in the time domain, what about freq domain translation?) #radar #ai #ML #GAN #microdoppler
Paper Talk Episode 2: Self Supervised Contrastive Learning for Radar Based Human Activity Recognitio
https://www.youtube.com/
To view or add a comment, sign in
-
“The geometry of the loss landscape is key to understanding neural networks” In the latest AXRP episode, Jesse H. chats with Daniel Filan about: 🔍 Timaeus Research - Scaling SLT for alignment applications 🧠 Beyond Interpretability - Probing geometry for broader insights 🛠️ Refined LLC - A tool revealing specialization in attention heads 🧩 Multigram Circuit - A new mechanism for nested pattern matching Recorded at the FAR.AI-hosted Bay Area #AlignmentWorkshop, this episode explores cutting-edge tools for understanding and improving neural networks. 🎥Watch the full episode: https://lnkd.in/gFsJzCX4
38.2 - Jesse Hoogland on Singular Learning Theory
https://www.youtube.com/
To view or add a comment, sign in
-
🚀 New research introduces LISA as the innovative alternative to the beloved LoRA! 🎉 🔍 Memory consumption is a significant hurdle in training. Enter LISA - Layerwise Importance Sampled AdamW💡 By leveraging the unique skewness of weight norms across different layers, Lisa introduces a surprisingly simple yet incredibly effective training strategy. 📈 With memory costs as low as LoRA, LISA outperforms both LoRA and full parameter training in different settings, showcasing its prowess in optimizing GPU resources without compromising on performance. 💻 🌟 LISA consistently outshines LoRA by a staggering 11% to 37% in MT-Bench scores across various downstream fine-tuning tasks. And that's not all - even on mammoth models like LLaMA-2-70B, LISA stands tall, matching or surpassing LoRA's performance on MT-Bench, GSM8K, and PubMedQA benchmarks. 📊
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
arxiv.org
To view or add a comment, sign in
-
Extremely excited to share that our recent paper "Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning" was accepted to this year's International Conference on Machine Learning (ICML)! The paper investigates three narratives explaining the training instabilities often observed in reinforcement learning. We evaluate the effectiveness of various algorithmic improvements derived from these narratives in an experimental setup involving ~20,000 agents. Our main finding is that specific combinations of these techniques result in superior performance and enhanced training stability, enabling a simple Soft Actor-Critic to effectively solve the notoriously challenging dog simulator. Paper: https://lnkd.in/dHk-2zjV With Michał Bortkiewicz Mateusz Ostaszewski Piotr Milos Tomasz Trzcinski and Marek Cygan
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
arxiv.org
To view or add a comment, sign in
-
I wrote a new blog post summarizing the ideas in the one of the new PEFT methods -- RoSA (https://lnkd.in/eQsEiRcs). Taking inspiration from the classical problem of low-rank + sparse decomposition, it's an improvement over the popular LoRA method for training LLMs. https://lnkd.in/ed4qDHvg
RoSA: From Video Surveillance to Large Language Models
oxhidingacne.com
To view or add a comment, sign in
-
A very interesting talk by Leslie Valiant on "How to Augment Supervised Learning with Reasoning" https://lnkd.in/drEx6B2U Leslie Valiant received 2010 #TuringAward for his fundamental contributions to the development of computational learning theory and to the broader theory of computer science. #AI
Leslie Valiant on "How to Augment Supervised Learning with Reasoning"
https://www.youtube.com/
To view or add a comment, sign in