Mujeeb Ur Rehman’s Post

LLMs Champion

5mo

Why do attention models require positional encoding while RNN or CNNs do not? How does it work? I have provided a detailed explanation in my recent blog. #LLMs #AttentionModels #GenaI #PositionalEncodings https://lnkd.in/dHzQfDsi

Why Positional Emneddings are Required in Attention Models.?

medium.com

To view or add a comment, sign in

More Relevant Posts

Artificial Intelligence Feed

989 followers
9mo
Report this post
FeatUp: A Machine Learning Algorithm that Upgrades the Resolution of Deep Neural Networks for Improved Performance in Computer Vision Tasks Deep features are pivotal in computer vision studies, unlocking image semantics and empowering researchers to tackle various tasks, even in scenarios with minimal data. Lately, techniques have been developed to extract features from diverse data typ... https://lnkd.in/eG8JAPDP #AI #ML #Automation

FeatUp: A Machine Learning Algorithm that Upgrades the Resolution of Deep Neural Networks for Improved Performance in Computer Vision Tasks

openexo.com
Like Comment
To view or add a comment, sign in
ASVA AI

1,938 followers
8mo
Report this post
The size of current LLMs: Based on parameter count: higher count means greater potential for understanding complex patterns and increased computational resources for training With the following models expected to launch soon, who do you think will rule the charts? Amazon Olympus - Aug 2024 Grok 2 - 2025 GPT-5 - soon Llama 3 405B - soon Gemini 2 - Nov 2024 IC: Made visual
Like Comment
To view or add a comment, sign in
Towards Data Science

639,382 followers
1mo
Report this post
Interested in quantization? Arun Nanda's article introduces this technique and sets the stage for exploring the journey from simpler models like BERT to advanced 1.58-bit LLMs. #LLM #ML

Reducing the Size of AI Models

towardsdatascience.com
Like Comment
To view or add a comment, sign in
Prabhat Gaurav

Author
7mo Edited
Report this post
This one is ground breaking as now LSTM can power LLMs which can prove to be better than transformers in terms of speed and accuracy. Parallel training LSTM was issue. In future, I expect a fusion of transformer and LSTM layers can be stacked together and that can make LLMs more capable. Expecting more large context window. Expecting some more capable models will come in future which will be capable of generating algorithms on their own, based on patterns instead of just doing similarity search. Then there will be no need for ML and DL.
Like Comment
To view or add a comment, sign in
Tafar M.

Data Scientist | AI/ML Practitioner {Specializing in AI & ML Pipelines} | Database {SQL & NoSQL Expertise} ● Predictive Maintenance & Digital Twin Technology
6mo
Report this post
Excited to embark on my journey into the world of computer vision, delving into advanced ML and DL techniques. Committed to mastering image processing, Object Detection, CNNs, and state-of-the-art vision applications. etc... #MachineLearning #DeepLearning #ComputerVision
Like Comment
To view or add a comment, sign in
Eckehard Hermann

Prof. Dipl.-Ing. Dr. at University of Applied Sciences Upper Austria - Hagenberg Campus
5mo Edited
Report this post
In this video Prof. Kopal presents the joint research results of the University of Siegen and the AI-Lab of the Department of Secure Information Systems of the Upper Austrian University of Applied Sciences published at HistoCrypt 2024 in Oxford in our joint paper "Cryptanalysis of Hagelin M-209 Cipher Machine with Artificial Neural Networks: A Known-Plaintext Attack":

Breaking the M-209 Cipher Machine using Machine Learning in a Known-Plaintext Scenario

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Lakshay Singh

AI/ML @ Wasserstoff | GenAI | LLM | Neural Networks | GAN
8mo
Report this post
Sharing the completion of my recent ML project. Implemented the ANN(Artificial Neural Network) model for Regression purposes on the Power Plant Output dataset in which the goal is to predict the power generated with various factors. Github: https://lnkd.in/gQ3uKtKG #machinelearning #ANN #neuralnetwork #regression

1 Comment
Like Comment
To view or add a comment, sign in
Ben Dickson

Software Engineer | Tech Blogger
1mo
Report this post
With a few tweaks, LSTMs and GRUs can leverage parallel training and compete with Transformer models https://lnkd.in/eAP3sg3S

Minimized RNNs offer a fast and efficient alternative to Transformers

http://bdtechtalks.com
Like Comment
To view or add a comment, sign in
Yash Soni

PHP || LARAVEL || CODEIGNITER || FULL STACK DEVELOPER
1mo
Report this post
I want to run hyperparameter tuning for a Neural Style Transfer algorithm which results in having a Check it out: https://lnkd.in/dCBxARKy Join the conversation! #googlecolaboratory #tensorflow

how to clear gpu memory without restarting runtime in google colaboratory tensorflow
Like Comment
To view or add a comment, sign in

4,019 followers

37 Posts

View Profile Connect

Mujeeb Ur Rehman’s Post

More Relevant Posts

Breaking the M-209 Cipher Machine using Machine Learning in a Known-Plaintext Scenario

https://www.youtube.com/

Explore topics