RAG Trick: Cosine similarity is the same as dot product Most embedding models normalize their output to 1.0, e.g. all models from OpenAI, Cohere, etc., check the documentation. In other words, the length of the vector is always 1. All the same: - Dot product - Cosine similarity - Euclidean distance (range 0-2 instead of -1-1) Just do dot product. There’s a lot fewer mathematical operations to get the same result. 💡 It’s a (hyper)sphere! Have a hard time wrapping your brain around distances in embedding space? Think about it as a sphere. All embedding points lie on the surface of that sphere. Not sure about you, but that simplifies a lot for me. All the same: - Distance from centroid (average of vectors) - Logistic regression (scikit-learn) The difference between these is the paradigm. With the centroid, you find the center by averaging. With a logistic regression you find the outside edge by training a logistic regression model to draw a hyperplane. Since all points are on a hypersphere, they get you the same result. Choose whichever makes more sense to you. IMO, - centroids are easier to update & delete from, but you have to come up with the distance - logistic regression is more obviously a classifier and so easier to wrap your head around and makes code clearer #RAG #LLMs #LLM #AI #embeddings #vectordb #vectordatabase
Tim Kellogg’s Post
More Relevant Posts
-
🚀 First-Time Integration: Voting Regressor as Meta-Model in Stacking Regressor! 💡 I’m excited to share a new milestone in my machine learning journey! Today, I successfully implemented a Voting Regressor as the meta-model in a Stacking Regressor—and it's the first time I’ve achieved this! 🎉 🛠 What’s Under the Hood? I used powerful base models including: Extra Trees Regressor XGBoost Regressor Random Forest Regressor 📊 For the meta-model, I didn’t stop there! Instead of a single regressor, I implemented a Voting Regressor, which includes: CatBoost Random Forest Extra Trees 📈 The result is a robust ensemble model that combines the strengths of various algorithms to improve overall prediction accuracy. Here’s a sneak peek of the structure I built (see attached image). 💼 This approach is particularly valuable for enhancing the performance of regression tasks by leveraging both the stacking and voting ensemble techniques. 🌟 If you’re working on regression problems or advanced machine learning ensembles, this hybrid approach might just give you the edge you’re looking for! #MachineLearning #DataScience #EnsembleLearning #StackingRegressor #VotingRegressor #CatBoost #XGBoost #RandomForest #AI #ML #DataScienceProjects
To view or add a comment, sign in
-
Support Vector Machines (SVMs) are one of the most popular algorithms in machine learning, known for their versatility in solving both classification and regression problems. But what makes them so powerful? Let’s break it down: 1.)The Decision Boundary: At the core of SVM is a hyperplane (shown as wTx=0w^T x = 0wTx=0) that separates data points belonging to different classes. The objective is to find the "optimal" hyperplane that best divides the dataset. 2.)Margins: SVM doesn't just aim to classify data but also focuses on maximizing the margin— the distance between the decision boundary and the nearest data points of each class. A larger margin often results in better generalization for unseen data. 3.) Support Vectors: These are the critical data points closest to the margin. They "support" the decision boundary, making them essential for determining the model's performance. 4.) Positive and Negative Hyperplanes: The lines wTx=+1w^T x = +1wTx=+1 and wTx=−1w^T x = -1wTx=−1 form the boundaries of the margin. The decision boundary lies equidistant between these. What makes SVM even more fascinating is its ability to handle non-linear data. By applying kernel functions (e.g., RBF or polynomial kernels), SVM can project data into higher-dimensional spaces, where it becomes easier to separate classes. Whether you’re solving a binary classification task or working with complex datasets, SVM’s robustness in high-dimensional spaces is worth exploring! Have you ever used SVM for a real-world problem? If yes, what challenges did you face, and how did you overcome them? Let’s discuss! #MachineLearning #AI #SupportVectorMachines #DataScience #Classification #Algorithms #TechLearning #AIForAll
To view or add a comment, sign in
-
Where does a model look at when differentiating between real vs fake images ? I did this small experiment a while ago. Picked up a dataset from kaggle of deepfakes and real images. Deepfakes were generated from a GAN. Two models were trained on the dataset: 1. 𝐑𝐞𝐬𝐍𝐞𝐭𝟑𝟒: Achieved 91% accuracy at 8 epochs 2. 𝐕𝐢𝐓 𝐒𝐦𝐚𝐥𝐥: Achieved 94.5% accuracy at 7 epochs Trained using 𝐟𝐚𝐬𝐭.𝐚𝐢 my goto library to train vision models. Extracted feature maps by grabbing hold of convolution layer weights for layer 1,5,...30. Then used captum for these 5 techniques for attribution. 𝐈𝐧𝐭𝐞𝐠𝐫𝐚𝐭𝐞𝐝 𝐆𝐫𝐚𝐝𝐢𝐞𝐧𝐭𝐬: Assigns importance scores to input features by gradually changing a baseline input to the actual input. 𝐒𝐚𝐥𝐢𝐞𝐧𝐜𝐲: Returns gradients with respect to inputs as a baseline approach. 𝐃𝐞𝐞𝐩𝐋𝐈𝐅𝐓 (𝐃𝐞𝐞𝐩𝐒𝐇𝐀𝐏): Explains predictions by comparing neuron activations to a reference state, extended to approximate SHAP values. 𝐈𝐧𝐩𝐮𝐭 𝐗 𝐆𝐫𝐚𝐝𝐢𝐞𝐧𝐭: Multiplies input with the gradient with respect to input. 𝐅𝐞𝐚𝐭𝐮𝐫𝐞 𝐀𝐛𝐥𝐚𝐭𝐢𝐨𝐧: Replaces input features with a baseline and computes the difference in output. ( Takes a very long time but very accurate ) One interesting thing i found with ViT was that it was looking at 𝐬𝐜𝐥𝐞𝐫𝐚 (𝐰𝐡𝐢𝐭𝐞 𝐩𝐚𝐫𝐭 𝐨𝐟 𝐭𝐡𝐞 𝐞𝐲𝐞) to make a prediction. Reporting attribution from the 𝐟𝐞𝐚𝐭𝐮𝐫𝐞 𝐚𝐛𝐥𝐚𝐭𝐢𝐨𝐧 𝐬𝐭𝐮𝐝𝐲 for ViT for three images. This takes a very long time since it is perturbating every pixel. Suppose if you have an image of size 256 x 256 it will do 65536 x 3 channels = 196608 forward passes. It is also very accurate though. It was a fun project little project. Link to the repo: In first comment. It has two notebooks. One training and other attribution. #deepfakes #ai #aideepfakes
To view or add a comment, sign in
-
🍓 𝗠𝘆 𝗘𝘅𝗵𝗮𝘂𝘀𝘁𝗶𝘃𝗲 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝗼𝗻 𝘁𝗵𝗲 𝗟𝗮𝘁𝗲𝘀𝘁 𝗦𝘁𝗿𝗮𝘄𝗯𝗲𝗿𝗿𝘆 𝗥𝗲𝗹𝗲𝗮𝘀𝗲 (𝗮𝗸𝗮 𝗢𝗽𝗲𝗻𝗔𝗜 𝗼𝟭) 🍓 I’ve spent the last couple of days diving deep into OpenAI o1, and let’s just say… it’s been 𝑒𝑛𝑙𝑖𝑔ℎ𝑡𝑒𝑛𝑖𝑛𝑔. Attached is a simplified visual report with my findings! 😅 First things first: Can we stop calling LLMs “intelligent humans” or “PhDs”? The sooner we separate science from sales, the better. I’ve seen way too much of the latter lately. And if I can spot that as a salesperson, well… you get the picture. Here’s why the whole o1 thing was a bit of a letdown for me: 🚀 𝐒𝐩𝐞𝐞𝐝: users want fast—very fast—especially in use cases where we’re emulating real “human” conversations. Customer experience matters, and let’s be honest, waiting more than 15 seconds for the same answer I’d get with GPT-4o? Not ideal. 💵 𝐂𝐨𝐬𝐭: One of the best things about LLMs APIs over the past two years has been the dramatic reduction in the price per token, making AI more accessible. But with o1, you’re looking at unpredictable costs per call because of some hidden chain of thought that you can’t see or control. 🙄 So yeah, it’s been an interesting ride, but not quite the breakthrough we were all hoping for. #AI #GenerativeAI #OpenAI #LLMs #TechInsights #hype
To view or add a comment, sign in
-
Hello LinkedIn community! I’m excited to share that I have successfully completed the SAWIT.AI Learnathon Program by Guvi, focused on Generative AI 🤖. During this incredible journey, I gained knowledge in: -> Retrieval-Augmented Generation (RAG) -> Generative AI and its applications -> Model Architecture -> OpenAI Models & APIs -> Data Processing As part of the program, I built Neo Bot, a RAG bot that brings these concepts to life! 🎯 #SAWIT #GenerativeAI #AI #RAG #OpenAI #DataProcessing #GUVI
To view or add a comment, sign in
-
Day 1 of AI supervised learning with Scikit Learn... Taking historical data from sklearn.linear_model import LinearRegression model= LinearRegression() model.fit(X,y) predictions= model.predict(x_new)
To view or add a comment, sign in
-
🚀 Mastering the Art of Hyperparameter Tuning in Machine Learning! 🛠️💡 Ever find yourself puzzled over which knobs to turn to improve your model's performance? Here’s a crisp guide to hyperparameter tuning for some popular algorithms: 🔹 Linear Regression: Optimize regularization parameters like alpha to prevent overfitting. 🔹 Logistic Regression: Fine-tune C and penalty types (L1, L2) to balance bias and variance. 🔹 Decision Trees: Adjust max_depth, min_samples_split, and more to control the tree complexity. 🔹 K-Nearest Neighbors: Select the right number of neighbors and distance metric for better proximity calculation. 🔹 Support Vector Machines: Play with C, kernel type, and gamma to shape the decision boundaries effectively. 🤓 Hyperparameter tuning is crucial as it directly influences how well our models learn and predict. Getting them right can be the difference between a good model and a great one! 👥 What's your go-to strategy for tuning parameters? Share your tips or ask questions below! 🗨️ #MachineLearning #DataScience #AI #TechTips #Hyperparameters
To view or add a comment, sign in
-
🚀Day 4 of My 50-Day Machine Learning Challenge! 🚀 Today, I delved into the crucial aspects of preparing and building a machine learning model. Here’s a snapshot of what I learned: 🔄 Preprocess + EDA + Feature Selection: - Preprocessing: - Exploratory Data Analysis (EDA): - Feature Selection: 🔢 Steps in Building a Machine Learning Model: 1. Extract Input and Output Columns: 2. Scale the Values: 3. Train-Test Split: 4. Train the Model: 5. Evaluate the Model/Model Selection: 6. Deploy the Model: 🧩 **Framing a Machine Learning Problem: Framing involves clearly defining the problem, understanding the data requirements, and determining the machine learning approach suitable for solving it. It's crucial to establish the problem’s objectives, constraints, and success criteria to ensure the model delivers valuable insights and results. I’m excited about these new insights and look forward to putting them into practice. Stay tuned for more updates as I continue my ML journey! #MachineLearning #DataScience #MLChallenge #AI #ModelBuilding #Preprocessing
To view or add a comment, sign in
-
🚀 Demystifying Multiclass Classification: OvA vs. OvO Strategies! 🤖 When tackling real-world classification problems, we often move beyond simple binary outcomes (e.g., spam vs. not spam) to multiclass classification tasks, where there are multiple possible labels (e.g., handwritten digits 0-9). But did you know that there are specific strategies to adapt binary classifiers for these tasks? Enter One-versus-All (OvA) and One-versus-One (OvO)! 🔹 One-versus-All (OvA) This approach trains one binary classifier per class. For example, if you're classifying images of digits, you'd have 10 classifiers: one for each digit (0 vs. all, 1 vs. all, etc.). The input is passed through each classifier, and the class with the highest decision score is chosen. Pros: Simpler setup with fewer models. Best For: Algorithms that scale well with the size of the training set. 🔹 One-versus-One (OvO) In OvO, a binary classifier is trained for each pair of classes, resulting in N×(N−1)/2N \times (N - 1) / 2N×(N−1)/2 classifiers (e.g., for 10 classes, that's 45 models!). During prediction, the input is evaluated by all classifiers, and the class with the most "wins" is selected. Pros: Each model is trained on smaller subsets, which is ideal for algorithms like SVMs that don’t scale well with large datasets. Best For: When training many smaller models is preferable to training fewer, larger ones. Understanding these strategies not only helps optimize your machine learning workflow but also deepens your appreciation for how algorithms scale and perform across various tasks. Next time you face a multiclass problem, remember that your choice between OvA and OvO can make all the difference! 💡 Which strategy do you prefer for multiclass problems? Share your thoughts below! 👇 #MachineLearning #DataScience #MulticlassClassification #OvA #OvO #AI
To view or add a comment, sign in
-
🚀Dissecting the concept of Feature Selection in Machine Learning! 🤖💡 🎯 Looking to supercharge your machine learning models? Dive into the fascinating dimension of feature selection! Here are some quirky insights to elevate your ML journey: 1️⃣ Less is More: Ever heard of the phrase "trimming the fat"? Feature selection does just that! It helps your model focus on the most relevant features while discarding the junk completely. 2️⃣ Statistical Sorcery: Filter methods like correlation and mutual information work their magic, separating the signal from the noise. 3️⃣ Model Mind Readers: Wrapper methods are like mind readers for your models. They iterate, they evaluate, and they pick the perfect team of features peak performance. 4️⃣ Tree Tango: Picture decision trees doing a graceful dance, effortlessly selecting the best among all features. It's nature's way of feature selection, and it's as beautiful as it sounds! 5️⃣ Mix & Match: Why settle for one when you can have it all? Hybrid approaches blend the best of both worlds, creating a feature selection cocktail that packs an innovative punch!! ✨ Explore the world of feature selection with us – where every bite brings you closer to that mouthwatering job offer! #MachineLearning #DataScience #FeatureSelection #AI #TechInnovation #ModelOptimization #Algorithm #DataAnalysis #ArtificialIntelligence #MLCommunity
To view or add a comment, sign in
Software Engineer at PNNL (Center for AI, rapid prototyping)
5moI think you know this, but for other folks who read just the first line: the dot product is only the same as cosine similarity if they are unit vectors, so if the embedding model doesn't normalize its output, they won't be the same. Euclidean distance is not the same as the other two. In three dimensions, it's the length of a direct line drawn from one vector to another. And I'm not sure thinking of it as a sphere in "high" dimensions is right. The range (of distances between random unit vectors) seems to narrow as the dimensions increase. For 1000 random vectors: