Sugato Ray’s Post

Sugato Ray

6mo

Training embedding models. Colab Notebook. #LLMs #RAG #ml #python #bookmark

Tom Aarsen

🤗 Sentence Transformers, SetFit & NLTK maintainer, MLE @ Hugging Face

6mo

Manuel Romero has just released an excellent notebook for training text embedding models whose embeddings can be truncated with minimal performance loss. This allows for faster retrieval, clustering, etc. Plus, you can train the model on your domain for better performance. Check it out here: https://lnkd.in/ezRrRy7r Or learn more about training embedding models here: https://sbert.net/

Google Colab

colab.research.google.com

To view or add a comment, sign in

More Relevant Posts

Jonas Minino

Software Engineer | Generative AI | Data Analytics
1mo
Report this post
Purchase decision using Kernel SVM Classification model. https://lnkd.in/grrhYA5m

Google Colab

colab.research.google.com
Like Comment
To view or add a comment, sign in
Doug Turnbull

Search Relevance, Machine Learning, and Discovery
6mo Edited
Report this post
One thing to keep in mind with BM25/full-text search libraries, there's a difference between * Computing BM25 term matrix for a model to consume * Retrieving just top N results vs scoring everything * Bells and whistles of phrases, position aware, fuzzy search, custom similarities, boolean queries I think there's tools for the former, for SearchArray I've tried to focus on the latter two (score everything, but with all the bells and whistles you'd expect from a full text search library). You can see that in the colab notebook I've been building up to https://lnkd.in/eRv8emgY

Google Colaboratory

colab.research.google.com

1 Comment
Like Comment
To view or add a comment, sign in
Rodrigo Miranda

Machine Learning Engineer | Data Scientist | Pre-Sales Engineer | Cloud Architect
6mo
Report this post
Always better to learn with a hands-on example :) With the Keras + Comet integration, automatically start logging: - Model and graph description - Steps and epochs - Metrics (such as loss and accuracy) - Hyperparameters - Optimizer Parameters (such as the learning rate, beta decay rate, and more) - Number of trainable parameters - Histograms for weights and biases - Histograms for activations - Histograms for gradients Try the quick-start Colab to get started:

Google Colab

colab.research.google.com
Like Comment
To view or add a comment, sign in
Boris FELD

Senior software Engineer at Comet.ml
6mo
Report this post
With the Keras + Comet integration, automatically start logging: - Model and graph description - Steps and epochs - Metrics (such as loss and accuracy) - Hyperparameters - Optimizer Parameters (such as the learning rate, beta decay rate, and more) - Number of trainable parameters - Histograms for weights and biases - Histograms for activations - Histograms for gradients Try the quick-start Colab to get started:

Google Colab

colab.research.google.com
Like Comment
To view or add a comment, sign in
rami eid

Student at American University of Beirut
2mo Edited
Report this post
Below is the colab link for my K-NN project as part of a competition hosted by Sarah for her ML workshop CodewithSerah Sarah R. : https://lnkd.in/dS3ZeGRq My code implements a custom K-NN classifier to classify handwritten blurry digits from the sklearn digits dataset. I compared the performance and accuracy of my classifier with that of scikitlearn which was imported, by evaluating over various values of k. I also visualized the accuracies of both classifiers, and displayed sample predictions alongside the true labels. Digit recognition is a real world problem solution where many problems we face in automating tasks include numerical data such as banking, finance, healthcare(medical records, prescriptions), and data entry, Using K-NN we can help improve accuracy and efficiency of such applications. This issue is a real-world problem which i chose as it offers a practical experience with image classification tasks and is useful in many domains. K-NN is well suited for such nature of a problem and its datasets at hand.

Google Colab

colab.research.google.com

3 Comments
Like Comment
To view or add a comment, sign in
David Berenstein

ML & DevRel for Argilla @ Hugging Face 🤗 || 👨🏽🍳 Cooking, 👨🏽💻 Coding, 🏆 Committing
1mo
Report this post
NER is not fully solved by throwing your prompt at an API. You can kickstart with some generalist models but you need to iterate on your data and build your own build your own, which will be smaller, more affordable, and more efficient. This notebook shows how:

efficient_token_classification.ipynb

colab.research.google.com

4 Comments
Like Comment
To view or add a comment, sign in
Vincent A. Alessi

Venture Builder | AI Bio | Researcher + GTM | Multi-exit Founder | Advisor | Board Director | Forbes 30u30
5mo
Report this post
I do love Daniel Han’s work at unsloth - these economical-to-run (thanks to Unsloth!) and well documented Google Collabs are super useful when it comes to standing up and playing with the llama series of models. #ai #opensource #unsloth #llama3

Daniel Han

unsloth.ai - open-source AI training
5mo Edited

I'm releasing 3 free Llama 3.1 notebooks for 2.1x faster & 60% less memory finetuning! Llama 3.1 comes in 8b, 70b & a large 405b size! All have 128K context length + are multilingual + have tool support! I also uploaded 4bit pre-quantized base and instruction tuned models for 4x faster downloads, and detail what we found in the new Llama model in our new Unsloth AI release and blog post! Also releasing a preview of our Studio UI chat, which makes running Llama 3.1 8b in a Colab 2x faster and for free! Llama 3.1 8b Colab finetuning notebook: https://lnkd.in/eDSCZdBt Llama 3.1 8b Kaggle finetuning notebook: https://lnkd.in/euyDpts6 Unsloth's HF repo: huggingface.co/unsloth Unsloth UI Chat Preview: https://lnkd.in/ek5btpKU Our blog: unsloth.ai/blog Llama 3.1 8b finetuning notebook: https://lnkd.in/eDSCZdBt

Google Colab

colab.research.google.com
Like Comment
To view or add a comment, sign in
Sonu kumar

Machine Learning Engineer(AI) @TerraDX Technologies | Assistant Head Placement Coordinator @IIT Patna | MTech AI | IIT Patna
5mo
Report this post
Impressed by DanielHan and Unsloth AI's Llama 3.1 notebooks! 🚀 Faster finetuning and lower memory usage. Check out the details. #AI #MachineLearning #Llama3.1 #UnslothAI

Daniel Han

unsloth.ai - open-source AI training
5mo Edited

I'm releasing 3 free Llama 3.1 notebooks for 2.1x faster & 60% less memory finetuning! Llama 3.1 comes in 8b, 70b & a large 405b size! All have 128K context length + are multilingual + have tool support! I also uploaded 4bit pre-quantized base and instruction tuned models for 4x faster downloads, and detail what we found in the new Llama model in our new Unsloth AI release and blog post! Also releasing a preview of our Studio UI chat, which makes running Llama 3.1 8b in a Colab 2x faster and for free! Llama 3.1 8b Colab finetuning notebook: https://lnkd.in/eDSCZdBt Llama 3.1 8b Kaggle finetuning notebook: https://lnkd.in/euyDpts6 Unsloth's HF repo: huggingface.co/unsloth Unsloth UI Chat Preview: https://lnkd.in/ek5btpKU Our blog: unsloth.ai/blog Llama 3.1 8b finetuning notebook: https://lnkd.in/eDSCZdBt

Google Colab

colab.research.google.com
Like Comment
To view or add a comment, sign in
Caroline Borders

MLOps Specialist @ Comet | Data Science, Machine Learning
6mo
Report this post
Microsoft's LightGBM is a distributed, efficient gradient boosting framework that uses tree based learning algorithms and is capable of handling large-scale data. Comet can automatically log your LightGBM model graph, logging metrics, and parameters in just 3 lines of code! 👇 Check out the Colab tutorial to get started:

Google Colaboratory

colab.research.google.com

1 Comment
Like Comment
To view or add a comment, sign in

6,143 followers

1,795 Posts

View Profile Connect

Sugato Ray’s Post

More Relevant Posts

Explore topics