Manuel Romero has just released an excellent notebook for training text embedding models whose embeddings can be truncated with minimal performance loss. This allows for faster retrieval, clustering, etc. Plus, you can train the model on your domain for better performance. Check it out here: https://lnkd.in/ezRrRy7r Or learn more about training embedding models here: https://sbert.net/
Sugato Ray’s Post
More Relevant Posts
-
Purchase decision using Kernel SVM Classification model. https://lnkd.in/grrhYA5m
Google Colab
colab.research.google.com
To view or add a comment, sign in
-
One thing to keep in mind with BM25/full-text search libraries, there's a difference between * Computing BM25 term matrix for a model to consume * Retrieving just top N results vs scoring everything * Bells and whistles of phrases, position aware, fuzzy search, custom similarities, boolean queries I think there's tools for the former, for SearchArray I've tried to focus on the latter two (score everything, but with all the bells and whistles you'd expect from a full text search library). You can see that in the colab notebook I've been building up to https://lnkd.in/eRv8emgY
Google Colaboratory
colab.research.google.com
To view or add a comment, sign in
-
Always better to learn with a hands-on example :) With the Keras + Comet integration, automatically start logging: - Model and graph description - Steps and epochs - Metrics (such as loss and accuracy) - Hyperparameters - Optimizer Parameters (such as the learning rate, beta decay rate, and more) - Number of trainable parameters - Histograms for weights and biases - Histograms for activations - Histograms for gradients Try the quick-start Colab to get started:
Google Colab
colab.research.google.com
To view or add a comment, sign in
-
With the Keras + Comet integration, automatically start logging: - Model and graph description - Steps and epochs - Metrics (such as loss and accuracy) - Hyperparameters - Optimizer Parameters (such as the learning rate, beta decay rate, and more) - Number of trainable parameters - Histograms for weights and biases - Histograms for activations - Histograms for gradients Try the quick-start Colab to get started:
Google Colab
colab.research.google.com
To view or add a comment, sign in
-
Below is the colab link for my K-NN project as part of a competition hosted by Sarah for her ML workshop CodewithSerah Sarah R. : https://lnkd.in/dS3ZeGRq My code implements a custom K-NN classifier to classify handwritten blurry digits from the sklearn digits dataset. I compared the performance and accuracy of my classifier with that of scikitlearn which was imported, by evaluating over various values of k. I also visualized the accuracies of both classifiers, and displayed sample predictions alongside the true labels. Digit recognition is a real world problem solution where many problems we face in automating tasks include numerical data such as banking, finance, healthcare(medical records, prescriptions), and data entry, Using K-NN we can help improve accuracy and efficiency of such applications. This issue is a real-world problem which i chose as it offers a practical experience with image classification tasks and is useful in many domains. K-NN is well suited for such nature of a problem and its datasets at hand.
Google Colab
colab.research.google.com
To view or add a comment, sign in
-
NER is not fully solved by throwing your prompt at an API. You can kickstart with some generalist models but you need to iterate on your data and build your own build your own, which will be smaller, more affordable, and more efficient. This notebook shows how:
efficient_token_classification.ipynb
colab.research.google.com
To view or add a comment, sign in
-
I do love Daniel Han’s work at unsloth - these economical-to-run (thanks to Unsloth!) and well documented Google Collabs are super useful when it comes to standing up and playing with the llama series of models. #ai #opensource #unsloth #llama3
I'm releasing 3 free Llama 3.1 notebooks for 2.1x faster & 60% less memory finetuning! Llama 3.1 comes in 8b, 70b & a large 405b size! All have 128K context length + are multilingual + have tool support! I also uploaded 4bit pre-quantized base and instruction tuned models for 4x faster downloads, and detail what we found in the new Llama model in our new Unsloth AI release and blog post! Also releasing a preview of our Studio UI chat, which makes running Llama 3.1 8b in a Colab 2x faster and for free! Llama 3.1 8b Colab finetuning notebook: https://lnkd.in/eDSCZdBt Llama 3.1 8b Kaggle finetuning notebook: https://lnkd.in/euyDpts6 Unsloth's HF repo: huggingface.co/unsloth Unsloth UI Chat Preview: https://lnkd.in/ek5btpKU Our blog: unsloth.ai/blog Llama 3.1 8b finetuning notebook: https://lnkd.in/eDSCZdBt
Google Colab
colab.research.google.com
To view or add a comment, sign in
-
Impressed by DanielHan and Unsloth AI's Llama 3.1 notebooks! 🚀 Faster finetuning and lower memory usage. Check out the details. #AI #MachineLearning #Llama3.1 #UnslothAI
I'm releasing 3 free Llama 3.1 notebooks for 2.1x faster & 60% less memory finetuning! Llama 3.1 comes in 8b, 70b & a large 405b size! All have 128K context length + are multilingual + have tool support! I also uploaded 4bit pre-quantized base and instruction tuned models for 4x faster downloads, and detail what we found in the new Llama model in our new Unsloth AI release and blog post! Also releasing a preview of our Studio UI chat, which makes running Llama 3.1 8b in a Colab 2x faster and for free! Llama 3.1 8b Colab finetuning notebook: https://lnkd.in/eDSCZdBt Llama 3.1 8b Kaggle finetuning notebook: https://lnkd.in/euyDpts6 Unsloth's HF repo: huggingface.co/unsloth Unsloth UI Chat Preview: https://lnkd.in/ek5btpKU Our blog: unsloth.ai/blog Llama 3.1 8b finetuning notebook: https://lnkd.in/eDSCZdBt
Google Colab
colab.research.google.com
To view or add a comment, sign in
-
Microsoft's LightGBM is a distributed, efficient gradient boosting framework that uses tree based learning algorithms and is capable of handling large-scale data. Comet can automatically log your LightGBM model graph, logging metrics, and parameters in just 3 lines of code! 👇 Check out the Colab tutorial to get started:
Google Colaboratory
colab.research.google.com
To view or add a comment, sign in