Fine tuning gpt2 on indic-gpt dataset I ran one epoch of fine tuning on an arbitrary dataset of Indic-gpt and managed to train it for language detection with Colab.Have a look: https://lnkd.in/dZJUY-UV #indic-gpt #fine tuning #language detection
Shyamkant Kulkarni’s Post
More Relevant Posts
-
🚀 Excited to share my recent work on fine-tuning the LLaMA-2 model! 🚀 After a lot of hard work and dedication, I successfully fine-tuned the LLaMA-2 model using the Glaive Function Calling v2 dataset, achieving an impressive training loss of 0.218200. 🎉 This fine-tuned model is now available on Hugging Face: https://lnkd.in/gGzCrR2i A big shoutout to everyone who supported me throughout this journey! 🙌 Looking forward to seeing the amazing applications and innovations that will come from this work. #AI #MachineLearning #NLP #HuggingFace #LLMs #ArtificialIntelligence #FunctionCalling #DeepLearning
Danjin/Llama-2-7b-chat-finetunev2 at main
huggingface.co
To view or add a comment, sign in
-
🌟 New Research Alert! Check out this groundbreaking blog post on MambaLRP: Explaining Selective State Space Sequence Models. The post delves into the importance of transparency in sequence modeling and introduces MambaLRP, a novel algorithm within the LRP framework. The research promises a more stable and reliable relevance propagation across diverse models and datasets. Don't miss out on this cutting-edge development in language modeling and beyond. Read the full post here: https://bit.ly/4chKhVI #SequenceModeling #MambaLRP #ResearchUpdate
To view or add a comment, sign in
-
Thrilled to bring you Research Log #044! https://lnkd.in/gWWnbB6e A big highlight is from our Multimodal Action Models team, who are developing a prompt engineering framework to adapt SoTA VLMs like GPT-4o to action trajectories. More details and project updates inside the log! Another incredible week for the ML research community, some shoutouts! - Sihang Li et. Al for SciLitLLM, an LLM designed to extract key information from scientific publications. - Shijia Yang et. Al for their work The Law of Vision Representation in MLLMs! More inside! If you liked this post, join the conversation! https://lnkd.in/gyT9Kx2W
Research Log #044
manifoldrg.com
To view or add a comment, sign in
-
#AI is not cheap its needs money to make and it brings money when its made. #opensouce is a gift, that does not mean, integrating #ai comes with "Zero-cost" #LLM #realitycheck , this model will further be fine-tuned on 1. maths 2. sql and custom dataset. #DM for any other daataset you would want to see . based model #mistral instrcut v2. dataset: #orca pairs fine-tuning : DPO https://lnkd.in/gaSjVifd
tvkkishore/inspire-Mistral-7B-v2-DPO-Math at main
huggingface.co
To view or add a comment, sign in
-
🚀 Exciting Update! 🚀 I’ve just released a quantized version of the Llama3-ChatQA-2-8B model, specifically optimized for Retrieval-Augmented Generation (RAG) tasks for large language models! If you’re looking for improved performance, reduced memory usage, and faster response times without compromising accuracy, this is for you! 🔗 Check it out here: https://lnkd.in/e96hbns2 🔧 Designed for those diving into Q&A tasks and enhancing LLM applications, this is a game-changer for efficient deployment. Get your hands on it and let me know your thoughts! 💡 #AI #LLM #RAG #DeepLearning #MachineLearning #Llama3ChatQA #QuantizedModel #OpenSource
bobofrut/Llama3-ChatQA-2-8B-GGUF · Hugging Face
huggingface.co
To view or add a comment, sign in
-
🌟 Excited to announce my latest contribution to the AI community! 🌟 I've fine-tuned a LLaMA-3 model specifically for Finance tasks, accessible here: "https://lnkd.in/gTb5KFcY". Trained on high-quality Q&A financial data, this model is primed for industry-specific NLP tasks like regulatory analysis, market predictions, and financial Q&A. 📊💼 Key Data Insights The dataset focuses on financial questions and contextual answers, making it perfect for models needing precise, industry-related knowledge. dataset link->"https://lnkd.in/gY9hgUsk" Get Started Here's how to load and query the model: --------------------------------------------------------------------------------------- from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel model_name = "Uddeshya/ud_llama" model = AutoModelForCausalLM.from_pretrained(model_name, load_in_8bit=True, device_map='auto' ) model = PeftModel.from_pretrained(model, model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "What is the capital of France?" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=20) generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) print(generated_text) --------------------------------------------------------------------------------------- I hope this helps drive innovation in finance-related NLP. Excited to see how this evolves! 🚀 #MachineLearning #FinanceAI #HuggingFace #NLP #LLM #AIinFinance #LLaMA
Uddeshya/ud_llama · Hugging Face
huggingface.co
To view or add a comment, sign in
-
Published an Instruction Tuned version of the Starcoder2-3B model to HuggingFace https://lnkd.in/gnVd_UYd Why? * Officially only the Starcoder2-15B version is available. * There are a few instruction tuned versions of the 3B one, but they either lack documentation & training hyperparameters or have not been trained on the same dataset as the Starcoder2-15B one. * It's trained on the same dataset as Starcoder2-15B. * It can be easily run locally on 4 GB, 6GB, 8 GB, and 10 GB VRAM machines with appropriate quantization. * Can be integrated with Ollama to have a local coding assistant. Some information: * Trained for 4 epochs * GPU: 40 GB A100 * Time: ~ 4 hours * Training type: Full fine-tuning All hyperparameters, training command, and link to the official Starcoder Self Align repository are given in the model card. #LLM #NLP #DeepLearning
sovitrath/starcoder2-3b-instruct · Hugging Face
huggingface.co
To view or add a comment, sign in
-
📣 Function-Calling Llama3 by ScaleGenAI is here! 📣 Check it out here: https://lnkd.in/gBtKCbNj ScaleGenAI ❤️ Open Source! And as a part of our drive to promote open source genAI, we've made our version of Llama3-8b that supports function calling. This model is intended for use in environments where automated function calling capabilities are required to enhance data manipulation and retrieval tasks. It is particularly useful in scenarios involving complex data analysis, where users can query data interactively through natural language commands. ⚠️ Oh we forgot! This can also take chain of thought as a parameter: Chain of thought increases the chances of getting better responses as each function is equipped with reasoning by the LLM. #llama3 #FunctionCalling #genAI #ChainOfThought #DataAnalysis #NaturalLanguage
ScaleGenAI/Llama3-8B-Function-Calling · Hugging Face
huggingface.co
To view or add a comment, sign in
-
Now this is exciting! ScaleGenAI has made the Function Calling Llama3 open source! Adds function calling capabilities for complex workflows to the already powerful Llama3-8B model by Meta. Check it out: https://lnkd.in/gDUCtw-g #llama3 #llamaByMeta #functionCalling #openSource #huggingFace #generativeAI
📣 Function-Calling Llama3 by ScaleGenAI is here! 📣 Check it out here: https://lnkd.in/gBtKCbNj ScaleGenAI ❤️ Open Source! And as a part of our drive to promote open source genAI, we've made our version of Llama3-8b that supports function calling. This model is intended for use in environments where automated function calling capabilities are required to enhance data manipulation and retrieval tasks. It is particularly useful in scenarios involving complex data analysis, where users can query data interactively through natural language commands. ⚠️ Oh we forgot! This can also take chain of thought as a parameter: Chain of thought increases the chances of getting better responses as each function is equipped with reasoning by the LLM. #llama3 #FunctionCalling #genAI #ChainOfThought #DataAnalysis #NaturalLanguage
ScaleGenAI/Llama3-8B-Function-Calling · Hugging Face
huggingface.co
To view or add a comment, sign in
-
"Guess what's cooking at ScaleGenAI? 🔥 We're all about spreading the Open Source love, and we're thrilled to unveil our latest creation: an amped-up version of Llama3-8b with function calling superpowers! Picture this: you're building your GenAI app, and suddenly you need some serious automation muscle. Boom! Our model swoops in like a superhero, making data manipulation a breeze with its slick function calling features. Ready to join the adventure? Jump on board and let's explore the possibilities of AI in style! 🚀 #OpenSourceRevolution #AIAdventures" #llama3 #functioncalling
📣 Function-Calling Llama3 by ScaleGenAI is here! 📣 Check it out here: https://lnkd.in/gBtKCbNj ScaleGenAI ❤️ Open Source! And as a part of our drive to promote open source genAI, we've made our version of Llama3-8b that supports function calling. This model is intended for use in environments where automated function calling capabilities are required to enhance data manipulation and retrieval tasks. It is particularly useful in scenarios involving complex data analysis, where users can query data interactively through natural language commands. ⚠️ Oh we forgot! This can also take chain of thought as a parameter: Chain of thought increases the chances of getting better responses as each function is equipped with reasoning by the LLM. #llama3 #FunctionCalling #genAI #ChainOfThought #DataAnalysis #NaturalLanguage
ScaleGenAI/Llama3-8B-Function-Calling · Hugging Face
huggingface.co
To view or add a comment, sign in