Shyamkant Kulkarni’s Post

6mo

Fine tuning gpt2 on indic-gpt dataset I ran one epoch of fine tuning on an arbitrary dataset of Indic-gpt and managed to train it for language detection with Colab.Have a look: https://lnkd.in/dZJUY-UV #indic-gpt #fine tuning #language detection

Shyamkant/indic-gpt · Hugging Face

huggingface.co

To view or add a comment, sign in

More Relevant Posts

Tushar Arora

🚀 24/7 Dev. 𝖯𝗅𝖺𝗒𝖾𝗋 | 🔥 8x Internship Veteran | 🎯 Open Source Contributor
6mo
Report this post
🚀 Excited to share my recent work on fine-tuning the LLaMA-2 model! 🚀 After a lot of hard work and dedication, I successfully fine-tuned the LLaMA-2 model using the Glaive Function Calling v2 dataset, achieving an impressive training loss of 0.218200. 🎉 This fine-tuned model is now available on Hugging Face: https://lnkd.in/gGzCrR2i A big shoutout to everyone who supported me throughout this journey! 🙌 Looking forward to seeing the amazing applications and innovations that will come from this work. #AI #MachineLearning #NLP #HuggingFace #LLMs #ArtificialIntelligence #FunctionCalling #DeepLearning

Danjin/Llama-2-7b-chat-finetunev2 at main

huggingface.co
Like Comment
To view or add a comment, sign in
Tanat Tonguthaisri, CISSP®

enabling digital services for Student Loan related activities while maintaining the highest security standard, the most compliant personal data protection and customer-centric data-driven innovation.
6mo
Report this post
🌟 New Research Alert! Check out this groundbreaking blog post on MambaLRP: Explaining Selective State Space Sequence Models. The post delves into the importance of transparency in sequence modeling and introduces MambaLRP, a novel algorithm within the LRP framework. The research promises a more stable and reliable relevance propagation across diverse models and datasets. Don't miss out on this cutting-edge development in language modeling and beyond. Read the full post here: https://bit.ly/4chKhVI #SequenceModeling #MambaLRP #ResearchUpdate
Like Comment
To view or add a comment, sign in
Manifold Research

293 followers
3mo
Report this post
Thrilled to bring you Research Log #044! https://lnkd.in/gWWnbB6e A big highlight is from our Multimodal Action Models team, who are developing a prompt engineering framework to adapt SoTA VLMs like GPT-4o to action trajectories. More details and project updates inside the log! Another incredible week for the ML research community, some shoutouts! - Sihang Li et. Al for SciLitLLM, an LLM designed to extract key information from scientific publications. - Shijia Yang et. Al for their work The Law of Vision Representation in MLLMs! More inside! If you liked this post, join the conversation! https://lnkd.in/gyT9Kx2W

Research Log #044

manifoldrg.com
Like Comment
To view or add a comment, sign in
Venkata Krishna kishore Terli

Staff Data Engineer @ Visa | AI | LLM | GenAI | Machine Learning
9mo
Report this post
#AI is not cheap its needs money to make and it brings money when its made. #opensouce is a gift, that does not mean, integrating #ai comes with "Zero-cost" #LLM #realitycheck , this model will further be fine-tuned on 1. maths 2. sql and custom dataset. #DM for any other daataset you would want to see . based model #mistral instrcut v2. dataset: #orca pairs fine-tuning : DPO https://lnkd.in/gaSjVifd

tvkkishore/inspire-Mistral-7B-v2-DPO-Math at main

huggingface.co

1 Comment
Like Comment
To view or add a comment, sign in
Rafal Jackiewicz

Senior Software Engineer at Mastercard
2mo
Report this post
🚀 Exciting Update! 🚀 I’ve just released a quantized version of the Llama3-ChatQA-2-8B model, specifically optimized for Retrieval-Augmented Generation (RAG) tasks for large language models! If you’re looking for improved performance, reduced memory usage, and faster response times without compromising accuracy, this is for you! 🔗 Check it out here: https://lnkd.in/e96hbns2 🔧 Designed for those diving into Q&A tasks and enhancing LLM applications, this is a game-changer for efficient deployment. Get your hands on it and let me know your thoughts! 💡 #AI #LLM #RAG #DeepLearning #MachineLearning #Llama3ChatQA #QuantizedModel #OpenSource

bobofrut/Llama3-ChatQA-2-8B-GGUF · Hugging Face

huggingface.co
Like Comment
To view or add a comment, sign in
Uddeshya Pachauri

FinTech Data Scientist | Python | Machine Learning | NLP | LLM | Deep Learning | Generative AI | Prompt Engineering | Financial Modeling | Time Series Forecasting | Cloud Computing (AWS/GCP) | AWS Sagemaker
1mo
Report this post
🌟 Excited to announce my latest contribution to the AI community! 🌟 I've fine-tuned a LLaMA-3 model specifically for Finance tasks, accessible here: "https://lnkd.in/gTb5KFcY". Trained on high-quality Q&A financial data, this model is primed for industry-specific NLP tasks like regulatory analysis, market predictions, and financial Q&A. 📊💼 Key Data Insights The dataset focuses on financial questions and contextual answers, making it perfect for models needing precise, industry-related knowledge. dataset link->"https://lnkd.in/gY9hgUsk" Get Started Here's how to load and query the model: --------------------------------------------------------------------------------------- from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel model_name = "Uddeshya/ud_llama" model = AutoModelForCausalLM.from_pretrained(model_name, load_in_8bit=True, device_map='auto' ) model = PeftModel.from_pretrained(model, model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "What is the capital of France?" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=20) generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) print(generated_text) --------------------------------------------------------------------------------------- I hope this helps drive innovation in finance-related NLP. Excited to see how this evolves! 🚀 #MachineLearning #FinanceAI #HuggingFace #NLP #LLM #AIinFinance #LLaMA

Uddeshya/ud_llama · Hugging Face

huggingface.co
Like Comment
To view or add a comment, sign in
Sovit Ranjan Rath

Computer Vision Engineer at OpenCV University| Author of Machine Learning Blog DebuggerCafe | debuggercafe.com | github.com/sovit-123
4mo
Report this post
Published an Instruction Tuned version of the Starcoder2-3B model to HuggingFace https://lnkd.in/gnVd_UYd Why? * Officially only the Starcoder2-15B version is available. * There are a few instruction tuned versions of the 3B one, but they either lack documentation & training hyperparameters or have not been trained on the same dataset as the Starcoder2-15B one. * It's trained on the same dataset as Starcoder2-15B. * It can be easily run locally on 4 GB, 6GB, 8 GB, and 10 GB VRAM machines with appropriate quantization. * Can be integrated with Ollama to have a local coding assistant. Some information: * Trained for 4 epochs * GPU: 40 GB A100 * Time: ~ 4 hours * Training type: Full fine-tuning All hyperparameters, training command, and link to the official Starcoder Self Align repository are given in the model card. #LLM #NLP #DeepLearning

sovitrath/starcoder2-3b-instruct · Hugging Face

huggingface.co
Like Comment
To view or add a comment, sign in
ScaleGenAI

1,000 followers
7mo
Report this post
📣 Function-Calling Llama3 by ScaleGenAI is here! 📣 Check it out here: https://lnkd.in/gBtKCbNj ScaleGenAI ❤️ Open Source! And as a part of our drive to promote open source genAI, we've made our version of Llama3-8b that supports function calling. This model is intended for use in environments where automated function calling capabilities are required to enhance data manipulation and retrieval tasks. It is particularly useful in scenarios involving complex data analysis, where users can query data interactively through natural language commands. ⚠️ Oh we forgot! This can also take chain of thought as a parameter: Chain of thought increases the chances of getting better responses as each function is equipped with reasoning by the LLM. #llama3 #FunctionCalling #genAI #ChainOfThought #DataAnalysis #NaturalLanguage

ScaleGenAI/Llama3-8B-Function-Calling · Hugging Face

huggingface.co
Like Comment
To view or add a comment, sign in
Aman Sharma

Head of Product @ TheAgentic ✳︎ Building the most accurate LLM for AI Agents
7mo
Report this post
Now this is exciting! ScaleGenAI has made the Function Calling Llama3 open source! Adds function calling capabilities for complex workflows to the already powerful Llama3-8B model by Meta. Check it out: https://lnkd.in/gDUCtw-g #llama3 #llamaByMeta #functionCalling #openSource #huggingFace #generativeAI

ScaleGenAI

1,000 followers
7mo

📣 Function-Calling Llama3 by ScaleGenAI is here! 📣 Check it out here: https://lnkd.in/gBtKCbNj ScaleGenAI ❤️ Open Source! And as a part of our drive to promote open source genAI, we've made our version of Llama3-8b that supports function calling. This model is intended for use in environments where automated function calling capabilities are required to enhance data manipulation and retrieval tasks. It is particularly useful in scenarios involving complex data analysis, where users can query data interactively through natural language commands. ⚠️ Oh we forgot! This can also take chain of thought as a parameter: Chain of thought increases the chances of getting better responses as each function is equipped with reasoning by the LLM. #llama3 #FunctionCalling #genAI #ChainOfThought #DataAnalysis #NaturalLanguage

ScaleGenAI/Llama3-8B-Function-Calling · Hugging Face

huggingface.co
Like Comment
To view or add a comment, sign in
Kartheek Surampudi

Building and scaling the best LLM for agents!
7mo
Report this post
"Guess what's cooking at ScaleGenAI? 🔥 We're all about spreading the Open Source love, and we're thrilled to unveil our latest creation: an amped-up version of Llama3-8b with function calling superpowers! Picture this: you're building your GenAI app, and suddenly you need some serious automation muscle. Boom! Our model swoops in like a superhero, making data manipulation a breeze with its slick function calling features. Ready to join the adventure? Jump on board and let's explore the possibilities of AI in style! 🚀 #OpenSourceRevolution #AIAdventures" #llama3 #functioncalling

ScaleGenAI

1,000 followers
7mo

📣 Function-Calling Llama3 by ScaleGenAI is here! 📣 Check it out here: https://lnkd.in/gBtKCbNj ScaleGenAI ❤️ Open Source! And as a part of our drive to promote open source genAI, we've made our version of Llama3-8b that supports function calling. This model is intended for use in environments where automated function calling capabilities are required to enhance data manipulation and retrieval tasks. It is particularly useful in scenarios involving complex data analysis, where users can query data interactively through natural language commands. ⚠️ Oh we forgot! This can also take chain of thought as a parameter: Chain of thought increases the chances of getting better responses as each function is equipped with reasoning by the LLM. #llama3 #FunctionCalling #genAI #ChainOfThought #DataAnalysis #NaturalLanguage

ScaleGenAI/Llama3-8B-Function-Calling · Hugging Face

huggingface.co
Like Comment
To view or add a comment, sign in

2,203 followers

View Profile Connect

Shyamkant Kulkarni’s Post

Shyamkant/indic-gpt · Hugging Face

huggingface.co

More from this author

These children need your help

Early Engineering Ph.Ds of Pune University

Lesson Plan for a Tea Party (In lighter vein)

Explore topics