It's an open secret in GenAI: most projects never make it to production. 💀 At Adaptive ML, we help you escape 🪦 𝐩𝐫𝐨𝐨𝐟-𝐨𝐟-𝐜𝐨𝐧𝐜𝐞𝐩𝐭 𝐩𝐮𝐫𝐠𝐚𝐭𝐨𝐫𝐲 🪦 and deploy with confidence using our automated evals framework. First, compare proprietary APIs and open models with 𝐀𝐈 𝐣𝐮𝐝𝐠𝐞𝐬 personalized to your use case and built-in RAG metrics. Then, validate with LMSYS-style 𝐀/𝐁 𝐭𝐞𝐬𝐭𝐢𝐧𝐠 managed through the platform. Finally, monitor your deployment with granular observability. Track 𝐛𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐦𝐞𝐭𝐫𝐢𝐜𝐬 and inspect 𝐦𝐨𝐝𝐞𝐥 𝐢𝐧𝐭𝐞𝐫𝐚𝐜𝐭𝐢𝐨𝐧𝐬. Learn more about Adaptive Engine: https://lnkd.in/evYMnwJh
À propos
Continuously evaluate and adapt models with synthetic data and production feedback to surpass frontier performance—from your cloud or ours.
- Site web
-
https://www.adaptive-ml.com
Lien externe pour Adaptive ML
- Secteur
- Technologie, information et Internet
- Taille de l’entreprise
- 11-50 employés
- Siège social
- Paris
- Type
- Société civile/Société commerciale/Autres types de sociétés
- Fondée en
- 2023
- Domaines
- Generative AI, Reinforcement Learning, Large Language Models, RLHF, RLAIF, Monitoring, A/B Testing et Post-Training
Lieux
-
Principal
Paris, FR
-
New York, US
Employés chez Adaptive ML
Nouvelles
-
According to the 2024 Gartner® Innovation Guide for Generative AI Technologies report, "GenAI also has a systemic impact on AI overall and unlocks the next phase of AI — namely, 𝐚𝐝𝐚𝐩𝐭𝐢𝐯𝐞 𝐀𝐈." 👇 "Adaptive AI systems allow for model behavior change post-deployment by learning behavioral patterns from past human and machine experience, and within runtime environments to adapt more quickly to changing, real-world circumstances." We believe compound AI systems will be fine-tuned to every task, learning from a continuous stream of production feedback. In other words, 𝐀𝐈, 𝐭𝐮𝐧𝐞𝐝 𝐭𝐨 𝐩𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧.😉 This is exactly what Adaptive Engine provides today. Learn more: https://lnkd.in/evYMnwJh 𝘎𝘢𝘳𝘵𝘯𝘦𝘳, 𝘐𝘯𝘯𝘰𝘷𝘢𝘵𝘪𝘰𝘯 𝘎𝘶𝘪𝘥𝘦 𝘧𝘰𝘳 𝘎𝘦𝘯𝘦𝘳𝘢𝘵𝘪𝘷𝘦 𝘈𝘐 𝘛𝘦𝘤𝘩𝘯𝘰𝘭𝘰𝘨𝘪𝘦𝘴, 14 𝘕𝘰𝘷𝘦𝘮𝘣𝘦𝘳 2024. 𝘎𝘈𝘙𝘛𝘕𝘌𝘙 𝘪𝘴 𝘢 𝘳𝘦𝘨𝘪𝘴𝘵𝘦𝘳𝘦𝘥 𝘵𝘳𝘢𝘥𝘦𝘮𝘢𝘳𝘬 𝘢𝘯𝘥 𝘴𝘦𝘳𝘷𝘪𝘤𝘦 𝘮𝘢𝘳𝘬 𝘰𝘧 𝘎𝘢𝘳𝘵𝘯𝘦𝘳, 𝘐𝘯𝘤. 𝘢𝘯𝘥/𝘰𝘳 𝘪𝘵𝘴 𝘢𝘧𝘧𝘪𝘭𝘪𝘢𝘵𝘦𝘴 𝘪𝘯 𝘵𝘩𝘦 𝘜.𝘚. 𝘢𝘯𝘥 𝘪𝘯𝘵𝘦𝘳𝘯𝘢𝘵𝘪𝘰𝘯𝘢𝘭𝘭𝘺 𝘢𝘯𝘥 𝘪𝘴 𝘶𝘴𝘦𝘥 𝘩𝘦𝘳𝘦𝘪𝘯 𝘸𝘪𝘵𝘩 𝘱𝘦𝘳𝘮𝘪𝘴𝘴𝘪𝘰𝘯. 𝘈𝘭𝘭 𝘳𝘪𝘨𝘩𝘵𝘴 𝘳𝘦𝘴𝘦𝘳𝘷𝘦𝘥.
-
GPT-4o likes loooooong answers. 🥱 When faced with two completions, it will choose the verbose option 85% 𝘰𝘧 𝘵𝘩𝘦 𝘵𝘪𝘮𝘦. Such biases make LLM evals hard, delaying production. We present a new way of 𝐞𝐥𝐢𝐦𝐢𝐧𝐚𝐭𝐢𝐧𝐠 𝐥𝐞𝐧𝐠𝐭𝐡 𝐛𝐢𝐚𝐬, enabling fairer LLM evals. To even the odds for introverted models, benchmarks like Alpaca Eval 2 use logistic regression to predict preferences while accounting for length differences, ensuring more balanced comparisons. Others resort to prompt engineering, adding a hail-mary ‘do not be biased by verbosity’ instruction to the prompt of the LLM judge. We propose a different approach: https://lnkd.in/ecmXbVb3
-
𝐏𝐫𝐨𝐱𝐢𝐦𝐚𝐥 𝐩𝐨𝐥𝐢𝐜𝐲 𝐨𝐩𝐭𝐢𝐦𝐢𝐳𝐚𝐭𝐢𝐨𝐧 (PPO) 🤜 vs 🤛 𝐝𝐢𝐫𝐞𝐜𝐭 𝐩𝐨𝐥𝐢𝐜𝐲 𝐨𝐩𝐭𝐢𝐦𝐢𝐳𝐚𝐭𝐢𝐨𝐧 (DPO). Two of the foremost preference-tuning algorithms. But, which should you use? We’ve done the research. 👇 We control for 𝐥𝐞𝐧𝐠𝐭𝐡 𝐛𝐢𝐚𝐬 (tendency of LLM judges to prefer longer responses, regardless of quality) in a novel way. Instead of simply 𝘮𝘪𝘵𝘪𝘨𝘢𝘵𝘪𝘯𝘨 length bias, we address it directly by training our PPO model to follow specific length instructions using a penalty in its reward. This allows us to compare completions of the same approximate length, effectively 𝘦𝘭𝘪𝘮𝘪𝘯𝘢𝘵𝘪𝘯𝘨 length bias from the comparison. Below, we present the results when comparing DPO and PPO across three common datasets that are used both for training and evaluation: Helpful & Harmless (𝐇𝐇), 𝐓𝐋;𝐃𝐑, and Ultra Feedback (𝐔𝐅). Across all datasets, 𝐏𝐏𝐎 𝐜𝐨𝐧𝐬𝐢𝐬𝐭𝐞𝐧𝐭𝐥𝐲 𝐰𝐢𝐧𝐬 𝐚𝐠𝐚𝐢𝐧𝐬𝐭 𝐃𝐏𝐎, while generating slightly shorter responses. Sometimes, as in the case of HH, PPO’s win-rate is a staggering 76%--𝐚 𝐛𝐢𝐠𝐠𝐞𝐫 𝐠𝐚𝐩 𝐭𝐡𝐚𝐧 𝐭𝐡𝐞 𝐣𝐮𝐦𝐩 𝐟𝐫𝐨𝐦 𝐋𝐥𝐚𝐦𝐚 2 𝐭𝐨 3. We then used the same approach to evaluate our DPO- and PPO-trained models on 𝐌𝐓-𝐛𝐞𝐧𝐜𝐡 (with both trained on UF and HH). Once again, PPO appears to be the better method. See the full results in our most recent blog: https://lnkd.in/ecmXbVb3
-
At Artificial Intelligence Marseille, Adaptive ML CTO Baptiste Pannier spoke with Youssef El Manssouri, CEO of Sesterce, on integrating GenAI into your existing data landscape. 🤖 🖥️ Too many AI initiatives get stuck at proof-of-concept; Adaptive ML enables organizations to tune models with production feedback, ensuring constant performance as operations evolve. Our thanks to La Tribune and La Provence for sponsoring and hosting the conversation! 🎊 #AIMSummit
-
What is 𝐏𝐫𝐨𝐱𝐢𝐦𝐚𝐥 𝐏𝐨𝐥𝐢𝐜𝐲 𝐎𝐩𝐭𝐢𝐦𝐢𝐳𝐚𝐭𝐢𝐨𝐧 (𝐏𝐏𝐎) and why does it seem so complicated? PPO requires four models. Importance sampling. Clipping. KL divergence. But, most importantly... ...why would I bother when there are other, simpler methods like 𝐬𝐮𝐩𝐞𝐫𝐯𝐢𝐬𝐞𝐝 𝐟𝐢𝐧𝐞-𝐭𝐮𝐧𝐢𝐧𝐠 (𝐒𝐅𝐓) for training my model? Our latest blog 𝐅𝐫𝐨𝐦 𝐙𝐞𝐫𝐨 𝐭𝐨 𝐏𝐏𝐎: 𝐔𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝𝐢𝐧𝐠 𝐭𝐡𝐞 𝐏𝐚𝐭𝐡 𝐭𝐨 𝐇𝐞𝐥𝐩𝐟𝐮𝐥 𝐀𝐈 𝐌𝐨𝐝𝐞𝐥𝐬 builds an intuitive understanding of PPO and its differentiation from other tuning techniques: https://lnkd.in/eUSMbJYx
-
Why isn't supervised fine-tuning (𝐒𝐅𝐓) good enough to train a model for my use-case? 𝐒𝐅𝐓 𝐠𝐢𝐯𝐞𝐬 𝐚𝐧 𝐋𝐋𝐌 𝐚 𝐟𝐢𝐬𝐡; 𝐢𝐭 𝐝𝐨𝐞𝐬 𝐧𝐨𝐭 𝐭𝐞𝐚𝐜𝐡 𝐢𝐭 𝐭𝐨 𝐟𝐢𝐬𝐡. 🪝🐟 Wrestling helpful answers out of a pre-trained model isn't simple. With only a few words as context, it's hard for the model to judge how to continue appropriately. The adequate continuation might be a Shakespearean sonnet; a paragraph fit for your teenage blog; or a quick answer to a travel question. All of this, and more, are equally weighted in the pre-training corpus. Let’s fix that. Why not extend the simple next-word prediction of pre-training, but with data illustrative of the conversations we want the model to have? This is the basis of 𝐬𝐮𝐩𝐞𝐫𝐯𝐢𝐬𝐞𝐝 𝐟𝐢𝐧𝐞-𝐭𝐮𝐧𝐢𝐧𝐠 (𝐒𝐅𝐓). Training on this dataset will 𝐦𝐚𝐱𝐢𝐦𝐢𝐳𝐞 𝐭𝐡𝐞 𝐥𝐢𝐤𝐞𝐥𝐢𝐡𝐨𝐨𝐝 that, when faced with users’ questions, the model will answer adequately. After SFT, the model will adhere to the examples of our gold dataset, delivering similarly helpful answers. However, SFT has 𝐚 𝐩𝐚𝐭𝐡𝐨𝐥𝐨𝐠𝐢𝐜𝐚𝐥 𝐬𝐡𝐨𝐫𝐭𝐜𝐨𝐦𝐢𝐧𝐠 related to 𝐠𝐞𝐧𝐞𝐫𝐚𝐥𝐢𝐳𝐚𝐭𝐢𝐨𝐧. SFT provides specific demonstrations of the right answer to the model ex-nihilo: the model did not come up with them on its own. SFT gives the LLM a fish; it does not teach it to fish. Parroting gold answers could lead to poor generalization when the LLM is left to its own devices, 𝐫𝐞𝐬𝐮𝐥𝐭𝐢𝐧𝐠 𝐢𝐧 𝐡𝐚𝐥𝐥𝐮𝐜𝐢𝐧𝐚𝐭𝐢𝐨𝐧𝐬. Additionally, producing complete golden demonstrations may be costly and unscalable. A more effective training process would be for the LLM to suggest completions and learn through the evaluation of these completions instead. See how in our blog 𝐅𝐫𝐨𝐦 𝐙𝐞𝐫𝐨 𝐭𝐨 𝐏𝐏𝐎: 𝐔𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝𝐢𝐧𝐠 𝐭𝐡𝐞 𝐏𝐚𝐭𝐡 𝐭𝐨 𝐇𝐞𝐥𝐩𝐟𝐮𝐥 𝐀𝐈 𝐌𝐨𝐝𝐞𝐥𝐬: https://lnkd.in/eUSMbJYx
-
Pretrained LLMs are 𝐚𝐥𝐢𝐞𝐧𝐬 𝐨𝐟 𝐞𝐱𝐭𝐫𝐚𝐨𝐫𝐝𝐢𝐧𝐚𝐫𝐲 𝐢𝐧𝐭𝐞𝐥𝐥𝐢𝐠𝐞𝐧𝐜𝐞, 𝐲𝐞𝐭 𝐥𝐢𝐭𝐭𝐥𝐞 𝐮𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝𝐢𝐧𝐠. 👽 How do post-training techniques like 𝐒𝐅𝐓, 𝐑𝐄𝐈𝐍𝐅𝐎𝐑𝐂𝐄, and 𝐏𝐏𝐎 work in-tandem to turn these aliens into helpful AI assistants? Helpfulness is instilled in LLMs as a result of extensive 𝐩𝐨𝐬𝐭-𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠. One approach in particular has been exceptionally successful: 𝐑𝐞𝐢𝐧𝐟𝐨𝐫𝐜𝐞𝐦𝐞𝐧𝐭 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐅𝐫𝐨𝐦 𝐇𝐮𝐦𝐚𝐧 𝐅𝐞𝐞𝐝𝐛𝐚𝐜𝐤 (𝐑𝐋𝐇𝐅). 𝐑𝐋𝐇𝐅 enables models to learn directly from human preferences, capturing rich, nuanced feedback rather than relying solely on specific gold examples. One of the de facto engines of RLHF has been 𝐏𝐫𝐨𝐱𝐢𝐦𝐚𝐥 𝐏𝐨𝐥𝐢𝐜𝐲 𝐎𝐩𝐭𝐢𝐦𝐢𝐳𝐚𝐭𝐢𝐨𝐧 (𝐏𝐏𝐎). Taken at face value, PPO is puzzling; when applied to LLMs, it involves no less than four different versions of the model interacting together (𝐩𝐨𝐥𝐢𝐜𝐲, 𝐯𝐚𝐥𝐮𝐞, 𝐫𝐞𝐰𝐚𝐫𝐝, and 𝐫𝐞𝐟𝐞𝐫𝐞𝐧𝐜𝐞), and is driven by an intricate loss function. In our blog, we build-up to PPO, starting from 𝐬𝐮𝐩𝐞𝐫𝐯𝐢𝐬𝐞𝐝 𝐟𝐢𝐧𝐞-𝐭𝐮𝐧𝐢𝐧𝐠 (𝐒𝐅𝐓). We connect the dots between 𝐫𝐞𝐣𝐞𝐜𝐭𝐢𝐨𝐧 𝐬𝐚𝐦𝐩𝐥𝐢𝐧𝐠, 𝐫𝐞𝐰𝐚𝐫𝐝 𝐦𝐨𝐝𝐞𝐥𝐬, 𝐑𝐄𝐈𝐍𝐅𝐎𝐑𝐂𝐄, and 𝐀𝐝𝐯𝐚𝐧𝐭𝐚𝐠𝐞 𝐀𝐜𝐭𝐨𝐫 𝐂𝐫𝐢𝐭𝐢𝐜, drawing a deeper understanding of how to tune LLMs to deliver helpful, harmless, and honest answers. Read 𝐅𝐫𝐨𝐦 𝐙𝐞𝐫𝐨 𝐭𝐨 𝐏𝐏𝐎: 𝐔𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝𝐢𝐧𝐠 𝐭𝐡𝐞 𝐏𝐚𝐭𝐡 𝐭𝐨 𝐇𝐞𝐥𝐩𝐟𝐮𝐥 𝐀𝐈 𝐌𝐨𝐝𝐞𝐥𝐬: https://lnkd.in/eUSMbJYx
-
📣 At #VDS2024, Adaptive ML CTO Baptiste Pannier joined Ahmed Menshawy of Mastercard, Margarida Garcia of poolside, and Neema Balolebwami Nelly of NEEMA AI for a discussion on the challenges and rewards of getting GenAI into production. 📣 The conversation, moderated by Ruben Colomer Flos of Next Tier Ventures, centered on how organizations can leverage LLMs to foster innovation and enhance operational efficiency, while avoiding potential legal challenges. Baptiste highlighted the importance of keeping models in-tune with operations, continuously learning from production feedback to ensure constant performance. 🎉 Thanks to VDS for hosting and organizing a great event! 🎉
-
🥳 Please welcome the newest edition to Adaptive ML, Tugdual de Kerviler! Tug joins us as a 𝐅𝐫𝐨𝐧𝐭𝐞𝐧𝐝 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫, building intuitive interfaces for our product and ensuring seamless user experience 💻 A repeat founder - Diversified (Acquired by Konvi) and Nirror (Acquired by AB Tasty) - and seasoned software engineer, Tug was most recently a 𝐌𝐚𝐜𝐡𝐢𝐧𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫 at SalesScan. There, he designed and orchestrated LLM agents to automate observations on sales opportunities. 🦸 Want to join us too? We're hiring for three roles: a 𝐑𝐞𝐬𝐞𝐚𝐫𝐜𝐡 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫, a 𝐃𝐞𝐯𝐎𝐩𝐬 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫, and another 𝐅𝐫𝐨𝐧𝐭𝐞𝐧𝐝 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫. Apply now 👉 https://lnkd.in/e_PEEbif