Root Signals

Software Development

Finally, a way to measure your LLM responses.

Seuraa

View all 10 employees

About us

Root Signals helps developers to create, optimize and embed the needed LLM evaluators to continuously monitor the behavior of LLM automations in production. With Root Signals End-to-End Evaluation Platform, development teams deliver reliable, measurable and auditable LLM automations at scale.

Sivusto: https://rootsignals.ai
External link for Root Signals
Toimiala: Software Development
Yrityksen koko: 2-10 employees
Päätoimipaikka: Helsinki
Tyyppi: Privately Held
Perustettu: 2023
Erityisosaaminen

Sijainnit

Ensisijainen

Helsinki, FI

Get directions
Dover, US

Get directions

Työntekijät Root Signals

Näytä kaikki työntekijät

Päivitykset

Root Signals

890 seuraajaa
6 pv
Ilmoita tästä julkaisusta
We are in NeurIPS 2024 in Vancouver! Hit Ari Heljakka or Oguzhan (Ouz) Gencoglu up for a chat about LLM evaluations, LLM-Judges and how we bring state-of-the-art AI research to production at scale.

Tykkää Kommentoi Jaa
Root Signals

890 seuraajaa
2 vk
Ilmoita tästä julkaisusta
🌎 Excited to be at AWS re:Invent in Las Vegas! Drop Ari Heljakka and Root Signals a line to chat about the latest in #EvalOps and #LLMevaluation.
Tykkää Kommentoi Jaa
Root Signals

890 seuraajaa
2 vk
Ilmoita tästä julkaisusta
🚨 Webinar Reminder: TOP-10 Misconceptions about LLM Judges in production is already soon. Join Ari Heljakka, Oguzhan (Ouz) Gencoglu, and Data Science Salon to explore: • Key misconceptions about #LLMjudges • Best practices for #EvalOps • How to build reliable, scalable evaluation systems Register now: https://lnkd.in/dZNMnWDh #LLMjudges #EvalOps
Tykkää Kommentoi Jaa
Root Signals

890 seuraajaa
2 vk Muokattu
Ilmoita tästä julkaisusta
🌐AWS re:Invent Guide for #GenAI Developers If you're an engineer working with LLMs, Amazon Web Services (AWS) re:Invent 2024 in Las Vegas has a lot to offer. We’ve put together a guide to help you get the most out of the event. Here’s what we cover: 🆕 What’s new in 2024 for developers building with LLMs 📋 How to prepare for the event to maximize your time 🔎 A quick guide to understanding session abbreviations and tracks 🎯 Our curated list of sessions—from keynotes to hands-on bootcamps—that are especially valuable for GenAI engineers Whether you’re looking to learn about the latest tools, dive deep into technical sessions, or connect with the GenAI developer community, this guide has everything you need to navigate #AWSreInvent effectively. Check out the guide here: https://lnkd.in/dzrQ6fFz
Tykkää Kommentoi Jaa
Root Signals

890 seuraajaa
3 vk
Ilmoita tästä julkaisusta
#LLM judges in production can transform AI evaluation but comes with challenges like reliability, explainability, cost unpredictability, and maintainability when not implemented properly. 👨💻 Join Ari Heljakka, Oguzhan (Ouz) Gencoglu, and Data Science Salon to explore: • Key misconceptions about #LLMjudges • Best practices for #EvalOps. • How to build reliable, scalable evaluation systems. 📆 Don’t miss this webinar! Learn more at: https://lnkd.in/dEusmY-5

Tämä sisältö ei ole saatavilla täällä

Käytä tätä ja paljon muuta sisältöä LinkedIn-sovelluksessa

3 kommenttia

Tykkää Kommentoi Jaa
Root Signals

890 seuraajaa
4 vk
Ilmoita tästä julkaisusta
🚀 Root Signals is at Slush in Helsinki, Nov 20-21! Meet Ari Heljakka and Oguzhan (Ouz) Gencoglu to explore the future of #EvalOps and the best practices for #LLMevaluation. See you at #Slush2024?!
2 kommenttia

Tykkää Kommentoi Jaa
Root Signals

890 seuraajaa
1 kk
Ilmoita tästä julkaisusta
It was a pleasure to host 40+ LLM experts and developers at our offices for the LLM Developer event organized by Tuomas Lounamaa, Symposium AI & Root Signals. The evening featured insightful talks from Aapo Tanskanen, Rasmus Toivanen, Markus S. and a demo from our Head of AI, Oguzhan (Ouz) Gencoglu, showcasing our Control, Evaluation & Observability Platform for GenAI applications. A heartfelt thank you to everyone who attended and made this gathering so engaging. We’re excited to continue building and learning with this incredible community. Stay tuned for more events focused on advancing LLM development!
Tykkää Kommentoi Jaa
Root Signals julkaisi tämän uudelleen
Oguzhan (Ouz) Gencoglu

Co-founder & Head of AI @ Root Signals | Measure and Control Your GenAI
1 kk Muokattu
Ilmoita tästä julkaisusta
#EvalsTuesdays Week 5 - Confirmation Bias in LLM-Judges LLM-as-a-Judge is the gift that keeps on giving (both joys and headaches). This week, we're tackling yet another bias that sneaks into our LLM evaluations: Confirmation Bias. Confirmation Bias: The tendency of LLM-Judges to favor responses that confirm their existing beliefs or the information presented in the prompt, while ignoring evidence to the contrary. In simpler terms, they might be agreeing with themselves a bit too much. Why does this matter? 🔍 It can lead to skewed evaluations, where certain types of responses are consistently overvalued or undervalued. 🧠 LLM-Judges may overlook errors or hallucinations if the response "sounds right" based on prior context. 🌐 This bias can be especially problematic in domains requiring critical analysis or when evaluating for factual accuracy. So, what's causing this? LLMs are trained on vast amounts of data, and they're great at picking up patterns. However, they also tend to reinforce patterns they've seen before. When an LLM-Judge evaluates a response, it might be more inclined to agree with content that aligns with those patterns, even if it's not the most accurate or helpful. How do we fight back? ✅ Diversify your prompts: Introduce variability in your evaluation prompts to prevent the model from getting too cozy with any one perspective. ✅ Encourage critical thinking: Incorporate instructions that nudge the LLM-Judge to consider alternative viewpoints or to critically assess the response. ✅ Meta-evaluation: Regularly test your LLM-Judges with known examples where the correct evaluation is counterintuitive, ensuring they're not just coasting on confirmation bias. At Root Signals, we obsess over these nuances so you don't have to. Our LLM-Judges are fine-tuned to spot not just the obvious issues but the subtle ones that can slip through the cracks. And most importantly, we provide a systematic and easy way to meta-evaluate = measure and tune your LLM-Judges. Remember, in the world of LLMs, vigilance is key. Don't let your judges get complacent - challenge them, test them, and keep them sharp. What's next? Maybe we'll dive into the rabbit hole of Chain-of-Thought prompting in LLM-Judges, maybe something else. Stay tuned!

1 kommentti

Tykkää Kommentoi Jaa
Root Signals julkaisi tämän uudelleen
Oguzhan (Ouz) Gencoglu

Co-founder & Head of AI @ Root Signals | Measure and Control Your GenAI
1 kk
Ilmoita tästä julkaisusta
#EvalsTuesdays Week 4 - Verbosity Bias in LLM-Judges Creating reliable evaluation metrics for #LLMs by using LLMs, i.e. LLM-as-a-Judge, is more than simply writing an evaluation prompt and calling an API. One reason is LLM-Judges are full of biases and verbosity is one of them. Verbosity bias: LLM judges favor longer responses, even if they are not as clear, high-quality, or accurate as shorter alternatives. They are like lazy teachers who give high grades to longer essays because they are long. Here is a quick example from Google's #Gemini ⬇ . The first answer is not only more to-the-point but also more precise. It is simply more helpful. Yet, Gemini scores the rambling management consultant answer with a higher score. When it comes to being able to measure your LLM applications, the devil is in the details. If your metrics are not reliable in the first place, what's the point? Verbosity bias has a massive effect in all sorts of use cases where LLM-Judges are utilized. Judge scores need to be calibrated and normalized with respect to the length of the text that is being evaluated. But this normalization is not universal either and is model dependent. We worry about all these things at Root Signals so that our users don't need to. I would love to hear how you evaluate your GenAI appplications?
1 kommentti

Tykkää Kommentoi Jaa
Root Signals julkaisi tämän uudelleen
Data Science Salon

6 556 seuraajaa
1 kk
Ilmoita tästä julkaisusta
Join us at #DSSSF for a session with Ari Heljakka, Founder and CEO of Root Signals, he will talk about: EvalOps - Mastering The Game of LLM Judges. This session will go into the innovative use of #LLMs as "judges" to oversee and refine the outputs of AI pipelines. With a focus on maintaining alignment with human norms and organizational policies, Ari's talk will explore the complexities and challenges of implementing these judge models effectively. This talk is essential for professionals involved in AI development and management who are looking to enhance the reliability and accountability of AI systems. In-person only on Nov 7 at Google HQ in SF. [Link in comments] #EvalOps #machinelearning #genAI
5 kommenttia

Tykkää Kommentoi Jaa

Samankaltaisia sivuja

Rahoitus

Root Signals 1 Kierros yhteensä

Viimeinen kierros

Siemen 4. lokak. 2024

2 800 000,00 $

Sijoittajat

Angular Ventures

Katso lisätietoja crunchbasesta

Root Signals

Software Development

Finally, a way to measure your LLM responses.

About us

Sijainnit

Työntekijät Root Signals

Otso Kallinen

Co-founder & Head of Product Design @ Root Signals | Measure and Control Your GenAI

Juho Ylikylä

Software @ Root Signals

Oguzhan (Ouz) Gencoglu

Co-founder & Head of AI @ Root Signals | Measure and Control Your GenAI

Janne Alasaarela

Let's roll up our sleeves and get the job done.

Päivitykset

Join now to see what you are missing

Samankaltaisia sivuja

ConfidentialMind

Laconic

Bitmagic

Valfare

Dream Broker

Mobal

Valohai

Top Data Science

Intentface

Superlines

Rahoitus