Sahil Sinha’s Post

View profile for Sahil Sinha, graphic

Co-Founder @ lytix.ai [YC W24]

Human in the loop evals 🧑💻👀 → custom, automated evaluation models 🤖📈? https://lnkd.in/gruPtC8d Human in the loop (HITL) is a really popular technique for teams just getting started with LLMs and their LLMOps functions. Given your biggest concern is your model acting ‘not like a human’, it’s easy to see how sticking a human in the loop, to review and edit any potential mistakes or weirdness, can solve that problem. But it’s equally easy to see how challenging this can be to scale, and how it can eat at a lot of the cost-savings promised by automations. So how should teams approach balancing the security and certainty from human-in-the-loop, with an eye towards scaling? Over the weekend I got the chance to experiment with a few approaches to turning HITL datasets into their own custom evaluation models. I also think about how I would balance my human-agent evaluations alongside my model-driven evaluations, to set me up for long-term success. https://lnkd.in/gruPtC8d

lytix Blog

lytix Blog

blog.lytix.co

Tausif Khan

Software Engineer @Stealth.design

4w

Great insights! Finding the sweet spot between human oversight and automated precision is key to effective LLMOps.

To view or add a comment, sign in

Explore topics