G. Bailey Stockdale’s Post

Ag LLM benchmark update: 1. V2: added 300 new questions (thanks to Noah Freeman!). 2. Reran models on new benchmark. 3. Categorized results. Farmer's Business Network, Inc.'s Norm has come way up in the rankings! Looks like they quietly updated. Full V2 benchmark here: https://lnkd.in/er8KqXxf

  • No alternative text description for this image
Pratik Desai, PhD

Founder, KissanAI | Computer Scientist | Farmer

5mo

Like this initiative, and approach to evaluate. I'm creating a QnA set for specifically for India agronomy. Our last model was finetuned on India specific data, so it would make sense to evaluate on that. Would you be okay, if we start contributing to your repo? I also have Climate Resilient Agriculture Eval set from the UN project. Again, India specific, but won't be hard to create another set for general CRA questions.

Max von Olfers

Agriculture & Technology 👨🏻🌾 🌱 | Web & LLM 🤖 🌻 | 🤩 Spirits (e)commerce 🥃 | 🇫🇷 🇩🇪 Hamburg & Bordeaux

6mo

Let’s see how that Sonnet 3.5 performs here! Great work G. Bailey Stockdale and Noah Freeman !

Good stuff. Looking forward to taking a look and testing these new 300 questions over here.

See more comments

To view or add a comment, sign in

Explore topics