AI4Bhārat’s Post

🚨🚨 𝗡𝗲𝘄 𝗜𝗻𝗱𝗶𝗰 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗟𝗟𝗠 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸! 🚨🚨 🎉 Excited to share our latest work: 𝗠𝗜𝗟𝗨 - A Multi-task Indic Language Understanding Benchmark, done as a collaboration between AI4Bhārat and IBM Research India, as part of The AI Alliance🌏🤝. 𝗞𝗲𝘆 𝗛𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝘀: • 85K MCQ questions in 11 Indian Languages. • Spanning 8 diverse domains and more than 40 subjects. • Built with an India-centric approach, evaluating both general and cultural knowledge. We also evaluated 40+ different LLMs (proprietary, open-source, and Indic language-specific). Our key findings: • GPT-4o achieved the highest accuracy at 72%. • Open LLMs (like Llama3.1, Gemma, etc) outperform Indic language finetuned LLMs.  • Models struggle more with culturally relevant domains vs STEM. We hope this benchmark helps in driving the development of more culturally aware and linguistically-competent AI systems for India's 1.4B+ people. 𝗣𝗮𝗽𝗲𝗿 📄: arxiv.org/abs/2411.02538 𝗖𝗼𝗱𝗲 💻: https://lnkd.in/gDhnRsf5 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 🤗: https://lnkd.in/gjruEKjP Work done by: Sshubam Verma Mohammed Safi Ur Rahman Khan Vishwajeet Kumar, PhD Rudra Murthy Jaydeep Sen #AI4Bharat #IBMResearch #NLP #AI #IndianLanguages #Benchmark #LLMs #Evaluation #LLMevaluation #AIAlliance

MILU: A Multi-task Indic Language Understanding Benchmark

MILU: A Multi-task Indic Language Understanding Benchmark

arxiv.org

Max Barker

Hire FAANG talent on Discord 🕹️ | Trusted by top VC backed startups | Send me a DM for access 👋

1mo
Like
Reply
Rushi Tulasi

Research @proshort | finetuning, RAGs, KG, Agentic.

1mo

\usepackage[backend=biber, style=numeric-comp, maxcitenames=1, maxbibnames=2]{biblatex}

Like
Reply
Pursottam Sah

SWE-1 @UKG | Ex- Full Stack Developer @AI4Bharat | Frontend Developer @Uniflik Pvt Ltd | Ex - SDE @Indian Oil | GDSC Open Source Lead | NIT AP CSE'24 | TnP Volunteer

1mo

Always there is love and respect for AI4Bhārat 🤩 Congratulations Team

Like
Reply
Vishnu Prasad J (ヴィシュヌ)

Artificial Intelligence Engineer at Examroom.ai | Ex Quest Global [Canon Medicals, Japan]

1mo
Krishnanjaneyulu Payala

Research Scholar- IIIT- Kottayam, Professional Member- ACM, ACM Anveshan Setu Fellow (2024- 25), ISRO (IIRS) Outreach Coordinator, Member (97999340) Software Defined Networks Community, IEEE

1mo

Excellent work

Like
Reply
Rishav Dash

Data Science @ Xelpmoc Design and Tech Ltd || Prev AI @ Squareyards @iNeuron.ai || Kaggle Competition Expert || AI for Good @Omdena|| RAITian

1mo
Like
Reply

Congratulations AI4bharat

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics