🚨🚨 𝗡𝗲𝘄 𝗜𝗻𝗱𝗶𝗰 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗟𝗟𝗠 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸! 🚨🚨 🎉 Excited to share our latest work: 𝗠𝗜𝗟𝗨 - A Multi-task Indic Language Understanding Benchmark, done as a collaboration between AI4Bhārat and IBM Research India, as part of The AI Alliance🌏🤝. 𝗞𝗲𝘆 𝗛𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝘀: • 85K MCQ questions in 11 Indian Languages. • Spanning 8 diverse domains and more than 40 subjects. • Built with an India-centric approach, evaluating both general and cultural knowledge. We also evaluated 40+ different LLMs (proprietary, open-source, and Indic language-specific). Our key findings: • GPT-4o achieved the highest accuracy at 72%. • Open LLMs (like Llama3.1, Gemma, etc) outperform Indic language finetuned LLMs. • Models struggle more with culturally relevant domains vs STEM. We hope this benchmark helps in driving the development of more culturally aware and linguistically-competent AI systems for India's 1.4B+ people. 𝗣𝗮𝗽𝗲𝗿 📄: arxiv.org/abs/2411.02538 𝗖𝗼𝗱𝗲 💻: https://lnkd.in/gDhnRsf5 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 🤗: https://lnkd.in/gjruEKjP Work done by: Sshubam Verma Mohammed Safi Ur Rahman Khan Vishwajeet Kumar, PhD Rudra Murthy Jaydeep Sen #AI4Bharat #IBMResearch #NLP #AI #IndianLanguages #Benchmark #LLMs #Evaluation #LLMevaluation #AIAlliance
\usepackage[backend=biber, style=numeric-comp, maxcitenames=1, maxbibnames=2]{biblatex}
Always there is love and respect for AI4Bhārat 🤩 Congratulations Team
Excellent work
Congratulations AI4bharat
Hire FAANG talent on Discord 🕹️ | Trusted by top VC backed startups | Send me a DM for access 👋
1mohttps://discord.gg/learnmutiny