FruitPunch AI’s Post

OCR* has been solved for western/Latin script for over 20 years, yet it is still too inaccurate to use for handwritten Devanagari script. This keeps in place a huge digital divide, disallowing Indian farmers the same benefits of western farmers like yield prediction, loan calculation etc. This is a huge problem!  So we’re launching a new Challenge with Heifer International: 🇮🇳 AI for Indian Farmers - Handwritten Devanagari ✍️ 🎯The goal: Develop Optical Character Recognition for handwritten Devanagari script to close the digital divide for Indian Farmers 💡 So why is this old Machine Learning problem still unsolved? While the Latin script has 26 letters and a total of ~70 different characters incl. numbers and punctuation, Devanagari has thousands of combinations due to the conjunction of consonants, diacritics and modifiers. As a cherry on the cake, letters are joined together in a word as well. Heifer Labs & Heifer India have provided us with big datasets from the handwritten financial records of farmers to start solving this problem. We’re looking for Experts with to help solve this Challenge, apply & more info here: https://lnkd.in/exEpF9q4  ❣️ Big plus if you can read Devanagari ❣️ Technical objectives: 🖼️ Image Preprocessing: Improve the quality of the images so that the OCR solutions would be standardised/improved. Identify which techniques improve the quality of OCR #️⃣ OCR for Devanagari Script Develop an OCR solution to extract both handwritten and typeset text from loan documents. The OCR system must handle the complexities of the Devanagari script effectively. ✅ Post-processing / Quality Control Assess the quality of the OCR outputs Make a best guess on any uninterpretable characters and determine if the spelling of words is correct. Topics that will be covered: #ComputerVision (e.g. OpenCV, PyTorch, TensorFlow, etc.) #DeepLearning Methodologies #CNN #GAN (Generative Adversarial Networks) #RNN (Recurrent Neural Networks) #FeatureExtraction Techniques and #NLP *#OCR: Optical Character Recognition: computer vision technique to turn images of letters and numbers into their digital equivalents Vess Antoinette Buster Kishore Tanishka Juhee Reena Deepa #devanagari #LLMs #digitaldivide #agritech

  • No alternative text description for this image
Buster Franken

CEO of FruitPunch AI | Building the global AI for Good community to solve humanity's greatest challenges! | AI, community building, education

2mo

The fact that OCR, or the translation of handwritten script to digital documents is still too inaccurate to use for Devanagari script, while it's long solved for Latin script, is widening the gap between rich western countries and developing nations. We must give Indian farmers the same advantages as western entrepreneurs have had for the last 30 years!

To view or add a comment, sign in

Explore topics