Large Language Models (LLMs) are kinda dumb 🤔 This is quite a statement, I know. But I can explain how LLMs work by showing their strange behavior and why that happens. Meet 'tokens,' the building blocks of LLMs, similar to the alphabet in Latin languages. While we have 26 letters, LLMs like GPT-4 use around 100,000 ‘letters’ (tokens). When someone asks you to determine the number of times the letter ‘r’ appears in "strawberry", you can: ↳ Dissect the word into its letters ↳ You get: s-t-r-a-w-b-e-r-r-y ↳ Count each occurrence of ‘r’ ↳ The outcome will be 3 But if I asked you how many times the letter ‘r’ appears in the word ‘🍓’, what would your answer be? How would you dissect the word ‘🍓’? That’s pretty much what you’re asking an LLM to do when you ask it to count the characters in a word… Think about it. When you understand how LLMs work, operate, and think—with their 100,000-button keyboard—you can better grasp the tasks they can and can’t do. In the coming weeks, I’ll share more examples to help you understand how LLMs work and how to improve your way of working with them. Let’s connect if you want to know more.
I was sad LLM cannot count letters... I felt it was my duty to help them
I had the same conversation with a colleague. If the problem space fits a passive robot parrot 🦜 you are ok. Else don't believe your toaster can talk about economics.
Data & ML Engineering, Large Language Models, Cloud Architecture & Development
4moDespite what's being parroted by so-called AI skeptics all around social media, language models are actually capable of counting any letter in any word with 100% accuracy: https://chatgpt.com/share/ebf16d0f-c588-4ec4-acd1-03aed9a95231