Job Nijenhuis’ Post

Human, Founder & CTO @ Neople, teaching your next, AI-powered, digital colleague to be your perfect personal assistant.

4mo

Large Language Models (LLMs) are kinda dumb 🤔 This is quite a statement, I know. But I can explain how LLMs work by showing their strange behavior and why that happens. Meet 'tokens,' the building blocks of LLMs, similar to the alphabet in Latin languages. While we have 26 letters, LLMs like GPT-4 use around 100,000 ‘letters’ (tokens). When someone asks you to determine the number of times the letter ‘r’ appears in "strawberry", you can: ↳ Dissect the word into its letters ↳ You get: s-t-r-a-w-b-e-r-r-y ↳ Count each occurrence of ‘r’ ↳ The outcome will be 3 But if I asked you how many times the letter ‘r’ appears in the word ‘🍓’, what would your answer be? How would you dissect the word ‘🍓’? That’s pretty much what you’re asking an LLM to do when you ask it to count the characters in a word… Think about it. When you understand how LLMs work, operate, and think—with their 100,000-button keyboard—you can better grasp the tasks they can and can’t do. In the coming weeks, I’ll share more examples to help you understand how LLMs work and how to improve your way of working with them. Let’s connect if you want to know more.

6 Comments

Volkan Erdoğan

Data & ML Engineering, Large Language Models, Cloud Architecture & Development

4mo

Despite what's being parroted by so-called AI skeptics all around social media, language models are actually capable of counting any letter in any word with 100% accuracy: https://chatgpt.com/share/ebf16d0f-c588-4ec4-acd1-03aed9a95231

1 Reaction

Pierre Alexandre Schembri

Turn your existing strengths into an unbeatable advantage with AI, no steep learning curve required.

3mo

I was sad LLM cannot count letters... I felt it was my duty to help them

1 Reaction

Rey Medina

Data Scientist, EE (Computer Vision)

4mo

I had the same conversation with a colleague. If the problem space fits a passive robot parrot 🦜 you are ok. Else don't believe your toaster can talk about economics.

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Benjamin di Lorenzo

Data Science | Strategy | Design Thinking | Statistics
7mo
Report this post
It is so strange to talk about "AI" in those days. It could mean anything to nothing. from Superhuman Agency to the simplest automation. To me a ChatBot is no "AI". Its a tool. It's a compressed knowledge retrieval system, it can process natural language. That's really it. There is no understanding or usability, without sophisticated systems in place that integrate that ChatBot. To me the notion of AI is about automation. Controlled, reliable scaling and the build of systems that solve problems by themselves. A ChatBot is not able to do that, because it just does not have any "Intelligence", it's a tool. If it would have any understanding, it would fail very complex tasks much more likely than the simplest tasks. In fact, it would not fail ANY of the simple tasks. And the fact that those systems can fail so dumbly makes scale very unefficient. And given that we understand very well WHY those systems fail so dumbly, tells us quite good, that this cannot ever really change. For example: LLM supervision with other LLM's must also break at any point due to the lack of understanding. So what system is supervising the supervisor?

Melanie Mitchell

Professor at the Santa Fe Institute
7mo

New blog post: Evaluating Large Language Models Using "Counterfactual Tasks". https://lnkd.in/gZ9Ep7wm

Evaluating Large Language Models Using “Counterfactual Tasks”

aiguide.substack.com

1 Comment
Like Comment
To view or add a comment, sign in
Murat Yildizoglu

Professor of economics
7mo
Report this post
Interesting read. My knowledge of deep learning is not very deep, but my experience with simple ANNs and basic knowledge about experience with the current LLMs let me think that "reasoning" is not yet what they do. However, they can follow the "reasoning" they learned from their training data, like following the steps for solving a Solow model, as in one of my previous examples.

Melanie Mitchell

Professor at the Santa Fe Institute
7mo

New blog post: Evaluating Large Language Models Using "Counterfactual Tasks". https://lnkd.in/gZ9Ep7wm

Evaluating Large Language Models Using “Counterfactual Tasks”

aiguide.substack.com
Like Comment
To view or add a comment, sign in
Brian Moyer

Founder & CEO @ The Innovation Studio | Cultivating Tech Innovation
9mo
Report this post
How Quickly Do Large Language Models Learn Unexpected Skills?

How Quickly Do Large Language Models Learn Unexpected Skills? | Quanta Magazine

quantamagazine.org
Like Comment
To view or add a comment, sign in
Science X Network

3,825 followers
2mo
Report this post
Large language models (LLMs) are machine-learning models designed to understand and generate human language.

Researchers develop method enabling LLMs to answer questions more concisely and accurately

techxplore.com
Like Comment
To view or add a comment, sign in
Emmanuel Onaih, MBA

Data Analyst with experience in the tech industry. Proven ability to collect, analyze, and interpret data to solve business problems, deliver insights, and drive business growth. Expertise in SQL, Python, and Tableau.
7mo
Report this post
6 Best Large Language Models (LLMs) in 2024 https://lnkd.in/eHKTBUQH via drumup.io

6 Best Large Language Models (LLMs) in 2024

https://www.eweek.com
Like Comment
To view or add a comment, sign in
TuringPost

5,173 followers
8mo
Report this post
"Long-Form Factuality in Large Language Models" introduces a new approach to evaluating and benchmarking the factuality of long-form responses generated by large language models (LLMs). ▪️ LongFact: 2,280 fact-seeking prompts across 38 topics, generated by GPT-4 to assess factuality in long-form responses. First dataset for evaluating LLM factuality over multiple paragraphs. ▪️ SAFE Methodology: It uses LLMs to evaluate response factuality. By breaking down responses into facts, checking their accuracy using Google Search, and aggregating evaluations, it provides a factuality score. SAFE outperforms human annotators in cost and reliability, with 72% agreement on ~16k facts. In 100 disagreements, SAFE was preferred 76% of the time, showing its efficiency and effectiveness. The authors provide open access to LongFact, SAFE, and all experimental code: https://lnkd.in/gq47hmue
1 Comment
Like Comment
To view or add a comment, sign in
John Heidenreich

Science Teacher at the French American School of New York
9mo
Report this post
How Quickly Do Large Language Models Learn Unexpected Skills?

How Quickly Do Large Language Models Learn Unexpected Skills? | Quanta Magazine

quantamagazine.org
Like Comment
To view or add a comment, sign in
Thomas John

Partner, Taft Stettinius & Hollister LLP
8mo
Report this post
6 Best Large Language Models (LLMs) in 2024 https://drumup.io/s/sY2U3H

6 Best Large Language Models (LLMs) in 2024

https://www.eweek.com
Like Comment
To view or add a comment, sign in
Alain AIROM

Partner Technical Specialist - Build
9mo
Report this post
https://lnkd.in/e2evrdTU Chain-of-Symbol Prompting For Large Language Models #llms #genai #generatieveai #cos Cobus Greyling

Chain-of-Symbol Prompting (CoS) For Large Language Models

cobusgreyling.medium.com
Like Comment
To view or add a comment, sign in
KDnuggets

51,868 followers
2mo
Report this post
In the first post in this series, we introduced retrieval augmented generation (RAG) , explaining that it became necessary to expand the capabilities of conventional large language models (LLMs) . #teachthemachine

Understanding RAG Part II: How Classic RAG Works - MachineLearningMastery.com

https://machinelearningmastery.com
Like Comment
To view or add a comment, sign in

1,262 followers

View Profile Connect

Job Nijenhuis’ Post

More from this author

The Vention Debug Duck

Explore topics