𝕎𝕙𝕪 #𝔸𝕀 𝕗𝕠𝕝𝕜𝕤 𝕟𝕖𝕖𝕕 𝕒 𝕓𝕣𝕠𝕒𝕕 𝕓𝕒𝕤𝕖𝕕 𝕀𝕟𝕥𝕣𝕠 𝕥𝕠 #𝔸𝕀 👉 As I go around giving talks/tutorials on the planning and reasoning abilities of LLMs, I am constantly surprised at the rather narrow ML-centric background of grad students/young researchers have about #AI. This seems to be especially the case with those who think LLMs are already doing planning and reasoning etc. Most of them don't seem to know much about the many topics that are taught in a broad-based Intro to #AI course--such as combinatorial search, logic, CSP, difference between inductive vs. deductive reasoning (aka learning vs. inference), soundness vs. completeness of inference/reasoning etc. I can understand why a strong background in ML and DL is sine qua non these days in using/applying the current #AI technology. That doesn't however mean that other things, that are typically not covered in ML courses, but are covered in Intro #AI courses, are expendable. If you don't know those concepts, you are more likely than not to re-invent crooked wheels (see this for examples of how people get tripped up: https://lnkd.in/gUPPb7s4) All this is particularly relevant for those busy building empirical scaffolds over LLMs (the "LLMs are Zero-shot <XXX>" variety). Most often, these young researchers are coming from NLP. At one point, NLP used to be NLU and students had quite a firm grasp of logic (e..g Montague Semantics!). But over the years, NLU became NLP which in turn has become Applied Machine Learning, and students don't quite have the background in logic/reasoning etc. Now that LLMs have basically "solved" the "processing" tasks--such as information extraction, format conversion etc., NLP folks are trying to turn to reasoning tasks--but often lack the necessary background. (See this unsolicited advice to NLP students: https://lnkd.in/gKTdsH2P) Background in the standard Intro AI topics like search/CSP/logic are useful even if you don't plan on directly using those techniques (e.g. because you want differentiable everything to make use of your SGD hammer). Like MDPs, they provide a normative basis for many deeper reasoning tasks AI systems would have to carry out when they broaden their scope beyond statistical learning. Without that background, you will likely try to pigeon hole everything into "in/out of distribution" framework, when what you need to think of is "in/out of deductive/knowledge closure; see https://lnkd.in/gTWVibdt ) One of the other things that you get exposed to in the standard Intro #AI is computational complexity of the various reasoning tasks. People who jumped in directly via applied ML might understand a bit of sample complexity (maybe?), but are not that attuned to reasoning complexity. (Contd. in the comment below)
Symbolic AI is a tool, just as DL is, and has its own shortcomings. Its a useful tool to learn, no doubt, especially for practitioners. But I am a bit more hesistant to just assume that many people "simply don't know enough about symbolic AI." Every CS degree (with specialization in AI) I've ever heard of features the course you're referring to. They likely do know a thing or two about it but want to steer away from it in general. Many times what gets people into AI is the desire to understand and model human intelligence. There is little to nothing that is biologically plausible about symbolic AI including search, CSP, logic, etc. To be clear, DL and SGD are not exactly plausible either, but it is at least subsymbolic.
Though I agree with the need to have much broader understanding of the pillars of modern AI as pointed out in your post, Prof. Subbarao Kambhampati, I disagree with assertion that LLMs outright lack planning. The terseness of the LI threads don’t make justice to complex topic, but I will try to provide schematic: - LLMs through training capture human experience recorded in millenia worth texts - the acquired knowledge is codified as parameters and embeddings - as a specialized and effective world model (with its limitations, of course) - that’s an empirical planning for all kinds of generalized human related scenarios - when fine-tuning LLM gets instructed to prioritize certain scenarios at the expense of others (ie embeddings and parameters get updated accordingly) - at inference times prompts get overlayed by transformers on the previously created model and actionable outcomes are generated - CSP and generally other search space methods are simply intractable in otherwise combinatorially explosive search space - DL cleverly optimizes the search space I have a brief reader’s digest in the end of this article about - How LLMs work: https://enterai.world/how-intelligent-ai-really-is/
Agree! I usully start my AI workshops with basic AI 101: what is AI, ML, DL, neural networks, gen AI, LLMs. And how do all these concepts relate to another. We don’t know what we don’t know. I initially found it surprising how many people with deep ML expertise and experience would come up and thank me for the big picture overview and tell me how much they learned and how helpful it is to understand the broader landscape. I worried I might come across as talking down to some of these brilliant folks I have the honor of speaking to. 😅
Not only has NLU devolved to NLP, but natural language generation used to mean “express some conceptual intent in a natural language.” Now it means “extend this text with more text.”
I would actually go further - I have no problem swallowing the bitter pill that arguably none of sybolic AI ever truly worked on real-world problems. What I worry more about is that we are losing the methodological foundation of computer science, which is to rigorously define a computational problem and think about what representations and algorithms will solve it best. That's what people are no longer interested in.
So, what are universities, CS departments, and academics doing about this? This knowledge gap in a whole generation of researchers is reflective of the hiring bias US departments have shown in the past decade.
Totally agree. Also, we can apply our own mind to learn, and understand these (classical AI) topics, rather than throw data, compute at them, and expect our hypothesis to not turn out false (as David Donoho says, Deep Learning is a magical mirror. You see what you want to see).
If all you know about is hammers…
Prof at ASU (Former President of AAAI)
5mo(Contd. from the main post) While computational complexity has increasingly been sidelined in these days where we mostly ignore the costs of offline training (see this thread on the "death of computational complexity" https://x.com/rao2z/status/1500178336504442880), it will eventually rise up and bite you (especially if you are trying to make money without an #AI Pyramid scheme..). For example, trying to get LLMs to fake reasoning via fine tuning often ignores the amortization costs associated with memoization (as sent off in this GoFAI vs. LLaMAI satire: https://x.com/rao2z/status/1749104832450072953). Understanding conservation of computational complexity also makes you question/avoid unwarranted optimism about reasoning being solved by just approximate retrieval on LLMs, (even with pre-training on web-scale corpora, synthetic data etc). given that they take constant time per completion token! (c.f: https://x.com/rao2z/status/1766087877216371072) (Finally, fwiw, here is the link to the Intro to #AI I teach at ASU that brings all these things together: https://rakaposhi.eas.asu.edu/cse471/ )