Shivendra Upadhyay’s Post

Data Science, Genrative AI, LLM, NLP, ChatBots,LLMops ,Semantic Search, Consulting

4mo

Spatial Reasoning did not ‘emerge’ spontaneously in Large Language Models (LLMs) the way so many reasoning capabilities did. Humans have specialized, highly capable spatial reasoning capabilities that LLMs have not replicated. But every subsequent release of the major models — GPT, Claude, Gemini- promises better multimedia support, and all will accept and try to use uploaded graphics along with texts. #llms #chatgpt4 #gemini #llmops #aiml #ml https://lnkd.in/dFSWKqKs

Language Models and Spatial Reasoning: What’s Good, What Is Still Terrible, and What Is Improving

towardsdatascience.com

To view or add a comment, sign in

More Relevant Posts

Aleksandar Sasha Grujicic

Public Company CEO - Board Member - Senior Advisor - Founder
6mo
Report this post
Very interesting paper on what these LLMs actually are: "Can Large Language Models put 2 and 2 together?" As has been said many different times, the conclusion of the research is illuminating: "We conclude that non neuro-symbolic LLMs are in effect big statistical search engines. Although the supported data distributions are ever more rich with each GPT release, giving an impression of innate reasoning capabilities, we emphasize that these models are not bona fide reasoners." https://lnkd.in/e7_wuGTs

2404.19432

arxiv.org
Like Comment
To view or add a comment, sign in
es/iode

1,100 followers
8mo
Report this post
📃Scientific paper: Large Language Models Are Zero-Shot Time Series Forecasters Abstract: By encoding time series as a string of numerical digits, we can frame time series forecasting as next-token prediction in text. Developing this approach, we find that large language models (LLMs) such as GPT-3 and LLaMA-2 can surprisingly zero-shot extrapolate time series at a level comparable to or exceeding the performance of purpose-built time series models trained on the downstream tasks. To facilitate this performance, we propose procedures for effectively tokenizing time series data and converting discrete distributions over tokens into highly flexible densities over continuous values. We argue the success of LLMs for time series stems from their ability to naturally represent multimodal distributions, in conjunction with biases for simplicity, and repetition, which align with the salient features in many time series, such as repeated seasonal trends. We also show how LLMs can naturally handle missing data without imputation through non-numerical text, accommodate textual side information, and answer questions to help explain predictions. While we find that increasing model size generally improves performance on time series, we show GPT-4 can perform worse than GPT-3 because of how it tokenizes numbers, and poor uncertainty calibration, which is likely the result of alignment interventions such as RLHF. ;Comment: NeurIPS 2023. Code available at: https://lnkd.in/eR8EABaY Continued on ES/IODE ➡️ https://etcse.fr/1pC ------- If you find this interesting, feel free to follow, comment and share. We need your help to enhance our visibility, so that our platform continues to serve you.

Large Language Models Are Zero-Shot Time Series Forecasters

ethicseido.com
Like Comment
To view or add a comment, sign in
Evan Benjamin

Relativity. AI Governance. AWS Cloud.
5mo
Report this post
Jonathan Johnson-Swagel and all my Chain-of-Thought friends out there... 👉 You've heard of Chain-of-Thought but have you heard of Whiteboard-of-Thought? 🔑 Whiteboard-of-Thought" enables multimodal language models to use images as intermediate steps in thinking, improving performance on tasks that require visual and spatial reasoning. 💠 Researchers from Columbia University have developed a new technique that allows multimodal large language models (MLLMs) like OpenAI's GPT-4o to use visual intermediate steps while thinking. 💠While Chain-of-Thought prompts language models to write out intermediate steps in reasoning, Whiteboard-of-Thought provides MLLMs with a metaphorical "whiteboard" where they can record the results of intermediate thinking steps as images! https://lnkd.in/ek_BFdim

Whiteboard of Thought: New method allows GPT-4o to reason with images

the-decoder.com
Like Comment
To view or add a comment, sign in
Sourav Raina, MBA

Management Consultant | Market Research Strategist | Founder @ Cogy | Global Collaborations with B4, F500, Leading MR Firms, Consulting Networks, and Startups
1mo
Report this post
How to mitigate hallucinations in Large Language Models (#LLMs)? 1. Knowledge Base Integration: Incorporate domain-specific knowledge bases to provide accurate, contextual information. 2. Chunk Optimization: Break down large documents into smaller, manageable chunks for better context retention. 3. Prompt Engineering: Craft precise prompts to guide the LLM towards accurate responses. 4. API Parameter Tuning: Adjust parameters like temperature and top_p to control output randomness. 5. Domain-Specific Classification: Implement classifiers to categorize queries and route them to appropriate models or knowledge bases. 6. Retrieval-Augmented Generation (RAG): Use RAG techniques to enhance responses with relevant, factual information. 7. Ensemble Approaches: Combine multiple models or techniques to improve overall accuracy and reliability. 8. Continuous Monitoring: Implement robust monitoring systems to detect and address hallucinations in real-time. 9. Feedback Loops: Establish mechanisms for user feedback to continuously improve the model’s performance. 10. Evaluation Frameworks: Utilize tools like #RAGAS and #TruLens to assess and improve the quality of LLM outputs. https://lnkd.in/gX54GwVC #AIOptimization #LLMTuning #NLPTech #MachineLearning #AIInnovation #LanguageModels #TextGeneration #DeepLearning #ArtificialIntelligence

Mitigating Hallucinations in Foundation Language Models: A Structured Approach for Hallucination- Free Query Responses in Regulatory Domains

https://adasci.org
Like Comment
To view or add a comment, sign in
Georg Huettenegger
5mo
Report this post
AI researchers have looked at the latest large language models and state as a result "Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models" with GPT-4 and Claude 3 sometimes providing the right answer https://lnkd.in/enw-nDkR. #airesearch #largelanguagemodel #gpt4 #claude3 #generativeai #artificialintelligence

Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

arxiv.org
Like Comment
To view or add a comment, sign in
Research Graph

197 followers
6mo
Report this post
CompactifAI: Large Language Models Don’t Have to Be Large 📌Large language models (LLMs) have been steadily growing in size and complexity over the years. Despite the performance improvements this has brought, this is an unsustainable path due to the energy and resources required to train more and even bigger models. With CompactifAI, LLMs can perform equally well without their huge size. Article link: https://lnkd.in/gvp67vg6 🔹LLMs nowadays have billions and even trillions of parameters to ensure their impressive natural language generation abilities, but this is unsustainable. Yet, compression techniques like quantisation and knowledge distillation cannot provide controlled compression and are brute-force, causing unpredictable drops in performance. 🔹CompactifAI compresses weights within LLMs using a novel approach, by utilising tensor networks and specifically, Matrix Product Operators (MPOs). Using MPOs, weight matrices can be decomposed into smaller ones and the compression can be controlled explicitly via the bond dimension. 🔹Evaluations of CompactifAI on the Llama-2 7b model showed that the compressed versions of the model required less training and inference times. Most excitingly, with 2.1 billion parameters rather than the original 7 billion, the compressed models only suffered a 2-3% drop in performance accuracy across five benchmarks. 🔹These results show that existing LLMs are heavily overparameterised and do not need to be as large as they are to perform well. With other methods like Sparse Llama also being released, it is evident that the world is moving toward smaller LLMs that are no less powerful than their larger counterparts. 📑Tomut, A., Jahromi, S. S., Sarkar, A., Kurt, U., Singh, S., Ishtiaq, F., Muñoz, C., Bajaj, P. S., Elborady, A., del Bimbo, G., Alizadeh, M., Montero, D., Martin-Ramiro, P., Ibrahim, M., Alaoui, O. T., Malcolm, J., Mugel, S., & Orus, R. (2024). CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks (Version 2). arXiv. DOI: 10.48550/ARXIV.2401.14109

CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks

arxiv.org
Like Comment
To view or add a comment, sign in
Claudio Grassi
6mo
Report this post
LLMs getting better at logic! 🧠 Introducing SymbCoT, a framework combining symbolic logic & natural language processing for enhanced reasoning. #AI #NLProc #SymbCoT https://lnkd.in/dHnn2eg4

Elevating the Mind of Machines with a Symbolic Chain-of-Thought

medium.com
Like Comment
To view or add a comment, sign in
Robert X

AI-Driven Blogging @AICompetence.org
2mo
Report this post
The Rise of Large Language Models: What's the Fuss About? Artificial Intelligence has surged in recent years, and at the heart of it are large language models (LLMs). #AIcomparison #AIinbusiness #AImodels #AIrealworldapplications #GPT4 #GPT4usecases #largelanguagemodels #MachineLearning #Reflection70B #Reflection70Bperformance

Reflection 70B Vs GPT-4: Who Wins In Real-World Use Cases?

aicompetence.org
Like Comment
To view or add a comment, sign in

9,013 followers

919 Posts

View Profile Follow

Shivendra Upadhyay’s Post

More Relevant Posts

Explore topics