Tzu-Lin Kuo’s Post

M.S @ NTU CSIE | Ex-Software Engineering Intern @ MediaTek | Ex-Deep Learning Research Intern @ MediaTek Research 聯發創新基地

2mo Edited

Even when accurate context is retrieved, can LLMs extract the preferred answer when layers of conditions or constraints are applied across conversation turns? 🤔 🌟 Thrilled to share my recent work: RAD-Bench: Evaluating Large Language Models' Capabilities in Retrieval Augmented Dialogues, developed during my internship at MediaTek Research 聯發創新基地 ! With RAG, SAG, and tool-using becoming prevalent, it's crucial to evaluate how models handle progressively accumulated constraints and external information in context-rich scenarios. Existing benchmarks either assess LLMs' chat abilities in multi-turn dialogues or their use of retrieval for augmented responses in single-turn settings. To bridge the gap, we proposed RAD-Bench, which to the best of our knowledge is the first benchmark to assess LLMs' ability to follow user instructions in multi-turn dialogues while effectively utilizing retrieved context. Our benchmark focuses on two key abilities: Retrieval Synthesis and Retrieval Reasoning. We constructed a pipeline using LLMs to generate, select, and synthesize synthetic questions and retrieved contexts, resulting in 89 high-quality multi-turn samples (267 total turns). We evaluated popular LLMs like GPT-4, Llama, Gemma, Mistral, Deepseek, and Breeze using LLM-as-a-Judge with tailored prompts. Our findings show that as conditions accumulate, models struggle more to identify key information from context. Comparisons with Chatbot Arena reveal that RAD-Bench effectively distinguishes LLM performance in context-rich, augmented dialogues, demonstrating that models with similar performance in regular multi-turn conversations may differ in retrieval-augmented scenarios. For detailed insights, check our paper on arXiv: https://lnkd.in/gKT8ACze Heartfelt thanks to my mentor Feng-Ting Liao for guidance and discussions, Mu-Wei Hsieh for additional experiments and result visualization, and Mark for paper revisions. Your contributions were instrumental in bringing this work to fruition! #LLM #MultiTurn #RAG #Benchmark #Research #Internship

2409.12558

arxiv.org

4 Comments

Wen-Tzu Chang

Generative AI, Computer Vision & Machine Learning Researcher | NTU EE

2mo

Very informative

1 Reaction

Daniel Chen

M.S @ NTU CommE | Data Scientist Intern @ Inventec

2mo

Excellent work!

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Belen Alastruey

PhD student @ Meta & PSL | Ex Apple, Amazon
2mo Edited
Report this post
Excited to share that the project I worked on during my Master's Thesis and an internship at Meta has been accepted at #EMNLP Main! 🎉 In our work we use interpretability to understand the role of ASR pretraining in Direct Speech Translation models, and we use the findings to skip the pertaining. 🔎 📈 We analyzed how does a Transformer use the speech source during the training in two approaches: using a pretrained encoder vs. training a model from scratch. Our findings reveal that models trained from scratch struggle to effectively use speech input for predictions early in training. We hypothesize that models focus on language modeling to bypass the encoder information until the encoder is adequately trained. This process is quick in MT, non-existent when using a pre-trained encoder in ST, and lengthy when training an ST system from scratch. To assess our findings, we propose a small tweak in the decoder's cross-attention mechanism. This change forces the model to integrate source information earlier in the training process. Our modified model, trained from scratch, not only performs comparably to its pretrained counterpart but also reduces training time significantly by skipping the pretraining. Check out the full paper! 📄: arxiv.org/abs/2409.18044 w/ Gerard I. G., Marta Ruiz Costa-jussà See you in Miami! 🏖
3 Comments
Like Comment
To view or add a comment, sign in
Vidya Madineni

Student at KL University | Artificial intelligence and Data Science | TECH DIRECTOR at Data Science Club | Intern at Prodigy Infotech | Certified in 1X Aviatrix | RedHat
6mo Edited
Report this post
Hey everyone, 🌟 🚀 #Codsoft Internship Task 01: Rule-Based Chatbot 🚀 I’m excited to share my latest project from the #Codsoft Internship, where I’m diving into the fascinating domain of artificial intelligence. 🎉 For Task 1, I developed a simple yet functional chatbot that responds to user inputs based on predefined rules. By utilizing if-else statements and pattern matching techniques, the chatbot can identify user queries and provide appropriate responses. 💡 This project is a great starting point for understanding the basics of natural language processing and conversation flow. 🔍 Feel free to check it out and let me know what you think! GitHub Link: https://lnkd.in/gS2SMJX4 #codsoft #ArtificialIntelligence
Like Comment
To view or add a comment, sign in
Debargha Ganguly

CS PhD Student @CWRU
2mo Edited
Report this post
🌟 Chain of Thought 🔗, Tree of Thought 🌳, Graph of Thought 🧠—all incredible advancements that enhance reasoning capabilities in AI. But what if we could take it further? What if we could bring interpretable and explicit reasoning to AI systems with guarantees? That’s where Proof of Thought comes in. While methods like CoT, ToT, and GoT improve performance through generating intermediate steps, Proof of Thought (PoT) introduces a new dimension: program generation and formal logic verification. Instead of relying on implicit reasoning or statistical patterns, PoT generates first-order logic programs that are verifiable through a theorem prover. This ensures that every inference is explicit, interpretable, and guaranteed. By blending LLMs with formal logic, we make reasoning processes transparent and accountable—paving the way for trustworthy AI in critical domains. Work done this summer at Microsoft Research. Srinivasan Iyengar, Vipin Chaudhary, and Shivkumar Kalyanaraman. #AI #Reasoning #InterpretableAI #MachineLearning #TheoremProving #LLMs #ProofOfThought

Srinivasan Iyengar

Senior Program Manager at Microsoft Energy Industry Asia
2mo

Excited to share our new paper: “Proof of Thought: Neurosymbolic Program Synthesis for Robust & Interpretable Reasoning"!🔍🤖 We combine LLMs with formal logic for verifiable AI systems. Benchmarked on StrategyQA & multimodal reasoning tasks. Paper Link: https://lnkd.in/gtRyBGzN Work done by a PhD student Debargha Ganguly during his internship at Microsoft. Fellow collaborators - Shivkumar Kalyanaraman Vipin Chaudhary

Proof of Thought : Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning

arxiv.org

7 Comments
Like Comment
To view or add a comment, sign in
Matthew Henry

Executive Growth-Focused Leader & Talent Multiplier Specialized in Driving Technology & Innovation to Empower Teams | Operational Excellence & Transformation
2mo
Report this post
Matthew reads 10 LinkedIn posts on Monday. Then he reads 44 posts on Tuesday. On Wednesday he reads double the number of posts he did on Monday, but five of them were one phrase or less. How many LinkedIn posts did Matthew read? Word problems! Do you like them? Thankfully, excellent research published through arXiv by researchers during an internship with Apple concluded: "We believe further research is essential to develop AI models capable of formal reasoning, moving beyond pattern recognition to achieve more robust and generalizable problem-solving skills. This remains a critical challenge for the field as we strive to create systems with human-like cognitive abilities or general intelligence." Great news for we as humans! The study was fascinating! Using the GSM8K (Grade School Math 8K) dataset and creating a new GSM Symbolic Template to test 25 state-of-the-art open and closed LLMs. We have a long way to go! https://lnkd.in/ewFdGKJF

2410.05229

arxiv.org

6 Comments
Like Comment
To view or add a comment, sign in
Pujitha Kota

Passionate Computer Science Student
5mo Edited
Report this post
Introducing My Visual Question Answering Model with Hugging Face! I’m thrilled to share my latest task where I developed a Visual Question Answering (VQA) model using the powerful tools provided by Hugging Face. 📸 Project Overview: Visual Question Answering (VQA): The model can understand and answer questions about images, merging the fields of computer vision and natural language processing. User Interaction: Users can upload an image and ask questions about it, receiving accurate and context-aware answers. 🎯 Objective: To create a seamless interaction between users and AI, enabling better understanding and interpretation of visual content through natural language. #Huggingface #AIMERSociety #AI #internship #task
Like Comment
To view or add a comment, sign in
Gowtham sai M

Attended Avanthi Institute of Engineering & Technology Cherukapally(V) Tagarapuvalasa Vizianagaram-531163
5mo
Report this post
Hello connection's I'm thrilled to share that I've recently completed a project on Visual Question Answering (VQA) using Hugging Face, executed in Google Collab as part of my internship at AIMER Society - Artificial Intelligence Medical and Engineering Researchers Society Sai Satish What is VQA? Visual Question Answering combines computer vision and natural language processing to answer questions about images. Check out this 3-minute video for a detailed walkthrough of my project
Like Comment
To view or add a comment, sign in
Rishabh Jain

ML Intern @Eureka Forbes | ML/DL | NLP | Gen AI | BITS Pilani 25
4mo
Report this post
Just completed my one month of internship I’m completed my one of month six-month internship at Eureka Forbes Ltd Forbes, specializing in the domain of Generative AI This past month has been incredibly insightful, and I’ve gained a lot of valuable knowledge and experience in this field. My primary focus has been on fine-tuning pre-trained language models for sentiment analysis of customer feedback. From creating our own custom dataset for fine-tuning to actually training a model tailored to our specific needs, the journey has been both challenging and rewarding. Throughout this process, some of the best resources that helped me through the way were -Medium articles - Hugging Face - GitHub - DeepLearning.AI courses A special shoutout to Lamini especially Sharon Zhou, PhD chou in collaboration with DeepLearning.AI for their comprehensive course on structuring datasets, training models, understanding training arguments in-depth, and evaluating and saving the fine-tuned models. This course has been instrumental in providing a clear and detailed explanation of each step involved. I highly recommend this course to anyone interested in this domain. Looking forward to the upcoming months and the new challenges and learning opportunities they will bring! #GenAI #Internship #MachineLearning #NLP #LLm #DeepLearning #AI #students #college #education

Rishabh Jain, congratulations on completing Finetuning Large Language Models!

learn.deeplearning.ai

11 Comments
Like Comment
To view or add a comment, sign in
Mazen mostafa

Software Engineer || Frontend Developer
6mo Edited
Report this post
📢 Exciting News! 🎉 Just finished Task1 at #codsoft! 🤖💬 , building a chatbot using if-else statements and pattern matching techniques. Learned about natural language processing and conversation flow. Grateful for the guidance from the CodSoft team. Eager to explore advanced AI techniques. Empowering businesses with AI-powered chatbots is inspiring. Stay tuned for more updates on my #internship journey. the project in GitHub: https://lnkd.in/dCb6FJfg #AI #chatbots #conversationalAI #learningjourney #codsoft

2 Comments
Like Comment
To view or add a comment, sign in
Satyam Wadhwa

Aspiring Quality Assurance Engineer | Machine Learning Enthusiast | B.Tech in AI & ML
4mo
Report this post
I successfully completed my AI internship task of creating a sentiment analyzer. The project involved developing a web application with a Flask backend and a React frontend, utilizing the VADER sentiment analysis tool. This experience enhanced my skills in natural language processing and integrating pre-trained models into practical applications. #PRASUNET #ARTIFICIALINTELLIGENCE
Like Comment
To view or add a comment, sign in
Manoj Kumar Balisetty

Student at GNA University
4mo
Report this post
🌟I'm thrilled to share that Interactive Voice-Driven AI Assistant Development 🎙️🤖 During my internship at AIMERS Society, guided by Sai Satish sir, I developed an advanced interactive system combining speech recognition, text generation, and text-to-speech functionalities. This project uses the Google AI Python SDK to create a conversational assistant that responds intelligently to user input. 🧠💬 #AIMERSsociety #AI #SpeechRecognition #TextGeneration #TextToSpeech #GoogleAI #Innovation #ConversationalAI
Like Comment
To view or add a comment, sign in

263 followers

10 Posts

View Profile Connect

Tzu-Lin Kuo’s Post

More Relevant Posts

Explore topics