Ian Armstrong’s Post

4mo

LLMs are trained to do complex things and tokenize language strings for maximum efficiency. This is important to understand when you are writing prompts. This example with Gemini will show you what I mean. Keep this in mind if you are struggling with detecting hallucinations in your work.

1 Comment

Ian Armstrong

4mo

ChatGPT 4o cheekily implies that Gemini isn't as good at understanding questions or it would be able to do this correctly. Gemini may be able to process a million tokens now but you have to prompt it much more specifically to get a good result.

To view or add a comment, sign in

More Relevant Posts

Izam Ryan

Associate Director @ K3 Advantage | Driving Value Creation with Strategic, Data-Driven Insights
8mo
Report this post
Ploughing through some elearnings while on public transport and in between meetings. This one is a real quick intro to the subject. I've been experimenting a lot with Gemini 1.0 Advanced, Gemini 1.5 Pro (Preview) and the Experimental Gemini large language models and I like how the material sets up a deeper understanding to the topic. 👍👍, will recommend.
Like Comment
To view or add a comment, sign in
Russel Alfeche

Technology Leader, IA | Enterprise Automation Architect | AI Enthusiast | UiPath MVP and AI Ambassador
7mo
Report this post
#llm #nuggets ❓Which configuration parameter for inference can be adjusted to either increase or decrease randomness within the model output layer? A. Max new tokens B. Top-k sampling C. Temperature Correct Answer: C ℹ️ Explanation During text generation, large language models (LLMs) rely on a softmax layer to assign probabilities to potential next words. Temperature acts as a key parameter influencing the randomness of these probability distributions. Lower Temperature: When set low, the softmax layer assigns significantly higher probabilities to the single word with the highest likelihood based on the current context. Higher Temperature: A higher temperature “softens” the probability distribution, making other, less likely words more competitive. 💁♂️Why other options are incorrect: (A) Max new tokens: This parameter simply defines the maximum number of words the LLM can generate in a single sequence. (B) Top-k sampling: This technique restricts the softmax layer to consider only the top k most probable words for the next prediction. credits: https://lnkd.in/gpz6nUUH

Mastering LLM (Large Language Model) – Medium

masteringllm.medium.com
Like Comment
To view or add a comment, sign in
Ajay Krishna

IIT-G | @UNDP | LLM | Data analyst | All things Startup and Quantum Computation | For the agi ,For the people
1w
Report this post
Training LLM to reason in Continuous Latent Space LLMs typically solve complex problems by reasoning in the "language space," where they use chains of thought (CoT). language-based reasoning isn't always ideal because many words focus on making sentences coherent rather than aiding in actual reasoning. Important Keywords before reading; -Reasoning / Language Space -Chain of Thought (CoT) -Chain of Continuous Thought (Coconut) -Latent Space Reasoning -Backtracking -Breadth-First Search (BFS) -Hidden State Embedding Introducing Coconut: Chain of Continuous Thought (Coconut) shifts reasoning from language space to a continuous latent space. Instead of producing word tokens, the model processes its reasoning states as embeddings within the LLM's hidden layers. This allows Coconut to explore multiple reasoning paths simultaneously, using techniques like breadth-first search (BFS) rather than committing to a single solution path prematurely. -Key Highlights: 1. Coconut reduces unnecessary "thinking tokens" while reasoning. 2. It improves logical reasoning tasks, especially those needing backtracking. Demonstrates new advanced reasoning patterns in LLMs. This approach reveals a new paradigm for reasoning that could redefine how LLMs solve complex problems. #AIReasoning #Coconut #LatentSpace #ChainOfContinuousThought #AIInnovation
1 Comment
Like Comment
To view or add a comment, sign in
Gopi Subramanian

Deep Learning Professional | Author | Applied Engineering Leader
2mo Edited
Report this post
Can LLM preference tuning degrade Multi-Agent system performance? In a recent tweet, Andrej Karpathy questioned why all LLMs sound similar, prompting Sebastian Rascha to speculate about the uniformity of alignment datasets. Aligning user preferences is a last-mile step in training large language models. Regardless of the pre-training datasets used, could the preference datasets used during alignment compromise the effectiveness of LLMs? This issue could adversely affect multi-agent systems incorporating multiple LLMs in their design.

1 Comment
Like Comment
To view or add a comment, sign in
Tanat Tonguthaisri, CISSP®

enabling digital services for Student Loan related activities while maintaining the highest security standard, the most compliant personal data protection and customer-centric data-driven innovation.
10mo
Report this post
🌟 Excited to share a new blog post discussing a simple yet powerful method to enhance the structured text generation capabilities of large language models. The post introduces G&O, an efficient two-step pipeline approach to improve named entity recognition (NER) and relation extraction (RE) tasks. The method effectively separates the generation of content from the structuring process, leading to significant performance improvements with minimal additional efforts. If you're interested in advancing structured language model output, check out the full post here: https://bit.ly/3T7MBHW
Like Comment
To view or add a comment, sign in
Ryszard Laskowski

Cloud & Infra - Senior Engineer at LTIMindtree
2w
Report this post
Next stage of ML "Fundamentals of Text Analysis with the Language Service" is passed and badge earned. It is not an easy way but worth the time in still changing the marketplace.

Fundamentals of Text Analysis with the Language Service

learn.microsoft.com
Like Comment
To view or add a comment, sign in
Habibullah Akbar

AI/ML Engineer | Currently Building Kreasof AI Labs 🤖
8mo
Report this post
I just uploaded my new video series regarding language model development to my YouTube channel. This time I briefly explain the pioneer of the transformer-based language model, which is BERT. Also, explain why the BERT model is still relevant until now for specific discriminative tasks to reduce our operational cost compared to using modern LLM 🤖 Check the video by yourself at the link below 👇 https://lnkd.in/gpPTwR-d #largelanguagemodels #machinelearning #artificialintelligence
Like Comment
To view or add a comment, sign in
Dimitar Danailov (a.k.a Mitco)

14+ Years Full Stack Engineer | 12+ Years Leadership | Top 4% on Stack Overflow | Led Teams between 250% and 500% Growth | Building High-Performing Teams | 2 x Staff Engineer / Engineering manager
5mo
Report this post
The diagram further illustrates how RAG* works: A user inputs a query. - The system retrieves relevant information from external sources (such as databases or documents). - This information is then fed into the LLM. - The LLM generates a response that is both relevant and context-aware based on the retrieved information. * RAG stands for Retrieval-Augmented Generation. It is an approach used to enhance the performance of Large Language Models (LLMs) by incorporating external information sources.
Like Comment
To view or add a comment, sign in

5,430 followers

View Profile Follow

Ian Armstrong’s Post

More from this author

UX Articles Published by Ian Armstrong

UX is different everywhere, but the Silicon Valley lives in the future

2016 Design Trend: Responsive Thumb Navigation

Explore topics