Polars’ Post

View organization page for Polars, graphic

18,815 followers

New article on the Polars blog! “Breaking the rules with expression expansion” delves into how `.struct.unnest` seems to break one of Polars'most fundamental principles but doesn't: A single expression must always produce a single column as a result. This short article explains how expression expansion makes this possible without breaking the principles that govern Polars. Read it here 👉 https://lnkd.in/dKQmCUMX

Breaking the rules with expression expansion

pola.rs

To view or add a comment, sign in

More Relevant Posts

Tori Tompkins

Senior AI Consultant at Advancing Analytics | Microsoft AI MVP | Trustee Board Member at Girls in Data
1mo
Report this post
Day 227 of the #365GenAIChallenge! 📅 Don't HyDE from todays definition! 👀 HyDE (Hypothetical Document Embedding) is a retrieval method that generates hypothetical answers with a language model, converting them into embeddings to perform answer-to-answer similarity retrieval, which improves search results for vague or complex queries. However, this approach can falter when the model lacks knowledge on the subject, leading to less accurate matches.
Like Comment
To view or add a comment, sign in
AI topics

942 followers
8mo
Report this post
pub.towardsai.net: RAG, short for Retrieval-Augmented Generation, is a method that integrates retrieval systems with LLMs (Large Language Models). This approach aims to improve the performance of language models by combining them with retrieval systems.

Retrieval-Augmented Generation, aka RAG — How does it work?

pub.towardsai.net
Like Comment
To view or add a comment, sign in
Hongliu CAO

Senior Artificial Intelligence Researcher chez Amadeus
7mo
Report this post
We've witnessed remarkable improvements in the field of text embeddings throughout recent years, entering the 4th era - the Universal Text Embeddings. This era is characterized by a comprehensive, unified model capable of handling a diverse range of input text lengths, downstream tasks, domains, and languages. This on-going evolution has been made possible through significant advancements in the quality, quantity, and diversity of pre-training/training/pre-finetuning/finetuning data, alongside the synthetic data generation from LLMs, the use of instructions, and the utilization of LLMs as backbones. Additionally, the focus on task and domain generalization benchmarks such as the Massive Text Embedding Benchmark (MTEB) has played a crucial role. For a more detailed exploration, I invite you to have a look at my latest review draft. I eagerly welcome your feedback and suggestions to further enhance the quality of this draft. Link of the draft: https://lnkd.in/e3fmM2Q2
Like Comment
To view or add a comment, sign in
Joseph Bolton

Actuarial Manager at PWC (ML, AI) | Actuarial, Risk & Quants team
5mo
Report this post
Conclusion of the (very enjoyed!) 2024 paper "Searching for Best Practices in Retrieval-Augmented Generation" (https://lnkd.in/dxRcuGqV): 1. Incorporate a query classification module (i.e. a model trained to classify whether a given user query requires retrieval at all, or if the LLM can respond accurately without external information). This can improve accuracy of the system. 2. If efficiency matters, use "Hybrid Search" (i.e. a weighted average of BM25 on a sparse bag-of-word vector and a dense embedding vector). If accuracy is all that matters, use "Hybrid Search with HyDE" (https://lnkd.in/dkeMf4Uq) i.e. the same thing but replacing the dense embeddings with hypothetical document embeddings. 3. If efficiency matters, use TILDEv2 (https://lnkd.in/d5a6Vwf6) for reranking. If accuracy is all that matters, use monoT5 (https://lnkd.in/dYCFDBPU) for reranking. 4. Repack retrieved results in reverse (ascending order of relevancy score i.e. most relevant last, closest to the query). 5. Use Recomp (https://lnkd.in/dJVQFzhg) for summarisation.

Searching for Best Practices in Retrieval-Augmented Generation

arxiv.org
Like Comment
To view or add a comment, sign in
Shyam Sunder Kumar

LLM Agents x SLMs
8mo
Report this post
RAFT: Adapting Language Model to Domain Specific RAG 📚 Retrieval Augmented Fine Tuning is a powerful fine-tuning recipe to enhance the model's performance in answering questions within specific domains in an "open-book" setting. Blog: https://lnkd.in/gi4ZYNYx Paper: https://lnkd.in/gV_wTR2J
Like Comment
To view or add a comment, sign in
Louis Tamames

Research Data Scientist
7mo
Report this post
Some insights on the promising paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention." It introduces an attention mechanism, Infini-attention, which permits Large Language Models to handle infinitely long texts efficiently. No need for fine-tuning with long context inputs! Though a compressive memory mechanism, it enables Transformers to manage both long and short term information while keeping a bounded memory footprint. Could be great in the field of summarization and key information extraction. Check out the full presentation here! Let me know what models/techniques do you use for your long text input >>>
Like Comment
To view or add a comment, sign in
Anna Simonova

Data Scientist | NLP Engineer | AI Engineer | Big Data
2mo Edited
Report this post
This paper proposes new Logic-of-Thought (LoT) prompting, which improves logical reasoning by expanding logical information from the input context and adding it to the input prompts. LoT extracts logical propositions, expands them, translates them back into natural language, and integrates this expanded information into the prompts. The authors claim that this approach enhances reasoning without fully relying on symbolic tools. LoT also can be combined with other prompting methods (like Chain-of-Thought) to improve performance. In experiments across five datasets, LoT significantly improved the accuracy of various prompting techniques. The picture is from the article. The source: https://lnkd.in/d-KabaWc
Like Comment
To view or add a comment, sign in
Yogananda M

⭐AI /GenAI, LLM, Prompt engineer⭐SAP BTP⭐SAP CPQ⭐SAP UI5⭐SAP Integration Suite(CPI)⭐SAP Kyma⭐SAP Datasphere⭐Azure Devops⭐Full Stack Developer ⭐Nodejs ⭐K8s ⭐
7mo
Report this post
Retrieval Augmented Generation: Where Information Retrieval Meets Text Generation https://lnkd.in/dWUq9Qi3

Retrieval Augmented Generation: Where Information Retrieval Meets Text Generation - KDNuggets

kdnuggets.com
Like Comment
To view or add a comment, sign in
Tanat Tonguthaisri, CISSP®

enabling digital services for Student Loan related activities while maintaining the highest security standard, the most compliant personal data protection and customer-centric data-driven innovation.
10mo
Report this post
🌟 Excited to share a new blog post discussing a simple yet powerful method to enhance the structured text generation capabilities of large language models. The post introduces G&O, an efficient two-step pipeline approach to improve named entity recognition (NER) and relation extraction (RE) tasks. The method effectively separates the generation of content from the structuring process, leading to significant performance improvements with minimal additional efforts. If you're interested in advancing structured language model output, check out the full post here: https://bit.ly/3T7MBHW
Like Comment
To view or add a comment, sign in
Yu Cao
4w
Report this post
Dynamic Decoding: A Practical Step Forward for LLMs The concept of Adaptive Decoding via Latent Preference Optimization offers a thoughtful way to improve language model outputs by dynamically selecting decoding temperatures. By building on frozen LLMs, the ADAPTIVEDECODER extracts rich contextual insights from hidden states, efficiently adapting to tasks like reasoning and storytelling. Its reliance on preference optimization (LPO) instead of manual tuning reduces complexity while improving flexibility. With minimal training data, the method achieves impressive adaptability across diverse tasks, marking a practical and scalable improvement for LLM applications. This approach feels like a natural progression toward more tailored and efficient language model usage. https://lnkd.in/e-_JRtAP

Adaptive Decoding via Latent Preference Optimization

arxiv.org
Like Comment
To view or add a comment, sign in

18,815 followers

View Profile Connect

Polars’ Post

More Relevant Posts

Explore topics