Nick Tarazona, MD’s Post

5mo

👉🏼 Large Language Models and the Wisdom of Small Crowds 🤓 Sean Trott 👇🏻 https://lnkd.in/ednsfpYm 🔍 Focus on data insights: - The study introduces the "number needed to beat" (NNB) metric to assess the quality of human data compared to LLM-generated data. - NNB varies across tasks, highlighting the importance of task-specific considerations in data analysis. - Two "centaur" methods are proposed for combining LLM and human data, showing improved performance over standalone approaches. 💡 Main outcomes and implications: - Empirical evidence suggests that LLMs do not fully capture the "wisdom of the crowd" and that human input remains crucial in certain tasks. - The study provides a framework for decision-making on integrating LLM-generated data into research processes, considering trade-offs in data cost and quality. 📚 Field significance: - Advances our understanding of the role of LLMs in research methodologies. - Highlights the complementary nature of LLM and human data in achieving optimal results. 🗄️: [#LargeLanguageModels #DataInsights #ResearchMethodologies]

To view or add a comment, sign in

More Relevant Posts

John Vandivier

Staff Software Engineer - Applied Generative AI | React, Python, AWS, JavaScript, TypeScript | AI/ML | Ph.D. in Economics
1mo
Report this post
if you are using a language model to create a quantitative output, consider this instead: 1. classical analysis 2. asking the language model to make a likert-type output instead 3. a hybrid approach of 1 and 2: you can run classical analytics on the LLM output, giving you access to many of the quant measures you want, but altering the interpretation slightly - you need to mention your LLM's training set influence

2 Comments
Like Comment
To view or add a comment, sign in
Dissertations help4u

Academic Guidance || proposal/thesis/assignments || CareerGrowth || email us at dissertationshelp4u@gmail.com
5mo
Report this post
Thematic analysis is a method for analyzing qualitative data that involves reading through a set of data and looking for patterns in the meaning of the data to find themes. It is an active process of reflexivity in which the researcher's subjective experience is at the center of making sense of the data. The screen shot of mutiple respondent groups analysed against primary and secondary themes in the software helps to narrow down, support 'general to specificity' in inductive research approach. https://lnkd.in/g-6D_yDj #research #PhD #masters #thematicanalysis #qualitativeresearch #EU #UK
Like Comment
To view or add a comment, sign in
Thom Vaughan

Web infrastructure and Open Data technology specialist
4mo
Report this post
Interesting paper by Emmanouil Tranos et al., proposing a novel methodology to identify economic clusters over time using archive data from the JISC UK Web Domain Dataset (subset of the Internet Archive). For validation, they looked at Shoreditch in East London, and found some interesting details! 📘 Read the full paper here: https://lnkd.in/etPH5qCw #Research #EconomicClusters #Tech #DataScience

Using the Web to Predict Regional Trade Flows: Data Extraction, Modeling, and Validation

tandfonline.com

1 Comment
Like Comment
To view or add a comment, sign in
Aaron Sheldon

Scientific Consultant | A Big Maths data unicorn pursuing unicorn projects
7mo Edited
Report this post
I simply need to accept that the vast majority of what is published in Statistics and Data Science is not mathematically rigorous. While the symbol pushing is syntactically correct, the claims are frequently provably wrong. One of the thorns in my side is the shear number of authors that claim their algorithm du jour generates Lebesgue measures, translation invariant, in the strong topology of function spaces. This is a provably false claim becuase the unit ball is not compact in function spaces. If we were to constrain ourselves to only analyses that were rigorously defensible we would be far more parsimonious in the algorithms we use and the conclusions we reach.

10 Comments
Like Comment
To view or add a comment, sign in
Dr.Öznur TAŞDÖKEN

Antalya Ticaret ve Sanayi Odasi
1mo
Report this post
I hope that my contribution to the literature with the study ‘USE OF ARTIFICIAL INTELLIGENCE AND AUDIT ANALYTICS IN INTERNAL AUDIT PROCESSES IN THE PUBLIC SECTOR’ will be a guide for researchers working on the subject.
1 Comment
Like Comment
To view or add a comment, sign in
Gaurav Kumar Sah

Full-Stack Developer | MERN Stack | MongoDB | Express.js | React.js | Next.js | Node.js | Mongoose | Ejs | Postman | JavaScript | Passionate about Data Structures & Algorithms | AWS Cloud Enthusiast
8mo
Report this post
Understanding Subarray vs. Subsequence in Data Structures Today, let's unravel the distinction between subarrays and subsequences in the realm of Data Structures and Algorithms (DSA)! These concepts play crucial roles in problem-solving and algorithmic optimization, yet they possess distinct characteristics. 🔍 Subarray: A subarray is a contiguous portion of an array, consisting of elements that appear consecutively in the original array. It maintains the relative order of elements and cannot skip any elements. 🔍 Subsequence: A subsequence, on the other hand, is a sequence of elements that may not necessarily be contiguous in the original array. It retains the relative order of elements but can skip elements in between. ⚙️ Key Differences: - Subarrays are contiguous, while subsequences can be non-contiguous. - Subarrays preserve the order of elements, while subsequences maintain the relative order but may skip elements. Understanding the nuances between subarrays and subsequences is pivotal in algorithm design and problem-solving strategies. Whether optimizing for efficiency or exploring creative solutions, clarity on these concepts is indispensable. Excited to delve deeper into these distinctions and apply them in tackling real-world problems in DSA! Join me on this journey of exploration and mastery. #DSA #Subarray #Subsequence #Algorithms #LearningJourney #gauravsah
Like Comment
To view or add a comment, sign in
Gönenç Onay
5mo Edited
Report this post
More Is Less: https://lnkd.in/d9qwU4Yc This -underrated- paper proves (mathematically) that huge datasets contains arbitrary correlations, indepedently of the nature of the data. Results are based on: - Poincare recurrence: famous to provide regularities in large subsets, e.g Szemerédi's theorem - Ramsey Theory: famous to provide lower bound for a fixed "correlation" so that it holds in any set of that size or bigger.

Cristian S. Calude & Giuseppe Longo, The Deluge of Spurious Correlations in Big Data - PhilPapers

philpapers.org
Like Comment
To view or add a comment, sign in
Bernhard Witt
3mo
Report this post
RECODING QUANTITATIVE VARIABLES INTO QUALITATIVE ONES – TECHNIQUES AND THEIR PRACTICAL APPLICATIONS https://lnkd.in/d2JMnjEA

RECODING QUANTITATIVE VARIABLES INTO QUALITATIVE ONES – TECHNIQUES AND THEIR PRACTICAL APPLICATIONS - 2x4 Solutions

https://ps-imago-pro.2x4.de
Like Comment
To view or add a comment, sign in
Anindya Bijoy Das

Assistant Professor, Electrical and Computer Engineering, The University of Akron
2w
Report this post
✨ Excited to share that our paper "Challenging Fairness: A Comprehensive Exploration of Bias in LLM-Based Recommendations" has been accepted to IEEE BigData 2024! 🎉 In this paper, we explore the fairness in large language model-based recommendation systems. We studied different interesting research questions and addressed them in terms of different fairness metrics, highlighting the urgent need for more equitable solutions in LLM-based recommendations. A version of the paper can be found here- https://lnkd.in/geUWCyBQ Thanks to my co-author Shahnewaz Karim Sakib. We look forward to discussing our findings and engaging with peers in the community to further explore these vital issues. See you at IEEE BigData! #IEEEBigData #FairnessInAI #MachineLearning #ArtificialIntelligence

5 Comments
Like Comment
To view or add a comment, sign in
Breenda Soares Abreu

Strategic Marketing | Planning | Data Analysis
10mo Edited
Report this post
Starting another course at Stanford University on statistics to gain comprehensive knowledge about data collection, organization, interpretation, analysis, and representation.
1 Comment
Like Comment
To view or add a comment, sign in

1,905 followers

View Profile Follow

Nick Tarazona, MD’s Post

More from this author

Jensen Huang: Navigating the AI Revolution with $3 trillion company

Accelerating the Future: Strategic Lessons from Jensen Huang’s NVIDIA Leadership

How AI Solved Protein Folding and Won a Nobel Prize

Explore topics