Miguel Jetté’s Post

Head of AI @ Circle Medical | Healthcare, ASR, NLU, & genAI

The Rev team has recently released another great open source paper and dataset. Proud of the work we did all of those years and the work they continue to do! Congrats to you all! Very proud and love watching these releases come to life! Corey Miller, Miguel del Rio Fernandez, Nishchal Bhandari, Martin Ratajczak, Danny Chen, and Quinn McNamara! Paper: https://lnkd.in/gTqrnikD Github: https://lnkd.in/gdFuegYt "Word error rate (WER) as a metric has a variety of limitations that have plagued the field of speech recognition. Evaluation datasets suffer from varying style, formality, and inherent ambiguity of the transcription task. In this work, we attempt to mitigate some of these differences by performing style-agnostic evaluation of ASR systems using multiple references transcribed under opposing style parameters. As a result, we find that existing WER reports are likely significantly over-estimating the number of contentful errors made by state-of-the-art ASR systems. In addition, we have found our multireference method to be a useful mechanism for comparing the quality of ASR models that differ in the stylistic makeup of their training data and target task." #asr #speechrecognition #speech #wer #speechrec #rev

To view or add a comment, sign in

More Relevant Posts

Brig Hisamullah Beg (SIM)

Pensioner at Home
4mo
Report this post
The purpose of a glossary is to clarify and define specialized terminology for the reader. https://lnkd.in/g34cTcn9

Glossary: A to Z of BOOK-III

hisamullahbeg.blogspot.com
Like Comment
To view or add a comment, sign in
Dr. Lennart Weiß

AI infused Patent & Science Search for Industry Leaders.
8mo
Report this post
Is it true that you MUST use citations to find similar prior art? I'd say this perception is totally misleading. Finding similar documents has almost nothing to do with citation-based searches. It’s about the written text. Everything is in there. - What is it about ... the context - How is it operating ... the applied method - Why is it working this way ... the characterisitc features - Where is it usefull ... the use case Context + Method + Features + Use Case = Finding similar prior art Do you agree?
9 Comments
Like Comment
To view or add a comment, sign in
Laura Dietz

Professor at University of New Hampshire
5mo
Report this post
Despite contrary belief, MAP has a pretty easy interpretation: When you locate all known relevant documents (R) in a ranking, how few non-relevant docs are above them (Prec@r)? MAP is a recall-based measure that want you to place all relevant docs at the top of a ranking -- correcting for broken ties. #informationretrieval #evaluation
11 Comments
Like Comment
To view or add a comment, sign in
Casper van Elteren

PHD Computational science at University of Amsterdam
1mo
Report this post
What is information and can information be negative? Read my latest take on the nature of information and its relation to multivariate contexts. Through illustrative interactive examples, I discuss what information is, and how to interpret negative information.

The Difference Operator

thefriendlyghost.nl

6 Comments
Like Comment
To view or add a comment, sign in
Nathan Burns

Mr Metacognition - Teacher Educator. Expert in metacognition. ITT facilitator. Maths leadership and curriculum support. More Able & Talented consultancy. Researcher and author.
5mo Edited
Report this post
I'm a particular fan of this InnerDrive & Bradley Busch article: Specific Retrieval Questions. I often find that retrieval is just lumped together as one big thing... 'recalling prior learning', but there's more to it. Are you using specific or open questions? If push came to shove, I think specific questions are both more important, and also easily to use more effectively. But this isn't to say that open retrieval questions are not also crucial. Anyway, check out the article because it does walk you through the pros and cons, and literature, around specific retrieval questions. https://lnkd.in/gPjsYSDU
Like Comment
To view or add a comment, sign in
Ruben Horbach

Emerging tech researcher - helping you navigate tomorrow -> Innovation strategist | Public Speaker | Sparringpartner | Workshop facilitator
2w
Report this post
This is a seriously good summarisation prompt: 1.) Analyse the input text and generate 5 essential questions that, when answered, capture the main points and core meaning of the text. 2.) When formulating your questions: a. Address the central theme or argument b. Identify key supporting ideas c. Highlight important facts or evidence d. Reveal the author's purpose or perspective e. Explore any significant implications or conclusions. 3.) Answer all of your generated questions one-by-one in detail. < input text >
Like Comment
To view or add a comment, sign in
Towards Data Science

639,408 followers
8mo Edited
Report this post
If you'd like to improve the effectiveness, coverage, and adaptability of HyDE (hypothetical document embeddings) for advanced RAG applications, Ian Ho presents his novel approach: AutoHyDE.

AutoHyDE: Making HyDE Better for Advanced LLM RAG

towardsdatascience.com
Like Comment
To view or add a comment, sign in
Antonio Montano 🪄

Delivering perpetual agility via technology ✨
1mo
Report this post
💥💥💥 Do LLMs suffer from Multi-Party Hangover? A Diagnostic Approach to Addressee Recognition and Response Selection in Conversations Nicolò Penzo, Maryam Sajedinia, Bruno Lepri, Sara Tonelli, Marco Guerini Abstract Assessing the performance of systems to classify Multi-Party Conversations (MPC) is challenging due to the interconnection between linguistic and structural characteristics of conversations. Conventional evaluation methods often overlook variances in model behavior across different levels of structural complexity on interaction graphs. In this work, we propose a methodological pipeline to investigate model performance across specific structural attributes of conversations. As a proof of concept we focus on Response Selection and Addressee Recognition tasks, to diagnose model weaknesses. To this end, we extract representative diagnostic subdatasets with a fixed number of users and a good structural variety from a large and open corpus of online MPCs. We further frame our work in terms of data minimization, avoiding the use of original usernames to preserve privacy, and propose alternatives to using original text messages. Results show that response selection relies more on the textual content of conversations, while addressee recognition requires capturing their structural dimension. Using an LLM in a zero-shot setting, we further highlight how sensitivity to prompt variations is task-dependent. 👉 https://lnkd.in/djP_yaaC #machinelearning
Like Comment
To view or add a comment, sign in
Guy W Wallace

Retired Performance Analyst & Instructional Architect - Award-winning consultant to Enterprise L&D in performance-based Instructional Architecture Analysis, Design & Development 1979 to 2023.
2mo
Report this post
Enterprise L&D: The language/terminology in L&D has always been a mess - going back long before I got into the field in 1979.

Enterprise L&D: Wet is Just One of Many Umbrella Terms

http://tppannex.wordpress.com
Like Comment
To view or add a comment, sign in
Yu Cao
4mo
Report this post
From Multimodal Thinking to Efficient Context Compression: The Innovation of COCOM After reading "Context Embeddings for Efficient Answer Generation in RAG," I realized that COCOM represents a variant of multimodal thinking, treating compressed context embeddings like visual features and integrating them with LLMs through adapter layers. This approach not only effectively handles long text contexts but also significantly enhances generation efficiency. By combining different forms of information, COCOM achieves higher task performance and efficiency, offering valuable insights and innovative directions for future multimodal research and applications. https://lnkd.in/eSSAHEKK

Context Embeddings for Efficient Answer Generation in RAG

arxiv.org
Like Comment
To view or add a comment, sign in

3,372 followers

View Profile Follow

Miguel Jetté’s Post

More from this author

Mental Health Stability in a fast paced world

Explore topics