The Rev team has recently released another great open source paper and dataset. Proud of the work we did all of those years and the work they continue to do! Congrats to you all! Very proud and love watching these releases come to life! Corey Miller, Miguel del Rio Fernandez, Nishchal Bhandari, Martin Ratajczak, Danny Chen, and Quinn McNamara! Paper: https://lnkd.in/gTqrnikD Github: https://lnkd.in/gdFuegYt "Word error rate (WER) as a metric has a variety of limitations that have plagued the field of speech recognition. Evaluation datasets suffer from varying style, formality, and inherent ambiguity of the transcription task. In this work, we attempt to mitigate some of these differences by performing style-agnostic evaluation of ASR systems using multiple references transcribed under opposing style parameters. As a result, we find that existing WER reports are likely significantly over-estimating the number of contentful errors made by state-of-the-art ASR systems. In addition, we have found our multireference method to be a useful mechanism for comparing the quality of ASR models that differ in the stylistic makeup of their training data and target task." #asr #speechrecognition #speech #wer #speechrec #rev
Miguel Jetté’s Post
More Relevant Posts
-
The purpose of a glossary is to clarify and define specialized terminology for the reader. https://lnkd.in/g34cTcn9
To view or add a comment, sign in
-
Is it true that you MUST use citations to find similar prior art? I'd say this perception is totally misleading. Finding similar documents has almost nothing to do with citation-based searches. It’s about the written text. Everything is in there. - What is it about ... the context - How is it operating ... the applied method - Why is it working this way ... the characterisitc features - Where is it usefull ... the use case Context + Method + Features + Use Case = Finding similar prior art Do you agree?
To view or add a comment, sign in
-
Despite contrary belief, MAP has a pretty easy interpretation: When you locate all known relevant documents (R) in a ranking, how few non-relevant docs are above them (Prec@r)? MAP is a recall-based measure that want you to place all relevant docs at the top of a ranking -- correcting for broken ties. #informationretrieval #evaluation
To view or add a comment, sign in
-
What is information and can information be negative? Read my latest take on the nature of information and its relation to multivariate contexts. Through illustrative interactive examples, I discuss what information is, and how to interpret negative information.
The Difference Operator
thefriendlyghost.nl
To view or add a comment, sign in
-
I'm a particular fan of this InnerDrive & Bradley Busch article: Specific Retrieval Questions. I often find that retrieval is just lumped together as one big thing... 'recalling prior learning', but there's more to it. Are you using specific or open questions? If push came to shove, I think specific questions are both more important, and also easily to use more effectively. But this isn't to say that open retrieval questions are not also crucial. Anyway, check out the article because it does walk you through the pros and cons, and literature, around specific retrieval questions. https://lnkd.in/gPjsYSDU
To view or add a comment, sign in
-
This is a seriously good summarisation prompt: 1.) Analyse the input text and generate 5 essential questions that, when answered, capture the main points and core meaning of the text. 2.) When formulating your questions: a. Address the central theme or argument b. Identify key supporting ideas c. Highlight important facts or evidence d. Reveal the author's purpose or perspective e. Explore any significant implications or conclusions. 3.) Answer all of your generated questions one-by-one in detail. < input text >
To view or add a comment, sign in
-
If you'd like to improve the effectiveness, coverage, and adaptability of HyDE (hypothetical document embeddings) for advanced RAG applications, Ian Ho presents his novel approach: AutoHyDE.
AutoHyDE: Making HyDE Better for Advanced LLM RAG
towardsdatascience.com
To view or add a comment, sign in
-
💥💥💥 Do LLMs suffer from Multi-Party Hangover? A Diagnostic Approach to Addressee Recognition and Response Selection in Conversations Nicolò Penzo, Maryam Sajedinia, Bruno Lepri, Sara Tonelli, Marco Guerini Abstract Assessing the performance of systems to classify Multi-Party Conversations (MPC) is challenging due to the interconnection between linguistic and structural characteristics of conversations. Conventional evaluation methods often overlook variances in model behavior across different levels of structural complexity on interaction graphs. In this work, we propose a methodological pipeline to investigate model performance across specific structural attributes of conversations. As a proof of concept we focus on Response Selection and Addressee Recognition tasks, to diagnose model weaknesses. To this end, we extract representative diagnostic subdatasets with a fixed number of users and a good structural variety from a large and open corpus of online MPCs. We further frame our work in terms of data minimization, avoiding the use of original usernames to preserve privacy, and propose alternatives to using original text messages. Results show that response selection relies more on the textual content of conversations, while addressee recognition requires capturing their structural dimension. Using an LLM in a zero-shot setting, we further highlight how sensitivity to prompt variations is task-dependent. 👉 https://lnkd.in/djP_yaaC #machinelearning
To view or add a comment, sign in
-
Enterprise L&D: The language/terminology in L&D has always been a mess - going back long before I got into the field in 1979.
Enterprise L&D: Wet is Just One of Many Umbrella Terms
http://tppannex.wordpress.com
To view or add a comment, sign in
-
From Multimodal Thinking to Efficient Context Compression: The Innovation of COCOM After reading "Context Embeddings for Efficient Answer Generation in RAG," I realized that COCOM represents a variant of multimodal thinking, treating compressed context embeddings like visual features and integrating them with LLMs through adapter layers. This approach not only effectively handles long text contexts but also significantly enhances generation efficiency. By combining different forms of information, COCOM achieves higher task performance and efficiency, offering valuable insights and innovative directions for future multimodal research and applications. https://lnkd.in/eSSAHEKK
Context Embeddings for Efficient Answer Generation in RAG
arxiv.org
To view or add a comment, sign in