About
I'm a machine learning scientist working on ML techniques for recommendations in online…
Activity
-
Synthetic data is becoming essential for training and fine-tuning models, but there’s a lot we still need to learn about best practices for…
Synthetic data is becoming essential for training and fine-tuning models, but there’s a lot we still need to learn about best practices for…
Liked by Rishabh Mehrotra
-
Mattias Frånberg, Mårten Schultzberg and I just posted a new paper on arXiv: https://lnkd.in/dDC7CmJH In the paper, we look at how different types of…
Mattias Frånberg, Mårten Schultzberg and I just posted a new paper on arXiv: https://lnkd.in/dDC7CmJH In the paper, we look at how different types of…
Liked by Rishabh Mehrotra
-
📣 I am recruiting Postdoc and PhD students at McGill University Mila - Quebec Artificial Intelligence Institute ! ** PhD positions on "Fairness…
📣 I am recruiting Postdoc and PhD students at McGill University Mila - Quebec Artificial Intelligence Institute ! ** PhD positions on "Fairness…
Liked by Rishabh Mehrotra
Experience
Education
Publications
-
Deriving User- and Content-specific Rewards for Contextual Bandits
World Wide Web Conference
-
Explore, Exploit, and Explain: Personalizing Explainable Recommendations with Bandits
ACM Conference on Recommender Systems (RecSys)
-
Auditing Search Engines for Differential Performance Across Demographics
World Wide Web Conference
-
Hey Cortana! Exploring the use cases of a Desktop based Digital Assistant
Workshop on Conversational Approaches to Information Retrieval (CAIR)
-
Topics, Tasks & Beyond: Learning Representations for Personalization
In Proceedings of Doctoral Consortium at the 8th ACM International Conference of Web Search and Data Mining (WSDM 2015), Shanghai.
-
Towards Hierarchies of Search Tasks & Subtasks
In Proceedings of the 24th International World Wide Web Conference (WWW-15) Florence, Italy.
Other authors -
Task-Based User Modelling for Personalization via Probabilistic Matrix Factorization
8th ACM Conference on Recommender Systems (RecSys 2014)
We introduce a novel approach to user modelling for behavioral targeting: task-based user representation and present an approach based on search task extraction from search logs wherein users are represented by their actions over a task-space. Given a web search log, we extract search tasks performed by users and find user representations based on these tasks. More specifically, we construct a user-task association matrix and borrow insights from Collaborative Filtering to learn low-dimensional…
We introduce a novel approach to user modelling for behavioral targeting: task-based user representation and present an approach based on search task extraction from search logs wherein users are represented by their actions over a task-space. Given a web search log, we extract search tasks performed by users and find user representations based on these tasks. More specifically, we construct a user-task association matrix and borrow insights from Collaborative Filtering to learn low-dimensional factor model wherein the interests/preferences of a user are determined by a small number of latent factors. We compare the performance of the proposed approach on the task of collaborative query recommendation on publicly available AOL search log with a standard term-similarity baseline and discuss potential future research directions.
Other authors -
Improving LDA Topic Models for Microblogs via Automatic Tweet Labeling and Pooling
36th ACM Special Interest Group on Information Retrieval Conference (SIGIR 2013)
Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on short, messy text. While topic models such as Latent Dirichlet Allocation (LDA) have a long history of successful application to news articles and academic abstracts, they are often less coherent when applied to microblog content like Twitter. In this paper, we investigate methods to improve topics learned from Twitter content without modifying the basic machinery of LDA; we achieve this through…
Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on short, messy text. While topic models such as Latent Dirichlet Allocation (LDA) have a long history of successful application to news articles and academic abstracts, they are often less coherent when applied to microblog content like Twitter. In this paper, we investigate methods to improve topics learned from Twitter content without modifying the basic machinery of LDA; we achieve this through various pooling schemes that aggregate tweets in a data preprocessing step for LDA. We empirically establish that a novel method of tweet pooling by hashtags leads to a vast improvement in a variety of measures for topic coherence across three diverse Twitter datasets in comparison to an unmodified LDA baseline and a variety of pooling schemes. An additional contribution of automatic hashtag labelling further improves on the hashtag pooling results for a subset of metrics. Overall, these two novel schemes lead to significantly improved LDA topic models on Twitter content.
Other authors -
Coupled Dictionary Learning for Cross Lingual Information Retrieval
NIPS 2012 Workshop on on Analysis Operator Learning vs. Dictionary Learning, Lake Tahoe, US
Automatic text understanding has been an unsolved research problem for many years. This partially results from the dynamic and diverging nature of human languages, which ultimately results in many different varieties of natural language. These variations range from the individual level, to regional and social dialects, and up to seemingly separate languages and language families. However, in recent years there have been considerable achievements in data driven approaches to computational…
Automatic text understanding has been an unsolved research problem for many years. This partially results from the dynamic and diverging nature of human languages, which ultimately results in many different varieties of natural language. These variations range from the individual level, to regional and social dialects, and up to seemingly separate languages and language families. However, in recent years there have been considerable achievements in data driven approaches to computational linguistics exploiting the redundancy in the encoded information and the structures used. Those approaches are mostly not language specific or can even exploit redundancies across languages. Representing documents by vectors that are independent of languages enhances the performance of cross-lingual tasks such as \textit{comparable document retrieval} and \textit{mate retrieval}.
\In this paper, we explore the use of Dictionary based approaches to solve the task of cross-lingual information retrieval. We propose a new dictionary learning algorithm for learning a pair of coupled dictionary pair representing basis atoms in a pair of languages, alongside learning two mapping functions which help in transforming representations learnt in one language to the other. Such transformations are necessary for the task of finding similar documents in a different language and hence finds immense application for various cross-lingual information retrieval tasks. We present an optimization procedure that iterates between two objectives and uses the K-SVD formulation to efficiently compute the parameters involved. We evaluate our algorithm on the task of cross-lingual comparable document retrieval and compare our results with existing approaches; the results highlight the efficacy of our method.Other authorsSee publication -
Dictionary based Sparse Representation for Domain Adaptation
21st ACM Conference on Information and Knowledge Management CIKM 2012, Maui, USA
Machine Learning algorithms are often as good as the data they can learn from. Enormous amount of unlabeled data is readily available and the ability to efficiently use such amount of unlabeled data holds a significant promise in terms of increasing the performance of various learning tasks. We consider the task of supervised Domain Adaptation and present a Self-Taught learning based framework which makes use of the K-SVD algorithm for learning sparse representation of data in an unsupervised…
Machine Learning algorithms are often as good as the data they can learn from. Enormous amount of unlabeled data is readily available and the ability to efficiently use such amount of unlabeled data holds a significant promise in terms of increasing the performance of various learning tasks. We consider the task of supervised Domain Adaptation and present a Self-Taught learning based framework which makes use of the K-SVD algorithm for learning sparse representation of data in an unsupervised manner. To the best of our knowledge this is the first work that integrates K-SVD algorithm into the self-taught learning framework. The K-SVD algorithm iteratively alternates between sparse coding of the instances based on the current dictionary and a process of updating/adapting the dictionary to better fit the data so as to achieve a sparse representation under strict sparsity constraints. Using the learnt dictionary, a rich feature representation of the few labeled instances is obtained which is fed to a classifier along with class labels to build the model. We evaluate our framework on the task of domain adaptation for sentiment classification. Both self-domain (requiring very few domain-specific training instances) and cross-domain classification (requiring 0 labeled instances of target domain and very few labeled instances of source domain) are performed. Empirical comparisons of self-domain and cross-domain results establish the efficacy of the proposed framework.
Other authorsSee publication -
Corporate News Classification and Valence Prediction: A Supervised Approach
49th Association for Computational Linguistics : ACL HLT 2011 Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011), Portland, Oregon, USA.
News articles have always been a prominent force in the formation of a company’s financial image in the minds of the general public, especially the investors. Given the large amount of news being generated these days through various websites, it is possible to mine the general sentiment of a particular company being portrayed by media agencies over a period of time, which can be utilized to gauge the long term impact on the investment potential of the company. However, given such a vast amount…
News articles have always been a prominent force in the formation of a company’s financial image in the minds of the general public, especially the investors. Given the large amount of news being generated these days through various websites, it is possible to mine the general sentiment of a particular company being portrayed by media agencies over a period of time, which can be utilized to gauge the long term impact on the investment potential of the company. However, given such a vast amount of
news data, we need to first separate corporate news from other kinds namely, sports, entertainment, science & technology, etc. We propose a system which takes news as, checks whether it is
of corporate nature, and then identifies the polarity of the sentiment expressed in the news. The system is also capable of
distinguishing the company/organization which is the subject of the news from other organizations which find mention, and this is used to pair the sentiment polarity with the identified company.Other authorsSee publication
Courses
-
Advanced Algorithms
-
-
Artificial Intelligence
-
-
Could Computing
-
-
Data Structure and Algorithms
-
-
Differential Geometry
-
-
Machine Learning
-
-
Object Oriented Programming
-
-
Optimization
-
-
Real Analysis
-
-
Topology
-
Honors & Awards
-
Y Ali Research Award
-
-
Donald B Crouch Research Grant
-
More activity by Rishabh
-
I remember that the first time I had to design an online experiment I realized that it was far less trivial than it looked on paper. Even deciding…
I remember that the first time I had to design an online experiment I realized that it was far less trivial than it looked on paper. Even deciding…
Liked by Rishabh Mehrotra
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Rishabh Mehrotra
-
Rishabh Mehrotra
-
Rishabh Mehrotra
-
Rishabh M.
-
Rishabh Mehrotra
Software Development Engineer at AWS
-
Rishabh Mehrotra
98 others named Rishabh Mehrotra are on LinkedIn
See others named Rishabh Mehrotra