Rishabh Mehrotra

Rishabh Mehrotra

London, England, United Kingdom
16K followers 500+ connections

About

I'm a machine learning scientist working on ML techniques for recommendations in online…

Activity

Join now to see all activity

Experience

  • Sourcegraph Graphic

    Sourcegraph

    London, England, United Kingdom

  • -

    London, England, United Kingdom

  • -

    London, England, United Kingdom

  • -

    London, England, United Kingdom

  • -

    London, United Kingdom

  • -

    London, United Kingdom

  • -

    London, United Kingdom

  • -

    London, United Kingdom

  • -

    Greater New York City Area

  • -

    Redmond, WA

  • -

    London, United Kingdom

  • -

  • -

    Bangalore

  • -

    Houston, Texas Area

  • -

    Canberra, Australia

  • -

    Singapore

  • -

Education

  • UCL Graphic

    UCL

    -

  • -

  • -

Publications

  • Sequence-aware Reinforcement Learning over Knowledge Graphs

    Reveal Workshop @ RecSys

    Other authors
  • The Music Streaming Sessions Dataset

    World Wide Web Conference

  • Towards Task Understanding in Visual Settings

    AAAI

    Other authors
  • Hey Cortana! Exploring the use cases of a Desktop based Digital Assistant

    Workshop on Conversational Approaches to Information Retrieval (CAIR)

  • Identifying User Sessions in Interactions with Intelligent Assistants

    World Wide Web Conference (WWW)

    Other authors
  • Predictive Power of Online and Offline Behavior Sequences: Evidence from a Micro-finance Context

    ICIS

    Other authors
  • Characterizing Users' Multi-Tasking Behavior in Web Search

    ACM SIGIR

    Other authors
  • Deconstructing Complex Search Tasks

    NAACL

    Other authors
  • Query Log Mining for Inferring User Tasks and Needs

    ECML

    Other authors
  • The Information Network: Exploiting Causal Dependencies in Online Information Seeking

    CHIIR

    Other authors
  • Uncovering Task Based Behavioral Heterogeneities in Online Search Behavior

    ACM SIGIR

    Other authors
  • Modeling the Evolution of User-generated Content on a Large Video Sharing Platform

    In Proceedings of the Web Science Track at 24th International World Wide Web Conference (WWW-15) (to appear) Florence, Italy.

    Other authors
  • Topics, Tasks & Beyond: Learning Representations for Personalization

    In Proceedings of Doctoral Consortium at the 8th ACM International Conference of Web Search and Data Mining (WSDM 2015), Shanghai.

  • Towards Hierarchies of Search Tasks & Subtasks

    In Proceedings of the 24th International World Wide Web Conference (WWW-15) Florence, Italy.

    Other authors
    • Emine Yilmaz
  • Task-Based User Modelling for Personalization via Probabilistic Matrix Factorization

    8th ACM Conference on Recommender Systems (RecSys 2014)

    We introduce a novel approach to user modelling for behavioral targeting: task-based user representation and present an approach based on search task extraction from search logs wherein users are represented by their actions over a task-space. Given a web search log, we extract search tasks performed by users and find user representations based on these tasks. More specifically, we construct a user-task association matrix and borrow insights from Collaborative Filtering to learn low-dimensional…

    We introduce a novel approach to user modelling for behavioral targeting: task-based user representation and present an approach based on search task extraction from search logs wherein users are represented by their actions over a task-space. Given a web search log, we extract search tasks performed by users and find user representations based on these tasks. More specifically, we construct a user-task association matrix and borrow insights from Collaborative Filtering to learn low-dimensional factor model wherein the interests/preferences of a user are determined by a small number of latent factors. We compare the performance of the proposed approach on the task of collaborative query recommendation on publicly available AOL search log with a standard term-similarity baseline and discuss potential future research directions.

    Other authors
    • Emine Yilmaz
    • Manisha Verma
    See publication
  • Improving LDA Topic Models for Microblogs via Automatic Tweet Labeling and Pooling

    36th ACM Special Interest Group on Information Retrieval Conference (SIGIR 2013)

    Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on short, messy text. While topic models such as Latent Dirichlet Allocation (LDA) have a long history of successful application to news articles and academic abstracts, they are often less coherent when applied to microblog content like Twitter. In this paper, we investigate methods to improve topics learned from Twitter content without modifying the basic machinery of LDA; we achieve this through…

    Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on short, messy text. While topic models such as Latent Dirichlet Allocation (LDA) have a long history of successful application to news articles and academic abstracts, they are often less coherent when applied to microblog content like Twitter. In this paper, we investigate methods to improve topics learned from Twitter content without modifying the basic machinery of LDA; we achieve this through various pooling schemes that aggregate tweets in a data preprocessing step for LDA. We empirically establish that a novel method of tweet pooling by hashtags leads to a vast improvement in a variety of measures for topic coherence across three diverse Twitter datasets in comparison to an unmodified LDA baseline and a variety of pooling schemes. An additional contribution of automatic hashtag labelling further improves on the hashtag pooling results for a subset of metrics. Overall, these two novel schemes lead to significantly improved LDA topic models on Twitter content.

    Other authors
    • Scott Sanner
    • Wray Buntine
    • Lexing Xie
    See publication
  • Coupled Dictionary Learning for Cross Lingual Information Retrieval

    NIPS 2012 Workshop on on Analysis Operator Learning vs. Dictionary Learning, Lake Tahoe, US

    Automatic text understanding has been an unsolved research problem for many years. This partially results from the dynamic and diverging nature of human languages, which ultimately results in many different varieties of natural language. These variations range from the individual level, to regional and social dialects, and up to seemingly separate languages and language families. However, in recent years there have been considerable achievements in data driven approaches to computational…

    Automatic text understanding has been an unsolved research problem for many years. This partially results from the dynamic and diverging nature of human languages, which ultimately results in many different varieties of natural language. These variations range from the individual level, to regional and social dialects, and up to seemingly separate languages and language families. However, in recent years there have been considerable achievements in data driven approaches to computational linguistics exploiting the redundancy in the encoded information and the structures used. Those approaches are mostly not language specific or can even exploit redundancies across languages. Representing documents by vectors that are independent of languages enhances the performance of cross-lingual tasks such as \textit{comparable document retrieval} and \textit{mate retrieval}.
    \In this paper, we explore the use of Dictionary based approaches to solve the task of cross-lingual information retrieval. We propose a new dictionary learning algorithm for learning a pair of coupled dictionary pair representing basis atoms in a pair of languages, alongside learning two mapping functions which help in transforming representations learnt in one language to the other. Such transformations are necessary for the task of finding similar documents in a different language and hence finds immense application for various cross-lingual information retrieval tasks. We present an optimization procedure that iterates between two objectives and uses the K-SVD formulation to efficiently compute the parameters involved. We evaluate our algorithm on the task of cross-lingual comparable document retrieval and compare our results with existing approaches; the results highlight the efficacy of our method.

    Other authors
    See publication
  • Dictionary based Sparse Representation for Domain Adaptation

    21st ACM Conference on Information and Knowledge Management CIKM 2012, Maui, USA

    Machine Learning algorithms are often as good as the data they can learn from. Enormous amount of unlabeled data is readily available and the ability to efficiently use such amount of unlabeled data holds a significant promise in terms of increasing the performance of various learning tasks. We consider the task of supervised Domain Adaptation and present a Self-Taught learning based framework which makes use of the K-SVD algorithm for learning sparse representation of data in an unsupervised…

    Machine Learning algorithms are often as good as the data they can learn from. Enormous amount of unlabeled data is readily available and the ability to efficiently use such amount of unlabeled data holds a significant promise in terms of increasing the performance of various learning tasks. We consider the task of supervised Domain Adaptation and present a Self-Taught learning based framework which makes use of the K-SVD algorithm for learning sparse representation of data in an unsupervised manner. To the best of our knowledge this is the first work that integrates K-SVD algorithm into the self-taught learning framework. The K-SVD algorithm iteratively alternates between sparse coding of the instances based on the current dictionary and a process of updating/adapting the dictionary to better fit the data so as to achieve a sparse representation under strict sparsity constraints. Using the learnt dictionary, a rich feature representation of the few labeled instances is obtained which is fed to a classifier along with class labels to build the model. We evaluate our framework on the task of domain adaptation for sentiment classification. Both self-domain (requiring very few domain-specific training instances) and cross-domain classification (requiring 0 labeled instances of target domain and very few labeled instances of source domain) are performed. Empirical comparisons of self-domain and cross-domain results establish the efficacy of the proposed framework.

    Other authors
    See publication
  • Corporate News Classification and Valence Prediction: A Supervised Approach

    49th Association for Computational Linguistics : ACL HLT 2011 Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011), Portland, Oregon, USA.

    News articles have always been a prominent force in the formation of a company’s financial image in the minds of the general public, especially the investors. Given the large amount of news being generated these days through various websites, it is possible to mine the general sentiment of a particular company being portrayed by media agencies over a period of time, which can be utilized to gauge the long term impact on the investment potential of the company. However, given such a vast amount…

    News articles have always been a prominent force in the formation of a company’s financial image in the minds of the general public, especially the investors. Given the large amount of news being generated these days through various websites, it is possible to mine the general sentiment of a particular company being portrayed by media agencies over a period of time, which can be utilized to gauge the long term impact on the investment potential of the company. However, given such a vast amount of
    news data, we need to first separate corporate news from other kinds namely, sports, entertainment, science & technology, etc. We propose a system which takes news as, checks whether it is
    of corporate nature, and then identifies the polarity of the sentiment expressed in the news. The system is also capable of
    distinguishing the company/organization which is the subject of the news from other organizations which find mention, and this is used to pair the sentiment polarity with the identified company.

    Other authors
    See publication

Courses

  • Advanced Algorithms

    -

  • Artificial Intelligence

    -

  • Could Computing

    -

  • Data Structure and Algorithms

    -

  • Differential Geometry

    -

  • Machine Learning

    -

  • Object Oriented Programming

    -

  • Optimization

    -

  • Real Analysis

    -

  • Topology

    -

Honors & Awards

  • Y Ali Research Award

    -

  • Donald B Crouch Research Grant

    -

More activity by Rishabh

View Rishabh’s full profile

  • See who you know in common
  • Get introduced
  • Contact Rishabh directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Rishabh Mehrotra

Add new skills with these courses